Livestreams are down
Incident Report for Livepeer Studio
Postmortem

Summary

This is a post-mortem describing the incident being investigated on 04/08/24 https://status.livepeer.studio/incidents/ss8n6px77ny5

Incident

Description

After deploying a fix into production, the Livepeer team received an internal alert of spikes for 500 errors. Shortly after, a user reported that their livestream playback wasn't functioning, and when they attempted to restart the stream, they couldn't ingest it. The Livepeer Studio team verified the problem and initiated an investigation into the issue.

Impact

  • Livestreams:

    • New livestreams were not able to be ingested for all regions
  • Viewers:

    • Playback for streams in all regions were not able to view

Regions:

  • All Regions

Current status

The service has been fully restored

https://status.livepeer.studio/

Timeline

  • 10:12 AM EST - The Livepeer Studio team was alerted of an incident with increased amounts of 500 errors
  • 10:13 AM EST - Reports from a user indicating livestreams were having issues with existing broadcasts not working and playback stopping playing
  • 10:14 AM EST - The team from Livepeer Studio acknowledged this incident and started an investigation
  • 10:29 AM EST - This investigation from the Livepeer Studio team led to a recent deployment at 9:37 AM EST, once the changes were in production, an alert went off and it was quickly reverted, which resolved the issue
  • 10:38 AM EST - After monitoring the fix for the incident, the Livepeer Studio team concluded that the issue was resolved

Prevention

  • Although the fix being deployed had already been tested on our Staging environment, the rollout of it to Production resulted in a non-graceful restart of our media server, which resulted in temporary disruption to ongoing streams and an inability to create new streams.
  • We are putting a fix in place to ensure this doesn't affect future deployments and are reviewing our deployment procedures to try to catch these kinds of issues before they reach Production.
Posted Apr 12, 2024 - 16:17 UTC

Resolved
This incident has been resolved.
Posted Apr 08, 2024 - 14:40 UTC
Identified
We have identified the cause of the livestreaming issue and implementing a fix.
Posted Apr 08, 2024 - 14:26 UTC
This incident affected: Livepeer Streaming API.