Kinescope - Histórico de avisos

Todos os sistemas operacionais

kinescope.io - Operacional

100% - tempo de atividade
set 2025 · 99.98%out · 100.0%nov · 100.0%
set 2025
out 2025
nov 2025

Dashboard - Operacional

100% - tempo de atividade
set 2025 · 100.0%out · 100.0%nov · 100.0%
set 2025
out 2025
nov 2025

Uploading - Operacional

100% - tempo de atividade
set 2025 · 100.0%out · 100.0%nov · 100.0%
set 2025
out 2025
nov 2025

Transcoding - Operacional

100% - tempo de atividade
set 2025 · 100.0%out · 100.0%nov · 100.0%
set 2025
out 2025
nov 2025

Player embeds - Operacional

100% - tempo de atividade
set 2025 · 100.0%out · 100.0%nov · 100.0%
set 2025
out 2025
nov 2025

Live streaming - Operacional

100% - tempo de atividade
set 2025 · 100.0%out · 100.0%nov · 100.0%
set 2025
out 2025
nov 2025

Analytics - Operacional

100% - tempo de atividade
set 2025 · 100.0%out · 100.0%nov · 100.0%
set 2025
out 2025
nov 2025

API - Operacional

100% - tempo de atividade
set 2025 · 100.0%out · 100.0%nov · 99.78%
set 2025
out 2025
nov 2025

DNS - Operacional

100% - tempo de atividade
set 2025 · 100.0%out · 100.0%nov · 100.0%
set 2025
out 2025
nov 2025

CDN - Operacional

100% - tempo de atividade
set 2025 · 100.0%out · 100.0%nov · 100.0%
set 2025
out 2025
nov 2025

Third Party: Webflow → Hosted Websites - Operacional

Third Party: Mailgun → Email Services → Outbound Delivery - Operacional

Histórico de avisos

nov 2025

API Service outage
  • Após a morte
    Após a morte

    During scheduled drive replacement performed in two of our Moscow data centers, application cluster responsible for API service went offline. Redis failed to promote masters on the remaining healthy node, causing the API service to repeatedly attempt connections to an unavailable Redis master. Manual failover resolved the issue and services were restored.

    Timeline & Root Cause Analysis

    • Planned maintenance was performed to replace hard drives in two data centers.

    • As part of the work, the app server in DC1 was shut down.

    • Later, the app server in DC2 was also shut down for the same maintenance.

      • The DC1 app server did not have enough time to fully come back online before the second shutdown.

    • As a result, two app servers in the cluster went down simultaneously.

    • Redis did not switch the master role to the remaining node in the other DC as expected.

    • The API service failed to start because it kept trying to connect to the Redis master located in DC1, which was unavailable.

    • Redis master roles were manually promoted to the servers in DC2.

    • Once Redis topology was corrected, the API and dashboard services recovered and returned to normal operation.

    Resolution

    • All Redis masters were manually switched to the healthy nodes in DC2.

    • Application services (API, dashboard) successfully started and functioned as expected.

      Next Steps / Preventive Actions

      • Applied corrections to the maintenance algorithm, ensuring app servers are never taken down simultaneously and Redis failover logic is properly validated before each step.

      • Review and improve Redis automatic failover configuration.

      • Add additional health checks and monitoring around Redis master availability and app server readiness.

      • Adjust maintenance sequencing to guarantee sufficient startup time between operations.

  • Resolvido
    Resolvido
  • Investigando
    Investigando

    We are currently investigating this incident.

out 2025

Nenhum aviso relatado este mês

set 2025 para nov 2025

Próximo