Kinescope - История уведомлений

Все системы работают

kinescope.io - Работает

100% - время безотказной работы
сент. 2025 · 99.98%окт. · 100.0%нояб. · 100.0%
сент. 2025
окт. 2025
нояб. 2025

Dashboard - Работает

100% - время безотказной работы
сент. 2025 · 100.0%окт. · 100.0%нояб. · 100.0%
сент. 2025
окт. 2025
нояб. 2025

Uploading - Работает

100% - время безотказной работы
сент. 2025 · 100.0%окт. · 100.0%нояб. · 100.0%
сент. 2025
окт. 2025
нояб. 2025

Transcoding - Работает

100% - время безотказной работы
сент. 2025 · 100.0%окт. · 100.0%нояб. · 100.0%
сент. 2025
окт. 2025
нояб. 2025

Player embeds - Работает

100% - время безотказной работы
сент. 2025 · 100.0%окт. · 100.0%нояб. · 100.0%
сент. 2025
окт. 2025
нояб. 2025

Live streaming - Работает

100% - время безотказной работы
сент. 2025 · 100.0%окт. · 100.0%нояб. · 100.0%
сент. 2025
окт. 2025
нояб. 2025

Analytics - Работает

100% - время безотказной работы
сент. 2025 · 100.0%окт. · 100.0%нояб. · 100.0%
сент. 2025
окт. 2025
нояб. 2025

API - Работает

100% - время безотказной работы
сент. 2025 · 100.0%окт. · 100.0%нояб. · 99.79%
сент. 2025
окт. 2025
нояб. 2025

DNS - Работает

100% - время безотказной работы
сент. 2025 · 100.0%окт. · 100.0%нояб. · 100.0%
сент. 2025
окт. 2025
нояб. 2025

CDN - Работает

100% - время безотказной работы
сент. 2025 · 100.0%окт. · 100.0%нояб. · 100.0%
сент. 2025
окт. 2025
нояб. 2025

Third Party: Webflow → Hosted Websites - Работает

Third Party: Mailgun → Email Services → Outbound Delivery - Работает

История уведомлений

нояб. 2025

API не работает
  • После смерти
    После смерти

    During scheduled drive replacement performed in two of our Moscow data centers, application cluster responsible for API service went offline. Redis failed to promote masters on the remaining healthy node, causing the API service to repeatedly attempt connections to an unavailable Redis master. Manual failover resolved the issue and services were restored.

    Timeline & Root Cause Analysis

    • Planned maintenance was performed to replace hard drives in two data centers.

    • As part of the work, the app server in DC1 was shut down.

    • Later, the app server in DC2 was also shut down for the same maintenance.

      • The DC1 app server did not have enough time to fully come back online before the second shutdown.

    • As a result, two app servers in the cluster went down simultaneously.

    • Redis did not switch the master role to the remaining node in the other DC as expected.

    • The API service failed to start because it kept trying to connect to the Redis master located in DC1, which was unavailable.

    • Redis master roles were manually promoted to the servers in DC2.

    • Once Redis topology was corrected, the API and dashboard services recovered and returned to normal operation.

    Resolution

    • All Redis masters were manually switched to the healthy nodes in DC2.

    • Application services (API, dashboard) successfully started and functioned as expected.

      Next Steps / Preventive Actions

      • Applied corrections to the maintenance algorithm, ensuring app servers are never taken down simultaneously and Redis failover logic is properly validated before each step.

      • Review and improve Redis automatic failover configuration.

      • Add additional health checks and monitoring around Redis master availability and app server readiness.

      • Adjust maintenance sequencing to guarantee sufficient startup time between operations.

  • Решено
    Решено
  • Изучается
    Изучается

    В настоящее время мы расследуем этот инцидент.

окт. 2025

В этом месяце уведомлений не поступало

сент. 2025 до нояб. 2025

Следующая