4 years ago
About downtime on 2020-03-04
What happened
On March 4th from 08:38AM to 8:50AM, Integrates became inaccessible due to an overload in the Kubernetes cluster, which was the result of new experimental ephemeral environments not being shut down.
What we’ve done
When the cause of the problem was found, we restarted the cluster’s overloaded nodes,
added extra nodes to increase computing capabilities and programmed a function to remove experimental ephemeral environments.
What’s the impact
Users were not able to access Integrates within the mentioned 12 minutes.
What we are doing to help
We restarted the overloaded nodes, increased the cluster size and programmed a function for stopping experimental ephemeral environments.