About Integrates Outage on 2020-10-09
What happened
- We are currently working on standardizing our infrastructure through nix.
- At 14:33 (COT), our team deployed a change that made use of docker experimental syntax. We expected that it would only affect the base infrastructure, but, due to the way docker works internally, caches for Integrates containers were also lost.
- At the same time, one of our sub-dependencies was updated and broke our source code compatibility.
What we’ve done
- After an intensive debug process, our team reproduced the issue at 10:33 (COT) on 2020/10/10.
- We implemented a temporary solution and restored the service at 10:40 (COT) on 2020/10/10.
- We committed a definitive fix at 11:00 (COT).
- We committed a complementary solution at 11:32 (COT).
What the impact was
- Users were unable to login to Integrates or to use the API from 2020/10/09 14:33 until 2020/10/10 10:40.
What we are doing to help
- Continue standardizing our infrastructure: https://gitlab.com/fluidattacks/product/-/issues/3504
- Freezing all deps and sub-dependencies: https://gitlab.com/fluidattacks/product/-/issues/3522
- Standardizing async helpers: https://gitlab.com/fluidattacks/product/-/issues/3521