Infrastructure outage incident report
Tuesday, Sep 28, 2021
By the Flowte Infrastructure and Dev Ops Team
From 06:00 AM GMT on Sep 25th to 06:30 GMT on Sep 25th until 16:04 GMT requests to flowte.me began to have intermittent connectivity issues. Flowte applications that rely on these calls also returned errors or had reduced functionality. At its peak, the issue affected 100% of traffic to our AWS infrastructure. The root cause of this outage were system upgrades carried out by the Flowte team late on Friday, that caused unexpected issues and the need to roll back to a previous version of the system.
Maintenance work carried out by the Flowte engineering team saw some issues arise, leading to a need to roll back the system to its previous version. The roll back process took a few hours to be completed, causing the system to unexpectedly go offline. The process was overseen by the engineering team located in Buenos Aires, adding extra hours to the logistics.
Corrective and Preventative Measures
New rules and regulations have been set in place, preventing any upgrades and maintenance work to be carried out on Fridays, a time previously chosen due to being the most quiet day of the week.
On behalf of everyone here at Flowte, we would like to offer our most sincere apologies for the severe disruption caused on Saturday, and we are resolved to do all that is in our power to prevent such issues happening again moving forward.
The Flowte Infrastructure Team