On October 14, 2019, multiple Aventri clients reported that they were getting an error message when accessing the back end and front end of Aventri in North America (I.e attendee registration). The Aventri Support Team picked up the ticket immediately and was able to verify the issue. The Aventri Support Team then forwarded the ticket on to the Aventri DevOps Team using its top priority. Within 22 minutes, the Aventri DevOps Team was able to identify and resolve the issue. The Aventri DevOps Team then informed the Aventri Support Team of the resolution. The Aventri Support Team verified the resolution on their own and then informed our clients, at which point the ticket was closed.
Why it happened
As part of the investigation of this issue, the DevOps Team did a root cause analysis and determined that this issue was the result of an overly restrictive database max connections setting. This setting is used to optimize the performance of our database. When the database started experiencing heavy load, the maximum number of connections was hit and the database began rejecting additional connections resulting in the errors being seen by our clients.
What we did about it
As the database approached the max connections setting value, the Aventri DevOps Team was alerted and began diagnosing the issue. By the time the client began seeing issues and reported the issue, the Aventri DevOps Team had already diagnosed the issue and began resolving the issue resulting in the quick resolution time. The Aventri DevOps Team began by killing off sleeping connections to lower the overall connections to under the max connections threshold. The Aventri DevOps Team also raised the max connection setting. Once we had killed off a sufficient number of sleeping connections and the updated max connections setting took effect, the platform began running properly. The Aventri DevOps Team then reassigned team members to actively watch the database connection and active process levels to ensure that this issue was resolved and did not reoccur.
Corrective and Preventative Measures
As a result of the root cause analysis associated with this issue, Aventri is making a series of changes to the platform database environment.