You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
May 23rd, 2022, 1500 people were sent 3 surveys. Soon after, survey respondants complained of app server downtime/unresponsive pages.
The associatedDocumentDB still logged their responses
Based on @sumants-dev 's initial analysis, it appears the problem was with DocumentDB in that it couldn’t handle so many survey results connections at once (1500*3)
Possible Solutions
Rate Management:
Establish ceilings for number of surveys sent in given stretch of time.
Force requests to occur slower than certain rate
how/where to encode (?)
DB Size: Add instances so that DocumentDB can better handle this.
Resources
DocumentDB configuration (?)
Code for deploying surveys (?)
TODO
Evaluate Rate Management solution
Evaluate DB scale solution
Document DocumentDB configuration/deployment
Document Survey Deployment code
The text was updated successfully, but these errors were encountered:
Based on initial discussions it looks like the Rate Management might be a solution to start with. But ideally we need to space requests out across an hour or more - not just a few minutes
Question: How many requests can it currently handle?
But I basically set up alerts for the mongo db when the db connections for to its limits of 30. So if you go to monitoring dashboard for the document db, the database connection limit is 30 for when there were server issues.
The database is the document db
The code base communicating with surveyor and the database is in surveyor. Check out databases folder for the connection in surveyor. I think the fix likely is to remove all the awaits for inserting into database and making it async (use .then syntax) Because what’s happening is the database only can do 30 connections at a time, typically the driver will keep retrying until it is submitted. This is fine. However, because the code is await, the user is waiting for the eventual submission to go through. This can be done in background asynchronously.
Here he references
a) database connection limit
Also: There was a discussion on our slack page about adding 'instances'. Not sure how that's different from 'database connection limit'. But that's something we can explore at our meeting tomorrow. @rivera-lanasm can you comment on this?
Issue Description
Possible Solutions
Resources
TODO
The text was updated successfully, but these errors were encountered: