Jump to content

Service Manager Issues [RESOLVED]


Jeremy
 Share

Recommended Posts

So today we are experiencing the issue in the image below on several peoples logins to various applications, there are also reports of slowness and 'cannot connect to database' errors.... is there something that is going on that we need to be aware of?

image.thumb.png.307f2c1a001c265392ec2724cf54bcdd.png

Link to comment
Share on other sites

  • Victor changed the title to Service Manager Issues [RESOLVED]

@all

Infrastructure team confirms the issue should now be resolved and full functionality restored. Let us know if any issues. We are looking to see what cause the issue and will update when we have more information.

We are deeply sorry for all the trouble this has caused.

Link to comment
Share on other sites

@all   

Our Infrastructure team have completed their analysis and have determined that the root cause was due to the following:

At 14:09 our monitoring systems alerted us simultaneously to a number of issues with around 10% of our customer instances. All issues were related to performance of underlying disks on a given node which would have resulted in customers reporting below expected performance or occasional disconnects. . Our cloud team immediately identified the root cause as a disk concurrency issue effecting 1 of the underlying node and began reducing the load. 

The issue was resolved by 14:13.

At 14:16 the same issue occurred again and we undertook the same steps to resolve. This was finally ended at 14:19

The root cause has been identified as a issue with session cloning (usually during elevation of Flowcode) when the same or multiple sessions are repeatedly cloned in a very short time and these have a large volume of cached data. The chance of this combination of events is small. 

This caused concurrency issues with the other instances running on the same node/disks.

We have now identified the root cause and have a development plan to prevent the issue going forward (Session Cloning will no longer copy cached data unless forced) and we would expect to see this changed rolled out over the next few weeks (Given the likely hood of this occurring we do not see the need to produce a patch) 

We apologise for any inconvenience this may have caused.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...