Jump to content

Recommended Posts

Posted

Got this error message in service manager, closed it down and now the instance itself will not load for any og us, please investigate ASAP

image.png.d2b165943712f09306d1a7de07ea7ace.png

 

image.png.622729f386e04eacb821e3504b25068a.png

 

image.png.64aafaa3847566d96f5f3778c313eda1.png

Posted

At 12:11 on Tuesday 30th July we detected the loss of one of our nodes. The root cause was identified as a Blue Screen of Death and resolved by a restart of node and all services. All instances were recovered by 12:15.

The operating systems should not fail in this way, the BSOD reports a driver IRQ conflict which given its a virtual machine and not subject to hardware changes is virtually impossible. Unfortunately, we have been unable to identify a solution/patch from Microsoft thus far.

As with any failure of this nature, when it is out of our hands, we always develop a strategy to prevent future problems. To this end, we are now doing the following.

Long-Term - Remove windows from our server stack
We have already begun the process of moving our application code from Windows to Linux; Windows has been less reliable and more difficult to work within our stack, so we are keen to remove this. This change will be transparent to customers. This migration to Linux will not only significantly increase stability, but is also part of a larger strategic development to an even more flexible microservices architecture.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...