Discourse Down for about 2 hours tonight (due to AWS issues)

Teke · July 3, 2014, 3:26am

Hi there, Spark Community - Thanks for your patience as our crack team of engineers ( @jgoggins @Dave!! ) worked for the last two hours to bring back the Forum. Please enjoy yourselves responsibly, and thank you for being amazing.

jgoggins · July 3, 2014, 4:00am

Hey Folks,

Thanks for sticking with us there folks. The cause of community.spark.io being down, according to a very helpful and knowledgable Amazon Tech Rep was due to “an underlying hardware fault which caused the instance to become unresponsive. Unfortunately, this issue affected one piece of hardware and you were unlucky to be have this occur to your instance.”

It sounded like it was a freak occurrence. When I asked why it didn’t appear in the in the data center’s status page he said "the status page updates are for wide scale issues. " This surprised me. If the physical hardware was what failed, there were likely other virtual machines that were impacted too. At what scale does a system failure warrant a status update or notification of those who manage that infrastructure? I would have expected to receive some notification. But of course from Amazon’s perspective, the recommendation is to architect for high availability so systems self-heal under scenarios like this (like the other parts of the Spark Cloud already do). Doing this for Discourse is on the roadmap.

It is a shabby thing that the computing power behind this wonderful digital habitat got benched for a few hours due to a freak occurrence and I’m glad we’re up again. If anyone has any follow up questions please post them here or reach out directly to me.

Thanks for your patience and happy hacking.

-joe

Topic		Replies	Views
Amazon Web Services Maintenance	6	1767	October 1, 2014
Community.spark.io taking a break for 10 minutes at 1pm General	1	874	June 16, 2014
Community site taken down for upgrade	7	1940	March 10, 2014
Spark Cloud Outtage	16	1928	May 5, 2014
Sparkulator - https://www.spark.io/build is dead?! [SOLVED] Cloud	6	943	March 31, 2014

Discourse Down for about 2 hours tonight (due to AWS issues)

Related topics