Resolved and monitoring System outage

Discussion in 'Outages and maintenance' started by dmitri, Aug 20, 2019.

Thread Status:
Not open for further replies.
  1. dmitri

    dmitri Everleap staff

    There is a networking issue in the data center that results in the errors on sites. Our system administrators are investigating the cause of the issue and should be able to resolve it soon.
     
    ravi likes this.
  2. This is related to the 502 error? It's appearing on all of my sites.
     
    ravi likes this.
  3. Our sites have been down for one-half an hour which is an eternity to our clients. Is there an update or ETA for the fix?
     
  4. dmitri

    dmitri Everleap staff

    Yes, all of the user sites are getting 502 errors. Our entire SA team is working on it.
     
    Mike Whatley likes this.
  5. dmitri

    dmitri Everleap staff

    We do not have ETA at this time, however, our system administrators were able to isolate the problem to networking storage. Site cannot connect to the networking storage and returning the 502 errors.
     
    ravi likes this.
  6. Frankc

    Frankc Everleap staff

    Outage is related to a failure in the backend storage system. Our team is working on fixing it.

    Will post more update when available.
     
    ravi likes this.
  7. Isn't there some alternate means to keep us running while the root cause is figured out. Even a temporary slower connection is better than no connection for this amount of time.
     
    Kristy Pearce, Leitzel, ravi and 2 others like this.
  8. Martin Ortega

    Martin Ortega Everleap staff

    As soon as we hear more information we'll go ahead and provide it here. Please monitor this forum post for any new information.
     
  9. Martin Ortega

    Martin Ortega Everleap staff

    We'll have to ask our higher level of support if that is possible after we review this outage.

    Also note we don't take these type out outages lightly. We'll investigate possible work around after we determined what the cause of the outage was and provide more information at a later time.

    At this moment we'll currently providing updates regarding this problem.
     
    ravi likes this.
  10. Joseph Jun

    Joseph Jun Everleap staff

    The engineering team is still working on getting the storage system back online but we're making some progress. We'll post another update as soon as possible.
     
  11. Martin Ortega

    Martin Ortega Everleap staff

    We're making progress and certain web sites may be up and running but please note we're still working on resolving the issue. We'll provide more information as we get it.
     
  12. Martin Ortega

    Martin Ortega Everleap staff

    We're still working on getting service restored. However, we don't have any new information.
     
  13. Joseph Jun

    Joseph Jun Everleap staff

    We've been able to restore some sites on the reserved cloud servers but it's a slow and gradual process. We're still making progress on the shared sites.

    We'll have another update in approximately 30 minutes from the engineering team.
     
  14. Joseph Jun

    Joseph Jun Everleap staff

    I don't have any new information at this time; status remains the same.

    I'll contact engineering for an update and hopefully we'll have some new details to share in about 30 minutes or so.
     
  15. Joseph Jun

    Joseph Jun Everleap staff

    Service to the Everleap Control Panel has been restored.
     
  16. Martin Ortega

    Martin Ortega Everleap staff

    The issue with our service has been restored. We'll have more information about what happen at a later time. At this moment we're still monitoring the services for any new problems.
     
  17. Takeshi

    Takeshi Everleap staff

    Following up with what happenned.

    At 7:10am PST Aug 20, 2019, we receive an alert from our monitoring system that the backend storage array went down.
    At 9:50am, our engineers were able to bring back the storage array.
    At 10:15am, we started to rebuild the servers within the Azure system. This took a while and websites started coming back online in a staggered fashion.
    The first websites started coming back online at about 11:00am. The rebuild was fully completed at approximately 1:00pm PST.
    We are working with the storage system vendor to identify the root cause of the problem.
     
Thread Status:
Not open for further replies.

Share This Page