Office365メールが利用できなかった!日本時間丸一日

下記の件において、日本マイクロソフトから回答が来ましたので、転載します。クラウドだから手足でないよねぇ。

■ インシデント情報
インシデント ID : EX21572
インシデント名 : Exchange Online のアクセスに関する問題
影響を受けたサービス : Exchange Online / Outlook / Outlook Web App / Exchange Active Sync / Exchange Web Service

■ 事象の概要
本件の影響を受けたユーザーは、複数のプロトコル (Outlook、Outlook Web App (OWA)、Exchange Active Sync (EAS)、Exchange Web Service (EWS) など) から Exchange Online サービスへ接続できない、あるいは接続が断続的となりました。

■ 事象発生経緯 (日本時間)
2015 年 4 月 20 日 (月)
7 時 15 分 : 低度のパケット損失の問題としてインシデントが発生。
8 時 06 分 : エンジニアリング チームは、データベース接続の不具合に関するアラートの調査を開始。
8 時 22 分 : エンジニアリング チームは、1つのデータベースからのみ接続の問題に関してアラートが生じた旨を報告。その他のデータベースもこれより前に切断されたものの、その後復旧。
9 時 11 分 : 非対称リンク使用に関する、より重大な 2つ目の問題が発生。複数のメールボックスが影響を受けたため、再起動を開始。
9 時 30 分から 11 時 59 分 : 問題の原因を特定するため、データベースの接続と可用性、ネットワーク リソースの使用率、パケット損失、ネットワーク エラーの調査を開始。ネットワーク エンジニアが問題を調査する間、Exchange エンジニアは一時的にサービスを修復するため、影響を受けているシステムの再起動を継続。
11 時 59 分から 12 時 20 分 : さらなる改善を図るため負荷分散モードを停止。
12 時 23 分: ネットワーク エンジニアは非対称リンクの問題を解決するため構成のアップデートを実施。しかしながら、サービスは完全には修復できず。
12 時 26 分 : データセンター間でのパケット損失の増加を確認。
13 時 03 分から 17 時 26 分 : ネットワーク リソースの使用状況に関する問題を調査。
17 時 26 分 : 非対称リンクの問題とは別に、不具合の生じているネットワーク リンクを発見。
18 時 00 分 : 可用性を回復させるため、低下しているネットワーク リンクからトラフィックを別の領域へルーティング。ネットワークの可用性が改善し始めたことを確認。ただし、影響を受けているシステム コンポーネントの監視は継続。
19 時 07 分: サービスの復旧を確認。

■ インシデント開始日時
2015 年 4 月 20 日 (月) 7 時 15 分 (日本時間)

■ サービスの復旧日時
2015 年 4 月 20 日 (木) 18 時 00 分 (日本時間)

■ 根本原因
2つの異なるネットワークの問題が発生し、Exchange Online メールボックス システムへ断続的に接続できなくなりました。最初の問題は、ネットワーク インフラストラクチャの非対称リンク使用状態に関する問題でした。この問題に加え、ネットワーク リンクが低下したことにより、ネットワーク インフラストラクチャの一部にパケット損失が生じました。

■ 今後の対応
検出事項 : Exchange Online サービス内にある 2つのネットワーク リンクが低下しました。
対策 : ネットワークのルーティングに関する不具合を診断および解消するため、強化されたツールを実装します。
完了予定 : 2015 年 5 月


私の会社では4月20日15時頃から障害を確認した。障害記録をみてみると、朝から発生した模様だ。現在は回復に至っている。

クラウドに依存した障害。こればかりは、何の抵抗もできない。公開された情報を見ていきたい。

今回は、Outlook、OWA(Outlook Web App、Webメール)、Exchange Active Sync(EAS)、Exchange Web Service(EMS)といった複数のサービスが、Exchange Online(Office365サービスの1つ)というメールサービスが利用できなかった。UTC(グリニッジ標準時)で説明されているため、9時間を足さなくてはいけない。面倒だ。。。

この事象が発生したのは、2015年4月20日 AM 7:15。解決したのが、PM18:00ということで、UTC+0時間圏内の地域であれば、概ね問題が無いという皮肉な時間だ。

ただ、何故異常が発生したか、理由が見当たらない。

 

暫定根本原因としては、高いネットワーク リソース使用率により、Exchange Online メールボックス システムが一時的に接続を失っているようだが、現在調査中とのこと。暫定回復として、残余影響を示しているメールボックスのサービスを再起動し、サービスの復元を確認しているという。

 

これまでの障害は、OutlookとかではNG、OWAならOKだったのだが、あらゆるサービスでコネクトできないといった、大規模障害だった。

この問題に関する詳細情報が入り次第、随時更新していきたい。

JST to UTC:UTC 0:00 より前(赤字)は前日
JST 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
UTC 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
インシデント 日付と時刻 ステータス 詳細
EX21572 2015年4月20日 19:08 サービスを復元済み In an effort to improve our incident communications, we are adding a survey to some of our service notifications. Please take a moment to fill out the brief survey by going to the following link: http://research.ipsosinteractive.com/surveys/?pid=S1043972&id=&SupplierID=9999&CultureInfo=EN-US&ReturnCode=NR&idType=real&groupcode=0009Final Status: Engineers redirected traffic to bypass a degraded network link, which restored service.User Experience: Affected users were unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange Active Sync (EAS), and Exchange Web Service (EWS).Customer Impact: Customers reported that they were experiencing this issue.

Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Incident End Time: Monday, April 20, 2015, at 9:00 AM UTC

Preliminary Root Cause: A degraded network link caused Exchange Online mailbox systems to intermittently lose connectivity. The underlying root cause is being investigated.

Next Steps: The following is a list of known action item(s) associated with this incident. As part of the Office 365 problem management process, additional engineering actions may be identified to improve the overall service.
- Action: Review network routing to avoid communication interruptions related to Exchange Online activity.

A post-incident report will be available on the Service Health Dashboard within five business days.

2015年4月20日 18:26 サービスの復元中 Current Status: In order to restore availability, engineers routed traffic away from the degraded network link while it is fixed. Availability is improving, but engineers are continuing to monitor the affected system components.User Experience: Affected users are intermittently unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: Customers are reporting that they are experiencing this issue.Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Preliminary Root Cause: A degraded network link may have caused Exchange Online mailbox systems to intermittently lose connectivity. The underlying root cause is being investigated.

Next Update by: Monday, April 20, 2015, at 10:30 AM UTC.

2015年4月20日 17:26 サービスの低下 Current Status: Engineers have determined that some network infrastructure has become degraded and is causing intermittent packet loss which, in turn, is causing availability drops. Engineers continue to monitor connectivity to the affected infrastructure while investigating options to connect mailbox systems to alternate capacity.User Experience: Affected users are intermittently unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: Customers are reporting that they are experiencing this issue.Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Preliminary Root Cause: High network resource utilization may have caused Exchange Online mailbox systems to intermittently lose connectivity. The underlying root cause is being investigated.

Next Update by: Monday, April 20, 2015, at 9:30 AM UTC.

2015年4月20日 16:26 サービスの低下 Current Status: Network engineers are examining diagnostic logs to determine the cause of the connectivity issues as some affected infrastructure continues to exhibit availability drops.User Experience: Affected users are intermittently unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: A few customers are reporting that they are experiencing this issue.Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Preliminary Root Cause: High network resource utilization may have caused Exchange Online mailbox systems to intermittently lose connectivity. The underlying root cause is being investigated.

Next Update by: Monday, April 20, 2015, at 8:30 AM UTC.

2015年4月20日 15:08 サービスの復元中 Current Status: Engineers continue to restart services on mailbox systems that are exhibiting residual impact. Additionally, engineers continue to monitor the affected environment to confirm resolution. Customers may begin to experience service restoration as the service restarts progress.User Experience: Affected users are intermittently unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: A few customers are reporting that they are experiencing this issue.Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Preliminary Root Cause: High network resource utilization caused Exchange Online mailbox systems to intermittently lose connectivity. The underlying root cause is being investigated.

Next Update by: Monday, April 20, 2015, at 7:30 AM UTC

2015年4月20日 14:01 サービスの復元中 Current Status: Network engineers have updated a network configuration. Engineers are restarting services on mailbox systems that are exhibiting residual impact following the update. Additionally, engineers are monitoring the affected environment to confirm resolution.User Experience: Affected users are intermittently unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: A few customers are reporting that they are experiencing this issue.Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Preliminary Root Cause: High network resource utilization is causing Exchange Online mailbox systems to intermittently lose connectivity. The underlying root cause is being investigated.

Next Update by: Monday, April 20, 2015, at 6:30 AM UTC

2015年4月20日 13:08 サービスの復元中 Current Status: The investigation determined that the network resource utilization issue persists. A potential source of the issue is a networking configuration update. Engineers are in the process of reverting the configuration update and connecting affected systems to alternate capacity.User Experience: Affected users are intermittently unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: A few customers are reporting that they are experiencing this issue.Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Preliminary Root Cause: High network resource utilization is causing Exchange Online mailbox systems to intermittently lose connectivity. The underlying root cause is being investigated.

Next Update by: Monday, April 20, 2015, at 5:30 AM UTC

2015年4月20日 11:51 サービスの低下 Current Status: While monitoring the affected environment, engineers observed that mailbox systems continue to encounter network errors. Engineers continue to investigate to determine possible remediation steps.User Experience: Affected users may be intermittently unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: Customer impact appears to be limited at this time.Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Preliminary Root Cause: High network resource utilization is causing Exchange Online mailbox systems to intermittently lose connectivity. The underlying root cause is being investigated.

Next Update by: Monday, April 20, 2015, at 5:30 AM UTC

2015年4月20日 10:29 サービスの復元中 Current Status: The investigation determined that high network resource utilization is causing Exchange Online mailbox systems to intermittently lose connectivity. Engineers have halted specific types of mailbox migration operations to reduce resource utilization. Additionally, engineers have restarted mailbox systems that became degraded due to the network issue. Engineers are monitoring the affected environment to confirm service restoration.User Experience: Affected users may be intermittently unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: Customer impact appears to be limited at this time.Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Preliminary Root Cause: High network resource utilization is causing Exchange Online mailbox systems to intermittently lose connectivity. The underlying root cause is being investigated.

Next Update by: Monday, April 20, 2015, at 3:30 AM UTC

2015年4月20日 9:24 サービスの低下 Current Status: Engineers are investigating system logs and networking components to determine the source of the issue.User Experience: Affected users may be intermittently unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: Customer impact appears to be limited at this time.Incident Start Time: Sunday, April 19, 2015, at 10:15 PM UTC

Next Update by: Monday, April 20, 2015, at 1:30 AM UTC

2015年4月20日 8:28 サービスの低下 Current Status: Engineers continue to investigate an issue in which some users may be experiencing problems accessing or using Exchange Online services or features. This event is actively being investigated. More information will be provided shortly.User Experience: Affected users may be unable to connect to the Exchange Online service when using multiple protocols including Outlook, Outlook Web App (OWA), Exchange ActiveSync (EAS), and Exchange Web Services (EWS).Customer Impact: Customer impact appears to be limited at this time.
2015年4月20日 7:47 調査中 [Çürrëñt §tãtús: Ëñgîñèërs árë ìñvêstîgâtïñg ªñ íssùë íñ whí¢h sòmé ¢µstømêrs màý þë ëxpërïèñçîñg prôþlèms ãççëssîñg ør ûsîñg Êxçhªñgê Óñlìñé sêrvïçës ór fèætùrës. Thís évëñt ís âçtîvëlý þéîñg ïñvêstïgãtéð. Mòrë ìñfôrmätîóñ wîll þê prôvíðêð shörtlý. xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx ]
インシデント 日付と時刻 ステータス 詳細
EX21583 2015年4月20日 18:29 誤検知 Final Status: The investigation is complete and engineers have determined the service is healthy. A service incident did not actually occur.
2015年4月20日 17:35 調査中 Current Status: Initial service health assessments indicate that the Exchange Online service is healthy, and the issue may be affecting monitoring alert infrastructure only. Engineers are continuing to perform a complete assessment of system health to confirm that there is no user impact.User Experience: Affected users may be unable to connect to the Exchange Online service.Customer Impact: Preliminary service health assessments indicate that there is no customer impact at this time.Next Update by: Monday, April 20, 2015, at 9:30 AM UTC
2015年4月20日 17:04 調査中 Current Status: Engineers are reviewing system monitoring logs to investigate a potential issue with the Exchange Online service.User Experience: Affected users may be unable to connect to the Exchange Online service.Customer Impact: Customer impact appears to be limited at this time.
2015年4月20日 16:34 調査中 [Çürrëñt §tãtús: Ëñgîñèërs árë ìñvêstîgâtïñg ªñ íssùë íñ whí¢h sòmé ¢µstømêrs màý þë ëxpërïèñçîñg prôþlèms ãççëssîñg ør ûsîñg Êxçhªñgê Óñlìñé sêrvïçës ór fèætùrës. Thís évëñt ís âçtîvëlý þéîñg ïñvêstïgãtéð. Mòrë ìñfôrmätîóñ wîll þê prôvíðêð shörtlý. xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx ]

Twitterでフォローしよう

おすすめの記事