Metro outage caused by hardware glitch

Metro officials have concluded that a hardware failure caused their train monitoring system to go dark twice this past weekend, prompting officials to halt service on all five lines. But officials say they still cannot rule out the possibility of another outage.

Metro spokesman Dan Stessel said that a module about the size of a pizza box failed, apparently hampering the flow of data used to monitor train movement throughout the system. He likened the breakdown to an engine that does not get enough gas. What caused that module to fail, however, is still under investigation.

Resources for your commute

What major overhauls are needed to improve the Metro system?

What major overhauls are needed to improve the Metro system?

CROWD SOURCED | Share your ideas for making the system better. Leave comments and upvote others.

Follow Dr. Gridlock on Twitter

Follow Dr. Gridlock on Twitter

Get the latest news that will affect your commute.

Read more from Dr. Gridlock

Read more from Dr. Gridlock

The blog is your transportation guide to the Beltway and beyond.

More transportation news

After the outages, Metro officials temporarily suspended train operations until the system could be rebooted and they searched for the problem.

Stessel said officials are now focused on finding ways to prevent such a breakdown from recurring by developing a backup system. Although there is a system in place that would have ensured continued operation, it is designed to kick in when there is a clear signal that a module has failed, Stessel said. In this case, the module continued to send indications it was working even though it was not.

Technicians replaced the faulty module, housed at the central control center in suburban Maryland, early Wednesday. The system was restarted in time for the morning rush hour, Stessel said.

Still, the ability of one module to bring down the monitoring system has experts questioning why the failure occurred. Stessel said that had problems with the monitoring system continued, Metro officials could have used a second system at its control center in the District.

The monitoring system helps controllers track and manage trains as they travel throughout Metro’s 106-mile system of tracks. While critical to operations, the monitoring system is different from the train protection system. That relies on track circuits to detect the presence of trains and sends speed commands to keep them properly spaced.

Just after 2 p.m. Saturday, Metro halted service on all lines after the train monitoring system went dark. The outage lasted for a little more than 30 minutes. Less than 12 hours later, the monitoring system again went dark and officials halted train service. In both instances, technicians rebooted the system to get it back online as they continued to scour computers logs for signs of an anomaly.

The weekend outage came on the heels of two other incidents and had both passengers and elected officials once again questioning the safety and reliability of the system.

On July 3, passengers complained they did not get clear evacuation instructions after the Green Line train on which they were riding broke down just outside the College Park station. Three days later, a Green Line train derailed near the West Hyattsville stop. Metro officials believe excessive heat made a section of track buckle, causing the accident.

Speed restrictions because of high temperatures were not in place before the derailment, but immediately after, officials slowed trains to 35 miles per hour on aboveground sections of track.

On Tuesday, Metro officials updated their criteria for determining when to slow trains because of excessive heat.

At a news conference Monday to tout recently approved safety standards that will increase federal oversight of Metro and other similar systems, Sen. Barbara A. Mikulski (D-Md.) took Metro officials to task, saying her offices had been flooded with e-mails “some volcanic, some even more so.”

“We have to make sure that people do no lose their confidence in Metro,” she said.

Mikulski also called on officials at the U.S. Department of Transportation and the Federal Transit Administration to update the federal audit of the system they did three years ago.

“These are complicated systems,” said Mort Downey, chairman of the Metro board’s safety and security committee, which will likely be briefed on the outage at its next meeting. “It’s been a bad month, no question about that. But the fundamental premise is safety comes first.”

 
Read what others are saying