There’s nothing magical about this. Your computer/phone needs to know how to make this webpage look the way The Washington Post’s designers intended it to look, and the HTML file includes all of the information (and links to other files) needed to have that happen. By viewing that source code, you are doing the equivalent of looking under the hood of your car. All that stuff in there makes it work, even if you don’t know how. This is not some brilliant technological wizardry, nor are you hacking valuable information.
Unless, it seems, you live in the state of Missouri.
On Thursday, Gov. Michael Parson (R) called a news conference to warn his state’s citizens about a nefarious plot against a teachers’ database by a reporter from the St. Louis Post-Dispatch.
“Through a multistep process,” Parson said with great solemnity, “an individual took the records of at least three educators, decoded the HTML source code and viewed the Social Security number of those specific educators.”
Let’s stop here for a moment. The phrase “decoded the HTML source code” certainly sounds intimidating to a layperson. But HTML source code is only “encoded” when it’s traveling from a secure website to your browser. Your browser “decodes” the HTML automatically because that’s the point of a browser: to interpret HTML instructions. So when you went to “View source,” you were viewing “decoded HTML source code” that was in regular text on The Post’s website, encrypted to travel over the Internet and converted back to regular text when you received it.
“Though no private information was clearly visible nor searchable on any of the web pages,” the Post-Dispatch’s report stated, “the newspaper found that teachers’ Social Security numbers were contained in the HTML source code of the pages involved.”
In other words, it seems, a search tool for teacher credentials responded to searches by including a bunch of information, some of which was embedded in the source code of the page but not visible when just reading the page. If you used “View source” on this page, you noticed that there’s a ton of stuff included in the HTML that isn’t displayed, most of which is instructions for scripts and things like that.
Automated webpages often include information visible only to the browser to facilitate navigation by users. It seems likely that, in this case, the database dumped a bunch of information about the teachers into the search results, including a Social Security number. Not to be petty, but if you look at the archive of the site, you will see that this was not sophisticated coding — an Active Server Page (ASP) that uses tables for formatting. The idea that there was some sophisticated encoding within the HTML page that was then decrypted defies logic. Even if it were something more, that a reporter could decode it suggests that actual criminals could, too. (Efforts to contact Renaud and an expert he consulted were not successful by the time of publication.)
According to his report, Renaud ran three searches for teachers to see whether the bug was consistent and found that it was. The paper informed the state, standard procedure with such breaches as it allows the entity to fix the problem before the news becomes public. But, instead, the state decided to target the Post-Dispatch.
Parson’s rhetoric was over the top to the point of near hilarity.
“Let me be clear,” he added a bit later, “this administration is standing up against any and all perpetrators who attempt to steal personal information and harm Missourians. It is unlawful to access encoded data and systems to examine other people’s personal information, and we are coordinating state resources to respond and utilize all legal methods available.”
Fixing the problem, he said, might cost the state up to $50 million. It’s not clear how, given that it would presumably involve little more than changing an ASP template to remove the embedded numbers. (The odds are good, to be fair, that this system has not been updated in a while.) But the governor kept going: The paper “had no authorization to convert or decode, so this was clearly a hack,” it was doing this to “sell headlines,” the data was “not freely available and had to be converted and decoded to be revealed,” nothing on the website “gave permission or authorization for this individual to access teacher data.”
If this is what it looks like, data included in an HTML file sent to users, then the lack of permission to access teacher data is beside the point. It’s as though I put a $5 bill on the sidewalk outside my house and then yelled at you for picking it up without permission. It’s also as though I then attacked you for having no authorization to walk on the sidewalk, given that the authority to use a sidewalk is as presumed as the authority to use a browser to access a public HTML file.
Again, there may be more to the story. There may be something that the state uncovered that suggests some additional step that hasn’t been part of coverage to this point. During his news conference, Parson offered no such evidence, nor did a news release from the state reported by the Post-Dispatch. This seems, instead, to be an example of someone who does not really understand technology having a very good understanding of the downside of running a system in which information is provided that could be used for identity theft.
Incidentally, going to the contact form at the state’s website allows you to pick a reason for your message, including that you’re reporting a “website issue.” Just know that, if you report any such issue, you run at least some risk of being labeled a hacker.