Big Brother and Big Data

Word now comes that Eric Snowden, the former NSA staffer who has leaked intelligence-gathering secrets, employed a very low-tech web-crawler program to scour top-secret servers for classified documents. According to The New York Times, Snowden was possibly able to download an estimated 1.7 million files using this program. We say “estimated” because intelligence analysts still don’t know the extent of Snowden’s file pilfering.

Web-crawlers are programs that can move from website to website and can, in effect, click on every link in a website, all the while reading and capturing the contents of every web page. It’s also possible to program the crawlers to search for certain key words and have pages containing those words downloaded.

That such a breach could happen in what should be the most secure website in the nation begs the question of who is minding the store at the National Security Agency. The breach is all the more appalling after the WikiLeaks incident in which a low-level army corporal accessed and downloaded classified documents using a similar program.

What makes matters worse is that investigators at the NSA noticed Snowden’s suspicious activity and even questioned him about it, but he apparently blew them off with some phony blather about the downloading being necessary for system backups.

Blame and appropriate punishment should be meted out to those at the NSA who believed such a ludicrous cover story. Disk backups are a routine data-center activity necessary to make sure that another set of data exists for recovery; no system administrator with any experience at all would use a web-crawler to perform the data backups. Using a web-crawler to back up files rather than using standard back-up tools that copy whole disks makes about as much sense as shoveling snow with a tablespoon.

Even if Snowden wasn’t doing any spying, he should have been dismissed for incompetence as a systems administrator. Not only does the NSA appear clueless in its security measures, but its investigators and auditors don’t know the rudiments of running an information technology shop.

Another nagging, unanswered question surrounding this lapse of security is this: Even if NSA auditors bought Snowden’s absurd alibi, how did he get the files off the government’s servers? Standard security features block use of an external drive for copying files. How did Snowden manage to bypass this security feature?

According to the article in the Times, part of the reason Snowden got away with what he did was because he worked in a facility that didn’t have upgraded security measures in its data center. In this day and age when data centers can be easily centralized in a few secure locations (a strategy that corporations have been using for years), the existence of data centers lacking proper security controls represents total incompetence and lack of concern for the nation’s most sensitive intelligence secrets.

Snowden’s leaks have made Americans nervous about what our government is collecting about its citizens without their knowledge and what it’s doing with the information. But the latest revelations about the NSA should also give Americans anxiety as to who has access to their private data — not only whom Americans are calling and what they are accessing over cyberspace, but information that they willingly provide to the government. Who has access to their tax returns, social security information, disability and medical records — the aggregation of information known as big data — that can be used to discriminate against job hunters, home buyers and health insurance applicants?

This question becomes all the more pertinent now that ObamaCare is collecting an enormous amount of health-care data on millions of Americans. Initially, the federal government gave the contract for the health-care website to a Canadian contracting company. When that website roll-out became a fiasco, Cap Gemini, a large software contracting company, took over the contract. Cap Gemini has a huge presence in India, where it’s outsourcing a lot of the work. Given the government’s sloppy efforts in protecting sensitive NSA data, Americans should be very uncomfortable knowing that their personal data may be sitting on servers in India, or accessible by software developers sitting in Bangalore.

Americans have to be concerned not only with Big Brother, but with who has access to their Big Data.