Pulling startling conclusions out of the Web’s wealth of raw information is a growing new genre of reporting, collectively called “data journalism.” It is refreshing to see how the Web can be a boon to individual reporters, even as it has been accused of eroding the profits of the news companies they work for.
Reporters’ access to vast amounts of raw data gives them an edge they’ve never had before. Google CEO Eric Schmidt estimates the Internet to contain about 5 million terabytes of data; a number that increases daily. A journalist with the right know-how can mine this data for a wealth of information that would otherwise be unavailable – even to eyewitness reporters on the ground.
A recent data journalism panel at the Columbia Journalism School explored how technology has become a wonderful tool for investigative reporters in particular. Here are the takeaways from the panel.
It is increasingly important for journalists to be “data literate.”
The phobia that many newsrooms have of dealing with math and numbers is no longer a viable way of operating. As the tools of journalism expand, so must the skills of the journalists that use them, says Aaron Pilhofer of the New York Times, who stressed the importance of what he calls “data literacy.” In order to make the most of the vast amount of information available, it is crucial to be able to process and read it in a way that will yield some form of understanding.
Sometimes you have to look for what isn’t there.
Jo Craven McGintey, a specialist in Computer Assisted Reporting at the New York Times, recounted an instance in which the best story turned out to be what the data didn’t show. When gathering data on cases of justifiable homicide by police officers, she noticed that there were records that should have existed but had been removed. By investigating this “easter egg in the data,” as McGintey called it, she was able to put out a report that earned her a Pulitzer Prize.
In a similar vein, putting the data to creative use can yield a wealth of unexpected information. For instance, by looking at betting records kept by casinos, the New York Times was able to gain a large amount of information about the frequency of racing horse injuries for a recent report.
Data often conflicts with the perspective of journalists on the ground.
When collected data conflicts with the testimony of reporters on the ground, who is the more reliable source? The answer is tough to agree upon. A few panelists, such as Maurice Tamman, bemoaned the naiveté of certain reporters who push to have their investigative reports published, even in the face of data that conflicts with their eyewitness reports.
Ideally, the data and field reporting support each other. However, when the two do not line up there doesn’t seem to be a consensus as to which is more reliable, as neither form of investigation can be considered completely accurate. While most panelists seemed to feel that data was the stronger source, Pilhofer expressed a complimentary perspective.
As he sees it, “You have qualitative and quantitative data gathering that merge together to form the spine of the story and give you both sides of the story.”
An app is worth a thousand words.
Stories based on data journalism rely on large amounts of data. Sometimes the most effective way to deal with things is to sift through this information and then convey the findings in a written story. However, often it can be better to use a less typical format when attempting to convey an idea based around large amounts of information. In these cases a graphic, an interactive app or something else may tell the story better than an article.
Take for example a recent ProPublica report on the opportunity gap between primary education in low and high-income areas. The app accompanying the report allows readers to look at how their own schools measure up to others in areas like access to advanced math courses and subsidized lunches, allowing for a much more personal report.
Human intelligence is still key.
The new availability of data tools is opening up worlds of opportunity for journalists. And while data journalism has certainly already proved itself as a valuable journalistic tool, it’s unlikely to bring about immense change in the greater field of journalism.
After all, for every story that can be researched through calculations and numbers, countless more require the same on-the ground fact-finding that journalists have always relied upon; no data set can displace this.