Data Visualization and the State of the Union

While I’m on the topic of games and IF with educational or persuasive value, I should mention (though I’m not sure how to place it relative to everything else) the State of the Union explorer. It allows the reader/player/experiencer to explore statistical information about the State of the Union addresses, discovering which words gain and lose prominence in political consciousness, and comparing any two specific years in overlay.

I am not quite sure what to call this — it’s not a game nor exactly what I tend to think of when I think of new media artwork or digital literature. It is not fictional, narrative, or goal-oriented. It exists chiefly to uncover or reveal existing texts. On the other hand, it yields to exploration, offers aesthetic pleasure, and may encourage the user to draw certain conclusions that he might not otherwise have drawn.

For the most part, I don’t think the tool is designed to be persuasive as such. Perhaps an exception: by charting the calculated grade level of each speech from 1790 through the present, and showing a clear decline, it does suggest an argument about the declining sophistication of political discourse in the United States. While I agree that discourse is in a pretty bad state now, I’m not sure I find this form of argument quite as persuasive as it initially looks: sentence complexity (which is all that’s measured here) is not always a measure of complex thought. So I accept the implicit thesis in this presentation of information, but only partly. On the other hand, this graph is only a small portion of what the visualization software allows us to see. The bulk of the exercise is less polemical.

Does it educate? I find it politically revealing, but chiefly about eras I already know something about. It is funny, in a sad way, to see the shift from the social-policy topics of the mid-90s towards the rhetoric of terror and warfare under the present administration; and in looking at the statistics for the years preceding, say, the world wars, one can see omens. In 1857, “slavery” floats to the top of the chart, with an ominous 32 occurrences. “Banks”, “depression”, and “unemployment” dominate in 1931; “housing”, “production”, “prosperity” in 1949. But there are other years which are opaque to me and reveal mostly how entirely we forget what must have seemed very deep concerns. Why, in 1924, was “nitrogen” so important? 1906 gives us “colored”, “judge”, “rape”, “mob”, “race”, “negro”, “lynching” — the significance horrible but unmistakeable — but what do we make of “pelagic”? On the other hand, how dull were things in 1827, that the most characteristic word of the State of the Union address was “conformably”? A timeline of US history running down the right side of the screen helps, but not enough. To understand any of this thoroughly would require actually reading the speeches in question, not just scanning their statistics for the most telling signs.

I find this tool alluring. I first ran into it a few months ago but I have returned several times to play with it, and have shown it to others. The appeal lies not just the smooth way that the words move around the screen that I like (though that’s part of it) or the important-looking starkness of the color scheme (how much different would this site be if executed sedately in blue and grey?). There’s also a sense of satisfaction in having so much information in so manipulable a form. We live with information overload all the time. This sort of tool says: don’t worry, you do not have to be overwhelmed; you can control and understand this huge wad of material; it’s at your disposal.

What I’m not sure of is how much it has to reveal. It demonstrates, certainly, that we can pulp a continuous text, something with structure and persuasive power, and reduce it down to its most-potent, most-relevant words. The fibrous structure, though — the arguments, the rhetorical devices, the flaws and inanities — all this is lost. And since one of the great failures of our current political discourse is the tendency to use single words rather than sentences — to deal in labels and packaged ideologies, rather than in articulate rebuttals and subtle distinctions — I wonder how much this tool is a symptom of the very decline it tries to demonstrate.

I’m not entirely knocking it. I think it’s a neat thing, cleverly made, enjoyable to use, and in many ways instructive. Moreover — in its defense — it lays out a rudimentary historical context, and also provides the full text of every speech for the user to peruse if he wants to find something out. (“pelagic”, it turns out, was at issue in 1906 because of the environmentally destructive practice of hunting seals at sea.) It offers a detailed account of its own methodologies, and the explanatory essay addresses many of the points I’ve raised, even agreeing that this statistical method is an incomplete and shallow way of addressing a body of text.

There’s one thing the essay doesn’t talk about, though. Namely, in my observation the SOTU tool chiefly reinforces conclusions the user was already inclined to draw. It offers the reader/user great interpretive freedom: this may be a good thing in a tool (and I find I keep calling it a tool, rather than a game, artwork, text, fiction, etc.), except that — by shedding context and structure — it offers more interpretive freedom than the texts on which it is based. When confronted with symbols juxtaposed semi-randomly (or apparently semi-randomly), the human reaction is to ignore confusing data and to draw the connections that appeal. This kind of tool encourages us to play at linguistic tessomancy, finding whatever we want to find in the dregs of speech.

