A few weeks ago I wrote about studying digital culture through the lens of specific file types. In the fall I’m teaching DIG 101 (Introduction to Digital Studies)—an amorphous course that is part new media studies, part digital humanities, part science and technology studies. I was imagining spending a week on, say, something like GIFs as way to understand Internet culture. My question is, what other file types could be similarly productive to explore?
That short post generated great ideas in the comments, on Facebook, and on Twitter. To make things easier to find again (for me and others), here are just some of the file type ideas that bubbled to the surface:
As commentator Sam Popowich put it, “love it or hate it” PDFs are everywhere. Ryan Cordell pointed out that Lisa Gitelman has a chapter devoted to PDFs in Paper Knowledge. Gitelman is exactly the kind of scholar I want undergraduates to read. Clear, perceptive, uncovering seemingly archaic history and showing why it matters.
Quite a few people suggested WADs, composite files made up sounds, sprites, graphics, level information, and other digital assets for PC games. Doom popularized WADs, but PC games continue to use similar composite files. You can use tools like GCFScape to unpack these files, and they lend themselves to digital forensic lab work in the classroom. Every time I teach Gone Home, for example, students explore unpacked sound and graphic files. It’s an alternative way of experiencing the game. My own research digging to WADS to find misogynistic game developer comments could come into play here too.
At first I thought studying JPGs would be redundant if GIFs are already on the table. Allison Parrish and Jeff Thompson make a strong case for JPGs though: they organize information differently, compress differently, and of course, are glitchable. Like PDFs, their very ubiquity renders them invisible as file types, especially to students who have grown up carrying a camera with them at all times.
Vika Zafrin and Tim Owens recommended EXIF, one of the few file types I hadn’t already considered as a possibility. Technically I guess EXIF is a metadata standard, not a file type per se, but the relationship between metadata and data is crucial to understand, and EXIF can get us there. Plus, we can talk about privacy, tracking, and my colleague Owen Mundy’s fantastic I Know Where Your Cat Lives project.
Before the rise of HTML5, YouTube videos were Flash files (FLV = Flash Video), and there were (and are) tricks to downloading these videos to watch offline. But it was a format you weren’t supposed to encounter; YouTube strove to make streaming seamless, hiding the actual video file. I would love to spend some time in DIG 101 studying all of these stigmatized file types, not so much to understand the technical features of the file formats themselves, but to better understand the cultural rules that influence the circulation of knowledge.
The Big Picture
The above list is certainly incomplete. And leaves off the file types that originally inspired this idea (MP3s, GIFs, HTML, and JSON). But it’s a great start. It’s also important to zoom out and see the big picture. To this end, Amelia Acker pointed me toward this surprisingly philosophical technical report from Microsoft Research: “What is a File?”
Indeed, what is a file and what do they mean is something we’ll be asking in DIG 101.
I am revamping “Introduction to Digital Studies,” my program’s overview of digital culture, creativity, and methodology. One approach is to partially organize the class around file types, the idea being that a close reading of certain file types can help us better understand contemporary culture, both online and off.
It’s a bit like Raymond William’s Keywords, except with file types. A few of the file types that seem especially generative to consider:
MP3 (Jonathan Sterne’s work on MP3s is the gold standard to follow)
GIF (especially the rise and fall and rise of the animated GIF)
HTML (a gateway to understanding the early history and ethos of the web)
JSON (as a way to talk about data and APIs)
This list is just an initial start, of course. What other culturally significant file types would you have students consider? And what undergrad-friendly readings about those file types would you recommend?
The Digital Studies program at Davidson College is growing! We now offer an interdisciplinary minor and, through our Center for Interdisciplinary Studies (CIS), an interdisciplinary major. Last year Digital Studies and the History Department partnered on a tenure-track search—leading to Dr. Jakub Kabala joining Davidson as a digital medievalist with a background in computational philology and digital spatial analysis.
I’m delighted to announce that Digital Studies is collaborating once again on a tenure line search, this time with the Art Department. Along with Jakub and myself, this position will form the core of the Digital Studies faculty. My vision for Digital Studies has always emphasized three areas: (1) the history, practice, and critique of digital methodologies; (2) the study of cultural texts, media, and practices made possible by modern technology; and (3) the design and creation of digital art and new media, which includes robotics, interactive installations, and physical computing. Roughly speaking, I think of these three areas in terms of methodology, culture, and creativity. This latest tenure track search addresses the last area, though of course the areas blur into each other in very interesting ways.
Here is the official search ad for the digital artist position. Please share widely!
Davidson College invites applications for a tenure-track Assistant Professor of Art and Digital Studies, with a specialization in interactive installation, transmedia art, robotics, data art, physical computing, or a similar creative field. Artists must demonstrate a distinguished record of creative work and a commitment to undergraduate education. Preference will be given to artists with a broad understanding of contemporary trends in Digital and New Media Art, including its history, theory, and practice. MFA by August 1, 2016 is required.
This tenure-track position is shared between the Art Department and Digital Studies Program. Art and Digital Studies at Davidson explore the contemporary technologies that shape daily life, focusing on critical making and digital culture. The successful applicant will teach in both Studio Art and Digital Studies. The candidate’s letter of application should highlight experiences that speak to both roles. The teaching load is 5 courses per year (reduced to 4 courses the first year). Classes include introductory and advanced digital art studio courses, as well as classes that focus on digital theory and practice.
Apply online at http://jobs.davidson.edu/. A complete application includes a letter of application, CV, artist’s statement, teaching philosophy, and a list of three or more references. In addition, submit links for up to 20 still images or up to 7 minutes of video in lieu of a portfolio. The application deadline is December 1, 2015. Do not send letters of reference until requested.
Davidson is strongly committed to achieving excellence and cultural diversity and welcomes applications from women, members of minority groups, and others who would bring additional dimensions to the college’s mission. Consistently ranked among the nation’s top liberal arts colleges, Davidson College is a highly selective, independent liberal arts college located in Davidson, North Carolina, close to the city of Charlotte. Davidson faculty enjoy a low student-faculty ratio, emphasis on and appreciation of excellence in teaching, and a collegial, respectful atmosphere that honors academic achievement and integrity.
Here (finally) is the talk I gave at the 2012 MLA Convention in Seattle. I was on Lori Emerson’s Reading Writing Interfaces: E-Literature’s Past and Present panel, along with Dene Grigar, Stephanie Strickland, and Marjorie Luesebrink. Lori’s talk on e-lit’s stand against the interface-free aesthetic worked particularly well with my own talk, which focused on Erik Loyer’s Strange Rain. I don’t offer a reading of Strange Rain so much as I use the piece as an entry point to think about interfaces—and my larger goal of reframing our concept of interfaces.
Today I want to talk about Strange Rain, an experiment in digital storytelling by the new media artist Erik Loyer.
Strange Rain came out in 2010 and runs on Apple iOS devices—the iPhone, iPod Touch, and iPad. As Loyer describes the work, Strange Rain turns your iPad into a “skylight on a rainy day.” You can play Strange Rain in four different modes. In the wordless mode, dark storm clouds shroud the screen, and the player can touch and tap its surface, causing columns of rain to pitter patter down upon the player’s first-person perspective. The raindrops appear to splatter on the screen, streaking it for a moment, and then slowly fade away. Each tap also plays a note or two of a bell-like celesta.
The other modes build upon this core mechanic. In the “whispers” mode, each tap causes words as well as raindrops to fall from the sky.
The “story” mode is the heart of Strange Rain. Here the player triggers the thoughts of Alphonse, a man standing in the rain, pondering a family tragedy.
And finally, with the most recent update of the app, there’s a fourth mode, the“feeds” mode. This allows players to replace the text of the story with tweets from a Twitter search, say the #MLA12 hashtag.
Note that any authorial information—Twitter user name, time or date—is stripped from the tweet when it appears,as if the tweet were the player’s own thoughts, making the feed mode more intimate than you might expect.
Like many of the best works of electronic literature, there are a number of ways to talk about Strange Rain, a number of ways to frame it. Especially in the wordless mode, Strange Rain fits alongside the growing genre of meditation apps for mobile devices, apps meant to calm the mind and sooth the spirit—like Pocket Pond:
In Pocket Pond, every touch of the screen creates a rippling effect.
The digital equivalent of a miniature zen garden, these apps allow us to contemplate minimalistic nature scenes on devices built by women workers in a FoxConn factory in Chengdu, China.
It’s appropriate that it’s the “wordless mode” that provides the seemingly most unmediated or direct experience of Strange Rain, when those workers who built the device upon which it runs are all but silent or silenced.
The “whispers” mode, meanwhile, with its words falling from the sky, recalls the trope in new media of falling letters—words that descend on the screen or even in large-scale multimedia installation pieces such as Camille Utterback and Romy Achituv’s Text Rain (1999).
And of course, the story mode even more directly situates Strange Rain as a work of electronic literature, allowing the reader to tap through “Convertible,” a short story by Loyer, which, not coincidentally I think, involvesa car crash, another long-standing trope of electronic literature.
As early as 1994, in fact, Stuart Moulthrop asked the question, “Why are there so many car wrecks in hypertext fiction?” (Moulthrop, “Crash” 5). Moulthrop speculated that it’s because hypertext and car crashes share the same kind of “hyperkinetic hurtle” and “disintegrating sensory whirl” (8). Perhaps Moulthrop’s characterization of hypertext held up in 1994…
…(though I’m not sure it did), but certainly today there are many more metaphors one can use to describe electronic literature than a car crash. And in fact I’d suggest that Strange Rain is intentionally playing with the car crash metaphor and even overturning it with its slow, meditative pace.
At the same time as this reflective component of Strange Rain, there are elements that make the work very much a game, featuring what any player of modern console or PC games would find familiar: achievements, unlocked by triggering particular moments in the story. Strange Rain even shows up on the iOS’s “Game Center.”
The way users can tap through Alphonse’s thoughts in Strange Rain recalls one of Moulthrop’s own works, the post-9/11 Pax, which Moulthrop calls, using a term from John Cayley, a “textual instrument”—as if the piece were a musical instrument that produces text rather than music.
We could think of Strange Rain as a textual instrument, then, or to use Noah Wardrip-Fruin’s reformulation of Cayley’s idea, as “playable media.” Wardrip-Fruin suggests that thinking of electronic literature in terms of playable media replaces a rather uninteresting question—“Is this a game?”—with a more productive inquiry, “How is this played?”
There’s plenty to say about all of these framing elements of Strange Rain—as an artwork, a story, a game, an instrument—but I want to follow Wardrip-Fuin’s advice and think about the question, how is Strange Rain played? More specifically, what is its interface? What happens when we think about Strange Rain in terms of the poetics of motion and touch?
Let me show you a quick video of Erik Loyer demonstrating the interface of Strange Rain, because there are a few characteristics of the piece that are lost in my description of it.
A key element that I hope you can see from this video is that the dominant visual element of Strange Rain—the background photograph—is never entirely visible on the screen. The photograph was taken during a tornado watch in Paulding County in northwest Ohio in 2007 and posted as a Creative Commons image on the photo-sharing site Flickr. But we never see this entire image at once on the iPad or iPhone screen. The boundaries of the photograph exceed the dimensions of the screen, and Strange Rain uses the hardware accelerometer to detect your motion, your movements. So that when you tilt the iPad even slightly, the image tilts slightly in the opposite direction. It’s as if there’s a larger world inside the screen, or rather, behind the screen. And this world is broader and deeper than what’s seen on the surface. Loyer described it to me this way: it’s “like augmented reality, but without the annoying distraction of trying to actually see the real world through the display” (Loyer 1).
This kinetic screen is one of the most compelling features of Strange Rain. As soon as you pick up the iPad or iPhone with Strange Rain running, it reacts to you. The work interacts with you before you even realize you’re interacting with it. Strange Rain taps into a kind of “camcorder subjectivity”—the entirely naturalized practice we now have of viewing the world through devices that have cameras on one end and screens on the other. Think about older videocameras, which you held up to your eye, and you saw the world straight through the camera. And then think of Flip cams or smartphone cameras we hold out in front of us. We looked through older videocameras as we filmed. We look at smartphone cameras as we film.
So when we pick up Strange Rain we have already been trained to accept this camcorder model, but we’re momentarily taken aback, I think, to discover that it doesn’t work quite the way we think it should. That is, it’s as if we are shooting a handheld camcorder onto a scene we cannot really control.
This aspect of the interface plays out in interesting ways. Loyer has an illustrative story about the first public test of Strange Rain. As people began to play the piece, many of them held it up over their heads so that “it looked like the rain was falling on them from above—many people thought that was the intended way to play the piece” (Loyer 1).
That is, people wanted it to work like a camcorder, and when it didn’t, they themselves tried to match their exterior actions to the interior environment of the piece.
There’s more to say about the poetics of motion with Strange Rain but I want to move on to the idea of touch. We’ve seen how touch propels the narrative of Strange Rain. Originally Loyer had planned on having each tap generate a single word, though he found that to be too tedious, requiring too many taps to telegraph a single thought (Loyer 1). It was, oddly enough in a work of playable media that was meant to be intimate and contemplative, too slow. Or rather, it required too much action—too much tapping—on the part of reader. So much tapping destroyed the slow, recursive feeling of the piece. It becomes frantic instead of serene.
Loyer tweaked the mechanic then, making each tap produce a distinct thought. Nonetheless, from my own experience and from watching other readers, I know that there’s an urge to tap quickly. In the story mode of Strange Rain you sometimes get caught in narrative loops—which again is Loyer playing with the idea of recursivity found in early hypertext fiction rather than merely reproducing it. Given the repetitive nature of Strange Rain, I’ve seen people want to fight against the system and tap fast. You see the same thought five times in a row, and you start tapping faster, even drumming using multiple fingers. And the piece paradoxically encourages this, as the only way to bring about a conclusion is to provoke an intense moment of anxiety for Alphonse, which you do by tapping more frantically.
I’m fascinated by with this tension between slow tapping and fast tapping—what I call haptic density—because it reveals the outer edges of the interface of the system. Quite literally.
Move from three fingers to four—easy to do when you want to bring Alphonse to a crisis moment—and the iPad translates your gestures differently. Four fingers tells the iPad you want to swipe to another application, the Windows equivalent of ALT-TAB. The multi-touch interface of the iPad trumps the touch interface of Strange Rain. There’s a slipperiness of the screen. The text is precipitously and perilously fragile and inadvertently escapable. The immersive nature of new media that years ago Janet Murray highlighted as an essential element of the form is entirely an illusion.
I want to conclude then by asking a question: what happen when we begin to think differently about interfaces? We usually think of an interface as a shared contact point between two distinct objects. The focus is on what is common. But what if we begin thinking—and I think Strange Rain encourages this—what if we begin thinking about interfaces in terms of difference. Instead of interfaces, what about thresholds, liminal spaces between two distinct elements. How does Strange Rain or any piece of digital expressive culture have both an interface, and a threshold, or thresholds? What are the edges of the work? And what do we discover when we transgress them?
Each of these works offers a meditation upon the act of reading or writing, the power of stories, the role of storytellers, and the materiality of books themselves as physical objects. In addition to these printed and (mostly) bound texts, my English Honors Seminar students will encounter a range of other unconventional narrative forms, from Jonathan Blow’s Braid to Kate Pullinger’s Inanimate Alice, from Christopher Strachey’s machine-generated love letters to Robert Coover’s deck of storytelling playing cards. Along the way we’ll also consider mash-ups, databased stories, role-playing games, interactive fiction, and a host of other narrative forms. We’ll also (a heads-up to my students) create some of our own post-print beasties…
• Italo Calvino, If on a Winter’s Night a Traveler (Harvest Books, ISBN 0156439611)
• Don DeLillo, Mao II (Penguin, ISBN 978-0140152746)
• Mark Z. Danielewski, House of Leaves (Pantheon, ISBN 978-0375703768)
• Salvador Plascencia, The People of Paper (Mariner, ISBN 978-0156032117)
• Anne Carson, Nox (New Directions, ISBN 978-0811218702)
[This is the text, more or less, of the talk I delivered at the 2011 biennial meeting of the Society for Textual Scholarship, which took place March 16-18 at Penn State University. I originally planned on talking about the role of metadata in two digital media projects—a topic that would have fit nicely with STS’s official mandate of investigating print and digital textual culture. But at the last minute (i.e. the night before), I changed the focus of my talk, turning it into a thinly-veiled call for digital textual scholarship (primarily the creation of digital editions of print works) to rethink everything it does. (Okay, that’s an exaggeration. But I do argue that there’s a lot the creators of digital editions of texts should learn from born-digital creative projects.)
Also, it was the day after St. Patrick’s Day. And the fire alarm went off several times during my talk.
None of these events are related.]
The Poetics of Metadata and the Potential of Paradata
in We Feel Fine and The Whale Hunt
I once made fun of the tendency of academics to begin their papers by apologizing in advance for the very same papers they were about to begin. I’m not exactly going to apologize for this paper. But I do want to begin by saying that this is not the paper I came to give. I had that paper, it was written, and it was a good paper. It was the kind of paper I wouldn’t have to apologize for.
But, last night, I trashed it.
I trashed that paper. Call it the Danny Boy effect, I don’t know. But it wasn’t the paper I felt I needed to deliver, here, today.
Throughout the past two days I’ve detected a low level background hum in the conference rooms, a kind of anxiety about digital texts and how we interact with them. And I wanted to acknowledge that anxiety, and perhaps even gesture toward a way forward in my paper. So, I rewrote it. Last night, in my hotel room. And, well, it’s not exactly finished. So I want to apologize in advance, not for what I say in the paper, but for all the things I don’t say.
My original talk had positioned two online works by the new media artist Jonathan Harris as two complementary expressions of metadata. I had a nice title for that paper. I even coined a new word in my title.
But this title doesn’t work anymore.
I have a new title. It’s a bit more ambitious.
But at least I’ve still got that word I coined.
It’s a lovely word. And truth be told, just between you and me, I didn’t coin it. In the social sciences, paradata refers to data about the data collection process itself—say the date or time of a survey, or other information about how a survey was conducted. But there are other senses of the prefix “para” I’m trying to evoke. In textual studies, of course, para-, as in paratext, is what Genette calls the threshold of the text. I’m guessing I don’t have to say anything more about paratext to this audience.
But there’s a third notion of “para” that I want to play with. It comes from the idea of paracinema, which Jeffrey Sconce first described in 1996. Paracinema is a kind of “reading protocol” that valorizes what most audiences would otherwise consider to be cinematic trash. The paracinematic aesthetic redeems films that are so bad that they actually become worth watching—worth enjoying—and it does so in a confrontational way that seeks to establish a counter-cinema.
Following Sconce’s work, the videogame theorist Jesper Juul has wondered if there can be such a thing as paragames—illogical, improbable, and unreasonably bad games. Such games, Juul suggests, might teach us about our tastes and playing habits, and what the limits of those tastes are. And even more, such paragames might actually revel in their badness, becoming fun to play in the process.
Trying to tap into these three different senses of “para,” I’ve been thinking about paradata. And I’ve got to tell you, so far, it’s a mess. (And this part of my paper was actually a mess in the original version of my paper as well). My concept of paradata is a big mess and it may not mean anything at all.
This is what I have so far: paradata is metadata at a threshold, or paraphrasing Genette, data that exists in a zone between metadata and not metadata. At the same time, in many cases it’s data that’s so flawed, so imperfect that it actually tells us more than compliant, well-structured metadata does.
So let me turn now to We Feel Fine, a massive, ongoing digital databased storytelling project rich with metadata—and possibly, paradata.
We Feel Fine is an astonishing collection of tens of thousands of sentences extracted from tens of thousands of blog posts, all containing the phrase “I feel” or “I am feeling.” It was designed by new media artist Jonathan Harris and the computer scientist Sep Kamvar and launched in May 2006.
The project is essentially an automated script that visits thousands of blogs every minute, and whenever the script detects the words “I feel” or “I am feeling,” it captures that sentence and sends it to a database. As of early this year, the project has harvested 14 million expressions of emotions from 2.5 million people. And the site has done this at a rate of 10,000 to 15,000 “feelings” a day.
Let me repeat that: every day approximately 10,000 new entries are added to We Feel Fine.
The heart of the project appears to be the multifaceted interface that has six so-called “movements”—six ways of visualizing the data collected by We Feel Fine’s crawler.
The default movement is Madness, a swarm of fifteen-hundred colored circles and squares, each one representing a single sentence from a blog post, a single “feeling.” The circles contain text only, while the squares include images associated with the respective blog post.
The colors of the particles signify emotional valence, with shades of yellow representing more positive emotions, red signaling anger. Blue is associated with sad feelings, and so on. This graphic, by the way, comes from the book version of We Feel Fine.
The book came out in 2009. In it, Harris and Kamvar curate hundreds of the most compelling additions to We Feel Fine, as well as analyze the millions of blog posts they’ve collected with with extensive data visualizations—graphs, and charts, and diagrams.
The book is an amazing project in and of itself and deserves its own separate talk. It raises important questions about archives, authorship, editorial practices, the material differences between a dynamic online project and a static printed work, and so on. I’ll leaves aside these questions right now; instead, I want to turn to the site itself. Let’s look at the Madness movement in action.
(And here I went online and interacted with the site. Why don’t you do that too, and come back later?)
(Also, right about here a fire alarm went off. Which, semantically, makes no sense. The alarm turned on, but I said it went off.)
(I can’t reproduce the sound of that particular fire alarm going off. I bet you have some sort of alarm on your phone or something you could make go off, right?)
(No? You don’t? Or you’re just as confused about on and off as I am? Then enjoy this short video intermission, which interrupts my talk, which I’m writing and which you’re reading, about as intrusively as the alarms interrupted my panel.)
(Okay. Back to my talk, which I’m writing, and which you’re reading.)
In the Madness movement you can click on any single circle, and the “feeling” will appear at the top of the screen. Another click on that feeling will drill down to the original blog post in its original context. So what’s important here is that a single click transitions from the general to the particular, from the crowd to the individual. You can also click on the squares to show “feelings” that have an image associated with them. And you have the option to “save” these images, which sends them to a gallery, just about the only way you can be sure to ever find any given image in We Feel Fine again.
At the top of the screen are are six filters you can use to narrow down what appears in the Madness movement. Working right to left, you can search by date, by location, the weather at that location at the time of the original blog post, the age of the blogger, the gender of the blogger, and finally, the feeling itself that is named in the blog post. While every item in the We Feel Fine database will have the feeling and date information attached to it, the age, gender, location, and weather fields are populated only for those items in which that information is publicly available—say a LiveJournal or Blogger profile that lists that information, or a Flickr photo that’s been geotagged.
What I want to call your attention to before I run through the other five movements of We Feel Fine is that these filters depend upon metadata. By metadata, I mean the descriptive information the database associates with the original blog post. This metadata not only makes We Feel Fine browsable, it makes it possible. The metadata is the data. The story—if there is one to be found in We Feel Fine—emerges only through the metadata.
You can manipulate the other five movements using these filters. At first, for example, the Murmurs movement displays a reverse chronological streaming, like movie credits, of the most recent emotions. The text appears letter-by-letter, as if it were being typed. This visual trick heightens the voyeuristic sensibility of We Feel Fine and makes it seem less like a database and more like a narrative, or even more to the point, like a confessional.
The Montage movement, meanwhile, organizes the emotions into browsable photo galleries:
By clicking on a photo and selecting save, you can add photos to a permanent “gallery.” Because the database grows so incredibly fast, this is the only way to ensure that you’ll be able to find any given photograph again in the future. There’s a strong ethos of ephemerality in We Feel Fine. To use one of Marie-Laure Ryan’s metaphors for a certain kind of new media, We Feel Fine is a kaleidoscope, an assemblage of fragments always in motion, never the same reading or viewing experience twice. We have little control over the experience. It’s only through manipulating the filters that we can hope to bring even a little coherency to what we read.
The next of the five movements is the Mobs movement. Mobs provides five separate data visualization of the most recent fifteen-hundred feelings. One of the most interesting aspects of the Mobs movement is that it highlights those moments when the filters don’t work, or at least not very well, because of missing metadata.
For instance, clicking the Age visualizations tells us that 1,223 (of the most recent 1,500) feelings have no age information attached to them. Similarly, the Location visualization draws attention to the large number of blog posts that lack any metadata regarding their location.
Unlike many other massive datamining projects, say, Google’s Ngram Viewer, We Feel Fine turns its missing metadata into a new source of information. In a kind of playful return of the repressed, the missing metadata is colorfully highlighted—it becomes paradata. The null set finds representation in We Feel Fine.
The Metrics movement is the fourth movement. And it shows what Kamvar and Harris call the “most salient” feelings, by which they mean “the ways in which a given population differs from the global average.”
Right now, for example, we see that “Crazy” is trending 3.8 times more than normal, while people are feeling “alive” 3.1 times more than usual. (Good for them!). Here again we see an ability to map the local against the global. It addresses what I see as one of the problems of large-scale data visualization projects, like the ones that Lev Manovich calls “cultural analytics.”
Ngram and the like are not forms of distant reading. There’s distant reading, and then there’s simply distance, which is all they offer. We Feel Fine mediates that distance, both visually, and practically.
(And here I was going to also say the following, but I was already in hot water at the conference for my provocations, so I didn’t say it, but I’ll write it here: Cultural analytics echo a totalitarian impulse for precise vision and control over broad swaths of populations.)
And finally, the Mounds movement, which simply shows big piles of emotion, beginning with whatever feeling is the most common at the moment, and moving on down the line towards less common emotions. The Mounds movement is at once the least useful visualization but also the most playful, with its globs that jiggle as you move your cursor over them.
(Obviously you can’t see it above, in the static image but…) The mounds convey what game designers call “juiciness.” As Jesper Juul characterizes juiciness, it’s “excessive positive feedback in response to the player’s actions.” Or, as one game designer puts it, a juicy game “will bounce and wiggle and squirt…it feels alive and responds to everything that you do.”
Harris’s work abounds with juicy, playful elements, and they’re not just eye candy. They are part of the interface, part of the design, and they make We Feel Fine welcoming, inviting. You want to spend time with it. Those aren’t characteristics you’d normally associate with a database. And make no mistake about it. We Feel Fine is a database. All of these movements are simply its frontend—a GUI Java applet written in Processing that obscures a very deliberate and structured data flow.
The true heart of We Feel Fine is not the responsive interface, but the 26,000 lines of code running on 5 different servers, and the MySQL database that stores the 10,000 new feelings collected each and every day. In their book, Kamvar and Harris provide an overview of the dozen or so main components that make up We Feel Fine’s backend.
It begins with a URL server that maintains the list of URLs to be crawled and the crawler itself, which runs on a single dedicated server.
Pages retrieved by the crawler are sent to the “Feeling Indexer,” which locates the words “feel” or “feeling” in the blog post. The adjective following “feel” or “feeling” is matched against the “emotional lexicon”—a list of 2,178 feelings that are indexed by We Feel Fine. If the emotion is not in the lexicon, it won’t be saved. That emotion is dead to We Feel Fine. But if the emotion does match the index, the script extracts the sentence with that feeling and any other information available (this is where the gender, location, and date data are parsed).
Next there’s the actual MySQL database, which stores the following fields for each data item: the extracted sentence, the feeling, the date, time, post URL, weather, and gender, age, and location information.
Then there’s an open API server and several other client applications. And finally, we reach the front end.
Now, why have I just taken this detour into the backend of We Feel Fine?
Because, if we pay attention to the hardware and software of We Feel Fine, we’ll notice important details that might otherwise escape us. For example, I don’t know if you noticed from the examples I showed earlier, but all of the sentences in We Feel Fine are stripped of their formatting. This is because the Perl code in the backend converts all of the text to lowercase, removes any HTML tags, and eliminates any non-alphanumeric characters:
The algorithm tampers with the data. The code mediates the raw information. In doing so, We Feel Fine makes both an editorial and aesthetic statement.
In fact, once we understand some of the procedural logic of We Feel Fine, we can discover all sorts of ways that the database proves itself to be unreliable.
I’ve already mentioned that if you express a feeling that is not among the 2,178 emotions tabulated, then your feeling doesn’t count. But there’s also the tricky language misdirection the algorithm pulls off, in which the same “feeling” is interpreted by the machine to be the same, no matter how it is used in the sentence. In this way, the machine exhibits the same kind of “naïve empiricism” (using Johanna Drucker’s dismissive phrase) that some humanists do interpreting quantitative data.
And finally, consider many of the images in the Montage movement. When there are multiple images on a blog page, the crawler only grabs the biggest one—and not biggest in dimensions, but biggest in file size, because that’s easier for the algorithm to detect—and this image often ends up being the header image for the blog, rather than connected to the actual feeling itself, as in this example.
The star pattern happens to be a sidebar image, rather than anything associated with the actual blog post that states the feeling:
So We Feel Fine forces associations. In experimental poetry or electronic literature communities, these kinds of random associations are celebrated. The procedural creation of art, literature, or music has a long tradition.
But in a database that seeks to be a representative “almanac of human emotions”? We’re in new territory there.
But in fact, it is representative, in the sense that human emotions are fungible, ephemeral, disjunctive, and, let’s face, sometimes random.
Let me bring this full circle, by returning to the revised title of my talk. I mentioned at the beginning that I felt this low-grade but pervasive concern about digital work these past few days at STS. I’ve heard questions like Are we doing everything we can to make digital editions accessible, legible, readable, and teachable? Where are we failing, some people have wondered. Why are we failing? Or at least, Why have we not yet reached the level of success that many of the very same people at this conference were predicting ten or fifteen or, dare I say it, twenty years ago?
Maybe because we’re doing it wrong.
I want to propose that we can learn a lot from We Feel Fine as we exit out the far end of what some media scholars have called the Gutenberg Parenthesis.
What can we learn from We Feel Fine?
Imagine if textual scholars built their digital editions and archives using these four principles.
Think about We Feel Fine and what makes work. Most importantly, We Feel Fine is a compelling reading experience. It’s not daunting. There’s a playful balance between interactivity and narrative coherence.
Secondly, and this goes back to my idea of paradata. Harris and Kamvar are not afraid to corrupt the source data, or to create metadata that blurs the line between metadata and not-metadata. They are not afraid to play with their sources, and for the most part, they are up front about how they’re playing with them.
This relates to the third feature of We Feel Fine that we should learn from. It’s open. Some of the source code is available. The list of emotions is available. There’s an open API, which anyone can use to build their own application on top of We Feel Fine, or more generally extract data from.
And finally, it’s juicy. I admit, this is probably not a term many textual scholars use in their research, but it’s essential for the success of We Feel Fine. The text responds to you. It’s alive in your hands, and I don’t think there’s much more we could ever ask from a text.
Drucker, Johanna. 2010. “Humanistic Approaches to the Graphical Expression of Interpretation” presented at the Hyperstudio: Digital Humanities at MIT, May 20, Cambridge, MA. http://mitworld.mit.edu/video/796.
Genette, Gerard. 1997. Paratexts: Thresholds of Interpretation. Cambridge: Cambridge University Press.
[This is the text of my second talk at the 2011 MLA convention in Los Angeles, for a panel on “Close Reading the Digital.” My talk was accompanied by a Prezi “Zooming” presentation, which I have replicated here with still images (the original slideshow is at the end of this post). In 15 minutes I could only gesture toward some of the broader historical and cultural meanings that resonate outward from code—but I am pursuing this project further and I welcome your thoughts and questions.]
New media critics such as Nick Montfort and Matthew Kirschenbaum have observed that a certain “screen essentialism” pervades new media studies, in which the “digital event on the screen,” as Kirschenbaum puts it (Kirschenbaum 4), becomes the sole object of study at the expense of the underlying computer code, the hardware, the storage devices, and even the non-digital inputs and outputs that make the digital object possible in the first place. There are a number of ways to remedy this essentialism, and the approach that I want to focus on today is the close reading of code.
Frederich Kittler has said that code is the only language that does what it says. But the close reading of code insists that code not only does what it says, it says things it does not do. Like any language, code operates on a literal plane—literal to the machine, that is—but it also operates on an evocative plane, rife with gaps, idiosyncrasies, and suggestive traces of its context. And the more the language emphasizes human legibility (for example, a high-level language like BASIC or Inform 7), the greater the chance that there’s some slippage in the code that is readable by the machine one way and readable by scholars and critics in another.
Today I want to close read some snippets of code from Micropolis, the open-source version of SimCity that was included on the Linux-based XO computers in the One Laptop per Child program.
Designed by the legendary Will Wright, SimCity was released by Maxis in 1989 on the Commodore 64, and it was the first of many popular Sim games, such as SimAnt and SimFarm, not to mention the enduring SimCity series of games—that were ported to dozens of platforms, from DOS to the iPad. Electronic Arts owns the rights to the SimCity brand, and in 2008, EA released the source code of the original game into the wild under a GPL License—a General Public License. EA prohibited any resulting branch of the game from using the SimCity name, so the developers, led by Don Hopkins, called it Micropolis, which was in fact Wright’s original name for his city simulation.
From the beginning, SimCity was criticized for presenting a naive vision of urban planning, if not an altogether egregious one. I don’t need to rehearse all those critiques here, but they boil down to what Ian Bogost calls the procedural rhetoric of the game. By procedural rhetoric, Bogost simply means the implicit or explicit argument a computer model makes. Rather than using words like a book, or images like a film, a game “makes a claim about how something works by modeling its processes” (Bogost, “The Proceduralist Style“).
In the case of SimCity, I want to explore a particularly rich site of embedded procedural rhetoric—the procedural rhetoric of crime. I’m hardly the first to think about the way SimCity or Micropolis models crime. Again, these criticisms date back to the nineties. And as recently as 2007, the legendary computer scientist Alan Kay called SimCity a “pernicious…black box,” full of assumptions and “somewhat arbitrary knowledge” that can’t be questioned or changed (Kay).
Kay goes on to illustrate his point using the example of crime in SimCity. SimCity, Kay notes, “gets the players to discover that the way to counter rising crime is to put in more police stations.” Of all the possible options in the real world—increasing funding for education, creating jobs, and so on—it’s the presence of the police that lowers crime in SimCity. That is the procedural rhetoric of the game.
And it doesn’t take long for players to figure it out. In fact, the original manual itself tells the player that “Police Departments lower the crime rate in the surrounding area. This in turn raises property values.”
It’s one thing for the manual to propose a relationship between crime, property values, and law enforcement, but quite another for the player to see that relationship enacted within the simulation. Players have to get a feel for it on their own as they play the game. The goal of the simulation, then, is not so much to win the game as it is to uncover what Lev Manovich calls the “hidden logic” of the game (Manovich 222). A player’s success in a simulation hinges upon discovering the algorithm underlying the game.
But, if the manual describes the model to us and players can discover it for themselves through gameplay, then what’s the value of looking at the code of the game. Why bother? What can it tell us that playing the game cannot?
Before I go any further, I want to be clear: I am not a programmer. I couldn’t code my way out of a paper bag. And this leads me to a crucial point I’d like to make today: you don’t have to be a coder to talk about code. Anybody can talk about code. Anybody can close read code. But you do need to develop some degree of what Michael Mateas has called “procedural literacy” (Mateas 1).
Let’s look at a piece of code from Micropolis and practice procedural literacy. This is a snippet from span.cpp, one of the many sub-programs called by the core Micropolis engine.
It’s written in C++, one of the most common middle-level programming languages—Firefox is written in C++, for example, as well as Photoshop, and nearly every Microsoft product. By paying attention to variable names, even a non-programmer might be able to discern that this code scans the player’s city map and calculates a number of critical statistics: population density, the likelihood of fire, pollution, land value, and the function that originally interested me in Micropolis,a neighborhood’s crime rate.
This specific calculation appears in lines 413-424. We start off with the crime rate variable Z at a baseline of 128, which is not as random at it seems, being exactly half of 256, the highest 8-bit binary value available on the original SimCity platform, the 8-bit Commodore 64.
128 is the baseline and the crime rate either goes up or down from there. The land value variable is subtracted from Z, and then the population density is added to Z:
While the number of police stations lowers Z.
It’s just as the manual said: crime is a function of population density, land value, and police stations, and a strict function at that. But the code makes visible nuances that are absent from the manual’s pithy description of crime rates. For example, land that has no value—land that hasn’t been built upon or utilized in your city—has no crime rate. This shows up in lines 433-434:
Also, because of this strict algorithm, there is no chance of a neighborhood existing outside of this model. The algorithm is, in Jeremy Douglass’s words when he saw this code, “absolutely deterministic.” A populous neighborhood with little police presence can never be crime free. Land value is likewise reduced to set formula, seen in this equation in lines 264-271:
Essentially these lines tell us that land value is a function of the property’s distance from the city center, the type of terrain, the nearby pollution, and the crime rate. Again, though, players will likely discover this for themselves, even if they don’t read the manual, which spells out the formula, explicitly telling us that “the land value of an area is based on terrain, accessibility, pollution, and distance to downtown.”
So there’s an interesting puzzle I’m trying to get at here. How does looking at the code teach us something new? If the manual describes the process, and the game enacts it, what does exploring the code do?
I think back to Sherry Turkle’s now classic work, Life on the Screen, about the relationship between identity formation and what we would now call social media. Turkle spends a great deal of time talking about what she calls, in a Baudrillardian fashion, the “seduction of the simulation.” And by simulations Turkle has in mind exactly what I’m talking about here, the Maxis games like SimCity, SimLife, and SimAnt that were so popular 15 years ago.
Turkle suggests that players can, on the one hand, surrender themselves totally to the simulation, openly accepting whatever processes are modeled within. On the other hand, players can reject the simulation entirely—what Turkle calls “simulation denial.” These are stark opposites, and our reaction to simulations obviously need not be entirely one or the other.
There’s a third alternative Turkle proposes: understanding the simulation, exploring its assumptions, both procedural and cultural (Turkle 71-72).
I’d argue that the close reading of code adds a fourth possibility, a fourth response to a simulation. Instead of surrendering to it, or rejecting it, or understanding it, we can deconstruct it. Take it apart. Open up the black box. See all the pieces and how they fit together. Even tweak the code ourselves and recompile it with our own algorithms inside.
When we crack open the code like this, we may well find surprises that playing the game or reading the manual will not tell us. Remember, code does what it says, but it also says things it does not do. Let’s consider the code for a file called disasters.cpp. Anyone with a passing familiarity with SimCity might be able to guess what a file called disasters.cpp does. It’s the routine that determines which random disasters will strike your city. The entire 408 line routine is worth looking at, but what I’ll draw your attention to is the section that begins at line 109, where the probability of the different possible disasters appears:
In the midst of rather generic biblical disasters (you see here there’s a 22% chance of a fire, and a 22% chance of a flood), there is a startling excision of code, the trace of which is only visible in the programmer’s comments. In the original SimCity there was a 1 out of 9 chance that an airplane would crash into the city. After 9/11 this disaster was removed from the code at the request Electronic Arts.
Playing Micropolis, say perhaps as one of the children in the OLPC program, this erasure is something we’d never notice. And we’d never notice because the machine doesn’t notice—it stands outside the procedural rhetoric of the game. It’s only visible when we read the code. And then, it pops, even to non-programmers. We could raise any number of questions about this decision to elide 9/11. There are questions, for example, about the way the code is commented. None of the other disasters have any kind of contextual, historically-rooted comments, the effect of which is that the other disasters are naturalized—even the human-made disasters like Godzilla-like monster that terrorizes an over-polluted city.
There are questions about the relationship between simulation, disaster, and history that call to mind Don DeLillo’s White Noise, where one character tells another, “The more we rehearse disaster, the safer we’ll be from the real thing…..There is no substitute for a planned simulation” (196).
And finally there are questions about corporate influence and censorship—was EA’s request to remove the airplane crash really a request, or more of a condition? How does this relate to EA’s more recent decision in October of 2010 to remove the Taliban from its most recent version of Medal of Honor? If you don’t know, a controversy erupted last fall when word leaked out that Medal of Honor players would be able to assume the role of the Taliban in the multiplayer game. After weeks of holding out, EA ended up changing all references to the Taliban to the unimaginative “Opposing Force.” So at least twice, EA, and by proxy, the videogame industry in general, has erased history, making it more palatable, or as a cynic might see it, more marketable.
I want to close by circling back to Michael Mateas’s idea of procedural literacy. My commitment to critical code studies is ultimately pedagogical as much as it is methodological. I’m interested in how we can teach everyday people, and in particular, nonprogramming undergraduate students, procedural literacy. I think these pieces of code from Micropolis make excellent material for novices, and in fact, I do have my videogame studies students dig around in this source code. Most of them have never programmed, let alone in C++, so I give them some prompts to get them started.
And for you today, here in the audience, I have similar questions, about the snippets of code that I showed, but also questions more generally about close reading digital objects. What other approaches are worth taking? What other games, simulations, or applications have the source available for study, and what might you want to look at with those programs? And finally, what are the limits of reading code from a humanist perspective?