Movie Tagger Alpha: Critical Tagging in Emerging Methods of Media Scholarship

By Joshua McVeigh-Schultz

Download PDF

The Movie Tagger project (1) was initially inspired by media artist and researcher Michael Naimark’s grand vision of a future in which every movie ever made could be richly tagged and parsed with time-based metadata (Naimark et al. 2010). To help realize this vision, in the alpha stage of the project on which I have been lead researcher and which I will discuss below, Naimark and our team collaborated with Zane Vella and others at Related Content Database (RCDb – now Watchwith), a company focused on commercial applications of time-based metadata.

Previous academic research had used such tagging, or annotation, to help identify formal patterns in film and other linear media (Cutting, Brunick, and Delong 2011; Tsivian 2009; Butler 2009; Manovich and Douglass). By focusing on an entire film, or corpus of films, rather than a sequence, these projects represented a departure from the traditional methodological emphasis on close readings in film and media scholarship. But, even as they expanded the scope of analysis, these earlier metadata projects also typically constrained the focus of inquiry to very specific formal parameters like shot length.

More subjective interpretive strategies of close reading (like those we are most familiar with in film and media studies) are traditionally applied to a single sequence, and thus can be difficult to translate into formal tagging schemas applicable across multiple films or even across multiple sequences within a single film. Questions of form, context, and filmic interpretation cut to the heart of familiar debates in film studies, often pitting formalism against more theoretically driven frameworks of analysis—psychoanalytic, post-colonial, feminist, queer, or Marxist, to name just a few. (2) The Movie Tagger project engages these issues insofar as time-based metadata tags introduce implicit arguments about the relationship between form and meaning, and raise questions about the degree of abstraction versus context specificity required for analytical interpretation.

Critical interpretive frameworks, such as those that examine the relationship between ideology and form, do indeed present significant challenges for hybrid human-machine collaborations due to inherent tensions between computational modes of abstraction and human analysts’ ability to recognize context specificity. But even neoformalist approaches to understanding meaning are not intended to be context independent. Specific formal devices elude universal interpretation, and as Kristin Thompson writes:

It is risky to assume that a given device has a fixed function from film to film…. Any given device serves different functions according to the context of the work, and one of the analyst’s main jobs is to find the device’s functions in this or that context. (Thompson 1988, 15)

This recognition of context specificity points to the role of the human analyst in synthesizing meaning. Such an acknowledgement stands in contrast to much of the existing innovation in the area of film analytics which relies on computer vision to do the formal analysis of the film frame, detecting elements like shot changes, luminosity, chroma, composition and dynamism within a frame. Rather than focus on these sorts of computationally abstractable features, we were instead interested in combining human annotation with computational visualisation strategies in order to explore opportunities for alternative “modes of seeing” in media scholarship. In an attempt to defamiliarise the process of film scholarship itself, we wanted to put more subjective interpretive strategies (strategies that push beyond formalism) in dialogue with macro-scale approaches to tagging.

Folksonomic tagging platforms, such as the online image sharing community flickr, offer opportunities to discover unexpected connections among user-contributed media. By opening up the process of tagging media to a large audience of users, folksonomies enable serendipitous categorisation strategies to emerge that no one contributor could have predicted on their own. In this sense, they represent a collaboration of what might be termed ‘ontological effort’: the hashing out in public of what it means for something to stand in as a category for a larger set of things. For static images on flickr, for example, questions of ontology are implicit in an, often silent, conversation that plays out among large groups of people who, through aggregation, gravitate to particular semantic conventions. Furthermore, in folksonomies, tagging schemas ultimately take on a bi-directional relationship to the contents of the database. As Movie Tagger Alpha Principal Investigator, and Associate Professor at the University of Southern California’s School of Cinematic Arts, Steve Anderson has pointed out, rather than understanding metadata strictly as a supplementary and ex post facto practice oriented toward the organization, storage and retrieval of database contents, instead we can also think of metadata as having a performative dimension. (3)

But this kind of collaboration becomes more complicated when applied to temporal media. Time-based metadata not only involves an interpretive framework applied to a piece of media, but also implies a selection of temporal boundaries (“in” and “out” points). While in film studies, the ‘shot’ has often been defined as a primary unit of analysis, time-based metadata does not privilege particular kinds of temporal boundaries, and instead, can apply just as easily to an entire film or to a single frame. This openness means that the ontological work of time-based metadata often involves miniature theories about change over time. For example, when ‘in’ and ‘out’ points are coupled with interpretive metadata, the calibration of these points involves tacit assumptions about the beginning and completion of pro-filmic actions. And in this sense, time-based metadata makes itself available to verb-like sequential or causal frameworks of meaning, in contrast to the more nominative or adjectival tendencies of metadata associated with static images. Or rather, it is perhaps more correct to say that time-based metadata addresses modalities of both change and stasis while the tags of static images primarily accommodate the latter (and can merely suggest the former).

Responding to these unique challenges, a variety of projects (academic, artistic, and commercial) have been carrying out work that uses time-based metadata as on organizing principle. This eclectic mix of projects includes work by film scholars, information scientists, interactive cinema artists, and emerging web start-ups. In each case, the projects’ designers face questions about what and how to annotate in films.

A number of researchers have been developing analytical tools to enable large-scale research on formal features of time-based media. These projects include: Cinemetrics (Tsivian and Civjans), a platform for film scholars to record precise shot lengths while viewing in real-time; Shot Logger (Butler), a platform for annotating visual style in film and television; and the suite of projects produced by the Software Studies Initiative under the title Cultural Analytics (Manovich et al.). These works visualize large datasets of media (films, animations, video games, comics, artwork, web sites, etc.), and they have collaborated with Cinemetrics to visualize how shot length has evolved in different periods and geographical centres over the history of film.

Another set of projects utilises time-based metadata primarily as a way of organizing database narrative engines. In such projects, metadata categorisation supports the recombinant rules that enable smaller clips to be reorganised with each viewing. Database narrative research includes work by the Garage Cinema lab (Davis 2003), the Soft Cinema project (Manovich and Kratky 2005), and works by the Labyrinth Project research initiative (Kinder et al.). More recently, a set of projects under the title Enactive Cinema uses the spectator’s emotional participation to drive live editing decisions, relying on viewer bio-feedback to alter the course of the project’s narrative trajectory (Tikka 2008).

A further set of projects focuses on the clip as a unit of analysis. In these, metadata is used in a more curatorial role (as tags that enable users to discover connections within a film or across multiple films). In Critical Commons, tags serve as scaffolding for the opening up of critical commentary in the academy. A project by Steve Anderson, the website serves as a repository for connecting film clips and critical analysis. Utilised by film scholars, educators, and media makers alike, the project represents an unprecedented assertion of fair use for the public redistribution of copyrighted media in critical contexts (Anderson). (4)

Alongside these academic projects, commercial websites are also creating clip-based repositories in an effort to capitalise on social interactions among fans. For example, Anyclip enables users to find clips of interest by searching for keywords in a script and emphasises sharing through other social media sites. Movieclips seems to have a somewhat larger content base and focuses on curated categories oriented to film buffs. It should be noted that both Anyclip and Movieclips currently constrain the metadata categories available to those imposed by the sites’ creators—in other words, the antithesis of a folksonomic tagging structure.

In addition to these existing projects, this alpha phase of the Movie Tagger project sought to draw, also, from the rich community of filmmakers and film scholars at the University of Southern California (USC). Using this academic community as a test bed, we shifted away from a traditional notion of crowd sourcing to explore instead models that might more aptly be described as “expert sourcing” or “partner sourcing.”  It should also be noted, that this phase of research was not specifically focused on the mechanics or user interfaces that might support a sustained collaborative platform, but rather we were interested in exploring more fundamental questions about what would happen if we remapped existing interests of filmmakers and scholars to the research opportunities afforded by time-based metadata. What questions does this approach help answer? What sorts of research questions might it open up that we could not have anticipated?

Movie Tagger Process / Methodology

In the early stages of the project I interviewed a broad range of faculty to explore a variety of approaches to metadata annotation. Interviewees ranged from prominent industry figures and innovators to academic film scholars, and accordingly their interests and research foci ranged across a variety of topics. A total of twelve interviews were conducted including eight with USC faculty (Steve Anderson, Tara McPherson, Ellen Seiter, Henry Jenkins, Jeremy Kagan, Norm Hollyn, Tom Holman, Midge Costin), two with either current or former USC graduate students (Chris Hanson and Jesse MacKinnon), and finally two with external faculty whose research intersects directly with metadata analysis of film (Jeremy Butler from the University of Alabama and Eric Faden from Bucknell University).

My role as interviewer was, in some ways, also that of a cultural ambassador or translator. The process involved working with faculty to map their insights and research questions onto possibilities of metadata analysis and data visualisation. For example, we might start from a scholar’s interest in a subject like the evolving representations of ‘work’ and ‘race’ in a contemporary television series set in the American South (as was the case for Tara McPherson, who was preparing to write about the television series True Blood). Within the interview we would try to unpack how this interest might be adapted to a set of time-based tags that could be applied over an entire episode or corpus of episodes. In this example, tags might include specific workplace locations, racial identification of characters, and additional information about whether a character depicted a service worker, physical labourer, or other kind of worker. This faculty model would later be tested and further refined by our team once we had begun the process of annotation.

While some of our faculty informants, like McPherson, were quick to understand the research potential of aggregating data, for others this approach needed further explanation, given that most of the interviewees were accustomed to researching and thinking about films in terms of a methodology of close reading. Accordingly, this stage often involved a process of disciplinary translation in order to expand the perspective of existing research interests from a micro- to macro-analytical lens. As a testament to the open mindedness of our faculty collaborators, though, each of the interviews resulted in a variety of potential metadata models.

A small but dedicated team of undergraduate researchers from USC’s Institute for Multimedia Literacy (including Jason Lipshin, Corianda Dimes, and Kera Kadir) were paid hourly to take on the Herculean task of watching and exhaustively annotating select movies with time-based metadata tags. This process involved the typing of tags into a spreadsheet while watching a film or television series. During early stages of the project, we chose to use ordinary spreadsheet formatting as an intentionally basic annotation platform. Later, as we iterated with our commercial partners at RCDb, (5) we used spreadsheet templates and, in later stages, adopted customizable notation to help facilitate data visualisations. Our batch of test films included 5000 Fingers of Dr. T, Harry Potter and the Sorcerer’s Stone, Minority Report, The Nightmare Before Christmas, Singin’ in the Rain, Strange Days, True Blood (episode 1), Vertigo, West Side Story, and Written on the Wind.

In the early stages we deliberately held off on imposing rigid tagging protocols in order not to foreclose unexpected discoveries. Accordingly, metadata coders were encouraged to experiment with more open tagging strategies as a way of exploring the conceptual terrain of each media work under analysis. So, for example, the concept of ‘race’ might more loosely be connected to the ‘race’ of vampires in True Blood or to racial tension between vampires and humans, and taggers organized their metadata schemes organically to accommodate such ontological anomalies. As we started to hone metadata models derived from faculty research, and as we developed more specific data visualisation strategies, the coding format gradually became more standardised. For example, at a certain point, we constrained the annotation to specific tags and particular contexts in which they should be used. Nevertheless, throughout the process, insights and observations of the coders themselves served an important role in helping us to refine, challenge, and iterate on the faculty-inspired metadata models. As our team worked, they became increasingly adept at identifying and addressing anomalous instances that challenged our existing tagging ontology. In this sense, it should be pointed out that they progressively became a group of expert taggers, an ideal complement to our faculty informants. Moreover, as they conducted tagging, they would frequently meet with other members of the Movie Tagger team to iterate on tagging schemes, and revise or revisit the conceptual assumptions behind our faculty models.

Finally, in collaboration with Eddie Elliott at RCDb, we developed a series of data visualisations in order to explore our annotated films. Some of these visualisations also served as interactive portals to the films themselves, so that as one scrubbed through an annotated timeline, corresponding moments from the film would play. Other visualisations deemphasised the temporal dimension and instead focused on mapping aggregate data about relationships between characters in a film.

Examples of Deeper Metadata Exploration

Many of the faculty interviews provided inspiration for metadata tagging, but the research of two film scholars in particular, Steve Anderson and Henry Jenkins, went on to serve as instigation for deeper exploration. Both of these scholars have used close reading of individual clips as a launch pad for critical analysis. While each began his academic career under the tutelage of mentors steeped in the traditions of formal analysis, they subsequently developed research trajectories that took them beyond the boundaries of traditional film scholarship into areas of emerging media practice. In this sense, they were ideally positioned both as ethnographic informants and experimental collaborators, able to operate in a hybrid mode of inquiry.

In his research Jenkins has investigated, inter alia, representations of adult-child relationships in media (Jenkins 2003), so, based on his work, tagging schemas were developed to map, for example, the occurrence of intergenerational affectionate and aggressive touch and language in the films 5000 Fingers of Dr. T and Harry Potter and the Sorcerer’s Stone. Meanwhile, Anderson’s research has explored the representation of virtual reality in Hollywood films, including the way these technologies are used to remediate historical events such as the Rodney King beating. This interest led to an exploration of how race, gender, crowds, and police violence are depicted in the film Strange Days. With his expert input, tagging schemas were developed to understand the representation of crowds and police in Strange Days, and later, in West Side Story.

To begin with Anderson’s input, he had been interested in how Strange Days, like other movies, enacts repeated associations of new media technologies (in this case the Virtual Reality [VR] “wire”) with violence— in this case against women and African Americans. Through the interviews with Anderson, we also became interested in studying the way that the film depicts crowds and bodies in motion, in the hope of understanding how that dimension evolves over the course of the film. Researcher Jason Lipshin adapted these foci to a metadata schema for annotating bodies in movement that distinguished between dancing bodies and mobbing bodies.

During this early stage of the project, we also encouraged experimentation with a wide range of annotation topics. The tags for Strange Days included ‘police,’ ‘virtual reality (for references to the VR “wire” in the film), ‘brutality towards women,’ ‘violent,’ ‘non-violent,’ ‘dancing bodies,’ ‘unrest,’ ‘sexualized,’ and so on. These tags were then used by as indices for an interactive iPad visualisation. [Figure 1] shows a screen shot of the interface dial using the touchscreen of the iPad to scroll in realtime through the movie, allowing us to access any point in the movie that had been annotated by a particular tag. Using this iPad interface we can verify that the presence of the VR wire on screen clearly and consistently correlates with violence and sex, which is a longstanding trope of virtual reality in the cultural imaginary.

[Figure 1] iPad app utilizing circular timeline that visualizes the tags collected for Strange Days and plays clips when the playhead overlaps a tag (for example the two tags in red above).

In addition, the iPad interface helped us to identify instances throughout the film of dancing, celebrating bodies that later shift to mobbing bodies (often indicated through tags of ‘Racial Protest’). In between these two phases we often saw tags for ‘LAPD’ or ‘Police Brutality,’ suggesting that presence of police consistently preceded or exacerbated violent unrest, often transforming dancing bodies into violent mobbing bodies. To illustrate a single example, [Figure 2] depicts a crowd of dancing bodies at a Y2K new years celebration.

[Figure 2] ‘Dancing Bodies’ in Strange Days

The image below depicts a crowd from the same scene but at this stage they have become violent. Notice the police in the mid-ground clubbing a person in the mob.

[Figure 3] ‘Police Brutality’ in Strange Days

In the case above, we see a direct inciting of violence by the police—a scenario that was highlighted by metadata tags but would have been accessible through close analysis. In other cases, however, the relationship appeared less obvious (police just “happened” to be cut in sequence, sometimes at an entirely different location from those dancing). Regardless, the police seem to be serving the thematic role of precipitating violence, a reading that is supported by the historical context of amplified racial tensions between LAPD and African-Americans in post-Rodney King Los Angeles. By analysing the tags in this way we deliberately avoided distinguishing between causal and non-causal forms of sequentiality, but in doing so reveal larger patterns and open up new questions for close-analysis. Such broader structural patterns suggest the potential of data-aggregation across a much more extensive data set—for example, mapping the impact of police presence across time in dozens or hundreds of films and then looking for correlation with historical statistics such as the incidence of police brutality in specific contexts. Along these lines, we began to experiment with using similar tagging schemas across multiple films.

Building on what we had learned from our experiments with tagging Strange Days, we wanted to explore the themes of dancing and violence in a radically different genre of film. We chose West Side Story in part because of similar themes of dancing bodies and bodies in violence as well as the presence of police.

[Figure 4] Linear timeline visualisation of tags in West Side Story

This time we also decided to look at intensities of dancing to see if that would reveal any consistent patterns. We also chose to experiment with linear timelines, this time as a way of exploring more detailed relationship between tags in sequence. In contrast to Strange Days, in West Side Story the police often precipitated the breaking up of the rival gangs. When the LAPD is tagged in Strange Days, subsequent sequences demonstrate greater chaos, violence and dynamism in the frame, whereas in West Side Story, the police have a consistently calming or mollifying presence on the movement of bodies on screen. In [Figure 4] above, we see that the presence of police (in yellow) corresponds, in many cases, either to a subsequent drop in high-intensity interracial dancing (purple), to a cessation of interracial conflict (black), to an interruption of interracial love (pink), or to some combination of these factors.

[Figure 5] Cops breaking up dancing/violence in West Side Story

In thinking through the implications of this comparison between Strange Days and West Side Story, Anderson reflected upon the importance of reflexivity in our tagging scheme, which he suggests enabled the discovery of emergent analytical vectors and serendipitous sites of comparison. Strange Days and West Side Story would surely have otherwise remained segregated by genre and chronology, but by juxtaposing them we can start to see the potential of extending this analytical vector to include a great many more films dealing with related themes. Through these unlikely connections, time-based metadata not only encourages the unmooring of context for particular formal features or pro-filmic actions but also productively defamiliarises the demarcations of genre, subject matter, and historical period.

Connecting these observations about our reflexive process to the productive pairing of micro- and macro-analytic frameworks, Anderson also argues that the coexistence of close reading and aggregated analytics across two or more films could represent a “cinematic corollary of the type of ‘distant reading’ advocated by literary historian Franco Moretti (2007). Like Moretti, this mode of analysis seeks to derive insights that are both highly detailed about the contents of a single work (the literal remediation of George Holliday’s video of the King beating in Strange Days, for example) and concurrently embedded in a context so broad that its contours may only be legible with the aid of computational processes.” (6)

To turn next to Movie Tagger’s collaboration with Henry Jenkins, this internationally renowned media scholar has written about the advent of the permissive parenting movement in the 1950s in relation to pop culture, in particular examining the interactions between adults and children in the film 5000 Fingers of Dr. T, an adaptation of one of Dr. Seuss’s stories (Jenkins 2003). Jenkins points out how the film illustrates a tension between opposing philosophies of parenting, one permissive—­epitomized by Mr. Zabladowski—and the other domineering—embodied by Dr. Terwilliker. Taking a cue from the themes of Jenkins’s research, we were interested in aggregating metadata about communication between adults and children. In particular, we looked at how authoritative and permissive language towards children played out in the film and at whether aggressive or affectionate touch between adults and children might also serve as a key signifier. (7)As we refined our metadata model, we found we wanted to specify the actor and receiver of the action. So we included information about which party was initiating the touch and which party was the receiver. Likewise, we distinguished between speakers and addressees in examples of authoritative and permissive speech. The image in [Figure 6], below, depicts an example of Dr. T. waking up Bart out of a nightmare. This was tagged as an example of ‘aggressive touch’ (from Dr. T to Bart).

[Figure 6] Aggressive touching in 5000 Fingers of Dr. T.

[Figure 7] Authoritative language in 5000 Fingers of Dr. T.

In Bart’s dream world Dr. Terwilliker becomes an autocratic figure who commands an army of piano playing pupils. The image in [Figure 7] depicts a moment that was tagged as authoritative language.


[Figure 8] Permissive Language and Affectionate Touching.

Mr. Zabladowski also figures in Bart’s dream world. In [Figure 8] he is shown in a sequence that was tagged as both affectionate touch and permissive language.

[Figure 9] Network graph illustrating authoritative and permissive language over the entirety of 5000 Fingers of Dr. T.

In [Figure 9] we visualised authoritative and permissive language using a graph that illustrates network relationships. Red signifies authoritative language and green signifies permissive. We also indicated speaker and addressee by varying opacity. Closer to the addressee the opacity is highest while as the joining line approaches the speaker the opacity decreases.

Using the above graph we can make observations about which adults are speaking most authoritatively to Bart (with Dr. T, and to a lesser degree, Bart’s mother both speaking more authoritatively than Mr. Zabladowski). We can also observe that Bart himself is speaking permissively to Mr. Zabladowski, perhaps revealing an implicit claim the film makes about the effectiveness of permissive parenting strategies.

[Figure 10] Network graph illustrating affectionate and aggressive touch in 5000 Fingers of Dr. T.

[Figure 10] depicts the distribution of touch (coded as affectionate, aggressive, or neutral). Additional information is noted if the touch is mutual, attempted (but failed), or magic. As in the authoritative and permissive language network graph, this visualization distinguishes an actor from receiver of touch by varying opacity. Closer to the touchee the opacity of the joining line is highest while, as the line approaches the toucher, the opacity decreases. From this data visualisation we can observe that both Mr. Zabladowski (Mr. Z.) and Bart’s mother exhibit a combination of affectionate and aggressive touch towards Bart while Mr. T. exhibits instances aggressive touch only.

We were interested in understanding how a metadata tagging scheme might be ported to another film. Given Henry Jenkins’s research on Harry Potter fandom, and given the very different take on parental authority that the Harry Potter stories seem to offer, we decided to use the film Harry Potter and the Sorcerer’s Stone as a test-subject for the same metadata framework that we had employed for 5000 Fingers of Dr. T.

In contrast to 5000 Fingers of Dr. T. the Harry Potter story world accommodates authoritative language from adults who are depicted warmly. For example, in [Figure 11] below, Professor McGonigal, a stalwart ally of the children throughout the story, can nevertheless be seen sternly lecturing Harry.

[Figure 11] Professor McGonigal using authoritative language

At the same time, the figure of Hagrid, relates more permissively to Harry and also frequently initiates affectionate touch, as in [Figure 12], below.

[Figure 12] Hagrid demonstrating affectionate touch towards Harry

The contrast between McGonigal and Hagrid, both of whom represent beneficent forces in Harry’s life, suggests that Harry Potter and the Sorcerer’s Stone would have a more complicated relationship to the adults in his life (compared to Bart in 5000 Fingers of Dr. T.).

The data visualization in [Figure 13], below, demonstrates that Hagrid and, to a lesser degree, McGonigal and Dumbledore express permissive language towards Harry. Authoritative language towards Harry comes from his adopted parents (Uncle Vernon and Aunt Petunia), McGonigal, Quirrell, and Filch.

[Figure 13] Network graph illustrating authoritative and permissive language in Harry Potter and the Sorcerer’s Stone.

Instances of permissive language towards Harry from Snape and Voldemort point to anomalous cases in which sarcasm (in the case of Snape) and bargaining (in the case of Voldemort) created tensions for our existing tagging schema. Rather than see these anomalous examples as failures, however, we found them to be important opportunities for revisiting close analysis.

[Figure 14] Network graph illustrating aggressive and affectionate touching in Harry Potter and the Sorcerer’s Stone.

Similarly [Figure 14], a network graph illustrating instances of touch contains examples in which pro-filmic actions created ontological tensions for our tagging schemas. In particular, we found anomalous cases of touching that did not easily conform to our metadata categories.

[Figure 15] Harry Potter touching (and being touched) by Quirrell/Voldemort.

For example, in [Figure 15] Harry attempts to remove Voldemort’s hand (which is actually the hand of Quirrell, whose body has been possessed by Voldemort). Harry does not realise that his power over Voldemort will result in Quirrell’s hand getting burned. In such a case, a number of questions emerge: Which character, Quirrell or Voldemort, is the receiver of the touch? According to metadata tags, both Harry and Quirell exhibited instances of aggressive touching towards one another, as illustrated in [Figure 14]. But is this example of touch aggressive, since Voldemort is being harmed, or neutral, since Harry is not aware that he has this power yet? Similarly, in other moments in the film, characters frequently “touch” one another remotely through their wands. Should such instances count as touching?

Such interpretational questions point to issues of abstraction and context specificity that I raised in my introduction. The lens of close analysis can help us to probe anomalies in the data and productively complicate our tagging framework by pointing out gaps and ontological tensions. Casting these anomalies as “merely” anomalous, then, would miss the point, for they speak to the important role that close analysis can play in critically re-examining the assumptions behind macro-analytic frameworks of data visualisation. Rather than seeing close-analysis and metadata as incompatible, we should emphasize instead the valuable dialogue that can occur at the intersection of micro- and macro-analytical lenses. The metadata collection that followed on from our collaborations with Anderson and Jenkins, then, led to revealing data visualisations and also provided a rich resource for revisiting these scholars’ topics of close analysis. Since computationally assisted “seeing” can risk obfuscating the very processes of meaning making that close analysis is adept at identifying, our experience is that pairing the two methodologies helped to challenge our data aggregation models to attend more closely to the hidden assumptions behind particular schemas of categorisation.

Conclusions and Future Directions

This phase of the Movie Tagger project can be understood as a design intervention in two very different domains. First, we took an existing approach to visualizing metadata that our partner company RCDb had developed with commercial contexts in mind—for example, using metadata to track product placement—and we adapted their architecture to serve as a tool of ideological analysis. Second, the project represents an intervention into the existing methodologies of film and media research. Along these lines, Anderson has flagged up the following two points as indicative of the transformational impact that the process of thinking through metadata has had on his own scholarship:

(1) Once you start thinking in terms of tagging movies in real time, you don’t ever really go back. My mental model for movie watching for the past year has been the construction of metadata schemes and cognitive databases for all media that I take in.

(2) I’ve begun to take seriously the merging of data with metadata when it comes to the massive database that is the history of film and television. It’s only a matter of time before we have the realistic ability to have metadata function as the primary viewing interface for what may be considered One Big Movie – that movie being everything ever committed to a format that can be accessed by the metadata. (8)

Anderson’s point about “One Big Movie” suggests a world in which metadata authoring, manipulation, and visualisation becomes a taken-for-granted, always available, norm of viewership. Such an emerging media landscape would open up as yet vastly unexplored areas of research. But it would also mean that questions about how to strike the right balance between context specificity and abstraction would no longer be confined to academic debates about meaning and criticality. Likewise, the commercial embrace of Big Data (9) also represents an unmooring of interpretive context; the attendant ontological tensions created by porting metadata frameworks from one domain into another may become a feature of our everyday media consumption. As humanists we have an important opportunity to put our own analytical toolsets in dialogue with the rhetoric and machinery of data visualisation, to exhume ontological tensions from metadata frameworks, and to demand that critical frameworks such as gender, race, power, ideology, among others, be addressed by emerging tools of media research.


(1Michael Naimark originated the Movie Tagger concept and his leadership guided the project through each stage of development. Additional founding members included: Steve Anderson, Maya Churi, Perry Hoberman, Andres Kratky, Erik Loyer. Earlier phases of the project have involved collaborations with Scott Fisher and the Mobile and Environmental Media Lab.During this research phase, Naimark acted as the primary liaison with Zane Vella, of RCDb, a company that is pioneering commercial applications of time-based metadata. Steve Anderson also guided and facilitated the research in his role as Principal Investigator for this phase of the project.

 For the alpha phase of the project, we explored with RCDb possible intersections between metadata tagging and approaches to film research in the academy. The core collaborators at RCDb included Eddie Elliott, who designed and engineered data visualisations of our metadata that we collected, and Sunny Lee, who served as project manager from the RCDb side of operations and met regularly with our USC team. Darren Lepke also served as a metadata adviser.

With an eye also toward investigating new models of folksonomic “expert-sourcing”, ten film scholars were interviewed in order to adapt insights from their filmmaking practice, teaching, and media scholarship to metadata tagging schemas. My own role in this effort was as lead researcher, interviewing USC faculty, supervising metadata annotation, and synthesizing research reports. Our undergraduate research assistants—whose creativity and devoted tagging efforts undergirded the project’s metadata collection component—included Corianda Dimes, Kera Kadir, and Jason Lipshin. 

(2Despite the eclecticism of critical approaches, pedagogy in film studies is strongly influenced by the methodology of neoformalism, especially in introductory undergraduate classes. Indeed, the prominence of Bordwell and Thompson’s Film Art: An Introduction in undergraduate film programs speaks to the impact of neoformalist models of film scholarship in the academy. Even pedagogical approaches that reach substantially outside of the frame of the film-as-text, to grapple with subject matter like power and ideology, still require students to ground their arguments within specific formal observations. For example, Timothy Corrigan’s A Short Guide to Writing About Film encourages students to take notes on concrete stylistic features, then reflect on those notes, and finally identify themes that synthesize formal observations into higher level arguments about a particular film (Corrigan 2003, 26-37). However, these familiar methods of note taking and annotation (such as those taught in introductory texts) seem ripe for design intervention.

(3) The tags added to flickr, for instance, become constituent elements of database contents, driving the creation and expansion of categories of images. A flickr tag such as “cats in sinks” becomes part of the motivation for flickr users to put their cat in the sink and take a photograph in order to add to the database. In our case, the creation of analytical categories for cinematic analysis served to transform our own critical perspectives and drove the selection of particular films and content features. Moreover, the potential to perform simultaneously close and distant readings within a single analytical platform prompted a range of unexpected outcomes for both researchers and taggers, which in turn continued to shape the kinds of observations we were able to make.

(4) See Anderson’s article on his work elsewhere in this inaugural issue of Frames.

(5) See footnote 1.

(6) Excerpted from correspondence with Anderson. See his contribution on issues of fair use elsewhere in the special issue of Frames.

(7) Team member Corianda Dimes led the efforts in tagging and refining the metadata model for this example.

(8) Anderson, Steve. 2012. “Technologies of Critical Writing: On the War Between Data and Images.” Paper presented at Society for Cinema and Media Studies, March 24th, Boston.

(9) As noted in footnote 1 above, RCDb developed their metadata architecture in part to track commercial product placement.


Anderson, Steve. Critical Commons.

Anderson, Steve. 2012. “Technologies of Critical Writing: On the War Between Data and Images.” Paper presented at Society for Cinema and Media Studies, March 24th, Boston.

Butler, Jeremy G. Shot Logger 2.0: Overview.

———. 2009. Television Style [Paperback]. Routledge.

Cutting, James E., Kaitlin L. Brunick, and Jordan E. Delong. 2011. “How Act Structure Sculpts Shot Lengths and Shot Transitions in Hollywood Film.” Projections 5 (1) (June 15): 1-16. doi:10.3167/proj.2011.050102.

Davis, Marc. 2003. “Editing Out Video Editing.” IEEE Mulitmedia 10 (2).

Jenkins, Henry. 2003. “No Matter How Small”: The Democratic Imagination of Dr. Seuss. In Hop on Pop: The Politics and Pleasures of Popular Culture. Duke University Press Books.

Kinder, Marsha, Kristy Kang, Rosemary Comella, and Scott Mahoy. The Labyrinth Project on Interactive Narrative.

Manovich, Lev, and Jeremy Douglass. “Visualizing Temporal Patterns in Visual Media.” Unpublished Manuscript.

Manovich, Lev, and Andreas Kratky. 2005. Soft Cinema: Navigating the Database (DVD-video with 40 page color booklet). MIT Press.

Manovich, Lev, Noah Wardrip-Fruin, Jeremy Douglass, and Benjamin Bratton. Software Studies Initiative.

Moretti, Franco. 2007. Graphs, Maps, Trees: Abstract Models for Literary History. Verso.

Naimark, Michael, Steve Anderson, Maya Churi, Perry Hoberman, Andreas Kratky, and Erik Loyer. 2010. Movie Tagger: a method and system for parsing and richly tagging every movie ever made.

Thompson, Kristin. 1988. Breaking the Glass Armor. Princeton University Press.

Tikka, Pia. 2008. Enactive Cinema: Simulatorium Eisensteinense. PhD Disser. Helsinki: University of Art and Design Publication Series.

Tsivian, Yuri. 2009. Cinemetrics, Part of the Humanities’ Cyberinfrastructure. In Digital Tools in Media Studies: Analysis and Research, An Overview, ed. Michael Ross, Joseph Garncarz, Manfred Grauer, and Bernd Freisleben, 220. Transcript Verlag.

Tsivian, Yuri, and Gunars Civjans. Cinemetrics: Movie Measurement and Study Tool Database.


Frames # 1 Film and Moving Image Studies Re-Born Digital? 2012-07-02, this article © Joshua McVeigh-Schultz. This article has been blind peer-reviewed.