Empirical Methods

The idea of asking questions about literature empirically, and getting to answer them, can be one of the most exciting things about cognitive approaches to the study of literature. It can also be one of the most daunting. There are as many methods as there are questions to ask, and the development of methods adequate to answering the questions we really want to about literature is still very much in progress. Many methods can be tried out in a modest way without too long a learning curve or substantial an investment of resources. This section provides an overview of some commonly employed methods for studying both texts and readers, and the kinds of question they can be used to tackle. Of course, many of the studies in the main Cognitive Humanities bibliography also employ empirical methods, so the Methods sections there can be mined for ideas too. 

Compiled by Emily Troscianko

Hidden accordion
Why do it?

The most basic argument for working empirically is that putting our theories and arguments to some kind of empirical test is an important way of corroborating, falsifying, and refining them. Here are some general introductions to reasons and methods for studying literature empirically:

Steen, G. J. (1991). The empirical study of literary reading: Methods of data collection. Poetics, 20, 339-575. https://bit.ly/3moGzQs (full text)

An outline of the most common measures used in experiments with literary texts, including verbal and nonverbal, with varying degrees of control imposed by the experimenter, and administered before, during, or after reading. Each strikes at a different point the balance between analysability and manipulability on the one hand, and richness and validity on the other.

Martindale, C. (1996). Empirical questions deserve empirical answers. Philosophy and Literature, 20, 347-361. https://muse.jhu.edu/article/26842/summary (paywall)

A forceful early advocacy of empirical literary studies, focused on the question of how much consistency or variation there is in trained and untrained readers’ interpretations and classifications of literature and other media. The sentence ‘Theories imply hypotheses, and hypotheses imply empirical or experimental testing’ nicely sums it up.

Bortolussi, M., and Dixon, P. (2003). Preliminaries. In M. Bortolussi and P. Dixon, Psychonarratology: Foundations for the empirical study of literary response, pp. 34-59. Cambridge: Cambridge University Press. https://www.google.co.uk/books/edition/Psychonarratology/549cb3w1OiUC?hl=en&gbpv=1&dq=Psychonarratology%3A%20Foundations%20for%20the%20empirical%20study%20of%20literary%20response&pg=PA34&printsec=frontcover (partial preview)

‘This chapter was designed in part to address the needs, interests, and concerns of literary scholars who may be intrigued by the empirical study of literary response but lack the confidence to pursue it on their own.’ The chapter includes an introduction to psychonarratology (the study of the psychological processing of narrative) and to the concepts of the ‘statistical reader’ and ‘measurement distributions’ of particular variables within a given population. It sets out some of the epistemological assumptions involved in empirical research on literature, as well as arguments for the importance of distinguishing clearly between text features and reader constructions, and the value of carrying out controlled ‘textual experiments’ in which texts are manipulated and changes in readers’ responses observed so as to eliminate potential confounds and move closer to causal explanations. (There are also some brief remarks on significance testing.) Subsequent chapters go into detail on the topics of narrator, events and plot, characters and characterisation, perception and focalisation, and represented speech and thought, in each case clarifying what it means to take a psychonarratological approach, and presenting existing empirical evidence on reader responses.

Lauer, G. (2009). Going empirical. Why we need cognitive literary studies. Journal of Literary Theory3(1), 145-154. https://goedoc.uni-goettingen.de/bitstream/handle/1/8377/jlt.2009.007_Lauer.pdf?sequence=1&isAllowed=y (full text)

A riposte to arguments against cognitive or neuroscientific approaches to the study of literature, emphasising the value of empiricism.

van Peer, W. (ed.) (2011). The future of scientific studies in literature. Special issue, Scientific Study of Literature, 1(1). https://benjamins.com/catalog/ssol.1.1 (paywall)

The contributions to SSOL’s first issue include a range of theoretically and empirically orientated discussions, on topics including science and literariness, empirical narratives and ‘symptoms of science’, literature as entertainment, and cultural colonialism; the importance of interdisciplinarity in the teaching of literature; corpus and computational linguistics, the use of ‘textoids’ versus more naturalistic texts in experimental work; and individual differences and commonalities amongst readers and uses of language.

Hanauer, D. I., Kuiken, D., and Hakemulder, F. (2013). The scope of SSOL: A discussion of the boundaries of science and literature. Scientific Study of Literature, 3(2), 169-174. https://www.researchgate.net/publication/263186188_The_scope_of_SSOL_A_discussion_of_the_boundaries_of_science_and_literature (full text)

The editors of the then two-year-old journal try to clarify what they mean by science and by literature.

Rating scales and questionnaires

Some variant on rating-scale research is the most common way of working empirically with literary response. You can make up your own questions, or use sets of questions designed and validated by other people, or a mixture. The validated scales below were designed specifically with aesthetic responses in mind, but you can find online validated scales to measure countless trait and state variables that may differentiate your participants or their responses.

Validated literature/narrative-specific questionnaires

Miall, D.S., and Kuiken, D. (1995). Aspects of literary response: A new questionnaire. Research in the Teaching of English, 17, 37-58. https://sites.ualberta.ca/~dmiall/reading/LRQ_95.htm (full text)

The Literary Response Questionnaire provides scales to measure insight, empathy, imagery vividness, leisure escape, concern with author, story-driven reading, and rejection of literary values in readers’ orientations towards literary texts. This paper describes the LRQ’s development and relates its subscales to readers’ personality traits and learning skills.

Green, M. C., and Brock, T. C. (2000). The role of transportation in the persuasiveness of public narratives. Journal of Personality and Social Psychology, 79(5), 701-721. http://www.communicationcache.com/uploads/1/0/8/8/10887248/the_role_of_transportation_in_the_persuasiveness_of_public_narratives.pdf

Develops and validates a transportation scale to measure absorption into a story (conceived of as involving mental imagery, emotion, and attentional focus). Experiments using the scale are reported that show effects of transportation level on story-consistent beliefs, and on evaluations of protagonists and other textual features, while finding no effect on transportation from labelling stories as fact or as fiction.

(For an application of this scale in two experiments investigating interactions between readers’ pre-reading emotional states and the emotional tone of the narrative, see also Green, M. C., Chatham, C., and Sestir, M. A. (2012). Emotion and transportation into fact and fiction. Scientific Study of Literature, 2(1), 37-59. https://www.jbe-platform.com/content/journals/10.1075/ssol.2.1.03gre (paywall) See also Thompson and Haddock 2012 (in the Cognitive Humanities bibliography section Personality and Individual Difference) for use of the transportation scale in the context of a study on individuals’ drinking habits, attitudes, and intentions.)

Busselle, R., and Bilandzic, H. (2009). Measuring narrative engagement. Media Psychology, 12, 321-347. https://www.tandfonline.com/doi/abs/10.1080/15213260903287259 (paywall)

Describes the development and validation of a scale to measure narrative engagement based on a mental-models approach to narrative processing across media. The scale distinguishes between narrative understanding, attentional focus, emotional engagement, and narrative presence, and is validated using data from viewers of movies and TV programmes in different viewing situations and from the USA and Germany.

Kuijpers, M. M., Hakemulder, F., Tan, E. S., and Doicaru, M. M. (2014). Exploring absorbing reading experiences: Developing and validating a self-report scale to measure story world absorption. Scientific Study of Literature, 4(1), 89-122. https://www.researchgate.net/publication/265968904_Exploring_absorbing_reading_experiences_Developing_and_validating_a_self-report_scale_to_measure_story_world_absorption (full text)

The scale includes the dimensions of attention, transportation, emotional engagement, and mental imagery (these subscales can also be used independently), and predicts two distinct evaluative responses: enjoyment and impact. The authors argue that the subscale of narrative presence in Busselle and Bilanszic’s narrative engagement scale confounds two dimensions (transportation and attention), and that Green and Brock’s transportation scale also lacks precision relative to theirs.

Moore, M., & Gordon, P. C. (2015). Reading ability and print exposure: Item response theory analysis of the author recognition test. Behavior Research Methods, 47(4), 1095-1109. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4732519/ (full text)

Outlines the use of the Author Recognition Test (a list of real and distractor author names) as a strong predictor of reading skill, with a potential two factor structure differentiating between literary and popular authors. This scale is often used to control for reading ability as a variable that may otherwise be a serious confound in reading experiments.

Ad hoc scales

Hilscher, M. C., and Cupchik, G. C. (2005). Reading, hearing, and seeing poetry performed. Empirical Studies of the Arts, 23(1), 47-64. https://www.researchgate.net/profile/Gerald_Cupchik/publication/250145971_Reading_Hearing_and_Seeing_Poetry_Performed/links/567fd43808ae1e63f1e9344b/Reading-Hearing-and-Seeing-Poetry-Performed.pdf (full text)

Investigates different ways of experiencing poetry by means of two questionnaires constructed for the purposes of this study: the General Poetry Questionnaire (about general experiences and impressions of poetry, administered beforehand) and the Poetry Reception Questionnaire (about the cognitive-emotional nuances of poetry reception, administered after reading), plus one open-ended question about the poem’s meaning (answered before the PRQ). The free-response paragraphs were analysed using a qualitative method of category construction. The study found that people prefer reading poetry rather than hearing it read or seeing it performed, since this lets them explore the text more independently and creatively.

Carney, J., Wlodarski, R., and Dunbar, R. (2014). Inference or enaction? The impact of genre on the narrative processing of other minds. PLoS One, 9(12), e114172. http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0114172 (full text)

Administers rating scales tapping engagement with the text and evaluations of literary quality to investigate whether narratives affect how we interact with other minds. A contrast was found in assessment of the literary quality of relationship versus espionage stories (as examples of evolutionarily familiar versus unfamiliar scenarios), depending on how many levels of intentionality they incorporated.

Online surveys

Koopman, E. (2011). Predictors of insight and catharsis among readers who use literature as a coping strategy. Scientific Study of Literature, 1(2), 241-259. https://www.jbe-platform.com/content/journals/10.1075/ssol.1.2.04koo (paywall)

Uses an online survey to ask respondents to report on music and literature that helped them get through a difficult time in their lives. They were asked questions (on 7-point scales) about their loss experience, coping style, and engagement with the music or literature. Aesthetic feelings (attention to and appreciation of stylistic features) were found to correlate with absorption and with experiencing more thoughts during reading, but not with catharsis or insight, which instead correlated with narrative feelings (identifying with the character and feeling absorbed in the narrative world). However, a subgroup also found therapeutic value in the comfort of aesthetic beauty.

Troscianko, E. T. (2018). Literary reading and eating disorders: survey evidence of therapeutic help and harm. Journal of Eating Disorders6(1), 1-17. https://link.springer.com/article/10.1186/s40337-018-0191-5 (full text)

A large online survey investigating respondents’ perceptions of the connections between their reading habits and their mental health, with a focus on eating disorders. The results show a strong correlation between reading eating disorder-themed texts and self-assessed significant detrimental effects on all studied dimensions (mood, self-esteem, feelings about one’s body, and diet and exercise habits), while self-reported responses to respondents’ preferred type of other fiction were neutral or positive.  

Qualitative methods

There are a variety of responses to the restrictions imposed by rating-scale paradigms; here are just a few.

Van Peer, W. (1990). The measurement of metre: Its cognitive and affective functions. Poetics, 19, 259-275. https://www.sciencedirect.com/science/article/abs/pii/0304422X9090023X (paywall)

Participants’ responses to a metrical and a non-metrical version of a poem are measured using 16 semantic differential scales (pairs of adjectives chosen to tap aesthetic reactions) plus a multiple-choice recall task and some comprehension questions. Metrical structure was found to enhance aesthetic pleasure and the ability to recognise segments of the text afterwards.

(For a review of the semantic differential method, see also Heise, D. R. (1970). The semantic differential and attitude research, in G. F. Summers (Ed), Attitude measurement (pp. 235-253). Chicago: Rand McNally. http://www.indiana.edu/~socpsy/papers/AttMeasure/attitude..htm.)

Dixon, P., Bortolussi, M., and Mullins, B. (2015). Judging a book by its cover. Scientific Study of Literature, 5(1), 23-48.

In this study, self-identified science-fiction fans and mystery fans sorted 80 randomly selected book genres from both genres into groups of their own devising; their sorts were used to identify similarity among books, and that similarity structure was used to measure similarity among participants, with cluster analysis used to find groups who sorted similarly. Group membership was related to reported knowledge about the genres, indicating the effectiveness of covers as an implicit signalling system between publishers and experienced readers of a given genre.

Otis, L. (2015). The value of qualitative research for cognitive literary studies. In The Oxford handbook of cognitive literary studies (pp. 505-524). Oxford University Press. https://pure.mpg.de/rest/items/item_3181889/component/file_3181890/content (full text)

Analyses the introspections, generated in one-to-one interviews, of 34 prominent scientists, writers, designers, and scholars, including Temple Grandin and Salman Rushdie, about the visual mental imagery they form while thinking, reading, and writing. The results are connected to debates on visual versus verbal cognitive styles, and suggest a high degree of individual variation in object versus spatial visualisation.

Open-ended questions and free-response/think-aloud protocols

Another simple way to avoid the limitations of rating scales is to ask open-ended questions. (This can also be done in a real-time fashion to avoid the drawbacks of only asking people about their experience once it’s over.) The only difficulty then is how to analyse the reams of raw verbal data you end up with.

Miall, D. S., and Kuiken, D. (2001). Shifting perspectives: Readers’ feelings and literary response. In W. van Peer and S. Chatman (Eds), New perspectives on narrative perspective, (pp. 289-302). Albany, NY: State University of New York Press. http://www.neurohumanitiestudies.eu/archivio/Shifting_Perspectives.pdf (preprint full text)

Investigates the relationships between aesthetic feeling, foregrounding, and reader perspective using a combination of measures: pre-validated questionnaires (the Literary Response Questionnaire and the Multidimensional Personality Questionnaire), reading times, and think-aloud protocols (relating to thoughts and feelings while reading), plus pre-existing and tailored discourse measures (for propositional features, discontinuities, and experiential perspective). The think-aloud protocols were analysed only informally, to add depth to the results gathered using the other measures. The findings contribute to a model of the interpretive process as a phasic cycle.

Claassen, E. (2012). Author inferences in thinking aloud. In E. Claassen, Author representations in literary reading (pp. 61-101). Amsterdam: John Benjamins. https://www.google.co.uk/books/edition/Author_Representations_in_Literary_Readi/_E4eK6KhV-YC?hl=en&gbpv=1&dq=Author%20representations%20in%20literary%20reading&pg=PA61&printsec=frontcover (partial preview)

Asks whether and how readers generate inferences about authors during the reading of narrative text, and if so, whether they can be revealed through think-aloud protocols. Includes methodological reflections on think-aloud as opposed to other methods, and what they can and cannot be used to demonstrate. In particular, the exploratory study described investigates whether inferences about authors contribute to a mental representation of the communication context, and whether a narrator’s visibility and/or particular reading strategies affect these inferences. Participants were asked to share their thoughts when they reached a black mark in the text, and also to give a summary of the text afterwards (with or without an instruction to reflect on authorial intention), as well as completing a short questionnaire on text evaluation and reading behaviour. The protocol coding procedure and statistical (chi-squared) analysis are also described in detail, and the findings include reflections on the limitations of the think-aloud method when it comes to automatically generated inferences.

Gibbs, R. W., and Blackwell, N. (2012). Climbing the ladder to literary Heaven: A case study of allegorical interpretation. Scientific Study of Literature, 2(2), 199-217. https://www.jbe-platform.com/content/journals/10.1075/ssol.2.2.02gib (paywall)

The authors asked participants to read a passage of extended metaphor from a novel and immediately write out their responses to a series of prompts/questions like ‘Please describe what the infinitely tall ladder refers to or represents’, ‘What would happen if the author “loosened his grip” while on the ladder and “fell to one side”?’, and ‘Describe the bodily sensations you felt while reading the story’. Participants’ responses manifested common features of the LIFE IS A JOURNEY metaphorical field and also elaborated on it with personal readings.

Probes during reading

The trouble with think-aloud methods is how severely they disrupt the ‘normal reading experience’. Other kinds of real-time probe may be much less intrusive. They seem to be being developed primarily for studying film (though see Other physiological measures below), but would transfer well to literary studies.

Troscianko, T., Meese, T., and Hinde, S. (2012). Perception while watching movies: Effects of physical screen size and scene type. i-Perception, 3, 414-425. https://core.ac.uk/download/pdf/9624299.pdf?repositoryId=7

Develops a simple measure to track ongoing ‘presence’ (involvement) in real time: participants were prompted to report their level of presence using a line-bisection task at intervals in a 45-minute section of film (‘You should make a mark on the line to indicate how “present” you feel in the movie just before the light came on. If you feel completely “in the story,” then your mark should be at the far right of the line. If you feel that you are viewing the movie, then place your mark on the far left of the line.’). Measures of pupil dilation and reaction times (a timed button pressed signalled by a beep) were also obtained. The first study found correlations between presence and physical screen size, and between presence and scenes focused on faces rather than landscapes. The second study found a correlation between presence and pupil dilation (even when controlling for variations in screen luminance), though not between presence and reaction times (but overall presence levels were lower in this experiment, presumably due to increased intrusion from the three measures), suggesting that pupil dilation may be a more sensitive measure than reaction times.

Bezdek, M., and Gerrig, R. (2016). When narrative transportation narrows attention: Changes in attentional focus during suspenseful viewing. Media Psychology, 20(1), 60-89. https://www.tandfonline.com/doi/abs/10.1080/15213269.2015.1121830 (paywall)

In order to explore the dynamic role of attention in narrative transportation, a reaction time-based method of real-time probing is developed to indicate the extent to which the primary task (watching a movie excerpt) commands viewers’ cognitive resources. Participants were asked to respond to a probe tone by pressing a button on a computer keyboard. Of particular interest were ‘hot spot’ moments where potential negative outcomes are emphasised. Reaction times were found to be longer, and more probes were missed, during hot spots than during cold spots. The researchers also administered a visual recognition memory task, asking participants to identify still images from the film clips, but confounding factors were identified for this measure, and they concluded that measuring narrative recall may work better.

Eye tracking

Eye tracking is a way of gathering lots of detailed information about cognitive processing indirectly (i.e. without needing to ask people to report on anything), and as the technology advances, this method also becomes ever less intrusive.

Rayner, K. (1998). Movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372-422. https://psychology.illinoisstate.edu/jccutti/psych369/readings/rayner1998.pdf (full text)

A review of the specifics of how eye movements reflect moment-to-moment cognitive processes involved in reading. The paper includes an outline of the basic characteristics of eye movements and fixations, some methods for measuring them, and features of eye-movement patterns like regression, refixations, and word skipping, as well as discussion of variation in reading practices during typical development and in people with reading difficulties.

Kaakinen, J. K., and Hyönä, J. (2008). Perspective-driven text comprehension. Applied Cognitive Psychology, 22, 319-334. https://bit.ly/2JucVKQ (full text)

The texts used here are in no sense literary, but the study illustrates a combination of eye tracking with written free recall of a text, following instructions to adopt a particular perspective when reading a text version where the (ir)relevance of the text to the perspective is either transparent or opaque. The authors conclude that perspective-related prior knowledge modulates the perspective effects observed in text processing, and that signalling of (ir)relevance helps readers encode relevant information to memory.

Koops van ‘t Jagt, R., Hoeks, J., Dorleijn, G., and Hendriks, P. (2014). Look before you leap: How enjambment affects the processing of poetry. Scientific Study of Literature, 4(1), 3-24. https://core.ac.uk/download/pdf/232448462.pdf (preprint full text)

Uses eye tracking to investigate the differences in reading poetry with or without enjambments (of both the prospective and the retrospective kind), using both authentic and specially constructed examples of enjambment. The study found significant differences between conditions, favouring a dynamic model of integrative language processing.

Hoven, E., Hartung, F. C., Burke, M., & Willems, R. M. (2016). Individual differences in sensitivity to style during literary reading: Insights from eye-tracking. https://repository.ubn.ru.nl/bitstream/handle/2066/168499/168499.pdf (full text)

Reports on an eye-tracking study finding significant variation in readers’ sensitivity to foregrounding features: some readers don’t slow down at all when reading foregrounded passages, while others slowed down a lot during foregrounded passages as well as for high-perplexity words (as one might expect as a result of higher processing demands).

Brain imaging

Certainly the trendiest way of doing cognitive research right now, if not the most informative, brain imaging has a solid history in the study of metaphor processing, and is now making its way into the study of specifically literary reading.

Mashal, N., Faust, M., Hendler, T., and Jung-Beeman, M. (2007). An fMRI investigation of the neural correlates underlying the processing of novel metaphoric expressions. Brain and Language, 100, 115-126. https://bit.ly/2JjN0pj (full text)

One of many fMRI studies on the processing of literal versus metaphorical language (often also using irony as an additional control) that demonstrate the involvement of the right hemisphere (usually associated with visuospatial rather than linguistic processing) in metaphor processing, this one suggests that interpretive salience—the extent to which a metaphor is novel (nonsalient) versus conventional (salient)—is actually the primary factor predicting RH involvement.

Yarkoni, T., Speer, N. K., Balota, D. A., McAvoy, M. P., and Zacks, J. M. (2008). Pictures of a thousand words: Investigating the neural mechanisms of reading with extremely rapid eventrelated fMRI. NeuroImage, 42, 973-987. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2572222/ (full text)

Describes a new ‘event-related reading’ method for fMRI, tested in an experiment in which reading of coherent narrative revealed widespread effects of orthographic, phonological, contextual, and semantic variables on brain activation. Results appear to replicate across previous single-word fMRI experiments, and to predict individual differences in reading comprehension.

Miall, D. S. (2011). Emotions and the structuring of narrative experience. Poetics Today, 32, 323-248. http://www.neurohumanitiestudies.eu/archivio/Emotions_PT_2011.pdf (full text)

Argues that findings from studies of evoked-response potentials (ERPs—an electrical potential recorded using EEG or EMG) indicate an early (within the first half second of response) and central role for emotion in the cognitive processing of language, including in areas such as inferencing, autobiographical memory and self-reference, anticipation, narrativising, and empathy.

Nijhof, A. D., & Willems, R. M. (2015). Simulating fiction: Individual differences in literature comprehension revealed with fMRI. PLoS One, 10(2), e0116492. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4324766/ (full text)

An fMRI study using excerpts from Dutch literary fiction, suggesting that ‘some people are mostly drawn into a story by mentalizing about the thoughts and beliefs of others, whereas others engage in literature by simulating more concrete events such as actions’.

Phillips, N. M. (2015). Literary neuroscience and history of mind: An interdisciplinary fMRI study of attention and Jane Austen. In L. Zunshine (Ed.), The Oxford handbook of cognitive literary studies (pp. 55-81). New York: Oxford University Press. https://www.oxfordhandbooks.com/view/10.1093/oxfordhb/9780199978069.001.0001/oxfordhb-9780199978069-e-4 (paywall)

Describes an experiment in which participants (18 PhD students expert in literary analysis) were asked to switch between close reading and reading for pleasure as they read Chapter 2 of Mansfield Park while lying in the fMRI scanner, finding overlapping but distinctive activation patterns between the two. The study also used fMRI-compatible eye tracking, plus heart rate and breathing measures, but it’s not clear whether those data were ever published.

Hartung, F., Hagoort, P., & Willems, R. M. (2017). Readers select a comprehension mode independent of pronoun: Evidence from fMRI during narrative comprehension. Brain and Language170, 29-38. https://pure.mpg.de/rest/items/item_2419703/component/file_2419702/content (full text)

‘To investigate how linguistically encoded perspective relates to cognitive perspective taking, this study involved participants listening to short literary stories using either 1st- or 3rd-person pronouns to refer to the protagonist, while undergoing fMRI. When comparing action events with 1st- versus 3rd-person pronouns, we found no evidence for a neural dissociation depending on pronoun. A split sample approach based on the self-reported experience of perspective taking revealed 3 comprehension preferences: a strong 1st-person preference, a strong 3rd-person preference, or engagement in 1st- and 3rd-person perspective taking simultaneously. Comparing brain activations of the groups revealed different neural networks. Our results suggest that comprehension is perspective-dependent, but that it depends not on the perspective suggested by the text, but on the reader’s (situational) preference.’

Other physiological measures

If you don’t have the cash or the inclination to go down the neuro route, other physiological measures that can serve as indirect indicators of particular types of response.

Ravja, N., Saari, T., Kallinen, K., and Laarni, J. (2006). The role of mood in the processing of media messages from a small screen: Effects on subjective and physiological responses. Media Psychology, 8(3), 239-265. https://www.tandfonline.com/doi/abs/10.1207/s1532785xmep0803_3 (paywall)

Facial EMG measures muscle activity in the face by detecting and amplifying the electrical pulses generated when muscle fibres contract; the focus is usually on two major muscles groups associated with frowning and smiling respectively. This study used facial electromyography (facial EMG) and cardiac heartbeat intervals as indicators of valence and arousal in the emotional reception of verbal versus video messages in different starting mood conditions and with different levels of relevance to the participants. Participants were also asked to rate their emotional reactions on a valence scale consisting of 9 pictures of human faces with expressions ranging from a severe frown to a broad smile. The researchers found higher relevance, arousal, and associated muscular activity for the verbal condition when in a depressed mood, and the reverse when in a joyful, relaxed, or playful mood.

Riese, Bayer, Lauer, and Schacht. (2014). Pupillary responses to suspense in literary classics. Scientific Study of Literature, 4(2), 211-232. http://gerhardlauer.io/files/8714/2917/2052/Riese-etal_suspense.pdf (full text)

Pupil dilation is well known to correlate with arousal (see also Troscianko et al., 2011, in Probes during reading above), and also to be related to attention. This study investigated feelings of suspense using pupillometry while participants listened to recordings of passages from Fontane and Storm. Detailed suspense ratings had previously been obtained from expert and nonexpert readings using two different methods: applying an 11-point scale to each sentence in printed copies of the texts, or listening to the texts being read while noting the suspense value (also on a scale from 0 to 10) at the end of each line of the transcript, in both cases relying on subjective appraisal rather than technical reasoning. Significant correlations were found between pupil diameter and changing arc of suspense, offering evidence for the usefulness of this technique as an indicator of suspense.

Dunbar, R. I. M., Teasdale, B., Thompson, J., Budelmann, F., Duncan, S., van Emde Boas, E., & Maguire, L. (2016). Emotional arousal when watching drama increases pain threshold and social bonding. Royal Society Open Science3(9), 160288. https://royalsocietypublishing.org/doi/pdf/10.1098/rsos.160288 (full text)

Tests the hypothesis that emotionally arousing drama, in particular, triggers the same neurobiological mechanism (the endorphin system, reflected in increased pain thresholds) that underpins anthropoid primate and human social bonding. The results show that, compared to participants who watch an emotionally neutral film, those who watch an emotionally arousing film have increased pain thresholds and an increased sense of group bonding. Participants completed two rating scales (one measuring inclusion-of-other-in-self, the other positive and negative affect) and a pain threshold test (how long they could sit unsupported with their back against a wall) before and after viewing.

Indirect behavioural measures

On the behavioural side of things, too, indirect measures can be useful, though the benefit of not needing to rely on verbal self-report is countered by the lack of certainty that a given measure reflects the variable you think it does, and only that one.

Bryant, D. J., Tversky, B., and Franklin, N. (1992). Internal and external spatial frameworks for representing described scenes. Journal of Memory and Language, 31, 74-98. https://www.sciencedirect.com/science/article/pii/0749596X9290006J (paywall)

Explored readers’ mental models of narrative scenes described from the perspective of either an observer within the scene, surrounded by objects, or one outside the scene, with objects in front (and spatial relations described relative to other objects not the observer). Participants were asked questions about the locations of objects, and their reaction times differed for the two conditions, with faster responses to questions of front than back in the internal condition (reflecting physiological constants), no difference between front and back questions in the external condition (where the body axis is irrelevant), and responses faster overall to questions answered from an external perspective. Subsequent experiments configured descriptions from the perspective of a central person or inanimate object, leaving the reader freer to choose what perspective to adopt. This method offers a way of ascertaining the perspective readers adopt in response to a text where perspective is unspecified or complex.

Zwaan, R. A., Radvansky, G. A., Hilliard, A. E., and Curiel, J. M. (1998). Constructing multidimensional situation models during reading. Scientific Studies of Reading, 2(3), 199-220. https://memorylab.nd.edu/assets/257376/zwaan_radvansky_hilliard_curiel_1998_scientific_studies_of_reading_.pdf (full text)

Used reading times to investigate which dimensions (time, space, physical and psychological causation, protagonist motivation, and new protagonists) of the mental model, or situation model, are monitored by readers of narrative, finding that the spatial dimension was the only one not monitored (i.e. spatial discontinuities did not lead to increased reading times). The spatial dimension was brought into line with the others in a second experiment where participants memorised a map of the building in which the described events took place, which encouraged them to monitor spatial continuity as they would otherwise not bother to do.

Emmott, C., Sanford, A. J., and Dawydiak, E. J. (2007). Stylistics meets cognitive science: Studying style in fiction and readers’ attention from an interdisciplinary perspective. Style, 41(2), 204-224. https://www.jstor.org/stable/10.5325/style.41.2.204?seq=1 (paywall) http://citeseerx.ist.psu.edu/viewdoc/download?doi= (additional materials)

Outlines the principles and practice of ‘depth of processing’ testing, with literary relevance in the realm of foregrounding devices in particular. Traditional methods and the newer ‘change detection’ method are introduced, the latter adapted for verbal texts from vision research on the phenomenon of change blindness. A study is described to assess the effects of ‘attention-capturing’ devices at different linguistic and narrative levels, with a surprising finding that narrative foregrounding actually reduces change detection demanding further exploration.

Zacks, J. M., Speer, N. K., and Reynolds, J. R. (2009). Segmentation in reading and film comprehension. Journal of Experimental Psychology: General, 138(2), 307-327. https://pdfs.semanticscholar.org/553f/99d942c3a17a2fd5c4a2a9494a25452b4e30.pdf (full text)

Applied the reading-times method to test a hypothesis about how readers use natural boundaries in ongoing activity evoked in narrative film as an important part of comprehension. The set of experiments involved collecting reading times, segmentation judgements (press a button when you judge that one meaningful unit of activity has ended and another begun), and predictability ratings (also gathered during the viewing), plus cued recall questions to test comprehension for a film that had been pre-coded for situational changes. In this study, interactions between variables of situation change in the film and patterns of segmentation and reading times were analysed at an individual level, rather than just using group-averaged reading times; good agreement about event boundaries was found between individuals, but consistency was better within individuals.

Whalen, D. H., Zunshine, L., and Holquist, M. (2012). Theory of Mind and embedding of perspective: A psychological test of a literary ‘sweet spot’. Scientific Study of Literature, 2(2), 301-315. http://www.haskins.yale.edu/Reprints/HL1713.pdf

Reading times alongside comprehension questions of varying difficulty were used to test a hypothesis about theory of mind and literature: that three levels of perspective embedding might provide a helpful amount information about characters without overwhelming, and that this ‘sweet spot’ could be preferred in part thanks to ease of processing. A second study imposed a fixed reading speed based on the average reading times from the first experiment. Both found that degree of perspective embedding affected cognitive engagement, with zero embedding read slower than 1-3 levels, and about the same as 4 levels, while comprehension error increased only with 5 levels.

Content analysis / discourse analysis / stylometrics

In this final section, we turn to methods directed at the texts themselves, rather than at readers’ responses to them.

Content analysis software


Funnels into a range of basic descriptive indices, plus outputs for dimensions including word concreteness, syntactic simplicity, referential cohesion (words and ideas that overlap across sentences and the entire text), deep cohesion (causal and intentional connectives), verb cohesion (overlapping verbs), connectivity (explicit conveying of logical connections), temporality, and narrativity), as well as measures for similarity between adjacent sentences, lexical diversity, situation model construction, syntactic complexity and pattern density, words before the main verb (an index of working memory load), word information, and readability.

Free online trial versions of the web tools plus extensive documentation are available here: http://cohmetrix.com. A free text analysis service for corpora over 15,000 words is also on offer.

LIWC (pronounced Luke): Linguistic Inquiry and Word Count dictionary

Contains 4,500 words and word stems, filed in one or more of around 80 outputs: 4 general descriptor categories (total word count, words per sentence, % of words captured by the dictionary, % of words longer than 6 letters), 22 standard linguistic dimensions (e.g. % of words that are pronouns, auxiliary verbs, etc.), 32 word categories tapping psychological constructs (e.g. affect, cognition, biological processes), 7 personal concern categories (e.g. work, home, leisure activities), 3 paralinguistic dimensions (assents, fillers, nonfluencies), and 12 punctuation categories. See http://www.liwc.net/LIWC2007LanguageManual.pdf for the development and psychometric properties, and http://liwc.wpengine.com to buy. The academic version currently costs £84.95 for an unlimited licence, or £9.95 for 30 days.

Word norm data

Word norm databases are language corpora that have been rated by human participants along a specific dimension or set of dimensions. They provide an empirically validated way of evaluating rich linguistic data without resorting to qualitative methods, which are unavoidably subjective and usually low in replicability. The two papers below outline VAD norms (valence, arousal, and dominance—dimensions of emotional response) and sensorimotor norms (touch, hearing, smell, taste, vision, and interoception).

Lynott, D., Connell, L., Brysbaert, M., Brand, J., & Carney, J. (2019). The Lancaster sensorimotor norms: Multidimensional measures of perceptual and action strength for 40,000 English words. Behavior Research Methods, 1-21. https://europepmc.org/article/med/31832879 (full text)

Warriner, A. B., Kuperman, V., & Brysbaert, M. (2013). Norms of valence, arousal, and dominance for 13,915 English lemmas. Behavior research methods, 45(4), 1191-1207. https://link.springer.com/article/10.3758/s13428-012-0314-x (full text)

General advocacies

Graesser, A. C., Dowell, N., and Moldovan, C. (2011). A computer’s understanding of literature. Scientific Study of Literature, 1(1), 24-33. https://bit.ly/36khu3m (preprint full text)

An introduction to how and why to use LIWC and Coh-Metrix to analyse literary texts.

Pennebaker, J. W., and Ireland, M.E. (2011). Using literature to understand authors: The case for computerized text analysis. Scientific Study of Literature, 1(1), 34-48. https://www.jbe-platform.com/content/journals/10.1075/ssol.1.1.04pen (paywall)

Makes a general case for the importance of analysing ‘almost-invisible function words’ (pronouns, prepositions, etc.) rather than just ‘content’ words to learn about the social and psychological worlds of the authors who created them.

Quantitative methods

Zöllner, K. (1990). ‘Quotation analysis’ as a means of understanding comprehension processes of longer and more difficult texts. Poetics, 19, 293-322. https://www.sciencedirect.com/science/article/pii/0304422X9090025Z (paywall)

Quotation analysis is presented as a method for explaining and predicting which sections of well-known texts are quoted from and interpreted more frequently than others, which can in turn help us understand the semantic and structural makeup of ‘classical’ texts. The paper sets out the method used to code quotations and interpretations of VIPs (very important passages) from Gulliver’s Travels for analysis, and draws out points of interest such as the high polyvalence of VIPs and the purposes for which they are used, and how certain interpretations become classics in their own right.

Anderson, T., and Crossley, S. (2011) ‘Rue with a difference’: A computational stylistic analysis of the rhetoric of suicide in Hamlet. In M. Ravassat and J. Culpeper (Eds), Stylistics and Shakespeare’s language: Transdisciplinary approaches, pp. 192-214. London: Continuum. https://www.google.co.uk/books/edition/Stylistics_and_Shakespeare_s_Language/0q3UAwAAQBAJ?hl=en&gbpv=1&dq=%E2%80%98Rue%20with%20a%20difference%E2%80%99%3A%20A%20computational%20stylistic%20analysis%20of%20the%20rhetoric%20of%20suicide%20in%20Hamlet%E2%80%99%2C&pg=PA192&printsec=frontcover (partial preview)

Demonstrates the complementarity of stylistic and literary interpretations using lexico-semantic and corpus approaches to analyse Hamlet’s and Ophelia’s dialogue for suicidal discourse, to reveal semantic prosodies and multiword meaning structures that have been overlooked by literary critics. Using the LIWC, and with Horatio’s and Laertes’s dialogue as controls, the authors construct and test specific linguistic hypotheses about suicidal rhetoric in the play.

Egbert, J. (2012). Style in nineteenth century fiction: A multi-dimensional analysis. Scientific Study of Literature, 2(2), 167-198. https://www.jbe-platform.com/content/journals/10.1075/ssol.2.2.01egb (paywall)

This paper presents a study of a large, principled corpus of 19th-century fiction using a multidimensional corpus stylistics approach which aims to consider ‘the full set of core linguistic features’. The key dimensions of variation are interpreted as ‘thought presentation versus description’, ‘abstract exposition versus concrete action’, and ‘dialogue versus narrative’, and can be used to compare authorial styles and assess the level of variation within novels by the same author.

Nichols, R., Lynn, J., Purzycki, B. G. (2014). Toward a science of science fiction: Applying quantitative methods to genre individuation. Scientific Study of Literature, 4(1), 25-45. https://www.jbe-platform.com/content/journals/10.1075/ssol.4.1.02nic (paywall)

To address the question of what genre is, and what distinguishes a genre like science fiction from other genres, this article presents a method of quantitative genre profiling that uses the word categories of the LIWC analyses to test a well-known literary theory in which science fiction offers particular representations of cognition and estrangement, as distinct from fantasy and mystery. Following presentation of the findings on similarities and differences between the three genres, the paper also includes a general discussion of the value of conducting quantitative empirical work on literature and of testing the hypotheses of literary theory.

Bruhn, M. J. (2018). Citation analysis: An empirical approach to professional literary interpretation. Scientific Study of Literature8(1), 77-113. https://www.jbe-platform.com/content/journals/10.1075/ssol.17009.bru (paywall)

‘This paper presents series of historiometric studies that exemplify the value of “citation analysis” as an empirical approach to professional literary-critical interpretation, especially with respect to the question of the “literariness” of literary texts. Specifically, the studies show that professional interpreters of Wordsworth’s poetry, across more than a century of time and despite widely varying critical approaches, tend to pay more attention to and therefore more frequently cite lines that involve prospective enjambments. Lines involving nominative noun phrase and retrospective enjambments, however, did not reveal the same correlation with frequency of citation. The studies thus suggest that literariness does indeed have a relatively stable textual component that may be discriminated through citation analysis of professional interpretations of individual literary texts by authors writing in distinct genres of literature and in different periods in literary history.’ The method involves collecting interpretations, counting citations, and performing statistical analysis to ask whether the resulting frequencies reveal significant patterns distinguishing the most from the least cited passages.

Dávid-Barrett, T., Carney, J., Rotkirch, A., & Izquierdo, I. B. (2019). Social Network Complexity in Mozart’s Marriage of Figaro. In Evolution and Popular Narrative (pp. 106-118). Brill Rodopi. https://brill.com/view/book/edcoll/9789004391161/BP000006.xml (paywall)

This study investigates the number and types of dyadic interactions in the libretto for The Marriage of Figaro, and analyses representations of the social network during the opera. The results suggest that part of this opera’s enduring appeal is its narrative and structural solutions to representing complex and ecologically valid social interactions onstage.

Qualitative methods

Braun, V., and Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77-101. https://uwe-repository.worktribe.com/preview/1043068/thematic_analysis_revised_-_final.pdf (preprint full text) 

Introduces thematic analysis as an accessible and theoretically flexible way of analysing qualitative data, and clearly outlines the analytical process (including setting out the difference between codes and themes, and how to draw conclusions from the analysis) as well as pitfalls to avoid.

Schreier, M. (2012). Qualitative content analysis in practice. London: Sage. https://www.amazon.co.uk/Qualitative-Content-Analysis-Practice-Schreier/dp/1849205930 (good chunks of the key sections are available when signed in) 

Covers what qualitative content analysis is, how blurred the boundary is between it and its quantitative sister, and how the coding process works, from how to build and evaluate a coding frame to carrying out the main analysis and presenting the results. 

Other methods

A creative category in need of expansion; watch this space! (or contact us if you know of anything fun that could be added)

Yurievich Manin, D. (2012). The right word in the left place: Measuring lexical foregrounding in poetry and prose. Scientific Study of Literature, 2(2), 273-300. https://www.researchgate.net/profile/Dmitrii_Manin/publication/253235395_The_right_word_in_the_left_place_Measuring_lexical_foregrounding_in_poetry_and_prose/links/0046352773560e7ac3000000/The-right-word-in-the-left-place-Measuring-lexical-foregrounding-in-poetry-and-prose.pdf (full text)

Uses an online literary game where players guess words in fragments of real texts to quantify two aspects of lexical foregrounding—unpredictability and constrainedness (or irreplaceability)—as they help characterise the poetry/prose distinction and illuminate the formal constraints of different poetic forms.


And last but not least, with all datasets comes the need to analyse them—and once you want to assess relative probabilities, and hence infer causation, statistics are what you’ll need. Here are some relatively painless ways to get started.

Hoover, D. L. (2008). Quantitative analysis and literary studies. In S. Schreibman and R. Siemens (Eds), A companion to digital literary studies, Ch 28. Oxford: Blackwell. http://www.digitalhumanities.org/companionDLS/

This article argues for and sketches out the basic principles of quantitative stylometrics, but also includes examples of statistical analysis. (The book also contains chapters on many other aspects of digital literary studies, including a bibliography of online tools and archives.)


This online archive of Jonathan Marchini’s statistics teaching notes provides introductions to basic statistical measures; probability; specific probability models including the binomial, Poisson, and normal distributions; hypothesis testing (including chi-squared tests); error probability; and confidence intervals.

Stanford’s four statistics MOOCs are a good way to familiarise yourself with the basics of describing and drawing conclusions from data:

Intro to statistics with Sebastian Thrun: https://www.class-central.com/mooc/361/udacity-intro-to-statistics

Statistics: The science of decisions, with Sean Laraway, Ronald Rogers, and Katie Kormanik https://www.class-central.com/mooc/631/udacity-statistics-the-science-of-decisions

Intro to descriptive statistics, with Sean Laraway, Ronald Rogers, and Katie Kormanik https://www.class-central.com/mooc/2309/udacity-intro-to-descriptive-statistics

Intro to inferential statistics (follows on from descriptive stats), with Sean Laraway, Ronald Rogers, and Katie Kormanik https://www.class-central.com/mooc/2310/udacity-intro-to-inferential-statistics

They also offer an intro to data science, and courses on data analysis with the open-source package R and programming foundations with Python. See https://www.classcentral.com/university/stanford for the full list.