Guest blog: Should we share qualitative data?

Conversation analysts soon accumulate many hours of tapes and transcripts; usually these have been collected on the understanding that they are for the researcher’s own use, with permission only to publish extracts anonymously. But should such data be open to other researchers? Jack B. Joyce, Catrin S. Rhys, Bethan Benwell, Adrian Kerrison, Ruth Parry summarise here the arguments examined in a recent paper.

Jack B Joyce, Catrin S Hughes, Bethan Benwell, Adrian Kerrison and Ruth Parry

Data sharing has been central to the development of Conversation Analysis (CA) and is considered the bread-and-butter of the approach. Here we are talking about data sharing generally but we should point to Elliot Hoey and Chase Raymond’s Rolsi blog where they critically reflect on ‘Classic data’ and some of the pros/cons associated with drawing on a narrow range of data. CA has a fairly unique relationship with data sharing.  It is not an afterthought or a response to the Open Science movement; rather data sharing and ‘secondary’ analysis has been “baked in” to the CA approach from its inception. 

Harvey Sacks (often!) illustrates this point:

“It was not from any large interest in language or from some theoretical formulation of what should be studied that I started with tape-recorded conversations, but simply because I could get my hands on it and I could study it again and again, and also, consequentially, because others could look at what I had studied and make of it what they could, if, for example, they wanted to be able to disagree with me” 

(Sacks, 1984, p. 26, emphasis added). 

Ongoing debates

While the replication crisis or issues of data fabrication have not really touched the shores of qualitative research, the waves of the Open Science movement have brought these rumbling debates into sharp focus, and now many funders and publishers encourage or mandate data sharing. The arguments for reusing qualitative data include the checking of findings, fostering trust in science, enhancing researcher training, and importantly producing new findings which are cost-effective for researchers and avoid unnecessarily burdening participants (Kuula, 2011; DuBois et al., 2018). All-in-all, sharing data maximises the social value of publicly funded research.

Nonetheless, there are concerns about sharing qualitative data when it is highly sensitive. On-high mandates to share data can mean that we fail to protect our participants’ anonymity, or prevent subsequent researchers misinterpreting data which is highly contextual. Moreover, many of the policies and repositories (locally, nationally and internationally) are designed with quantitative research in mind. The extent to which repository specialists or ethics panels are prepared to advise qualitative researchers is varied; Mozersky and colleagues (2020) find that those groups felt responsibilities lay elsewhere.

Such is the difficulty of navigating the choppy waters of data sharing that many qualitative researchers understandably opt not to share their data. Can CA sail in and save the day? Not really; but CA can speak to some of these epistemological, ethical and practice issues and tilt the balance toward data sharing.

Practices of sharing CA data

Conversation analysts have, for the longest time, been sharing their data. You can see three distinct categories of sharing: making a whole corpus available to authorised scholars (as do Jepsen et al (2017) with their ‘one-in-a-million’ corpus of primary care interaction); sharing among a small number of fellow-researchers working on the same project (as often happens in close-knit groups of colleagues); and, most publicly but much more selectively, making selected extracts available in published articles.

For conversation analysts, the chief drivers for sharing data have been to add a level of rigour to the analytic findings and to give others the ability to check the analysis. Seeing the evidence means that “others could look at what I had studied and make of it what they could, if, for example, they wanted to be able to disagree with me” (Sacks, 1984, p. 26, emphasis added).

Primary/Secondary distinction of data

What can CA add to these ongoing debates about sharing qualitative data

In a new publication we argue that the data CA draws on is inherently sensitive even if not formally, legally protected. A person’s openly available actions during a meal may typically be regulated quite differently from their private medical records, but CA researchers quickly learn how prized and intimate mundane interactions can become when we request access to them. Although the ethical concerns associated with handling interactional data are not uniquely experienced by CA researchers, we argue that our experiences in needing to share sensitive data means we have much to contribute to ethical discussions around participant consent regarding data access and practices for maintaining anonymity even in widely-available recordings and transcripts.

An anonymised image of the kind typically used in a journal article

On epistemology, the highly contextual nature of qualitative data and thus the question of whether that data can ever be reinterpreted in secondary analyses has dictated the direction of data sharing discussions. We argue that CA’s endogenous conception of ‘context’ (in brief: that CA does not entertain or assume contextual explanations of phenomena, which was debated at length by Billig, Schegloff and Wetherell in the late 90s) means that data is always constituted for the first time at the point of analysis. Therefore, the distinction between ‘primary’ and ‘secondary’ data which directs the debates on whether to share qualitative data does not exist in CA. In fact, a more significant concern for CA researchers should be the relationship between researcher and data at each analysis (where data are “constituted for the first time” in each project) because of the way that we draw on our own memberships to recognise practices despite the CA appeal to ‘unmotivated looking’. (As an example , see the Twitter thread on this point and related concerns about “the whiteness in the erasure of the researcher” by Edward Reynolds.)

There is a strong practical legacy too: CA’s baked-in approach to sharing data during and after a research project should widen the existing debates around qualitative data sharing and can inform how other approaches might share their data.


CA is unique in its conception of ‘context’ and the real emphasis that is placed on data sharing as part of a rigorous approach. Our rich history (some of which we might now critically reflect on) means that we have experience and knowledge which can usefully inform these debates and shape others’ thinking on how and why data sharing can be at the bow of research design. 

So, should we share qualitative data? Well, you’ll have to read the article to find out… 

Joyce, J.B., Douglass, T., Benwell, B., Rhys, C. S., Parry, R., Simmons, R. & Kerrison, A. (2022). Should we share qualitative data? International Journal of Social Research Methodology


