New paper: Informal communication and archaeological data work

The first peer-reviewed paper deriving from my dissertation is finally published in Open Archaeology! It showcases qualitative research on scholarly communication within archaeological projects — specifically the role of informal communication styles in archaeological knowledge production, and how they complement more formally structured documentary media.

On the Value of Informal Communication in Archaeological Data Work
https://doi.org/10.1515/opar-2024-0014

Archaeological data simultaneously serve as formal documentary evidence that supports and legitimizes chains of analytical inference and as communicative media that bind together scholarly activities distributed across time, place, and social circumstance. This contributes to a sense of “epistemic anxiety,” whereby archaeologists require that data be objective and decisive to support computational analysis but also intuitively understand data to be subjective and situated based on their own experiences as participants in an archaeological community of practice. In this article, I present observations of and elicitations about archaeological practices relating to the constitution and transformation of data in three cases in order to articulate this tension and document how archaeologists cope with it. I found that archaeologists rely on a wide variety of situated representations of archaeological experiences – which are either not recorded at all or occupy entirely separate and unpublished data streams – to make sense of more formal records. This undervalued information is crucial for ensuring that relatively local, bounded, and private collaborative ties may be extended beyond the scope of a project and, therefore, should be given more attention as we continue to develop open data infrastructures.

New paper: Exploring collaborative practices in archaeological software development

I’m happy to announce that Joe Roe and I just published a paper in Internet Archaeology that explores collaborative practices in archaeological open source software development. This paper has been in development for a while, and we’re glad to finally release our work.

To briefly summarize: we investigated the under-explored practices involved in research software engineering in archaeology, with an emphasis on collaborative experiences involved in open source software development. We identified not only what kinds of software archaeologists are making, but how archaeologists create these tools as part of a broader community of practice. We conducted exploratory data analysis and network analysis on data from open-archaeo, supplemented with additional data pulled from the GitHub API, to trace how archaeologists use various languages, forges, licenses and supporting features (e.g. issues, stars, pull requests), and to discern trends regarding projects’ longevity, degree of community participation, and overall structure of collaborative ties.

Open archaeology, open source? Collaborative practices in an emerging community of archaeological software engineers

Surveying the first quarter-century of computer applications in archaeology, Scollar (1999) lamented that the field relied almost exclusively on “hand-me-down” tools repurposed from other disciplines. Twenty five years later, this is no longer the case: computational archaeologists often find themselves practicing the dual roles of data analyst and research software engineer (Baxter et al. 2012; Schmidt and Marwick 2020), developing and applying new tools that are tailored specifically to archaeological problems and archaeological methods. Though this trend can be traced to the very earliest days of the field (Cowgill 1967), its most recent manifestation is distinguished by its apparent embrace of practices from free and open source software. Most prominently, since around 2015, there has been a rapid uptake of workflow tools designed for open source development communities, such as the version control system git and associated online source code management platforms (e.g. GitHub, GitLab). These tools facilitate collaboration among developers and users of open source software using patterns that can diverge quite radically from conventional scholarly norms (Tennant et al. 2020).

In this paper, we investigate modes of collaboration in this emerging community of practice using ‘open-archaeo’, a curated list of archaeological software, and data on the activity of associated GitHub repositories and users. We conduct an exploratory quantitative analysis to characterize the nature and intensity of these collaborations and map the collaborative networks that emerge from them. We document uneven adoption of open source collaborative practices beyond the basic use of git as a version control system and GitHub to host source code. Most projects do make use of collaborative features and, through shared contributions, we can can trace a collaborative network that includes the majority of archaeologists active on GitHub. However, a majority of repositories have 1–3 contributors, with only a few projects distinguished by an active and diverse developer base. Direct collaboration on code or other repository content—as opposed to the more passive, social media-style interaction that GitHub supports—remains very limited. In other words, there is little evidence that archaeologists’ adoption of open source tools (git and GitHub) has been accompanied by the decentralized, participatory forms of collaboration that characterise other open source communities. On the contrary, our results indicate that research software engineering in archaeology remains largely embedded in conventional professional norms and organizational structures of academia.

Finished my dissertation!

I finally defended my doctoral dissertation a few weeks ago, and after 7 years I’m happy to put it out into the world: https://doi.org/10.5281/zenodo.8373390

To briefly summarize: I observed and interviewed archaeologists while they worked, focusing on how they collaborate to produce information commons within relatively small, bounded communities. I relate these observations to issues experienced when sharing data globally on the web using open data platforms. This is part of an effort to reorient data sharing (and other aspects of open science) as a social, collaborative, communicative, and commensal experience.

Many thanks to my supervisor, Costis Dallas, for being such a great mentor, and to Matt Ratto and Ted Banning for their constant constructive feedback. And special thanks to the external examiners, Jeremy Huggett and Ed Swenson, for critically engaging with my work.

Archaeological data work as continuous and collaborative practice

This dissertation critically examines the sociotechnical structures that archaeologists rely on to coordinate their research and manage their data. I frame data as discursive media that communicate archaeological encounters, which enable archaeologists to form productive collaboration relationships. All archaeological activities involve data work, as archaeologists simultaneously account for the decisions and circumstances that framed the information they rely on to perform their own practices, while anticipating how their information outputs will be used by others in the future. All archaeological activities are therefore loci of practical epistemic convergence, where meanings are negotiated in relation to communally-held objectives.

Through observations of and interviews with archaeologists at work, and analysis of the documents they produce, I articulate how data sharing relates distributed work experiences as part of a continuum of practice. I highlight the assumptions and value regimes that underlie the social and technical structures that support productive archaeological work, and draw attention to the inseparable relationship between the management of labour and data. I also relate this discursive view of data sharing to the open data movement, and suggest that it is necessary to develop new collaborative commitments pertaining to data publication and reuse that are more in line with disciplinary norms, expectations, and value regimes.

Comments on a recent “science mapping” paper

A new paper examining published research outputs to describe the makeup of archaeology as a discipline just dropped, and it’s getting a lot of positive attention.

Sinclair, A. 2022 Archaeological Research 2014 to 2021: an examination of its intellectual base, collaborative networks and conceptual language using science maps, Internet Archaeology 59. https://doi.org/10.11141/ia.59.10

I see some issues with the paper that I think are worth addressing. This is not a comprehensive review, more like a commentary based on my own interests and experiences. I welcome dialog with the author and anyone else who is interested in discussing this further.

I’m a bit hesitant to post this because I do not know the author, Anthony Sinclair, and I don’t want to come across as too harsh. I intentionally did not look him up prior to writing this post. This is a commentary of the paper, not the person behind it.

Simplistic description of network graphs

My first criticism is about the surface-level description of the network visualizations. Network visualizations are one of many ways of rendering a dataset, and this would have really benefited from more multifaceted statistical analysis of the underlying data. For example, it would have been nice to see the distribution of nodes with different degrees of centrality compared against some other variable, such as gender. The author reverts back to a plain and simple citation count in his analysis of gender disparities, and misses a great opportunity to draw upon centrality measurements as a key indicator of inequitable aspects of professional development across the genders.

The author also annotated the graphs with diagrams that look kind of like a compass rose. I only found one instance in the text describing them and their function:

“In certain maps, the key dimensions that affect the layout of the maps are identified in one of the upper corners of the map.”

One of the network visualizations from the original paper. Note the compass rose in the top left corner.

What do these compass roses actually represent? Are they derived from the author’s interpretations, or are they derived from the dataset? This is unclear. In either case, I would have liked to understand the reasoning or approach for identifying the extremes at each end of the gradients, and how a node’s situation along the scale is determined.

Framing of science and non-science

This paper perpetuates an outdated dichotomy between science and the arts and humanities. It never really defined either of these things, or attempts to reconcile the terms used by the citations databases against their own notion of what science and arts and humanities means to them. But these terms appear in the compass roses and in their descriptions of the graph visualizations as if their meanings are self-evident.

Also very interesting is they say a lot about science but not much about arts and humanities. In fact, it may be more apt to say that this paper describes science and non-science, rather than some alternative other cohesive entity. The author describe journals, topics and methods that they identify as scientific, but do not do this at all for entities that they relate to as belonging to the arts and humanities. The lack of distinction between these terms reveals a lack of willingness to treat the things they represent as things in themselves rather than a lack of something, namely, science.

The paper also relies on really outdated visions of the character of various disciplines and of archaeology specifically. As far as I can tell, it relies on two sources:

  • Pantin, C.F.A. 1968 The Relations Between the Sciences, Cambridge: Cambridge University Press.
  • Becher, T. and Trowler, P.R. 2001 Academic Tribes and Territories, 2nd Edition, Maidenhead: Society for Research into Higher Education/Open University Press.

Becher is extremely outdated and falls within a period when scientists (especially social scientists, including Binford) were aching to make their disciplines seem more scientific. So there is a strange value judgement at play, and they often failed to capture the reality of how science actually works. The other source is mentioned only very briefly in passing, but follows a similar essentialist rhetoric regarding the fundamental nature of specific disciplines, which rubs me the wrong way. A lot of excellent work that examines the pragmatic reality of scientific practice, which highlights contradictions and misrepresentations, and that presents the fluidity across disciplines rather than hard distinctions, is simply ignored (e.g. Latour and Woolgar’s Laboratory Life, Latour’s Pandora’s Hope, Knorr-Cetina’s Epistemic Cultures, Bowker’s Science on the Run, to name just a few).

Critical reflection on what the networks actually represent

The value of analyzing citation networks is unclear to me, and the author don’t really convince me that they represent a “window on the shape of the discipline”. Citation networks depict clusters of citations, but the jump to making these clusters meaningful in relation to some broader social or epistemic phenomenon is never really articulated. Moreover, the author indicates they he applied the Girvan-Newman method for identifying clusters, but doesn’t really incorporate the means through which this algorithm operates, including its limitations, into the analysis. Clusters do not simply exist, they are highlighted through some method, which impacts what we see.

After writing an initial draft I found that the author has done other relevant scientometric work with a more targeted scope:

Sinclair, A. (2020). From Specialty to Specialist: a citation analysis of Evolutionary Anthropology, Palaeolithic Archaeology and the Work of John Gowlett 1970-2018. In J. Cole, J. McNabb, M. Grove, & R. Hosfield (Eds.), Landscapes of Evolution: Studies in Honour of John Gowlett (pp. 175-201). Oxford: Archaeopress.

Although I do not have access to this paper, it is likely to be much more effective since these kinds of analyses tend to work better when a more specific objective is outlined, since it’s easier to ground the relationships within a specific set of experiences, rather than relying too much on generalizations and abstractions.

Analysis of language and push for standardized terminology

I like the analysis of language and keywords. I think it’s the strongest part of the paper, and there’s a lot of potential there. However the author draws this into a push towards standardization, which seems kind of forced and not relevant to the analysis of key words across the literature. The author frames the diverse array of terminology as a problem that needs to be overcome, rather than a very interesting aspect of archaeological research practice with its own benefits and affordances. Standards implemented in harder sciences are stated as goals worth attaining, but I’m left unconvinced that this is really worth doing based on the findings presented here.