Finished my dissertation!

I finally defended my doctoral dissertation a few weeks ago, and after 7 years I’m happy to put it out into the world:

To briefly summarize: I observed and interviewed archaeologists while they worked, focusing on how they collaborate to produce information commons within relatively small, bounded communities. I relate these observations to issues experienced when sharing data globally on the web using open data platforms. This is part of an effort to reorient data sharing (and other aspects of open science) as a social, collaborative, communicative, and commensal experience.

Many thanks to my supervisor, Costis Dallas, for being such a great mentor, and to Matt Ratto and Ted Banning for their constant constructive feedback. And special thanks to the external examiners, Jeremy Huggett and Ed Swenson, for critically engaging with my work.

Archaeological data work as continuous and collaborative practice

This dissertation critically examines the sociotechnical structures that archaeologists rely on to coordinate their research and manage their data. I frame data as discursive media that communicate archaeological encounters, which enable archaeologists to form productive collaboration relationships. All archaeological activities involve data work, as archaeologists simultaneously account for the decisions and circumstances that framed the information they rely on to perform their own practices, while anticipating how their information outputs will be used by others in the future. All archaeological activities are therefore loci of practical epistemic convergence, where meanings are negotiated in relation to communally-held objectives.

Through observations of and interviews with archaeologists at work, and analysis of the documents they produce, I articulate how data sharing relates distributed work experiences as part of a continuum of practice. I highlight the assumptions and value regimes that underlie the social and technical structures that support productive archaeological work, and draw attention to the inseparable relationship between the management of labour and data. I also relate this discursive view of data sharing to the open data movement, and suggest that it is necessary to develop new collaborative commitments pertaining to data publication and reuse that are more in line with disciplinary norms, expectations, and value regimes.

Comments on “The rise and fall of peer review”

A substack post about peer review is getting a lot of attention, and I’m here to rant about it. Basically the post is calling out the peer review process as a terrible and broken system. And it is. But the author’s rhetoric about it is kind of problematic.

1. Peer review is not an experiment.

The author claims that it is, but contradicts himself straight away:

The experimental design wasn’t great; there was no randomization and no control group. Nobody was in charge, exactly, and nobody was really taking consistent measurements. And yet it was the most massive experiment ever run, and it included every scientist on Earth.

These are not just things that make an experiment bad, they are things that preclude peer review from being an experiment altogether. Experiments are run on samples, they are run with intent, they are performed in controlled environments. As someone who calls himself an experimental psychologist and who calls his blog “experimental history”, he really extends the term experiment in weird ways. This use of the term is like referring to the “experiment of democracy”, basically just grand rhetoric for “we’re figuring things out and learning as we go”.

The author also seems to think of the experiment of peer review as a pass/fail test, which again, is not what experiments are for. He sets bars for what successful scientific evaluation ought to look like, and measures his experiences of peer review against it. But this is not an experiment, this is qualitative assessment. There’s nothing wrong with that, but it’s troubling how the author wraps his proclamation that peer review is bad and should be abolished within some phony hearkening to science-core.

2. General discourse is not enough to validate truth statements.

Various parts of the post indicate that the author considers science to be the evaluation of statements of truth, which can only be verified by their fidelity to observed reality. Ok, fair enough. But he refers to Einstein’s large body of non-reviewed work as an argument for relying on discourse among educated fellows as an efficient way of evaluating the quality of scientific work. Despite Einstein’s apparent genius, which is cemented in popular imagination but who also happened to be wrong about some things, this is not reason enough to abolish peer review. Moreover, the author does not consider the general acceptance of non-reviewed ideas that happened to be wrong as a counter point that clearly refutes his main point.

3. What about non-experimental methods?

The author has a huge blind spot for non-experimental methods. He suggests that if the results of a scientific analysis can be replicated, then that is good enough for acceptance into an authoritative cannon of truth. Moreover, he indicates that work that can not replicate is “a whole lotta money for nothing”, basically a waste of time and resources. But a lot of science can not be replicated, by virtue of the fact that science doesn’t always follow experimental protocols that allow for replication tests to be performed. Fantastic and valuable work that relies on non-experimental heuristics, including a lot of work in the social sciences and humanities, climate science, ecology, astronomy and various other fields, are left in the lurch. His take on non-replicability in these disciplines reads a lot like the unethical and ironically non-replicable Sokal hoaxes that serve as the basis for unhinged right-wing attacks on the social sciences and humanities.

This also contradicts the author’s hearkening to discourse among learned men of olde as a way of dealing with problems relating to peer review. Opening up the comments section, even if just limited to a curated list of credentialed scholars, is not the same as conducting independent replication studies. I think the reasoning behind this link is that if other people have experienced similar phenomena in their own labs, then it’s more likely to be accepted as true. But this is not the same as replication under the same conditions, it is just the same uncontrolled consensus-based evaluation criteria as peer review but with an open filter.

4. Peer-review in context

I agree with many of the things that the author is saying. Yes, there are many ways in which peer review is broken and could be improved. For instance, I agree with the notion that peer reviewers do not dive deep enough into the data and aren’t always critical enough. But I think that this is because most people are unprepared to do so, either because they do not have access to data or do not know how to work with statistics or read code. Moreover, certain journals like PNAS give preferential treatment to certain authors over others, and there are definitely major issues with racism and sexism in the evaluation process. Open peer review does not resolve these issues, namely because it treats peer review in isolation.

The only way to make peer review better is by instilling good scholarly practices in the next generation of scholars. However, this is inhibited by structural issues, such as the tight job market that favours quantity of peer reviewed articles over any other factor, and the general prestige economy of academia. These are the root issues. The foul state of peer review is one aspect of this mess, alongside structural racism, sexism and transphobia, the sheer expense of obtaining an advanced degree and excelling in the years immediately post-PhD, and the pressures to conform trends that get you funding. You can not separate the problems with peer review from these issues. Yet somehow the author manages to completely side step these concerns, identifying the broken peer review system as a purely epistemic problem, rather than a problem with tangible and far-reaching social implications.

Open science and its weird conception of data

In an early draft of one of my dissertation’s background chapters I wrote a ranty section about notions of data held by the open science movement that I find really annoying. I eventually excised this bit of text, and while it isn’t really worth assembling into any publication, I thought it may still be worth sharing here. So here is a lightly adapted version, original circa May 2022.

Continue reading “Open science and its weird conception of data”

Abstract submitted for DAB23 (Bern, Switzerland)

Today I submitted an abstract to present at the DAB23 colloquium hosted by the Bern Computational and Digital Archaeology lab. The conference is about “advancing open research into the next decade” and my paper is titled Documenting the collaborative commitments that support data sharing within archaeological project collectives. Here is the abstract:

Archaeological research is inherently collaborative, in that it involves many people coming together to examine a material assemblage of mutual interest by implementing a variety of tools and methods in tandem. Independent projects establish organizational structures and information systems to help coordinate labour and pool information derived thereof into a communal data stream, which can then be applied towards the production and publication of analytical findings. Albeit not necessarily egalitarian, and with different expectations set for people assigned different roles, archaeological projects thus constitute a form of commons, whereby participants contribute to and obtain value from a collective endeavour. Adopting open research practices, including sharing data beyond a project’s original scope, involves altering the collaborative commitments that bind work together. This paper, drawn from my doctoral dissertation, examines how archaeologists are presently navigating this juncture between established professional norms and expectations on the one hand, and the potential benefits and limitations afforded by open research on the other.

I applied an abductive qualitative data analysis approach based on recorded observations, interviews, and documents collected from three cases, including two independent archaeological projects and one regional data sharing consortium with limited scope and targeted research objectives. My analysis documents a few underappreciated aspects of archaeological projects’ sociotechnical arrangements that open data infrastructures should account for more thoroughly:

  1. boundaries, whether they restrict membership within a collective, delimit a project’s scope, or limit the time frame under which a project operates, have practical positive value, and are not just arbitrary impediments;
  2. systems designed to direct the flow of information do so via the coordination of labour, and the strategic arrangement of human and object agency, as well as resistances against such managerial control, are rarely accounted for in data documentation; and
  3. information systems and the institutional structures that support them tend to reinforce and reify existing power structures and divisions of labour, including implicit rules that govern ownership and control over research materials and that designate who may benefit from their use.

By framing data sharing, whether it occurs between close colleagues or as mediated by open data platforms among strangers, as comprising a series of collaborative commitments, my work highlights the broader social contexts within which we develop open archaeological research infrastructures. As we move forward, we should be aware of and account for how the data governance models embedded within open research infrastructures either complement or challenge existing social dynamics.