DH2016: What’s the relation between crowdsourcing cultural heritage and Wikimedia?

The yearly Digital Humanities conference brings together academic researchers using digital tools to solve humanities research questions. In the sciences digital technologies are embedded in each research field, but the humanities have been slower to adopt them. This year in Krakow, Poland, the Digital Humanities 2016 conference showcased and discussed the growing body of work within the domain. Topics ranged from geospatial analysis of visibility in ancient Maya sites to animating archived Javanese puppets using robotics.

In the conference, I wanted to deepen my understanding about participation and crowdsourcing in cultural heritage research projects, and envision the role of Wikimedia and other open collaborative projects in humanities research.

Why Digital Humanities + Wikimedia

Creating and using open source tools and sharing results is becoming more and more usual in the academic research projects. Wikimedia projects are still rarely used for research, while many of the Wikimedia projects could be an obvious part of the academic research workflow.

Recently Pepe Flores wrote about the connection between DH and Wikimedia in the Wikimedia Foundation blog:
Is Wikipedia the largest-ever digital humanities project? Exploring an emerging relationship.

There is still a long way to go. For the humanities scholars, there is a tension between maintaining control over interpretation and modification the corpus of study, and the perceived loss of control in open platforms. Many of the researchers I talked with were concerned about how their contributions could match the notability guidelines set by the communities in Wikimedia projects.

MediaWiki’s editing and knowledge discovery tools pose additional challenges to the researchers, although they have come a long way. The user interfaces are not as flexible as what is considered a contemporary standard. The researchers have little experience of managing and taking advantage of the APIs, extensions, tools and other services in the technical ecosystem. But for the study, the benefits are evident. The data is maintained and fostered together, continuously evolving and enriched with more information as it comes available, in several languages, and in real time.

What next for crowdsourcing?

My primary interest in this year’s conference was to take part in the workshop Beyond Basics – What next for crowdsourcing? The workshop discussed the methods, technologies, and ethics of engaging volunteers in digital humanities projects, such as Transcribe Bentham by UCL or Emigrant city by NYPL & Zooniverse. In the projects the citizen participants find names and places in diaries, or identify animals or galaxies in images, transcribe letters and diaries, classify paintings and so on. These tasks are often arranged into workflows that the users follow.

As a distinction, Wikimedia projects go beyond crowdsourcing. The participants are invited in open-ended collaborative work. The closest to crowdsourcing in Wikimedia projects is fact-checking of Wikidata entries. These microtasks can be carried out with the Wikidata Game, for example.

The crowdsourcerer’s workshop

How could these approaches be combined: the open ended knowledge production of the Wikimedia projects, and the guided and schematic crowdsourcing workflows? Mia Ridge from the British Library, a co-organizer of the workshop, promotes scaffolding: starting off with simple tasks and gaining more responsibility and freedom as you progress.

Let’s also make crowdsourcing workflows in Wikimedia and with Wikimedia content! A workflow for an image collection could include uploading, structuring metadata, identifying people/objects/themes, geolocating, dating, and making annotations. Maps and manuscripts need different tasks: indexing or transcribing, or identifying names of known people and places. The workflows could be broken into fun microtasks and games, and the results would be saved in Wikimedia projects.

Christy Henshaw from the Wellcome Library in London is also a co-organizer of the workshop. She pinpoints how much energy is spent on copying images over to Wikimedia Commons. The focus could be shifted into the reuse of the imagery, made available through metadata only, and accessing the images in their original locations, whenever needed.

MediaWiki can be used as the platform for a custom transcription project. Joanna Iranowska works with the Norwegian eMunch project for transcribing Edvard Munch’s texts. It is a MediaWiki project like the Transcribe Bentham project, which it is based on.

A page from Edvard Munch’s notebook in the eMunch project, Munchmuseet

Ben Brumfield – a third co-organizer and a wikipedian himself – runs a transcription platform FromThePage. He sees both opportunities and challenges in connecting external tools with Wikisource, for example. The key issue is not so much the technology, but engaging the participants. It would be important for a small memory institution to foster and relate with their own volunteer community in the vastness of a Wikimedia project.

What brings people together to love a project?

Engaging people, especially when they provide information about their own lives, requires active presence and focusing on the people themselves.

Crowdsourcing is like a party. You can’t just lock yourself in the kitchen. You have to have a gracious way to invite people, introduce them, and let them know when it’s time to go.

Susan Schreibman runs The Letters of 1916 project to transcribe letters written around the time of the Easter Rising in Ireland. She accounts that the social and educational side of the project has become so remarkable that she prefers to describe the project as public engagement.

The projects cannot overlook the social process that is taking place. A classic example is the Old Weather project by Zooniverse. The aim was to invite volunteers to transcribe historic ship’s logs to uncover historical weather data for climate model assimilation. What the volunteers ended up researching as well were stories of sailors, ships and life aboard.

Could there be novel ways for experts and volunteers to work together in the Wikimedia projects?


In volunteered projects many actors are giving and receiving. When institutions share, they expect to get enriched data back. When people spend their time, they expect their contributions to be valuable.

Transparency about the project goals and how the project results are being used, emerged as a key principle. Open licensing of the results will secure the reuse for participants and beyond the project.

If I’ve given my time freely, I take the view that the content I generate should also be freely available for anyone to reuse. – Siobhan Leachman

There are many ways to reward participants for their work. It is not so important how, but it is essential that people who contribute voluntarily are being recognized, thanked, and respected.

If you want to study the discussions closer, the workshop notes are online, along with notes on each topic. The conference abstracts cover a multitude of topics and they are well worth visiting. And help make the Crowdsourcing page in Wikipedia a great resource for humanities projects as well!

Let’s Edit Wikimedia + the Digital Humanities together!

Digital Humanities 2017 will be held in Montréal, Canada right before Wikimania. DH2017 will be held on August 8–11 on the campus of McGill University, while Wikimania 2017 takes place at the Centre Sheraton Montréal in downtown Montréal on August 11–13. This will surely be a unique opportunity for crossovers between the academic and volunteered digital humanities research!

In the meantime, we should start imagining ways to bring Wikimedia projects together with the vibrant new space created by the Digital Humanities. I have started a page to share those ideas in the Why Digital Humanities + Wikimedia page in Wikimedia Outreach. Let’s edit it together!

