Big cultural data
The key motivation for my activities with cultural heritage is the will to preserve and uncover the historical evidence that is embedded in private albums and family histories. The counterparts of these privately held items are kept in memory institutions, but the pieces of the broken mirror (echoing how Peter von Bagh characterizes compilation film) have not been possible to piece together until now. With the sharing capacity of the Internet and the efforts of the free culture movement, it is now possible to create platforms and productions for stitching and weaving stories that incorporate materials from museums, archives, libraries, albums and attics.
Locked, scattered, unmanageable
However, piecing together the broken mirror is a tedious task. Even though ordinary photographs taken before 1966 are already out of copyright or reproductions of artworks of authors who died before 1945 are only covered by the copyright of the reproduction, only samples of works are made available openly. Privacy concerns limit publication of content legally in Europe, while images are spreading uncontrollably in US based social media, getting double copyrighted by the services.
Memory organizations are struggling with their demands and diminishing resources. A vast majority of documents in archives are and will remain undigitized. Interpretation of the materials is considered the core competency of the domain experts, which complicates inviting the wider public to participate. Contextualizing would take such amounts of time, that many organizations prefer not to expose their holdings to the public. Even for openly licensed content cumbersome legacy content management systems make efficient access difficult for humans, let alone computer programs. On the bottom are the samples that have deliberately been made openly available for reuse and redistribution.
And finally, there is little offered for the preservation of privately held media. Consumers are left with commercial services while the preservation of private photos, letters and documents is nobody’s responsibility.
Genealogy is a goldmine for web service developers. Ancestry.com has invested a lot in tools to make their 2 million genealogical researchers stay and pay a monthly fee of $30 or $40. Ancestry.com was acquired by an investor group led by European private equity firm Permira for about $1.6 billion in October 2012.
Institutions need help. They are slowly letting go of their role as the only authorities capable of making meaning and letting the audience help in laborious tasks like classifying, transcribing or giving geographic coordinates to objects.
Online volunteering comes in many forms and with many names: Crowdsourcing, citizen science, citizen history, volunteered geographic information and there are more.
More often than not, the institution invites laymen to help out with repetitive, relatively simple tasks. However, there is growing competition for the volunteer researcher’s labor. Where does the motivation of the volunteer stem from? Appreciation of the peers, altruistic will to help, or compensation?
I propose to work towards developing reciprocity between institutional and individual Internet users in submitting material for research, interpreting, enriching, connecting and finally reusing them. In order for the hobbyist local historian or genealogist to take advantage of the knowledge provided by the memory institutions and the institution to benefit from the assistance of engaged citizen researchers they must find common ground where they can collaboratively share their research.
Existing or new open spaces are needed where one can make available materials, access previous study, welcome non-expert contributions, share expertise, make available tools, circulate and congregate in. Are these spaces the web of existing environments, is it a working method, or are new platforms needed?
What Wikimedia is doing with cultural heritage
I have worked with Wikimedia for several years contributing to the GLAM activities. GLAM stands for collaboration of Wikimedia chapters or volunteers with Galleries, Libraries, Archives and Museums. The aim is to help organizations open up their content: help to license it openly and to transfer files and data to Wikimedia services. Adding material to these services will make them legally open for redistribution and reuse for any purpose, including commercial purposes.
Editathons are Wikipedia editing events, organized by a memory institution, their community partners and wikimedians together. They have been arranged about all imaginable topics around the world. Wikimedia Finland arranged Tuo kulttuuri Wikipediaan, a series of editathons with six institutions in Helsinki in 2014.
We also assist in uploading media files to Wikimedia Commons. It may be tricky. Files may not be licensed, not available online, not readable, the metadata may be messy, and there are no consumer tools for mass uploads. We decided to develop some of our own tools, a tool that can prepare the messy output of a content management system into a format that Wikimedia Commons will understand. So far we have uploaded historical maps from the National Archives and the National Library, and we are preparing to upload also maps from the Norwegian Mapping Authority. Photographs by I. K. Inha, from Yle Archives, the National Gallery, Gallen-Kallela Museum and many others have already been uploaded or are waiting to be uploaded.
The latest project by the Wikimedia Foundation, Wikidata, is combining the capability of linked open data with the knowledge and user base of Wikimedia. Wikidata is a central repository for structured data collected from the Wikimedia projects. The data is gradually, but already swiftly, enriched by mapping the data with external sources, thesauri and ontologies.
We realized that we cannot master Wikidata, but there are also very few others in Finland who can. Therefore, we put up a learning project and invited 10 memory organizations to participate. In the project Mobilizing Open Cultural Data we arrange two workshops, work with each organization individually and prepare learning material online together. The data that we want to bring to Wikidata includes historical place names of Finland, Finnish names for species, all the paintings in the National Gallery among others.
Projects with historical maps
Working with GIS (geographic information systems) has been an expert activity. There is on the other hand a lot of untapped potential with hobbyist researchers who are interested in investigating old documents. One of the cornerstones of any data is data about the location.
With the Wikimaps project I set out on a mission of thinking about how to bring historical geographic data to Wikimedia: how to read that from old maps and how to link it to what we know about people and events. There had been encouraging examples of inviting the crowd to georeference old maps. The New York Public Library had produced the Mapwarper, and British Library had used another platform to run successful georeferencing drives, where people aligned old maps to match today’s coordinates.
The Wikimaps project started out with a goal to include the New York Public Library tool for historical maps in Wikimedia Commons. Since the beginning, the project has embraced Nordic collaboration, hack events and bringing in other aspects of historical geodata such as place names or locations for historical photographs. We connect with OpenHistoricalMap, a project to use the scanned maps as sources to create a street map through times.
Wiki Loves Maps
This spring we arranged a seminar and a hackathon Wiki Loves Maps, where we invited international guest speakers and developers to hack historical space, and Finnish speakers to discuss how to capture do-it-yourself history.
One of the hackathon projects that we are willing to develop further is the idea of historical street view. When enough photographs of the same place are collected and placed in relation to each other, it will be possible to create a 3-dimensional model from the images. We tested the technology later in a workshop event DroneArt Helsinki! where the participants created 3D models of public artworks.
Bring in the everyone!
According to the research Suomalaiset ja historia  learning about one’s family history is the most popular form of enjoying history among Finns. The reliability of stories heard from relatives is considered equal with teaching in schools and university level history research.  Still, there is no free and open place to collaboratively store, research and experience the past.
While Wikipedia is an omnipresent and ubiquitous platform that would seem ideal for hosting this kind of enquiry, it has limiting practices.
Wikipedia has quite successfully been able to raise the quality and reliability of the articles by requiring better references resembling academic writing. Generally, the sources need to be reputable publications. For working with intangible culture or oral history, this would be a harsh requirement.
Wikipedia always aims for consensus. All viewpoints are presented in a single article. For amateur history practice that should not be a requirement. It should be possible to fork articles to reflect the different worldviews of the writers.
The project will give tools for people to make sense of the materials that have been made available. Tools are for researching and interpreting. In events we can together create new ways of working and invent new tools.
The story of independence through archives and albums
When Finland celebrates the centenary of its independence in 2017, Wikimedia Finland wishes to challenge archives and museums to share films, photographs, printed matter, maps and paintings from the era of National Romanticism, leading up to the independence and culminating in the civil war, and invite everyone to open archives and to share albums, letters and family histories from the period.
We hope to enable a space for discovery for individuals and collectives. An option is to launch a new MediaWiki platform as a starting point, with strong connections to Wikimedia projects. The tools can be enhanced for collaborative writing, contextualizing photographs, search audiovisual content, making audiovisual presentations, transcribing documents, using historical maps or making data visualizations.
Collecting and remediating the archive would take place in encounters: collection events, hackathons and participatory events. The resulting narrative presentations would not be predefined. They would be developed through interventions and collaborations.
 Bagh, Peter von. Peili jolla oli muisti : elokuvallinen kollaasi kadonneen ajan merkityksien hahmottajana (1895-1970). Helsinki: Suomalaisen Kirjallisuuden Seura, 2002.
 “Ancestry.com Agrees To $1.6 Billion Cash Buyout Led By European Private Equity Firm Permira, Eyes ‘New Geographies’ | TechCrunch” TechCrunch, accessed October 22, 2012.
 Ridge, Mia. “From Tagging to Theorizing: Deepening Engagement with Cultural Heritage through Crowdsourcing.” Curator: The Museum Journal 56, no. 4 (2013): 435–50.
 Algan, Yann, Yochai Benkler, Mayo Fuster Morell, and Jérôme Hergueux. “Cooperation in a Peer Production Economy Experimental Evidence from Wikipedia” In Workshop on Information Systems and Economics, Milan, Italy, 1–31, 2013.
 https://fi.wikipedia.org/wiki/Wikipedia:Wikiprojekti_Tuo_kulttuuri_Wikipediaan Helsinki Region Summer University, Helsinki City Library, Helsinki Art Museum, Finnish National Gallery and Ateneum Art museum, Finnish Photography Museum, Svenska litteratursällskapet i Finland and Yle Archives.
 Torsti, Pilvi. Suomalaiset ja historia. Helsinki: Gaudeamus, 2012, p. 30.
 Torsti, p. 54.