Handwritten report from CID
Photo: Stefan Jensen

The Detective Branch: Transcribing, cataloguing and visualising police reports and photographs from Gothenburg in 1868–1902

Research project
Active research
Project period
2020 - ongoing
Project owner
GPS400: Centrum för samverkande visuell forskning

Short description

The Detective Branch is a GPS400 project conducted jointly with a research and development initiative at the Swedish National Archives. The project material comprises police reports from the the detective branch of the Gothenburg Police Force covering 1868–1902. These reports are held at the Regional State Archives in Gothenburg and add up to around 22,500 handwritten pages. The aim is to innovatively improve access to archival information and find links between visual and text-based archival sources. The long-term hope is that the project will also create new foundations for collaborative research into a unique period in Gothenburg’s history, strongly defined by urbanisation, industrialisation and migration.

The Criminal Investigation Department was set up in the 1850s and tasked with conducting investigations as non-uniformed police officers. In 1868, the department established a permanent team of personnel, and it was at this point that “report books” began to be kept, containing copies of all outgoing reports from the department. The image shows the HTR processing of a police report from 1896.


Automatic transcription using machine learning

In recent years, research has made huge advances in the automated transcription of old, handwritten texts. The method of Handwritten Text Recognition (HTR) is based on artificial intelligence, whereby the computer programme is trained to understand and interpret the figures in an image and translate them into characters. In the CID project, an extensive collection of police reports will be processed using HTR methods that will automatically transcribe the handwritten text into a digital format. 



The content of the report books offers enormous potential for many archive users. The results of the project will also be made freely available to everyone via the Swedish National Archives’ Digital Research Room. Several GPS400 projects, including CID, will use the transcribed archival information as contextual information and metadata for photographs from the same period. Searching for and using information from this transcribed archive material will be very different from the traditionally time-consuming task of searching through analogue archives. It will be possible, for example, to conduct a free text search of the entire archive and quickly process large quantities of data. The Swedish National Archives are also running a Vinnova-funded project to expand the use of HTR and to create a search interface for the transcribed data.


Now you can join in!

Spring 2020 saw the manual transcription of 500 two-page spreads by a group of participants drawn from the general public. Based on this transcribed data, an HTR model was then trained to conduct automatic transcription of the remaining material with 97 per cent accuracy. However, to achieve as high a quality as possible, the project is once again inviting the public to take part, this time in the correction of the automatically transcribed police reports. The work will take place over the internet and requires no prior knowledge, beyond normal computer skills and some experience of reading texts from the turn of the 20th century. Participating in the project will give you an insight into the Gothenburg of this period and how its people lived. You will also be making a major contribution to future research, creating new opportunities to generate knowledge about the history of the city. Are you interested in helping out? Email and we’ll tell you all about it. Come and join in!