The records of the Dutch East India Company (1602-1795) are kept at the National Archives in The Hague. The archives span 1.25 km / 0.78 miles. Many of these records are scanned and available from the finding aid.
Description of the project
To make the records easily findable the National Archives is participating in an ambitious project called IJsberg [iceberg]. The project uses a range of state-of-the-art artificial intelligence technologies. I am involved in the project in a minor role, and am intrigued by the possibilities it offers!
The records are processed in multiple steps:
- The records are scanned (done)
- Artificial intelligence is used to automatically create transcriptions (underway)
- Artificial intelligence is used to automatically detect names of persons and places and dates/times in the transcriptions (needs training data first).
The type of artificial intelligence models used for the project require training data. Volunteers are asked to help transcribe and tag the names and dates in the records.
Volunteer project Tag the Text
Last month, the National Archives launched the volunteer project Tag the Text at the Vele Handen website to help train the automatic name and date detection. They are looking for more volunteers.
If you join the project, you will be presented with the image and the automatically generated transcription, and asked to tag the people, places, and dates in the transcribed text. You do not need to create or correct the transcription; that is done in another project.
Please consider joining if you have the following skills:
- Necessary: ability to understand modern Dutch to follow the instructions.
- Necessary: ability to understand the Dutch text from the 1600s or 1700s
- Helpful: ability to read the original Dutch text
- Helpful: knowledge of geography in the Dutch East Indies and other places where the Dutch East India Company traded, to recognize the difference between the name of a place and the name of a person.
The project allows you to select the types of records you want to work on. Many people are choosing to work on the easier notarial records, which means there is now a shortage of people working on the Dutch East India Company records. It’s especially important that these records are tagged so that computers can do the monumental task for us!
Even if you will only do a few pages, that will help to train the computer program to then do thousands or even millions more. Together, we can help to make the millions of records searchable.