A method to machine read messy early manuscript sources and to automatically create clean structured historical data.

I have been experimenting with automatic handwriting recognition software called Transkribus. This has been used to transcribe letters and handwritten notes but for database creation, as far as I know.

The image above shows a training model for text recognition I am building for the port books of Bridgewater 1672-77.

Port books contain entries for millions of English and Welsh merchant voyages 1565-1790. The script has in the past been very hard to transcribe meaning most of the data contained in this source remains unstudied. Can Transkribus help to unlock the vast repository of information about trade, shipping and consumption in early modern Europe contained in this source and others?

I have now created data using Transkribus for Newcastle shipping movements from the 1590s with 11,000 observations. This work can be seen in the following slides.

Please see my slides describing the results and method developed:

Handwritten text recognition

The following text describes some of the elements:

Transkribus and database creation

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s