Literature
Will be uploaded on Moodle.

Preconditions
For this class, you will need to bring your laptop. Familiarity with Python is essential to understand the pipeline for creating data. You will also need to have installed Microsoft Visual Code and a Python IDE (such as Spyder or PyCharm, both of which can be accessed on Anaconda).
 

Official Course Description
This exercise seminar provides an introduction to creating a gold-standard dataset involving preprocessing data, manual and automatic annotation of text for features such as part-of-speech and lemma, evaluating data formats such as TSV and XML/TEI and checking and correcting annotations. In this class, we will annotate medical texts produced in the Romance languages (Old and Middle French/Anglo-Norman/Old Occitan) in Medieval Europe, using the annotation tool, INCEpTION. We will also look at annotating texts with SpaCy.
Prior knowledge of these languages is not required. Knowledge of modern French is useful but also not required. Resources such as dictionaries, grammar books (in English), lemma lists and translations of the texts will be provided on Moodle.

The class is organised in association with the Knowledge Networks in Medieval Romance Speaking Europe (ALMA) project, based at the Heidelberg Academy of Sciences and Humanities and Heidelberg University. You can find more information about the project here: [url]https://www.hadw-bw.de/en/research/research-center/knowledge-networks-medieval-romance-speaking-europe-alma[/url]


To pass this course, you will need to complete the annotation tasks throughout the semester, along with a final term paper. More details will be announced in the first session.

Online Offerings
moodle

Semester: WT 2025/26
Jupyterhub API Server: https://tu-jupyter-t.ca.hrz.tu-darmstadt.de