Course Contents
This seminar is concerned with topics, theories, methods and data in corpus and computational linguistics, the area of linguistics that entails the construction and analysis of principled collections of linguistic data in digital form as well as methods of the corpus-based and computational modelling and analysis of language. Corpus linguistics builds on the assumption that we are going to discuss types of research questions and applications that are best pursued by means of corpus and computational linguistic methods and techniques. Examples are lexicological questions such as research on terminology as well as collocations and other types of multi-word lexical units, the corpus-based description of special-purpose grammars, e.g. extraction of register-specific patterns, and the role of corpora in statistical natural language processing.

[b]Expectations and Goals[/b]

In this seminar, you are going to learn about empirical methodology in linguistics with a focus on corpus and computational linguistics and the use of digital corpora for the study of language. In the seminar, you are going to learn about the theoretical as well as methodological assumptions and implications of studying language on the basis of digital corpora. We are going to lay the foundations for corpus study and see how particular research questions are operationalized in terms of accessing relevant categories and patterns at different levels of linguistic organization (lexical, syntactic, semantic, pragmatic etc.). On the practical side, you are going to learn how to generate, read and interpret word frequency lists, query digital corpora for particular lexical and grammatical patterns and decide which patterns and quantitative results are suitable for answering specific research questions. The seminar takes a critical look at approaches to the study of language, investigates the structure and composition of a variety of standard corpora and begins to ask what we can learn from corpora that other types of evidence are unable to provide.

The goal of the seminar is to acquaint students with the basic tenets of corpus linguistic theory and methods, introduce corpus design and some important corpora and present some sample corpus studies alongside some first hands-on experience with data and tools. The course is designed to lay the basic foundations for your own projects in the CCL II seminar in the summer term.

[b]Expectations:[/b] We do not assume any previous experience with tools or data, but we assume that you have completed the module Introduction to linguistics and we expect you to be ready to get your head around some theory and your hands dirty on data and methodology. This requires that you read the set texts and work with the tools and data introduced in the course. Note that the techniques learned in this seminar build the foundation for other seminars and projects in Digital Philology.

Literature
Literature for this course will be announced via the accompanying moodle course.

Semester: Inverno 2023/24