Category Archives: Learner corpus research

International Conference for Learner Corpus Research LCR 2024 Tartu 26-28 September 2024 

Keynote speakers: 

    Gaëtanelle Gilquin (Université Catholique de Louvain, Belgium) 
    Ilmari Ivaska (Turun Yliopisto, Finland) 
    Cristóbal Lozano (Universidad de Granada, Spain) 

Event Details: 

Date: 26-28 September 2024 
Location: Institute of Foreign Languages and Cultures and the Institute of Estonian and General Linguistics. University of Tartu, Estonia. 
Topics: Areas of interest include, but are not limited to, the following:  
        Language for academic purposes  
        Language for specific purposes  
        Language teaching, assessment and testing  
        Learner corpus-based SLA studies  
        Corpora as pedagogical resources  
        Multimodal learner corpora  
        Software for learner corpus analysis  
        Corpus-based translation studies  
        English as a Medium of Instruction (EMI)  
        English as a Lingua Franca (ELF)  
        Data mining and other explorative approaches to learner corpora  
        Statistical methods in learner corpus studies  
        Discourse analysis and pragmatics  
        Studies related to lexis: semantics, metaphor, etc.  
        NLP approaches  
        Complexity, accuracy and/or fluency (CAF) analysis  


A short summary of the intended presentation, capturing the central idea along with the research questions, methods of research and the (possibly tentative) key conclusions, also citing any relevant previous work or theoretical background of the field. Limited to 300 words, excluding keywords and references . Anonymous: the abstract itself should hold no reference to the author or their affiliation  

Further information is available at our webpage:;!!D9dNQwwGXtA!TUReXuBJrwvE8MwfDBbsYfDDv-8M-oqLj0P7oyev5SizIbp9MEDJtjrXsNNs5Xsv4AcNKfUqCXd0LcAsG_s$

Contact information:

The Core Metadata Schema for L2 data

The Core Metadata Schema for L2 data: Collaborative efforts towards improved data findability, metadata quality and study comparability in L2 research

Dr Magali Paquot, UCLouvain

October 30, 18:00 (Madrid time) / 17:00 (UK time)


You can check out the 2021 and 2022 talks here:


The Core Metadata Schema for L2 data consists in a comprehensive set of variables that encapsulate crucial information about L2 data. It is organized into several sections that describe specific aspects of a learner corpus. These include administrative details (e.g. authors or license), corpus design, text-related variables, learner-related variables, in-built annotation(e.g. details about manual or automatic annotation), information about annotators or transcribers (e.g. native language or language repertoire) and task-related details (e.g. instructions, time constraints) (Paquot et al., 2023). It is the result of extensive collaboration between learner corpus compilers at the Centre for English Corpus Linguistics (UCLouvain, Belgium) and EURAC Research (Bolzano, Italy), and a research data infrastructure expert and member of CLARIN’s metadata taskforce (König et al., 2022; Frey et al. 2023).

In this presentation, I will discuss the underlying rationale for the development of such a resource and present its second version. This will give me the opportunity to clarify in what ways we have tried to embark learner corpus researchers into this initiative and reiterate our hope that the LCR community will collaborate with us to refine the schema and align it with the evolving needs of the field.


Frey, J.-C., König, A., Stemle, E. & M. Paquot (2023). A core metadata schema for L2 data. Paper presented at the 32nd Conference of the European Second Language Association (EUROSLA), 30 August – 2 September 2023, University of Birmingham, UK.

König, A., Frey J.-C., Stemle, E., Glaznieks, A. & M. Paquot (2022). Towards standardizing LCR metadata. Paper presented at Learner Corpus Research 6, 22-24 September 2022, University of Padua, Italy.

Paquot, M., König, A., Stemle, E. & J.-C. Frey (2023). Core Metadata Schema for Learner Corpora,

Dr Magali Paquot is a permanent FNRS research associate at the Centre for English Corpus Linguistics, Institut Langage et Communication, UCLouvain, and an affiliate member of the Corpus Linguistics Lab, University of Florida. She holds a PhD in Linguistics (Université catholique de Louvain) and a degree in Natural Language Processing (Université de Liège). Her research interests include (but are not limited to) corpus linguistics, learner corpus research, vocabulary, phraseology (collocations, lexical bundles, …), pedagogical lexicography, electronic lexicography, terminology, EAP (English for Academic Purposes), ESP (English for Specific Purposes), EFL (English as a Foreign Language), SLA (Second Language Acquisition), linguistic complexity and L1 influence.

This online event is organized by the Universidad de Murcia and the E020-07 research group (Lenguajes de especialidad, corpus lingüísticos y lingüística inglesa aplicada a la ingeniería del conocimiento).

Coordination: Prof Pascual Pérez-Paredes & Dr Carlos Ordoñana Guillamón

The International Conference for Learner Corpus Research – LCR 2022 University of Padua


The International Conference for Learner Corpus Research (LCR 2022) will be held at the University of Padua (Italy), at the Department of Linguistic and Literary Studies ( on 22-24 September, 2022.

The LCR2022 Conference aims to showcase the latest developments in the field of learner corpus research regarding the description of learner language and the design of innovative methods and tools to analyse it. 

The conference will feature keynote lectures, full paper presentations, work in progress reports, poster presentations, software demonstrations and a book exhibition. Pre-conference workshops are also planned. 

Keynote speakers

–          Silvia Bernardini (Università di Bologna, Italy)

–          Anke Lüdeling (Humboldt-Universität zu Berlin, Germany)

–          Hilary Nesi (Coventry University, England)


All topics related to learner corpus research based on any language are welcome. Areas of interest include, but are not limited to, the following: 

  • Language for Academic Purposes; 
  • Language for Specific Purposes;
  • Language Teaching, Assessment and Testing;
  • Learner corpus-based SLA studies;
  • Corpora as pedagogical resources;
  • Multimodal learner corpora;
  • Software for learner corpus analysis;
  • Corpus-based translation studies;
  • English as a Medium of Instruction (EMI);
  • English as a Lingua Franca (ELF);
  • Data mining and other explorative approaches to learner corpora;
  • Statistical methods in learner corpus studies.


Abstracts, written in English, should be between 600 and 700 words (excluding a list of references) and should provide the following:

–          clearly articulated research question(s) and its/their relevance;

–          the most important details about research approach, data and methods;

–          (preliminary) results and their interpretation.

Abstracts will be submitted through EasyChair. Abstract submission will open on 18 November 2021 and the deadline for submission is 23 January 2022. Abstracts will be reviewed anonymously by the scientific committee. Notification of the outcome of the review process will be sent by 31 March 2022.

Further information is available at

The Graduate Student Conference in Learner Corpus Research 2021

A virtual conference, under the aegis of the  Learner Corpus Association at Inland Norway University of Applied Sciences (INN)

Program here.

Event URL.

This event offers a great opportunity for MA and PhD students to present their (in-progress) results, share ideas, receive feedback from senior researchers in the field, and further develop their professional networks. Senior researchers are welcome as delegates, helping to ensure the high quality of the event and foster the careers of Early Career Researchers.

The Learner Corpus Association (LCA) is an international association which aims to promote the field of learner corpus research and provide an interdisciplinary forum for all the researchers and professionals who are actively involved in the field or simply want to know more about it. 

LCA supports the compilation of learner corpora (i.e. electronic collections of written and/or spoken language produced by foreign/second language learners) in a wide range of languages and the design of innovative methods and tools to analyze them. It seeks to link up learner corpus research to second language acquisition theory, first language acquisition theory and linguistic theory in general and to promote applications in fields including foreign language teaching, language testing and natural language processing (automated scoring, spell- and grammar-checking, L1 identification).