Sidansvarig: Webbredaktion
Sidan uppdaterades: 2012-09-11 15:12
Författare |
Elena Volodina Arild Matsson Dan Rosén Mats Wirén |
---|---|
Publicerad i | Learner Corpus Research conference (LCR-2019), Warsaw, 12-14 September 2019, Book of abstracts |
Publiceringsår | 2019 |
Publicerad vid |
Institutionen för svenska språket |
Språk | en |
Länkar |
https://lcr2019.ils.uw.edu.pl/progr... |
Ämnesord | SweLL, L2 infrastructure, annotation tool, learner corpus research, parallel corpora |
Ämneskategorier | Språkteknologi (språkvetenskaplig databehandling), Jämförande språkvetenskap och lingvistik, Lärande |
Learner corpora are actively used for research on Language Acquisition and in Learner Corpus Research (LCR). The data is, however, very expensive to collect and manually annotate, and includes steps like anonymization, normalization, error annotation, linguistic annotation. In the past, projects often re - used tools from a number of different projects for the above steps. As a result, various input and output formats between the tools needed to be converted, which increased the complexity of the task. In the present project, we are developing a tool that handles all of the above - mentioned steps in one environment maintaining a stable interpretable format between the steps. A distinguishing feature of the tool is that users work in a usual environment (plain text) while the tool visualizes all performed edits via a graph that links an original learner text with an edited one, token by token.