CLARIN Tool Portal

The CLASSLA-StanfordNLP model for lemmatisation of non-standard Croatian 1.0

2 resources

The model for lemmatisation of non-standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1210), the ReLDI-NormTagNER-hr corpus (http://hdl.handle.net/11356/1241), the RAPUT corpus (https://www.aclweb.org/anthology/L16-1513/) and the ReLDI-NormTagNER-sr corpus (http://hdl.handle.net/11356/1240), using the hrLex inflectional lexicon (http://hdl.handle.net/11356/1232). These corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed. The estimated F1 of the lemma annotations is ~97.54.

Use "The CLASSLA-StanfordNLP model for lemmatisation of non-standard Croatian 1.0"

Liner2 events model

1 resources

Liner2 model for event and event relation recognition

Use "Liner2 events model"

BlogReader

1 resources

BlogReader - corpus acquisition from structured web sources

Use "BlogReader"

Poliqarp for DjVu -a demonstration (open Virtual Appliance)

3 resources

a server for DjVu corpora

Use "Poliqarp for DjVu -a demonstration (open Virtual Appliance)"

The CLASSLA-StanfordNLP model for lemmatisation of non-standard Serbian 1.0

2 resources

The model for lemmatisation of non-standard Serbian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the SETimes.SR training corpus (http://hdl.handle.net/11356/1200), the ReLDI-NormTagNER-sr corpus (http://hdl.handle.net/11356/1240), the ReLDI-NormTagNER-hr corpus (http://hdl.handle.net/11356/1241), the hr500k training corpus (http://hdl.handle.net/11356/1210) and the RAPUT corpus (https://www.aclweb.org/anthology/L16-1513/), using the srLex inflectional lexicon (http://hdl.handle.net/11356/1233). These corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed. The estimated F1 of the lemma annotations is ~97.62.

Use "The CLASSLA-StanfordNLP model for lemmatisation of non-standard Serbian 1.0"

SpatialPL

3 resources

SpatialPl is a tool for automatic recognition of spatial expressions in Polish texts

Use "SpatialPL"

Defender

1 resources

Deepened lexical parser into nominal phrase.

Use "Defender"

Voice control and question answering (22.10)

3 resources

[English] The goal of this work package was to develop Kaldi recipes for voice control and question answering systems for Icelandic. We defined six tasks and either generated or gathered data for each, normalized the data and trained Kaldi language models. Included in this submission are six ASR language models, an acoustic model, the training data for the language model and all the code used to generate the data and create the models. For further information have a look at the file README.md. [Icelandic] Markmiðið með þessu verkefni var að búa til talgreiningar uppskriftir með Kalda fyrir raddskipanir og fyrirspurnir. Við skilgreindum sex verkefni og annaðhvort söfnuðum eða bjuggum til gögn fyrir hvert og eitt þeirra, undirbjuggum gögnin og þjálfuðum mállíkön. Í þessu safni er að finna sex sérhæfð mállíkön, hljóðlíkan, gögnin sem voru notuð til þess að búa til mállíkönin ásamt öllum kóða sem notaður var til þess að búa til gögnin og líkönin. Freakri upplýsingar má finna í skránni README.md.

Use "Voice control and question answering (22.10)"

Kaldi Recipe for Faroese

3 resources

- ENGLISH The "Kaldi Recipe for Faroese" is a code recipe intended to show how to use the corpus "Ravnursson Faroese Speech and Transcripts" [1] to create automatic speech recognition systems using the Kaldi toolkit [2]. - ÍSLENSLA "Kaldi Forskrift fyrir færeysku" er forskrift af því hvernig má nota gagnasafnið "Ravnursson Faroese Speech and Transcripts" [1] til að búa til talgreini í verkfærakistunni Kaldi [2]. [1] Hernández Mena, Carlos Daniel; Simonsen, Annika. "Ravnursson Faroese Speech and Transcripts". Web Downloading: http://hdl.handle.net/20.500.12537/276 [2] Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., ... & Vesely, K. (2011). The Kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding (No. CONF). IEEE Signal Processing Society.

Use "Kaldi Recipe for Faroese"

CUBBITT Translation Models (en-cs) (v1.0)

3 resources

CUBBITT En-Cs translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). Models are compatible with Tensor2tensor version 1.6.6. For details about the model training (data, model hyper-parameters), please contact the archive maintainer. Evaluation on newstest2014 (BLEU): en->cs: 27.6 cs->en: 34.4 (Evaluated using multeval: https://github.com/jhclark/multeval)

Use "CUBBITT Translation Models (en-cs) (v1.0)"

Result filters

Metadata provider

Language

Resource type

Tool task

Availability

Project

Keywords

Active filters:

Search results

The CLASSLA-StanfordNLP model for lemmatisation of non-standard Croatian 1.0

Liner2 events model

BlogReader

Poliqarp for DjVu -a demonstration (open Virtual Appliance)

The CLASSLA-StanfordNLP model for lemmatisation of non-standard Serbian 1.0

SpatialPL

Defender

Voice control and question answering (22.10)

Kaldi Recipe for Faroese

CUBBITT Translation Models (en-cs) (v1.0)

Result filters

Metadata provider

Language

Resource type

Tool task

Availability

Project

Keywords

Active filters:

Search results

Session recording