Result filters

Metadata provider

  • DSpace

Language

Resource type

Availability

Active filters:

  • Metadata provider: DSpace
Loading...
419 record(s) found

Search results

  • NeMo Neural Machine Translation service RSDO-DS4-NMT-API 1.0

    Neural Machine Translation service for NeMo AAYN Base models. For more details about building such models, see the official NVIDIA NeMo documentation (https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/machine_translation/machine_translation.html) and NVIDIA NeMo GitHub (https://github.com/NVIDIA/NeMo). A model for language pair SL-EN can be downloaded from http://hdl.handle.net/11356/1736. The service accepts the source language and target language, and either a single string or list of strings to be translated. The result will be in the same format as the request, either as a single string or list of strings. The maximal accepted text length is 5000c. Note that transcription of one 5000c text block on cpu will take advantage of all available cores, consume up to 3GB RAM and may take ~200s (on a system with 24 vCPU). See the service README.md for further details.
  • Slovenian commonsense reasoning model SloMET-ATOMIC 2020

    The SloMET-ATOMIC 2020 is a Slovene commonsense reasoning model that is able to predict commonsense descriptions in a natural language for a given input sentence. The model is an adaptation of the Slovene GPT-2 model (https://huggingface.co/cjvt/gpt-sl-base) that has been finetuned using the SloATOMIC 2020 corpus (http://hdl.handle.net/11356/1724), consisting of 1.33M everyday interence knowledge tuples about entities and events. The released model is a pytorch neural network model, intended for usage with the transformers library (https://github.com/huggingface/transformers).
  • CEC6-Converter

    Diese Software erlaubt eine Konvertierung von *.cec6.gz-Dateien in 24 Formate, die in der Korpuslinguistik / NLProc üblich sind. Die Ausführung ist unter allen modernen Betriebssystemen möglich (Windows, Linux, MacOS). Die Binärdateien wurden für die x64-Architektur kompiliert. Sollten Sie einen Prozessor (CPU) verwenden, der eine x86- oder ARM-Architektur hat, dann nutzen Sie bitte die Anleitung: andere Betriebssysteme bzw. x86 / ARM / ARM64. --- This software allows the conversion of *.cec6.gz files into 24 formats that are commonly used in corpus linguistics / NLProc. Execution is possible under all modern operating systems (Windows, Linux, MacOS). The binary files have been compiled for the x64 architecture. If you are using a processor (CPU) with x86 or ARM architecture, please use the instructions for "other operating systems or x86 / ARM / ARM64".
  • Slowal (2018-06-29)

    Slowal is a web tool designed for creating, editing and browsing valence dictionaries. So far, it has mainly been used for creating The Polish Valence Dictionary (Walenty). Slowal supports the process of creating the dictionary; it also facilitates access by making it possible to browse the dictionary using an advanced built-in filtering system, covering both syntactic and semantic phenomena. Slowal also gives control over the work of lexicographers involved in creating dictionary, for instance by using predefined lists of values, which prevents spelling errors and enforces consistency, as well as by imposing strict validation rules. Last but not least, the created dictionary can be exported from Slowal in various formats: plain text, TeX, PDF, and TEI XML. This version was adapted for creating semantics of nouns and adjectives.
  • Punctuation model (20.09)

    A python package that punctuates Icelandic text. The input data is unpunctuated text and punctuated text is returned. The user can choose between two punctuation models, a BERT-based Transformer and a bidirectional RNN ([Punctuator 2](www.github.com/ottokart/punctuator2)) in Tensorflow 2. [Icelandic] Python-pakki sem greinarmerkjasetur íslenskan texta. Inntakið er á formi ógreinarmerkjasetts texta og greinarmerkjasettum texta er skilað. Notandinn getur valið milli tveggja greinarmerkjasetningalíkana, annars vegar umbreytis sem byggir á BERT og tvístefnu-endurkvæmnisneti ([Punctuator 2](www.github.com/ottokart/punctuator2)) í Tensorflow 2.
  • Yfirlestur Word 22.10

    Yfirlestur Word is the source code for a spelling and grammar correction add-on for Icelandic, for use with Microsoft Word. The plugin provides error annotation and replacement, based on user interaction. The source code is intended for third party development and can be installed and tested locally using Node.js. The plugin requires third party correction software for its functionality. For development and testing, the open-access Yfirlestur.is API produced by Miðeind was used (see: https://github.com/icelandic-lt/Yfirlestur)) but is not intended for production use. This software is licensed under the MIT License. More information at https://github.com/icelandic-lt/Yfirlestur-Word.
  • GreynirSeq - A Natural Language Processing Toolkit for Icelandic (v0.2.0)

    GreynirSeq is a natural language parsing toolkit for Icelandic focused on sequence modeling with neural networks. The modeling part (nicenlp) of GreynirSeq is built on top of the excellent Fairseq from Meta (which is built on top of PyTorch). Interfaces for POS-tagging, NER-tagging and machine translation are included in this version v.0.2.0. For updated versions of the software please refer to https://github.com/mideind/GreynirSeq -- GreynirSeq er málvinnsluhugbúnaður fyrir íslensku með áherslu á notkun runulíkana sem byggja á tauganetum. Sá hluti sem snýr að tauganetum er byggður á Fairseq frá Meta og byggir á PyTorch. Í þessari útgáfu (v0.2.0) er stuðningur við orðflokkagreiningu, nafnamörkun og þýðingu í gegnum viðmót á skipanalínu. Nýjustu útgáfu af hugbúnaðinum má ávallt finna á https://github.com/mideind/GreynirSeq