z-logo
open-access-imgOpen Access
Ungoliant: An Optimized Pipeline for the Generation of a Very Large-Scale Multilingual Web Corpus
Author(s) -
Julien Abadji,
Pedro Javier Ortiz Suárez,
Laurent Romary,
Benoît Sagot
Publication year - 2021
Publication title -
hal (le centre pour la communication scientifique directe)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.14618/ids-pub-10468
Subject(s) - computer science , pipeline (software) , metadata , natural language processing , license , artificial intelligence , text corpus , information retrieval , resource (disambiguation) , corpus linguistics , modular design , world wide web , programming language , computer network , operating system

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here