Improving NLTK for Processing Portuguese

Medientyp: Elektronischer Konferenzbericht; E-Artikel; Sonstige Veröffentlichung

Titel: Improving NLTK for Processing Portuguese

Beteiligte: Ferreira, João [VerfasserIn]; Gonçalo Oliveira, Hugo [VerfasserIn]; Rodrigues, Ricardo [VerfasserIn]

Erschienen: Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019

Sprache: Englisch

DOI: https://doi.org/10.4230/OASIcs.SLATE.2019.18

Schlagwörter: NLP ; Tokenization ; Lemmatization ; PoS tagging ; Named Entity Recognition

Entstehung:

Anmerkungen: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.

Beschreibung: Python has a growing community of users, especially in the AI and ML fields. Yet, Computational Processing of Portuguese in this programming language is limited, in both available tools and results. This paper describes NLPyPort, a NLP pipeline in Python, primarily based on NLTK, and focused on Portuguese. It is mostly assembled from pre-existent resources or their adaptations, but improves over the performance of existing alternatives in Python, namely in the tasks of tokenization, PoS tagging, lemmatization and NER.

Zugangsstatus: Freier Zugang

Nur in Feld suchen:

Zuletzt gesuchte Begriffe: