Low-resource Languages: A Review of Past Work and Future Challenges

2020-06-12

816

Computation and Language
AAlexandre MagueresseVVincent CarlesEEvan Heetderks
Link	Official website
IF	0	DOI	10.48550/arXiv.2006.07264
OA	1	Research category	No data

Comprehensive information

Keywords

Low-resource Languages

NLP

Supervised data

Native speakers

Experts

Abstract

A current problem in NLP is massaging and processing low-resource languages which lack useful training attributes such as supervised data, number of native speakers or experts, etc. This review paper concisely summarizes previous groundbreaking achievements made towards resolving this problem, and analyzes potential improvements in the context of the overall future research direction.

References

[1] A Generalized Constraint Approach to Bilingual Dictionary Induction for Low-Resource Language Families

The lack or absence of parallel and comparable corpora makes bilingual lexicon extraction a difficult task for low-resource languages.

AArbi Haza NasutionYYohei Murakami

2018-02-05

ACM Transactions on Asian and Low Resource Language Information Processing(IF 1.8)

[2] Asia-Pacific Signal and Information Processing Association (APSIPA) and IEICE

SSadaoki FURUI

2012-01-01

IEICE ESS Fundamentals Review

Cited articles

[1] Developing NLP models for Taiwanese Hokkien with challenges, script unification, and language modeling

In recent years, artificial intelligence (AI) has advanced significantly in natural language processing (NLP), marked by powerful pre-trained language models from major companies.

JJeng-Shin SheuSShun-Yi Xu

2025-05-23

Journal of the Chinese Institute of Engineers(IF 1)

[1] Opportunities and Challenges of Large Language Models for Low-Resource Languages in Humanities Research

Low-resource languages serve as invaluable repositories of human history, embodying cultural evolution and intellectual diversity.

TTianyang ZhongZZhenyuan Yang

+10

2024-11-30

Computation and Language

[2] Natural language processing applications for low-resource languages

Natural language processing (NLP) has significantly advanced our ability to model and interact with human language through technology.

PPartha PakrayAAlexander Gelbukh

2025-02-28

Natural Language Processing

[3] Multilingual Knowledge Graphs and Low-Resource Languages: A Review

There is a lack of multilingual data to support applications in a large number of languages, especially for low-resource languages.

LLucie-Aimée KaffeeRRussa Biswas

2023-01-01

Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2023

[4] Natural Language Understanding of Low-Resource Languages in Voice Assistants: Advancements, Challenges and Mitigation Strategies

This paper presents an exploration of low resource languages and the specific challenges that arise in natural language understanding of these by a voice assistant.

AAshlesha V Kadam

2023-01-01

International Journal of Language, Literature and Culture

[5] Linguistic Challenges in Generative Artificial Intelligence: Implications for Low-Resource Languages in the Developing World

Proficiency in English is pivotal for leveraging information and communication technologies, but it holds even greater significance in the realm of generative artificial intelligence (GAI), which is poised as the next digital frontier.

NNir Kshetri

2024-04-29

Journal of Global Information Technology Management(IF 3)

[6] How Low is Too Low? A Computational Perspective on Extremely Low-Resource Languages

Despite the recent advancements of attention-based deep learning architectures across a majority of Natural Language Processing tasks, their application remains limited in a low-resource setting because of a lack of pre-trained models for such languages.

RRachit BansalHHimanshu Choudhary

2021-05-30

Computation and Language

[7] Transcending Language Boundaries: Harnessing LLMs for Low-Resource Language Translation

Large Language Models (LLMs) have demonstrated remarkable success across a wide range of tasks and domains.

PPeng ShuJJunhao Chen

+14

2024-11-18

Computation and Language

[8] Steps towards Addressing Text Classification in Low-Resource Languages

Text classification is an area of NLP in which major improvements have been observed in recent years, primarily via pre-training and fine-tuning of large language models (LLMs).

MMaximilian WeissenbacherUUdo Kruschwitz

2023-01-01

Proceedings of the 19th Conference on Natural Language Processing (KONVENS …, 2023

Journal information

Journal name	arXiv-Computation and Language
Journal name abbreviation	Computer Science.cs.CL
Official website	https://arxiv.org/list/cs.CL/recent
IF	0
Country/region
Annual articles
Percentage of Chinese authors
Self-Citation Rate
Layout fee
Review period
Open access	1
Warning
Database
Publication time
Publication frequency
Publisher	arxiv
Review journal	No
ISSN
Journal description	arXiv.org is a free online archive of preprint and postprint manuscripts in physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. arXiv.org does not perform peer review. However, all articles are subject to a moderation process that classifies material by subject area and checks for scholarly value. Authors may submit preprint articles to arXiv.org prior to, or simultaneously with, submission to a journal. arXiv.org thus allows authors to make their findings immediately available to the scientific community without undergoing the peer review process. This makes arXiv.org is a useful source for finding new research, but it is important to remember that preprint articles have not been peer reviewed, nor have they undergone the editing that journal articles receive.

Low-resource Languages: A Review of Past Work and Future Challenges

Comprehensive information

Abstract

References

Cited articles

Related articles

Journal information