Arabic verb pattern extraction

E. M. Saad, M. H. Awadalla, A. Alajmi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Arabic is a highly inflected language, and therefore the processes of stemming and root extracting represent a challenge to researches. A new method is presented for extracting Arabic text stem, and lemma. Stemming sometimes affects the semantic of a word, where as lemma preserve the meaning of a word. The approach is based on pattern extraction. It uses a special encoding based on dividing letters into original and non-original letters. Codes are automatically generated for each pattern and then match against input text to extract root, pattern, and lemma of a word. A comparison with other methods reveals a promising result with accuracy up to 96%.

Original languageEnglish
Title of host publication10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010
Pages642-645
Number of pages4
DOIs
Publication statusPublished - 2010
Event10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010 - Kuala Lumpur, Malaysia
Duration: May 10 2010May 13 2010

Other

Other10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010
CountryMalaysia
CityKuala Lumpur
Period5/10/105/13/10

Fingerprint

Semantics

Keywords

  • Morphological analyzer
  • Natural language processing
  • Root extraction

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Signal Processing

Cite this

Saad, E. M., Awadalla, M. H., & Alajmi, A. (2010). Arabic verb pattern extraction. In 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010 (pp. 642-645). [5605427] https://doi.org/10.1109/ISSPA.2010.5605427

Arabic verb pattern extraction. / Saad, E. M.; Awadalla, M. H.; Alajmi, A.

10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010. 2010. p. 642-645 5605427.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Saad, EM, Awadalla, MH & Alajmi, A 2010, Arabic verb pattern extraction. in 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010., 5605427, pp. 642-645, 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010, Kuala Lumpur, Malaysia, 5/10/10. https://doi.org/10.1109/ISSPA.2010.5605427
Saad EM, Awadalla MH, Alajmi A. Arabic verb pattern extraction. In 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010. 2010. p. 642-645. 5605427 https://doi.org/10.1109/ISSPA.2010.5605427
Saad, E. M. ; Awadalla, M. H. ; Alajmi, A. / Arabic verb pattern extraction. 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010. 2010. pp. 642-645
@inproceedings{be1438d57c3944948522bcd61166afc5,
title = "Arabic verb pattern extraction",
abstract = "Arabic is a highly inflected language, and therefore the processes of stemming and root extracting represent a challenge to researches. A new method is presented for extracting Arabic text stem, and lemma. Stemming sometimes affects the semantic of a word, where as lemma preserve the meaning of a word. The approach is based on pattern extraction. It uses a special encoding based on dividing letters into original and non-original letters. Codes are automatically generated for each pattern and then match against input text to extract root, pattern, and lemma of a word. A comparison with other methods reveals a promising result with accuracy up to 96{\%}.",
keywords = "Morphological analyzer, Natural language processing, Root extraction",
author = "Saad, {E. M.} and Awadalla, {M. H.} and A. Alajmi",
year = "2010",
doi = "10.1109/ISSPA.2010.5605427",
language = "English",
isbn = "9781424471676",
pages = "642--645",
booktitle = "10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010",

}

TY - GEN

T1 - Arabic verb pattern extraction

AU - Saad, E. M.

AU - Awadalla, M. H.

AU - Alajmi, A.

PY - 2010

Y1 - 2010

N2 - Arabic is a highly inflected language, and therefore the processes of stemming and root extracting represent a challenge to researches. A new method is presented for extracting Arabic text stem, and lemma. Stemming sometimes affects the semantic of a word, where as lemma preserve the meaning of a word. The approach is based on pattern extraction. It uses a special encoding based on dividing letters into original and non-original letters. Codes are automatically generated for each pattern and then match against input text to extract root, pattern, and lemma of a word. A comparison with other methods reveals a promising result with accuracy up to 96%.

AB - Arabic is a highly inflected language, and therefore the processes of stemming and root extracting represent a challenge to researches. A new method is presented for extracting Arabic text stem, and lemma. Stemming sometimes affects the semantic of a word, where as lemma preserve the meaning of a word. The approach is based on pattern extraction. It uses a special encoding based on dividing letters into original and non-original letters. Codes are automatically generated for each pattern and then match against input text to extract root, pattern, and lemma of a word. A comparison with other methods reveals a promising result with accuracy up to 96%.

KW - Morphological analyzer

KW - Natural language processing

KW - Root extraction

UR - http://www.scopus.com/inward/record.url?scp=78650299233&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78650299233&partnerID=8YFLogxK

U2 - 10.1109/ISSPA.2010.5605427

DO - 10.1109/ISSPA.2010.5605427

M3 - Conference contribution

SN - 9781424471676

SP - 642

EP - 645

BT - 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010

ER -