Three-level morphological analyzer for arabic verbs and particles

Fatima T. Al-Raisi, Anisa M. Al-Hafeedh, Salha M. Al-Farsi, Hamza Z. Zidoum

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a Three-level Morphological Analyzer (MA). Our approach consists of mimicking morphology processing carried out by a human linguist expert. Hence, a great emphasis is put on the analysis and representation of Arabic linguistic rules. This step is very crucial in order to come up with a reliable MA. In the Three-level MA, surface words (tokens) undergo stemming to produce corresponding stems. Roots are then generated from resultant stems. A multi-affix approach is considered when stemming tokens. The stemming algorithm performs iterative light stemming which strips a part of the prefix/suffix. Indeed, from the linguistic point of view, a prefix/suffix is not just one string of characters. It is rather a combination of letters that may represent a number of distinct entities. Light stemming helps extracting information from each prefix/suffix by considering each separately. The root generating algorithm identifies the form of a stem, wherefrom, it extracts the root. The root generating algorithm manipulates deviated stems for unified treatment purposes. The MA is equipped with a comprehensive coverage lexicon to ensure correctness of results.

Original languageEnglish
Title of host publicationProceedings of the Eighth IASTED International Conference on Artificial Intelligence and Soft Computing
EditorsA.P. Pobil
Pages41-47
Number of pages7
Publication statusPublished - 2004
EventProceedings of the Eighth IASTED International Conference on Atificial Intelligence and Soft Computing - Marbella, Spain
Duration: Sep 1 2004Sep 3 2004

Other

OtherProceedings of the Eighth IASTED International Conference on Atificial Intelligence and Soft Computing
CountrySpain
CityMarbella
Period9/1/049/3/04

Fingerprint

Linguistics
Processing

Keywords

  • Arabic processing
  • Computational linguistics
  • Morphological analysis

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Al-Raisi, F. T., Al-Hafeedh, A. M., Al-Farsi, S. M., & Zidoum, H. Z. (2004). Three-level morphological analyzer for arabic verbs and particles. In A. P. Pobil (Ed.), Proceedings of the Eighth IASTED International Conference on Artificial Intelligence and Soft Computing (pp. 41-47)

Three-level morphological analyzer for arabic verbs and particles. / Al-Raisi, Fatima T.; Al-Hafeedh, Anisa M.; Al-Farsi, Salha M.; Zidoum, Hamza Z.

Proceedings of the Eighth IASTED International Conference on Artificial Intelligence and Soft Computing. ed. / A.P. Pobil. 2004. p. 41-47.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Al-Raisi, FT, Al-Hafeedh, AM, Al-Farsi, SM & Zidoum, HZ 2004, Three-level morphological analyzer for arabic verbs and particles. in AP Pobil (ed.), Proceedings of the Eighth IASTED International Conference on Artificial Intelligence and Soft Computing. pp. 41-47, Proceedings of the Eighth IASTED International Conference on Atificial Intelligence and Soft Computing, Marbella, Spain, 9/1/04.
Al-Raisi FT, Al-Hafeedh AM, Al-Farsi SM, Zidoum HZ. Three-level morphological analyzer for arabic verbs and particles. In Pobil AP, editor, Proceedings of the Eighth IASTED International Conference on Artificial Intelligence and Soft Computing. 2004. p. 41-47
Al-Raisi, Fatima T. ; Al-Hafeedh, Anisa M. ; Al-Farsi, Salha M. ; Zidoum, Hamza Z. / Three-level morphological analyzer for arabic verbs and particles. Proceedings of the Eighth IASTED International Conference on Artificial Intelligence and Soft Computing. editor / A.P. Pobil. 2004. pp. 41-47
@inproceedings{65d33983cea14d2c80304aa9f2a0b0ca,
title = "Three-level morphological analyzer for arabic verbs and particles",
abstract = "This paper presents a Three-level Morphological Analyzer (MA). Our approach consists of mimicking morphology processing carried out by a human linguist expert. Hence, a great emphasis is put on the analysis and representation of Arabic linguistic rules. This step is very crucial in order to come up with a reliable MA. In the Three-level MA, surface words (tokens) undergo stemming to produce corresponding stems. Roots are then generated from resultant stems. A multi-affix approach is considered when stemming tokens. The stemming algorithm performs iterative light stemming which strips a part of the prefix/suffix. Indeed, from the linguistic point of view, a prefix/suffix is not just one string of characters. It is rather a combination of letters that may represent a number of distinct entities. Light stemming helps extracting information from each prefix/suffix by considering each separately. The root generating algorithm identifies the form of a stem, wherefrom, it extracts the root. The root generating algorithm manipulates deviated stems for unified treatment purposes. The MA is equipped with a comprehensive coverage lexicon to ensure correctness of results.",
keywords = "Arabic processing, Computational linguistics, Morphological analysis",
author = "Al-Raisi, {Fatima T.} and Al-Hafeedh, {Anisa M.} and Al-Farsi, {Salha M.} and Zidoum, {Hamza Z.}",
year = "2004",
language = "English",
isbn = "0889864586",
pages = "41--47",
editor = "A.P. Pobil",
booktitle = "Proceedings of the Eighth IASTED International Conference on Artificial Intelligence and Soft Computing",

}

TY - GEN

T1 - Three-level morphological analyzer for arabic verbs and particles

AU - Al-Raisi, Fatima T.

AU - Al-Hafeedh, Anisa M.

AU - Al-Farsi, Salha M.

AU - Zidoum, Hamza Z.

PY - 2004

Y1 - 2004

N2 - This paper presents a Three-level Morphological Analyzer (MA). Our approach consists of mimicking morphology processing carried out by a human linguist expert. Hence, a great emphasis is put on the analysis and representation of Arabic linguistic rules. This step is very crucial in order to come up with a reliable MA. In the Three-level MA, surface words (tokens) undergo stemming to produce corresponding stems. Roots are then generated from resultant stems. A multi-affix approach is considered when stemming tokens. The stemming algorithm performs iterative light stemming which strips a part of the prefix/suffix. Indeed, from the linguistic point of view, a prefix/suffix is not just one string of characters. It is rather a combination of letters that may represent a number of distinct entities. Light stemming helps extracting information from each prefix/suffix by considering each separately. The root generating algorithm identifies the form of a stem, wherefrom, it extracts the root. The root generating algorithm manipulates deviated stems for unified treatment purposes. The MA is equipped with a comprehensive coverage lexicon to ensure correctness of results.

AB - This paper presents a Three-level Morphological Analyzer (MA). Our approach consists of mimicking morphology processing carried out by a human linguist expert. Hence, a great emphasis is put on the analysis and representation of Arabic linguistic rules. This step is very crucial in order to come up with a reliable MA. In the Three-level MA, surface words (tokens) undergo stemming to produce corresponding stems. Roots are then generated from resultant stems. A multi-affix approach is considered when stemming tokens. The stemming algorithm performs iterative light stemming which strips a part of the prefix/suffix. Indeed, from the linguistic point of view, a prefix/suffix is not just one string of characters. It is rather a combination of letters that may represent a number of distinct entities. Light stemming helps extracting information from each prefix/suffix by considering each separately. The root generating algorithm identifies the form of a stem, wherefrom, it extracts the root. The root generating algorithm manipulates deviated stems for unified treatment purposes. The MA is equipped with a comprehensive coverage lexicon to ensure correctness of results.

KW - Arabic processing

KW - Computational linguistics

KW - Morphological analysis

UR - http://www.scopus.com/inward/record.url?scp=10444256511&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=10444256511&partnerID=8YFLogxK

M3 - Conference contribution

SN - 0889864586

SP - 41

EP - 47

BT - Proceedings of the Eighth IASTED International Conference on Artificial Intelligence and Soft Computing

A2 - Pobil, A.P.

ER -