Abstract
This paper presents a Three-level Morphological Analyzer (MA). Our approach consists of mimicking morphology processing carried out by a human linguist expert. Hence, a great emphasis is put on the analysis and representation of Arabic linguistic rules. This step is very crucial in order to come up with a reliable MA. In the Three-level MA, surface words (tokens) undergo stemming to produce corresponding stems. Roots are then generated from resultant stems. A multi-affix approach is considered when stemming tokens. The stemming algorithm performs iterative light stemming which strips a part of the prefix/suffix. Indeed, from the linguistic point of view, a prefix/suffix is not just one string of characters. It is rather a combination of letters that may represent a number of distinct entities. Light stemming helps extracting information from each prefix/suffix by considering each separately. The root generating algorithm identifies the form of a stem, wherefrom, it extracts the root. The root generating algorithm manipulates deviated stems for unified treatment purposes. The MA is equipped with a comprehensive coverage lexicon to ensure correctness of results.
Original language | English |
---|---|
Title of host publication | Proceedings of the Eighth IASTED International Conference on Artificial Intelligence and Soft Computing |
Editors | A.P. Pobil |
Pages | 41-47 |
Number of pages | 7 |
Publication status | Published - 2004 |
Event | Proceedings of the Eighth IASTED International Conference on Atificial Intelligence and Soft Computing - Marbella, Spain Duration: Sep 1 2004 → Sep 3 2004 |
Other
Other | Proceedings of the Eighth IASTED International Conference on Atificial Intelligence and Soft Computing |
---|---|
Country | Spain |
City | Marbella |
Period | 9/1/04 → 9/3/04 |
Keywords
- Arabic processing
- Computational linguistics
- Morphological analysis
ASJC Scopus subject areas
- Engineering(all)