TY - GEN
T1 - Arabic verb pattern extraction
AU - Saad, E. M.
AU - Awadalla, M. H.
AU - Alajmi, A.
PY - 2010
Y1 - 2010
N2 - Arabic is a highly inflected language, and therefore the processes of stemming and root extracting represent a challenge to researches. A new method is presented for extracting Arabic text stem, and lemma. Stemming sometimes affects the semantic of a word, where as lemma preserve the meaning of a word. The approach is based on pattern extraction. It uses a special encoding based on dividing letters into original and non-original letters. Codes are automatically generated for each pattern and then match against input text to extract root, pattern, and lemma of a word. A comparison with other methods reveals a promising result with accuracy up to 96%.
AB - Arabic is a highly inflected language, and therefore the processes of stemming and root extracting represent a challenge to researches. A new method is presented for extracting Arabic text stem, and lemma. Stemming sometimes affects the semantic of a word, where as lemma preserve the meaning of a word. The approach is based on pattern extraction. It uses a special encoding based on dividing letters into original and non-original letters. Codes are automatically generated for each pattern and then match against input text to extract root, pattern, and lemma of a word. A comparison with other methods reveals a promising result with accuracy up to 96%.
KW - Morphological analyzer
KW - Natural language processing
KW - Root extraction
UR - http://www.scopus.com/inward/record.url?scp=78650299233&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78650299233&partnerID=8YFLogxK
U2 - 10.1109/ISSPA.2010.5605427
DO - 10.1109/ISSPA.2010.5605427
M3 - Conference contribution
AN - SCOPUS:78650299233
SN - 9781424471676
T3 - 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010
SP - 642
EP - 645
BT - 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010
T2 - 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010
Y2 - 10 May 2010 through 13 May 2010
ER -