Part of Speech (POS) tag sets reduction and analysis using rough set techniques

Mohamed Elhadi*, Amjd Al-Tobi

*المؤلف المقابل لهذا العمل

نتاج البحث: Conference contribution

ملخص

The motivation behind this work stems from an earlier work where text was transformed into strings of syntactical structures and used in similarity calculations using sequence algorithm on a string generated by a POS tagger. The performance of computations was greatly affected by the size of the string which in itself is the result of the type of tags used. Generated tags range from several (minimum of nine) general ones to many more (hundreds) detailed tags. Figuring out which tags and what combination of tags affect the realization of meanings, dependencies or relationships that exist in the text is an important issue. The resulting tag set reduction using rough sets and consequently string reduction has resulted in an improved efficiency in similarity calculations between documents while maintaining the same level of accuracy. Such finding was very encouraging.

اللغة الأصليةEnglish
عنوان منشور المضيفRough Sets, Fuzzy Sets, Data Mining and Granular Computing - 12th International Conference, RSFDGrC 2009, Proceedings
الصفحات223-230
عدد الصفحات8
المعرِّفات الرقمية للأشياء
حالة النشرPublished - 2009
منشور خارجيًانعم
الحدث12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, RSFDGrC 2009 - Delhi, India
المدة: ديسمبر ١٥ ٢٠٠٩ديسمبر ١٨ ٢٠٠٩

سلسلة المنشورات

الاسمLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
مستوى الصوت5908 LNAI
رقم المعيار الدولي للدوريات (المطبوع)0302-9743
رقم المعيار الدولي للدوريات (الإلكتروني)1611-3349

Conference

Conference12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, RSFDGrC 2009
الدولة/الإقليمIndia
المدينةDelhi
المدة١٢/١٥/٠٩١٢/١٨/٠٩

ASJC Scopus subject areas

  • ???subjectarea.asjc.2600.2614???
  • ???subjectarea.asjc.1700.1700???

بصمة

أدرس بدقة موضوعات البحث “Part of Speech (POS) tag sets reduction and analysis using rough set techniques'. فهما يشكلان معًا بصمة فريدة.

قم بذكر هذا