Human emotion recognition in Thai short text

Ohm SornilJirawan Charoensuk2023-05-152023-05-152018b207808https://repository.nida.ac.th/handle/662723737/6423Thesis (Ph.D. (Computer Science and Information Systems))--National Institute of Development Administration, 2018Emotion classification is one of the topics in effective computing applicable in various research areas such as speech synthesis, image processing, and especially, text processing. Emotion classification is aimed at identifying a suitable emotion label for each review. In this research, a hierarchical classification framework to identify emotions (objective opinion and anger, disgust, fear, sadness, happiness, and surprise) is proposed for actual customer reviews written in Thai. The hierarchical classification framework consists of three levels: opinion, sentiment, and emotion. First, the opinion level distinguishes customers’ reviews into two types, namely objective and subjective opinions. Second, the sentiment level is used to categorize the subjective opinions as either positive or negative. Last, in the emotion level, an emotion label is assigned to an opinion as either anger, disgust, fear, happiness, sadness, or surprise. The proposed method consists of three main processes: (1) text preprocessing, (2) feature extraction, and (3) emotion classification. Text preprocessing provides necessary information and normalization of words in the reviews and comprises word segmentation, part-ofspeech (POS) tagging, word replacement, and stop-word elimination. Feature extraction is a process to construct a vector space model (VSM) for opinion classification. Five feature sets for generating the VSM are created by using a corpusand lexicon-based approach: the term frequency-inverse document frequency (Tf-Idf) of unigram words (TUW), bigram words (TBW), unigram POS (TUP), and bigram POS (TBP), and a Thai sentiment lexicon (TSL). Furthermore, a decision tree, multinomial naïve Bayes, and a support vector machine (SVM) are used as classifiers in the emotion classification process. The experimental results show that for the hierarchical approach where the subjectivity of a review is first determined, the polarity of an opinion is identified, and then the emotional label is calculated yielded the highest performance with an accuracy of 69.60%. Overall, TBW was the most effective feature subset used for filtering opinions, determining polarity, and classifying negative emotions. Lexicon resources such as TSL and the POS tag sets in the morphology level improved the accuracy of opinion filtering in two- and three-level hierarchical classification. SVM achieved a high performance in identifying contrasting opinions such as objective versus subjective opinions and positive versus negative sentiment. Meanwhile, multinomial naïve Bayes performed the best when identifying closely related emotions such as happiness versus surprise in positive emotion classification.68 leavesapplication/pdfengผลงานนี้เผยแพร่ภายใต้ สัญญาอนุญาตครีเอทีฟคอมมอนส์แบบ แสดงที่มา-ไม่ใช้เพื่อการค้า-ไม่ดัดแปลง 4.0 (CC BY-NC-ND 4.0)EmotionUser Interfaces and Human Computer InteractionHuman emotion recognition in Thai short texttext--thesis--doctoral thesis10.14457/NIDA.the.2018.127