Human emotion recognition in Thai short text
Issued Date
2018
Issued Date (B.E.)
2561
Available Date
Copyright Date
Resource Type
Series
Edition
Language
eng
File Type
application/pdf
No. of Pages/File Size
68 leaves
ISBN
ISSN
eISSN
Other identifier(s)
b207808
Identifier(s)
Access Rights
Access Status
Rights
ผลงานนี้เผยแพร่ภายใต้ สัญญาอนุญาตครีเอทีฟคอมมอนส์แบบ แสดงที่มา-ไม่ใช้เพื่อการค้า-ไม่ดัดแปลง 4.0 (CC BY-NC-ND 4.0)
Rights Holder(s)
Physical Location
National Institute of Development Administration. Library and Information Center
Bibliographic Citation
Citation
Jirawan Charoensuk (2018). Human emotion recognition in Thai short text. Retrieved from: https://repository.nida.ac.th/handle/662723737/6423.
Title
Human emotion recognition in Thai short text
Alternative Title(s)
Author(s)
Advisor(s)
Editor(s)
item.page.dc.contrubutor.advisor
Advisor's email
Contributor(s)
Contributor(s)
Abstract
Emotion classification is one of the topics in effective computing applicable in
various research areas such as speech synthesis, image processing, and especially, text
processing. Emotion classification is aimed at identifying a suitable emotion label for
each review. In this research, a hierarchical classification framework to identify
emotions (objective opinion and anger, disgust, fear, sadness, happiness, and surprise)
is proposed for actual customer reviews written in Thai. The hierarchical classification
framework consists of three levels: opinion, sentiment, and emotion. First, the opinion
level distinguishes customers’ reviews into two types, namely objective and subjective
opinions. Second, the sentiment level is used to categorize the subjective opinions as
either positive or negative. Last, in the emotion level, an emotion label is assigned to
an opinion as either anger, disgust, fear, happiness, sadness, or surprise. The proposed
method consists of three main processes: (1) text preprocessing, (2) feature extraction,
and (3) emotion classification. Text preprocessing provides necessary information and
normalization of words in the reviews and comprises word segmentation, part-ofspeech (POS) tagging, word replacement, and stop-word elimination. Feature
extraction is a process to construct a vector space model (VSM) for opinion
classification. Five feature sets for generating the VSM are created by using a corpusand lexicon-based approach: the term frequency-inverse document frequency (Tf-Idf)
of unigram words (TUW), bigram words (TBW), unigram POS (TUP), and bigram POS
(TBP), and a Thai sentiment lexicon (TSL). Furthermore, a decision tree, multinomial naïve Bayes, and a support vector machine (SVM) are used as classifiers in the emotion
classification process.
The experimental results show that for the hierarchical approach where the
subjectivity of a review is first determined, the polarity of an opinion is identified, and
then the emotional label is calculated yielded the highest performance with an accuracy
of 69.60%. Overall, TBW was the most effective feature subset used for filtering
opinions, determining polarity, and classifying negative emotions. Lexicon resources
such as TSL and the POS tag sets in the morphology level improved the accuracy of
opinion filtering in two- and three-level hierarchical classification. SVM achieved a
high performance in identifying contrasting opinions such as objective versus
subjective opinions and positive versus negative sentiment. Meanwhile, multinomial
naïve Bayes performed the best when identifying closely related emotions such as
happiness versus surprise in positive emotion classification.
Table of contents
Description
Thesis (Ph.D. (Computer Science and Information Systems))--National Institute of Development Administration, 2018