Thai rhetorical structure analysis
Issued Date
2009
Available Date
Copyright Date
Resource Type
Series
Edition
Language
eng
File Type
application/pdf
No. of Pages/File Size
99 leaves : ill. ; 30 cm.
ISBN
ISSN
eISSN
Other identifier(s)
Identifier(s)
Access Rights
Access Status
Rights
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Rights Holder(s)
Physical Location
National Institute of Development Administration. Library and Information Center
Bibliographic Citation
Citation
Somnuk Sinthupoun (2009). Thai rhetorical structure analysis. Retrieved from: http://repository.nida.ac.th/handle/662723737/283.
Title
Thai rhetorical structure analysis
Alternative Title(s)
Author(s)
Editor(s)
Advisor(s)
Advisor's email
Contributor(s)
Contributor(s)
Abstract
Rhetorical Structure Analysis (RSA) explores Discourse Relations (DRs)
among related Elementary Discourse Units (EDUs) in a Rhetorical Structure Tree (RS
tree) to describe meaning in a text. It is very useful in many text processing tasks
employing relationships among EDUs to use an input such as text understanding,
summarization, discourse parsing, machine translation and question answering. The
Thai language, with its distinctive linguistic characteristics, requires a unique
technique. Thai linguistic characteristics have no explicit EDU boundaries, referred to
as EDU constituent omissions. Within a Thai RS tree, Thai has adjacent markers,
implicit markers and marker ambiguities in DR determination.
This dissertation proposes a new approach to Thai RSA which consists of
three steps. The first is EDU segmentation where EDUs are segmented by phrase and
syntactic hidden Markov models which are derived from phrase and syntactic
structure rules, respectively. The second is RS tree Construction, where an RS tree is
constructed using a clustering technique with its similarity matrix calculated from the
three Thai semantic rules: Repetition rules established from the repetition of EDU
constituents, Omission rules established from the omission of EDU constituents, and
Addition rules established by adding Markers into EDUs. In the final step, DR
determination, decision tree learning whose features are derived by relating two EDUs
in an RS tree using the semantic rules to determine Discourse Relations.
Table of contents
Description
Thesis (Ph.D. (Computer Science))--National Institute of Development Administration, 2009