Thai rhetorical structure analysis

Somnuk Sinthupoun

Thai rhetorical structure analysis

Files

nida-diss-b166399.pdf (12.18 MB)

Publisher

National Institute of Development Administration

Issued Date

2009

Issued Date (B.E.)

2552

Resource Type

Dissertation

Language

eng

File Type

application/pdf

No. of Pages/File Size

99 leaves : ill. ; 30 cm.

DOI

10.14457/NIDA.the.2009.139

Rights

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Physical Location

National Institute of Development Administration. Library and Information Center

Citation

Somnuk Sinthupoun (2009). Thai rhetorical structure analysis. Retrieved from: http://repository.nida.ac.th/handle/662723737/283.

Title

Thai rhetorical structure analysis

Author(s)

Somnuk Sinthupoun

Advisor(s)

Ohm Sornil, advisor

Abstract

Rhetorical Structure Analysis (RSA) explores Discourse Relations (DRs) among related Elementary Discourse Units (EDUs) in a Rhetorical Structure Tree (RS tree) to describe meaning in a text. It is very useful in many text processing tasks employing relationships among EDUs to use an input such as text understanding, summarization, discourse parsing, machine translation and question answering. The Thai language, with its distinctive linguistic characteristics, requires a unique technique. Thai linguistic characteristics have no explicit EDU boundaries, referred to as EDU constituent omissions. Within a Thai RS tree, Thai has adjacent markers, implicit markers and marker ambiguities in DR determination. This dissertation proposes a new approach to Thai RSA which consists of three steps. The first is EDU segmentation where EDUs are segmented by phrase and syntactic hidden Markov models which are derived from phrase and syntactic structure rules, respectively. The second is RS tree Construction, where an RS tree is constructed using a clustering technique with its similarity matrix calculated from the three Thai semantic rules: Repetition rules established from the repetition of EDU constituents, Omission rules established from the omission of EDU constituents, and Addition rules established by adding Markers into EDUs. In the final step, DR determination, decision tree learning whose features are derived by relating two EDUs in an RS tree using the semantic rules to determine Discourse Relations.

Description

Thesis (Ph.D. (Computer Science))--National Institute of Development Administration, 2009

Degree Name

Doctor of Philosophy

Degree Level

Doctoral

Degree Department

School of Applied Statistics

Degree Discipline

Computer Science

Degree Grantor(s)

National Institute of Development Administration

LCC

P 98.3 So55 2009

Subject(s)

Computational linguistics
Natural language processing (Computer science)
Thai language -- Rhetoric -- Computer programs

URI

http://repository.nida.ac.th/handle/662723737/283

Collections

GSAS: Dissertations

Full item page

Thai rhetorical structure analysis

Files

Publisher

Issued Date

Issued Date (B.E.)

Available Date

Copyright Date

Resource Type

Series

Edition

Language

File Type

No. of Pages/File Size

ISBN

ISSN

eISSN

DOI

Other identifier(s)

Identifier(s)

Access Rights

Access Status

Rights

Rights Holder(s)

Physical Location

Bibliographic Citation

Citation

Title

Alternative Title(s)

Author(s)

Advisor(s)

Editor(s)

item.page.dc.contrubutor.advisor

Advisor's email

Contributor(s)

Contributor(s)

Abstract

Table of contents

Description

Description

Sponsorship

Degree Name

Degree Level

Degree Department

Degree Discipline

Degree Grantor(s)

Classification

LCC

Subject(s)

Keyword(s)

View online Resources

URI

Collections