Show simple item record

dc.contributor.advisorSurapong Auwatanamongkol, advisorth
dc.contributor.authorPornpimol Bungkomkhunth
dc.date.accessioned2014-05-05T08:49:38Z
dc.date.available2014-05-05T08:49:38Z
dc.date.issued2012th
dc.identifier.urihttp://repository.nida.ac.th/handle/662723737/277th
dc.descriptionThesis (Ph.D. (Computer Science))--National Institute of Development Administration, 2012th
dc.description.abstractClustering analysis is one of the primary methods of data mining tasks with the objective to understand the natural grouping (or structure) of data objects in a dataset. The clustering tasks aim to segment the entire data set into relatively homogenous subgroups or clusters where the similarities of the data objects within clusters are maximized and the similarities of data objects belonging to different clusters are minimized. For supervised clustering, not only attribute variables of data objects but also the class variable of data objects take part in grouping or dividing data objects into clusters in the manner that each cluster has high homogeneity in term of classes of its data objects. This dissertation proposes a grid-based supervised clustering algorithm that is able to identify clusters of any shapes and sizes without presuming any canonical form for data distribution. The algorithm not only needs no pre-specified number of clusters but also is insensitive to the order of the input data objects. The proposed algorithm gradually partitions data space into equal-size grid cells using one dimension at a time. The greedy method is used to arrange the order of dimensions for the gradual partitioning that would give the best quality of clustering, while the gradient descent method is used to find the optimal number of intervals for each partitioning. After all dimensions have been partitioned, any connected dense grid cells containing majority of data objects from the same class are merged into a cluster. By using the greedy and gradient descent methods as mentioned, the proposed algorithm can produce high quality clusters while reduce time to find the best partitioning and avoid the memory confinement problem during the process. On twodimensional synthetic datasets, the proposed algorithm can identify clusters with different shapes and sizes correctly. The proposed algorithm also outperforms other five supervised clustering algorithms when performed on some UCI datasets.th
dc.description.provenanceMade available in DSpace on 2014-05-05T08:49:38Z (GMT). No. of bitstreams: 1 nida-diss-b175320.pdf: 8936080 bytes, checksum: 0b7bac8cc9d3f2cb78efb903b476e2ef (MD5) Previous issue date: 2012th
dc.format.extentix, 81 leaves : ill ; 30 cm.th
dc.format.mimetypeapplication/pdfth
dc.language.isoength
dc.publisherNational Institute of Development Administrationth
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.th
dc.subject.lccQA 278 P826 2012th
dc.subject.otherCluster analysisth
dc.subject.otherAlgorithmsth
dc.titleGrid-based supervised clustering algorithm using greedy and gradient descent methods to build clustersth
dc.typeTextth
mods.genreDissertationth
mods.physicalLocationNational Institute of Development Administration. Library and Information Centerth
thesis.degree.nameDoctor of Philosophyth
thesis.degree.levelDoctoralth
thesis.degree.disciplineComputer Scienceth
thesis.degree.grantorNational Institute of Development Administrationth
thesis.degree.departmentSchool of Applied Statisticsth
dc.identifier.doi10.14457/NIDA.the.2012.6


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record