A filter-based feature selection using two criterion functions and evolutionary fuzzification

dc.contributor.advisorOhm Sornilth
dc.contributor.authorJitwadee Chaiyakarnth
dc.date.accessioned2016-05-16T04:49:13Z
dc.date.available2016-05-16T04:49:13Z
dc.date.issued2013th
dc.date.issuedBE2556th
dc.descriptionDissertations(Ph.D. (Computer Science)) National Institute of Development Administration, 2013.th
dc.description.abstractIn information age, data has become increasingly large, in both dimension (the number of features) and volume. Data mining processes, such as data classification and data clustering, performed on high dimensional data can be time-consuming and can produce poor results due to the problem so called curse of dimensionality. Feature selection is one of the fundamental techniques that selects only the most significant features and eliminates irrelevant and redundant features from the entire set of features. Filter-based feature selection is the technique to be focused in this dissertation. This technique can take less time to select significant features, especially for high dimensional data, but can not guarantee an optimal feature set. Filter-based feature selection comprises of two important parts; searching process and criterion function evaluation. Floating search is commonly used for the searching process. It is a heuristic search, which does not take much time, however, can not guarantee an optimal feature set. The latter part relies on a criterion function, which is an independent measure to evaluate and select feature subsets without actually performing data mining algorithm. Therefore, it does not inherit any bias of the data mining algorithm. Usually, only one criterion function is used so one chararteristic of data is considered at a time. In this dissertation, two criterion functions are proposed for the feature evaluation. The two functions can compliment each other and two or more characteristics of data can be considered together to effectively select features. Noise, ambiguity and uncertainty of data, which are frequently found in the real-world problem, can effect data mining process. Hence, fuzzy logic was applied to cope with these problems in this dissertation. A membership function was needed in the fuzzy logic to fuzzify original data and to infer data into fuzzy value. The fuzzy value was then passed through feature selection process instead of the original data. Genetic algorithm (GA) was used to determine the irregular shape of the membership function instead of by human expert. From the experiments, the proposed two criterion functions was found to be effective to select features that can increase accuracy of data classification. The proposed method outperforms two existing methods, the hybrid and one criterion function filter-based methods. The experimental results also show that the proposed method with fuzzy logic enhances classification accuracy. It outperforms some wrapper-based feature selection methods, which have been widely known to achieve higher accuracy than filter-based methods. The proposed feature selection method can also be used to reduce data dimension for unsupervised learning problems, such as data clustering. Unlike the supervised learning problems, there is no class label attribute of data objects to guide and cluster them into groups. Hence, it is not an easy task to select discriminant features for unsupervised learning problems. The criterion functions or measures for unsupervised learning problem were also proposed to be used for the proposed method. The experimental results showed that the proposed method can help improving clustering accuracy when compared with the results from other approaches. Therefore, the proposed feature selection method can be used for both supervised and unsupervised learning problems.th
dc.format.extent83 leavesth
dc.format.mimetypeapplication/pdfth
dc.identifier.doi10.14457/NIDA.the.2013.21
dc.identifier.otherb184489th
dc.identifier.urihttp://repository.nida.ac.th/handle/662723737/3027th
dc.language.isoength
dc.publisherNational Institute of Development Administrationth
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.th
dc.subjectEvolutionary fuzzificationth
dc.subject.otherCriterion Functionsth
dc.titleA filter-based feature selection using two criterion functions and evolutionary fuzzificationth
dc.typetext--thesis--doctoral thesisth
mods.genreDissertationth
mods.physicalLocationNational Institute of Development Administration. Library and Information Centerth
thesis.degree.departmentSchool of Applied Statisticsth
thesis.degree.disciplineComputer Scienceth
thesis.degree.grantorNational Institute of Development Administrationth
thesis.degree.levelDoctoralth
thesis.degree.nameDoctor of Philosophyth
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
b184489.pdf
Size:
1.89 MB
Format:
Adobe Portable Document Format
Description:
fulltext
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: