A Feature Group Weighting Method for Classifying High-Dimensional Big Data

UIU Institutional Repository

    • Login
    View Item 
    •   UIU DSpace Home
    • School of Science and Engineering (SoSE)
    • Department of Computer Science and Engineering (CSE)
    • M.Sc Thesis/Project
    • View Item
    •   UIU DSpace Home
    • School of Science and Engineering (SoSE)
    • Department of Computer Science and Engineering (CSE)
    • M.Sc Thesis/Project
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    A Feature Group Weighting Method for Classifying High-Dimensional Big Data

    Thumbnail
    View/Open
    Final_Thesis_Book__Shakila.pdf (593.0Kb)
    Date
    2019-11-25
    Author
    Sarker, Shakila
    Metadata
    Show full item record
    Abstract
    Features hold the distinctive characteristics and intrinsic values of data. But it's of no use if the important information and pattern can not be extracted from the data coming from disparate sources and applications. In the area of big data, feature selection is one of the most important pre-processing step in reducing numerous numbers of unessential, irrelevant and noisy features that can seriously affect the outcomes of the classifier models. The main motivation for applying feature selection is to reduce high-dimensionality of large-scale data. As high-dimensional big data has more features for training, it becomes challenging and costly to measure the performances. The aim of the research is to build models with several hybrid feature selection techniques so that the classification algorithms can have only those features that are really relevant and help to achieve better performances. Also, finding the informative features and grouping them so that we can extract the knowledge from Big Data. In this research, we have collected 10 benchmark datasets from UC Irvine Machine Learning Repository. We have applied several feature selection methods and tested their performance (CFS, Chi-Square, Consistency Subset Evaluator, Gain Ratio, Information Gain, OneR, PCA, ReliefF, Symmetrical Uncertainty and Wrapper). The feature grouping methods are named Random Grouping, Correlation based Grouping and Attribute weighting grouping; these groups were experimented with ensemble classifiers: Random Forest, Bagging and Boosting (AdaBoost). With the observed result it has been found that these groups have similar or even better result than the entire feature sets for the datasets. Attribute Weighting grouping method has shown promising performances for the Big Data.
    URI
    http://dspace.uiu.ac.bd/handle/52243/1507
    Collections
    • M.Sc Thesis/Project [151]

    Copyright 2003-2017 United International University
    Contact Us | Send Feedback
    Developed by UIU CITS
     

     

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Copyright 2003-2017 United International University
    Contact Us | Send Feedback
    Developed by UIU CITS