Classification by Clustering (CbC): An Approach of Classifying Big Data based on Similarities

Khan, Sakib Shahriar; Ahamed, Shakim; Jannat, Miftahul; Monwar, Irin

dc.contributor.author	Khan, Sakib Shahriar
dc.contributor.author	Ahamed, Shakim
dc.contributor.author	Jannat, Miftahul
dc.contributor.author	Monwar, Irin
dc.date.accessioned	2019-01-30T02:35:43Z
dc.date.available	2019-01-30T02:35:43Z
dc.date.issued	2019-01-30
dc.identifier.uri	http://dspace.uiu.ac.bd/handle/52243/743
dc.description.abstract	Data classification in supervised learning is the process of classifying data for data mining task that helps to analyses data for decision making. The objective of a classification model is to correctly predict the categorical class labels of known/ unknown instances. In machine learning for data mining applications, the classification models are trained based on labelled training data sets. In this paper, we have investigated if we can build a classification model based on the similarities of the instances instead of class labels of instances. Data labeling is always very costly and time consuming process, and it's become very difficult task if the data is big data. The proposed approach clusters the big data and builds the classifier based on the clusters without considering the class labels, which basically improve the performance of the classifier. However, we can relate the clusters with class labels. We have collected 10 big data from the UC Irvine machine learning repository for experimental analysis and applied three popular decision tree induction algorithms: ID3 (Iterative Dichotomiser 3), C4.5 (extension of ID3 algorithm), and CART (Classification & Regression Tree) for classifier construction.	en_US
dc.language.iso	en	en_US
dc.subject	Classification	en_US
dc.subject	Classifier	en_US
dc.subject	Clustering	en_US
dc.subject	Data mining	en_US
dc.title	Classification by Clustering (CbC): An Approach of Classifying Big Data based on Similarities	en_US
dc.type	Thesis	en_US

Files in this item

Name:: Classification by Clustering ...
Size:: 968.6Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

B.Sc Thesis/Project [82]

Show simple item record