150 likes | 628 Views
2012-7-1. 2. Outline. MotivationC4.5BaggingMapReduceMReC4.5EvaluationsConclusions. 2012-7-1. 3. Motivation. Classification plays an important role in data mining C4.5, one kind of decision tree classification algorithm known as landmark in data mining Ensemble learning mechanism is very suit
E N D
1. MReC4.5: C4.5 Ensemble Classification with MapReduce Gongqing Wu1, Haiguang Li1, Xuegang Hu1, Yuanjun Bi1,
Jing Zhang1, Xindong Wu1, 2
1School of Computer Science and Information Engineering
Hefei University of Technology, Hefei, China
2Department of Computer Science University of Vermont Burlington, U.S.A.
Yantai, China, 7/1/2012
2. 2012-7-1 2 Outline Motivation
C4.5
Bagging
MapReduce
MReC4.5
Evaluations
Conclusions
3. 2012-7-1 3 Motivation Classification plays an important role in data mining
C4.5, one kind of decision tree classification algorithm known as landmark in data mining
Ensemble learning mechanism is very suitable for parallel and distributed computing model in nature
MapReduce is a new distributed programming paradigm proposed by Google for the parallel and distributed processing on large data sets
Offer an ensemble C4.5 classification method based on MapReduce, MReC4.5
Make MReC4.5 classifier “Construct Once, Use Anywhere” by providing a series of serialization operations on the model level.
4. 2012-7-1 4 C4.5