590 likes | 1.54k Views
Student Grade Prediction Intelligent System. Under guidance of Dr. Sudip Sanyal IIIT-Allahabad. Student Dinh Ngoc Lan MS200507 M.Tech of Software Engineering IIIT-Allahabad. Content. Objective of the Project C4.5 Algorithm on Student Grade Prediction Intelligent System
E N D
Student Grade Prediction Intelligent System Under guidance of Dr. Sudip Sanyal IIIT-Allahabad Student Dinh Ngoc Lan MS200507 M.Tech of Software Engineering IIIT-Allahabad
Content • Objective of the Project • C4.5 Algorithm on Student Grade Prediction Intelligent System • Software Architecture and Design • Screenshots and Result Comparison with Other Method (CART Pro 6.0) • Advantages of the Software • Limitations/Concerns • Scope of Improvement • Software Demo • References
Objective of the Project • Predict the grade of any student, in any given paper, for currently running courses i.e. their grades have not yet been declared. • Predictions will help teachers/instructors to classify students based on predicted grades and suggest them to focus on certain subjects. • Predictions will also help students improve their studying goals and improve their grades. • Develop a classification tool for other works in future. (Basic technique to be used for predictions is C4.5 algorithm)
TargetSubject Subjects Marks Student Grade Training Table This training table will be used to build a decision tree by using C4.5 algorithm. Back
SRE A plus C B A plus A A SEN B plus A B plus SEN SEN A plus A SEN A plus A plus A A B plus A Represent the predicted value of ASS A Decision Tree created from student grade table A decision tree is built from the previous training table Back
Construct Rule Set if … then … Example of Rule Set if … then … from the previous decision tree • if SRE =A plus and SEN = A plus then ASS =A plus • if SRE =A plus and SEN = A then ASS =B plus • if SRE =A and SEN = A plus then ASS =A • if SRE =A and SEN = A then ASS =A • if SRE =B plus then ASS =A • if SRE =B then ASS =B plus • if SRE =C then ASS =A A decision tree can be represented as a rule set IF-THEN
Function C4.5: • Input: (R: a set of non-target attributes, C: the target attribute, S: a training set) • returns a decision tree; • Begin • If (S is empty){ • return a single node with value Failure; • }else{ • if (S consists of records all with the same value for the target attribute){ • return a single leaf node with that value; • }else{ • if (R is empty){ • then return a single node with the value of the most frequent of the values of the target attribute that are found in records of S; [in this situation, there may be errors, examples that will be improperly classified] • }else{ • Let A be the attribute with largest Gain Ratio (A, S) among attributes in R; • Let {aj| j=1, 2... m} be the values of attribute A; • Let {Sj| j=1, 2... m} be the subsets of S consisting respectively of records with value aj for A; • Return a tree with root labeled A and arcs labeled a1, a2... am going respectively to the trees (C4.5(R-{A}, C, S1), C4.5(R-{A}, C, S2)... C4.5(R-{A}, C, Sm); • Recursively apply C4.5 to subsets {Sj| j=1, 2... m} until they are empty; • } • } • } • End C4.5 Algorithm
Example: Steps how to choose the best subject as a candidate node on the tree • In the target subject ASS: 4 cases belong to Aplus, 4 cases to A, 5 cases to Bplus, then we have : info (T) = – 4/13 x log2(4/13) – 4/13 x log2(4/13)– 5/13 x log2(5/13) = 1.576 (This represents the average information needed to identify the class of a case in T) • Calculate InfoX(T), gain(X) for the non target subject X in the training table: Let count InfoX(T) for OSS first: After using OSS to divide T into three subsets: 5 cases belong to Aplus, 5 to A and 3 to Bplus: infoOOS (T)= 5/13 x (-3/5 x log2(3/5) -1/5 x log2(1/5) - 1/5 x log2(1/5)) +5/13 x (-1/5 x log2(1/5) -2/5 x log2(2/5) -2/5 x log2(2/5)) + 3/13 x (-2/3 x log2(2/3) – 1/3 x log2(1/3)) =1.324 Similarly, computing for other possible choices (SEN and SRE) we find: SEN: infoSEN(T) = 1.084 and gain(SEN): 1.576-1.084=0.492. SRE: infoSRE(T) = 0.739 and gain(SRE): 1.576-0.739=0.837. • Calculate split info(X) and gain ratio(X) for the non target subject X in the training table: split info(OOS) = - 5/13 x log2 (5/13) - 5/13 x log2 (5/13) – 3/13 x log2 (3/13) Gain(X) Similarly, computing for other subjects we get: Gain Ratio(X) gain ratio(SEN) = 0.312, gain ration(SRE) = 0.392 Decision Tree • Finally, the best subject will be chosen isSRE because it has the highest gain ratio Training table
Error based pruning Example • Consider of sub tree represent as rule set below: if SRE =A and SEN = A plus then ASS =A (1|2) if SRE =A and SEN = A then ASS =A (1|1) • The key idea is: if predicted error of the leave is smaller than the sub-tree’s error then prune the tree by replacing the sub-tree with that leaf. Calculate the number of predicted errors for the sub-tree: 2 x U(1,2) + 1 x U(1,1)=2 x 0.9065 + 1 x 0.75 = 2.563 Calculate the number of predicted errors for the leave A on total 3 cases of the sub tree: 3 x U(1,3)=3 x 0.663= 1.989 The number of predicted errors of the leaveis smaller than the number of predicted errors for the sub-tree then we can replace this sub tree with the leave A: If SRE=A then ASS=A (2|3) Error based pruning
Software Architecture Follow 3 tier model architecture
Package Diagram The core software has more than near 2000 lines of code with 5 packages and 16 classes developed in Java programming language.
Result Comparison with CART Pro 6.0 Input for CART Pro 6.0 is an excel file with 1000 student records
Advantages of the software • Performance: Using Java with Web service technology reduces bandwidth consumption and makes the environment more reliable, available and safe. • Resource Utilization Efficiency: Database Connection Pooling technology allows the application reusing resources from Database Oracle. • Security: Using Sun Java Application server as a middle tie between clients and RDBMS, with advanced security features provided by Sun. • Interfaces: Using new Java Sever Face to develop interface with web 2.0 asynchronous technology. • Ease of Use, Portability, Maintainability, Expandability and System Administration Ease: Using advance feature of J2EE to develop the software.
Limitations/concerns • Lack of student records to construct decision tree. • Need an update of teacher’s suggestions regularly according to new situations in teaching progress. • The prediction based on statistical classification not all the capability of students.
Scope of improvement • Integrate with other student applications like Student Study Progress management, Student Research management. • Automatic gives a result of concerned students by e-mail to help them focus on the certain subjects. • Integrate with other technologies of predictions like Neural Network, Genetic Algorithm to give closer prediction result.
References [1] Quinlan, J. R., “C4.5: Programs for Machine Learning”. San Mateo, CA: Morgan Kaufmann. [2] Carter, C., and Catlett, J., “Assessing credit card applications using machine learning”. IEEE Expert, Fall issue, pp. 71-79, 1987. [3] Aha, D. W., Kibler, D., and Albert, M. K., “Instance-based learning algorithms”. Machine Learning, pp.37-66, 1991. [4] Stanfill, C., and Waltz, D., “Toward memory-based reasoning”. Communications of the ACM, pp.1213-1228, 1986. [5] Nilsson, N. J., “Learning Machines”. New York: McGraw Hill, 1965. [6] Hinton, G. E., “Learning distributed representations of concepts”. Proceeding of the Eighth Annual Conference of the Cognitive Science Society, Amherst, MA. Reprinted in R. G. M. Morris (ed.), Parallel Distributed Processing: Implications for Psychology and Neurobiology. Oxford, UK: Oxford University Press, 1986. [7] McClelland, J. L., and Rumelhart, D. E., “Explorations in Parallel Distributed Processing”. Cambridge, MA: MIT Press, 1988. [8] Dietterich, T. G., Hill, H., and Bakiri, G., “A comparative study of ID3 and back propagation for English text-to-speech mapping”. Proceedings of the seventh International Conference on Machine Learning (pp.24-31). San Mateo, CA: Morgan Kaufmann, 1989. [9] Holland, J. H., “Escaping brittleness: The possibilities of general purpose learning algorithms applied to parallel rule-based systems”. In R. S. Michalski, J. G. Carbonell, and T. M. Mitchell (eds.), Machine Learning: An Artificial Approach (Vol. 2). San Mateo, CA: Morgan Kaufmann, 1986. [10] Java EE 5 Tutorial http://java.sun.com/javaee/5/docs/tutorial/doc/ (See the thesis for complete of references)
Thank You Question Please !!!
Which subject is the best classifier? Method 1: (in case of small outcome – few values of the target subject) The quantity: Where: X is a subject, T is a subset partitioned by X Average amount of information needed to identify the target subject of a raw in T (This quantity is also known as the entropy of set T) A sum of information of n subset after T has been partitioned according to n values of X Back
Which subject is the best classifier? Method 2: (in case of a lot of outcome – many values of the target subject) Where: It present potential information generated by dividing T into n subsets, whereas the information gain measures the information relevant to classification from the same division. Back
Error based pruning • Using the normal approximation to the binomial distribution function with upper bound to calculate the predicted error for sub-tree and leaves. • If predicted error of the leave is smaller than the sub-tree’s error then pruning tree by replacing the sub-tree with that leaf Where: - U(E,N): upper bound; - N: number of classification cases; - p=E/N: error rate on training data; • E: number of error; • z-value is then obtained • through the z-table; back