110 likes | 224 Views
Here are the two Discovery Methods for Data Mining. Descriptive Models : Used to describe patterns and to create meaningful subgroups or clusters.
E N D
Here are the two Discovery Methods for Data Mining • Descriptive Models: Used to describe patterns and to create meaningful subgroups or clusters. • 2005, 2006, and 2007 student data: Answers the question, this is what is happening – clustered into groups. What we are used to seeing. Kids who are White scored with these patterns over the last three years. • Predictive Models: Used to forecast explicit values, based upon patterns in known results. • Based on 2005, 2006, and 2007 student data, what would you predict 2008 data to look like? • Apply same model for 2008, 2009, and 2010, and see if your predictions stand up – if not change model by adding additional data fields
Here is a comparison of Data Mining and Traditional Scientific Methods • The Basic Steps of the Scientific Method • Choose a problem • Formulate a hypothesis • Perform an experiment • Draw conclusion • Verify Conclusions • The Basic Steps of Data Mining • Identify the goal • Create target data • Data Processing • Data transformation • Data Mining • Interpretation/Evaluation • Take Action Data mining is research “like” in the sense that you can use multiple correlations to attempt to make predictions about what you think are important to keep track of. For example, if you know that a certain “type” of student is a distraction in large group settings, you might include those ideas to do better course scheduling.
Intuition and Kids with Disabilities • Often when running data mining activities, people familiar with the concept in question would say, I thought that was true but did not have data to prove it. • The following slides are ones that we used to explore special education data required by counties to submit to the state on an annual basis (end of October). The most significant of these is where students receive services. There are several options for kids with disabilities • Receive services in their home school • Participate in a combination of general and special education • Receive services in another school within their district • Receive services in another district or private placement • This is a percentage and is reported as percent time spent in special education classrooms as opposed to general education classes. If you are familiar with inclusion, this is why students are participating in your classes.
Least Restrictive Environment • Mandated through the Individuals with Disabilities Education Act 1997 and 2004 • The definition for LRE A is (A through M): • “A 6-21 student enrolled in a comprehensive school who receives special education and related services outside the general education setting for less than 21% of the school day.” • The federal goal for LRE A is 100% • The federal “realistic” goal set for LRE A is (was) 80% • The Maryland statewide goals is now about 60%
Here is the 2005 data for Least Restrictive Environment by Disability. LRE A is 77% or more time in general ed classes. LRE B is between 33% and 76% time in general education classes. LRE C is pretty much all time in special education classes. • 90% of all Children in the state of Maryland 3 through 21 in LRE A, B, and C . • Distribution of Groups • Specific Learning Disabilities (99% Total ABC) • LRE A = 61% LRE B = 26% LRE C = 12% • Speech and Language (74% Total ABC) • LRE A = 61% LRE B = 8% LRE C = 5% • Emotional Disturbance (69% Total ABC) • LRE A = 26% LRE B = 13% LRE C = 30% • Other Health Impairments (95% Total ABC) • LRE A = 61% LRE B = 20% LRE C = 14% • Mental Retardation (84% Total ABC) • LRE A = 9% LRE B = 21% LRE C = 54% • Multiple Disabilities (91% Total ABC) • LRE A = 16% LRE B = 21% LRE C = 54% • Autism (66% Total ABC) • LRE A = 23% LRE B = 10% LRE C = 33%
That data is useful, however, we wanted to explore other ways to look at this data so we conducted the First Data Mining Study in Maryland • THE GOAL: To analyze specific attributes of the special education population (6-21) who are within LRE A, B, and C based on available 2003 (then added 2004/2005) data (we will look at this when we meet face to face) • Groups were segmented into: Grades 1-5, 6-8, and 9-12 • SSIS data included only age, race, gender, grade, disability, whether or not transportation was provided, and if the student received medical assistance
We Applied Decision-Tree Analysis • A decision tree approach works on the basis of starting with a population: • parent group = all disability codes for students age 6-21 • and dividing that population into two subgroups (child groups) • using available attributes (race, gender etc) • such that the two subgroups are maximally different in their outcome, i.e. the LRE A rate. • The decision tree approach takes then the two subgroups and keeps on dividing each parent group into two child subgroups until no further subdivisions exist that would achieve a difference in the outcome.
The overall LRE A rate is 65.6%. The population is split into 2 different subgroups on disability. The other subgroup (Node 2) contains all the disabilities not in the other subgroup, with a LRE A rate if 74.9%: 02 - Hearing Impairment 04 -Speech/Language 08- Other Health Impairment 09 - Specific Learning Disability The first subgroup (Terminal Node 1) contains disabilities 01, 03, 05, 06, 07, 10, 12, 13, 14 and 15, with a LRE A rate of 27.9%. All the disabilities in Node 2, except for Speech and Language, with a LRE A rate of 62.3% Disability Speech and Language, with a LRE A rate of 86.7%.