230 likes | 332 Views
RStat: Release 1.2. Ali-Zain Rahim, Strategic Product Manager March 18, 2010. Agenda : . Differentiators and Benefits Review 1.2 Enhancements Survival Analysis demo - Child welfare Questions. RStat: Differentiators & Benefits. Based on R-Project Open Source
E N D
RStat: Release 1.2 Ali-Zain Rahim, Strategic Product Manager March 18, 2010
Agenda: • Differentiators and Benefits • Review 1.2 Enhancements • Survival Analysis demo - Child welfare • Questions
RStat: Differentiators & Benefits • Based on R-Project • Open Source • Maintained by world wide consortium of universities, scientists, government funded research organizations, statisticians. • Over 2000 packages • RStat is a GUI to R • Intuitive guided approach to modeling • Simple model evaluation • Intended both for business analysts and advanced modelers • Single BI and Predictive Modeling Environment • Re-use metadata and queries • Perform data manipulation and sampling • Build scoring applications • Unique Deployment Method for Scoring Solutions • Scoring models are built directly into WF metadata • Deployment on any platform and operating system - Windows, Unix, Linux, Z/OS, and i Series.
RStat 1.2 Enhancements: • New Modeling Technique: • Survival Analysis: • Two Techniques – Cox Regression and Parametric Time Regression • Cox Regression – risk scoring routine • Parametric regression – time scoring routine • What Survival Does and when to use • Survival analysis encompasses a wide variety of methods for analyzing the timing of events with censored data (Censoring: Nearly every sample contains some cases that do not experience an event) • How to study the causes of • Births and Deaths • Marriages and Divorces • Arrests and Convictions • Job Changes and Promotions • Bankruptcies and Mergers • Wars and Revolutions • Residence Changes • Consumer Purchases • Adoption of Innovations • Hospitalizations .
RStat 1.2 Enhancements – cont’d • New Scoring Routines: • Neural Network model with comprehensive output – Enables users to compile NNET models into WebFOCUS functions for creation of applications. • Transformation capabilities for scoring routines – Allows for data manipulation within the RStat tool. Some methods are: Imputation, Scaling, and Remapping • Enhanced statistical output: • Indicators to Regression models ANOVA table to show significance – Enables users to determine the variables that are significant to the model. • Performance and Usability optimization • Auto sampling for faster visualization of large data sets in the KMeans model – Enables more optimized and efficient resource usage to display Cluster model statistics and data plots.
RStat 1.2 Enhancements – cont’d • Performance and Usability optimization • Model optimization – Allows only the variables used to create the model to be included in the exported C file. [In RStat 1.1 all variables selected by the user were included in the model] • Enhanced Log functionality – Allows users to create R-scripts for use with other applications, such as a Dialogue Manager application. • Process Cancellation capability – Allows users to cancel a long running process from within RStat. • Special characters functionality – Enables efficient handling of data with special characters. • Timestamp within the RConsole and Log Textview – Enables users to view and match the log with any errors received, thereby allowing for easier troubleshooting.
Demo: Child Welfare Use Case To identify the children who will stay in Child Welfare programs, and at what age will the children leave the programs – a time to event analysis
Foster Care Analytical Framework: Background and Optimization Goals • Half a million children in foster care • Managed by county departments and the private agencies who train families • It is a team effort to find a child a permanent home • Severe consequence of bad foster care: • Youth who leave the system are more likely to be homeless, incarcerated, unemployed, and unskilled. • Foster Care Analytical Framework: Goals & Benefits : • Provide better understanding of the factors that contribute to better foster care to all parties involved in the process • Provide standardized analytic and reporting system • Match children with better foster parents • Optimize child foster care duration
Thank you! • "..if you are serious about statistics as a career, you need to become familiar with R because it is the most powerful and flexible language available, and may become the lingua franca of statistical programming in the near future.“ • Source: "Statistics in a Nutshell" by Sarah Boslaugh published by O'Reilly