160 likes | 248 Views
FirstDIG First Data Investigation on the Grid. Paul Graham, Terry Sloan, Adam Carter EPCC Ian Gregory, Darren Unwin First South Yorkshire tel:+44 (0)131 650 5155 email:t.sloan@epcc.ed.ac.uk. Description. First plc - UK’s largest public transport operator Data sources
E N D
FirstDIGFirst Data Investigation on the Grid Paul Graham, Terry Sloan, Adam Carter EPCC Ian Gregory, Darren Unwin First South Yorkshire tel:+44 (0)131 650 5155 email:t.sloan@epcc.ed.ac.uk
Description • First plc - UK’s largest public transport operator • Data sources • Huge range – mileage, revenue, fuel, maintenance, routes … • Collected – manually, ticket machines, GPS … • Disparate DBMS • Acquisitions, historical, OS, physical location, representation … • Issues NOT unique to the bus industry • Fine for day to day operations, but … • Business questions – data from >1 source • Complaints vs Lateness, Revenue vs Lost Miles … • Aggregation – by service, by day, weekdays only … • Introduces challenges for data analysis
Description • First South Yorkshire situation • No common interface • No common reporting process • Statistics produced manually when required • Labour intensive • Not performed often or well • Process to produce what is needed • Expensive • Impractical
Description and Aims • Open Grid Services Architecture: Data Access and Integration • Assists with the access and integration of data from separate data sources via Grid Services • Our remit:To evaluate the suitability of the use of OGSA-DAI in a commercial environment. If OGSA-DAI: • Is appropriate, secure, straightforward to deploy and use … • Does what we need! • Provide feedback to OGSA-DAI team • Aims • Demonstrate deployment of OGSA-DAI within the First South Yorkshire bus operational environment and learn from it • Short data analysis using OGSA-DAI service enabled data sources to answer business questions posed by First South Yorkshire
Status: Workpackages • WP 1: Data Source requirements capture (FINISHED) • D1.1 Data Source Requirements Capture & D1.2 Organisation Data Schema (COMMERCIAL-IN-CONFIDENCE) • WP 2: Development of data interfaces (FINISHED) • OGSA-DAI Deployment • WP 3: Deployment & refinement of OGSA-DAI (FINISHED) • First Data Service Browser User Guide • First Data Service Browser Software • WP 4: Data mining requirements capture (FINISHED) • D4.1 Data Mining Requirements Capture (COMMERCIAL-IN-CONFIDENCE) • WP 5: Initial data mining analysis (FINISHED) • D5.1 Initial Data Mining Report(COMMERCIAL-IN-CONFIDENCE) • WP 6: Data mining detailed analysis (FINISHED) • D6.1 Final Data Mining Report(COMMERCIAL-IN-CONFIDENCE)
Technical Achievements 1 • Data Mining • Combined two databases to answer First’s business questions • The Customer Contact System • Microsoft Access • Information on customer complaints e.g. time, service, nature • The Mileage database • dBASE IV • Information on bus mileage e.g. lost miles • Also investigated Revenue and Schedule Adherence suitability for data mining • Produced detailed data mining report
Technical Achievements 2 • OGSA-DAI deployment at First South Yorkshire • Created Grid Data Services for DBMS previously unsupported by OGSA-DAI • MS Access – CCS, dBASE IV – Mileage • Investigated GDS for SQL Server and CVS-based DBMS • Rigorously exercised use of OGSA-DAI in a commercial setting: • Identified numerous areas for improvement in OGSA-DAI • Identified new requirements for use of OGSA-DAI in business • Confirmed the relevance and potential of OGSA-DAI for business
Technical Achievements 3 • Data Service Browser • Identified need to aid ‘ease of use’ for OGSA-DAI • Middleware • Developed a generic Grid Data Service Browser • Simple GUI – avoids XML etc • Allows SQL queries and updates to databases • Enables JOIN queries across databases • Will be included in future OGSA-DAI releases • … demo later
Achievements – First’s perspective • Project has proven that: • There is a cost-effective solution that First South Yorkshire can utilise • First can get to its data and analyse it in a useful manner • With considerably reduced labour time First can produce more accurate and more wide-ranging information for the business management
Achievements “the results of this exercise will revolutionise the way we do things in the bus industry” Darren Unwin Divisional IT Manager
Dissemination • Presentations • Ernst & Young, WestInfo Services, Strategy & Performance Associates, SingTel Optus, Executive Briefing Centre, Curtin Business School, Curtin University of Technology, Perth Australia, February 24th, 26th, 2004. • Curtin Business School Information Systems Seminar, Curtin University of Technology, Perth, Australia, February 20th 2004 • UK e-Science booth, Supercomputing 2003, Phoenix, USA, November 2003 • Flyers • UK e-Science All Hands Conference, Nottingham, UK 2-4 September 2003 • Posters • UK e-Science All Hands Conference, Nottingham, UK 2-4 September 2003 • Articles • T.M.Sloan, A.Carter, P.J.Graham, D.Unwin, I.Gregory, "First Data Investigation on the Grid: FirstDIG", Proceedings of the 2nd UK e-Science All Hands Meeting, 2-4 September, 2003, Nottingham, UK
Exploitation • First Data Service Browser is being used and extended in the INWA project with Curtin Business School, Perth, Australia • First are keen to extend their deployment to other databases
Future Plans • Project is finished, no effort remaining. • Incorporation of First Data Service Browser into future releases of OGSA-DAI • First South Yorkshire want to build management reporting applications based on OGSA-DAI
Demo • Data Service Browser • Accessing three different DBMS • Mileage, CCS, MySQL • A JOIN – similar to the queries required for the data mining • Easy within one DB, requires intermediary steps for distributed DB • Without OGSA-DAI would have been impractical • Looking at Lost Miles and Customer Complaints
In Conclusion • Successfully demonstrated the use of Grid middleware in a ‘real-world’ environment • OGSA-DAI team: • Gained (in)valuable feedback • Incorporated Data Service Browser • First • Discovered valuable information from their data which would have otherwise been practically unobtainable • Keen to extend to other DBMS