200 likes | 404 Views
The Data Exchange Project. A Concept Viability Workshop delivered in Partnership with techUK 8 th November, Russell Square, London. Context: The Current State.
E N D
The Data Exchange Project A Concept Viability Workshop delivered in Partnership with techUK 8th November, Russell Square, London
Context: The Current State • DfE manages around 20 annual, bulk upload collections from schools and LAs per year. Critical for policy monitoring, allocation of funding, local accountability and informing inspections. • To do this we use COLLECT – a bespoke collection, validation and storage. It contains a Studio and Portal, and can handle multiple collections from all our data providers in a secure way. It ‘works’ for the current ways of working (i.e. school management system set up to generate and feed COLLECT a bulk file). • There are other systems maintained to transfer information between schools and LAs – ‘School to School’ and ‘Key to Success’ and ‘Common Transfer Files’
Context: The Current State Criticisms / Weaknesses within the current set up include: • Very long lead in times to make changes or additions to collections • Long time lags before information is published • Schools have to send ‘the same’ data via multiple systems, to multiple destinations, at multiple times • Data is stored in silos, in bespoke fashions and accessed using wide range of websites with varied but limited functionality • Standard data definitions exist (Common Basic Data Set), which is not broad enough for all data we move, nor granular enough to support some ambitions • Risk / Issue that data in different systems can diverge over time
Context: The Case for Change In addition to those weaknesses…. • We move a whole load of data around the sector that hasn’t actually changed since the last time we asked • Human effort involved in shifting data means we can’t do it very often without increasing burden. Even where it should happen…it doesn’t always! • Ever increasing number of Academies means increased providers of data. The old model becomes harder and harder to support • There are operational and policy issues which could be enhanced by quicker data flows – e.g. identifying children dropping out of education nationally, tracking Special Educational Needs Children outside of LA boundaries, and supporting fairer funding
The current landscape – a summary of problems 1. Gather Data 2. Process and Store Data 3. Make Data Available Process • Bulk collections limit detail • Significant front line ‘compliance’ cost • Changes reliant on MIS systems responding to COLLECT specifications • Schools have to send the same data to several places • Data cleaning can happen outside school MIS The Problem • Data stored in silos • Data stored inconsistently (version control) • Data processed with locally chosen software • Data not stored at lowest level • Responding to new policies or queries can be time consuming / inflexible • Taken together, accessing, combining and then using data is more difficult than it ought to be • Several places to go for the ‘same’ data. Multiple websites and passwords. • Varying analytical and visualisation tools • Parents, schools, DfE, inspectors, researchers...not all pointing to the same data for the same issue • 3rd party access to data is a bespoke and labour intensive process The Solution 1. DATA EXCHANGE School Performance Data Programme 2. WAREHOUSE 3. PORTAL The Data Transformation Programme
Context: The vision for the end state For Richer, More accurate data to be available quickly in accessible and usable forms, in order to enable others to drive up the quality of education and services received by children Specifically within the context of data exchange: • From bulk upload to regular movement with minimal manual intevention…a business process should trigger a movement • Able to tell someone plugged into the exchange ‘within minutes’, but most typically ‘within hours’ will do. • Data could be pushed on change, at defined times, or pulled. • Schools don’t have to repackage data for different users – just be plugged into the exchange
Context: The vision for the end state • We anticipate each and every School and LA who uses one, being plugged into exchange via their MIS • Appropriate role based authorised access and security • Data movements being controlled via a central hub as part of a ‘hub and spoke’ configuration (as opposed to hierarchical, distributed or centralised) • SPDP’s data warehouse and portal will be key consumers, and as such the exchange architecture should be closely integrated to maximise performance • Able to handle a variety of formats for moving data around the sector • When data moves it will conform to the ISB Enterprise Data Architecture, but the agents / adaptors particular to each MIS will manage the translation needed • Significant front line involvement in governance
Designing the Architecture– what do we know? • 25,000 schools, 152 LAs. Nearly all have MIS, but not all use it to the full degree. Data Exchange will be ‘one size fits most’ but need a way to bring data into data store for the tail of schools not using MIS • Given the fact DfE is already buying a warehouse and portal under School Performance Data Programme, we should fully exploit the elements of those which can deliver part of the solution for Data Exchange • Hub and spoke considered the most efficient design
Data Exchange: What’s out of scope? Data / Organisational scope could potentially be massive, but the risk of never getting off the ground would be substantial. Initial scope will focus on individualised data sitting in school and LA MIS as end points. But by building a scalable solution using open standards, we will avoid a cul-de-sac in future. Within that scope, a number of scenarios have been identified which fall outside the scope of DTP, including: • the transfer of information between systems within an organisation, for example to maintain common data in separate systems within the organisation in a consistent state on a sub-second timescale • the transfer of information between schools working collaboratively, for example to move in-lesson attainment data captured in an interactive learning environment from one school to another during the lesson • to alert the Local Authority children’s services immediately when a learner that is being monitored does not arrive for school.
Scope of today’s discussion… • To integrate an exchange within the warehouse and portals solutions in hand we need… • School / LA MIS to be able to communicate with Data Exchange Hub • A data exchange hub, with appropriate routing, control, audit and security • The hub to seamlessly integrate with the SPDP data warehouse to provide the storage area for all the data DfE receive. • We do not need – a data store, or way to present data – these are to be delivered by SPDP
Challenges / Risks • Number of end points and variation in technical ability of schools • Implementing the ISB Enterprise Data Architecture with MIS suppliers with whom we have no formal contractual relationship • Ensuring integration with SPDP architecture, which itself has not been built yet • Cultural shift for data providers, from annual data collections which are physically sent, to data flowing out of system automatically • Greater transparency of information than ever before at a local and national level • Data cleaning / validation. Ensuring we better support front-line data entry by developing easily accessible rules across the end-to-end solution, and not throwing out the baby with the bathwater in terms of current cleaning and checking roles played by schools and LAs
Current situation Weeks/months from data capture to availability on portals
The future Hours from data capture to availability on portals
Data exchange – where might it go? Deputy Directors Pupil Referral Unit Educational Data Division Independent Schools Standards and Testing Agency Maintained Schools Data Exchange System Education Funding Agency Academies / Free schools National College for Teaching & Learning Local Authorities Ofsted Awarding Bodies Higher Education Further Education Information Authority
Key requirements for transfer mechanism • Automatically transfer information • Guaranteed transfer of information between any two end points with no manual intervention (Not guaranteed order of delivery) • Addition / removal of end points with minimum effort • Prioritisation/precedence to meet SLAs for different message types • Configure dataflows (control capability) • Enable authorised users to define dataflow services • Data flow contents • Trigger (on change, scheduled, on request, others TBD) • Performance targets • Source and destination end points (individual or multiple) • Validate and cleanse data • Data quality is important for end to end solution • Transfer mechanism must validate against XSD • (Rely on end points and SPDP for data model specific validation rules)
Key requirements for transfer mechanism • Monitor and improve performance • Performance logging and alerting • Maintain security • Solution will need accreditation to ‘Official’ level • Authenticate end points (and end points must authenticate hub) • Protect information in transit • Ensure only authorised users can configure data flows • Ensure data flows reflect access privileges of end points • Accounting / Audit capability • (SPDP protects data at rest)
Key non-functional requirements • Support tens of thousands of end points • 25k schools, could be several end points for some schools • Scalable to support future growth • Message volumes still being developed • 25k schools reporting on (non)attendance twice per day • message per session/school or message per class / session TBD • Other messages one or more orders of magnitude lower in frequency • Performance • Session attendance available on portal within 1 hour of leaving school • ‘Performance budget’ needs to be split between transfer mechanism and SPDP analysis / reporting / publication activities • Availability • 24/7 with 99%, core working hours higher • < 30 minutes interruption of service
Key non-functional requirements • Flexibility • Keep data model and transfer mechanism separate • Maximum flexibility in terms of modes – push, pub/sub, etc • Standards • Use open standards in widespread use • Identify output based requirements and assess solutions offered against VFM for the sector