1 / 16

OpenMRS Data Synchronization Implementing OpenMRS in loosely connected environments

OpenMRS Data Synchronization Implementing OpenMRS in loosely connected environments. 27-Nov-2008, Maros Cunderlik , openmrs.org. OpenMRS Software Architecture. Software Architecture

chandler
Download Presentation

OpenMRS Data Synchronization Implementing OpenMRS in loosely connected environments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OpenMRS Data SynchronizationImplementing OpenMRS in loosely connected environments 27-Nov-2008, Maros Cunderlik, openmrs.org

  2. OpenMRS Software Architecture • Software Architecture • “Architecture is defined by the recommended practice as the fundamental organization of a system, embodied in its components, their relationships to each other and the environment, and the principles governing its design and evolution” • WHAT does it do? • HOW does it do it? • WHY does it do it the way it does? • Architecture Documentation: ‘Views’ • Logical: documents functional composition of the system elements and their relationships • Physical: distribution of the logical units onto physical resources • Servers, technologies, protocols, ports, etc. References: http://www.sei.cmu.edu/architecture/ http://www.sei.cmu.edu/architecture/published_definitions.html

  3. OpenMRS : Logical View

  4. OpenMRS: Physical View Reporting Data Entry Clinical Decision Support Internet (HTTP) • OpenMRS Server: • OS: • Windows or Unix • Application server: • tomcat, Java • Database: • MySql or other RDBMS Tomcat + Spring + Hibernate (Http + Java Application Server + Java Persistence) JDBC MySql DB

  5. OpenMRS: Challenges in rural areas • Goals: • Allow convenient and up-to-date access to system in rural districts and health centers • Data collected in rural areas must be available to central systems in timely manner • Challenges? • Power • Connectivity (packet loss, corruption) and bandwidth • Travel: data cannot be easily shipped to/from central locations • HW and SW maintenance and upgrades in remote areas • HW failures, patches, SW upgrades, etc.

  6. OpenMRS: Loosely connected • Solutions? • #1: Collect Data on paper and ship it back to central location • Pros vs. Cons ? • #2: OpenMRS ‘Lite’ • Make light-weight copy of openmrs that support minimal functionality and distribute it to remote areas • Pros vs. Cons ? • #3: Separate Desktop Application • Make completely separate application that works in disconnected mode • Pros vs. Cons ? • #4: Connected installs of OpenMRS with data synchronization • Pros vs. Cons ?

  7. OpenMRS: #4 Data Synchronization • Reasons for #4: • Health centers need functionality beyond simple data entry (i.e. reporting, updated drug information); i.e. making separate application would be costly • Health centers also need to *receive* data about patients from other centers in their district/province: i.e. ‘one-way’ data flow is not sufficient • On-site connectivity: there *will* be onsite Internet connectivity; it may be sporadic and at times unreliable • Given limited amount of dev resources, reuse as much of core openmrs java code as possible

  8. OpenMRS: Data Synchronization Design • Q: What capabilities must exist in a system for two installations to exchange data? • A: Four things • 1. Serialization • Facility to reliably export and then import business objects • 2. Globally Unique Identification of data/records • Primary keys are unique only to a single *local* database • 3. Change tracking mechanism • How do we know what changed on a given system since ‘last time’? • 4. Transport mechanism capable of working on unreliable networks

  9. Data Synchronization: Implementation • #1: Serialization: Serializing object graphs can be tricky #1 public class Person { protected Address primaryAddress; protected Address secondaryAddress; … public Address getPrimaryAddress() {..} public void setPrimaryAddress(Address a) {..} public Address getSecondaryAddress() {..} public void setSecondaryAddress(Address a) {..} } #2 public static void main(String [ ] args) { Address a = new Address(“Kigali”); Person p = new Person(); p.setPrimaryAddress(a); p.setSecondaryAddress(a); a.setValue(“Kirehe”); assert(p.getPrimaryAddress().equals(p.getSecondaryAddress()); } ? true #4 .. Address a1 = p.getPrimaryAddress(); Address a2 = p.getSecondaryAddress(); a1.setValue(“Rwinkwavu”): assert(a1.equals(a2)); <Person> <primaryAddress value = ‘Kigali’ /> <secondaryAddress value = ‘Kigali’ /> </Person> #3 ? true

  10. Data Synchronization: Implementation • Serialization Options: • Java native: java.io.Serializable • doesn’t work well for durable state; cannot move from one JVM to another • 3rd party tools: Simple, XStream • Custom • i.e. implement iava.io.Externalizable • Data Synchronization: leverage Hibernate Persistence Mechanism • Pros: • Reuse what is already in use in openmrs • Also provides simple solution to #3 problem • Dependent on persistence layer: any changes made outside of it will not be serialized or understood • Longer-term: replace with rebust serialization framework in core openmrs

  11. Data Synchronization: Implementation • #2: Record Uniqueness • How do we know patient_id of a given patient in two different databases? • Example: 2 server: Rwinkwavu and Kirehe • 0. both Rwink and Kirehe have exact same # of patients in their tables • 1. Rwink: Add patient Joe, system assigns next id, assume patient_id = 34; • 2. Kirehe: Add patient Patrick, system assigns next available primary key, say 34; • 3. If Rwink sends its patient data, patient #34 will be Joe but Kirehe ‘thinks’ it is Patrick: how to fix??? • Two common solutions: • Create mapping tables: server_id, table_id, pk • Cons: using one central mapping table creates single point of failure, keeping up distributed version of the mapping table is trickly • Use something that *is* globally unique: Universally Unique IDentifier (UUID) • Java.util.UUID, http://en.wikipedia.org/wiki/UUID

  12. Data Synchronization: Implementation • #3: Change tracking mechanism • Servers need to somehow ‘tell’ each other the last time they ‘saw’ each other • Classic problem in distributed computing • Two (at least) common solutions: • #1: Versioning • #2: Change logs/journals • Openmrs sync: Change journal + hibernate intercetor (for now) • We actually want versioning but doing so requires extensive changes to data model; compromise: change journal table analogous to DB transaction log

  13. Data Synchronization: Implementation • #4: Robust Transport • Needs: • cannot be connected protocol (i.e. RPC) • Efficient on wire • Back up mode for transport needs to be available in case of not connectivity • Able to withstand network transport corruption • OpenMRS Sync solution: • HTTP + checksum + compression • ‘USB’ flat file based data interchange as backup

  14. Data Synchronization: Implementation • Addressing the Challenges • Power and maintenance • Use Ubuntu to minimize need for patches • Application will automatically start up when server boots up and sync on schedule; no intervention needed • Investment in key infrastructure: solar power, sat. connections • TODOs: application self-update, database self-migrate • Connectivity, Travel • Sync transmissions checksum-ed and compressed • Data can be retransmitted without concern about corruption/duplication • If prolonged outage: sync-via-USB available

  15. Data Synchronization: Implementation • DEMO

  16. Data Synchronization: Vision

More Related