Alternative Storage

Regions of Interest Alternative Storage

Overview • What’s in a ROI? • Use cases • Requirements • Current Storage System • Problems • Alternative Storage

What’s in an ROI? • ROI • Geometry • Measurements • ROI on Channel • Annotations • ROI • Measurement • Links

Use Cases • User created ROI • Measurement tools • HCS generated ROI • Automatic • External • External analysis • Particle Tracking • Other • Templates • ROIs without images

Use Cases – Human Generated • Human generated • More interactions • Merge, Propagate, Split, Delete • Measurements • Geometry • Intensity • Path • ROI/ROI Links • Tags mostly on ROI • Write Many/Read Many

Use Cases - HCS • HCS Generated ROI • Lots of ROI • Attached to Channel • Measurements Attached • Multiple measurements • Tags on ROI, Measurements • Analysis, results and meta. • Write Once, Read Many

Use Cases – External Tools • External Tool can Generate ROI (+ scripts) • Can be tagged • Links (ROI/ROI, ROI/Image) • Results can be in any format

Use Cases - Templates • ROI need not be attached to image • Template to define other ROI

ROI from the Nth Dimension • N-Dimensional Data • Storage of Image data simple • ROI more complex • Database entry, file format • We don’t just want to store in HDF

Current Storage Solutions • Database • ROI • ROI Annotations • PyTables • Mask ROI • Measurements

Current Status • Pytables • ROI are heterogeneous • Concurrency • Python behind a core service call • Measurements are optimal • Tagging is an issue • Inside file • Multiple annotations reported to be slow

Database • ROI can be stored in database • Mask data can be an issue • Tagging in RBD not best • Many more annotations than we’d like • Link to external source for measurements

Alternative Storage • Key-Value Pair Stores • Berkeley DB • Project Voldermort • Tokyo Cabinet • Document DB • MongoDB • CouchDB • Graph DB • Neo4J • InfoGrid • Table DB • Cassandra • Hypertables • HBase

Where others have gone before • Other opinions on the storage solutions • MongoDB vs CouchDB, Cassandra, .. • CouchDB vs MongoDB • Pros and cons of MongoDB • Digg on Cassandra • What is a supercolumn • Cassandra talk • Indexing nodes in Neo4J

MongoDB • Document Database • NOSQL movement • Schemaless • No Tables • Collections of like data • No Joins • Document is equivalent of row of data • Distributed file system (GridFS)

MongoDB– Pros and Cons Pros • It has bindings to numerous languages (C++, C#, Java, Python, ...). • Allows storage, indexing, linking of any user data • Annotations are now very easy, efficient • Has mechanisms for schema upgrade • Dynamic Queries • Replication • Sharding. • Map-Reduce framework. • Fast. • GridFS is a distributed file storage mechanism within Mongo. • Easy to install Cons • Schemaless, data integrity will need to be worked on. • Graph structures not inherently supported.

MongoDB - Deployments DEPLOYMENTS • SourceForge http://sourceforge.net/ • BusinessInsider http://www.businessinsider.com/ • New York Times http://www.nytimes.com/ • Disqus http://www.disqus.com/

MongoDB– ROI Use cases

MongoDB– Example insert connection = Connection(); db = connection['databaseName']; collection = db.['collectionName']; collection.insert({"tags" : [ ], "label" : “MyROI”, "shapes" : [{ "tags" : [{"tag" : "foo1", "namespace" : "bob"}], "rx" : 17, "ry" : 17, "label" : null, "cy" : 75, "cx" : 3, "t" : 0, "z" : 0, "type" : "Ellipse", "id" : 3 }, { "tags" : [{"tag" : "foo2", "namespace" : "bob"}], "rx" : 10, "ry" : 16, "label" : null, "cy" : 82, "cx" : 45, "t" : 0, "z" : 0, "type" : "Ellipse", "id" : 5 }], "type" : "Roi", "id" : 565 })

MongoDB– Example query Find roi with tag foofoo and shapes with tag foo1 connection = Connection(); db = connection['databaseName']; collection = db.['collectionName']; collection.find({”shapes.tags.tag”:”foo1”,”tags.tag”:”foofoo”}) Find roi shapes with tag containing mitosis connection = Connection(); db = connection['databaseName']; collection = db.['collectionName']; collection.find({"shapes.tags.tag":'/.*mitosis.*/i'})

Neo4J • Graph Database • use nodes to represent objects • User specifies relationship between nodes • Allows complex traversal of node structures

Neo4J – Pros and Cons PROS • Handles graph structures nicely • Transactional • Supported by Gremlin Gremlin • Native RDF http://components.neo4j.org/neo-rdf-sail/ • Easy to install CONS • No C++ language binding. • Not distributed. • Tables are not so easily modeled. • Difficult to query on node contents

Neo4J - Deployments DEPLOYMENTS • The Swedish Defence forces http://www.mil.se • Windh Technologies http://www.windh.com • Flextoll http://www.flextoll.se

Neo4J - Example public enumOMERORelations implements RelationshipType { ASSOCIATE, DERIVE, AGGREGATE, COMPOSE } Node image = neo.createNode(); image.setProperty("IObject",imageI); image.setProperty("id",imageI.getId().getValue()); image.setProperty("name",imageI.getName().getValue()); Node derivedImage = neo.createNode(); derivedImage.setProperty("IObject",derivedImageI); derivedImage.setProperty("id",derivedImageI.getId().getValue()); derivedImage.setProperty("name",derivedImageI.getName().getValue()); Relationship relationship = image.createRelationshipTo( derivedImage, OMERORelations.DERIVE ); relationship.setProperty("type","ROI"); relationship.setProperty("operation","crop"); relationship.setProperty("roi",cropRoiI);

Neo4J – ROI Use cases

Cassandra Implementation of Google’s BigTables, is a complex implement of a key/value store to represent a table. A sophisticated toolset is required to get the most out of this solutions, for instance Google has created sawzall to query this system. Digg have released a language to work with Cassandra called LazyBoy. Works by creating a table which has columns linked together called column families, like data will exist in the same column family (Ellipse ROI).

Cassandra – Pros and Cons Pros • Quick • Handles heterogeneous data well • Different rows can have different columns • Can manage distributed data • Map/Reduce • Focus on writes not reads • Scales nicely • Easy to Install Cons • Not simple to work with • Building hierarchical structures • Sorting • Querying • Ad Hoc Queries are bad, Digg still use MySQL for certain queries. • Have to manage secondary indexes, (K/V) • Version 0.5

Cassandra - Deployments Deployments • Facebook (MAYBE!!) http://www.facebook.com • Digghttp://www.digg.com

Cassandra – ROI Use cases

HyperTable Implementation of Google’s BigTables, is a complex implement of a key/value store to represent a table. A sophisticated toolset is required to get the most out of this solutions, for instance Google has created sawzall to query this system.HyperTable has a query language call HQL. Works by creating a table which has columns linked together called column families, like data will exist in the same column family (Ellipse ROI).

Hypertable–Pros and Cons Pros • Quick • Handles heterogeneous data well • Different rows can have different columns • Can manage distributed data • Map/Reduce • Scales nicely • Easy to Install Cons • GPL License • Building hierarchical structures • Docs are weak • HQL works for simple queries only • Map/Reduce for other work • limit of 255 column families • Secondary keys

HyperTable- Deployments Deployments • Rediffhttp://www.rediff.com • Zventshttp://www.zvents.com/

HyperTable–ROI Use cases

Are we Normal? • Why do we have an RDMS • We don’tnormalise the data • Each import will normalise on: • Image, ObjectiveSettings, LogicalChannel, LightSettings, Detector Settings. • Object Penalty • Difference between normalisation and view

Alternative Storage

Alternative Storage

Presentation Transcript

Alternative Fuels for Transportation and Energy Storage

Critical materials and alternative for storage batteries

Alternative

STORAGE

Alternative Sprinkler System Designs for Storage Protection

Alternative

ALTERNATIVE

Alternative Energy Alternative Education

Alternative DC Storage Examples

Reliable Electrochemical Energy Storage for Alternative Energy

Alternative Energy… … alternative learning…

LOER Initiative Alternative Storage/Disposal of Excess Surface Water

Alternative

Alternative

alternative

[Storage]

Storage

Storage

Alternative

Self storage, Storage units, Cheap storage units

Storage Post Self Storage

Storage Sapulpa - Lokitup Storage