1 / 17

Jake Lin Shmulevich Lab jake.lin@systemsbiology

LIMS SOLR Integration . Jake Lin Shmulevich Lab jake.lin@systemsbiology.org. LIMS for Systems Genetics. Systems Genetics - study of complex traits (phenotypes) resulting from multiple genotypes and environment interactions  LIMS web app- content and process management

lexine
Download Presentation

Jake Lin Shmulevich Lab jake.lin@systemsbiology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LIMS SOLR Integration  Jake Lin Shmulevich Lab jake.lin@systemsbiology.org

  2. LIMS for Systems Genetics • Systems Genetics - study of complex traits (phenotypes) resulting from multiple genotypes and environment interactions  • LIMS web app- content and process management • Spring MVC with Addama components • Aid research and improve operations • Sample and experiment tracking • Annotations • Visualizing - relationships and results • Pipelines - bash + python + http • Data sharing

  3. Resources and Content • 8 Natural Variant Crossings • ~3000 progeny • 67 Sequencing submissions  • 46 Multiplexed ~48 degrees • 400,000 progeny images • ~10X more content

  4. Robust Search Heart of Information Management • Simple & Fast • Accurate & Meaningful

  5. XPath Search • Hierarchy • file directory • RESTful - http/ajax • Domain • Drawbacks: • Slow LIMS Web Search Addama JCR JCR addama JCR

  6. Wraps Lucene - .jar • Doug Cutting • Apache • Matured • Ported to C++/C#,Pyton,Perl,... • IBM, Apple,... • High performance text search engine library • indexing • querying • Simple Configuration • Web admin  • REST/HTTP APIs • solr.war SOLR + Lucene LIMS Web Search

  7. SOLR schema.xml $TOMCAT_HOME/webapps/ROOT/solr/conf/schema.xml <!-- progeny -->    <field name="ypgKey" type="string" indexed="true" stored="true"/>    <field name="ypgMatingType" type="text" indexed="true" stored="true"/>    <field name="ypgGenotype" type="text" indexed="true" stored="true"/>    <field name="ypgParentA" type="text" indexed="true" stored="true"/>    <field name="ypgParentAlpha" type="text" indexed="true" stored="true"/>    <field name="ypgSiblings" type="text" indexed="true" stored="true"/>    <field name="ypgCrossingRef" type="text" indexed="true" stored="true"/>    ... <!-- composite --> <field name="ypgFields" type="text" indexed="true" stored="true" multiValued="true" /> <copyField source="*Key" dest="limsKey"/> <copyField source="ypg*" dest="ypgFields"/> <copyField source="*" dest="allFields" /> • Field types determine tokenizing and indexing • impact 'fuzzy' and 'like' search  • http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

  8. SOLR HTTP Post #Update/Insert - CSV • curl 'http://saskatoon:8080/solr/update/csv?commit=true' --data-binary @YCR_YPGAll.csv -H 'Content-type:text/plain; charset=utf-8' #Update - JSON • curl 'http://saskatoon:8080/solr/update/json?commit=true' --data-binary @YCR_YPG500.json -H 'Content-type:application/json' #Delete • curl 'http://saskatoon:8080/solr/update?commit=true' -H "Content-Type: text/xml" --data-binary '<delete><query>ypgKey:testKey_001</query></delete>'

  9. Data Migration Update/Insert - CSV • LIMS built in export results to CSV function Import from Database • http://wiki.apache.org/solr/DataImportHandler

  10. SOLR HTTP Get //Find all progenies for YCR6 http://systemsgenetics.systemsbiology.net:8080/solr/select/q=ypgCrossingRef:YCR6&wt=json&rows=5000&fl=ypgKey,ypgBoxNumber,ypgCrossingRef,ypgMatingType,ypgGenotype,ypgParentA,ypgParentAlpha,ypgAlias,ypgTetrad,ypgStatus,ypgPosition,ypgComments,ypgDateFrozen,ypgSiblings QTime: 9 ms {"response":{numFound:400,"docs":[{"ypgKey:ypgX",...}, {"ypgKey:ypgX",...}, ...]}} • &hl=true&hl.fl=ypgPosition,ypgStatus

  11. More get examples //range http://saskatoon:8080/solr/select/?q=yoBoxNumber:[1%20TO%202]&wt=json //AND + OR http://saskatoon:8080/solr/select/?q=(yoBoxNumber:[1%20TO%202]%20AND%20yoFields:wine)&wt=json ...

  12. SOLR ExtJs AJAX Get function getYPGSolrUrl(searchTerm) {     return "/solr/select/?" + "q=" + searchTerm + "&wt=json&rows=5000&" +            "fl=ypgKey,ypgBoxNumber,ypgCrossingRef,ypgMatingType,ypgGenotype,ypgParentA,ypgParentAlpha," +            "ypgAlias,ypgTetrad,ypgStatus,ypgPosition,ypgComments,ypgDateFrozen,ypgSiblings"; } function goSearch(index, ypgSearchInput, ypgSearchOption) {         if (ypgSearchInput == '') {         ypgSearchInput = 'YPG';         }         ypgSearchInput = checkWildcard(ypgSearchInput);                 var searchWin = getSearchLoadingWindow("yprogeny-");                 var searchUrl = getYPGSolrUrl(ypgSearchOption + ":" + ypgSearchInput);         searchWin.on("show", function () {             var sb = Ext.getCmp("yprogeny-search-statusbar");             sb.showBusy();         });         searchWin.show();         Ext.Ajax.request({             url: searchUrl,             method: "GET",             success: function(response) {                 var searchResultObj = Ext.util.JSON.decode(response.responseText);                 myYPGData = [];                 loadYPGSearchResult(index, searchResultObj.response, function() {                     Ext.getDom("sample-search-result-list").innerHTML = "";                     Ext.getDom("yo-form").innerHTML = "";                     searchWin.close();                     renderYPGSearchResult();                 });             },             failure: function() {                 eventManager.fireStatusMessageEvent({ text: "Search Results failed for url:" + searchUrl, level: "error" });             }         });     } //Post function postSolrUpdates(jsonObj, callback) {     var docsol = {};     docsol["doc"] = jsonObj;     var add = {};     add["add"] = docsol;     Ext.Ajax.request({         url: "/solr/update/json?commit=true",         method: "POST",         jsonData: {             add: docsol         },         success: function() {             callback();         },         failure: function() {             Ext.Msg.alert("Error", "Failed updating/adding record - please let Jake know:" + jsonObj);         }     }); }

  13. SOLR ExtJs AJAX Post /* jsonObj contains new and existing annotation values from form */ function postSolrUpdates(jsonObj, callback) {     var docsol = {};     docsol["doc"] = jsonObj;     var add = {};     add["add"] = docsol;     Ext.Ajax.request({         url: "/solr/update/json?commit=true",         method: "POST",         jsonData: {             add: docsol         },         success: function() {             callback();         },         failure: function() {             Ext.Msg.alert("Error", "Failed updating/adding record - please contact Infocore with this info:" + jsonObj);         }     }); }

  14. SOLR Java HttpClient public void testPost(String url, JSONObject jsonObject) {         try {             HttpClient client = new HttpClient();             PostMethod post = new PostMethod(url);             post.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,                     new DefaultHttpMethodRetryHandler(3, false));             JSONObject postObject = new JSONObject();             postObject.put("doc", jsonObject);             JSONObject addObject = new JSONObject();             addObject.put("add", postObject);             //"docs":[{"limsadminKey":"dudley_limsadminkey","limsKey":"dudley_limsadminkey","limsadminYoCount":791,             // "limsadminYoMaxNum":512,"limsadminYoBoxNum":7,"limsadminYoPosition":"G3",             // "allFields":["dudley_limsadminkey","791","512","7","G3","420","420","6","B5","1768","1768","20","A1","367","367","531","531","240","240","344","344"]}]}}             post.setParameter("jsonData", "application/json");             post.setRequestEntity(new StringRequestEntity(addObject.toString(), "application/json", null));             post.setRequestHeader("Content-Type", "application/json");             int statusCode = client.executeMethod(post);             System.out.println("Post " + url + "\nStatus code:" + statusCode);             System.out.println(IOUtils.toString(post.getResponseBodyAsStream(), "UTF-8"));             post.releaseConnection();             assertEquals(0,0);                     } catch (IOException e) {             e.printStackTrace();             assertEquals(0,1);         } catch (JSONException ej) {             ej.printStackTrace();             assertEquals(0,1);         } }

  15. Notes and observations • update act as inserts, delete existing doc • must use lowercase for wild card (*) search • keys must be primitive type • index corruption with java 1.7 • start/stop tomcat • [www@saskatoon ROOT]$ ../../bin/shutdown.sh • [www@saskatoon ROOT]$ ../../bin/startup.sh

  16. References Lucene in Action - Manning Press http://lucene.apache.org/solr/ http://lucene.apache.org/solr/tutorial.html http://wiki.apache.org/solr/SchemaXml http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters http://www.systemsbiology.org/Scientists_and_Research/Faculty_Groups/Dudley_Group Science Perspective http://www.systemsbiology.org/Scientists_and_Research/Faculty_Groups/Shmulevich_Group  In progress http://code.google.com/p/lims-systemsgenetics/

  17. Thanks Shmulevich Lab Andrea Eakin Hector Rovira John Boyle Ilya Shmulevich Dudley Lab Gareth Cromie Cathy Ludlow Patrick May Adrian Scott Aimee Dudley  

More Related