50 likes | 190 Views
Data Handling Breakout Neil Chue Hong, OMII-UK. Data Handling. What’s important? Ease of management Lifetime / durability Interoperability with other systems / software Functionality Security / trust Data formats Size limitations S tandards for annotation of databases
E N D
Data Handling • What’s important? • Ease of management • Lifetime / durability • Interoperability with other systems / software • Functionality • Security / trust • Data formats • Size limitations • Standards for annotation of databases • Are these different for different communities?
Questions • Challenges from your research area • What does the NGS currently provide that ‘s useful? • What does the NGS need to provide? • What should we do next (as a community)?
Challenges • Security of certain types of data is very important • E.g. storage of anonymous MRI image data for large scale research projects • If this is not resolved, data will stay within the lab • Data formats are all different and divergent • Need functionality to aggregate and integrate data from different formats • Policy varies between areas and internationally • Standards for annotation of databases needed
NGS Wishlist • Identify and host up-to-date key databases in each field • Prevent decay , divergence and desynchonisation of locally copied datasets • Easy way for database providers to publish datasets to NGS • Map VO attributes to Unix groups so VO’s can have control on authorisation to their data • Make it easy to make data available on worker nodes when it’s needed • Provide more information for submission of data, how-to’s for common usage scenarios(e.g. SNP calling, BLAST search)