230 likes | 390 Views
NIH-NCRR Advisory Panel Meeting, August 11, 2000. Collaboratory Testbed for Macromolecular Crystallography at SSRL. Peter Kuhn, Stanford Synchrotron Radiation Laboratory, pkuhn@stanford.edu SSRL is funded by the US Dept. of Energy and the National Institutes of Health.
E N D
NIH-NCRR Advisory Panel Meeting, August 11, 2000 Collaboratory Testbed for Macromolecular Crystallography at SSRL Peter Kuhn, Stanford Synchrotron Radiation Laboratory, pkuhn@stanford.edu SSRL is funded by the US Dept. of Energy and the National Institutes of Health
Agenda for the NCRR Collaboratory Advisory Meeting • Overview, History, Evolution of the Collaboratory • Demonstration of the Current Tools • Introduction to the Assessment System (Frank Topper) • Report on the previously prioritized success indicators • Development of new success indicators • What is missing from the previous set • What are the global, evolutionary goals • What are the new development goals • Coffee Break • Ranking of new success indicators • Collaboratory Software and its use at other synchrotrons and within other disciplines • Short report on current status • Maintenance vs. development; service vs. collaboration • What is needed to develop a Collaboratory environment • Future Directions and the Interface with High-Throughput Data collection and the Joint Center for Structural Genomics
The Collaboratory for Protein Crystallography • Goals • Allow a team of researchers distributed anywhere in the world to perform a complete crystallographic experiment, from data collection to structure publication. • Enhance productivity by allowing remote collaborators to participate in experimental choices at the beam line. • Facilitate collaborative experiments in such areas as drug design and structural genomics. • Fully utilize National resources for crystallographic experiments.
Video Feed Data Collection Web-based Data Viewer Data Reduction and Structure Analysis File and Project Management A Collaborative Research Environment Local/Remote Users
Design Choices for Collaboratory Implementation • Distributed architecture. • Collaboratory services will be hosted by a large number of computers at the National labs, but this infrastructure will be transparent to the remote scientist who will see an integrated view of the experiment with client software. • Platform independence. • The remote scientist will be able to run the client software on any widely available computer operating system, and Collaboratory servers will be designed to avoid obsolescence when computer hardware is upgraded. • Network performance. • “Thin” clients will optimally utilize the available network bandwidth. • Secure access. • Remote access will be via secure channel, and users will be able to specify who can access the data. • Crystallographic applications. • A full suite of widely used crystallographic software will be made available to remote users through a Windows Terminal Server platform. • Permanent archive. • Raw data will be written to a 1000-Terabyte tape storage system at the San Diego Supercomputer Center. Permanent data access will permit more accurate structural analysis. • Tiered approach • WWW appliactions for minimal access • WindowsTerminalServer ICA environment for access to full suite of x-ray software • BLU-ICE in native Client-Server for full performance environment
History on the Collaboratory • Development Progress (planned) • 1998: Assessment of basic needs • 1999: Design, evaluation of needs, and testing of existing software • 2000: Implementation of standalone communication tools, networking of crystallographic software and development of Collaboratory backbone • 2001: Beta-testing of the Collaboratory backbone • 2002: End of testing and launch of Collaboratory • Development Progress (implemented) • 1998: Assessment of basic needs • 1999: Advisory Panel Meeting for definition of priorities and development plan; Diffraction Image Viewer as first web-application; Design and specification of BLU-ICE as the unified control and data collection environment. Implementation of single OS environment at the beam lines. • 2000: Access to data collection, data analysis, image viewing, office software via a single user account with fully integrated file access; BLU-ICE launched on BL9-2 (fall 1999) and full launch on all beam lines in November 2000. Prototype developments of database implementations for data collection. • Initial Proposal: March 1998 • Initiation of Funding: September 1998 • Staffing: • Nick Sauter and Limin Yang were hired in late 1998, both have since moved on to LBNL and a software company in Winter 1999. • Significant responsibility for the Collaboratory was assumed by Timothy McPhillips for software design, Scott McPhillips for software engineering and Peter Kuhn for scientific direction in Fall 1999 • Thomas Eriksson joined in March 2000 as Systems Developer • Fred Bertsch will join on August 14th 2000 as Sci. Software Developer • Offer to candidate for the lead-scientist is currently being drafted with an expected starting date of October 2000.
WWW-Diffraction Image Viewer • Prototype Web-application • Logon via authorized Unix Account • Browse life data directory • View JPG of diffraction images • Zoom, Contrast controls • Image viewer is now implemented at NSLS
Legacy X-Windows Applications Can Run Within in ICA Client without Modification Citrix ICA Client Showing SGI Desktop at SSRL Data Analysis Application Running at SSRL SGI Desktop at home lab
Citrix ICA Client as an Example of a “Good” Thin Client ICA client anywhere In the world • Complete working environment • Feels like a complete workstation in a window. • Supports multiple graphical applications running simultaneously. • User need only install the free ICA client. • High performance • X11 performance and responsiveness in ICA session comparable to a local workstation. • Cross-platform • Client available for all popular operating systems, including DOS, Windows, MacOS, Linux, and many flavors of Unix. • Applications run identically on all client platforms. • Integrated with Client Computer • Local file systems and printers on client computer are automatically accessible in ICA session. • Thin • Does not take up significant CPU or memory resources on client machine. • Only 20 kbit/sec of bandwidth needed for full performance • Robust • Does not hang or cause client computer to crash. Modem or Internet < 20 Kbps Collaboratory Citrix server X11 protocol Over Gigabit LAN Unix beam line computers and central CPU and file servers
BLU-ICE – a unified data collection interface • Insert movie here
Assessment of the Current Status of the Collaboratory • Evolutionary History of the SSRL Collaboratory • Previous success indicators ranked by importance • Previous success indicators grouped by ‘theme’ • Unfulfilled success indicators • Largest project: Archival System • New success indicators • What are the global, evolutionary goals • What will be the future bottlenecks in SMB • What are the projects that need particular attention • What are the projects that benefit the most from a collaborative environment • What are the new development goals • Database environment for all system parameters to enter imgCIF; database will be made available and becomes information source for http://biosync.sdsc.edu web-sites and SSRL internal web-sites • Integrated account system that enables individual user accounts and shared group accounts • Grouping and prioritization of success indicator
Top 30 Success Indicators from 2/5/1999 Meeting 1.88 Ability of researcher at the home lab to monitor & participate in real-time strategic decisions at the beam line 1.88 Access to computer resources from remote locations once the data collection run has ended 2.06 Availability of complete toolkit for solving X-ray structure 2.19 Permanent archiving of data 2.20 Compliance with IUCr standards 2.25 24/7 user support 2.27 Ability of researchers at multiple locations applications on screen, e.g., molecular modeling. 2.31 Turn-key operation instead of traditional methods 2.40 Increased throughput: number of user groups and number of data frames collected 2.50 Availability of all legacy applications for solving X-ray structure 2.53 Scalability and ability of other synchrotron sources to use Collaboratory model 2.53 Willingness of users to collaborate & involve more researchers on a given project 2.59 High resolution video feed to monitor microscope and goniometer 2.67 Beam line control from remote location 1.25 Ability to transfer data in and out of Collaboratory 1.27 24/7 availability 1.31 Rapid feedback to the user during the data collection run 1.44 Beam line safety 1.47 Ease of use; friendliness of user environment 1.53 Security and reliability (free of malicious and accidental interruption) 1.53 Responsiveness to user suggestions 1.57 Increased percentage of successful experiments 1.63 User-friendly interface for camera and beam line motion control 1.63 Database of methods, tutorials, and example files 1.63 Responsive, high-speed user interface at remote (worldwide) locations 1.63 Rapid processing of CPU-intensive jobs 1.81 Early characterization of user needs and wishes 1.86 Reduced time from data collection to structure solution 1.87 Availability and quality of training: safety, hardware, crystallography methods, software
Grouped success indicators – 1 • Capabilities • Remote control & video presence: • High resolution video feed to monitor microscope and goniometer • under development • Beam line control from remote location • available now through the Citrix ICA client and on native platforms Q1 2001. • currently developing the security protocols needed to allow remote access • Ability of researcher at the home lab to monitor & participate in real-time strategic decisions at the beam line • current tools allow test users to experiment with this capability, WWW-image viewer gives all users access to their data • User-friendly interface for detector and beam line motion control • available now at beam line 9-2 with BLU-ICE • Data processing: • Availability of complete tool kit for solving X-ray structure • MAD structures are routinely solved at BL9-2; not yet implemented in collaborative way • It will require a larger effort to integrate software from different sources • Access to computer resources from remote locations once the data collection run has ended • Available now for test users, but requirements are not yet defined for larger scale • Access to sufficient compute resources for rapid data processing during the experiment • Part of ‘regular’ user operations; three 4-processor 667MHz systems will support the 5 beam lines from Nov • Permanent archiving of data • all image data will be in imgCIF format from Nov 2000; archival under development • Ability to transfer data in and out of Collaboratory (see archival) • Turn-key operation instead of traditional methods • 120 second movie shows impact of advanced instrumentation and software environments. Still developing methods for rapid determination of ‘best’ energies for MAD and general data collection strategies • Database of methods, tutorials, and example files • To be developed
Grouped success indicators - 2 • Accessibility • Availability: • 24/7 availability • Test systems are ‘standalone’, production systems will include multi-system failover environment • Distributed control system has built in ‘watch-dog’ and other safety features that enhance high uptime • Software engineering principles result in high robustness • Security and reliability: • Free of malicious and accidental interruption • implemented basic security precautions for all remote access avenues • Beam line safety • X-ray accidents are not possible because regular hutch safety protocols are never circumvented. • Responsiveness: • Responsive, high-speed user interface at remote locations • Tested from Singapore, Hong Kong, Erice (Italy), and numerous places in the US • Rapid processing of CPU-intensive jobs • Adequate CPU power for current use but expansion of capabilities required for post-experimental access • 24/7 user support • Support at the beam lines is 24/7, extension to remote users is under study • Collaboratory development has triggered equipping all support staff with cell phones and high-speed internet connections to home locations • Compliance with IUCr standards • Compliance with mmCIF standards and SDSC archive formats; mmCIF will be used from Nov 2000
Grouped success indicators - 3 • User Experience • Rapid feedback to the user during data collection • Users generally process their data in real-time at the beam line. Test users process data remotely. • Ease of use; friendliness of user environment • User feedback has been positive • Responsiveness to user suggestions • Software upgrades and system improvements based on user feedback, BLU-ICE for data collection was initiated in mid-1999, developed with user feedback, launched in Nov 1999, revised with user feedback and will control all beam lines by Nov 2000 • Early characterization of user needs and wishes • See above • Availability and quality of training: safety, hardware, crystallography methods, software • Expansion of smb.slac.stanford.edu web pages; the remote collaboration tools will be documented in full when they become available; enhanced scientific support through additional support from NIGMS; SMB School2000 in September 2000. • Scientific Progress • Increased percentage of successful experiments • Reduced time from data collection to structure solution • Increased throughput; number of user groups and data frames collected
Current Plans from Previous Success Indicators • Capabilities • Remote control & video presence: • High resolution video feed to monitor microscope and goniometer • under development • Ability of researcher at the home lab to monitor & participate in real-time strategic decisions at the beam line • full collaborative environment under development; BLU-ICE client server to include data reduction; • Data processing: • Availability of complete tool kit for solving X-ray structure • MAD structures are routinely solved at BL9-2; not yet implemented in collaborative way; • Specialized RUNS window within BLU-ICE that allows auto-selection of MAD energies, ultra-high resolution strategies, integration of Kappa strategy • Permanent archiving of data • under development; highest demand project because it carries responsibility for the data • Ability to transfer data in and out of Collaboratory (see archival) • Database of methods, tutorials, and example files • To be developed; SMB Team is collaborating with outside groups
A Distributed Architecture for Data Archiving Web Browser Interface Data Collection Software Collaboratory File Browser Unix Command Line Interface SSRL Data Archive Server SSRL Data Archive Database Hard Disk At Home Lab Storage Resource Broker (SRB) At SDSC SSRL RAID System HPSS at SDSC
File Parameters • Creation date • Access control list • Tape archive status • User annotation • Annotation by data processing software • Move, rename, and copy tracking Metadata for Diffraction Images Image File Header Thumbnail View Larger JPEG View Larger JPEG View
New Success Indicators • What are the global, evolutionary goals, how do we define scientific progress • Increased percentage of successful experiments • Reduced time from data collection to structure solution • Increased throughput; number of user groups and data frames collected • What will be the future bottlenecks in SMB • What are the projects that need particular attention • What are the projects that benefit the most from a collaborative environment • What are the new development goals • Database environment for all system parameters to enter imgCIF; database will be made available and becomes information source for http://biosync.sdsc.edu web-sites and SSRL internal web-sites • Integrated account system that enables individual user accounts and shared group accounts for data sharing and project separation as needed • New WWW-Diffraction data viewer, add’l WWW tools • Integration with the scheduling process • Integration with the proposal review process • Closing the chapter of BLU-ICE • Next revision as full production version • Only additional modules but no new developments
Future Plans and Directions • Collaboratory Software and its use at other synchrotrons and within other disciplines • XAS-Collaboratory • Short report on current status • SRRC, Spring8 • Canadian Light Source • Implementation of BLU-ICE on rotating anode systems • ALS MCF BL5.0. • ALS superbending magnet beam lines • Maintenance vs. development; service vs. collaboration • SRRC MoU for multi-year collaboration • What is needed to develop a Collaboratory environment • Future Directions and the Interface with High-Throughput Data collection and the Joint Center for Structural Genomics • Overview of JCSG • Overview of ASAP