360 likes | 500 Views
Scalable Cryptographic Authentication for High Performance Computing. Andrew Prout, William Arcand, David Bestor, Chansup Byun, Bill Bergeron, Matthew Hubbell, Jeremy Kepner, Peter Michaleas, Julie Mullen, Albert Reuther, Antonio Rosa 2012 IEEE High Performance Extreme Computing Conference
E N D
Scalable Cryptographic Authentication for High Performance Computing Andrew Prout, William Arcand, David Bestor, Chansup Byun, Bill Bergeron, Matthew Hubbell, Jeremy Kepner, Peter Michaleas, Julie Mullen, Albert Reuther, Antonio Rosa 2012 IEEE High Performance Extreme Computing Conference 10 - 12 September 2012 This work is sponsored by the Department of the Air Force under Air Force contract FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the United States Government.
Outline • What is the LLGrid • The Problem: External services authentication • The Solution: Cryptographic authentication • Results
LLGrid System Architecture Service Nodes Compute Nodes Cluster Switch • LLGrid is a ~500 user ~2000 processor system • World’s only desktop interactive supercomputer • Dramatically easier to use than any other supercomputer • Highest fraction of staff using (20%) supercomputing of any organization on the planet • Foundation of Supercomputing in Massachusetts Users Network Storage Resource Manager Configuration Server LLAN To Lincoln LAN LAN Switch
LLGrid Usage All jobs run on LLGrid Classic Supercomputing • Desktop Computing • CPU-time <20 minutes • Classic Supercomputing • Wall-clock time >3 hours • Interactive Supercomputing • Between desktop and classic supercomputing • Shortens the “time to insight” • Ten development turns/day instead of one turn/week Total Job duration (seconds) 1 100 10000 1M Interactive Supercomputing Desktop Computing TX-2500 (952 Cores) TX-X (220 Cores) TX-3d (540 Cores) 1 10 100 1000 Processors used by Job
Outline • What is the LLGrid • The Problem: External services authentication • The Solution: Cryptographic authentication • Results
Challenges withInteractive Supercomputing • As the line between a shared supercomputer and a “really powerful personal computer” blurs, users expect to have access to network resources (storage, svn, cvs, etc). Challenge: Users expect seamless access to other network resources from the HPC.
Challenges withInteractive Supercomputing • However these commands raise security concerns. • They store passwords as plain-text on the HPC central storage. • Password synchronization has made this password very sensitive. “S3cr3t” Challenge: Ensure seamless access without putting the user’s “one common password” at risk.
Outline • What is the LLGrid • The Problem: External services authentication • The Solution: Cryptographic authentication • Results
Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User
Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request
Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request Authentication Request A
Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request Authentication Request A A
Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request Authentication Request A Signed Authentication Response and copy of PKI certificate A
Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request Authentication Request A Signed Authentication Response and copy of PKI certificate A A
Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request Authentication Request A Access Granted: Welcome Andy! Signed Authentication Response and copy of PKI certificate A A
Challenges withCryptographic Authentication • Cryptographic authentication depends on both the security of the user’s private key and access to it. • Storing the private key on central storage is little different than storing a user’s password. Challenge: Where to store the private key?
Challenges withCryptographic Authentication • Cryptographic authentication depends on both the security of the user’s private key and access to it. • Storing the private key on central storage is little different than storing a user’s password. No guarantee the key won’t be lost, copied or left unprotected.
Challenges withCryptographic Authentication • One traditional solution is to store the key on the client system and forward authentication requests back to the user’s system. • Could be on the client system or in a smart card.
Challenges withCryptographic Authentication • One traditional solution is to store the key on the client system and forward authentication requests back to the user’s system. • However this fails if the user disconnects from the HPC. Poof! Forwarding requests back doesn’t work for semi-interactive computing or background jobs.
Challenges withCryptographic Authentication • Connecting smart cards to the HPC is not practical. • Some network-attached key storage devices exist, but their practical benefit in this scenario is questionable. Poof!
Challenges withCryptographic Authentication • We implemented a virtual smart card to run on each node. • Allows for keys to be used on any node, connected or disconnected. • Allows for different keys on each node. Poof!
Virtual Smart Card Defined • Uses the smart card communication API: PKCS#11. • Authenticates users and allows authorized users to perform cryptographic operations. • Protects private keys from being copied, even by authorized users of the key. • High throughput capability & low latency. • Physical smart cards have a latency approximately 800-900ms.
The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. • Runs as it’s own user account. Keyd
The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. • Runs as it’s own user account. • Has access to all the keys. Keyd
The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. • Runs as it’s own user account. • Has access to all the keys. • We then created a library that conformed to the PKCS#11 standard and could talk to this daemon. • Loaded by applications running as a HPC user. Keyd PKCS#11
The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. • Runs as it’s own user account. • Has access to all the keys. • We then created a library that conformed to the PKCS#11 standard and could talk to this daemon. • Loaded by applications running as a HPC user. • Connects through a unix socket. • User credentials passed through the socket • Secure, provided you trust your linux kernel. Keyd PKCS#11
The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. • Runs as it’s own user account. • Has access to all the keys. • We then created a library that conformed to the PKCS#11 standard and could talk to this daemon. • Loaded by applications running as a HPC user. • Connects through a unix socket. • User credentials passed through the socket • Secure, provided you trust your linux kernel. • The SVN client can then load the PKCS#11 library and use the keys to authenticate to the SVN server. Keyd PKCS#11
The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. • Runs as it’s own user account. • Has access to all the keys. • We then created a library that conformed to the PKCS#11 standard and could talk to this daemon. • Loaded by applications running as a HPC user. • Connects through a unix socket. • User credentials passed through the socket • Secure, provided you trust your linux kernel. • The SVN client can then load the PKCS#11 library and use the keys to authenticate to the SVN server. • Other applications can be enabled in the future. Keyd PKCS#11
Configuring SVN for TLS Client Auth • The SVN server was configured to accept the LLGrid’s root of trust. • The SVN client on the LLGrid was configured to load the keyd daemon PKCS#11 library. • One configuration entry: ssl-pkcs11-provider=libkeyd_pkcs11 SVN Server Keyd Daemon SVN User Connection Request Authentication Request A Signed Authentication Response and copy of PKI certificate A A
Outline • What is the LLGrid • The Problem: External services authentication • The Solution: Cryptographic authentication • Results
X509 PKI Certificate Enrollment • Keypair generation and X509 PKI certificate creation is performed during user account creation. • LLGrid Adminstrators act as the root of trust. • We developed scripts that execute parallel key generation across nodes in the cluster. • Each certificate asserts both the user identity and the node identity to meet the guidelines to be used for either server or client TLS authentication. Keypair & Certificate Generation Time (seconds) Nodes
Results • Created a general purpose key storage and certificate management solution for HPC. • Keys are not managed by the end-user, ensuring a low risk of compromise requiring revocation. • Demonstrated that it can be used to enable single sign-on integration to systems outside of the HPC. • Mitigated security concerns over passwords being stored on the LLGrid central storage. • Avoided the issue of periodic password changes impacting batch processing.
Future Work • Future work will look to use these PKI certificates to secure inter-node web services communication. • Certificates are valid for both TLS client or server authentication.