290 likes | 305 Views
Learn how Condor provides a secure environment for users and admins through authentication, encryption, integrity checks, and flexible authorization policies. Explore network security methods, user mapping, and scalability solutions.
E N D
Overview • Principles • Network Security • Local Security
Principles • Goals • Provide a secure environment for both users and administrators • Do so in a flexible and configurable way • Make sure the system is scalable
Principles • Condor supports a wide array of security features • Authentication mechanisms • Encryption • Integrity checks • User mapping • Flexible authorization policies
Principles • Condor supports a wide array of security features • Advanced session management • Integration with Window’s LSA secure registry storage • Privilege separation mechanisms • Condor’s PrivSep implementation • glExec
Network Security • Because Condor is deployed on many computers connected via a network, point-to-point security is very important • It is generally the job of the Condor administrator to configure the network security of their Condor cluster
Network Security • Authentication – Identifying the entity we are communicating with • Basic authentication mechanisms • FS – (UNIX only) Rely on the security of the filesystem for local authentication • NTSSPI – (Windows only) Use Window’s standard authentication mechanism
Network Security • Strong authentication mechanisms • KERBEROS (ActiveDirectory) – uses an online KDC (ADC on Windows) with user and service principals • SSL – based on PKI and x.509 certificates • GSI – based on the Globus Toolkit with support for multiple certificate authorities, proxies, VOMS attributes, etc.
Network Security • Other authentication mechanisms • PASSWORD – simple and strong but not useful beyond securing a single pool • ANONYMOUS – the remote entity is not known • CLAIMTOBE – generally for testing purposes only
Network Security • Example Condor configurations • Authenticate administrator operations with Kerberos SEC_ADMINISTRATOR_AUTHENTICATION = REQUIRED SEC_ADMINISTRATOR_AUTHENTICATION_METHODS = KERBEROS • Authenticate everything SEC_DEFAULT_AUTHENTICATION = REQUIRED SEC_DEFAULT_AUTHENTICATION_METHODS = FS, SSL, GSI, KERBEROS
Network Security • Encrypting data means that an outside party cannot view the contents of the data • Integrity checks allow Condor to determine if data was modified by an outside party while en route • Condor easily supports these concepts SEC_DEFAULT_ENCRYPTION = REQUIRED SEC_DEFAULT_INTEGRITY = REQUIRED
Security Negotation • When two entities communicate, they may have different abilities, preferences, and requirements for security features • For example, the client may support X509 and KERBEROS, and the server may support FS, GSI, and KERBEROS • One side or the other may require encryption or integrity • The two sides must negotiate to agree on which mechanisms to use
Security Negotiation • This negotiation is done by exchanging ClassAds specifying security capabilities, preferences, and requirements • The server has final say over which mechanisms will be used • The client can choose to reject this decision and close the connection if its requirements are not met
User Mapping • Once authenticated, Condor will map the authenticated idenity to a canonical user • Can be done with a simple grid-mapfile • Condor supports its own mapfile which allows the use of regular expressions • Condor can also use VOMS attributes, extract portions of a Kerberos principal or x509 DN, or make callouts to other mapping services
Example Condor Map File GSI "/DC=org/DC=doegrids/OU=People/CN=Zach Miller 428167" zmiller GSI “/DC=org/DC=doegrids/OU=People/CN=.*,/vdt/Role=Admin” condor GSI (.*) GSS_ASSIST_GRIDMAP FS (.*) \1 FS_REMOTE (.*) \1 SSL (.*) ssl@unmapped KERBEROS ([^/]*)/?[^@]*@(.*) \1@\2 NTSSPI (.*) \1 PASSWORD (.*) \1 CLAIMTOBE (.*) \1
Network Security • So, why not always authenticate, encrypt, and integrity check everything with the strongest methods? • Strong authentication has a fair amount of overhead! • X.509 authentication can take on the order of 0.25 seconds • KERBEROS authentication relies on contacting the KDC over the network • Encrypting everything also isn’t free • Since Condor may be running hundreds or thousands of jobs, the submit machine could easily be overwhelmed
What to do? • Users to specify whether or not their data files need to be encrypted • These settings override the default set by the administrator • This can be done on a file-by-file basis: transfer_input_files = private.data, database encrypt_input_files = private.data dont_encrypt_input_files = database
Scalability • Authentication is expensive • A pool with thousands of machines can easily overwhelm its Central Manager • So, Condor creates security sessions with private symmetric keys • Security sessions can be reused until they expire • Drastically reduces the amount of authentication and key generation • Still provides strong security
Scalability • Using sessions also avoids mapping the user each time, which could be a problem if using a mapping service rather than a simple text map file • Sessions can also be “delegated” from one Condor daemon to another. • When a match is made, the Condor negotiator delegates a session to the submit and execute machines so they have a secure channel already in place to launch the job • For more info on advanced session usage, see: http://www.cs.wisc.edu/condor/doc/flexible_sessions.pdf
Local Security • Condor has a number of mechanism to enhance “Local Security”. • These mechanisms protect running user jobs from each other, protect the local machine from malicious jobs, and limit exposure of sensitive information when running on an untrusted machine.
Local Security • Jobs can be run as the user who submitted them, giving the jobs only the privileges that user has on that system • Condor can also run jobs as “nobody”, giving the jobs even less privilege than the user who submitted them • Condor can even run each job on the system as a different “nobody” user so that the jobs can not interfere with each other
Local Security • Condor often runs with root privilege to allow for acting and spawning jobs on behalf of multiple users • Even in this case, Condor has a lot of safety nets and kill switches. For example, it can not start a job as root.
Local Security • Condor tries to follow the concept of “least privilege” which means giving a user or process only those abilities which should be needed to complete the task • Execute machines can be configured to use privilege separation • In this case, the Condor daemons do not have root privilege and instead rely on an external mechanism to perform privileged operations: • Condor’s built-in PrivSep • glExec
Condor PrivSep • Condor’s PrivSep mechanism allows Condor to run as an unprivileged user (no root privilege) but still perform some privileged operations • To perform certain tasks, Condor invokes the PrivSep switchboard • Reading and writing files as a specific user • Spawning a process (running a job) as a specific user • Cleaning up data files left behind • The switchboard is a setuid binary that is separate from the rest of the Condor code base
Condor PrivSep • PrivSep switchboard • Being separate from the rest of the (hundreds of thousands of lines of) Condor code makes it much easier to verify correctness and audit for security • The switchboard has a configuration file that contains whitelists of users to act as, directories to act within, and abilities that the switchboard can perform • This limits the capabilities of an attacker who somehow gains control of the Condor binaries • Prevents local privilege escalation to root
glExec • glExec is another mechanism to provide privilege separation • Also allows Condor binaries to run without root privilege • Requires an x509 user credential to be presented when asking to perform privileged operations • Maps the x509 credential DN to a local UID and performs the task as said user
glExec • Because glExec requires a valid proxy, it further limits what an attacker could do if they gain control of the Condor processes – the attacker also needs to posess a valid user proxy • When the proxy expires, it can no longer be used to invoke glExec
glExec • Condor’s use of glExec has some shortcomings which are being addressed • If the proxy expires while a job is running, Condor cannot clean up after the job since it can no longer act on behalf of that user • If a new proxy is created while the job is running (via MyProxy or any other mechanism) Condor needs to check that the new proxy maps to the same user
Questions? • Ask me now • Email condor-admin@cs.wisc.edu • Email zmiller@cs.wisc.edu • See other talks on security: http://www.cs.wisc.edu/condor/talks.html