200 likes | 272 Views
GENI Spiral 4 Architecture Plan. Marshall Brinn, GPO 11-18-2011. Overview. Over the course of the remaining GENI spirals, I suggest we expand and define the GENI Architecture in several dimensions:
E N D
GENI Spiral 4 Architecture Plan Marshall Brinn, GPO 11-18-2011
Overview • Over the course of the remaining GENI spirals, I suggest we expand and define the GENI Architecture in several dimensions: • Interoperability: Assuring that GENI can readily manage heterogeneous aggregates and control planes • Accountability: Assuring that all actions taking by experimenters are authorized and logged • Usability: Making it easier to create, monitor, modify and shut-down experiments. • Scalability: Assuring that GENI can support additional aggregates and experimenters nearly without limit.
Themes for Spiral 4 Some goals along these dimensions for the end of Spiral 4 (GEC15). The details of these are provided in following slides. • Interoperability • Enhancement of AM API or RSpecs to include required resource attributes, to be matched to resource attributes managed by AM’s. • Integration of current stitching tools into AM API (or Stitching API on AM’s) • Accountability • Design and initial implementation of Clearinghouse (CH) including request portal, AuthN/AuthZ, logging, forensics • Usability • Expansion of Experimenter tools for supporting life-cycle of an experiment to GENI-wide applicability • Integration of GENI I&M and AM architectures • Design for “Opt-in User” support • Scalability • Design of CH, AM to support hierarchical composition, delegation, redundancy
Challenge Problems • To motivate and focus the Spiral 4 effort, I suggest these challenge problems to be demonstrated at GEC15: • Bad Actor Shutdown • Aggregate Independence • These scenarios will require and foster tight collaboration between the I&M, Operations and Software Architecture GENI groups. • We should plan for these scenarios to run and be demonstrated on the new (ExoGENI, InstaGENI) racks.
Challenge Problem: Bad Actor Shutdown • In this scenario, we have our two standard actors, “Coffee Girl” and “Test Tube Guy”. • Both TTG and CG allocate their own slices and start to work with them • They share some of the the same aggregates • CG begins to “act badly” • Request resources beyond her policy limitations • The PI of her project is given a warning email. • Start a DOS attack on her resources • The GMOC notices the behavior and decides to shut her down • All processes associated with her experiment are killed, all resources are released • Her certificate is withdrawn, a notification to project PI is sent. • Through all this, the TTG’s work continues relatively undisturbed. This scenario emphasizes the role of the GENI Clearinghouse in real-time security control.
Challenge Problem: Aggregate Independence • This scenario allows Experimenters to specify their requirements and find appropriate resources without specifying specific resources in their RSpecs. • By augmenting the RSpec, aggregates can advertise and experimenters can request services at a level above individual resources • The AM API supports a match service to allow for finding resources on a given AM matching a given spec. • The Experimenter (or proxy service) is able to construct a slice without any knowledge of underlying available hardware resources. This scenario emphasizes a technology agnostic approach to GENI federation: We should be able to add and immediately use new resources by specifying their capabilities and user requirements rather than specific resources by name/URN.
Interoperability for Aggregate Managers:Require Little, Allow Much I suggest a ‘big tent’ approach by which we encourage/allow many groups to provide their resources to GENI via the AM API. • GENI defines a set of services and attributes that are required • Any AM that provides less than these is non-compliant. • GENI defines a set of services and attributes that are optional • Any AM that provides more than these is non-standard. • Beyond this, we should take a technology-agnostic approach: we don’t care how you implement the AM API.
Interoperability: AM API Enhancement • Extensions to AM API to support interoperability and ‘technology agnostic’ approach to computation resource allocation • RSpec provides optional S/W or H/W attributes on requested resources • E.g. “can run VM”, “can run GPU”, “can run Java 1.6”, “provides persistence”, “can access layer 1 network”, “is ‘near’ a given location” • Aggregates provide attributes of what they can provide • Match requested attributes of requested resources against attributes of available resources By ‘technology agnostic’, we mean that experimenters shouldn’t care whether they are getting PL or PG or other resources, so long as they fit their requirements. If they don’t specify requirements, they shouldn’t care what kind of resource they get back.
Interoperability:Stitching API • We should integrate and standardize, as appropriate, the current stitching tools and RSpec provisions into an interface provided by each AM • Possibly as an extension of the AM API, possibly a different required ‘Stitching’ API.
GENI Clearing House and Authentication, Authorization, Accountability 4) All requests (successful or not) to access resources by a principal are logged, so that current state of resource allocations can be readily derived back to experimenter and PI as needed. 0) CH contains credentials, roles, policies of all AM’s, experimenters, PI’s, projects Request Logs, Aggregate State Clearinghouse (CH) 1) Experimenter identifies self to CH and receives authenticated credentials. 3) AM registers resource request with CH and receives a policy-based approval/denial. Aggregate Manager (AM) Experimenter It is a configuration decision whether this is an synchronous or asynchronous response, and if asynchronous, whether then AM may allocate the resource provisionally while awaiting final authorization. 2) Experimenter approaches AM’s with Rspec (requirements) and credentials. Experimenter may be supported by automated proxy service (e.g. Slice Manager, Experiment Manager) authorized to interact with AM’s and aggregates on behalf of the Experimenter.
Accountability: Clearinghouse Implementation • CH will support • Portal API for allowing experimenters to request slice services by gaining authenticated credentials • Authentication of experimenters, validation of credentials, authorization of requests relative to project, role, resource policy requirements • Logging service (either CH-internal or based on common long-lived logging service) to account for all attempts to access resources • Extremely high availability and defensibility • Spiral 4 implementation may not be redundant /federated • But to be designed for future scalability in Spirals 5 and 6
Usability:Experiment Management Tools • There are many useful tools available to Experimenters to help them manage the life-cycle of an experiment: • Slice request, allocation, reconfiguration, release • Experiment configuration, controls, result logging, reproducible configurations and workflows • Some have been developed or ported to support GENI API’s • E.g. Flack, Raven, omni • Many of these are control framework specific • The goal is to make as many of them as we can applicable GENI-wide • I’m thinking of (not exclusively): • Slice Manager (ORCA) • Broker (ORCA) • Slicing (Lehman and ORCA) • I&M Tools (GIMI: OMF/OML and GEMINI) These tools should not require significant changes to the AMI API, but will significantly lower the barrier to entry for new experimenters or other application developers.
Usability: Integration of I&M and AM Architectures • We currently have two relatively independent architectures co-located on GENI Resources: the I&M (Instrumentation and Measurement) and AM (Aggregate Manager) • In Spiral 4, I would like to start the process of unifying these • Specify what sensors and data sinks should be placed where within slices or long-lived services • Expand/standardize notions of “Aspect-oriented” I&M • Integrate (or jointly develop) I&M and slice management tools • Defining standard ‘Report’ interface from Aggregates and slices to GMOC or other monitoring tools
Usability:Opt-in User Design • While there are many legal and social issues to be worked out to make ‘Opt-in Users’ a reality, we can move forward to design the architectural approach • How is an opt-in user presented with the option to ‘opt-in’? • What happens when they take that option? • What possibility is there to subsequently ‘opt-out’? • What can/should we do to handle opt-in user data specially?
Scalability: AM • We should support AM’s that are “managers of AM’s”, i.e. present a set of AM’s as a single aggregate • This should allow for a highly scalable configuration: arbitrary hierarchies of AM managers, which can satisfy requests or delegate requests to subordinates. • Similarly, we should support coordinated replication of AM state to allow for redundant (more reliable) storage • Future spirals will focus on scalability of CH and common services
Architecture Expansion Themes To be integrated with expansion from other groups (Operations, I&M).
Themes for Future Spirals • QOS-based specs • Often, Experimenters know what kind of performance they need but don’t (or shouldn’t) know about the underlying resources. • We should provide a high-level spec of performance that compiles into RSpec’s, and augment the SM to dynamically allocate/release resources to try to attain the requested QoS performance. • Creation/support of scalable, distributed long-lived services available to other slices
Themes for Future Spirals [2] • Requirements-based resource allocation • Allocating/releasing resources at points in a slice’s workflow so that they are only taken when needed • Clearinghouse Scalability • Distribution, Redundancy, Federation of Clearinghouses to allow multiple entry points, sharing of AuthZ/AuthN services, logging. • Other long-lived services should be redundantly federated at this point as well. • Opt-In User Policy-based Implementation • Implement design captured in Spiral 4, incorporate in future challenge problem • Subverted Aggregate Manager • How to detect, contain and recover from an Aggregate Manager that has been taken over by outsider
Themes for Future Spirals [3] • Production Quality I&M Services • Production Quality Operations Services • Seamless, complete integration with WiMAX, OpenFlow • Ongoing enhancement of Experimenter Tools