200 likes | 281 Views
A Collaborative Monitoring Mechanism for Making a Multitenant Platform Accoutable. HotCloud 10 By Xuanran Zong. Background. Applications are moving to cloud Pay-as-you-go basis Resource multiplexing Reduce over-provisioning cost Cloud service uncertainty
E N D
A Collaborative Monitoring Mechanism for Making a Multitenant Platform Accoutable HotCloud 10 By XuanranZong
Background • Applications are moving to cloud • Pay-as-you-go basis • Resource multiplexing • Reduce over-provisioning cost • Cloud service uncertainty • How do the clients know if the cloud provider handles their data and logic correctly • Logic correctness • Consistency constraints • Performance
Service level agreement (SLA) • To ensure data and logic are handled correctly, service provider offers service level agreement to clients • Performance • e.g. One EC2 compute unit has the computation power of 1-1.2 GHz • Availability • e.g. the service would up 99.9% of the time
SLA • Problems • Few means are provided to clients to make a SLA accountable when problem occurs • Accountable means we know who is responsible when things go wrong • Monitoring is provided by provider • Clients are often required to furnish evidence all by themselves to be eligible to claim credit for SLA violation
EC2 SLA Reference: http://usenix.org/events/hotcloud10/tech/slides/wangc.pdf
Accountability service • Provided by third party • Responsibility • Collect evidence based on SLA • Runtime compliance check and problem detection
Problem description • Clients has a set of end-points {ep0, ep1, … , epn-1} that operate on data stored in multitenancy environment • Many things can go wrong • Data is modified without owner’s permission • Consistency requirement is broken • The accountability service should detect these issues and provide evidence.
System architecture • Wrapper provided by third party • Wrapper captures input/ouput from epi and send to accountability service
Accountability service • The accountability service maintains a view of the data state • Reflects what data should be from users’ perspective • Aggregates data updating requests of users to calculate the data state • Authenticates query results based on the calculated data state
Evidence collection and processing • Logging service, wep, extract operation information and send log message to accountability service W • If it is a update service, W updates MB-tree • If it is a query service, W authenticates the result with MB-tree and ensures correctness and completeness • MB-tree maintains the data state
Data state calculation • Use Merkle B-tree to calculate data state • By combining the items in VO, we can recalculate the root of the MB-tree and compare it with the root to reveal the correctness and completeness of the query result
Consistency issue • What if the log messages arrive out-of-order? • Assume eventual consistency • Clocks are synchronized • Maintains a sliding window of sorted log messages based on timestamp • Time window size is determined by the maximum delay of passing a log message from client to W
Collaborative monitoring mechanism • Current approach • Centralized: availability, scalability, trustworthy • Let’s make it distributed • Data state is maintained by a set of services • Each service maintains a view of the data state
Design choice I • Log send to one data state service and the service then propagate the log to other services in a synchronous manner • Pros • Strong consistency • Request can be answered by any service • Cons • Large overhead due to synchronous communication
Design choice II • Log send to one service and the service propagate the log asynchronously • Pros • Better logging performance • Cons • Uncertainty in answering an authentication request
Their design • Somewhere in between of the two extremes • Partition the key range into a few disjoint regions • Log message only sends to its designated region • Log message is propagate synchronously within the region and asynchronously across regions • Authentication request is directed to service whose region overlaps most with request range • Answer with certainty if request range falls inside service region • Wait, if not
Evaluation • Overhead • Centralized design • Where does the overhead come from?
Evaluation • VO calculation overhead
Evaluation • Performance improvement with multiple data state service
Discussion • Articulate the problem clearly and show one solution that employs third party to make the data state accountable • Which part is the main overhead? • Communication? VO calculation? • Distributed design does not help much when query range is large • Do people want to sacrifice their performance(at least double the time) in order to make the service accountable? • Can we use similar design to make other parts accountable? For instance, performance?