Saturday, October 30, 2010

Component based offers and custom constraints

It has been a while since my last blogpost. Although nothing appeared on this blog, I've been busy. In the past 2 weeks I've - among other things- continued developing the first market prototype. At this moment, that prototype contains most features described in its design which I discussed in a previous post.
Having implemented the prototype, I tried to validate it by creating bids and asks based on offers of some real-world cloud providers. The results were not really satisfying.

Although some of the providers fit nicely into my proposed double auction model, some others really don't. Most notably, providers that offer component based pricing (such as CloudSigma) really cannot participate into the market model I proposed. Also, I realized that some/many providers offer specific functionality that can not be easily expressed by a resource constraint (SLA's, response times, certification, etc). Furthermore, cloud customers might have specific wishes or constraints (region of deployment, SLA's, certification, etc) that similarly can not be expressed by these resource constraints.

In other words, by developing the prototype I discovered 2 problems:
  1. There is no way to allow component based pricing in the current model
  2. A more generic constraint mechanism is needed for both consumers and providers
When brainstorming for solutions about these 2 issues, I quickly discovered the (presumably) easy fix: allow aks to contain the logic to calculate the price for a bid and allow custom constraints in the market that include custom logic to determine whether it matches with an other (customer) constraint.
Although both ideas allow greatly improved flexibilty and differentation for cloud providers, they do make things considerably harder from a technical point of view. When allowing remote code to run within the market, we need to put certain mechanisms in place that verify that the remote code does not breach market security. In other words, we cannot allow arbitrary code to run within the market; extended analysis of the remote code is absolutely needed if we want the market to remain credible.

Doing such security analysis is far from trivial. Although examples exists (such as the mechanisms in Apache River - formerly Jini), implementing such mechanisms in the market would require a lot of time and effort.

There is however an other way to solve this problem. Instead of running remote code on the market itself, the market can also contact providers during the matching process and ask them for a price for a specific resource set or ask them whether a specific constraint matches an other constraint.
Before we have a closer look at both situations (requesting a price and matching a constraint), we first need to consider the impact of this 'reversal of control'.

By delegating market logic to external parties, we can offer improved flexibility without needing mechanisms to do complex code analysis since no remote code is running within the market. However, making such calls to external parties is typically a very slow operation since it involves webservice calls over a network. Since the market potentially needs to do a lot of these calls, it is important that we require a certain amount of QoS from the services that execute the remote logic. The most important constraint will probably be response time (max, avg).

To achieve this, it is probably interesting for providers to work with brokers that have a very fast network access to the market. Such brokers can for example run in the same datacenter or even on the same physical machine (different jvm) as the market itself. Providers can then deploy their custom logic on these brokers, in which way they garantuee that QoS requirements are met. Note that the brokers will still need a mechanism to verify that the remote code of providers can be trusted. The market however, does not; a compromised broker can never lead to a security breach on the market. (A compromised broker can however start generating a lot of network traffic or start using a lot of cpu, disk or memory resources, which can be problematic when the market runs on the same host or within the same datacenter. However, this is not an issue since both the jvm as the virtual machines in which the brokers are ran can easily be setup in a way that limits their resources).  

The diagram below gives an conceptual overview of how the brokers, market and providers interact (for a component based price ask, custom constraint checking uses the same idea).


Let's take a moment to give some more thought to the specific issues of component based pricing and custom constraints.

Component Based Pricing
When introducing component priced asks, the market needs an updated mechanism to efficiently match incoming bids with asks (note that the market also still accepts 'normal', 'fixed' asks).
This can be done by combining the already existing continuous double auction mechanism with a reverse auction for component priced asks. Basically, we will now have 2 queues and a list in the market. The 2 queues still have the same function as before: there is a ask and a bid queue sorted by price (by increasing and decreasing price respectively). The list contains the component priced asks. When a new component based ask comes in, it is added to the list. A new 'normal' ask is first matched to the bid queue and if no match is found, it is added to the ask queue (this is the same behaviour as before). 
The situation is however a bit different when a new bid comes in. When a new bid comes in, 2 threads are started. The first thread does the normal matching as before. The second thread will do the reverse auction by contacting every provider (or broker - typically referred to as a 'bidding agent') in the component based ask list (using webservices) and requesting a price for the incoming bid. When all these prices are in, the algorithm selects the cheapest component based provider. Then there are 2 possibilities: either the other thread has already found a match or it hasn't. If a match is already found, the algorithm will compare the price of the match with the cheapest component based price and return the cheapest of the two as the actual match. If no match has been found yet, the algorithm will look whether the price of the cheapest component based match is less than the price of the ask that is currently being matched with the incoming bid. If this is the case, the component based match is returned. If this is not the case, the algorithm continues searching for a match that is cheaper than the cheapest component based match. If no such match exists, the component based match is returned. If the price of the cheapest component based match is the same as the price of the cheapest non-component based match is the same, one of the two is randomly chosen and returned.

This algorithm can be improved by caching the prices of the component based offers. This can easily be done by associating an expiration date (or lease) with the returned price. By letting the market charge more for component based asks that return a short expiration date, there is an incentive for providers to limit the volatility of their component based prices. This increases the effectiveness of the caching mechanism.

By making sure that providers (or brokers) meet certain QoS constraints and by caching the prices, this algorithm is still fairly efficient while improving the market's flexibility.

Constraints
In we want to allow more generic constraints, it is a good idea to make or matching mechanism more generic by letting the constraints match with each other. In other words, move the matching logic out of the market itself and into the constraints. By doing this - note that we are using the idea of a command pattern here - the market no longer needs to be aware of specific constraint logic, which simplifies the overal market design. An offer (both bids and asks) will now thus have less components:
  • Details about who made the offer (including accounting info, etc)
  • A set of constraints
  • A price
Note that, in this model, there is no longer a difference between a resource, time or any other constraint.

A constraint contains an id which uniquely identifies its type. The market will only match constraints with the same type-id (this is done for efficiency reasons, trying to match every constraint of a bid with every constraint of an ask is very inefficient). Besides a type-id, a constraint also contains (key, value) pairs that give it meaning. For example, a time constraint of an ask may contain ("start", time) and ("end", time+5) to indicate a time interval in which an ask is valid.

Now, how do custom constraints fit into this picture? The idea is that there are many type of constraints already offered by the market. These include resource constraints (e.g. memory/cpu/disk constraint), legal constraints (e.g. SLAs, certification), time constraints and possibly others. These constraints that are offered by the market itself and thus trusted by the market. The code that does the actual constraint checking can thus safely run within the market, which is fast since it is executed locally.
Besides these constraints, there are these so-called custom constraints. These constraints can contain (key, value) pairs as normal constraints, but are also annotated with a webservice that can be called to perform the constraint check. That is, when the market wants to match 2 custom constraints of the same type, it will make a call to the annotated webservice (passing the 2 custom constraints), requesting a constraint check.

In order to optimize this idea, we need to minimize the number of webservice calls that are made. Caching might be an option, but is possibly less interesting here since it is more complex (can we use leases?). I'll need to have a further look into this. Caching is however not the only way to optimize the use of custom constraints. By providing a set of generic constraints that check for property equality or order, we can decrease the amount of custom constraints that are needed. That is, if we provide an 'EqualConstraint' that will check whether the properties of the 'EqualConstraint' it is checked against are equal to its own properties, we can probably already avoid the use a lot of custom constraints. Similarly, 'LessThan' and 'GreaterThan' constraints could verify that the properties of the constraint that is checked against are all less or greater than the properties of the constraint itself. Since the code that does this checking is provided by the market, it can also run safely and fast within the market.

The only question that remains is then how these custom constraints are shared between providers and consumers. A directory service in which providers can add custom constraints is the most logical solution and simple solution.

Prototypes
Although flawed in some ways, the initial prototype already contains most of the CDA logic. I can definitely reuse parts of this prototype. During this week, I also made a quick (local) prototype of the ideas concerning the custom constraints. I'll probably update this post to put that code online tomorrow. I've yet to make a reverse auction prototype for the component based asks, but I think this can be done rather quickly.

The next step involves putting all these prototypes together and making the actual webservice implementation. However, I think I'll explore some other open questions first. 
These include:
  • Does this updated model can be applied in the real world?
  • Exploring application based bid specification (how do data storage and transfer costs fit into the model?)
  • Accounting, how does the market earn money ?
  • Trust: how can we be sure that a provider actually delivers?
  • Price discovery (see Dube)
  • Match execution: a match is found, how do consumer and provider connect with each other?
  • How do we bootstrap the market?
  • Updating the market architecture (see previous blog post)

No comments:

Post a Comment