Thesis Joris Roovers: The GridEcon Project

As I mentioned in a previous blogpost I will now write something more about the GridEcon Project.
On the website of the GridEcon Project, the following project description can be found

GridEcon 1.0 is a computational resource auctioning system built upon an innovative bid matching algorithm tailored specifically for the trading of computing power. It brings together both consumers and providers of available computational resources and not only has the capability to deliver those resources just like Amazon, IBM and Microsoft, but also allows you to trade your own excess computational power... all in a single, cost-effective marketplace.

In other words, the GridEcon project is in fact an attempt to implement the system that I am currently researching.

Is my work done then? Not quite. Although GridEcon's market mechanism contains some good ideas, it also makes an assumption that prevents it from being useful in a real world situation.
That is, it assumes that the 'unit of trade' is a single VM, and that providers give consumers the possibility to let such a VM run on their machines. Although GridEcon acknowledges the fact the providers offer different types of VMs, with different characteristics, it does not really provide an elaborate way for providers and consumers to specify which kind of VM is offered or requested (both parties should choose from a predefined set of VMs). Although virtualization can be considered as a core technology behind most IaaS implementations, and standardization of VM formats is considered by many as critical for the real commoditization of cloud resources, I believe that only offering a predefined specific set of VMs is too constraining. A more elaborate resource specification model (such as the one proposed by Dube) is required. This allows for more differentiation between different cloud providers and more choice for the consumer. Note that this also allows consumers to only specify their actual requirements, without needing to worry about all the other specifics of the VM. For example, a university researcher only needs to now that the machine he'll be using runs Fedora Linux 11 and has a dual core 2.2 Ghz processor with 2Gb of RAM. The amount of disk space, network connectivity, etc might not be important for the application that is used by the researcher (of course, these components should also always have a certain minimal value, e.g. 2Gb disk space, 10Mbit WAN connectivity). Using a more elaborate requirement/offering syntax (again, such as the one proposed by Dube) would allow matching of such partial requirement specifications.

Besides this remark, I do believe that GridEcon provides some other valuable ideas that, combined with some of Dube's concepts, can lead to a good, more realistic market model.
For example, the GridEcon project gives more attention to the issue of time constraints that are often part of offers and bids in a grid/cloud resource auction. Dube also very shortly mentions time constraints, but doesn't take it into account when matching bids with offers (this is partly because Dube's focus is not on the matching algorithm itself, but more on the market behavior as a result of the actions of the consumers and providers). When making a real-world implementation of a cloud spot market, taking those time constraints into account is a necessity. Because the market clearing algorithm of the GridEcon project does take these constraints into account, I believe that it may be valuable when I will be looking more close at the matching algorithm.

GridEcon also gives a good overview of which parts of the market can be considered as axillary services, and can thus be provided by brokers instead of the market itself. Although GridEcon's market architecture can be considered as complex, I like the concept of keeping the market itself fairly simple and delegating more complex tasks to brokers (possibly controlled or monitored by the market or by a separate Grid Authority). Such a simple market would only consist of mechanisms to accept bids and offers, match those offers and bids and contact other brokers to do accounting, administration and connect consumers with providers.

GridEcon demo and implementation
Since the GridEcon project already implements many of the ideas I'm researching, I thought it was a good idea to have a better look at the online demo (username:user1, password:user1) and at the source code.

Although the online demo seems to work well (I was able to place bids and offers and let them match), I was a bit disappointed at the offered functionality. For example, although the GridEcon research papers mention different types of VMs that can be chosen by the provider/consumer, the demo only seems to allow a single type of VM, but offers the user different types of applications to offer/use. This is not really the functionality I had in mind (In my opinion the market should not know which applications are run on the offered VMS. Even more, the market really should not be allowed to know this.)

Submitting an offer in the GridEcon demo

Although the basic functionality seems to work, I'm was not really convinced by this demo (although that some concepts of it might be usable).

When looking at the Java source code, I confirmed that the whole project uses Webservices to connect the different brokers and components (I already read about this in the various papers). I think that using webservices is certainly a good choice since after all, with cloud computing we are working in a webservices environment.

After investigating some more, I was supprised to see that some parts of it were relatively well documented. However, my promoter Kurt Vanmechelen pointed out to me that large parts of the code were less well documented and that a lot of the classes are generated using Apache Axis and that setting up the whole project (to see if the project could be reused) would probably involve regenerating all those classes (or using older versions of Axis/Tomcat etc, which is also not very beneficial). Since this can potentially involve quite a lot of work (without being sure that I can actually use some of the code), I decided that I won't be doing that for now.

Combining GridEcon and Dube: an overview
Having read about the GridEcon project and Dube's thesis, I will now try to outline a first rough idea of how I believe a market model for IaaS resources can be implemented.

I believe that the best way to approach the problem is to work top-down and incrementally. That is, we should first start by defining and implementing the market itself, gradually specifying the exact workings of it's inner components and gradually adding other components that interact with it. As said before, this market should focus on accepting bids and offers and matching them. This functionality should be offered using a webservice API. The syntax for bids should be expressive enough to contain detailed application requirements, price (range) and time constraints. The syntax for offers should allow detailed instance specification, price and time constraints. Using XML to define this syntax, so that it human-readable and plays nice with webservices, is probably the way to go.
The matching algorithm should maximize profit for both parties (note that this a core concept of the double auction), while taking into account price and time constraints. It is probably a good idea to first consider future bids/offers (in which time constraints are less of an issue), and then handle spot bids/offers (more realtime behaviour). The exact algorithm details are to be defined later (having a more in depth-look at GridEcon's and Dube's matching algorithms is probably a good idea).
Once those components are in place, adding a notification service is probably required to keep both provider, consumer and any (future) administrative services updated.
After that, I think that providing a separate emulation service is a good idea in order to test the market and monitor it's behaviour under various provider and consumer strategies. Note that this implies that the market itself should provide an interface to easily extract information about it's current state.

Initial Market Design

Of course, once this is implemented, other components should also be provided (such as accounting and consumer-provider connector services). Although providing these components are a necessity to make the market actually usable, I'm not sure whether I'll implement those components as they require a substantial coding effort. I'm a strong believer of quality above quantity, and as such I will first focus on delivering quality market and emulation components.

A last remark: technical details
Although it is too early to decide which technologies I'll be using, it speaks for itself that I'll be using open technologies that are widely used by cloud providers and on the web. The only thing that I'm quite sure of is that because of the it's pervasiveness around the web (and quite frankly, because of my personal affinity) I'll most probably be using the Java programming language.
I'll also pay attention to the ease of deployment, since I personally feel that this is important if any component is to be reused in the future.

Expect and update within a couple of days when I've worked out some of the details of this first proposal.

Thesis Joris Roovers

Thursday, October 14, 2010

The GridEcon Project - Thoughts and Ideas

No comments:

Post a Comment