Skip to content

Portability layer

Andrew Spyker edited this page Jul 25, 2014 · 22 revisions

While this project already shows cloud portability and cross-cloud capabilities realized by the changes to port from Amazon EC2 to SoftLayer and RightScale, we didn't in all cases introduce a portability layer. In some cases, the NetflixOSS technology was changed to support the SoftLayer/RightScale IaaS instead of the Amazon EC2 IaaS. We see value in this new cloud support, but do realize it is sub-optimal long term strategically. This page talks about ways we have experimented with NetflixOSS changes that could allow NetflixOSS to work across multiple IaaS layers without code changes to NetflixOSS for each IaaS layer.

API vs. Domain Model

Various aspects of the NetflixOSS platform have ties to both the Amazon IaaS API's and the Amazon IaaS domain model returns and passed to API calls. In most NetflixOSS projects both the API access and domain model is provided by the Amazon Java SDK library.

There are NetflixOSS projects where the IaaS API is the key dependency and very little is done with the domain model. An example of this is the Chaos Monkey that needs to implement basic query for an instanceid to kill followed by a terminate call to that instance.

There are other NetflixOSS projects where there are very deep ties to and usage of the Amazon domain model. An example of this is the Asgard management console. Asgard not only makes extensive use of the Amazon API's but every screen shown in the console is a view of the model defines by the Java objects in Asgard caches that come from the Amazon Java SDK domain model classes.

Looking across the three different IaaS API's important within this project (Amazon, SoftLayer, and RightScale) and another that will be shown as important below (OpenStack) it is obvious that each provides a different API, domain model, level of detail within the domain model, and availability of Java library for access. Other languages are important, but for NetflixOSS, having a JVM based language library is required.

See the following table for an example using the IaaS "Image" domain model and API's. The Image domain entity and API are only used for a simple example. A similar comparison is needed for other IaaS concepts such as instances, clusters, etc.

Cloud Image domain model Number of image domain attributes Image access API Java access libraries
SoftLayer model 24 API Direct REST (no Java library)
RightScale model 10 API Direct REST (no Java Library)
Amazon EC2 model 21 API Amazon Java SDK
OpenStack model 9 + metadata map API OpenStack Java SDK

Cloud portability library layers

We researched a few options for cloud portability. Each had it's own benefits. They are summarized below along with experiments we did to research them.

Apache jClouds (as an API)

Apache jClouds strives to create a common Java access library across multiple clouds. We rewrote the chaos monkey port. You can see the changes here. For simple scenarios we found it was possible to implement a single Java API client with a single Java model. However, we don't know if as the logic and model information becomes more advanced, the client code also become more complex due to differences in jClouds providers. We also hit an [issue] (https://issues.apache.org/jira/browse/JCLOUDS-163) with jclouds and SoftLayer, which is still being resolved. We believe for the chaos monkey, this library is likely sufficient (as long as IaaS providers continue to keep the providers up to date).

After working through this Chaos Monkey/jClouds experiment, here is what we learned:

  1. Once fixed, jclouds will provides a Java library which simplifies access to the SoftLayer REST APIs from Java.
  2. It is important to use the correct id to reference a running instance. This id is cloud layer specific, and needs investigation on the correct usage when used across multiple libraries. (Asgard, RightScale, SoftLayer & Simian Army).

Apache Deltacloud

Similar to RightScale, Apache DeltaCloud provides a service that expose a common REST API across IaaS clouds and a server implementation that executes the service. Unlike RightScale, Apache DeltaCloud stops there. RightScale goes further in many ways providing it's own services (such as auto scaling/monitoring) and application lifecycle management. We did not want to introduce yet another layer of servers (the Deltacloud services) to our approach and therefore avoided Deltacloud. It might have been good to consider the REST domain model of Deltacloud as it, by definition, must abstract the IaaS domain model across clouds. Deltacloud provides a native REST API as well as the DMTF CIMI REST API.

OpenStack Java SDK (as a domain)

OpenStack provides a REST API for the IaaS layer OpenStack implements. Much like Deltacloud (or the RightScale REST API), this IaaS layer defines both a domain model and API for performing actions against the IaaS OpenStack provides. In the cases where we knew there was deep usage of the IaaS API's (beyond what jClouds can currently abstract), we looked to the OpenStack Java SDK (which is a Java binding to the OpenStack REST domain model) for a domain model. We decided to focus on this domain model given that OpenStack is a strong leader in the open and standard based IaaS cloud computing (with more than 200 participating companies).

Note that the below code wasn't maintained going forward and the links below are broken. You can find the code in a semi-checked in state at: https://github.com/EmergingTechnologyInstitute/asgard/tree/cloudprize-openstackdomainmodel

In order to experiment with how well the OpenStack Java SDK would work as a IaaS domain model within NetflixOSS, we did a fork of our own Asgard port. You can see how we changed the underlying caches to be based on the OpenStack Nova Compute Model. Then while working with IaaS providers, we converted from their IaaS model to the common OpenStack model (EC2, RightScale). We then adjusted the Asgard controller and view to work with this OpenStack domain model. Finally, we looked at RightScale returning it's view of instance, mapping to the same OpenStack domain model.

After working through this Asgard/OpenStack experiment (again using Images as example), here is what we learned:

  1. The domain model required by Asgard is extensive with every entity from EC2 being exposed to the end user. This meant that the 21 fields exposed in EC2 needed to be mapped to the 9 fields, plus the "metadata" map. We weren't able to put all EC2 fields under those two areas, but it pointed out how it would be good to either (a) consider if we should add more common attributes to base fields in the OpenStack image entity or (b) consider a common/default/required plus optional extensions data model architecture using the metadata fields more strategically.

  2. The domain model populated from RightScale images was less filled in than what was filled in some EC2 images. Similarly, there were fields that came from RightScale images that did not equate to fields coming from EC2 images. Mostly these fields were non-core OpenStack image entity fields and therefore were stored in the metadata map fields. This suggests, as documented in the Asgard part of the wiki, the need for a formal UI skinning architecture in Asgard. If a user is working with an Image from one cloud the user should not see the lowest common denominator model nor a model polluted with fields that are not applicable to their cloud.

Overall Summary

We have learned the following things in all of the portability work

  1. We have shown that the NetflixOSS platform can be ported to work on another cloud - specifically SoftLayer and RightScale. Certain projects (like Asgard) can also, once the porting exists, work across multiple clouds simultaneously.

  2. There is work beyond basic porting to allow for NetflixOSS code to remain unchanged when working across multiple clouds. This work requires a portability layer for both IaaS API's and IaaS domain models.

  3. The domain models are quite different between IaaS cloud portability REST interfaces and libraries. These differences require both lower level extensibility in the common domain model and upper level skinning to handle differences in that non-common domain model aspects.

  4. There does not exist a "silver bullet" at this point to allow for a single Java codebase without IaaS cloud specialization. There are front runners on the API front (jClouds) and domain model front (OpenStack). It should also be noted that if the focus is to work across multiple clouds, that RightScale is a leader in this space.

  5. In looking for what cloud portability projects to focus on, it is equally important to look at the technology as the community around adopting the technology. OpenSource projects have been started with good intentions, but that doesn't mean they can offer the same level of support of open standards such as OpenStack. Having standards around the portability layer are likely to help the cloud as many implementations will rally around the same API's and domain models reducing the complexity of implementing higher level solutions (such as NetflixOSS).

  6. It will be interesting to see if IaaS level abstractions (such as Deltacloud with a REST proxy or OpenStack) step beyond their current focus to solve these issues. While neither's REST API was intended to provide a plugable IaaS abstraction layer in the topology needed for this work, the value of the common and standards based REST API's should not be underestimated.