AWS vs. OpenStack

Let us compare the popularity of two top cloud-computing platforms: 1) the infamous Amazon Web Services, which companies typically leverage for the speed and convenience of Amazon's global, hosted, cloud-computing infrastructure, and 2) the increasingly versatile OpenStack, which allows organizations to roll their own cloud-computing services on standard hardware.

As you can see in the Google Trends graph below, AWS is far ahead in popularity. Keep in mind its headstart, however. And let us not discount the 500+ companies now contributing to OpenStack.

As you can see, interest in the Amazon Web Services (AWS) Application Programming Interface (API) has grown steadily since its inception in 2006. March 2008 shows the first large jump, and then interest surges again in May of this year (2015).

OpenStack has similarly grown in popularity since its launch in 2010, having had a nice jump in the Spring of 2013. Some of the more notable companies contributing to OpenStack include: AT&T, MD, Canonical, Cisco, Citrix, Comcast, Cray, Dell, Dreamhost, EMC, Ericsson, Fujitsu, GoDaddy, Google, HP, Hitachi, Huawei, IBM, Intel, Juniper Networks, Mirantis, Oracle, Red Hat, SUSE Linux, VMware, and Yahoo!.

While OpenStack has a lot of diverse contributors, AWS is the fifth largest web hosting provider globally.

Worldwide Market Share by Number of Clients in 2015:

  1. GoDaddy - 4.26%
  2. BlueHost - 2.56%
  3. Host Gator - 2.15%
  4. OVH.com - 1.91%
  5. Amazon Web Services - 1.81%
  6. Rackspace - 1.59%
  7. 1&1 - 1.54%
  8. Hetzner - 1.29%
  9. SoftLayer - 1.19%
  10. DreamHost - 1.01%

source: http://hostadvice.com/marketshare/ (2015)

As an open-source cloud-computing protocol, OpenStack obviously can't compete on these terms with a multi-billion dollar cloud-computing and software-as-a-service company. There are a number selling points to consider, however:

  • WalMart uses OpenStack to coordinate 100,000+ cores, this provided 100% uptime during Black Friday last year
  • Developers gave over 300 talks at the OpenStack Summit in Tokyo this October
  • Debian, Canonical, Red Hat, and SUSE Linux all support OpenStack and are active contributors
  • OpenStack has enabled companies like Disney, Bloomberg, and Wells Fargo to manage their own clouds at a fraction of the cost of proprietary solutions like AWS
  • OpenStack is the only solution that supports mixed hypervisor and bare metal server environments

I think these points lend themselves to the conclusion that adoption and further development in OpenStack are likely to keep pace.

As for the features of AWS and OpenStack, let's do a little comparison:

Here is a subset of AWS services:

In the Compute realm we have...
  • Amazon Elastic Compute Cloud (EC2) scalable virtual private servers using Xen
  • Amazon Elastic MapReduce (EMR) Hadoop-based big data analytics
In Networking we have...
  • Amazon Route 53 scalable DNS
  • Amazon Virtual Private Cloud (VPC) isolated EC2 instances with the ability to extend corporate networks VPN
  • Amazon Elastic Load Balancing (ELB)
In Content Delivery we have...
  • Amazon CloudFront CDN
In Storage we have...
  • Amazon Simple Storage Service (S3)
  • Amazon Glacier low-cost, long-term storage for data archival
  • Amazon Elastic File System (EFS) to accompany EC2
In the Database realm we have...
  • Amazon DynamoDB low-latency NoSQL SSD-backed databases
  • Amazon Relational Database Service (RDS) with MySQL, Oracle, SQL Server, and PostgreSQL support
  • Amazon SimpleDB distributed database with EC2 and S3 interoperability, written in Erlang
In the Deployment realm we have...
  • AWS Elastic Beanstalk for quick deployment and cloud app management
  • AWS OpsWorks EC2 configuration services via Chef, which we discussed previously
In Management we have...
  • Amazon Identity and Access Management (IAM) to authenticate into the various services
  • AWS Directory Service for tying into an on-premises Microsoft Active Directory or for setting up a new stand-alone AWS directory
  • Amazon CloudWatch for application and resource monitoring
  • Amazon CloudHSM Hardware Security Module for data security and for meeting regulatory compliance requirements
  • AWS Key Management Service (KMS) for creating and managing encryption keys
In the Application Services realm we have...
  • Amazon DevPay (beta) for billing and account management
  • Amazon Elastic Transcoder (ETS) for mobile video transcription from S3
  • Amazon Simple Email Service (SES) for sending bulk and transactional email
  • Amazon Simple Notification Service (SNS) multi-protocol application "push" notifications
  • Amazon Cognito secure application-user data management and synchronization tool
In Analytics we have...

Here are the main components of the modular OpenStack architecture:

Compute (Nova)
  • An Infrastructure as a Service (IaaS) system
  • Management and automatation of pools of computer resources
  • Bare metal and high-performance computing (HPC) configurations
  • KVM, VMware, and Xen hypervisor virtualization
  • Hyper-V and LXC containerization
  • Python-based with various external libraries: Eventlet for concurrent programming, Kombu for AMQP communication, SQLAlchemy for database access, etc.
  • Designed to scale horizontally on standard hardware with no proprietary hardware or software requirements
  • Interoperable with legacy systems
Image Service (Glance)
  • OpenStack Image Service for discovery, registration, and delivery of services for disk and server images
  • Template-building from stored images
  • Storage and cataloging of unlimited backups
  • REST interface for querying disk image information
  • Streaming of images to servers
  • VMware integration, with vMotion Dynamic Resource Scheduling (DRS) and live migration of running virtual machines
  • All OpenStack OS images built on virtual machines
  • Maintenance of image metadata
  • Creation, deletion, sharing, and duplification of images
Object Storage (Swift)
  • Scalable redundant storage system
  • Automatic replication of content from failed disks to other active nodes
  • Suitable for inexpensive commodity hard drives and servers
Dashboard (Horizon)
  • GUI for access, provision, and automation of cloud-based resources for administrators and users
  • Third-party billing, monitoring, management tool integration
  • Customizable (brandable) dashboard
  • EC2 compatibility
Identity Service (Keystone)
  • Unified authentication system across the cloud OS
  • Integration with existing backend directory services such as LDAP
  • Various authentication methods: username/password, token-based systems, and AWS-style logins
  • Queryable, single registry of all deployed services, with programmatic determination of access for users and third-party tools
Networking (Neutron)
  • Manual and automatic management of networks and IP addresses
  • Distict networking models for different applications and user groups
  • Flat networks (VLAN's) for separating servers and traffic.
  • Static IP addresses, DHCP
  • Floating IP addresses for dynamic rerouting to resources on the network
  • Software-defined networking (SDN), OpenFlow, for multi-tenancy and scalability.
  • Management of intrusion detection systems (IDS), load balancing, firewalls, VPN's, etc.
Block Storage (Cinder)
  • Persistent block-level storage for databases and expandable file systems
  • Block storage integration into OpenStack Compute and Dashboard for allocation of storage
  • Various storage platforms supported: Ceph, CloudByte, Coraid, EMC (ScaleIO, VMAX and VNX), GlusterFS, Hitachi Data Systems, IBM Storage (Storwize family, SAN Volume Controller, XIV Storage System, and GPFS), Linux LIO, NetApp, Nexenta, Scality, SolidFire, HP (StoreVirtual and 3PAR StoreServ families) and Pure Storag
  • Snapshot management for backing up data stored on block storage volumes
  • Restoring of snapshots, use of snapshots as templates for new block storage volumes
Orchestration (Heat)
  • Orchestratation of multiple composite cloud applications using templates
  • OpenStack-native REST API
  • CloudFormation-compatible Query API
Telemetry (Ceilometer)
  • Billing system Single Point Of Contact
  • Traceable, auditable delivery of counters for billing
  • Counters extensible to new projects
  • Independent data collection
Database (Trove)
  • Database-as-a-service (DaaS) provisioning relational database engine
  • DaaS non-relational database engine
Elastic Map Reduce (Sahara)
  • Hadoop cluster provisioning
  • Setting of parameters based on: Hadoop version, cluster topology, node hardware details, etc.
  • Cluster deployment in minutes
  • Scaling of already-provisioned clusters by adding and removing worker nodes on demand
Bare Metal Provisioning (Ironic)
  • Provisioning of bare metal machines (as opposed to virtual machines)
  • Bare-metal hypervisor API
  • Plugins for interacting with bare-metal hypervisors
  • PXE and IPMI simultaneous provisioning, turning machines on and off as needed
  • Extensible with vendor-specific plugins for additional functionality
Multiple Tenant Cloud Messaging (Zaqar)
  • Multi-tenant cloud messaging service for Web developers
  • Some components inspired by Amazon's SQS, with additional semantics for event broadcasting
  • Fully RESTful API for sending messages between various components of their SaaS and mobile applications
  • Surfacing of events to end users and guest agents that run in the "over-cloud" layer
Shared File System Service (Manila)
  • Vendor-agnostic share management API
  • Create, delete, give/deny access to a share
  • Support for commercial storage appliances from: EMC, NetApp, HP, IBM, Oracle, Quobyte, and Hitachi Data Systems
  • Support for Red Hat's GlusterFS filesystem
DNSaaS (Designate)
  • DNS as a Service
Security API (Barbican)
  • REST API for secure storage, provisioning and management of secrets
  • Built for use in all environments, including large ephemeral clouds
AWS Compatibility
  • Interoperability with Amazon EC2 and Amazon S3
  • Minimal effort to port AWS client applications to OpenStack

Contrasting AWS and OpenStack:

While OpenStack clearly lacks some of the preconfigured services Amazon has, such as its Simple Email Service (SES), and its pre-built Hardware Security Modules (HSM) for regulatory compliance and data security, large and small enterprises alike are not necessarily at a disadvantage in having to configure these components themselves.

The Case for Using OpenStack

Being able to leverage existing, standard hardware to build a private cloud with OpenStack's open-source framework lends itself both to corporate profitability and also increased resiliency. Existing hardware is a sunk cost; the API is free. Building an in-house solution is likely to boost Developer and Administrator confidence as well as company morale. Consider what it could mean to your team to have built their own private clould, as well as being able to maintain it themselves.

The Case for a Hybrid Solution (e.g. OpenStack + AWS)

Should you need some additional services that OpenStack doesn't include, let's say HSM for confidential data, there are myriad options open to you, not the least of which is Amazon. A hybrid solution could work and save you money. Many of the companies contributing to OpenStack offer Software/Database/Infrastructure services, such as HP and its Helion cloud service, which can easily tie into your own private cloud.

The Case for Using Amazon Web Services

Ultimately, if you're a startup with a small IT staff, the convenience of Amazon's many offerings, along with its flexible customer support options, may prove to be the most prudent and effective choice. With more players in the space all the time Amazon will likely double-down on its efforts in an attempt to continue to provide some of the best hosting, backup, data management, and analytics services available.