Home | Cloud | Review: AWS Storage Gateway
Review: AWS Storage Gateway

Review: AWS Storage Gateway

11 Flares Twitter 0 Facebook 0 Google+ 11 StumbleUpon 0 Buffer 0 LinkedIn 0 11 Flares ×

Cloud Storage Gateways provide the ability to write data directly into a cloud service/storage provider, either via a physical or virtual onsite appliance.  There are many vendors in the marketplace today and some are making serious money, (as reported by Chris Mellor today in The Register).  Amazon Web Services, seen as a leader in Infrastructure as a Service cloud solutions has offered their Storage Gateway solution for some time.  Recently the feature set was improved to allow a Storage Gateway to be deployed directly in EC2 (Elastic Cloud Comput) rather than on the customer’s site.  As Amazon offers a 60-day trial, I’ve been doing some research into exactly how the AWS Storage Gateway works, its strengths and weaknesses.  The review will form part of a future white paper, reviewing the wider market.  In the meantime, here are the results of my findings so far.
 

The Basics

The AWS Storage Gateway runs as a virtual machine, delivered as a VMware vSphere OVA (Open Virtual Appliance).  In essence the appliance is a customised installation of CentOS 5.3, with an additional component to provide the gateway functions.  Each deployed appliance needs a minimum of 4 vCPUs, 7.5GB of RAM and 75GB of local disk space, excluding cache and data.  There’s very little that can be configured through the console on the appliance, other than networking settings.  Although deployed with only one virtual NIC, it’s likely most users will configure at least two, with one used to talk to the outside world and one or more used for internal connectivity.  AWS provide example configurations of the external network using a Internet IP address, however they do also support configuration via SOCKS proxy and in my instance I used firewall redirection to translate an external IP address to the appliances internal interface.  There were some glitches however.  The appliance seems to have a problem with gateway definitions; my network and route settings didn’t display consistently (I believe this is a known problem).

The Storage Gateway is essentially a block-based device, presenting iSCSI LUNs to internal hosts.  These LUNs are then stored on AWS S3 (Simple Storage Service) as volumes.  There are two basic configurations; gateway-stored volumes and gateway-cached volumes.

  • Gateway-stored - data is stored on the local appliance and “periodically” offloaded into S3.  I haven’t found anything yet which describes what this actually means, in terms of how out of date a local LUN copy can be.
  • Gateway-cached –  data is stored on S3 with only a local cache of data.  Again, I’m not clear on how concurrent the data replication back to S3 is; whether writes are synchronised immediately or cached and offloaded later.
Both volume types allow for snapshots to be taken either on-demand or on a scheduled basis.  The snapshots are available in EC2 as an EBS (Elastic Block Store) snapshots and so can be mounted onto EC2 instances for DR or other recovery purposes.  To support either of the two LUN types, two LUNs need to be added to the appliance, one for storing persistent LUNs (gateway stored) and one for storing the temporary cache data.

 

Configuration, Management & Monitoring

All configuration and management tasks are performed from the AWS Web Console, rather than the appliance itself.   This includes the initial activation of the appliance, the creation of volumes and security settings and snapshot functions.    Appliances are associated with a specific AWS region in a similar fashion to EC2.  I guess it’s in keeping with the AWS philosophy of having everything configured in one place via a single console, but for me it doesn’t really work or feel right.  The standard Storage Gateway console provides very little information other than details on each gateway and any configured volumes.  More detailed metrics on the status of a gateway and volumes has to be obtained using the AWS CloudWatch offering.  This displays graphs on various operational metrics, including bytes transferred in and out and the concurrency of the cache areas.

 

Pricing

There are separate costs covering the use of gateways, data stored and data transferred between the local appliance and AWS.  Current pricing details can be found on the Storage Gateway pricing page.  At $125/month (correct as of January 2013) per gateway, exclusive of data charges, the solution appears expensive.  Pricing for disk space appears to be based on LUN size, rather than stored data.  If this is correct, then the solution costs could become untenable as customers would be being charged for storage capacity they had yet to use.

Issues

  1. The minimum processor and memory requirements of the appliance seem quite high; perhaps they could have been scaled down with more recommendations for upward scaling depending on workload.
  2. There’s not enough transparency around data consistency between local and remote copies.  It’s not easy to see how much data has yet to be written to S3, or how much data is being stored on S3.
  3. There’s no consistency of snapshotting.  All snapshot copies will seem like “crash” copies of data.  Integration with VSS on Windows, for example would be useful.
  4. The lag in time of having the AWS console pull back latest configuration information from a gateway appliance wasn’t great.  I can see scenarios where this could be a real problem, for instance if there was a network outage, it would be impossible to see the status of an appliance.
  5. There’s no granularity of access control (e.g delegating permissions for a specific gateway to a separate user) and iSCSI security is simply mutual CHAP, which isn’t scalable to implement or manage.
  6. Reporting is particularly poor.  CloudWatch provides only basic metrics, presented in simplistic formats that wouldn’t scale with large numbers of gateways.
  7. Pricing seems to be based on LUNs configured rather than consumed (no thin provisioning), making costs prohibitive or requiring careful monitoring of space usage; unfortunately monitoring isn’t up to the task.

 

The Architect’s View

Amazon are normally market leading with IaaS features but in this instance seem to have dropped the ball.  There are glaring feature omissions and design issues, which cause problems from configuring the appliance onwards.  Although both the Storage Gateway and CloudWatch have APIs enabling more elegant interfaces to be developed, I would imagine many customers would expect the out-of-the-box offering to be either significantly cheaper or to offer a much better interface.  The cloud storage gateway market is competitive with the likes of Nasuni offering much more mature solutions.  We’ve also seen other vendors fail spectacularly before – Nirvanix’s CloudNAS product (reviewed by me in 2009) had a similar “science project” feel.  Amazon need to seriously review their plans for the Storage Gateway if they want to compete on even a level playing field.

 

Comments are always welcome; please indicate if you work for a vendor as it’s only fair.  If you have any related links of interest, please feel free to add them as a comment for consideration.

Subscribe to the newsletter! – simply follow this link and enter your basic details (email addresses not shared with any other site).

About Chris M Evans

  • Attila Sukosd

    Can you see a potential use-case this actually?

    To me it seems like the limitations you’ve listed above, and the fact that storing in the cloud is also heavy on the external infrastructure seems prohibitively expensive and the performance, reliability and data integrity could also be questioned…

  • Atomic Ice

    Based on https://forums.aws.amazon.com/thread.jspa?threadID=118180&tstart=0, the pricing overview in the article is incorrect. It charges for stored data rather than provisioned data.

11 Flares Twitter 0 Facebook 0 Google+ 11 StumbleUpon 0 Buffer 0 LinkedIn 0 11 Flares ×