Home arrow Infrastructure Blog arrow Making Efficient Use of Cloud Computing Resources - Part I
Making Efficient Use of Cloud Computing Resources - Part I
Written by Eric Novikoff   

Introduction

In response to requests from customers, I'm going to start a series of irregular blog articles on making the most efficient use of cloud computing resources, such as those provided by ENKI's Computing Utility.  I will use examples from our customer base, as well as from the storehouse of experience the staff at ENKI have from working at start-up and enterprise companies.

Ultimately, making efficient use of resources in the Cloud, where computing resources are utility-billed, is very similar to simply dotting your Is and crossing your Ts with respect to efficiency in in any computing environment: you want to use your allocated resources as effectively as possible.  But there are also some interesting new problems and opportunities that come with being able to resize or re-allocate your computing resources on demand.  Today I am going to write about bursty loads, such as those involved with media transcoding, image processing, and database-intensive tasks. 

Many of the applications I created in the past, and those of customers who come to ENKI, are developed and possibly deployed on fixed hardware such as a PC under their desk, or a physical server in a data center.  In this case, the hardware is typically dedicated to the application and runs all parts (or tiers) of the application.  If the application is too slow, the choices are to improve the efficiency of the code (which Dave wrote about in Data Center Power Consumption, Part III: The software), buy a bigger server, or split the application into parts, each running on a different server.  With ENKI's Computing Utility service, you can increase the size of your server on demand, but it won't increase the efficiency of your application, only your monthly bill!  Splitting the application is where you can get some big efficiency gains from utility computing.

Before we dig into the numbers, please remember that this discussion involves a tradeoff between the cost of the computing service and the user experience.  If you make the wrong choices, you may no longer have a business to optimize!

A bursty batch processing example 

Let's take a look at a sample application which transcodes stored video files into a customer-selected format for download.  We will start by assuming that a standard server can process 1000 videos an hour.  If you deploy this server to transcode a video when a user requests it, the user would have to wait (60*60/1000) = 3.6 seconds to be able to start downloading it, which is not a problem. Suppose we deploy the application this way and observe how the users use it.  For most of the day, requests come in once a minute on the average, which means that our server is only at 6% utilization.  However, in the evenings, there are peaks where users are submitting a request about once every 2 seconds for periods of up to 15 minutes. In this case, after 15 minutes, the last requestor has to wait for their video an amount of time equal to the number of backlogged requests submitted times the processing backlog on each video: 15*30*(3.6-2) = 720 seconds or 12 minutes.

Whether this 12 minute delay is a problem for your business or not depends on how you set your users' expectations.  If they have to sit and watch a spinning film reel, they'll go away and not come back.  If you tell them you'll send them an email when their processing job is ready to download, they will most likely accept that.  The very successful (until the RIAA killed it) music download service, AllofMP3, allowed you to transcode your music this way, and their customers were delighted.  Similarly, NetSuite submits long-running reports for background execution, notifying you when they are complete.

So, how can we change the deployment of this application to save you money during most of the day when your server is only being used at 6%?  The trick is to move the transcoding to a separate virtual machine instance (called an "Appliance" at ENKI) which is dedicated to processing only the transcoding tasks, and implementing a processing queue to allow the instance to manage its backlog.  You can then set the amount of CPU allocated to the instance to control your maximum queue length at peak periods.  A side benefit of doing this is that you can reduce the resources allocated to the Appliance with the remaining tasks in your application (UI, downloading, etc.) because it no longer has to process transcoding jobs quickly.

For example, if you chose to set your CPU allocation to 12% of one standard CPU for your queue processing Appliance, your processing time for one video would increase to (1/0.12)*3.6 or 30 seconds.  In this case, using the math above, users would have to wait a maximum of 210 minutes at a peak usage time.  However, most of the day their wait would be 30 seconds.  In return for this increased delay, you save 88% on your resource charges.  You could even look at the queue length and offer them the option of waiting or getting an email when it was done if they had to wait longer than 30 seconds.

Making the cost vs. speed tradeoff 

This is a tradeoff that you will have to make based on the best information you have about your business.  You need to know what kind of user experience is required for your business to succeed before you can analyze performance numbers. You may want to conduct a focus group or user feedback study. There may be automatic ways to check if performance is unacceptable, such as correlating abandoned sessions to changes in resource allocation, for example.   If you are an internet startup company or an internal IT department rolling out a new application, you only have one chance to make a first impression, so it's probably wise to start with a generous allocation and reduce it over time while checking back with your users to make sure they aren't too frustrated.  However, by setting the users' expectations correctly, such as with the email-on-completion method listed above, you can take a lot of pressure off of your application.

In the next article, I'll write about another technique for making efficient use of utility-billed computing, which is varying the breadth (the parallelism) of your application as deployed in the Cloud in response to increasing load. 

Comments (0)add comment

Write comment

busy
 
Tag it:
Delicious
Digg
Technorati
Stumble
YahooMyWeb
Ma.gnolia
Furl it!
Reddit
< Prev   Next >