on Apr 13th, 2009Limitations and Challenges in Cloud Computing for Applications
I was supposed to be involved in a discussion about cloud computing at Cloudcamp Bangalore, but due to other commitments, I could not attend the event. I had a small writeup about the limitations and challenges in Application clouds. Here is the full text of it.
Cloud Computing is a way of providing dynamically scalable and available resources such as computation, storage etc as a service to users who can use it to deploy their applications and data. Cloud Computing can handle data in both the public and the private domain. But this seemingly harmless way of thinking about building applications has its own set of issues.I am primarily referring to application cloud providers, the kind where you deploy your applications. Not storage and service clouds. Google AppEngine would be a good example for the cloud that I am describing. I note some of them here :
From the Users perspective:
- New unstructured and non standard paradigm of programming: Each cloud has its own supported programming language and syntax requirements for programming, though most of these clouds expose the typical hashtable based cache and datastore interfaces. There is an urgent need for standardization of interfaces and methods of programming them. One of the reasons why shared hosting environments work great is because , as a programmer, I know that I can move my PHP/PERL code to another server and it will work without too much of a fuss. Moving from one of the dozen odd cloud providers to another requires considerable developmental efforts, not to forget time (for businesses, this could spell doom). A look back at history shows languages like SQL, C etc being standardized to stop exactly this sort of undesirable proliferation.
- Restrictions on the programming model : For cloud based applications to be highly available, they must be easy to dynamically mirror on multiple machines. Once these applications are mirrored, they can be served on demand by load balancing servers which makes them highly available and the user doesn’t face delays in being serviced. This is an old trick used by busy websites from the early days of web publishing but these solutions were custom built for websites. So, extending this concept to cloud based platforms, servicing thousands of applications, mandates the platform providers to automate this task of replication and mirroring. This job is easier said than done. This process can be made seamless when the program stores as little state information as possible. By state, I mean transactional variables, static variables, variables in the context of the entire application etc. These things are almost a given in traditional programming environments but are very hard to come by in cloud based environments. The unnatural way of dealing with this situation is using the datastore or the cache to store state of an application. There are a lot of restrictions like lack of privileges to install third party libraries, no access to file system to write files etc ( which forces you to use the datastore and pay for it)
- A good local debugging experience: A good local development environment, debugging experience is a must for programming on the cloud. Most cloud providers do not provide good local development environments. There is also a lack of good IDE’s that can help with programming and debugging programs written for the cloud. The providers that do provide a local debug experience, do not simulate real cloud like conditions. Both from my personal experience and from conversations with other developers, I have come to realize that most people face problems when moving code from their local development servers to the actual cloud. This is only due to inconsistencies in the behavior of the local dev env compared to the cloud.
- Appropriate metrics and documentation of programming best practices : On a cloud, since a user pays for almost every CPU cycle, appropriate metrics on usage of processing time and memory must be presented to the users. Typically a profile of the application with function names and their corresponding time taken, memory used, processing cycles used will definitely help the developer tune his/her code to optimize on usage of processing power. The best solution for this is for cloud providers to abstract common code patterns into optimal libraries so that the users can be assured that they are running the most optimal code for a certain operation. An example of this is Apache PIG, which gives a scripting like interface to Apache Hadoop’s HDFS for data analysis. Also, Most cloud providers do not provide enough statistics and also profiling capabilities.
From the providers perspective:
Here I look at challenges that cloud providers have to face:
- Ensuring availability of the cloud: This proves to be crucial as Clouds host critical business applications, for whom, downtime would mean monetary losses. Effective monitoring and load balancing solutions are to be built. Most clouds employ virtualization technology to get the most out of any resource. In such cases, tools should be written to figure out a resource hog early and move the application to a more powerful grid or a machine, so that the other users get their share of the cloud without delays.
- Ensuring Consistency: Both the data and code is replicated on the cloud and maintaining consistency of data is extremely crucial. This is the reason why most transactional updates are not allowed on the cloud. Example: sequence objects, which are almost a given in traditional databases are not provided, probably because maintaining state across machines for such statements is non trivial. Problems like distributed updates, locking, partitioning, sharding etc arise when dealing with data. Such constructs are to be provided to the users as most of it is given in the non cloud deployment space.
Most datastores provided by cloud vendors (except the ones that provide cloud based database services) do not support relational models. Which means all object relations have to be programmatically established. This could always lead to bad code, unnecessary joins, cascading problems and tons of other problems that developers faced before working with relational datastores. - Program verification : One of the biggest worries about deploying applications on the cloud is the correctness of the program in execution. Erroneous conditions, like infinite loops, can not only put the machine at the risk of being overloaded and unavailable, but also cost the user a significant amount of money. Tools like static analysis should be used to analyze code uploaded on the cloud and it should be checked for infinite loops, possible race conditions, null references, unreachable code etc. The code uploaded should also be optimized or suggestions should be provided to the users about how they could optimize code to best utilize the available resources.
Conclusion : The cloud should become a complete nonrestrictive platform for applications. There should be no restrictions on the constructs, functionality and privileges on the cloud. Also, it should be dead simple to move everyday applications onto the cloud without too much of rework. This could mean writing migration utilities, import/export options and other artifacts that make the transition to a cloud much easier. This will prove essential as most live applications, at least currently, do not run on a cloud and helping them migrate easily will mean more revenue and adoption.


Limitations and Challenges in Cloud Computing for Applications…
I was supposed to be involved in a discussion about cloud computing at Cloudcamp Bangalore, but due to other commitments, I could not attend the event. I had a small writeup about the limitations and challenges in Application clouds. Here is the full t…
New Post : Limitations and Challenges in application clouds http://is.gd/s9QQ
Thanks for the info. Telstra – Australia’s biggest telco has just announced (on 17th August) a $500m investment into cloud computing which is pretty huge.
Seems you are confused about cloud infra & cloud based systems..
cloud computing is advantageous to those companies which can reduce their capital ependiture on servers and data centres. it facilitates pay as per usage and the consumer need not pay full for the service. the service is almost similar to how we pay for using electricity and water.
?????? ????? ??? ????? ????????? ??? ??????, ?? ??? ? ?? ?????((