Greetings to all the smart people around here!
I'd like to ask whether it is feasible or a good idea at all to deploy a Java enterprise web application to a Cloud such as Amazon EC2. More exactly, I'm looking for infrastructure options for an application that shall handle few hundred users with long but neither CPU nor memory intensive sessions. I'm considering dedicated servers, virtual private servers (VPSs) and EC2. I've noticed that there is a project called JBoss Cloud so people are working on enabling such a deployment, on the other hand it doesn't seem to be mature yet and I'm not sure that the cloud is ready for this kind of applications, which differs from the typical cloud-based applications like Twitter. Would you recommend to deploy it to the cloud? What are the pros and cons?
The application is a Java EE 5 web application whose main function is to enable users to compose their own customized Product by combining the available Parts. It uses stateless and stateful session beans and JPA for persistence of entities to a RDBMS and fetches information about Parts from the company's inventory system via a web service. Aside of external users it's used also by few internal ones, who are authenticated against the company's LDAP. The application should handle around 300-400 concurrent users building their product and should be reasonably scalable and available though these qualities are only of a medium importance at this stage.
I've proposed an architecture consisting of a firewall (FW) and load balancer supporting sticky sessions and https (in the Cloud this would be replaced with EC2's Elastic Load Balancing service and FW on the app. servers, in a physical architecture the load-balancer would be a HW), then two physical clustered application servers combined with web servers (so that if one fails, a user doesn't loose his/her long built product) and finally a database server. The DB server would need a slave backup instance that can replace the master instance if it fails. This should provide reasonable availability and fault tolerance and provide good scalability as long as a single RDBMS can keep with the load, which should be OK for quite a while because most of the operations are done in the memory using a stateful bean and only occasionally stored or retrieved from the DB and the amount of data is low too. A problematic part could be the dependency on the remote inventory system webservice but with good caching of its outputs in the application it should be OK too.
Unfortunately I've only vague idea of the system resources (memory size, number and speed of CPUs/cores) that such an "average Java EE application" for few hundred users needs. My rough and mostly unfounded estimate based on actual Amazon offerings is that 1.7GB and a single, 2-core "modern CPU" with speed around 2.5GHz (the High-CPU Medium Instance) should be sufficient for any of the two application servers (since we can handle higher load by provisioning more of them). Alternatively I would consider using the Large instance (64b, 7.5GB RAM, 2 cores at 1GHz)
So my question is whether such a deployment to the cloud is technically and financially feasible or whether dedicated/VPS servers would be a better option and whether there are some real-world experiences with something similar.
Thank you very much! /Jakub Holy
PS: I've found the JBoss EAP in a Cloud Case Study that shows that it is possible to deploy a real-world Java EE application to the EC2 cloud but unfortunately there're no details regarding topology, instance types, or anything :-(
I'm serving a "few hundred users" from a single EC2 High-CPU Medium instance. No load balancing, no dedicated DB servers, nothing fancy at all. Simply a single box. Additionally I'm using some services:
As I said, nothing fancy - at least in Amazon's cloud environment. And everything for less than 200$/month. Regarding pricing, you should take care though. Amazon did a good job at obfuscating main costs. For example, looking at CloudFront Pricing, you might look at 0,15$ per GB but ignore 0,01$ per 10,000 - it's a ridiculously small price for a lot of requests, isn't it? Big surprise: 2/3 of our CloudFront cost is for requests (about 3 KB per request). I/O requests for EBS is a similar story.
As it would be extremely easy to scale (use a bigger instance, move DB on Relational Database Service) I'd suggest you start with the same setup. As you said, throwing more boxes in is pretty simple (assuming your setup supports adding/removing nodes on the fly). This makes choosing the appropriate setup by trial and error easily feasible - some thorough load testing should do the job. Choose something that works for your expected load (plus some extra power) and grow/shrink as soon as you have production data.
As a conclusion: yes, it's certainly possible to host Java EE apps on EC2 :)
Edit: as a side note: comparing pricing of EC2 with traditional hosting is comparing apples and oranges - at least as long as you don't get an SLA for your network, nearly unlimited scalability, no hardware issues, nearly unlimited and redundant storage, different availability zones and a bunch of extra services with it. If somebody tells you that traditional hosting is cheaper, he might be a sysadmin anxious about his job ;) Don't get me wrong, it is cheaper - but you get much less for a little less money.
And by the way, I'm in no way affiliated with Amazon ... but I feel that I should be rewarded for being a good spokesman, shouldn't I? :D