[ad_1]
Amazon’s cloud computing platform, EC2, has a new Elastic IP feature that expands its hosting capabilities. However, for projects requiring high availability, a load-balanced cluster within EC2 is limited by slow IP propagation times. A dedicated load balancing solution is needed for truly highly available clusters.
April 24, 2008: Amazon is building a revolutionary cloud computing platform with the Electric Compute Cloud (EC2) service. The recently announced Elastic IP feature greatly expands the possibilities of EC2 as a true hosting environment.
For standard website requirements, the current implementation seems fine, but for projects that require high availability, there is at least one significant limitation.
We envision a load-balanced cluster entirely within EC2. The front end of this setup would be managed by two small EC2 instances that would effectively act as load balancers or routers. Requests would arrive at the primary router and then be routed to the least loaded instance within the cluster. Since a single router serves as a single point of failure, at least one additional router is required for a truly highly available system. A monitor might ping the primary router regularly, and if something went wrong, the secondary router would have to reassign the IP address to itself and assume the role of primary router.
Combined with Amazon’s Availability Zones, such a system would have no single point of failure. To test the feasibility of this layout, we spawned two small EC2 instances and monitored how long it took for a second instance to acquire the IP address of the first. In three tests, it took an average of 3 1/2 minutes and never less than 3 minutes for this to occur.
We are running a similar cluster in a traditional hosting environment and the IP captures take about 2 seconds.
The result of course is that in the event of the primary router failing, there would be a theoretical downtime of up to 3.5 minutes while the secondary router is waiting for IP propagation. We assume that the large number of routers within Amazon’s network makes faster IP propagation a non-trivial task.
One solution would be to move routers outside of EC2. Increased ping times and latency make this solution suboptimal.
Ideally, Amazon would offer a dedicated load balancing solution designed specifically for such purposes. Unless some other solution is offered, “high availability” clusters entirely within Amazon’s EC2 service will not be truly highly available.