Load Balancing VS Fail over VS ??

jeepin81 · Nov 18, 2019

I just wanted to get opinions from you fine folks on this forum. Where I work our web site is becoming more critical and we want to implement some type of fail over or load balancing to ensure we are up 99%ish of the time if possible. We're on an IBM i using ZendServer w/ Apache. There is a known bug where the server jobs get stuck in a timewait status and we'll have to restart Apache to clear those jobs and get everything back on track. It is very sporadic. Maybe we'll go months w/o it dropping or we'll have to restart 4 times in a row. IBM does have as open ticket on this and we've worked with them to fine tune Apache to try to keep stable.

What do you all do or have done or suggest in regard to either failing over to a instance of Apache or load balancing multiple version so if one drops traffic will flow to the live version.

I did find this link: Pound

https://linuxtechlab.com/load-balancing-web-servers-using-pound-load-balancer/

Watched this YouTube tutorial and it seems like it could be an option.

https://www.youtube.com/watch?v=-WuW27hpHWc

I guess another related question would be how do you run multiple instances of Apache?

I appreciate you insight.

dalecosp · Nov 18, 2019

Hmm ... depending on your numbers, I might be no help. Currently we don't load-balance at work. Here are some things we've done to help keep pages serving quickly as traffic has risen.

Optimize all images, minify JS, etc., CSS files, etc. and serve from a cookie-free domain, with long expiry dates in the headers. (This often means when we change one of these assets we have to adjust a query string in the HTML, which we do via our config files).
Cache HTML assets that are used in mashup pages on disk or RAM-disk (example, featured blocks that show on multiple pages throughout the site).
Optimize Queries like nobody's business. Make sure nothing's without an Index, avoid subqueries, etc.

If you've done all that, it might be time to investigate CDN and load-balancing, etc. In my experience, page speed/performance is a constant battle of additional features vs. code simplicity and execution times.

jeepin81 · Nov 18, 2019

I think we have a considerable amount of traffic on our instance of Apache but working here is my only frame of reference. Our main ecommerce site that will have around 200+ concurrent users on it during normal business hours. We then have about 15 Virtual Servers hosting "catalog" type of generic websites that our customers use to put on their sites so their customers can view product. I'd say we have about 10k-15k customers who have those sites linked on their pages. They individually will receive around 100k+ views per week. We also send out A LOT of email. I'd say close to 50k email addresses we email 3-5 times per week. Then we have a have couple dozen internal tools built for business logic type of queries... All running through our one instance of Apache.

We cache as much possible. Minify scripts. Optimize images and db queries.

Two years ago we were only running on 1 CPU and it was pinging at about 80-90% utilization throughout the day. During this time the site crash fairly often. I'd say every other day. We finally bought another license to spin up a second CPU and that sped up everything on our network. Took our average page request from about 3-5 seconds to around 100-300 milliseconds currently. Minus the occasional spike.

IMO it seems like a lot of traffic for 1 server but I am not 100% sure how much Apache can handle

NogDog · Nov 18, 2019

I'm not very strong in the devops world. All I can tell you is that the project I work on is hosted on AWS, and we use multiple Docker containers (in ECS tasks) across multiple AWS EC2 instances, both for webservers and load-balanacers. (One of those diagrams where you have several load-balancers with lines to each webserver, resulting in lots of crisscrossing lines. ).

dalecosp · Nov 18, 2019

I'm probably in over my head also. I'm not familiar with ZendServer; what OS is on the IBM box, and do you have a link to the bug?

The point there being, if you just have two servers with the same bug ... is that really a failover?

If I go back to my original line of thought ... are you running (My)SQL on the same box as the Apache server, or is it separate?

200 concurrent users is probably an OOM more than I'm dealing with. If I were facing loading issues right now, first thing I'd do is get the SQL server onto another box, and probably the BI stuff also (I assume you mean stats or BI software when you say "internal tools built for business logic type of queries") would come off the public web-server onto another box.

Apache comes OOB with configuration for 150 concurrent connections. It can be tuned for a much higher number of concurrent connections, but you & I know that at levels like that it's not Apache, but PHP and (My)SQL that are chewing up resources, especially RAM.

If you can separate the DB from the Apache box and get the BI off the server and still are pushing your systems, I'd say you're in CDN/Load-Balancing territory at that point. You're often talking about redundant systems at that point, with round-robin DNS, perhaps something like: https://help.dnsmadeeasy.com/dns-failover/configure-dns-failover-round-robin/

(Note that this is an easy Google result, not a recommendation for DNSMadeEasy --- I have no experience with them one way or the other).

Load Balancing VS Fail over VS ??

Jjeepin81

dalecosp

Jjeepin81

NogDog

dalecosp