Looking for opinions, advice on redundancy strategy

Looking for opinions, advice on redundancy strategy

Post by Aaron Hars » Thu, 17 Dec 1998 04:00:00

My employer is starting to move more and more of the business onto Linux
servers, and we're getting to the point where it could be very expensive if
a server died.

We're considering building some redundancy into our servers as follows:

- The applications and data for each service (e.g., Oracle database, SMTP
server, HTTP server, etc.) will be stored on sets of RAID disks.

- We'll have N+2 machines (where there are N services running on our Linux
machines), with the extra two just sitting around waiting for a problem.

- Each of the N+2 machines will have a SCSI controller, an Ethernet
controller, and a small internal disk with nothing on it but the kernel,
drivers, and bare operating system (that is, no web/mail/database server).

- Each of the RAID sets will also include a script that makes any settings
necessary on the server and starts the appropriate application.  This script
will also bind the necessary IP address to our network card (e.g., the HTTP
disks will bind www.rentrak.com to the card, the SMTP disks will bind
mail.rentrak.com, etc.)

The goal is obviously to get rid of any single points of failure.  Disk
failure will be handled automatically by RAID, and any other hardware
problems can be fixed by moving the disks over to one of the backup

A nice benefit of this separation of hardware/software is that we can
upgrade hardware less painfully; we build the hardware, make sure the
kernel's set up, and just move the application disks to the new box.

Does anyone see any problems with this scheme?  I'm not sure whether it
would even be possible to move the RAID disks to a machine with a different
RAID controller (or to a machine using software RAID).  Am I missing
anything else?

Aaron Harsh


1. strategies for fault-tolerance/redundancy?

I may have to put together a high-reliability e-commerce site.  Can
anybody point me to a good reference on fault-tolerance/redundancy

Beyond a good tutorial, book, or link collection, there's one question
that particularly interests me:  If I run multiple machines that are
geographically separated - how can I get around the problem of a client's
DNS resolver caching the IP address of a machine that's gone down (short
of using an extremely short TTL)?

Thanks very much,

Miles Fidelman

2. Compiling 1.3.3* Kernels

3. Need advice for redundancy connections guidelines.

4. aphid v0.10a -- quick Apache installer

5. vpn - not sure if its what Im looking for - looking for advice

6. ipautofw with redhat 5.2

7. Need advice re SCO developer product strategy

8. Slow LAN with FreeBSD 3.4

9. Backup Strategy Advice Needed

10. Advice, opinions, and ideas sought.

11. Seeking advice/opinions about Conner TS4000 tape drive

12. Looking for Floppy Tape Drive Backup Strategy.

13. Thanks (was: Advice, opinions, and ideas sought.)