Generic gateway for fault tolerance

Generic gateway for fault tolerance

Post by Helge Hess » Fri, 29 Jun 2001 00:30:21



Hi,

I am going to design a high availabilty system where two redundant servers
(each running the same applications, "warm passive") are serving a fixed
number of client maschines. Our policy is that a client application should
never send a request directly to a server app. That way a client does not
have to handle fail-over situations, nor time out detection. The decission
that an application has failed should not be based on its accessibility,
since the application can be responsive but the system control has decided
that the whole maschine has a problem. So each request should be directed
over a local gateway on the client's maschine. This gateway decides to which
replica the request will be forwarded. The gateway will forward the request,
a LocationForward to the client would yield to a client's responsibility to
handle a failure (particularly TCP/IP timeouts).

Since this gateway will forward requests for quite a few interfaces, it
should be a generic solution not an application level gateway. DII/DSI is to
slow and needs a redundant (and therefor replicated) IFR. Using static
stabs/skeletons is not a generic solution and would require to change the
gateway for every new interface.
Supposed that an algorithm exits which can extract the requested service
from the object key, is it possible to build a generic gateway without
DII/DSI ?

One of the main problems with the CORBA FT spec is that it depends on the
client's ability to detect a failure before it goes to use another object
from a multiple profile IOR. To detect that a cable has been unplugged may
take a long time (several minutes). Setting a request timeout on a stub
object is heuristic. If a request is processed by the servant but the reply
is not received in time, the request is send to another object (of an object
group). In this case the request is processed twice, which violates the
CORBA "exactly once" semantic.

Are there any possibilities to active forward a request, probably ORB
supported ?

Hope some of you has any hints,

Thanks,
Helge

 
 
 

Generic gateway for fault tolerance

Post by Jim Thom » Tue, 10 Jul 2001 10:17:27


There is an interesting article on the issues of CORBA fault-tolerance
from HP labs at http://www.hpl.hp.com/org/stl/emd/docs/corba.pdf.

 
 
 

Generic gateway for fault tolerance

Post by Ke J » Wed, 11 Jul 2001 16:42:34



> Hi,

> I am going to design a high availabilty system where two redundant servers
> (each running the same applications, "warm passive") are serving a fixed
> number of client maschines. Our policy is that a client application should
> never send a request directly to a server app. That way a client does not
> have to handle fail-over situations, nor time out detection. The decission
> that an application has failed should not be based on its accessibility,
> since the application can be responsive but the system control has decided
> that the whole maschine has a problem. So each request should be directed
> over a local gateway on the client's maschine. This gateway decides to which
> replica the request will be forwarded. The gateway will forward the request,
> a LocationForward to the client would yield to a client's responsibility to
> handle a failure (particularly TCP/IP timeouts).

What is the link/connection between clients to the gateway?
Is that link/connection more reliable than the gateway to
server or direct client to server connections?

Quote:

> Since this gateway will forward requests for quite a few interfaces, it
> should be a generic solution not an application level gateway. DII/DSI is to
> slow and needs a redundant (and therefor replicated) IFR. Using static
> stabs/skeletons is not a generic solution and would require to change the
> gateway for every new interface.
> Supposed that an algorithm exits which can extract the requested service
> from the object key, is it possible to build a generic gateway without
> DII/DSI ?

Why not simply working on message level? I assume the gateway
has no interest on what is the message's payload content. Why
bother it needs to know the interface and demarshal the payload??
Simply forward the request with original payload and original CDR
marshal endian to the target. At most a replacement in object
key/target in the request header. GIOP 1.2 is well design to
easy such message forwarding (for example, the payload body is
always at 8 bytes align etc..) Neither DII nor IFR is necessary.

Quote:

> One of the main problems with the CORBA FT spec is that it depends on the
> client's ability to detect a failure before it goes to use another object
> from a multiple profile IOR. To detect that a cable has been unplugged may
> take a long time (several minutes).

Is this an artifical case or a practical case? If a system cable
link is so vulnerable, my opinion is it is necessary to be replaced
or enhancement by another link layer, e.g. a ppp circult link, which
can detect link layer failure immediately.

Quote:> Setting a request timeout on a stub
> object is heuristic. If a request is processed by the servant but the reply
> is not received in time, the request is send to another object (of an object
> group). In this case the request is processed twice, which violates the
> CORBA "exactly once" semantic.

> Are there any possibilities to active forward a request, probably ORB
> supported ?

An OMG firewall proxy (such as Visibroker's gatekeeper) seems close to
what you need. Although specified for a very specific intention, most
proxy implementations are enough to service for more generic purposes
beyond firewall tunneling only.

Ke

Quote:

> Hope some of you has any hints,

> Thanks,
> Helge

 
 
 

1. help for DosFs recovery tools, disk fault tolerance

Hi , all

I'm working with DosFs of vxWorks 5.3 and it seems not to be solid
in case of power fail. Did anyone of you already faced this problem?
How can I set some fault tolerance on my hard disk data?
What about a "read only" dosfs partition with all the vital files
inside?

I know that the file system library handles and keeps updated all the
fats you have specified at start. But it never uses them to solve error
recovery issues. They are left available to the user.
Did anyone try to manage these Fat copies to build some "boot disk
check and recover"  utilities?

There is a chkdsk utility for vxWorks written by "RST Software
Industries Ltd. Israel" but it is only a starting point! Infact if you
have any file that cannot be rebuilt it is cut by chkdsk.

( My target is VR4300 mips  based)

Thanks for attention
Lorenzo Ferigo
Bergamo Hard Copy
Hewlett-Packard Italy

2. Lists with unknown contents

3. Looking for Fault Tolerance Libraries

4. C/C++ Self Teaching guides

5. Error Recovery and Fault Tolerance

6. A2630: How to Address 4.0 MB as 32-bit

7. JOB: Bay area start up; Need an Expert in Fault Tolerance Computing

8. how does WWW relate to IP adresses & Winsock?

9. CFP - Conf on dependability & fault-tolerance...

10. Fault tolerance for MPI on a LAN/WAN of Windows workstations

11. LAM-MPI fault tolerance

12. Q:Fault tolerance in LAM

13. CFP: Hardware & Software Fault Tolerance in Distributed Systems