HACMP - Disabling cascading failback

HACMP - Disabling cascading failback

Post by Pascal » Wed, 27 Oct 1999 04:00:00



Hi,

I am using 3 systems whith HACMP 4.3.  They are configured as follow
2 production systems (HA) and one standby system.

When one production system goes down, the standby takes the HA resources
 and, as we are using cascading mode, as soon as the production system is
back again it sends a message to standby to release the resource and take
them
back...

My question : how can I avoid this ?  I don't want an automatic failback.

I know that one solution is to use the rotating mode, but this mode is
limited to
2 nodes per cluster (or at least per resource group).

Thanx in advance !

Pascal R.
Belgium

 
 
 

HACMP - Disabling cascading failback

Post by Dr. Marku » Wed, 27 Oct 1999 04:00:00




Quote:> Hi,

> I am using 3 systems whith HACMP 4.3.  They are configured as follow
> 2 production systems (HA) and one standby system.

> When one production system goes down, the standby takes the HA
resources
>  and, as we are using cascading mode, as soon as the production system
is
> back again it sends a message to standby to release the resource and
take
> them
> back...

> My question : how can I avoid this ?  I don't want an automatic
failback.

> I know that one solution is to use the rotating mode, but this mode is
> limited to
> 2 nodes per cluster (or at least per resource group).

> Thanx in advance !

> Pascal R.
> Belgium

Hi,
of course you could always schedule the reintegration of the failed node
into the cluster by using the at-command:
on the failed node: echo "rc.cluster -boot" | at <the time you want>
so reintegration can occur when it suits your availibility-needs.
just my USD 0.02
regards

Sent via Deja.com http://www.deja.com/
Before you buy.

 
 
 

HACMP - Disabling cascading failback

Post by Matthew Land » Wed, 27 Oct 1999 04:00:00



> Hi,

> I am using 3 systems whith HACMP 4.3.  They are configured as follow
> 2 production systems (HA) and one standby system.

> When one production system goes down, the standby takes the HA resources
>  and, as we are using cascading mode, as soon as the production system is
> back again it sends a message to standby to release the resource and take
> them
> back...

> My question : how can I avoid this ?  I don't want an automatic failback.

> I know that one solution is to use the rotating mode, but this mode is
> limited to
> 2 nodes per cluster (or at least per resource group).

> Thanx in advance !

> Pascal R.
> Belgium

Unfortunately, cascading is designed to reintegrate.  Rotating is the way
to go.

There isn't a rotating limitation of 2 nodes per resource group.  You
could have 8 rotating nodes (1 active and 7 standby).  There is a
limitation of n-1 (n = # nodes) resource groups per network.  That means
in your 3 node environment, you could have 2 rotating resource groups
(3-1 = 2) on a network and have ALL 3 participate.  The first node up
will take the first resource group.  The second node up will take the
second resource group, and the final node up will be the standby for BOTH
rotating resource groups, although only one resource group could fail at
a time until another standby became available (failed node restarted will
be a new standby).

Having 2 production and 1 standby that you DON'T want to reintegrate
after a failure is *EXACTLY* the rotating environment I described above;
2 of 3 can be production, the third is a  non reintegrating standby.

Other ways around this: bring the cascading node back into the cluster at
a scheduled maintenance time.  However, when a system is down, there is a
hole for problems.  Under standard environments, no other system failure
can occur until that node is reintegrated.

 - Matt
--
_______________________________________________________________________

      Comments, views, and opinions are mine alone, not IBM's.

 
 
 

HACMP - Disabling cascading failback

Post by John Newma » Wed, 27 Oct 1999 04:00:00


Quote:> My question : how can I avoid this ?  I don't want an automatic
failback.
> I know that one solution is to use the rotating mode, but this mode is
> limited to
> 2 nodes per cluster (or at least per resource group).

Hi,

I am not sure whether you are on HA 4.3.1, in this version, it has a
new feature which allows you to make a resource sticky.

i.e. If boxA fails over to BoxB then you can say to BoxB hold on to this
resource, so when you restart HA on BoxA, the resource does not cascade
back to BoxA.

I think that this is what you a looking for, The are obviously commands
to allow you to make a resource sticky, and you could probably put them
into a post install script of some kind.

Later,  John.

Sent via Deja.com http://www.deja.com/
Before you buy.

 
 
 

HACMP - Disabling cascading failback

Post by Matthew Land » Wed, 27 Oct 1999 04:00:00



> > My question : how can I avoid this ?  I don't want an automatic
> failback.
> > I know that one solution is to use the rotating mode, but this mode is
> > limited to
> > 2 nodes per cluster (or at least per resource group).

> Hi,

> I am not sure whether you are on HA 4.3.1, in this version, it has a
> new feature which allows you to make a resource sticky.

> i.e. If boxA fails over to BoxB then you can say to BoxB hold on to this
> resource, so when you restart HA on BoxA, the resource does not cascade
> back to BoxA.

> I think that this is what you a looking for, The are obviously commands
> to allow you to make a resource sticky, and you could probably put them
> into a post install script of some kind.

> Later,  John.

> Sent via Deja.com http://www.deja.com/
> Before you buy.

The sticky option has been around a while.  If you have HACMP installed
and its man pages, try man cldare.

cldare -M resgrp :location :sticky   # etc.

However, if you really want 2 prod servers to fail over to one standby
for both, and leave the failed system as the new standby when it comes
up, rotating is the way to go.  If you  then want to migrate the rotating
resources back to the first production node, you can use the same cldare
-M command to migrate it back there at a time of your choice.  I like the
control in a standard env vs trying to customize one function to act like
another.

If you decide to try to go this route, you need to be CAREFUL and test
this outside of production.  The vaules of default and sticky can be
reset by HACMP if certain circumstances occur.  I recommend reading (and
the man page recommends reading) the chapter on "Changing Resources and
Resource Groups" in the HACMP for AIX Administration Guide.

 - Matt

--
_______________________________________________________________________

      Comments, views, and opinions are mine alone, not IBM's.

 
 
 

HACMP - Disabling cascading failback

Post by Matthew Land » Wed, 27 Oct 1999 04:00:00


I also forgot to recommend an egroups mailing list for hacmp.

an excellent source of HACMP administrators and their skills trying to
help one another.

 - Matt

--
_______________________________________________________________________

      Comments, views, and opinions are mine alone, not IBM's.

 
 
 

HACMP - Disabling cascading failback

Post by Darrell Frappi » Thu, 28 Oct 1999 04:00:00




Quote:>I am using 3 systems whith HACMP 4.3.  They are configured as follow
>2 production systems (HA) and one standby system.

>When one production system goes down, the standby takes the HA resources
> and, as we are using cascading mode, as soon as the production system is
>back again it sends a message to standby to release the resource and take
>them back...

>My question : how can I avoid this ?  I don't want an automatic failback.

The most common "solution" I've seen for this is remove the HACMP
start from /etc/inittab on the production systems.  That means the
failed machine can be booted and tested without fear that it will
reintegrate before you are ready.

Manual procedures will need to ensure that HA is started on the
production systems before the backup is started.  After the production
machine is fixed, start HA when you want it to take over.

There are some risks of course.  If all 3 machines reboot after a
power failure, both  production system will be started on the backup,
which should be possible in a cascading situation.  Someone needs to
be available to diagnose failures and decide when and how to start HA.


 
 
 

1. HACMP unexpected fallback with Cascading Without Fallback

We experienced an unexpected fallback in the following situation:

Our setup

node 1:                         node 2:
resource group 1                (just a standby)
(cascading without fallback)

---

During testing, we stopped node 1, resulting in fallover of the resource
group to node2.

We brought node 1 up again, and the resource group stayed on node 2,
as expected.

Then we added a volume group to the resource group, and synchronized
the configuration. WHAM! Fallback of resource group to node 1.

We tried to understand this effect, Reading The Fscking Manuals, but were
not enlightened.

Can anybody explain this?

Florian

--
This thing, that hath a code and not a core,
Hath set acquaintance where might be affections,
And nothing now
                Disturbeth his reflections.          -- Ezra Pound, "An Object"

2. Forking to level n, degree x

3. HACMP cascading config

4. NIS+ ypmake?

5. HACMP cascading cluster

6. Questions About starting Linux

7. Disable the snmpd community public in HACMP Cluster

8. Need help with curses/ncurses

9. Does hacmp synchronisations need hacmp started ?

10. Difference between HACMP and HACMP/ES

11. Weird behaviour after upgrading from HACMP/ES 4.4.0 to HACMP/ES 4.4.1

12. tcsh: How to disable printing motd w/o disabling mail check?

13. wnidow maker "programs" cascade menu doesn't show up