DiskSuite Hot Spares and Filesystem Layout

DiskSuite Hot Spares and Filesystem Layout

Post by Brian E. Seppane » Fri, 25 Jun 2004 04:35:38



Hi Folks:

I have a question.  I'm running solaris 8 and we have a DiskSuite Raid
configuration.   After initializing the raid five, I mounted the
metadevice and layed out a new filesystem

newfs -m0 /dev/md/dsk/d15

now the question I have concerns how I handle hotspares and filesystem
layout.   At this point, I've labeled the hot spare devices, so they
have a valid partition table, but I have not layed out a fileystem on
these devices.   If in the example configuration one slice within a raid
five set dies it will fail over to the hot spare device, but the hot
spare at this point does not have a file system layed out on it?    How
will this be handled?   Will solaris automatically incorporate the
device and layout the filesystem as necessary?   Or should I lay out the
filesystem on the hot spares as if they were individual disks?

Thanks for the help.

Brian Seppanen

 
 
 

DiskSuite Hot Spares and Filesystem Layout

Post by Darren Dunha » Fri, 25 Jun 2004 05:52:11



Quote:> now the question I have concerns how I handle hotspares and filesystem
> layout.   At this point, I've labeled the hot spare devices, so they
> have a valid partition table, but I have not layed out a fileystem on
> these devices.   If in the example configuration one slice within a raid
> five set dies it will fail over to the hot spare device, but the hot
> spare at this point does not have a file system layed out on it?    How
> will this be handled?   Will solaris automatically incorporate the
> device and layout the filesystem as necessary?   Or should I lay out the
> filesystem on the hot spares as if they were individual disks?

There's no filesystem on a single disk, the filesystem exists on the
metadevice as a whole.

You didn't put a filesystem on each of the constituent disks did you?
If you did, they're not in existence now anyway.

The Raid 5 device allows all data on a single column to be recreated if
necessary.  This data includes the filesystem.

--

Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >

 
 
 

DiskSuite Hot Spares and Filesystem Layout

Post by Brian E. Seppane » Fri, 25 Jun 2004 06:05:39



> There's no filesystem on a single disk, the filesystem exists on the
> metadevice as a whole.

> You didn't put a filesystem on each of the constituent disks did you?
> If you did, they're not in existence now anyway.

> The Raid 5 device allows all data on a single column to be recreated if
> necessary.  This data includes the filesystem.

Yes, I guess more succintly my question is then:

At failover does the hotspare automatically have it's filesystem created
as a member of the larger raid configuration?  Is it part of the
failover process, or is there prep work involved that I'm not aware of.

At this time there is no individualized filesystem layout on the
hotspare disk.   My concern is that there was some sort of file system
prep, of which I know not, that was necessary for hot spares and at
failover it would become obvious that I missed a step.

Being Paranoid.   Thanks for the help.

Brian Seppanen

 
 
 

DiskSuite Hot Spares and Filesystem Layout

Post by Darren Dunha » Fri, 25 Jun 2004 06:11:22



Quote:>> You didn't put a filesystem on each of the constituent disks did you?
>> If you did, they're not in existence now anyway.

>> The Raid 5 device allows all data on a single column to be recreated if
>> necessary.  This data includes the filesystem.

> Yes, I guess more succintly my question is then:
> At failover does the hotspare automatically have it's filesystem created
> as a member of the larger raid configuration?  Is it part of the
> failover process, or is there prep work involved that I'm not aware
> of.

Once again, I disagree with the phrase.  The hotspare does not have a
filesystem of its own.

Portions of *the* filesystem will be moved onto the hotspare
automatically as part of the failover process.  The only prep is to
configure the slice on the disk and with SDS prior to the event.

Quote:> At this time there is no individualized filesystem layout on the
> hotspare disk.

No.  Likewise there is no individualized filesystem layout on any of the
other disks.  The filesystem exists usefully only on the metadevice,
which is spread across all the constituent disks in a raid 5 setup.

Quote:> My concern is that there was some sort of file system
> prep, of which I know not, that was necessary for hot spares and at
> failover it would become obvious that I missed a step.

Nope.  The filesystem is just bits on the metadevice.  The failover
process moves the necessary bits onto the hotspare.  Whether those bits
are a filesystem or a raw oracle database, it does not know and does not
care.

--

Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >

 
 
 

DiskSuite Hot Spares and Filesystem Layout

Post by Juhan Leeme » Fri, 25 Jun 2004 09:30:11




>>> You didn't put a filesystem on each of the constituent disks did you?
>>> If you did, they're not in existence now anyway.

>>> The Raid 5 device allows all data on a single column to be recreated if
>>> necessary.  This data includes the filesystem.

>> Yes, I guess more succintly my question is then:

>> At failover does the hotspare automatically have it's filesystem created
>> as a member of the larger raid configuration?  Is it part of the
>> failover process, or is there prep work involved that I'm not aware
>> of.

> Once again, I disagree with the phrase.  The hotspare does not have a
> filesystem of its own.

ah, trying to translate...? ...or am I being the devil's advocate?
Maybe the OP is talking about partitioning?

I myself have wondered about how partitioning for hotspares works. You do
have to partition disks before you setup root mirrors using SDS. I assume
that you also have to define partitions identically for raid disks and
their hot spares? When you define things using meta* commands you are
always talking about slices, not disks. Therefore all disks have to be
identically partitioned (either format/label same size using slice 2, or
by creating a slice by format/partition). That suggests that one cannot
have "generic" disks (unpartitioned) as a pool of hotspares for different
RAIDs? If they are identical formatted/labelled disks, that's OK
(preferred, actually), isn't it?

Should each disk have a metadb replica? In my RAID array, I have a copy of
the metadb at the beginning of each RAID slice (no separate partition). I
guess I can assume that it is copied/recreated when a hotspare takes
over? I think that is the case, since I did "fail" a disk, fall over to a
hotspare, and then later replace it with the original disk: metadb is OK.

I haven't yet put root mirrors on that machine with the RAID. When I do
that, I think I'll put another couple of metadb replicas on each disk
(sd0, sd1, in Ultra2). That gives me 5+4=9 replicas, but flexibility in
case I want to move off (or have a catastrophic hardware failure) in the
RAID array. If I move the array intact, the next machine should find the
metadb on those slices? If I "leave behind" 4 replicas, my root mirror
will still work (w/o the RAID). It probably won't boot (unattended) since
the number of replicas has dropped <1/2. I'm not sure how to fix that, but
I guess some more RTFM will sort that out.

I did read the SDS stuff, but it's been a while since I setup my RAID, and
in the mean time I've gotten confused (? again?). I have a 711 with
software (infrequent writes) RAID, 5 active and 1 hotspare. My disks had
actually been partitioned identically, but I wasn't sure if I had to.

I would also think that RAID and hotspares works best with identical
disks. This is probably the strongest argument for using Sun branded disks
with their custom firmware, because they are guaranteed to be identical
(even if the mfg is different: Fujitsu, IBM, Seagate, etc.), for each
generic size (9GB, 18GB). Have I understood that right?

What if you don't have identical disks. Could you make do, by defining a
slice on a bigger disk to be identical in size to a RAID slice and use
that in the hotspares pool? I don't see why that shouldn't work.

Oh, also, for this RAID array, I've defined it using slice 2 (backup =
whole disk). That basically ignores any other partitioning on the disk,
doesn't it? Is that a reasonable thing to do? Is there a better way?

Quote:>> My concern is that there was some sort of file system
>> prep, of which I know not, that was necessary for hot spares and at
>> failover it would become obvious that I missed a step.

This is where I think he's talking about partitioning.

Quote:> Nope.  The filesystem is just bits on the metadevice.  The failover
> process moves the necessary bits onto the hotspare.  Whether those bits
> are a filesystem or a raw oracle database, it does not know and does not
> care.

At one point I thought that SDS might (re)write the vtoc with the
necessary partitioning, but that sounds too dangerous, and implausible.

So, conceptually, rebuilding on a hotspare is (sort of) like doing a dd
from the virtual (reconstructed) RAID slice onto a new disk? except that
meanwhile the RAID array can keep on trucking, doing reads and writes.
(that's awesome! when stuff actually works the way it should!)

--
Juhan Leemet
Logicognosis, Inc.

 
 
 

DiskSuite Hot Spares and Filesystem Layout

Post by Darren Dunha » Fri, 25 Jun 2004 14:08:11



> have to partition disks before you setup root mirrors using SDS. I assume
> that you also have to define partitions identically for raid disks and
> their hot spares? When you define things using meta* commands you are
> always talking about slices, not disks. Therefore all disks have to be
> identically partitioned (either format/label same size using slice 2, or
> by creating a slice by format/partition). That suggests that one cannot
> have "generic" disks (unpartitioned) as a pool of hotspares for different
> RAIDs? If they are identical formatted/labelled disks, that's OK
> (preferred, actually), isn't it?
> Should each disk have a metadb replica? In my RAID array, I have a copy of
> the metadb at the beginning of each RAID slice (no separate partition). I
> guess I can assume that it is copied/recreated when a hotspare takes
> over? I think that is the case, since I did "fail" a disk, fall over to a
> hotspare, and then later replace it with the original disk: metadb is OK.

No.  metadb is managaged separately from the metadevices.  All metadb
copies are "active" all the time, so there's nothing extra to happen
when a disk fails.  Only a redundant metadevice will "failover" to a hot
spare.

Quote:> I haven't yet put root mirrors on that machine with the RAID. When I do
> that, I think I'll put another couple of metadb replicas on each disk
> (sd0, sd1, in Ultra2). That gives me 5+4=9 replicas, but flexibility in
> case I want to move off (or have a catastrophic hardware failure) in the
> RAID array. If I move the array intact, the next machine should find the
> metadb on those slices?

It wouldn't know about the metadbs, so they would be ignored.

Quote:> If I "leave behind" 4 replicas, my root mirror
> will still work (w/o the RAID). It probably won't boot (unattended) since
> the number of replicas has dropped <1/2. I'm not sure how to fix that, but
> I guess some more RTFM will sort that out.

It won't survive with only 4.  You would need to delete at least 2
replicas from the array before disconnecting it.  

Perhaps you would rather not have more than 50% of your replicas on the
external array.

Quote:> I did read the SDS stuff, but it's been a while since I setup my RAID, and
> in the mean time I've gotten confused (? again?). I have a 711 with
> software (infrequent writes) RAID, 5 active and 1 hotspare. My disks had
> actually been partitioned identically, but I wasn't sure if I had to.
> I would also think that RAID and hotspares works best with identical
> disks. This is probably the strongest argument for using Sun branded disks
> with their custom firmware, because they are guaranteed to be identical
> (even if the mfg is different: Fujitsu, IBM, Seagate, etc.), for each
> generic size (9GB, 18GB). Have I understood that right?
> What if you don't have identical disks. Could you make do, by defining a
> slice on a bigger disk to be identical in size to a RAID slice and use
> that in the hotspares pool? I don't see why that shouldn't work.

Yup.

Quote:> Oh, also, for this RAID array, I've defined it using slice 2 (backup =
> whole disk). That basically ignores any other partitioning on the disk,
> doesn't it? Is that a reasonable thing to do? Is there a better way?

It's one way to do it, and it will work.  Some places avoid using slice
2 for any purpose and instead take something else (say 6 or 7) and
define the entire disk on that slice.

Personally, I'd rather define a separate slice for the replicas.  If you
ever forget to put a replica on the disk, the metadevice will start at
offset 0 in the raw slice, different from all the other disks.  You
won't be able to then add a replica on the disk easily.

Quote:>>> My concern is that there was some sort of file system
>>> prep, of which I know not, that was necessary for hot spares and at
>>> failover it would become obvious that I missed a step.
> This is where I think he's talking about partitioning.

Except he previously mentioned that the partitioning setup on the disks
were identical.

Quote:>> Nope.  The filesystem is just bits on the metadevice.  The failover
>> process moves the necessary bits onto the hotspare.  Whether those bits
>> are a filesystem or a raw oracle database, it does not know and does not
>> care.
> At one point I thought that SDS might (re)write the vtoc with the
> necessary partitioning, but that sounds too dangerous, and implausible.

Right.  It does not.

I haven't really examined how soft partitioning changes this.
Presumably you could have several soft partitions rebuild onto a generic
hot spare.  

That would be similar to VxVM which does the VTOC work up front and just
moves the pieces around in a single large slice.

Quote:> So, conceptually, rebuilding on a hotspare is (sort of) like doing a dd
> from the virtual (reconstructed) RAID slice onto a new disk? except that
> meanwhile the RAID array can keep on trucking, doing reads and writes.
> (that's awesome! when stuff actually works the way it should!)

Yes, that's exactly what happens.  

--

Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >

 
 
 

DiskSuite Hot Spares and Filesystem Layout

Post by Brian E. Seppane » Fri, 25 Jun 2004 20:18:55



> Once again, I disagree with the phrase.  The hotspare does not have a
> filesystem of its own.

> Portions of *the* filesystem will be moved onto the hotspare
> automatically as part of the failover process.  The only prep is to
> configure the slice on the disk and with SDS prior to the event.

Thanks! You just answered my question.   I wasn't stating that it should
have an individual filesystem layed out.   I just had no idea whether it
should or not, and so asked the question.   It's all good!

Brian Seppanen

 
 
 

DiskSuite Hot Spares and Filesystem Layout

Post by Juhan Leeme » Sat, 26 Jun 2004 09:15:36




>> Should each disk have a metadb replica? In my RAID array...
> No.  metadb is managaged separately from the metadevices...

[...etc...]

Quote:>> Oh, also, for this RAID array, I've defined it using slice 2...
> It's one way to do it, and it will work.  Some places avoid using slice
> 2 for any purpose and instead take something else (say 6 or 7) and
> define the entire disk on that slice.

> Personally, I'd rather define a separate slice for the replicas.  If you
> ever forget to put a replica on the disk, the metadevice will start at
> offset 0 in the raw slice, different from all the other disks.  You
> won't be able to then add a replica on the disk easily.

Thanks for the good advice, and for clarifying my understanding.
p.s. I think in future I will also avoid using slice 2. Cleaner.

Uh, one more thing (Columbo apologies)...

If in my case I were to define the metadb replicas in separate slices,
then I'm unclear as to how it could get reconstructed on a hotspare? Would
it? Having defined the metadb in each slice of the 5 disk software RAID,
I believe it sort of "rides along" with that slice and gets reconstructed
onto the hotspare that is "switched in". I don't remember seeing any
replica "disappear" when I yanked a drive. Maybe I should do that test
again? If metadb replicas are separate slices, then how would I tell SDS
that a new copy is to be replicated onto the hot spare? Does having metadb
replicas embedded in RAID slices work best in this particular case?

--
Juhan Leemet
Logicognosis, Inc.

 
 
 

DiskSuite Hot Spares and Filesystem Layout

Post by Greg Menk » Sat, 26 Jun 2004 09:35:01





> > Personally, I'd rather define a separate slice for the replicas.  If you
> > ever forget to put a replica on the disk, the metadevice will start at
> > offset 0 in the raw slice, different from all the other disks.  You
> > won't be able to then add a replica on the disk easily.

> Thanks for the good advice, and for clarifying my understanding.
> p.s. I think in future I will also avoid using slice 2. Cleaner.

> Uh, one more thing (Columbo apologies)...

> If in my case I were to define the metadb replicas in separate slices,
> then I'm unclear as to how it could get reconstructed on a hotspare? Would
> it? Having defined the metadb in each slice of the 5 disk software RAID,
> I believe it sort of "rides along" with that slice and gets reconstructed
> onto the hotspare that is "switched in". I don't remember seeing any
> replica "disappear" when I yanked a drive. Maybe I should do that test
> again? If metadb replicas are separate slices, then how would I tell SDS
> that a new copy is to be replicated onto the hot spare? Does having metadb
> replicas embedded in RAID slices work best in this particular case?

My guess is the replica(s) on the failed drive would be marked as
errored and other slice(s) on the drive which are supported by raid
would be moved to the hotswap slice(s).  It seems to be up to you to
set up more replicas.

The volume mangler section in the Solaris 9 docs goes through how
hotswap works- I should read it again myself at this point...

Gregm

 
 
 

DiskSuite Hot Spares and Filesystem Layout

Post by Darren Dunha » Sun, 27 Jun 2004 00:05:48



>>> Oh, also, for this RAID array, I've defined it using slice 2...
>> It's one way to do it, and it will work.  Some places avoid using slice
>> 2 for any purpose and instead take something else (say 6 or 7) and
>> define the entire disk on that slice.

>> Personally, I'd rather define a separate slice for the replicas.  If you
>> ever forget to put a replica on the disk, the metadevice will start at
>> offset 0 in the raw slice, different from all the other disks.  You
>> won't be able to then add a replica on the disk easily.
> Thanks for the good advice, and for clarifying my understanding.
> p.s. I think in future I will also avoid using slice 2. Cleaner.

If you like.  For me it's not a slice2 vs non-slice2, it's a combo slice
for db & metadevice versus separate slices.

Quote:> Uh, one more thing (Columbo apologies)...

Yes, lieutenant?

Quote:> If in my case I were to define the metadb replicas in separate slices,
> then I'm unclear as to how it could get reconstructed on a hotspare?

metadb replicas are not reconstructed on a hotspare.

all metadb replicas are active all the time.  If a disk fails, it may be
that some of the replicas are not longer accessible.  If more than 50%
of the replicas fail, the machine will panic and reboot.  

After replacing a disk, you would manually add replicas to it if you
want.

Think of it like an N-way mirror.

Quote:> Would
> it? Having defined the metadb in each slice of the 5 disk software RAID,
> I believe it sort of "rides along" with that slice and gets reconstructed
> onto the hotspare that is "switched in".

No, you would need to add the replica to the disk beforehand.

Quote:> I don't remember seeing any
> replica "disappear" when I yanked a drive. Maybe I should do that test
> again?

If you like.  They should appear with different flags in 'metadb' output
if they are inaccessible.

Quote:> If metadb replicas are separate slices, then how would I tell SDS
> that a new copy is to be replicated onto the hot spare?

I don't understand the question.  You simply put the replicas where you
want.  If that's the hotspare disk, put one there.

Quote:> Does having metadb
> replicas embedded in RAID slices work best in this particular case?

No.  

--

Senior Technical Consultant         TAOS            http://www.taos.com/
Got some Dr Pepper?                           San Francisco, CA bay area
         < This line left intentionally blank to confuse you. >

 
 
 

1. Disksuite - Removing a Hot spare

How do you go about removing a hot spare which is in use ?
i.e. Disassociating it from the submirror that it is now
part of.

I basically want to return the disk to it's associated Hot Spare
pool, detach the submirror it was part of, and then build a new
submirror from some spare disks.

N.B. I don't want to use the spare disks as Hot spares.

Thanks a lot.

2. pppd

3. DiskSuite + Hot Spare

4. SNMP agent

5. Hot spare when both submirrors fails (DiskSuite 4.2)?

6. Red Hat 5.0 not seeing all memory!!??

7. DiskSuite hot spare trouble

8. Toshiba T4900CT XF86 Config

9. Advise for filesystem layout with DiskSuite

10. Can hot spare be added after metadevices were created ?

11. Veritas Mirroring (Hot Spare Problems)

12. hot spare system

13. adding a hot spare to a RAID5 device...