Network RAID with Linux network block device

Network RAID with Linux network block device

Post by C. Ch » Thu, 07 Jun 2001 23:34:31



From what I've read about Linux ndb it will allow aggregation
of separate remote ndb exported partitions in a RAID 0,1 or
even 5 md volume. However, because of the lack of distributed
locking, only one client can mount the volumes r/w.

From what I've read of GFS, it provides locking so multiple
clients can mount r/w but GFS does not allow aggregation.

I'd like to do both: aggregate several separate disk pools
on the network into a 0+1 and allow r/w by multiple clients.

Is there any Linux project working on this? I know there
are commercial solutions but they are expensive.
--


 
 
 

Network RAID with Linux network block device

Post by Peter T. Breue » Sat, 09 Jun 2001 02:22:56



> From what I've read about Linux ndb it will allow aggregation
> of separate remote ndb exported partitions in a RAID 0,1 or
> even 5 md volume. However, because of the lack of distributed
> locking, only one client can mount the volumes r/w.

This is correct. But the aggregation is not neccessary. You can always
run RAID linear over sevrela nbd devices.

Quote:> From what I've read of GFS, it provides locking so multiple
> clients can mount r/w but GFS does not allow aggregation.

Well, locking is not really the problem anyway.  Isn't GFS a file
system?

Quote:> I'd like to do both: aggregate several separate disk pools
> on the network into a 0+1 and allow r/w by multiple clients.

See above. You need to invest more in your imagination!

Quote:> Is there any Linux project working on this? I know there
> are commercial solutions but they are expensive.

Well, there's my ENBD, which is better than NBD, but also only
multiple-readonly. You can set it multiple-readwr, but the result will
be chaos. It'd bound to be if you don't have FS-level atomicity of
operations.

The short term solution is to simply export the nbd (or enbd)
aggegation via nfs. This adds to network traffic, but you probably
won't notice.

The long term solution is either

  a) use a journalling fs such as XFS on nbd, and add FS hooks
     to atomicise accesses,

  b) use a journalling fs ... , and maintain time-order of block
      accesses across all the machines (hic).

If somebody could tell me about how to do (b), I would be pleased.

Peter

 
 
 

Network RAID with Linux network block device

Post by C. Ch » Sun, 10 Jun 2001 04:53:38




Hello Mr Breuer, I'm glad to hear from an NDB developer.


>> From what I've read about Linux ndb it will allow aggregation
>> of separate remote ndb exported partitions in a RAID 0,1 or
>> even 5 md volume. However, because of the lack of distributed
>> locking, only one client can mount the volumes r/w.

>This is correct. But the aggregation is not neccessary. You can always
>run RAID linear over sevrela nbd devices.

Are there any advantages to doing so rather than treating them
as one logical volume?

Quote:>> From what I've read of GFS, it provides locking so multiple
>> clients can mount r/w but GFS does not allow aggregation.

>Well, locking is not really the problem anyway.  Isn't GFS a file
>system?

Yes, but it also provides Fibre Channel, parallel SCSI, and
Ethernet/IP tools. GFS has its own network block device
driver GNDB and an IP based lock server memexpd. The filesystem
part of GFS is a journalling FS, and the exports allow
multiple clients to mount r/w.

Quote:

>See above. You need to invest more in your imagination!

After my misspent years in graduate school I think I'll invest
in more tangible assets...;-)

Quote:>Well, there's my ENBD, which is better than NBD, but also only
>multiple-readonly. You can set it multiple-readwr, but the result will
>be chaos. It'd bound to be if you don't have FS-level atomicity of
>operations.

>The short term solution is to simply export the nbd (or enbd)
>aggegation via nfs. This adds to network traffic, but you probably
>won't notice.

I hadn't of that but there doesn't seem to be any reason why
this won't work. Has anyone tried it?

Quote:>The long term solution is either

>  a) use a journalling fs such as XFS on nbd, and add FS hooks
>     to atomicise accesses,

>  b) use a journalling fs ... , and maintain time-order of block
>      accesses across all the machines (hic).

>If somebody could tell me about how to do (b), I would be pleased.

>Peter

The GFS project seems close. The GFS FAQ mentions a cluster wide
LVM is in the works, though RAID will be harder due to the issues
you mentioned.

Some commercial startups have been using Linux as the base
for their products. Tricord has software called Illumina which
does RAID across an NAS cluster, and Falconstor has a pure software
storage virtualization product. But I don't think they've made any
contributions of source code back to the open or free software
communities.
--


 
 
 

Network RAID with Linux network block device

Post by Peter T. Breue » Mon, 11 Jun 2001 07:09:46






>>> From what I've read about Linux ndb it will allow aggregation
>>> of separate remote ndb exported partitions in a RAID 0,1 or
>>> even 5 md volume. However, because of the lack of distributed
>>> locking, only one client can mount the volumes r/w.

>>This is correct. But the aggregation is not neccessary. You can always
>>run RAID linear over [several] nbd devices.
> Are there any advantages to doing so rather than treating them
> as one logical volume?

Eh?  What do you mean by the latter?  Nobody mentioned LVM, I think.
The distinction I was drawing was between the NBD system doing the
resource aggregation, and using a separate aggregation mechanism.

I'm not sure which I'd prefer. I can tell you that it's easy to build
raid aggregation into the NBD server (because I have), but not so easy
to build it into the client (because I haven't). In my opinion it is
better to keep the aggregation separate. If all the resources are at
the server end, you can use software RAID to aggregate them there, and
export the result via NBD. If the resources are scattered, then you
will have to export them individually via NBD, then aggregate them via
softRAID.

Or you could look at drbd, shich I think uses mirrored servers.

Quote:>>> From what I've read of GFS, it provides locking so multiple
>>> clients can mount r/w but GFS does not allow aggregation.

>>Well, locking is not really the problem anyway.  Isn't GFS a file
>>system?
> Yes, but it also provides Fibre Channel, parallel SCSI, and
> Ethernet/IP tools. GFS has its own network block device

I'm not sure what relevance that is. The transport system should
really be invisible to the device. I'd prefer something like ipv6, and
leave it to someone else to provide PtP ipv6 over scsi!

Quote:> driver GNDB and an IP based lock server memexpd. The filesystem

Interesting. Do you know anything about GNBD?

Quote:> part of GFS is a journalling FS, and the exports allow
> multiple clients to mount r/w.

journalling fs's (such as xfs) over nbd indeed work well, and I am
assured that the only condition required for their working is that
client block requests retain their time ordering at the server end.
This is of course certain if there is only one client.  With two
clients, I would have to implement a clocks mechanism (I haven't, but
will).  I don't believe that FS hooks to get atomic locking (i.e.
temporary exclusive access to the server) is necessary under those
circumstances.

The most generic mechanism I know of right now using NBDs is
to export a RAID  device via NFS or other shared network FS.
The RAID device shoudl be a composite of NBD devices. This allows
a network FS to be made up of components distributed over the net.

A next generation device would be one that virtualized this
achitecture, so that it appears to be as I have described it,
but isn't.

Quote:>>See above. You need to invest more in your imagination!
> After my misspent years in graduate school I think I'll invest
> in more tangible assets...;-)
>>Well, there's my ENBD, which is better than NBD, but also only
>>multiple-readonly. You can set it multiple-readwr, but the result will
>>be chaos. It'd bound to be if you don't have FS-level atomicity of
>>operations.

>>The short term solution is to simply export the nbd (or enbd)
>>aggegation via nfs. This adds to network traffic, but you probably
>>won't notice.
> I hadn't of that but there doesn't seem to be any reason why
> this won't work. Has anyone tried it?

I suppose they have (although I haven't). It's completely unremarkable
as an idea, so I doubt if anyone would mention it to me.

Quote:>>The long term solution is either

>>  a) use a journalling fs such as XFS on nbd, and add FS hooks
>>     to atomicise accesses,

>>  b) use a journalling fs ... , and maintain time-order of block
>>      accesses across all the machines (hic).

>>If somebody could tell me about how to do (b), I would be pleased.
> The GFS project seems close. The GFS FAQ mentions a cluster wide
> LVM is in the works, though RAID will be harder due to the issues
> you mentioned.
> Some commercial startups have been using Linux as the base
> for their products. Tricord has software called Illumina which
> does RAID across an NAS cluster, and Falconstor has a pure software
> storage virtualization product. But I don't think they've made any
> contributions of source code back to the open or free software
> communities.

I am involved with some commercial (and noncommercial) applications.
I don't know if I am at liberty to mention anything.

Peter

 
 
 

1. network block device w/ RAID-1 -> distr. fs ?

Hello everyone. We have two servers that need to have access to a shared
filesystem. When serverA goes down, serverB needs to be able to take over
and use the state of the filesystem that serverA last had before it went
down. NFS is not a solution obviously since we don't want to be dependent
on a single NFS server.

Here's what I just set up:

* I created a file /sharefile of exactly 1gb on both serverA and serverB.
* I set up the /dev/loop0 loopback device and associated it /sharefile on
  both serverA and serverB
* I exported the /dev/loop0 device on port 4000 of both servers using
  the network block device server. Now the /dev/loop0 of a server is
  accessible as /dev/nd0 on the other server.
* I created a RAID-1 mirror set on serverA of two devices: /dev/loop0 and
  /dev/nd0.
* I created an ext2 filesystem on the raid-1 mirror

ServerA should have the raid-1 set mounted under /share. Whatever
operations are done on that filesystem should be replicated on the
/dev/loop0 of serverB. That way when server1 fails, serverB will be able
to take over with the most recent data.

It turns out that i can mount the /dev/loop0 on serverB under /share while
it is actually in the raid-1 set on serverA. Now when i create a file on
serverA:/share, it shows up on serverB:/share as well after a small delay.
However, all file-sizes are 0 bytes on serverB! A little experimenting
shows that while /dev/loop0 is not mounted on serverB, all changes on
server1 are correctly replicated on serverB's /dev/loop0. As soon as i
mount it, it does not work properly anymore.

Now i understand that i can't reasonably expect this to work flawlessly. I
am just interested in WHAT is exactly causing this to fail. If anybody
could give me some insight in this matter that'd be greatly appreciated.

Best regards,

Martijn de Vries

2. Keyboard Mapping

3. Network blocks randomly (no traffic) network restart makes it work again

4. making a shared library

5. Network within a network.. Iptables stumbling block

6. 2.5.70: CODA breaks boot

7. How to use fwtk on linux with dual-network (Private Network + Network class C)

8. bourne shell

9. Block device over the network

10. Emulation of a cd-rom with the Network Block Device.

11. sort of Network Block Device, open, simple, and especially cross-platform?

12. The memory usage of the network block device

13. network block device -- zero bytes?