Yet another linux filesytem: with version control

Yet another linux filesytem: with version control

Post by Jerome de Vivi » Wed, 25 Jul 2001 06:10:08



Hi all,

Handling multiples versions is a tough challenge (...even in the linux
kernel). Working under software configuration management (SCM) helps
but with some overhead; and it works only if everybody support it.

From CVS to ClearCase, i haven't seen any easy tool. I feel a real
need to handle SCM simply.

The multiple version filesystem (mvfs) of ClearCase gives a
transparent acces to the data. I found this feature cool, but the
overall system is too complex. I would like to write an extension
module for the linux kernel to handle version control in a simply way.

Here's the main features:

-no check-out/check-in
-labelization
-private copy
-transparent acces to data
-select configuration with a single environment variable.
-mix of normal files (with the base FS) and, files which are managed
under version control (C-files) in a same filesystem.

Here's how i see it works:

When a C-file is created, the label "init" is put onto.  The first
write on a C-file create a private copy for the user who run the
process. This C-file is added to a "User File List" (UFL). This
private copy is now selected by the FS in place of version "init".
Each user can start his own private copy by writting into a C-file.

When a developper has reach a step and, would like to share his work;
he creates a new label. This label will be put on every private copy
listed in the UFL and, the UFL is zeroed. Thoses new versions
are now public. They are viewed by setting $CONFIGURATION to the new
label. New developpement can be start from this label.

The label "init" is predefined. Labels will be organized in a tree
and, the structure will look like this:

struct label {
       int id;
       char [] name;
       struct label * parent;

Quote:}

When we access a C-file with a "read" or a "write", the extension
module select one version with the following rules:

First, if the C-file is into the UFL, we have a private copy to
select. Else, we choose the version labeled by "$CONFIGURATION". If
such version does not exist, we search the version marked by the
nearest "parent" label (at least, label "init" match).

In kernel side, we need to manage the following structes:
-a tree of versions for each C-file.
-a tree of labels.
-a UFL list for each developpers.

In userland, we need:
-a "mklabel" tool.
-use a "CONFIGURATION" environment variable.
-use existing tool for "merge" operations.

If my design match your needs and, if there is enough feedback; i will
start this project. As i'm not a super kernel hacker, i need your help.

Any volunters are welcome !

j.

--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Larry McVo » Wed, 25 Jul 2001 06:30:06


Quote:> The multiple version filesystem (mvfs) of ClearCase gives a
> transparent acces to the data. I found this feature cool, but the
> overall system is too complex. I would like to write an extension
> module for the linux kernel to handle version control in a simply way.

Having been through this a time or two, a few points to consider:

a) This is a hard area to get right.  I've done it twice, I told Linus that
   I could do it the second time in 6 months, and that was 3 years ago and
   we're up to 6 full time people working on this.  Your mileage may vary.
b) Filesystem support for SCM is really a flawed approach.  No matter how
   much you hate all SCM systems out there, shoving the problem into the
   kernel isn't the answer.  All that means is that you have an ongoing
   battle to keep your VFS up to date with the kernel.  Ask Rational
   how much fun that is...
c) If you have to do a file system, may I suggest that you clone the SunOS
   4.x TFS (translucent file system)?  It's a useful model, you "stack" a
   directory on top of a directory and you can see through to the underlying
   directory.  When you write to a file, the file is copied forward to the
   top directory.  So a hack attack is

        mount -t TFS my_linux /usr/src/linux
        cd my_linux
        hack hack hack
        ... many hours later
        cd ..
        umount my_linux
        find . -type f -print   # this is your list of modified files

   It's a cool thing but only semi needed - most serious programmers already
   know how to do the same thing with hard links.

More brains are better than less brains, so welcome to the SCM mess...
--
---
Larry McVoy              lm at bitmover.com           http://www.bitmover.com/lm
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Rik van Rie » Wed, 25 Jul 2001 07:00:11



> b) Filesystem support for SCM is really a flawed approach.

Agreed.  I mean, how can you cleanly group changesets and
versions with a filesystem level "transparent" SCM ?

The goal of an SCM is to _manage_ versions and changesets,
if it doesn't do that we're back at CVS's "every file its
own versioning and to hell with manageability" ...

regards,

Rik
--
Executive summary of a recent Microsoft press release:
   "we are concerned about the GNU General Public License (GPL)"

                http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Jerome de Vivi » Wed, 25 Jul 2001 07:10:07


Larry McVoy a crit :

Quote:> Having been through this a time or two, a few points to consider:

> a) This is a hard area to get right.  I've done it twice, I told Linus that
>    I could do it the second time in 6 months, and that was 3 years ago and
>    we're up to 6 full time people working on this.  Your mileage may vary.

Yeah, i'm not alone !

I absolutely don't know how much work it is. Will you work again on this
topic ?
You + Me + 5 persons which work with you = 7p

If we need 50p, there is place enought for 43 volunters !

Quote:> b) Filesystem support for SCM is really a flawed approach.  No matter how
>    much you hate all SCM systems out there, shoving the problem into the
>    kernel isn't the answer.  All that means is that you have an ongoing

A filesystem seems to be the best location to store files. My first
intend
was to get ride of additional layers and, being able to use all UNIX
tool
directly on data. As i say, i have only one idea in head: "do it simple"
!

Quote:>    battle to keep your VFS up to date with the kernel.  Ask Rational
>    how much fun that is...

> c) If you have to do a file system, may I suggest that you clone the SunOS
>    4.x TFS (translucent file system)?  It's a useful model, you "stack" a
>    directory on top of a directory and you can see through to the underlying
>    directory.  When you write to a file, the file is copied forward to the
>    top directory.  So a hack attack is

>         mount -t TFS my_linux /usr/src/linux
>         cd my_linux
>         hack hack hack
>         ... many hours later
>         cd ..
>         umount my_linux
>         find . -type f -print   # this is your list of modified files

>    It's a cool thing but only semi needed - most serious programmers already
>    know how to do the same thing with hard links.

I've yet done this kind of solution:
-copy every directories and sub-dircetories of v1/ into v2/
-create a symlink from v2 to v1 for each files.
-protect v1/

To work on a file, we just break and copy the link. But, i don't see how
to
work with 2 versions of the same file with hard link.

Quote:

> More brains are better than less brains, so welcome to the SCM mess...

Ya, it's a true mess !

j.

--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Larry McVo » Wed, 25 Jul 2001 07:20:10



> I absolutely don't know how much work it is. Will you work again on this
> topic ?

Err, I've got a young but healthy company that is already doing it.  I'm
happy to offer what advice I can to help you but I can't really commit
substantial resources towards this.  I make my living off of my company
and that has to come first.  That said, it's an interesting area and it's
nice to see others take an interest, so I'll help a little...

Quote:> To work on a file, we just break and copy the link. But, i don't see how
> to work with 2 versions of the same file with hard link.

You don't want to do so.  You save little by doing so.  Please tell me you
weren't going to version control at the block level, therein lies the path
to insanity.  Getting it right at the file boundary is hard enough.
--
---
Larry McVoy              lm at bitmover.com           http://www.bitmover.com/lm
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 
 
 

Yet another linux filesytem: with version control

Post by Jerome de Vivi » Wed, 25 Jul 2001 07:20:11


Rik van Riel a crit :


> > b) Filesystem support for SCM is really a flawed approach.

> Agreed.  I mean, how can you cleanly group changesets and
> versions with a filesystem level "transparent" SCM ?

With label !

In my initial post, i have explain that labels are used to
identify individual files AND are also uses to select for
each files of a set, one version (= select a configuration).
It works !

Quote:

> The goal of an SCM is to _manage_ versions and changesets,
> if it doesn't do that we're back at CVS's "every file its
> own versioning and to hell with manageability" ...

versioning is yet a first step.

j.

--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Jerome de Vivi » Wed, 25 Jul 2001 07:30:15


Larry McVoy a crit :


> > I absolutely don't know how much work it is. Will you work again on this
> > topic ?

> Err, I've got a young but healthy company that is already doing it.  I'm
> happy to offer what advice I can to help you but I can't really commit
> substantial resources towards this.  I make my living off of my company
> and that has to come first.  That said, it's an interesting area and it's
> nice to see others take an interest, so I'll help a little...

Ok, thanks !

Quote:

> > To work on a file, we just break and copy the link. But, i don't see how
> > to work with 2 versions of the same file with hard link.

> You don't want to do so.  You save little by doing so.  Please tell me you
> weren't going to version control at the block level, therein lies the path
> to insanity.  Getting it right at the file boundary is hard enough.

Yes, it was block level version control but it feets our needs ( I have
scattered files across directories when there were no dependencies).

j.

--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Rik van Rie » Wed, 25 Jul 2001 07:40:06



> Rik van Riel a crit :

> > > b) Filesystem support for SCM is really a flawed approach.

> > Agreed.  I mean, how can you cleanly group changesets and
> > versions with a filesystem level "transparent" SCM ?

> With label !

> In my initial post, i have explain that labels are used to
> identify individual files AND are also uses to select for
> each files of a set, one version (= select a configuration).
> It works !

Hmmmm, so it's not completely transparent. Good.

Now if you want to make this kernel-accessible, why
not make a userland NFS daemon which uses something
like bitkeeper or PRCS as its backend ?

The system would then look like this:

 _____    _______    _____    _____
|     |  |       |  |     |  |     |
| SCM |--| UNFSD |--| NET |--| NFS |
|_____|  |_______|  |_____|  |_____|

And there, you have a transparent SCM filesystem
that works over the network ... without ever having
to modify the kernel or implement SCM.

Quote:> versioning is yet a first step.

And I'm not convinced it is even needed. All you
really need is the glue layer between the SCM
system and the kernel. A user level NFS server
will do this just fine.

regards,

Rik
--
Executive summary of a recent Microsoft press release:
   "we are concerned about the GNU General Public License (GPL)"

                http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Jerome de Vivi » Wed, 25 Jul 2001 08:10:08


Rik van Riel a crit :


> > Rik van Riel a crit :

> > > > b) Filesystem support for SCM is really a flawed approach.

> > > Agreed.  I mean, how can you cleanly group changesets and
> > > versions with a filesystem level "transparent" SCM ?

> > With label !

> > In my initial post, i have explain that labels are used to
> > identify individual files AND are also uses to select for
> > each files of a set, one version (= select a configuration).
> > It works !

> Hmmmm, so it's not completely transparent. Good.

You only set a global variable to select on which configuration
you want to work. You can't do it simplier Rik: everything else
is transparent: read, write, ... !

Quote:

> Now if you want to make this kernel-accessible, why
> not make a userland NFS daemon which uses something
> like bitkeeper or PRCS as its backend ?

> The system would then look like this:

>  _____    _______    _____    _____
> |     |  |       |  |     |  |     |
> | SCM |--| UNFSD |--| NET |--| NFS |
> |_____|  |_______|  |_____|  |_____|

Your architecture is too complex for me.

Quote:

> And there, you have a transparent SCM filesystem
> that works over the network ... without ever having
> to modify the kernel or implement SCM.

I can't do it outside the kernel. There is one important
feature i have mention: I would like to mix file from the
"base" filesystem and files which are managed under
configuration. Why is this feature really important ?
Because in the product, there are two kind of files:
-source (leaf on the dependency tree)
-and generated files.
As you know in SCM, generated files are not identify by version
number, but by a configuration (a set with one version for each
dependencies). So, there is no need to manage all objects of a
partition under version control.

j.

--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Larry McVo » Wed, 25 Jul 2001 08:20:10



> Now if you want to make this kernel-accessible, why
> not make a userland NFS daemon which uses something
> like bitkeeper or PRCS as its backend ?

> The system would then look like this:

>  _____    _______    _____    _____
> |     |  |       |  |     |  |     |
> | SCM |--| UNFSD |--| NET |--| NFS |
> |_____|  |_______|  |_____|  |_____|

> And there, you have a transparent SCM filesystem
> that works over the network ... without ever having
> to modify the kernel or implement SCM.

I like the way you think, Rik.  About 2 years ago I did a very quick and ugly
version of exactly this, just as a proof of concept.  You could mount old
versions of the repositories and diff them, etc.  Quite cool.  It's long
since out of date and it adds a layer of caching and performance loss that
I wasn't willing to live with, but it's a cool idea.  When we have more time
than problems I might get back to that.  I think it is the right approach.

As to the comments he made about mixing files, that's not a problem.  You
do need some way to tell UNFDS that this file is to be revision controlled
and that one is not, but with that you can let .o's be created and just
managed in the backing file system.  Works fine.  The interface to
revision control stuff seems ugly because you have to be explicit, but that
can be made nice.  Suppose we used fake subdirectories as a way of doing
operations, such that

        mv *.c ./.checkin

does a checkin, etc.  That's not so bad and you need the interface anyway
to tell the system you are ready to check things in. You don't want it to
check in a new version every time you modify the file, that's excessive.
--
---
Larry McVoy              lm at bitmover.com           http://www.bitmover.com/lm
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Rik van Rie » Wed, 25 Jul 2001 08:40:05



> Rik van Riel a crit :
> > Hmmmm, so it's not completely transparent. Good.

> You only set a global variable to select on which configuration
> you want to work. You can't do it simplier Rik: everything else
> is transparent: read, write, ... !

*nod*

Sounds like a great idea indeed.

Quote:> > Now if you want to make this kernel-accessible, why
> > not make a userland NFS daemon which uses something
> > like bitkeeper or PRCS as its backend ?

> > The system would then look like this:

> >  _____    _______    _____    _____
> > |     |  |       |  |     |  |     |
> > | SCM |--| UNFSD |--| NET |--| NFS |
> > |_____|  |_______|  |_____|  |_____|

> Your architecture is too complex for me.

But you only have to implement 10% of it, the rest already
exists.

You already have:
1) Source Control Management system (SCM)
2) Userland NFS daemon (UNFSD)
3) network layer
4) NFS filesystem support (for every OS!)

All you need is a backend for the NFS server daemon to
get its files from a version control system (the SCM)
instead of from disk.

Quote:> > And there, you have a transparent SCM filesystem
> > that works over the network ... without ever having
> > to modify the kernel or implement SCM.

> I can't do it outside the kernel.

So chose the appropriate "magic directories" for the
NFS daemon ... maybe even "magic mount paths" ?

You're looking at reimplementing the 90% which is
already there (the versioning and the filesystem code)
while leaving the other 10% (the management code) for
a later date ;)

regards,

Rik
--
Executive summary of a recent Microsoft press release:
   "we are concerned about the GNU General Public License (GPL)"

                http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Keith Owen » Wed, 25 Jul 2001 11:20:09


On Mon, 23 Jul 2001 23:06:34 +0200,

Quote:>Handling multiples versions is a tough challenge (...even in the linux
>kernel). Working under software configuration management (SCM) helps
>but with some overhead; and it works only if everybody support it.

FYI, you do not need this for the kernel.  kbuild 2.5 already supports
multiple source trees for building the linux kernel.  Current beta is
http://prdownloads.sourceforge.net/kbuild/kbuild-2.5-2.4.7-2.gz, read
Documentation/kbuild/kbuild-2.5.txt.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Albert D. Cahala » Wed, 25 Jul 2001 14:30:08


Quote:Larry McVoy writes:
> b) Filesystem support for SCM is really a flawed approach.  No matter how
>    much you hate all SCM systems out there, shoving the problem into the
>    kernel isn't the answer.  All that means is that you have an ongoing
>    battle to keep your VFS up to date with the kernel.  Ask Rational
>    how much fun that is...

I'm sure it is a pain to maintain, but consider recovery
with revision control in your root filesystem:


Nice, isn't it? You can trash /bin/* all you want.

Distributed filesystems like Coda seem to get pretty close
to having revision control anyway. They need something like
it for conflict resolution.

The traditional revision control approach seems to get pretty
wasteful as well. Maybe you have a few dozen developers, each
with a few files checked out of a multi-gigabyte source tree.
The kernel solution has less trouble sharing resources among
all the developers, especially when people share a machine.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

Yet another linux filesytem: with version control

Post by Larry McVo » Wed, 25 Jul 2001 14:40:08


> > b) Filesystem support for SCM is really a flawed approach.  No matter how
> >    much you hate all SCM systems out there, shoving the problem into the
> >    kernel isn't the answer.  All that means is that you have an ongoing
> >    battle to keep your VFS up to date with the kernel.  Ask Rational
> >    how much fun that is...

> I'm sure it is a pain to maintain, but consider recovery
> with revision control in your root filesystem:


> Nice, isn't it? You can trash /bin/* all you want.

Yeah, that's cool.  I'm with you in spirit on this one Albert, I've long
promoted that we use revision control for all the config files (stuff
like /etc/sendmail.cf, etc).

And we have customers who use BitKeeper to manage their entire OS, I mean
all the binaries are in there.

That said, I'd really urge people to listen to Rik, he has the right idea
with the user level NFS idea.  There is no good reason and a lot of bad
reasons to put this stuff in the kernel.

I realize that since this is our business that my credibility is low,
you'll expect that I'm pushing this because it somehow benefits us (how,
I'm not sure, but I have faith that someone will think that).  Anyway,
that's not the case, this is purely from a kernel point of view, I think
this is a dead end.

Useful stuff would be the copy on write file system, that's good for SCM
and other things.  And the user level NFS approach.  That way if you hate
the BK license you can plug PRCS or CVS or my-favorite-SCM system into the
back end.  I'd much rather see that than BK in the kernel.  Yuck.

Quote:> Distributed filesystems like Coda seem to get pretty close
> to having revision control anyway. They need something like
> it for conflict resolution.

Yeah!  No kidding.  If Coda had this I think there is a reasonable chance
that most SCM systems would go away.  Certainly the trivial ones would.
--
---
Larry McVoy              lm at bitmover.com           http://www.bitmover.com/lm
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 
 
 

Yet another linux filesytem: with version control

Post by Alexander Vir » Wed, 25 Jul 2001 15:10:09



> That said, I'd really urge people to listen to Rik, he has the right idea
> with the user level NFS idea.  There is no good reason and a lot of bad
> reasons to put this stuff in the kernel.
> > Distributed filesystems like Coda seem to get pretty close
> > to having revision control anyway. They need something like
> > it for conflict resolution.

> Yeah!  No kidding.  If Coda had this I think there is a reasonable chance
> that most SCM systems would go away.  Certainly the trivial ones would.

        CODA servers tend to be simpler than NFS ones (stateful protocol,
commit-on-close, all file IO handled by local fs code, you name it).
Full-blown Venus is, indeed, a lurking horror from beyond, but that's a
different story - nightmarish stuff is in the distributed fs part. As a
glue for userland fs CODA wins hands down (BTW, that goes not only for
simplicity of code, but for performance and deadlock avoidance reasons).

        There's a whole shitcan of worms around the semantics of versioned
fs, though - e.g. what happens if you create a link to an old version of
file? What happens if you rename an old version away? What happens if you
rename _over_ it? There are obvious answers to that (e.g. all versions
except the last one are read-only and can be freely moved around or removed;
all association between them is semblance of names), but I doubt that
any of the easy variants will satisfy those who want that stuff. Personally,
I'd go for "you can take a read-only snapshot of a subtree and then bind
its parts anywhere you want", but that's not the only variant and I really
doubt that _any_ variant would satisfy everyone.

        No matter what implementation you choose, semantics will be a fscking
minefild and I'd rather _not_ see that flamewar on l-k. If somebody cares
to set a maillist - great, but let's keep it separate from l-k. This stuff
has a potential for flamewar worse than devfs, forked-files, bk licensing
and CML2 ones combined (and is very likely to resurrect the first two, in
bargain).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

1. Source control/Version control wanted

Where can I find a good (PD) source control/version control program
for UNIX?  We have an application with about 2000 subroutines, supported
on several UNIX platforms and woked on by several people.  We need a better
means of managing this situation.  All suggestions MOST welcome.

Please Email as I don't always have time to read this group.

Thanks!

--
-------------------------------------------------------------------------
Brian Wainscott  | "I want to get so close to Him that it's no big change

-------------------------------------------------------------------------

2. Advantages/Disadvantages BroadVision/Vignette/Intershop

3. Source control, version control, global updates for HTML?

4. Sound blaster Live support

5. FreeBSD/NetBSD/LINUX - Sun/SPARC version yet?

6. ifconfig, route, netstat source code?

7. LINUX - Sun/SPARC version yet?

8. Red Hat Distribution

9. Linux Proc Filesytem Emulator

10. version control in Linux

11. New tool for linux Automated Network Bandwidth Control , beta version try it

12. RedHat Linux 5.x - How many CR-ROM image files can I mount as filesytems ??

13. Version Control and Other Development Tools for Linux