[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Olaf Dietsch » Sat, 19 Oct 2002 21:20:11



This patch adds filesystem capabilities to 2.5.42, but it applies to
2.5.43 as well.

It's very simple. In the root directory of every filesystem, there
must be a file named ".capabilities". This is the capability database
indexed by inode number. These files are populated by a chcap tool,
see next mail.

This fs capability system should work on all filesystem, which can
provide long dotted names and have some sort of inode. Another benefit
is, when holes in files are allowed. Otherwise the .capabilities file
could grow pretty large.

I use this on an ext2 filesystem. It boots and seems to work so far.

Comments?

Regards, Olaf.

diff -urN a/security/Config.in b/security/Config.in
--- a/security/Config.in        Sat Oct  5 18:44:05 2002

 #
 mainmenu_option next_comment
 comment 'Security options'
-define_bool CONFIG_SECURITY_CAPABILITIES y
+tristate 'Security Capabilities' CONFIG_SECURITY_CAPABILITIES
+dep_bool '  Filesystem Capabilities (EXPERIMENTAL)' CONFIG_FS_CAPABILITIES $CONFIG_EXPERIMENTAL
 endmenu
diff -urN a/security/capability.c b/security/capability.c
--- a/security/capability.c     Sat Oct 12 14:24:21 2002

 #include <linux/smp_lock.h>
 #include <linux/skbuff.h>
 #include <linux/netlink.h>
+#include <linux/namei.h>

 /* flag to keep track of how we were registered */

        return 0;
 }

+#ifdef CONFIG_FS_CAPABILITIES
+static struct file *open_capabilities(struct linux_binprm *bprm)
+{
+       static char name[] = ".capabilities";
+       struct nameidata nd;
+       int err;
+       nd.mnt = mntget(bprm->file->f_vfsmnt);
+       nd.dentry = dget(nd.mnt->mnt_root);
+//     nd.last_type = LAST_ROOT;
+       nd.flags = 0;
+       err = path_walk(name, &nd);
+       if (err)
+               return ERR_PTR(err);
+
+       return dentry_open(nd.dentry, nd.mnt, O_RDONLY);
+}
+
+static void read_capabilities(struct file *filp, struct linux_binprm *bprm)
+{
+       __u32 fscaps[3];
+       unsigned long ino = bprm->file->f_dentry->d_inode->i_ino;
+       int n = kernel_read(filp, ino * sizeof(fscaps), (char *) fscaps, sizeof(fscaps));
+       if (n == sizeof(fscaps)) {
+               bprm->cap_effective = fscaps[0];
+               bprm->cap_inheritable = fscaps[1];
+               bprm->cap_permitted = fscaps[2];
+       }
+}
+#endif
+
 static int cap_bprm_set_security (struct linux_binprm *bprm)
 {
+#ifdef CONFIG_FS_CAPABILITIES
+       struct file *filp;
+#endif
        /* Copied from fs/exec.c:prepare_binprm. */

-       /* We don't have VFS support for capabilities yet */
        cap_clear (bprm->cap_inheritable);
        cap_clear (bprm->cap_permitted);
        cap_clear (bprm->cap_effective);
+#ifdef CONFIG_FS_CAPABILITIES
+       filp = open_capabilities(bprm);
+       if (filp && !IS_ERR(filp)) {
+               read_capabilities(filp, bprm);
+               filp_close(filp, 0);
+       }
+#endif

        /*  To support inheritance of root-permissions and suid-root
         *  executables under compatibility mode, we raise all three
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Alexander Vir » Sun, 20 Oct 2002 01:10:06



> This patch adds filesystem capabilities to 2.5.42, but it applies to
> 2.5.43 as well.

> It's very simple. In the root directory of every filesystem, there
> must be a file named ".capabilities". This is the capability database
> indexed by inode number. These files are populated by a chcap tool,
> see next mail.

> This fs capability system should work on all filesystem, which can
> provide long dotted names and have some sort of inode. Another benefit
> is, when holes in files are allowed. Otherwise the .capabilities file
> could grow pretty large.

> I use this on an ext2 filesystem. It boots and seems to work so far.

> Comments?

His-fscking-terical.  Seriously, what comments do you expect?  To start
with, on a bunch of filesystems inode numbers are unstable.  Moreover,
owner of that file suddenly gets _all_ capabilities that exist in the
system, ditto for any task capable of mount(2), ditto for owner of
root directory on some filesystem.  And there is no way to recognize
that file as such, so additional checks on write(), mount(), unlink().
etc. are not possible.  And that is not to mention that binding of
non-root will play silly *s with the entire scheme.

IOW, idea is unsalvagable.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Olaf Dietsch » Sun, 20 Oct 2002 02:20:06




>> This patch adds filesystem capabilities to 2.5.42, but it applies to
>> 2.5.43 as well.

>> It's very simple. In the root directory of every filesystem, there
>> must be a file named ".capabilities". This is the capability database
>> indexed by inode number. These files are populated by a chcap tool,
>> see next mail.

>> This fs capability system should work on all filesystem, which can
>> provide long dotted names and have some sort of inode. Another benefit
>> is, when holes in files are allowed. Otherwise the .capabilities file
>> could grow pretty large.

>> I use this on an ext2 filesystem. It boots and seems to work so far.

>> Comments?

> His-fscking-terical.

Yes, I like it very much, too ;-)

Quote:> Seriously, what comments do you expect?

Seriously, I'm more or less a newbie in this area, so I want thoughts
and suggestions from more experienced people. That's what this list is
about, isn't it?

Quote:> To start
> with, on a bunch of filesystems inode numbers are unstable.

Not really a problem, so restrict it to stable inode systems only.

Quote:> Moreover,
> owner of that file suddenly gets _all_ capabilities that exist in the
> system,

Yup, like root for example.

Quote:> ditto for any task capable of mount(2),

How's that? I think this task must own the filesystem and root
directory too.

Quote:> ditto for owner of
> root directory on some filesystem.

Which is a problem for foreign (network) filesystems only. Should be
solvable with a mount option (i.e. mount -o nocaps ...).

Quote:> And there is no way to recognize
> that file as such, so additional checks on write(), mount(), unlink().
> etc. are not possible.

Depends on, wether I want to recognize it and do these checks. Anyway,
could be solved with a mount option too or something like quotactl(2)
maybe.

Quote:> And that is not to mention that binding of
> non-root will play silly *s with the entire scheme.

I don't understand this sentence. What do you mean with "binding of
non-root"?

Quote:> IOW, idea is unsalvagable.

I'm working on it. Thanks for sharing your thoughts.

Regards, Olaf.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Alexander Vir » Sun, 20 Oct 2002 02:30:10



> > To start
> > with, on a bunch of filesystems inode numbers are unstable.

> Not really a problem, so restrict it to stable inode systems only.

So exec.c code should go looking for fs type and try and match it
against some table?  OK...

Quote:> > Moreover,
> > owner of that file suddenly gets _all_ capabilities that exist in the
> > system,

> Yup, like root for example.

> > ditto for any task capable of mount(2),

> How's that? I think this task must own the filesystem and root
> directory too.

mount --bind my_file /usr/.capabilities

Quote:> > ditto for owner of
> > root directory on some filesystem.

> Which is a problem for foreign (network) filesystems only. Should be
> solvable with a mount option (i.e. mount -o nocaps ...).

> > And there is no way to recognize
> > that file as such, so additional checks on write(), mount(), unlink().
> > etc. are not possible.

> Depends on, wether I want to recognize it and do these checks. Anyway,
> could be solved with a mount option too or something like quotactl(2)
> maybe.

Ahem.  You had made several capabilities equivalent to "everything".
E.g. "anyone who can override checks in chown() can set arbitrary
capabilities", etc.  Which changes model big way and makes the affected
capabilities pretty much useless - they can be elevated to any other
capability.

Quote:> > And that is not to mention that binding of
> > non-root will play silly *s with the entire scheme.

> I don't understand this sentence. What do you mean with "binding of
> non-root"?

mount --bind /usr/bin /mnt
Suddenly /mnt/foo and /usr/bin/foo (same file) have different capabilities.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Andreas Gruenbache » Mon, 21 Oct 2002 02:30:09



Quote:> This patch adds filesystem capabilities to 2.5.42, but it applies to
> 2.5.43 as well.

> It's very simple. In the root directory of every filesystem, there
> must be a file named ".capabilities". This is the capability database
> indexed by inode number. These files are populated by a chcap tool,
> see next mail.

> This fs capability system should work on all filesystem, which can
> provide long dotted names and have some sort of inode. Another benefit
> is, when holes in files are allowed. Otherwise the .capabilities file
> could grow pretty large.

> I use this on an ext2 filesystem. It boots and seems to work so far.

> Comments?

Capabilities should be implemented as extended attributes; see Ted's recent
postings. Adding the necessary kernel infrastructure as extended attributes
is pretty simple. We will need to spend more time on producing good user
space tools, and figuring out ways so that the whole thing remains
manageable.

--Andreas.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Olaf Dietsch » Tue, 22 Oct 2002 23:50:12



> Capabilities should be implemented as extended attributes;

Why "should" this be implemented as extended attributes? What are the
benefits in doing so?

Quote:> see Ted's recent postings.

Ted's recent postings argue against capabilities at all. So what do
you mean?

Regards, Olaf.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Andreas Gruenbache » Wed, 23 Oct 2002 00:10:10


Hi,

I believe that Capabilities on the file system are a useful thing. They
obviously also are quite controversial. If deployed without the right tools
they may certainly lead to less secure systems. So these supporting tools
need to be develped first, and some real-world experience seems necessary to
learn more.

Whatever the result of this process will be, should we decide to have
filesystem capabilities we would need to associate some pieces of information
with individual inodes, and this is exactly what Extended Attributes were
designed for. There are implementations for ext2, ext3, jfs, xfs, reiserfs,
so I think it makes no sense to reinvent the wheel. (Xattrs (or EAs) were
actually not invented for Linux; Irix and other OSes support almost identical
schemes.)

Do you happen to know the attr(5) manual page? An online version is available
at <http://acl.bestbits.at/cgi-man/attr.5>; perhaps that helps.

--Andreas.



> > Capabilities should be implemented as extended attributes;

> Why "should" this be implemented as extended attributes? What are the
> benefits in doing so?

> > see Ted's recent postings.

> Ted's recent postings argue against capabilities at all. So what do
> you mean?

> Regards, Olaf.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Olaf Dietsch » Fri, 25 Oct 2002 14:30:17


Alexander Viro <v...@math.psu.edu> writes:
> On Sat, 19 Oct 2002, Olaf Dietsche wrote:

> mount --bind my_file /usr/.capabilities

This is still open.

>> > ditto for owner of
>> > root directory on some filesystem.

>> Which is a problem for foreign (network) filesystems only. Should be
>> solvable with a mount option (i.e. mount -o nocaps ...).

I use option nosuid for now.

>> > And there is no way to recognize
>> > that file as such, so additional checks on write(), mount(), unlink().
>> > etc. are not possible.

>> Depends on, wether I want to recognize it and do these checks. Anyway,
>> could be solved with a mount option too or something like quotactl(2)
>> maybe.

> Ahem.  You had made several capabilities equivalent to "everything".
> E.g. "anyone who can override checks in chown() can set arbitrary
> capabilities", etc.  Which changes model big way and makes the affected
> capabilities pretty much useless - they can be elevated to any other
> capability.

Which is nothing new, how I learnt recently: CAP_SYS_RAWIO,
CAP_SYS_MODULE for example, not to mention CAP_SETPCAP. But maybe
pinning the .capabilities inode in memory and providing an appropriate
i_op could be a solution?

> mount --bind /usr/bin /mnt
> Suddenly /mnt/foo and /usr/bin/foo (same file) have different capabilities.

I use super_block->s_root, so this is no problem anymore.

Now it drops capabilities, when a chown() or open() is done.

Well, here is my next try against 2.5.44. It's slightly better than
before, boots and seems to work as expected :-)

Regards, Olaf.

diff -urN a/fs/Config.in b/fs/Config.in
--- a/fs/Config.in      Wed Oct 16 09:23:41 2002
+++ b/fs/Config.in      Thu Oct 24 00:11:51 2002
@@ -108,6 +108,8 @@
    define_bool CONFIG_QUOTACTL y
 fi

+dep_bool 'Filesystem capabilities (Experimental)' CONFIG_FS_CAPABILITIES $CONFIG_EXPERIMENTAL
+
 if [ "$CONFIG_NET" = "y" ]; then

    mainmenu_option next_comment
diff -urN a/fs/Makefile b/fs/Makefile
--- a/fs/Makefile       Sat Oct 19 13:52:46 2002
+++ b/fs/Makefile       Thu Oct 24 00:11:51 2002
@@ -6,7 +6,7 @@
 #

 export-objs := open.o dcache.o buffer.o bio.o inode.o dquot.o mpage.o aio.o \
-                fcntl.o read_write.o dcookies.o
+                fcntl.o read_write.o dcookies.o fscaps.o

 obj-y :=       open.o read_write.o devices.o file_table.o buffer.o \
                bio.o super.o block_dev.o char_dev.o stat.o exec.o pipe.o \
@@ -41,7 +41,8 @@
 obj-y                          += devpts/

 obj-$(CONFIG_PROFILING)                += dcookies.o
-
+obj-$(CONFIG_FS_CAPABILITIES)  += fscaps.o
+
 # Do not add any filesystems before this line
 obj-$(CONFIG_EXT3_FS)          += ext3/ # Before ext2 so root fs can be ext3
 obj-$(CONFIG_JBD)              += jbd/
diff -urN a/fs/attr.c b/fs/attr.c
--- a/fs/attr.c Sat Oct  5 18:44:20 2002
+++ b/fs/attr.c Thu Oct 24 00:11:51 2002
@@ -13,6 +13,7 @@
 #include <linux/fcntl.h>
 #include <linux/quotaops.h>
 #include <linux/security.h>
+#include <linux/fscaps.h>

 /* Taken over from the old code... */

@@ -170,6 +171,9 @@
        }
        if (!error) {
                unsigned long dn_mask = setattr_mask(ia_valid);
+               if (ia_valid & (ATTR_KILL_SUID | ATTR_KILL_SGID))
+                       fscap_drop(dentry);
+
                if (dn_mask)
                        dnotify_parent(dentry, dn_mask);
        }
diff -urN a/fs/fscaps.c b/fs/fscaps.c
--- a/fs/fscaps.c       Thu Jan  1 01:00:00 1970
+++ b/fs/fscaps.c       Thu Oct 24 13:44:37 2002
@@ -0,0 +1,102 @@
+/*
+ * Copyright (c) 2002 Olaf Dietsche
+ *
+ * Filesystem capabilities for linux.
+ */
+
+#include <linux/fscaps.h>
+#include <linux/module.h>
+#include <linux/binfmts.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <asm/uaccess.h>
+
+static int __fscap_lookup(struct vfsmount *mnt, struct nameidata *nd)
+{
+       static char name[] = ".capabilities";
+       nd->mnt = mntget(mnt);
+       nd->dentry = dget(mnt->mnt_sb->s_root);
+       nd->flags = 0;
+       return path_walk(name, nd);
+}
+
+static struct file *__fscap_open(struct dentry *de, struct vfsmount *mnt, int flags)
+{
+       struct inode *inode = de->d_inode;
+       if (mnt->mnt_flags & MNT_NOSUID)
+               return ERR_PTR(-EPERM);
+
+       if (inode->i_uid != 0 || inode->i_gid != 0)
+               return ERR_PTR(-EPERM);
+
+       if ((inode->i_mode & 077) != 0)
+               return ERR_PTR(-EACCES);
+
+       return dentry_open(de, mnt, flags);
+}
+
+static void __fscap_read(struct file *filp, struct linux_binprm *bprm)
+{
+       __u32 fscaps[3];
+       unsigned long ino = bprm->file->f_dentry->d_inode->i_ino;
+       int n = kernel_read(filp, ino * sizeof(fscaps), (char *) fscaps, sizeof(fscaps));
+       if (n == sizeof(fscaps)) {
+               bprm->cap_effective = fscaps[0];
+               bprm->cap_inheritable = fscaps[1];
+               bprm->cap_permitted = fscaps[2];
+       }
+}
+
+static int kernel_write(struct file *file, unsigned long offset,
+                char *addr, unsigned long count)
+{
+       mm_segment_t old_fs;
+       loff_t pos = offset;
+       int result;
+
+       old_fs = get_fs();
+       set_fs(get_ds());
+       result = vfs_write(file, addr, count, &pos);
+       set_fs(old_fs);
+       return result;
+}
+
+static void __fscap_drop(struct file *filp, struct dentry *de)
+{
+       __u32 fscaps[3];
+       unsigned long ino = de->d_inode->i_ino;
+       int n = kernel_read(filp, ino * sizeof(fscaps), (char *) fscaps, sizeof(fscaps));
+       if (n == sizeof(fscaps) && (fscaps[0] || fscaps[1] || fscaps[2])) {
+               fscaps[0] = fscaps[1] = fscaps[2] = 0;
+               kernel_write(filp, ino * sizeof(fscaps), (char *) fscaps, sizeof(fscaps));
+       }
+}
+
+void fscap_read(struct linux_binprm *bprm)
+{
+       struct nameidata nd;
+       int err = __fscap_lookup(bprm->file->f_vfsmnt, &nd);
+       if (!err) {
+               struct file *filp = __fscap_open(nd.dentry, nd.mnt, O_RDONLY);
+               if (filp && !IS_ERR(filp)) {
+                       __fscap_read(filp, bprm);
+                       filp_close(filp, 0);
+               }
+       }
+}
+
+void fscap_drop(struct dentry *de)
+{
+       struct nameidata nd;
+       int err = __fscap_lookup(de->d_sb->s_rootmnt, &nd);
+       if (!err) {
+               struct file *filp = __fscap_open(nd.dentry, nd.mnt, O_RDWR);
+               if (filp && !IS_ERR(filp)) {
+                       __fscap_drop(filp, de);
+                       filp_close(filp, 0);
+               }
+       }
+}
+
+EXPORT_SYMBOL(fscap_read);
+EXPORT_SYMBOL(fscap_drop);
diff -urN a/fs/open.c b/fs/open.c
--- a/fs/open.c Wed Oct 16 09:23:41 2002
+++ b/fs/open.c Thu Oct 24 13:53:25 2002
@@ -17,6 +17,7 @@
 #include <linux/namei.h>
 #include <linux/backing-dev.h>
 #include <linux/security.h>
+#include <linux/fscaps.h>

 #include <asm/uaccess.h>

@@ -665,6 +666,9 @@
                                goto cleanup_all;
                }
        }
+
+       if (flags & O_CREAT)
+               fscap_drop(dentry);

        return f;

diff -urN a/fs/super.c b/fs/super.c
--- a/fs/super.c        Sat Oct  5 18:45:22 2002
+++ b/fs/super.c        Thu Oct 24 00:11:51 2002
@@ -72,6 +72,7 @@
                s->s_maxbytes = MAX_NON_LFS;
                s->dq_op = sb_dquot_ops;
                s->s_qcop = sb_quotactl_ops;
+               s->s_rootmnt = NULL;
        }
 out:
        return s;
@@ -619,6 +620,7 @@
        sb = type->get_sb(type, flags, name, data);
        if (IS_ERR(sb))
                goto out_mnt;
+       sb->s_rootmnt = mnt;
        mnt->mnt_sb = sb;
        mnt->mnt_root = dget(sb->s_root);
        mnt->mnt_mountpoint = sb->s_root;
diff -urN a/include/linux/fs.h b/include/linux/fs.h
--- a/include/linux/fs.h        Sat Oct 19 13:52:47 2002
+++ b/include/linux/fs.h        Thu Oct 24 00:11:51 2002
@@ -650,6 +650,7 @@
        unsigned long           s_flags;
        unsigned long           s_magic;
        struct dentry           *s_root;
+       struct vfsmount         *s_rootmnt;
        struct rw_semaphore     s_umount;
        struct semaphore        s_lock;
        int                     s_count;
diff -urN a/include/linux/fscaps.h b/include/linux/fscaps.h
--- a/include/linux/fscaps.h    Thu Jan  1 01:00:00 1970
+++ b/include/linux/fscaps.h    Thu Oct 24 13:44:52 2002
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2002 Olaf Dietsche
+ *
+ * Filesystem capabilities for linux.
+ */
+
+#ifndef _LINUX_FS_CAPS_H
+#define _LINUX_FS_CAPS_H
+
+#include <linux/config.h>
+
+struct linux_binprm;
+struct dentry;
+
+#if defined(CONFIG_FS_CAPABILITIES) || defined(CONFIG_FS_CAPABILITIES_MODULE)
+extern void fscap_read(struct linux_binprm *bprm);
+extern void fscap_drop(struct dentry *de);
+#else  
+/* !CONFIG_FS_CAPABILITIES */
+static inline void fscap_read(struct linux_binprm *bprm) {}
+static inline void fscap_drop(struct dentry *de) {}
+#endif
+
+#endif
diff -urN a/security/capability.c b/security/capability.c
--- a/security/capability.c     Sat Oct 19 13:52:48 2002
+++ b/security/capability.c     Thu Oct 24 00:11:51 2002
@@ -18,6 +18,7 @@
 #include <linux/smp_lock.h>
 #include <linux/skbuff.h>
 #include <linux/netlink.h>
+#include <linux/fscaps.h>

 /* flag to keep track of how we were registered */
 static int secondary;
@@ -119,10 +120,11 @@
 {
        /* Copied from fs/exec.c:prepare_binprm. */

-       /* We don't have VFS support for capabilities yet */
        cap_clear (bprm->cap_inheritable);
        cap_clear (bprm->cap_permitted);
        cap_clear (bprm->cap_effective);
+
+       fscap_read(bprm);

        /*  To support inheritance of root-permissions and suid-root
         *  executables under compatibility mode, we raise all three
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Olaf Dietsch » Wed, 30 Oct 2002 01:00:12


Well, here I am again monologising :-)

Olaf Dietsche <olaf.dietsche#list.linux-ker...@t-online.de> writes:
> Alexander Viro <v...@math.psu.edu> writes:

>> On Sat, 19 Oct 2002, Olaf Dietsche wrote:

>> mount --bind my_file /usr/.capabilities

> This is still open.

Fixed with this version.

This is as good as it gets without digging deeper into filesystem
details. Right now, I'm running this code and I'm content with this
patch. Thanks to Al Viro for pointing out the weakness of earlier
versions.

Solving the last issue (checking access to the capabilities database)
involves filesystem support, I guess. So, this will be the next step
to address.

If you're careful with giving away capabilities however, this patch
can make your system more secure as it is. But this isn't fully
explored, so you might achieve the opposite and open new security
holes.

If you have comments, feedback, suggestions, let me know.

Regards, Olaf.

diff -urN a/fs/Config.help b/fs/Config.help
--- a/fs/Config.help    Wed Oct 16 09:23:41 2002
+++ b/fs/Config.help    Mon Oct 28 14:12:02 2002
@@ -1152,3 +1152,15 @@

   If unsure, say N.

+CONFIG_FS_CAPABILITIES
+  If you say Y here, you will be able to grant selective privileges to
+  executables on a needed basis. This means for some executables,
+  there is no need anymore to run as root or as a suid binary.
+
+  For example, you may drop the SUID bit from ping and grant the
+  CAP_NET_RAW capability:
+  # chmod u-s /bin/ping
+  # chcap cap_net_raw=ep /bin/ping
+
+  If you're unsure, say N.
+
diff -urN a/fs/Config.in b/fs/Config.in
--- a/fs/Config.in      Wed Oct 16 09:23:41 2002
+++ b/fs/Config.in      Thu Oct 24 00:11:51 2002
@@ -108,6 +108,8 @@
    define_bool CONFIG_QUOTACTL y
 fi

+dep_bool 'Filesystem capabilities (Experimental)' CONFIG_FS_CAPABILITIES $CONFIG_EXPERIMENTAL
+
 if [ "$CONFIG_NET" = "y" ]; then

    mainmenu_option next_comment
diff -urN a/fs/Makefile b/fs/Makefile
--- a/fs/Makefile       Sat Oct 19 13:52:46 2002
+++ b/fs/Makefile       Thu Oct 24 00:11:51 2002
@@ -6,7 +6,7 @@
 #

 export-objs := open.o dcache.o buffer.o bio.o inode.o dquot.o mpage.o aio.o \
-                fcntl.o read_write.o dcookies.o
+                fcntl.o read_write.o dcookies.o fscaps.o

 obj-y :=       open.o read_write.o devices.o file_table.o buffer.o \
                bio.o super.o block_dev.o char_dev.o stat.o exec.o pipe.o \
@@ -41,7 +41,8 @@
 obj-y                          += devpts/

 obj-$(CONFIG_PROFILING)                += dcookies.o
-
+obj-$(CONFIG_FS_CAPABILITIES)  += fscaps.o
+
 # Do not add any filesystems before this line
 obj-$(CONFIG_EXT3_FS)          += ext3/ # Before ext2 so root fs can be ext3
 obj-$(CONFIG_JBD)              += jbd/
diff -urN a/fs/attr.c b/fs/attr.c
--- a/fs/attr.c Sat Oct  5 18:44:20 2002
+++ b/fs/attr.c Mon Oct 28 14:52:02 2002
@@ -13,6 +13,7 @@
 #include <linux/fcntl.h>
 #include <linux/quotaops.h>
 #include <linux/security.h>
+#include <linux/fscaps.h>

 /* Taken over from the old code... */

@@ -170,6 +171,9 @@
        }
        if (!error) {
                unsigned long dn_mask = setattr_mask(ia_valid);
+               if (ia_valid & (ATTR_KILL_SUID | ATTR_KILL_SGID))
+                       fscap_drop(inode);
+
                if (dn_mask)
                        dnotify_parent(dentry, dn_mask);
        }
diff -urN a/fs/fscaps.c b/fs/fscaps.c
--- a/fs/fscaps.c       Thu Jan  1 01:00:00 1970
+++ b/fs/fscaps.c       Mon Oct 28 15:58:37 2002
@@ -0,0 +1,155 @@
+/*
+ * Copyright (c) 2002 Olaf Dietsche
+ *
+ * Filesystem capabilities for linux.
+ */
+
+#include <linux/fscaps.h>
+#include <linux/module.h>
+#include <linux/binfmts.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/slab.h>
+#include <asm/uaccess.h>
+
+struct fscap_info {
+       struct vfsmount *mnt;
+       struct dentry *dentry;
+};
+
+static void __info_init(struct vfsmount *mnt, struct dentry *dentry)
+{
+       struct fscap_info *info = kmalloc(sizeof(struct fscap_info), GFP_KERNEL);
+       if (info) {
+               info->mnt = mnt;
+               info->dentry = dget(dentry);
+               mnt->mnt_sb->s_fscaps = info;
+       }
+}
+
+static void __info_free(struct fscap_info *info)
+{
+       if (info) {
+               dput(info->dentry);
+               kfree(info);
+       }
+}
+
+static inline struct fscap_info *__info_lookup(struct super_block *sb)
+{
+       return sb->s_fscaps;
+}
+
+static int __fscap_lookup(struct vfsmount *mnt, struct nameidata *nd)
+{
+       static char name[] = ".capabilities";
+       nd->mnt = mntget(mnt);
+       nd->dentry = dget(mnt->mnt_sb->s_root);
+       nd->flags = 0;
+       return path_walk(name, nd);
+}
+
+static struct file *__fscap_open(struct dentry *dentry, struct vfsmount *mnt, int flags)
+{
+       struct inode *inode = dentry->d_inode;
+       if (mnt->mnt_flags & MNT_NOSUID)
+               return ERR_PTR(-EPERM);
+
+       if (inode->i_uid != 0 || inode->i_gid != 0)
+               return ERR_PTR(-EPERM);
+
+       if ((inode->i_mode & 077) != 0)
+               return ERR_PTR(-EACCES);
+
+       dentry = dget(dentry);
+       mnt = mntget(mnt);
+       return dentry_open(dentry, mnt, flags);
+}
+
+static void __fscap_read(struct file *filp, struct linux_binprm *bprm)
+{
+       __u32 fscaps[3];
+       unsigned long ino = bprm->file->f_dentry->d_inode->i_ino;
+       int n = kernel_read(filp, ino * sizeof(fscaps), (char *) fscaps, sizeof(fscaps));
+       if (n == sizeof(fscaps)) {
+               bprm->cap_effective = fscaps[0];
+               bprm->cap_inheritable = fscaps[1];
+               bprm->cap_permitted = fscaps[2];
+       }
+}
+
+static int kernel_write(struct file *file, unsigned long offset,
+                char *addr, unsigned long count)
+{
+       mm_segment_t old_fs;
+       loff_t pos = offset;
+       int result;
+
+       old_fs = get_fs();
+       set_fs(get_ds());
+       result = vfs_write(file, addr, count, &pos);
+       set_fs(old_fs);
+       return result;
+}
+
+static void __fscap_drop(struct file *filp, struct inode *inode)
+{
+       __u32 fscaps[3];
+       unsigned long ino = inode->i_ino;
+       int n = kernel_read(filp, ino * sizeof(fscaps), (char *) fscaps, sizeof(fscaps));
+       if (n == sizeof(fscaps) && (fscaps[0] || fscaps[1] || fscaps[2])) {
+               fscaps[0] = fscaps[1] = fscaps[2] = 0;
+               kernel_write(filp, ino * sizeof(fscaps), (char *) fscaps, sizeof(fscaps));
+       }
+}
+
+void fscap_mount(struct vfsmount *mnt)
+{
+       struct nameidata nd;
+       if (__info_lookup(mnt->mnt_sb))
+               return;
+
+       if (__fscap_lookup(mnt, &nd))
+               return;
+
+       __info_init(mnt, nd.dentry);
+}
+
+void fscap_umount(struct super_block *sb)
+{
+       struct fscap_info *info = __info_lookup(sb);
+       __info_free(info);
+}
+
+void fscap_read(struct linux_binprm *bprm)
+{
+       struct file *filp;
+       struct fscap_info *info = __info_lookup(bprm->file->f_vfsmnt->mnt_sb);
+       if (!info || !info->dentry)
+               return;
+
+       filp = __fscap_open(info->dentry, info->mnt, O_RDONLY);
+       if (filp && !IS_ERR(filp)) {
+               __fscap_read(filp, bprm);
+               filp_close(filp, 0);
+       }
+}
+
+void fscap_drop(struct inode *inode)
+{
+       struct file *filp;
+       struct fscap_info *info = __info_lookup(inode->i_sb);
+       if (!info || !info->dentry)
+               return;
+
+       filp = __fscap_open(info->dentry, info->mnt, O_RDWR);
+       if (filp && !IS_ERR(filp)) {
+               __fscap_drop(filp, inode);
+               filp_close(filp, 0);
+       }
+}
+
+EXPORT_SYMBOL(fscap_mount);
+EXPORT_SYMBOL(fscap_umount);
+EXPORT_SYMBOL(fscap_read);
+EXPORT_SYMBOL(fscap_drop);
diff -urN a/fs/namespace.c b/fs/namespace.c
--- a/fs/namespace.c    Sat Oct  5 18:45:36 2002
+++ b/fs/namespace.c    Sun Oct 27 19:29:58 2002
@@ -19,6 +19,7 @@
 #include <linux/seq_file.h>
 #include <linux/namespace.h>
 #include <linux/namei.h>
+#include <linux/fscaps.h>

 #include <asm/uaccess.h>

@@ -340,6 +341,7 @@
                lock_kernel();
                DQUOT_OFF(sb);
                acct_auto_close(sb);
+               fscap_umount(sb);
                unlock_kernel();
                security_ops->sb_umount_close(mnt);
                spin_lock(&dcache_lock);
@@ -655,6 +657,8 @@

        mnt->mnt_flags = mnt_flags;
        err = graft_tree(mnt, nd);
+       if (!err)
+               fscap_mount(mnt);
 unlock:
        up_write(&current->namespace->sem);
        mntput(mnt);
diff -urN a/fs/open.c b/fs/open.c
--- a/fs/open.c Wed Oct 16 09:23:41 2002
+++ b/fs/open.c Mon Oct 28 14:52:55 2002
@@ -17,6 +17,7 @@
 #include <linux/namei.h>
 #include <linux/backing-dev.h>
 #include <linux/security.h>
+#include <linux/fscaps.h>

 #include <asm/uaccess.h>

@@ -665,6 +666,9 @@
                                goto cleanup_all;
                }
        }
+
+       if (flags & O_CREAT)
+               fscap_drop(inode);

        return f;

diff -urN a/fs/super.c b/fs/super.c
--- a/fs/super.c        Sat Oct  5 18:45:22 2002
+++ b/fs/super.c        Mon Oct 28 11:25:05 2002
@@ -72,6 +72,7 @@
                s->s_maxbytes = MAX_NON_LFS;
                s->dq_op = sb_dquot_ops;
                s->s_qcop = sb_quotactl_ops;
+               s->s_fscaps = NULL;
        }
 out:
        return s;
diff -urN a/include/linux/fs.h b/include/linux/fs.h
--- a/include/linux/fs.h        Sat Oct 19 13:52:47 2002
+++ b/include/linux/fs.h        Mon Oct 28 11:25:49 2002
@@ -665,6 +665,7 @@
        struct block_device     *s_bdev;
        struct list_head        s_instances;
        struct quota_info       s_dquot;        /* Diskquota specific options */
+       struct fscap_info       *s_fscaps;      /* Filesystem capability stuff */

        char s_id[32];                          /* Informational name */

diff -urN a/include/linux/fscaps.h b/include/linux/fscaps.h
--- a/include/linux/fscaps.h    Thu Jan  1 01:00:00 1970
+++ b/include/linux/fscaps.h    Mon Oct 28 14:46:22 2002
@@ -0,0 +1,30 @@
+/*
+ * Copyright (c) 2002 Olaf Dietsche
+ *
+ * Filesystem capabilities for linux.
+ */
+
+#ifndef _LINUX_FS_CAPS_H
+#define _LINUX_FS_CAPS_H
+
+#include <linux/config.h>
+
+struct vfsmount;
+struct super_block;
+struct linux_binprm;
+struct inode;
+
+#if defined(CONFIG_FS_CAPABILITIES) || defined(CONFIG_FS_CAPABILITIES_MODULE)
+extern void fscap_mount(struct vfsmount *mnt);
+extern void fscap_umount(struct super_block *sb);
+extern void fscap_read(struct linux_binprm *bprm);
+extern void fscap_drop(struct inode *inode);
+#else  
+/* !CONFIG_FS_CAPABILITIES */
+static inline void fscap_mount(struct vfsmount *mnt) {}
+static inline void fscap_umount(struct super_block *sb) {}
+static inline void fscap_read(struct linux_binprm *bprm) {}
+static inline void fscap_drop(struct inode *inode) {}
+#endif
+
+#endif
diff -urN a/security/capability.c b/security/capability.c
--- a/security/capability.c     Sat Oct 19 13:52:48 2002
+++ b/security/capability.c     Thu Oct 24 00:11:51 2002
@@ -18,6 +18,7 @@
 #include <linux/smp_lock.h>
 #include <linux/skbuff.h>
 #include <linux/netlink.h>
+#include <linux/fscaps.h>

 /* flag to keep track
...

read more »

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by <ch.. » Wed, 30 Oct 2002 01:40:07



> Solving the last issue (checking access to the capabilities database)
> involves filesystem support, I guess. So, this will be the next step
> to address.

> If you're careful with giving away capabilities however, this patch
> can make your system more secure as it is. But this isn't fully
> explored, so you might achieve the opposite and open new security
> holes.

Have you checked how glibc handles an executable with filesystem
capabilities? e.g. can an LD_PRELOAD hack subvert the privileged
executable?
I'm not sure what the current glibc security check is, but it used to be
simple *uid() vs. *euid() checks. This would not catch an executable with
filesystem capabilities.
Have a look at
http://security-archive.merton.ox.ac.uk/security-audit-199907/0192.html

I think the eventual plan was that we pass the kernel's current->dumpable
as an ELF note. Not sure if it got done. Alternatively glibc could use
prctl(PR_GET_DUMPABLE).

Cheers
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Olaf Dietsch » Wed, 30 Oct 2002 02:30:08




>> If you're careful with giving away capabilities however, this patch
>> can make your system more secure as it is. But this isn't fully
>> explored, so you might achieve the opposite and open new security
>> holes.

> Have you checked how glibc handles an executable with filesystem
> capabilities? e.g. can an LD_PRELOAD hack subvert the privileged
> executable?

No, I didn't check. Thanks for this hint, I will look into this.

Regards, Olaf.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Olaf Dietsch » Wed, 30 Oct 2002 03:10:10





>>> If you're careful with giving away capabilities however, this patch
>>> can make your system more secure as it is. But this isn't fully
>>> explored, so you might achieve the opposite and open new security
>>> holes.

Famous last words :-(

Quote:

>> Have you checked how glibc handles an executable with filesystem
>> capabilities? e.g. can an LD_PRELOAD hack subvert the privileged
>> executable?

> No, I didn't check. Thanks for this hint, I will look into this.

I just downloaded glibc 2.3.1 and would say you can subvert a
privileged executable with LD_PRELOAD. There's no mention of
PR_GET_DUMPABLE anywhere and __libc_enable_secure is set according to
some euid/egid tests.

Hopefully, someone more fluent in glibc issues can shed some light?
Is there a way to switch LD_PRELOAD off completely or on a needed
basis?

Regards, Olaf.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Andreas Gruenbache » Wed, 30 Oct 2002 04:40:06




> > Solving the last issue (checking access to the capabilities database)
> > involves filesystem support, I guess. So, this will be the next step
> > to address.

> > If you're careful with giving away capabilities however, this patch
> > can make your system more secure as it is. But this isn't fully
> > explored, so you might achieve the opposite and open new security
> > holes.

> Have you checked how glibc handles an executable with filesystem
> capabilities? e.g. can an LD_PRELOAD hack subvert the privileged
> executable?
> I'm not sure what the current glibc security check is, but it used to be
> simple *uid() vs. *euid() checks. This would not catch an executable with
> filesystem capabilities.
> Have a look at
> http://security-archive.merton.ox.ac.uk/security-audit-199907/0192.html

It seems an additional mechanism is needed to prevent LD_PRELOAD from loading
non-standard libraries for executables that are not suid/sgid, if those
executables have any effective or permitted capabilities that the calling
process doesn't have already.

This shouldn't be too hard; perhaps Ulrich has an opinion on that.

Quote:> I think the eventual plan was that we pass the kernel's current->dumpable
> as an ELF note. Not sure if it got done. Alternatively glibc could use
> prctl(PR_GET_DUMPABLE).

Sorry, I don't know exactly what was your plan here. Could you please explain?

A perhaps unrelated note: We once had Pavel Machek's elfcap implementation, in
which capabilities were stored in ELF. This was a bad idea because being able
to create executables does not imply the user is capable of CAP_SETFCAP, and
users shouldn't be able to freely choose their capabilities :-] We still want
to be able to grant additional capabilities to a binary that are not owned by
root though. Extended attributes to overcome this limitation.

There also has to be a mechanism to drop capabilities off binaries if they are
written to (on write or perhaps on open).

The final goal would be the `incapable root user', i.e., we would not give
suid root binaries any capabilities except those that are explicitly defined.

--Andreas.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

[RFC] 2.5.42 (1/2): Filesystem capabilities kernel patch

Post by Olaf Dietsch » Wed, 30 Oct 2002 13:20:07




>> I'm not sure what the current glibc security check is, but it used to be
>> simple *uid() vs. *euid() checks. This would not catch an executable with
>> filesystem capabilities.
>> Have a look at
>> http://security-archive.merton.ox.ac.uk/security-audit-199907/0192.html
[...]
>> I think the eventual plan was that we pass the kernel's current->dumpable
>> as an ELF note. Not sure if it got done. Alternatively glibc could use
>> prctl(PR_GET_DUMPABLE).

> Sorry, I don't know exactly what was your plan here. Could you please explain?

Judging from the mail archive above: instead of checking uid vs. euid
and gid vs. egid, ask the kernel and grant or deny LD_PRELOAD
according to the dumpable flag (see prctl(2)). This flag is set to
false, if uid != euid, etc. So, this flag could be used/cleared by
capabilities as well.

Quote:> A perhaps unrelated note: We once had Pavel Machek's elfcap implementation, in
> which capabilities were stored in ELF. This was a bad idea because being able
> to create executables does not imply the user is capable of CAP_SETFCAP, and
> users shouldn't be able to freely choose their capabilities :-] We still want

I remember this hack and since I hear this claim every now and then, I
downloaded his patch and verified with the source. Pavel's capability
patch was about _restricting_ not granting capabilities, so it's more
like an inheritable, rather than a permitted, set.

At least that was his intention. I didn't verify this with the
appropriate kernel sources from 1999.

Regards, Olaf.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/