PROPOSAL: dot-proc interface [was: /proc stuff]

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Jakob ?stergaar » Tue, 06 Nov 2001 06:30:09




> =?iso-8859-1?Q?Jak writes:

> > Please tell me,  is "1610612736" a 32-bit integer, a 64-bit integer, is
> > it signed or unsigned   ?

> > I could even live with parsing ASCII, as long as there'd just be type
> > information to go with the values.

> You are looking for something called the registry. It's something
> that was introduced with Windows 95. It's basically a filesystem
> with typed files: char, int, string, string array, etc.

Nope   :)

It does not have "char, int, string, string array, etc." it has "String, binary
and DWORD".

Having read out 64 bit values, floating point data etc. from the registry, I'm
old enough to know that it is *NOT* what I'm looking for   :)

...

Quote:> Funny you should mention that one. I wrote the code used by procps
> to read this file. I love that file! The parentheses issue is just
> a beauty wart. People rarely feel the urge to*with raw numbers.
> In all the other files, idiots like to: add headers, change the
> spelling of field names, change the order, add spaces and random
> punctuation, etc. Nothing is as stable and easy to use as the
> /proc/self/stat file.

Imagine every field in a file by itself, with well-defined type
information and unit informaiton.

...

Quote:> Linus clearly doesn't give a * about /proc performance.
> That's his right, and you are welcome to patch your kernel to
> have something better: http://www.veryComputer.com/

Performance is one thing.  Not being able to know whether numbers are i32, u32,
u64, or measured in Kilobytes or carrots is another ting.

--
................................................................

:.........................: putrid forms of man                :
:   Jakob ?stergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://www.veryComputer.com/
Please read the FAQ at  http://www.veryComputer.com/

 
 
 

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Alex Bligh - linux-kerne » Tue, 06 Nov 2001 06:30:13


--On Sunday, 04 November, 2001 4:12 PM -0500 "Albert D. Cahalan"


>> Now you are proposing to dink with the format. See above comments.

Attribution error: that was me, disagreeing with Jakob - the point was
if you want to dink with the format to achieve the objectives
he seemed to be after (which I thought were to do at least
in part with consistency etc.), it is theoretically possible
to do such dinking with minimal change & certainly retain
text format (and note I said retain original /proc files too). Whether
it's worth it as a practical exercize, with all the inherent
disruption it would no doubt cause, and questionable net benefit
is a completely different question. I was just saying that
binary format wasn't necessary to achieve what I think
Jakob wanted to achieve. The full thought
experiment was in a later email. I suspect you don't disagree
given your previous post.

Quote:>>> 3. Try and rearrange all the /proc entries this way, which
>>>    means sysctl can be implemented by a straight ASCII
>>>    write - nice and easy to parse files.

> This is exactly what the sysctl command does.

Sorry, I meant 'this way a consistent interface cf
sysctl could be used for more of what's currently
done through /proc'. Last time I looked there was
stuff you could read/write to through /proc which
couldn't be done through sysctl.

--
Alex Bligh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Daniel Phillip » Tue, 06 Nov 2001 07:10:12


(Whoops, sorry about the 200K spam the first time, hope that was bounced
from the list, here's the abridged version.)

On November 4, 2001 08:52 pm, Alexander Viro wrote:

> On Sun, 4 Nov 2001, [iso-8859-1] Jakob ?stergaard wrote:

> > > If you feel it's too hard to write use scanf(), use sh, awk, perl
> > > etc. which all have their own implementations that appear to have
> > > served UNIX quite well for a long while.

> > Witness ten lines of vmstat output taking 300+ millions of clock cycles.

> Would the esteemed sir care to check where these cycles are spent?
> How about "traversing page tables of every damn process out there"?
> Doesn't sound like a string operation to me...

Doing 'top -d .1' eats 18% of a 1GHz cpu, which is abominable.  A kernel
profile courtesy of sgi's kernprof shows that scanning pages does not move
the needle, whereas sprintf does.  Notice that the biggest chunk of time
is spent in user space, possibly decoding proc values.  I didn't profile
user space, and I should but I'm not set up to do that just now.  Another
main cause of this embarrassing waste of cpu cycles is the bazillions of
file operations.  Enjoy:

                     Call graph (explanation follows)

granularity: each sample hit covers 4 byte(s) for 0.00% of 38.05 seconds

index % time    self  children    called     name
                                                 <spontaneous>
[1]     78.0    0.00   29.67                 cpu_idle [1]
               29.65    0.00   54034/54034       default_idle [2]
                0.01    0.00    1805/3791        schedule [81]
                0.00    0.00    1804/1806        check_pgt_cache [225]
-----------------------------------------------
               29.65    0.00   54034/54034       cpu_idle [1]
[2]     77.9   29.65    0.00   54034         default_idle [2]
-----------------------------------------------
                                                 <spontaneous>
[3]     11.2    4.26    0.00                 USER [3]
-----------------------------------------------
                                                 <spontaneous>
[4]     10.4    0.04    3.94                 system_call [4]
                0.01    2.85   19776/19776       sys_read [5]
                0.01    0.48   17031/17031       sys_open [14]
                0.00    0.15    5742/5742        sys_stat64 [24]
                0.00    0.12    4554/4554        sys_write [28]
                0.00    0.09     642/642         sys_getdents64 [30]
                0.01    0.08    1544/1544        sys_select [32]
                0.00    0.03     767/767         sys_poll [69]
                0.01    0.02   16901/16903       sys_close [75]
                0.00    0.02    3969/3969        sys_fcntl64 [82]
                0.00    0.01    1046/1046        sys_ioctl [137]
                0.00    0.01     134/134         old_mmap [142]
                0.00    0.01    5376/5376        sys_alarm [145]
                0.00    0.00    3644/3644        sys_gettimeofday [154]
                0.00    0.00       1/1           sys_fork [165]
                0.00    0.00     128/128         sys_access [176]
                0.00    0.00     130/130         sys_munmap [188]
                0.00    0.00       1/1           sys_execve [201]
                0.00    0.00     385/385         sys_lseek [214]
                0.00    0.00     263/263         sys_fstat64 [219]
                0.00    0.00    3615/3615        sys_rt_sigaction [224]
                0.00    0.00     128/128         sys_llseek [232]
                0.00    0.00      17/17          sys_wait4 [237]
                0.00    0.00       1/1           sys_exit [239]
                0.00    0.00       4/4           sys_brk [297]
                0.00    0.00       6/6           sys_writev [304]
                0.00    0.00       1/1           sys_mprotect [350]
                0.00    0.00     148/148         sys_time [425]
                0.00    0.00      34/34          sys_rt_sigprocmask [433]
                0.00    0.00       2/2           sys_setpgid [504]
                0.00    0.00       1/1           sys_newuname [550]
                0.00    0.00       1/1           sys_getpid [549]
                0.00    0.00       1/1           sys_sigreturn [551]
-----------------------------------------------
                0.01    2.85   19776/19776       system_call [4]
[5]      7.5    0.01    2.85   19776         sys_read [5]
                0.01    1.50   16512/16512       proc_info_read [6]
                0.00    1.27     524/524         proc_file_read [7]
                0.00    0.02     781/781         tty_read [92]
                0.00    0.02    1793/1798        generic_file_read [98]
                0.01    0.01   19776/80631       fput [45]
                0.00    0.01     166/166         sock_read [121]
                0.01    0.00   19776/52388       fget [90]
-----------------------------------------------
                0.01    1.50   16512/16512       sys_read [5]
[6]      4.0    0.01    1.50   16512         proc_info_read [6]
                0.03    1.04    5504/5504        proc_pid_statm [8]
                0.12    0.12    5504/5504        proc_pid_stat [23]
                0.09    0.00   15360/27661       _generic_copy_to_user [25]
                0.01    0.04    5504/5504        proc_pid_cmdline [48]
                0.00    0.04   16512/20003       _get_free_pages [50]
                0.02    0.00   16512/20160       _free_pages_ok [77]
                0.00    0.00   16512/87748       _free_pages [107]
                0.00    0.00   16512/80548       free_pages [167]
-----------------------------------------------
                0.00    1.27     524/524         sys_read [5]
[7]      3.3    0.00    1.27     524         proc_file_read [7]
                0.00    0.92     128/128         meminfo_read_proc [10]
                0.01    0.29     128/128         kstat_read_proc [21]
                0.00    0.03      12/12          ksyms_read_proc [68]
                0.00    0.00     524/27661       _generic_copy_to_user [25]
                0.00    0.00     128/128         loadavg_read_proc [230]
                0.00    0.00     128/128         uptime_read_proc [231]
                0.00    0.00     524/20003       _get_free_pages [50]
                0.00    0.00     524/20160       _free_pages_ok [77]
                0.00    0.00     524/87748       _free_pages [107]
                0.00    0.00     524/80548       free_pages [167]
                0.00    0.00     384/512         proc_calc_metrics [417]
-----------------------------------------------
                0.03    1.04    5504/5504        proc_info_read [6]
[8]      2.8    0.03    1.04    5504         proc_pid_statm [8]
                0.98    0.00  177792/177792      statm_pgd_range [9]
                0.00    0.05    5504/44307       sprintf [16]
                0.00    0.00    4352/17928       mmput [152]
-----------------------------------------------
                0.98    0.00  177792/177792      proc_pid_statm [8]
[9]      2.6    0.98    0.00  177792         statm_pgd_range [9]
-----------------------------------------------
                0.00    0.92     128/128         proc_file_read [7]
[10]     2.4    0.00    0.92     128         meminfo_read_proc [10]
                0.92    0.00     128/128         si_swapinfo [11]
                0.00    0.00     256/44307       sprintf [16]
                0.00    0.00     128/128         si_meminfo [271]
                0.00    0.00     128/128         nr_inactive_clean_pages [429]
                0.00    0.00     128/512         proc_calc_metrics [417]
-----------------------------------------------
                0.92    0.00     128/128         meminfo_read_proc [10]
[11]     2.4    0.92    0.00     128         si_swapinfo [11]
-----------------------------------------------
[12]     1.3    0.07    0.44   22903+4       <cycle 1 as a whole> [12]
                0.07    0.44   22905             path_walk <cycle 1> [13]
-----------------------------------------------
                                   2             vfs_follow_link <cycle 1> [506]
                0.00    0.00       2/22903       open_exec [326]
                0.02    0.11    5870/22903       _user_walk [26]
                0.05    0.33   17031/22903       open_namei [19]
[13]     1.3    0.07    0.44   22905         path_walk <cycle 1> [13]
                0.03    0.24   38529/38529       real_lookup [22]
                0.02    0.08   85429/108334      dput [27]
                0.01    0.02   63040/63041       cached_lookup [74]
                0.01    0.01   63042/80076       vfs_permission [71]
                0.01    0.00   63042/80076       permission [120]
                0.00    0.00   22389/22389       lookup_mnt [200]
                0.00    0.00     244/5999        path_release [136]
                0.00    0.00       2/1809        update_atime [402]
                0.00    0.00       2/2           ext2_follow_link [492]
                                   2             vfs_follow_link <cycle 1> [506]
-----------------------------------------------
                0.01    0.48   17031/17031       system_call [4]
[14]     1.3    0.01    0.48   17031         sys_open [14]
                0.00    0.44   17031/17031       filp_open [15]
                0.01    0.01   17031/22902       getname [86]
                0.01    0.00   17031/17032       get_unused_fd [110]
                0.00    0.00   17031/111161      kmem_cache_free [70]
-----------------------------------------------
                0.00    0.44   17031/17031       sys_open [14]
[15]     1.2    0.00    0.44   17031         filp_open [15]
                0.01    0.40   17031/17031       open_namei [19]
                0.01    0.01   16902/16904       dentry_open [84]
-----------------------------------------------
                0.00    0.00     128/44307       loadavg_read_proc [230]
                0.00  
...

read more »

 
 
 

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Albert D. Cahala » Tue, 06 Nov 2001 07:20:10


=?iso-8859-1?Q?Jak writes:

>> You are looking for something called the registry. It's something
>> that was introduced with Windows 95. It's basically a filesystem
>> with typed files: char, int, string, string array, etc.

> Nope   :)

> It does not have "char, int, string, string array, etc." it
> has "String, binary and DWORD".

I'm pretty sure that newer implementations have additional types.
BTW, we could call the persistent part of our registry "reiserfs4".

Quote:> Imagine every field in a file by itself, with well-defined type
> information and unit informaiton.

I suppose I could print a warning if the type or unit info
isn't what was expected. That's insignificantly useful.

Individual files are nice, until you realize: open, read, close

Quote:> Performance is one thing.  Not being able to know whether
> numbers are i32, u32, u64, or measured in Kilobytes or
> carrots is another ting.

I don't see what the code is supposed to do if it was expecting
kilobytes and you serve it carrots. Certainly nothing useful can
be done when this happens.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 
 
 

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Alexander Vir » Tue, 06 Nov 2001 08:50:10



> Doing 'top -d .1' eats 18% of a 1GHz cpu, which is abominable.  A kernel
> profile courtesy of sgi's kernprof shows that scanning pages does not move
> the needle, whereas sprintf does.  Notice that the biggest chunk of time

Huh?  Scanning pages is statm_pgd_range().  I'd say that it takes
seriously more than vsnprintf() - look at your own results.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Daniel Phillip » Tue, 06 Nov 2001 09:20:08




> > Doing 'top -d .1' eats 18% of a 1GHz cpu, which is abominable.  A kernel
> > profile courtesy of sgi's kernprof shows that scanning pages does not move
> > the needle, whereas sprintf does.  Notice that the biggest chunk of time

> Huh?  Scanning pages is statm_pgd_range().  I'd say that it takes
> seriously more than vsnprintf() - look at your own results.

Yes, true, 2.6 seconds for the statm_pgd_range vs 1.2 for sprintf.  Still,
sprintf is definitely burning cycles, pretty much the whole 1.2 seconds would
be recovered with a binary interface.

Now look at the total time we spend in the kernel: 10.4 seconds, 4 times the
page scanning overhead.  This is really wasteful.

For top does it really matter?  (yes, think slow computer)  What happens when
proc stabilizes and applications start relying on it heavily as a kernel
interface?  If we're still turning in this kind of stunningly poor
performance, it won't be nice.

It's not that it doesn't work, it's just that it isn't the best.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Alexander Vir » Tue, 06 Nov 2001 13:10:09



> Any reason we can't move all the process info into something like
> /proc/pid/* instead of in the root /proc tree?

Thanks, but no thanks.  If we are starting to move stuff around, we
would be much better off leaving in /proc only what it was supposed
to contain - per-process information.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Daniel Phillip » Tue, 06 Nov 2001 19:30:15




> > At the very least, use canonical bytesex and field sizes.  Anything less
> > is just begging for trouble.  And in case of procfs or its equivalents,
> > _use_ the_ _damn_ _ASCII_ _representations_.  scanf(3) is there for
> > purpose.

> And the purpose of scanf in system level applications is to introduce
> nice opportunities for buffer overruns and string formatting bugs.

I've done quite a bit more kernel profiling and I've found that overhead for
converting numbers to ascii for transport to proc is significant, and there
are other overheads as well, such as the sprintf and proc file open.  These
must be matched by corresponding overhead on the user space side, which I
have not profiled.  I'll take some time and present these numbers properly at
some point.

Not that I think we are going to change this way of doing things any time
soon - Linus has spoken - but at least we should know what the overheads are.
Programmers should not labor under the misaprehension that this is an
efficient interface.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Martin Daleck » Tue, 06 Nov 2001 19:40:09


Every BASTARD out there telling the world, that parsing ASCII formatted
files
is easy should be punished to providing a BNF definition of it's syntax.
Otherwise I won't trust him. Having a struct {} with a version field,
indicating
possible semantical changes wil always be easier faster more immune
to errors to use in user level programs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 
 
 

PROPOSAL: dot-proc interface [was: /proc stuff]

Post by Alexander Vir » Wed, 07 Nov 2001 01:10:07




> Every BASTARD out there telling the world, that parsing ASCII formatted
> files

What was your username, again?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/