'df' and 'du' show wrong sizes

'df' and 'du' show wrong sizes

Post by Frank Adl » Sun, 03 May 1998 04:00:00



Adabas D 10.0 Linux Business
Redhat 4.0 / RedHat 4.2

I've installed Adabas D an created an new DB using xcontrol. The Data,  
SysDev, DataDev and Transactionlog have been added to the same partition  
/dev/sdb5 aka /home/adabas. In the result the filesizes shown by 'ls -l'  
are correct, the device space shown with 'df' or 'du' is not. Has anyone  
seen this before? I have done a filesystemcheck with e2fsck but it doesnot  
detect any error.

Moving the files to another volume and back solves the problem.

Frank

Filesystem         1024-blocks  Used Available Capacity Mounted on
/dev/sdb5             303251   26150   261440      9%   /home/adabas


total 26151
drwxr-xr-x   3 adabas   adabas       1024 May  1 00:21 .
drwxr-xr-x  47 root     root         1024 Mar 27 19:39 ..
drwxr-xr-x   2 root     root        12288 Apr 30 13:01 lost+found
-rw-r--r--   1 adabas   database      204 Apr 30 14:45 util1.prot
-rw-r--r--   1 adabas   database 102404096 May  1 00:34 www.data
-rw-r--r--   1 adabas   database  1527808 May  1 03:04 www.sys
-rw-r--r--   1 adabas   database 20484096 May  1 00:38 www.trans
-rw-r--r--   1 adabas   database      240 Apr 30 23:56 xparam.prot


12      lost+found
1       util1.prot
5434    www.data
617     www.sys
20084   www.trans
1       xparam.prot

 
 
 

'df' and 'du' show wrong sizes

Post by David Z. Maz » Sun, 03 May 1998 04:00:00


FA> I've installed Adabas D an created an new DB using xcontrol. The
FA> Data, SysDev, DataDev and Transactionlog have been added to the
FA> same partition /dev/sdb5 aka /home/adabas. In the result the
FA> filesizes shown by 'ls -l' are correct, the device space shown
FA> with 'df' or 'du' is not. Has anyone seen this before? I have done
FA> a filesystemcheck with e2fsck but it doesnot detect any error.
FA>
FA> Moving the files to another volume and back solves the problem.

My guess is that the database software is creating "sparse" files.  If
a part of a file is completely empty, the software can skip over the
empty blocks while writing, and nothing ever gets written to disk.  If
you try to read the empty blocks, you just get zeros.  The net effect
is that you get what looks like a big file that doesn't have quite as
many disk blocks allocated as it seems like it should.  This is
perfectly normal.

--
 _____________________________
/                             \       "Dad was reading a book called
|          David Maze         |     _Schroedinger's Kittens_.  A*

| http://www.veryComputer.com/|               -- Abra Mitchell
\_____________________________/

 
 
 

'df' and 'du' show wrong sizes

Post by Ronald Wah » Sun, 03 May 1998 04:00:00



Quote:> Adabas D 10.0 Linux Business
> Redhat 4.0 / RedHat 4.2

> I've installed Adabas D an created an new DB using xcontrol. The Data,  
> SysDev, DataDev and Transactionlog have been added to the same partition  
> /dev/sdb5 aka /home/adabas. In the result the filesizes shown by 'ls -l'  
> are correct, the device space shown with 'df' or 'du' is not. Has anyone  
> seen this before? I have done a filesystemcheck with e2fsck but it doesnot  
> detect any error.

"du" reports the amount of space used on the disk by an object. This value
may be lower than the real filesize printed by "ls -l". If this is the
case the file contains holes - a very nice feature. These holes are
created by positioning the file pointer after the actual end of a file. A
hole contains only zeros. Copying such a file will replace the holes by
zeros and the copy will consume really the amount of space shown by
"ls -l" but this is a waste of diskspace.

Quote:> Moving the files to another volume and back solves the problem.

Moving to another volume is copying+deleting. Moving on the same volume
doesn't touch the file contens and the holes will remain.

ron

--

 \ WWW: http://www.tu-chemnitz.de/~row/  \                           /

   \ PGP key available                     \                       /

 
 
 

'df' and 'du' show wrong sizes

Post by Frank Adl » Mon, 04 May 1998 04:00:00



Quote:> My guess is that the database software is creating "sparse" files.  If
> a part of a file is completely empty, the software can skip over the
> empty blocks while writing, and nothing ever gets written to disk.  If
> you try to read the empty blocks, you just get zeros.  The net effect
> is that you get what looks like a big file that doesn't have quite as
> many disk blocks allocated as it seems like it should.  This is
> perfectly normal.

Thank you for explaining this. But I wonder if this behaviour makes sense  
for a database application. Each write access to the holes needs  
allocation some new blocks on disk. The database application maybe never  
check if there is any free space on disk because it will never read/write  
beyond its limits. This may lead to hazardous situations.

Frank

 
 
 

'df' and 'du' show wrong sizes

Post by Brian McCaule » Mon, 04 May 1998 04:00:00



> I've installed Adabas D an created an new DB using xcontrol. The Data,  
> SysDev, DataDev and Transactionlog have been added to the same partition  
> /dev/sdb5 aka /home/adabas. In the result the filesizes shown by 'ls -l'  
> are correct, the device space shown with 'df' or 'du' is not. Has anyone  
> seen this before? I have done a filesystemcheck with e2fsck but it doesnot  
> detect any error.

Probalby there is no error, the file is sparse.  "ls -l" shows
the nototional size of the file not the number of allocated blocks.
Under Linux ext2fs (like most Unix filesystems) disk blocks are not
allocated until you write something to them.

Quote:> Moving the files to another volume and back solves the problem.

Pepending on how you do this it will often make a sparse file non-sparse.
This may be desirable as sparse files allow you to over commit your
disk and so you can suddenly run out of space when writing to the
middle of a file.  Any DB s/w for a Unix platform should either create
non-sparse files or be prepared to cope with this eventuality.

--

 .  _\\__[oo       from       | Phones: +44 121 471 3789 (home)

.  l___\\    /~~) /~~[  /   [ | PGP-fp: D7 03 2A 4B D8 3A 05 37...
 # ll  l\\  ~~~~ ~   ~ ~    ~ | http://wcl-l.bham.ac.uk/~bam/

 
 
 

'df' and 'du' show wrong sizes

Post by Benedikt Hein » Thu, 07 May 1998 04:00:00


[[[ After creating a new database -- looking at the DB files ]]]

Quote:>> In the result the filesizes shown by 'ls -l'  
>> are correct, the device space shown with 'df' or 'du' is not.
>Probalby there is no error, the file is sparse.  "ls -l" shows
>the nototional size of the file not the number of allocated blocks.
>Under Linux ext2fs (like most Unix filesystems) disk blocks are not
>allocated until you write something to them.

Depending on which system you're on, you might even see the "discrepancy"
with "ls -l":


        total 12158
        -rw-rw-r--   1 beh      ermes    61444096 Nov  3  1997 dd.01
        -rw-rw-r--   1 beh      ermes      925696 Feb 27 07:01 sd
        -rw-rw-r--   1 beh      ermes     6148096 Feb 27 07:01 tl

As you can see, the file size appears to be correct (well - the 60M are
the correct figure), but you might notice that the "total" figure looks
messed up...

This is due to the use of sparse files (files with "holes" in them).
I'd really prefer if SAG would change to something writing some random
patterns to the database, so that the space gets taken up. I dislike
seeing a full DB, but then at a later stage seeing the database moan
"out of diskspace", just because someone else took up a couple of KBs
before Adabas got around to actually trying and using the space...

  Benedikt

--
Windows 95: n.
    32-bit extensions and a graphical shell for a 16-bit patch to an 8-bit
    operating system originally coded for a 4-bit microprocessor,  written
         by a 2-bit company that can't stand for 1 bit of competition.

 
 
 

'df' and 'du' show wrong sizes

Post by Joerg Brueh » Tue, 12 May 1998 04:00:00


Hi,

regarding the ADABAS D "devspace" files
that do not take as much diskspace as their size implies:

The analysis of "sparse files" and a general UNIX / Linux feature is correct.


> ((...))

> I'd really prefer if SAG would change to something writing some random
> patterns to the database, so that the space gets taken up.

That was the behaviour in previous releases, and several customerscomplained
about the long time it took to create a new database -
especially when it would very soon be filled with contents, so the pages
would twice in short succession: first "empty", then with data.

Most installations use disk partitions for their devspaces, where allocation
by writing is no issue at all - they would complain the most.

Quote:> I dislike seeing a full DB,

If the DB is "full", all pages have been written, so the devspace filehas
expanded to its defined size - in this stage no problem can occur.

Quote:> but then at a later stage seeing the database moan "out of diskspace",

That can only happen when the DB is not yet full (judging by its defined
size),but the filesystem is full - and yes, that is the real problem.

Quote:> just because someone else took up a couple of KBs
> before Adabas got around to actually trying and using the space...

>   Benedikt

> --
> Windows 95: n.
>     32-bit extensions and a graphical shell for a 16-bit patch to an 8-bit
>     operating system originally coded for a 4-bit microprocessor,  written
>          by a 2-bit company that can't stand for 1 bit of competition.

So the main question is: would people accept longer initialization times
(writing all file devspace pages) in order to avoid the risk involved in
"sparse files" ?

Hoping for comments,
Joerg Bruehe

--
Joerg Bruehe, SQL Datenbanksysteme GmbH, Berlin, Germany
     (speaking only for himself)

 
 
 

'df' and 'du' show wrong sizes

Post by Frank Adl » Wed, 13 May 1998 04:00:00



[discussion about usage of sparse files when creating a database using  
ADABAS D]

Quote:> Most installations use disk partitions for their devspaces, where allocation
> by writing is no issue at all - they would complain the most.

I would prefer installing on "raw devices", which is not supported in the  
Linux editition as far as I remember (if I am right, it is a lack of Linux  
file system).

Quote:> So the main question is: would people accept longer initialization times
> (writing all file devspace pages) in order to avoid the risk involved in
> "sparse files" ?

The latter can be avoided by setting disk quotas or using designated  
partitions for the database (which I prefer). But how about performance? I  
thought the database kernel does some type of optimization of disk access?  
Wouldn't it be better to have all sectors allocated in sequential order?

Frank Adler