Wednesday, May 11, 2011

Sparse files: du vs. ls -h

When the disk size is checked for a sparse file using `ls -lh' and `du -h' they report different sizes. This is not because of a bug in either du/ls or the filesystem itself, but the way du and ls calculate the file size.

du uses fts(3) to walk the filesystem and calculates the sizes of the files (this is on the BSDs, I haven't checked the GNU version of du but it will be more or less the same).

du calculates the size by looking into *fts_statp->st_blocks from the FTSENT structure, which is the number of allocated blocks with block size 512 bytes. Hence the size is calculated as *fts_statp->st_blocks * 512, fts_statp is the struct *stat from the stat(2) call.

On the other hand `ls -l' uses the same above method as `du' but looks into the st_size field of the stat structure to determine the file size. This gives the size of the file than the actual allocated blocks.