Differences in du and df command results

If you find a difference in the outputs of du and df command and want to know how to fix it or how it happened, read this blog.

Differences in du and df command results

Have you ever come across a situation where you don't have space in your linux machine and when you check df command, you see

vignesh@localhost:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           768M  2.2M  766M   1% /run
/dev/nvme0n1p4  222G   222G   0  100% /  <------
tmpfs           3.8G  697M  3.1G  19% /dev/shm
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           4.0M     0  4.0M   0% /sys/fs/cgroup
/dev/nvme0n1p1  970M  216M  754M  23% /boot
/dev/nvme0n1p3  241M  5.2M  236M   3% /boot/efi
/dev/sda2       256G  1.9G  255G   1% /home/vignesh/VMs
/dev/sda1       256G  3.1G  253G   2% /home/vignesh/Downloads
/dev/sda3       420G  115G  305G  28% /home/vignesh/WorkFiles
tmpfs           768M  176K  768M   1% /run/user/1000

and you try the du command to find which folder and through that, which file is using this much space and you end up seeing that the total of du command and the one showed by the df command does not match.

vignesh@localhost:~$ sudo du -sch /*
[sudo] password for vignesh: 
0	/bin
182M	/boot
0	/cdrom
80K	/dev
17M	/etc
130G	/home
0	/lib
0	/lib32
0	/lib64
0	/libx32
0	/media
0	/mnt
667M	/opt
du: cannot access '/proc/51557/task/51557/fd/4': No such file or directory
du: cannot access '/proc/51557/task/51557/fdinfo/4': No such file or directory
du: cannot access '/proc/51557/fd/3': No such file or directory
du: cannot access '/proc/51557/fdinfo/3': No such file or directory
0	/proc
7.7M	/root
du: cannot access '/run/user/1000/gvfs': Permission denied
2.3M	/run
0	/sbin
1.9G	/snap
0	/srv
0	/sys
56K	/tmp
8.3G	/usr
8.4G	/var
150G	total

And now you are like

Image of Stewie, the baby in Family Guy series saying "What the Deuce?".

Let's fix it

To fix it, you have to find which process is keeping the deleted file open and kill it.

To find the process ID, you have two ways.

Finding deleted file from proc

To find all the files that are kept open by all the processes, you can run the following command.

sudo find /proc/*/fd -ls

We only want to see the deleted files so for that we can run

find /proc/*/fd -ls | grep  '(deleted)'

This shows all the open files and the process number which has opened that file

vignesh@localhost:~$ sudo find /proc/*/fd -ls | grep  '(deleted)'
[sudo] password for vignesh: 
  1222058      0 lrwx------   1 vignesh  vignesh        64 Jul 13 15:03 /proc/13964/fd/32 -> /tmp/.org.chromium.Chromium.hX85lB\ (deleted)
  1222060      0 lrwx------   1 vignesh  vignesh        64 Jul 13 15:03 /proc/13964/fd/34 -> /dev/shm/.org.chromium.Chromium.EatHOj\ (deleted)
  1222062      0 lrwx------   1 vignesh  vignesh        64 Jul 13 15:03 /proc/13964/fd/36 -> /dev/shm/.org.chromium.Chromium.zBwPPh\ (deleted)
  1222063      0 lrwx------   1 vignesh  vignesh        64 Jul 13 15:03 /proc/13964/fd/38 -> /dev/shm/.org.chromium.Chromium.N0KVxf\ (deleted)
  1222064      0 lrwx------   1 vignesh  vignesh        64 Jul 13 15:03 /proc/13964/fd/39 -> /dev/shm/.org.chromium.Chromium.LwqpXZ\ (deleted)
  1222065      0 lrwx------   1 vignesh  vignesh        64 Jul 13 15:03 /proc/13964/fd/40 -> /dev/shm/.org.chromium.Chromium.dWTkhn\ (deleted)
  1222066      0 lrwx------   1 vignesh  vignesh        64 Jul 13 15:03 /proc/13964/fd/41 -> /dev/shm/.org.chromium.Chromium.PY7RYu\ (deleted)

Here the process ID of the process is obtained from the path in the proc folder.

For the first entry in the above example, the path is /proc/13964/fd/32

Here, the PID is 13964. So generally, the format is

/proc/<pid>/fd/<file_num>

Using lsof command

Most linux machines come with lsof command, if not, you can use the previous method.

The following command shows all the Open but deleted files

sudo lsof +L1

That give the following output

vignesh@localhost:~$ sudo lsof +L1
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.
COMMAND     PID    USER   FD   TYPE DEVICE SIZE/OFF NLINK      NODE NAME
pipewire   2557 vignesh   26u   REG    0,1     2312     0         2 /memfd:pipewire-memfd (deleted)
pipewire   2557 vignesh   29u   REG    0,1     2312     0      1027 /memfd:pipewire-memfd (deleted)
pulseaudi  2559 vignesh    6u   REG    0,1 67108864     0      2051 /memfd:pulseaudio (deleted)
gnome-she  2725 vignesh   35r   REG  259,4      828     0 272568322 /home/vignesh/.local/share/gvfs-metadata/home (deleted)
gnome-she  2725 vignesh   37r   REG  259,4       64     0 272338791 /home/vignesh/.local/share/gvfs-metadata/root (deleted)
gnome-she  2725 vignesh   52u   REG    0,1    51027     0      7183 /memfd:mutter-shared (deleted)

Here the PID of the process is shown under the PID column

Now it's time to kill the process

You can kill the process with the following command

sudo kill -9 <PID>
GIF of a sloth covering the face of a cat: Shhh... Only dreams now meme

If the process was a service, then you can restart the service or if that doesn't work, you can first kill the process and the restart the service using systemctl or init

Explanation

If you would like to know why this happened, read on.

The df and du command checks the file spaces in entirely different ways.

The df command

The df command checks the partition space based on the block usage.

When a file is written, the file is physically coded at some place in the secondary storage based on the availability. This path can be really random at time. The system then find this place by using an index in the file system. The index tells the file location pointing to the exact sector or block of storage where the file starts and ends.

When the df command is issued, it checks the index to find which files exists and then calculates the total used space based on the total from the start and end blocks of all the indexed files.

This is really fast and hence running a df command completes in less than a second.

The du command

The du command is more of a directory and file based size checking.

In linux, when a file is written, the file is hard linked to the index location in the disk. When a user or program accesses the file, we specify the hard link path and the program uses the hard link path to find the index and then the index is used for further processing of the file.

This way we can create multiple hard links or soft links for a single file. The only restriction for a hard link is that all the hard links should be in the same partition as the file, but since soft links are linking to the hard link paths, they can be in any partition.

The du command actually uses the hard links and calculates the file size from that. This is required since the du command needs to show the storage space usage inside a directory and not the file system.

Thus, du checks each and every hard link that exists in that directory recursively to find the total space used by all the file.

This is a really slow process and can sometimes take more than 10 seconds.

What's the problem?

If a process keeps a file open and keeps writing to it, it is directly referencing the disk index. This means that during this access, you can delete the file in Linux. Linux will not throw a "File is currently in use" error unlike windows.

When you delete the file, what actually happens is that the hard link to that disk index is deleted but since the process is working on that file, the disk index is not removed by the file system.

Now what happens is that when you run the df command, you see the disk is filling up or if the process isn't stopped, the disk will reach 100% but when you try to use the du command to find the location of the file, you will not be able to find the file and the total of the du command run on / will not add up to the total / disk capacity.

I.e, du cannot see the hard link which is already deleted and hence does not include it in the calculation but df which uses the disk index uses it in calculation because that space is not actually free.

So df is more accurate than du but du is more precise than df.