Pythian Blog: Technical Track

The Enigma of Disappearing Disk Space: Why Space Isn't Freed After Deleting Files in Linux?

Introduction

If you're a Linux user, you might have encountered a perplexing scenario where you delete a large file or folder, expecting to reclaim disk space immediately, only to find that the freed space is not reflected in the available storage. This enigmatic phenomenon can be frustrating and confusing for many users. In this blog, we will unravel the mysteries behind why space isn't immediately freed after deleting files in Linux and explore the inner workings of the filesystem.

Understanding the File System Basics

Before we delve into the specifics, it's essential to understand the fundamentals of how Linux filesystems manage data. Most Linux systems use the "ext" (ext2, ext3, ext4) filesystem family, which employs a hierarchical structure to store files and directories. Each file system consists of data blocks, and the management of these blocks plays a crucial role in understanding the behavior of disk space.

File Deletion in Linux

When you delete a file in Linux, the actual data in the file is not immediately removed from the disk. Instead, the file system marks the corresponding data blocks as available for reuse. The file's directory entry is deleted, making it seem like the file has vanished. However, the content of the file remains intact on the disk until the space is needed for new data.
root@localhost ~ % df -h              
Filesystem       Size       Used     Avail    %iused  Mounted on
/dev/disk3s1s1   400Gi      384Gi    16Gi      96%        /
devfs            215Ki      215Ki    0Bi      100%       /dev
In this example, you find a file test_log file generated by an application which occupies about 40 GB in the directory. So you perform the following operations attempting to release space:
  1. Delete test_log file.
    rm test_log
  2. Check the file system disk space usage.
    root@localhost ~ % df -h              
    Filesystem       Size       Used     Avail    %iused  Mounted on
    /dev/disk3s1s1   400Gi      384Gi    16Gi      96%        /
    devfs            215Ki      215Ki    0Bi       100%      /dev
However, the output shows that the usage is still 96% on the root volume.

The Role of the "unlink" Operation

The process of deleting a file in Linux involves an operation known as "unlink" or "unlinking." This operation removes the link between the file's name and its inode (a data structure that contains information about the file). When a file is unlinked, its inode is marked as free, and the associated data blocks become eligible for reuse.

Delayed Disk Space Reclamation

Now that we understand the basics of file deletion let's explore why disk space isn't immediately reclaimed: File Still Open: If a program has a file open at the time of deletion, the space won't be freed until the file is closed. Linux allows files to remain open even after they are unlinked, as long as they are still in use by a process. This feature ensures that processes can continue to work with the file until they finish using it. To check this, run the lsof command to check whether any process keeps writing data to the test_log file. Tools like lsof can help identify open files handles.
# lsof -n |grep delete
root@localhost ~# lsof -n | grep deleted
COMMAND PID USER FD TYPE DEVICE   SIZE     NODE NAME
httpd       13423 root 5u      REG 253,3   42949672960     17 (deleted)
mysqld     13423 mysql 6u REG 253,3     0             18 (deleted)
mysqld 13423 mysql 7u REG 253,3     0             19 (deleted)
As shown in the command output, the test_log file is still being used by the httpd process which keeps writing data into the file. Value deleted in the brackets indicates that the log file has been deleted. But the space is not released because httpd keeps writing data to the file.

So how do we resolve this issue? By Forcing Disk Space Reclamation

While Linux automatically manages disk space for efficiency and performance reasons, you can manually force disk space reclamation in certain situations. Some techniques include:
  1. Closing Open Handles: Identify and close any open file handles that might be holding on to deleted files. Restart the httpd process or restart the system. Then you are advised to empty the contents of test_log, rather than deleting the file.
Run the following command to empty test_log:
# echo "">/test_log
In this way, the disk space will be released immediately and the process can continue writing data to the file. Now run the df -h command to check the space usage again, you can see that 10% disk space is freed up now.
root@localhost ~ % df -h              
Filesystem       Size       Used     Avail    %iused  Mounted on
/dev/disk3s1s1    400Gi     344Gi    56Gi       86%        /
devfs             215Ki     215Ki    0Bi       100%      /dev
Snapshots and Backups : Some filesystems support snapshots or have backup mechanisms that retain deleted files for a certain period. These features are essential for data recovery and version control purposes, but they can temporarily prevent the immediate release of disk space.

Conclusion

In conclusion, the mystery of why disk space isn't immediately freed after deleting files in Linux can be attributed to the filesystem's design and various system configurations. Understanding how the filesystem manages data and being aware of potential factors that delay space reclamation will help users avoid confusion and make better use of their storage resources. While Linux's automatic management of disk space ensures efficient performance, users can utilize manual techniques to free up space when needed.

No Comments Yet

Let us know what you think

Subscribe by email