When you work with a large amount of media and documents, it is common for multiple copies of the same file to accumulate on your computer. Inevitably, what follows is cluttered disk space filled with redundant files, which provokes regular checks for duplicate files on your system.

For this purpose, you can find various programs to identify and delete duplicate files. And fdupes happens to be one such program for Linux. So, follow the discussion by fdupes and walk you through the steps to find and delete duplicate files on Linux.

What is fdupes?

Fdupes is a CLI-based program for finding and deleting duplicate files on Linux. It is published on GitHub under the MIT license.

In its simplest form, the program works by executing the specified directory via md5sum to compare the MD5 signatures of its files. It then does a byte-by-byte comparison to identify the duplicate files and ensure that no duplicates are left out.

Once fdupes identifies duplicate files, you have the option to either delete them or replace them with hard links (links to the original files). So you can proceed with an operation as needed.

How do I install fdupes on Linux?

Fdupes is available on most of the major Linux distributions like Ubuntu, Arch, Fedora, etc. Based on the distribution you are running on your computer, issue the commands given below.

On Ubuntu or Debian based systems:

sudo apt install fdupes

To install fdupes on Fedora / CentOS and other RHEL-based distributions:

sudo dnf install fdupes

On Arch Linux and Manjaro:

sudo pacman -S fdupes

How do you use fdupes?

After installing the program on your computer, follow the steps below to find and remove duplicate files.

Finding duplicate files with fdupes

Let's start by looking for all of the duplicate files in a directory. The basic syntax for this is:

fdupes-path / to / directory

For example, if you have duplicate files in documents Directory, would you run:

fdupes ~ / documents

Output:

Identify-duplicate-files-in-one-directory

If fdupes finds duplicate files in the specified directory, it returns a list of all redundant files grouped by Set, and you can then perform further operations on them if necessary.

However, if the directory you specified consists of subdirectories, the above command will not detect duplicates in them. In such situations, you need to do a recursive search to find all duplicate files in the subdirectories.

To do a recursive search in fdupes, use the -R Flag:

fdupes -r path / to / directory

For example:

fdupes -r ~ / documents

Output:

fdupes-recursive-search

While the above two commands can easily find duplicate files within the specified directory (and its subdirectories), their output also includes zero-length (or empty) duplicate files.

While this functionality can still be useful if you have too many empty duplicate files on your system, it can create confusion if you just want to find out non-empty duplicates in a directory.

Fortunately, fdupes lets you exclude zero-length files from search results by using the -n Option that you can use in your commands.

Note: You can exclude non-empty duplicate files in both normal and recursive searches.

To only check for non-empty duplicate files on your computer:

fdupes -n ~ / documents

Output:

Recursive-non-empty-duplicate-files-search-fdupes

When dealing with multiple sets of duplicate files, it is a good idea to output the results to a text file for future reference.

To do this, do the following:

fdupes path / to / directory> filename.txt

…Where Path / to / directory is the directory in which you want to search.

How to find duplicate files in the documents Directory and then send the output to a file:

fdupes / home / Documents> output.txt

Finally, if you want to see a summary of all duplicate file information in a directory, you can use the -m Flag in your commands:

fdupes -m path / to / directory

To get duplicate file information for the documents Directory:

fdupes -m ~ / documents

Output:

Show-summary-of-duplicate-files-with-fdupes

If you ever need help with a command or function while using fdupes, use the -H Option to get command line help:

fdupes -h

fdupes help menu

Deleting duplicate files in Linux with fdupes

After identifying the duplicate files in a directory, you can proceed with removing / deleting those files from your system to clean up clutter and free up disk space.

To delete a duplicate file, enter the -D Flag with the command and hit Enter:

fdupes -d path / to / directory

How to remove duplicate files in Downloads Portfolio:

fdupes -d ~ / Downloads

Fdupes will now present you with a list of all the duplicate files in this directory and give you the option to keep the ones that you want to keep on your computer.

For example, if you want to keep the first file in set 1, type 1 after issuing a fdupes search and hits Enter.

Delete-duplicate-files-under-Linux-with-fdupes

In addition, you can also save multiple instances of files in a set of returned duplicate files if necessary. To do this, you need to enter the numbers of the duplicate files in a comma-separated list and press Enter.

For example, if you want to save files 1, 3, and 5, you would need to type:

1,3,5

In case you want to keep the first instance of a file in each set of duplicate files and ignore the prompt, you can do so by using the -N switch, as shown in the following command:

fdupes -d -N path / to / directory

For example:

fdupes -d -N ~ / documents

Successful deletion of duplicate files in Linux

Organizing files is a tedious task in and of itself. Add in the problems that duplicate files cause and you see a few hours of time and effort wasted organizing your disordered storage.

But thanks to utilities like fdupes, it's much easier and more efficient to identify and delete duplicate files. And the above guide should help you with these operations on your Linux computer.

Similar to duplicate files, duplicate words and repeated lines in a file can be frustrating and require advanced tools to be removed. If you are faced with such problems too, you can unique to remove duplicate lines from a text file.

uniq-command-in-linux

How to find duplicate data in a Linux text file with uniq

When you have a duplicate text file that you want to remove, it's time to use the uniq command.

Continue reading

About the author

Yash Wate
(23 articles published)

Yash is Staff Writer at MUO for DIY, Linux, programming and security. Before he discovered his passion for writing, he developed for the web and iOS. You can also find his writing on TechPP where he covers other industries. Aside from technology, he likes to talk about astronomy, Formula 1 and clocks.

More
By Yash Wate

Subscribe to our newsletter

Subscribe to our newsletter for tech tips, reviews, free e-books, and exclusive offers!

Click here to subscribe