Main page > Java > Code examples

DupeFinder.java—Find duplicate files

Note: the explanations on how this program works are still missing.

The DupeFinder program creates a list of files from parameters given to it and finds all files with the same content. The program first throws out all files with a unique size, then creates CRC32 checksums on all remaining files and prints all files which share the same checksum and size to standard output.

This program covers the following topics:

Note that this program turned out a bit larger and more complex than I expected. You may want to try the smaller examples first.

If you do master the program's complexity, there are some tips on extending the program as a student project.

Compiling and running the program

These instructions are hopefully beginner-friendly. That's why they are a bit verbose.

  1. Save the source code in a file DupeFinder.java (regard case).
  2. Open a prompt (shell), change to the directory where you have saved the file and compile it:
    javac DupeFinder.java
    Now you should have two new files DupeFinder.class and FileInfo.class in the same directory. Explanation for the second class file: the source code file contains two class declarations.
  3. Run the program with this command:
    java DupeFinder FILE1 FILE2 ... DIR1 DIR2 ...
    where the FILEs and DIRs are file and directory names which you can add in an arbitrary order.

Explanation

TODO.

Extending the program

If you have studied the program and find it interesting, here are some suggestions on how you could enhance it.

Source code of DupeFinder.java

DupeFinder.java