Main page > Software

Identify truncated JPEG files

Command line program to identify files that are not JPEG files, print their names to standard output and optionally (if switch -d was used) delete those files.

This program may be useful to identify JPEG files which were truncated because of file transfer errors or partial recovery. It works by reading the last two bytes of a file and comparing them to the mandatory JPEG end-of-stream marker. If that marker is not present, the file is either a corrupted JPEG or never was a JPEG file in the first place.

Note that some software package seem to append data after the end of the JPEG FIF bitstream. These files are also identified as corrupted. I've yet to find an explanation on why some software developers think that might be a good idea.

Program parameters have to be either file names or the switch "-d" (for delete).

Note: if -d is used, all files are deleted that do not end in a certain byte sequence, not only those which appear to be JPEG but aren't. So use with care.

Note: This program is not a full JPEG parser. It uses a heuristical approach which may identify files as JPEG which in fact are not. However, those "false positives" are relatively unlikely (1:65,536 for a random file).

License

This program and its source code are contributed to the Public Domain.

Usage

Best copy all files to be examined to a temporary directory and run this program on that directory, to avoid accidental deletion:

java IdentifyTruncatedJpeg -d c:\temp\*

To just check out which files with a .jpg name extension are not properly terminated JPEGs, leave out the deletion switch:

java IdentifyTruncatedJpeg c:\images\*.jpg

Each name printed as a result of that call is either a damaged JPEG file or a valid file which isn't in JPEG format. Sometimes, image files in one format have the wrong file extension, e.g. a GIF file which is called image.jpg. It may be a valid GIF file, which displays just fine, it's just not a JPEG file. To avoid deleting such non-corrupt but misnomed files, use additional software to identify the file format, e.g. one of the software packages listed on my file formats main page.

Download

See

ImageInfo identifies JPEG files (among others) by looking at the signature at the beginning of files. ImageInfo will also identify truncated JPEG files as JPEG.