JPEG / JFIF image file format
- Typical file name extensions
.jpg
.jpeg
- Magic bytes
0xff 0xd8 at file offset 0x00
- MIME types
image/jpeg
image/pjpeg (for progressive JPEGs)
- Image types
- Continuous tone images (photos) with 5 or more bits per channel.
- Most popular color types: grayscale, 8 bits (one channel) and
YCbCr color, 24 bits (three channels).
Also YCCK, 32 bits (four channels).
- Popularity
- Very high. Standard file format for the exchange of photos.
- Other formats
-
JPEG (or at least a part of its compression types) can be used as compression type within TIFF.
There are two numbers for it, old-style, and modern.
Various other formats embed JFIF bitstreams, e.g. PDF.
In 1990, the Joint Photographic Experts Group
defined various methods to be used for photo image compression.
They also defined a bitstream to store the various data structures and
compressed image data.
Canadian company C-Cube Microsystems (acquired in 2001 by LSI Corporation)
added a few details to make the interchange of JPEG bitstreams possible,
including the color space.
Their format was called JPEG File Interchange Format, or short, JFIF.
JPEG2000 is not an extension of JPEG but a completely new development.
A good introduction to JPEG is part of the JPEG FAQ.
A JPEG bitstream is a sequence of data chunks, each chunk starts with a marker value.
A marker is a 16 bit integer value, stored in big endian byte order, with the most significant byte set to 0xff.
The lower byte of the marker value determines its type.
A marker is followed by a 16 bit integer value for the size.
JPEG defines a number of compression types to be used.
However, only two of these types are in widespread use,
Baseline DCT / Huffman and its progressive version.
Both of these popular compression types are lossy (decompressing
the compressed JPEG will thus not result in an identical copy of the original).
There are lossless compression types, but they are rarely used,
so the typical saying that JPEG is lossy isn't perfectly accurate,
but true most of the time.
Optimizing JPEG compression for size
There are various options to choose from when saving an image file as JPEG,
and a number of optimization tools.
Some of the following steps can be performed by almost every application
capable of saving images in JPEG format.
-
Quality settings.
JPEG encoders typically allow the user to set a quality number between 1 and 100.
This allows a trade-off between loss of image quality and file size.
The quality number influences the quantization tables which are ultimately responsible
for how much information is thrown away.
There is no single perfect quality value, so you might want to test various
numbers with images typical for your projects.
-
Color sub-sampling.
This option does not apply to grayscale images.
Storing the two chrominance channels Cb and Cr with a lower resolution than the luminance channel (Y) reduces the amount of data to be compressed.
Typical settings are 2:1:1 and 4:1:1 in both horizontal and vertical direction.
Thus, only one Cb and Cr sample is saved for every 2x2 or even 4x4 luminance samples.
-
Optimized Huffman tables.
If an option Optimize Huffman is available in the encoding software,
it will require two runs over the data (take longer)
but choose codes for the lossless part which are tailored to each image,
improving compression ratio.
-
Progressive mode.
Many applications support progressive mode JPEGs these days.
Progressive JPEGs store an image in several iterations,
adding details in every iteration.
In progressive mode, files are usually a bit smaller compared to the standard
baseline JPEGs.
-
Remove metadata.
Some of the metadata written to JPEG files by image editors or digital cameras
can take up several kilobytes of space, including thumbnails, audio tracks or
application-specific markers like those used by Photoshop.
While all of this data may be useful under certain conditions,
it is not essential.
When exporting images as JPEG files, some software packages offer to not include
metadata or remove it from existing JPEGs.
The original file with all the metadata should be kept as a backup, though.
-
Reduce pixel resolution.
This isn't a JPEG-specific step.
In many cases, a lower resolution of an image is sufficient.
All software packages capable of image scaling can perform this reduction step.
-
Reduce detail.
Blurring the complete input image or non-essential parts of it
before compressing it with JPEG usually results in a smaller file.
With the free command line tool jpegtran (see the Applications section below)
some of the steps can be performed like this:
jpegtran -copy none -optimize -progressive < in.jpg > out.jpg
The copy none switch removes extra markers, which includes EXIF metadata.
With optimize the resulting JPEG gets optimized Huffman tables while
progressive turns on the progressive mode.
Obviously, a command line program can be easily run in a script which
you might be using anyway to export images for usage in a Web environment.
The additional call to jpegtran can save some traffic in the long run.
-
Special marker type for textual comments.
One byte per character, everything beyond ASCII is probably not safe to use across platform boundaries.
-
Exif (Exchangeable Image File Format) markers store additional information on the image,
optionally a thumbnail and even audio information.
Used mostly with digital cameras and information related to it (was a flash used, aperture, shutter speed and so on).
Exif uses the format of TIFF image file directories to store its information.
-
A Photoshop application marker can be used for both Photoshop-specific information like clipping paths as well as for metadata following the IPTC standard.
Within the application marker, data is stored as 8BIM resources which are documented Photoshop PSD specification.
-
A more recent development is Adobe's metadata standard XMP (based on XML).
It is supposed to replace the IPTC header structures and is capable of including user extensions.
Metadata libraries
Libraries
- Independent JPEG Group C library, free, cross-platform.
- Pegasus Imaging, a company offering various proprietary JPEG-related libraries and tools.
- pasjpeg, a port of the IJG library to Pascal.
- Small JPEG Decoder Library - C++ library.
- JPEG reading code has been integrated in the Java runtime library since version 1.0 (class java.awt.Toolkit).
Writing code is there since 1.4 (package javax.imageio).
Both internally use IJG's library (native code).
See the list of Java image I/O libraries for more.
- The above libraries all deal with the popular subset of JPEG for lossy compression.
At HP Labs there is a C implementation and Photoshop plugin for
lossless JPEG (which doesn't include all of JPEG's
lossless modes, as far as I understand).
Applications
- About any program that reads or writes image files supports JPEG, no need to list those applications here.
- jpegtran allows for all kinds of modifications of a JPEG file,
including several lossless transformations.
- Lossless jpegtran applications,
a list of applications that support lossless operations of JPEG files.
Note that only certain operations can be done losslessly, e.g. flipping or rotation in steps of 90 degrees.
- jpegdump,
a utility to extract header information of a JPEG file.
Can also guess quality factor for certain JPEG files.
- Note that there is another tool called
jpegdump, which is to be used for image recovery
with digital cameras.
- Identify truncated JPEG files—tool
to sort out JPEG files that were partially transferred or recovered.
- Extract JPEGs from arbitrary files— identifies embedded
JPEG/JFIF streams and writes them to new files.
- The official specs can be bought from ISO: DIS 10918-1 and draft DIS 10918-2.
- The book JPEG Still Image Data Compression Standard by W.B. Pennebaker and J. Mitchell,
published by Van Nostrand Reinhold, New York, NY in 1993, ISBN 0442012721, can be considered the
JPEG bible.
It contains both detailed explanations plus the complete ISO documents in its appendix.
Cf. Google Books.
- IJG.org offers relevant document files.
- There are various scientific papers published at IEEE (JPEG search)
and ACM (search portal).
They deal with different aspects of JPEG, including compression optimization, data recovery,
hardware implementation, quality evaluation and transmission error recovery.