Basics of Images
Reference
Most of the material on this page comes from the book
- Encyclopedia of Graphics File Formats,
by James D. Murray and William Van Ryper,
O'Reilly and Associates, Sebastopol, CA, 1994.
(
http://www.ora.com/
).
This is an excellent reference book for anyone dealing with computer
graphics image files; it contains a detailed description of many file
formats, as well as several good chapters at the beginning which
explain many concepts related to images. It also comes with a CD
containing hundreds of sample images, image processing software (for
Macintosh, Windows, and UNIX), and detailed image format
specifications.
Pixels
An image consists of a rectangular array of dots called
pixels. The size of the image is usually specified as
width X height, in numbers of pixels. The
physical size of the image, in inches or centimeters or whatever,
depends on the resolution of the device on which the image is
displayed. Resolution is usually measured in terms of DPI,
which stands for dots per inch. An image will appear smaller
(and generally sharper) on a device with a higher resolution than on
one with a lower resolution.
Note: The term resolution is also sometimes used in the
context of an image, in which case it specifies the resolution at
which the image is intended to be displayed. Some image file formats
allow for the specification of an image resolution. It is important
to understand the distinction between this number, which expresses a
preferred DPI for displaying the image, and the size of the
image, which gives the number of pixels in the image.
Image Depth / Bitplanes
For each pixel in an image, one needs to know what color to make that
pixel when displaying the image. For a black and white image there
are only two choices --- each pixel is either black or white --- so
one bit of information is all that is needed for each pixel. Such
images are sometimes called 1-bit, or monochrome
images.
For color images, one needs enough bits per pixel to represent all the
colors in the image. The number of bits per pixel is sometimes called
the depth of the image, or the number of bitplanes.
A number consisting of n bits can have 2^n different
values, so an image of depth n can store up to 2^n
colors. The most common depths in computer graphics today are
probably 8 and 24, although 2-bit and 16-bit images are also common.
The human eye can discern between roughly 2^24 different colors (this
number obviously varies greatly from person to person and depends a
lot on viewing conditions), so a 24-bit image is needed to represent
the full range of colors that we can perceive. In many cases,
however, the quality of an 8-bit image is very acceptible, and such
images are often preferably because they require less storage space.
Color: Direct vs Colormaps
A color in computer graphics is usually represented as an ordered
triple of numbers in one of several color spaces. The most
common color space in computer graphics is RGB, which stands for Red,
Green, and Blue. If you stare very closely at a computer screen with
a magifying class you will see that it consists of lots of tiny red,
green, and blue dots which light up in various combintations to create
the impression of a range of colors (each individual pixel on the
screen usually consists of several of this single-color dots).
The most direct way to store color information in an image is for the
data representing each pixel to directly specify the color of that
image by giving the values of the red, green, and blue components (or
the components in some other color space) of that color. We sometimes
use the term RGB image to refer to an image which stores
color in a direct RGB representation.
Another way of storing colors is to use the data for each pixel in an
image to store an index into a table of colors, rather to store the
color value directly. The advantage of this method is that it can
drastically reduce the amount of storage space necessary when only a
relatively small number of colors appear in the image. The table of
colors is called a color table, a color map, a
palette, an index map, or a look-up table
(lut) (these are all equivalent terms).
Transparency
It is sometimes useful to combine two images by overlaying one onto
the other. Transparency is a technique whereby certain pixels in the
image being layed on "top" can be labeled as "transparent", which
means that the "bottom" image will show through them. Transparency is
usually handled by treating transparency as additional color
information; for example by storing a 4th number for each pixel along
with its R,G,B values. Simple transparency can be stored in just 1
bit; in this case a pixel is either transparent (or not), which means
that pixel will be ignored (or not) when the image is combined with
another image.
It's also possible to use varying degrees of transparency, which can
allow us to combine images in a way such that parts of the bottom
image show through parts of the top image to varying degrees. This
requires more than 1 bit of transparency information.
The transparency value of a pixel is sometimes called its
alpha value, and the collection of all an image's alpha
values is sometimes called its alpha channel. The term
RGBA is sometimes used to refer to an RGB image with an alpha
channel.
In the context of the Web transparency is often used to get an image
to blend in well with the browser's background.
Data Compression
Several data compression schemes are often used to reduce the amount
of space necessary to store an image. It isn't necessary for the
purposes of this course to get into the details of these techniques,
but it is helpful to know what they are and to have a general idea of
what they involve.
- Run Length Encoding (RLE)
Replaces runs (multiple consecutive occurances) of a pattern in
the data with a single occurance and a count. Supported by many
bitmap file formats, including TIFF and BMP (Microsoft Windows
Bitmap format). Does not usually give as high a compression
ratio as better methods, but is fast and easy to implement.
RLE is well-suited for monochrome images which consist mostly
of one (background) color, such as line drawings.
- Lempel-Ziv-Welch (LZW) Compression
A very general an efficient dictionary substitution algorithm.
Replaces common patterns in the data with abbreviations for
those patterns. Supported by several file formats such as GIF
and TIFF.
- CCITT (Huffman) Encoding
A collection of several algorithms, used mostly by fax and
document imaging systems (CCITT stands for International
Telegraph and Telephone Consultative Commitee). Some
bitmap file formats, such as TIFF, also support versions of
CCITT encoding.
The above algorithms are all lossless, meaning that the
process of compressing and restoring the data does not change the data
at all. Some algorithms are lossy, meaning that the the
compression/uncompression step may actually alter or destroy some of
the data (but hopefully only parts that are in some way expendable
anyway).
- JPEG Compression
A lossy compression algorithm particularly well-suited to
compressing photographs and video images. (JPEG stands for
Joint Photographic Experts Group.) Discards some
parts of the data in a way that minimizes visible changes to
the image, in order to achieve a high compression ratio.
Bitmap Files
There are many different bitmap file formats in use today, but they
all have more or less a similar structure:
- Header
A section at the beginning of the file giving some information
about the image, such as its size, the number of colors, what
compression scheme it uses for the bitmap data, and so on.
Some formats allow for comments in the header, which can be
used to store notes about the image.
- Color Table
If the image uses a colormap, it usually follows (or is
included in) the header.
- Pixel Data
The actual pixel data of the image, usually compressed.
- Footer
Some formats have a footer which signals the end of the bitmap
data and possibly contains additional information about the
image.
Note that it is usually only the pixel data itself that is compressed
in an image file; the header, color table, and other parts are usually
left uncompressed.
Bitmap File Formats
There are hundreds, if not thousands, of bitmap file formats. The
ones you are most likely to encounter in the process of producing and
using materials for the Web (at least at the Geometry Center) are:
- GIF (Graphics Interchange Format)
The most common image format on the Web, and possibly on the
Internet. Invented by CompuServe Inc. Stores 1- to 8-bit
images. Uses LZW compression. Inline images in Web pages are
usually GIFs.
- TIFF (Tagged Image File Format)
The standard image format found in most paint, imaging, and
desktop publishing programs. Very flexible; supports 1- to
24- bit images and several different compression schemes.
- SGI Image
Silicon Graphics' native image file format; produced by many
programs that run on SGI workstations. Stores direct 24-bit
RGB color.
- Sun Raster
Sun's native image file format; produced by many programs that
run on Sun workstations.
- PICT
Macintosh's native image file format; produced by many
programs that run on Macs. Stores up to 24-bit color.
- BMP (Microsoft Windows Bitmap)
Main format supported by Microsoft Windows. Stores 1-, 4-,
8-, and 24-bit images.
- XBM (X Bitmap)
A format for monochrome (1-bit) images common in the X Windows
system.
- JPEG
The term JPEG itself actually refers to a standard
for data compression (and the organization that developed the
standard), not a file format. However, there is a particular
file format that uses JPEG compression, developed by C-Cube
Microsystems, called the JPEG File Interchange
Format, which is also sometimes called simply the
JPEG file format. It can store up to 24-bits of
color. Some Web browsers can display JPEG images inline (in
particular, Netscape can), but this feature is not a part of
the HTML standard.
Next: Creating Images
Up: Still Images
Prev: Still Images
The Geometry Center Home Page
Comments to:
webmaster@geom.umn.edu
Created: May 31 1996 ---
Last modified: Sun Jun 2 20:30:49 1996