curriculume vitae (Resume)
graduation project
current research
learn programming
lecture notes
database and information retrieval
software engineering
image processing
intelligent systems
cisco training & networks
awards received
everday English

 Digital Image Processing 

Suggested Book: Digital Image Processing Using MATLAB

2nd ed.

3rd ed.
Author: Rafael C. Gonzalez , Richard E. Woods , Steven L. Eddins
ISBN-10: 0130085197 (2nd ed.) - 013168728X (3rd ed.)
ISBN-13: 978-0130085191 (2nd ed.) - 978-0131687288 (3rd ed.)
Publisher: Pearson Prentice Hall, 2004 - 609 pages
Copyright: 2004, 2007
Table of Contents

1 Introduction
2 Digital Image Fundamentals
3 Intensity Transformations and Spatial Filtering
4 Filtering in the Frequency Domain
5 Image Restoration and Reconstruction
6 Color Image Processing
7 Wavelets and Multiresolution Processing
8 Image Compression
9 Morphological Image Processing
10 Image Segmentation
11 Representation and Description
12 Object Recognition

Summary: The leader in the field for more than twenty years, this introduction to basic concepts and methodologies for digital image processing continues its cutting-edge focus on contemporary developments in all mainstream areas of image processing. Completely self-contained, heavily illustrated, and mathematically accessible, it has a scope of application that is not limited to the solution of specialized problems. Digital Image Fundamentals. Image Enhancement in the Spatial Domain. Image Enhancement in the Frequency Domain. Image Restoration. Color Image Processing. Wavelets and Multiresolution Processing. Image Compression. Morphological Image Processing. Image Segmentation. Representation and Description. Object Recognition.
Read Through: Sample of Chapter 1, and Chapter 2.

More about computer graphics can be found in Lecture Notes section here

My Project:

Digital images are used in two primary disciplines: computer vision and image processing, with image analysis being the key component in the development and deployment of both. The output images from computer vision applications are for computer use, whereas the output images from image processing applications are for human use.

Image processing mechanisms can be used for image restoration, image enhancement and/or image compression.

  • Image enhancement: is the process of improving images visually. For example, obtaining high-contrast images from low-contrast or unclear images.

  • Image restoration: is the process of taking an image with some noise and then restoring it to its original appearance. Median filter, for example, can be used for restoring an image after applying a salt-and-pepper noising filter on that image.

  • Image compression: is the process of reducing the amount of data needed  to represent images.

On the other hand, image analysis mechanisms are used to examine the image data such as objects and segments and use them for features extraction and pattern classification.   This package provides mechanisms for image analysis, image enhancement, edge detection and boundary extraction algorithms.



1.    Image Analysis

1.1. Image Geometry Operations

1.2. Arithmetic and Logic Operations

1.3. Image Quantization

2.    Image Enhancement

2.1. Image Filtering

2.2. Histogram

3.    Edge/Line Detection

3.1. Gradient Operators

3.2. Compass Masks

3.3. Advanced Edge Detection Techniques

4.    Segmentation

4.1. Boundary Extraction

Images in their nature contain enormous amounts of data, most of them are unnecessary for specific applications. Therefore, image analysis is primarily a data reduction process aimed to determine what information contained in the image are necessary for a particular problem. Image analysis contains three stages:

Data reduction can be done by using several mechanisms implemented in the flowing figure:


1.1     Image Geometry Operations

For image analysis, we need to concentrate more in a specific  region called a Region-of-Interest (ROI).
To do so, image geometry operations are used to modify the spatial coordinates of the image. The image geometry operations implemented in this package include: zoom, shrink, crop, translate and rotate

1.1.1   Image Zoom

Often done after cropping the image and is used to enlarge the image and see the detailed objects in the image. Typical zoom process can be done in two ways:


Zero-order hold:

A zero-order hold is performed by repeating previous pixel values. The following example is used to implement Zero-order hold algorithm on a small array of data.


 The previous sample code is implemented in the package by using the following code:



First-order hold:

A first-order hold is performed by doing linear interpolation between adjacent pixels.


First-order hold-Method (1): One way to do First-order hold is to find the average value between every pixel pair in each row and use that value as a pixel value between these two. Next, take the result and expand the columns in the same way. As can be seen, this method enlarge the image from the size MxN to (2M-1)(2N-1).



The previous method is implemented in the package by using the following code:


First-order hold -Method (2): Another method to achieve first-order hold requires convolution process. This method is applied in two steps:

1.     Expand the image by adding rows and columns of zeros between the existing rows and columns.

2.     Apply the following convolution mask on the expanded image:


1.1.2   Image Crop

 Image crop is the process of selecting a portion of the image (sub-image) and cutting it away from the rest of the image. Image crop is useful, for instance, to remove a border from an image before doing further enhancement or analysis.

1.1.3   Geometric Transformations


The translation process can be achieved by using the following equations:

Where  are the new coordinates, r and c are the original coordinates and  and  are the distances to move or translate the image.



The translation process can be achieved by using the following equations:

Where  are the new coordinates, r and c are the original coordinates and 𝚹 is the angle to rotate the image.


Combination of Translation and Rotation:

The rotation and translation process can be combined into one set of equations:


1.2     Arithmetic and Logic Operations

Arithmetic and logic operations are often applied as pre-processing steps before image analysis. These operations are performed in two images, except the NOT logic operation which requires only one image, and are done on a pixel by pixel basis.

Arithmetic Operations:


Note that multiplication or division by a constant can be used to brighten or darken the image. The following figure shows an image divided by a value less than one to brighten the image and another image divided by a value greater than one to darken the image.


Logical Operations:

Logical AND and OR are very useful to perform a masking operation. AND and OR can be used as a simple method to extract a ROI from an image.


AND Operation


OR Operation

Note that direct logical & and logical | operations are not useful for this application since they obtain only either 0 or 1 which results in binary images. Instead, bitwise AND (bitand function) bitwise OR (bitor function) in MATLAB have been used to perform logical AND and OR on bit basis. (i.e. 0 will be converted to 00000000, 255 will be converted to 11111111 and so on). As a result, when we apply the mask for AND operation, if the mask bit value is 0000000, the result will be 0; while if the mask bit value is 11111111, the result will be the image pixel value. Similarly, when we apply the mask for OR operation, if the mask bit value is 11111111, the result will be 1; while if the mask bit value is 00000000, the result will be the image pixel value.


Logical NOT

Logical NOT creates a negative of the original image by inverting each bit within each pixel value.

1.3     Image Quantization


Image quantization is the process of reducing the image data by removing some of the detail information by mapping a group of data points to a single point.  

1.3.1     Gray-Level Reduction

The simplest method for gray-level reduction is applying a threshold. Thresholding is the process of selecting a gray-level value called threshold value and every pixel value above this value is equal to 1 (255 in 8-bit images) and every pixel value below the thresholding value is equal to 0.


Two-level (Binary Images)





Thirty two-level

The general case is to mask k bits where 2k is the number of gray levels required in the image.  

1.3.2   Gray-Level Modification


1.3.3  Spatial Reduction

Quantization in the spatial coordinates results in reducing the size of data. First step: specify the desired size, in pixels, of the resulting image. To reduce an image from 512x512 to 25% of its size, that means the resulting image will have the size 256x256.

Second Step: apply one of following three methods to create the reduced image:      Averaging       Mediating      Decimation

Averaging: take all pixels in the image in each group and replace them with the average value in the reduced image.


Median: sort all pixels from lowest to highest and replace them with the middle value in the reduced image.


Decimation: (also called sub-sampling) eliminate unnecessary rows and columns.


2.12.1     Image Filtering


Filters can be either in spatial domain or frequency domain.

Many spatial filters are implemented with convolution masks. Since a convolution mask provides a result that is a weighted sum of the values of a pixel and its neighbours, it is called a linear filter.  

2.1.1   Linear Filtering Mean Filter

Mean filter is used primarily to deal with noise images. In addition, a mean filter adds a softer look to images.  Mean filters are averaging filters, they are work on a neighbourhood pixels and replace the centre pixel with the average of the pixels in the neighbourhood.

Mean filter convolution mask:

Note that the border will still have the noise because the convolution mask operation does not involve pixels in the border of the image.the border will still have the noise because the convolution mask operation does not involve pixels in the border of the image.




































































































X Median Filter

The median filter is a nonlinear filter. A nonlinear filter has a result that cannot be done by a weighted sum of the neighborhood pixels like convolution mask. The result of median filter is amazing! An image with a salt and pepper noise, for instance, will be restored to the original appearance. Laplacian Filters

Laplacian Filters are linear filters implemented with convolution masks having some alternative positive and negative coefficients to enhance the image details.

Convolution masks for Laplacian Filters:

FILTER1                                            FILTER2                               FILTER3


Laplacian filters are called rotationally invariant because they tend to to enhance the details of the image in all directions equally. Difference Filters

Difference filters , also called emboss filters, tend to enhance the image in the direction specified by the mask. There are four primary directions: horizontal, diagonal, and two diagonal directions.

VERTICAL                        HORIZONTAL                    DIAGONAL1                      DIAGONAL2

                            Rotated Difference Filters

By applying the rotation on the previous difference masks, four more difference filters are obtained.            Enhanced Details Filter

The convolution mask used for this filter is 3 x 3 matrix as follows:

2.1.2   Non-linear Transformations Power-law Transform

The mapping equations for power-law transform

Imaging equipments, such as printers, use power-lower transform. That means their response is nonlinear. For values of γ greater than one, images will appear darker and for values less than one, images will appear lighter. Gamma-Correction Transform

The mapping equations for power-law transform

If the response function of a device is given by the above power-law transform, then it can be compensate for by an application by using the above gamma-correction equation.


2.2     Histogram

A gray-level histogram of an image is the distribution of the gray levels in the image. A histogram with a small spread indicates that the image has a low contrast while a histogram with fairly or stretched levels indicates that the image has a high contrast. The histogram can be modified by a mapping function, which will stretch, shrink (compress) or slid the histogram.

2.2.1   Histogram Modification Histogram Stretch

The mapping function for histogram stretch:  Where is the largest gray level value in the image,  is the smallest gray level value in the image,  corresponds the smallest and the largest gray level possible values (0 and 255 for an 8-bit image).The effect of histogram stretch is to increase the contrast of low contrast images.  Histogram Shrink (compress)

The mapping function for histogram shrink:  Where  is the largest gray level value in the image,  is the smallest gray level value in the image,  are the maximum and minimum desired in the compressed histogram.The effect of histogram shrink is to decrease the contrast of images. Histogram Slide

The histogram slide technique can be used to make an image either darker or lighter but retain the relationship between gray level values. This is can be accomplished simply by adding or subtracting a fixed number from an image.  Where  is the amount to slide histogram.


2.2.2   Histogram Equalization

Histogram equalization is an effective technique that can be used to improve the appearance of poor images. It is function is similar to histogram stretch with two differences:
First, appearance of histogram after performing histogram equalization is as flat as possible, while in histogram stretch the overall shape of histogram remains the same. Second,  result of histogram equalization is much more pleasant and effective in a wide range of images.

Edge detection methods are used as a first step in line detection. It is also used to find complex objects boundaries. Edge detection operators are based on the idea that edge information in an image is found by looking at the relationship between a pixel and its neighbors. If the pixel's gray level value similar to those around it, there is a probably no edge at this point. However, if a pixel has a gray level a widely varying gray level, it may represent an edge.



3.1     Gradient Operators

3.1.1   Roberts Operator

The simplest edge detection operator which marks edge points only, but does not return any information about edge orientation. Roberts  operator works the best with binary images; therefore, ideally gray level images could be converted to binary images by using a threshold operation.


Roberts Operator: First Form

Roberts Operator: Second Form



3.1.2   Sobel Operator

This method approximates the gradient by using a row and column mask, which will approximates the first derivative in each direction. The Sobel edge detection masks look for edges in both horizontal and vertical directions, and then combine this information into a single metric. The masks are as follows:

VERTICAL EDGE                 HORIZONTAL EDGE                       


These masks are convolved with the image. At each pixel location, there are two numbers: s1, corresponds to the result from the vertical edge mask, and s2, corresponds to the result from the horizontal edge mask. s1 and s2 are used to compute two matrices: edge magnitude and edge direction.

In the following image, the edge magnitude is calculated by using:

When the edge magnitude is calculated by adding the absolute value of p1and p2, edges become more clearer.

3.1.3   Prewitt Operator

Similar to the Sobel but with different masks. The masks are simpler which makes the calculations faster.

  VERTICAL EDGE     HORIZONTAL EDGE                     


These masks are convolved with the image. At each pixel location, there are two numbers: p1, corresponds to the result from the vertical edge mask, and p2, corresponds to the result from the horizontal edge mask. p1 and p2 are used to compute two matrices: edge magnitude and edge direction.


3.1.4   Laplacian Operators

There are three Laplacian masks for edge detection unlike Sobel and Prewitt, Laplacian masks are rotationally symmetric, which means edges at all directions contribute to the result. One of these masks is applied to the image, and the sign of the result from two adjacent pixels provided information about the edge direction.  Then, the resultant image can be enhanced with a threshold value makes the edges color contrasts the background.




3.2     Compass Masks

Krisch and Robinson edge detection masks are called compass masks since they are defined by taking a single mask and rotating it to the eight major compass orientations: North, Northwest, West, Southwest, South, Southeast, East and NorthEast.

3.2.1   Krisch Compass Masks

K0        K1     K2     K3

K4        K5     K6     K7

How to apply Krisch Compass Masks?

1.     Convolve the image with the 8 masks

2.     The edge magnitude is defined as the maximum value found at each point by each of the above convolution masks.

3.     The edge direction is the direction of the mask that give the maximum magnitude.


3.2.2   Robinson Compass Masks

Used in the manner that is similar to the Krish masks, but Robinson masks are easier to implement since they only rely on the coefficients 0,1 and 2.

r0            r1         r2      r3

r4            r5         r6      r7


3.3     Advanced Edge Detection Techniques

3.3.1   Laplacian of Gaussian (LoG) Edge Detection

Laplacian of Gaussian (LoG) edge detection method requires the following steps:

1.     Convolve the image with a Gaussian Smoothing filter

2.     Convolve the image with a Laplacian mask

A common 5x5 mask that approximates the combination of the Gaussian and Laplacian into one convolution mask is as follows:


3.3.2   Canny Algorithm

The Canny algorithm consists of the following steps:

1.     Apply the Gaussian filter mask to smooth the image.

2.     Find the magnitude and edge direction of the gradient using the following convolution masks and equations


The goal of image segmentation is to find regions that represent objects or meaningful parts of objects. Image segmentation techniques can be divided into three main categories:

1.     Region growing and shrinking

2.     Clustering methods

3.     Boundary detection



4.1     Boundary Extraction

Boundary detection for image segmentation is performed by finding boundaries between objects, thus indirectly defining the objects.

Boundary Detection General Steps:

(1)  Mask potential edge points by finding discontinuities in features such as brightness.

The edge operators are tend to mark points of rapid change, thus indicating the possibility of an object boundary. In this package, Sobel operator is used as an edge detection method. The resultant image and the histogram are appears as follows:

(2)  Threshold the results.

One method to consider the histogram of edge detection results, looking for the best valley manually.

  • With a bimodal histogram which is a histogram with two major peaks, a good threshold value can be determined easily. A bimodal histogram is typically for images which have one object against a background of high contrast.
  • With a histogram which has more than two peaks, a method called minimizing within group variance or the Otsu method can be used.

Often, the histogram of an image that has been operated on by an edge operator is unimodal (one peak), so it may be difficult to find a good valley. In this case, the average value is used for the threshold.

(3)  Merge edge segments into boundaries via edge linking

In this package, the edge points have been marked by using plot(x,y) method. The technique used for marking points is as follows:   If we find three successor points above the thresholding value, then they are considered part of the object boundary.  

After Marking the edge points:


Download the presentation slides - by Salha Alzahrani

Important Note: Many of these filters and image processing tools are built-in functions in MATLAB Image Processing toolbox. Nevertheless, in my project I deal with images as an array of pixels and re-define (or re-build) my own code that implement a certain image processing task..

:: Home :: Lecture notes :: Calendar :: E-Learning :: Contact us :: Guest book :: Taif University :: CIT ::

All rights reserved for © 2008-2012

For best resolution use: 1024 x 768
You are using :