Digital
Image
Processing
Suggested Book: 
Digital Image Processing Using MATLAB 
2 nd ed.
3rd ed. 
Author: 
Rafael C. Gonzalez , Richard E. Woods , Steven L. Eddins 
ISBN10: 
0130085197 (2nd ed.)  013168728X
(3rd ed.) 
ISBN13: 
9780130085191 (2nd ed.)  9780131687288
(3rd ed.) 
Publisher: 
Pearson Prentice Hall, 2004  609 pages 
Copyright: 
2004, 2007 
Table of Contents 
1 Introduction
2 Digital Image Fundamentals
3 Intensity Transformations and Spatial Filtering
4 Filtering in the Frequency Domain
5 Image Restoration and Reconstruction
6 Color Image Processing
7 Wavelets and Multiresolution Processing
8 Image Compression
9 Morphological Image Processing
10 Image Segmentation
11 Representation and Description
12 Object Recognition

Summary: 
The leader in the field for more than twenty years, this introduction to basic concepts and methodologies for digital image processing continues its cuttingedge focus on contemporary developments in all mainstream areas of image processing. Completely selfcontained, heavily illustrated, and mathematically accessible, it has a scope of application that is not limited to the solution of specialized problems. Digital Image Fundamentals. Image Enhancement in the Spatial Domain. Image Enhancement in the Frequency Domain. Image Restoration. Color Image Processing. Wavelets and Multiresolution Processing. Image Compression. Morphological Image Processing. Image Segmentation. Representation and Description. Object Recognition. Amazon.com 
Read Through: 
Sample of Chapter 1, and Chapter 2.

More
about computer graphics can be found in
Lecture Notes
section here 
My Project:
Digital images are used in two primary disciplines: computer vision
and image processing, with image analysis being the key
component in the development and deployment of both. The output images from
computer vision applications are for computer use, whereas the output images
from image processing applications are for human use.
Image processing
mechanisms can be used for image restoration, image enhancement and/or image
compression.

Image enhancement:
is the process of improving images visually. For example, obtaining
highcontrast images from lowcontrast or unclear images.

Image restoration:
is the process of taking an image with some noise and then restoring it
to its original appearance. Median filter, for example, can be used for
restoring an image after applying a saltandpepper noising filter on
that image.

Image compression:
is the process of reducing the amount of data needed
to represent images.
On the other hand, image analysis mechanisms are used to examine the
image data such as objects and segments and use them for
features extraction and pattern classification.
This package provides mechanisms
for image analysis, image enhancement, edge detection and boundary
extraction algorithms.
Contents
1.1.
Image Geometry Operations
1.2.
Arithmetic and Logic Operations
1.3.
Image Quantization
2.1.
Image Filtering
2.2.
Histogram
3.1.
Gradient Operators
3.2.
Compass Masks
3.3.
Advanced Edge Detection Techniques
4.1.
Boundary Extraction
Images in their nature contain enormous amounts of data, most of them
are unnecessary for specific applications.
Therefore, image analysis is primarily a data reduction process aimed to
determine what information contained in the image are necessary for a
particular problem. Image analysis contains three stages:
Data reduction can be done by using several mechanisms implemented in
the flowing figure:
1.1
Image Geometry Operations
For image analysis, we need to concentrate more in a specific
region called a RegionofInterest (ROI).
To do so, image geometry operations are used to modify the spatial
coordinates of the image. The image geometry operations implemented in
this package include: zoom, shrink, crop, translate and rotate
1.1.1
Image Zoom
Often done after cropping the image and is used to enlarge the image and
see the detailed objects in the image.
Typical zoom process can be done in two ways:
Zeroorder hold:
A zeroorder hold is performed by repeating previous pixel values. The
following example is used to implement Zeroorder hold algorithm on a
small array of data.
The previous sample code is implemented in the package by using the
following code:
Firstorder hold:
A firstorder hold is performed by doing linear interpolation between
adjacent pixels.
Firstorder holdMethod (1):
One way to do Firstorder hold is to find the average value between
every pixel pair in each row and use that value as a pixel value between
these two. Next, take the result and expand the columns in the same way. As
can be seen, this method enlarge the image from the size MxN to
(2M1)(2N1).
The previous method is implemented in the package by using the following
code:
Firstorder hold Method (2): Another method to achieve firstorder hold requires convolution process.
This method is applied in two steps:
1. Expand the image by adding
rows and columns of zeros between the existing rows and columns.
2.
Apply the following convolution mask on the expanded image:
1.1.2
Image Crop
Image crop is the process of selecting a portion of the image
(subimage) and cutting it away from the rest of the image. Image crop is
useful, for instance, to remove a border from an image before doing further
enhancement or analysis.
1.1.3
Geometric Transformations
Translation:
The translation process can be achieved by using the following equations:
Where
are the new coordinates, r and c are
the original coordinates and
and
are the distances to move or
translate the image.
Rotation:
The translation process can be achieved by using the following equations:
Where
are the new coordinates, r and c are
the original coordinates and
𝚹
is the angle to rotate the image.
Combination of Translation and Rotation:
The rotation and translation process can be combined into one set of
equations:
1.2
Arithmetic and Logic Operations
Arithmetic and logic operations are often applied as preprocessing steps
before image analysis. These operations are performed in two images, except
the NOT logic operation which requires only one image, and are done on a
pixel by pixel basis.
Arithmetic Operations:
Note that multiplication or division by a constant can be used to brighten
or darken the image. The following figure shows an image divided by a value
less than one to brighten the image and another image divided by a value
greater than one to darken the image.
Logical Operations:
Logical AND and OR are very useful to perform a masking operation. AND and
OR can be used as a simple method to extract a ROI from an image.
AND Operation
OR Operation
Note that
direct logical & and logical  operations are not useful for this
application since they obtain only either 0 or 1 which results in binary
images. Instead, bitwise AND (bitand function) bitwise OR (bitor function)
in MATLAB have been used to perform logical AND and OR on bit basis. (i.e. 0
will be converted to 00000000, 255 will be converted to 11111111 and so on).
As a result, when we apply the mask for AND operation, if the mask bit value
is 0000000, the result will be 0; while if the mask bit value is 11111111,
the result will be the image pixel value. Similarly, when we apply the mask
for OR operation, if the mask bit value is 11111111, the result will be 1;
while if the mask bit value is 00000000, the result will be the image pixel
value.
Logical NOT
Logical NOT creates a negative of the original image by inverting each bit
within each pixel value.
1.3
Image Quantization
Image quantization is the process of reducing the image data by removing
some of the detail information by mapping a group of data points to a single
point.
1.3.1
GrayLevel Reduction
The simplest method for graylevel reduction is applying a threshold. Thresholding
is the process of selecting a graylevel value called threshold value and
every pixel value above this value is equal to 1 (255 in 8bit images) and
every pixel value below the thresholding value is equal to 0.
Twolevel (Binary Images)
Fourlevel
Sixteenlevel
Thirty twolevel
The general case
is to mask k bits where 2^{k} is the number of gray levels required
in the image.
1.3.2
GrayLevel Modification
1.3.3
Spatial Reduction
Quantization in the spatial coordinates results in reducing the size of
data. First step: specify the desired size, in pixels, of the resulting
image. To reduce an image from 512x512 to 25% of its size, that means the
resulting image will have the size 256x256.
Second Step: apply one of following three methods to create the reduced
image:
Averaging
Mediating
Decimation
Averaging:
take all pixels in the image in each group and replace them with the average
value in the reduced image.
Median:
sort all pixels from lowest to highest and replace them with the middle
value in the reduced image.
Decimation:
(also called subsampling) eliminate unnecessary rows and columns.
2.12.1
Image Filtering
Filters can be either in spatial domain or frequency domain.
Many spatial filters are implemented with convolution masks. Since a
convolution mask provides a result that is a weighted sum of the values of a
pixel and its neighbours, it is called a linear filter.
2.1.1
Linear Filtering
2.1.1.1
Mean Filter
Mean filter is used primarily to deal with noise images. In addition, a mean
filter adds a softer look to images.
Mean filters are averaging filters, they are work on a neighbourhood
pixels and replace the centre pixel with the average of the pixels in the
neighbourhood.
Mean filter convolution mask:
Note that the border will still have the noise because the convolution mask operation
does not involve pixels in the border of the image.the border will still have the noise because the convolution mask operation
does not involve pixels in the border of the image.
X 
X 
X 
X 
X 
X 
X 
X 
X 
X 
X 








X 
X 








X 
X 








X 
X 








X 
X 








X 
X 








X 
X 








X 
X 








X 
X 
X 
X 
X 
X 
X 
X 
X 
X 
X 
2.1.1.2
Median Filter
The median filter is a nonlinear filter. A nonlinear filter has a
result that cannot be done by a weighted sum of the neighborhood pixels like
convolution mask.
The result of median filter is amazing! An image with a salt and pepper
noise, for instance, will be restored to the original appearance.
2.1.1.3
Laplacian Filters
Laplacian Filters are linear filters implemented with convolution masks
having some alternative positive and negative coefficients to enhance the
image details.
Convolution masks for Laplacian Filters:
FILTER1
FILTER2
FILTER3
Laplacian filters are called rotationally invariant because they tend to to
enhance the details of the image in all directions equally.
2.1.1.4
Difference Filters
Difference filters , also called emboss filters, tend to enhance the image
in the direction specified by the mask. There are four primary directions:
horizontal, diagonal, and two diagonal directions.
VERTICAL
HORIZONTAL
DIAGONAL1 DIAGONAL2
2.1.1.5
Rotated Difference Filters
By applying the rotation on the previous difference masks, four more
difference filters are obtained.
2.1.1.6
Enhanced Details Filter
The convolution mask used for this filter is 3 x 3 matrix as follows:
2.1.2
Nonlinear Transformations
2.1.2.1
Powerlaw Transform
The mapping equations for powerlaw transform
Imaging equipments, such as printers, use powerlower transform. That means
their response is nonlinear. For values of γ greater than one, images will
appear darker and for values less than one, images will appear lighter.
2.1.2.2
GammaCorrection Transform
The mapping equations for powerlaw transform
If the response function of a device is given by the above powerlaw
transform, then it can be compensate for by an application by using the
above gammacorrection equation.
2.2
Histogram
A graylevel histogram of an image is the distribution of the gray levels in
the image. A histogram with a small spread indicates that the image has a
low contrast while a histogram with fairly or stretched levels indicates
that the image has a high contrast. The histogram can be modified by a
mapping function, which will stretch, shrink (compress) or slid the
histogram.
2.2.1
Histogram Modification
2.2.1.1
Histogram Stretch
The mapping function for histogram stretch:
Where is the largest gray level value in
the image,
is the smallest gray level value in
the image,
corresponds the smallest and the
largest gray level possible values (0 and 255 for an 8bit image).The effect of histogram stretch is to increase the contrast of low contrast
images.
2.2.1.2
Histogram Shrink (compress)
The mapping function for histogram shrink:
Where
is the largest gray level value in
the image,
is the smallest gray level value in
the image,
are the maximum and minimum desired
in the compressed histogram.The effect of histogram shrink is to decrease the contrast of images.
2.2.1.3
Histogram Slide
The histogram slide technique can be used to make an image either darker or
lighter but retain the relationship between gray level values. This is can
be accomplished simply by adding or subtracting a fixed number from an
image.
Where
is the amount to slide histogram.
2.2.2
Histogram Equalization
Histogram equalization is an effective technique that can be used to improve
the appearance of poor images. It is function is similar to histogram
stretch with two differences:
First, appearance of histogram after performing histogram equalization is as
flat as possible, while in histogram stretch the overall shape of histogram
remains the same. Second,
result of histogram equalization is much more pleasant and effective in
a wide range of images.
Edge detection methods are used as a first step in line detection. It is
also used to find complex objects boundaries. Edge detection operators are
based on the idea that edge information in an image is found by looking at
the relationship between a pixel and its neighbors. If the pixel's gray
level value similar to those around it, there is a probably no edge at this
point. However, if a pixel has a gray level a widely varying gray level, it
may represent an edge.
3
3.1
Gradient Operators
3.1.1
Roberts Operator
The simplest edge detection operator which marks edge points only, but does
not return any information about edge orientation. Roberts
operator works the best with binary
images; therefore, ideally gray level images could be converted to binary
images by using a threshold operation.
Roberts Operator: First Form
Roberts Operator: Second Form
3.1.2
Sobel Operator
This method approximates the gradient by using a row and column mask, which
will approximates the first derivative in each direction. The Sobel edge
detection masks look for edges in both horizontal and vertical directions,
and then combine this information into a single metric. The masks are as
follows:
VERTICAL EDGE
HORIZONTAL EDGE
These masks are convolved with the image. At each pixel location, there are
two numbers: s1, corresponds to the result from the vertical edge mask, and
s2, corresponds to the result from the horizontal edge mask. s1 and s2 are
used to compute two matrices:
edge magnitude and edge direction.
In the following image, the edge magnitude is calculated by using:
When the edge magnitude is calculated by adding the absolute value of p1and
p2, edges become more clearer.
3.1.3
Prewitt Operator
Similar to the Sobel but with different masks. The masks are simpler which
makes the calculations faster.
VERTICAL EDGE
HORIZONTAL EDGE
These masks are convolved with the image. At each pixel location, there are
two numbers: p1, corresponds to the result from the vertical edge mask, and
p2, corresponds to the result from the horizontal edge mask. p1 and p2 are
used to compute two matrices: edge magnitude and edge direction.
3.1.4
Laplacian Operators
There are three Laplacian masks for edge detection unlike Sobel and Prewitt,
Laplacian masks are rotationally symmetric, which means edges at all
directions contribute to the result. One of these masks is applied to the
image, and the sign of the result from two adjacent pixels provided
information about the edge direction.
Then, the resultant image can be enhanced with a threshold value makes the
edges color contrasts the background.
3.2
Compass Masks
Krisch and Robinson edge detection masks are called compass masks since they
are defined by taking a single mask and rotating it to the eight major
compass orientations: North, Northwest, West, Southwest, South, Southeast,
East and NorthEast.
3.2.1
Krisch Compass Masks
K_{0}
K_{1}
K_{2}
K_{3}
K_{4}
K_{5}
K_{6}
K_{7}
How to apply Krisch Compass Masks?
1.
Convolve the image with the 8 masks
2.
The edge magnitude is defined as the maximum value found at each point by
each of the above convolution masks.
3.
The edge direction is the direction of the mask that give the maximum
magnitude.
3.2.2
Robinson Compass Masks
Used in the manner that is similar to the Krish masks, but Robinson masks
are easier to implement since they only rely on the coefficients 0,1 and 2.
r_{0}
r_{1}
r_{2}
r_{3}
r_{4}
r_{5}
r_{6}
r_{7}
3.3
Advanced Edge Detection Techniques
3.3.1
Laplacian of Gaussian (LoG) Edge Detection
Laplacian of Gaussian (LoG) edge detection method requires the following
steps:
1.
Convolve the image with a Gaussian Smoothing filter
2.
Convolve the image with a Laplacian mask
A common 5x5 mask that approximates the combination of the Gaussian and
Laplacian into one convolution mask is as follows:
3.3.2
Canny Algorithm
The Canny algorithm consists of the following steps:
1.
Apply the Gaussian filter mask to smooth the image.
2.
Find the magnitude and edge direction of the gradient using the following
convolution masks and equations
The goal of image segmentation is to find regions that represent objects or
meaningful parts of objects. Image segmentation techniques can be divided
into three main categories:
1.
Region growing and shrinking
2.
Clustering methods
3.
Boundary detection
4
4.1
Boundary Extraction
Boundary detection for image segmentation is performed by finding boundaries
between objects, thus indirectly defining the objects.
Boundary Detection General Steps:
(1)
Mask potential edge points by finding discontinuities in features such as
brightness.
The edge operators are tend to mark points of rapid change, thus indicating
the possibility of an object boundary. In this package, Sobel operator is
used as an edge detection method. The resultant image and the histogram are
appears as follows:
(2)
Threshold the results.
One method to consider the histogram of edge detection results, looking for
the best valley manually.

With a bimodal histogram which is a histogram with two major peaks, a good
threshold value can be determined easily. A bimodal histogram is typically
for images which have one object against a background of high contrast.

With a histogram which has more than two peaks, a method called minimizing
within group variance or the Otsu method can be used.
Often, the histogram of an image that has been operated on by an edge
operator is unimodal (one peak), so it may be difficult to find a good
valley. In this case, the average value is used for the threshold.
(3)
Merge edge segments into boundaries via edge linking
In this package, the edge points have been marked by using plot(x,y) method.
The technique used for marking points is as follows:
If we find three successor points above the thresholding value, then
they are considered part of the object boundary.
After Marking the edge points:
Downloads!
Download the presentation slides  by Salha Alzahrani 


Important Note: Many of these filters and image processing tools are builtin functions in MATLAB Image Processing toolbox. Nevertheless, in my project I deal with images as an array of pixels and redefine (or rebuild) my own code that implement a certain image processing task..

