Commit 30e393df authored by Vadim Pisarevsky's avatar Vadim Pisarevsky

added latest tutorials from the trunk; fixed a few build problems

parent 05c98f56
.. _Table-Of-Content-Calib3D:
*calib3d* module. Camera calibration and 3D reconstruction
-----------------------------------------------------------
Although we got most of our images in a 2D format they do come from a 3D world. Here you will learn how to find out from the 2D images information about the 3D world.
.. include:: ../../definitions/noContent.rst
.. _Adding_Images:
Adding (blending) two images using OpenCV
*******************************************
Goal
=====
In this tutorial you will learn how to:
* What is *linear blending* and why it is useful.
* Add two images using :add_weighted:`addWeighted <>`
Cool Theory
=================
.. note::
The explanation below belongs to the book `Computer Vision: Algorithms and Applications <http://szeliski.org/Book/>`_ by Richard Szeliski
From our previous tutorial, we know already a bit of *Pixel operators*. An interesting dyadic (two-input) operator is the *linear blend operator*:
.. math::
g(x) = (1 - \alpha)f_{0}(x) + \alpha f_{1}(x)
By varying :math:`\alpha` from :math:`0 \rightarrow 1` this operator can be used to perform a temporal *cross-disolve* between two images or videos, as seen in slide shows and film production (cool, eh?)
Code
=====
As usual, after the not-so-lengthy explanation, let's go to the code. Here it is:
.. code-block:: cpp
#include <cv.h>
#include <highgui.h>
#include <iostream>
using namespace cv;
int main( int argc, char** argv )
{
double alpha = 0.5; double beta; double input;
Mat src1, src2, dst;
/// Ask the user enter alpha
std::cout<<" Simple Linear Blender "<<std::endl;
std::cout<<"-----------------------"<<std::endl;
std::cout<<"* Enter alpha [0-1]: ";
std::cin>>input;
/// We use the alpha provided by the user iff it is between 0 and 1
if( alpha >= 0 && alpha <= 1 )
{ alpha = input; }
/// Read image ( same size, same type )
src1 = imread("../../images/LinuxLogo.jpg");
src2 = imread("../../images/WindowsLogo.jpg");
if( !src1.data ) { printf("Error loading src1 \n"); return -1; }
if( !src2.data ) { printf("Error loading src2 \n"); return -1; }
/// Create Windows
namedWindow("Linear Blend", 1);
beta = ( 1.0 - alpha );
addWeighted( src1, alpha, src2, beta, 0.0, dst);
imshow( "Linear Blend", dst );
waitKey(0);
return 0;
}
Explanation
============
#. Since we are going to perform:
.. math::
g(x) = (1 - \alpha)f_{0}(x) + \alpha f_{1}(x)
We need two source images (:math:`f_{0}(x)` and :math:`f_{1}(x)`). So, we load them in the usual way:
.. code-block:: cpp
src1 = imread("../../images/LinuxLogo.jpg");
src2 = imread("../../images/WindowsLogo.jpg");
.. warning::
Since we are *adding* *src1* and *src2*, they both have to be of the same size (width and height) and type.
#. Now we need to generate the :math:`g(x)` image. For this, the function :add_weighted:`addWeighted <>` comes quite handy:
.. code-block:: cpp
beta = ( 1.0 - alpha );
addWeighted( src1, alpha, src2, beta, 0.0, dst);
since :add_weighted:`addWeighted <>` produces:
.. math::
dst = \alpha \cdot src1 + \beta \cdot src2 + \gamma
In this case, :math:`\gamma` is the argument :math:`0.0` in the code above.
#. Create windows, show the images and wait for the user to end the program.
Result
=======
.. image:: images/Adding_Images_Tutorial_Result_0.png
:alt: Blending Images Tutorial - Final Result
:align: center
.. _Drawing_1:
Basic Drawing
****************
Goals
======
In this tutorial you will learn how to:
* Use :point:`Point <>` to define 2D points in an image.
* Use :scalar:`Scalar <>` and why it is useful
* Draw a **line** by using the OpenCV function :line:`line <>`
* Draw an **ellipse** by using the OpenCV function :ellipse:`ellipse <>`
* Draw a **rectangle** by using the OpenCV function :rectangle:`rectangle <>`
* Draw a **circle** by using the OpenCV function :circle:`circle <>`
* Draw a **filled polygon** by using the OpenCV function :fill_poly:`fillPoly <>`
OpenCV Theory
===============
For this tutorial, we will heavily use two structures: :point:`Point <>` and :scalar:`Scalar <>`:
Point
-------
It represents a 2D point, specified by its image coordinates :math:`x` and :math:`y`. We can define it as:
.. code-block:: cpp
Point pt;
pt.x = 10;
pt.y = 8;
or
.. code-block:: cpp
Point pt = Point(10, 8);
Scalar
-------
* Represents a 4-element vector. The type Scalar is widely used in OpenCV for passing pixel values.
* In this tutorial, we will use it extensively to represent RGB color values (3 parameters). It is not necessary to define the last argument if it is not going to be used.
* Let's see an example, if we are asked for a color argument and we give:
.. code-block:: cpp
Scalar( a, b, c )
We would be defining a RGB color such as: *Red = c*, *Green = b* and *Blue = a*
Code
=====
* This code is in your OpenCV sample folder. Otherwise you can grab it from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/Basic/Drawing_1.cpp>`_
Explanation
=============
#. Since we plan to draw two examples (an atom and a rook), we have to create 02 images and two windows to display them.
.. code-block:: cpp
/// Windows names
char atom_window[] = "Drawing 1: Atom";
char rook_window[] = "Drawing 2: Rook";
/// Create black empty images
Mat atom_image = Mat::zeros( w, w, CV_8UC3 );
Mat rook_image = Mat::zeros( w, w, CV_8UC3 );
#. We created functions to draw different geometric shapes. For instance, to draw the atom we used *MyEllipse* and *MyFilledCircle*:
.. code-block:: cpp
/// 1. Draw a simple atom:
/// 1.a. Creating ellipses
MyEllipse( atom_image, 90 );
MyEllipse( atom_image, 0 );
MyEllipse( atom_image, 45 );
MyEllipse( atom_image, -45 );
/// 1.b. Creating circles
MyFilledCircle( atom_image, Point( w/2.0, w/2.0) );
#. And to draw the rook we employed *MyLine*, *rectangle* and a *MyPolygon*:
.. code-block:: cpp
/// 2. Draw a rook
/// 2.a. Create a convex polygon
MyPolygon( rook_image );
/// 2.b. Creating rectangles
rectangle( rook_image,
Point( 0, 7*w/8.0 ),
Point( w, w),
Scalar( 0, 255, 255 ),
-1,
8 );
/// 2.c. Create a few lines
MyLine( rook_image, Point( 0, 15*w/16 ), Point( w, 15*w/16 ) );
MyLine( rook_image, Point( w/4, 7*w/8 ), Point( w/4, w ) );
MyLine( rook_image, Point( w/2, 7*w/8 ), Point( w/2, w ) );
MyLine( rook_image, Point( 3*w/4, 7*w/8 ), Point( 3*w/4, w ) );
#. Let's check what is inside each of these functions:
* *MyLine*
.. code-block:: cpp
void MyLine( Mat img, Point start, Point end )
{
int thickness = 2;
int lineType = 8;
line( img,
start,
end,
Scalar( 0, 0, 0 ),
thickness,
lineType );
}
As we can see, *MyLine* just call the function :line:`line <>`, which does the following:
* Draw a line from Point **start** to Point **end**
* The line is displayed in the image **img**
* The line color is defined by **Scalar( 0, 0, 0)** which is the RGB value correspondent to **Black**
* The line thickness is set to **thickness** (in this case 2)
* The line is a 8-connected one (**lineType** = 8)
* *MyEllipse*
.. code-block:: cpp
void MyEllipse( Mat img, double angle )
{
int thickness = 2;
int lineType = 8;
ellipse( img,
Point( w/2.0, w/2.0 ),
Size( w/4.0, w/16.0 ),
angle,
0,
360,
Scalar( 255, 0, 0 ),
thickness,
lineType );
}
From the code above, we can observe that the function :ellipse:`ellipse <>` draws an ellipse such that:
* The ellipse is displayed in the image **img**
* The ellipse center is located in the point **(w/2.0, w/2.0)** and is enclosed in a box of size **(w/4.0, w/16.0)**
* The ellipse is rotated **angle** degrees
* The ellipse extends an arc between **0** and **360** degrees
* The color of the figure will be **Scalar( 255, 255, 0)** which means blue in RGB value.
* The ellipse's **thickness** is 2.
* *MyFilledCircle*
.. code-block:: cpp
void MyFilledCircle( Mat img, Point center )
{
int thickness = -1;
int lineType = 8;
circle( img,
center,
w/32.0,
Scalar( 0, 0, 255 ),
thickness,
lineType );
}
Similar to the ellipse function, we can observe that *circle* receives as arguments:
* The image where the circle will be displayed (**img**)
* The center of the circle denoted as the Point **center**
* The radius of the circle: **w/32.0**
* The color of the circle: **Scalar(0, 0, 255)** which means *Red* in RGB
* Since **thickness** = -1, the circle will be drawn filled.
* *MyPolygon*
.. code-block:: cpp
void MyPolygon( Mat img )
{
int lineType = 8;
/** Create some points */
Point rook_points[1][20];
rook_points[0][0] = Point( w/4.0, 7*w/8.0 );
rook_points[0][1] = Point( 3*w/4.0, 7*w/8.0 );
rook_points[0][2] = Point( 3*w/4.0, 13*w/16.0 );
rook_points[0][3] = Point( 11*w/16.0, 13*w/16.0 );
rook_points[0][4] = Point( 19*w/32.0, 3*w/8.0 );
rook_points[0][5] = Point( 3*w/4.0, 3*w/8.0 );
rook_points[0][6] = Point( 3*w/4.0, w/8.0 );
rook_points[0][7] = Point( 26*w/40.0, w/8.0 );
rook_points[0][8] = Point( 26*w/40.0, w/4.0 );
rook_points[0][9] = Point( 22*w/40.0, w/4.0 );
rook_points[0][10] = Point( 22*w/40.0, w/8.0 );
rook_points[0][11] = Point( 18*w/40.0, w/8.0 );
rook_points[0][12] = Point( 18*w/40.0, w/4.0 );
rook_points[0][13] = Point( 14*w/40.0, w/4.0 );
rook_points[0][14] = Point( 14*w/40.0, w/8.0 );
rook_points[0][15] = Point( w/4.0, w/8.0 );
rook_points[0][16] = Point( w/4.0, 3*w/8.0 );
rook_points[0][17] = Point( 13*w/32.0, 3*w/8.0 );
rook_points[0][18] = Point( 5*w/16.0, 13*w/16.0 );
rook_points[0][19] = Point( w/4.0, 13*w/16.0) ;
const Point* ppt[1] = { rook_points[0] };
int npt[] = { 20 };
fillPoly( img,
ppt,
npt,
1,
Scalar( 255, 255, 255 ),
lineType );
}
To draw a filled polygon we use the function :fill_poly:`fillPoly <>`. We note that:
* The polygon will be drawn on **img**
* The vertices of the polygon are the set of points in **ppt**
* The total number of vertices to be drawn are **npt**
* The number of polygons to be drawn is only **1**
* The color of the polygon is defined by **Scalar( 255, 255, 255)**, which is the RGB value for *white*
* *rectangle*
.. code-block:: cpp
rectangle( rook_image,
Point( 0, 7*w/8.0 ),
Point( w, w),
Scalar( 0, 255, 255 ),
-1,
8 );
Finally we have the :rectangle:`rectangle <>` function (we did not create a special function for this guy). We note that:
* The rectangle will be drawn on **rook_image**
* Two opposite vertices of the rectangle are defined by ** Point( 0, 7*w/8.0 )** and **Point( w, w)**
* The color of the rectangle is given by **Scalar(0, 255, 255)** which is the RGB value for *yellow*
* Since the thickness value is given by **-1**, the rectangle will be filled.
Result
=======
Compiling and running your program should give you a result like this:
.. image:: images/Drawing_1_Tutorial_Result_0.png
:alt: Drawing Tutorial 1 - Final Result
:align: center
.. _Basic_Linear_Transform:
Changing the contrast and brightness of an image!
***************************************************
Goal
=====
In this tutorial you will learn how to:
* Access pixel values
* Initialize a matrix with zeros
* Learn what :saturate_cast:`saturate_cast <>` does and why it is useful
* Get some cool info about pixel transformations
Cool Theory
=================
.. note::
The explanation below belongs to the book `Computer Vision: Algorithms and Applications <http://szeliski.org/Book/>`_ by Richard Szeliski
Image Processing
--------------------
* A general image processing operator is a function that takes one or more input images and produces an output image.
* Image transforms can be seen as:
* Point operators (pixel transforms)
* Neighborhood (area-based) operators
Pixel Transforms
^^^^^^^^^^^^^^^^^
* In this kind of image processing transform, each output pixel's value depends on only the corresponding input pixel value (plus, potentially, some globally collected information or parameters).
* Examples of such operators include *brightness and contrast adjustments* as well as color correction and transformations.
Brightness and contrast adjustments
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Two commonly used point processes are *multiplication* and *addition* with a constant:
.. math::
g(x) = \alpha f(x) + \beta
* The parameters :math:`\alpha > 0` and :math:`\beta` are often called the *gain* and *bias* parameters; sometimes these parameters are said to control *contrast* and *brightness* respectively.
* You can think of :math:`f(x)` as the source image pixels and :math:`g(x)` as the output image pixels. Then, more conveniently we can write the expression as:
.. math::
g(i,j) = \alpha \cdot f(i,j) + \beta
where :math:`i` and :math:`j` indicates that the pixel is located in the *i-th* row and *j-th* column.
Code
=====
* The following code performs the operation :math:`g(i,j) = \alpha \cdot f(i,j) + \beta`
* Here it is:
.. code-block:: cpp
#include <cv.h>
#include <highgui.h>
#include <iostream>
using namespace cv;
double alpha; /**< Simple contrast control */
int beta; /**< Simple brightness control */
int main( int argc, char** argv )
{
/// Read image given by user
Mat image = imread( argv[1] );
Mat new_image = Mat::zeros( image.size(), image.type() );
/// Initialize values
std::cout<<" Basic Linear Transforms "<<std::endl;
std::cout<<"-------------------------"<<std::endl;
std::cout<<"* Enter the alpha value [1.0-3.0]: ";std::cin>>alpha;
std::cout<<"* Enter the beta value [0-100]: "; std::cin>>beta;
/// Do the operation new_image(i,j) = alpha*image(i,j) + beta
for( int y = 0; y < image.rows; y++ )
{ for( int x = 0; x < image.cols; x++ )
{ for( int c = 0; c < 3; c++ )
{
new_image.at<Vec3b>(y,x)[c] = saturate_cast<uchar>( alpha*( image.at<Vec3b>(y,x)[c] ) + beta );
}
}
}
/// Create Windows
namedWindow("Original Image", 1);
namedWindow("New Image", 1);
/// Show stuff
imshow("Original Image", image);
imshow("New Image", new_image);
/// Wait until user press some key
waitKey();
return 0;
}
Explanation
============
#. We begin by creating parameters to save :math:`\alpha` and :math:`\beta` to be entered by the user:
.. code-block:: cpp
double alpha;
int beta;
#. We load an image using :imread:`imread <>` and save it in a Mat object:
.. code-block:: cpp
Mat image = imread( argv[1] );
#. Now, since we will make some transformations to this image, we need a new Mat object to store it. Also, we want this to have the following features:
* Initial pixel values equal to zero
* Same size and type as the original image
.. code-block:: cpp
Mat new_image = Mat::zeros( image.size(), image.type() );
We observe that :mat_zeros:`Mat::zeros <>` returns a Matlab-style zero initializer based on *image.size()* and *image.type()*
#. Now, to perform the operation :math:`g(i,j) = \alpha \cdot f(i,j) + \beta` we will access to each pixel in image. Since we are operating with RGB images, we will have three values per pixel (R, G and B), so we will also access them separately. Here is the piece of code:
.. code-block:: cpp
for( int y = 0; y < image.rows; y++ )
{ for( int x = 0; x < image.cols; x++ )
{ for( int c = 0; c < 3; c++ )
{ new_image.at<Vec3b>(y,x)[c] = saturate_cast<uchar>( alpha*( image.at<Vec3b>(y,x)[c] ) + beta ); }
}
}
Notice the following:
* To access each pixel in the images we are using this syntax: *image.at<Vec3b>(y,x)[c]* where *y* is the row, *x* is the column and *c* is R, G or B (0, 1 or 2).
* Since the operation :math:`\alpha \cdot p(i,j) + \beta` can give values out of range or not integers (if :math:`\alpha` is float), we use :saturate_cast:`saturate_cast <>` to make sure the values are valid.
#. Finally, we create windows and show the images, the usual way.
.. code-block:: cpp
namedWindow("Original Image", 1);
namedWindow("New Image", 1);
imshow("Original Image", image);
imshow("New Image", new_image);
waitKey(0);
.. note::
Instead of using the **for** loops to access each pixel, we could have simply used this command:
.. code-block:: cpp
image.convertTo(new_image, -1, alpha, beta);
where :convert_to:`convertTo <>` would effectively perform *new_image = a*image + beta*. However, we wanted to show you how to access each pixel. In any case, both methods give the same result.
Result
=======
* Running our code and using :math:`\alpha = 2.2` and :math:`\beta = 50`
.. code-block:: bash
$ ./BasicLinearTransforms lena.png
Basic Linear Transforms
-------------------------
* Enter the alpha value [1.0-3.0]: 2.2
* Enter the beta value [0-100]: 50
* We get this:
.. image:: images/Basic_Linear_Transform_Tutorial_Result_0.png
:height: 400px
:alt: Basic Linear Transform - Final Result
:align: center
.. _Table-Of-Content-Core:
*core* module. The Core Functionality
-----------------------------------------------------------
Here you will learn the about the basic building blocks of the library. A must read and know for understanding how to manipulate the images on a pixel level.
.. toctree::
:hidden:
../adding_images/adding_images
../basic_linear_transform/basic_linear_transform
../basic_geometric_drawing/basic_geometric_drawing
../random_generator_and_text/random_generator_and_text
.. |Author_AnaH| unicode:: Ana U+0020 Huam U+00E1 n
* :ref:`Adding_Images`
=============== ======================================================
|Beginners_4| *Title:* **Linear Blending**
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to blend two images!
=============== ======================================================
.. |Beginners_4| image:: images/Adding_Images_Tutorial_Result_0.png
:height: 100pt
:width: 100pt
* :ref:`Basic_Linear_Transform`
=============== ====================================================
|Bas_Lin_Tran| *Title:* **Changing the contrast and brightness of an image**
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to change our image appearance!
=============== ====================================================
.. |Bas_Lin_Tran| image:: images/Basic_Linear_Transform_Tutorial_Result_0.png
:height: 100pt
:width: 100pt
* :ref:`Drawing_1`
=============== ======================================================
|Beginners_6| *Title:* **Basic Drawing**
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to draw simple geometry with OpenCV!
=============== ======================================================
.. |Beginners_6| image:: images/Drawing_1_Tutorial_Result_0.png
:height: 100pt
:width: 100pt
* :ref:`Drawing_2`
=============== ======================================================
|Beginners_7| *Title:* **Cool Drawing**
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will draw some *fancy-looking* stuff using OpenCV!
=============== ======================================================
.. |Beginners_7| image:: images/Drawing_2_Tutorial_Result_7.png
:height: 100pt
:width: 100pt
.. note::
Unfortunetly we have no tutorials into this section. Nevertheless, our tutorial writting team is working on it. If you have a tutorial suggestion or you have writen yourself a tutorial (or coded a sample code) that you would like to see here please contact us via our :opencv_group:`user group <>`.
\ No newline at end of file
.. _Table-Of-Content-Feature2D:
*feature2d* module. 2D Features framework
-----------------------------------------------------------
Learn about how to use the feature points detectors, descriptors and matching framework found inside OpenCV.
.. include:: ../../definitions/noContent.rst
.. _Table-Of-Content-General:
General tutorials
-----------------------------------------------------------
These tutorials are the bottom of the iceberg as they link together multiple of the modules presented above in order to solve complex problems.
.. include:: ../../definitions/noContent.rst
.. _Table-Of-Content-GPU:
*gpu* module. GPU-Accelerated Computer Vision
-----------------------------------------------------------
Squeeze out every little computation power from your system by using the power of your video card to run the OpenCV algorithms.
.. include:: ../../definitions/noContent.rst
.. _Table-Of-Content-HighGui:
*highgui* module. High Level GUI and Media
-----------------------------------------------------------
This section contains valuable tutorials about how to read/save your image/video files and how to use the built-in graphical user interface of the library.
.. toctree::
:hidden:
../trackbar/trackbar
* :ref:`Adding_Trackbars`
=============== ======================================================
|Beginners_5| *Title:* **Creating Trackbars**
*Compatibility:* > OpenCV 2.0
We will learn how to add a Trackbar to our applications
=============== ======================================================
.. |Beginners_5| image:: images/Adding_Trackbars_Tutorial_Cover.png
:height: 100pt
:width: 100pt
.. _Adding_Trackbars:
Adding a Trackbar to our applications!
***************************************
* In the previous tutorials (about *linear blending* and the *brightness and contrast adjustments*) you might have noted that we needed to give some **input** to our programs, such as :math:`\alpha` and :math:`beta`. We accomplished that by entering this data using the Terminal
* Well, it is time to use some fancy GUI tools. OpenCV provides some GUI utilities (*highgui.h*) for you. An example of this is a **Trackbar**
.. image:: images/Adding_Trackbars_Tutorial_Trackbar.png
:alt: Trackbar example
:align: center
* In this tutorial we will just modify our two previous programs so that they get the input information from the trackbar.
Goals
======
In this tutorial you will learn how to:
* Add a Trackbar in an OpenCV window by using :create_trackbar:`createTrackbar <>`
Code
=====
Let's modify the program made in the tutorial :ref:`Adding_Images`. We will let the user enter the :math:`\alpha` value by using the Trackbar.
.. code-block:: cpp
#include <cv.h>
#include <highgui.h>
using namespace cv;
/// Global Variables
const int alpha_slider_max = 100;
int alpha_slider;
double alpha;
double beta;
/// Matrices to store images
Mat src1;
Mat src2;
Mat dst;
/**
* @function on_trackbar
* @brief Callback for trackbar
*/
void on_trackbar( int, void* )
{
alpha = (double) alpha_slider/alpha_slider_max ;
beta = ( 1.0 - alpha );
addWeighted( src1, alpha, src2, beta, 0.0, dst);
imshow( "Linear Blend", dst );
}
int main( int argc, char** argv )
{
/// Read image ( same size, same type )
src1 = imread("../../images/LinuxLogo.jpg");
src2 = imread("../../images/WindowsLogo.jpg");
if( !src1.data ) { printf("Error loading src1 \n"); return -1; }
if( !src2.data ) { printf("Error loading src2 \n"); return -1; }
/// Initialize values
alpha_slider = 0;
/// Create Windows
namedWindow("Linear Blend", 1);
/// Create Trackbars
char TrackbarName[50];
sprintf( TrackbarName, "Alpha x %d", alpha_slider_max );
createTrackbar( TrackbarName, "Linear Blend", &alpha_slider, alpha_slider_max, on_trackbar );
/// Show some stuff
on_trackbar( alpha_slider, 0 );
/// Wait until user press some key
waitKey(0);
return 0;
}
Explanation
============
We only analyze the code that is related to Trackbar:
#. First, we load 02 images, which are going to be blended.
.. code-block:: cpp
src1 = imread("../../images/LinuxLogo.jpg");
src2 = imread("../../images/WindowsLogo.jpg");
#. To create a trackbar, first we have to create the window in which it is going to be located. So:
.. code-block:: cpp
namedWindow("Linear Blend", 1);
#. Now we can create the Trackbar:
.. code-block:: cpp
createTrackbar( TrackbarName, "Linear Blend", &alpha_slider, alpha_slider_max, on_trackbar );
Note the following:
* Our Trackbar has a label **TrackbarName**
* The Trackbar is located in the window named **"Linear Blend"**
* The Trackbar values will be in the range from :math:`0` to **alpha_slider_max** (the minimum limit is always **zero**).
* The numerical value of Trackbar is stored in **alpha_slider**
* Whenever the user moves the Trackbar, the callback function **on_trackbar** is called
#. Finally, we have to define the callback function **on_trackbar**
.. code-block:: cpp
void on_trackbar( int, void* )
{
alpha = (double) alpha_slider/alpha_slider_max ;
beta = ( 1.0 - alpha );
addWeighted( src1, alpha, src2, beta, 0.0, dst);
imshow( "Linear Blend", dst );
}
Note that:
* We use the value of **alpha_slider** (integer) to get a double value for **alpha**.
* **alpha_slider** is updated each time the trackbar is displaced by the user.
* We define *src1*, *src2*, *dist*, *alpha*, *alpha_slider* and *beta* as global variables, so they can be used everywhere.
Result
=======
* Our program produces the following output:
.. image:: images/Adding_Trackbars_Tutorial_Result_0.png
:alt: Adding Trackbars - Windows Linux
:align: center
* As a manner of practice, you can also add 02 trackbars for the program made in :ref:`Basic_Linear_Transform`. One trackbar to set :math:`\alpha` and another for :math:`\beta`. The output might look like:
.. image:: images/Adding_Trackbars_Tutorial_Result_1.png
:alt: Adding Trackbars - Lena
:height: 500px
:align: center
.. _Morphology_1:
Eroding and Dilating
**********************
Goal
=====
In this tutorial you will learn how to:
* Apply two very common morphology operators: Dilation and Erosion. For this purpose, you will use the following OpenCV functions:
* :erode:`erode <>`
* :dilate:`dilate <>`
Cool Theory
============
.. note::
The explanation below belongs to the book **Learning OpenCV** by Bradski and Kaehler.
Morphological Operations
--------------------------
* In short: A set of operations that process images based on shapes. Morphological operations apply a *structuring element* to an input image and generate an output image.
* The most basic morphological operations are two: Erosion and Dilation. They have a wide array of uses, i.e. :
* Removing noise
* Isolation of individual elements and joining disparate elements in an image.
* Finding of intensity bumps or holes in an image
* We will explain dilation and erosion briefly, using the following image as an example:
.. image:: images/Morphology_1_Tutorial_Theory_Original_Image.png
:alt: Original image
:height: 100px
:align: center
Dilation
^^^^^^^^^
* This operations consists of convoluting an image :math:`A` with some kernel (:math:`B`), which can have any shape or size, usually a square or circle.
* The kernel :math:`B` has a defined *anchor point*, usually being the center of the kernel.
* As the kernel :math:`B` is scanned over the image, we compute the maximal pixel value overlapped by :math:`B` and replace the image pixel in the anchor point position with that maximal value. As you can deduce, this maximizing operation causes bright regions within an image to "grow" (therefore the name *dilation*). Take as an example the image above. Applying dilation we can get:
.. image:: images/Morphology_1_Tutorial_Theory_Dilation.png
:alt: Dilation result - Theory example
:height: 100px
:align: center
The background (bright) dilates around the black regions of the letter.
Erosion
^^^^^^^^
* This operation is the sister of dilation. What this does is to compute a local minimum over the area of the kernel.
* As the kernel :math:`B` is scanned over the image, we compute the minimal pixel value overlapped by :math:`B` and replace the image pixel under the anchor point with that minimal value.
* Analagously to the example for dilation, we can apply the erosion operator to the original image (shown above). You can see in the result below that the bright areas of the image (the background, apparently), get thinner, whereas the dark zones (the "writing"( gets bigger.
.. image:: images/Morphology_1_Tutorial_Theory_Erosion.png
:alt: Erosion result - Theory example
:height: 100px
:align: center
Code
======
This tutorial code's is shown lines below. You can also download it from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/Image_Processing/Morphology_1.cpp>`_
.. code-block:: cpp
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "highgui.h"
#include <stdlib.h>
#include <stdio.h>
using namespace cv;
/// Global variables
Mat src, erosion_dst, dilation_dst;
int erosion_elem = 0;
int erosion_size = 0;
int dilation_elem = 0;
int dilation_size = 0;
int const max_elem = 2;
int const max_kernel_size = 21;
/** Function Headers */
void Erosion( int, void* );
void Dilation( int, void* );
/** @function main */
int main( int argc, char** argv )
{
/// Load an image
src = imread( argv[1] );
if( !src.data )
{ return -1; }
/// Create windows
namedWindow( "Erosion Demo", CV_WINDOW_AUTOSIZE );
namedWindow( "Dilation Demo", CV_WINDOW_AUTOSIZE );
cvMoveWindow( "Dilation Demo", src.cols, 0 );
/// Create Erosion Trackbar
createTrackbar( "Element:\n 0: Rect \n 1: Cross \n 2: Ellipse", "Erosion Demo",
&erosion_elem, max_elem,
Erosion );
createTrackbar( "Kernel size:\n 2n +1", "Erosion Demo",
&erosion_size, max_kernel_size,
Erosion );
/// Create Dilation Trackbar
createTrackbar( "Element:\n 0: Rect \n 1: Cross \n 2: Ellipse", "Dilation Demo",
&dilation_elem, max_elem,
Dilation );
createTrackbar( "Kernel size:\n 2n +1", "Dilation Demo",
&dilation_size, max_kernel_size,
Dilation );
/// Default start
Erosion( 0, 0 );
Dilation( 0, 0 );
waitKey(0);
return 0;
}
/** @function Erosion */
void Erosion( int, void* )
{
int erosion_type;
if( erosion_elem == 0 ){ erosion_type = MORPH_RECT; }
else if( erosion_elem == 1 ){ erosion_type = MORPH_CROSS; }
else if( erosion_elem == 2) { erosion_type = MORPH_ELLIPSE; }
Mat element = getStructuringElement( erosion_type,
Size( 2*erosion_size + 1, 2*erosion_size+1 ),
Point( erosion_size, erosion_size ) );
/// Apply the erosion operation
erode( src, erosion_dst, element );
imshow( "Erosion Demo", erosion_dst );
}
/** @function Dilation */
void Dilation( int, void* )
{
int dilation_type;
if( dilation_elem == 0 ){ dilation_type = MORPH_RECT; }
else if( dilation_elem == 1 ){ dilation_type = MORPH_CROSS; }
else if( dilation_elem == 2) { dilation_type = MORPH_ELLIPSE; }
Mat element = getStructuringElement( dilation_type,
Size( 2*dilation_size + 1, 2*dilation_size+1 ),
Point( dilation_size, dilation_size ) );
/// Apply the dilation operation
dilate( src, dilation_dst, element );
imshow( "Dilation Demo", dilation_dst );
}
Explanation
=============
#. Most of the stuff shown is known by you (if you have any doubt, please refer to the tutorials in previous sections). Let's check the general structure of the program:
* Load an image (can be RGB or grayscale)
* Create two windows (one for dilation output, the other for erosion)
* Create a set of 02 Trackbars for each operation:
* The first trackbar "Element" returns either **erosion_elem** or **dilation_elem**
* The second trackbar "Kernel size" return **erosion_size** or **dilation_size** for the corresponding operation.
* Every time we move any slider, the user's function **Erosion** or **Dilation** will be called and it will update the output image based on the current trackbar values.
Let's analyze these two functions:
#. **erosion:**
.. code-block:: cpp
/** @function Erosion */
void Erosion( int, void* )
{
int erosion_type;
if( erosion_elem == 0 ){ erosion_type = MORPH_RECT; }
else if( erosion_elem == 1 ){ erosion_type = MORPH_CROSS; }
else if( erosion_elem == 2) { erosion_type = MORPH_ELLIPSE; }
Mat element = getStructuringElement( erosion_type,
Size( 2*erosion_size + 1, 2*erosion_size+1 ),
Point( erosion_size, erosion_size ) );
/// Apply the erosion operation
erode( src, erosion_dst, element );
imshow( "Erosion Demo", erosion_dst );
}
* The function that performs the *erosion* operation is :erode:`erode <>`. As we can see, it receives three arguments:
* *src*: The source image
* *erosion_dst*: The output image
* *element*: This is the kernel we will use to perform the operation. If we do not specify, the default is a simple :math:`3x3` matrix. Otherwise, we can specify its shape. For this, we need to use the function :get_structuring_element:`getStructuringElement <>`:
.. code-block:: cpp
Mat element = getStructuringElement( erosion_type,
Size( 2*erosion_size + 1, 2*erosion_size+1 ),
Point( erosion_size, erosion_size ) );
We can choose any of three shapes for our kernel:
* Rectangular box: MORPH_RECT
* Cross: MORPH_CROSS
* Ellipse: MORPH_ELLIPSE
Then, we just have to specify the size of our kernel and the *anchor point*. If not specified, it is assumed to be in the center.
* That is all. We are ready to perform the erosion of our image.
.. note::
Additionally, there is another parameter that allows you to perform multiple erosions (iterations) at once. We are not using it in this simple tutorial, though. You can check out the Reference for more details.
#. **dilation:**
The code is below. As you can see, it is completely similar to the snippet of code for **erosion**. Here we also have the option of defining our kernel, its anchor point and the size of the operator to be used.
.. code-block:: cpp
/** @function Dilation */
void Dilation( int, void* )
{
int dilation_type;
if( dilation_elem == 0 ){ dilation_type = MORPH_RECT; }
else if( dilation_elem == 1 ){ dilation_type = MORPH_CROSS; }
else if( dilation_elem == 2) { dilation_type = MORPH_ELLIPSE; }
Mat element = getStructuringElement( dilation_type,
Size( 2*dilation_size + 1, 2*dilation_size+1 ),
Point( dilation_size, dilation_size ) );
/// Apply the dilation operation
dilate( src, dilation_dst, element );
imshow( "Dilation Demo", dilation_dst );
}
Results
========
* Compile the code above and execute it with an image as argument. For instance, using this image:
.. image:: images/Morphology_1_Tutorial_Original_Image.png
:alt: Original image
:height: 200px
:align: center
We get the results below. Varying the indices in the Trackbars give different output images, naturally. Try them out! You can even try to add a third Trackbar to control the number of iterations.
.. image:: images/Morphology_1_Tutorial_Cover.png
:alt: Dilation and Erosion application
:height: 400px
:align: center
.. _canny_detector:
Canny Edge Detector
********************
Goal
=====
In this tutorial you will learn how to:
a. Use the OpenCV function :canny:`Canny <>` to implement the Canny Edge Detector.
Theory
=======
#. The *Canny Edge detector* was developed by John F. Canny in 1986. Also known to many as the *optimal detector*, Canny algorithm aims to satisfy three main criteria:
* **Low error rate:** Meaning a good detection of only existent edges.
* **Good localization:** The distance between edge pixels detected and real edge pixels have to be minimized.
* **Minimal response:** Only one detector response per edge.
Steps
------
#. Filter out any noise. The Gaussian filter is used for this purpose. An example of a Gaussian kernel of :math:`size = 5` that might be used is shown below:
.. math::
K = \dfrac{1}{159}\begin{bmatrix}
2 & 4 & 5 & 4 & 2 \\
4 & 9 & 12 & 9 & 4 \\
5 & 12 & 15 & 12 & 5 \\
4 & 9 & 12 & 9 & 4 \\
2 & 4 & 5 & 4 & 2
\end{bmatrix}
#. Find the intensity gradient of the image. For this, we follow a procedure analogous to Sobel:
a. Apply a pair of convolution masks (in :math:`x` and :math:`y` directions:
.. math::
G_{x} = \begin{bmatrix}
-1 & 0 & +1 \\
-2 & 0 & +2 \\
-1 & 0 & +1
\end{bmatrix}
G_{y} = \begin{bmatrix}
-1 & -2 & -1 \\
0 & 0 & 0 \\
+1 & +2 & +1
\end{bmatrix}
b. Find the gradient strength and direction with:
.. math::
\begin{array}{l}
G = \sqrt{ G_{x}^{2} + G_{y}^{2} } \\
\theta = \arctan(\dfrac{ G_{y} }{ G_{x} })
\end{array}
The direction is rounded to one of four possible angles (namely 0, 45, 90 or 135)
#. *Non-maximum* suppression is applied. This removes pixels that are not considered to be part of an edge. Hence, only thin lines (candidate edges) will remain.
#. *Hysteresis*: The final step. Canny does use two thresholds (upper and lower):
a. If a pixel gradient is higher than the *upper* threshold, the pixel is accepted as an edge
b. If a pixel gradient value is below the *lower* threshold, then it is rejected.
c. If the pixel gradient is between the two thresholds, then it will be accepted only if it is connected to a pixel that is above the *upper* threshold.
Canny recommended a *upper*:*lower* ratio between 2:1 and 3:1.
#. For more details, you can always consult your favorite Computer Vision book.
Code
=====
#. **What does this program do?**
* Asks the user to enter a numerical value to set the lower threshold for our *Canny Edge Detector* (by means of a Trackbar)
* Applies the *Canny Detector* and generates a **mask** (bright lines representing the edges on a black background).
* Applies the mask obtained on the original image and display it in a window.
#. The tutorial code's is shown lines below. You can also download it from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/ImgTrans/CannyDetector_Demo.cpp>`_
.. code-block:: cpp
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <stdlib.h>
#include <stdio.h>
using namespace cv;
/// Global variables
Mat src, src_gray;
Mat dst, detected_edges;
int edgeThresh = 1;
int lowThreshold;
int const max_lowThreshold = 100;
int ratio = 3;
int kernel_size = 3;
char* window_name = "Edge Map";
/**
* @function CannyThreshold
* @brief Trackbar callback - Canny thresholds input with a ratio 1:3
*/
void CannyThreshold(int, void*)
{
/// Reduce noise with a kernel 3x3
blur( src_gray, detected_edges, Size(3,3) );
/// Canny detector
Canny( detected_edges, detected_edges, lowThreshold, lowThreshold*ratio, kernel_size );
/// Using Canny's output as a mask, we display our result
dst = Scalar::all(0);
src.copyTo( dst, detected_edges);
imshow( window_name, dst );
}
/** @function main */
int main( int argc, char** argv )
{
/// Load an image
src = imread( argv[1] );
if( !src.data )
{ return -1; }
/// Create a matrix of the same type and size as src (for dst)
dst.create( src.size(), src.type() );
/// Convert the image to grayscale
cvtColor( src, src_gray, CV_BGR2GRAY );
/// Create a window
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
/// Create a Trackbar for user to enter threshold
createTrackbar( "Min Threshold:", window_name, &lowThreshold, max_lowThreshold, CannyThreshold );
/// Show the image
CannyThreshold(0, 0);
/// Wait until user exit program by pressing a key
waitKey(0);
return 0;
}
Explanation
============
#. Create some needed variables:
.. code-block:: cpp
Mat src, src_gray;
Mat dst, detected_edges;
int edgeThresh = 1;
int lowThreshold;
int const max_lowThreshold = 100;
int ratio = 3;
int kernel_size = 3;
char* window_name = "Edge Map";
Note the following:
a. We establish a ratio of lower:upper threshold of 3:1 (with the variable *ratio*)
b. We set the kernel size of :math:`3` (for the Sobel operations to be performed internally by the Canny function)
c. We set a maximum value for the lower Threshold of :math:`100`.
#. Loads the source image:
.. code-block:: cpp
/// Load an image
src = imread( argv[1] );
if( !src.data )
{ return -1; }
#. Create a matrix of the same type and size of *src* (to be *dst*)
.. code-block:: cpp
dst.create( src.size(), src.type() );
#. Convert the image to grayscale (using the function :cvt_color:`cvtColor <>`:
.. code-block:: cpp
cvtColor( src, src_gray, CV_BGR2GRAY );
#. Create a window to display the results
.. code-block:: cpp
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
#. Create a Trackbar for the user to enter the lower threshold for our Canny detector:
.. code-block:: cpp
createTrackbar( "Min Threshold:", window_name, &lowThreshold, max_lowThreshold, CannyThreshold );
Observe the following:
a. The variable to be controlled by the Trackbar is *lowThreshold* with a limit of *max_lowThreshold* (which we set to 100 previously)
b. Each time the Trackbar registers an action, the callback function *CannyThreshold* will be invoked.
#. Let's check the *CannyThreshold* function, step by step:
a. First, we blur the image with a filter of kernel size 3:
.. code-block:: cpp
blur( src_gray, detected_edges, Size(3,3) );
b. Second, we apply the OpenCV function :canny:`Canny <>`:
.. code-block:: cpp
Canny( detected_edges, detected_edges, lowThreshold, lowThreshold*ratio, kernel_size );
where the arguments are:
* *detected_edges*: Source image, grayscale
* *detected_edges*: Output of the detector (can be the same as the input)
* *lowThreshold*: The value entered by the user moving the Trackbar
* *highThreshold*: Set in the program as three times the lower threshold (following Canny's recommendation)
* *kernel_size*: We defined it to be 3 (the size of the Sobel kernel to be used internally)
#. We fill a *dst* image with zeros (meaning the image is completely black).
.. code-block:: cpp
dst = Scalar::all(0);
#. Finally, we will use the function :copy_to:`copyTo <>` to map only the areas of the image that are identified as edges (on a black background).
.. code-block:: cpp
src.copyTo( dst, detected_edges);
:copy_to:`copyTo <>` copy the *src* image onto *dst*. However, it will only copy the pixels in the locations where they have non-zero values. Since the output of the Canny detector is the edge contours on a black background, the resulting *dst* will be black in all the area but the detected edges.
#. We display our result:
.. code-block:: cpp
imshow( window_name, dst );
Result
=======
#. After compiling the code above, we can run it giving as argument the path to an image. For example, using as an input the following image:
.. image:: images/Canny_Detector_Tutorial_Original_Image.jpg
:alt: Original test image
:width: 200pt
:align: center
and moving the slider, trying different threshold, we obtain the following result:
.. image:: images/Canny_Detector_Tutorial_Result.jpg
:alt: Result after running Canny
:width: 200pt
:align: center
Notice how the image is superposed to the black background on the edge regions.
.. _copyMakeBorder:
Adding borders to your images
******************************
Goal
=====
In this tutorial you will learn how to:
#. Use the OpenCV function :copy_make_border:`copyMakeBorder <>` to set the borders (extra padding to your image).
Theory
============
.. note::
The explanation below belongs to the book **Learning OpenCV** by Bradski and Kaehler.
#. In our previous tutorial we learned to use convolution to operate on images. One problem that naturally arises is how to handle the boundaries. How can we convolve them if the evaluated points are at the edge of the image?
#. What most of OpenCV functions do is to copy a given image onto another slightly larger image and then automatically pads the boundary (by any of the methods explained in the sample code just below). This way, the convolution can be performed over the needed pixels without problems (the extra padding is cut after the operation is done).
#. In this tutorial, we will briefly explore two ways of defining the extra padding (border) for an image:
a. **BORDER_CONSTANT**: Pad the image with a constant value (i.e. black or :math:`0`
b. **BORDER_REPLICATE**: The row or column at the very edge of the original is replicated to the extra border.
This will be seen more clearly in the Code section.
Code
======
#. **What does this program do?**
* Load an image
* Let the user choose what kind of padding use in the input image. There are two options:
#. *Constant value border*: Applies a padding of a constant value for the whole border. This value will be updated randomly each 0.5 seconds.
#. *Replicated border*: The border will be replicated from the pixel values at the edges of the original image.
The user chooses either option by pressing 'c' (constant) or 'r' (replicate)
* The program finishes when the user presses 'ESC'
#. The tutorial code's is shown lines below. You can also download it from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/ImgTrans/copyMakeBorder_demo.cpp>`_
.. code-block:: cpp
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <stdlib.h>
#include <stdio.h>
using namespace cv;
/// Global Variables
Mat src, dst;
int top, bottom, left, right;
int borderType;
Scalar value;
char* window_name = "copyMakeBorder Demo";
RNG rng(12345);
/** @function main */
int main( int argc, char** argv )
{
int c;
/// Load an image
src = imread( argv[1] );
if( !src.data )
{ return -1;
printf(" No data entered, please enter the path to an image file \n");
}
/// Brief how-to for this program
printf( "\n \t copyMakeBorder Demo: \n" );
printf( "\t -------------------- \n" );
printf( " ** Press 'c' to set the border to a random constant value \n");
printf( " ** Press 'r' to set the border to be replicated \n");
printf( " ** Press 'ESC' to exit the program \n");
/// Create window
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
/// Initialize arguments for the filter
top = (int) (0.05*src.rows); bottom = (int) (0.05*src.rows);
left = (int) (0.05*src.cols); right = (int) (0.05*src.cols);
dst = src;
imshow( window_name, dst );
while( true )
{
c = waitKey(500);
if( (char)c == 27 )
{ break; }
else if( (char)c == 'c' )
{ borderType = BORDER_CONSTANT; }
else if( (char)c == 'r' )
{ borderType = BORDER_REPLICATE; }
value = Scalar( rng.uniform(0, 255), rng.uniform(0, 255), rng.uniform(0, 255) );
copyMakeBorder( src, dst, top, bottom, left, right, borderType, value );
imshow( window_name, dst );
}
return 0;
}
Explanation
=============
#. First we declare the variables we are going to use:
.. code-block:: cpp
Mat src, dst;
int top, bottom, left, right;
int borderType;
Scalar value;
char* window_name = "copyMakeBorder Demo";
RNG rng(12345);
Especial attention deserves the variable *rng* which is a random number generator. We use it to generate the random border color, as we will see soon.
#. As usual we load our source image *src*:
.. code-block:: cpp
src = imread( argv[1] );
if( !src.data )
{ return -1;
printf(" No data entered, please enter the path to an image file \n");
}
#. After giving a short intro of how to use the program, we create a window:
.. code-block:: cpp
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
#. Now we initialize the argument that defines the size of the borders (*top*, *bottom*, *left* and *right*). We give them a value of 5% the size of *src*.
.. code-block:: cpp
top = (int) (0.05*src.rows); bottom = (int) (0.05*src.rows);
left = (int) (0.05*src.cols); right = (int) (0.05*src.cols);
#. The program begins a *while* loop. If the user presses 'c' or 'r', the *borderType* variable takes the value of *BORDER_CONSTANT* or *BORDER_REPLICATE* respectively:
.. code-block:: cpp
while( true )
{
c = waitKey(500);
if( (char)c == 27 )
{ break; }
else if( (char)c == 'c' )
{ borderType = BORDER_CONSTANT; }
else if( (char)c == 'r' )
{ borderType = BORDER_REPLICATE; }
#. In each iteration (after 0.5 seconds), the variable *value* is updated...
.. code-block:: cpp
value = Scalar( rng.uniform(0, 255), rng.uniform(0, 255), rng.uniform(0, 255) );
with a random value generated by the **RNG** variable *rng*. This value is a number picked randomly in the range :math:`[0,255]`
#. Finally, we call the function :copy_make_border:`copyMakeBorder <>` to apply the respective padding:
.. code-block:: cpp
copyMakeBorder( src, dst, top, bottom, left, right, borderType, value );
The arguments are:
a. *src*: Source image
#. *dst*: Destination image
#. *top*, *bottom*, *left*, *right*: Length in pixels of the borders at each side of the image. We define them as being 5% of the original size of the image.
#. *borderType*: Define what type of border is applied. It can be constant or replicate for this example.
#. *value*: If *borderType* is *BORDER_CONSTANT*, this is the value used to fill the border pixels.
#. We display our output image in the image created previously
.. code-block:: cpp
imshow( window_name, dst );
Results
========
#. After compiling the code above, you can execute it giving as argument the path of an image. The result should be:
* By default, it begins with the border set to BORDER_CONSTANT. Hence, a succession of random colored borders will be shown.
* If you press 'r', the border will become a replica of the edge pixels.
* If you press 'c', the random colored borders will appear again
* If you press 'ESC' the program will exit.
Below some screenshot showing how the border changes color and how the *BORDER_REPLICATE* option looks:
.. image:: images/CopyMakeBorder_Tutorial_Results.jpg
:alt: Final result after copyMakeBorder application
:width: 750pt
:align: center
.. _filter_2d:
Making your own linear filters!
********************************
Goal
=====
In this tutorial you will learn how to:
* Use the OpenCV function :filter2d:`filter2D <>` to create your own linear filters.
Theory
============
.. note::
The explanation below belongs to the book **Learning OpenCV** by Bradski and Kaehler.
Convolution
------------
In a very general sense, convolution is an operation between every part of an image and an operator (kernel).
What is a kernel?
------------------
A kernel is essentially a fixed size array of numerical coefficeints along with an *anchor point* in that array, which is tipically located at the center.
.. image:: images/filter_2d_tutorial_kernel_theory.png
:alt: kernel example
:align: center
How does convolution with a kernel work?
-----------------------------------------
Assume you want to know the resulting value of a particular location in the image. The value of the convolution is calculated in the following way:
#. Place the kernel anchor on top of a determined pixel, with the rest of the kernel overlaying the corresponding local pixels in the image.
#. Multiply the kernel coefficients by the corresponding image pixel values and sum the result.
#. Place the result to the location of the *anchor* in the input image.
#. Repeat the process for all pixels by scanning the kernel over the entire image.
Expressing the procedure above in the form of an equation we would have:
.. math::
H(x,y) = \sum_{i=0}^{M_{i} - 1} \sum_{j=0}^{M_{j}-1} I(x+i - a_{i}, y + j - a_{j})K(i,j)
Fortunately, OpenCV provides you with the function :filter2d:`filter2D <>` so you do not have to code all these operations.
Code
======
#. **What does this program do?**
* Loads an image
* Performs a *normalized box filter*. For instance, for a kernel of size :math:`size = 3`, the kernel would be:
.. math::
K = \dfrac{1}{3 \cdot 3} \begin{bmatrix}
1 & 1 & 1 \\
1 & 1 & 1 \\
1 & 1 & 1
\end{bmatrix}
The program will perform the filter operation with kernels of sizes 3, 5, 7, 9 and 11.
* The filter output (with each kernel) will be shown during 500 milliseconds
#. The tutorial code's is shown lines below. You can also download it from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/ImgTrans/filter2D_demo.cpp>`_
.. code-block:: cpp
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <stdlib.h>
#include <stdio.h>
using namespace cv;
/** @function main */
int main ( int argc, char** argv )
{
/// Declare variables
Mat src, dst;
Mat kernel;
Point anchor;
double delta;
int ddepth;
int kernel_size;
char* window_name = "filter2D Demo";
int c;
/// Load an image
src = imread( argv[1] );
if( !src.data )
{ return -1; }
/// Create window
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
/// Initialize arguments for the filter
anchor = Point( -1, -1 );
delta = 0;
ddepth = -1;
/// Loop - Will filter the image with different kernel sizes each 0.5 seconds
int ind = 0;
while( true )
{
c = waitKey(500);
/// Press 'ESC' to exit the program
if( (char)c == 27 )
{ break; }
/// Update kernel size for a normalized box filter
kernel_size = 3 + 2*( ind%5 );
kernel = Mat::ones( kernel_size, kernel_size, CV_32F )/ (float)(kernel_size*kernel_size);
/// Apply filter
filter2D(src, dst, ddepth , kernel, anchor, delta, BORDER_DEFAULT );
imshow( window_name, dst );
ind++;
}
return 0;
}
Explanation
=============
#. Load an image
.. code-block:: cpp
src = imread( argv[1] );
if( !src.data )
{ return -1; }
#. Create a window to display the result
.. code-block:: cpp
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
#. Initialize the arguments for the linear filter
.. code-block:: cpp
anchor = Point( -1, -1 );
delta = 0;
ddepth = -1;
#. Perform an infinite loop updating the kernel size and applying our linear filter to the input image. Let's analyze that more in detail:
#. First we define the kernel our filter is going to use. Here it is:
.. code-block:: cpp
kernel_size = 3 + 2*( ind%5 );
kernel = Mat::ones( kernel_size, kernel_size, CV_32F )/ (float)(kernel_size*kernel_size);
The first line is to update the *kernel_size* to odd values in the range: :math:`[3,11]`. The second line actually builds the kernel by setting its value to a matrix filled with :math:`1's` and normalizing it by dividing it between the number of elements.
#. After setting the kernel, we can generate the filter by using the function :filter2d:`filter2D <>`:
.. code-block:: cpp
filter2D(src, dst, ddepth , kernel, anchor, delta, BORDER_DEFAULT );
The arguments denote:
a. *src*: Source image
#. *dst*: Destination image
#. *ddepth*: The depth of *dst*. A negative value (such as :math:`-1`) indicates that the depth is the same as the source.
#. *kernel*: The kernel to be scanned through the image
#. *anchor*: The position of the anchor relative to its kernel. The location *Point(-1, -1)* indicates the center by default.
#. *delta*: A value to be added to each pixel during the convolution. By default it is :math:`0`
#. *BORDER_DEFAULT*: We let this value by default (more details in the following tutorial)
#. Our program will effectuate a *while* loop, each 500 ms the kernel size of our filter will be updated in the range indicated.
Results
========
#. After compiling the code above, you can execute it giving as argument the path of an image. The result should be a window that shows an image blurred by a normalized filter. Each 0.5 seconds the kernel size should change, as can be seen in the series of snapshots below:
.. image:: images/filter_2d_tutorial_result.png
:alt: kernel example
:align: center
.. _hough_circle:
Hough Circle Transform
***********************
Goal
=====
In this tutorial you will learn how to:
* Use the OpenCV function :hough_circles:`HoughCircles <>` to detect circles in an image.
Theory
=======
Hough Circle Transform
------------------------
* The Hough Circle Transform works in a *roughly* analogous way to the Hough Line Transform explained in the previous tutorial.
* In the line detection case, a line was defined by two parameters :math:`(r, \theta)`. In the circle case, we need three parameters to define a circle:
.. math::
C : ( x_{center}, y_{center}, r )
where :math:`(x_{center}, y_{center})` define the center position (gree point) and :math:`r` is the radius, which allows us to completely define a circle, as it can be seen below:
.. image:: images/Hough_Circle_Tutorial_Theory_0.jpg
:alt: Result of detecting circles with Hough Transform
:height: 200pt
:align: center
* For sake of efficiency, OpenCV implements a detection method slightly trickier than the standard Hough Transform: *The Hough gradient method*. For more details, please check the book *Learning OpenCV* or your favorite Computer Vision bibliography
Code
======
#. **What does this program do?**
* Loads an image and blur it to reduce the noise
* Applies the *Hough Circle Transform* to the blurred image .
* Display the detected circle in a window.
#. The sample code that we will explain can be downloaded from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/houghlines.cpp>`_. A slightly fancier version (which shows both Hough standard and probabilistic with trackbars for changing the threshold values) can be found `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/ImgTrans/HoughCircle_Demo.cpp>`_
.. code-block:: cpp
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
using namespace cv;
/** @function main */
int main(int argc, char** argv)
{
Mat src, src_gray;
/// Read the image
src = imread( argv[1], 1 );
if( !src.data )
{ return -1; }
/// Convert it to gray
cvtColor( src, src_gray, CV_BGR2GRAY );
/// Reduce the noise so we avoid false circle detection
GaussianBlur( src_gray, src_gray, Size(9, 9), 2, 2 );
vector<Vec3f> circles;
/// Apply the Hough Transform to find the circles
HoughCircles( src_gray, circles, CV_HOUGH_GRADIENT, 1, src_gray.rows/8, 200, 100, 0, 0 );
/// Draw the circles detected
for( size_t i = 0; i < circles.size(); i++ )
{
Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
int radius = cvRound(circles[i][2]);
// circle center
circle( src, center, 3, Scalar(0,255,0), -1, 8, 0 );
// circle outline
circle( src, center, radius, Scalar(0,0,255), 3, 8, 0 );
}
/// Show your results
namedWindow( "Hough Circle Transform Demo", CV_WINDOW_AUTOSIZE );
imshow( "Hough Circle Transform Demo", src );
waitKey(0);
return 0;
}
Explanation
============
#. Load an image
.. code-block:: cpp
src = imread( argv[1], 1 );
if( !src.data )
{ return -1; }
#. Convert it to grayscale:
.. code-block:: cpp
cvtColor( src, src_gray, CV_BGR2GRAY );
#. Apply a Gaussian blur to reduce noise and avoid false circle detection:
.. code-block:: cpp
GaussianBlur( src_gray, src_gray, Size(9, 9), 2, 2 );
#. Proceed to apply Hough Circle Transform:
.. code-block:: cpp
vector<Vec3f> circles;
HoughCircles( src_gray, circles, CV_HOUGH_GRADIENT, 1, src_gray.rows/8, 200, 100, 0, 0 );
with the arguments:
* *src_gray*: Input image (grayscale)
* *circles*: A vector that stores sets of 3 values: :math:`x_{c}, y_{c}, r` for each detected circle.
* *CV_HOUGH_GRADIENT*: Define the detection method. Currently this is the only one available in OpenCV
* *dp = 1*: The inverse ratio of resolution
* *min_dist = src_gray.rows/8*: Minimum distance between detected centers
* *param_1 = 200*: Upper threshold for the internal Canny edge detector
* *param_2* = 100*: Threshold for center detection.
* *min_radius = 0*: Minimum radio to be detected. If unknown, put zero as default.
* *max_radius = 0*: Maximum radius to be detected. If unknown, put zero as default
#. Draw the detected circles:
.. code-block:: cpp
for( size_t i = 0; i < circles.size(); i++ )
{
Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
int radius = cvRound(circles[i][2]);
// circle center
circle( src, center, 3, Scalar(0,255,0), -1, 8, 0 );
// circle outline
circle( src, center, radius, Scalar(0,0,255), 3, 8, 0 );
}
You can see that we will draw the circle(s) on red and the center(s) with a small green dot
#. Display the detected circle(s):
.. code-block:: cpp
namedWindow( "Hough Circle Transform Demo", CV_WINDOW_AUTOSIZE );
imshow( "Hough Circle Transform Demo", src );
#. Wait for the user to exit the program
.. code-block:: cpp
waitKey(0);
Result
=======
The result of running the code above with a test image is shown below:
.. image:: images/Hough_Circle_Tutorial_Result.jpg
:alt: Result of detecting circles with Hough Transform
:align: center
This diff is collapsed.
.. _laplace_operator:
Laplace Operator
*****************
Goal
=====
In this tutorial you will learn how to:
a. Use the OpenCV function :laplacian:`Laplacian <>` to implement a discrete analog of the *Laplacian operator*.
Theory
=======
#. In the previous tutorial we learned how to use the *Sobel Operator*. It was based on the fact that in the edge area, the pixel intensity shows a "jump" or a high variation of intensity. Getting the first derivative of the intensity, we observed that an edge is characterized by a maximum, as it can be seen in the figure:
.. image:: images/Laplace_Operator_Tutorial_Theory_Previous.jpg
:alt: Previous theory
:height: 200pt
:align: center
#. And...what happens if we take the second derivative?
.. image:: images/Laplace_Operator_Tutorial_Theory_ddIntensity.jpg
:alt: Second derivative
:height: 200pt
:align: center
You can observe that the second derivative is zero! So, we can also use this criterion to attempt to detect edges in an image. However, note that zeros will not only appear in edges (they can actually appear in other meaningless locations); this can be solved by applying filtering where needed.
Laplacian Operator
-------------------
#. From the explanation above, we deduce that the second derivative can be used to *detect edges*. Since images are "*2D*", we would need to take the derivative in both dimensions. Here, the Laplacian operator comes handy.
#. The *Laplacian operator* is defined by:
.. math::
Laplace(f) = \dfrac{\partial^{2} f}{\partial x^{2}} + \dfrac{\partial^{2} f}{\partial y^{2}}
#. The Laplacian operator is implemented in OpenCV by the function :laplacian:`Laplacian <>`. In fact, since the Laplacian uses the gradient of images, it calls internally the *Sobel* operator to perform its computation.
Code
======
#. **What does this program do?**
* Loads an image
* Remove noise by applying a Gaussian blur and then convert the original image to grayscale
* Applies a Laplacian operator to the grayscale image and stores the output image
* Display the result in a window
#. The tutorial code's is shown lines below. You can also download it from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/ImgTrans/Laplace_Demo.cpp>`_
.. code-block:: cpp
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <stdlib.h>
#include <stdio.h>
using namespace cv;
/** @function main */
int main( int argc, char** argv )
{
Mat src, src_gray, dst;
int kernel_size = 3;
int scale = 1;
int delta = 0;
int ddepth = CV_16S;
char* window_name = "Laplace Demo";
int c;
/// Load an image
src = imread( argv[1] );
if( !src.data )
{ return -1; }
/// Remove noise by blurring with a Gaussian filter
GaussianBlur( src, src, Size(3,3), 0, 0, BORDER_DEFAULT );
/// Convert the image to grayscale
cvtColor( src, src_gray, CV_RGB2GRAY );
/// Create window
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
/// Apply Laplace function
Mat abs_dst;
Laplacian( src_gray, dst, ddepth, kernel_size, scale, delta, BORDER_DEFAULT );
convertScaleAbs( dst, abs_dst );
/// Show what you got
imshow( window_name, abs_dst );
waitKey(0);
return 0;
}
Explanation
============
#. Create some needed variables:
.. code-block:: cpp
Mat src, src_gray, dst;
int kernel_size = 3;
int scale = 1;
int delta = 0;
int ddepth = CV_16S;
char* window_name = "Laplace Demo";
#. Loads the source image:
.. code-block:: cpp
src = imread( argv[1] );
if( !src.data )
{ return -1; }
#. Apply a Gaussian blur to reduce noise:
.. code-block:: cpp
GaussianBlur( src, src, Size(3,3), 0, 0, BORDER_DEFAULT );
#. Convert the image to grayscale using :cvt_color:`cvtColor <>`
.. code-block:: cpp
cvtColor( src, src_gray, CV_RGB2GRAY );
#. Apply the Laplacian operator to the grayscale image:
.. code-block:: cpp
Laplacian( src_gray, dst, ddepth, kernel_size, scale, delta, BORDER_DEFAULT );
where the arguments are:
* *src_gray*: The input image.
* *dst*: Destination (output) image
* *ddepth*: Depth of the destination image. Since our input is *CV_8U* we define *ddepth* = *CV_16S* to avoid overflow
* *kernel_size*: The kernel size of the Sobel operator to be applied internally. We use 3 in this example.
* *scale*, *delta* and *BORDER_DEFAULT*: We leave them as default values.
#. Convert the output from the Laplacian operator to a *CV_8U* image:
.. code-block:: cpp
convertScaleAbs( dst, abs_dst );
#. Display the result in a window:
.. code-block:: cpp
imshow( window_name, abs_dst );
Results
========
#. After compiling the code above, we can run it giving as argument the path to an image. For example, using as an input:
.. image:: images/Laplace_Operator_Tutorial_Original_Image.jpg
:alt: Original test image
:width: 250pt
:align: center
#. We obtain the following result. Notice how the trees and the silhouette of the cow are approximately well defined (except in areas in which the intensity are very similar, i.e. around the cow's head). Also, note that the roof of the house behind the trees (right side) is notoriously marked. This is due to the fact that the contrast is higher in that region.
.. image:: images/Laplace_Operator_Tutorial_Result.jpg
:alt: Original test image
:width: 250pt
:align: center
.. _sobel_derivatives:
Sobel Derivatives
******************
Goal
=====
In this tutorial you will learn how to:
#. Use the OpenCV function :sobel:`Sobel <>` to calculate the derivatives from an image.
#. Use the OpenCV function :scharr:`Scharr <>` to calculate a more accurate derivative for a kernel of size :math:`3 \cdot 3`
Theory
========
.. note::
The explanation below belongs to the book **Learning OpenCV** by Bradski and Kaehler.
#. In the last two tutorials we have seen applicative examples of convolutions. One of the most important convolutions is the computation of derivatives in an image (or an approximation to them).
#. Why may be important the calculus of the derivatives in an image? Let's imagine we want to detect the *edges* present in the image. For instance:
.. image:: images/Sobel_Derivatives_Tutorial_Theory_0.jpg
:alt: How intensity changes in an edge
:height: 200pt
:align: center
You can easily notice that in an *edge*, the pixel intensity *changes* in a notorious way. A good way to express *changes* is by using *derivatives*. A high change in gradient indicates a major change in the image.
#. To be more graphical, let's assume we have a 1D-image. An edge is shown by the "jump" in intensity in the plot below:
.. image:: images/Sobel_Derivatives_Tutorial_Theory_Intensity_Function.jpg
:alt: Intensity Plot for an edge
:height: 200pt
:align: center
#. The edge "jump" can be seen more easily if we take the first derivative (actually, here appears as a maximum)
.. image:: images/Sobel_Derivatives_Tutorial_Theory_dIntensity_Function.jpg
:alt: First derivative of Intensity - Plot for an edge
:height: 200pt
:align: center
#. So, from the explanation above, we can deduce that a method to detect edges in an image can be performed by locating pixel locations where the gradient is higher than its neighbors (or to generalize, higher than a threshold).
#. More detailed explanation, please refer to **Learning OpenCV** by Bradski and Kaehler
Sobel Operator
---------------
#. The Sobel Operator is a discrete differentiation operator. It computes an approximation of the gradient of an image intensity function.
#. The Sobel Operator combines Gaussian smoothing and differentiation.
Formulation
^^^^^^^^^^^^
Assuming that the image to be operated is :math:`I`:
#. We calculate two derivatives:
a. **Horizontal changes**: This is computed by convolving :math:`I` with a kernel :math:`G_{x}` with odd size. For example for a kernel size of 3, :math:`G_{x}` would be computed as:
.. math::
G_{x} = \begin{bmatrix}
-1 & 0 & +1 \\
-2 & 0 & +2 \\
-1 & 0 & +1
\end{bmatrix} * I
b. **Vertical changes**: This is computed by convolving :math:`I` with a kernel :math:`G_{y}` with odd size. For example for a kernel size of 3, :math:`G_{y}` would be computed as:
.. math::
G_{y} = \begin{bmatrix}
-1 & -2 & -1 \\
0 & 0 & 0 \\
+1 & +2 & +1
\end{bmatrix} * I
#. At each point of the image we calculate an approximation of the *gradient* in that point by combining both results above:
.. math::
G = \sqrt{ G_{x}^{2} + G_{y}^{2} }
Although sometimes the following simpler equation is used:
.. math::
G = |G_{x}| + |G_{y}|
.. note::
When the size of the kernel is :math:`3`, the Sobel kernel shown above may produce noticeable inaccuracies (after all, Sobel is only an approximation of the derivative). OpenCV addresses this inaccuracy for kernels of size 3 by using the :scharr:`Scharr <>` function. This is as fast but more accurate than the standar Sobel function. It implements the following kernels:
.. math::
G_{x} = \begin{bmatrix}
-3 & 0 & +3 \\
-10 & 0 & +10 \\
-3 & 0 & +3
\end{bmatrix}
G_{y} = \begin{bmatrix}
-3 & -10 & -3 \\
0 & 0 & 0 \\
+3 & +10 & +3
\end{bmatrix}
You can check out more information of this function in the OpenCV reference (:scharr:`Scharr <>`). Also, in the sample code below, you will notice that above the code for :sobel:`Sobel <>` function there is also code for the :scharr:`Scharr <>` function commented. Uncommenting it (and obviously commenting the Sobel stuff) should give you an idea of how this function works.
Code
=====
#. **What does this program do?**
* Applies the *Sobel Operator* and generates as output an image with the detected *edges* bright on a darker background.
#. The tutorial code's is shown lines below. You can also download it from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/ImgTrans/Sobel_Demo.cpp>`_
.. code-block:: cpp
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <stdlib.h>
#include <stdio.h>
using namespace cv;
/** @function main */
int main( int argc, char** argv )
{
Mat src, src_gray;
Mat grad;
char* window_name = "Sobel Demo - Simple Edge Detector";
int scale = 1;
int delta = 0;
int ddepth = CV_16S;
int c;
/// Load an image
src = imread( argv[1] );
if( !src.data )
{ return -1; }
GaussianBlur( src, src, Size(3,3), 0, 0, BORDER_DEFAULT );
/// Convert it to gray
cvtColor( src, src_gray, CV_RGB2GRAY );
/// Create window
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
/// Generate grad_x and grad_y
Mat grad_x, grad_y;
Mat abs_grad_x, abs_grad_y;
/// Gradient X
//Scharr( src_gray, grad_x, ddepth, 1, 0, scale, delta, BORDER_DEFAULT );
Sobel( src_gray, grad_x, ddepth, 1, 0, 3, scale, delta, BORDER_DEFAULT );
convertScaleAbs( grad_x, abs_grad_x );
/// Gradient Y
//Scharr( src_gray, grad_y, ddepth, 0, 1, scale, delta, BORDER_DEFAULT );
Sobel( src_gray, grad_y, ddepth, 0, 1, 3, scale, delta, BORDER_DEFAULT );
convertScaleAbs( grad_y, abs_grad_y );
/// Total Gradient (approximate)
addWeighted( abs_grad_x, 0.5, abs_grad_y, 0.5, 0, grad );
imshow( window_name, grad );
waitKey(0);
return 0;
}
Explanation
=============
#. First we declare the variables we are going to use:
.. code-block:: cpp
Mat src, src_gray;
Mat grad;
char* window_name = "Sobel Demo - Simple Edge Detector";
int scale = 1;
int delta = 0;
int ddepth = CV_16S;
#. As usual we load our source image *src*:
.. code-block:: cpp
src = imread( argv[1] );
if( !src.data )
{ return -1; }
#. First, we apply a :gaussian_blur:`GaussianBlur <>` to our image to reduce the noise ( kernel size = 3 )
.. code-block:: cpp
GaussianBlur( src, src, Size(3,3), 0, 0, BORDER_DEFAULT );
#. Now we convert our filtered image to grayscale:
.. code-block:: cpp
cvtColor( src, src_gray, CV_RGB2GRAY );
#. Second, we calculate the "*derivatives*" in *x* and *y* directions. For this, we use the function :sobel:`Sobel <>` as shown below:
.. code-block:: cpp
Mat grad_x, grad_y;
Mat abs_grad_x, abs_grad_y;
/// Gradient X
Sobel( src_gray, grad_x, ddepth, 1, 0, 3, scale, delta, BORDER_DEFAULT );
/// Gradient Y
Sobel( src_gray, grad_y, ddepth, 0, 1, 3, scale, delta, BORDER_DEFAULT );
The function takes the following arguments:
* *src_gray*: In our example, the input image. Here it is *CV_8U*
* *grad_x*/*grad_y*: The output image.
* *ddepth*: The depth of the output image. We set it to *CV_16S* to avoid overflow.
* *x_order*: The order of the derivative in **x** direction.
* *y_order*: The order of the derivative in **y** direction.
* *scale*, *delta* and *BORDER_DEFAULT*: We use default values.
Notice that to calculate the gradient in *x* direction we use: :math:`x_{order}= 1` and :math:`y_{order} = 0`. We do analogously for the *y* direction.
#. We convert our partial results back to *CV_8U*:
.. code-block:: cpp
convertScaleAbs( grad_x, abs_grad_x );
convertScaleAbs( grad_y, abs_grad_y );
#. Finally, we try to approximate the *gradient* by adding both directional gradients (note that this is not an exact calculation at all! but it is good for our purposes).
.. code-block:: cpp
addWeighted( abs_grad_x, 0.5, abs_grad_y, 0.5, 0, grad );
#. Finally, we show our result:
.. code-block:: cpp
imshow( window_name, grad );
Results
========
#. Here is the output of applying our basic detector to *lena.jpg*:
.. image:: images/Sobel_Derivatives_Tutorial_Result.jpg
:alt: Result of applying Sobel operator to lena.jpg
:width: 300pt
:align: center
.. _Morphology_2:
More Morphology Transformations
*********************************
Goal
=====
In this tutorial you will learn how to:
* Use the OpenCV function :morphology_ex:`morphologyEx <>` to apply Morphological Transformation such as:
* Opening
* Closing
* Morphological Gradient
* Top Hat
* Black Hat
Cool Theory
============
.. note::
The explanation below belongs to the book **Learning OpenCV** by Bradski and Kaehler.
In the previous tutorial we covered two basic Morphology operations:
* Erosion
* Dilation.
Based on these two we can effectuate more sophisticated transformations to our images. Here we discuss briefly 05 operations offered by OpenCV:
Opening
---------
* It is obtained by the erosion of an image followed by a dilation.
.. math::
dst = open( src, element) = dilate( erode( src, element ) )
* Useful for removing small objects (it is assumed that the objects are bright on a dark foreground)
* For instance, check out the example below. The image at the left is the original and the image at the right is the result after applying the opening transformation. We can observe that the small spaces in the corners of the letter tend to dissapear.
.. image:: images/Morphology_2_Tutorial_Theory_Opening.png
:height: 150pt
:alt: Opening
:align: center
Closing
---------
* It is obtained by the dilation of an image followed by an erosion.
.. math::
dst = close( src, element ) = erode( dilate( src, element ) )
* Useful to remove small holes (dark regions).
.. image:: images/Morphology_2_Tutorial_Theory_Closing.png
:height: 150pt
:alt: Closing example
:align: center
Morphological Gradient
------------------------
* It is the difference between the dilation and the erosion of an image.
.. math::
dst = morph_{grad}( src, element ) = dilate( src, element ) - erode( src, element )
* It is useful for finding the outline of an object as can be seen below:
.. image:: images/Morphology_2_Tutorial_Theory_Gradient.png
:height: 150pt
:alt: Gradient
:align: center
Top Hat
---------
* It is the difference between an input image and its opening.
.. math::
dst = tophat( src, element ) = src - open( src, element )
.. image:: images/Morphology_2_Tutorial_Theory_TopHat.png
:height: 150pt
:alt: Top Hat
:align: center
Black Hat
----------
* It is the difference between the closing and its input image
.. math::
dst = blackhat( src, element ) = close( src, element ) - src
.. image:: images/Morphology_2_Tutorial_Theory_BlackHat.png
:height: 150pt
:alt: Black Hat
:align: center
Code
======
This tutorial code's is shown lines below. You can also download it from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/ImgProc/Morphology_2.cpp>`_
.. code-block:: cpp
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <stdlib.h>
#include <stdio.h>
using namespace cv;
/// Global variables
Mat src, dst;
int morph_elem = 0;
int morph_size = 0;
int morph_operator = 0;
int const max_operator = 4;
int const max_elem = 2;
int const max_kernel_size = 21;
char* window_name = "Morphology Transformations Demo";
/** Function Headers */
void Morphology_Operations( int, void* );
/** @function main */
int main( int argc, char** argv )
{
/// Load an image
src = imread( argv[1] );
if( !src.data )
{ return -1; }
/// Create window
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
/// Create Trackbar to select Morphology operation
createTrackbar("Operator:\n 0: Opening - 1: Closing \n 2: Gradient - 3: Top Hat \n 4: Black Hat", window_name, &morph_operator, max_operator, Morphology_Operations );
/// Create Trackbar to select kernel type
createTrackbar( "Element:\n 0: Rect - 1: Cross - 2: Ellipse", window_name,
&morph_elem, max_elem,
Morphology_Operations );
/// Create Trackbar to choose kernel size
createTrackbar( "Kernel size:\n 2n +1", window_name,
&morph_size, max_kernel_size,
Morphology_Operations );
/// Default start
Morphology_Operations( 0, 0 );
waitKey(0);
return 0;
}
/**
* @function Morphology_Operations
*/
void Morphology_Operations( int, void* )
{
// Since MORPH_X : 2,3,4,5 and 6
int operation = morph_operator + 2;
Mat element = getStructuringElement( morph_elem, Size( 2*morph_size + 1, 2*morph_size+1 ), Point( morph_size, morph_size ) );
/// Apply the specified morphology operation
morphologyEx( src, dst, operation, element );
imshow( window_name, dst );
}
Explanation
=============
#. Let's check the general structure of the program:
* Load an image
* Create a window to display results of the Morphological operations
* Create 03 Trackbars for the user to enter parameters:
* The first trackbar **"Operator"** returns the kind of morphology operation to use (**morph_operator**).
.. code-block:: cpp
createTrackbar("Operator:\n 0: Opening - 1: Closing \n 2: Gradient - 3: Top Hat \n 4: Black Hat",
window_name, &morph_operator, max_operator,
Morphology_Operations );
* The second trackbar **"Element"** returns **morph_elem**, which indicates what kind of structure our kernel is:
.. code-block:: cpp
createTrackbar( "Element:\n 0: Rect - 1: Cross - 2: Ellipse", window_name,
&morph_elem, max_elem,
Morphology_Operations );
* The final trackbar **"Kernel Size"** returns the size of the kernel to be used (**morph_size**)
.. code-block:: cpp
createTrackbar( "Kernel size:\n 2n +1", window_name,
&morph_size, max_kernel_size,
Morphology_Operations );
* Every time we move any slider, the user's function **Morphology_Operations** will be called to effectuate a new morphology operation and it will update the output image based on the current trackbar values.
.. code-block:: cpp
/**
* @function Morphology_Operations
*/
void Morphology_Operations( int, void* )
{
// Since MORPH_X : 2,3,4,5 and 6
int operation = morph_operator + 2;
Mat element = getStructuringElement( morph_elem, Size( 2*morph_size + 1, 2*morph_size+1 ), Point( morph_size, morph_size ) );
/// Apply the specified morphology operation
morphologyEx( src, dst, operation, element );
imshow( window_name, dst );
}
We can observe that the key function to perform the morphology transformations is :morphology_ex:`morphologyEx <>`. In this example we use four arguments (leaving the rest as defaults):
* **src** : Source (input) image
* **dst**: Output image
* **operation**: The kind of morphology transformation to be performed. Note that we have 5 alternatives:
* *Opening*: MORPH_OPEN : 2
* *Closing*: MORPH_CLOSE: 3
* *Gradient*: MORPH_GRADIENT: 4
* *Top Hat*: MORPH_TOPHAT: 5
* *Black Hat*: MORPH_BLACKHAT: 6
As you can see the values range from <2-6>, that is why we add (+2) to the values entered by the Trackbar:
.. code-block:: cpp
int operation = morph_operator + 2;
* **element**: The kernel to be used. We use the function :get_structuring_element:`getStructuringElement <>` to define our own structure.
Results
========
* After compiling the code above we can execute it giving an image path as an argument. For this tutorial we use as input the image: **baboon.jpg**:
.. image:: images/Morphology_2_Tutorial_Original_Image.jpg
:height: 200pt
:alt: Morphology 2: Original image
:align: center
* And here are two snapshots of the display window. The first picture shows the output after using the operator **Opening** with a cross kernel. The second picture (right side, shows the result of using a **Blackhat** operator with an ellipse kernel.
.. image:: images/Morphology_2_Tutorial_Cover.png
:height: 300pt
:alt: Morphology 2: Result sample
:align: center
.. _Pyramids:
Image Pyramids
***************
Goal
=====
In this tutorial you will learn how to:
* Use the OpenCV functions :pyr_up:`pyrUp <>` and :pyr_down:`pyrDown <>` to downsample or upsample a given image.
Theory
=======
.. note::
The explanation below belongs to the book **Learning OpenCV** by Bradski and Kaehler.
* Usually we need to convert an image to a size different than its original. For this, there are two possible options:
* *Upsize* the image (zoom in) or
* *Downsize* it (zoom out).
* Although there is a *geometric transformation* function in OpenCV that -literally- resize an image (:resize:`resize <>`, which we will show in a future tutorial), in this section we analyze first the use of **Image Pyramids**, which are widely applied in a huge range of vision applications.
Image Pyramid
--------------
* An image pyramid is a collection of images - all arising from a single original image - that are successively downsampled until some desired stopping point is reached.
* There are two common kinds of image pyramids:
* **Gaussian pyramid:** Used to downsample images
* **Laplacian pyramid:** Used to reconstruct an upsampled image from an image lower in the pyramid (with less resolution)
* In this tutorial we'll use the *Gaussian pyramid*.
Gaussian Pyramid
^^^^^^^^^^^^^^^^^
* Imagine the pyramid as a set of layers in which the higher the layer, the smaller the size.
.. image:: images/Pyramids_Tutorial_Pyramid_Theory.png
:alt: Pyramid figure
:align: center
* Every layer is numbered from bottom to top, so layer :math:`(i+1)` (denoted as :math:`G_{i+1}` is smaller than layer :math:`i` (:math:`G_{i}`).
* To produce layer :math:`(i+1)` in the Gaussian pyramid, we do the following:
* Convolve :math:`G_{i}` with a Gaussian kernel:
.. math::
\frac{1}{16} \begin{bmatrix} 1 & 4 & 6 & 4 & 1 \\ 4 & 16 & 24 & 16 & 4 \\ 6 & 24 & 36 & 24 & 6 \\ 4 & 16 & 24 & 16 & 4 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix}
* Remove every even-numbered row and column.
* You can easily notice that the resulting image will be exactly one-quarter the area of its predecessor. Iterating this process on the input image :math:`G_{0}` (original image) produces the entire pyramid.
* The procedure above was useful to downsample an image. What if we want to make it bigger?:
* First, upsize the image to twice the original in each dimension, wit the new even rows and columns filled with zeros (:math:`0`)
* Perform a convolution with the same kernel shown above (multiplied by 4) to approximate the values of the "missing pixels"
* These two procedures (downsampling and upsampling as explained above) are implemented by the OpenCV functions :pyr_up:`pyrUp <>` and :pyr_down:`pyrDown <>`, as we will see in an example with the code below:
.. note::
When we reduce the size of an image, we are actually *losing* information of the image.
Code
======
This tutorial code's is shown lines below. You can also download it from `here <https://code.ros.org/svn/opencv/trunk/opencv/samples/cpp/tutorial_code/ImgProc/Pyramids.cpp>`_
.. code-block:: cpp
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
using namespace cv;
/// Global variables
Mat src, dst, tmp;
char* window_name = "Pyramids Demo";
/**
* @function main
*/
int main( int argc, char** argv )
{
/// General instructions
printf( "\n Zoom In-Out demo \n " );
printf( "------------------ \n" );
printf( " * [u] -> Zoom in \n" );
printf( " * [d] -> Zoom out \n" );
printf( " * [ESC] -> Close program \n \n" );
/// Test image - Make sure it s divisible by 2^{n}
src = imread( "../images/chicky_512.png" );
if( !src.data )
{ printf(" No data! -- Exiting the program \n");
return -1; }
tmp = src;
dst = tmp;
/// Create window
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
imshow( window_name, dst );
/// Loop
while( true )
{
int c;
c = waitKey(10);
if( (char)c == 27 )
{ break; }
if( (char)c == 'u' )
{ pyrUp( tmp, dst, Size( tmp.cols*2, tmp.rows*2 ) );
printf( "** Zoom In: Image x 2 \n" );
}
else if( (char)c == 'd' )
{ pyrDown( tmp, dst, Size( tmp.cols/2, tmp.rows/2 ) );
printf( "** Zoom Out: Image / 2 \n" );
}
imshow( window_name, dst );
tmp = dst;
}
return 0;
}
Explanation
=============
#. Let's check the general structure of the program:
* Load an image (in this case it is defined in the program, the user does not have to enter it as an argument)
.. code-block:: cpp
/// Test image - Make sure it s divisible by 2^{n}
src = imread( "../images/chicky_512.png" );
if( !src.data )
{ printf(" No data! -- Exiting the program \n");
return -1; }
* Create a Mat object to store the result of the operations (*dst*) and one to save temporal results (*tmp*).
.. code-block:: cpp
Mat src, dst, tmp;
/* ... */
tmp = src;
dst = tmp;
* Create a window to display the result
.. code-block:: cpp
namedWindow( window_name, CV_WINDOW_AUTOSIZE );
imshow( window_name, dst );
* Perform an infinite loop waiting for user input.
.. code-block:: cpp
while( true )
{
int c;
c = waitKey(10);
if( (char)c == 27 )
{ break; }
if( (char)c == 'u' )
{ pyrUp( tmp, dst, Size( tmp.cols*2, tmp.rows*2 ) );
printf( "** Zoom In: Image x 2 \n" );
}
else if( (char)c == 'd' )
{ pyrDown( tmp, dst, Size( tmp.cols/2, tmp.rows/2 ) );
printf( "** Zoom Out: Image / 2 \n" );
}
imshow( window_name, dst );
tmp = dst;
}
Our program exits if the user presses *ESC*. Besides, it has two options:
* **Perform upsampling (after pressing 'u')**
.. code-block:: cpp
pyrUp( tmp, dst, Size( tmp.cols*2, tmp.rows*2 )
We use the function :pyr_up:`pyrUp <>` with 03 arguments:
* *tmp*: The current image, it is initialized with the *src* original image.
* *dst*: The destination image (to be shown on screen, supposedly the double of the input image)
* *Size( tmp.cols*2, tmp.rows*2 )* : The destination size. Since we are upsampling, :pyr_up:`pyrUp <>` expects a size double than the input image (in this case *tmp*).
* **Perform downsampling (after pressing 'd')**
.. code-block:: cpp
pyrDown( tmp, dst, Size( tmp.cols/2, tmp.rows/2 )
Similarly as with :pyr_up:`pyrUp <>`, we use the function :pyr_down:`pyrDown <>` with 03 arguments:
* *tmp*: The current image, it is initialized with the *src* original image.
* *dst*: The destination image (to be shown on screen, supposedly half the input image)
* *Size( tmp.cols/2, tmp.rows/2 )* : The destination size. Since we are upsampling, :pyr_down:`pyrDown <>` expects half the size the input image (in this case *tmp*).
* Notice that it is important that the input image can be divided by a factor of two (in both dimensions). Otherwise, an error will be shown.
* Finally, we update the input image **tmp** with the current image displayed, so the subsequent operations are performed on it.
.. code-block:: cpp
tmp = dst;
Results
========
* After compiling the code above we can test it. The program calls an image **chicky_512.png** that comes in the *tutorial_code/image* folder. Notice that this image is :math:`512 \times 512`, hence a downsample won't generate any error (:math:`512 = 2^{9}`). The original image is shown below:
.. image:: images/Pyramids_Tutorial_Original_Image.png
:alt: Pyramids: Original image
:align: center
* First we apply two successive :pyr_down:`pyrDown <>` operations by pressing 'd'. Our output is:
.. image:: images/Pyramids_Tutorial_PyrDown_Result.png
:alt: Pyramids: PyrDown Result
:align: center
* Note that we should have lost some resolution due to the fact that we are diminishing the size of the image. This is evident after we apply :pyr_up:`pyrUp <>` twice (by pressing 'u'). Our output is now:
.. image:: images/Pyramids_Tutorial_PyrUp_Result.png
:alt: Pyramids: PyrUp Result
:align: center
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment