Commit 61f36de5 authored by Maksim Shabunin's avatar Maksim Shabunin

Doxygen tutorials support

parent 312c8fa7
......@@ -57,7 +57,7 @@ namespace bgsegm
/** @brief Gaussian Mixture-based Background/Foreground Segmentation Algorithm.
The class implements the algorithm described in @cite KB2001.
The class implements the algorithm described in @cite KB2001 .
*/
class CV_EXPORTS_W BackgroundSubtractorMOG : public BackgroundSubtractor
{
......@@ -86,9 +86,9 @@ means some automatic value.
CV_EXPORTS_W Ptr<BackgroundSubtractorMOG>
createBackgroundSubtractorMOG(int history=200, int nmixtures=5,
double backgroundRatio=0.7, double noiseSigma=0);
/** @brief Background Subtractor module based on the algorithm given in @cite Gold2012.
/** @brief Background Subtractor module based on the algorithm given in @cite Gold2012 .
Takes a series of images and returns a sequence of mask (8UC1)
images of the same size, where 255 indicates Foreground and 0 represents Background.
......
Retina : a Bio mimetic human retina model {#bioinspired_retina}
=========================================
Bioinspired Module Retina Introduction {#bioinspired_retina}
======================================
Retina
------
**Note** : do not forget that the retina model is included in the following namespace :
*cv::bioinspired*.
@note do not forget that the retina model is included in the following namespace : cv::bioinspired
### Introduction
......@@ -18,14 +17,13 @@ separable spatio-temporal filter modelling the two main retina information chann
From a general point of view, this filter whitens the image spectrum and corrects luminance thanks
to local adaptation. An other important property is its hability to filter out spatio-temporal noise
while enhancing details. This model originates from Jeanny Herault work @cite Herault2010. It has been
while enhancing details. This model originates from Jeanny Herault work @cite Herault2010 . It has been
involved in Alexandre Benoit phd and his current research @cite Benoit2010, @cite Strat2013 (he
currently maintains this module within OpenCV). It includes the work of other Jeanny's phd student
such as @cite Chaix2007 and the log polar transformations of Barthelemy Durette described in Jeanny's
book.
**NOTES :**
@note
- For ease of use in computer vision applications, the two retina channels are applied
homogeneously on all the input images. This does not follow the real retina topology but this
can still be done using the log sampling capabilities proposed within the class.
......@@ -71,7 +69,7 @@ described hereafter. XML parameters file samples are shown at the end of the pag
Here is an overview of the abstract Retina interface, allocate one instance with the *createRetina*
functions.:
@code{.cpp}
namespace cv{namespace bioinspired{
class Retina : public Algorithm
......@@ -122,6 +120,7 @@ functions.:
cv::Ptr<Retina> createRetina (Size inputSize);
cv::Ptr<Retina> createRetina (Size inputSize, const bool colorMode, RETINA_COLORSAMPLINGMETHOD colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const double reductionFactor=1.0, const double samplingStrenght=10.0);
}} // cv and bioinspired namespaces end
@endcode
### Description
......@@ -146,59 +145,47 @@ Use : this model can be used basically for spatio-temporal video effects but als
- performing motion analysis also taking benefit of the previously cited properties (check out the
magnocellular retina channel output, by using the provided **getMagno** methods)
- general image/video sequence description using either one or both channels. An example of the
use of Retina in a Bag of Words approach is given in @cite Strat2013.
use of Retina in a Bag of Words approach is given in @cite Strat2013 .
Literature
----------
For more information, refer to the following papers :
- Model description :
[Benoit2010] Benoit A., Caplier A., Durette B., Herault, J., "Using Human Visual System Modeling For Bio-Inspired Low Level Image Processing", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773. DOI <http://dx.doi.org/10.1016/j.cviu.2010.01.011>
- Model use in a Bag of Words approach :
- Model description : @cite Benoit2010
[Strat2013] Strat S., Benoit A., Lambert P., "Retina enhanced SIFT descriptors for video indexing", CBMI2013, Veszprém, Hungary, 2013.
- Model use in a Bag of Words approach : @cite Strat2013
- Please have a look at the reference work of Jeanny Herault that you can read in his book :
[Herault2010] Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891.
- Please have a look at the reference work of Jeanny Herault that you can read in his book : @cite Herault2010
This retina filter code includes the research contributions of phd/research collegues from which
code has been redrawn by the author :
- take a look at the *retinacolor.hpp* module to discover Brice Chaix de Lavarene phD color
mosaicing/demosaicing and his reference paper:
[Chaix2007] B. Chaix de Lavarene, D. Alleysson, B. Durette, J. Herault (2007). "Efficient demosaicing through recursive filtering", IEEE International Conference on Image Processing ICIP 2007
mosaicing/demosaicing and his reference paper: @cite Chaix2007
- take a look at *imagelogpolprojection.hpp* to discover retina spatial log sampling which
originates from Barthelemy Durette phd with Jeanny Herault. A Retina / V1 cortex projection is
also proposed and originates from Jeanny's discussions. More informations in the above cited
Jeanny Heraults's book.
- Meylan&al work on HDR tone mapping that is implemented as a specific method within the model :
[Meylan2007] L. Meylan , D. Alleysson, S. Susstrunk, "A Model of Retinal Local Adaptation for the Tone Mapping of Color Filter Array Images", Journal of Optical Society of America, A, Vol. 24, N 9, September, 1st, 2007, pp. 2807-2816
- Meylan&al work on HDR tone mapping that is implemented as a specific method within the model : @cite Meylan2007
Demos and experiments !
-----------------------
**NOTE : Complementary to the following examples, have a look at the Retina tutorial in the
@note Complementary to the following examples, have a look at the Retina tutorial in the
tutorial/contrib section for complementary explanations.**
Take a look at the provided C++ examples provided with OpenCV :
- **samples/cpp/retinademo.cpp** shows how to use the retina module for details enhancement (Parvo channel output) and transient maps observation (Magno channel output). You can play with images, video sequences and webcam video.
Typical uses are (provided your OpenCV installation is situated in folder
*OpenCVReleaseFolder*)
- image processing : **OpenCVReleaseFolder/bin/retinademo -image myPicture.jpg**
- video processing : **OpenCVReleaseFolder/bin/retinademo -video myMovie.avi**
- webcam processing: **OpenCVReleaseFolder/bin/retinademo -video**
- **samples/cpp/retinademo.cpp** shows how to use the retina module for details enhancement (Parvo channel output) and transient maps observation (Magno channel output). You can play with images, video sequences and webcam video.
Typical uses are (provided your OpenCV installation is situated in folder *OpenCVReleaseFolder*)
- image processing : **OpenCVReleaseFolder/bin/retinademo -image myPicture.jpg**
- video processing : **OpenCVReleaseFolder/bin/retinademo -video myMovie.avi**
- webcam processing: **OpenCVReleaseFolder/bin/retinademo -video**
**Note :** This demo generates the file *RetinaDefaultParameters.xml* which contains the
@note This demo generates the file *RetinaDefaultParameters.xml* which contains the
default parameters of the retina. Then, rename this as *RetinaSpecificParameters.xml*, adjust
the parameters the way you want and reload the program to check the effect.
......@@ -217,7 +204,7 @@ Take a look at the provided C++ examples provided with OpenCV :
Note that some sliders are made available to allow you to play with luminance compression.
If not using the 'fast' option, then, tone mapping is performed using the full retina model
@cite Benoit2010. It includes spectral whitening that allows luminance energy to be reduced.
@cite Benoit2010 . It includes spectral whitening that allows luminance energy to be reduced.
When using the 'fast' option, then, a simpler method is used, it is an adaptation of the
algorithm presented in @cite Meylan2007. This method gives also good results and is faster to
algorithm presented in @cite Meylan2007 . This method gives also good results and is faster to
process but it sometimes requires some more parameters adjustement.
This diff is collapsed.
Interactive Visual Debugging of Computer Vision applications {#tutorial_cvv_introduction}
============================================================
What is the most common way to debug computer vision applications? Usually the answer is temporary,
hacked together, custom code that must be removed from the code for release compilation.
In this tutorial we will show how to use the visual debugging features of the **cvv** module
(*opencv2/cvv.hpp*) instead.
Goals
-----
In this tutorial you will learn how to:
- Add cvv debug calls to your application
- Use the visual debug GUI
- Enable and disable the visual debug features during compilation (with zero runtime overhead when
disabled)
Code
----
The example code
- captures images (*videoio*), e.g. from a webcam,
- applies some filters to each image (*imgproc*),
- detects image features and matches them to the previous image (*features2d*).
If the program is compiled without visual debugging (see CMakeLists.txt below) the only result is
some information printed to the command line. We want to demonstrate how much debugging or
development functionality is added by just a few lines of *cvv* commands.
@includelineno cvv/samples/cvv_demo.cpp
@code{.cmake}
cmake_minimum_required(VERSION 2.8)
project(cvvisual_test)
SET(CMAKE_PREFIX_PATH ~/software/opencv/install)
SET(CMAKE_CXX_COMPILER "g++-4.8")
SET(CMAKE_CXX_FLAGS "-std=c++11 -O2 -pthread -Wall -Werror")
# (un)set: cmake -DCVV_DEBUG_MODE=OFF ..
OPTION(CVV_DEBUG_MODE "cvvisual-debug-mode" ON)
if(CVV_DEBUG_MODE MATCHES ON)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DCVVISUAL_DEBUGMODE")
endif()
FIND_PACKAGE(OpenCV REQUIRED)
include_directories(${OpenCV_INCLUDE_DIRS})
add_executable(cvvt main.cpp)
target_link_libraries(cvvt
opencv_core opencv_videoio opencv_imgproc opencv_features2d
opencv_cvv
)
@endcode
Explanation
-----------
-# We compile the program either using the above CmakeLists.txt with Option *CVV_DEBUG_MODE=ON*
(*cmake -DCVV_DEBUG_MODE=ON*) or by adding the corresponding define *CVVISUAL_DEBUGMODE* to
our compiler (e.g. *g++ -DCVVISUAL_DEBUGMODE*).
-# The first cvv call simply shows the image (similar to *imshow*) with the imgIdString as comment.
@code{.cpp}
cvv::showImage(imgRead, CVVISUAL_LOCATION, imgIdString.c_str());
@endcode
The image is added to the overview tab in the visual debug GUI and the cvv call blocks.
![image](images/01_overview_single.jpg)
The image can then be selected and viewed
![image](images/02_single_image_view.jpg)
Whenever you want to continue in the code, i.e. unblock the cvv call, you can either continue
until the next cvv call (*Step*), continue until the last cvv call (*\>\>*) or run the
application until it exists (*Close*).
We decide to press the green *Step* button.
-# The next cvv calls are used to debug all kinds of filter operations, i.e. operations that take a
picture as input and return a picture as output.
@code{.cpp}
cvv::debugFilter(imgRead, imgGray, CVVISUAL_LOCATION, "to gray");
@endcode
As with every cvv call, you first end up in the overview.
![image](images/03_overview_two.jpg)
We decide not to care about the conversion to gray scale and press *Step*.
@code{.cpp}
cvv::debugFilter(imgGray, imgGraySmooth, CVVISUAL_LOCATION, "smoothed");
@endcode
If you open the filter call, you will end up in the so called "DefaultFilterView". Both images
are shown next to each other and you can (synchronized) zoom into them.
![image](images/04_default_filter_view.jpg)
When you go to very high zoom levels, each pixel is annotated with its numeric values.
![image](images/05_default_filter_view_high_zoom.jpg)
We press *Step* twice and have a look at the dilated image.
@code{.cpp}
cvv::debugFilter(imgEdges, imgEdgesDilated, CVVISUAL_LOCATION, "dilated edges");
@endcode
The DefaultFilterView showing both images
![image](images/06_default_filter_view_edges.jpg)
Now we use the *View* selector in the top right and select the "DualFilterView". We select
"Changed Pixels" as filter and apply it (middle image).
![image](images/07_dual_filter_view_edges.jpg)
After we had a close look at these images, perhaps using different views, filters or other GUI
features, we decide to let the program run through. Therefore we press the yellow *\>\>* button.
The program will block at
@code{.cpp}
cvv::finalShow();
@endcode
and display the overview with everything that was passed to cvv in the meantime.
![image](images/08_overview_all.jpg)
-# The cvv debugDMatch call is used in a situation where there are two images each with a set of
descriptors that are matched to each other.
We pass both images, both sets of keypoints and their matching to the visual debug module.
@code{.cpp}
cvv::debugDMatch(prevImgGray, prevKeypoints, imgGray, keypoints, matches, CVVISUAL_LOCATION, allMatchIdString.c_str());
@endcode
Since we want to have a look at matches, we use the filter capabilities (*\#type match*) in the
overview to only show match calls.
![image](images/09_overview_filtered_type_match.jpg)
We want to have a closer look at one of them, e.g. to tune our parameters that use the matching.
The view has various settings how to display keypoints and matches. Furthermore, there is a
mouseover tooltip.
![image](images/10_line_match_view.jpg)
We see (visual debugging!) that there are many bad matches. We decide that only 70% of the
matches should be shown - those 70% with the lowest match distance.
![image](images/11_line_match_view_portion_selector.jpg)
Having successfully reduced the visual distraction, we want to see more clearly what changed
between the two images. We select the "TranslationMatchView" that shows to where the keypoint
was matched in a different way.
![image](images/12_translation_match_view_portion_selector.jpg)
It is easy to see that the cup was moved to the left during the two images.
Although, cvv is all about interactively *seeing* the computer vision bugs, this is complemented
by a "RawView" that allows to have a look at the underlying numeric data.
![image](images/13_raw_view.jpg)
-# There are many more useful features contained in the cvv GUI. For instance, one can group the
overview tab.
![image](images/14_overview_group_by_line.jpg)
Result
------
- By adding a view expressive lines to our computer vision program we can interactively debug it
through different visualizations.
- Once we are done developing/debugging we do not have to remove those lines. We simply disable
cvv debugging (*cmake -DCVV_DEBUG_MODE=OFF* or g++ without *-DCVVISUAL_DEBUGMODE*) and our
programs runs without any debug overhead.
Enjoy computer vision!
......@@ -45,7 +45,7 @@ the use of this software, even if advised of the possibility of such damage.
@defgroup face Face Recognition
- @ref face_changelog
- @ref face_tutorial
- @ref tutorial_face_main
*/
......
......@@ -61,7 +61,7 @@ Discriminatively Trained Part Based Models for Object Detection
---------------------------------------------------------------
The object detector described below has been initially proposed by P.F. Felzenszwalb in
@cite Felzenszwalb2010a. It is based on a Dalal-Triggs detector that uses a single filter on histogram
@cite Felzenszwalb2010a . It is based on a Dalal-Triggs detector that uses a single filter on histogram
of oriented gradients (HOG) features to represent an object category. This detector uses a sliding
window approach, where a filter is applied at all positions and scales of an image. The first
innovation is enriching the Dalal-Triggs model using a star-structured part-based model defined by a
......@@ -77,7 +77,7 @@ and scale is the maximum over components, of the score of that component model a
location.
The detector was dramatically speeded-up with cascade algorithm proposed by P.F. Felzenszwalb in
@cite Felzenszwalb2010b. The algorithm prunes partial hypotheses using thresholds on their scores.The
@cite Felzenszwalb2010b . The algorithm prunes partial hypotheses using thresholds on their scores.The
basic idea of the algorithm is to use a hierarchy of models defined by an ordering of the original
model's parts. For a model with (n+1) parts, including the root, a sequence of (n+1) models is
obtained. The i-th model in this sequence is defined by the first i parts from the original model.
......
This diff is collapsed.
......@@ -63,8 +63,8 @@ Computation of binary descriptors
---------------------------------
To obtatin a binary descriptor representing a certain line detected from a certain octave of an
image, we first compute a non-binary descriptor as described in @cite LBD. Such algorithm works on
lines extracted using EDLine detector, as explained in @cite EDL. Given a line, we consider a
image, we first compute a non-binary descriptor as described in @cite LBD . Such algorithm works on
lines extracted using EDLine detector, as explained in @cite EDL . Given a line, we consider a
rectangular region centered at it and called *line support region (LSR)*. Such region is divided
into a set of bands \f$\{B_1, B_2, ..., B_m\}\f$, whose length equals the one of line.
......
......@@ -854,7 +854,7 @@ std::vector<cv::Mat> octaveImages;
Lines extraction methodology
----------------------------
The lines extraction methodology described in the following is mainly based on @cite EDL. The
The lines extraction methodology described in the following is mainly based on @cite EDL . The
extraction starts with a Gaussian pyramid generated from an original image, downsampled N-1 times,
blurred N times, to obtain N layers (one for each octave), with layer 0 corresponding to input
image. Then, from each layer (octave) in the pyramid, lines are extracted using LSD algorithm.
......@@ -931,7 +931,7 @@ based on *Multi-Index Hashing (MiHashing)* will be described.
Multi-Index Hashing
-------------------
The theory described in this section is based on @cite MIH. Given a dataset populated with binary
The theory described in this section is based on @cite MIH . Given a dataset populated with binary
codes, each code is indexed *m* times into *m* different hash tables, according to *m* substrings it
has been divided into. Thus, given a query code, all the entries close to it at least in one
substring are returned by search as *neighbor candidates*. Returned entries are then checked for
......
Line Features Tutorial {#tutorial_line_descriptor_main}
======================
In this tutorial it will be shown how to:
- use the *BinaryDescriptor* interface to extract lines and store them in *KeyLine* objects
- use the same interface to compute descriptors for every extracted line
- use the *BynaryDescriptorMatcher* to determine matches among descriptors obtained from different
images
Lines extraction and descriptors computation
--------------------------------------------
In the following snippet of code, it is shown how to detect lines from an image. The LSD extractor
is initialized with *LSD\_REFINE\_ADV* option; remaining parameters are left to their default
values. A mask of ones is used in order to accept all extracted lines, which, at the end, are
displayed using random colors for octave 0.
@includelineno line_descriptor/samples/lsd_lines_extraction.cpp
This is the result obtained for famous cameraman image:
![alternate text](pics/lines_cameraman_edl.png)
Another way to extract lines is using *LSDDetector* class; such class uses the LSD extractor to
compute lines. To obtain this result, it is sufficient to use the snippet code seen above, just
modifying it by the rows
@code{.cpp}
// create a pointer to an LSDDetector object
Ptr<LSDDetector> lsd = LSDDetector::createLSDDetector();
// compute lines
std::vector<KeyLine> keylines;
lsd->detect( imageMat, keylines, mask );
@endcode
Here's the result returned by LSD detector again on cameraman picture:
![alternate text](pics/cameraman_lines2.png)
Once keylines have been detected, it is possible to compute their descriptors as shown in the
following:
@includelineno line_descriptor/samples/compute_descriptors.cpp
Matching among descriptors
--------------------------
If we have extracted descriptors from two different images, it is possible to search for matches
among them. One way of doing it is matching exactly a descriptor to each input query descriptor,
choosing the one at closest distance:
@includelineno line_descriptor/samples/matching.cpp
Sometimes, we could be interested in searching for the closest *k* descriptors, given an input one.
This requires to modify slightly previous code:
@code{.cpp}
// prepare a structure to host matches
std::vector<std::vector<DMatch> > matches;
// require knn match
bdm->knnMatch( descr1, descr2, matches, 6 );
@endcode
In the above example, the closest 6 descriptors are returned for every query. In some cases, we
could have a search radius and look for all descriptors distant at the most *r* from input query.
Previous code must me modified:
@code{.cpp}
// prepare a structure to host matches
std::vector<std::vector<DMatch> > matches;
// compute matches
bdm->radiusMatch( queries, matches, 30 );
@endcode
Here's an example om matching among descriptors extratced from original cameraman image and its
downsampled (and blurred) version:
![alternate text](pics/matching2.png)
Querying internal database
--------------------------
The *BynaryDescriptorMatcher* class, owns an internal database that can be populated with
descriptors extracted from different images and queried using one of the modalities described in
previous section. Population of internal dataset can be done using the *add* function; such function
doesn't directly add new data to database, but it just stores it them locally. The real update
happens when function *train* is invoked or when any querying function is executed, since each of
them invokes *train* before querying. When queried, internal database not only returns required
descriptors, but, for every returned match, it is able to tell which image matched descriptor was
extracted from. An example of internal dataset usage is described in the following code; after
adding locally new descriptors, a radius search is invoked. This provokes local data to be
transferred to dataset, which, in turn, is then queried.
@includelineno line_descriptor/samples/radius_matching.cpp
......@@ -99,7 +99,7 @@ for pixel
@param speed_up_thr threshold to detect point with irregular flow - where flow should be
recalculated after upscale
See @cite Tao2012. And site of project - <http://graphics.berkeley.edu/papers/Tao-SAN-2012-05/>.
See @cite Tao2012 . And site of project - <http://graphics.berkeley.edu/papers/Tao-SAN-2012-05/>.
@note
- An example using the simpleFlow algorithm can be found at samples/simpleflow_demo.cpp
......
......@@ -66,7 +66,7 @@ That is, MHI pixels where the motion occurs are set to the current timestamp , w
where the motion happened last time a long time ago are cleared.
The function, together with calcMotionGradient and calcGlobalOrientation , implements a motion
templates technique described in @cite Davis97 and @cite Bradski00.
templates technique described in @cite Davis97 and @cite Bradski00 .
*/
CV_EXPORTS_W void updateMotionHistory( InputArray silhouette, InputOutputArray mhi,
double timestamp, double duration );
......
......@@ -45,7 +45,7 @@
The Registration module implements parametric image registration. The implemented method is direct
alignment, that is, it uses directly the pixel values for calculating the registration between a
pair of images, as opposed to feature-based registration. The implementation follows essentially the
corresponding part of @cite Szeliski06.
corresponding part of @cite Szeliski06 .
Feature based methods have some advantages over pixel based methods when we are trying to register
pictures that have been shoot under different lighting conditions or exposition times, or when the
......
......@@ -365,7 +365,7 @@ accurate representation. However, note that number of point pair features to be
quadratically increased as the complexity is O(N\^2). This is especially a concern for 32 bit
systems, where large models can easily overshoot the available memory. Typically, values in the
range of 0.025 - 0.05 seem adequate for most of the applications, where the default value is 0.03.
(Note that there is a difference in this paremeter with the one presented in @cite drost2010. In
(Note that there is a difference in this paremeter with the one presented in @cite drost2010 . In
@cite drost2010 a uniform cuboid is used for quantization and model diameter is used for reference of
sampling. In my implementation, the cuboid is a rectangular prism, and each dimension is quantized
independently. I do not take reference from the diameter but along the individual dimensions.
......
Tracking diagrams {#tracking_diagrams}
=================
General diagram
===============
@startuml{tracking_uml_general.png}
package "Tracker"
package "TrackerFeature"
package "TrackerSampler"
package "TrackerModel"
Tracker -> TrackerModel: create
Tracker -> TrackerSampler: create
Tracker -> TrackerFeature: create
@enduml
Tracker diagram
===============
@startuml{tracking_uml_tracking.png}
package "Tracker package" #DDDDDD {
class Algorithm
class Tracker{
Ptr<TrackerFeatureSet> featureSet;
Ptr<TrackerSampler> sampler;
Ptr<TrackerModel> model;
---
+static Ptr<Tracker> create(const string& trackerType);
+bool init(const Mat& image, const Rect& boundingBox);
+bool update(const Mat& image, Rect& boundingBox);
}
class Tracker
note right: Tracker is the general interface for each specialized trackers
class TrackerMIL{
+static Ptr<TrackerMIL> createTracker(const TrackerMIL::Params &parameters);
+virtual ~TrackerMIL();
}
class TrackerBoosting{
+static Ptr<TrackerBoosting> createTracker(const TrackerBoosting::Params &parameters);
+virtual ~TrackerBoosting();
}
Algorithm <|-- Tracker : virtual inheritance
Tracker <|-- TrackerMIL
Tracker <|-- TrackerBoosting
note "Single instance of the Tracker" as N1
TrackerBoosting .. N1
TrackerMIL .. N1
}
@enduml
TrackerFeatureSet diagram
=========================
@startuml{tracking_uml_feature.png}
package "TrackerFeature package" #DDDDDD {
class TrackerFeatureSet{
-vector<pair<string, Ptr<TrackerFeature> > > features
-vector<Mat> responses
...
TrackerFeatureSet();
~TrackerFeatureSet();
--
+extraction(const std::vector<Mat>& images);
+selection();
+removeOutliers();
+vector<Mat> response getResponses();
+vector<pair<string TrackerFeatureType, Ptr<TrackerFeature> > > getTrackerFeatures();
+bool addTrackerFeature(string trackerFeatureType);
+bool addTrackerFeature(Ptr<TrackerFeature>& feature);
-clearResponses();
}
class TrackerFeature <<virtual>>{
static Ptr<TrackerFeature> = create(const string& trackerFeatureType);
compute(const std::vector<Mat>& images, Mat& response);
selection(Mat& response, int npoints);
}
note bottom: Can be specialized as in table II\nA tracker can use more types of features
class TrackerFeatureFeature2D{
-vector<Keypoints> keypoints
---
TrackerFeatureFeature2D(string detectorType, string descriptorType);
~TrackerFeatureFeature2D();
---
compute(const std::vector<Mat>& images, Mat& response);
selection( Mat& response, int npoints);
}
class TrackerFeatureHOG{
TrackerFeatureHOG();
~TrackerFeatureHOG();
---
compute(const std::vector<Mat>& images, Mat& response);
selection(Mat& response, int npoints);
}
TrackerFeatureSet *-- TrackerFeature
TrackerFeature <|-- TrackerFeatureHOG
TrackerFeature <|-- TrackerFeatureFeature2D
note "Per readability and simplicity in this diagram\n there are only two TrackerFeature but you\n can considering the implementation of the other TrackerFeature" as N1
TrackerFeatureHOG .. N1
TrackerFeatureFeature2D .. N1
}
@enduml
TrackerModel diagram
====================
@startuml{tracking_uml_model.png}
package "TrackerModel package" #DDDDDD {
class Typedef << (T,#FF7700) >>{
ConfidenceMap
Trajectory
}
class TrackerModel{
-vector<ConfidenceMap> confidenceMaps;
-Trajectory trajectory;
-Ptr<TrackerStateEstimator> stateEstimator;
...
TrackerModel();
~TrackerModel();
+bool setTrackerStateEstimator(Ptr<TrackerStateEstimator> trackerStateEstimator);
+Ptr<TrackerStateEstimator> getTrackerStateEstimator();
+void modelEstimation(const vector<Mat>& responses);
+void modelUpdate();
+void setLastTargetState(const Ptr<TrackerTargetState> lastTargetState);
+void runStateEstimator();
+const vector<ConfidenceMap>& getConfidenceMaps();
+const ConfidenceMap& getLastConfidenceMap();
}
class TrackerTargetState <<virtual>>{
Point2f targetPosition;
---
Point2f getTargetPosition();
void setTargetPosition(Point2f position);
}
class TrackerTargetState
note bottom: Each tracker can create own state
class TrackerStateEstimator <<virtual>>{
~TrackerStateEstimator();
static Ptr<TrackerStateEstimator> create(const String& trackeStateEstimatorType);
Ptr<TrackerTargetState> estimate(const vector<ConfidenceMap>& confidenceMaps)
void update(vector<ConfidenceMap>& confidenceMaps)
}
class TrackerStateEstimatorSVM{
TrackerStateEstimatorSVM()
~TrackerStateEstimatorSVM()
Ptr<TrackerTargetState> estimate(const vector<ConfidenceMap>& confidenceMaps)
void update(vector<ConfidenceMap>& confidenceMaps)
}
class TrackerStateEstimatorMILBoosting{
TrackerStateEstimatorMILBoosting()
~TrackerStateEstimatorMILBoosting()
Ptr<TrackerTargetState> estimate(const vector<ConfidenceMap>& confidenceMaps)
void update(vector<ConfidenceMap>& confidenceMaps)
}
TrackerModel -> TrackerStateEstimator: create
TrackerModel *-- TrackerTargetState
TrackerStateEstimator <|-- TrackerStateEstimatorMILBoosting
TrackerStateEstimator <|-- TrackerStateEstimatorSVM
}
@enduml
TrackerSampler diagram
======================
@startuml{tracking_uml_sampler.png}
package "TrackerSampler package" #DDDDDD {
class TrackerSampler{
-vector<pair<String, Ptr<TrackerSamplerAlgorithm> > > samplers
-vector<Mat> samples;
...
TrackerSampler();
~TrackerSampler();
+sampling(const Mat& image, Rect boundingBox);
+const vector<pair<String, Ptr<TrackerSamplerAlgorithm> > >& getSamplers();
+const vector<Mat>& getSamples();
+bool addTrackerSamplerAlgorithm(String trackerSamplerAlgorithmType);
+bool addTrackerSamplerAlgorithm(Ptr<TrackerSamplerAlgorithm>& sampler);
---
-void clearSamples();
}
class TrackerSamplerAlgorithm{
~TrackerSamplerAlgorithm();
+static Ptr<TrackerSamplerAlgorithm> create(const String& trackerSamplerType);
+bool sampling(const Mat& image, Rect boundingBox, vector<Mat>& sample);
}
note bottom: A tracker could sample the target\nor it could sample the target and the background
class TrackerSamplerCS{
TrackerSamplerCS();
~TrackerSamplerCS();
+bool sampling(const Mat& image, Rect boundingBox, vector<Mat>& sample);
}
class TrackerSamplerCSC{
TrackerSamplerCSC();
~TrackerSamplerCSC();
+bool sampling(const Mat& image, Rect boundingBox, vector<Mat>& sample);
}
}
@enduml
......@@ -52,7 +52,7 @@ Long-term optical tracking API
Long-term optical tracking is one of most important issue for many computer vision applications in
real world scenario. The development in this area is very fragmented and this API is an unique
interface useful for plug several algorithms and compare them. This work is partially based on
@cite AAM and @cite AMVOT.
@cite AAM and @cite AMVOT .
This algorithms start from a bounding box of the target and with their internal representation they
avoid the drift during the tracking. These long-term trackers are able to evaluate online the
......@@ -67,36 +67,15 @@ most likely target states). The class TrackerTargetState represents a possible s
The TrackerSampler and the TrackerFeatureSet are the visual representation of the target, instead
the TrackerModel is the statistical model.
A recent benchmark between these algorithms can be found in @cite OOT.
A recent benchmark between these algorithms can be found in @cite OOT
UML design:
-----------
**General diagram**
![General diagram](pics/package.png)
**Tracker diagram**
![Tracker diagram](pics/Tracker.png)
**TrackerSampler diagram**
![TrackerSampler diagram](pics/TrackerSampler.png)
**TrackerFeatureSet diagram**
![TrackerFeatureSet diagram](pics/TrackerFeature.png)
**TrackerModel diagram**
![TrackerModel diagram](pics/TrackerModel.png)
UML design: see @ref tracking_diagrams
To see how API works, try tracker demo:
<https://github.com/lenlen/opencv/blob/tracking_api/samples/cpp/tracker.cpp>
@note This Tracking API has been designed with PlantUML. If you modify this API please change UML
files under modules/tracking/misc/ The following reference was used in the API
in <em>modules/tracking/doc/tracking_diagrams.markdown</em>. The following reference was used in the API
Creating Own Tracker
--------------------
......
......@@ -1073,7 +1073,7 @@ class CV_EXPORTS_W TrackerFeatureLBP : public TrackerFeature
background.
Multiple Instance Learning avoids the drift problem for a robust tracking. The implementation is
based on @cite MIL.
based on @cite MIL .
Original code can be found here <http://vision.ucsd.edu/~bbabenko/project_miltrack.shtml>
*/
......@@ -1105,7 +1105,7 @@ class CV_EXPORTS_W TrackerMIL : public Tracker
/** @brief This is a real-time object tracking based on a novel on-line version of the AdaBoost algorithm.
The classifier uses the surrounding background as negative examples in update step to avoid the
drifting problem. The implementation is based on @cite OLB.
drifting problem. The implementation is based on @cite OLB .
*/
class CV_EXPORTS_W TrackerBoosting : public Tracker
{
......@@ -1137,7 +1137,7 @@ class CV_EXPORTS_W TrackerBoosting : public Tracker
/** @brief Median Flow tracker implementation.
Implementation of a paper @cite MedianFlow.
Implementation of a paper @cite MedianFlow .
The tracker is suitable for very smooth and predictable movements when object is visible throughout
the whole sequence. It's quite and accurate for this type of problems (in particular, it was shown
......@@ -1168,7 +1168,7 @@ tracking, learning and detection.
The tracker follows the object from frame to frame. The detector localizes all appearances that
have been observed so far and corrects the tracker if necessary. The learning estimates detector’s
errors and updates it to avoid these errors in the future. The implementation is based on @cite TLD.
errors and updates it to avoid these errors in the future. The implementation is based on @cite TLD .
The Median Flow algorithm (see cv::TrackerMedianFlow) was chosen as a tracking component in this
implementation, following authors. Tracker is supposed to be able to handle rapid motions, partial
......
......@@ -64,7 +64,7 @@ namespace xfeatures2d
//! @addtogroup xfeatures2d_experiment
//! @{
/** @brief Class implementing the FREAK (*Fast Retina Keypoint*) keypoint descriptor, described in @cite AOV12.
/** @brief Class implementing the FREAK (*Fast Retina Keypoint*) keypoint descriptor, described in @cite AOV12 .
The algorithm propose a novel keypoint descriptor inspired by the human visual system and more
precisely the retina, coined Fast Retina Key- point (FREAK). A cascade of binary strings is
......@@ -116,7 +116,7 @@ public:
* BRIEF Descriptor
*/
/** @brief Class for computing BRIEF descriptors described in @cite calon2010
/** @brief Class for computing BRIEF descriptors described in @cite calon2010 .
@note
- A complete BRIEF extractor sample can be found at
......
......@@ -54,7 +54,7 @@ namespace xfeatures2d
//! @{
/** @brief Class for extracting keypoints and computing descriptors using the Scale Invariant Feature Transform
(SIFT) algorithm by D. Lowe @cite Lowe04.
(SIFT) algorithm by D. Lowe @cite Lowe04 .
*/
class CV_EXPORTS_W SIFT : public Feature2D
{
......@@ -84,7 +84,7 @@ public:
typedef SIFT SiftFeatureDetector;
typedef SIFT SiftDescriptorExtractor;
/** @brief Class for extracting Speeded Up Robust Features from an image @cite Bay06.
/** @brief Class for extracting Speeded Up Robust Features from an image @cite Bay06 .
The algorithm parameters:
- member int extended
......
......@@ -46,3 +46,12 @@
year={2010},
publisher={Springer}
}
@inproceedings{Lim2013,
title={Sketch tokens: A learned mid-level representation for contour and object detection},
author={Lim, Joseph J and Zitnick, C Lawrence and Doll{\'a}r, Piotr},
booktitle={Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on},
pages={3158--3165},
year={2013},
organization={IEEE}
}
......@@ -61,7 +61,7 @@ enum EdgeAwareFiltersList
/** @brief Interface for realizations of Domain Transform filter.
For more details about this filter see @cite Gastal11.
For more details about this filter see @cite Gastal11 .
*/
class CV_EXPORTS_W DTFilter : public Algorithm
{
......@@ -125,7 +125,7 @@ void dtFilter(InputArray guide, InputArray src, OutputArray dst, double sigmaSpa
/** @brief Interface for realizations of Guided Filter.
For more details about this filter see @cite Kaiming10.
For more details about this filter see @cite Kaiming10 .
*/
class CV_EXPORTS_W GuidedFilter : public Algorithm
{
......@@ -153,7 +153,7 @@ channels then only first 3 channels will be used.
@param eps regularization term of Guided Filter. \f${eps}^2\f$ is similar to the sigma in the color
space into bilateralFilter.
For more details about Guided Filter parameters, see the original article @cite Kaiming10.
For more details about Guided Filter parameters, see the original article @cite Kaiming10 .
*/
CV_EXPORTS_W Ptr<GuidedFilter> createGuidedFilter(InputArray guide, int radius, double eps);
......@@ -228,7 +228,7 @@ bilateralFilter.
@param adjust_outliers optional, specify perform outliers adjust operation or not, (Eq. 9) in the
original paper.
For more details about Adaptive Manifold Filter parameters, see the original article @cite Gastal12.
For more details about Adaptive Manifold Filter parameters, see the original article @cite Gastal12 .
@note Joint images with CV_8U and CV_16U depth converted to images with CV_32F depth and [0; 1]
color range before processing. Hence color space sigma sigma_r must be in [0; 1] range, unlike same
......
......@@ -54,7 +54,7 @@ namespace ximgproc
//! @{
/** @brief Class implementing the SEEDS (Superpixels Extracted via Energy-Driven Sampling) superpixels
algorithm described in @cite VBRV14.
algorithm described in @cite VBRV14 .
The algorithm uses an efficient hill-climbing algorithm to optimize the superpixels' energy
function that is based on color histograms and a boundary term, which is optional. The energy
......
Structured forests for fast edge detection {#tutorial_ximgproc_prediction}
==========================================
Introduction
------------
In this tutorial you will learn how to use structured forests for the purpose of edge detection in
an image.
Examples
--------
![image](images/01.jpg)
![image](images/02.jpg)
![image](images/03.jpg)
![image](images/04.jpg)
![image](images/05.jpg)
![image](images/06.jpg)
![image](images/07.jpg)
![image](images/08.jpg)
![image](images/09.jpg)
![image](images/10.jpg)
![image](images/11.jpg)
![image](images/12.jpg)
@note binarization techniques like Canny edge detector are applicable to edges produced by both
algorithms (Sobel and StructuredEdgeDetection::detectEdges).
Source Code
-----------
@includelineno ximgproc/samples/structured_edge_detection.cpp
Explanation
-----------
-# **Load source color image**
@code{.cpp}
cv::Mat image = cv::imread(inFilename, 1);
if ( image.empty() )
{
printf("Cannot read image file: %s\n", inFilename.c_str());
return -1;
}
@endcode
-# **Convert source image to [0;1] range**
@code{.cpp}
image.convertTo(image, cv::DataType<float>::type, 1/255.0);
@endcode
-# **Run main algorithm**
@code{.cpp}
cv::Mat edges(image.size(), image.type());
cv::Ptr<StructuredEdgeDetection> pDollar =
cv::createStructuredEdgeDetection(modelFilename);
pDollar->detectEdges(image, edges);
@endcode
-# **Show results**
@code{.cpp}
if ( outFilename == "" )
{
cv::namedWindow("edges", 1);
cv::imshow("edges", edges);
cv::waitKey(0);
}
else
cv::imwrite(outFilename, 255*edges);
@endcode
Literature
----------
For more information, refer to the following papers : @cite Dollar2013 @cite Lim2013
function modelConvert(model, outname)
%% script for converting Piotr's matlab model into YAML format
outfile = fopen(outname, 'w');
fprintf(outfile, '%%YAML:1.0\n\n');
fprintf(outfile, ['options:\n'...
' numberOfTrees: 8\n'...
' numberOfTreesToEvaluate: 4\n'...
' selfsimilarityGridSize: 5\n'...
' stride: 2\n'...
' shrinkNumber: 2\n'...
' patchSize: 32\n'...
' patchInnerSize: 16\n'...
' numberOfGradientOrientations: 4\n'...
' gradientSmoothingRadius: 0\n'...
' regFeatureSmoothingRadius: 2\n'...
' ssFeatureSmoothingRadius: 8\n'...
' gradientNormalizationRadius: 4\n\n']);
fprintf(outfile, 'childs:\n');
printToYML(outfile, model.child', 0);
fprintf(outfile, 'featureIds:\n');
printToYML(outfile, model.fids', 0);
fprintf(outfile, 'thresholds:\n');
printToYML(outfile, model.thrs', 0);
N = 1000;
fprintf(outfile, 'edgeBoundaries:\n');
printToYML(outfile, model.eBnds, N);
fprintf(outfile, 'edgeBins:\n');
printToYML(outfile, model.eBins, N);
fclose(outfile);
gzip(outname);
end
function printToYML(outfile, A, N)
%% append matrix A to outfile as
%% - [a11, a12, a13, a14, ..., a1n]
%% - [a21, a22, a23, a24, ..., a2n]
%% ...
%%
%% if size(A, 2) == 1, A is printed by N elemnent per row
if (length(size(A)) ~= 2)
error('printToYML: second-argument matrix should have two dimensions');
end
if (size(A,2) ~= 1)
for i=1:size(A,1)
fprintf(outfile, ' - [');
fprintf(outfile, '%d,', A(i, 1:end-1));
fprintf(outfile, '%d]\n', A(i, end));
end
else
len = length(A);
for i=1:ceil(len/N)
first = (i-1)*N + 1;
last = min(i*N, len) - 1;
fprintf(outfile, ' - [');
fprintf(outfile, '%d,', A(first:last));
fprintf(outfile, '%d]\n', A(last + 1));
end
end
fprintf(outfile, '\n');
end
\ No newline at end of file
Structured forest training {#tutorial_ximgproc_training}
==========================
Introduction
------------
In this tutorial we show how to train your own structured forest using author's initial Matlab
implementation.
Training pipeline
-----------------
-# Download "Piotr's Toolbox" from [link](http://vision.ucsd.edu/~pdollar/toolbox/doc/index.html)
and put it into separate directory, e.g. PToolbox
-# Download BSDS500 dataset from
link \<http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/BSR/\> and put it into
separate directory named exactly BSR
-# Add both directory and their subdirectories to Matlab path.
-# Download detector code from
link \<http://research.microsoft.com/en-us/downloads/389109f6-b4e8-404c-84bf-239f7cbf4e3d/\> and
put it into root directory. Now you should have :
@code
.
BSR
PToolbox
models
private
Contents.m
edgesChns.m
edgesDemo.m
edgesDemoRgbd.m
edgesDetect.m
edgesEval.m
edgesEvalDir.m
edgesEvalImg.m
edgesEvalPlot.m
edgesSweeps.m
edgesTrain.m
license.txt
readme.txt
@endcode
-# Rename models/forest/modelFinal.mat to models/forest/modelFinal.mat.backup
-# Open edgesChns.m and comment lines 26--41. Add after commented lines the following:
@code{.cpp}
shrink=opts.shrink;
chns = single(getFeatures( im2double(I) ));
@endcode
-# Now it is time to compile promised getFeatures. I do with the following code:
@code{.cpp}
#include <cv.h>
#include <highgui.h>
#include <mat.h>
#include <mex.h>
#include "MxArray.hpp" // https://github.com/kyamagu/mexopencv
class NewRFFeatureGetter : public cv::RFFeatureGetter
{
public:
NewRFFeatureGetter() : name("NewRFFeatureGetter"){}
virtual void getFeatures(const cv::Mat &src, NChannelsMat &features,
const int gnrmRad, const int gsmthRad,
const int shrink, const int outNum, const int gradNum) const
{
// here your feature extraction code, the default one is:
// resulting features Mat should be n-channels, floating point matrix
}
protected:
cv::String name;
};
MEXFUNCTION_LINKAGE void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
if (nlhs != 1) mexErrMsgTxt("nlhs != 1");
if (nrhs != 1) mexErrMsgTxt("nrhs != 1");
cv::Mat src = MxArray(prhs[0]).toMat();
src.convertTo(src, cv::DataType<float>::type);
std::string modelFile = MxArray(prhs[1]).toString();
NewRFFeatureGetter *pDollar = createNewRFFeatureGetter();
cv::Mat edges;
pDollar->getFeatures(src, edges, 4, 0, 2, 13, 4);
// you can use other numbers here
edges.convertTo(edges, cv::DataType<double>::type);
plhs[0] = MxArray(edges);
}
@endcode
-# Place compiled mex file into root dir and run edgesDemo. You will need to wait a couple of hours
after that the new model will appear inside models/forest/.
-# The final step is converting trained model from Matlab binary format to YAML which you can use
with our ocv::StructuredEdgeDetection. For this purpose run
opencv_contrib/ximpgroc/tutorials/scripts/modelConvert(model, "model.yml")
How to use your model
---------------------
Just use expanded constructor with above defined class NewRFFeatureGetter
@code{.cpp}
cv::StructuredEdgeDetection pDollar
= cv::createStructuredEdgeDetection( modelName, makePtr<NewRFFeatureGetter>() );
@endcode
......@@ -131,7 +131,7 @@ struct CV_EXPORTS WaldBoostParams
{}
};
/** @brief WaldBoost object detector from @cite Sochman05
/** @brief WaldBoost object detector from @cite Sochman05 .
*/
class CV_EXPORTS WaldBoost : public Algorithm
{
......@@ -190,7 +190,7 @@ struct CV_EXPORTS ICFDetectorParams
{}
};
/** @brief Integral Channel Features from @cite Dollar09
/** @brief Integral Channel Features from @cite Dollar09 .
*/
class CV_EXPORTS ICFDetector
{
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment