erGrouping now uses a classifier for group validation instead of a set of heuristical ifos.

Updated documentation and sample to use the new function API

erGrouping now uses a classifier for group validation instead of a set of heuristical ifos.
Updated documentation and sample to use the new function API
047b568f · lluis · 4b203f7b · 047b568f · 047b568f · 047b568f
Commit 047b568f authored Sep 27, 2013 by lluis
13 changed files
--- a/modules/objdetect/doc/erfilter.rst
+++ b/modules/objdetect/doc/erfilter.rst
@@ -198,12 +198,14 @@ erGrouping
 ----------
 Find groups of Extremal Regions that are organized as text blocks.

-.. ocv:function:: void erGrouping( InputArrayOfArrays src, std::vector<std::vector<ERStat> > &regions, std::vector<Rect> &groups )
+.. ocv:function:: void erGrouping( InputArrayOfArrays src, std::vector<std::vector<ERStat> > &regions, const std::string& filename, float minProbablity, std::vector<Rect > &groups)

    :param src: Vector of sinle channel images CV_8UC1 from wich the regions were extracted
    :param regions: Vector of ER's retreived from the ERFilter algorithm from each channel
+    :param filename: The XML or YAML file with the classifier model (e.g. trained_classifier_erGrouping.xml)
+    :param minProbability: The minimum probability for accepting a group
    :param groups: The output of the algorithm are stored in this parameter as list of rectangles.

 This function implements the grouping algorithm described in [Gomez13]. Notice that this implementation constrains the results to horizontally-aligned text and latin script (since ERFilter classifiers are trained only for latin script detection).

-The algorithm combines two different clustering techniques in a single parameter-free procedure to detect groups of regions organized as text. The maximally meaningful groups are fist detected in several feature spaces, where each feature space is a combination of proximity information (x,y coordinates) and a similarity measure (intensity, color, size, gradient magnitude, etc.), thus providing a set of hypotheses of text groups. Evidence Accumulation framework is used to combine all these hypotheses to get the final estimate. Each of the resulting groups are finally heuristically validated in order to assess if they form a valid horizontally-aligned text block.
+The algorithm combines two different clustering techniques in a single parameter-free procedure to detect groups of regions organized as text. The maximally meaningful groups are fist detected in several feature spaces, where each feature space is a combination of proximity information (x,y coordinates) and a similarity measure (intensity, color, size, gradient magnitude, etc.), thus providing a set of hypotheses of text groups. Evidence Accumulation framework is used to combine all these hypotheses to get the final estimate. Each of the resulting groups are finally validated using a classifier in order to assess if they form a valid horizontally-aligned text block.
--- a/modules/objdetect/include/opencv2/objdetect/erfilter.hpp
+++ b/modules/objdetect/include/opencv2/objdetect/erfilter.hpp
@@ -250,14 +250,17 @@ CV_EXPORTS void computeNMChannels(InputArray _src, OutputArrayOfArrays _channels
    (x,y coordinates) and a similarity measure (intensity, color, size, gradient magnitude, etc.),
    thus providing a set of hypotheses of text groups. Evidence Accumulation framework is used to
    combine all these hypotheses to get the final estimate. Each of the resulting groups are finally
-    heuristically validated in order to assest if they form a valid horizontally-aligned text block.
+    validated using a classifier in order to assest if they form a valid horizontally-aligned text block.

    \param  src            Vector of sinle channel images CV_8UC1 from wich the regions were extracted.
    \param  regions        Vector of ER's retreived from the ERFilter algorithm from each channel
+    \param  filename       The XML or YAML file with the classifier model (e.g. trained_classifier_erGrouping.xml)
+    \param  minProbability The minimum probability for accepting a group
    \param  groups         The output of the algorithm are stored in this parameter as list of rectangles.
 */
 CV_EXPORTS void erGrouping(InputArrayOfArrays src, std::vector<std::vector<ERStat> > &regions,
-                                                   std::vector<Rect> &groups);
+                                                   const std::string& filename, float minProbablity,
+                                                   std::vector<Rect > &groups);

 }
 #endif // _OPENCV_ERFILTER_HPP_
--- a/modules/objdetect/src/erfilter.cpp
+++ b/modules/objdetect/src/erfilter.cpp
--- a/samples/cpp/scenetext.jpg
+++ b/samples/cpp/scenetext.jpg
--- a/samples/cpp/scenetext01.jpg
+++ b/samples/cpp/scenetext01.jpg
--- a/samples/cpp/scenetext02.jpg
+++ b/samples/cpp/scenetext02.jpg
--- a/samples/cpp/scenetext03.jpg
+++ b/samples/cpp/scenetext03.jpg
--- a/samples/cpp/scenetext04.jpg
+++ b/samples/cpp/scenetext04.jpg
--- a/samples/cpp/scenetext05.jpg
+++ b/samples/cpp/scenetext05.jpg
--- a/samples/cpp/scenetext06.jpg
+++ b/samples/cpp/scenetext06.jpg
--- a/samples/cpp/scenetext_GT.png
+++ b/samples/cpp/scenetext_GT.png
--- a/samples/cpp/erfilter.cpp
+++ b/samples/cpp/erfilter.cpp
-
-//--------------------------------------------------------------------------------------------------
-//  A demo program of the Extremal Region Filter algorithm described in
-//  Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012
-//--------------------------------------------------------------------------------------------------
+/*
+ * textdetection.cpp
+ *
+ * A demo program of the Extremal Region Filter algorithm described in
+ * Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012
+ *
+ * Created on: Sep 23, 2013
+ *     Author: Lluis Gomez i Bigorda <lgomez AT cvc.uab.es>
+ */

 #include  "opencv2/opencv.hpp"
 #include  "opencv2/objdetect.hpp"
@@ -18,10 +22,13 @@ using  namespace cv;

 void show_help_and_exit(const char *cmd);
 void groups_draw(Mat &src, vector<Rect> &groups);
-void er_draw(Mat &src, Mat &dst, ERStat& er);
+void er_show(vector<Mat> &channels, vector<vector<ERStat> > &regions);

 int  main(int argc, const char * argv[])
 {
+    cout << endl << argv[0] << endl << endl;
+    cout << "Demo program of the Extremal Region Filter algorithm described in " << endl;
+    cout << "Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012" << endl << endl;

    if (argc < 2) show_help_and_exit(argv[0]);

@@ -37,11 +44,13 @@ int  main(int argc, const char * argv[])
        channels.push_back(255-channels[c]);

    // Create ERFilter objects with the 1st and 2nd stage default classifiers
-    Ptr<ERFilter> er_filter1 = createERFilterNM1(loadClassifierNM1("trained_classifierNM1.xml"),8,0.00025,0.13,0.4,true,0.1);
-    Ptr<ERFilter> er_filter2 = createERFilterNM2(loadClassifierNM2("trained_classifierNM2.xml"),0.3);
+    Ptr<ERFilter> er_filter1 = createERFilterNM1(loadClassifierNM1("trained_classifierNM1.xml"),16,0.00015,0.13,0.2,true,0.1);
+    Ptr<ERFilter> er_filter2 = createERFilterNM2(loadClassifierNM2("trained_classifierNM2.xml"),0.5);

    vector<vector<ERStat> > regions(channels.size());
    // Apply the default cascade classifier to each independent channel (could be done in parallel)
+    cout << "Extracting Class Specific Extremal Regions from " << (int)channels.size() << " channels ..." << endl;
+    cout << "    (...) this may take a while (...)" << endl << endl;
    for (int c=0; c<(int)channels.size(); c++)
    {
        er_filter1->run(channels[c], regions[c]);
@@ -49,13 +58,18 @@ int  main(int argc, const char * argv[])
    }

    // Detect character groups
+    cout << "Grouping extracted ERs ... ";
    vector<Rect> groups;
-    erGrouping(channels, regions, groups);
+    erGrouping(channels, regions, "trained_classifier_erGrouping.xml", 0.5, groups);

    // draw groups
    groups_draw(src, groups);
    imshow("grouping",src);
-    waitKey(-1);
+
+    cout << "Done!" << endl << endl;
+    cout << "Press 'e' to show the extracted Extremal Regions, any other key to exit." << endl << endl;
+    if( waitKey (-1) == 101)
+        er_show(channels,regions);

    // memory clean-up
    er_filter1.release();
@@ -73,9 +87,6 @@ int  main(int argc, const char * argv[])

 void show_help_and_exit(const char *cmd)
 {
-    cout << endl << cmd << endl << endl;
-    cout << "Demo program of the Extremal Region Filter algorithm described in " << endl;
-    cout << "Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012" << endl << endl;
    cout << "    Usage: " << cmd << " <input_image> " << endl;
    cout << "    Default classifier files (trained_classifierNM*.xml) must be in current directory" << endl << endl;
    exit(-1);
@@ -92,14 +103,25 @@ void groups_draw(Mat &src, vector<Rect> &groups)
    }
 }

-void er_draw(Mat &src, Mat &dst, ERStat& er)
+void er_show(vector<Mat> &channels, vector<vector<ERStat> > &regions)
 {
-
-    if (er.parent != NULL) // deprecate the root region
+    for (int c=0; c<(int)channels.size(); c++)
    {
-        int newMaskVal = 255;
-        int flags = 4 + (newMaskVal << 8) + FLOODFILL_FIXED_RANGE + FLOODFILL_MASK_ONLY;
-        floodFill(src,dst,Point(er.pixel%src.cols,er.pixel/src.cols),Scalar(255),0,Scalar(er.level),Scalar(0),flags);
+        Mat dst = Mat::zeros(channels[0].rows+2,channels[0].cols+2,CV_8UC1);
+        for (int r=0; r<(int)regions[c].size(); r++)
+        {
+            ERStat er = regions[c][r];
+            if (er.parent != NULL) // deprecate the root region
+            {
+                int newMaskVal = 255;
+                int flags = 4 + (newMaskVal << 8) + FLOODFILL_FIXED_RANGE + FLOODFILL_MASK_ONLY;
+                floodFill(channels[c],dst,Point(er.pixel%channels[c].cols,er.pixel/channels[c].cols),
+                          Scalar(255),0,Scalar(er.level),Scalar(0),flags);
+            }
+        }
+        char buff[10]; char *buff_ptr = buff;
+        sprintf(buff, "channel %d", c);
+        imshow(buff_ptr, dst);
    }
-
+    waitKey(-1);
 }
--- a/samples/cpp/trained_classifier_erGrouping.xml
+++ b/samples/cpp/trained_classifier_erGrouping.xml