updated documentation

7af368f7 · Dmitriy Anisimov · 38168334 · 7af368f7 · 7af368f7 · 7af368f7
Commit 7af368f7 authored Oct 28, 2014 by Dmitriy Anisimov
23 changed files
--- a/modules/datasets/doc/datasets.rst
+++ b/modules/datasets/doc/datasets.rst
@@ -4,16 +4,24 @@ datasets. Framework for working with different datasets
 .. highlight:: cpp
-The datasets module includes classes for working with different datasets: load data, evaluate different algorithms on them, etc.
+The datasets module includes classes for working with different datasets: load data, evaluate different algorithms on them, contains benchmarks, etc.
+It is planned to have:
+ * basic: loading code for all datasets to help start work with them.
+ * next stage: quick benchmarks for all datasets to show how to solve them using OpenCV and implement evaluation code.
+ * finally: implement on OpenCV state-of-the-art algorithms, which solve these tasks.
 .. toctree::
    :hidden:
    datasets/ar_hmdb
    datasets/ar_sports
+    datasets/fr_adience
    datasets/fr_lfw
    datasets/gr_chalearn
    datasets/gr_skig
+    datasets/hpe_humaneva
    datasets/hpe_parse
    datasets/ir_affine
    datasets/ir_robot
@@ -33,14 +41,16 @@ The datasets module includes classes for working with different datasets: load d
 Action Recognition
 ------------------
-    :doc:`datasets/ar_hmdb`
+    :doc:`datasets/ar_hmdb` [#f1]_
    :doc:`datasets/ar_sports`
 Face Recognition
 ----------------
-    :doc:`datasets/fr_lfw`
+    :doc:`datasets/fr_adience`
+    :doc:`datasets/fr_lfw` [#f1]_
 Gesture Recognition
 -------------------
@@ -52,6 +62,8 @@ Gesture Recognition
 Human Pose Estimation
 ---------------------
+    :doc:`datasets/hpe_humaneva`
    :doc:`datasets/hpe_parse`
 Image Registration
@@ -80,14 +92,14 @@ Object Recognition
    :doc:`datasets/or_imagenet`
-    :doc:`datasets/or_mnist`
+    :doc:`datasets/or_mnist` [#f2]_
    :doc:`datasets/or_sun`
 Pedestrian Detection
 --------------------
-    :doc:`datasets/pd_caltech`
+    :doc:`datasets/pd_caltech` [#f2]_
 SLAM
 ----
@@ -101,5 +113,9 @@ Text Recognition
    :doc:`datasets/tr_chars`
-    :doc:`datasets/tr_svt`
+    :doc:`datasets/tr_svt` [#f1]_
+*Footnotes*
+ .. [#f1] Benchmark implemented
+ .. [#f2] Not used in Vision Challenge
--- a/modules/datasets/doc/datasets/ar_hmdb.rst
+++ b/modules/datasets/doc/datasets/ar_hmdb.rst
@@ -17,7 +17,7 @@ _`"HMDB: A Large Human Motion Database"`: http://serre-lab.clps.brown.edu/resour
 Benchmark
 """""""""
-For this dataset was implemented benchmark, which gives accuracy: 0.107407 (using precomputed HOG/HOF "STIP" features from site, averaging for 3 splits)
+For this dataset was implemented benchmark with accuracy: 0.107407 (using precomputed HOG/HOF "STIP" features from site, averaging for 3 splits)
 To run this benchmark execute:
@@ -27,3 +27,10 @@ To run this benchmark execute:
 (precomputed features should be unpacked in the same folder: /home/user/path_to_unpacked_folders/hmdb51_org_stips/)
+**References:**
+.. [Kuehne11] H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre. HMDB: A Large Video Database for Human Motion Recognition. ICCV, 2011
+.. [Laptev08] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning Realistic Human Actions From Movies. CVPR, 2008
--- a/modules/datasets/doc/datasets/ar_sports.rst
+++ b/modules/datasets/doc/datasets/ar_sports.rst
@@ -12,3 +12,7 @@ _`"Sports-1M Dataset"`: http://cs.stanford.edu/people/karpathy/deepvideo/
 2. To load data run: ./opencv/build/bin/example_datasets_ar_sports -p=/home/user/path_to_downloaded_folders/
+**References:**
+.. [KarpathyCVPR14] Andrej Karpathy and George Toderici and Sanketh Shetty and Thomas Leung and Rahul Sukthankar and Li Fei-Fei. Large-scale Video Classification with Convolutional Neural Networks. CVPR, 2014
--- a/modules/datasets/doc/datasets/fr_adience.rst
+++ b/modules/datasets/doc/datasets/fr_adience.rst
+Adience
+=======
+.. ocv:class:: FR_adience
+Implements loading dataset:
+_`"Adience"`: http://www.openu.ac.il/home/hassner/Adience/data.html
+.. note:: Usage
+ 1. From link above download any dataset file: faces.tar.gz\\aligned.tar.gz and files with splits: fold_0_data.txt-fold_4_data.txt, fold_frontal_0_data.txt-fold_frontal_4_data.txt. (For face recognition task another splits should be created)
+ 2. Unpack dataset file to some folder and place split files into the same folder.
+ 3. To load data run: ./opencv/build/bin/example_datasets_fr_adience -p=/home/user/path_to_created_folder/
+**References:**
+.. [Eidinger] E. Eidinger, R. Enbar, and T. Hassner. Age and Gender Estimation of Unfiltered Faces
--- a/modules/datasets/doc/datasets/fr_lfw.rst
+++ b/modules/datasets/doc/datasets/fr_lfw.rst
@@ -8,7 +8,7 @@ _`"Labeled Faces in the Wild"`: http://vis-www.cs.umass.edu/lfw/
 .. note:: Usage
- 1. From link above download any dataset file: lfw.tgz\lfwa.tar.gz\lfw-deepfunneled.tgz\lfw-funneled.tgz and files with pairs: 10 test splits: pairs.txt and developer train split: pairsDevTrain.txt.
+ 1. From link above download any dataset file: lfw.tgz\\lfwa.tar.gz\\lfw-deepfunneled.tgz\\lfw-funneled.tgz and files with pairs: 10 test splits: pairs.txt and developer train split: pairsDevTrain.txt.
 2. Unpack dataset file and place pairs.txt and pairsDevTrain.txt in created folder.
@@ -17,7 +17,7 @@ _`"Labeled Faces in the Wild"`: http://vis-www.cs.umass.edu/lfw/
 Benchmark
 """""""""
-For this dataset was implemented benchmark, which gives accuracy: 0.623833 +- 0.005223 (train split: pairsDevTrain.txt, dataset: lfwa)
+For this dataset was implemented benchmark with accuracy: 0.623833 +- 0.005223 (train split: pairsDevTrain.txt, dataset: lfwa)
 To run this benchmark execute:
@@ -25,3 +25,7 @@ To run this benchmark execute:
 ./opencv/build/bin/example_datasets_fr_lfw_benchmark -p=/home/user/path_to_unpacked_folder/lfw2/
+**References:**
+.. [Huang07] G.B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. 2007
--- a/modules/datasets/doc/datasets/gr_chalearn.rst
+++ b/modules/datasets/doc/datasets/gr_chalearn.rst
@@ -16,3 +16,7 @@ _`"ChaLearn Looking at People"`: http://gesture.chalearn.org/
 4. To load data run: ./opencv/build/bin/example_datasets_gr_chalearn -p=/home/user/path_to_unpacked_folders/
+**References:**
+.. [Escalera14] S. Escalera, X. Baró, J. Gonzàlez, M.A. Bautista, M. Madadi, M. Reyes, V. Ponce-López, H.J. Escalante, J. Shotton, I. Guyon, "ChaLearn Looking at People Challenge 2014: Dataset and Results", ECCV Workshops, 2014
--- a/modules/datasets/doc/datasets/gr_skig.rst
+++ b/modules/datasets/doc/datasets/gr_skig.rst
@@ -14,3 +14,7 @@ _`"Sheffield Kinect Gesture Dataset"`: http://lshao.staff.shef.ac.uk/data/Sheffi
 3. To load data run: ./opencv/build/bin/example_datasets_gr_skig -p=/home/user/path_to_unpacked_folders/
+**References:**
+.. [Liu13] L. Liu and L. Shao, “Learning Discriminative Representations from RGB-D Video Data”, In Proc. International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013.
--- a/modules/datasets/doc/datasets/hpe_humaneva.rst
+++ b/modules/datasets/doc/datasets/hpe_humaneva.rst
+HumanEva Dataset
+================
+.. ocv:class:: HPE_humaneva
+Implements loading dataset:
+_`"HumanEva Dataset"`: http://humaneva.is.tue.mpg.de
+.. note:: Usage
+ 1. From link above download dataset files for HumanEva-I (tar) & HumanEva-II.
+ 2. Unpack them to HumanEva_1 & HumanEva_2 accordingly.
+ 3. To load data run: ./opencv/build/bin/example_datasets_hpe_humaneva -p=/home/user/path_to_unpacked_folders/
+**References:**
+.. [Sigal10] L. Sigal, A. Balan and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion, In International Journal of Computer Vision, Vol. 87 (1-2), 2010
+.. [Sigal06] L. Sigal and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion, Techniacl Report CS-06-08, Brown University, 2006
--- a/modules/datasets/doc/datasets/hpe_parse.rst
+++ b/modules/datasets/doc/datasets/hpe_parse.rst
@@ -14,3 +14,7 @@ _`"PARSE Dataset"`: http://www.ics.uci.edu/~dramanan/papers/parse/
 3. To load data run: ./opencv/build/bin/example_datasets_hpe_parse -p=/home/user/path_to_unpacked_folder/people_all/
+**References:**
+.. [Ramanan06] D. Ramanan "Learning to Parse Images of Articulated Bodies." Neural Info. Proc. Systems (NIPS) To appear. Dec 2006.
--- a/modules/datasets/doc/datasets/ir_affine.rst
+++ b/modules/datasets/doc/datasets/ir_affine.rst
@@ -14,3 +14,7 @@ _`"Affine Covariant Regions Datasets"`: http://www.robots.ox.ac.uk/~vgg/data/dat
 3. To load data, for example, for "bark", run: ./opencv/build/bin/example_datasets_ir_affine -p=/home/user/path_to_unpacked_folder/bark/
+**References:**
+.. [Mikolajczyk05]  K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, L. Van Gool. A Comparison of Affine Region Detectors. International Journal of Computer Vision, Volume 65, Number 1/2, page 43--72, 2005
--- a/modules/datasets/doc/datasets/ir_robot.rst
+++ b/modules/datasets/doc/datasets/ir_robot.rst
@@ -4,12 +4,17 @@ Robot Data Set
 Implements loading dataset:
-_`"Robot Data Set"`: http://roboimagedata.compute.dtu.dk/?page_id=24
+_`"Robot Data Set, Point Feature Data Set – 2010"`: http://roboimagedata.compute.dtu.dk/?page_id=24
 .. note:: Usage
- 1. From link above download files for dataset "Point Feature Data Set – 2010": SET001_6.tar.gz-SET055_60.tar.gz (there are two data sets: - Full resolution images (1200×1600), ~500 Gb and - Half size image (600×800), ~115 Gb.)
+ 1. From link above download dataset files: SET001_6.tar.gz-SET055_60.tar.gz
 2. Unpack them to one folder.
 3. To load data run: ./opencv/build/bin/example_datasets_ir_robot -p=/home/user/path_to_unpacked_folder/
+**References:**
+.. [aanæsinteresting] Aan{\ae}s, H. and Dahl, A.L. and Steenstrup Pedersen, K. Interesting Interest Points. International Journal of Computer Vision. 2012.
--- a/modules/datasets/doc/datasets/is_bsds.rst
+++ b/modules/datasets/doc/datasets/is_bsds.rst
@@ -14,3 +14,7 @@ _`"The Berkeley Segmentation Dataset and Benchmark"`: https://www.eecs.berkeley.
 3. To load data run: ./opencv/build/bin/example_datasets_is_bsds -p=/home/user/path_to_unpacked_folder/BSDS300/
+**References:**
+.. [MartinFTM01] D. Martin and C. Fowlkes and D. Tal and J. Malik. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. 2001
--- a/modules/datasets/doc/datasets/is_weizmann.rst
+++ b/modules/datasets/doc/datasets/is_weizmann.rst
@@ -14,3 +14,7 @@ _`"Weizmann Segmentation Evaluation Database"`: http://www.wisdom.weizmann.ac.il
 3. To load data, for example, for 1 object dataset, run: ./opencv/build/bin/example_datasets_is_weizmann -p=/home/user/path_to_unpacked_folder/1obj/
+**References:**
+.. [AlpertGBB07] Sharon Alpert and Meirav Galun and Ronen Basri and Achi Brandt. Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration. 2007
--- a/modules/datasets/doc/datasets/msm_epfl.rst
+++ b/modules/datasets/doc/datasets/msm_epfl.rst
@@ -14,3 +14,7 @@ _`"EPFL Multi-View Stereo"`: http://cvlabwww.epfl.ch/~strecha/multiview/denseMVS
 3. To load data, for example, for "fountain", run: ./opencv/build/bin/example_datasets_msm_epfl -p=/home/user/path_to_unpacked_folder/fountain/
+**References:**
+.. [Strecha08] C. Strecha, W. von Hansen, L. Van Gool, P. Fua, U. Thoennessen. On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery. CVPR, 2008
--- a/modules/datasets/doc/datasets/msm_middlebury.rst
+++ b/modules/datasets/doc/datasets/msm_middlebury.rst
@@ -14,3 +14,7 @@ _`"Stereo – Middlebury Computer Vision"`: http://vision.middlebury.edu/mview/
 3. To load data, for example "temple" dataset, run: ./opencv/build/bin/example_datasets_msm_middlebury -p=/home/user/path_to_unpacked_folder/temple/
+**References:**
+.. [Seitz06] S. M. Seitz, B. Curless, J. Diebel, D. Scharstein, R. Szeliski. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, CVPR, 2006
--- a/modules/datasets/doc/datasets/or_imagenet.rst
+++ b/modules/datasets/doc/datasets/or_imagenet.rst
@@ -6,13 +6,34 @@ Implements loading dataset:
 _`"ImageNet"`: http://www.image-net.org/
-Currently implemented loading full list with urls. Planned to implement dataset from ILSVRC challenge. 
 .. note:: Usage
- 1. From link above download dataset file: imagenet_fall11_urls.tgz
+ 1. From link above download dataset files: ILSVRC2010_images_train.tar\\ILSVRC2010_images_test.tar\\ILSVRC2010_images_val.tar & devkit: ILSVRC2010_devkit-1.0.tar.gz (Implemented loading of 2010 dataset as only this dataset has ground truth for test data, but structure for ILSVRC2014 is similar)
+ 2. Unpack them to: some_folder/train/\\some_folder/test/\\some_folder/val & some_folder/ILSVRC2010_validation_ground_truth.txt\\some_folder/ILSVRC2010_test_ground_truth.txt.
+ 3. Create file with labels: some_folder/labels.txt, for example, using :ref:`python script <python-script>` below (each file's row format: synset,labelID,description. For example: "n07751451,18,plum").
+ 4. Unpack all tar files in train.
+ 5. To load data run: ./opencv/build/bin/example_datasets_or_imagenet -p=/home/user/some_folder/
+.. _python-script:
+Python script to parse meta.mat:
+::
+ import scipy.io
+ meta_mat = scipy.io.loadmat("devkit-1.0/data/meta.mat")
+ labels_dic = dict((m[0][1][0], m[0][0][0][0]-1) for m in meta_mat['synsets']
+ label_names_dic = dict((m[0][1][0], m[0][2][0]) for m in meta_mat['synsets']
+ for label in labels_dic.keys():
+     print "{0},{1},{2}".format(label, labels_dic[label], label_names_dic[label])
- 2. Unpack it.
+**References:**
- 3. To load data run: ./opencv/build/bin/example_datasets_or_imagenet -p=/home/user/path_to_unpacked_file/
+.. [ILSVRCarxiv14] Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. 2014
--- a/modules/datasets/doc/datasets/or_mnist.rst
+++ b/modules/datasets/doc/datasets/or_mnist.rst
@@ -14,3 +14,7 @@ _`"MNIST"`: http://yann.lecun.com/exdb/mnist/
 3. To load data run: ./opencv/build/bin/example_datasets_or_mnist -p=/home/user/path_to_unpacked_files/
+**References:**
+.. [LeCun98a] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.
--- a/modules/datasets/doc/datasets/or_sun.rst
+++ b/modules/datasets/doc/datasets/or_sun.rst
@@ -4,15 +4,19 @@ SUN Database
 Implements loading dataset:
-_`"SUN Database"`: http://sundatabase.mit.edu/
+_`"SUN Database, Scene Recognition Benchmark. SUN397"`: http://vision.cs.princeton.edu/projects/2010/SUN/
-Currently implemented loading "Scene Recognition Benchmark. SUN397". Planned to implement also "Object Detection Benchmark. SUN2012". 
 .. note:: Usage
- 1. From link above download dataset file: SUN397.tar
+ 1. From link above download dataset file: SUN397.tar & file with splits: Partitions.zip
+ 2. Unpack SUN397.tar into folder: SUN397/ & Partitions.zip into folder: SUN397/Partitions/
+ 3. To load data run: ./opencv/build/bin/example_datasets_or_sun -p=/home/user/path_to_unpacked_files/SUN397/
+**References:**
- 2. Unpack it.
+.. [Xiao10] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. SUN Database: Large-scale Scene Recognition from Abbey to Zoo. IEEE Conference on Computer Vision and Pattern Recognition. CVPR, 2010
- 3. To load data run: ./opencv/build/bin/example_datasets_or_sun -p=/home/user/path_to_unpacked_folder/SUN397/
+.. [Xiao14] J. Xiao, K. A. Ehinger, J. Hays, A. Torralba, and A. Oliva. SUN Database: Exploring a Large Collection of Scene Categories. International Journal of Computer Vision. IJCV, 2014
--- a/modules/datasets/doc/datasets/pd_caltech.rst
+++ b/modules/datasets/doc/datasets/pd_caltech.rst
@@ -21,3 +21,9 @@ _`"Caltech Pedestrian Detection Benchmark"`: http://www.vision.caltech.edu/Image
 3. To load data run: ./opencv/build/bin/example_datasets_pd_caltech -p=/home/user/path_to_unpacked_folders/
+**References:**
+.. [Dollár12] P. Dollár, C. Wojek, B. Schiele and P. Perona. Pedestrian Detection: An Evaluation of the State of the Art. PAMI, 2012.
+.. [DollárCVPR09] P. Dollár, C. Wojek, B. Schiele and P. Perona. Pedestrian Detection: A Benchmark. CVPR, 2009
--- a/modules/datasets/doc/datasets/slam_kitti.rst
+++ b/modules/datasets/doc/datasets/slam_kitti.rst
@@ -14,3 +14,11 @@ _`"KITTI Vision Benchmark"`: http://www.cvlibs.net/datasets/kitti/eval_odometry.
 3. To load data run: ./opencv/build/bin/example_datasets_slam_kitti -p=/home/user/path_to_unpacked_folder/dataset/
+**References:**
+.. [Geiger2012CVPR] Andreas Geiger and Philip Lenz and Raquel Urtasun. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. CVPR, 2012
+.. [Geiger2013IJRR] Andreas Geiger and Philip Lenz and Christoph Stiller and Raquel Urtasun. Vision meets Robotics: The KITTI Dataset. IJRR, 2013
+.. [Fritsch2013ITSC] Jannik Fritsch and Tobias Kuehnl and Andreas Geiger. A New Performance Measure and Evaluation Benchmark for Road Detection Algorithms. ITSC, 2013
--- a/modules/datasets/doc/datasets/slam_tumindoor.rst
+++ b/modules/datasets/doc/datasets/slam_tumindoor.rst
@@ -14,3 +14,7 @@ _`"TUMindoor Dataset"`: http://www.navvis.lmt.ei.tum.de/dataset/
 3. To load each dataset run: ./opencv/build/bin/example_datasets_slam_tumindoor -p=/home/user/path_to_unpacked_folders/
+**References:**
+.. [TUMindoor] R. Huitl and G. Schroth and S. Hilsenbeck and F. Schweiger and E. Steinbach. {TUM}indoor: An Extensive Image and Point Cloud Dataset for Visual Indoor Localization and Mapping. 2012
--- a/modules/datasets/doc/datasets/tr_chars.rst
+++ b/modules/datasets/doc/datasets/tr_chars.rst
@@ -16,3 +16,7 @@ _`"The Chars74K Dataset"`: http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
 4. To load data, for example "EnglishImg", run: ./opencv/build/bin/example_datasets_tr_chars -p=/home/user/path_to_unpacked_folder/English/
+**References:**
+.. [Campos09] T. E. de Campos, B. R. Babu and M. Varma. Character recognition in natural images. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), 2009
--- a/modules/datasets/doc/datasets/tr_svt.rst
+++ b/modules/datasets/doc/datasets/tr_svt.rst
@@ -14,3 +14,20 @@ _`"The Street View Text Dataset"`: http://vision.ucsd.edu/~kai/svt/
 3. To load data run: ./opencv/build/bin/example_datasets_tr_svt -p=/home/user/path_to_unpacked_folder/svt/svt1/
+Benchmark
+"""""""""
+For this dataset was implemented benchmark with accuracy (mean f1): 0.217
+To run benchmark execute:
+.. code-block:: bash
+ ./opencv/build/bin/example_datasets_tr_svt_benchmark -p=/home/user/path_to_unpacked_folders/svt/svt1/
+**References:**
+.. [Wang11] Kai Wang, Boris Babenko and Serge Belongie. End-to-end Scene Text Recognition. ICCV, 2011
+.. [Wang10] Kai Wang and Serge Belongie. Word Spotting in the Wild. ECCV, 2010