Commit 501033db authored by Vadim Pisarevsky's avatar Vadim Pisarevsky

integrated grammar fixes from tech writer (part I)

parent 84e4f597
...@@ -3,12 +3,14 @@ ...@@ -3,12 +3,14 @@
Gradient Boosted Trees Gradient Boosted Trees
====================== ======================
Gradient Boosted Trees (GBT) is a generalized boosting algorithm, introduced by .. highlight:: cpp
Gradient Boosted Trees (GBT) is a generalized boosting algorithm introduced by
Jerome Friedman: http://www.salfordsystems.com/doc/GreedyFuncApproxSS.pdf . Jerome Friedman: http://www.salfordsystems.com/doc/GreedyFuncApproxSS.pdf .
In contrast to AdaBoost.M1 algorithm GBT can deal with both multiclass In contrast to the AdaBoost.M1 algorithm, GBT can deal with both multiclass
classification and regression problems. More than that it can use any classification and regression problems. Moreover, it can use any
differential loss function, some popular ones are implemented. differential loss function, some popular ones are implemented.
Decision trees (:ref:`CvDTree`) usage as base learners allows to process ordered Decision trees (:ocv:class:`CvDTree`) usage as base learners allows to process ordered
and categorical variables. and categorical variables.
...@@ -17,10 +19,10 @@ and categorical variables. ...@@ -17,10 +19,10 @@ and categorical variables.
Training the GBT model Training the GBT model
---------------------- ----------------------
Gradient Boosted Trees model represents an ensemble of single regression trees, Gradient Boosted Trees model represents an ensemble of single regression trees
that are built in a greedy fashion. Training procedure is an iterative proccess built in a greedy fashion. Training procedure is an iterative proccess
similar to the numerical optimazation via gradient descent method. Summary loss similar to the numerical optimization via the gradient descent method. Summary loss
on the training set depends only from the current model predictions on the on the training set depends only on the current model predictions for the
thaining samples, in other words thaining samples, in other words
:math:`\sum^N_{i=1}L(y_i, F(x_i)) \equiv \mathcal{L}(F(x_1), F(x_2), ... , F(x_N)) :math:`\sum^N_{i=1}L(y_i, F(x_i)) \equiv \mathcal{L}(F(x_1), F(x_2), ... , F(x_N))
\equiv \mathcal{L}(F)`. And the :math:`\mathcal{L}(F)` \equiv \mathcal{L}(F)`. And the :math:`\mathcal{L}(F)`
...@@ -30,12 +32,13 @@ gradient can be computed as follows: ...@@ -30,12 +32,13 @@ gradient can be computed as follows:
grad(\mathcal{L}(F)) = \left( \dfrac{\partial{L(y_1, F(x_1))}}{\partial{F(x_1)}}, grad(\mathcal{L}(F)) = \left( \dfrac{\partial{L(y_1, F(x_1))}}{\partial{F(x_1)}},
\dfrac{\partial{L(y_2, F(x_2))}}{\partial{F(x_2)}}, ... , \dfrac{\partial{L(y_2, F(x_2))}}{\partial{F(x_2)}}, ... ,
\dfrac{\partial{L(y_N, F(x_N))}}{\partial{F(x_N)}} \right) . \dfrac{\partial{L(y_N, F(x_N))}}{\partial{F(x_N)}} \right) .
On every training step a single regression tree is built to predict an
At every training step, a single regression tree is built to predict an
antigradient vector components. Step length is computed corresponding to the antigradient vector components. Step length is computed corresponding to the
loss function and separately for every region determined by the tree leaf, and loss function and separately for every region determined by the tree leaf. It
can be eliminated by changing leaves' values directly. can be eliminated by changing values of the leaves directly.
The main scheme of the training proccess is shown below. See below the main scheme of the training proccess:
#. #.
Find the best constant model. Find the best constant model.
...@@ -52,49 +55,50 @@ The main scheme of the training proccess is shown below. ...@@ -52,49 +55,50 @@ The main scheme of the training proccess is shown below.
Add the tree to the model. Add the tree to the model.
The following loss functions are implemented: The following loss functions are implemented for regression problems:
*for regression problems:*
#. *
Squared loss (``CvGBTrees::SQUARED_LOSS``): Squared loss (``CvGBTrees::SQUARED_LOSS``):
:math:`L(y,f(x))=\dfrac{1}{2}(y-f(x))^2` :math:`L(y,f(x))=\dfrac{1}{2}(y-f(x))^2`
#. *
Absolute loss (``CvGBTrees::ABSOLUTE_LOSS``): Absolute loss (``CvGBTrees::ABSOLUTE_LOSS``):
:math:`L(y,f(x))=|y-f(x)|` :math:`L(y,f(x))=|y-f(x)|`
#. *
Huber loss (``CvGBTrees::HUBER_LOSS``): Huber loss (``CvGBTrees::HUBER_LOSS``):
:math:`L(y,f(x)) = \left\{ \begin{array}{lr} :math:`L(y,f(x)) = \left\{ \begin{array}{lr}
\delta\cdot\left(|y-f(x)|-\dfrac{\delta}{2}\right) & : |y-f(x)|>\delta\\ \delta\cdot\left(|y-f(x)|-\dfrac{\delta}{2}\right) & : |y-f(x)|>\delta\\
\dfrac{1}{2}\cdot(y-f(x))^2 & : |y-f(x)|\leq\delta \end{array} \right.`, \dfrac{1}{2}\cdot(y-f(x))^2 & : |y-f(x)|\leq\delta \end{array} \right.`,
where :math:`\delta` is the :math:`\alpha`-quantile estimation of the
where :math:`\delta` is the :math:`\alpha`-quantile estimation of the
:math:`|y-f(x)|`. In the current implementation :math:`\alpha=0.2`. :math:`|y-f(x)|`. In the current implementation :math:`\alpha=0.2`.
*for classification problems:*
4. The following loss functions are implemented for classification problems:
*
Deviance or cross-entropy loss (``CvGBTrees::DEVIANCE_LOSS``): Deviance or cross-entropy loss (``CvGBTrees::DEVIANCE_LOSS``):
:math:`K` functions are built, one function for each output class, and :math:`K` functions are built, one function for each output class, and
:math:`L(y,f_1(x),...,f_K(x)) = -\sum^K_{k=0}1(y=k)\ln{p_k(x)}`, :math:`L(y,f_1(x),...,f_K(x)) = -\sum^K_{k=0}1(y=k)\ln{p_k(x)}`,
where :math:`p_k(x)=\dfrac{\exp{f_k(x)}}{\sum^K_{i=1}\exp{f_i(x)}}` where :math:`p_k(x)=\dfrac{\exp{f_k(x)}}{\sum^K_{i=1}\exp{f_i(x)}}`
is the estimation of the probability that :math:`y=k`. is the estimation of the probability of :math:`y=k`.
In the end we get the model in the following form: As a result, you get the following model:
.. math:: f(x) = f_0 + \nu\cdot\sum^M_{i=1}T_i(x) , .. math:: f(x) = f_0 + \nu\cdot\sum^M_{i=1}T_i(x) ,
where :math:`f_0` is the initial guess (the best constant model) and :math:`\nu`
where :math:`f_0` is an initial guess (the best constant model) and :math:`\nu`
is a regularization parameter from the interval :math:`(0,1]`, futher called is a regularization parameter from the interval :math:`(0,1]`, futher called
*shrinkage*. *shrinkage*.
.. _Predicting with GBT model: .. _Predicting with GBT model:
Predicting with GBT model Predicting with the GBT Model
------------------------- -------------------------
To get the GBT model prediciton it is needed to compute the sum of responses of To get the GBT model prediciton, you need to compute the sum of responses of
all the trees in the ensemble. For regression problems it is the answer, and all the trees in the ensemble. For regression problems, it is the answer.
for classification problems the result is :math:`\arg\max_{i=1..K}(f_i(x))`. For classification problems, the result is :math:`\arg\max_{i=1..K}(f_i(x))`.
.. highlight:: cpp .. highlight:: cpp
...@@ -105,9 +109,9 @@ for classification problems the result is :math:`\arg\max_{i=1..K}(f_i(x))`. ...@@ -105,9 +109,9 @@ for classification problems the result is :math:`\arg\max_{i=1..K}(f_i(x))`.
CvGBTreesParams CvGBTreesParams
--------------- ---------------
.. c:type:: CvGBTreesParams .. ocv:class:: CvGBTreesParams
GBT training parameters :: GBT training parameters. ::
struct CvGBTreesParams : public CvDTreeParams struct CvGBTreesParams : public CvDTreeParams
{ {
...@@ -123,43 +127,29 @@ GBT training parameters :: ...@@ -123,43 +127,29 @@ GBT training parameters ::
The structure contains parameters for each sigle decision tree in the ensemble, The structure contains parameters for each sigle decision tree in the ensemble,
as well as the whole model characteristics. The structure is derived from as well as the whole model characteristics. The structure is derived from
:ref:`CvDTreeParams` but not all of the decision tree parameters are supported: :ocv:class:`CvDTreeParams` but not all of the decision tree parameters are supported:
cross-validation, pruning and class priorities are not used. The whole cross-validation, pruning, and class priorities are not used.
parameters list is shown below:
``weak_count`` :param weak_count: Count of boosting algorithm iterations. ``weak_count*K`` is the total
The count of boosting algorithm iterations. ``weak_count*K`` -- is the total
count of trees in the GBT model, where ``K`` is the output classes count count of trees in the GBT model, where ``K`` is the output classes count
(equal to one in the case of regression). (equal to one in case of a regression).
``loss_function_type`` :param loss_function_type: Type of the loss function used for training
The type of the loss function used for training
(see :ref:`Training the GBT model`). It must be one of the (see :ref:`Training the GBT model`). It must be one of the
following: ``CvGBTrees::SQUARED_LOSS``, ``CvGBTrees::ABSOLUTE_LOSS``, following types: ``CvGBTrees::SQUARED_LOSS``, ``CvGBTrees::ABSOLUTE_LOSS``,
``CvGBTrees::HUBER_LOSS``, ``CvGBTrees::DEVIANCE_LOSS``. The first three ``CvGBTrees::HUBER_LOSS``, ``CvGBTrees::DEVIANCE_LOSS``. The first three
ones are used for the case of regression problems, and the last one for types are used for regression problems, and the last one for
classification. classification.
``shrinkage`` :param shrinkage: Regularization parameter (see :ref:`Training the GBT model`).
Regularization parameter (see :ref:`Training the GBT model`).
``subsample_portion`` :param subsample_portion: Portion of the whole training set used for each algorithm iteration.
Subset is generated randomly. For more information see
http://www.salfordsystems.com/doc/StochasticBoostingSS.pdf.
The portion of the whole training set used on each algorithm iteration. :param max_depth: Maximal depth of each decision tree in the ensemble (see :ocv:class:`CvDTree`).
Subset is generated randomly
(For more information see
http://www.salfordsystems.com/doc/StochasticBoostingSS.pdf).
``max_depth`` :param use_surrogates: If ``true``, surrogate splits are built (see :ocv:class:`CvDTree`).
The maximal depth of each decision tree in the ensemble (see :ref:`CvDTree`).
``use_surrogates``
If ``true`` surrogate splits are built (see :ref:`CvDTree`).
By default the following constructor is used: By default the following constructor is used:
...@@ -175,9 +165,9 @@ By default the following constructor is used: ...@@ -175,9 +165,9 @@ By default the following constructor is used:
CvGBTrees CvGBTrees
--------- ---------
.. c:type:: CvGBTrees .. ocv:class:: CvGBTrees
GBT model :: GBT model. ::
class CvGBTrees : public CvStatModel class CvGBTrees : public CvStatModel
{ {
...@@ -248,25 +238,25 @@ GBT model :: ...@@ -248,25 +238,25 @@ GBT model ::
CvGBTrees::train CvGBTrees::train
---------------- ----------------
.. c:function:: bool train(const Mat & trainData, int tflag, const Mat & responses, const Mat & varIdx=Mat(), const Mat & sampleIdx=Mat(), const Mat & varType=Mat(), const Mat & missingDataMask=Mat(), CvGBTreesParams params=CvGBTreesParams(), bool update=false) .. ocv:function:: bool train(const Mat & trainData, int tflag, const Mat & responses, const Mat & varIdx=Mat(), const Mat & sampleIdx=Mat(), const Mat & varType=Mat(), const Mat & missingDataMask=Mat(), CvGBTreesParams params=CvGBTreesParams(), bool update=false)
.. c:function:: bool train(CvMLData* data, CvGBTreesParams params=CvGBTreesParams(), bool update=false) .. ocv:function:: bool train(CvMLData* data, CvGBTreesParams params=CvGBTreesParams(), bool update=false)
Trains a Gradient boosted tree model. Trains a Gradient boosted tree model.
The first train method follows the common template (see :ref:`CvStatModel::train`). The first train method follows the common template (see :ocv:func:`CvStatModel::train`).
Both ``tflag`` values (``CV_ROW_SAMPLE``, ``CV_COL_SAMPLE``) are supported. Both ``tflag`` values (``CV_ROW_SAMPLE``, ``CV_COL_SAMPLE``) are supported.
``trainData`` must be of ``CV_32F`` type. ``responses`` must be a matrix of type ``trainData`` must be of the ``CV_32F`` type. ``responses`` must be a matrix of type
``CV_32S`` or ``CV_32F``, in both cases it is converted into the ``CV_32F`` ``CV_32S`` or ``CV_32F``. In both cases it is converted into the ``CV_32F``
matrix inside the training procedure. ``varIdx`` and ``sampleIdx`` must be a matrix inside the training procedure. ``varIdx`` and ``sampleIdx`` must be a
list of indices (``CV_32S``), or a mask (``CV_8U`` or ``CV_8S``). ``update`` is list of indices (``CV_32S``) or a mask (``CV_8U`` or ``CV_8S``). ``update`` is
a dummy parameter. a dummy parameter.
The second form of :ref:`CvGBTrees::train` function uses :ref:`CvMLData` as a The second form of :ocv:func:`CvGBTrees::train` function uses :ocv:class:`CvMLData` as a
data set container. ``update`` is still a dummy parameter. data set container. ``update`` is still a dummy parameter.
All parameters specific to the GBT model are passed into the training function All parameters specific to the GBT model are passed into the training function
as a :ref:`CvGBTreesParams` structure. as a :ocv:class:`CvGBTreesParams` structure.
.. index:: CvGBTrees::predict .. index:: CvGBTrees::predict
...@@ -275,52 +265,41 @@ as a :ref:`CvGBTreesParams` structure. ...@@ -275,52 +265,41 @@ as a :ref:`CvGBTreesParams` structure.
CvGBTrees::predict CvGBTrees::predict
------------------ ------------------
.. c:function:: float predict(const Mat & sample, const Mat & missing=Mat(), const Range & slice = Range::all(), int k=-1) const .. ocv:function:: float predict(const Mat & sample, const Mat & missing=Mat(), const Range & slice = Range::all(), int k=-1) const
Predicts a response for an input sample. Predicts a response for an input sample.
The method predicts the response, corresponding to the given sample :param sample: Input feature vector that has the same format as every training set
(see :ref:`Predicting with GBT model`). element. If not all the variables were actualy used during training,
The result is either the class label or the estimated function value. ``sample`` contains forged values at the appropriate places.
:c:func:`predict` method allows to use the parallel version of the GBT model
prediction if the OpenCV is built with the TBB library. In this case predicitons
of single trees are computed in a parallel fashion.
``sample``
An input feature vector, that has the same format as every training set
element. Hence, if not all the variables were actualy used while training,
``sample`` have to contain fictive values on the appropriate places.
``missing`` :param missing: Missing values mask, which is a dimentional matrix of the same size as
``sample`` having the ``CV_8U`` type. ``1`` corresponds to the missing value
The missing values mask. The one dimentional matrix of the same size as
``sample`` having a ``CV_8U`` type. ``1`` corresponds to the missing value
in the same position in the ``sample`` vector. If there are no missing values in the same position in the ``sample`` vector. If there are no missing values
in the feature vector empty matrix can be passed instead of the missing mask. in the feature vector, an empty matrix can be passed instead of the missing mask.
``weak_responses`` :param weak_responses: Matrix used to obtain predictions of all the trees.
The matrix has :math:`K` rows,
In addition to the prediciton of the whole model all the trees' predcitions where :math:`K` is the count of output classes (1 for the regression case).
can be obtained by passing a ``weak_responses`` matrix with :math:`K` rows, The matrix has as many columns as the ``slice`` length.
where :math:`K` is the output classes count (1 for the case of regression)
and having as many columns as the ``slice`` length.
``slice``
Defines the part of the ensemble used for prediction. :param slice: Parameter defining the part of the ensemble used for prediction.
All trees are used when ``slice = Range::all()``. This parameter is useful to If ``slice = Range::all()``, all trees are used. Use this parameter to
get predictions of the GBT models with different ensemble sizes learning get predictions of the GBT models with different ensemble sizes learning
only the one model actually. only one model.
``k``
In the case of the classification problem not the one, but :math:`K` tree
ensembles are built (see :ref:`Training the GBT model`). By passing this
parameter the ouput can be changed to sum of the trees' predictions in the
``k``'th ensemble only. To get the total GBT model prediction ``k`` value
must be -1. For regression problems ``k`` have to be equal to -1 also.
:param k: Number of tree ensembles built in case of the classification problem
(see :ref:`Training the GBT model`). Use this
parameter to change the ouput to sum of the trees' predictions in the
``k``-th ensemble only. To get the total GBT model prediction, ``k`` value
must be -1. For regression problems, ``k`` is also equal to -1.
The method predicts the response corresponding to the given sample
(see :ref:`Predicting with the GBT model`).
The result is either the class label or the estimated function value. The
:ocv:func:`predict` method enables using the parallel version of the GBT model
prediction if the OpenCV is built with the TBB library. In this case, predictions
of single trees are computed in a parallel fashion.
.. index:: CvGBTrees::clear .. index:: CvGBTrees::clear
...@@ -329,12 +308,12 @@ of single trees are computed in a parallel fashion. ...@@ -329,12 +308,12 @@ of single trees are computed in a parallel fashion.
CvGBTrees::clear CvGBTrees::clear
---------------- ----------------
.. c:function:: void clear() .. ocv:function:: void clear()
Clears the model. Clears the model.
Deletes the data set information, all the weak models and sets all internal The finction deletes the data set information and all the weak models and sets all internal
variables to the initial state. Is called in :ref:`CvGBTrees::train` and in the variables to the initial state. The function is called in :ocv:func:`CvGBTrees::train` and in the
destructor. destructor.
...@@ -344,28 +323,21 @@ destructor. ...@@ -344,28 +323,21 @@ destructor.
CvGBTrees::calc_error CvGBTrees::calc_error
--------------------- ---------------------
.. c:function:: float calc_error( CvMLData* _data, int type, std::vector<float> *resp = 0 ) .. ocv:function:: float calc_error( CvMLData* _data, int type, std::vector<float> *resp = 0 )
Calculates training or testing error.
If the :ref:`CvMLData` data is used to store the data set :c:func:`calc_error` can be
used to get the training or testing error easily and (optionally) all predictions
on the training/testing set. If TBB library is used, the error is computed in a
parallel way: predictions for different samples are computed at the same time.
In the case of regression problem mean squared error is returned. For
classifications the result is the misclassification error in percent.
``_data``
Data set. Calculates a training or testing error.
``type`` :param _data: Data set.
Defines what error should be computed: train (``CV_TRAIN_ERROR``) or test :param type: Parameter defining the error that should be computed: train (``CV_TRAIN_ERROR``) or test
(``CV_TEST_ERROR``). (``CV_TEST_ERROR``).
``resp`` :param resp: If non-zero, a vector of predictions on the corresponding data set is
If not ``0`` a vector of predictions on the corresponding data set is
returned. returned.
If the :ocv:class:`CvMLData` data is used to store the data set, :ocv:func:`calc_error` can be
used to get a training/testing error easily and (optionally) all predictions
on the training/testing set. If the Intel* TBB* library is used, the error is computed in a
parallel way, namely, predictions for different samples are computed at the same time.
In case of a regression problem, a mean squared error is returned. For
classifications, the result is a misclassification error in percent.
\ No newline at end of file
K Nearest Neighbors K-Nearest Neighbors
=================== ===================
.. highlight:: cpp
The algorithm caches all training samples and predicts the response for a new sample by analyzing a certain number ( The algorithm caches all training samples and predicts the response for a new sample by analyzing a certain number (
**K** **K**
) of the nearest neighbors of the sample (using voting, calculating weighted sum, and so on). The method is sometimes referred to as "learning by example" because for prediction it looks for the feature vector with a known response that is closest to the given vector. ) of the nearest neighbors of the sample using voting, calculating weighted sum, and so on. The method is sometimes referred to as "learning by example" because for prediction it looks for the feature vector with a known response that is closest to the given vector.
.. index:: CvKNearest .. index:: CvKNearest
...@@ -11,9 +13,9 @@ The algorithm caches all training samples and predicts the response for a new sa ...@@ -11,9 +13,9 @@ The algorithm caches all training samples and predicts the response for a new sa
CvKNearest CvKNearest
---------- ----------
.. c:type:: CvKNearest .. ocv:class:: CvKNearest
K-Nearest Neighbors model :: K-Nearest Neighbors model. ::
class CvKNearest : public CvStatModel class CvKNearest : public CvStatModel
{ {
...@@ -53,7 +55,8 @@ CvKNearest::train ...@@ -53,7 +55,8 @@ CvKNearest::train
Trains the model. Trains the model.
The method trains the K-Nearest model. It follows the conventions of the generic ``train`` "method" with the following limitations: The method trains the K-Nearest model. It follows the conventions of the generic ``train`` approach with the following limitations:
* Only ``CV_ROW_SAMPLE`` data layout is supported. * Only ``CV_ROW_SAMPLE`` data layout is supported.
* Input variables are all ordered. * Input variables are all ordered.
* Output variables can be either categorical ( ``is_regression=false`` ) or ordered ( ``is_regression=true`` ). * Output variables can be either categorical ( ``is_regression=false`` ) or ordered ( ``is_regression=true`` ).
...@@ -87,7 +90,7 @@ For each input vector, the neighbors are sorted by their distances to the vector ...@@ -87,7 +90,7 @@ For each input vector, the neighbors are sorted by their distances to the vector
If only a single input vector is passed, all output matrices are optional and the predicted value is returned by the method. If only a single input vector is passed, all output matrices are optional and the predicted value is returned by the method.
The sample below (currently using the obsolete ``CvMat`` structures) demonstrates the use of the k-nearest classifier for 2D point classification :: The sample below (currently using the obsolete ``CvMat`` structures) demonstrates the use of the k-nearest classifier for 2D point classification: ::
#include "ml.h" #include "ml.h"
#include "highgui.h" #include "highgui.h"
......
...@@ -3,13 +3,13 @@ MLData ...@@ -3,13 +3,13 @@ MLData
.. highlight:: cpp .. highlight:: cpp
For the machine learning algorithms usage it is often that data set is saved in file of format like .csv. The supported format file must contains the table of predictors and responses values, each row of the table must correspond to one sample. Missing values are supported. Famous UC Irvine Machine Learning Repository (http://archive.ics.uci.edu/ml/) provides many stored in such format data sets to the machine learning community. The class MLData has been implemented to ease the loading data for the training one of the existing in OpenCV machine learning algorithm. For float values only separator ``'.'`` is supported. For the machine learning algorithms, the data set is often stored in a file of the ``.csv``-like format. The file contains a table of predictor and response values where each row of the table corresponds to a sample. Missing values are supported. The UC Irvine Machine Learning Repository (http://archive.ics.uci.edu/ml/) provides many data sets stored in such a format to the machine learning community. The class ``MLData`` is implemented to easily load the data for training one of the OpenCV machine learning algorithms. For float values, only the ``'.'`` separator is supported.
CvMLData CvMLData
-------- --------
.. ocv:class:: CvMLData .. ocv:class:: CvMLData
The class to load the data from .csv file. Class for loading the data from a ``.csv`` file.
:: ::
class CV_EXPORTS CvMLData class CV_EXPORTS CvMLData
...@@ -58,91 +58,118 @@ CvMLData::read_csv ...@@ -58,91 +58,118 @@ CvMLData::read_csv
------------------ ------------------
.. ocv:function:: int CvMLData::read_csv(const char* filename); .. ocv:function:: int CvMLData::read_csv(const char* filename);
This method reads the data set from .csv-like file named ``filename`` and store all read values in one matrix. While reading the method tries to define variables (predictors and response) type: ordered or categorical. If some value of the variable is not a number (e.g. contains the letters) exept a label for missing value, then the type of the variable is set to ``CV_VAR_CATEGORICAL``. If all unmissing values of the variable are the numbers, then the type of the variable is set to ``CV_VAR_ORDERED``. So default definition of variables types works correctly for all cases except the case of categorical variable that has numerical class labeles. In such case the type ``CV_VAR_ORDERED`` will be set and user should change the type to ``CV_VAR_CATEGORICAL`` using method :ocv:func:`CvMLData::change_var_type`. For categorical variables the common map is built to convert string class label to the numerical class label and this map can be got by :ocv:func:`CvMLData::get_class_labels_map`. Also while reading the data the method constructs the mask of missing values (e.g. values are egual to `'?'`). Reads the data set from a ``.csv``-like ``filename`` file and stores all read values in a matrix.
While reading the data, the method tries to define the type of variables (predictors and responses): ordered or categorical. If a value of the variable is not numerical (except for the label for a missing value), the type of the variable is set to ``CV_VAR_CATEGORICAL``. If all existing values of the variable are numerical, the type of the variable is set to ``CV_VAR_ORDERED``. So, the default definition of variables types works correctly for all cases except the case of a categorical variable with numerical class labeles. In this case, the type ``CV_VAR_ORDERED`` is set. You should change the type to ``CV_VAR_CATEGORICAL`` using the method :ocv:func:`CvMLData::change_var_type`. For categorical variables, a common map is built to convert a string class label to the numerical class label. Use :ocv:func:`CvMLData::get_class_labels_map` to obtain this map.
Also, when reading the data, the method constructs the mask of missing values. For example, values are egual to `'?'`.
CvMLData::get_values CvMLData::get_values
-------------------- --------------------
.. ocv:function:: const CvMat* CvMLData::get_values() const; .. ocv:function:: const CvMat* CvMLData::get_values() const;
Returns the pointer to the predictors and responses ``values`` matrix or ``0`` if data has not been loaded from file yet. This matrix has rows count equal to samples count, columns count equal to predictors ``+ 1`` for response (if exist) count (i.e. each row of matrix is values of one sample predictors and response) and type ``CV_32FC1``. Returns a pointer to the matrix of predictor and response ``values`` or ``0`` if the data has not been loaded from the file yet.
The row count of this matrix equals the sample count. The column count equals predictors ``+ 1`` for the response (if exists) count. This means that each row of the matrix contains values of one sample predictor and response. The matrix type is ``CV_32FC1``.
CvMLData::get_responses CvMLData::get_responses
----------------------- -----------------------
.. ocv:function:: const CvMat* CvMLData::get_responses(); .. ocv:function:: const CvMat* CvMLData::get_responses();
Returns the pointer to the responses values matrix or throw exception if data has not been loaded from file yet. This matrix has rows count equal to samples count, one column and type ``CV_32FC1``. Returns a pointer to the matrix of response values or throws an exception if the data has not been loaded from the file yet.
This is a single-column matrix of the type ``CV_32FC1``. Its row count is equal to the sample count, one column and .
CvMLData::get_missing CvMLData::get_missing
--------------------- ---------------------
.. ocv:function:: const CvMat* CvMLData::get_missing() const; .. ocv:function:: const CvMat* CvMLData::get_missing() const;
Returns the pointer to the missing values mask matrix or throw exception if data has not been loaded from file yet. This matrix has the same size as ``values`` matrix (see :ocv:func:`CvMLData::get_values`) and type ``CV_8UC1``. Returns a pointer to the mask matrix of missing values or throws an exception if the data has not been loaded from the file yet.
This matrix has the same size as the ``values`` matrix (see :ocv:func:`CvMLData::get_values`) and the type ``CV_8UC1``.
CvMLData::set_response_idx CvMLData::set_response_idx
-------------------------- --------------------------
.. ocv:function:: void CvMLData::set_response_idx( int idx ); .. ocv:function:: void CvMLData::set_response_idx( int idx );
Sets index of response column in ``values`` matrix (see :ocv:func:`CvMLData::get_values`) or throw exception if data has not been loaded from file yet. The old response column become pridictors. If ``idx < 0`` there will be no response. Sets the index of a response column in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`) or throws an exception if the data has not been loaded from the file yet.
The old response columns become predictors. If ``idx < 0``, there is no response.
CvMLData::get_response_idx CvMLData::get_response_idx
---------- ----------
.. ocv:function:: int CvMLData::get_response_idx() const; .. ocv:function:: int CvMLData::get_response_idx() const;
Gets response column index in ``values`` matrix (see :ocv:func:`CvMLData::get_values`), negative value there is no response or throw exception if data has not been loaded from file yet. Gets the index of a response column in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`) or throws an exception if the data has not been loaded from the file yet.
If ``idx < 0``, there is no response.
CvMLData::set_train_test_split CvMLData::set_train_test_split
------------------------------ ------------------------------
.. ocv:function:: void CvMLData::set_train_test_split( const CvTrainTestSplit * spl ); .. ocv:function:: void CvMLData::set_train_test_split( const CvTrainTestSplit * spl );
For different purposes it can be useful to devide the read data set into two disjoint subsets: training and test ones. This method sets parametes for such split (using ``spl``, see :ocv:class:`CvTrainTestSplit`) and make the data split or throw exception if data has not been loaded from file yet. Divides the read data set into two disjoint training and test subsets.
This method sets parameters for such a split using ``spl`` (see :ocv:class:`CvTrainTestSplit`) or throws an exception if the data has not been loaded from the file yet.
CvMLData::get_train_sample_idx CvMLData::get_train_sample_idx
------------------------------ ------------------------------
.. ocv:function:: const CvMat* CvMLData::get_train_sample_idx() const; .. ocv:function:: const CvMat* CvMLData::get_train_sample_idx() const;
The read data set can be devided on training and test data subsets by setting split (see :ocv:func:`CvMLData::set_train_test_split`). Current method returns the matrix of samples indices for training subset (this matrix has one row and type ``CV_32SC1``). If data split is not set then the method returns ``0``. If data has not been loaded from file yet an exception is thrown. Divides the data set into training and test subsets by setting a split (see :ocv:func:`CvMLData::set_train_test_split`).
The current method returns the matrix of sample indices for a training subset. This is a single-row matrix of the type ``CV_32SC1``. If data split is not set, the method returns ``0``. If the data has not been loaded from the file yet, an exception is thrown.
CvMLData::get_test_sample_idx CvMLData::get_test_sample_idx
----------------------------- -----------------------------
.. ocv:function:: const CvMat* CvMLData::get_test_sample_idx() const; .. ocv:function:: const CvMat* CvMLData::get_test_sample_idx() const;
Analogically with :ocv:func:`CvMLData::get_train_sample_idx`, but for test subset. Provides functionality similar to :ocv:func:`CvMLData::get_train_sample_idx` but for a test subset.
CvMLData::mix_train_and_test_idx CvMLData::mix_train_and_test_idx
-------------------------------- --------------------------------
.. ocv:function:: void CvMLData::mix_train_and_test_idx(); .. ocv:function:: void CvMLData::mix_train_and_test_idx();
Mixes the indices of training and test samples preserving sizes of training and test subsets (if data split is set by :ocv:func:`CvMLData::get_values`). If data has not been loaded from file yet an exception is thrown. Mixes the indices of training and test samples preserving sizes of training and test subsets if the data split is set by :ocv:func:`CvMLData::get_values`. If the data has not been loaded from the file yet, an exception is thrown.
CvMLData::get_var_idx CvMLData::get_var_idx
--------------------- ---------------------
.. ocv:function:: const CvMat* CvMLData::get_var_idx(); .. ocv:function:: const CvMat* CvMLData::get_var_idx();
Returns used variables (columns) indices in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`), ``0`` if used subset is not set or throw exception if data has not been loaded from file yet. Returned matrix has one row, columns count equel to used variable subset size and type ``CV_32SC1``. Returns the indices of variables (columns) used in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`).
The function returns `0`` if the used subset is not set. It throws an exception if the data has not been loaded from the file yet. Returned matrix is a single-row matrix of the type ``CV_32SC1``. Its column count is equal to the size of the used variable subset.
CvMLData::chahge_var_idx CvMLData::chahge_var_idx
------------------------ ------------------------
.. ocv:function:: void CvMLData::chahge_var_idx( int vi, bool state ); .. ocv:function:: void CvMLData::chahge_var_idx( int vi, bool state );
By default after reading the data set all variables in ``values`` matrix (see :ocv:func:`CvMLData::get_values`) are used. But the user may want to use only subset of variables and can include on/off (depends on ``state`` value) a variable with ``vi`` index from used subset. If data has not been loaded from file yet an exception is thrown. Controls the data set by changing the number of variables.??
By default, after reading the data set all variables in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`) are used. But you may want to use only a subset of variables and include/exclude (depending on ``state`` value) a variable with the ``vi`` index from the used subset. If the data has not been loaded from the file yet, an exception is thrown.
CvMLData::get_var_types CvMLData::get_var_types
----------------------- -----------------------
.. ocv:function:: const CvMat* CvMLData::get_var_types(); .. ocv:function:: const CvMat* CvMLData::get_var_types();
Returns matrix of used variable types. The matrix has one row, column count equel to used variables count and type ``CV_8UC1``. If data has not been loaded from file yet an exception is thrown.
Returns a matrix of used variable types.
The function returns a single-row matrix of the type ``CV_8UC1``. column count equel to used variables count and type . If data has not been loaded from file yet an exception is thrown.
CvMLData::set_var_types CvMLData::set_var_types
----------------------- -----------------------
.. ocv:function:: void CvMLData::set_var_types( const char* str ); .. ocv:function:: void CvMLData::set_var_types( const char* str );
Sets variables types according to given string ``str``. The better description of the supporting string format is several examples of it: ``"ord[0-17],cat[18]"``, ``"ord[0,2,4,10-12], cat[1,3,5-9,13,14]"``, ``"cat"`` (all variables are categorical), ``"ord"`` (all variables are ordered). That is after the variable type a list of such type variables indices is followed. Sets variables types according to the given string ``str``.
In the string, a variable type is followed by a list of variables indices. For example: ``"ord[0-17],cat[18]"``, ``"ord[0,2,4,10-12], cat[1,3,5-9,13,14]"``, ``"cat"`` (all variables are categorical), ``"ord"`` (all variables are ordered).
CvMLData::get_var_type CvMLData::get_var_type
---------------------- ----------------------
.. ocv:function:: int CvMLData::get_var_type( int var_idx ) const; .. ocv:function:: int CvMLData::get_var_type( int var_idx ) const;
Returns type of variable by index ``var_idx`` ( ``CV_VAR_ORDERED`` or ``CV_VAR_CATEGORICAL``). Returns the type of a variable by the index ``var_idx`` ( ``CV_VAR_ORDERED`` or ``CV_VAR_CATEGORICAL``).
CvMLData::change_var_type CvMLData::change_var_type
------------------------- -------------------------
...@@ -154,37 +181,37 @@ CvMLData::set_delimiter ...@@ -154,37 +181,37 @@ CvMLData::set_delimiter
----------------------- -----------------------
.. ocv:function:: void CvMLData::set_delimiter( char ch ); .. ocv:function:: void CvMLData::set_delimiter( char ch );
Sets the delimiter for the variable values in file. E.g. ``','`` (default), ``';'``, ``' '`` (space) or other character (exapt float separator ``'.'``). Sets the delimiter for variable values in a file. For example: ``','`` (default), ``';'``, ``' '`` (space), or other characters. The float separator ``'.'`` is not allowed.
CvMLData::get_delimiter CvMLData::get_delimiter
----------------------- -----------------------
.. ocv:function:: char CvMLData::get_delimiter() const; .. ocv:function:: char CvMLData::get_delimiter() const;
Gets the set delimiter charecter. Gets the set delimiter character.
CvMLData::set_miss_ch CvMLData::set_miss_ch
--------------------- ---------------------
.. ocv:function:: void CvMLData::set_miss_ch( char ch ); .. ocv:function:: void CvMLData::set_miss_ch( char ch );
Sets the character denoting the missing of value. E.g. ``'?'`` (default), ``'-'``, etc (exapt float separator ``'.'``). Sets the character for a missing value. For example: ``'?'`` (default), ``'-'``. The float separator ``'.'`` is not allowed.
CvMLData::get_miss_ch CvMLData::get_miss_ch
--------------------- ---------------------
.. ocv:function:: char CvMLData::get_miss_ch() const; .. ocv:function:: char CvMLData::get_miss_ch() const;
Gets the character denoting the missing value. Gets the character for a missing value.
CvMLData::get_class_labels_map CvMLData::get_class_labels_map
------------------------------- -------------------------------
.. ocv:function:: const std::map<std::string, int>& CvMLData::get_class_labels_map() const; .. ocv:function:: const std::map<std::string, int>& CvMLData::get_class_labels_map() const;
Returns map that converts string class labels to the numerical class labels. It can be used to get original class label (as in file). Returns a map that converts string class labels to the numerical class labels. It can be used to get an original class label as in a file.
CvTrainTestSplit CvTrainTestSplit
---------------- ----------------
.. ocv:class:: CvTrainTestSplit .. ocv:class:: CvTrainTestSplit
The structure to set split of data set read by :ocv:class:`CvMLData`. Structure setting the split of a data set read by :ocv:class:`CvMLData`.
:: ::
struct CvTrainTestSplit struct CvTrainTestSplit
...@@ -203,4 +230,8 @@ The structure to set split of data set read by :ocv:class:`CvMLData`. ...@@ -203,4 +230,8 @@ The structure to set split of data set read by :ocv:class:`CvMLData`.
bool mix; bool mix;
}; };
There are two ways to construct split. The first is by setting training sample count (subset size) ``train_sample_count``; other existing samples will be in test subset. The second is by setting training sample portion in ``[0,..1]``. The flag ``mix`` is used to mix training and test samples indices when split will be set, otherwise the data set will be devided in the storing order (first part of samples of given size is the training subset, other part is the test one). There are two ways to construct a split:
* Set the training sample count (subset size) ``train_sample_count``. Other existing samples are located in a test subset.
* Set a training sample portion in ``[0,..1]``. The flag ``mix`` is used to mix training and test samples indices when the split is set. Otherwise, the data set is split in the storing order: the first part of samples of a given size is a training subset, the second part is a test subset.
Neural Networks Neural Networks
=============== ===============
ML implements feed-forward artificial neural networks, more particularly, multi-layer perceptrons (MLP), the most commonly used type of neural networks. MLP consists of the input layer, output layer, and one or more hidden layers. Each layer of MLP includes one or more neurons that are directionally linked with the neurons from the previous and the next layer. The example below represents a 3-layer perceptron with three inputs, two outputs, and the hidden layer including five neurons: .. highlight:: cpp
ML implements feed-forward artificial neural networks or, more particularly, multi-layer perceptrons (MLP), the most commonly used type of neural networks. MLP consists of the input layer, output layer, and one or more hidden layers. Each layer of MLP includes one or more neurons directionally linked with the neurons from the previous and the next layer. The example below represents a 3-layer perceptron with three inputs, two outputs, and the hidden layer including five neurons:
.. image:: pics/mlp.png .. image:: pics/mlp.png
...@@ -45,10 +47,13 @@ In ML, all the neurons have the same activation functions, with the same free pa ...@@ -45,10 +47,13 @@ In ML, all the neurons have the same activation functions, with the same free pa
So, the whole trained network works as follows: So, the whole trained network works as follows:
#. It takes the feature vector as input. The vector size is equal to the size of the input layer. #. Take the feature vector as input. The vector size is equal to the size of the input layer.
#. Values are passed as input to the first hidden layer.
#. Outputs of the hidden layer are computed using the weights and the activation functions. #. Pass values as input to the first hidden layer.
#. Outputs are passed further downstream until you compute the output layer.
#. Compute outputs of the hidden layer using the weights and the activation functions.
#. Pass outputs further downstream until you compute the output layer.
So, to compute the network, you need to know all the So, to compute the network, you need to know all the
weights weights
...@@ -66,10 +71,10 @@ so the error on the test set usually starts increasing after the network ...@@ -66,10 +71,10 @@ so the error on the test set usually starts increasing after the network
size reaches a limit. Besides, the larger networks are trained much size reaches a limit. Besides, the larger networks are trained much
longer than the smaller ones, so it is reasonable to pre-process the data, longer than the smaller ones, so it is reasonable to pre-process the data,
using using
:ref:`PCA::operator ()` or similar technique, and train a smaller network :ocv:func:`PCA::operator ()` or similar technique, and train a smaller network
on only essential features. on only essential features.
Another feature of MLP's is their inability to handle categorical Another MPL feature is an inability to handle categorical
data as is. However, there is a workaround. If a certain feature in the data as is. However, there is a workaround. If a certain feature in the
input or output (in case of ``n`` -class classifier for input or output (in case of ``n`` -class classifier for
:math:`n>2` ) layer is categorical and can take :math:`n>2` ) layer is categorical and can take
...@@ -101,9 +106,9 @@ References: ...@@ -101,9 +106,9 @@ References:
CvANN_MLP_TrainParams CvANN_MLP_TrainParams
--------------------- ---------------------
.. c:type:: CvANN_MLP_TrainParams .. ocv:class:: CvANN_MLP_TrainParams
Parameters of the MLP training algorithm :: Parameters of the MLP training algorithm. ::
struct CvANN_MLP_TrainParams struct CvANN_MLP_TrainParams
{ {
...@@ -134,9 +139,9 @@ The structure has a default constructor that initializes parameters for the ``RP ...@@ -134,9 +139,9 @@ The structure has a default constructor that initializes parameters for the ``RP
CvANN_MLP CvANN_MLP
--------- ---------
.. c:type:: CvANN_MLP .. ocv:class:: CvANN_MLP
MLP model :: MLP model. ::
class CvANN_MLP : public CvStatModel class CvANN_MLP : public CvStatModel
{ {
...@@ -259,9 +264,9 @@ CvANN_MLP::train ...@@ -259,9 +264,9 @@ CvANN_MLP::train
:param _flags: Various parameters to control the training algorithm. A combination of the following parameters is possible: :param _flags: Various parameters to control the training algorithm. A combination of the following parameters is possible:
* **UPDATE_WEIGHTS = 1** Algorithm updates the network weights, rather than computes them from scratch (in the latter case the weights are initialized using the Nguyen-Widrow algorithm). * **UPDATE_WEIGHTS = 1** Algorithm updates the network weights, rather than computes them from scratch. In the latter case the weights are initialized using the Nguyen-Widrow algorithm.
* **NO_INPUT_SCALE** Algorithm does not normalize the input vectors. If this flag is not set, the training algorithm normalizes each input feature independently, shifting its mean value to 0 and making the standard deviation =1. If the network is assumed to be updated frequently, the new training data could be much different from original one. In this case, you should take care of proper normalization. * **NO_INPUT_SCALE** Algorithm does not normalize the input vectors. If this flag is not set, the training algorithm normalizes each input feature independently, shifting its mean value to 0 and making the standard deviation equal to 1. If the network is assumed to be updated frequently, the new training data could be much different from original one. In this case, you should take care of proper normalization.
* **NO_OUTPUT_SCALE** Algorithm does not normalize the output vectors. If the flag is not set, the training algorithm normalizes each output feature independently, by transforming it to the certain range depending on the used activation function. * **NO_OUTPUT_SCALE** Algorithm does not normalize the output vectors. If the flag is not set, the training algorithm normalizes each output feature independently, by transforming it to the certain range depending on the used activation function.
......
...@@ -3,7 +3,9 @@ ...@@ -3,7 +3,9 @@
Normal Bayes Classifier Normal Bayes Classifier
======================= =======================
This is a simple classification model assuming that feature vectors from each class are normally distributed (though, not necessarily independently distributed). So, the whole data distribution function is assumed to be a Gaussian mixture, one component per class. Using the training data the algorithm estimates mean vectors and covariance matrices for every class, and then it uses them for prediction. .. highlight:: cpp
This simple classification model assumes that feature vectors from each class are normally distributed (though, not necessarily independently distributed). So, the whole data distribution function is assumed to be a Gaussian mixture, one component per class. Using the training data the algorithm estimates mean vectors and covariance matrices for every class, and then it uses them for prediction.
[Fukunaga90] K. Fukunaga. *Introduction to Statistical Pattern Recognition*. second ed., New York: Academic Press, 1990. [Fukunaga90] K. Fukunaga. *Introduction to Statistical Pattern Recognition*. second ed., New York: Academic Press, 1990.
...@@ -11,9 +13,9 @@ This is a simple classification model assuming that feature vectors from each cl ...@@ -11,9 +13,9 @@ This is a simple classification model assuming that feature vectors from each cl
CvNormalBayesClassifier CvNormalBayesClassifier
----------------------- -----------------------
.. c:type:: CvNormalBayesClassifier .. ocv:class:: CvNormalBayesClassifier
Bayes classifier for normally distributed data :: Bayes classifier for normally distributed data. ::
class CvNormalBayesClassifier : public CvStatModel class CvNormalBayesClassifier : public CvStatModel
{ {
...@@ -50,7 +52,7 @@ CvNormalBayesClassifier::train ...@@ -50,7 +52,7 @@ CvNormalBayesClassifier::train
Trains the model. Trains the model.
The method trains the Normal Bayes classifier. It follows the conventions of the generic ``train`` "method" with the following limitations: The method trains the Normal Bayes classifier. It follows the conventions of the generic ``train`` approach with the following limitations:
* Only ``CV_ROW_SAMPLE`` data layout is supported. * Only ``CV_ROW_SAMPLE`` data layout is supported.
* Input variables are all ordered. * Input variables are all ordered.
......
...@@ -3,9 +3,9 @@ Support Vector Machines ...@@ -3,9 +3,9 @@ Support Vector Machines
.. highlight:: cpp .. highlight:: cpp
Originally, support vector machines (SVM) was a technique for building an optimal binary (2-class) classifier. Later the technique has been extended to regression and clustering problems. SVM is a partial case of kernel-based methods. It maps feature vectors into a higher-dimensional space using a kernel function and builds an optimal linear discriminating function in this space or an optimal hyper-plane that fits into the training data. In case of SVM, the kernel is not defined explicitly. Instead, a distance between any 2 points in the hyper-space needs to be defined. Originally, support vector machines (SVM) was a technique for building an optimal binary (2-class) classifier. Later the technique was extended to regression and clustering problems. SVM is a partial case of kernel-based methods. It maps feature vectors into a higher-dimensional space using a kernel function and builds an optimal linear discriminating function in this space or an optimal hyper-plane that fits into the training data. In case of SVM, the kernel is not defined explicitly. Instead, a distance between any 2 points in the hyper-space needs to be defined.
The solution is optimal, which means that the margin between the separating hyper-plane and the nearest feature vectors from both classes (in case of 2-class classifier) is maximal. The feature vectors that are the closest to the hyper-plane are called "support vectors", which means that the position of other vectors does not affect the hyper-plane (the decision function). The solution is optimal, which means that the margin between the separating hyper-plane and the nearest feature vectors from both classes (in case of 2-class classifier) is maximal. The feature vectors that are the closest to the hyper-plane are called *support vectors*, which means that the position of other vectors does not affect the hyper-plane (the decision function).
There are a lot of good references on SVM. You may consider starting with the following: There are a lot of good references on SVM. You may consider starting with the following:
...@@ -27,9 +27,9 @@ There are a lot of good references on SVM. You may consider starting with the fo ...@@ -27,9 +27,9 @@ There are a lot of good references on SVM. You may consider starting with the fo
CvSVM CvSVM
----- -----
.. c:type:: CvSVM .. ocv:class:: CvSVM
Support Vector Machines :: Support Vector Machines. ::
class CvSVM : public CvStatModel class CvSVM : public CvStatModel
{ {
...@@ -90,9 +90,9 @@ Support Vector Machines :: ...@@ -90,9 +90,9 @@ Support Vector Machines ::
CvSVMParams CvSVMParams
----------- -----------
.. c:type:: CvSVMParams .. ocv:class:: CvSVMParams
SVM training parameters :: SVM training parameters. ::
struct CvSVMParams struct CvSVMParams
{ {
...@@ -117,7 +117,7 @@ SVM training parameters :: ...@@ -117,7 +117,7 @@ SVM training parameters ::
The structure must be initialized and passed to the training method of The structure must be initialized and passed to the training method of
:ref:`CvSVM` . :ocv:class:`CvSVM` .
.. index:: CvSVM::train .. index:: CvSVM::train
...@@ -127,17 +127,20 @@ CvSVM::train ...@@ -127,17 +127,20 @@ CvSVM::train
------------ ------------
.. ocv:function:: bool CvSVM::train( const Mat& _train_data, const Mat& _responses, const Mat& _var_idx=Mat(), const Mat& _sample_idx=Mat(), CvSVMParams _params=CvSVMParams() ) .. ocv:function:: bool CvSVM::train( const Mat& _train_data, const Mat& _responses, const Mat& _var_idx=Mat(), const Mat& _sample_idx=Mat(), CvSVMParams _params=CvSVMParams() )
Trains SVM. Trains an SVM.
The method trains the SVM model. It follows the conventions of the generic ``train`` "method" with the following limitations: The method trains the SVM model. It follows the conventions of the generic ``train`` approach with the following limitations:
* Only the ``CV_ROW_SAMPLE`` data layout is supported. * Only the ``CV_ROW_SAMPLE`` data layout is supported.
* Input variables are all ordered. * Input variables are all ordered.
* Output variables can be either categorical ( ``_params.svm_type=CvSVM::C_SVC`` or ``_params.svm_type=CvSVM::NU_SVC`` ), or ordered ( ``_params.svm_type=CvSVM::EPS_SVR`` or ``_params.svm_type=CvSVM::NU_SVR`` ), or not required at all ( ``_params.svm_type=CvSVM::ONE_CLASS`` ). * Output variables can be either categorical ( ``_params.svm_type=CvSVM::C_SVC`` or ``_params.svm_type=CvSVM::NU_SVC`` ), or ordered ( ``_params.svm_type=CvSVM::EPS_SVR`` or ``_params.svm_type=CvSVM::NU_SVR`` ), or not required at all ( ``_params.svm_type=CvSVM::ONE_CLASS`` ).
* Missing measurements are not supported. * Missing measurements are not supported.
All the other parameters are gathered in the All the other parameters are gathered in the
:ref:`CvSVMParams` structure. :ocv:class:`CvSVMParams` structure.
.. index:: CvSVM::train_auto .. index:: CvSVM::train_auto
...@@ -147,16 +150,16 @@ CvSVM::train_auto ...@@ -147,16 +150,16 @@ CvSVM::train_auto
----------------- -----------------
.. ocv:function:: train_auto( const Mat& _train_data, const Mat& _responses, const Mat& _var_idx, const Mat& _sample_idx, CvSVMParams params, int k_fold = 10, CvParamGrid C_grid = get_default_grid(CvSVM::C), CvParamGrid gamma_grid = get_default_grid(CvSVM::GAMMA), CvParamGrid p_grid = get_default_grid(CvSVM::P), CvParamGrid nu_grid = get_default_grid(CvSVM::NU), CvParamGrid coef_grid = get_default_grid(CvSVM::COEF), CvParamGrid degree_grid = get_default_grid(CvSVM::DEGREE) ) .. ocv:function:: train_auto( const Mat& _train_data, const Mat& _responses, const Mat& _var_idx, const Mat& _sample_idx, CvSVMParams params, int k_fold = 10, CvParamGrid C_grid = get_default_grid(CvSVM::C), CvParamGrid gamma_grid = get_default_grid(CvSVM::GAMMA), CvParamGrid p_grid = get_default_grid(CvSVM::P), CvParamGrid nu_grid = get_default_grid(CvSVM::NU), CvParamGrid coef_grid = get_default_grid(CvSVM::COEF), CvParamGrid degree_grid = get_default_grid(CvSVM::DEGREE) )
Trains SVM with optimal parameters. Trains an SVM with optimal parameters.
:param k_fold: Cross-validation parameter. The training set is divided into ``k_fold`` subsets. One subset is used to train the model, the others form the test set. So, the SVM algorithm is executed ``k_fold`` times. :param k_fold: Cross-validation parameter. The training set is divided into ``k_fold`` subsets. One subset is used to train the model, the others form the test set. So, the SVM algorithm is executed ``k_fold`` times.
The method trains the SVM model automatically by choosing the optimal The method trains the SVM model automatically by choosing the optimal
parameters ``C`` , ``gamma`` , ``p`` , ``nu`` , ``coef0`` , ``degree`` from parameters ``C`` , ``gamma`` , ``p`` , ``nu`` , ``coef0`` , ``degree`` from
:ref:`CvSVMParams`. Parameters are considered optimal :ocv:class:`CvSVMParams`. Parameters are considered optimal
when the cross-validation estimate of the test set error when the cross-validation estimate of the test set error
is minimal. The parameters are iterated by a logarithmic grid, for is minimal. The parameters are iterated by a logarithmic grid, for
example, the parameter ``gamma`` takes the values in the set example, the parameter ``gamma`` takes values in the set
( (
:math:`min`, :math:`min`,
:math:`min*step`, :math:`min*step`,
...@@ -165,7 +168,7 @@ example, the parameter ``gamma`` takes the values in the set ...@@ -165,7 +168,7 @@ example, the parameter ``gamma`` takes the values in the set
where where
:math:`min` is ``gamma_grid.min_val`` , :math:`min` is ``gamma_grid.min_val`` ,
:math:`step` is ``gamma_grid.step`` , and :math:`step` is ``gamma_grid.step`` , and
:math:`n` is the maximal index such that :math:`n` is the maximal index where
.. math:: .. math::
...@@ -173,12 +176,12 @@ where ...@@ -173,12 +176,12 @@ where
So ``step`` must always be greater than 1. So ``step`` must always be greater than 1.
If there is no need to optimize a parameter, the corresponding grid step should be set to any value less or equal to 1. For example, to avoid optimization in ``gamma`` , set ``gamma_grid.step = 0`` , ``gamma_grid.min_val`` , ``gamma_grid.max_val`` as arbitrary numbers. In this case, the value ``params.gamma`` is taken for ``gamma`` . If there is no need to optimize a parameter, the corresponding grid step should be set to any value less than or equal to 1. For example, to avoid optimization in ``gamma`` , set ``gamma_grid.step = 0`` , ``gamma_grid.min_val`` , ``gamma_grid.max_val`` as arbitrary numbers. In this case, the value ``params.gamma`` is taken for ``gamma`` .
And, finally, if the optimization in a parameter is required but And, finally, if the optimization in a parameter is required but
the corresponding grid is unknown, you may call the function ``CvSVM::get_default_grid`` . To generate a grid, for example, for ``gamma`` , call ``CvSVM::get_default_grid(CvSVM::GAMMA)`` . the corresponding grid is unknown, you may call the function ``CvSVM::get_default_grid`` . To generate a grid, for example, for ``gamma`` , call ``CvSVM::get_default_grid(CvSVM::GAMMA)`` .
This function works for the case of classification This function works for the classification
( ``params.svm_type=CvSVM::C_SVC`` or ``params.svm_type=CvSVM::NU_SVC`` ) ( ``params.svm_type=CvSVM::C_SVC`` or ``params.svm_type=CvSVM::NU_SVC`` )
as well as for the regression as well as for the regression
( ``params.svm_type=CvSVM::EPS_SVR`` or ``params.svm_type=CvSVM::NU_SVR`` ). If ``params.svm_type=CvSVM::ONE_CLASS`` , no optimization is made and the usual SVM with parameters specified in ``params`` is executed. ( ``params.svm_type=CvSVM::EPS_SVR`` or ``params.svm_type=CvSVM::NU_SVR`` ). If ``params.svm_type=CvSVM::ONE_CLASS`` , no optimization is made and the usual SVM with parameters specified in ``params`` is executed.
...@@ -207,7 +210,7 @@ CvSVM::get_default_grid ...@@ -207,7 +210,7 @@ CvSVM::get_default_grid
* **CvSVM::DEGREE** * **CvSVM::DEGREE**
The grid will be generated for the parameter with this ID. The grid is generated for the parameter with this ID.
The function generates a grid for the specified parameter of the SVM algorithm. The grid may be passed to the function ``CvSVM::train_auto`` . The function generates a grid for the specified parameter of the SVM algorithm. The grid may be passed to the function ``CvSVM::train_auto`` .
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment