integrated grammar fixes from tech writer (part I)

501033db · Vadim Pisarevsky · 84e4f597 · 501033db · 501033db · 501033db
Commit 501033db authored Jun 24, 2011 by Vadim Pisarevsky
6 changed files
--- a/modules/ml/doc/gradient_boosted_trees.rst
+++ b/modules/ml/doc/gradient_boosted_trees.rst
@@ -3,12 +3,14 @@
 Gradient Boosted Trees
 ======================
-Gradient Boosted Trees (GBT) is a generalized boosting algorithm, introduced by
+.. highlight:: cpp
+Gradient Boosted Trees (GBT) is a generalized boosting algorithm introduced by
 Jerome Friedman: http://www.salfordsystems.com/doc/GreedyFuncApproxSS.pdf .
-In contrast to AdaBoost.M1 algorithm GBT can deal with both multiclass
+In contrast to the AdaBoost.M1 algorithm, GBT can deal with both multiclass
-classification and regression problems. More than that it can use any
+classification and regression problems. Moreover, it can use any
 differential loss function, some popular ones are implemented.
-Decision trees (:ref:`CvDTree`) usage as base learners allows to process ordered
+Decision trees (:ocv:class:`CvDTree`) usage as base learners allows to process ordered
 and categorical variables.
@@ -17,10 +19,10 @@ and categorical variables.
 Training the GBT model
 ----------------------
-Gradient Boosted Trees model represents an ensemble of single regression trees,
+Gradient Boosted Trees model represents an ensemble of single regression trees
-that are built in a greedy fashion. Training procedure is an iterative proccess
+built in a greedy fashion. Training procedure is an iterative proccess
-similar to the numerical optimazation via gradient descent method. Summary loss
+similar to the numerical optimization via the gradient descent method. Summary loss
-on the training set depends only from the current model predictions on the
+on the training set depends only on the current model predictions for the
 thaining samples,  in other words
 :math:`\sum^N_{i=1}L(y_i, F(x_i)) \equiv \mathcal{L}(F(x_1), F(x_2), ... , F(x_N))
 \equiv \mathcal{L}(F)`. And the :math:`\mathcal{L}(F)`
@@ -30,12 +32,13 @@ gradient can be computed as follows:
    grad(\mathcal{L}(F)) = \left( \dfrac{\partial{L(y_1, F(x_1))}}{\partial{F(x_1)}},
    \dfrac{\partial{L(y_2, F(x_2))}}{\partial{F(x_2)}}, ... ,
    \dfrac{\partial{L(y_N, F(x_N))}}{\partial{F(x_N)}} \right) .
-On every training step a single regression tree is built to predict an
+At every training step, a single regression tree is built to predict an
 antigradient vector components. Step length is computed corresponding to the
-loss function and separately for every region determined by the tree leaf, and
+loss function and separately for every region determined by the tree leaf. It
-can be eliminated by changing leaves' values directly.
+can be eliminated by changing values of the leaves  directly.
-The main scheme of the training proccess is shown below.
+See below the main scheme of the training proccess:
 #.
    Find the best constant model.
@@ -52,49 +55,50 @@ The main scheme of the training proccess is shown below.
        Add the tree to the model.
-The following loss functions are implemented:
+The following loss functions are implemented for regression problems:
-*for regression problems:*
-#.
+*
    Squared loss (``CvGBTrees::SQUARED_LOSS``):
    :math:`L(y,f(x))=\dfrac{1}{2}(y-f(x))^2`
-#.
+*
    Absolute loss (``CvGBTrees::ABSOLUTE_LOSS``):
    :math:`L(y,f(x))=|y-f(x)|`
-#.
+*
    Huber loss (``CvGBTrees::HUBER_LOSS``):
    :math:`L(y,f(x)) = \left\{ \begin{array}{lr}
    \delta\cdot\left(|y-f(x)|-\dfrac{\delta}{2}\right) & : |y-f(x)|>\delta\\
    \dfrac{1}{2}\cdot(y-f(x))^2 & : |y-f(x)|\leq\delta \end{array} \right.`,
-    where :math:`\delta` is the :math:`\alpha`-quantile estimation of the
+	where :math:`\delta` is the :math:`\alpha`-quantile estimation of the
    :math:`|y-f(x)|`. In the current implementation :math:`\alpha=0.2`.
-*for classification problems:*
-4.
+The following loss functions are implemented for classification problems:
+*
    Deviance or cross-entropy loss (``CvGBTrees::DEVIANCE_LOSS``):
    :math:`K` functions are built, one function for each output class, and
    :math:`L(y,f_1(x),...,f_K(x)) = -\sum^K_{k=0}1(y=k)\ln{p_k(x)}`,
    where :math:`p_k(x)=\dfrac{\exp{f_k(x)}}{\sum^K_{i=1}\exp{f_i(x)}}`
-    is the estimation of the probability that :math:`y=k`.
+    is the estimation of the probability of :math:`y=k`.
-In the end we get the model in the following form:
+As a result, you get the following model:
 .. math:: f(x) = f_0 + \nu\cdot\sum^M_{i=1}T_i(x) ,
-where :math:`f_0` is the initial guess (the best constant model) and :math:`\nu`
+where :math:`f_0` is an initial guess (the best constant model) and :math:`\nu`
 is a regularization parameter from the interval :math:`(0,1]`, futher called
 *shrinkage*.
 .. _Predicting with GBT model:
-Predicting with GBT model
+Predicting with the GBT Model
 -------------------------
-To get the GBT model prediciton it is needed to compute the sum of responses of
+To get the GBT model prediciton, you need to compute the sum of responses of
-all the trees in the ensemble. For regression problems it is the answer, and
+all the trees in the ensemble. For regression problems, it is the answer.
-for classification problems the result is :math:`\arg\max_{i=1..K}(f_i(x))`.
+For classification problems, the result is :math:`\arg\max_{i=1..K}(f_i(x))`.
 .. highlight:: cpp
@@ -105,9 +109,9 @@ for classification problems the result is :math:`\arg\max_{i=1..K}(f_i(x))`.
 CvGBTreesParams
 ---------------
-.. c:type:: CvGBTreesParams
+.. ocv:class:: CvGBTreesParams
-GBT training parameters ::
+GBT training parameters. ::
    struct CvGBTreesParams : public CvDTreeParams
    {
@@ -123,43 +127,29 @@ GBT training parameters ::
 The structure contains parameters for each sigle decision tree in the ensemble,
 as well as the whole model characteristics. The structure is derived from
-:ref:`CvDTreeParams` but not all of the decision tree parameters are supported:
+:ocv:class:`CvDTreeParams` but not all of the decision tree parameters are supported:
-cross-validation, pruning and class priorities are not used. The whole
+cross-validation, pruning, and class priorities are not used.
-parameters list is shown below:
-``weak_count``
+   :param weak_count: Count of boosting algorithm iterations. ``weak_count*K`` is the total
-    The count of boosting algorithm iterations. ``weak_count*K`` -- is the total
    count of trees in the GBT model, where ``K`` is the output classes count
-    (equal to one in the case of regression).
+    (equal to one in case of a regression).
-``loss_function_type``
+   :param loss_function_type: Type of the loss function used for training
-    The type of the loss function used for training
    (see :ref:`Training the GBT model`). It must be one of the
-    following: ``CvGBTrees::SQUARED_LOSS``, ``CvGBTrees::ABSOLUTE_LOSS``,
+    following types: ``CvGBTrees::SQUARED_LOSS``, ``CvGBTrees::ABSOLUTE_LOSS``,
    ``CvGBTrees::HUBER_LOSS``, ``CvGBTrees::DEVIANCE_LOSS``. The first three
-    ones are used for the case of regression problems, and the last one for
+    types are used for regression problems, and the last one for
    classification.
-``shrinkage``
+   :param shrinkage: Regularization parameter (see :ref:`Training the GBT model`).
-    Regularization parameter (see :ref:`Training the GBT model`).
-``subsample_portion``
+   :param subsample_portion: Portion of the whole training set used for each algorithm iteration.
+    Subset is generated randomly. For more information see
+    http://www.salfordsystems.com/doc/StochasticBoostingSS.pdf.
-    The portion of the whole training set used on each algorithm iteration.
+   :param max_depth: Maximal depth of each decision tree in the ensemble (see :ocv:class:`CvDTree`).
-    Subset is generated randomly
-    (For more information see
-    http://www.salfordsystems.com/doc/StochasticBoostingSS.pdf).
-``max_depth``
+   :param use_surrogates: If ``true``, surrogate splits are built (see :ocv:class:`CvDTree`).
-    The maximal depth of each decision tree in the ensemble (see :ref:`CvDTree`).
-``use_surrogates``
-    If ``true`` surrogate splits are built (see :ref:`CvDTree`).
 By default the following constructor is used:
@@ -175,9 +165,9 @@ By default the following constructor is used:
 CvGBTrees
 ---------
-.. c:type:: CvGBTrees
+.. ocv:class:: CvGBTrees
-GBT model ::
+GBT model. ::
 	class CvGBTrees : public CvStatModel
 	{
@@ -248,25 +238,25 @@ GBT model ::
 CvGBTrees::train
 ----------------
-.. c:function:: bool train(const Mat & trainData, int tflag, const Mat & responses, const Mat & varIdx=Mat(), const Mat & sampleIdx=Mat(), const Mat & varType=Mat(), const Mat & missingDataMask=Mat(), CvGBTreesParams params=CvGBTreesParams(), bool update=false)
+.. ocv:function:: bool train(const Mat & trainData, int tflag, const Mat & responses, const Mat & varIdx=Mat(), const Mat & sampleIdx=Mat(), const Mat & varType=Mat(), const Mat & missingDataMask=Mat(), CvGBTreesParams params=CvGBTreesParams(), bool update=false)
-.. c:function:: bool train(CvMLData* data, CvGBTreesParams params=CvGBTreesParams(), bool update=false)
+.. ocv:function:: bool train(CvMLData* data, CvGBTreesParams params=CvGBTreesParams(), bool update=false)
 	Trains a Gradient boosted tree model.
-The first train method follows the common template (see :ref:`CvStatModel::train`).
+The first train method follows the common template (see :ocv:func:`CvStatModel::train`).
 Both ``tflag`` values (``CV_ROW_SAMPLE``, ``CV_COL_SAMPLE``) are supported.
-``trainData`` must be of ``CV_32F`` type. ``responses`` must be a matrix of type
+``trainData`` must be of the ``CV_32F`` type. ``responses`` must be a matrix of type
-``CV_32S`` or ``CV_32F``, in both cases it is converted into the ``CV_32F``
+``CV_32S`` or ``CV_32F``. In both cases it is converted into the ``CV_32F``
 matrix inside the training procedure. ``varIdx`` and ``sampleIdx`` must be a
-list of indices (``CV_32S``), or a mask (``CV_8U`` or ``CV_8S``). ``update`` is
+list of indices (``CV_32S``) or a mask (``CV_8U`` or ``CV_8S``). ``update`` is
 a dummy parameter.
-The second form of :ref:`CvGBTrees::train` function uses :ref:`CvMLData` as a
+The second form of :ocv:func:`CvGBTrees::train` function uses :ocv:class:`CvMLData` as a
 data set container. ``update`` is still a dummy parameter. 
 All parameters specific to the GBT model are passed into the training function
-as a :ref:`CvGBTreesParams` structure.
+as a :ocv:class:`CvGBTreesParams` structure.
 .. index:: CvGBTrees::predict
@@ -275,52 +265,41 @@ as a :ref:`CvGBTreesParams` structure.
 CvGBTrees::predict
 ------------------
-.. c:function:: float predict(const Mat & sample, const Mat & missing=Mat(), const Range & slice = Range::all(), int k=-1) const
+.. ocv:function:: float predict(const Mat & sample, const Mat & missing=Mat(), const Range & slice = Range::all(), int k=-1) const
    Predicts a response for an input sample.
-The method predicts the response, corresponding to the given sample
+   :param sample: Input feature vector that has the same format as every training set
-(see :ref:`Predicting with GBT model`).
+    element. If not all the variables were actualy used during training,
-The result is either the class label or the estimated function value.
+    ``sample`` contains forged values at the appropriate places.
-:c:func:`predict` method allows to use the parallel version of the GBT model
-prediction if the OpenCV is built with the TBB library. In this case predicitons
-of single trees are computed in a parallel fashion.
-``sample``
-    An input feature vector, that has the same format as every training set
-    element. Hence, if not all the variables were actualy used while training,
-    ``sample`` have to contain fictive values on the appropriate places.
-``missing``
+   :param missing: Missing values mask, which is a dimentional matrix of the same size as
+    ``sample`` having the ``CV_8U`` type. ``1`` corresponds to the missing value
-    The missing values mask. The one dimentional matrix of the same size as
-    ``sample`` having a ``CV_8U`` type. ``1`` corresponds to the missing value
    in the same position in the ``sample`` vector. If there are no missing values
-    in the feature vector empty matrix can be passed instead of the missing mask.
+    in the feature vector, an empty matrix can be passed instead of the missing mask.
-``weak_responses``
+   :param weak_responses: Matrix used to obtain predictions of all the trees.
+    The matrix has :math:`K` rows,
-    In addition to the prediciton of the whole model all the trees' predcitions
+    where :math:`K` is the count of output classes (1 for the regression case).
-    can be obtained by passing a ``weak_responses`` matrix with :math:`K` rows,
+    The matrix has as many columns as the ``slice`` length.
-    where :math:`K` is the output classes count (1 for the case of regression)
-    and having as many columns as the ``slice`` length.
-``slice``
-    Defines the part of the ensemble used for prediction.
+   :param slice: Parameter defining the part of the ensemble used for prediction.
-    All trees are used when ``slice = Range::all()``. This parameter is useful to
+    If ``slice = Range::all()``, all trees are used. Use this parameter to
    get predictions of the GBT models with different ensemble sizes learning
-    only the one model actually.
+    only one model.
-``k``
-    In the case of the classification problem not the one, but :math:`K` tree
-    ensembles are built (see :ref:`Training the GBT model`). By passing this
-    parameter the ouput can be changed to sum of the trees' predictions in the
-    ``k``'th ensemble only. To get the total GBT model prediction ``k`` value
-    must be -1. For regression problems ``k`` have to be equal to -1 also.
+   :param k: Number of tree ensembles built in case of the classification problem
+    (see :ref:`Training the GBT model`). Use this
+    parameter to change the ouput to sum of the trees' predictions in the
+    ``k``-th ensemble only. To get the total GBT model prediction, ``k`` value
+    must be -1. For regression problems, ``k`` is also equal to -1.
+The method predicts the response corresponding to the given sample
+(see :ref:`Predicting with the GBT model`).
+The result is either the class label or the estimated function value. The
+:ocv:func:`predict` method enables using the parallel version of the GBT model
+prediction if the OpenCV is built with the TBB library. In this case, predictions
+of single trees are computed in a parallel fashion. 
 .. index:: CvGBTrees::clear
@@ -329,12 +308,12 @@ of single trees are computed in a parallel fashion.
 CvGBTrees::clear
 ----------------
-.. c:function:: void clear()
+.. ocv:function:: void clear()
    Clears the model.
-Deletes the data set information, all the weak models and sets all internal
+The finction deletes the data set information and all the weak models and sets all internal
-variables to the initial state. Is called in :ref:`CvGBTrees::train` and in the
+variables to the initial state. The function is called in :ocv:func:`CvGBTrees::train` and in the
 destructor.
@@ -344,28 +323,21 @@ destructor.
 CvGBTrees::calc_error
 ---------------------
-.. c:function:: float calc_error( CvMLData* _data, int type, std::vector<float> *resp = 0 )
+.. ocv:function:: float calc_error( CvMLData* _data, int type, std::vector<float> *resp = 0 )
-    Calculates training or testing error.
-If the :ref:`CvMLData` data is used to store the data set :c:func:`calc_error` can be
-used to get the training or testing error easily and (optionally) all predictions
-on the training/testing set. If TBB library is used, the error is computed in a
-parallel way: predictions for different samples are computed at the same time.
-In the case of regression problem mean squared error is returned. For
-classifications the result is the misclassification error in percent.
-``_data``
-    Data set.
+    Calculates a training or testing error.
-``type``
+   :param _data: Data set.
-    Defines what error should be computed: train (``CV_TRAIN_ERROR``) or test
+   :param type: Parameter defining the error that should be computed: train (``CV_TRAIN_ERROR``) or test
    (``CV_TEST_ERROR``).
-``resp``
+   :param resp: If non-zero, a vector of predictions on the corresponding data set is
-    If not ``0`` a vector of predictions on the corresponding data set is
    returned.
+If the :ocv:class:`CvMLData` data is used to store the data set, :ocv:func:`calc_error` can be
+used to get a training/testing error easily and (optionally) all predictions
+on the training/testing set. If the Intel* TBB* library is used, the error is computed in a
+parallel way, namely, predictions for different samples are computed at the same time.
+In case of a regression problem, a mean squared error is returned. For
+classifications, the result is a misclassification error in percent.
\ No newline at end of file
--- a/modules/ml/doc/k_nearest_neighbors.rst
+++ b/modules/ml/doc/k_nearest_neighbors.rst
-K Nearest Neighbors
+K-Nearest Neighbors
 ===================
+.. highlight:: cpp
 The algorithm caches all training samples and predicts the response for a new sample by analyzing a certain number (
 **K**
-) of the nearest neighbors of the sample (using voting, calculating weighted sum, and so on). The method is sometimes referred to as "learning by example" because for prediction it looks for the feature vector with a known response that is closest to the given vector.
+) of the nearest neighbors of the sample using voting, calculating weighted sum, and so on. The method is sometimes referred to as "learning by example" because for prediction it looks for the feature vector with a known response that is closest to the given vector.
 .. index:: CvKNearest
@@ -11,9 +13,9 @@ The algorithm caches all training samples and predicts the response for a new sa
 CvKNearest
 ----------
-.. c:type:: CvKNearest
+.. ocv:class:: CvKNearest
-K-Nearest Neighbors model ::
+K-Nearest Neighbors model. ::
    class CvKNearest : public CvStatModel
    {
@@ -53,7 +55,8 @@ CvKNearest::train
    Trains the model.
-The method trains the K-Nearest model. It follows the conventions of the generic ``train`` "method" with the following limitations: 
+The method trains the K-Nearest model. It follows the conventions of the generic ``train`` approach with the following limitations: 
 * Only ``CV_ROW_SAMPLE`` data layout is supported.
 * Input variables are all ordered.
 * Output variables can be either categorical ( ``is_regression=false`` ) or ordered ( ``is_regression=true`` ).
@@ -87,7 +90,7 @@ For each input vector, the neighbors are sorted by their distances to the vector
 If only a single input vector is passed, all output matrices are optional and the predicted value is returned by the method.
-The sample below (currently using the obsolete ``CvMat`` structures) demonstrates the use of the k-nearest classifier for 2D point classification ::
+The sample below (currently using the obsolete ``CvMat`` structures) demonstrates the use of the k-nearest classifier for 2D point classification: ::
    #include "ml.h"
    #include "highgui.h"

--- a/modules/ml/doc/mldata.rst
+++ b/modules/ml/doc/mldata.rst
@@ -3,13 +3,13 @@ MLData
 .. highlight:: cpp
-For the machine learning algorithms usage it is often that data set is saved in file of format like .csv. The supported format file must contains the table of predictors and responses values, each row of the table must correspond to one sample. Missing values are supported. Famous UC Irvine Machine Learning Repository (http://archive.ics.uci.edu/ml/) provides many stored in such format data sets to the machine learning community. The class MLData has been implemented to ease the loading data for the training one of the existing in OpenCV machine learning algorithm. For float values only separator ``'.'`` is supported.
+For the machine learning algorithms, the data set is often stored in a file of the ``.csv``-like format. The file contains a table of predictor and response values where each row of the table corresponds to a sample. Missing values are supported. The UC Irvine Machine Learning Repository (http://archive.ics.uci.edu/ml/) provides many data sets stored in such a format to the machine learning community. The class ``MLData`` is implemented to easily load the data for training one of the OpenCV machine learning algorithms. For float values, only the  ``'.'`` separator is supported.
 CvMLData
 --------
 .. ocv:class:: CvMLData
-The class to load the data from .csv file. 
+Class for loading the data from a ``.csv`` file. 
 ::
    class CV_EXPORTS CvMLData
@@ -58,91 +58,118 @@ CvMLData::read_csv
 ------------------
 .. ocv:function:: int CvMLData::read_csv(const char* filename);
-    This method reads the data set from .csv-like file named ``filename`` and store all read values in one matrix. While reading the method tries to define variables (predictors and response) type: ordered or categorical. If some value of the variable is not a number (e.g. contains the letters) exept a label for missing value, then the type of the variable is set to ``CV_VAR_CATEGORICAL``. If all unmissing values of the variable are the numbers, then the type of the variable is set to ``CV_VAR_ORDERED``. So default definition of variables types works correctly for all cases except the case of categorical variable that has numerical class labeles. In such case the type ``CV_VAR_ORDERED`` will be set and user should change the type to ``CV_VAR_CATEGORICAL`` using method :ocv:func:`CvMLData::change_var_type`. For categorical variables the common map is built to convert string class label to the numerical class label and this map can be got by :ocv:func:`CvMLData::get_class_labels_map`. Also while reading the data the method constructs the mask of missing values (e.g. values are egual to `'?'`).
+    Reads the data set from a ``.csv``-like ``filename`` file and stores all read values in a matrix. 
+While reading the data, the method tries to define the type of variables (predictors and responses): ordered or categorical. If a value of the variable is not numerical (except for the label for a missing value), the type of the variable is set to ``CV_VAR_CATEGORICAL``. If all existing values of the variable are numerical, the type of the variable is set to ``CV_VAR_ORDERED``. So, the default definition of variables types works correctly for all cases except the case of a categorical variable with numerical class labeles. In this case, the type ``CV_VAR_ORDERED`` is set. You should change the type to ``CV_VAR_CATEGORICAL`` using the method :ocv:func:`CvMLData::change_var_type`. For categorical variables, a common map is built to convert a string class label to the numerical class label. Use :ocv:func:`CvMLData::get_class_labels_map` to obtain this map. 
+Also, when reading the data, the method constructs the mask of missing values. For example, values are egual to `'?'`.
 CvMLData::get_values
 --------------------
 .. ocv:function:: const CvMat* CvMLData::get_values() const;
-    Returns the pointer to the predictors and responses ``values`` matrix or ``0`` if data has not been loaded from file yet. This matrix has rows count equal to samples count, columns count equal to predictors ``+ 1`` for response (if exist) count (i.e. each row of matrix is values of one sample predictors and response) and type ``CV_32FC1``.
+    Returns a pointer to the matrix of predictor and response ``values``  or ``0`` if the data has not been loaded from the file yet. 
+The row count of this matrix equals the sample count. The column count equals predictors ``+ 1`` for the response (if exists) count. This means that each row of the matrix contains values of one sample predictor and response. The matrix type is ``CV_32FC1``.
 CvMLData::get_responses
 -----------------------
 .. ocv:function:: const CvMat* CvMLData::get_responses();
-    Returns the pointer to the responses values matrix or throw exception if data has not been loaded from file yet. This matrix has rows count equal to samples count, one column and type ``CV_32FC1``.
+    Returns a pointer to the matrix of response values or throws an exception if the data has not been loaded from the file yet. 
+This is a single-column matrix of the type ``CV_32FC1``. Its row count is equal to the sample count, one column and .
 CvMLData::get_missing
 ---------------------
 .. ocv:function:: const CvMat* CvMLData::get_missing() const;
-    Returns the pointer to the missing values mask matrix or throw exception if data has not been loaded from file yet. This matrix has the same size as ``values`` matrix (see :ocv:func:`CvMLData::get_values`) and type ``CV_8UC1``.
+    Returns a pointer to the mask matrix of missing values or throws an exception if the data has not been loaded from the file yet. 
+This matrix has the same size as the  ``values`` matrix (see :ocv:func:`CvMLData::get_values`) and the type ``CV_8UC1``.
 CvMLData::set_response_idx
 --------------------------
 .. ocv:function:: void CvMLData::set_response_idx( int idx );
-    Sets index of response column in ``values`` matrix (see :ocv:func:`CvMLData::get_values`) or throw exception if data has not been loaded from file yet. The old response column become pridictors. If ``idx < 0`` there will be no response.
+    Sets the index of a response column in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`) or throws an exception if the data has not been loaded from the file yet. 
+The old response columns become predictors. If ``idx < 0``, there is no response.
 CvMLData::get_response_idx
 ----------
 .. ocv:function:: int CvMLData::get_response_idx() const;
-    Gets response column index in ``values`` matrix (see :ocv:func:`CvMLData::get_values`), negative value there is no response or throw exception if data has not been loaded from file yet.
+    Gets the index of a response column in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`) or throws an exception if the data has not been loaded from the file yet.
+If ``idx < 0``, there is no response.
 CvMLData::set_train_test_split
 ------------------------------
 .. ocv:function:: void CvMLData::set_train_test_split( const CvTrainTestSplit * spl );
-    For different purposes it can be useful to devide the read data set into two disjoint subsets: training and test ones. This method sets parametes for such split (using ``spl``, see :ocv:class:`CvTrainTestSplit`) and make the data split or throw exception if data has not been loaded from file yet. 
+    Divides the read data set into two disjoint training and test subsets. 
+This method sets parameters for such a split using ``spl`` (see :ocv:class:`CvTrainTestSplit`) or throws an exception if the data has not been loaded from the file yet. 
 CvMLData::get_train_sample_idx
 ------------------------------
 .. ocv:function:: const CvMat* CvMLData::get_train_sample_idx() const;
-    The read data set can be devided on training and test data subsets by setting split (see :ocv:func:`CvMLData::set_train_test_split`). Current method returns the matrix of samples indices for training subset (this matrix has one row and type ``CV_32SC1``). If data split is not set then the method returns ``0``. If data has not been loaded from file yet an exception is thrown.
+    Divides the data set into training and test subsets by setting a split (see :ocv:func:`CvMLData::set_train_test_split`).
+The current method returns the matrix of sample indices for a training subset. This is a single-row  matrix of the type ``CV_32SC1``. If data split is not set, the method returns ``0``. If the data has not been loaded from the file yet, an exception is thrown.
 CvMLData::get_test_sample_idx
 -----------------------------
 .. ocv:function:: const CvMat* CvMLData::get_test_sample_idx() const;
-    Analogically with :ocv:func:`CvMLData::get_train_sample_idx`, but for test subset.
+    Provides functionality similar to :ocv:func:`CvMLData::get_train_sample_idx` but for a test subset.
 CvMLData::mix_train_and_test_idx
 --------------------------------
 .. ocv:function:: void CvMLData::mix_train_and_test_idx();
-    Mixes the indices of training and test samples preserving sizes of training and test subsets (if data split is set by :ocv:func:`CvMLData::get_values`). If data has not been loaded from file yet an exception is thrown.
+    Mixes the indices of training and test samples preserving sizes of training and test subsets if the data split is set by :ocv:func:`CvMLData::get_values`. If the data has not been loaded from the file yet, an exception is thrown.
 CvMLData::get_var_idx
 ---------------------
 .. ocv:function:: const CvMat* CvMLData::get_var_idx();
-    Returns used variables (columns) indices in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`), ``0`` if used subset is not set or throw exception if data has not been loaded from file yet. Returned matrix has one row, columns count equel to used variable subset size and type ``CV_32SC1``.
+    Returns the indices of variables (columns) used in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`). 
+The function returns `0`` if the used subset is not set. It throws an exception if the data has not been loaded from the file yet. Returned matrix is a single-row matrix of the type ``CV_32SC1``. Its column count is equal to the size of the used variable subset.
 CvMLData::chahge_var_idx
 ------------------------
 .. ocv:function:: void CvMLData::chahge_var_idx( int vi, bool state );
-    By default after reading the data set all variables in ``values`` matrix (see :ocv:func:`CvMLData::get_values`) are used. But the user may want to use only subset of variables and can include on/off (depends on ``state`` value) a variable with ``vi`` index from used subset. If data has not been loaded from file yet an exception is thrown.
+    Controls the data set by changing the number of variables.??
+By default, after reading the data set all variables in the ``values`` matrix (see :ocv:func:`CvMLData::get_values`) are used. But you may want to use only a subset of variables and include/exclude (depending on ``state`` value) a variable with the ``vi`` index from the used subset. If the data has not been loaded from the file yet, an exception is thrown.
 CvMLData::get_var_types
 -----------------------
 .. ocv:function:: const CvMat* CvMLData::get_var_types();
-    Returns matrix of used variable types. The matrix has one row, column count equel to used variables count and type ``CV_8UC1``. If data has not been loaded from file yet an exception is thrown.
+	Returns a matrix of used variable types. 
+The function returns a single-row matrix of the type ``CV_8UC1``. column count equel to used variables count and type . If data has not been loaded from file yet an exception is thrown.
 CvMLData::set_var_types
 -----------------------
 .. ocv:function:: void CvMLData::set_var_types( const char* str );
-    Sets variables types according to given string ``str``. The better description of the supporting string format is several examples of it: ``"ord[0-17],cat[18]"``, ``"ord[0,2,4,10-12], cat[1,3,5-9,13,14]"``, ``"cat"`` (all variables are categorical), ``"ord"`` (all variables are ordered). That is after the variable type a list of such type variables indices is followed.
+    Sets variables types according to the given string ``str``. 
+In the string, a variable type is followed by a list of variables indices. For example: ``"ord[0-17],cat[18]"``, ``"ord[0,2,4,10-12], cat[1,3,5-9,13,14]"``, ``"cat"`` (all variables are categorical), ``"ord"`` (all variables are ordered). 
 CvMLData::get_var_type
 ----------------------
 .. ocv:function:: int CvMLData::get_var_type( int var_idx ) const;
-    Returns type of variable by index ``var_idx`` ( ``CV_VAR_ORDERED`` or ``CV_VAR_CATEGORICAL``).
+    Returns the type of a variable by the index ``var_idx`` ( ``CV_VAR_ORDERED`` or ``CV_VAR_CATEGORICAL``).
 CvMLData::change_var_type
 -------------------------
@@ -154,37 +181,37 @@ CvMLData::set_delimiter
 -----------------------
 .. ocv:function:: void CvMLData::set_delimiter( char ch );
-    Sets the delimiter for the variable values in file. E.g. ``','`` (default), ``';'``, ``' '`` (space) or other character (exapt float separator ``'.'``).
+    Sets the delimiter for variable values in a file. For example: ``','`` (default), ``';'``, ``' '`` (space), or other characters. The float separator ``'.'`` is not allowed.
 CvMLData::get_delimiter
 -----------------------
 .. ocv:function:: char CvMLData::get_delimiter() const;
-    Gets the set delimiter charecter.
+    Gets the set delimiter character.
 CvMLData::set_miss_ch
 ---------------------
 .. ocv:function:: void CvMLData::set_miss_ch( char ch );
-    Sets the character denoting the missing of value. E.g. ``'?'`` (default), ``'-'``, etc (exapt float separator ``'.'``).
+    Sets the character for a missing value. For example: ``'?'`` (default), ``'-'``. The float separator ``'.'`` is not allowed.
 CvMLData::get_miss_ch
 ---------------------
 .. ocv:function:: char CvMLData::get_miss_ch() const;
-    Gets the character denoting the missing value.
+    Gets the character for a missing value.
 CvMLData::get_class_labels_map
 -------------------------------
 .. ocv:function:: const std::map<std::string, int>& CvMLData::get_class_labels_map() const;
-    Returns map that converts string class labels to the numerical class labels. It can be used to get original class label (as in file).
+    Returns a map that converts string class labels to the numerical class labels. It can be used to get an original class label as in a file.
 CvTrainTestSplit
 ----------------
 .. ocv:class:: CvTrainTestSplit
-The structure to set split of data set read by :ocv:class:`CvMLData`.
+Structure setting the split of a data set read by :ocv:class:`CvMLData`.
 ::
    struct CvTrainTestSplit
@@ -203,4 +230,8 @@ The structure to set split of data set read by :ocv:class:`CvMLData`.
        bool mix;
    };
-There are two ways to construct split. The first is by setting training sample count (subset size) ``train_sample_count``; other existing samples will be in test subset. The second is by setting training sample portion in ``[0,..1]``. The flag ``mix`` is used to mix training and test samples indices when split will be set, otherwise the data set will be devided in the storing order (first part of samples of given size is the training subset, other part is the test one).
+There are two ways to construct a split:
+* Set the training sample count (subset size) ``train_sample_count``. Other existing samples are located in a test subset. 
+* Set a training sample portion in ``[0,..1]``. The flag ``mix`` is used to mix training and test samples indices when the split is set. Otherwise, the data set is split in the storing order: the first part of samples of a given size is a training subset, the second part is a test subset.
--- a/modules/ml/doc/neural_networks.rst
+++ b/modules/ml/doc/neural_networks.rst
 Neural Networks
 ===============
-ML implements feed-forward artificial neural networks, more particularly, multi-layer perceptrons (MLP), the most commonly used type of neural networks. MLP consists of the input layer, output layer, and one or more hidden layers. Each layer of MLP includes one or more neurons that are directionally linked with the neurons from the previous and the next layer. The example below represents a 3-layer perceptron with three inputs, two outputs, and the hidden layer including five neurons:
+.. highlight:: cpp
+ML implements feed-forward artificial neural networks or, more particularly, multi-layer perceptrons (MLP), the most commonly used type of neural networks. MLP consists of the input layer, output layer, and one or more hidden layers. Each layer of MLP includes one or more neurons directionally linked with the neurons from the previous and the next layer. The example below represents a 3-layer perceptron with three inputs, two outputs, and the hidden layer including five neurons:
 .. image:: pics/mlp.png
@@ -45,10 +47,13 @@ In ML, all the neurons have the same activation functions, with the same free pa
 So, the whole trained network works as follows: 
-#. It takes the feature vector as input. The vector size is equal to the size of the input layer.
+#. Take the feature vector as input. The vector size is equal to the size of the input layer.
-#. Values are passed as input to the first hidden layer.
-#. Outputs of the hidden layer are computed using the weights and the activation functions.
+#. Pass values as input to the first hidden layer.
-#. Outputs are passed further downstream until you compute the output layer.
+#. Compute outputs of the hidden layer using the weights and the activation functions.
+#. Pass outputs further downstream until you compute the output layer.
 So, to compute the network, you need to know all the
 weights
@@ -66,10 +71,10 @@ so the error on the test set usually starts increasing after the network
 size reaches a limit. Besides, the larger networks are trained much
 longer than the smaller ones, so it is reasonable to pre-process the data,
 using
-:ref:`PCA::operator ()` or similar technique, and train a smaller network
+:ocv:func:`PCA::operator ()` or similar technique, and train a smaller network
 on only essential features.
-Another feature of MLP's is their inability to handle categorical
+Another MPL feature is an inability to handle categorical
 data as is. However, there is a workaround. If a certain feature in the
 input or output (in case of ``n`` -class classifier for
 :math:`n>2` ) layer is categorical and can take
@@ -101,9 +106,9 @@ References:
 CvANN_MLP_TrainParams
 ---------------------
-.. c:type:: CvANN_MLP_TrainParams
+.. ocv:class:: CvANN_MLP_TrainParams
-Parameters of the MLP training algorithm ::
+Parameters of the MLP training algorithm. ::
    struct CvANN_MLP_TrainParams
    {
@@ -134,9 +139,9 @@ The structure has a default constructor that initializes parameters for the ``RP
 CvANN_MLP
 ---------
-.. c:type:: CvANN_MLP
+.. ocv:class:: CvANN_MLP
-MLP model ::
+MLP model. ::
    class CvANN_MLP : public CvStatModel
    {
@@ -259,9 +264,9 @@ CvANN_MLP::train
    :param _flags: Various parameters to control the training algorithm. A combination of the following parameters is possible:
-            * **UPDATE_WEIGHTS = 1** Algorithm updates the network weights, rather than computes them from scratch (in the latter case the weights are initialized using the  Nguyen-Widrow  algorithm).
+            * **UPDATE_WEIGHTS = 1** Algorithm updates the network weights, rather than computes them from scratch. In the latter case the weights are initialized using the  Nguyen-Widrow  algorithm.
-            * **NO_INPUT_SCALE** Algorithm does not normalize the input vectors. If this flag is not set, the training algorithm normalizes each input feature independently, shifting its mean value to 0 and making the standard deviation =1. If the network is assumed to be updated frequently, the new training data could be much different from original one. In this case, you should take care of proper normalization.
+            * **NO_INPUT_SCALE** Algorithm does not normalize the input vectors. If this flag is not set, the training algorithm normalizes each input feature independently, shifting its mean value to 0 and making the standard deviation equal to 1. If the network is assumed to be updated frequently, the new training data could be much different from original one. In this case, you should take care of proper normalization.
            * **NO_OUTPUT_SCALE** Algorithm does not normalize the output vectors. If the flag is not set, the training algorithm normalizes each output feature independently, by transforming it to the certain range depending on the used activation function.

--- a/modules/ml/doc/normal_bayes_classifier.rst
+++ b/modules/ml/doc/normal_bayes_classifier.rst
@@ -3,7 +3,9 @@
 Normal Bayes Classifier
 =======================
-This is a simple classification model assuming that feature vectors from each class are normally distributed (though, not necessarily independently distributed). So, the whole data distribution function is assumed to be a Gaussian mixture, one component per  class. Using the training data the algorithm estimates mean vectors and covariance matrices for every class, and then it uses them for prediction.
+.. highlight:: cpp
+This simple classification model assumes that feature vectors from each class are normally distributed (though, not necessarily independently distributed). So, the whole data distribution function is assumed to be a Gaussian mixture, one component per  class. Using the training data the algorithm estimates mean vectors and covariance matrices for every class, and then it uses them for prediction.
 [Fukunaga90] K. Fukunaga. *Introduction to Statistical Pattern Recognition*. second ed., New York: Academic Press, 1990.
@@ -11,9 +13,9 @@ This is a simple classification model assuming that feature vectors from each cl
 CvNormalBayesClassifier
 -----------------------
-.. c:type:: CvNormalBayesClassifier
+.. ocv:class:: CvNormalBayesClassifier
-Bayes classifier for normally distributed data ::
+Bayes classifier for normally distributed data. ::
    class CvNormalBayesClassifier : public CvStatModel
    {
@@ -50,7 +52,7 @@ CvNormalBayesClassifier::train
    Trains the model.
-The method trains the Normal Bayes classifier. It follows the conventions of the generic ``train`` "method" with the following limitations: 
+The method trains the Normal Bayes classifier. It follows the conventions of the generic ``train`` approach with the following limitations: 
 * Only ``CV_ROW_SAMPLE`` data layout is supported.
 * Input variables are all ordered.

--- a/modules/ml/doc/support_vector_machines.rst
+++ b/modules/ml/doc/support_vector_machines.rst
@@ -3,9 +3,9 @@ Support Vector Machines
 .. highlight:: cpp
-Originally, support vector machines (SVM) was a technique for building an optimal binary (2-class) classifier. Later the technique has been extended to regression and clustering problems. SVM is a partial case of kernel-based methods. It maps feature vectors into a higher-dimensional space using a kernel function and builds an optimal linear discriminating function in this space or an optimal hyper-plane that fits into the training data. In case of SVM, the kernel is not defined explicitly. Instead, a distance between any 2 points in the hyper-space needs to be defined.
+Originally, support vector machines (SVM) was a technique for building an optimal binary (2-class) classifier. Later the technique was extended to regression and clustering problems. SVM is a partial case of kernel-based methods. It maps feature vectors into a higher-dimensional space using a kernel function and builds an optimal linear discriminating function in this space or an optimal hyper-plane that fits into the training data. In case of SVM, the kernel is not defined explicitly. Instead, a distance between any 2 points in the hyper-space needs to be defined.
-The solution is optimal, which means that the margin between the separating hyper-plane and the nearest feature vectors from both classes (in case of 2-class classifier) is maximal. The feature vectors that are the closest to the hyper-plane are called "support vectors", which means that the position of other vectors does not affect the hyper-plane (the decision function).
+The solution is optimal, which means that the margin between the separating hyper-plane and the nearest feature vectors from both classes (in case of 2-class classifier) is maximal. The feature vectors that are the closest to the hyper-plane are called *support vectors*, which means that the position of other vectors does not affect the hyper-plane (the decision function).
 There are a lot of good references on SVM. You may consider starting with the following:
@@ -27,9 +27,9 @@ There are a lot of good references on SVM. You may consider starting with the fo
 CvSVM
 -----
-.. c:type:: CvSVM
+.. ocv:class:: CvSVM
-Support Vector Machines ::
+Support Vector Machines. ::
    class CvSVM : public CvStatModel
    {
@@ -90,9 +90,9 @@ Support Vector Machines ::
 CvSVMParams
 -----------
-.. c:type:: CvSVMParams
+.. ocv:class:: CvSVMParams
-SVM training parameters ::
+SVM training parameters. ::
    struct CvSVMParams
    {
@@ -117,7 +117,7 @@ SVM training parameters ::
 The structure must be initialized and passed to the training method of
-:ref:`CvSVM` .
+:ocv:class:`CvSVM` .
 .. index:: CvSVM::train
@@ -127,17 +127,20 @@ CvSVM::train
 ------------
 .. ocv:function:: bool CvSVM::train(  const Mat& _train_data,  const Mat& _responses,                     const Mat& _var_idx=Mat(),  const Mat& _sample_idx=Mat(),                     CvSVMParams _params=CvSVMParams() )
-    Trains SVM.
+    Trains an SVM.
-The method trains the SVM model. It follows the conventions of the generic ``train`` "method" with the following limitations: 
+The method trains the SVM model. It follows the conventions of the generic ``train`` approach with the following limitations: 
 * Only the ``CV_ROW_SAMPLE`` data layout is supported.
 * Input variables are all ordered.
 * Output variables can be either categorical ( ``_params.svm_type=CvSVM::C_SVC`` or ``_params.svm_type=CvSVM::NU_SVC`` ), or ordered ( ``_params.svm_type=CvSVM::EPS_SVR`` or ``_params.svm_type=CvSVM::NU_SVR`` ), or not required at all ( ``_params.svm_type=CvSVM::ONE_CLASS`` ).
 * Missing measurements are not supported.
 All the other parameters are gathered in the
-:ref:`CvSVMParams` structure.
+:ocv:class:`CvSVMParams` structure.
 .. index:: CvSVM::train_auto
@@ -147,16 +150,16 @@ CvSVM::train_auto
 -----------------
 .. ocv:function:: train_auto(  const Mat& _train_data,  const Mat& _responses,          const Mat& _var_idx,  const Mat& _sample_idx,          CvSVMParams params,  int k_fold = 10,          CvParamGrid C_grid      = get_default_grid(CvSVM::C),          CvParamGrid gamma_grid  = get_default_grid(CvSVM::GAMMA),          CvParamGrid p_grid      = get_default_grid(CvSVM::P),          CvParamGrid nu_grid     = get_default_grid(CvSVM::NU),          CvParamGrid coef_grid   = get_default_grid(CvSVM::COEF),          CvParamGrid degree_grid = get_default_grid(CvSVM::DEGREE) )
-    Trains SVM with optimal parameters.
+    Trains an SVM with optimal parameters.
    :param k_fold: Cross-validation parameter. The training set is divided into  ``k_fold``  subsets. One subset is used to train the model, the others form the test set. So, the SVM algorithm is executed  ``k_fold``  times.
 The method trains the SVM model automatically by choosing the optimal
 parameters ``C`` , ``gamma`` , ``p`` , ``nu`` , ``coef0`` , ``degree`` from
-:ref:`CvSVMParams`. Parameters are considered optimal
+:ocv:class:`CvSVMParams`. Parameters are considered optimal
 when the cross-validation estimate of the test set error
 is minimal. The parameters are iterated by a logarithmic grid, for
-example, the parameter ``gamma`` takes the values in the set
+example, the parameter ``gamma`` takes values in the set
 (
 :math:`min`,
 :math:`min*step`,
@@ -165,7 +168,7 @@ example, the parameter ``gamma`` takes the values in the set
 where
 :math:`min` is ``gamma_grid.min_val`` ,
 :math:`step` is ``gamma_grid.step`` , and
-:math:`n` is the maximal index such that
+:math:`n` is the maximal index where
 .. math::
@@ -173,12 +176,12 @@ where
 So ``step`` must always be greater than 1.
-If there is no need to optimize a parameter, the corresponding grid step should be set to any value less or equal to 1. For example, to avoid optimization in ``gamma`` , set ``gamma_grid.step = 0`` , ``gamma_grid.min_val`` , ``gamma_grid.max_val`` as arbitrary numbers. In this case, the value ``params.gamma`` is taken for ``gamma`` .
+If there is no need to optimize a parameter, the corresponding grid step should be set to any value less than or equal to 1. For example, to avoid optimization in ``gamma`` , set ``gamma_grid.step = 0`` , ``gamma_grid.min_val`` , ``gamma_grid.max_val`` as arbitrary numbers. In this case, the value ``params.gamma`` is taken for ``gamma`` .
 And, finally, if the optimization in a parameter is required but
 the corresponding grid is unknown, you may call the function ``CvSVM::get_default_grid`` . To generate a grid, for example, for ``gamma`` , call ``CvSVM::get_default_grid(CvSVM::GAMMA)`` .
-This function works for the case of classification
+This function works for the classification
 ( ``params.svm_type=CvSVM::C_SVC`` or ``params.svm_type=CvSVM::NU_SVC`` )
 as well as for the regression
 ( ``params.svm_type=CvSVM::EPS_SVR`` or ``params.svm_type=CvSVM::NU_SVR`` ). If ``params.svm_type=CvSVM::ONE_CLASS`` , no optimization is made and the usual SVM with parameters specified in ``params``  is executed.
@@ -207,7 +210,7 @@ CvSVM::get_default_grid
            * **CvSVM::DEGREE**
-        The grid will be generated for the parameter with this ID.
+        The grid is generated for the parameter with this ID.
 The function generates a grid for the specified parameter of the SVM algorithm. The grid may be passed to the function ``CvSVM::train_auto`` .