corrected grammar (done by Elena)

fafc2f37 · Vadim Pisarevsky · 57195e96 · fafc2f37 · fafc2f37 · fafc2f37
Commit fafc2f37 authored Mar 28, 2011 by Vadim Pisarevsky
8 changed files
--- a/modules/gpu/doc/feature_detection_and_description.rst
+++ b/modules/gpu/doc/feature_detection_and_description.rst
@@ -3,15 +3,14 @@ Feature Detection and Description

 .. highlight:: cpp

-
-
 .. index:: gpu::SURF_GPU

 gpu::SURF_GPU
 -------------
 .. cpp:class:: gpu::SURF_GPU

-Class for extracting Speeded Up Robust Features from an image. ::
+Class used for extracting Speeded Up Robust Features (SURF) from an image. 
+::

    class SURF_GPU : public CvSURFParams
    {
@@ -20,8 +19,7 @@ Class for extracting Speeded Up Robust Features from an image. ::
        SURF_GPU();
        //! the full constructor taking all the necessary parameters
        explicit SURF_GPU(double _hessianThreshold, int _nOctaves=4,
-             int _nOctaveLayers=2, bool _extended=false, float _keypointsRatio=0.01f, 
-             bool _upright = false);
+             int _nOctaveLayers=2, bool _extended=false, float _keypointsRatio=0.01f);

        //! returns the descriptor size in float's (64 or 128)
        int descriptorSize() const;
@@ -61,8 +59,6 @@ Class for extracting Speeded Up Robust Features from an image. ::

        //! max keypoints = keypointsRatio * img.size().area()
        float keypointsRatio;
-        
-        bool upright;

        GpuMat sum, mask1, maskSum, intBuffer;

@@ -73,15 +69,15 @@ Class for extracting Speeded Up Robust Features from an image. ::
        GpuMat keypointsBuffer;
    };

-The class ``SURF_GPU`` implements Speeded Up Robust Features descriptor. There is fast multi-scale Hessian keypoint detector that can be used to find the keypoints (which is the default option), but the descriptors can be also computed for the user-specified keypoints. Supports only 8 bit grayscale images.

-The class ``SURF_GPU`` can store results to GPU and CPU memory and provides functions to convert results between CPU and GPU version (``uploadKeypoints``, ``downloadKeypoints``, ``downloadDescriptors``). CPU results has the same format as :c:type:`SURF` results. GPU results are stored to :cpp:class:`gpu::GpuMat`. ``keypoints`` matrix is one row matrix with ``CV_32FC6`` type. It contains 6 float values per feature: ``x, y, laplacian, size, dir, hessian``. ``descriptors`` matrix is ``nFeatures`` :math:`\times` ``descriptorSize`` matrix with ``CV_32FC1`` type.
-
-The class ``SURF_GPU`` uses some buffers and provides access to it. All buffers can be safely released between function calls.
+The class ``SURF_GPU`` implements Speeded Up Robust Features descriptor. There is a fast multi-scale Hessian keypoint detector that can be used to find the keypoints (which is the default option). But the descriptors can also be computed for the user-specified keypoints. Only 8 bit grayscale images are supported.

-See also: :c:type:`SURF`.
+The class ``SURF_GPU`` can store results in the GPU and CPU memory. It provides functions to convert results between CPU and GPU version ( ``uploadKeypoints``,``downloadKeypoints``,``downloadDescriptors`` ). The format of CPU results is the same as ``SURF`` results. GPU results are stored in  ``GpuMat`` . The ``keypoints`` matrix is one-row matrix of the ``CV_32FC6`` type. It contains 6 float values per feature: ``x, y, laplacian, size, dir, hessian`` .  The ``descriptors`` matrix is
+:math:`\texttt{nFeatures} \times \texttt{descriptorSize}` matrix with ``CV_32FC1`` type.

+The class ``SURF_GPU`` uses some buffers and provides access to it. All buffers can be safely released between function calls.

+See Also: :c:type:`SURF`.

 .. index:: gpu::BruteForceMatcher_GPU

@@ -104,7 +100,7 @@ Brute-force descriptor matcher. For each descriptor in the first set, this match
        // Clear train descriptors collection.
        void clear();

-        // Return true if there are not train descriptors in collection.
+        // Return true if there are no train descriptors in collection.
        bool empty() const;

        // Return true if the matcher supports mask in match methods.
@@ -173,27 +169,23 @@ Brute-force descriptor matcher. For each descriptor in the first set, this match
        std::vector<GpuMat> trainDescCollection;
    };

-The class ``BruteForceMatcher_GPU`` has the similar interface to class :c:type:`DescriptorMatcher`. It has two groups of match methods: for matching descriptors of one image with other image or with image set. Also all functions have alternative: save results to GPU memory or to CPU memory.

-``Distance`` template parameter is kept for CPU/GPU interfaces similarity. ``BruteForceMatcher_GPU`` supports only ``L1<float>`` and ``L2<float>`` distance types.
+The class ``BruteForceMatcher_GPU`` has the interface similar to class :c:type:`DescriptorMatcher`. It has two groups of ``match`` methods: for matching descriptors of one image with another image or with an image set. Also, all functions have an alternative: save results to the GPU memory or to the CPU memory. ``Distance`` template parameter is kept for CPU/GPU interfaces similarity. ``BruteForceMatcher_GPU`` supports only ``L1<float>`` and ``L2<float>`` distance types.

 See also: :c:type:`DescriptorMatcher`, :c:type:`BruteForceMatcher`.

-
-
 .. index:: gpu::BruteForceMatcher_GPU::match

 gpu::BruteForceMatcher_GPU::match
 -------------------------------------
-.. cpp:function:: void gpu::BruteForceMatcher_GPU::match(const GpuMat& queryDescs, const GpuMat& trainDescs, vector<DMatch>& matches, const GpuMat& mask = GpuMat())
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::match(const GpuMat& queryDescs, const GpuMat& trainDescs, std::vector<DMatch>& matches, const GpuMat& mask = GpuMat())

-.. cpp:function:: void gpu::BruteForceMatcher_GPU::match(const GpuMat& queryDescs, vector<DMatch>& matches, const vector<GpuMat>& masks = vector<GpuMat>())
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::match(const GpuMat& queryDescs, std::vector<DMatch>& matches, const std::vector<GpuMat>& masks = std::vector<GpuMat>())

    Finds the best match for each descriptor from a query set with train descriptors.

-See also: :c:func:`DescriptorMatcher::match`.
-
-
+See Also:
+:c:func:`DescriptorMatcher::match` .

 .. index:: gpu::BruteForceMatcher_GPU::matchSingle

@@ -201,19 +193,17 @@ gpu::BruteForceMatcher_GPU::matchSingle
 -------------------------------------------
 .. cpp:function:: void gpu::BruteForceMatcher_GPU::matchSingle(const GpuMat& queryDescs, const GpuMat& trainDescs, GpuMat& trainIdx, GpuMat& distance, const GpuMat& mask = GpuMat())

-    Finds the best match for each query descriptor. Results will be stored to GPU memory.
-    
-    :param queryDescs: Query set of descriptors.
-
-    :param trainDescs: Train set of descriptors. This will not be added to train descriptors collection stored in class object.
-
-    :param trainIdx: One row ``CV_32SC1`` matrix. Will contain the best train index for each query. If some query descriptors are masked out in ``mask`` it will contain -1.
-
-    :param distance: One row ``CV_32FC1`` matrix. Will contain the best distance for each query. If some query descriptors are masked out in ``mask`` it will contain ``FLT_MAX``.
-
-    :param mask: Mask specifying permissible matches between input query and train matrices of descriptors.
+    Finds the best match for each query descriptor. Results are stored in the GPU memory.

+    :param queryDescs: Query set of descriptors.
+    
+    :param trainDescs: Training set of descriptors. It is not added to train descriptors collection stored in the class object.
+    
+    :param trainIdx: The output single-row ``CV_32SC1`` matrix that contains the best train index for each query. If some query descriptors are masked out in ``mask`` , it contains -1.
+    
+    :param distance: The output single-row ``CV_32FC1`` matrix that contains the best distance for each query. If some query descriptors are masked out in ``mask``, it will contains ``FLT_MAX``.

+    :param mask: Mask specifying permissible matches between the input query and train matrices of descriptors.

 .. index:: gpu::BruteForceMatcher_GPU::matchCollection

@@ -221,123 +211,122 @@ gpu::BruteForceMatcher_GPU::matchCollection
 -----------------------------------------------
 .. cpp:function:: void gpu::BruteForceMatcher_GPU::matchCollection(const GpuMat& queryDescs, const GpuMat& trainCollection, GpuMat& trainIdx, GpuMat& imgIdx, GpuMat& distance, const GpuMat& maskCollection)

-    Find the best match for each query descriptor from train collection. Results will be stored to GPU memory.
-    
-    :param queryDescs: Query set of descriptors.
-
-    :param trainCollection: :cpp:class:`gpu::GpuMat` containing train collection. It can be obtained from train descriptors collection that was set using ``add`` method by :cpp:func:`gpu::BruteForceMatcher_GPU::makeGpuCollection`. Or it can contain user defined collection. It must be one row matrix, each element is a :cpp:class:`gpu::DevMem2D_` that points to one train descriptors matrix.
-
-    :param trainIdx: One row ``CV_32SC1`` matrix. Will contain the best train index for each query. If some query descriptors are masked out in ``maskCollection`` it will contain -1.
-
-    :param imgIdx: One row ``CV_32SC1`` matrix. Will contain image train index for each query. If some query descriptors are masked out in ``maskCollection`` it will contain -1.
-
-    :param distance: One row ``CV_32FC1`` matrix. Will contain the best distance for each query. If some query descriptors are masked out in ``maskCollection`` it will contain ``FLT_MAX``.
-
-    :param maskCollection: :cpp:class:`gpu::GpuMat` containing set of masks. It can be obtained from ``vector<GpuMat>`` by :cpp:func:`gpu::BruteForceMatcher_GPU::makeGpuCollection`. Or it can contain user defined mask set. It must be empty matrix or one row matrix, each element is a :cpp:class:`gpu::PtrStep_` that points to one mask.
+    Finds the best match for each query descriptor from train collection. Results are stored in the GPU memory.

+    :param queryDescs: Query set of descriptors.
+    
+    :param trainCollection: :cpp:class:`gpu::GpuMat` containing train collection. It can be obtained from the collection of train descriptors that was set using the ``add``     method by :cpp:func:`gpu::BruteForceMatcher_GPU::makeGpuCollection`. Or it may contain a user-defined collection. This is a one-row matrix where each element is ``DevMem2D`` pointing out to a matrix of train descriptors.
+    
+    :param trainIdx: The output single-row ``CV_32SC1`` matrix that contains the best train index for each query. If some query descriptors are masked out in ``maskCollection``  , it contains -1.
+    
+    :param imgIdx: The output single-row ``CV_32SC1`` matrix that contains image train index for each query. If some query descriptors are masked out in ``maskCollection``  , it contains -1.
+    
+    :param distance: The output single-row ``CV_32FC1`` matrix that contains the best distance for each query. If some query descriptors are masked out in ``maskCollection``  , it contains ``FLT_MAX``.

+    :param maskCollection: ``GpuMat``  containing a set of masks. It can be obtained from  ``std::vector<GpuMat>``  by  ?? or it may contain  a user-defined mask set. This is an empty matrix or one-row matrix where each element is a  ``PtrStep``  that points to one mask.

 .. index:: gpu::BruteForceMatcher_GPU::makeGpuCollection

 gpu::BruteForceMatcher_GPU::makeGpuCollection
 -------------------------------------------------
-.. cpp:function:: void gpu::BruteForceMatcher_GPU::makeGpuCollection(GpuMat& trainCollection, GpuMat& maskCollection, const vector<GpuMat>& masks = vector<GpuMat>())
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::makeGpuCollection(GpuMat& trainCollection, GpuMat& maskCollection, const vector<GpuMat>&masks = std::vector<GpuMat>())

    Makes gpu collection of train descriptors and masks in suitable format for :cpp:func:`gpu::BruteForceMatcher_GPU::matchCollection` function.

-
-
 .. index:: gpu::BruteForceMatcher_GPU::matchDownload

 gpu::BruteForceMatcher_GPU::matchDownload
 ---------------------------------------------
-.. cpp:function:: void gpu::BruteForceMatcher_GPU::matchDownload(const GpuMat& trainIdx, const GpuMat& distance, vector<DMatch>& matches)
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::matchDownload(const GpuMat& trainIdx, const GpuMat& distance, std::vector<DMatch>&matches)

-.. cpp:function:: void gpu::BruteForceMatcher_GPU::matchDownload(const GpuMat& trainIdx, GpuMat&imgIdx, const GpuMat& distance, vector<DMatch>& matches)
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::matchDownload(const GpuMat& trainIdx, GpuMat& imgIdx, const GpuMat& distance, std::vector<DMatch>&matches)

    Downloads ``trainIdx``, ``imgIdx`` and ``distance`` matrices obtained via :cpp:func:`gpu::BruteForceMatcher_GPU::matchSingle` or :cpp:func:`gpu::BruteForceMatcher_GPU::matchCollection` to CPU vector with :c:type:`DMatch`.

-
-
 .. index:: gpu::BruteForceMatcher_GPU::knnMatch

 gpu::BruteForceMatcher_GPU::knnMatch
 ----------------------------------------
-.. cpp:function:: void gpu::BruteForceMatcher_GPU::knnMatch(const GpuMat& queryDescs, const GpuMat& trainDescs, vector< vector<DMatch> >& matches, int k, const GpuMat& mask = GpuMat(), bool compactResult = false)
-
-.. cpp:function:: void gpu::BruteForceMatcher_GPU::knnMatch(const GpuMat& queryDescs, vector< vector<DMatch> >& matches, int k, const vector<GpuMat>& masks = vector<GpuMat>(), bool compactResult = false)
-
-    Finds the k best matches for each descriptor from a query set with train descriptors. Found k (or less if not possible) matches are returned in distance increasing order.
-
-.. cpp:function:: void gpu::BruteForceMatcher_GPU::knnMatch(const GpuMat& queryDescs, const GpuMat& trainDescs, GpuMat& trainIdx, GpuMat& distance, GpuMat& allDist, int k, const GpuMat& mask = GpuMat())
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::knnMatch(const GpuMat& queryDescs, const GpuMat& trainDescs, std::vector< std::vector<DMatch> >&matches, int k, const GpuMat& mask = GpuMat(), bool compactResult = false)

-    Finds the k best matches for each descriptor from a query set with train descriptors. Found k (or less if not possible) matches are returned in distance increasing order. Results will be stored to GPU memory.
-    
-    :param queryDescs: Query set of descriptors.
+    Finds the k best matches for each descriptor from a query set with train descriptors. The function returns detected k (or less if not possible) matches in the increasing order by distance.

-    :param trainDescs; Train set of descriptors. This will not be added to train descriptors collection stored in class object.
+.. c:function:: void knnMatch(const GpuMat& queryDescs, std::vector< std::vector<DMatch> >&matches, int k, const std::vector<GpuMat>&masks = std::vector<GpuMat>(), bool compactResult = false )

-    :param trainIdx: Matrix with ``nQueries`` :math:`\times` ``k`` size and ``CV_32SC1`` type. ``trainIdx.at<int>(queryIdx, i)`` will contain index of the i'th best trains. If some query descriptors are masked out in ``mask`` it will contain -1.
+See Also:
+:func:`DescriptorMatcher::knnMatch` .

-    :param distance: Matrix with ``nQuery`` :math:`\times` ``k`` and ``CV_32FC1`` type. Will contain distance for each query and the i'th best trains. If some query descriptors are masked out in ``mask`` it will contain ``FLT_MAX``.
+.. index:: gpu::BruteForceMatcher_GPU::knnMatch

-    :param allDist: Buffer to store all distances between query descriptors and train descriptors. It will have ``nQuery`` :math:`\times` ``nTrain`` size and ``CV_32FC1`` type. ``allDist.at<float>(queryIdx, trainIdx)`` will contain ``FLT_MAX``, if ``trainIdx`` is one from k best, otherwise it will contain distance between ``queryIdx`` and ``trainIdx`` descriptors.
+gpu::BruteForceMatcher_GPU::knnMatch
+----------------------------------------
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::knnMatch(const GpuMat& queryDescs, const GpuMat& trainDescs, GpuMat& trainIdx, GpuMat& distance, GpuMat& allDist, int k, const GpuMat& mask = GpuMat())

-    :param k: Number of the best matches will be found per each query descriptor (or less if it's not possible).
+    Finds the k best matches for each descriptor from a query set with train descriptors. The function returns detected k (or less if not possible) matches in the increasing order by distance. Results will be stored in the GPU memory.

-    :param mask: Mask specifying permissible matches between input query and train matrices of descriptors.
+    :param queryDescs: Query set of descriptors.
+    :param trainDescs: Training set of descriptors. It is not be added to train descriptors collection stored in the class object.
+    :param trainIdx: The output matrix of ``queryDescs.rows x k`` size and ``CV_32SC1`` type. ``trainIdx.at<int>(i, j)`` contains an index of the j-th best match for the i-th query descriptor. If some query descriptors are masked out in ``mask``, it will contains -1.
+    :param distance: The output matrix of ``queryDescs.rows x k`` size and ``CV_32FC1`` type. ``distance.at<float>(i, j)`` contains a distance from the j-th best match for the i-th query descriptor to the query descriptor. If some query descriptors are masked out in ``mask``, it will contain ``FLT_MAX``.
+    :param allDist: The floating-point matrix of the size ``queryDescs.rows x trainDescs.rows``. This is a buffer to store all distances between each query descriptors and each train descriptor. On output, ``allDist.at<float>(queryIdx, trainIdx)`` will contain ``FLT_MAX`` if ``trainIdx`` is one from k best.

-See also: :c:func:`DescriptorMatcher::knnMatch`.
+    :param k: Number of the best matches per each query descriptor (or less if it is not possible).

+    :param mask: Mask specifying permissible matches between the input query and train matrices of descriptors.

 .. index:: gpu::BruteForceMatcher_GPU::knnMatchDownload

 gpu::BruteForceMatcher_GPU::knnMatchDownload
 ------------------------------------------------
-.. cpp:function:: void gpu::BruteForceMatcher_GPU::knnMatchDownload(const GpuMat& trainIdx, const GpuMat& distance, vector< vector<DMatch> >& matches, bool compactResult = false)
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::knnMatchDownload(const GpuMat& trainIdx, const GpuMat& distance, std::vector< std::vector<DMatch> >&matches, bool compactResult = false)

    Downloads ``trainIdx`` and ``distance`` matrices obtained via :cpp:func:`gpu::BruteForceMatcher_GPU::knnMatch` to CPU vector with :c:type:`DMatch`. If ``compactResult`` is true ``matches`` vector will not contain matches for fully masked out query descriptors.

-
-
 .. index:: gpu::BruteForceMatcher_GPU::radiusMatch

 gpu::BruteForceMatcher_GPU::radiusMatch
 -------------------------------------------
-.. cpp:function:: void gpu::BruteForceMatcher_GPU::radiusMatch(const GpuMat& queryDescs, const GpuMat& trainDescs, vector< vector<DMatch> >& matches, float maxDistance, const GpuMat& mask = GpuMat(), bool compactResult = false)
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::radiusMatch(const GpuMat& queryDescs, const GpuMat& trainDescs, std::vector< std::vector<DMatch> >&matches, float maxDistance, const GpuMat& mask = GpuMat(), bool compactResult = false)

-.. cpp:function:: void gpu::BruteForceMatcher_GPU::radiusMatch(const GpuMat& queryDescs, vector< vector<DMatch> >& matches, float maxDistance, const vector<GpuMat>& masks = vector<GpuMat>(), bool compactResult = false)
+    For each query descriptor, finds the best matches with a distance less than a given threshold. The function returns detected matches in the increasing order by distance.

-    Finds the best matches for each query descriptor which have distance less than given threshold. Found matches are returned in distance increasing order.
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::radiusMatch(const GpuMat& queryDescs, std::vector< std::vector<DMatch> >&matches, float maxDistance, const std::vector<GpuMat>&masks = std::vector<GpuMat>(), bool compactResult = false)

-.. cpp:function:: void gpu::BruteForceMatcher_GPU::radiusMatch(const GpuMat&queryDescs,  const GpuMat&trainDescs,  GpuMat&trainIdx,  GpuMat&nMatches,  GpuMat&distance,  float maxDistance,  const GpuMat&mask = GpuMat())
+    This function works only on devices with the compute capability
+:math:`>=` 1.1.

-    Finds the best matches for each query descriptor which have distance less than given threshold. Results will be stored to GPU memory. Results are not sorted by distance increasing order.
-    
-    :param queryDescs: Query set of descriptors.
+See Also:
+:func:`DescriptorMatcher::radiusMatch` .

-    :param trainDescs: Train set of descriptors. This will not be added to train descriptors collection stored in class object.
+.. index:: gpu::BruteForceMatcher_GPU::radiusMatch

-    :param trainIdx: ``trainIdx.at<int>(queryIdx, i)`` will contain i'th train index ``(i < min(nMatches.at<unsigned int>(0, queryIdx), trainIdx.cols)``. If ``trainIdx`` is empty, it will be created with size ``nQuery`` :math:`\times` ``nTrain``. Or it can be allocated by user (it must have ``nQuery`` rows and ``CV_32SC1`` type). Cols can be less than ``nTrain``, but it can be that matcher won't find all matches, because it haven't enough memory to store results.
+gpu::BruteForceMatcher_GPU::radiusMatch
+-------------------------------------------
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::radiusMatch(const GpuMat& queryDescs, const GpuMat& trainDescs, GpuMat& trainIdx, GpuMat& nMatches, GpuMat& distance, float maxDistance, const GpuMat& mask = GpuMat())

-    :param nMatches: ``nMatches.at<unsigned int>(0, queryIdx)`` will contain matches count for ``queryIdx``. Carefully, ``nMatches`` can be greater than ``trainIdx.cols`` - it means that matcher didn't find all matches, because it didn't have enough memory.
+    For each query descriptor, finds the best matches with a distance less than a given threshold (``maxDistance``). The results are stored in the GPU memory.

-    :param distance: ``distance.at<int>(queryIdx, i)`` will contain i'th distance ``(i < min(nMatches.at<unsigned int>(0, queryIdx), trainIdx.cols)``. If ``trainIdx`` is empty, it will be created with size ``nQuery`` :math:`\times` ``nTrain``. Otherwise it must be also allocated by user (it must have the same size as ``trainIdx`` and ``CV_32FC1`` type).
+    :param queryDescs: Query set of descriptors.
+    
+    :param trainDescs: Training set of descriptors.It is not added to train descriptors collection stored in the class object.
+    
+    :param trainIdx: ``trainIdx.at<int>(i, j)`` is the index of j-th training descriptor which is close enough to i-th query descriptor. If ``trainIdx`` is empty, it is created with the size ``queryDescs.rows x trainDescs.rows``. When the matrix is pre-allocated, it can have less than ``trainDescs.rows`` columns. Then the function will return as many matches for each query descriptors as fit into the matrix.
+    
+    :param nMatches: ``nMatches.at<unsigned int>(0, i)`` contains the number of matching descriptors for the i-th query descriptor. The value can be larger than ``trainIdx.cols`` - it means that the function could not store all the matches since it did not have enough memory.
+    
+    :param distance: ``distance.at<int>(i, j)`` is the distance between the j-th match for the j-th query descriptor and the this very query descriptor. The matrix will have ``CV_32FC1`` type and the same size as ``trainIdx``.

    :param maxDistance: Distance threshold.

-    :param mask: Mask specifying permissible matches between input query and train matrices of descriptors.
-
-**Please note:** This function works only on devices with Compute Capability :math:`>=` 1.1.
-
-See also: :c:func:`DescriptorMatcher::radiusMatch`.
-
+    :param mask: Mask specifying permissible matches between the input query and train matrices of descriptors.

+    In contrast to :cpp:func:`gpu::BruteForceMatcher_GPU::knnMatch`, here the results are not sorted by the distance. This function works only on devices with the compute capability >= 1.1.

 .. index:: gpu::BruteForceMatcher_GPU::radiusMatchDownload

 gpu::BruteForceMatcher_GPU::radiusMatchDownload
 ---------------------------------------------------
-.. cpp:function:: void gpu::BruteForceMatcher_GPU::radiusMatchDownload(const GpuMat& trainIdx, const GpuMat& nMatches, const GpuMat& distance, vector< vector<DMatch> >& matches, bool compactResult = false)
+.. cpp:function:: void gpu::BruteForceMatcher_GPU::radiusMatchDownload(const GpuMat& trainIdx, const GpuMat& nMatches, const GpuMat& distance, std::vector< std::vector<DMatch> >&matches, bool compactResult = false)

    Downloads ``trainIdx``, ``nMatches`` and ``distance`` matrices obtained via :cpp:func:`gpu::BruteForceMatcher_GPU::radiusMatch` to CPU vector with :c:type:`DMatch`. If ``compactResult`` is true ``matches`` vector will not contain matches for fully masked out query descriptors.
+
--- a/modules/gpu/doc/gpu.rst
+++ b/modules/gpu/doc/gpu.rst
@@ -3,7 +3,7 @@ gpu. GPU-accelerated Computer Vision
 ************************************

 .. toctree::
-    :maxdepth: 2
+    :maxdepth: 1

    introduction
    initalization_and_information

--- a/modules/gpu/doc/image_filtering.rst
+++ b/modules/gpu/doc/image_filtering.rst
@@ -3,14 +3,10 @@ Image Filtering

 .. highlight:: cpp

-
-
 Functions and classes described in this section are used to perform various linear or non-linear filtering operations on 2D images.

 See also: :ref:`ImageFiltering`.

-
-
 .. index:: gpu::BaseRowFilter_GPU

 gpu::BaseRowFilter_GPU
@@ -28,9 +24,8 @@ The base class for linear or non-linear filters that processes rows of 2D arrays
        int ksize, anchor;
    };

-**Please note:** This class doesn't allocate memory for destination image. Usually this class is used inside :cpp:class:`gpu::FilterEngine_GPU`.
-

+**Note:** This class does not allocate memory for a destination image. Usually this class is used inside :cpp:class:`gpu::FilterEngine_GPU`.

 .. index:: gpu::BaseColumnFilter_GPU

@@ -49,9 +44,9 @@ The base class for linear or non-linear filters that processes columns of 2D arr
        int ksize, anchor;
    };

-**Please note:** This class doesn't allocate memory for destination image. Usually this class is used inside :cpp:class:`gpu::FilterEngine_GPU`.
-

+**Note:**
+This class does not allocate memory for a destination image. Usually this class is used inside :cpp:class:`gpu::FilterEngine_GPU`.

 .. index:: gpu::BaseFilter_GPU

@@ -72,9 +67,8 @@ The base class for non-separable 2D filters. ::
    };


-**Please note:** This class doesn't allocate memory for destination image. Usually this class is used inside :cpp:class:`gpu::FilterEngine_GPU`.
-
-
+**Note:**
+This class does not allocate memory for a destination image. Usually this class is used inside :cpp:class:`gpu::FilterEngine_GPU`.

 .. index:: gpu::FilterEngine_GPU

@@ -93,7 +87,9 @@ The base class for Filter Engine. ::
                           Rect roi = Rect(0,0,-1,-1)) = 0;
    };

-The class can be used to apply an arbitrary filtering operation to an image. It contains all the necessary intermediate buffers. Pointers to the initialized ``FilterEngine_GPU`` instances are returned by various ``create*Filter_GPU`` functions, see below, and they are used inside high-level functions such as :cpp:func:`gpu::filter2D`, :cpp:func:`gpu::erode`, :cpp:func:`gpu::Sobel` etc.
+
+The class can be used to apply an arbitrary filtering operation to an image. It contains all the necessary intermediate buffers. Pointers to the initialized ``FilterEngine_GPU`` instances are returned by various ``create*Filter_GPU`` functions (see below), and they are used inside high-level functions such as
+:func:`gpu::filter2D`,:func:`gpu::erode`,:func:`gpu::Sobel` , and others.

 By using ``FilterEngine_GPU`` instead of functions you can avoid unnecessary memory allocation for intermediate buffers and get much better performance: ::

@@ -117,31 +113,27 @@ By using ``FilterEngine_GPU`` instead of functions you can avoid unnecessary mem
    // Release buffers only once
    filter.release();

-``FilterEngine_GPU`` can process a rectangular sub-region of an image. By default, if ``roi == Rect(0,0,-1,-1)``, ``FilterEngine_GPU`` processes inner region of image (``Rect(anchor.x, anchor.y, src_size.width - ksize.width, src_size.height - ksize.height)``), because some filters doesn't check if indices are outside the image for better perfomace. See below which filters supports processing the whole image and which not and image type limitations.
+ ``FilterEngine_GPU`` can process a rectangular sub-region of an image. By default, if ``roi == Rect(0,0,-1,-1)``,``FilterEngine_GPU`` processes the inner region of an image ( ``Rect(anchor.x, anchor.y, src_size.width - ksize.width, src_size.height - ksize.height)`` ), because some filters do not check whether indices are outside the image for better perfomance. See below to understand which filters support processing the whole image and which do not and identify image type limitations.

-**Please note:** The GPU filters doesn't support the in-place mode.
+**Note:** The GPU filters do not support the in-place mode.

 See also: :cpp:class:`gpu::BaseRowFilter_GPU`, :cpp:class:`gpu::BaseColumnFilter_GPU`, :cpp:class:`gpu::BaseFilter_GPU`, :cpp:func:`gpu::createFilter2D_GPU`, :cpp:func:`gpu::createSeparableFilter_GPU`, :cpp:func:`gpu::createBoxFilter_GPU`, :cpp:func:`gpu::createMorphologyFilter_GPU`, :cpp:func:`gpu::createLinearFilter_GPU`, :cpp:func:`gpu::createSeparableLinearFilter_GPU`, :cpp:func:`gpu::createDerivFilter_GPU`, :cpp:func:`gpu::createGaussianFilter_GPU`.

-
-
 .. index:: gpu::createFilter2D_GPU

 gpu::createFilter2D_GPU
 ---------------------------
-.. cpp:function:: Ptr<FilterEngine_GPU> gpu::createFilter2D_GPU(const Ptr<BaseFilter_GPU>& filter2D, int srcType, int dstType)
+.. cpp:function:: Ptr<FilterEngine_GPU> gpu::createFilter2D_GPU( const Ptr<BaseFilter_GPU>& filter2D, int srcType, int dstType)

-    Creates non-separable filter engine with the specified filter.
-    
-    :param filter2D: Non-separable 2D filter.
+    Creates a non-separable filter engine with the specified filter.

-    :param srcType: Input image type. It must be supported by ``filter2D``.
-
-    :param dstType: Output image type. It must be supported by ``filter2D``.
+    :param filter2D: Non-separable 2D filter.

-Usually this function is used inside high-level functions, like :cpp:func:`gpu::createLinearFilter_GPU`, :cpp:func:`gpu::createBoxFilter_GPU`.
+    :param srcType: Input image type. It must be supported by  ``filter2D`` .

+    :param dstType: Output image type. It must be supported by  ``filter2D`` .

+	Usually this function is used inside such high-level functions as :cpp:func:`gpu::createLinearFilter_GPU`, :cpp:func:`gpu::createBoxFilter_GPU`.

 .. index:: gpu::createSeparableFilter_GPU

@@ -149,21 +141,19 @@ gpu::createSeparableFilter_GPU
 ----------------------------------
 .. cpp:function:: Ptr<FilterEngine_GPU> gpu::createSeparableFilter_GPU( const Ptr<BaseRowFilter_GPU>& rowFilter, const Ptr<BaseColumnFilter_GPU>& columnFilter, int srcType, int bufType, int dstType)

-    Creates separable filter engine with the specified filters.
-    
-    :param rowFilter: "Horizontal" 1D filter.
+    Creates a separable filter engine with the specified filters.

+    :param rowFilter: "Horizontal" 1D filter.
+    
    :param columnFilter: "Vertical" 1D filter.

-    :param srcType: Input image type. It must be supported by ``rowFilter``.
-
-    :param bufType: Buffer image type. It must be supported by ``rowFilter`` and ``columnFilter``.
+    :param srcType: Input image type. It must be supported by  ``rowFilter``.

-    :param dstType: Output image type. It must be supported by ``columnFilter``.
-
-Usually this function is used inside high-level functions, like :cpp:func:`gpu::createSeparableLinearFilter_GPU`.
+    :param bufType: Buffer image type. It must be supported by  ``rowFilter``  and  ``columnFilter``.

+    :param dstType: Output image type. It must be supported by  ``columnFilter``.

+	Usually this function is used inside such high-level functions as :cpp:func:`gpu::createSeparableLinearFilter_GPU`.

 .. index:: gpu::getRowSumFilter_GPU

@@ -171,7 +161,7 @@ gpu::getRowSumFilter_GPU
 ----------------------------
 .. cpp:function:: Ptr<BaseRowFilter_GPU> gpu::getRowSumFilter_GPU(int srcType, int sumType, int ksize, int anchor = -1)

-    Creates horizontal 1D box filter.
+    Creates a horizontal 1D box filter.

    :param srcType: Input image type. Only ``CV_8UC1`` type is supported for now.

@@ -181,17 +171,15 @@ gpu::getRowSumFilter_GPU

    :param anchor: Anchor point. The default value (-1) means that the anchor is at the kernel center.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-
+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

 .. index:: gpu::getColumnSumFilter_GPU

 gpu::getColumnSumFilter_GPU
 -------------------------------
-.. cpp:function:: Ptr<BaseColumnFilter_GPU> gpu::getColumnSumFilter_GPU(int sumType,  int dstType, int ksize, int anchor = -1)
+.. cpp:function:: Ptr<BaseColumnFilter_GPU> gpu::getColumnSumFilter_GPU(int sumType, int dstType, int ksize, int anchor = -1)

-    Creates vertical 1D box filter.
+    Creates a vertical 1D box filter.

    :param sumType: Input image type. Only ``CV_8UC1`` type is supported for now.

@@ -201,9 +189,7 @@ gpu::getColumnSumFilter_GPU

    :param anchor: Anchor point. The default value (-1) means that the anchor is at the kernel center.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-
+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

 .. index:: gpu::createBoxFilter_GPU

@@ -211,9 +197,9 @@ gpu::createBoxFilter_GPU
 ----------------------------
 .. cpp:function:: Ptr<FilterEngine_GPU> gpu::createBoxFilter_GPU(int srcType, int dstType, const Size& ksize, const Point& anchor = Point(-1,-1))

-.. cpp:function:: Ptr<BaseFilter_GPU> gpu::getBoxFilter_GPU(int srcType, int dstType, const Size& ksize, Point anchor = Point(-1, -1))
+    Creates a normalized 2D box filter.

-    Creates normalized 2D box filter.
+.. cpp:function:: Ptr<BaseFilter_GPU> getBoxFilter_GPU(int srcType, int dstType, const Size& ksize, Point anchor = Point(-1, -1))

    :param srcType: Input image type. Supports ``CV_8UC1`` and ``CV_8UC4``.

@@ -221,13 +207,11 @@ gpu::createBoxFilter_GPU

    :param ksize: Kernel size.

-    :param anchor: Anchor point. The default value Point(-1, -1) means that the anchor is at the kernel center.
-
-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-See also: :c:func:`boxFilter`.
+    :param anchor: Anchor point. The default value ``Point(-1, -1)`` means that the anchor is at the kernel center.

+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`boxFilter`.

 .. index:: gpu::boxFilter

@@ -237,43 +221,39 @@ gpu::boxFilter

    Smooths the image using the normalized box filter.

-    :param src: Input image. Supports ``CV_8UC1`` and ``CV_8UC4`` source types.
+    :param src: Input image. ``CV_8UC1`` and ``CV_8UC4`` source types are supported.

-    :param dst: Output image type. Will have the same size and the same type as ``src``.
+    :param dst: Output image type. The size and type is the same as ``src``.

-    :param ddepth: Output image depth. Support only the same as source depth (``CV_8U``) or -1 what means use source depth.
+    :param ddepth: Output image depth. If -1, the output image will have the same depth as the input one. The only values allowed here are ``CV_8U`` and -1.

    :param ksize: Kernel size.

-    :param anchor: Anchor point. The default value Point(-1, -1) means that the anchor is at the kernel center.
-
-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-See also: :c:func:`boxFilter`, :cpp:func:`gpu::createBoxFilter_GPU`.
+    :param anchor: Anchor point. The default value ``Point(-1, -1)`` means that the anchor is at the kernel center.

+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`boxFilter`.

 .. index:: gpu::blur

 gpu::blur
 -------------
-.. cpp:function:: void gpu::blur(const GpuMat& src, GpuMat& dst, Size ksize,  Point anchor = Point(-1,-1))
+.. cpp:function:: void gpu::blur(const GpuMat& src, GpuMat& dst, Size ksize, Point anchor = Point(-1,-1))

-    A synonym for normalized box filter.
+    Acts as a synonym for the normalized box filter.

-    :param src: Input image. Supports ``CV_8UC1`` and ``CV_8UC4`` source type.
+    :param src: Input image.  ``CV_8UC1``  and  ``CV_8UC4``  source types are supported.

-    :param dst: Output image type. Will have the same size and the same type as ``src``.
+    :param dst: Output image type. The size and type is the same as  ``src`` .

    :param ksize: Kernel size.

    :param anchor: Anchor point. The default value Point(-1, -1) means that the anchor is at the kernel center.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-See also: :c:func:`blur`, :cpp:func:`gpu::boxFilter`.
-
+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`blur`, :cpp:func:`gpu::boxFilter`.

 .. index:: gpu::createMorphologyFilter_GPU

@@ -281,25 +261,23 @@ gpu::createMorphologyFilter_GPU
 -----------------------------------
 .. cpp:function:: Ptr<FilterEngine_GPU> gpu::createMorphologyFilter_GPU(int op, int type, const Mat& kernel, const Point& anchor = Point(-1,-1), int iterations = 1)

-.. cpp:function:: Ptr<BaseFilter_GPU> gpu::getMorphologyFilter_GPU(int op, int type, const Mat& kernel, const Size& ksize, Point anchor=Point(-1,-1))
-
-    Creates 2D morphological filter.
-    
-    :param op: Morphology operation id. Only ``MORPH_ERODE`` and ``MORPH_DILATE`` are supported.
+    Creates a 2D morphological filter.

-    :param type: Input/output image type. Only ``CV_8UC1`` and ``CV_8UC4`` are supported.
+.. cpp:function:: Ptr<BaseFilter_GPU> getMorphologyFilter_GPU(int op, int type, const Mat& kernel, const Size& ksize, Point anchor=Point(-1,-1))

-    :param kernel: 2D 8-bit structuring element for the morphological operation.
+    {Morphology operation id. Only ``MORPH_ERODE``     and ``MORPH_DILATE``     are supported.}

-    :param size: Horizontal or vertical structuring element size for separable morphological operations.
+    :param type: Input/output image type. Only  ``CV_8UC1``  and  ``CV_8UC4``  are supported.

-    :param anchor: Anchor position within the structuring element; negative values mean that the anchor is at the center.
+    :param kernel: 2D 8-bit structuring element for the morphological operation.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
+    :param size: Size of a horizontal or vertical structuring element used for separable morphological operations.

-See also: :c:func:`createMorphologyFilter`.
+    :param anchor: Anchor position within the structuring element. Negative values mean that the anchor is at the center.

+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`createMorphologyFilter`.

 .. index:: gpu::erode

@@ -309,21 +287,19 @@ gpu::erode

    Erodes an image by using a specific structuring element.

-    :param src: Source image. Only ``CV_8UC1`` and ``CV_8UC4`` types are supported.
+    :param src: Source image. Only  ``CV_8UC1``  and  ``CV_8UC4``  types are supported.

-    :param dst: Destination image. It will have the same size and the same type as ``src``.
+    :param dst: Destination image. The size and type is the same as  ``src`` .

-    :param kernel: Structuring element used for dilation. If ``kernel=Mat()``, a :math:`3 \times 3` rectangular structuring element is used.
+    :param kernel: Structuring element used for dilation. If  ``kernel=Mat()``, a  3x3 rectangular structuring element is used.

-    :param anchor: Position of the anchor within the element. The default value ``(-1, -1)``  means that the anchor is at the element center.
+    :param anchor: Position of an anchor within the element. The default value  ``(-1, -1)``  means that the anchor is at the element center.

    :param iterations: Number of times erosion to be applied.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-See also: :c:func:`erode`, :cpp:func:`gpu::createMorphologyFilter_GPU`.
-
+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`erode`.

 .. index:: gpu::dilate

@@ -333,57 +309,54 @@ gpu::dilate

    Dilates an image by using a specific structuring element.

-    :param src: Source image. Supports ``CV_8UC1`` and ``CV_8UC4`` source types.
+    :param src: Source image. ``CV_8UC1`` and ``CV_8UC4`` source types are supported.

-    :param dst: Destination image. It will have the same size and the same type as ``src``.
+    :param dst: Destination image. The size and type is the same as ``src``.

-    :param kernel: Structuring element used for dilation. If ``kernel=Mat()``, a :math:`3 \times 3` rectangular structuring element is used.
+    :param kernel: Structuring element used for dilation. If  ``kernel=Mat()``, a  3x3 rectangular structuring element is used.

-    :param anchor: Position of the anchor within the element. The default value ``(-1, -1)``  means that the anchor is at the element center.
+    :param anchor: Position of an anchor within the element. The default value  ``(-1, -1)``  means that the anchor is at the element center.

    :param iterations: Number of times dilation to be applied.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-See also: :c:func:`dilate`, :cpp:func:`gpu::createMorphologyFilter_GPU`.
-
+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`dilate`.

 .. index:: gpu::morphologyEx

 gpu::morphologyEx
 ---------------------
-.. cpp:function:: void gpu::morphologyEx(const GpuMat& src, GpuMat& dst, int op,  const Mat& kernel,  Point anchor = Point(-1, -1),  int iterations = 1)
-
-    Applies an advanced morphological operation to image.
+.. cpp:function:: void gpu::morphologyEx(const GpuMat& src, GpuMat& dst, int op, const Mat& kernel, Point anchor = Point(-1, -1), int iterations = 1)

-    :param src: Source image. Supports ``CV_8UC1`` and ``CV_8UC4`` source type.
+    Applies an advanced morphological operation to an image.

-    :param dst: Destination image. It will have the same size and the same type as ``src``.
+    :param src: Source image.  ``CV_8UC1``  and  ``CV_8UC4``  source types are supported.

-    :param op: Type of morphological operation, one of the following:
+    :param dst: Destination image. The size and type is the same as  ``src``
+    
+    :param op: Type of morphological operation. The following types are possible:
        
-            * **MORPH_OPEN** opening
+        * **MORPH_OPEN** opening
            
-            * **MORPH_CLOSE** closing
+        * **MORPH_CLOSE** closing
            
-            * **MORPH_GRADIENT** morphological gradient
+        * **MORPH_GRADIENT** morphological gradient
            
-            * **MORPH_TOPHAT** "top hat"
+        * **MORPH_TOPHAT** "top hat"
+            
+        * **MORPH_BLACKHAT** "black hat"
            
-            * **MORPH_BLACKHAT** "black hat"

    :param kernel: Structuring element.

-    :param anchor: Position of the anchor within the element. The default value ``(-1, -1)`` means that the anchor is at the element center.
+    :param anchor: Position of an anchor within the element. The default value ``Point(-1, -1)`` means that the anchor is at the element center.

    :param iterations: Number of times erosion and dilation to be applied.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-See also: :c:func:`morphologyEx`.
-
+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`morphologyEx` .

 .. index:: gpu::createLinearFilter_GPU

@@ -391,25 +364,23 @@ gpu::createLinearFilter_GPU
 -------------------------------
 .. cpp:function:: Ptr<FilterEngine_GPU> gpu::createLinearFilter_GPU(int srcType, int dstType, const Mat& kernel, const Point& anchor = Point(-1,-1))

+    Creates a non-separable linear filter.
+
 .. cpp:function:: Ptr<BaseFilter_GPU> gpu::getLinearFilter_GPU(int srcType, int dstType, const Mat& kernel, const Size& ksize, Point anchor = Point(-1, -1))

-    Creates the non-separable linear filter.
+    :param srcType: Input image type. ``CV_8UC1``  and  ``CV_8UC4`` types are supported.

-    :param srcType: Input image type. Supports ``CV_8UC1`` and ``CV_8UC4``.
-
-    :param dstType: Output image type. Supports only the same as source type.
+    :param dstType: Output image type. The same type as ``src`` is supported.

-    :param kernel: 2D array of filter coefficients. This filter works with integers kernels, if ``kernel`` has ``float`` or ``double`` type it will be used fixed point arithmetic.
+    :param kernel: 2D array of filter coefficients. Floating-point coefficients will be converted to fixed-point representation before the actual processing.

    :param ksize: Kernel size.

-    :param anchor: Anchor point. The default value ``(-1, -1)`` means that the anchor is at the kernel center.
-
-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-See also: :c:func:`createLinearFilter`.
+    :param anchor: Anchor point. The default value Point(-1, -1) means that the anchor is at the kernel center.

+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`createLinearFilter`.

 .. index:: gpu::filter2D

@@ -417,23 +388,21 @@ gpu::filter2D
 -----------------
 .. cpp:function:: void gpu::filter2D(const GpuMat& src, GpuMat& dst, int ddepth, const Mat& kernel, Point anchor=Point(-1,-1))

-    Applies non-separable 2D linear filter to image.
-
-    :param src: Source image. Supports ``CV_8UC1`` and ``CV_8UC4`` source types.
-
-    :param dst: Destination image. It will have the same size and the same number of channels as ``src``.
+    Applies the non-separable 2D linear filter to an image.

-    :param ddepth: The desired depth of the destination image. If it is negative, it will be the same as ``src.depth()``. Supports only the same depth as source image.
+    :param src: Source image.  ``CV_8UC1``  and  ``CV_8UC4``  source types are supported.

-    :param kernel: 2D array of filter coefficients. This filter works with integers kernels, if ``kernel`` has ``float`` or ``double`` type it will use fixed point arithmetic.
+    :param dst: Destination image. The size and the number of channels is the same as  ``src`` .

-    :param anchor: Anchor of the kernel that indicates the relative position of a filtered point within the kernel. The anchor should lie within the kernel. The special default value ``(-1,-1)`` means that the anchor is at the kernel center.
+    :param ddepth: Desired depth of the destination image. If it is negative, it is the same as  ``src.depth()`` . It supports only the same depth as the source image depth.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
+    :param kernel: 2D array of filter coefficients. This filter works with integers kernels. If  ``kernel``  has a ``float``  or  ``double``  type, it uses fixed-point arithmetic.

-See also: :c:func:`filter2D`, :cpp:func:`gpu::createLinearFilter_GPU`.
+    :param anchor: Anchor of the kernel that indicates the relative position of a filtered point within the kernel. The anchor resides within the kernel. The special default value (-1,-1) means that the anchor is at the kernel center.

+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`filter2D`.

 .. index:: gpu::Laplacian

@@ -441,23 +410,22 @@ gpu::Laplacian
 ------------------
 .. cpp:function:: void gpu::Laplacian(const GpuMat& src, GpuMat& dst, int ddepth, int ksize = 1, double scale = 1)

-    Applies Laplacian operator to image.
+    Applies the Laplacian operator to an image.

-    :param src: Source image. Supports ``CV_8UC1`` and ``CV_8UC4`` source types.
+    :param src: Source image. ``CV_8UC1``  and  ``CV_8UC4``  source types are supported.

-    :param dst: Destination image; will have the same size and the same number of channels as ``src``.
+    :param dst: Destination image. The size and number of channels is the same as  ``src`` .

-    :param ddepth: Desired depth of the destination image. Supports only tha same depth as source image depth.
+    :param ddepth: Desired depth of the destination image. It supports only the same depth as the source image depth.

-    :param ksize: Aperture size used to compute the second-derivative filters, see :c:func:`getDerivKernels`. It must be positive and odd. Supports only ``ksize`` = 1 and ``ksize`` = 3.
+    :param ksize: Aperture size used to compute the second-derivative filters (see :c:func:`getDerivKernels`). It must be positive and odd. Only  ``ksize``  = 1 and  ``ksize``  = 3 are supported.

-    :param scale: Optional scale factor for the computed Laplacian values (by default, no scaling is applied, see  :c:func:`getDerivKernels`).
-
-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-See also: :c:func:`Laplacian`, :cpp:func:`gpu::filter2D`.
+    :param scale: Optional scale factor for the computed Laplacian values. By default, no scaling is applied (see  :c:func:`getDerivKernels` ).

+	**Note:**
+	This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

+See Also: :c:func:`Laplacian`,:func:`gpu::filter2D` .

 .. index:: gpu::getLinearRowFilter_GPU

@@ -465,23 +433,23 @@ gpu::getLinearRowFilter_GPU
 -------------------------------
 .. cpp:function:: Ptr<BaseRowFilter_GPU> gpu::getLinearRowFilter_GPU(int srcType, int bufType, const Mat& rowKernel, int anchor = -1, int borderType = BORDER_CONSTANT)

-    Creates primitive row filter with the specified kernel.
+    Creates a primitive row filter with the specified kernel.

-    :param srcType: Source array type. Supports only ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` source types.
+    :param srcType: Source array type. Only  ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1``  source types are supported.

-    :param bufType: Inermediate buffer type; must have as many channels as ``srcType``.
+    :param bufType: Intermediate buffer type with as many channels as  ``srcType`` .

    :param rowKernel: Filter coefficients.

-    :param anchor: Anchor position within the kernel; negative values mean that anchor is positioned at the aperture center.
-
-    :param borderType: Pixel extrapolation method; see :c:func:`borderInterpolate`. About limitation see below.
+    :param anchor: Anchor position within the kernel. Negative values mean that the anchor is positioned at the aperture center.

-There are two version of algorithm: NPP and OpenCV. NPP calls when ``srcType == CV_8UC1`` or ``srcType == CV_8UC4`` and ``bufType == srcType``, otherwise calls OpenCV version. NPP supports only ``BORDER_CONSTANT`` border type and doesn't check indices outside image. OpenCV version supports only ``CV_32F`` buffer depth and ``BORDER_REFLECT101``,``BORDER_REPLICATE`` and ``BORDER_CONSTANT`` border types and checks indices outside image.
-
-See also: :cpp:func:`gpu::getLinearColumnFilter_GPU`, :c:func:`createSeparableLinearFilter`.
+    :param borderType: Pixel extrapolation method. For details, see :c:func:`borderInterpolate`. For details on limitations, see below.

+	There are two versions of the algorithm: NPP and OpenCV.
+	* NPP version is called when ``srcType == CV_8UC1`` or ``srcType == CV_8UC4`` and ``bufType == srcType`` . Otherwise, the OpenCV version is called. NPP supports only ``BORDER_CONSTANT`` border type and does not check indices outside the image. 
+	* OpenCV version supports only ``CV_32F`` buffer depth and ``BORDER_REFLECT101``,``BORDER_REPLICATE``, and ``BORDER_CONSTANT`` border types. It checks indices outside the image.

+See Also:,:func:`createSeparableLinearFilter` .

 .. index:: gpu::getLinearColumnFilter_GPU

@@ -489,45 +457,44 @@ gpu::getLinearColumnFilter_GPU
 ----------------------------------
 .. cpp:function:: Ptr<BaseColumnFilter_GPU> gpu::getLinearColumnFilter_GPU(int bufType, int dstType, const Mat& columnKernel, int anchor = -1, int borderType = BORDER_CONSTANT)

-    Creates the primitive column filter with the specified kernel.
+    Creates a primitive column filter with the specified kernel.

-    :param bufType: Inermediate buffer type; must have as many channels as ``dstType``.
+    :param bufType: Inermediate buffer type with as many channels as  ``dstType`` .

-    :param dstType: Destination array type. Supports ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` destination types.
+    :param dstType: Destination array type. ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` destination types are supported.

    :param columnKernel: Filter coefficients.

-    :param anchor: Anchor position within the kernel; negative values mean that anchor is positioned at the aperture center.
-
-    :param borderType: Pixel extrapolation method; see :c:func:`borderInterpolate`. About limitation see below.
+    :param anchor: Anchor position within the kernel. Negative values mean that the anchor is positioned at the aperture center.

-There are two version of algorithm: NPP and OpenCV. NPP calls when ``dstType == CV_8UC1`` or ``dstType == CV_8UC4`` and ``bufType == dstType``, otherwise calls OpenCV version. NPP supports only ``BORDER_CONSTANT`` border type and doesn't check indices outside image. OpenCV version supports only ``CV_32F`` buffer depth and ``BORDER_REFLECT101``,``BORDER_REPLICATE`` and ``BORDER_CONSTANT`` border types and checks indices outside image.
+    :param borderType: Pixel extrapolation method. For details, see  :c:func:`borderInterpolate` . For details on limitations, see below.

+	There are two versions of the algorithm: NPP and OpenCV.
+	* NPP version is called when ``dstType == CV_8UC1`` or ``dstType == CV_8UC4`` and ``bufType == dstType`` . Otherwise, the OpenCV version is called. NPP supports only ``BORDER_CONSTANT`` border type and does not check indices outside the image. 
+	* OpenCV version supports only ``CV_32F`` buffer depth and ``BORDER_REFLECT101``,``BORDER_REPLICATE``, and ``BORDER_CONSTANT`` border types. It checks indices outside image.
+	
 See also: :cpp:func:`gpu::getLinearRowFilter_GPU`, :c:func:`createSeparableLinearFilter`.

-
-
 .. index:: gpu::createSeparableLinearFilter_GPU

 gpu::createSeparableLinearFilter_GPU
 ----------------------------------------
-.. cpp:function:: Ptr<FilterEngine_GPU> gpu::createSeparableLinearFilter_GPU(int srcType,  int dstType, const Mat& rowKernel, const Mat& columnKernel, const Point& anchor = Point(-1,-1), int rowBorderType = BORDER_DEFAULT, int columnBorderType = -1)
+.. cpp:function:: Ptr<FilterEngine_GPU> gpu::createSeparableLinearFilter_GPU(int srcType, int dstType, const Mat& rowKernel, const Mat& columnKernel, const Point& anchor = Point(-1,-1), int rowBorderType = BORDER_DEFAULT, int columnBorderType = -1)

-    Creates the separable linear filter engine.
+    Creates a separable linear filter engine.

-    :param srcType: Source array type. Supports ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` source types.
+    :param srcType: Source array type.  ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1``  source types are supported.

-    :param dstType: Destination array type. Supports ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` destination types.
+    :param dstType: Destination array type.  ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1``  destination types are supported.

    :param rowKernel, columnKernel: Filter coefficients.

-    :param anchor: Anchor position within the kernel; negative values mean that anchor is positioned at the aperture center.
+    :param anchor: Anchor position within the kernel. Negative values mean that anchor is positioned at the aperture center.

-    :param rowBorderType, columnBorderType: Pixel extrapolation method in the horizontal and the vertical directions; see :c:func:`borderInterpolate`. About limitation see :cpp:func:`gpu::getLinearRowFilter_GPU`, cpp:func:`gpu::getLinearColumnFilter_GPU`.
-
-See also: :cpp:func:`gpu::getLinearRowFilter_GPU`, :cpp:func:`gpu::getLinearColumnFilter_GPU`, :c:func:`createSeparableLinearFilter`.
+    :param rowBorderType, columnBorderType: Pixel extrapolation method in the horizontal and vertical directions For details, see  :c:func:`borderInterpolate`. For details on limitations, see :cpp:func:`gpu::getLinearRowFilter_GPU`, cpp:func:`gpu::getLinearColumnFilter_GPU`.


+See also: :cpp:func:`gpu::getLinearRowFilter_GPU`, :cpp:func:`gpu::getLinearColumnFilter_GPU`, :c:func:`createSeparableLinearFilter`.

 .. index:: gpu::sepFilter2D

@@ -535,180 +502,166 @@ gpu::sepFilter2D
 --------------------
 .. cpp:function:: void gpu::sepFilter2D(const GpuMat& src, GpuMat& dst, int ddepth, const Mat& kernelX, const Mat& kernelY, Point anchor = Point(-1,-1), int rowBorderType = BORDER_DEFAULT, int columnBorderType = -1)

-    Applies separable 2D linear filter to the image.
+    Applies a separable 2D linear filter to an image.

-    :param src: Source image. Supports ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` source types.
+    :param src: Source image.  ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1``  source types are supported.

-    :param dst: Destination image; will have the same size and the same number of channels as ``src``.
+    :param dst: Destination image. The size and number of channels is the same as  ``src`` .

-    :param ddepth: Destination image depth. Supports ``CV_8U``, ``CV_16S``, ``CV_32S`` and ``CV_32F``.
+    :param ddepth: Destination image depth.  ``CV_8U``, ``CV_16S``, ``CV_32S``, and  ``CV_32F`` are supported.

    :param kernelX, kernelY: Filter coefficients.

-    :param anchor: Anchor position within the kernel; The default value ``(-1, 1)`` means that the anchor is at the kernel center.
+    :param anchor: Anchor position within the kernel. The default value ``(-1, 1)`` means that the anchor is at the kernel center.

-    :param rowBorderType, columnBorderType: Pixel extrapolation method; see :c:func:`borderInterpolate`.
+    :param rowBorderType, columnBorderType: Pixel extrapolation method. For details, see  :c:func:`borderInterpolate`.

 See also: :cpp:func:`gpu::createSeparableLinearFilter_GPU`, :c:func:`sepFilter2D`.

-
-
 .. index:: gpu::createDerivFilter_GPU

 gpu::createDerivFilter_GPU
 ------------------------------
-.. cpp:function:: Ptr<FilterEngine_GPU> gpu::createDerivFilter_GPU(int srcType, int dstType, int dx, int dy, int ksize, int rowBorderType = BORDER_DEFAULT, int columnBorderType = -1)
+.. cpp:function:: Ptr<FilterEngine_GPU> createDerivFilter_GPU(int srcType, int dstType, int dx, int dy, int ksize, int rowBorderType = BORDER_DEFAULT, int columnBorderType = -1)

-    Creates filter engine for the generalized Sobel operator.
+    Creates a filter engine for the generalized Sobel operator.

-    :param srcType: Source image type. Supports ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` source types.
+    :param srcType: Source image type.  ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1``  source types are supported.

-    :param dstType: Destination image type; must have as many channels as ``srcType``. Supports ``CV_8U``, ``CV_16S``, ``CV_32S`` and ``CV_32F`` depths.
+    :param dstType: Destination image type with as many channels as  ``srcType`` .  ``CV_8U``, ``CV_16S``, ``CV_32S``, and  ``CV_32F``  depths are supported.

-    :param dx: Derivative order in respect with x.
+    :param dx: Derivative order in respect of x.

-    :param dy: Derivative order in respect with y.
+    :param dy: Derivative order in respect of y.

-    :param ksize: Aperture size; see :c:func:`getDerivKernels`.
+    :param ksize: Aperture size. See  :c:func:`getDerivKernels` for details.

-    :param rowBorderType, columnBorderType: Pixel extrapolation method; see :c:func:`borderInterpolate`.
+    :param rowBorderType, columnBorderType: Pixel extrapolation method. See  :c:func:`borderInterpolate` for details.

 See also: :cpp:func:`gpu::createSeparableLinearFilter_GPU`, :c:func:`createDerivFilter`.

-
-
 .. index:: gpu::Sobel

 gpu::Sobel
 --------------
 .. cpp:function:: void gpu::Sobel(const GpuMat& src, GpuMat& dst, int ddepth, int dx, int dy, int ksize = 3, double scale = 1, int rowBorderType = BORDER_DEFAULT, int columnBorderType = -1)

-    Applies generalized Sobel operator to the image.
+    Applies the generalized Sobel operator to an image.

-    :param src: Source image. Supports ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` source types.
+    :param src: Source image.  ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1``  source types are supported.

-    :param dst: Destination image. Will have the same size and number of channels as source image.
+    :param dst: Destination image. The size and number of channels is the same as source image has.

-    :param ddepth: Destination image depth. Supports ``CV_8U``, ``CV_16S``, ``CV_32S`` and ``CV_32F``.
+    :param ddepth: Destination image depth.  ``CV_8U``, ``CV_16S``, ``CV_32S``, and  ``CV_32F`` are supported.

-    :param dx: Derivative order in respect with x.
+    :param dx: Derivative order in respect of x.

-    :param dy: Derivative order in respect with y.
+    :param dy: Derivative order in respect of y.

-    :param ksize: Size of the extended Sobel kernel, must be 1, 3, 5 or 7.
+    :param ksize: Size of the extended Sobel kernel. Possible valies are 1, 3, 5 or 7.

-    :param scale: Optional scale factor for the computed derivative values (by default, no scaling is applied, see :c:func:`getDerivKernels`).
+    :param scale: Optional scale factor for the computed derivative values. By default, no scaling is applied. For details, see  :c:func:`getDerivKernels` .

-    :param rowBorderType, columnBorderType: Pixel extrapolation method; see :c:func:`borderInterpolate`.
+    :param rowBorderType, columnBorderType: Pixel extrapolation method. See  :c:func:`borderInterpolate` for details.

 See also: :cpp:func:`gpu::createSeparableLinearFilter_GPU`, :c:func:`Sobel`.

-
-
 .. index:: gpu::Scharr

 gpu::Scharr
 ---------------
 .. cpp:function:: void gpu::Scharr(const GpuMat& src, GpuMat& dst, int ddepth, int dx, int dy, double scale = 1, int rowBorderType = BORDER_DEFAULT, int columnBorderType = -1)

-    Calculates the first x- or y- image derivative using Scharr operator.
+    Calculates the first x- or y- image derivative using the Scharr operator.

-    :param src: Source image. Supports ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` source types.
+    :param src: Source image.  ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1``  source types are supported.

-    :param dst: Destination image; will have the same size and the same number of channels as ``src``.
+    :param dst: Destination image. The size and number of channels is the same as  ``src`` has.

-    :param ddepth: Destination image depth. Supports ``CV_8U``, ``CV_16S``, ``CV_32S`` and ``CV_32F``.
+    :param ddepth: Destination image depth.  ``CV_8U``, ``CV_16S``, ``CV_32S``, and  ``CV_32F`` are supported.

    :param xorder: Order of the derivative x.

    :param yorder: Order of the derivative y.

-    :param scale: Optional scale factor for the computed derivative values (by default, no scaling is applied, see :c:func:`getDerivKernels`).
+    :param scale: Optional scale factor for the computed derivative values. By default, no scaling is applied. See  :c:func:`getDerivKernels`  for details.

-    :param rowBorderType, columnBorderType: Pixel extrapolation method, see :c:func:`borderInterpolate`.
+    :param rowBorderType, columnBorderType: Pixel extrapolation method. For details, see  :c:func:`borderInterpolate`  and :c:func:`Scharr` .

 See also: :cpp:func:`gpu::createSeparableLinearFilter_GPU`, :c:func:`Scharr`.

-
-
 .. index:: gpu::createGaussianFilter_GPU

 gpu::createGaussianFilter_GPU
 ---------------------------------
 .. cpp:function:: Ptr<FilterEngine_GPU> gpu::createGaussianFilter_GPU(int type, Size ksize, double sigmaX, double sigmaY = 0, int rowBorderType = BORDER_DEFAULT, int columnBorderType = -1)

-    Creates Gaussian filter engine.
+    Creates a Gaussian filter engine.

-    :param type: Source and the destination image type. Supports ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1``.
+    :param type: Source and destination image type.  ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` are supported.

-    :param ksize: Aperture size; see :c:func:`getGaussianKernel`.
+    :param ksize: Aperture size. See  :c:func:`getGaussianKernel` for details.

-    :param sigmaX: Gaussian sigma in the horizontal direction; see :c:func:`getGaussianKernel`.
+    :param sigmaX: Gaussian sigma in the horizontal direction. See  :c:func:`getGaussianKernel` for details.

-    :param sigmaY: Gaussian sigma in the vertical direction; if 0, then :math:`\texttt{sigmaY}\leftarrow\texttt{sigmaX}`.
+    :param sigmaY: Gaussian sigma in the vertical direction. If 0, then  :math:`\texttt{sigmaY}\leftarrow\texttt{sigmaX}` .

-    :param rowBorderType, columnBorderType: Which border type to use; see :c:func:`borderInterpolate`.
+    :param rowBorderType, columnBorderType: Border type to use. See  :c:func:`borderInterpolate` for details.

 See also: :cpp:func:`gpu::createSeparableLinearFilter_GPU`, :c:func:`createGaussianFilter`.

-
-
 .. index:: gpu::GaussianBlur

 gpu::GaussianBlur
 ---------------------
 .. cpp:function:: void gpu::GaussianBlur(const GpuMat& src, GpuMat& dst, Size ksize, double sigmaX, double sigmaY = 0, int rowBorderType = BORDER_DEFAULT, int columnBorderType = -1)

-    Smooths the image using Gaussian filter.
+    Smooths an image using the Gaussian filter.

-    :param src: Source image. Supports ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1`` source types.
+    :param src: Source image.  ``CV_8UC1``, ``CV_8UC4``, ``CV_16SC1``, ``CV_16SC2``, ``CV_32SC1``, ``CV_32FC1``  source types are supported.

-    :param dst: Destination image; will have the same size and the same type as ``src``.
+    :param dst: Destination image. The size and type is the same as  ``src`` has.

-    :param ksize: Gaussian kernel size; ``ksize.width`` and ``ksize.height`` can differ, but they both must be positive and odd. Or, they can be zero's, then they are computed from ``sigmaX`` amd ``sigmaY``.
+    :param ksize: Gaussian kernel size.  ``ksize.width``  and  ``ksize.height``  can differ but they both must be positive and odd. If they are zeros, they are computed from  ``sigmaX``  and  ``sigmaY`` .

-    :param sigmaX, sigmaY: Gaussian kernel standard deviations in X and Y direction. If ``sigmaY`` is zero, it is set to be equal to ``sigmaX``. If they are both zeros, they are computed from ``ksize.width`` and ``ksize.height``, respectively, see :c:func:`getGaussianKernel`. To fully control the result regardless of possible future modification of all this semantics, it is recommended to specify all of ``ksize``, ``sigmaX`` and ``sigmaY``.
+    :param sigmaX, sigmaY: Gaussian kernel standard deviations in X and Y direction. If  ``sigmaY``  is zero, it is set to be equal to  ``sigmaX`` . If they are both zeros, they are computed from  ``ksize.width``  and  ``ksize.height``, respectively. See  :c:func:`getGaussianKernel` for details. To fully control the result regardless of possible future modification of all this semantics, you are recommended to specify all of  ``ksize``, ``sigmaX``, and  ``sigmaY`` .

-    :param rowBorderType, columnBorderType: Pixel extrapolation method; see :c:func:`borderInterpolate`.
+    :param rowBorderType, columnBorderType: Pixel extrapolation method. See  :c:func:`borderInterpolate` for details.

 See also: :cpp:func:`gpu::createGaussianFilter_GPU`, :c:func:`GaussianBlur`.

-
-
 .. index:: gpu::getMaxFilter_GPU

 gpu::getMaxFilter_GPU
 -------------------------
 .. cpp:function:: Ptr<BaseFilter_GPU> gpu::getMaxFilter_GPU(int srcType, int dstType, const Size& ksize, Point anchor = Point(-1,-1))

-    Creates maximum filter.
+    Creates the maximum filter.

-    :param srcType: Input image type. Supports only ``CV_8UC1`` and ``CV_8UC4``.
+    :param srcType: Input image type. Only  ``CV_8UC1``  and  ``CV_8UC4`` are supported.

-    :param dstType: Output image type. Supports only the same type as source.
+    :param dstType: Output image type. It supports only the same type as the source type.

    :param ksize: Kernel size.

    :param anchor: Anchor point. The default value (-1) means that the anchor is at the kernel center.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
-
-
+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.

 .. index:: gpu::getMinFilter_GPU

 gpu::getMinFilter_GPU
 -------------------------
-.. cpp:function:: Ptr<BaseFilter_GPU> gpu::getMinFilter_GPU(int srcType, int dstType,  const Size& ksize, Point anchor = Point(-1,-1))
+.. cpp:function:: Ptr<BaseFilter_GPU> gpu::getMinFilter_GPU(int srcType, int dstType, const Size& ksize, Point anchor = Point(-1,-1))

-    Creates minimum filter.
+    Creates the minimum filter.

-    :param srcType: Input image type. Supports only ``CV_8UC1`` and ``CV_8UC4``.
+    :param srcType: Input image type. Only  ``CV_8UC1``  and  ``CV_8UC4`` are supported.

-    :param dstType: Output image type. Supports only the same type as source.
+    :param dstType: Output image type. It supports only the same type as the source type.

    :param ksize: Kernel size.

    :param anchor: Anchor point. The default value (-1) means that the anchor is at the kernel center.

-**Please note:** This filter doesn't check out of border accesses, so only proper submatrix of bigger matrix have to be passed to it.
+	**Note:** This filter does not check out-of-border accesses, so only a proper sub-matrix of a bigger matrix has to be passed to it.
--- a/modules/gpu/doc/introduction.rst
+++ b/modules/gpu/doc/introduction.rst
-GPU module introduction
+GPU Module Introduction
 =======================

 .. highlight:: cpp

-General information
+General Information
 -------------------

-The OpenCV GPU module is a set of classes and functions to utilize GPU computational capabilities. It is implemented using NVidia CUDA Runtime API, so only the NVidia GPUs are supported. It includes utility functions, low-level vision primitives as well as high-level algorithms. The utility functions and low-level primitives provide a powerful infrastructure for developing fast vision algorithms taking advantage of GPU. Whereas the high-level functionality includes some state-of-the-art algorithms (such as stereo correspondence, face and people detectors etc.), ready to be used by the application developers.
+The OpenCV GPU module is a set of classes and functions to utilize GPU computational capabilities. It is implemented using NVidia* CUDA Runtime API and supports only NVidia GPUs. The OpenCV GPU module includes utility functions, low-level vision primitives, and high-level algorithms. The utility functions and low-level primitives provide a powerful infrastructure for developing fast vision algorithms taking advantage of GPU whereas the high-level functionality includes some state-of-the-art algorithms (such as stereo correspondence, face and people detectors, and others), ready to be used by the application developers.

-The GPU module is designed as host-level API, i.e. if a user has pre-compiled OpenCV GPU binaries, it is not necessary to have Cuda Toolkit installed or write any extra code to make use of the GPU.
+The GPU module is designed as a host-level API. This means that if you have pre-compiled OpenCV GPU binaries, you are not required to have the CUDA Toolkit installed or write any extra code to make use of the GPU.

-The GPU module depends on the Cuda Toolkit and NVidia Performance Primitives library (NPP). Make sure you have the latest versions of those. The two libraries can be downloaded from NVidia site for all supported platforms. To compile OpenCV GPU module you will need a compiler compatible with Cuda Runtime Toolkit.
+The GPU module depends on the CUDA Toolkit and NVidia Performance Primitives library (NPP). Make sure you have the latest versions of this software installed. You can download two libraries for all supported platforms from the NVidia site. To compile the OpenCV GPU module, you need a compiler compatible with the Cuda Runtime Toolkit.

-OpenCV GPU module is designed for ease of use and does not require any knowledge of Cuda. Though, such a knowledge will certainly be useful in non-trivial cases, or when you want to get the highest performance. It is helpful to have understanding of the costs of various operations, what the GPU does, what are the preferred data formats etc. The GPU module is an effective instrument for quick implementation of GPU-accelerated computer vision algorithms. However, if you algorithm involves many simple operations, then for the best possible performance you may still need to write your own kernels, to avoid extra write and read operations on the intermediate results.
+The OpenCV GPU module is designed for ease of use and does not require any knowledge of CUDA. Though, such a knowledge will certainly be useful to handle non-trivial cases or achieve the highest performance. It is helpful to understand the cost of various operations, what the GPU does, what the preferred data formats are, and so on. The GPU module is an effective instrument for quick implementation of GPU-accelerated computer vision algorithms. However, if your algorithm involves many simple operations, then, for the best possible performance, you may still need to write your own kernels to avoid extra write and read operations on the intermediate results.

-To enable CUDA support, configure OpenCV using CMake with ``WITH_CUDA=ON`` . When the flag is set and if CUDA is installed, the full-featured OpenCV GPU module will be built. Otherwise, the module will still be built, but at runtime all functions from the module will throw :c:type:`Exception` with ``CV_GpuNotSupported`` error code, except for :cpp:func:`gpu::getCudaEnabledDeviceCount`. The latter function will return zero GPU count in this case. Building OpenCV without CUDA support does not perform device code compilation, so it does not require Cuda Toolkit installed. Therefore, using :cpp:func:`gpu::getCudaEnabledDeviceCount` function it is possible to implement a high-level algorithm that will detect GPU presence at runtime and choose the appropriate implementation (CPU or GPU) accordingly.
+To enable CUDA support, configure OpenCV using CMake with ``WITH_CUDA=ON`` . When the flag is set and if CUDA is installed, the full-featured OpenCV GPU module is built. Otherwise, the module is still built, but at runtime all functions from the module throw
+:func:`Exception` with ``CV_GpuNotSupported`` error code, except for
+:func:`gpu::getCudaEnabledDeviceCount()`. The latter function returns zero GPU count in this case. Building OpenCV without CUDA support does not perform device code compilation, so it does not require the CUDA Toolkit installed. Therefore, using
+:func:`gpu::getCudaEnabledDeviceCount()` function, you can implement a high-level algorithm that will detect GPU presence at runtime and choose the appropriate implementation (CPU or GPU) accordingly.

-Compilation for different NVidia platforms.
+Compilation for Different NVidia* Platforms
 -------------------------------------------

-NVidia compiler allows generating binary code (cubin and fatbin) and intermediate code (PTX). Binary code often implies a specific GPU architecture and generation, so the compatibility with other GPUs is not guaranteed. PTX is targeted for a virtual platform, which is defined entirely by the set of capabilities, or features. Depending on the virtual platform chosen, some of the instructions will be emulated or disabled, even if the real hardware supports all the features.
-
-On first call, the PTX code is compiled to binary code for the particular GPU using JIT compiler. When the target GPU has lower "compute capability" (CC) than the PTX code, JIT fails.
+NVidia* compiler enables generating binary code (cubin and fatbin) and intermediate code (PTX). Binary code often implies a specific GPU architecture and generation, so the compatibility with other GPUs is not guaranteed. PTX is targeted for a virtual platform that is defined entirely by the set of capabilities or features. Depending on the selected virtual platform, some of the instructions are emulated or disabled, even if the real hardware supports all the features.

+At the first call, the PTX code is compiled to binary code for the particular GPU using a JIT compiler. When the target GPU has a compute capability (CC) lower than the PTX code, JIT fails.
 By default, the OpenCV GPU module includes:

-* Binaries for compute capabilities 1.1, 1.2, 1.3 and 2.0 (controlled by ``CUDA_ARCH_BIN`` in CMake)
+*
+    Binaries for compute capabilities 1.3 and 2.0 (controlled by ``CUDA_ARCH_BIN``     in CMake)

-* PTX code for compute capabilities 1.1 and 1.3 (controlled by ``CUDA_ARCH_PTX`` in CMake)
+*
+    PTX code for compute capabilities 1.1 and 1.3 (controlled by ``CUDA_ARCH_PTX``     in CMake)

-That means for devices with CC 1.1, 1.2, 1.3 and 2.0 binary images are ready to run. For all newer platforms the PTX code for 1.3 is JIT'ed to a binary image. For devices with CC 1.0 no code is available and the functions will throw
-:c:type:`Exception`. For platforms where JIT compilation is performed first run will be slow.
+This means that for devices with CC 1.3 and 2.0 binary images are ready to run. For all newer platforms, the PTX code for 1.3 is JIT'ed to a binary image. For devices with CC 1.1 and 1.2, the PTX for 1.1 is JIT'ed. For devices with CC 1.0, no code is available and the functions throw
+:func:`Exception`. For platforms where JIT compilation is performed first, run is slow.

-If you happen to have GPU with CC 1.0, the GPU module can still be compiled on it and most of the functions will run just fine on such card. Simply add "1.0" to the list of binaries, for example, ``CUDA_ARCH_BIN="1.0 1.3 2.0"``. The functions that can not be run on CC 1.0 GPUs will throw an exception.
+On a GPU with CC 1.0, you can still compile the GPU module and most of the functions will run flawlessly. To achieve this, add "1.0" to the list of binaries, for example, ``CUDA_ARCH_BIN="1.0 1.3 2.0"`` . The functions that cannot be run on CC 1.0 GPUs throw an exception.

-You can always determine at runtime whether OpenCV GPU built binaries (or PTX code) are compatible with your GPU. The function :cpp:func:`gpu::DeviceInfo::isCompatible` return the compatibility status (true/false).
+You can always determine at runtime whether the OpenCV GPU-built binaries (or PTX code) are compatible with your GPU. The function
+:func:`gpu::DeviceInfo::isCompatible` returns the compatibility status (true/false).

-Threading and multi-threading.
+Threading and Multi-threading
 ------------------------------

-OpenCV GPU module follows Cuda Runtime API conventions regarding the multi-threaded programming. That is, on first the API call a Cuda context is created implicitly, attached to the current CPU thread and then is used as the thread's "current" context. All further operations, such as memory allocation, GPU code compilation, will be associated with the context and the thread. Because any other thread is not attached to the context, memory (and other resources) allocated in the first thread can not be accessed by the other thread. Instead, for this other thread Cuda will create another context associated with it. In short, by default different threads do not share resources.
+The OpenCV GPU module follows the CUDA Runtime API conventions regarding the multi-threaded programming. This means that for the first API call a CUDA context is created implicitly, attached to the current CPU thread and then is used as the thread's "current" context. All further operations, such as memory allocation, GPU code compilation, are associated with the context and the thread. Because any other thread is not attached to the context, memory (and other resources) allocated in the first thread cannot be accessed by the other thread. Instead, for this other thread CUDA creates another context associated with it. In short, by default, different threads do not share resources.

-But such limitation can be removed using Cuda Driver API (version 3.1 or later). User can retrieve context reference for one thread, attach it to another thread and make it "current" for that thread. Then the threads can share memory and other resources. It is also possible to create a context explicitly before calling any GPU code and attach it to all the threads that you want to share the resources.
+But you can remove this limitation by using the CUDA Driver API (version 3.1 or later). You can retrieve context reference for one thread, attach it to another thread, and make it "current" for that thread. As a result, the threads can share memory and other resources. It is also possible to create a context explicitly before calling any GPU code and attach it to all the threads you want to share the resources with.

-Also it is possible to create context explicitly using Cuda Driver API, attach and make "current" for all necessary threads. Cuda Runtime API (and OpenCV functions respectively) will pick up it.
+It is also possible to create the context explicitly using the CUDA Driver API, attach, and set the "current" context for all necessary threads. The CUDA Runtime API (and OpenCV functions, respectively) picks it up.

-Multi-GPU
+Utilizing Multiple GPUs
 ---------

-In the current version each of the OpenCV GPU algorithms can use only a single GPU. So, to utilize multiple GPUs, user has to manually distribute the work between the GPUs. Here are the two ways of utilizing multiple GPUs:
+In the current version, each of the OpenCV GPU algorithms can use only a single GPU. So, to utilize multiple GPUs, you have to manually distribute the work between GPUs. Here are the two ways of utilizing multiple GPUs:
+
+*
+    If you only use synchronous functions, create several CPU threads (one per each GPU) and from within each thread create a CUDA context for the corresponding GPU using
+    :func:`gpu::setDevice()`     or Driver API. Each of the threads will use the associated GPU.
+
+*
+    If you use asynchronous functions, you can use the Driver API to create several CUDA contexts associated with different GPUs but attached to one CPU thread. Within the thread you can switch from one GPU to another by making the corresponding context "current". With non-blocking GPU calls, managing algorithm is clear.
+
+While developing algorithms for multiple GPUs, note a data passing overhead. For primitive functions and small images, it can be significant, which may eliminate all the advantages of having multiple GPUs. But for high-level algorithms, consider using multi-GPU acceleration. For example, the Stereo Block Matching algorithm has been successfully parallelized using the following algorithm:
+
+
+ 1.   Split each image of the stereo pair into two horizontal overlapping stripes.
+

-* If you only use synchronous functions, first, create several CPU threads (one per each GPU) and from within each thread create CUDA context for the corresponding GPU using :cpp:func:`gpu::setDevice` or Driver API. That's it. Now each of the threads will use the associated GPU.
+ 2.   Process each pair of stripes (from the left and right images) on a separate Fermi* GPU.

-* In case of asynchronous functions, it is possible to create several Cuda contexts associated with different GPUs but attached to one CPU thread. This can be done only by Driver API. Within the thread you can switch from one GPU to another by making the corresponding context "current". With non-blocking GPU calls managing algorithm is clear.

-While developing algorithms for multiple GPUs a data passing overhead have to be taken into consideration. For primitive functions and for small images it can be significant and eliminate all the advantages of having multiple GPUs. But for high level algorithms Multi-GPU acceleration may be suitable. For example, Stereo Block Matching algorithm has been successfully parallelized using the following algorithm:
+ 3.   Merge the results into a single disparity map.

-* Each image of the stereo pair is split into two horizontal overlapping stripes.
-* Each pair of stripes (from the left and the right images) has been processed on a separate Fermi GPU
-* The results are merged into the single disparity map.
+With this algorithm, a dual GPU gave a 180
+%
+performance increase comparing to the single Fermi GPU. For a source code example, see
+https://code.ros.org/svn/opencv/trunk/opencv/examples/gpu/.

-With this scheme dual GPU gave 180 % performance increase comparing to the single Fermi GPU. The source code of the example is available at https://code.ros.org/svn/opencv/trunk/opencv/examples/gpu/.
--- a/modules/gpu/doc/matrix_reductions.rst
+++ b/modules/gpu/doc/matrix_reductions.rst
@@ -3,133 +3,109 @@ Matrix Reductions

 .. highlight:: cpp

-
-
 .. index:: gpu::meanStdDev

 gpu::meanStdDev
 -------------------
-.. cpp:function:: void gpu::meanStdDev(const GpuMat& mtx, Scalar& mean, Scalar& stddev)
+.. cpp:function:: void gpu::meanStdDev(const GpuMat\& mtx, Scalar\& mean, Scalar\& stddev)

-    Computes mean value and standard deviation of matrix elements.
+    Computes a mean value and a standard deviation of matrix elements.

-    :param mtx: Source matrix. ``CV_8UC1`` matrices are supported for now.
+    :param mtx: Source matrix.  ``CV_8UC1``  matrices are supported for now.

    :param mean: Mean value.

    :param stddev: Standard deviation value.

-See also: :c:func:`meanStdDev`.
-
-
+See Also: :c:func:`meanStdDev` .

 .. index:: gpu::norm

 gpu::norm
 -------------
-.. cpp:function:: double gpu::norm(const GpuMat& src, int normType=NORM_L2)
-
-    Returns norm of matrix (or of two matrices difference).
-
-    :param src: Source matrix. Any matrices except 64F are supported.
-
-    :param normType: Norm type. ``NORM_L1``, ``NORM_L2`` and ``NORM_INF`` are supported for now.
-
-.. cpp:function:: double gpu::norm(const GpuMat& src, int normType, GpuMat& buf)
-
-    :param src: Source matrix. Any matrices except 64F are supported.
-
-    :param normType: Norm type. ``NORM_L1``, ``NORM_L2`` and ``NORM_INF`` are supported for now.
-
-    :param buf: Optional buffer to avoid extra memory allocations. It's resized automatically.
-
-.. cpp:function:: double gpu::norm(const GpuMat& src1, const GpuMat& src2, int normType=NORM_L2)
+.. cpp:function:: double gpu::norm(const GpuMat\& src1, int normType=NORM_L2)
+.. cpp:function:: double gpu::norm(const GpuMat\& src1, int normType, GpuMat\& buf)
+.. cpp:function:: double norm(const GpuMat\& src1, const GpuMat\& src2, int normType=NORM_L2)

-    :param src1: First source matrix. ``CV_8UC1`` matrices are supported for now.
+    Returns the norm of matrix (or difference of two matrices).

-    :param src2: Second source matrix. Must have the same size and type as ``src1``.
+    :param src1: The source matrix. Any matrices except 64F are supported.

-    :param normType: Norm type. ``NORM_L1``, ``NORM_L2`` and ``NORM_INF`` are supported for now.
+    :param src2: The second source matrix (if any). The size and type is the same as ``src1``.

-See also: :c:func:`norm`.
+    :param normType: Norm type.  ``NORM_L1`` ,  ``NORM_L2`` , and  ``NORM_INF``  are supported for now.

+    :param buf: Optional buffer to avoid extra memory allocations. It is resized automatically.

+See Also: :c:func:`norm`.

 .. index:: gpu::sum

 gpu::sum
 ------------
-.. cpp:function:: Scalar gpu::sum(const GpuMat& src)
+.. cpp:function:: Scalar gpu::sum(const GpuMat\& src)

-.. cpp:function:: Scalar gpu::sum(const GpuMat& src, GpuMat& buf)
+.. cpp:function:: Scalar gpu::sum(const GpuMat\& src, GpuMat\& buf)

-    Returns sum of matrix elements.
+    Returns the sum of matrix elements.

-    :param src: Source image of any depth except ``CV_64F``.
-
-    :param buf: Optional buffer to avoid extra memory allocations. It's resized automatically.
-
-See also: :c:func:`sum`.
+    :param src: Source image of any depth except for ``CV_64F`` .

+    :param buf: Optional buffer to avoid extra memory allocations. It is resized automatically.

+See Also: :c:func:`sum` .

 .. index:: gpu::absSum

 gpu::absSum
 ---------------
-.. cpp:function:: Scalar gpu::absSum(const GpuMat& src)
-
-.. cpp:function:: Scalar gpu::absSum(const GpuMat& src, GpuMat\& buf)
+.. cpp:function:: Scalar gpu::absSum(const GpuMat\& src)

-    Returns sum of matrix elements absolute values.
+.. cpp:function:: Scalar gpu::absSum(const GpuMat\& src, GpuMat\& buf)

-    :param src: Source image of any depth except ``CV_64F``.
-
-    :param buf: Optional buffer to avoid extra memory allocations. It's resized automatically.
+    Returns the sum of absolute values for matrix elements.

+    :param src: Source image of any depth except for ``CV_64F`` .

+    :param buf: Optional buffer to avoid extra memory allocations. It is resized automatically.

 .. index:: gpu::sqrSum

 gpu::sqrSum
 ---------------
-.. cpp:function:: Scalar gpu::sqrSum(const GpuMat& src)
-
-.. cpp:function:: Scalar gpu::sqrSum(const GpuMat& src, GpuMat\& buf)
-
-    Returns squared sum of matrix elements.
+.. cpp:function:: Scalar gpu::sqrSum(const GpuMat\& src)

-    :param src: Source image of any depth except ``CV_64F``.
+.. cpp:function:: Scalar gpu::sqrSum(const GpuMat\& src, GpuMat\& buf)

-    :param buf: Optional buffer to avoid extra memory allocations. It's resized automatically.
+    Returns the squared sum of matrix elements.

+    :param src: Source image of any depth except for ``CV_64F`` .

+    :param buf: Optional buffer to avoid extra memory allocations. It is resized automatically.

 .. index:: gpu::minMax

 gpu::minMax
 ---------------
-.. cpp:function:: void gpu::minMax(const GpuMat& src, double* minVal, double* maxVal=0, const GpuMat& mask=GpuMat())
+.. cpp:function:: void gpu::minMax(const GpuMat\& src, double* minVal, double* maxVal=0, const GpuMat\& mask=GpuMat())

-.. cpp:function:: void gpu::minMax(const GpuMat& src, double* minVal, double* maxVal, const GpuMat& mask, GpuMat& buf)
+.. cpp:function:: void gpu::minMax(const GpuMat\& src, double* minVal, double* maxVal, const GpuMat\& mask, GpuMat\& buf)

    Finds global minimum and maximum matrix elements and returns their values.

    :param src: Single-channel source image.

-    :param minVal: Pointer to returned minimum value. ``NULL`` if not required.
+    :param minVal: Pointer to the returned minimum value.  Use ``NULL``  if not required.

-    :param maxVal: Pointer to returned maximum value. ``NULL`` if not required.
+    :param maxVal: Pointer to the returned maximum value.  Use ``NULL``  if not required.

    :param mask: Optional mask to select a sub-matrix.

-    :param buf: Optional buffer to avoid extra memory allocations. It's resized automatically.
-
-Function doesn't work with ``CV_64F`` images on GPU with compute capability :math:`<` 1.3.
-
-See also: :c:func:`minMaxLoc`.
-
+    :param buf: Optional buffer to avoid extra memory allocations. It is resized automatically.

+	The Function does not work with ``CV_64F`` images on GPUs with the compute capability < 1.3.
+	
+See Also: :c:func:`minMaxLoc` .

 .. index:: gpu::minMaxLoc

@@ -143,40 +119,38 @@ gpu::minMaxLoc

    :param src: Single-channel source image.

-    :param minVal: Pointer to returned minimum value. ``NULL`` if not required.
+    :param minVal: Pointer to the returned minimum value. Use ``NULL``  if not required.

-    :param maxVal: Pointer to returned maximum value. ``NULL`` if not required.
+    :param maxVal: Pointer to the returned maximum value. Use ``NULL``  if not required.

-    :param minValLoc: Pointer to returned minimum location. ``NULL`` if not required.
+    :param minValLoc: Pointer to the returned minimum location. Use ``NULL``  if not required.

-    :param maxValLoc: Pointer to returned maximum location. ``NULL`` if not required.
+    :param maxValLoc: Pointer to the returned maximum location. Use ``NULL``  if not required.

    :param mask: Optional mask to select a sub-matrix.

-    :param valbuf: Optional values buffer to avoid extra memory allocations. It's resized automatically.
+    :param valbuf: Optional values buffer to avoid extra memory allocations. It is resized automatically.

-    :param locbuf: Optional locations buffer to avoid extra memory allocations. It's resized automatically.
-
-Function doesn't work with ``CV_64F`` images on GPU with compute capability :math:`<` 1.3.
-
-See also: :c:func:`minMaxLoc`.
+    :param locbuf: Optional locations buffer to avoid extra memory allocations. It is resized automatically.

+	The function does not work with ``CV_64F`` images on GPU with the compute capability < 1.3.

+See Also: :c:func:`minMaxLoc` .

 .. index:: gpu::countNonZero

 gpu::countNonZero
 ---------------------
-.. cpp:function:: int gpu::countNonZero(const GpuMat& src)
+.. cpp:function:: int gpu::countNonZero(const GpuMat\& src)

-.. cpp:function:: int gpu::countNonZero(const GpuMat& src, GpuMat& buf)
+.. cpp:function:: int gpu::countNonZero(const GpuMat\& src, GpuMat\& buf)

    Counts non-zero matrix elements.

    :param src: Single-channel source image.

-    :param buf: Optional buffer to avoid extra memory allocations. It's resized automatically.
-
-Function doesn't work with ``CV_64F`` images on GPU with compute capability :math:`<` 1.3.
+    :param buf: Optional buffer to avoid extra memory allocations. It is resized automatically.

-See also: :c:func:`countNonZero`.
+	The function does not work with ``CV_64F`` images on GPUs with the compute capability < 1.3.
+	
+	See Also: :c:func:`countNonZero` .
--- a/modules/gpu/doc/object_detection.rst
+++ b/modules/gpu/doc/object_detection.rst
@@ -3,17 +3,16 @@ Object Detection

 .. highlight:: cpp

-
-
 .. index:: gpu::HOGDescriptor

 gpu::HOGDescriptor
 ------------------
 .. cpp:class:: gpu::HOGDescriptor

-Histogram of Oriented Gradients [Navneet Dalal and Bill Triggs. Histogram of oriented gradients for human detection. 2005.] descriptor and detector. ::
+    Histogram of Oriented Gradients [Navneet Dalal and Bill Triggs. Histogram of oriented gradients for human detection. 2005.] descriptor and detector.
+::

-    struct HOGDescriptor
+    struct CV_EXPORTS HOGDescriptor
    {
        enum { DEFAULT_WIN_SIGMA = -1 };
        enum { DEFAULT_NLEVELS = 64 };
@@ -62,47 +61,46 @@ Histogram of Oriented Gradients [Navneet Dalal and Bill Triggs. Histogram of ori
    }


-Interfaces of all methods are kept similar to CPU HOG descriptor and detector analogues as much as possible.
-
-
+Interfaces of all methods are kept similar to the ``CPU HOG`` descriptor and detector analogues as much as possible.

 .. index:: gpu::HOGDescriptor::HOGDescriptor

 gpu::HOGDescriptor::HOGDescriptor
 -------------------------------------
-.. cpp:function:: gpu::HOGDescriptor::HOGDescriptor(Size win_size=Size(64, 128), Size block_size=Size(16, 16), Size block_stride=Size(8, 8), Size cell_size=Size(8, 8), int nbins=9, double win_sigma=DEFAULT_WIN_SIGMA, double threshold_L2hys=0.2, bool gamma_correction=true, int nlevels=DEFAULT_NLEVELS)
+.. cpp:function:: gpu::HOGDescriptor::HOGDescriptor(Size win_size=Size(64, 128),
+   Size block_size=Size(16, 16), Size block_stride=Size(8, 8),
+   Size cell_size=Size(8, 8), int nbins=9,
+   double win_sigma=DEFAULT_WIN_SIGMA,
+   double threshold_L2hys=0.2, bool gamma_correction=true,
+   int nlevels=DEFAULT_NLEVELS)

-    Creates HOG descriptor and detector.
+    Creates ``HOG`` descriptor and detector.

-    :param win_size: Detection window size. Must be aligned to block size and block stride.
+    :param win_size: Detection window size. Align to block size and block stride.

-    :param block_size: Block size in pixels. Must be aligned to cell size. Only (16,16) is supported for now.
+    :param block_size: Block size in pixels. Align to cell size. Only (16,16) is supported for now.

-    :param block_stride: Block stride. Must be a multiple of cell size.
+    :param block_stride: Block stride. It must be a multiple of cell size.

    :param cell_size: Cell size. Only (8, 8) is supported for now.

-    :param nbins: Number of bins. Only 9 bins per cell is supported for now.
+    :param nbins: Number of bins. Only 9 bins per cell are supported for now.

    :param win_sigma: Gaussian smoothing window parameter.

    :param threshold_L2Hys: L2-Hys normalization method shrinkage.

-    :param gamma_correction: Do gamma correction preprocessing or not.
+    :param gamma_correction: Flag to specify whether the gamma correction preprocessing is required or not.

    :param nlevels: Maximum number of detection window increases.

-
-
 .. index:: gpu::HOGDescriptor::getDescriptorSize

 gpu::HOGDescriptor::getDescriptorSize
 -----------------------------------------
 .. cpp:function:: size_t gpu::HOGDescriptor::getDescriptorSize() const

-    Returns number of coefficients required for the classification.
-
-
+    Returns the number of coefficients required for the classification.

 .. index:: gpu::HOGDescriptor::getBlockHistogramSize

@@ -110,20 +108,16 @@ gpu::HOGDescriptor::getBlockHistogramSize
 ---------------------------------------------
 .. cpp:function:: size_t gpu::HOGDescriptor::getBlockHistogramSize() const

-    Returns block histogram size.
-
-
+    Returns the block histogram size.

 .. index:: gpu::HOGDescriptor::setSVMDetector

 gpu::HOGDescriptor::setSVMDetector
 --------------------------------------
-.. cpp:function:: void gpu::HOGDescriptor::setSVMDetector(const vector<float>& detector)
+.. cpp:function:: void gpu::HOGDescriptor::setSVMDetector(const vector<float>\& detector)

    Sets coefficients for the linear SVM classifier.

-
-
 .. index:: gpu::HOGDescriptor::getDefaultPeopleDetector

 gpu::HOGDescriptor::getDefaultPeopleDetector
@@ -132,8 +126,6 @@ gpu::HOGDescriptor::getDefaultPeopleDetector

    Returns coefficients of the classifier trained for people detection (for default window size).

-
-
 .. index:: gpu::HOGDescriptor::getPeopleDetector48x96

 gpu::HOGDescriptor::getPeopleDetector48x96
@@ -142,8 +134,6 @@ gpu::HOGDescriptor::getPeopleDetector48x96

    Returns coefficients of the classifier trained for people detection (for 48x96 windows).

-
-
 .. index:: gpu::HOGDescriptor::getPeopleDetector64x128

 gpu::HOGDescriptor::getPeopleDetector64x128
@@ -152,140 +142,139 @@ gpu::HOGDescriptor::getPeopleDetector64x128

    Returns coefficients of the classifier trained for people detection (for 64x128 windows).

-
-
 .. index:: gpu::HOGDescriptor::detect

 gpu::HOGDescriptor::detect
 ------------------------------
-.. cpp:function:: void gpu::HOGDescriptor::detect(const GpuMat& img, vector<Point>& found_locations, double hit_threshold=0, Size win_stride=Size(), Size padding=Size())
-
-    Perfroms object detection without multiscale window.
+.. cpp:function:: void gpu::HOGDescriptor::detect(const GpuMat\& img,
+   vector<Point>\& found_locations, double hit_threshold=0,
+   Size win_stride=Size(), Size padding=Size())

-    :param img: Source image. ``CV_8UC1`` and ``CV_8UC4`` types are supported for now.
+    Performs object detection without a multi-scale window.

-    :param found_locations: Will contain left-top corner points of detected objects boundaries.
+    :param img: Source image.  ``CV_8UC1``  and  ``CV_8UC4`` types are supported for now.

-    :param hit_threshold: Threshold for the distance between features and SVM classifying plane. Usually it's 0 and should be specfied in the detector coefficients (as the last free coefficient), but if the free coefficient is omitted (it's allowed) you can specify it manually here.
+    :param found_locations: Left-top corner points of detected objects boundaries.

-    :param win_stride: Window stride. Must be a multiple of block stride.
-
-    :param padding: Mock parameter to keep CPU interface compatibility. Must be (0,0).
+    :param hit_threshold: Threshold for the distance between features and SVM classifying plane. Usually it is 0 and should be specfied in the detector coefficients (as the last free coefficient). But if the free coefficient is omitted (which is allowed), you can specify it manually here.

+    :param win_stride: Window stride. It must be a multiple of block stride.

+    :param padding: Mock parameter to keep the CPU interface compatibility. Must be (0,0).

 .. index:: gpu::HOGDescriptor::detectMultiScale

 gpu::HOGDescriptor::detectMultiScale
 ----------------------------------------
-.. cpp:function:: void gpu::HOGDescriptor::detectMultiScale(const GpuMat& img, vector<Rect>& found_locations, double hit_threshold=0, Size win_stride=Size(), Size padding=Size(), double scale0=1.05, int group_threshold=2)
+.. cpp:function:: void gpu::HOGDescriptor::detectMultiScale(const GpuMat\& img,
+   vector<Rect>\& found_locations, double hit_threshold=0,
+   Size win_stride=Size(), Size padding=Size(),
+   double scale0=1.05, int group_threshold=2)

-    Perfroms object detection with multiscale window.
+    Performs object detection with a multi-scale window.

-    :param img: Source image. See :cpp:func:`gpu::HOGDescriptor::detect` for type limitations.
+    :param img: Source image. See  :func:`gpu::HOGDescriptor::detect`  for type limitations.

-    :param found_locations: Will contain detected objects boundaries.
+    :param found_locations: Detected objects boundaries.

-    :param hit_threshold: The threshold for the distance between features and SVM classifying plane. See :cpp:func:`gpu::HOGDescriptor::detect` for details.
+    :param hit_threshold: Threshold for the distance between features and SVM classifying plane. See  :func:`gpu::HOGDescriptor::detect`  for details.

-    :param win_stride: Window stride. Must be a multiple of block stride.
+    :param win_stride: Window stride. It must be a multiple of block stride.

-    :param padding: Mock parameter to keep CPU interface compatibility. Must be (0,0).
+    :param padding: Mock parameter to keep the CPU interface compatibility. Must be (0,0).

    :param scale0: Coefficient of the detection window increase.

-    :param group_threshold: After detection some objects could be covered by many rectangles. This coefficient regulates similarity threshold. 0 means don't perform grouping. See :c:func:`groupRectangles`.
-
-
+    :param group_threshold: Coefficient to regulate the similarity threshold. When detected, some objects can be covered by many rectangles. 0 means not to perform grouping. See
+    :func:`groupRectangles` .

 .. index:: gpu::HOGDescriptor::getDescriptors

 gpu::HOGDescriptor::getDescriptors
 --------------------------------------
-.. cpp:function:: void gpu::HOGDescriptor::getDescriptors(const GpuMat& img, Size win_stride, GpuMat& descriptors, int descr_format=DESCR_FORMAT_COL_BY_COL)
+.. cpp:function:: void gpu::HOGDescriptor::getDescriptors(const GpuMat\& img,
+   Size win_stride, GpuMat\& descriptors,
+   int descr_format=DESCR_FORMAT_COL_BY_COL)

-    Returns block descriptors computed for the whole image. It's mainly used for classifier learning purposes.
+    Returns block descriptors computed for the whole image. The function is mainly used to learn the classifier.

-    :param img: Source image. See :cpp:func:`gpu::HOGDescriptor::detect` for type limitations.
+    :param img: Source image. See  :func:`gpu::HOGDescriptor::detect`  for type limitations.

-    :param win_stride: Window stride. Must be a multiple of block stride.
+    :param win_stride: Window stride. It must be a multiple of block stride.

    :param descriptors: 2D array of descriptors.

    :param descr_format: Descriptor storage format: 

-            * **DESCR_FORMAT_ROW_BY_ROW** Row-major order.
+        * **DESCR_FORMAT_ROW_BY_ROW** Row-major order.

-            * **DESCR_FORMAT_COL_BY_COL** Column-major order.
+        * **DESCR_FORMAT_COL_BY_COL** Column-major order.
            

-
 .. index:: gpu::CascadeClassifier_GPU

 gpu::CascadeClassifier_GPU
 --------------------------
 .. cpp:class:: gpu::CascadeClassifier_GPU

-The cascade classifier class for object detection. ::
+    The cascade classifier class used for object detection. 
+::

-    class CascadeClassifier_GPU
+    class CV_EXPORTS CascadeClassifier_GPU
    {
    public:
-        CascadeClassifier_GPU();
-        CascadeClassifier_GPU(const string& filename);
-        ~CascadeClassifier_GPU();
+            CascadeClassifier_GPU();
+            CascadeClassifier_GPU(const string& filename);
+            ~CascadeClassifier_GPU();

-        bool empty() const;
-        bool load(const string& filename);
-        void release();
+            bool empty() const;
+            bool load(const string& filename);
+            void release();

-        /* returns number of detected objects */
-        int detectMultiScale( const GpuMat& image, GpuMat& objectsBuf, double scaleFactor=1.2, int minNeighbors=4, Size minSize=Size());
+            /* Returns number of detected objects */
+            int detectMultiScale( const GpuMat& image, GpuMat& objectsBuf, double scaleFactor=1.2, int minNeighbors=4, Size minSize=Size());

-        /* Finds only the largest object. Special mode for need to training*/
-        bool findLargestObject;
+            /* Finds only the largest object. Special mode if training is required.*/
+            bool findLargestObject;

-        /* Draws rectangles in input image */
-        bool visualizeInPlace;
+            /* Draws rectangles in input image */
+            bool visualizeInPlace;

-        Size getClassifierSize() const;
+            Size getClassifierSize() const;
    };


-
 .. index:: gpu::CascadeClassifier_GPU::CascadeClassifier_GPU

 gpu::CascadeClassifier_GPU::CascadeClassifier_GPU
 -----------------------------------------------------
-.. cpp:function:: gpu::CascadeClassifier_GPU::CascadeClassifier_GPU(const string& filename)
-
-    Loads the classifier from file.
-
-    :param filename: Name of file from which classifier will be load. Only old haar classifier (trained by haartraining application) and NVidia's nvbin are supported.
+.. cpp:function:: gpu::CascadeClassifier_GPU(const string\& filename)

+    Loads the classifier from a file.

+    :param filename: Name of the file from which the classifier is loaded. Only the old ``haar`` classifier (trained by the haartraining application) and NVidia's ``nvbin`` are supported.

 .. index:: gpu::CascadeClassifier_GPU::empty

+.. _gpu::CascadeClassifier_GPU::empty:
+
 gpu::CascadeClassifier_GPU::empty
 -------------------------------------
 .. cpp:function:: bool gpu::CascadeClassifier_GPU::empty() const

-    Checks if the classifier has been loaded or not.
+    Checks whether the classifier is loaded or not.

+.. index:: gpu::CascadeClassifier_GPU::load

-
-.. index:: cv::gpu::CascadeClassifier_GPU::load
+.. _gpu::CascadeClassifier_GPU::load:

 gpu::CascadeClassifier_GPU::load
 ------------------------------------
 .. cpp:function:: bool gpu::CascadeClassifier_GPU::load(const string\& filename)

-    Loads the classifier from file. The previous content is destroyed.
-
-    :param filename: Name of file from which classifier will be load. Only old haar classifier (trained by haartraining application) and NVidia's nvbin are supported.
-
+    Loads the classifier from a file. The previous content is destroyed.

+    :param filename: Name of the file from which the classifier is loaded. Only the old ``haar`` classifier (trained by the haartraining application) and NVidia's ``nvbin`` are supported.

 .. index:: gpu::CascadeClassifier_GPU::release

@@ -293,31 +282,29 @@ gpu::CascadeClassifier_GPU::release
 ---------------------------------------
 .. cpp:function:: void gpu::CascadeClassifier_GPU::release()

-    Destroys loaded classifier.
-
-
+    Destroys the loaded classifier.

 .. index:: gpu::CascadeClassifier_GPU::detectMultiScale

 gpu::CascadeClassifier_GPU::detectMultiScale
 ------------------------------------------------
-.. cpp:function:: int gpu::CascadeClassifier_GPU::detectMultiScale(const GpuMat& image, GpuMat& objectsBuf, double scaleFactor=1.2, int minNeighbors=4, Size minSize=Size())
+.. cpp:function:: int gpu::CascadeClassifier_GPU::detectMultiScale(const GpuMat\& image, GpuMat\& objectsBuf, double scaleFactor=1.2, int minNeighbors=4, Size minSize=Size())

    Detects objects of different sizes in the input image. The detected objects are returned as a list of rectangles.

-    :param image: Matrix of type ``CV_8U`` containing the image in which to detect objects.
+    :param image: Matrix of type  ``CV_8U``  containing an image where objects should be detected.

-    :param objects: Buffer to store detected objects (rectangles). If it is empty, it will be allocated with default size. If not empty, function will search not more than N objects, where ``N = sizeof(objectsBufer's data)/sizeof(cv::Rect)``.
+    :param objects: Buffer to store detected objects (rectangles). If it is empty, it is allocated with the default size. If not empty, the function searches not more than N objects, where N = sizeof(objectsBufer's data)/sizeof(cv::Rect).

-    :param scaleFactor: Specifies how much the image size is reduced at each image scale.
+    :param scaleFactor: Value to specify how much the image size is reduced at each image scale.

-    :param minNeighbors: Specifies how many neighbors should each candidate rectangle have to retain it.
+    :param minNeighbors: Value to specify how many neighbours each candidate rectangle has to retain.

-    :param minSize: The minimum possible object size. Objects smaller than that are ignored.
+    :param minSize: Minimum possible object size. Objects smaller than that are ignored.

-The function returns number of detected objects, so you can retrieve them as in following example: ::
+    The function returns the number of detected objects, so you can retrieve them as in the following example: ::

-    cv::gpu::CascadeClassifier_GPU cascade_gpu(...);
+    gpu::CascadeClassifier_GPU cascade_gpu(...);

    Mat image_cpu = imread(...)
    GpuMat image_gpu(image_cpu);
@@ -336,5 +323,6 @@ The function returns number of detected objects, so you can retrieve them as in

    imshow("Faces", image_cpu);

-See also: :c:func:`CascadeClassifier::detectMultiScale`.
+
+See Also: :c:func:`CascadeClassifier::detectMultiScale` .

--- a/modules/gpu/doc/operations_on_matrices.rst
+++ b/modules/gpu/doc/operations_on_matrices.rst
@@ -3,8 +3,6 @@ Operations on Matrices

 .. highlight:: cpp

-
-
 .. index:: gpu::transpose

 gpu::transpose
@@ -17,56 +15,49 @@ gpu::transpose

    :param dst: Destination matrix.

-See also: :c:func:`transpose`.
-
-
+See Also:
+:c:func:`transpose` .

 .. index:: gpu::flip

 gpu::flip
 -------------
-.. cpp:function:: void gpu::flip(const GpuMat& a, GpuMat& b, int flipCode)
+.. cpp:function:: void gpu::flip(const GpuMat& src, GpuMat& dst, int flipCode)

-    Flips a 2D matrix around vertical, horizontal or both axes.
+    Flips a 2D matrix around vertical, horizontal, or both axes.

-    :param a: Source matrix. Only ``CV_8UC1`` and ``CV_8UC4`` matrices are supported for now.
+    :param src: Source matrix. Only  ``CV_8UC1``  and  ``CV_8UC4``  matrices are supported for now.

-    :param b: Destination matrix.
+    :param dst: Destination matrix.

-    :param flipCode: Specifies how to flip the source:
+    :param flipCode: Flip mode for the source:
        
-            * **0** Flip around x-axis.
+            * ``0`` Flips around x-axis.
            
-            * **:math:`>`0** Flip around y-axis.
+            * ``>0`` Flips around y-axis.
+            
+            * ``<0`` Flips around both axes.
            
-            * **:math:`<`0** Flip around both axes.
-
-See also: :c:func:`flip`.
-

+See Also:
+:c:func:`flip` .

 .. index:: gpu::LUT

 gpu::LUT
 ------------
-
 .. cpp:function:: void gpu::LUT(const GpuMat& src, const Mat& lut, GpuMat& dst)

-    Transforms the source matrix into the destination matrix using given look-up table:
+    Transforms the source matrix into the destination matrix using the given look-up table: ``dst(I) = lut(src(I))``

-    .. math::
+    :param src: Source matrix.  ``CV_8UC1``  and  ``CV_8UC3``  matrices are supported for now.

-        dst(I) = lut(src(I))
-
-    :param src: Source matrix. ``CV_8UC1`` and ``CV_8UC3`` matrixes are supported for now.
-
-    :param lut: Look-up table. Must be continuous, ``CV_8U`` depth matrix. Its area must satisfy to ``lut.rows`` :math:`\times` ``lut.cols`` = 256 condition.
-
-    :param dst: Destination matrix. Will have the same depth as ``lut`` and the same number of channels as ``src``.
-
-See also: :c:func:`LUT`.
+    :param lut: Look-up table of 256 elements. Must be continuous, ``CV_8U`` matrix.

+    :param dst: Destination matrix with the same depth as  ``lut``  and the same number of channels as  ``src`` .
+            

+See Also: :c:func:`LUT` .

 .. index:: gpu::merge

@@ -76,29 +67,21 @@ gpu::merge

 .. cpp:function:: void gpu::merge(const GpuMat* src, size_t n, GpuMat& dst, const Stream& stream)

-    Makes a multi-channel matrix out of several single-channel matrices.
-
-    :param src: Pointer to array of the source matrices.
-
-    :param n: Number of source matrices.
-
-    :param dst: Destination matrix.
-
-    :param stream: Stream for the asynchronous version.
-
 .. cpp:function:: void gpu::merge(const vector<GpuMat>& src, GpuMat& dst)

 .. cpp:function:: void gpu::merge(const vector<GpuMat>& src, GpuMat& dst, const Stream& stream)

-    :param src: Vector of the source matrices.
+    Makes a multi-channel matrix out of several single-channel matrices.

-    :param dst: Destination matrix.
+    :param src: Array/vector of source matrices.

-    :param stream: Stream for the asynchronous version.
+    :param n: Number of source matrices.

-See also: :c:func:`merge`.
+    :param dst: Destination matrix.

+    :param stream: Stream for the asynchronous version.

+See Also: :c:func:`merge` .

 .. index:: gpu::split

@@ -108,44 +91,34 @@ gpu::split

 .. cpp:function:: void gpu::split(const GpuMat& src, GpuMat* dst, const Stream& stream)

-    Copies each plane of a multi-channel matrix into an array.
-
-    :param src: Source matrix.
-
-    :param dst: Pointer to array of single-channel matrices.
-
-    :param stream: Stream for the asynchronous version.
-
 .. cpp:function:: void gpu::split(const GpuMat& src, vector<GpuMat>& dst)

 .. cpp:function:: void gpu::split(const GpuMat& src, vector<GpuMat>& dst, const Stream& stream)

+    Copies each plane of a multi-channel matrix into an array.
+
    :param src: Source matrix.

-    :param dst: Destination vector of single-channel matrices.
+    :param dst: The destination array/vector of single-channel matrices.

    :param stream: Stream for the asynchronous version.

-See also: :c:func:`split`.
-
-
+See Also: :c:func:`split`.

 .. index:: gpu::magnitude

 gpu::magnitude
 ------------------
-.. cpp:function:: void gpu::magnitude(const GpuMat& x, GpuMat& magnitude)
-
-    Computes magnitudes of complex matrix elements.
-
-    :param x: Source complex matrix in the interleaved format (``CV_32FC2``).
-
-    :param magnitude: Destination matrix of float magnitudes (``CV_32FC1``).
+.. cpp:function:: void gpu::magnitude(const GpuMat& xy, GpuMat& magnitude)

 .. cpp:function:: void gpu::magnitude(const GpuMat& x, const GpuMat& y, GpuMat& magnitude)

 .. cpp:function:: void gpu::magnitude(const GpuMat& x, const GpuMat& y, GpuMat& magnitude, const Stream& stream)

+    Computes magnitudes of complex matrix elements.
+
+    :param xy: Source complex matrix in the interleaved format (``CV_32FC2``).
+    
    :param x: Source matrix, containing real components (``CV_32FC1``).

    :param y: Source matrix, containing imaginary components (``CV_32FC1``).
@@ -154,26 +127,23 @@ gpu::magnitude

    :param stream: Stream for the asynchronous version.

-See also: :c:func:`magnitude`.
-
-
+See Also:
+:c:func:`magnitude` .

 .. index:: gpu::magnitudeSqr

 gpu::magnitudeSqr
 ---------------------
-.. cpp:function:: void gpu::magnitudeSqr(const GpuMat& x, GpuMat& magnitude)
-
-    Computes squared magnitudes of complex matrix elements.
-
-    :param x: Source complex matrix in the interleaved format (``CV_32FC2``).
-
-    :param magnitude: Destination matrix of float magnitude squares (``CV_32FC1``).
+.. cpp:function:: void gpu::magnitudeSqr(const GpuMat& xy, GpuMat& magnitude)

 .. cpp:function:: void gpu::magnitudeSqr(const GpuMat& x, const GpuMat& y, GpuMat& magnitude)

 .. cpp:function:: void gpu::magnitudeSqr(const GpuMat& x, const GpuMat& y, GpuMat& magnitude, const Stream& stream)

+    Computes squared magnitudes of complex matrix elements.
+
+    :param xy: Source complex matrix in the interleaved format (``CV_32FC2``).
+
    :param x: Source matrix, containing real components (``CV_32FC1``).

    :param y: Source matrix, containing imaginary components (``CV_32FC1``).
@@ -182,8 +152,6 @@ gpu::magnitudeSqr

    :param stream: Stream for the asynchronous version.

-
-
 .. index:: gpu::phase

 gpu::phase
@@ -194,19 +162,18 @@ gpu::phase

    Computes polar angles of complex matrix elements.

-    :param x: Source matrix, containing real components (``CV_32FC1``).
+    :param x: Source matrix containing real components (``CV_32FC1``).

-    :param y: Source matrix, containing imaginary components (``CV_32FC1``).
+    :param y: Source matrix containing imaginary components (``CV_32FC1``).

    :param angle: Destionation matrix of angles (``CV_32FC1``).

-    :param angleInDegress: Flag which indicates angles must be evaluated in degress.
+    :param angleInDegress: Flag for angles that must be evaluated in degress.

    :param stream: Stream for the asynchronous version.

-See also: :c:func:`phase`.
-
-
+See Also:
+:c:func:`phase` .

 .. index:: gpu::cartToPolar

@@ -218,21 +185,20 @@ gpu::cartToPolar

    Converts Cartesian coordinates into polar.

-    :param x: Source matrix, containing real components (``CV_32FC1``).
+    :param x: Source matrix containing real components (``CV_32FC1``).

-    :param y: Source matrix, containing imaginary components (``CV_32FC1``).
+    :param y: Source matrix containing imaginary components (``CV_32FC1``).

-    :param magnitude: Destination matrix of float magnituds (``CV_32FC1``).
+    :param magnitude: Destination matrix of float magnitudes (``CV_32FC1``).

    :param angle: Destionation matrix of angles (``CV_32FC1``).

-    :param angleInDegress: Flag which indicates angles must be evaluated in degress.
+    :param angleInDegress: Flag for angles that must be evaluated in degress.

    :param stream: Stream for the asynchronous version.

-See also: :c:func:`cartToPolar`.
-
-
+See Also:
+:c:func:`cartToPolar` .

 .. index:: gpu::polarToCart

@@ -244,16 +210,17 @@ gpu::polarToCart

    Converts polar coordinates into Cartesian.

-    :param magnitude: Source matrix, containing magnitudes (``CV_32FC1``).
+    :param magnitude: Source matrix containing magnitudes (``CV_32FC1``).

-    :param angle: Source matrix, containing angles (``CV_32FC1``).
+    :param angle: Source matrix containing angles (``CV_32FC1``).

    :param x: Destination matrix of real components (``CV_32FC1``).

    :param y: Destination matrix of imaginary components (``CV_32FC1``).

-    :param angleInDegress: Flag which indicates angles are in degress.
+    :param angleInDegress: Flag that indicates angles in degress.

    :param stream: Stream for the asynchronous version.

-See also: :c:func:`polarToCart`.
+See Also:
+:c:func:`polarToCart` .
--- a/modules/gpu/doc/per_element_operations.rst
+++ b/modules/gpu/doc/per_element_operations.rst
@@ -9,49 +9,35 @@ Per-element Operations.

 gpu::add
 ------------
-.. cpp:function:: void gpu::add(const GpuMat& a, const GpuMat& b, GpuMat& c)
+.. cpp:function:: void gpu::add(const GpuMat& src1, const GpuMat& src2, GpuMat& dst)

-    Computes matrix-matrix or matrix-scalar sum.
-
-    :param a: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.
-
-    :param b: Second source matrix. Must have the same size and type as ``a``.
+.. cpp:function:: void gpu::add(const GpuMat& src1, const Scalar& src2, GpuMat& dst)

-    :param c: Destination matrix. Will have the same size and type as ``a``.
-
-.. cpp:function:: void gpu::add(const GpuMat& a, const Scalar& sc, GpuMat& c)
+    Computes matrix-matrix or matrix-scalar sum.

-    :param a: Source matrix. ``CV_32FC1`` and ``CV_32FC2`` matrixes are supported for now.
+    :param src1: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.

-    :param b: Source scalar to be added to the source matrix.
+    :param src2: Second source matrix or a scalar to be added to ``src1``.

-    :param c: Destination matrix. Will have the same size and type as ``a``.
+    :param dst: Destination matrix. Will have the same size and type as ``src1``.

 See also: :c:func:`add`.

-
-
 .. index:: gpu::subtract

 gpu::subtract
 -----------------
-.. cpp:function:: void gpu::subtract(const GpuMat& a, const GpuMat& b, GpuMat& c)
-
-    Subtracts matrix from another matrix (or scalar from matrix).
+.. cpp:function:: void gpu::subtract(const GpuMat& src1, const GpuMat& src2, GpuMat& dst)

-    :param a: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.
+.. cpp:function:: void gpu::subtract(const GpuMat& src1, const Scalar& src2, GpuMat& dst)

-    :param b: Second source matrix. Must have the same size and type as ``a``.
+    Computes matrix-matrix or matrix-scalar difference.

-    :param c: Destination matrix. Will have the same size and type as ``a``.
+    :param src1: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.

-.. cpp:function:: void gpu::subtract(const GpuMat& a, const Scalar& sc, GpuMat& c)
+    :param src2: Second source matrix or a scalar to be subtracted from ``src1``.

-    :param a: Source matrix. ``CV_32FC1`` and ``CV_32FC2`` matrixes are supported for now.
-
-    :param b: Scalar to be subtracted from the source matrix elements.
-
-    :param c: Destination matrix. Will have the same size and type as ``a``.
+    :param dst: Destination matrix. Will have the same size and type as ``src1``.

 See also: :c:func:`subtract`.

@@ -61,51 +47,38 @@ See also: :c:func:`subtract`.

 gpu::multiply
 -----------------
-.. cpp:function:: void gpu::multiply(const GpuMat& a, const GpuMat& b, GpuMat& c)
-
-    Computes per-element product of two matrices (or of matrix and scalar).
+.. cpp:function:: void gpu::multiply(const GpuMat& src1, const GpuMat& src2, GpuMat& dst)

-    :param a: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.
+.. cpp:function:: void gpu::multiply(const GpuMat& src1, const Scalar& src2, GpuMat& dst)

-    :param b: Second source matrix. Must have the same size and type as ``a``.
+    Computes matrix-matrix or matrix-scalar per-element product.

-    :param c: Destionation matrix. Will have the same size and type as ``a``.
+    :param src1: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.

-.. cpp:function:: void gpu::multiply(const GpuMat& a, const Scalar& sc, GpuMat& c)
+    :param src2: Second source matrix or a scalar to be multiplied by ``src1`` elements.

-    :param a: Source matrix. ``CV_32FC1`` and ``CV_32FC2`` matrixes are supported for now.
-
-    :param b: Scalar to be multiplied by.
-
-    :param c: Destination matrix. Will have the same size and type as ``a``.
+    :param dst: Destination matrix. Will have the same size and type as ``src1``.

 See also: :c:func:`multiply`.


-
 .. index:: gpu::divide

 gpu::divide
 ---------------
-.. cpp:function:: void gpu::divide(const GpuMat& a, const GpuMat& b, GpuMat& c)
+.. cpp:function:: void gpu::divide(const GpuMat& src1, const GpuMat& src2, GpuMat& dst)

-    Performs per-element division of two matrices (or division of matrix by scalar).
+.. cpp:function:: void gpu::divide(const GpuMat& src1, const Scalar& src2, GpuMat& dst)

-    :param a: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.
-
-    :param b: Second source matrix. Must have the same size and type as ``a``.
-
-    :param c: Destionation matrix. Will have the same size and type as ``a``.
-
-.. cpp:function:: void gpu::divide(const GpuMat& a, const Scalar& sc, GpuMat& c)
+    Computes matrix-matrix or matrix-scalar sum.

-    :param a: Source matrix. ``CV_32FC1`` and ``CV_32FC2`` matrixes are supported for now.
+    :param src1: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.

-    :param b: Scalar to be divided by.
+    :param src2: Second source matrix or a scalar. The ``src1`` elements are divided by it.

-    :param c: Destination matrix. Will have the same size and type as ``a``.
+    :param dst: Destination matrix. Will have the same size and type as ``src1``.

-This function in contrast to :func:`divide` uses round-down rounding mode.
+This function in contrast to :c:func:`divide` uses round-down rounding mode.

 See also: :c:func:`divide`.

@@ -115,13 +88,13 @@ See also: :c:func:`divide`.

 gpu::exp
 ------------
-.. cpp:function:: void gpu::exp(const GpuMat& a, GpuMat& b)
+.. cpp:function:: void gpu::exp(const GpuMat& src, GpuMat& dst)

    Computes exponent of each matrix element.

-    :param a: Source matrix. ``CV_32FC1`` matrixes are supported for now.
+    :param src: Source matrix. ``CV_32FC1`` matrixes are supported for now.

-    :param b: Destination matrix. Will have the same size and type as ``a``.
+    :param dst: Destination matrix. Will have the same size and type as ``src``.

 See also: :c:func:`exp`.

@@ -131,13 +104,13 @@ See also: :c:func:`exp`.

 gpu::log
 ------------
-.. cpp:function:: void gpu::log(const GpuMat& a, GpuMat& b)
+.. cpp:function:: void gpu::log(const GpuMat& src, GpuMat& dst)

    Computes natural logarithm of absolute value of each matrix element.

-    :param a: Source matrix. ``CV_32FC1`` matrixes are supported for now.
+    :param src: Source matrix. ``CV_32FC1`` matrixes are supported for now.

-    :param b: Destination matrix. Will have the same size and type as ``a``.
+    :param dst: Destination matrix. Will have the same size and type as ``src``.

 See also: :c:func:`log`.

@@ -147,55 +120,46 @@ See also: :c:func:`log`.

 gpu::absdiff
 ----------------
-.. cpp:function:: void gpu::absdiff(const GpuMat& a, const GpuMat& b, GpuMat& c)
-
-    Computes per-element absolute difference of two matrices (or of matrix and scalar).
+.. cpp:function:: void gpu::absdiff(const GpuMat& src1, const GpuMat& src2, GpuMat& dst)

-    :param a: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.
+.. cpp:function:: void gpu::absdiff(const GpuMat& src1, const Scalar& src2, GpuMat& dst)

-    :param b: Second source matrix. Must have the same size and type as ``a``.
-
-    :param c: Destionation matrix. Will have the same size and type as ``a``.
-
-.. cpp:function:: void gpu::absdiff(const GpuMat& a, const Scalar& s, GpuMat& c)
+    Computes per-element absolute difference of two matrices (or of matrix and scalar).

-    :param a: Source matrix. ``CV_32FC1`` matrixes are supported for now.
+    :param src1: First source matrix. ``CV_8UC1``, ``CV_8UC4``, ``CV_32SC1`` and ``CV_32FC1`` matrices are supported for now.

-    :param b: Scalar to be subtracted from the source matrix elements.
+    :param src2: Second source matrix or a scalar to be added to ``src1``.

-    :param c: Destination matrix. Will have the same size and type as ``a``.
+    :param dst: Destination matrix. Will have the same size and type as ``src1``.

 See also: :c:func:`absdiff`.

-
-
 .. index:: gpu::compare

 gpu::compare
 ----------------
-.. cpp:function:: void gpu::compare(const GpuMat& a, const GpuMat& b, GpuMat& c, int cmpop)
+.. cpp:function:: void gpu::compare(const GpuMat& src1, const GpuMat& src2, GpuMat& dst, int cmpop)

    Compares elements of two matrices.

-    :param a: First source matrix. ``CV_8UC4`` and ``CV_32FC1`` matrices are supported for now.
+    :param src1: First source matrix. ``CV_8UC4`` and ``CV_32FC1`` matrices are supported for now.

-    :param b: Second source matrix. Must have the same size and type as ``a``.
+    :param src2: Second source matrix. Must have the same size and type as ``a``.

-    :param c: Destination matrix. Will have the same size as ``a`` and be ``CV_8UC1`` type.
+    :param dst: Destination matrix. Will have the same size as ``a`` and be ``CV_8UC1`` type.

    :param cmpop: Flag specifying the relation between the elements to be checked:
        
-            * **CMP_EQ** :math:`=`             
-            * **CMP_GT** :math:`>`             
-            * **CMP_GE** :math:`\ge`             
-            * **CMP_LT** :math:`<`             
-            * **CMP_LE** :math:`\le`             
-            * **CMP_NE** :math:`\ne`             
+            * **CMP_EQ:** ``src1(.) == src2(.)``
+            * **CMP_GT:** ``src1(.) < src2(.)``
+            * **CMP_GE:** ``src1(.) <= src2(.)``
+            * **CMP_LT:** ``src1(.) < src2(.)``
+            * **CMP_LE:** ``src1(.) <= src2(.)``
+            * **CMP_NE:** ``src1(.) != src2(.)``

 See also: :c:func:`compare`.


-
 .. index:: gpu::bitwise_not

 gpu::bitwise_not
@@ -290,17 +254,15 @@ gpu::min

 .. cpp:function:: void gpu::min(const GpuMat& src1, const GpuMat& src2, GpuMat& dst, const Stream& stream)

-.. cpp:function:: void gpu::min(const GpuMat& src1, double value, GpuMat& dst)
+.. cpp:function:: void gpu::min(const GpuMat& src1, double src2, GpuMat& dst)

-.. cpp:function:: void gpu::min(const GpuMat& src1, double value, GpuMat& dst, const Stream& stream)
+.. cpp:function:: void gpu::min(const GpuMat& src1, double src2, GpuMat& dst, const Stream& stream)

    Computes per-element minimum of two matrices (or a matrix and a scalar).

    :param src1: First source matrix.

-    :param src2: Second source matrix.
-
-    :param value: Scalar value to compare ``src1`` elements with.
+    :param src2: Second source matrix or a scalar to compare compare ``src1`` elements with.

    :param dst: Destination matrix. Will have the same size and type as ``src1``.

@@ -318,17 +280,15 @@ gpu::max

 .. cpp:function:: void gpu::max(const GpuMat& src1, const GpuMat& src2, GpuMat& dst, const Stream& stream)

-.. cpp:function:: void gpu::max(const GpuMat& src1, double value, GpuMat& dst)
+.. cpp:function:: void gpu::max(const GpuMat& src1, double src2, GpuMat& dst)

-.. cpp:function:: void gpu::max(const GpuMat& src1, double value, GpuMat& dst, const Stream& stream)
+.. cpp:function:: void gpu::max(const GpuMat& src1, double src2, GpuMat& dst, const Stream& stream)

    Computes per-element maximum of two matrices (or a matrix and a scalar).

    :param src1: First source matrix.

-    :param src2: Second source matrix.
-    
-    :param value: Scalar value to compare ``src1`` elements with.
+    :param src2: Second source matrix or a scalar to compare ``src1`` elements with.

    :param dst: Destination matrix. Will have the same size and type as ``src1``.