Commit 3165baa1 authored by Alexander Alekhin's avatar Alexander Alekhin

Merge remote-tracking branch 'upstream/3.4' into merge-3.4

parents 9a8a964b 74da80db
...@@ -1624,7 +1624,7 @@ endif() ...@@ -1624,7 +1624,7 @@ endif()
macro(ocv_git_describe var_name path) macro(ocv_git_describe var_name path)
if(GIT_FOUND) if(GIT_FOUND)
execute_process(COMMAND "${GIT_EXECUTABLE}" describe --tags --tags --exact-match --dirty execute_process(COMMAND "${GIT_EXECUTABLE}" describe --tags --exact-match --dirty
WORKING_DIRECTORY "${path}" WORKING_DIRECTORY "${path}"
OUTPUT_VARIABLE ${var_name} OUTPUT_VARIABLE ${var_name}
RESULT_VARIABLE GIT_RESULT RESULT_VARIABLE GIT_RESULT
......
...@@ -16,42 +16,152 @@ Theory ...@@ -16,42 +16,152 @@ Theory
Code Code
---- ----
@add_toggle_cpp
This tutorial code's is shown lines below. You can also download it from This tutorial code's is shown lines below. You can also download it from
[here](https://github.com/opencv/opencv/tree/master/samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp). [here](https://github.com/opencv/opencv/tree/master/samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp).
@include samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp @include samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp
@end_toggle
@add_toggle_java
This tutorial code's is shown lines below. You can also download it from
[here](https://github.com/opencv/opencv/tree/master/samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java)
@include samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java
@end_toggle
@add_toggle_python
This tutorial code's is shown lines below. You can also download it from
[here](https://github.com/opencv/opencv/tree/master/samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py)
@include samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py
@end_toggle
Explanation / Result Explanation / Result
-------------------- --------------------
-# Load the source image and check if it is loaded without any problem, then show it: - Load the source image and check if it is loaded without any problem, then show it:
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp load_image
![](images/source.jpeg) @add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp load_image
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java load_image
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py load_image
@end_toggle
![](images/source.jpeg)
- Then if we have an image with a white background, it is good to transform it to black. This will help us to discriminate the foreground objects easier when we will apply the Distance Transform:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp black_bg
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java black_bg
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py black_bg
@end_toggle
![](images/black_bg.jpeg)
- Afterwards we will sharpen our image in order to acute the edges of the foreground objects. We will apply a laplacian filter with a quite strong filter (an approximation of second derivative):
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp sharp
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java sharp
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py sharp
@end_toggle
![](images/laplace.jpeg)
![](images/sharp.jpeg)
- Now we transform our new sharpened source image to a grayscale and a binary one, respectively:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp bin
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java bin
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py bin
@end_toggle
![](images/bin.jpeg)
- We are ready now to apply the Distance Transform on the binary image. Moreover, we normalize the output image in order to be able visualize and threshold the result:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp dist
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java dist
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py dist
@end_toggle
![](images/dist_transf.jpeg)
- We threshold the *dist* image and then perform some morphology operation (i.e. dilation) in order to extract the peaks from the above image:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp peaks
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java peaks
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py peaks
@end_toggle
![](images/peaks.jpeg)
- From each blob then we create a seed/marker for the watershed algorithm with the help of the @ref cv::findContours function:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp seeds
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java seeds
@end_toggle
-# Then if we have an image with a white background, it is good to transform it to black. This will help us to discriminate the foreground objects easier when we will apply the Distance Transform: @add_toggle_python
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp black_bg @snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py seeds
![](images/black_bg.jpeg) @end_toggle
-# Afterwards we will sharpen our image in order to acute the edges of the foreground objects. We will apply a laplacian filter with a quite strong filter (an approximation of second derivative): ![](images/markers.jpeg)
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp sharp
![](images/laplace.jpeg)
![](images/sharp.jpeg)
-# Now we transform our new sharpened source image to a grayscale and a binary one, respectively: - Finally, we can apply the watershed algorithm, and visualize the result:
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp bin
![](images/bin.jpeg)
-# We are ready now to apply the Distance Transform on the binary image. Moreover, we normalize the output image in order to be able visualize and threshold the result: @add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp dist @snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp watershed
![](images/dist_transf.jpeg) @end_toggle
-# We threshold the *dist* image and then perform some morphology operation (i.e. dilation) in order to extract the peaks from the above image: @add_toggle_java
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp peaks @snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java watershed
![](images/peaks.jpeg) @end_toggle
-# From each blob then we create a seed/marker for the watershed algorithm with the help of the @ref cv::findContours function: @add_toggle_python
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp seeds @snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py watershed
![](images/markers.jpeg) @end_toggle
-# Finally, we can apply the watershed algorithm, and visualize the result: ![](images/final.jpeg)
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp watershed
![](images/final.jpeg)
\ No newline at end of file
...@@ -285,6 +285,8 @@ In this section you will learn about the image processing (manipulation) functio ...@@ -285,6 +285,8 @@ In this section you will learn about the image processing (manipulation) functio
- @subpage tutorial_distance_transform - @subpage tutorial_distance_transform
*Languages:* C++, Java, Python
*Compatibility:* \> OpenCV 2.0 *Compatibility:* \> OpenCV 2.0
*Author:* Theodore Tsesmelis *Author:* Theodore Tsesmelis
......
...@@ -1985,6 +1985,31 @@ CV_EXPORTS_W int decomposeHomographyMat(InputArray H, ...@@ -1985,6 +1985,31 @@ CV_EXPORTS_W int decomposeHomographyMat(InputArray H,
OutputArrayOfArrays translations, OutputArrayOfArrays translations,
OutputArrayOfArrays normals); OutputArrayOfArrays normals);
/** @brief Filters homography decompositions based on additional information.
@param rotations Vector of rotation matrices.
@param normals Vector of plane normal matrices.
@param beforePoints Vector of (rectified) visible reference points before the homography is applied
@param afterPoints Vector of (rectified) visible reference points after the homography is applied
@param possibleSolutions Vector of int indices representing the viable solution set after filtering
@param pointsMask optional Mat/Vector of 8u type representing the mask for the inliers as given by the findHomography function
This function is intended to filter the output of the decomposeHomographyMat based on additional
information as described in @cite Malis . The summary of the method: the decomposeHomographyMat function
returns 2 unique solutions and their "opposites" for a total of 4 solutions. If we have access to the
sets of points visible in the camera frame before and after the homography transformation is applied,
we can determine which are the true potential solutions and which are the opposites by verifying which
homographies are consistent with all visible reference points being in front of the camera. The inputs
are left unchanged; the filtered solution set is returned as indices into the existing one.
*/
CV_EXPORTS_W void filterHomographyDecompByVisibleRefpoints(InputArrayOfArrays rotations,
InputArrayOfArrays normals,
InputArray beforePoints,
InputArray afterPoints,
OutputArray possibleSolutions,
InputArray pointsMask = noArray());
/** @brief The base class for stereo correspondence algorithms. /** @brief The base class for stereo correspondence algorithms.
*/ */
class CV_EXPORTS_W StereoMatcher : public Algorithm class CV_EXPORTS_W StereoMatcher : public Algorithm
......
/*M/////////////////////////////////////////////////////////////////////////////////////// /*M///////////////////////////////////////////////////////////////////////////////////////
// //
// This is a homography decomposition implementation contributed to OpenCV // This is a homography decomposition implementation contributed to OpenCV
// by Samson Yilma. It implements the homography decomposition algorithm // by Samson Yilma. It implements the homography decomposition algorithm
// described in the research report: // described in the research report:
// Malis, E and Vargas, M, "Deeper understanding of the homography decomposition // Malis, E and Vargas, M, "Deeper understanding of the homography decomposition
// for vision-based control", Research Report 6303, INRIA (2007) // for vision-based control", Research Report 6303, INRIA (2007)
// //
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING. // IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
// //
// By downloading, copying, installing or using the software you agree to this license. // By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install, // If you do not agree to this license, do not download, install,
// copy or use the software. // copy or use the software.
// //
// //
// License Agreement // License Agreement
// For Open Source Computer Vision Library // For Open Source Computer Vision Library
// //
// Copyright (C) 2014, Samson Yilma (samson_yilma@yahoo.com), all rights reserved. // Copyright (C) 2014, Samson Yilma (samson_yilma@yahoo.com), all rights reserved.
// // Copyright (C) 2018, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners. //
// // Third party copyrights are property of their respective owners.
// Redistribution and use in source and binary forms, with or without modification, //
// are permitted provided that the following conditions are met: // Redistribution and use in source and binary forms, with or without modification,
// // are permitted provided that the following conditions are met:
// * Redistribution's of source code must retain the above copyright notice, //
// this list of conditions and the following disclaimer. // * Redistribution's of source code must retain the above copyright notice,
// // this list of conditions and the following disclaimer.
// * Redistribution's in binary form must reproduce the above copyright notice, //
// this list of conditions and the following disclaimer in the documentation // * Redistribution's in binary form must reproduce the above copyright notice,
// and/or other materials provided with the distribution. // this list of conditions and the following disclaimer in the documentation
// // and/or other materials provided with the distribution.
// * The name of the copyright holders may not be used to endorse or promote products //
// derived from this software without specific prior written permission. // * The name of the copyright holders may not be used to endorse or promote products
// // derived from this software without specific prior written permission.
// This software is provided by the copyright holders and contributors "as is" and //
// any express or implied warranties, including, but not limited to, the implied // This software is provided by the copyright holders and contributors "as is" and
// warranties of merchantability and fitness for a particular purpose are disclaimed. // any express or implied warranties, including, but not limited to, the implied
// In no event shall the Intel Corporation or contributors be liable for any direct, // warranties of merchantability and fitness for a particular purpose are disclaimed.
// indirect, incidental, special, exemplary, or consequential damages // In no event shall the Intel Corporation or contributors be liable for any direct,
// (including, but not limited to, procurement of substitute goods or services; // indirect, incidental, special, exemplary, or consequential damages
// loss of use, data, or profits; or business interruption) however caused // (including, but not limited to, procurement of substitute goods or services;
// and on any theory of liability, whether in contract, strict liability, // loss of use, data, or profits; or business interruption) however caused
// or tort (including negligence or otherwise) arising in any way out of // and on any theory of liability, whether in contract, strict liability,
// the use of this software, even if advised of the possibility of such damage. // or tort (including negligence or otherwise) arising in any way out of
// // the use of this software, even if advised of the possibility of such damage.
//M*/ //
//M*/
#include "precomp.hpp" #include "precomp.hpp"
#include <memory> #include <memory>
...@@ -489,4 +490,67 @@ int decomposeHomographyMat(InputArray _H, ...@@ -489,4 +490,67 @@ int decomposeHomographyMat(InputArray _H,
return nsols; return nsols;
} }
void filterHomographyDecompByVisibleRefpoints(InputArrayOfArrays _rotations,
InputArrayOfArrays _normals,
InputArray _beforeRectifiedPoints,
InputArray _afterRectifiedPoints,
OutputArray _possibleSolutions,
InputArray _pointsMask)
{
CV_Assert(_beforeRectifiedPoints.type() == CV_32FC2 && _afterRectifiedPoints.type() == CV_32FC2);
CV_Assert(_pointsMask.empty() || _pointsMask.type() == CV_8U);
Mat beforeRectifiedPoints = _beforeRectifiedPoints.getMat();
Mat afterRectifiedPoints = _afterRectifiedPoints.getMat();
Mat pointsMask = _pointsMask.getMat();
int nsolutions = (int)_rotations.total();
int npoints = (int)beforeRectifiedPoints.total();
CV_Assert(pointsMask.empty() || pointsMask.checkVector(1, CV_8U) == npoints);
const uchar* pointsMaskPtr = pointsMask.data;
std::vector<uchar> solutionMask(nsolutions, (uchar)1);
std::vector<Mat> normals(nsolutions);
std::vector<Mat> rotnorm(nsolutions);
Mat R;
for( int i = 0; i < nsolutions; i++ )
{
_normals.getMat(i).convertTo(normals[i], CV_64F);
CV_Assert(normals[i].total() == 3);
_rotations.getMat(i).convertTo(R, CV_64F);
rotnorm[i] = R*normals[i];
CV_Assert(rotnorm[i].total() == 3);
}
for( int j = 0; j < npoints; j++ )
{
if( !pointsMaskPtr || pointsMaskPtr[j] )
{
Point2f prevPoint = beforeRectifiedPoints.at<Point2f>(j);
Point2f currPoint = afterRectifiedPoints.at<Point2f>(j);
for( int i = 0; i < nsolutions; i++ )
{
if( !solutionMask[i] )
continue;
const double* normal_i = normals[i].ptr<double>();
const double* rotnorm_i = rotnorm[i].ptr<double>();
double prevNormDot = normal_i[0]*prevPoint.x + normal_i[1]*prevPoint.y + normal_i[2];
double currNormDot = rotnorm_i[0]*currPoint.x + rotnorm_i[1]*currPoint.y + rotnorm_i[2];
if (prevNormDot <= 0 || currNormDot <= 0)
solutionMask[i] = (uchar)0;
}
}
}
std::vector<int> possibleSolutions;
for( int i = 0; i < nsolutions; i++ )
if( solutionMask[i] )
possibleSolutions.push_back(i);
Mat(possibleSolutions).copyTo(_possibleSolutions);
}
} //namespace cv } //namespace cv
This diff is collapsed.
...@@ -12,7 +12,8 @@ ocv_add_dispatched_file_force_all("layers/layers_common" AVX AVX2 AVX512_SKX) ...@@ -12,7 +12,8 @@ ocv_add_dispatched_file_force_all("layers/layers_common" AVX AVX2 AVX512_SKX)
ocv_add_module(dnn opencv_core opencv_imgproc WRAP python matlab java js) ocv_add_module(dnn opencv_core opencv_imgproc WRAP python matlab java js)
ocv_option(OPENCV_DNN_OPENCL "Build with OpenCL support" HAVE_OPENCL) ocv_option(OPENCV_DNN_OPENCL "Build with OpenCL support" HAVE_OPENCL AND NOT APPLE)
if(OPENCV_DNN_OPENCL AND HAVE_OPENCL) if(OPENCV_DNN_OPENCL AND HAVE_OPENCL)
add_definitions(-DCV_OCL4DNN=1) add_definitions(-DCV_OCL4DNN=1)
else() else()
......
...@@ -1446,7 +1446,7 @@ struct Net::Impl ...@@ -1446,7 +1446,7 @@ struct Net::Impl
// TODO: OpenCL target support more fusion styles. // TODO: OpenCL target support more fusion styles.
if ( preferableBackend == DNN_BACKEND_OPENCV && IS_DNN_OPENCL_TARGET(preferableTarget) && if ( preferableBackend == DNN_BACKEND_OPENCV && IS_DNN_OPENCL_TARGET(preferableTarget) &&
(!cv::ocl::useOpenCL() || (ld.layerInstance->type != "Convolution" && (!cv::ocl::useOpenCL() || (ld.layerInstance->type != "Convolution" &&
ld.layerInstance->type != "MVN")) ) ld.layerInstance->type != "MVN" && ld.layerInstance->type != "Pooling")) )
continue; continue;
Ptr<Layer>& currLayer = ld.layerInstance; Ptr<Layer>& currLayer = ld.layerInstance;
...@@ -1993,11 +1993,17 @@ Net Net::readFromModelOptimizer(const String& xml, const String& bin) ...@@ -1993,11 +1993,17 @@ Net Net::readFromModelOptimizer(const String& xml, const String& bin)
backendNode->net = Ptr<InfEngineBackendNet>(new InfEngineBackendNet(ieNet)); backendNode->net = Ptr<InfEngineBackendNet>(new InfEngineBackendNet(ieNet));
for (auto& it : ieNet.getOutputsInfo()) for (auto& it : ieNet.getOutputsInfo())
{ {
Ptr<Layer> cvLayer(new InfEngineBackendLayer(it.second));
InferenceEngine::CNNLayerPtr ieLayer = ieNet.getLayerByName(it.first.c_str());
CV_Assert(ieLayer);
LayerParams lp; LayerParams lp;
int lid = cvNet.addLayer(it.first, "", lp); int lid = cvNet.addLayer(it.first, "", lp);
LayerData& ld = cvNet.impl->layers[lid]; LayerData& ld = cvNet.impl->layers[lid];
ld.layerInstance = Ptr<Layer>(new InfEngineBackendLayer(it.second)); cvLayer->name = it.first;
cvLayer->type = ieLayer->type;
ld.layerInstance = cvLayer;
ld.backendNodes[DNN_BACKEND_INFERENCE_ENGINE] = backendNode; ld.backendNodes[DNN_BACKEND_INFERENCE_ENGINE] = backendNode;
for (int i = 0; i < inputsNames.size(); ++i) for (int i = 0; i < inputsNames.size(); ++i)
......
...@@ -165,6 +165,7 @@ public: ...@@ -165,6 +165,7 @@ public:
(type == AVE ? LIBDNN_POOLING_METHOD_AVE : (type == AVE ? LIBDNN_POOLING_METHOD_AVE :
LIBDNN_POOLING_METHOD_STO); LIBDNN_POOLING_METHOD_STO);
config.avePoolPaddedArea = avePoolPaddedArea; config.avePoolPaddedArea = avePoolPaddedArea;
config.computeMaxIdx = computeMaxIdx;
config.use_half = use_half; config.use_half = use_half;
poolOp = Ptr<OCL4DNNPool<float> >(new OCL4DNNPool<float>(config)); poolOp = Ptr<OCL4DNNPool<float> >(new OCL4DNNPool<float>(config));
} }
......
...@@ -352,6 +352,7 @@ struct OCL4DNNPoolConfig ...@@ -352,6 +352,7 @@ struct OCL4DNNPoolConfig
pool_method(LIBDNN_POOLING_METHOD_MAX), pool_method(LIBDNN_POOLING_METHOD_MAX),
global_pooling(false), global_pooling(false),
avePoolPaddedArea(true), avePoolPaddedArea(true),
computeMaxIdx(true),
use_half(false) use_half(false)
{} {}
MatShape in_shape; MatShape in_shape;
...@@ -365,6 +366,7 @@ struct OCL4DNNPoolConfig ...@@ -365,6 +366,7 @@ struct OCL4DNNPoolConfig
ocl4dnnPoolingMethod_t pool_method; // = LIBDNN_POOLING_METHOD_MAX; ocl4dnnPoolingMethod_t pool_method; // = LIBDNN_POOLING_METHOD_MAX;
bool global_pooling; // = false; bool global_pooling; // = false;
bool avePoolPaddedArea; bool avePoolPaddedArea;
bool computeMaxIdx;
bool use_half; bool use_half;
}; };
...@@ -399,6 +401,7 @@ class OCL4DNNPool ...@@ -399,6 +401,7 @@ class OCL4DNNPool
int32_t pooled_height_; int32_t pooled_height_;
int32_t pooled_width_; int32_t pooled_width_;
bool avePoolPaddedArea; bool avePoolPaddedArea;
bool computeMaxIdx;
bool use_half; bool use_half;
}; };
......
...@@ -56,6 +56,7 @@ OCL4DNNPool<Dtype>::OCL4DNNPool(OCL4DNNPoolConfig config) ...@@ -56,6 +56,7 @@ OCL4DNNPool<Dtype>::OCL4DNNPool(OCL4DNNPoolConfig config)
channels_ = config.channels; channels_ = config.channels;
pool_method_ = config.pool_method; pool_method_ = config.pool_method;
avePoolPaddedArea = config.avePoolPaddedArea; avePoolPaddedArea = config.avePoolPaddedArea;
computeMaxIdx = config.computeMaxIdx;
use_half = config.use_half; use_half = config.use_half;
for (int i = 0; i < spatial_dims; ++i) for (int i = 0; i < spatial_dims; ++i)
...@@ -97,7 +98,7 @@ bool OCL4DNNPool<Dtype>::Forward(const UMat& bottom, ...@@ -97,7 +98,7 @@ bool OCL4DNNPool<Dtype>::Forward(const UMat& bottom,
UMat& top_mask) UMat& top_mask)
{ {
bool ret = true; bool ret = true;
size_t global[] = { 128 * 128 }; size_t global[] = { (size_t)count_ };
size_t local[] = { 128 }; size_t local[] = { 128 };
// support 2D case // support 2D case
...@@ -105,8 +106,7 @@ bool OCL4DNNPool<Dtype>::Forward(const UMat& bottom, ...@@ -105,8 +106,7 @@ bool OCL4DNNPool<Dtype>::Forward(const UMat& bottom,
{ {
case LIBDNN_POOLING_METHOD_MAX: case LIBDNN_POOLING_METHOD_MAX:
{ {
bool haveMask = !top_mask.empty(); String kname = computeMaxIdx ? "max_pool_forward_mask" : "max_pool_forward";
String kname = haveMask ? "max_pool_forward_mask" : "max_pool_forward";
kname += (use_half) ? "_half" : "_float"; kname += (use_half) ? "_half" : "_float";
ocl::Kernel oclk_max_pool_forward( ocl::Kernel oclk_max_pool_forward(
kname.c_str(), kname.c_str(),
...@@ -118,7 +118,7 @@ bool OCL4DNNPool<Dtype>::Forward(const UMat& bottom, ...@@ -118,7 +118,7 @@ bool OCL4DNNPool<Dtype>::Forward(const UMat& bottom,
kernel_w_, kernel_h_, kernel_w_, kernel_h_,
stride_w_, stride_h_, stride_w_, stride_h_,
pad_w_, pad_h_, pad_w_, pad_h_,
haveMask ? " -D HAVE_MASK=1" : "" computeMaxIdx ? " -D HAVE_MASK=1" : ""
)); ));
if (oclk_max_pool_forward.empty()) if (oclk_max_pool_forward.empty())
......
...@@ -65,28 +65,33 @@ __kernel void ...@@ -65,28 +65,33 @@ __kernel void
#endif #endif
) )
{ {
for (int index = get_global_id(0); index < nthreads; int index = get_global_id(0);
index += get_global_size(0)) if (index >= nthreads)
{ return;
const int pw = index % pooled_width; const int pw = index % pooled_width;
const int ph = (index / pooled_width) % pooled_height; const int xx = index / pooled_width;
const int c = (index / pooled_width / pooled_height) % channels; const int ph = xx % pooled_height;
const int n = index / pooled_width / pooled_height / channels; const int ch = xx / pooled_height;
int hstart = ph * STRIDE_H - PAD_H; int hstart = ph * STRIDE_H - PAD_H;
int wstart = pw * STRIDE_W - PAD_W; int wstart = pw * STRIDE_W - PAD_W;
const int hend = min(hstart + KERNEL_H, height);
const int wend = min(wstart + KERNEL_W, width);
hstart = max(hstart, (int)0);
wstart = max(wstart, (int)0);
Dtype maxval = -FLT_MAX; Dtype maxval = -FLT_MAX;
int maxidx = -1; int maxidx = -1;
__global const Dtype* bottom_slice = bottom_data int in_offset = ch * height * width;
+ (n * channels + c) * height * width; for (int h = 0; h < KERNEL_H; ++h)
for (int h = hstart; h < hend; ++h) { {
for (int w = wstart; w < wend; ++w) { int off_y = hstart + h;
if (bottom_slice[h * width + w] > maxval) { if (off_y >= 0 && off_y < height)
maxidx = h * width + w; {
maxval = bottom_slice[maxidx]; for (int w = 0; w < KERNEL_W; ++w)
{
int off_x = wstart + w;
if (off_x >= 0 && off_x < width)
{
Dtype val = bottom_data[in_offset + off_y * width + off_x];
maxidx = (val > maxval) ? (off_y * width + off_x) : maxidx;
maxval = fmax(val, maxval);
}
} }
} }
} }
...@@ -94,7 +99,6 @@ __kernel void ...@@ -94,7 +99,6 @@ __kernel void
#ifdef HAVE_MASK #ifdef HAVE_MASK
mask[index] = maxidx; mask[index] = maxidx;
#endif #endif
}
} }
#elif defined KERNEL_AVE_POOL #elif defined KERNEL_AVE_POOL
...@@ -105,14 +109,14 @@ __kernel void TEMPLATE(ave_pool_forward, Dtype)( ...@@ -105,14 +109,14 @@ __kernel void TEMPLATE(ave_pool_forward, Dtype)(
const int pooled_height, const int pooled_width, const int pooled_height, const int pooled_width,
__global Dtype* top_data) __global Dtype* top_data)
{ {
for (int index = get_global_id(0); index < nthreads; int index = get_global_id(0);
index += get_global_size(0)) if (index >= nthreads)
{ return;
{
const int pw = index % pooled_width; const int pw = index % pooled_width;
const int ph = (index / pooled_width) % pooled_height; const int xx = index / pooled_width;
const int c = (index / pooled_width / pooled_height) % channels; const int ph = xx % pooled_height;
const int n = index / pooled_width / pooled_height / channels; const int ch = xx / pooled_height;
int hstart = ph * STRIDE_H - PAD_H; int hstart = ph * STRIDE_H - PAD_H;
int wstart = pw * STRIDE_W - PAD_W; int wstart = pw * STRIDE_W - PAD_W;
int hend = min(hstart + KERNEL_H, height + PAD_H); int hend = min(hstart + KERNEL_H, height + PAD_H);
...@@ -132,16 +136,15 @@ __kernel void TEMPLATE(ave_pool_forward, Dtype)( ...@@ -132,16 +136,15 @@ __kernel void TEMPLATE(ave_pool_forward, Dtype)(
pool_size = (hend - hstart) * (wend - wstart); pool_size = (hend - hstart) * (wend - wstart);
#endif #endif
Dtype aveval = 0; Dtype aveval = 0;
__global const Dtype* bottom_slice = bottom_data int in_offset = ch * height * width;
+ (n * channels + c) * height * width; for (int h = hstart; h < hend; ++h)
for (int h = hstart; h < hend; ++h) { {
for (int w = wstart; w < wend; ++w) { for (int w = wstart; w < wend; ++w)
aveval += bottom_slice[h * width + w]; {
aveval += bottom_data[in_offset + h * width + w];
} }
} }
top_data[index] = aveval / pool_size; top_data[index] = aveval / pool_size;
}
}
} }
#elif defined KERNEL_STO_POOL #elif defined KERNEL_STO_POOL
......
This diff is collapsed.
...@@ -182,11 +182,9 @@ TEST_P(DNNTestNetwork, MobileNet_SSD_Caffe) ...@@ -182,11 +182,9 @@ TEST_P(DNNTestNetwork, MobileNet_SSD_Caffe)
throw SkipTestException(""); throw SkipTestException("");
Mat sample = imread(findDataFile("dnn/street.png", false)); Mat sample = imread(findDataFile("dnn/street.png", false));
Mat inp = blobFromImage(sample, 1.0f / 127.5, Size(300, 300), Scalar(127.5, 127.5, 127.5), false); Mat inp = blobFromImage(sample, 1.0f / 127.5, Size(300, 300), Scalar(127.5, 127.5, 127.5), false);
float l1 = (backend == DNN_BACKEND_OPENCV && target == DNN_TARGET_OPENCL_FP16) ? 0.0007 : 0.0; float diffScores = (target == DNN_TARGET_OPENCL_FP16) ? 6e-3 : 0.0;
float lInf = (backend == DNN_BACKEND_OPENCV && target == DNN_TARGET_OPENCL_FP16) ? 0.011 : 0.0;
processNet("dnn/MobileNetSSD_deploy.caffemodel", "dnn/MobileNetSSD_deploy.prototxt", processNet("dnn/MobileNetSSD_deploy.caffemodel", "dnn/MobileNetSSD_deploy.prototxt",
inp, "detection_out", "", l1, lInf); inp, "detection_out", "", diffScores);
} }
TEST_P(DNNTestNetwork, MobileNet_SSD_v1_TensorFlow) TEST_P(DNNTestNetwork, MobileNet_SSD_v1_TensorFlow)
......
...@@ -157,7 +157,8 @@ static inline bool checkMyriadTarget() ...@@ -157,7 +157,8 @@ static inline bool checkMyriadTarget()
net.addLayerToPrev("testLayer", "Identity", lp); net.addLayerToPrev("testLayer", "Identity", lp);
net.setPreferableBackend(cv::dnn::DNN_BACKEND_INFERENCE_ENGINE); net.setPreferableBackend(cv::dnn::DNN_BACKEND_INFERENCE_ENGINE);
net.setPreferableTarget(cv::dnn::DNN_TARGET_MYRIAD); net.setPreferableTarget(cv::dnn::DNN_TARGET_MYRIAD);
net.setInput(cv::Mat::zeros(1, 1, CV_32FC1)); static int inpDims[] = {1, 2, 3, 4};
net.setInput(cv::Mat(4, &inpDims[0], CV_32FC1, cv::Scalar(0)));
try try
{ {
net.forward(); net.forward();
......
...@@ -143,7 +143,7 @@ TEST_P(Test_Darknet_nets, YoloVoc) ...@@ -143,7 +143,7 @@ TEST_P(Test_Darknet_nets, YoloVoc)
classIds[0] = 6; confidences[0] = 0.750469f; boxes[0] = Rect2d(0.577374, 0.127391, 0.325575, 0.173418); // a car classIds[0] = 6; confidences[0] = 0.750469f; boxes[0] = Rect2d(0.577374, 0.127391, 0.325575, 0.173418); // a car
classIds[1] = 1; confidences[1] = 0.780879f; boxes[1] = Rect2d(0.270762, 0.264102, 0.461713, 0.48131); // a bicycle classIds[1] = 1; confidences[1] = 0.780879f; boxes[1] = Rect2d(0.270762, 0.264102, 0.461713, 0.48131); // a bicycle
classIds[2] = 11; confidences[2] = 0.901615f; boxes[2] = Rect2d(0.1386, 0.338509, 0.282737, 0.60028); // a dog classIds[2] = 11; confidences[2] = 0.901615f; boxes[2] = Rect2d(0.1386, 0.338509, 0.282737, 0.60028); // a dog
double scoreDiff = (targetId == DNN_TARGET_OPENCL_FP16 || targetId == DNN_TARGET_MYRIAD) ? 7e-3 : 8e-5; double scoreDiff = (targetId == DNN_TARGET_OPENCL_FP16 || targetId == DNN_TARGET_MYRIAD) ? 1e-2 : 8e-5;
double iouDiff = (targetId == DNN_TARGET_OPENCL_FP16 || targetId == DNN_TARGET_MYRIAD) ? 0.013 : 3e-5; double iouDiff = (targetId == DNN_TARGET_OPENCL_FP16 || targetId == DNN_TARGET_MYRIAD) ? 0.013 : 3e-5;
testDarknetModel("yolo-voc.cfg", "yolo-voc.weights", outNames, testDarknetModel("yolo-voc.cfg", "yolo-voc.weights", outNames,
classIds, confidences, boxes, backendId, targetId, scoreDiff, iouDiff); classIds, confidences, boxes, backendId, targetId, scoreDiff, iouDiff);
......
...@@ -925,6 +925,10 @@ TEST(Layer_Test_Convolution_DLDT, Accuracy) ...@@ -925,6 +925,10 @@ TEST(Layer_Test_Convolution_DLDT, Accuracy)
Mat out = net.forward(); Mat out = net.forward();
normAssert(outDefault, out); normAssert(outDefault, out);
std::vector<int> outLayers = net.getUnconnectedOutLayers();
ASSERT_EQ(net.getLayer(outLayers[0])->name, "output_merge");
ASSERT_EQ(net.getLayer(outLayers[0])->type, "Concat");
} }
// 1. Create a .prototxt file with the following network: // 1. Create a .prototxt file with the following network:
...@@ -1183,6 +1187,7 @@ TEST(Layer_Test_PoolingIndices, Accuracy) ...@@ -1183,6 +1187,7 @@ TEST(Layer_Test_PoolingIndices, Accuracy)
} }
} }
} }
net.setPreferableBackend(DNN_BACKEND_OPENCV);
net.setInput(blobFromImage(inp)); net.setInput(blobFromImage(inp));
std::vector<Mat> outputs; std::vector<Mat> outputs;
......
...@@ -127,6 +127,7 @@ TEST_P(Test_TensorFlow_layers, conv) ...@@ -127,6 +127,7 @@ TEST_P(Test_TensorFlow_layers, conv)
runTensorFlowNet("atrous_conv2d_same", targetId); runTensorFlowNet("atrous_conv2d_same", targetId);
runTensorFlowNet("depthwise_conv2d", targetId); runTensorFlowNet("depthwise_conv2d", targetId);
runTensorFlowNet("keras_atrous_conv2d_same", targetId); runTensorFlowNet("keras_atrous_conv2d_same", targetId);
runTensorFlowNet("conv_pool_nchw", targetId);
} }
TEST_P(Test_TensorFlow_layers, padding) TEST_P(Test_TensorFlow_layers, padding)
...@@ -142,9 +143,10 @@ TEST_P(Test_TensorFlow_layers, eltwise_add_mul) ...@@ -142,9 +143,10 @@ TEST_P(Test_TensorFlow_layers, eltwise_add_mul)
runTensorFlowNet("eltwise_add_mul", GetParam()); runTensorFlowNet("eltwise_add_mul", GetParam());
} }
TEST_P(Test_TensorFlow_layers, pad_and_concat) TEST_P(Test_TensorFlow_layers, concat)
{ {
runTensorFlowNet("pad_and_concat", GetParam()); runTensorFlowNet("pad_and_concat", GetParam());
runTensorFlowNet("concat_axis_1", GetParam());
} }
TEST_P(Test_TensorFlow_layers, batch_norm) TEST_P(Test_TensorFlow_layers, batch_norm)
...@@ -440,4 +442,20 @@ TEST(Test_TensorFlow, resize_bilinear) ...@@ -440,4 +442,20 @@ TEST(Test_TensorFlow, resize_bilinear)
runTensorFlowNet("resize_bilinear_factor"); runTensorFlowNet("resize_bilinear_factor");
} }
TEST(Test_TensorFlow, two_inputs)
{
Net net = readNet(path("two_inputs_net.pbtxt"));
net.setPreferableBackend(DNN_BACKEND_OPENCV);
Mat firstInput(2, 3, CV_32FC1), secondInput(2, 3, CV_32FC1);
randu(firstInput, -1, 1);
randu(secondInput, -1, 1);
net.setInput(firstInput, "first_input");
net.setInput(secondInput, "second_input");
Mat out = net.forward();
normAssert(out, firstInput + secondInput);
}
} }
...@@ -175,8 +175,6 @@ bool SunRasterDecoder::readData( Mat& img ) ...@@ -175,8 +175,6 @@ bool SunRasterDecoder::readData( Mat& img )
AutoBuffer<uchar> _src(src_pitch + 32); AutoBuffer<uchar> _src(src_pitch + 32);
uchar* src = _src; uchar* src = _src;
AutoBuffer<uchar> _bgr(m_width*3 + 32);
uchar* bgr = _bgr;
if( !color && m_maptype == RMT_EQUAL_RGB ) if( !color && m_maptype == RMT_EQUAL_RGB )
CvtPaletteToGray( m_palette, gray_palette, 1 << m_bpp ); CvtPaletteToGray( m_palette, gray_palette, 1 << m_bpp );
...@@ -340,16 +338,18 @@ bad_decoding_end: ...@@ -340,16 +338,18 @@ bad_decoding_end:
case 24: case 24:
for( y = 0; y < m_height; y++, data += step ) for( y = 0; y < m_height; y++, data += step )
{ {
m_strm.getBytes( color ? data : bgr, src_pitch ); m_strm.getBytes(src, src_pitch );
if( color ) if( color )
{ {
if( m_type == RAS_FORMAT_RGB ) if( m_type == RAS_FORMAT_RGB )
icvCvt_RGB2BGR_8u_C3R( data, 0, data, 0, cvSize(m_width,1) ); icvCvt_RGB2BGR_8u_C3R(src, 0, data, 0, cvSize(m_width,1) );
else
memcpy(data, src, std::min(step, (size_t)src_pitch));
} }
else else
{ {
icvCvt_BGR2Gray_8u_C3C1R( bgr, 0, data, 0, cvSize(m_width,1), icvCvt_BGR2Gray_8u_C3C1R(src, 0, data, 0, cvSize(m_width,1),
m_type == RAS_FORMAT_RGB ? 2 : 0 ); m_type == RAS_FORMAT_RGB ? 2 : 0 );
} }
} }
......
...@@ -670,6 +670,14 @@ public: ...@@ -670,6 +670,14 @@ public:
void groupRectangles(std::vector<cv::Rect>& rectList, std::vector<double>& weights, int groupThreshold, double eps) const; void groupRectangles(std::vector<cv::Rect>& rectList, std::vector<double>& weights, int groupThreshold, double eps) const;
}; };
/** @brief Detect QR code in image and return minimum area of quadrangle that describes QR code.
@param in Matrix of the type CV_8UC1 containing an image where QR code are detected.
@param points Output vector of vertices of a quadrangle of minimal area that describes QR code.
@param eps_x Epsilon neighborhood, which allows you to determine the horizontal pattern of the scheme 1:1:3:1:1 according to QR code standard.
@param eps_y Epsilon neighborhood, which allows you to determine the vertical pattern of the scheme 1:1:3:1:1 according to QR code standard.
*/
CV_EXPORTS bool detectQRCode(InputArray in, std::vector<Point> &points, double eps_x = 0.2, double eps_y = 0.1);
//! @} objdetect //! @} objdetect
} }
......
This diff is collapsed.
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// Intel License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2000, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of Intel Corporation may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "test_precomp.hpp"
namespace opencv_test { namespace {
TEST(Objdetect_QRCode, regression)
{
String root = cvtest::TS::ptr()->get_data_path() + "qrcode/";
// String cascades[] =
// {
// root + "haarcascade_frontalface_alt.xml",
// root + "lbpcascade_frontalface.xml",
// String()
// };
// vector<Rect> objects;
// RNG rng((uint64)-1);
// for( int i = 0; !cascades[i].empty(); i++ )
// {
// printf("%d. %s\n", i, cascades[i].c_str());
// CascadeClassifier cascade(cascades[i]);
// for( int j = 0; j < 100; j++ )
// {
// int width = rng.uniform(1, 100);
// int height = rng.uniform(1, 100);
// Mat img(height, width, CV_8U);
// randu(img, 0, 256);
// cascade.detectMultiScale(img, objects);
// }
// }
}
}} // namespace
...@@ -250,7 +250,9 @@ when fullAffine=false. ...@@ -250,7 +250,9 @@ when fullAffine=false.
@sa @sa
estimateAffine2D, estimateAffinePartial2D, getAffineTransform, getPerspectiveTransform, findHomography estimateAffine2D, estimateAffinePartial2D, getAffineTransform, getPerspectiveTransform, findHomography
*/ */
CV_EXPORTS_W Mat estimateRigidTransform( InputArray src, InputArray dst, bool fullAffine ); CV_EXPORTS_W Mat estimateRigidTransform( InputArray src, InputArray dst, bool fullAffine);
CV_EXPORTS_W Mat estimateRigidTransform( InputArray src, InputArray dst, bool fullAffine, int ransacMaxIters, double ransacGoodRatio,
int ransacSize0);
enum enum
......
...@@ -1402,7 +1402,7 @@ namespace cv ...@@ -1402,7 +1402,7 @@ namespace cv
{ {
static void static void
getRTMatrix( const Point2f* a, const Point2f* b, getRTMatrix( const std::vector<Point2f> a, const std::vector<Point2f> b,
int count, Mat& M, bool fullAffine ) int count, Mat& M, bool fullAffine )
{ {
CV_Assert( M.isContinuous() ); CV_Assert( M.isContinuous() );
...@@ -1478,6 +1478,12 @@ getRTMatrix( const Point2f* a, const Point2f* b, ...@@ -1478,6 +1478,12 @@ getRTMatrix( const Point2f* a, const Point2f* b,
} }
cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullAffine ) cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullAffine )
{
return estimateRigidTransform(src1, src2, fullAffine, 500, 0.5, 3);
}
cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullAffine, int ransacMaxIters, double ransacGoodRatio,
const int ransacSize0)
{ {
CV_INSTRUMENT_REGION() CV_INSTRUMENT_REGION()
...@@ -1485,9 +1491,6 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA ...@@ -1485,9 +1491,6 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
const int COUNT = 15; const int COUNT = 15;
const int WIDTH = 160, HEIGHT = 120; const int WIDTH = 160, HEIGHT = 120;
const int RANSAC_MAX_ITERS = 500;
const int RANSAC_SIZE0 = 3;
const double RANSAC_GOOD_RATIO = 0.5;
std::vector<Point2f> pA, pB; std::vector<Point2f> pA, pB;
std::vector<int> good_idx; std::vector<int> good_idx;
...@@ -1499,6 +1502,12 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA ...@@ -1499,6 +1502,12 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
RNG rng((uint64)-1); RNG rng((uint64)-1);
int good_count = 0; int good_count = 0;
if( ransacSize0 < 3 )
CV_Error( Error::StsBadArg, "ransacSize0 should have value bigger than 2.");
if( ransacGoodRatio > 1 || ransacGoodRatio < 0)
CV_Error( Error::StsBadArg, "ransacGoodRatio should have value between 0 and 1");
if( A.size() != B.size() ) if( A.size() != B.size() )
CV_Error( Error::StsUnmatchedSizes, "Both input images must have the same size" ); CV_Error( Error::StsUnmatchedSizes, "Both input images must have the same size" );
...@@ -1587,23 +1596,23 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA ...@@ -1587,23 +1596,23 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
good_idx.resize(count); good_idx.resize(count);
if( count < RANSAC_SIZE0 ) if( count < ransacSize0 )
return Mat(); return Mat();
Rect brect = boundingRect(pB); Rect brect = boundingRect(pB);
std::vector<Point2f> a(ransacSize0);
std::vector<Point2f> b(ransacSize0);
// RANSAC stuff: // RANSAC stuff:
// 1. find the consensus // 1. find the consensus
for( k = 0; k < RANSAC_MAX_ITERS; k++ ) for( k = 0; k < ransacMaxIters; k++ )
{ {
int idx[RANSAC_SIZE0]; std::vector<int> idx(ransacSize0);
Point2f a[RANSAC_SIZE0]; // choose random 3 non-complanar points from A & B
Point2f b[RANSAC_SIZE0]; for( i = 0; i < ransacSize0; i++ )
// choose random 3 non-coplanar points from A & B
for( i = 0; i < RANSAC_SIZE0; i++ )
{ {
for( k1 = 0; k1 < RANSAC_MAX_ITERS; k1++ ) for( k1 = 0; k1 < ransacMaxIters; k1++ )
{ {
idx[i] = rng.uniform(0, count); idx[i] = rng.uniform(0, count);
...@@ -1623,7 +1632,7 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA ...@@ -1623,7 +1632,7 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
if( j < i ) if( j < i )
continue; continue;
if( i+1 == RANSAC_SIZE0 ) if( i+1 == ransacSize0 )
{ {
// additional check for non-complanar vectors // additional check for non-complanar vectors
a[0] = pA[idx[0]]; a[0] = pA[idx[0]];
...@@ -1647,11 +1656,11 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA ...@@ -1647,11 +1656,11 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
break; break;
} }
if( k1 >= RANSAC_MAX_ITERS ) if( k1 >= ransacMaxIters )
break; break;
} }
if( i < RANSAC_SIZE0 ) if( i < ransacSize0 )
continue; continue;
// estimate the transformation using 3 points // estimate the transformation using 3 points
...@@ -1665,11 +1674,11 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA ...@@ -1665,11 +1674,11 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
good_idx[good_count++] = i; good_idx[good_count++] = i;
} }
if( good_count >= count*RANSAC_GOOD_RATIO ) if( good_count >= count*ransacGoodRatio )
break; break;
} }
if( k >= RANSAC_MAX_ITERS ) if( k >= ransacMaxIters )
return Mat(); return Mat();
if( good_count < count ) if( good_count < count )
...@@ -1682,7 +1691,7 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA ...@@ -1682,7 +1691,7 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
} }
} }
getRTMatrix( &pA[0], &pB[0], good_count, M, fullAffine ); getRTMatrix( pA, pB, good_count, M, fullAffine );
M.at<double>(0, 2) /= scale; M.at<double>(0, 2) /= scale;
M.at<double>(1, 2) /= scale; M.at<double>(1, 2) /= scale;
......
This diff is collapsed.
This diff is collapsed.
/** /**
* @function Watershed_and_Distance_Transform.cpp
* @brief Sample code showing how to segment overlapping objects using Laplacian filtering, in addition to Watershed and Distance Transformation * @brief Sample code showing how to segment overlapping objects using Laplacian filtering, in addition to Watershed and Distance Transformation
* @author OpenCV Team * @author OpenCV Team
*/ */
...@@ -12,39 +11,43 @@ ...@@ -12,39 +11,43 @@
using namespace std; using namespace std;
using namespace cv; using namespace cv;
int main() int main(int argc, char *argv[])
{ {
//! [load_image] //! [load_image]
// Load the image // Load the image
Mat src = imread("../data/cards.png"); CommandLineParser parser( argc, argv, "{@input | ../data/cards.png | input image}" );
Mat src = imread( parser.get<String>( "@input" ) );
// Check if everything was fine if( src.empty() )
if (!src.data) {
cout << "Could not open or find the image!\n" << endl;
cout << "Usage: " << argv[0] << " <Input image>" << endl;
return -1; return -1;
}
// Show source image // Show source image
imshow("Source Image", src); imshow("Source Image", src);
//! [load_image] //! [load_image]
//! [black_bg] //! [black_bg]
// Change the background from white to black, since that will help later to extract // Change the background from white to black, since that will help later to extract
// better results during the use of Distance Transform // better results during the use of Distance Transform
for( int x = 0; x < src.rows; x++ ) { for ( int i = 0; i < src.rows; i++ ) {
for( int y = 0; y < src.cols; y++ ) { for ( int j = 0; j < src.cols; j++ ) {
if ( src.at<Vec3b>(x, y) == Vec3b(255,255,255) ) { if ( src.at<Vec3b>(i, j) == Vec3b(255,255,255) )
src.at<Vec3b>(x, y)[0] = 0; {
src.at<Vec3b>(x, y)[1] = 0; src.at<Vec3b>(i, j)[0] = 0;
src.at<Vec3b>(x, y)[2] = 0; src.at<Vec3b>(i, j)[1] = 0;
src.at<Vec3b>(i, j)[2] = 0;
} }
} }
} }
// Show output image // Show output image
imshow("Black Background Image", src); imshow("Black Background Image", src);
//! [black_bg] //! [black_bg]
//! [sharp] //! [sharp]
// Create a kernel that we will use for accuting/sharpening our image // Create a kernel that we will use to sharpen our image
Mat kernel = (Mat_<float>(3,3) << Mat kernel = (Mat_<float>(3,3) <<
1, 1, 1, 1, 1, 1,
1, -8, 1, 1, -8, 1,
...@@ -57,8 +60,8 @@ int main() ...@@ -57,8 +60,8 @@ int main()
// BUT a 8bits unsigned int (the one we are working with) can contain values from 0 to 255 // BUT a 8bits unsigned int (the one we are working with) can contain values from 0 to 255
// so the possible negative number will be truncated // so the possible negative number will be truncated
Mat imgLaplacian; Mat imgLaplacian;
Mat sharp = src; // copy source image to another temporary one filter2D(src, imgLaplacian, CV_32F, kernel);
filter2D(sharp, imgLaplacian, CV_32F, kernel); Mat sharp;
src.convertTo(sharp, CV_32F); src.convertTo(sharp, CV_32F);
Mat imgResult = sharp - imgLaplacian; Mat imgResult = sharp - imgLaplacian;
...@@ -68,41 +71,39 @@ int main() ...@@ -68,41 +71,39 @@ int main()
// imshow( "Laplace Filtered Image", imgLaplacian ); // imshow( "Laplace Filtered Image", imgLaplacian );
imshow( "New Sharped Image", imgResult ); imshow( "New Sharped Image", imgResult );
//! [sharp] //! [sharp]
src = imgResult; // copy back
//! [bin] //! [bin]
// Create binary image from source image // Create binary image from source image
Mat bw; Mat bw;
cvtColor(src, bw, COLOR_BGR2GRAY); cvtColor(imgResult, bw, COLOR_BGR2GRAY);
threshold(bw, bw, 40, 255, THRESH_BINARY | THRESH_OTSU); threshold(bw, bw, 40, 255, THRESH_BINARY | THRESH_OTSU);
imshow("Binary Image", bw); imshow("Binary Image", bw);
//! [bin] //! [bin]
//! [dist] //! [dist]
// Perform the distance transform algorithm // Perform the distance transform algorithm
Mat dist; Mat dist;
distanceTransform(bw, dist, DIST_L2, 3); distanceTransform(bw, dist, DIST_L2, 3);
// Normalize the distance image for range = {0.0, 1.0} // Normalize the distance image for range = {0.0, 1.0}
// so we can visualize and threshold it // so we can visualize and threshold it
normalize(dist, dist, 0, 1., NORM_MINMAX); normalize(dist, dist, 0, 1.0, NORM_MINMAX);
imshow("Distance Transform Image", dist); imshow("Distance Transform Image", dist);
//! [dist] //! [dist]
//! [peaks] //! [peaks]
// Threshold to obtain the peaks // Threshold to obtain the peaks
// This will be the markers for the foreground objects // This will be the markers for the foreground objects
threshold(dist, dist, .4, 1., THRESH_BINARY); threshold(dist, dist, 0.4, 1.0, THRESH_BINARY);
// Dilate a bit the dist image // Dilate a bit the dist image
Mat kernel1 = Mat::ones(3, 3, CV_8UC1); Mat kernel1 = Mat::ones(3, 3, CV_8U);
dilate(dist, dist, kernel1); dilate(dist, dist, kernel1);
imshow("Peaks", dist); imshow("Peaks", dist);
//! [peaks] //! [peaks]
//! [seeds] //! [seeds]
// Create the CV_8U version of the distance image // Create the CV_8U version of the distance image
// It is needed for findContours() // It is needed for findContours()
Mat dist_8u; Mat dist_8u;
...@@ -113,34 +114,36 @@ int main() ...@@ -113,34 +114,36 @@ int main()
findContours(dist_8u, contours, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE); findContours(dist_8u, contours, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE);
// Create the marker image for the watershed algorithm // Create the marker image for the watershed algorithm
Mat markers = Mat::zeros(dist.size(), CV_32SC1); Mat markers = Mat::zeros(dist.size(), CV_32S);
// Draw the foreground markers // Draw the foreground markers
for (size_t i = 0; i < contours.size(); i++) for (size_t i = 0; i < contours.size(); i++)
drawContours(markers, contours, static_cast<int>(i), Scalar::all(static_cast<int>(i)+1), -1); {
drawContours(markers, contours, static_cast<int>(i), Scalar(static_cast<int>(i)+1), -1);
}
// Draw the background marker // Draw the background marker
circle(markers, Point(5,5), 3, CV_RGB(255,255,255), -1); circle(markers, Point(5,5), 3, Scalar(255), -1);
imshow("Markers", markers*10000); imshow("Markers", markers*10000);
//! [seeds] //! [seeds]
//! [watershed] //! [watershed]
// Perform the watershed algorithm // Perform the watershed algorithm
watershed(src, markers); watershed(imgResult, markers);
Mat mark = Mat::zeros(markers.size(), CV_8UC1); Mat mark;
markers.convertTo(mark, CV_8UC1); markers.convertTo(mark, CV_8U);
bitwise_not(mark, mark); bitwise_not(mark, mark);
// imshow("Markers_v2", mark); // uncomment this if you want to see how the mark // imshow("Markers_v2", mark); // uncomment this if you want to see how the mark
// image looks like at that point // image looks like at that point
// Generate random colors // Generate random colors
vector<Vec3b> colors; vector<Vec3b> colors;
for (size_t i = 0; i < contours.size(); i++) for (size_t i = 0; i < contours.size(); i++)
{ {
int b = theRNG().uniform(0, 255); int b = theRNG().uniform(0, 256);
int g = theRNG().uniform(0, 255); int g = theRNG().uniform(0, 256);
int r = theRNG().uniform(0, 255); int r = theRNG().uniform(0, 256);
colors.push_back(Vec3b((uchar)b, (uchar)g, (uchar)r)); colors.push_back(Vec3b((uchar)b, (uchar)g, (uchar)r));
} }
...@@ -155,16 +158,16 @@ int main() ...@@ -155,16 +158,16 @@ int main()
{ {
int index = markers.at<int>(i,j); int index = markers.at<int>(i,j);
if (index > 0 && index <= static_cast<int>(contours.size())) if (index > 0 && index <= static_cast<int>(contours.size()))
{
dst.at<Vec3b>(i,j) = colors[index-1]; dst.at<Vec3b>(i,j) = colors[index-1];
else }
dst.at<Vec3b>(i,j) = Vec3b(0,0,0);
} }
} }
// Visualize the final image // Visualize the final image
imshow("Final Result", dst); imshow("Final Result", dst);
//! [watershed] //! [watershed]
waitKey(0); waitKey();
return 0; return 0;
} }
...@@ -22,6 +22,7 @@ const char* keys = ...@@ -22,6 +22,7 @@ const char* keys =
"{ height | -1 | Preprocess input image by resizing to a specific height. }" "{ height | -1 | Preprocess input image by resizing to a specific height. }"
"{ rgb | | Indicate that model works with RGB input images instead BGR ones. }" "{ rgb | | Indicate that model works with RGB input images instead BGR ones. }"
"{ thr | .5 | Confidence threshold. }" "{ thr | .5 | Confidence threshold. }"
"{ thr | .4 | Non-maximum suppression threshold. }"
"{ backend | 0 | Choose one of computation backends: " "{ backend | 0 | Choose one of computation backends: "
"0: automatically (by default), " "0: automatically (by default), "
"1: Halide language (http://halide-lang.org/), " "1: Halide language (http://halide-lang.org/), "
...@@ -37,7 +38,7 @@ const char* keys = ...@@ -37,7 +38,7 @@ const char* keys =
using namespace cv; using namespace cv;
using namespace dnn; using namespace dnn;
float confThreshold; float confThreshold, nmsThreshold;
std::vector<std::string> classes; std::vector<std::string> classes;
void postprocess(Mat& frame, const std::vector<Mat>& out, Net& net); void postprocess(Mat& frame, const std::vector<Mat>& out, Net& net);
...@@ -59,6 +60,7 @@ int main(int argc, char** argv) ...@@ -59,6 +60,7 @@ int main(int argc, char** argv)
} }
confThreshold = parser.get<float>("thr"); confThreshold = parser.get<float>("thr");
nmsThreshold = parser.get<float>("nms");
float scale = parser.get<float>("scale"); float scale = parser.get<float>("scale");
Scalar mean = parser.get<Scalar>("mean"); Scalar mean = parser.get<Scalar>("mean");
bool swapRB = parser.get<bool>("rgb"); bool swapRB = parser.get<bool>("rgb");
...@@ -144,6 +146,9 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net) ...@@ -144,6 +146,9 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net)
static std::vector<int> outLayers = net.getUnconnectedOutLayers(); static std::vector<int> outLayers = net.getUnconnectedOutLayers();
static std::string outLayerType = net.getLayer(outLayers[0])->type; static std::string outLayerType = net.getLayer(outLayers[0])->type;
std::vector<int> classIds;
std::vector<float> confidences;
std::vector<Rect> boxes;
if (net.getLayer(0)->outputNameToIndex("im_info") != -1) // Faster-RCNN or R-FCN if (net.getLayer(0)->outputNameToIndex("im_info") != -1) // Faster-RCNN or R-FCN
{ {
// Network produces output blob with a shape 1x1xNx7 where N is a number of // Network produces output blob with a shape 1x1xNx7 where N is a number of
...@@ -160,8 +165,11 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net) ...@@ -160,8 +165,11 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net)
int top = (int)data[i + 4]; int top = (int)data[i + 4];
int right = (int)data[i + 5]; int right = (int)data[i + 5];
int bottom = (int)data[i + 6]; int bottom = (int)data[i + 6];
int classId = (int)(data[i + 1]) - 1; // Skip 0th background class id. int width = right - left + 1;
drawPred(classId, confidence, left, top, right, bottom, frame); int height = bottom - top + 1;
classIds.push_back((int)(data[i + 1]) - 1); // Skip 0th background class id.
boxes.push_back(Rect(left, top, width, height));
confidences.push_back(confidence);
} }
} }
} }
...@@ -181,16 +189,16 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net) ...@@ -181,16 +189,16 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net)
int top = (int)(data[i + 4] * frame.rows); int top = (int)(data[i + 4] * frame.rows);
int right = (int)(data[i + 5] * frame.cols); int right = (int)(data[i + 5] * frame.cols);
int bottom = (int)(data[i + 6] * frame.rows); int bottom = (int)(data[i + 6] * frame.rows);
int classId = (int)(data[i + 1]) - 1; // Skip 0th background class id. int width = right - left + 1;
drawPred(classId, confidence, left, top, right, bottom, frame); int height = bottom - top + 1;
classIds.push_back((int)(data[i + 1]) - 1); // Skip 0th background class id.
boxes.push_back(Rect(left, top, width, height));
confidences.push_back(confidence);
} }
} }
} }
else if (outLayerType == "Region") else if (outLayerType == "Region")
{ {
std::vector<int> classIds;
std::vector<float> confidences;
std::vector<Rect> boxes;
for (size_t i = 0; i < outs.size(); ++i) for (size_t i = 0; i < outs.size(); ++i)
{ {
// Network produces output blob with a shape NxC where N is a number of // Network produces output blob with a shape NxC where N is a number of
...@@ -218,8 +226,12 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net) ...@@ -218,8 +226,12 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net)
} }
} }
} }
}
else
CV_Error(Error::StsNotImplemented, "Unknown output layer type: " + outLayerType);
std::vector<int> indices; std::vector<int> indices;
NMSBoxes(boxes, confidences, confThreshold, 0.4f, indices); NMSBoxes(boxes, confidences, confThreshold, nmsThreshold, indices);
for (size_t i = 0; i < indices.size(); ++i) for (size_t i = 0; i < indices.size(); ++i)
{ {
int idx = indices[i]; int idx = indices[i];
...@@ -227,9 +239,6 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net) ...@@ -227,9 +239,6 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net)
drawPred(classIds[idx], confidences[idx], box.x, box.y, drawPred(classIds[idx], confidences[idx], box.x, box.y,
box.x + box.width, box.y + box.height, frame); box.x + box.width, box.y + box.height, frame);
} }
}
else
CV_Error(Error::StsNotImplemented, "Unknown output layer type: " + outLayerType);
} }
void drawPred(int classId, float conf, int left, int top, int right, int bottom, Mat& frame) void drawPred(int classId, float conf, int left, int top, int right, int bottom, Mat& frame)
......
...@@ -31,6 +31,7 @@ parser.add_argument('--height', type=int, ...@@ -31,6 +31,7 @@ parser.add_argument('--height', type=int,
parser.add_argument('--rgb', action='store_true', parser.add_argument('--rgb', action='store_true',
help='Indicate that model works with RGB input images instead BGR ones.') help='Indicate that model works with RGB input images instead BGR ones.')
parser.add_argument('--thr', type=float, default=0.5, help='Confidence threshold') parser.add_argument('--thr', type=float, default=0.5, help='Confidence threshold')
parser.add_argument('--nms', type=float, default=0.4, help='Non-maximum suppression threshold')
parser.add_argument('--backend', choices=backends, default=cv.dnn.DNN_BACKEND_DEFAULT, type=int, parser.add_argument('--backend', choices=backends, default=cv.dnn.DNN_BACKEND_DEFAULT, type=int,
help="Choose one of computation backends: " help="Choose one of computation backends: "
"%d: automatically (by default), " "%d: automatically (by default), "
...@@ -57,6 +58,7 @@ net.setPreferableBackend(args.backend) ...@@ -57,6 +58,7 @@ net.setPreferableBackend(args.backend)
net.setPreferableTarget(args.target) net.setPreferableTarget(args.target)
confThreshold = args.thr confThreshold = args.thr
nmsThreshold = args.nms
def getOutputsNames(net): def getOutputsNames(net):
layersNames = net.getLayerNames() layersNames = net.getLayerNames()
...@@ -86,12 +88,14 @@ def postprocess(frame, outs): ...@@ -86,12 +88,14 @@ def postprocess(frame, outs):
lastLayerId = net.getLayerId(layerNames[-1]) lastLayerId = net.getLayerId(layerNames[-1])
lastLayer = net.getLayer(lastLayerId) lastLayer = net.getLayer(lastLayerId)
classIds = []
confidences = []
boxes = []
if net.getLayer(0).outputNameToIndex('im_info') != -1: # Faster-RCNN or R-FCN if net.getLayer(0).outputNameToIndex('im_info') != -1: # Faster-RCNN or R-FCN
# Network produces output blob with a shape 1x1xNx7 where N is a number of # Network produces output blob with a shape 1x1xNx7 where N is a number of
# detections and an every detection is a vector of values # detections and an every detection is a vector of values
# [batchId, classId, confidence, left, top, right, bottom] # [batchId, classId, confidence, left, top, right, bottom]
assert(len(outs) == 1) for out in outs:
out = outs[0]
for detection in out[0, 0]: for detection in out[0, 0]:
confidence = detection[2] confidence = detection[2]
if confidence > confThreshold: if confidence > confThreshold:
...@@ -99,14 +103,16 @@ def postprocess(frame, outs): ...@@ -99,14 +103,16 @@ def postprocess(frame, outs):
top = int(detection[4]) top = int(detection[4])
right = int(detection[5]) right = int(detection[5])
bottom = int(detection[6]) bottom = int(detection[6])
classId = int(detection[1]) - 1 # Skip background label width = right - left + 1
drawPred(classId, confidence, left, top, right, bottom) height = bottom - top + 1
classIds.append(int(detection[1]) - 1) # Skip background label
confidences.append(float(confidence))
boxes.append([left, top, width, height])
elif lastLayer.type == 'DetectionOutput': elif lastLayer.type == 'DetectionOutput':
# Network produces output blob with a shape 1x1xNx7 where N is a number of # Network produces output blob with a shape 1x1xNx7 where N is a number of
# detections and an every detection is a vector of values # detections and an every detection is a vector of values
# [batchId, classId, confidence, left, top, right, bottom] # [batchId, classId, confidence, left, top, right, bottom]
assert(len(outs) == 1) for out in outs:
out = outs[0]
for detection in out[0, 0]: for detection in out[0, 0]:
confidence = detection[2] confidence = detection[2]
if confidence > confThreshold: if confidence > confThreshold:
...@@ -114,8 +120,11 @@ def postprocess(frame, outs): ...@@ -114,8 +120,11 @@ def postprocess(frame, outs):
top = int(detection[4] * frameHeight) top = int(detection[4] * frameHeight)
right = int(detection[5] * frameWidth) right = int(detection[5] * frameWidth)
bottom = int(detection[6] * frameHeight) bottom = int(detection[6] * frameHeight)
classId = int(detection[1]) - 1 # Skip background label width = right - left + 1
drawPred(classId, confidence, left, top, right, bottom) height = bottom - top + 1
classIds.append(int(detection[1]) - 1) # Skip background label
confidences.append(float(confidence))
boxes.append([left, top, width, height])
elif lastLayer.type == 'Region': elif lastLayer.type == 'Region':
# Network produces output blob with a shape NxC where N is a number of # Network produces output blob with a shape NxC where N is a number of
# detected objects and C is a number of classes + 4 where the first 4 # detected objects and C is a number of classes + 4 where the first 4
...@@ -138,7 +147,11 @@ def postprocess(frame, outs): ...@@ -138,7 +147,11 @@ def postprocess(frame, outs):
classIds.append(classId) classIds.append(classId)
confidences.append(float(confidence)) confidences.append(float(confidence))
boxes.append([left, top, width, height]) boxes.append([left, top, width, height])
indices = cv.dnn.NMSBoxes(boxes, confidences, confThreshold, 0.4) else:
print('Unknown output layer type: ' + lastLayer.type)
exit()
indices = cv.dnn.NMSBoxes(boxes, confidences, confThreshold, nmsThreshold)
for i in indices: for i in indices:
i = i[0] i = i[0]
box = boxes[i] box = boxes[i]
......
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import org.opencv.core.Core;
import org.opencv.core.CvType;
import org.opencv.core.Mat;
import org.opencv.core.MatOfPoint;
import org.opencv.core.Point;
import org.opencv.core.Scalar;
import org.opencv.highgui.HighGui;
import org.opencv.imgcodecs.Imgcodecs;
import org.opencv.imgproc.Imgproc;
/**
*
* @brief Sample code showing how to segment overlapping objects using Laplacian filtering, in addition to Watershed
* and Distance Transformation
*
*/
class ImageSegmentation {
public void run(String[] args) {
//! [load_image]
// Load the image
String filename = args.length > 0 ? args[0] : "../data/cards.png";
Mat srcOriginal = Imgcodecs.imread(filename);
if (srcOriginal.empty()) {
System.err.println("Cannot read image: " + filename);
System.exit(0);
}
// Show source image
HighGui.imshow("Source Image", srcOriginal);
//! [load_image]
//! [black_bg]
// Change the background from white to black, since that will help later to
// extract
// better results during the use of Distance Transform
Mat src = srcOriginal.clone();
byte[] srcData = new byte[(int) (src.total() * src.channels())];
src.get(0, 0, srcData);
for (int i = 0; i < src.rows(); i++) {
for (int j = 0; j < src.cols(); j++) {
if (srcData[(i * src.cols() + j) * 3] == (byte) 255 && srcData[(i * src.cols() + j) * 3 + 1] == (byte) 255
&& srcData[(i * src.cols() + j) * 3 + 2] == (byte) 255) {
srcData[(i * src.cols() + j) * 3] = 0;
srcData[(i * src.cols() + j) * 3 + 1] = 0;
srcData[(i * src.cols() + j) * 3 + 2] = 0;
}
}
}
src.put(0, 0, srcData);
// Show output image
HighGui.imshow("Black Background Image", src);
//! [black_bg]
//! [sharp]
// Create a kernel that we will use to sharpen our image
Mat kernel = new Mat(3, 3, CvType.CV_32F);
// an approximation of second derivative, a quite strong kernel
float[] kernelData = new float[(int) (kernel.total() * kernel.channels())];
kernelData[0] = 1; kernelData[1] = 1; kernelData[2] = 1;
kernelData[3] = 1; kernelData[4] = -8; kernelData[5] = 1;
kernelData[6] = 1; kernelData[7] = 1; kernelData[8] = 1;
kernel.put(0, 0, kernelData);
// do the laplacian filtering as it is
// well, we need to convert everything in something more deeper then CV_8U
// because the kernel has some negative values,
// and we can expect in general to have a Laplacian image with negative values
// BUT a 8bits unsigned int (the one we are working with) can contain values
// from 0 to 255
// so the possible negative number will be truncated
Mat imgLaplacian = new Mat();
Imgproc.filter2D(src, imgLaplacian, CvType.CV_32F, kernel);
Mat sharp = new Mat();
src.convertTo(sharp, CvType.CV_32F);
Mat imgResult = new Mat();
Core.subtract(sharp, imgLaplacian, imgResult);
// convert back to 8bits gray scale
imgResult.convertTo(imgResult, CvType.CV_8UC3);
imgLaplacian.convertTo(imgLaplacian, CvType.CV_8UC3);
// imshow( "Laplace Filtered Image", imgLaplacian );
HighGui.imshow("New Sharped Image", imgResult);
//! [sharp]
//! [bin]
// Create binary image from source image
Mat bw = new Mat();
Imgproc.cvtColor(imgResult, bw, Imgproc.COLOR_BGR2GRAY);
Imgproc.threshold(bw, bw, 40, 255, Imgproc.THRESH_BINARY | Imgproc.THRESH_OTSU);
HighGui.imshow("Binary Image", bw);
//! [bin]
//! [dist]
// Perform the distance transform algorithm
Mat dist = new Mat();
Imgproc.distanceTransform(bw, dist, Imgproc.DIST_L2, 3);
// Normalize the distance image for range = {0.0, 1.0}
// so we can visualize and threshold it
Core.normalize(dist, dist, 0, 1., Core.NORM_MINMAX);
Mat distDisplayScaled = dist.mul(dist, 255);
Mat distDisplay = new Mat();
distDisplayScaled.convertTo(distDisplay, CvType.CV_8U);
HighGui.imshow("Distance Transform Image", distDisplay);
//! [dist]
//! [peaks]
// Threshold to obtain the peaks
// This will be the markers for the foreground objects
Imgproc.threshold(dist, dist, .4, 1., Imgproc.THRESH_BINARY);
// Dilate a bit the dist image
Mat kernel1 = Mat.ones(3, 3, CvType.CV_8U);
Imgproc.dilate(dist, dist, kernel1);
Mat distDisplay2 = new Mat();
dist.convertTo(distDisplay2, CvType.CV_8U);
distDisplay2 = distDisplay2.mul(distDisplay2, 255);
HighGui.imshow("Peaks", distDisplay2);
//! [peaks]
//! [seeds]
// Create the CV_8U version of the distance image
// It is needed for findContours()
Mat dist_8u = new Mat();
dist.convertTo(dist_8u, CvType.CV_8U);
// Find total markers
List<MatOfPoint> contours = new ArrayList<>();
Mat hierarchy = new Mat();
Imgproc.findContours(dist_8u, contours, hierarchy, Imgproc.RETR_EXTERNAL, Imgproc.CHAIN_APPROX_SIMPLE);
// Create the marker image for the watershed algorithm
Mat markers = Mat.zeros(dist.size(), CvType.CV_32S);
// Draw the foreground markers
for (int i = 0; i < contours.size(); i++) {
Imgproc.drawContours(markers, contours, i, new Scalar(i + 1), -1);
}
// Draw the background marker
Imgproc.circle(markers, new Point(5, 5), 3, new Scalar(255, 255, 255), -1);
Mat markersScaled = markers.mul(markers, 10000);
Mat markersDisplay = new Mat();
markersScaled.convertTo(markersDisplay, CvType.CV_8U);
HighGui.imshow("Markers", markersDisplay);
//! [seeds]
//! [watershed]
// Perform the watershed algorithm
Imgproc.watershed(imgResult, markers);
Mat mark = Mat.zeros(markers.size(), CvType.CV_8U);
markers.convertTo(mark, CvType.CV_8UC1);
Core.bitwise_not(mark, mark);
// imshow("Markers_v2", mark); // uncomment this if you want to see how the mark
// image looks like at that point
// Generate random colors
Random rng = new Random(12345);
List<Scalar> colors = new ArrayList<>(contours.size());
for (int i = 0; i < contours.size(); i++) {
int b = rng.nextInt(256);
int g = rng.nextInt(256);
int r = rng.nextInt(256);
colors.add(new Scalar(b, g, r));
}
// Create the result image
Mat dst = Mat.zeros(markers.size(), CvType.CV_8UC3);
byte[] dstData = new byte[(int) (dst.total() * dst.channels())];
dst.get(0, 0, dstData);
// Fill labeled objects with random colors
int[] markersData = new int[(int) (markers.total() * markers.channels())];
markers.get(0, 0, markersData);
for (int i = 0; i < markers.rows(); i++) {
for (int j = 0; j < markers.cols(); j++) {
int index = markersData[i * markers.cols() + j];
if (index > 0 && index <= contours.size()) {
dstData[(i * dst.cols() + j) * 3 + 0] = (byte) colors.get(index - 1).val[0];
dstData[(i * dst.cols() + j) * 3 + 1] = (byte) colors.get(index - 1).val[1];
dstData[(i * dst.cols() + j) * 3 + 2] = (byte) colors.get(index - 1).val[2];
} else {
dstData[(i * dst.cols() + j) * 3 + 0] = 0;
dstData[(i * dst.cols() + j) * 3 + 1] = 0;
dstData[(i * dst.cols() + j) * 3 + 2] = 0;
}
}
}
dst.put(0, 0, dstData);
// Visualize the final image
HighGui.imshow("Final Result", dst);
//! [watershed]
HighGui.waitKey();
System.exit(0);
}
}
public class ImageSegmentationDemo {
public static void main(String[] args) {
// Load the native OpenCV library
System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
new ImageSegmentation().run(args);
}
}
from __future__ import print_function
import cv2 as cv
import numpy as np
import argparse
import random as rng
rng.seed(12345)
## [load_image]
# Load the image
parser = argparse.ArgumentParser(description='Code for Image Segmentation with Distance Transform and Watershed Algorithm.\
Sample code showing how to segment overlapping objects using Laplacian filtering, \
in addition to Watershed and Distance Transformation')
parser.add_argument('--input', help='Path to input image.', default='../data/cards.png')
args = parser.parse_args()
src = cv.imread(args.input)
if src is None:
print('Could not open or find the image:', args.input)
exit(0)
# Show source image
cv.imshow('Source Image', src)
## [load_image]
## [black_bg]
# Change the background from white to black, since that will help later to extract
# better results during the use of Distance Transform
src[np.all(src == 255, axis=2)] = 0
# Show output image
cv.imshow('Black Background Image', src)
## [black_bg]
## [sharp]
# Create a kernel that we will use to sharpen our image
# an approximation of second derivative, a quite strong kernel
kernel = np.array([[1, 1, 1], [1, -8, 1], [1, 1, 1]], dtype=np.float32)
# do the laplacian filtering as it is
# well, we need to convert everything in something more deeper then CV_8U
# because the kernel has some negative values,
# and we can expect in general to have a Laplacian image with negative values
# BUT a 8bits unsigned int (the one we are working with) can contain values from 0 to 255
# so the possible negative number will be truncated
imgLaplacian = cv.filter2D(src, cv.CV_32F, kernel)
sharp = np.float32(src)
imgResult = sharp - imgLaplacian
# convert back to 8bits gray scale
imgResult = np.clip(imgResult, 0, 255)
imgResult = imgResult.astype('uint8')
imgLaplacian = np.clip(imgLaplacian, 0, 255)
imgLaplacian = np.uint8(imgLaplacian)
#cv.imshow('Laplace Filtered Image', imgLaplacian)
cv.imshow('New Sharped Image', imgResult)
## [sharp]
## [bin]
# Create binary image from source image
bw = cv.cvtColor(imgResult, cv.COLOR_BGR2GRAY)
_, bw = cv.threshold(bw, 40, 255, cv.THRESH_BINARY | cv.THRESH_OTSU)
cv.imshow('Binary Image', bw)
## [bin]
## [dist]
# Perform the distance transform algorithm
dist = cv.distanceTransform(bw, cv.DIST_L2, 3)
# Normalize the distance image for range = {0.0, 1.0}
# so we can visualize and threshold it
cv.normalize(dist, dist, 0, 1.0, cv.NORM_MINMAX)
cv.imshow('Distance Transform Image', dist)
## [dist]
## [peaks]
# Threshold to obtain the peaks
# This will be the markers for the foreground objects
_, dist = cv.threshold(dist, 0.4, 1.0, cv.THRESH_BINARY)
# Dilate a bit the dist image
kernel1 = np.ones((3,3), dtype=np.uint8)
dist = cv.dilate(dist, kernel1)
cv.imshow('Peaks', dist)
## [peaks]
## [seeds]
# Create the CV_8U version of the distance image
# It is needed for findContours()
dist_8u = dist.astype('uint8')
# Find total markers
_, contours, _ = cv.findContours(dist_8u, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
# Create the marker image for the watershed algorithm
markers = np.zeros(dist.shape, dtype=np.int32)
# Draw the foreground markers
for i in range(len(contours)):
cv.drawContours(markers, contours, i, (i+1), -1)
# Draw the background marker
cv.circle(markers, (5,5), 3, (255,255,255), -1)
cv.imshow('Markers', markers*10000)
## [seeds]
## [watershed]
# Perform the watershed algorithm
cv.watershed(imgResult, markers)
#mark = np.zeros(markers.shape, dtype=np.uint8)
mark = markers.astype('uint8')
mark = cv.bitwise_not(mark)
# uncomment this if you want to see how the mark
# image looks like at that point
#cv.imshow('Markers_v2', mark)
# Generate random colors
colors = []
for contour in contours:
colors.append((rng.randint(0,256), rng.randint(0,256), rng.randint(0,256)))
# Create the result image
dst = np.zeros((markers.shape[0], markers.shape[1], 3), dtype=np.uint8)
# Fill labeled objects with random colors
for i in range(markers.shape[0]):
for j in range(markers.shape[1]):
index = markers[i,j]
if index > 0 and index <= len(contours):
dst[i,j,:] = colors[index-1]
# Visualize the final image
cv.imshow('Final Result', dst)
## [watershed]
cv.waitKey()
...@@ -28,10 +28,9 @@ knn_matches = matcher.knnMatch(descriptors1, descriptors2, 2) ...@@ -28,10 +28,9 @@ knn_matches = matcher.knnMatch(descriptors1, descriptors2, 2)
#-- Filter matches using the Lowe's ratio test #-- Filter matches using the Lowe's ratio test
ratio_thresh = 0.7 ratio_thresh = 0.7
good_matches = [] good_matches = []
for matches in knn_matches: for m,n in knn_matches:
if len(matches) > 1: if m.distance / n.distance <= ratio_thresh:
if matches[0].distance / matches[1].distance <= ratio_thresh: good_matches.append(m)
good_matches.append(matches[0])
#-- Draw matches #-- Draw matches
img_matches = np.empty((max(img1.shape[0], img2.shape[0]), img1.shape[1]+img2.shape[1], 3), dtype=np.uint8) img_matches = np.empty((max(img1.shape[0], img2.shape[0]), img1.shape[1]+img2.shape[1], 3), dtype=np.uint8)
......
...@@ -28,10 +28,9 @@ knn_matches = matcher.knnMatch(descriptors_obj, descriptors_scene, 2) ...@@ -28,10 +28,9 @@ knn_matches = matcher.knnMatch(descriptors_obj, descriptors_scene, 2)
#-- Filter matches using the Lowe's ratio test #-- Filter matches using the Lowe's ratio test
ratio_thresh = 0.75 ratio_thresh = 0.75
good_matches = [] good_matches = []
for matches in knn_matches: for m,n in knn_matches:
if len(matches) > 1: if m.distance / n.distance <= ratio_thresh:
if matches[0].distance / matches[1].distance <= ratio_thresh: good_matches.append(m)
good_matches.append(matches[0])
#-- Draw matches #-- Draw matches
img_matches = np.empty((max(img_object.shape[0], img_scene.shape[0]), img_object.shape[1]+img_scene.shape[1], 3), dtype=np.uint8) img_matches = np.empty((max(img_object.shape[0], img_scene.shape[0]), img_object.shape[1]+img_scene.shape[1], 3), dtype=np.uint8)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment