Commit 3165baa1 authored by Alexander Alekhin's avatar Alexander Alekhin

Merge remote-tracking branch 'upstream/3.4' into merge-3.4

parents 9a8a964b 74da80db
......@@ -1624,7 +1624,7 @@ endif()
macro(ocv_git_describe var_name path)
if(GIT_FOUND)
execute_process(COMMAND "${GIT_EXECUTABLE}" describe --tags --tags --exact-match --dirty
execute_process(COMMAND "${GIT_EXECUTABLE}" describe --tags --exact-match --dirty
WORKING_DIRECTORY "${path}"
OUTPUT_VARIABLE ${var_name}
RESULT_VARIABLE GIT_RESULT
......
......@@ -16,42 +16,152 @@ Theory
Code
----
@add_toggle_cpp
This tutorial code's is shown lines below. You can also download it from
[here](https://github.com/opencv/opencv/tree/master/samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp).
[here](https://github.com/opencv/opencv/tree/master/samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp).
@include samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp
@end_toggle
@add_toggle_java
This tutorial code's is shown lines below. You can also download it from
[here](https://github.com/opencv/opencv/tree/master/samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java)
@include samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java
@end_toggle
@add_toggle_python
This tutorial code's is shown lines below. You can also download it from
[here](https://github.com/opencv/opencv/tree/master/samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py)
@include samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py
@end_toggle
Explanation / Result
--------------------
-# Load the source image and check if it is loaded without any problem, then show it:
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp load_image
![](images/source.jpeg)
- Load the source image and check if it is loaded without any problem, then show it:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp load_image
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java load_image
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py load_image
@end_toggle
![](images/source.jpeg)
- Then if we have an image with a white background, it is good to transform it to black. This will help us to discriminate the foreground objects easier when we will apply the Distance Transform:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp black_bg
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java black_bg
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py black_bg
@end_toggle
![](images/black_bg.jpeg)
- Afterwards we will sharpen our image in order to acute the edges of the foreground objects. We will apply a laplacian filter with a quite strong filter (an approximation of second derivative):
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp sharp
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java sharp
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py sharp
@end_toggle
![](images/laplace.jpeg)
![](images/sharp.jpeg)
- Now we transform our new sharpened source image to a grayscale and a binary one, respectively:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp bin
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java bin
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py bin
@end_toggle
![](images/bin.jpeg)
- We are ready now to apply the Distance Transform on the binary image. Moreover, we normalize the output image in order to be able visualize and threshold the result:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp dist
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java dist
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py dist
@end_toggle
![](images/dist_transf.jpeg)
- We threshold the *dist* image and then perform some morphology operation (i.e. dilation) in order to extract the peaks from the above image:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp peaks
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java peaks
@end_toggle
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py peaks
@end_toggle
![](images/peaks.jpeg)
- From each blob then we create a seed/marker for the watershed algorithm with the help of the @ref cv::findContours function:
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp seeds
@end_toggle
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java seeds
@end_toggle
-# Then if we have an image with a white background, it is good to transform it to black. This will help us to discriminate the foreground objects easier when we will apply the Distance Transform:
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp black_bg
![](images/black_bg.jpeg)
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py seeds
@end_toggle
-# Afterwards we will sharpen our image in order to acute the edges of the foreground objects. We will apply a laplacian filter with a quite strong filter (an approximation of second derivative):
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp sharp
![](images/laplace.jpeg)
![](images/sharp.jpeg)
![](images/markers.jpeg)
-# Now we transform our new sharpened source image to a grayscale and a binary one, respectively:
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp bin
![](images/bin.jpeg)
- Finally, we can apply the watershed algorithm, and visualize the result:
-# We are ready now to apply the Distance Transform on the binary image. Moreover, we normalize the output image in order to be able visualize and threshold the result:
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp dist
![](images/dist_transf.jpeg)
@add_toggle_cpp
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp watershed
@end_toggle
-# We threshold the *dist* image and then perform some morphology operation (i.e. dilation) in order to extract the peaks from the above image:
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp peaks
![](images/peaks.jpeg)
@add_toggle_java
@snippet samples/java/tutorial_code/ImgTrans/distance_transformation/ImageSegmentationDemo.java watershed
@end_toggle
-# From each blob then we create a seed/marker for the watershed algorithm with the help of the @ref cv::findContours function:
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp seeds
![](images/markers.jpeg)
@add_toggle_python
@snippet samples/python/tutorial_code/ImgTrans/distance_transformation/imageSegmentation.py watershed
@end_toggle
-# Finally, we can apply the watershed algorithm, and visualize the result:
@snippet samples/cpp/tutorial_code/ImgTrans/imageSegmentation.cpp watershed
![](images/final.jpeg)
\ No newline at end of file
![](images/final.jpeg)
......@@ -285,6 +285,8 @@ In this section you will learn about the image processing (manipulation) functio
- @subpage tutorial_distance_transform
*Languages:* C++, Java, Python
*Compatibility:* \> OpenCV 2.0
*Author:* Theodore Tsesmelis
......
......@@ -1985,6 +1985,31 @@ CV_EXPORTS_W int decomposeHomographyMat(InputArray H,
OutputArrayOfArrays translations,
OutputArrayOfArrays normals);
/** @brief Filters homography decompositions based on additional information.
@param rotations Vector of rotation matrices.
@param normals Vector of plane normal matrices.
@param beforePoints Vector of (rectified) visible reference points before the homography is applied
@param afterPoints Vector of (rectified) visible reference points after the homography is applied
@param possibleSolutions Vector of int indices representing the viable solution set after filtering
@param pointsMask optional Mat/Vector of 8u type representing the mask for the inliers as given by the findHomography function
This function is intended to filter the output of the decomposeHomographyMat based on additional
information as described in @cite Malis . The summary of the method: the decomposeHomographyMat function
returns 2 unique solutions and their "opposites" for a total of 4 solutions. If we have access to the
sets of points visible in the camera frame before and after the homography transformation is applied,
we can determine which are the true potential solutions and which are the opposites by verifying which
homographies are consistent with all visible reference points being in front of the camera. The inputs
are left unchanged; the filtered solution set is returned as indices into the existing one.
*/
CV_EXPORTS_W void filterHomographyDecompByVisibleRefpoints(InputArrayOfArrays rotations,
InputArrayOfArrays normals,
InputArray beforePoints,
InputArray afterPoints,
OutputArray possibleSolutions,
InputArray pointsMask = noArray());
/** @brief The base class for stereo correspondence algorithms.
*/
class CV_EXPORTS_W StereoMatcher : public Algorithm
......
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// This is a homography decomposition implementation contributed to OpenCV
// by Samson Yilma. It implements the homography decomposition algorithm
// described in the research report:
// Malis, E and Vargas, M, "Deeper understanding of the homography decomposition
// for vision-based control", Research Report 6303, INRIA (2007)
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2014, Samson Yilma (samson_yilma@yahoo.com), all rights reserved.
//
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
//
// This is a homography decomposition implementation contributed to OpenCV
// by Samson Yilma. It implements the homography decomposition algorithm
// described in the research report:
// Malis, E and Vargas, M, "Deeper understanding of the homography decomposition
// for vision-based control", Research Report 6303, INRIA (2007)
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2014, Samson Yilma (samson_yilma@yahoo.com), all rights reserved.
// Copyright (C) 2018, Intel Corporation, all rights reserved.
//
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "precomp.hpp"
#include <memory>
......@@ -489,4 +490,67 @@ int decomposeHomographyMat(InputArray _H,
return nsols;
}
void filterHomographyDecompByVisibleRefpoints(InputArrayOfArrays _rotations,
InputArrayOfArrays _normals,
InputArray _beforeRectifiedPoints,
InputArray _afterRectifiedPoints,
OutputArray _possibleSolutions,
InputArray _pointsMask)
{
CV_Assert(_beforeRectifiedPoints.type() == CV_32FC2 && _afterRectifiedPoints.type() == CV_32FC2);
CV_Assert(_pointsMask.empty() || _pointsMask.type() == CV_8U);
Mat beforeRectifiedPoints = _beforeRectifiedPoints.getMat();
Mat afterRectifiedPoints = _afterRectifiedPoints.getMat();
Mat pointsMask = _pointsMask.getMat();
int nsolutions = (int)_rotations.total();
int npoints = (int)beforeRectifiedPoints.total();
CV_Assert(pointsMask.empty() || pointsMask.checkVector(1, CV_8U) == npoints);
const uchar* pointsMaskPtr = pointsMask.data;
std::vector<uchar> solutionMask(nsolutions, (uchar)1);
std::vector<Mat> normals(nsolutions);
std::vector<Mat> rotnorm(nsolutions);
Mat R;
for( int i = 0; i < nsolutions; i++ )
{
_normals.getMat(i).convertTo(normals[i], CV_64F);
CV_Assert(normals[i].total() == 3);
_rotations.getMat(i).convertTo(R, CV_64F);
rotnorm[i] = R*normals[i];
CV_Assert(rotnorm[i].total() == 3);
}
for( int j = 0; j < npoints; j++ )
{
if( !pointsMaskPtr || pointsMaskPtr[j] )
{
Point2f prevPoint = beforeRectifiedPoints.at<Point2f>(j);
Point2f currPoint = afterRectifiedPoints.at<Point2f>(j);
for( int i = 0; i < nsolutions; i++ )
{
if( !solutionMask[i] )
continue;
const double* normal_i = normals[i].ptr<double>();
const double* rotnorm_i = rotnorm[i].ptr<double>();
double prevNormDot = normal_i[0]*prevPoint.x + normal_i[1]*prevPoint.y + normal_i[2];
double currNormDot = rotnorm_i[0]*currPoint.x + rotnorm_i[1]*currPoint.y + rotnorm_i[2];
if (prevNormDot <= 0 || currNormDot <= 0)
solutionMask[i] = (uchar)0;
}
}
}
std::vector<int> possibleSolutions;
for( int i = 0; i < nsolutions; i++ )
if( solutionMask[i] )
possibleSolutions.push_back(i);
Mat(possibleSolutions).copyTo(_possibleSolutions);
}
} //namespace cv
This diff is collapsed.
......@@ -12,7 +12,8 @@ ocv_add_dispatched_file_force_all("layers/layers_common" AVX AVX2 AVX512_SKX)
ocv_add_module(dnn opencv_core opencv_imgproc WRAP python matlab java js)
ocv_option(OPENCV_DNN_OPENCL "Build with OpenCL support" HAVE_OPENCL)
ocv_option(OPENCV_DNN_OPENCL "Build with OpenCL support" HAVE_OPENCL AND NOT APPLE)
if(OPENCV_DNN_OPENCL AND HAVE_OPENCL)
add_definitions(-DCV_OCL4DNN=1)
else()
......
......@@ -1446,7 +1446,7 @@ struct Net::Impl
// TODO: OpenCL target support more fusion styles.
if ( preferableBackend == DNN_BACKEND_OPENCV && IS_DNN_OPENCL_TARGET(preferableTarget) &&
(!cv::ocl::useOpenCL() || (ld.layerInstance->type != "Convolution" &&
ld.layerInstance->type != "MVN")) )
ld.layerInstance->type != "MVN" && ld.layerInstance->type != "Pooling")) )
continue;
Ptr<Layer>& currLayer = ld.layerInstance;
......@@ -1993,11 +1993,17 @@ Net Net::readFromModelOptimizer(const String& xml, const String& bin)
backendNode->net = Ptr<InfEngineBackendNet>(new InfEngineBackendNet(ieNet));
for (auto& it : ieNet.getOutputsInfo())
{
Ptr<Layer> cvLayer(new InfEngineBackendLayer(it.second));
InferenceEngine::CNNLayerPtr ieLayer = ieNet.getLayerByName(it.first.c_str());
CV_Assert(ieLayer);
LayerParams lp;
int lid = cvNet.addLayer(it.first, "", lp);
LayerData& ld = cvNet.impl->layers[lid];
ld.layerInstance = Ptr<Layer>(new InfEngineBackendLayer(it.second));
cvLayer->name = it.first;
cvLayer->type = ieLayer->type;
ld.layerInstance = cvLayer;
ld.backendNodes[DNN_BACKEND_INFERENCE_ENGINE] = backendNode;
for (int i = 0; i < inputsNames.size(); ++i)
......
......@@ -165,6 +165,7 @@ public:
(type == AVE ? LIBDNN_POOLING_METHOD_AVE :
LIBDNN_POOLING_METHOD_STO);
config.avePoolPaddedArea = avePoolPaddedArea;
config.computeMaxIdx = computeMaxIdx;
config.use_half = use_half;
poolOp = Ptr<OCL4DNNPool<float> >(new OCL4DNNPool<float>(config));
}
......
......@@ -352,6 +352,7 @@ struct OCL4DNNPoolConfig
pool_method(LIBDNN_POOLING_METHOD_MAX),
global_pooling(false),
avePoolPaddedArea(true),
computeMaxIdx(true),
use_half(false)
{}
MatShape in_shape;
......@@ -365,6 +366,7 @@ struct OCL4DNNPoolConfig
ocl4dnnPoolingMethod_t pool_method; // = LIBDNN_POOLING_METHOD_MAX;
bool global_pooling; // = false;
bool avePoolPaddedArea;
bool computeMaxIdx;
bool use_half;
};
......@@ -399,6 +401,7 @@ class OCL4DNNPool
int32_t pooled_height_;
int32_t pooled_width_;
bool avePoolPaddedArea;
bool computeMaxIdx;
bool use_half;
};
......
......@@ -56,6 +56,7 @@ OCL4DNNPool<Dtype>::OCL4DNNPool(OCL4DNNPoolConfig config)
channels_ = config.channels;
pool_method_ = config.pool_method;
avePoolPaddedArea = config.avePoolPaddedArea;
computeMaxIdx = config.computeMaxIdx;
use_half = config.use_half;
for (int i = 0; i < spatial_dims; ++i)
......@@ -97,7 +98,7 @@ bool OCL4DNNPool<Dtype>::Forward(const UMat& bottom,
UMat& top_mask)
{
bool ret = true;
size_t global[] = { 128 * 128 };
size_t global[] = { (size_t)count_ };
size_t local[] = { 128 };
// support 2D case
......@@ -105,8 +106,7 @@ bool OCL4DNNPool<Dtype>::Forward(const UMat& bottom,
{
case LIBDNN_POOLING_METHOD_MAX:
{
bool haveMask = !top_mask.empty();
String kname = haveMask ? "max_pool_forward_mask" : "max_pool_forward";
String kname = computeMaxIdx ? "max_pool_forward_mask" : "max_pool_forward";
kname += (use_half) ? "_half" : "_float";
ocl::Kernel oclk_max_pool_forward(
kname.c_str(),
......@@ -118,7 +118,7 @@ bool OCL4DNNPool<Dtype>::Forward(const UMat& bottom,
kernel_w_, kernel_h_,
stride_w_, stride_h_,
pad_w_, pad_h_,
haveMask ? " -D HAVE_MASK=1" : ""
computeMaxIdx ? " -D HAVE_MASK=1" : ""
));
if (oclk_max_pool_forward.empty())
......
......@@ -65,36 +65,40 @@ __kernel void
#endif
)
{
for (int index = get_global_id(0); index < nthreads;
index += get_global_size(0))
int index = get_global_id(0);
if (index >= nthreads)
return;
const int pw = index % pooled_width;
const int xx = index / pooled_width;
const int ph = xx % pooled_height;
const int ch = xx / pooled_height;
int hstart = ph * STRIDE_H - PAD_H;
int wstart = pw * STRIDE_W - PAD_W;
Dtype maxval = -FLT_MAX;
int maxidx = -1;
int in_offset = ch * height * width;
for (int h = 0; h < KERNEL_H; ++h)
{
const int pw = index % pooled_width;
const int ph = (index / pooled_width) % pooled_height;
const int c = (index / pooled_width / pooled_height) % channels;
const int n = index / pooled_width / pooled_height / channels;
int hstart = ph * STRIDE_H - PAD_H;
int wstart = pw * STRIDE_W - PAD_W;
const int hend = min(hstart + KERNEL_H, height);
const int wend = min(wstart + KERNEL_W, width);
hstart = max(hstart, (int)0);
wstart = max(wstart, (int)0);
Dtype maxval = -FLT_MAX;
int maxidx = -1;
__global const Dtype* bottom_slice = bottom_data
+ (n * channels + c) * height * width;
for (int h = hstart; h < hend; ++h) {
for (int w = wstart; w < wend; ++w) {
if (bottom_slice[h * width + w] > maxval) {
maxidx = h * width + w;
maxval = bottom_slice[maxidx];
int off_y = hstart + h;
if (off_y >= 0 && off_y < height)
{
for (int w = 0; w < KERNEL_W; ++w)
{
int off_x = wstart + w;
if (off_x >= 0 && off_x < width)
{
Dtype val = bottom_data[in_offset + off_y * width + off_x];
maxidx = (val > maxval) ? (off_y * width + off_x) : maxidx;
maxval = fmax(val, maxval);
}
}
}
top_data[index] = maxval;
}
top_data[index] = maxval;
#ifdef HAVE_MASK
mask[index] = maxidx;
mask[index] = maxidx;
#endif
}
}
#elif defined KERNEL_AVE_POOL
......@@ -105,43 +109,42 @@ __kernel void TEMPLATE(ave_pool_forward, Dtype)(
const int pooled_height, const int pooled_width,
__global Dtype* top_data)
{
for (int index = get_global_id(0); index < nthreads;
index += get_global_size(0))
{
{
const int pw = index % pooled_width;
const int ph = (index / pooled_width) % pooled_height;
const int c = (index / pooled_width / pooled_height) % channels;
const int n = index / pooled_width / pooled_height / channels;
int hstart = ph * STRIDE_H - PAD_H;
int wstart = pw * STRIDE_W - PAD_W;
int hend = min(hstart + KERNEL_H, height + PAD_H);
int wend = min(wstart + KERNEL_W, width + PAD_W);
int pool_size;
int index = get_global_id(0);
if (index >= nthreads)
return;
const int pw = index % pooled_width;
const int xx = index / pooled_width;
const int ph = xx % pooled_height;
const int ch = xx / pooled_height;
int hstart = ph * STRIDE_H - PAD_H;
int wstart = pw * STRIDE_W - PAD_W;
int hend = min(hstart + KERNEL_H, height + PAD_H);
int wend = min(wstart + KERNEL_W, width + PAD_W);
int pool_size;
#ifdef AVE_POOL_PADDING_AREA
pool_size = (hend - hstart) * (wend - wstart);
hstart = max(hstart, (int)0);
wstart = max(wstart, (int)0);
hend = min(hend, height);
wend = min(wend, width);
pool_size = (hend - hstart) * (wend - wstart);
hstart = max(hstart, (int)0);
wstart = max(wstart, (int)0);
hend = min(hend, height);
wend = min(wend, width);
#else
hstart = max(hstart, (int)0);
wstart = max(wstart, (int)0);
hend = min(hend, height);
wend = min(wend, width);
pool_size = (hend - hstart) * (wend - wstart);
hstart = max(hstart, (int)0);
wstart = max(wstart, (int)0);
hend = min(hend, height);
wend = min(wend, width);
pool_size = (hend - hstart) * (wend - wstart);
#endif
Dtype aveval = 0;
__global const Dtype* bottom_slice = bottom_data
+ (n * channels + c) * height * width;
for (int h = hstart; h < hend; ++h) {
for (int w = wstart; w < wend; ++w) {
aveval += bottom_slice[h * width + w];
}
}
top_data[index] = aveval / pool_size;
Dtype aveval = 0;
int in_offset = ch * height * width;
for (int h = hstart; h < hend; ++h)
{
for (int w = wstart; w < wend; ++w)
{
aveval += bottom_data[in_offset + h * width + w];
}
}
top_data[index] = aveval / pool_size;
}
#elif defined KERNEL_STO_POOL
......
This diff is collapsed.
......@@ -182,11 +182,9 @@ TEST_P(DNNTestNetwork, MobileNet_SSD_Caffe)
throw SkipTestException("");
Mat sample = imread(findDataFile("dnn/street.png", false));
Mat inp = blobFromImage(sample, 1.0f / 127.5, Size(300, 300), Scalar(127.5, 127.5, 127.5), false);
float l1 = (backend == DNN_BACKEND_OPENCV && target == DNN_TARGET_OPENCL_FP16) ? 0.0007 : 0.0;
float lInf = (backend == DNN_BACKEND_OPENCV && target == DNN_TARGET_OPENCL_FP16) ? 0.011 : 0.0;
float diffScores = (target == DNN_TARGET_OPENCL_FP16) ? 6e-3 : 0.0;
processNet("dnn/MobileNetSSD_deploy.caffemodel", "dnn/MobileNetSSD_deploy.prototxt",
inp, "detection_out", "", l1, lInf);
inp, "detection_out", "", diffScores);
}
TEST_P(DNNTestNetwork, MobileNet_SSD_v1_TensorFlow)
......
......@@ -157,7 +157,8 @@ static inline bool checkMyriadTarget()
net.addLayerToPrev("testLayer", "Identity", lp);
net.setPreferableBackend(cv::dnn::DNN_BACKEND_INFERENCE_ENGINE);
net.setPreferableTarget(cv::dnn::DNN_TARGET_MYRIAD);
net.setInput(cv::Mat::zeros(1, 1, CV_32FC1));
static int inpDims[] = {1, 2, 3, 4};
net.setInput(cv::Mat(4, &inpDims[0], CV_32FC1, cv::Scalar(0)));
try
{
net.forward();
......
......@@ -143,7 +143,7 @@ TEST_P(Test_Darknet_nets, YoloVoc)
classIds[0] = 6; confidences[0] = 0.750469f; boxes[0] = Rect2d(0.577374, 0.127391, 0.325575, 0.173418); // a car
classIds[1] = 1; confidences[1] = 0.780879f; boxes[1] = Rect2d(0.270762, 0.264102, 0.461713, 0.48131); // a bicycle
classIds[2] = 11; confidences[2] = 0.901615f; boxes[2] = Rect2d(0.1386, 0.338509, 0.282737, 0.60028); // a dog
double scoreDiff = (targetId == DNN_TARGET_OPENCL_FP16 || targetId == DNN_TARGET_MYRIAD) ? 7e-3 : 8e-5;
double scoreDiff = (targetId == DNN_TARGET_OPENCL_FP16 || targetId == DNN_TARGET_MYRIAD) ? 1e-2 : 8e-5;
double iouDiff = (targetId == DNN_TARGET_OPENCL_FP16 || targetId == DNN_TARGET_MYRIAD) ? 0.013 : 3e-5;
testDarknetModel("yolo-voc.cfg", "yolo-voc.weights", outNames,
classIds, confidences, boxes, backendId, targetId, scoreDiff, iouDiff);
......
......@@ -925,6 +925,10 @@ TEST(Layer_Test_Convolution_DLDT, Accuracy)
Mat out = net.forward();
normAssert(outDefault, out);
std::vector<int> outLayers = net.getUnconnectedOutLayers();
ASSERT_EQ(net.getLayer(outLayers[0])->name, "output_merge");
ASSERT_EQ(net.getLayer(outLayers[0])->type, "Concat");
}
// 1. Create a .prototxt file with the following network:
......@@ -1183,6 +1187,7 @@ TEST(Layer_Test_PoolingIndices, Accuracy)
}
}
}
net.setPreferableBackend(DNN_BACKEND_OPENCV);
net.setInput(blobFromImage(inp));
std::vector<Mat> outputs;
......
......@@ -127,6 +127,7 @@ TEST_P(Test_TensorFlow_layers, conv)
runTensorFlowNet("atrous_conv2d_same", targetId);
runTensorFlowNet("depthwise_conv2d", targetId);
runTensorFlowNet("keras_atrous_conv2d_same", targetId);
runTensorFlowNet("conv_pool_nchw", targetId);
}
TEST_P(Test_TensorFlow_layers, padding)
......@@ -142,9 +143,10 @@ TEST_P(Test_TensorFlow_layers, eltwise_add_mul)
runTensorFlowNet("eltwise_add_mul", GetParam());
}
TEST_P(Test_TensorFlow_layers, pad_and_concat)
TEST_P(Test_TensorFlow_layers, concat)
{
runTensorFlowNet("pad_and_concat", GetParam());
runTensorFlowNet("concat_axis_1", GetParam());
}
TEST_P(Test_TensorFlow_layers, batch_norm)
......@@ -440,4 +442,20 @@ TEST(Test_TensorFlow, resize_bilinear)
runTensorFlowNet("resize_bilinear_factor");
}
TEST(Test_TensorFlow, two_inputs)
{
Net net = readNet(path("two_inputs_net.pbtxt"));
net.setPreferableBackend(DNN_BACKEND_OPENCV);
Mat firstInput(2, 3, CV_32FC1), secondInput(2, 3, CV_32FC1);
randu(firstInput, -1, 1);
randu(secondInput, -1, 1);
net.setInput(firstInput, "first_input");
net.setInput(secondInput, "second_input");
Mat out = net.forward();
normAssert(out, firstInput + secondInput);
}
}
......@@ -175,8 +175,6 @@ bool SunRasterDecoder::readData( Mat& img )
AutoBuffer<uchar> _src(src_pitch + 32);
uchar* src = _src;
AutoBuffer<uchar> _bgr(m_width*3 + 32);
uchar* bgr = _bgr;
if( !color && m_maptype == RMT_EQUAL_RGB )
CvtPaletteToGray( m_palette, gray_palette, 1 << m_bpp );
......@@ -340,16 +338,18 @@ bad_decoding_end:
case 24:
for( y = 0; y < m_height; y++, data += step )
{
m_strm.getBytes( color ? data : bgr, src_pitch );
m_strm.getBytes(src, src_pitch );
if( color )
{
if( m_type == RAS_FORMAT_RGB )
icvCvt_RGB2BGR_8u_C3R( data, 0, data, 0, cvSize(m_width,1) );
icvCvt_RGB2BGR_8u_C3R(src, 0, data, 0, cvSize(m_width,1) );
else
memcpy(data, src, std::min(step, (size_t)src_pitch));
}
else
{
icvCvt_BGR2Gray_8u_C3C1R( bgr, 0, data, 0, cvSize(m_width,1),
icvCvt_BGR2Gray_8u_C3C1R(src, 0, data, 0, cvSize(m_width,1),
m_type == RAS_FORMAT_RGB ? 2 : 0 );
}
}
......
......@@ -670,6 +670,14 @@ public:
void groupRectangles(std::vector<cv::Rect>& rectList, std::vector<double>& weights, int groupThreshold, double eps) const;
};
/** @brief Detect QR code in image and return minimum area of quadrangle that describes QR code.
@param in Matrix of the type CV_8UC1 containing an image where QR code are detected.
@param points Output vector of vertices of a quadrangle of minimal area that describes QR code.
@param eps_x Epsilon neighborhood, which allows you to determine the horizontal pattern of the scheme 1:1:3:1:1 according to QR code standard.
@param eps_y Epsilon neighborhood, which allows you to determine the vertical pattern of the scheme 1:1:3:1:1 according to QR code standard.
*/
CV_EXPORTS bool detectQRCode(InputArray in, std::vector<Point> &points, double eps_x = 0.2, double eps_y = 0.1);
//! @} objdetect
}
......
This diff is collapsed.
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// Intel License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2000, Intel Corporation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of Intel Corporation may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "test_precomp.hpp"
namespace opencv_test { namespace {
TEST(Objdetect_QRCode, regression)
{
String root = cvtest::TS::ptr()->get_data_path() + "qrcode/";
// String cascades[] =
// {
// root + "haarcascade_frontalface_alt.xml",
// root + "lbpcascade_frontalface.xml",
// String()
// };
// vector<Rect> objects;
// RNG rng((uint64)-1);
// for( int i = 0; !cascades[i].empty(); i++ )
// {
// printf("%d. %s\n", i, cascades[i].c_str());
// CascadeClassifier cascade(cascades[i]);
// for( int j = 0; j < 100; j++ )
// {
// int width = rng.uniform(1, 100);
// int height = rng.uniform(1, 100);
// Mat img(height, width, CV_8U);
// randu(img, 0, 256);
// cascade.detectMultiScale(img, objects);
// }
// }
}
}} // namespace
......@@ -250,7 +250,9 @@ when fullAffine=false.
@sa
estimateAffine2D, estimateAffinePartial2D, getAffineTransform, getPerspectiveTransform, findHomography
*/
CV_EXPORTS_W Mat estimateRigidTransform( InputArray src, InputArray dst, bool fullAffine );
CV_EXPORTS_W Mat estimateRigidTransform( InputArray src, InputArray dst, bool fullAffine);
CV_EXPORTS_W Mat estimateRigidTransform( InputArray src, InputArray dst, bool fullAffine, int ransacMaxIters, double ransacGoodRatio,
int ransacSize0);
enum
......
......@@ -1402,7 +1402,7 @@ namespace cv
{
static void
getRTMatrix( const Point2f* a, const Point2f* b,
getRTMatrix( const std::vector<Point2f> a, const std::vector<Point2f> b,
int count, Mat& M, bool fullAffine )
{
CV_Assert( M.isContinuous() );
......@@ -1478,6 +1478,12 @@ getRTMatrix( const Point2f* a, const Point2f* b,
}
cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullAffine )
{
return estimateRigidTransform(src1, src2, fullAffine, 500, 0.5, 3);
}
cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullAffine, int ransacMaxIters, double ransacGoodRatio,
const int ransacSize0)
{
CV_INSTRUMENT_REGION()
......@@ -1485,9 +1491,6 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
const int COUNT = 15;
const int WIDTH = 160, HEIGHT = 120;
const int RANSAC_MAX_ITERS = 500;
const int RANSAC_SIZE0 = 3;
const double RANSAC_GOOD_RATIO = 0.5;
std::vector<Point2f> pA, pB;
std::vector<int> good_idx;
......@@ -1499,6 +1502,12 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
RNG rng((uint64)-1);
int good_count = 0;
if( ransacSize0 < 3 )
CV_Error( Error::StsBadArg, "ransacSize0 should have value bigger than 2.");
if( ransacGoodRatio > 1 || ransacGoodRatio < 0)
CV_Error( Error::StsBadArg, "ransacGoodRatio should have value between 0 and 1");
if( A.size() != B.size() )
CV_Error( Error::StsUnmatchedSizes, "Both input images must have the same size" );
......@@ -1587,23 +1596,23 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
good_idx.resize(count);
if( count < RANSAC_SIZE0 )
if( count < ransacSize0 )
return Mat();
Rect brect = boundingRect(pB);
std::vector<Point2f> a(ransacSize0);
std::vector<Point2f> b(ransacSize0);
// RANSAC stuff:
// 1. find the consensus
for( k = 0; k < RANSAC_MAX_ITERS; k++ )
for( k = 0; k < ransacMaxIters; k++ )
{
int idx[RANSAC_SIZE0];
Point2f a[RANSAC_SIZE0];
Point2f b[RANSAC_SIZE0];
// choose random 3 non-coplanar points from A & B
for( i = 0; i < RANSAC_SIZE0; i++ )
std::vector<int> idx(ransacSize0);
// choose random 3 non-complanar points from A & B
for( i = 0; i < ransacSize0; i++ )
{
for( k1 = 0; k1 < RANSAC_MAX_ITERS; k1++ )
for( k1 = 0; k1 < ransacMaxIters; k1++ )
{
idx[i] = rng.uniform(0, count);
......@@ -1623,7 +1632,7 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
if( j < i )
continue;
if( i+1 == RANSAC_SIZE0 )
if( i+1 == ransacSize0 )
{
// additional check for non-complanar vectors
a[0] = pA[idx[0]];
......@@ -1647,11 +1656,11 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
break;
}
if( k1 >= RANSAC_MAX_ITERS )
if( k1 >= ransacMaxIters )
break;
}
if( i < RANSAC_SIZE0 )
if( i < ransacSize0 )
continue;
// estimate the transformation using 3 points
......@@ -1665,11 +1674,11 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
good_idx[good_count++] = i;
}
if( good_count >= count*RANSAC_GOOD_RATIO )
if( good_count >= count*ransacGoodRatio )
break;
}
if( k >= RANSAC_MAX_ITERS )
if( k >= ransacMaxIters )
return Mat();
if( good_count < count )
......@@ -1682,7 +1691,7 @@ cv::Mat cv::estimateRigidTransform( InputArray src1, InputArray src2, bool fullA
}
}
getRTMatrix( &pA[0], &pB[0], good_count, M, fullAffine );
getRTMatrix( pA, pB, good_count, M, fullAffine );
M.at<double>(0, 2) /= scale;
M.at<double>(1, 2) /= scale;
......
This diff is collapsed.
This diff is collapsed.
/**
* @function Watershed_and_Distance_Transform.cpp
* @brief Sample code showing how to segment overlapping objects using Laplacian filtering, in addition to Watershed and Distance Transformation
* @author OpenCV Team
*/
......@@ -12,43 +11,47 @@
using namespace std;
using namespace cv;
int main()
int main(int argc, char *argv[])
{
//! [load_image]
//! [load_image]
// Load the image
Mat src = imread("../data/cards.png");
// Check if everything was fine
if (!src.data)
CommandLineParser parser( argc, argv, "{@input | ../data/cards.png | input image}" );
Mat src = imread( parser.get<String>( "@input" ) );
if( src.empty() )
{
cout << "Could not open or find the image!\n" << endl;
cout << "Usage: " << argv[0] << " <Input image>" << endl;
return -1;
}
// Show source image
imshow("Source Image", src);
//! [load_image]
//! [load_image]
//! [black_bg]
//! [black_bg]
// Change the background from white to black, since that will help later to extract
// better results during the use of Distance Transform
for( int x = 0; x < src.rows; x++ ) {
for( int y = 0; y < src.cols; y++ ) {
if ( src.at<Vec3b>(x, y) == Vec3b(255,255,255) ) {
src.at<Vec3b>(x, y)[0] = 0;
src.at<Vec3b>(x, y)[1] = 0;
src.at<Vec3b>(x, y)[2] = 0;
}
for ( int i = 0; i < src.rows; i++ ) {
for ( int j = 0; j < src.cols; j++ ) {
if ( src.at<Vec3b>(i, j) == Vec3b(255,255,255) )
{
src.at<Vec3b>(i, j)[0] = 0;
src.at<Vec3b>(i, j)[1] = 0;
src.at<Vec3b>(i, j)[2] = 0;
}
}
}
// Show output image
imshow("Black Background Image", src);
//! [black_bg]
//! [black_bg]
//! [sharp]
// Create a kernel that we will use for accuting/sharpening our image
//! [sharp]
// Create a kernel that we will use to sharpen our image
Mat kernel = (Mat_<float>(3,3) <<
1, 1, 1,
1, -8, 1,
1, 1, 1); // an approximation of second derivative, a quite strong kernel
1, 1, 1,
1, -8, 1,
1, 1, 1); // an approximation of second derivative, a quite strong kernel
// do the laplacian filtering as it is
// well, we need to convert everything in something more deeper then CV_8U
......@@ -57,8 +60,8 @@ int main()
// BUT a 8bits unsigned int (the one we are working with) can contain values from 0 to 255
// so the possible negative number will be truncated
Mat imgLaplacian;
Mat sharp = src; // copy source image to another temporary one
filter2D(sharp, imgLaplacian, CV_32F, kernel);
filter2D(src, imgLaplacian, CV_32F, kernel);
Mat sharp;
src.convertTo(sharp, CV_32F);
Mat imgResult = sharp - imgLaplacian;
......@@ -68,41 +71,39 @@ int main()
// imshow( "Laplace Filtered Image", imgLaplacian );
imshow( "New Sharped Image", imgResult );
//! [sharp]
//! [sharp]
src = imgResult; // copy back
//! [bin]
//! [bin]
// Create binary image from source image
Mat bw;
cvtColor(src, bw, COLOR_BGR2GRAY);
cvtColor(imgResult, bw, COLOR_BGR2GRAY);
threshold(bw, bw, 40, 255, THRESH_BINARY | THRESH_OTSU);
imshow("Binary Image", bw);
//! [bin]
//! [bin]
//! [dist]
//! [dist]
// Perform the distance transform algorithm
Mat dist;
distanceTransform(bw, dist, DIST_L2, 3);
// Normalize the distance image for range = {0.0, 1.0}
// so we can visualize and threshold it
normalize(dist, dist, 0, 1., NORM_MINMAX);
normalize(dist, dist, 0, 1.0, NORM_MINMAX);
imshow("Distance Transform Image", dist);
//! [dist]
//! [dist]
//! [peaks]
//! [peaks]
// Threshold to obtain the peaks
// This will be the markers for the foreground objects
threshold(dist, dist, .4, 1., THRESH_BINARY);
threshold(dist, dist, 0.4, 1.0, THRESH_BINARY);
// Dilate a bit the dist image
Mat kernel1 = Mat::ones(3, 3, CV_8UC1);
Mat kernel1 = Mat::ones(3, 3, CV_8U);
dilate(dist, dist, kernel1);
imshow("Peaks", dist);
//! [peaks]
//! [peaks]
//! [seeds]
//! [seeds]
// Create the CV_8U version of the distance image
// It is needed for findContours()
Mat dist_8u;
......@@ -113,34 +114,36 @@ int main()
findContours(dist_8u, contours, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE);
// Create the marker image for the watershed algorithm
Mat markers = Mat::zeros(dist.size(), CV_32SC1);
Mat markers = Mat::zeros(dist.size(), CV_32S);
// Draw the foreground markers
for (size_t i = 0; i < contours.size(); i++)
drawContours(markers, contours, static_cast<int>(i), Scalar::all(static_cast<int>(i)+1), -1);
{
drawContours(markers, contours, static_cast<int>(i), Scalar(static_cast<int>(i)+1), -1);
}
// Draw the background marker
circle(markers, Point(5,5), 3, CV_RGB(255,255,255), -1);
circle(markers, Point(5,5), 3, Scalar(255), -1);
imshow("Markers", markers*10000);
//! [seeds]
//! [seeds]
//! [watershed]
//! [watershed]
// Perform the watershed algorithm
watershed(src, markers);
watershed(imgResult, markers);
Mat mark = Mat::zeros(markers.size(), CV_8UC1);
markers.convertTo(mark, CV_8UC1);
Mat mark;
markers.convertTo(mark, CV_8U);
bitwise_not(mark, mark);
// imshow("Markers_v2", mark); // uncomment this if you want to see how the mark
// image looks like at that point
// imshow("Markers_v2", mark); // uncomment this if you want to see how the mark
// image looks like at that point
// Generate random colors
vector<Vec3b> colors;
for (size_t i = 0; i < contours.size(); i++)
{
int b = theRNG().uniform(0, 255);
int g = theRNG().uniform(0, 255);
int r = theRNG().uniform(0, 255);
int b = theRNG().uniform(0, 256);
int g = theRNG().uniform(0, 256);
int r = theRNG().uniform(0, 256);
colors.push_back(Vec3b((uchar)b, (uchar)g, (uchar)r));
}
......@@ -155,16 +158,16 @@ int main()
{
int index = markers.at<int>(i,j);
if (index > 0 && index <= static_cast<int>(contours.size()))
{
dst.at<Vec3b>(i,j) = colors[index-1];
else
dst.at<Vec3b>(i,j) = Vec3b(0,0,0);
}
}
}
// Visualize the final image
imshow("Final Result", dst);
//! [watershed]
//! [watershed]
waitKey(0);
waitKey();
return 0;
}
......@@ -22,6 +22,7 @@ const char* keys =
"{ height | -1 | Preprocess input image by resizing to a specific height. }"
"{ rgb | | Indicate that model works with RGB input images instead BGR ones. }"
"{ thr | .5 | Confidence threshold. }"
"{ thr | .4 | Non-maximum suppression threshold. }"
"{ backend | 0 | Choose one of computation backends: "
"0: automatically (by default), "
"1: Halide language (http://halide-lang.org/), "
......@@ -37,7 +38,7 @@ const char* keys =
using namespace cv;
using namespace dnn;
float confThreshold;
float confThreshold, nmsThreshold;
std::vector<std::string> classes;
void postprocess(Mat& frame, const std::vector<Mat>& out, Net& net);
......@@ -59,6 +60,7 @@ int main(int argc, char** argv)
}
confThreshold = parser.get<float>("thr");
nmsThreshold = parser.get<float>("nms");
float scale = parser.get<float>("scale");
Scalar mean = parser.get<Scalar>("mean");
bool swapRB = parser.get<bool>("rgb");
......@@ -144,6 +146,9 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net)
static std::vector<int> outLayers = net.getUnconnectedOutLayers();
static std::string outLayerType = net.getLayer(outLayers[0])->type;
std::vector<int> classIds;
std::vector<float> confidences;
std::vector<Rect> boxes;
if (net.getLayer(0)->outputNameToIndex("im_info") != -1) // Faster-RCNN or R-FCN
{
// Network produces output blob with a shape 1x1xNx7 where N is a number of
......@@ -160,8 +165,11 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net)
int top = (int)data[i + 4];
int right = (int)data[i + 5];
int bottom = (int)data[i + 6];
int classId = (int)(data[i + 1]) - 1; // Skip 0th background class id.
drawPred(classId, confidence, left, top, right, bottom, frame);
int width = right - left + 1;
int height = bottom - top + 1;
classIds.push_back((int)(data[i + 1]) - 1); // Skip 0th background class id.
boxes.push_back(Rect(left, top, width, height));
confidences.push_back(confidence);
}
}
}
......@@ -181,16 +189,16 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net)
int top = (int)(data[i + 4] * frame.rows);
int right = (int)(data[i + 5] * frame.cols);
int bottom = (int)(data[i + 6] * frame.rows);
int classId = (int)(data[i + 1]) - 1; // Skip 0th background class id.
drawPred(classId, confidence, left, top, right, bottom, frame);
int width = right - left + 1;
int height = bottom - top + 1;
classIds.push_back((int)(data[i + 1]) - 1); // Skip 0th background class id.
boxes.push_back(Rect(left, top, width, height));
confidences.push_back(confidence);
}
}
}
else if (outLayerType == "Region")
{
std::vector<int> classIds;
std::vector<float> confidences;
std::vector<Rect> boxes;
for (size_t i = 0; i < outs.size(); ++i)
{
// Network produces output blob with a shape NxC where N is a number of
......@@ -218,18 +226,19 @@ void postprocess(Mat& frame, const std::vector<Mat>& outs, Net& net)
}
}
}
std::vector<int> indices;
NMSBoxes(boxes, confidences, confThreshold, 0.4f, indices);
for (size_t i = 0; i < indices.size(); ++i)
{
int idx = indices[i];
Rect box = boxes[idx];
drawPred(classIds[idx], confidences[idx], box.x, box.y,
box.x + box.width, box.y + box.height, frame);
}
}
else
CV_Error(Error::StsNotImplemented, "Unknown output layer type: " + outLayerType);
std::vector<int> indices;
NMSBoxes(boxes, confidences, confThreshold, nmsThreshold, indices);
for (size_t i = 0; i < indices.size(); ++i)
{
int idx = indices[i];
Rect box = boxes[idx];
drawPred(classIds[idx], confidences[idx], box.x, box.y,
box.x + box.width, box.y + box.height, frame);
}
}
void drawPred(int classId, float conf, int left, int top, int right, int bottom, Mat& frame)
......
......@@ -31,6 +31,7 @@ parser.add_argument('--height', type=int,
parser.add_argument('--rgb', action='store_true',
help='Indicate that model works with RGB input images instead BGR ones.')
parser.add_argument('--thr', type=float, default=0.5, help='Confidence threshold')
parser.add_argument('--nms', type=float, default=0.4, help='Non-maximum suppression threshold')
parser.add_argument('--backend', choices=backends, default=cv.dnn.DNN_BACKEND_DEFAULT, type=int,
help="Choose one of computation backends: "
"%d: automatically (by default), "
......@@ -57,6 +58,7 @@ net.setPreferableBackend(args.backend)
net.setPreferableTarget(args.target)
confThreshold = args.thr
nmsThreshold = args.nms
def getOutputsNames(net):
layersNames = net.getLayerNames()
......@@ -86,36 +88,43 @@ def postprocess(frame, outs):
lastLayerId = net.getLayerId(layerNames[-1])
lastLayer = net.getLayer(lastLayerId)
classIds = []
confidences = []
boxes = []
if net.getLayer(0).outputNameToIndex('im_info') != -1: # Faster-RCNN or R-FCN
# Network produces output blob with a shape 1x1xNx7 where N is a number of
# detections and an every detection is a vector of values
# [batchId, classId, confidence, left, top, right, bottom]
assert(len(outs) == 1)
out = outs[0]
for detection in out[0, 0]:
confidence = detection[2]
if confidence > confThreshold:
left = int(detection[3])
top = int(detection[4])
right = int(detection[5])
bottom = int(detection[6])
classId = int(detection[1]) - 1 # Skip background label
drawPred(classId, confidence, left, top, right, bottom)
for out in outs:
for detection in out[0, 0]:
confidence = detection[2]
if confidence > confThreshold:
left = int(detection[3])
top = int(detection[4])
right = int(detection[5])
bottom = int(detection[6])
width = right - left + 1
height = bottom - top + 1
classIds.append(int(detection[1]) - 1) # Skip background label
confidences.append(float(confidence))
boxes.append([left, top, width, height])
elif lastLayer.type == 'DetectionOutput':
# Network produces output blob with a shape 1x1xNx7 where N is a number of
# detections and an every detection is a vector of values
# [batchId, classId, confidence, left, top, right, bottom]
assert(len(outs) == 1)
out = outs[0]
for detection in out[0, 0]:
confidence = detection[2]
if confidence > confThreshold:
left = int(detection[3] * frameWidth)
top = int(detection[4] * frameHeight)
right = int(detection[5] * frameWidth)
bottom = int(detection[6] * frameHeight)
classId = int(detection[1]) - 1 # Skip background label
drawPred(classId, confidence, left, top, right, bottom)
for out in outs:
for detection in out[0, 0]:
confidence = detection[2]
if confidence > confThreshold:
left = int(detection[3] * frameWidth)
top = int(detection[4] * frameHeight)
right = int(detection[5] * frameWidth)
bottom = int(detection[6] * frameHeight)
width = right - left + 1
height = bottom - top + 1
classIds.append(int(detection[1]) - 1) # Skip background label
confidences.append(float(confidence))
boxes.append([left, top, width, height])
elif lastLayer.type == 'Region':
# Network produces output blob with a shape NxC where N is a number of
# detected objects and C is a number of classes + 4 where the first 4
......@@ -138,15 +147,19 @@ def postprocess(frame, outs):
classIds.append(classId)
confidences.append(float(confidence))
boxes.append([left, top, width, height])
indices = cv.dnn.NMSBoxes(boxes, confidences, confThreshold, 0.4)
for i in indices:
i = i[0]
box = boxes[i]
left = box[0]
top = box[1]
width = box[2]
height = box[3]
drawPred(classIds[i], confidences[i], left, top, left + width, top + height)
else:
print('Unknown output layer type: ' + lastLayer.type)
exit()
indices = cv.dnn.NMSBoxes(boxes, confidences, confThreshold, nmsThreshold)
for i in indices:
i = i[0]
box = boxes[i]
left = box[0]
top = box[1]
width = box[2]
height = box[3]
drawPred(classIds[i], confidences[i], left, top, left + width, top + height)
# Process inputs
winName = 'Deep learning object detection in OpenCV'
......
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import org.opencv.core.Core;
import org.opencv.core.CvType;
import org.opencv.core.Mat;
import org.opencv.core.MatOfPoint;
import org.opencv.core.Point;
import org.opencv.core.Scalar;
import org.opencv.highgui.HighGui;
import org.opencv.imgcodecs.Imgcodecs;
import org.opencv.imgproc.Imgproc;
/**
*
* @brief Sample code showing how to segment overlapping objects using Laplacian filtering, in addition to Watershed
* and Distance Transformation
*
*/
class ImageSegmentation {
public void run(String[] args) {
//! [load_image]
// Load the image
String filename = args.length > 0 ? args[0] : "../data/cards.png";
Mat srcOriginal = Imgcodecs.imread(filename);
if (srcOriginal.empty()) {
System.err.println("Cannot read image: " + filename);
System.exit(0);
}
// Show source image
HighGui.imshow("Source Image", srcOriginal);
//! [load_image]
//! [black_bg]
// Change the background from white to black, since that will help later to
// extract
// better results during the use of Distance Transform
Mat src = srcOriginal.clone();
byte[] srcData = new byte[(int) (src.total() * src.channels())];
src.get(0, 0, srcData);
for (int i = 0; i < src.rows(); i++) {
for (int j = 0; j < src.cols(); j++) {
if (srcData[(i * src.cols() + j) * 3] == (byte) 255 && srcData[(i * src.cols() + j) * 3 + 1] == (byte) 255
&& srcData[(i * src.cols() + j) * 3 + 2] == (byte) 255) {
srcData[(i * src.cols() + j) * 3] = 0;
srcData[(i * src.cols() + j) * 3 + 1] = 0;
srcData[(i * src.cols() + j) * 3 + 2] = 0;
}
}
}
src.put(0, 0, srcData);
// Show output image
HighGui.imshow("Black Background Image", src);
//! [black_bg]
//! [sharp]
// Create a kernel that we will use to sharpen our image
Mat kernel = new Mat(3, 3, CvType.CV_32F);
// an approximation of second derivative, a quite strong kernel
float[] kernelData = new float[(int) (kernel.total() * kernel.channels())];
kernelData[0] = 1; kernelData[1] = 1; kernelData[2] = 1;
kernelData[3] = 1; kernelData[4] = -8; kernelData[5] = 1;
kernelData[6] = 1; kernelData[7] = 1; kernelData[8] = 1;
kernel.put(0, 0, kernelData);
// do the laplacian filtering as it is
// well, we need to convert everything in something more deeper then CV_8U
// because the kernel has some negative values,
// and we can expect in general to have a Laplacian image with negative values
// BUT a 8bits unsigned int (the one we are working with) can contain values
// from 0 to 255
// so the possible negative number will be truncated
Mat imgLaplacian = new Mat();
Imgproc.filter2D(src, imgLaplacian, CvType.CV_32F, kernel);
Mat sharp = new Mat();
src.convertTo(sharp, CvType.CV_32F);
Mat imgResult = new Mat();
Core.subtract(sharp, imgLaplacian, imgResult);
// convert back to 8bits gray scale
imgResult.convertTo(imgResult, CvType.CV_8UC3);
imgLaplacian.convertTo(imgLaplacian, CvType.CV_8UC3);
// imshow( "Laplace Filtered Image", imgLaplacian );
HighGui.imshow("New Sharped Image", imgResult);
//! [sharp]
//! [bin]
// Create binary image from source image
Mat bw = new Mat();
Imgproc.cvtColor(imgResult, bw, Imgproc.COLOR_BGR2GRAY);
Imgproc.threshold(bw, bw, 40, 255, Imgproc.THRESH_BINARY | Imgproc.THRESH_OTSU);
HighGui.imshow("Binary Image", bw);
//! [bin]
//! [dist]
// Perform the distance transform algorithm
Mat dist = new Mat();
Imgproc.distanceTransform(bw, dist, Imgproc.DIST_L2, 3);
// Normalize the distance image for range = {0.0, 1.0}
// so we can visualize and threshold it
Core.normalize(dist, dist, 0, 1., Core.NORM_MINMAX);
Mat distDisplayScaled = dist.mul(dist, 255);
Mat distDisplay = new Mat();
distDisplayScaled.convertTo(distDisplay, CvType.CV_8U);
HighGui.imshow("Distance Transform Image", distDisplay);
//! [dist]
//! [peaks]
// Threshold to obtain the peaks
// This will be the markers for the foreground objects
Imgproc.threshold(dist, dist, .4, 1., Imgproc.THRESH_BINARY);
// Dilate a bit the dist image
Mat kernel1 = Mat.ones(3, 3, CvType.CV_8U);
Imgproc.dilate(dist, dist, kernel1);
Mat distDisplay2 = new Mat();
dist.convertTo(distDisplay2, CvType.CV_8U);
distDisplay2 = distDisplay2.mul(distDisplay2, 255);
HighGui.imshow("Peaks", distDisplay2);
//! [peaks]
//! [seeds]
// Create the CV_8U version of the distance image
// It is needed for findContours()
Mat dist_8u = new Mat();
dist.convertTo(dist_8u, CvType.CV_8U);
// Find total markers
List<MatOfPoint> contours = new ArrayList<>();
Mat hierarchy = new Mat();
Imgproc.findContours(dist_8u, contours, hierarchy, Imgproc.RETR_EXTERNAL, Imgproc.CHAIN_APPROX_SIMPLE);
// Create the marker image for the watershed algorithm
Mat markers = Mat.zeros(dist.size(), CvType.CV_32S);
// Draw the foreground markers
for (int i = 0; i < contours.size(); i++) {
Imgproc.drawContours(markers, contours, i, new Scalar(i + 1), -1);
}
// Draw the background marker
Imgproc.circle(markers, new Point(5, 5), 3, new Scalar(255, 255, 255), -1);
Mat markersScaled = markers.mul(markers, 10000);
Mat markersDisplay = new Mat();
markersScaled.convertTo(markersDisplay, CvType.CV_8U);
HighGui.imshow("Markers", markersDisplay);
//! [seeds]
//! [watershed]
// Perform the watershed algorithm
Imgproc.watershed(imgResult, markers);
Mat mark = Mat.zeros(markers.size(), CvType.CV_8U);
markers.convertTo(mark, CvType.CV_8UC1);
Core.bitwise_not(mark, mark);
// imshow("Markers_v2", mark); // uncomment this if you want to see how the mark
// image looks like at that point
// Generate random colors
Random rng = new Random(12345);
List<Scalar> colors = new ArrayList<>(contours.size());
for (int i = 0; i < contours.size(); i++) {
int b = rng.nextInt(256);
int g = rng.nextInt(256);
int r = rng.nextInt(256);
colors.add(new Scalar(b, g, r));
}
// Create the result image
Mat dst = Mat.zeros(markers.size(), CvType.CV_8UC3);
byte[] dstData = new byte[(int) (dst.total() * dst.channels())];
dst.get(0, 0, dstData);
// Fill labeled objects with random colors
int[] markersData = new int[(int) (markers.total() * markers.channels())];
markers.get(0, 0, markersData);
for (int i = 0; i < markers.rows(); i++) {
for (int j = 0; j < markers.cols(); j++) {
int index = markersData[i * markers.cols() + j];
if (index > 0 && index <= contours.size()) {
dstData[(i * dst.cols() + j) * 3 + 0] = (byte) colors.get(index - 1).val[0];
dstData[(i * dst.cols() + j) * 3 + 1] = (byte) colors.get(index - 1).val[1];
dstData[(i * dst.cols() + j) * 3 + 2] = (byte) colors.get(index - 1).val[2];
} else {
dstData[(i * dst.cols() + j) * 3 + 0] = 0;
dstData[(i * dst.cols() + j) * 3 + 1] = 0;
dstData[(i * dst.cols() + j) * 3 + 2] = 0;
}
}
}
dst.put(0, 0, dstData);
// Visualize the final image
HighGui.imshow("Final Result", dst);
//! [watershed]
HighGui.waitKey();
System.exit(0);
}
}
public class ImageSegmentationDemo {
public static void main(String[] args) {
// Load the native OpenCV library
System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
new ImageSegmentation().run(args);
}
}
from __future__ import print_function
import cv2 as cv
import numpy as np
import argparse
import random as rng
rng.seed(12345)
## [load_image]
# Load the image
parser = argparse.ArgumentParser(description='Code for Image Segmentation with Distance Transform and Watershed Algorithm.\
Sample code showing how to segment overlapping objects using Laplacian filtering, \
in addition to Watershed and Distance Transformation')
parser.add_argument('--input', help='Path to input image.', default='../data/cards.png')
args = parser.parse_args()
src = cv.imread(args.input)
if src is None:
print('Could not open or find the image:', args.input)
exit(0)
# Show source image
cv.imshow('Source Image', src)
## [load_image]
## [black_bg]
# Change the background from white to black, since that will help later to extract
# better results during the use of Distance Transform
src[np.all(src == 255, axis=2)] = 0
# Show output image
cv.imshow('Black Background Image', src)
## [black_bg]
## [sharp]
# Create a kernel that we will use to sharpen our image
# an approximation of second derivative, a quite strong kernel
kernel = np.array([[1, 1, 1], [1, -8, 1], [1, 1, 1]], dtype=np.float32)
# do the laplacian filtering as it is
# well, we need to convert everything in something more deeper then CV_8U
# because the kernel has some negative values,
# and we can expect in general to have a Laplacian image with negative values
# BUT a 8bits unsigned int (the one we are working with) can contain values from 0 to 255
# so the possible negative number will be truncated
imgLaplacian = cv.filter2D(src, cv.CV_32F, kernel)
sharp = np.float32(src)
imgResult = sharp - imgLaplacian
# convert back to 8bits gray scale
imgResult = np.clip(imgResult, 0, 255)
imgResult = imgResult.astype('uint8')
imgLaplacian = np.clip(imgLaplacian, 0, 255)
imgLaplacian = np.uint8(imgLaplacian)
#cv.imshow('Laplace Filtered Image', imgLaplacian)
cv.imshow('New Sharped Image', imgResult)
## [sharp]
## [bin]
# Create binary image from source image
bw = cv.cvtColor(imgResult, cv.COLOR_BGR2GRAY)
_, bw = cv.threshold(bw, 40, 255, cv.THRESH_BINARY | cv.THRESH_OTSU)
cv.imshow('Binary Image', bw)
## [bin]
## [dist]
# Perform the distance transform algorithm
dist = cv.distanceTransform(bw, cv.DIST_L2, 3)
# Normalize the distance image for range = {0.0, 1.0}
# so we can visualize and threshold it
cv.normalize(dist, dist, 0, 1.0, cv.NORM_MINMAX)
cv.imshow('Distance Transform Image', dist)
## [dist]
## [peaks]
# Threshold to obtain the peaks
# This will be the markers for the foreground objects
_, dist = cv.threshold(dist, 0.4, 1.0, cv.THRESH_BINARY)
# Dilate a bit the dist image
kernel1 = np.ones((3,3), dtype=np.uint8)
dist = cv.dilate(dist, kernel1)
cv.imshow('Peaks', dist)
## [peaks]
## [seeds]
# Create the CV_8U version of the distance image
# It is needed for findContours()
dist_8u = dist.astype('uint8')
# Find total markers
_, contours, _ = cv.findContours(dist_8u, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
# Create the marker image for the watershed algorithm
markers = np.zeros(dist.shape, dtype=np.int32)
# Draw the foreground markers
for i in range(len(contours)):
cv.drawContours(markers, contours, i, (i+1), -1)
# Draw the background marker
cv.circle(markers, (5,5), 3, (255,255,255), -1)
cv.imshow('Markers', markers*10000)
## [seeds]
## [watershed]
# Perform the watershed algorithm
cv.watershed(imgResult, markers)
#mark = np.zeros(markers.shape, dtype=np.uint8)
mark = markers.astype('uint8')
mark = cv.bitwise_not(mark)
# uncomment this if you want to see how the mark
# image looks like at that point
#cv.imshow('Markers_v2', mark)
# Generate random colors
colors = []
for contour in contours:
colors.append((rng.randint(0,256), rng.randint(0,256), rng.randint(0,256)))
# Create the result image
dst = np.zeros((markers.shape[0], markers.shape[1], 3), dtype=np.uint8)
# Fill labeled objects with random colors
for i in range(markers.shape[0]):
for j in range(markers.shape[1]):
index = markers[i,j]
if index > 0 and index <= len(contours):
dst[i,j,:] = colors[index-1]
# Visualize the final image
cv.imshow('Final Result', dst)
## [watershed]
cv.waitKey()
......@@ -28,10 +28,9 @@ knn_matches = matcher.knnMatch(descriptors1, descriptors2, 2)
#-- Filter matches using the Lowe's ratio test
ratio_thresh = 0.7
good_matches = []
for matches in knn_matches:
if len(matches) > 1:
if matches[0].distance / matches[1].distance <= ratio_thresh:
good_matches.append(matches[0])
for m,n in knn_matches:
if m.distance / n.distance <= ratio_thresh:
good_matches.append(m)
#-- Draw matches
img_matches = np.empty((max(img1.shape[0], img2.shape[0]), img1.shape[1]+img2.shape[1], 3), dtype=np.uint8)
......
......@@ -28,10 +28,9 @@ knn_matches = matcher.knnMatch(descriptors_obj, descriptors_scene, 2)
#-- Filter matches using the Lowe's ratio test
ratio_thresh = 0.75
good_matches = []
for matches in knn_matches:
if len(matches) > 1:
if matches[0].distance / matches[1].distance <= ratio_thresh:
good_matches.append(matches[0])
for m,n in knn_matches:
if m.distance / n.distance <= ratio_thresh:
good_matches.append(m)
#-- Draw matches
img_matches = np.empty((max(img_object.shape[0], img_scene.shape[0]), img_object.shape[1]+img_scene.shape[1], 3), dtype=np.uint8)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment