Unverified Commit 88c4ed01 authored by sunitanyk's avatar sunitanyk Committed by GitHub

Computer Vision based Alpha Matting (#2306)

* Computer Vision based Alpha Matting Code

alpha matting code

This is a combination of 3 commits.

removed whitespaces

addressed issues raised in the PR

removed whitespaces

* removed global variable

* incorporated changes suggested by second round of review

* updated build instructions

* changed to OutputArray

* removed whitespaces

* alphamat: fix bugs triggered by assertions of Debug builds

* alphamat: fix documentation

* alphamat: coding style fixes

- get rid of std::cout
- remove clock_t
- drop unnecessary cast: float pix = tmap.at<uchar>(i, j);
- global 'dim' => 'ALPHAMAT_DIM'
- fix sample command line handling

* alphamat: apply clang-format

* clang-format fixups
parent bf0075a5
......@@ -10,6 +10,8 @@ $ cmake -D OPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -D BUILD_opencv_<r
- **aruco**: ArUco and ChArUco Markers -- Augmented reality ArUco marker and "ChARUco" markers where ArUco markers embedded inside the white areas of the checker board.
- **alphamat**: Computer Vision based Alpha Matting -- Given an input image and a trimap, generate an alpha matte.
- **bgsegm**: Background segmentation algorithm combining statistical background image estimation and per-pixel Bayesian segmentation.
- **bioinspired**: Biological Vision -- Biologically inspired vision model: minimize noise and luminance variance, transient event segmentation, high dynamic range tone mapping methods.
......
if(NOT HAVE_EIGEN)
message(STATUS "Module opencv_alphamat disabled because the following dependencies are not found: Eigen")
ocv_module_disable(alphamat)
endif()
ocv_define_module(alphamat
opencv_core
opencv_imgproc
)
# Computer Vision based Alpha Matting
This project was part of the Google Summer of Code 2019.
####Student: Muskaan Kularia
####Mentor: Sunita Nayak
***
Alphamatting is the problem of extracting the foreground from an image. Given the input of an image and its corresponding trimap, we try to extract the foreground from the background.
This project is implementation of "[[Designing Effective Inter-Pixel Information Flow for Natural Image Matting](http://people.inf.ethz.ch/aksoyy/ifm/)]" by Yağız Aksoy, Tunç Ozan Aydın and Marc Pollefeys[1]. It required implementation of parts of other papers [2,3,4].
## References
[1] Yagiz Aksoy, Tunc Ozan Aydin, Marc Pollefeys, "[Designing Effective Inter-Pixel Information Flow for Natural Image Matting](http://people.inf.ethz.ch/aksoyy/ifm/)", CVPR, 2017.
[2] Roweis, Sam T., and Lawrence K. Saul. "[Nonlinear dimensionality reduction by locally linear embedding](https://science.sciencemag.org/content/290/5500/2323)" Science 290.5500 (2000): 2323-2326.
[3] Anat Levin, Dani Lischinski, Yair Weiss, "[A Closed Form Solution to Natural Image Matting](https://www.researchgate.net/publication/5764820_A_Closed-Form_Solution_to_Natural_Image_Matting)", IEEE TPAMI, 2008.
[4] Qifeng Chen, Dingzeyu Li, Chi-Keung Tang, "[KNN Matting](http://dingzeyu.li/files/knn-matting-tpami.pdf)", IEEE TPAMI, 2013.
[5] Yagiz Aksoy, "[Affinity Based Matting Toolbox](https://github.com/yaksoy/AffinityBasedMattingToolbox)".
@inproceedings{aksoy2017designing,
title={Designing effective inter-pixel information flow for natural image matting},
author={Aksoy, Yagiz and Ozan Aydin, Tunc and Pollefeys, Marc},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={29--37},
year={2017}
}
@article{roweis2000nonlinear,
title={Nonlinear dimensionality reduction by locally linear embedding},
author={Roweis, Sam T and Saul, Lawrence K},
journal={science},
volume={290},
number={5500},
pages={2323--2326},
year={2000},
publisher={American Association for the Advancement of Science}
}
@inproceedings{shahrian2013improving,
title={Improving image matting using comprehensive sampling sets},
author={Shahrian, Ehsan and Rajan, Deepu and Price, Brian and Cohen, Scott},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={636--643},
year={2013}
}
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
/** Information Flow algorithm implementaton for alphamatting */
#ifndef _OPENCV_ALPHAMAT_HPP_
#define _OPENCV_ALPHAMAT_HPP_
/**
* @defgroup alphamat Alpha Matting
* This module is dedicated to compute alpha matting of images, given the input image and an input trimap.
* The samples directory includes easy examples of how to use the module.
*/
namespace cv { namespace alphamat {
//! @addtogroup alphamat
//! @{
/**
* The implementation is based on Designing Effective Inter-Pixel Information Flow for Natural Image Matting by Yağız Aksoy, Tunç Ozan Aydın and Marc Pollefeys, CVPR 2019.
*
* This module has been originally developed by Muskaan Kularia and Sunita Nayak as a project
* for Google Summer of Code 2019 (GSoC 19).
*
*/
CV_EXPORTS_W void infoFlow(InputArray image, InputArray tmap, OutputArray result);
//! @}
}} // namespace
#endif
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#include <iostream>
#include "opencv2/highgui.hpp"
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/alphamat.hpp>
using namespace std;
using namespace cv;
using namespace cv::alphamat;
const char* keys =
"{img || input image name}"
"{tri || input trimap image name}"
"{out || output image name}"
"{help h || print help message}"
;
int main(int argc, char* argv[])
{
CommandLineParser parser(argc, argv, keys);
parser.about("This sample demonstrates Information Flow Alpha Matting");
if (parser.has("help"))
{
parser.printMessage();
return 0;
}
string img_path = parser.get<std::string>("img");
string trimap_path = parser.get<std::string>("tri");
string result_path = parser.get<std::string>("out");
if (!parser.check()
|| img_path.empty() || trimap_path.empty())
{
parser.printMessage();
parser.printErrors();
return 1;
}
Mat image, tmap;
image = imread(img_path, IMREAD_COLOR); // Read the input image file
if (image.empty())
{
printf("Cannot read image file: '%s'\n", img_path.c_str());
return 1;
}
tmap = imread(trimap_path, IMREAD_GRAYSCALE);
if (tmap.empty())
{
printf("Cannot read trimap file: '%s'\n", trimap_path.c_str());
return 1;
}
Mat result;
infoFlow(image, tmap, result);
if (result_path.empty())
{
namedWindow("result alpha matte", WINDOW_NORMAL);
imshow("result alpha matte", result);
waitKey(0);
}
else
{
imwrite(result_path, result);
printf("Result saved: '%s'\n", result_path.c_str());
}
return 0;
}
/***********************************************************************
* Software License Agreement (BSD License)
*
* Copyright 2011-16 Jose Luis Blanco (joseluisblancoc@gmail.com).
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*************************************************************************/
#pragma once
#include "nanoflann.hpp"
#include <vector>
// ===== This example shows how to use nanoflann with these types of containers: =======
//typedef std::vector<std::vector<double> > my_vector_of_vectors_t;
//typedef std::vector<Eigen::VectorXd> my_vector_of_vectors_t; // This requires #include <Eigen/Dense>
// =====================================================================================
/** A simple vector-of-vectors adaptor for nanoflann, without duplicating the storage.
* The i'th vector represents a point in the state space.
*
* \tparam DIM If set to >0, it specifies a compile-time fixed dimensionality for the points in the data set, allowing more compiler optimizations.
* \tparam num_t The type of the point coordinates (typically, double or float).
* \tparam Distance The distance metric to use: nanoflann::metric_L1, nanoflann::metric_L2, nanoflann::metric_L2_Simple, etc.
* \tparam IndexType The type for indices in the KD-tree index (typically, size_t of int)
*/
template <class VectorOfVectorsType, typename num_t = double, int DIM = -1, class Distance = nanoflann::metric_L2, typename IndexType = size_t>
struct KDTreeVectorOfVectorsAdaptor
{
typedef KDTreeVectorOfVectorsAdaptor<VectorOfVectorsType, num_t, DIM,Distance> self_t;
typedef typename Distance::template traits<num_t, self_t>::distance_t metric_t;
typedef nanoflann::KDTreeSingleIndexAdaptor< metric_t, self_t, DIM, IndexType> index_t;
index_t* index; //! The kd-tree index for the user to call its methods as usual with any other FLANN index.
/// Constructor: takes a const ref to the vector of vectors object with the data points
KDTreeVectorOfVectorsAdaptor(const size_t /* dimensionality */, const VectorOfVectorsType &mat, const int leaf_max_size = 10) : m_data(mat)
{
assert(mat.size() != 0 && mat[0].size() != 0);
const size_t dims = mat[0].size();
if (DIM>0 && static_cast<int>(dims) != DIM)
throw std::runtime_error("Data set dimensionality does not match the 'DIM' template argument");
index = new index_t( static_cast<int>(dims), *this /* adaptor */, nanoflann::KDTreeSingleIndexAdaptorParams(leaf_max_size ) );
index->buildIndex();
}
~KDTreeVectorOfVectorsAdaptor() {
delete index;
}
const VectorOfVectorsType &m_data;
/** Query for the \a num_closest closest points to a given point (entered as query_point[0:dim-1]).
* Note that this is a short-cut method for index->findNeighbors().
* The user can also call index->... methods as desired.
* \note nChecks_IGNORED is ignored but kept for compatibility with the original FLANN interface.
*/
//inline void query(const num_t *query_point, const size_t num_closest, IndexType *out_indices, num_t *out_distances_sq, const int nChecks_IGNORED = 10) const
inline void query(const num_t *query_point, const size_t num_closest, IndexType *out_indices, num_t *out_distances_sq) const
{
nanoflann::KNNResultSet<num_t, IndexType> resultSet(num_closest);
resultSet.init(out_indices, out_distances_sq);
index->findNeighbors(resultSet, query_point, nanoflann::SearchParams());
}
/** @name Interface expected by KDTreeSingleIndexAdaptor
* @{ */
const self_t & derived() const {
return *this;
}
self_t & derived() {
return *this;
}
// Must return the number of data points
inline size_t kdtree_get_point_count() const {
return m_data.size();
}
// Returns the dim'th component of the idx'th point in the class:
inline num_t kdtree_get_pt(const size_t idx, const size_t dim) const {
return m_data[idx][dim];
}
// Optional bounding-box computation: return false to default to a standard bbox computation loop.
// Return true if the BBOX was already computed by the class and returned in "bb" so it can be avoided to redo it again.
// Look at bb.size() to find out the expected dimensionality (e.g. 2 or 3 for point clouds)
template <class BBOX>
bool kdtree_get_bbox(BBOX & /*bb*/) const {
return false;
}
/** @} */
}; // end of KDTreeVectorOfVectorsAdaptor
This diff is collapsed.
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#include "precomp.hpp"
#include "intraU.hpp"
#include "cm.hpp"
namespace cv { namespace alphamat {
static
void generateFVectorCM(my_vector_of_vectors_t& samples, Mat& img)
{
int nRows = img.rows;
int nCols = img.cols;
samples.resize(nRows * nCols);
int i, j;
for (i = 0; i < nRows; ++i)
{
for (j = 0; j < nCols; ++j)
{
samples[i * nCols + j].resize(ALPHAMAT_DIM);
samples[i * nCols + j][0] = img.at<cv::Vec3b>(i, j)[0] / 255.0;
samples[i * nCols + j][1] = img.at<cv::Vec3b>(i, j)[1] / 255.0;
samples[i * nCols + j][2] = img.at<cv::Vec3b>(i, j)[2] / 255.0;
samples[i * nCols + j][3] = double(i) / nRows;
samples[i * nCols + j][4] = double(j) / nCols;
}
}
}
static
void kdtree_CM(Mat& img, my_vector_of_vectors_t& indm, my_vector_of_vectors_t& samples, std::unordered_set<int>& unk)
{
// Generate feature vectors for intra U:
generateFVectorCM(samples, img);
// Query point: same as samples from which KD tree is generated
// construct a kd-tree index:
// Dimensionality set at run-time (default: L2)
// ------------------------------------------------------------
typedef KDTreeVectorOfVectorsAdaptor<my_vector_of_vectors_t, double> my_kd_tree_t;
my_kd_tree_t mat_index(ALPHAMAT_DIM /*dim*/, samples, 10 /* max leaf */);
mat_index.index->buildIndex();
// do a knn search with cm = 20
const size_t num_results = 20 + 1;
int N = unk.size();
std::vector<size_t> ret_indexes(num_results);
std::vector<double> out_dists_sqr(num_results);
nanoflann::KNNResultSet<double> resultSet(num_results);
indm.resize(N);
int i = 0;
for (std::unordered_set<int>::iterator it = unk.begin(); it != unk.end(); it++)
{
resultSet.init(&ret_indexes[0], &out_dists_sqr[0]);
mat_index.index->findNeighbors(resultSet, &samples[*it][0], nanoflann::SearchParams(10));
indm[i].resize(num_results - 1);
for (std::size_t j = 1; j < num_results; j++)
{
indm[i][j - 1] = ret_indexes[j];
}
i++;
}
}
static
void lle(my_vector_of_vectors_t& indm, my_vector_of_vectors_t& samples, float eps, std::unordered_set<int>& unk,
SparseMatrix<double>& Wcm, SparseMatrix<double>& Dcm, Mat& img)
{
CV_LOG_INFO(NULL, "ALPHAMAT: In cm's lle function");
int k = indm[0].size(); //number of neighbours that we are considering
int n = indm.size(); //number of unknown pixels
typedef Triplet<double> T;
std::vector<T> triplets, td;
my_vector_of_vectors_t wcm;
wcm.resize(n);
Mat C(20, 20, DataType<float>::type), rhs(20, 1, DataType<float>::type), Z(3, 20, DataType<float>::type), weights(20, 1, DataType<float>::type), pt(3, 1, DataType<float>::type);
Mat ptDotN(20, 1, DataType<float>::type), imd(20, 1, DataType<float>::type);
Mat Cones(20, 1, DataType<float>::type), Cinv(20, 1, DataType<float>::type);
float alpha, beta, lagrangeMult;
Cones += 1;
C = 0;
rhs = 1;
int i, ind = 0;
for (std::unordered_set<int>::iterator it = unk.begin(); it != unk.end(); it++)
{
// filling values in Z
i = *it;
int index_nbr;
for (int j = 0; j < k; j++)
{
index_nbr = indm[ind][j];
for (int p = 0; p < ALPHAMAT_DIM - 2; p++)
{
Z.at<float>(p, j) = samples[index_nbr][p];
}
}
pt.at<float>(0, 0) = samples[i][0];
pt.at<float>(1, 0) = samples[i][1];
pt.at<float>(2, 0) = samples[i][2];
C = Z.t() * Z;
for (int p = 0; p < k; p++)
{
C.at<float>(p, p) += eps;
}
ptDotN = Z.t() * pt;
solve(C, ptDotN, imd);
alpha = 1 - cv::sum(imd)[0];
solve(C, Cones, Cinv);
beta = cv::sum(Cinv)[0]; //% sum of elements of inv(corr)
lagrangeMult = alpha / beta;
solve(C, ptDotN + lagrangeMult * Cones, weights);
float sum = cv::sum(weights)[0];
weights = weights / sum;
int cMaj_i = findColMajorInd(i, img.rows, img.cols);
for (int j = 0; j < k; j++)
{
int cMaj_ind_j = findColMajorInd(indm[ind][j], img.rows, img.cols);
triplets.push_back(T(cMaj_i, cMaj_ind_j, weights.at<float>(j, 0)));
td.push_back(T(cMaj_i, cMaj_i, weights.at<float>(j, 0)));
}
ind++;
}
Wcm.setFromTriplets(triplets.begin(), triplets.end());
Dcm.setFromTriplets(td.begin(), td.end());
}
void cm(Mat& image, Mat& tmap, SparseMatrix<double>& Wcm, SparseMatrix<double>& Dcm)
{
my_vector_of_vectors_t samples, indm, Euu;
int i, j;
std::unordered_set<int> unk;
for (i = 0; i < tmap.rows; i++)
{
for (j = 0; j < tmap.cols; j++)
{
uchar pix = tmap.at<uchar>(i, j);
if (pix == 128)
unk.insert(i * tmap.cols + j);
}
}
kdtree_CM(image, indm, samples, unk);
float eps = 0.00001;
lle(indm, samples, eps, unk, Wcm, Dcm, image);
CV_LOG_INFO(NULL, "ALPHAMAT: cm DONE");
}
}} // namespace cv::alphamat
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#ifndef __OPENCV_ALPHAMAT_CM_H__
#define __OPENCV_ALPHAMAT_CM_H__
namespace cv { namespace alphamat {
using namespace Eigen;
using namespace nanoflann;
void cm(Mat& image, Mat& tmap, SparseMatrix<double>& Wcm, SparseMatrix<double>& Dcm);
}}
#endif
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#include "precomp.hpp"
#include <Eigen/Sparse>
using namespace Eigen;
namespace cv { namespace alphamat {
static
void solve(SparseMatrix<double> Wcm, SparseMatrix<double> Wuu, SparseMatrix<double> Wl, SparseMatrix<double> Dcm,
SparseMatrix<double> Duu, SparseMatrix<double> Dl, SparseMatrix<double> T,
Mat& wf, Mat& alpha)
{
float suu = 0.01, sl = 0.1, lamd = 100;
SparseMatrix<double> Lifm = ((Dcm - Wcm).transpose()) * (Dcm - Wcm) + sl * (Dl - Wl) + suu * (Duu - Wuu);
SparseMatrix<double> A;
int n = wf.rows;
VectorXd b(n), x(n);
Eigen::VectorXd wf_;
cv2eigen(wf, wf_);
A = Lifm + lamd * T;
b = (lamd * T) * (wf_);
ConjugateGradient<SparseMatrix<double>, Lower | Upper> cg;
cg.setMaxIterations(500);
cg.compute(A);
x = cg.solve(b);
CV_LOG_INFO(NULL, "ALPHAMAT: #iterations: " << cg.iterations());
CV_LOG_INFO(NULL, "ALPHAMAT: estimated error: " << cg.error());
int nRows = alpha.rows;
int nCols = alpha.cols;
float pix_alpha;
for (int j = 0; j < nCols; ++j)
{
for (int i = 0; i < nRows; ++i)
{
pix_alpha = x(i + j * nRows);
if (pix_alpha < 0)
pix_alpha = 0;
if (pix_alpha > 1)
pix_alpha = 1;
alpha.at<uchar>(i, j) = uchar(pix_alpha * 255);
}
}
}
void infoFlow(InputArray image_ia, InputArray tmap_ia, OutputArray result)
{
Mat image = image_ia.getMat();
Mat tmap = tmap_ia.getMat();
int64 begin = cv::getTickCount();
int nRows = image.rows;
int nCols = image.cols;
int N = nRows * nCols;
SparseMatrix<double> T(N, N);
typedef Triplet<double> Tr;
std::vector<Tr> triplets;
//Pre-process trimap
for (int i = 0; i < nRows; ++i)
{
for (int j = 0; j < nCols; ++j)
{
uchar& pix = tmap.at<uchar>(i, j);
if (pix <= 0.2f * 255)
pix = 0;
else if (pix >= 0.8f * 255)
pix = 255;
else
pix = 128;
}
}
Mat wf = Mat::zeros(nRows * nCols, 1, CV_8U);
// Column Major Interpretation for working with SparseMatrix
for (int i = 0; i < nRows; ++i)
{
for (int j = 0; j < nCols; ++j)
{
uchar pix = tmap.at<uchar>(i, j);
// collection of known pixels samples
triplets.push_back(Tr(i + j * nRows, i + j * nRows, (pix != 128) ? 1 : 0));
// foreground pixel
wf.at<uchar>(i + j * nRows, 0) = (pix > 200) ? 1 : 0;
}
}
SparseMatrix<double> Wl(N, N), Dl(N, N);
local_info(image, tmap, Wl, Dl);
SparseMatrix<double> Wcm(N, N), Dcm(N, N);
cm(image, tmap, Wcm, Dcm);
Mat new_tmap = tmap.clone();
SparseMatrix<double> Wuu(N, N), Duu(N, N);
Mat image_t = image.t();
Mat tmap_t = tmap.t();
UU(image, tmap, Wuu, Duu);
double elapsed_secs = ((double)(getTickCount() - begin)) / getTickFrequency();
T.setFromTriplets(triplets.begin(), triplets.end());
Mat alpha = Mat::zeros(nRows, nCols, CV_8UC1);
solve(Wcm, Wuu, Wl, Dcm, Duu, Dl, T, wf, alpha);
alpha.copyTo(result);
elapsed_secs = ((double)(getTickCount() - begin)) / getTickFrequency();
CV_LOG_INFO(NULL, "ALPHAMAT: total time: " << elapsed_secs);
}
}} // namespace cv::alphamat
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#include "precomp.hpp"
#include "intraU.hpp"
namespace cv { namespace alphamat {
int findColMajorInd(int rowMajorInd, int nRows, int nCols)
{
int iInd = rowMajorInd / nCols;
int jInd = rowMajorInd % nCols;
return (jInd * nRows + iInd);
}
static
void generateFVectorIntraU(my_vector_of_vectors_t& samples, Mat& img, Mat& tmap, std::vector<int>& orig_ind)
{
int nRows = img.rows;
int nCols = img.cols;
int unk_count = 0;
int i, j;
for (i = 0; i < nRows; ++i)
{
for (j = 0; j < nCols; ++j)
{
uchar pix = tmap.at<uchar>(i, j);
if (pix == 128)
unk_count++;
}
}
samples.resize(unk_count);
orig_ind.resize(unk_count);
int c1 = 0;
for (i = 0; i < nRows; ++i)
{
for (j = 0; j < nCols; ++j)
{
uchar pix = tmap.at<uchar>(i, j);
if (pix == 128) // collection of unknown pixels samples
{
samples[c1].resize(ALPHAMAT_DIM);
samples[c1][0] = img.at<cv::Vec3b>(i, j)[0] / 255.0;
samples[c1][1] = img.at<cv::Vec3b>(i, j)[1] / 255.0;
samples[c1][2] = img.at<cv::Vec3b>(i, j)[2] / 255.0;
samples[c1][3] = (double(i + 1) / nRows) / 20;
samples[c1][4] = (double(j + 1) / nCols) / 20;
orig_ind[c1] = i * nCols + j;
c1++;
}
}
}
CV_LOG_INFO(NULL, "ALPHAMAT: Total number of unknown pixels : " << c1);
}
static
void kdtree_intraU(Mat& img, Mat& tmap, my_vector_of_vectors_t& indm, my_vector_of_vectors_t& samples, std::vector<int>& orig_ind)
{
// Generate feature vectors for intra U:
generateFVectorIntraU(samples, img, tmap, orig_ind);
typedef KDTreeVectorOfVectorsAdaptor<my_vector_of_vectors_t, double> my_kd_tree_t;
my_kd_tree_t mat_index(ALPHAMAT_DIM /*dim*/, samples, 10 /* max leaf */);
mat_index.index->buildIndex();
// do a knn search with ku = 5
const size_t num_results = 5 + 1;
int N = samples.size(); // no. of unknown samples
std::vector<size_t> ret_indexes(num_results);
std::vector<double> out_dists_sqr(num_results);
nanoflann::KNNResultSet<double> resultSet(num_results);
indm.resize(N);
for (int i = 0; i < N; i++)
{
resultSet.init(&ret_indexes[0], &out_dists_sqr[0]);
mat_index.index->findNeighbors(resultSet, &samples[i][0], nanoflann::SearchParams(10));
indm[i].resize(num_results - 1);
for (std::size_t j = 1; j < num_results; j++)
{
indm[i][j - 1] = ret_indexes[j];
}
}
}
static
double l1norm(std::vector<double>& x, std::vector<double>& y)
{
double sum = 0;
for (int i = 0; i < ALPHAMAT_DIM; i++)
sum += abs(x[i] - y[i]);
return sum / ALPHAMAT_DIM;
}
static
void intraU(Mat& img, my_vector_of_vectors_t& indm, my_vector_of_vectors_t& samples,
std::vector<int>& orig_ind, SparseMatrix<double>& Wuu, SparseMatrix<double>& Duu)
{
// input: indm, samples
int n = indm.size(); // num of unknown samples
CV_LOG_INFO(NULL, "ALPHAMAT: num of unknown samples, n : " << n);
int i, j, nbr_ind;
for (i = 0; i < n; i++)
{
samples[i][3] *= 1 / 100;
samples[i][4] *= 1 / 100;
}
my_vector_of_vectors_t weights;
typedef Triplet<double> T;
std::vector<T> triplets, td;
double weight;
for (i = 0; i < n; i++)
{
int num_nbr = indm[i].size();
int cMaj_i = findColMajorInd(orig_ind[i], img.rows, img.cols);
for (j = 0; j < num_nbr; j++)
{
nbr_ind = indm[i][j];
int cMaj_nbr_j = findColMajorInd(orig_ind[nbr_ind], img.rows, img.cols);
weight = max(1 - l1norm(samples[i], samples[j]), 0.0);
triplets.push_back(T(cMaj_i, cMaj_nbr_j, weight / 2));
td.push_back(T(cMaj_i, cMaj_i, weight / 2));
triplets.push_back(T(cMaj_nbr_j, cMaj_i, weight / 2));
td.push_back(T(cMaj_nbr_j, cMaj_nbr_j, weight / 2));
}
}
Wuu.setFromTriplets(triplets.begin(), triplets.end());
Duu.setFromTriplets(td.begin(), td.end());
}
void UU(Mat& image, Mat& tmap, SparseMatrix<double>& Wuu, SparseMatrix<double>& Duu)
{
my_vector_of_vectors_t samples, indm;
std::vector<int> orig_ind;
kdtree_intraU(image, tmap, indm, samples, orig_ind);
intraU(image, indm, samples, orig_ind, Wuu, Duu);
CV_LOG_INFO(NULL, "ALPHAMAT: Intra U Done");
}
}} // namespace cv::alphamat
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#ifndef __OPENCV_ALPHAMAT_INTRAU_H__
#define __OPENCV_ALPHAMAT_INTRAU_H__
namespace cv { namespace alphamat {
const int ALPHAMAT_DIM = 5; // dimension of feature vectors
using namespace Eigen;
using namespace nanoflann;
typedef std::vector<std::vector<double>> my_vector_of_vectors_t;
int findColMajorInd(int rowMajorInd, int nRows, int nCols);
void UU(Mat& image, Mat& tmap, SparseMatrix<double>& Wuu, SparseMatrix<double>& Duu);
}} // namespace
#endif
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
// #ifndef local_info
// #define local_info
#include "precomp.hpp"
#include "local_info.hpp"
namespace cv { namespace alphamat {
void local_info(Mat& img, Mat& tmap, SparseMatrix<double>& Wl, SparseMatrix<double>& Dl)
{
float eps = 0.000001;
int win_size = 1;
int nRows = img.rows;
int nCols = img.cols;
int N = img.rows * img.cols;
Mat unk_img = Mat::zeros(cv::Size(nCols, nRows), CV_32FC1);
for (int i = 0; i < nRows; ++i)
{
for (int j = 0; j < nCols; ++j)
{
uchar pix = tmap.at<uchar>(i, j);
if (pix == 128) // collection of unknown pixels samples
{
unk_img.at<float>(i, j) = 255;
}
}
}
Mat element = getStructuringElement(MORPH_RECT, Size(2 * win_size + 1, 2 * win_size + 1));
/// Apply the dilation operation
Mat dilation_dst = unk_img.clone();
//dilate(unk_img, dilation_dst, element);
int num_win = (win_size * 2 + 1) * (win_size * 2 + 1); // number of pixels in window
typedef Triplet<double> T;
std::vector<T> triplets, td, tl;
int neighInd[9];
int i, j;
for (j = win_size; j < nCols - win_size; j++)
{
for (i = win_size; i < nRows - win_size; i++)
{
uchar pix = tmap.at<uchar>(i, j);
//std::cout << i+j*nRows << " --> " << pix << std::endl;
if (pix != 128)
continue;
// extract the window out of image
Mat win = img.rowRange(i - win_size, i + win_size + 1);
win = win.colRange(j - win_size, j + win_size + 1);
Mat win_ravel = Mat::zeros(9, 3, CV_64F); // doubt ??
double sum1 = 0;
double sum2 = 0;
double sum3 = 0;
int c = 0;
for (int q = -1; q <= 1; q++)
{
for (int p = -1; p <= 1; p++)
{
neighInd[c] = (j + q) * nRows + (i + p); // column major
c++;
}
}
c = 0;
//parsing column major way in the window
for (int q = 0; q < win_size * 2 + 1; q++)
{
for (int p = 0; p < win_size * 2 + 1; p++)
{
win_ravel.at<double>(c, 0) = win.at<cv::Vec3b>(p, q)[0] / 255.0;
win_ravel.at<double>(c, 1) = win.at<cv::Vec3b>(p, q)[1] / 255.0;
win_ravel.at<double>(c, 2) = win.at<cv::Vec3b>(p, q)[2] / 255.0;
sum1 += win.at<cv::Vec3b>(p, q)[0] / 255.0;
sum2 += win.at<cv::Vec3b>(p, q)[1] / 255.0;
sum3 += win.at<cv::Vec3b>(p, q)[2] / 255.0;
c++;
}
}
win = win_ravel;
Mat win_mean = Mat::zeros(1, 3, CV_64F);
win_mean.at<double>(0, 0) = sum1 / num_win;
win_mean.at<double>(0, 1) = sum2 / num_win;
win_mean.at<double>(0, 2) = sum3 / num_win;
// calculate the covariance matrix
Mat covariance = (win.t() * win / num_win) - (win_mean.t() * win_mean);
Mat I = Mat::eye(img.channels(), img.channels(), CV_64F);
Mat I1 = (covariance + (eps / num_win) * I);
Mat I1_inv = I1.inv();
Mat X = win - repeat(win_mean, num_win, 1);
Mat vals = (1 + X * I1_inv * X.t()) / num_win;
for (int q = 0; q < num_win; q++)
{
for (int p = 0; p < num_win; p++)
{
triplets.push_back(T(neighInd[p], neighInd[q], vals.at<double>(p, q)));
}
}
}
}
std::vector<T> tsp;
SparseMatrix<double> W(N, N), Wsp(N, N);
W.setFromTriplets(triplets.begin(), triplets.end());
SparseMatrix<double> Wt = W.transpose();
SparseMatrix<double> Ws = Wt + W;
W = Ws;
for (int k = 0; k < W.outerSize(); ++k)
{
double sumCol = 0;
for (SparseMatrix<double>::InnerIterator it(W, k); it; ++it)
{
sumCol += it.value();
}
if (sumCol < 0.05)
sumCol = 1;
tsp.push_back(T(k, k, 1 / sumCol));
}
Wsp.setFromTriplets(tsp.begin(), tsp.end());
Wl = Wsp * W; // For normalization
//Wl = W; // No normalization
SparseMatrix<double> Wlt = Wl.transpose();
for (int k = 0; k < Wlt.outerSize(); ++k)
{
double sumarr = 0;
for (SparseMatrix<double>::InnerIterator it(Wlt, k); it; ++it)
sumarr += it.value();
td.push_back(T(k, k, sumarr));
}
Dl.setFromTriplets(td.begin(), td.end());
CV_LOG_INFO(NULL, "ALPHAMAT: local_info DONE");
}
}} // namespace cv::alphamat
// #endif
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#ifndef __OPENCV_ALPHAMAT_LOCAL_INFO_H__
#define __OPENCV_ALPHAMAT_LOCAL_INFO_H__
namespace cv { namespace alphamat {
using namespace Eigen;
void local_info(Mat& img, Mat& tmap, SparseMatrix<double>& Wl, SparseMatrix<double>& Dl);
}} // namespace
#endif
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#ifndef __OPENCV_PRECOMP_H__
#define __OPENCV_PRECOMP_H__
#include <vector>
#include <unordered_set>
#include <set>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/core/utils/logger.hpp>
#include <opencv2/alphamat.hpp>
#include "3rdparty/nanoflann.hpp"
#include "3rdparty/KDTreeVectorOfVectorsAdaptor.h"
#ifdef HAVE_EIGEN
#include <Eigen/Eigen>
#include <opencv2/core/eigen.hpp>
#include <Eigen/IterativeLinearSolvers>
#endif
#include "intraU.hpp"
#include "cm.hpp"
#include "local_info.hpp"
#endif
Information Flow Alpha Matting {#tutorial_alphamat}
============================
This project was part of Google Summer of Code 2019.
*Student:* Muskaan Kularia
*Mentor:* Sunita Nayak
Alphamatting is the problem of extracting the foreground from an image. The extracted foreground can be used for further operations like changing the background in an image.
Given an input image and its corresponding trimap, we try to extract the foreground from the background. Following is an example:
Input Image: ![](samples/input_images/plant.jpg)
Input Trimap: ![](samples/trimaps/plant.png)
Output alpha Matte: ![](samples/output_mattes/plant_result.jpg)
This project is implementation of @cite aksoy2017designing . It required implementation of parts of other papers [2,3,4].
# Building
This module uses the Eigen package.
Build the sample code of the alphamat module using the following two cmake commands run inside the build folder:
```
cmake -DOPENCV_EXTRA_MODULES_PATH=<path to opencv_contrib modules> -DBUILD_EXAMPLES=ON ..
cmake --build . --config Release --target example_alphamat_information_flow_matting
```
Please refer to OpenCV building tutorials for further details, if needed.
# Testing
The built target can be tested as follows:
```
example_alphamat_information_flow_matting -img=<path to input image file> -tri=<path to the corresponding trimap> -out=<path to save output matte file>
```
# Source Code of the sample
@includelineno alphamat/samples/information_flow_matting.cpp
# References
[1] Yagiz Aksoy, Tunc Ozan Aydin, Marc Pollefeys, "[Designing Effective Inter-Pixel Information Flow for Natural Image Matting](http://people.inf.ethz.ch/aksoyy/ifm/)", CVPR, 2017.
[2] Roweis, Sam T., and Lawrence K. Saul. "[Nonlinear dimensionality reduction by locally linear embedding](https://science.sciencemag.org/content/290/5500/2323)" Science 290.5500 (2000): 2323-2326.
[3] Anat Levin, Dani Lischinski, Yair Weiss, "[A Closed Form Solution to Natural Image Matting](https://www.researchgate.net/publication/5764820_A_Closed-Form_Solution_to_Natural_Image_Matting)", IEEE TPAMI, 2008.
[4] Qifeng Chen, Dingzeyu Li, Chi-Keung Tang, "[KNN Matting](http://dingzeyu.li/files/knn-matting-tpami.pdf)", IEEE TPAMI, 2013.
[5] Yagiz Aksoy, "[Affinity Based Matting Toolbox](https://github.com/yaksoy/AffinityBasedMattingToolbox)".
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment