Commit bfacad1a authored by Vadim Pisarevsky's avatar Vadim Pisarevsky

Merge pull request #3119 from mshabunin:logistic-regression

parents 0ffc53ba 108caae2
Logistic Regression
===================
.. highlight:: cpp
ML implements logistic regression, which is a probabilistic classification technique. Logistic Regression is a binary classification algorithm which is closely related to Support Vector Machines (SVM).
Like SVM, Logistic Regression can be extended to work on multi-class classification problems like digit recognition (i.e. recognizing digitis like 0,1 2, 3,... from the given images).
This version of Logistic Regression supports both binary and multi-class classifications (for multi-class it creates a multiple 2-class classifiers).
In order to train the logistic regression classifier, Batch Gradient Descent and Mini-Batch Gradient Descent algorithms are used (see [BatchDesWiki]_).
Logistic Regression is a discriminative classifier (see [LogRegTomMitch]_ for more details). Logistic Regression is implemented as a C++ class in ``LogisticRegression``.
In Logistic Regression, we try to optimize the training paramater
:math:`\theta`
such that the hypothesis
:math:`0 \leq h_\theta(x) \leq 1` is acheived.
We have
:math:`h_\theta(x) = g(h_\theta(x))`
and
:math:`g(z) = \frac{1}{1+e^{-z}}`
as the logistic or sigmoid function.
The term "Logistic" in Logistic Regression refers to this function.
For given data of a binary classification problem of classes 0 and 1,
one can determine that the given data instance belongs to class 1 if
:math:`h_\theta(x) \geq 0.5`
or class 0 if
:math:`h_\theta(x) < 0.5`
.
In Logistic Regression, choosing the right parameters is of utmost importance for reducing the training error and ensuring high training accuracy.
``LogisticRegression::Params`` is the structure that defines parameters that are required to train a Logistic Regression classifier.
The learning rate is determined by ``LogisticRegression::Params.alpha``. It determines how faster we approach the solution.
It is a positive real number. Optimization algorithms like Batch Gradient Descent and Mini-Batch Gradient Descent are supported in ``LogisticRegression``.
It is important that we mention the number of iterations these optimization algorithms have to run.
The number of iterations are mentioned by ``LogisticRegression::Params.num_iters``.
The number of iterations can be thought as number of steps taken and learning rate specifies if it is a long step or a short step. These two parameters define how fast we arrive at a possible solution.
In order to compensate for overfitting regularization is performed, which can be enabled by setting ``LogisticRegression::Params.regularized`` to a positive integer (greater than zero).
One can specify what kind of regularization has to be performed by setting ``LogisticRegression::Params.norm`` to ``LogisticRegression::REG_L1`` or ``LogisticRegression::REG_L2`` values.
``LogisticRegression`` provides a choice of 2 training methods with Batch Gradient Descent or the Mini-Batch Gradient Descent. To specify this, set ``LogisticRegression::Params.train_method`` to either ``LogisticRegression::BATCH`` or ``LogisticRegression::MINI_BATCH``.
If ``LogisticRegression::Params`` is set to ``LogisticRegression::MINI_BATCH``, the size of the mini batch has to be to a postive integer using ``LogisticRegression::Params.mini_batch_size``.
A sample set of training parameters for the Logistic Regression classifier can be initialized as follows:
::
LogisticRegression::Params params;
params.alpha = 0.5;
params.num_iters = 10000;
params.norm = LogisticRegression::REG_L2;
params.regularized = 1;
params.train_method = LogisticRegression::MINI_BATCH;
params.mini_batch_size = 10;
**References:**
.. [LogRegWiki] http://en.wikipedia.org/wiki/Logistic_regression. Wikipedia article about the Logistic Regression algorithm.
.. [RenMalik2003] Learning a Classification Model for Segmentation. Proc. CVPR, Nice, France (2003).
.. [LogRegTomMitch] http://www.cs.cmu.edu/~tom/NewChapters.html. "Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression" in Machine Learning, Tom Mitchell.
.. [BatchDesWiki] http://en.wikipedia.org/wiki/Gradient_descent_optimization. Wikipedia article about Gradient Descent based optimization.
LogisticRegression::Params
--------------------------
.. ocv:struct:: LogisticRegression::Params
Parameters of the Logistic Regression training algorithm. You can initialize the structure using a constructor or declaring the variable and initializing the the individual parameters.
The training parameters for Logistic Regression:
.. ocv:member:: double alpha
The learning rate of the optimization algorithm. The higher the value, faster the rate and vice versa. If the value is too high, the learning algorithm may overshoot the optimal parameters and result in lower training accuracy. If the value is too low, the learning algorithm converges towards the optimal parameters very slowly. The value must a be a positive real number. You can experiment with different values with small increments as in 0.0001, 0.0003, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, ... and select the learning rate with less training error.
.. ocv:member:: int num_iters
The number of iterations required for the learing algorithm (Gradient Descent or Mini Batch Gradient Descent). It has to be a positive integer. You can try different number of iterations like in 100, 1000, 2000, 3000, 5000, 10000, .. so on.
.. ocv:member:: int norm
The type of normalization applied. It takes value ``LogisticRegression::L1`` or ``LogisticRegression::L2``.
.. ocv:member:: int regularized
It should be set to postive integer (greater than zero) in order to enable regularization.
.. ocv:member:: int train_method
The kind of training method used to train the classifier. It should be set to either ``LogisticRegression::BATCH`` or ``LogisticRegression::MINI_BATCH``.
.. ocv:member:: int mini_batch_size
If the training method is set to LogisticRegression::MINI_BATCH, it has to be set to positive integer. It can range from 1 to number of training samples.
.. ocv:member:: cv::TermCriteria term_crit
Sets termination criteria for training algorithm.
LogisticRegression::Params::Params
----------------------------------
The constructors
.. ocv:function:: LogisticRegression::Params::Params(double learning_rate = 0.001, int iters = 1000, int method = LogisticRegression::BATCH, int normlization = LogisticRegression::REG_L2, int reg = 1, int batch_size = 1)
:param learning_rate: Specifies the learning rate.
:param iters: Specifies the number of iterations.
:param train_method: Specifies the kind of training method used. It should be set to either ``LogisticRegression::BATCH`` or ``LogisticRegression::MINI_BATCH``. If using ``LogisticRegression::MINI_BATCH``, set ``LogisticRegression::Params.mini_batch_size`` to a positive integer.
:param normalization: Specifies the kind of regularization to be applied. ``LogisticRegression::REG_L1`` or ``LogisticRegression::REG_L2`` (L1 norm or L2 norm). To use this, set ``LogisticRegression::Params.regularized`` to a integer greater than zero.
:param reg: To enable or disable regularization. Set to positive integer (greater than zero) to enable and to 0 to disable.
:param mini_batch_size: Specifies the number of training samples taken in each step of Mini-Batch Gradient Descent. Will only be used if using ``LogisticRegression::MINI_BATCH`` training algorithm. It has to take values less than the total number of training samples.
By initializing this structure, one can set all the parameters required for Logistic Regression classifier.
LogisticRegression
------------------
.. ocv:class:: LogisticRegression : public StatModel
Implements Logistic Regression classifier.
LogisticRegression::create
--------------------------
Creates empty model.
.. ocv:function:: Ptr<LogisticRegression> LogisticRegression::create( const Params& params = Params() )
:param params: The training parameters for the classifier of type ``LogisticRegression::Params``.
Creates Logistic Regression model with parameters given.
LogisticRegression::train
-------------------------
Trains the Logistic Regression classifier and returns true if successful.
.. ocv:function:: bool LogisticRegression::train( const Ptr<TrainData>& trainData, int flags=0 )
:param trainData: Instance of ml::TrainData class holding learning data.
:param flags: Not used.
LogisticRegression::predict
---------------------------
Predicts responses for input samples and returns a float type.
.. ocv:function:: void LogisticRegression::predict( InputArray samples, OutputArray results=noArray(), int flags=0 ) const
:param samples: The input data for the prediction algorithm. Matrix [m x n], where each row contains variables (features) of one object being classified. Should have data type ``CV_32F``.
:param results: Predicted labels as a column matrix of type ``CV_32S``.
:param flags: Not used.
LogisticRegression::get_learnt_thetas
-------------------------------------
This function returns the trained paramters arranged across rows. For a two class classifcation problem, it returns a row matrix.
.. ocv:function:: Mat LogisticRegression::get_learnt_thetas() const
It returns learnt paramters of the Logistic Regression as a matrix of type ``CV_32F``.
LogisticRegression::read
------------------------
This function reads the trained LogisticRegression clasifier from disk.
.. ocv:function:: void LogisticRegression::read(const FileNode& fn)
LogisticRegression::write
-------------------------
This function writes the trained LogisticRegression clasifier to disk.
.. ocv:function:: void LogisticRegression::write(FileStorage& fs) const
......@@ -18,4 +18,5 @@ Most of the classification and regression algorithms are implemented as C++ clas
random_trees
expectation_maximization
neural_networks
logistic_regression
mldata
......@@ -90,7 +90,6 @@ public:
CV_PROP_RW double logStep;
};
class CV_EXPORTS TrainData
{
public:
......@@ -566,6 +565,48 @@ public:
static Ptr<ANN_MLP> create(const Params& params=Params());
};
/****************************************************************************************\
* Logistic Regression *
\****************************************************************************************/
class CV_EXPORTS LogisticRegression : public StatModel
{
public:
class CV_EXPORTS Params
{
public:
Params(double learning_rate = 0.001,
int iters = 1000,
int method = LogisticRegression::BATCH,
int normlization = LogisticRegression::REG_L2,
int reg = 1,
int batch_size = 1);
double alpha;
int num_iters;
int norm;
int regularized;
int train_method;
int mini_batch_size;
TermCriteria term_crit;
};
enum { REG_L1 = 0, REG_L2 = 1};
enum { BATCH = 0, MINI_BATCH = 1};
// Algorithm interface
virtual void write( FileStorage &fs ) const = 0;
virtual void read( const FileNode &fn ) = 0;
// StatModel interface
virtual bool train( const Ptr<TrainData>& trainData, int flags=0 ) = 0;
virtual float predict( InputArray samples, OutputArray results=noArray(), int flags=0 ) const = 0;
virtual void clear() = 0;
virtual Mat get_learnt_thetas() const = 0;
static Ptr<LogisticRegression> create( const Params& params = Params() );
};
/****************************************************************************************\
* Auxilary functions declarations *
\****************************************************************************************/
......
This diff is collapsed.
///////////////////////////////////////////////////////////////////////////////////////
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
// This is a implementation of the Logistic Regression algorithm in C++ in OpenCV.
// AUTHOR:
// Rahul Kavi rahulkavi[at]live[at]com
//
// contains a subset of data from the popular Iris Dataset (taken from "http://archive.ics.uci.edu/ml/datasets/Iris")
// # You are free to use, change, or redistribute the code in any way you wish for
// # non-commercial purposes, but please maintain the name of the original author.
// # This code comes with no warranty of any kind.
// #
// # You are free to use, change, or redistribute the code in any way you wish for
// # non-commercial purposes, but please maintain the name of the original author.
// # This code comes with no warranty of any kind.
// # Logistic Regression ALGORITHM
// License Agreement
// For Open Source Computer Vision Library
// Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
// Copyright (C) 2008-2011, Willow Garage Inc., all rights reserved.
// Third party copyrights are property of their respective owners.
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
// * Redistributions of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
#include "test_precomp.hpp"
using namespace std;
using namespace cv;
using namespace cv::ml;
static bool calculateError( const Mat& _p_labels, const Mat& _o_labels, float& error)
{
error = 0.0f;
float accuracy = 0.0f;
Mat _p_labels_temp;
Mat _o_labels_temp;
_p_labels.convertTo(_p_labels_temp, CV_32S);
_o_labels.convertTo(_o_labels_temp, CV_32S);
CV_Assert(_p_labels_temp.total() == _o_labels_temp.total());
CV_Assert(_p_labels_temp.rows == _o_labels_temp.rows);
accuracy = (float)countNonZero(_p_labels_temp == _o_labels_temp)/_p_labels_temp.rows;
error = 1 - accuracy;
return true;
}
//--------------------------------------------------------------------------------------------
class CV_LRTest : public cvtest::BaseTest
{
public:
CV_LRTest() {}
protected:
virtual void run( int start_from );
};
void CV_LRTest::run( int /*start_from*/ )
{
// initialize varibles from the popular Iris Dataset
string dataFileName = ts->get_data_path() + "iris.data";
Ptr<TrainData> tdata = TrainData::loadFromCSV(dataFileName, 0);
LogisticRegression::Params params = LogisticRegression::Params();
params.alpha = 1.0;
params.num_iters = 10001;
params.norm = LogisticRegression::REG_L2;
params.regularized = 1;
params.train_method = LogisticRegression::BATCH;
params.mini_batch_size = 10;
// run LR classifier train classifier
Ptr<LogisticRegression> p = LogisticRegression::create(params);
p->train(tdata);
// predict using the same data
Mat responses;
p->predict(tdata->getSamples(), responses);
// calculate error
int test_code = cvtest::TS::OK;
float error = 0.0f;
if(!calculateError(responses, tdata->getResponses(), error))
{
ts->printf(cvtest::TS::LOG, "Bad prediction labels\n" );
test_code = cvtest::TS::FAIL_INVALID_OUTPUT;
}
else if(error > 0.05f)
{
ts->printf(cvtest::TS::LOG, "Bad accuracy of (%f)\n", error);
test_code = cvtest::TS::FAIL_BAD_ACCURACY;
}
{
FileStorage s("debug.xml", FileStorage::WRITE);
s << "original" << tdata->getResponses();
s << "predicted1" << responses;
s << "learnt" << p->get_learnt_thetas();
s << "error" << error;
s.release();
}
ts->set_failed_test_info(test_code);
}
//--------------------------------------------------------------------------------------------
class CV_LRTest_SaveLoad : public cvtest::BaseTest
{
public:
CV_LRTest_SaveLoad(){}
protected:
virtual void run(int start_from);
};
void CV_LRTest_SaveLoad::run( int /*start_from*/ )
{
int code = cvtest::TS::OK;
// initialize varibles from the popular Iris Dataset
string dataFileName = ts->get_data_path() + "iris.data";
Ptr<TrainData> tdata = TrainData::loadFromCSV(dataFileName, 0);
Mat responses1, responses2;
Mat learnt_mat1, learnt_mat2;
LogisticRegression::Params params1 = LogisticRegression::Params();
params1.alpha = 1.0;
params1.num_iters = 10001;
params1.norm = LogisticRegression::REG_L2;
params1.regularized = 1;
params1.train_method = LogisticRegression::BATCH;
params1.mini_batch_size = 10;
// train and save the classifier
String filename = tempfile(".xml");
try
{
// run LR classifier train classifier
Ptr<LogisticRegression> lr1 = LogisticRegression::create(params1);
lr1->train(tdata);
lr1->predict(tdata->getSamples(), responses1);
learnt_mat1 = lr1->get_learnt_thetas();
lr1->save(filename);
}
catch(...)
{
ts->printf(cvtest::TS::LOG, "Crash in write method.\n" );
ts->set_failed_test_info(cvtest::TS::FAIL_EXCEPTION);
}
// and load to another
try
{
Ptr<LogisticRegression> lr2 = StatModel::load<LogisticRegression>(filename);
lr2->predict(tdata->getSamples(), responses2);
learnt_mat2 = lr2->get_learnt_thetas();
}
catch(...)
{
ts->printf(cvtest::TS::LOG, "Crash in write method.\n" );
ts->set_failed_test_info(cvtest::TS::FAIL_EXCEPTION);
}
CV_Assert(responses1.rows == responses2.rows);
// compare difference in learnt matrices before and after loading from disk
Mat comp_learnt_mats;
comp_learnt_mats = (learnt_mat1 == learnt_mat2);
comp_learnt_mats = comp_learnt_mats.reshape(1, comp_learnt_mats.rows*comp_learnt_mats.cols);
comp_learnt_mats.convertTo(comp_learnt_mats, CV_32S);
comp_learnt_mats = comp_learnt_mats/255;
// compare difference in prediction outputs and stored inputs
// check if there is any difference between computed learnt mat and retreived mat
float errorCount = 0.0;
errorCount += 1 - (float)countNonZero(responses1 == responses2)/responses1.rows;
errorCount += 1 - (float)sum(comp_learnt_mats)[0]/comp_learnt_mats.rows;
if(errorCount>0)
{
ts->printf( cvtest::TS::LOG, "Different prediction results before writing and after reading (errorCount=%d).\n", errorCount );
code = cvtest::TS::FAIL_BAD_ACCURACY;
}
remove( filename.c_str() );
ts->set_failed_test_info( code );
}
TEST(ML_LR, accuracy) { CV_LRTest test; test.safe_run(); }
TEST(ML_LR, save_load) { CV_LRTest_SaveLoad test; test.safe_run(); }
This diff is collapsed.
/*//////////////////////////////////////////////////////////////////////////////////////
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
// This is a implementation of the Logistic Regression algorithm in C++ in OpenCV.
// AUTHOR:
// Rahul Kavi rahulkavi[at]live[at]com
//
// contains a subset of data from the popular Iris Dataset (taken from
// "http://archive.ics.uci.edu/ml/datasets/Iris")
// # You are free to use, change, or redistribute the code in any way you wish for
// # non-commercial purposes, but please maintain the name of the original author.
// # This code comes with no warranty of any kind.
// #
// # You are free to use, change, or redistribute the code in any way you wish for
// # non-commercial purposes, but please maintain the name of the original author.
// # This code comes with no warranty of any kind.
// # Logistic Regression ALGORITHM
// License Agreement
// For Open Source Computer Vision Library
// Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
// Copyright (C) 2008-2011, Willow Garage Inc., all rights reserved.
// Third party copyrights are property of their respective owners.
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
// * Redistributions of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.*/
#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/ml.hpp>
#include <opencv2/highgui.hpp>
using namespace std;
using namespace cv;
using namespace cv::ml;
static void showImage(const Mat &data, int columns, const String &name)
{
Mat bigImage;
for(int i = 0; i < data.rows; ++i)
{
bigImage.push_back(data.row(i).reshape(0, columns));
}
imshow(name, bigImage.t());
}
static float calculateAccuracyPercent(const Mat &original, const Mat &predicted)
{
return 100 * (float)countNonZero(original == predicted) / predicted.rows;
}
int main()
{
const String filename = "data01.xml";
cout << "**********************************************************************" << endl;
cout << filename
<< " contains digits 0 and 1 of 20 samples each, collected on an Android device" << endl;
cout << "Each of the collected images are of size 28 x 28 re-arranged to 1 x 784 matrix"
<< endl;
cout << "**********************************************************************" << endl;
Mat data, labels;
{
cout << "loading the dataset...";
FileStorage f;
if(f.open(filename, FileStorage::READ))
{
f["datamat"] >> data;
f["labelsmat"] >> labels;
f.release();
}
else
{
cerr << "file can not be opened: " << filename << endl;
return 1;
}
data.convertTo(data, CV_32F);
labels.convertTo(labels, CV_32F);
cout << "read " << data.rows << " rows of data" << endl;
}
Mat data_train, data_test;
Mat labels_train, labels_test;
for(int i = 0; i < data.rows; i++)
{
if(i % 2 == 0)
{
data_train.push_back(data.row(i));
labels_train.push_back(labels.row(i));
}
else
{
data_test.push_back(data.row(i));
labels_test.push_back(labels.row(i));
}
}
cout << "training/testing samples count: " << data_train.rows << "/" << data_test.rows << endl;
// display sample image
showImage(data_train, 28, "train data");
showImage(data_test, 28, "test data");
// simple case with batch gradient
LogisticRegression::Params params = LogisticRegression::Params(
0.001, 10, LogisticRegression::BATCH, LogisticRegression::REG_L2, 1, 1);
// simple case with mini-batch gradient
// LogisticRegression::Params params = LogisticRegression::Params(
// 0.001, 10, LogisticRegression::MINI_BATCH, LogisticRegression::REG_L2, 1, 1);
// mini-batch gradient with higher accuracy
// LogisticRegression::Params params = LogisticRegression::Params(
// 0.000001, 10, LogisticRegression::MINI_BATCH, LogisticRegression::REG_L2, 1, 1);
cout << "training...";
Ptr<StatModel> lr1 = LogisticRegression::create(params);
lr1->train(data_train, ROW_SAMPLE, labels_train);
cout << "done!" << endl;
cout << "predicting...";
Mat responses;
lr1->predict(data_test, responses);
cout << "done!" << endl;
// show prediction report
cout << "original vs predicted:" << endl;
labels_test.convertTo(labels_test, CV_32S);
cout << labels_test.t() << endl;
cout << responses.t() << endl;
cout << "accuracy: " << calculateAccuracyPercent(labels_test, responses) << "%" << endl;
// save the classfier
const String saveFilename = "NewLR_Trained.xml";
cout << "saving the classifier to " << saveFilename << endl;
lr1->save(saveFilename);
// load the classifier onto new object
cout << "loading a new classifier from " << saveFilename << endl;
Ptr<LogisticRegression> lr2 = StatModel::load<LogisticRegression>(saveFilename);
// predict using loaded classifier
cout << "predicting the dataset using the loaded classfier...";
Mat responses2;
lr2->predict(data_test, responses2);
cout << "done!" << endl;
// calculate accuracy
cout << labels_test.t() << endl;
cout << responses2.t() << endl;
cout << "accuracy: " << calculateAccuracyPercent(labels_test, responses2) << "%" << endl;
waitKey(0);
return 0;
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment