feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake
1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试 2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程 3.重整权利声明文件,重整代码工程,确保最小化侵权风险 Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
This commit is contained in:
2
3rdparty/opencv-4.5.4/modules/ml/CMakeLists.txt
vendored
Normal file
2
3rdparty/opencv-4.5.4/modules/ml/CMakeLists.txt
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
set(the_description "Machine Learning")
|
||||
ocv_define_module(ml opencv_core WRAP java objc python)
|
||||
481
3rdparty/opencv-4.5.4/modules/ml/doc/ml_intro.markdown
vendored
Normal file
481
3rdparty/opencv-4.5.4/modules/ml/doc/ml_intro.markdown
vendored
Normal file
@@ -0,0 +1,481 @@
|
||||
Machine Learning Overview {#ml_intro}
|
||||
=========================
|
||||
|
||||
[TOC]
|
||||
|
||||
Training Data {#ml_intro_data}
|
||||
=============
|
||||
|
||||
In machine learning algorithms there is notion of training data. Training data includes several
|
||||
components:
|
||||
|
||||
- A set of training samples. Each training sample is a vector of values (in Computer Vision it's
|
||||
sometimes referred to as feature vector). Usually all the vectors have the same number of
|
||||
components (features); OpenCV ml module assumes that. Each feature can be ordered (i.e. its
|
||||
values are floating-point numbers that can be compared with each other and strictly ordered,
|
||||
i.e. sorted) or categorical (i.e. its value belongs to a fixed set of values that can be
|
||||
integers, strings etc.).
|
||||
- Optional set of responses corresponding to the samples. Training data with no responses is used
|
||||
in unsupervised learning algorithms that learn structure of the supplied data based on distances
|
||||
between different samples. Training data with responses is used in supervised learning
|
||||
algorithms, which learn the function mapping samples to responses. Usually the responses are
|
||||
scalar values, ordered (when we deal with regression problem) or categorical (when we deal with
|
||||
classification problem; in this case the responses are often called "labels"). Some algorithms,
|
||||
most noticeably Neural networks, can handle not only scalar, but also multi-dimensional or
|
||||
vector responses.
|
||||
- Another optional component is the mask of missing measurements. Most algorithms require all the
|
||||
components in all the training samples be valid, but some other algorithms, such as decision
|
||||
trees, can handle the cases of missing measurements.
|
||||
- In the case of classification problem user may want to give different weights to different
|
||||
classes. This is useful, for example, when:
|
||||
- user wants to shift prediction accuracy towards lower false-alarm rate or higher hit-rate.
|
||||
- user wants to compensate for significantly different amounts of training samples from
|
||||
different classes.
|
||||
- In addition to that, each training sample may be given a weight, if user wants the algorithm to
|
||||
pay special attention to certain training samples and adjust the training model accordingly.
|
||||
- Also, user may wish not to use the whole training data at once, but rather use parts of it, e.g.
|
||||
to do parameter optimization via cross-validation procedure.
|
||||
|
||||
As you can see, training data can have rather complex structure; besides, it may be very big and/or
|
||||
not entirely available, so there is need to make abstraction for this concept. In OpenCV ml there is
|
||||
cv::ml::TrainData class for that.
|
||||
|
||||
@sa cv::ml::TrainData
|
||||
|
||||
Normal Bayes Classifier {#ml_intro_bayes}
|
||||
=======================
|
||||
|
||||
This simple classification model assumes that feature vectors from each class are normally
|
||||
distributed (though, not necessarily independently distributed). So, the whole data distribution
|
||||
function is assumed to be a Gaussian mixture, one component per class. Using the training data the
|
||||
algorithm estimates mean vectors and covariance matrices for every class, and then it uses them for
|
||||
prediction.
|
||||
|
||||
@sa cv::ml::NormalBayesClassifier
|
||||
|
||||
K-Nearest Neighbors {#ml_intro_knn}
|
||||
===================
|
||||
|
||||
The algorithm caches all training samples and predicts the response for a new sample by analyzing a
|
||||
certain number (__K__) of the nearest neighbors of the sample using voting, calculating weighted
|
||||
sum, and so on. The method is sometimes referred to as "learning by example" because for prediction
|
||||
it looks for the feature vector with a known response that is closest to the given vector.
|
||||
|
||||
@sa cv::ml::KNearest
|
||||
|
||||
Support Vector Machines {#ml_intro_svm}
|
||||
=======================
|
||||
|
||||
Originally, support vector machines (SVM) was a technique for building an optimal binary (2-class)
|
||||
classifier. Later the technique was extended to regression and clustering problems. SVM is a partial
|
||||
case of kernel-based methods. It maps feature vectors into a higher-dimensional space using a kernel
|
||||
function and builds an optimal linear discriminating function in this space or an optimal hyper-
|
||||
plane that fits into the training data. In case of SVM, the kernel is not defined explicitly.
|
||||
Instead, a distance between any 2 points in the hyper-space needs to be defined.
|
||||
|
||||
The solution is optimal, which means that the margin between the separating hyper-plane and the
|
||||
nearest feature vectors from both classes (in case of 2-class classifier) is maximal. The feature
|
||||
vectors that are the closest to the hyper-plane are called _support vectors_, which means that the
|
||||
position of other vectors does not affect the hyper-plane (the decision function).
|
||||
|
||||
SVM implementation in OpenCV is based on @cite LibSVM
|
||||
|
||||
@sa cv::ml::SVM
|
||||
|
||||
Prediction with SVM {#ml_intro_svm_predict}
|
||||
-------------------
|
||||
|
||||
StatModel::predict(samples, results, flags) should be used. Pass flags=StatModel::RAW_OUTPUT to get
|
||||
the raw response from SVM (in the case of regression, 1-class or 2-class classification problem).
|
||||
|
||||
Decision Trees {#ml_intro_trees}
|
||||
==============
|
||||
|
||||
The ML classes discussed in this section implement Classification and Regression Tree algorithms
|
||||
described in @cite Breiman84 .
|
||||
|
||||
The class cv::ml::DTrees represents a single decision tree or a collection of decision trees. It's
|
||||
also a base class for RTrees and Boost.
|
||||
|
||||
A decision tree is a binary tree (tree where each non-leaf node has two child nodes). It can be used
|
||||
either for classification or for regression. For classification, each tree leaf is marked with a
|
||||
class label; multiple leaves may have the same label. For regression, a constant is also assigned to
|
||||
each tree leaf, so the approximation function is piecewise constant.
|
||||
|
||||
@sa cv::ml::DTrees
|
||||
|
||||
Predicting with Decision Trees {#ml_intro_trees_predict}
|
||||
------------------------------
|
||||
|
||||
To reach a leaf node and to obtain a response for the input feature vector, the prediction procedure
|
||||
starts with the root node. From each non-leaf node the procedure goes to the left (selects the left
|
||||
child node as the next observed node) or to the right based on the value of a certain variable whose
|
||||
index is stored in the observed node. The following variables are possible:
|
||||
|
||||
- __Ordered variables.__ The variable value is compared with a threshold that is also stored in
|
||||
the node. If the value is less than the threshold, the procedure goes to the left. Otherwise, it
|
||||
goes to the right. For example, if the weight is less than 1 kilogram, the procedure goes to the
|
||||
left, else to the right.
|
||||
|
||||
- __Categorical variables.__ A discrete variable value is tested to see whether it belongs to a
|
||||
certain subset of values (also stored in the node) from a limited set of values the variable
|
||||
could take. If it does, the procedure goes to the left. Otherwise, it goes to the right. For
|
||||
example, if the color is green or red, go to the left, else to the right.
|
||||
|
||||
So, in each node, a pair of entities (variable_index , `decision_rule (threshold/subset)` ) is used.
|
||||
This pair is called a _split_ (split on the variable variable_index ). Once a leaf node is reached,
|
||||
the value assigned to this node is used as the output of the prediction procedure.
|
||||
|
||||
Sometimes, certain features of the input vector are missed (for example, in the darkness it is
|
||||
difficult to determine the object color), and the prediction procedure may get stuck in the certain
|
||||
node (in the mentioned example, if the node is split by color). To avoid such situations, decision
|
||||
trees use so-called _surrogate splits_. That is, in addition to the best "primary" split, every tree
|
||||
node may also be split to one or more other variables with nearly the same results.
|
||||
|
||||
Training Decision Trees {#ml_intro_trees_train}
|
||||
-----------------------
|
||||
|
||||
The tree is built recursively, starting from the root node. All training data (feature vectors and
|
||||
responses) is used to split the root node. In each node the optimum decision rule (the best
|
||||
"primary" split) is found based on some criteria. In machine learning, gini "purity" criteria are
|
||||
used for classification, and sum of squared errors is used for regression. Then, if necessary, the
|
||||
surrogate splits are found. They resemble the results of the primary split on the training data. All
|
||||
the data is divided using the primary and the surrogate splits (like it is done in the prediction
|
||||
procedure) between the left and the right child node. Then, the procedure recursively splits both
|
||||
left and right nodes. At each node the recursive procedure may stop (that is, stop splitting the
|
||||
node further) in one of the following cases:
|
||||
|
||||
- Depth of the constructed tree branch has reached the specified maximum value.
|
||||
- Number of training samples in the node is less than the specified threshold when it is not
|
||||
statistically representative to split the node further.
|
||||
- All the samples in the node belong to the same class or, in case of regression, the variation is
|
||||
too small.
|
||||
- The best found split does not give any noticeable improvement compared to a random choice.
|
||||
|
||||
When the tree is built, it may be pruned using a cross-validation procedure, if necessary. That is,
|
||||
some branches of the tree that may lead to the model overfitting are cut off. Normally, this
|
||||
procedure is only applied to standalone decision trees. Usually tree ensembles build trees that are
|
||||
small enough and use their own protection schemes against overfitting.
|
||||
|
||||
Variable Importance {#ml_intro_trees_var}
|
||||
-------------------
|
||||
|
||||
Besides the prediction that is an obvious use of decision trees, the tree can be also used for
|
||||
various data analyses. One of the key properties of the constructed decision tree algorithms is an
|
||||
ability to compute the importance (relative decisive power) of each variable. For example, in a spam
|
||||
filter that uses a set of words occurred in the message as a feature vector, the variable importance
|
||||
rating can be used to determine the most "spam-indicating" words and thus help keep the dictionary
|
||||
size reasonable.
|
||||
|
||||
Importance of each variable is computed over all the splits on this variable in the tree, primary
|
||||
and surrogate ones. Thus, to compute variable importance correctly, the surrogate splits must be
|
||||
enabled in the training parameters, even if there is no missing data.
|
||||
|
||||
Boosting {#ml_intro_boost}
|
||||
========
|
||||
|
||||
A common machine learning task is supervised learning. In supervised learning, the goal is to learn
|
||||
the functional relationship \f$F: y = F(x)\f$ between the input \f$x\f$ and the output \f$y\f$ .
|
||||
Predicting the qualitative output is called _classification_, while predicting the quantitative
|
||||
output is called _regression_.
|
||||
|
||||
Boosting is a powerful learning concept that provides a solution to the supervised classification
|
||||
learning task. It combines the performance of many "weak" classifiers to produce a powerful
|
||||
committee @cite HTF01 . A weak classifier is only required to be better than chance, and thus can be
|
||||
very simple and computationally inexpensive. However, many of them smartly combine results to a
|
||||
strong classifier that often outperforms most "monolithic" strong classifiers such as SVMs and
|
||||
Neural Networks.
|
||||
|
||||
Decision trees are the most popular weak classifiers used in boosting schemes. Often the simplest
|
||||
decision trees with only a single split node per tree (called stumps ) are sufficient.
|
||||
|
||||
The boosted model is based on \f$N\f$ training examples \f${(x_i,y_i)}1N\f$ with \f$x_i \in{R^K}\f$
|
||||
and \f$y_i \in{-1, +1}\f$ . \f$x_i\f$ is a \f$K\f$ -component vector. Each component encodes a
|
||||
feature relevant to the learning task at hand. The desired two-class output is encoded as -1 and +1.
|
||||
|
||||
Different variants of boosting are known as Discrete Adaboost, Real AdaBoost, LogitBoost, and Gentle
|
||||
AdaBoost @cite FHT98 . All of them are very similar in their overall structure. Therefore, this
|
||||
chapter focuses only on the standard two-class Discrete AdaBoost algorithm, outlined below.
|
||||
Initially the same weight is assigned to each sample (step 2). Then, a weak classifier
|
||||
\f$f_{m(x)}\f$ is trained on the weighted training data (step 3a). Its weighted training error and
|
||||
scaling factor \f$c_m\f$ is computed (step 3b). The weights are increased for training samples that
|
||||
have been misclassified (step 3c). All weights are then normalized, and the process of finding the
|
||||
next weak classifier continues for another \f$M\f$ -1 times. The final classifier \f$F(x)\f$ is the
|
||||
sign of the weighted sum over the individual weak classifiers (step 4).
|
||||
|
||||
__Two-class Discrete AdaBoost Algorithm__
|
||||
|
||||
- Set \f$N\f$ examples \f${(x_i,y_i)}1N\f$ with \f$x_i \in{R^K}, y_i \in{-1, +1}\f$ .
|
||||
|
||||
- Assign weights as \f$w_i = 1/N, i = 1,...,N\f$ .
|
||||
|
||||
- Repeat for \f$m = 1,2,...,M\f$ :
|
||||
|
||||
- Fit the classifier \f$f_m(x) \in{-1,1}\f$, using weights \f$w_i\f$ on the training data.
|
||||
|
||||
- Compute \f$err_m = E_w [1_{(y \neq f_m(x))}], c_m = log((1 - err_m)/err_m)\f$ .
|
||||
|
||||
- Set \f$w_i \Leftarrow w_i exp[c_m 1_{(y_i \neq f_m(x_i))}], i = 1,2,...,N,\f$ and
|
||||
renormalize so that \f$\Sigma i w_i = 1\f$ .
|
||||
|
||||
- Classify new samples _x_ using the formula: \f$\textrm{sign} (\Sigma m = 1M c_m f_m(x))\f$ .
|
||||
|
||||
@note Similar to the classical boosting methods, the current implementation supports two-class
|
||||
classifiers only. For M \> 2 classes, there is the __AdaBoost.MH__ algorithm (described in
|
||||
@cite FHT98) that reduces the problem to the two-class problem, yet with a much larger training set.
|
||||
|
||||
To reduce computation time for boosted models without substantially losing accuracy, the influence
|
||||
trimming technique can be employed. As the training algorithm proceeds and the number of trees in
|
||||
the ensemble is increased, a larger number of the training samples are classified correctly and with
|
||||
increasing confidence, thereby those samples receive smaller weights on the subsequent iterations.
|
||||
Examples with a very low relative weight have a small impact on the weak classifier training. Thus,
|
||||
such examples may be excluded during the weak classifier training without having much effect on the
|
||||
induced classifier. This process is controlled with the weight_trim_rate parameter. Only examples
|
||||
with the summary fraction weight_trim_rate of the total weight mass are used in the weak classifier
|
||||
training. Note that the weights for __all__ training examples are recomputed at each training
|
||||
iteration. Examples deleted at a particular iteration may be used again for learning some of the
|
||||
weak classifiers further @cite FHT98
|
||||
|
||||
@sa cv::ml::Boost
|
||||
|
||||
Prediction with Boost {#ml_intro_boost_predict}
|
||||
---------------------
|
||||
StatModel::predict(samples, results, flags) should be used. Pass flags=StatModel::RAW_OUTPUT to get
|
||||
the raw sum from Boost classifier.
|
||||
|
||||
Random Trees {#ml_intro_rtrees}
|
||||
============
|
||||
|
||||
Random trees have been introduced by Leo Breiman and Adele Cutler:
|
||||
<http://www.stat.berkeley.edu/users/breiman/RandomForests/> . The algorithm can deal with both
|
||||
classification and regression problems. Random trees is a collection (ensemble) of tree predictors
|
||||
that is called _forest_ further in this section (the term has been also introduced by L. Breiman).
|
||||
The classification works as follows: the random trees classifier takes the input feature vector,
|
||||
classifies it with every tree in the forest, and outputs the class label that received the majority
|
||||
of "votes". In case of a regression, the classifier response is the average of the responses over
|
||||
all the trees in the forest.
|
||||
|
||||
All the trees are trained with the same parameters but on different training sets. These sets are
|
||||
generated from the original training set using the bootstrap procedure: for each training set, you
|
||||
randomly select the same number of vectors as in the original set ( =N ). The vectors are chosen
|
||||
with replacement. That is, some vectors will occur more than once and some will be absent. At each
|
||||
node of each trained tree, not all the variables are used to find the best split, but a random
|
||||
subset of them. With each node a new subset is generated. However, its size is fixed for all the
|
||||
nodes and all the trees. It is a training parameter set to \f$\sqrt{number\_of\_variables}\f$ by
|
||||
default. None of the built trees are pruned.
|
||||
|
||||
In random trees there is no need for any accuracy estimation procedures, such as cross-validation or
|
||||
bootstrap, or a separate test set to get an estimate of the training error. The error is estimated
|
||||
internally during the training. When the training set for the current tree is drawn by sampling with
|
||||
replacement, some vectors are left out (so-called _oob (out-of-bag) data_ ). The size of oob data is
|
||||
about N/3 . The classification error is estimated by using this oob-data as follows:
|
||||
|
||||
- Get a prediction for each vector, which is oob relative to the i-th tree, using the very i-th
|
||||
tree.
|
||||
|
||||
- After all the trees have been trained, for each vector that has ever been oob, find the
|
||||
class-<em>winner</em> for it (the class that has got the majority of votes in the trees where
|
||||
the vector was oob) and compare it to the ground-truth response.
|
||||
|
||||
- Compute the classification error estimate as a ratio of the number of misclassified oob vectors
|
||||
to all the vectors in the original data. In case of regression, the oob-error is computed as the
|
||||
squared error for oob vectors difference divided by the total number of vectors.
|
||||
|
||||
For the random trees usage example, please, see letter_recog.cpp sample in OpenCV distribution.
|
||||
|
||||
@sa cv::ml::RTrees
|
||||
|
||||
__References:__
|
||||
|
||||
- _Machine Learning_, Wald I, July 2002.
|
||||
<http://stat-www.berkeley.edu/users/breiman/wald2002-1.pdf>
|
||||
- _Looking Inside the Black Box_, Wald II, July 2002.
|
||||
<http://stat-www.berkeley.edu/users/breiman/wald2002-2.pdf>
|
||||
- _Software for the Masses_, Wald III, July 2002.
|
||||
<http://stat-www.berkeley.edu/users/breiman/wald2002-3.pdf>
|
||||
- And other articles from the web site
|
||||
<http://www.stat.berkeley.edu/users/breiman/RandomForests/cc_home.htm>
|
||||
|
||||
Expectation Maximization {#ml_intro_em}
|
||||
========================
|
||||
|
||||
The Expectation Maximization(EM) algorithm estimates the parameters of the multivariate probability
|
||||
density function in the form of a Gaussian mixture distribution with a specified number of mixtures.
|
||||
|
||||
Consider the set of the N feature vectors { \f$x_1, x_2,...,x_{N}\f$ } from a d-dimensional Euclidean
|
||||
space drawn from a Gaussian mixture:
|
||||
|
||||
\f[p(x;a_k,S_k, \pi _k) = \sum _{k=1}^{m} \pi _kp_k(x), \quad \pi _k \geq 0, \quad \sum _{k=1}^{m} \pi _k=1,\f]
|
||||
|
||||
\f[p_k(x)= \varphi (x;a_k,S_k)= \frac{1}{(2\pi)^{d/2}\mid{S_k}\mid^{1/2}} exp \left \{ - \frac{1}{2} (x-a_k)^TS_k^{-1}(x-a_k) \right \} ,\f]
|
||||
|
||||
where \f$m\f$ is the number of mixtures, \f$p_k\f$ is the normal distribution density with the mean
|
||||
\f$a_k\f$ and covariance matrix \f$S_k\f$, \f$\pi_k\f$ is the weight of the k-th mixture. Given the
|
||||
number of mixtures \f$M\f$ and the samples \f$x_i\f$, \f$i=1..N\f$ the algorithm finds the maximum-
|
||||
likelihood estimates (MLE) of all the mixture parameters, that is, \f$a_k\f$, \f$S_k\f$ and
|
||||
\f$\pi_k\f$ :
|
||||
|
||||
\f[L(x, \theta )=logp(x, \theta )= \sum _{i=1}^{N}log \left ( \sum _{k=1}^{m} \pi _kp_k(x) \right ) \to \max _{ \theta \in \Theta },\f]
|
||||
|
||||
\f[\Theta = \left \{ (a_k,S_k, \pi _k): a_k \in \mathbbm{R} ^d,S_k=S_k^T>0,S_k \in \mathbbm{R} ^{d \times d}, \pi _k \geq 0, \sum _{k=1}^{m} \pi _k=1 \right \} .\f]
|
||||
|
||||
The EM algorithm is an iterative procedure. Each iteration includes two steps. At the first step
|
||||
(Expectation step or E-step), you find a probability \f$p_{i,k}\f$ (denoted \f$\alpha_{i,k}\f$ in
|
||||
the formula below) of sample i to belong to mixture k using the currently available mixture
|
||||
parameter estimates:
|
||||
|
||||
\f[\alpha _{ki} = \frac{\pi_k\varphi(x;a_k,S_k)}{\sum\limits_{j=1}^{m}\pi_j\varphi(x;a_j,S_j)} .\f]
|
||||
|
||||
At the second step (Maximization step or M-step), the mixture parameter estimates are refined using
|
||||
the computed probabilities:
|
||||
|
||||
\f[\pi _k= \frac{1}{N} \sum _{i=1}^{N} \alpha _{ki}, \quad a_k= \frac{\sum\limits_{i=1}^{N}\alpha_{ki}x_i}{\sum\limits_{i=1}^{N}\alpha_{ki}} , \quad S_k= \frac{\sum\limits_{i=1}^{N}\alpha_{ki}(x_i-a_k)(x_i-a_k)^T}{\sum\limits_{i=1}^{N}\alpha_{ki}}\f]
|
||||
|
||||
Alternatively, the algorithm may start with the M-step when the initial values for \f$p_{i,k}\f$ can
|
||||
be provided. Another alternative when \f$p_{i,k}\f$ are unknown is to use a simpler clustering
|
||||
algorithm to pre-cluster the input samples and thus obtain initial \f$p_{i,k}\f$ . Often (including
|
||||
machine learning) the k-means algorithm is used for that purpose.
|
||||
|
||||
One of the main problems of the EM algorithm is a large number of parameters to estimate. The
|
||||
majority of the parameters reside in covariance matrices, which are \f$d \times d\f$ elements each
|
||||
where \f$d\f$ is the feature space dimensionality. However, in many practical problems, the
|
||||
covariance matrices are close to diagonal or even to \f$\mu_k*I\f$ , where \f$I\f$ is an identity
|
||||
matrix and \f$\mu_k\f$ is a mixture-dependent "scale" parameter. So, a robust computation scheme
|
||||
could start with harder constraints on the covariance matrices and then use the estimated parameters
|
||||
as an input for a less constrained optimization problem (often a diagonal covariance matrix is
|
||||
already a good enough approximation).
|
||||
|
||||
@sa cv::ml::EM
|
||||
|
||||
References:
|
||||
- Bilmes98 J. A. Bilmes. _A Gentle Tutorial of the EM Algorithm and its Application to Parameter
|
||||
Estimation for Gaussian Mixture and Hidden Markov Models_. Technical Report TR-97-021,
|
||||
International Computer Science Institute and Computer Science Division, University of California
|
||||
at Berkeley, April 1998.
|
||||
|
||||
Neural Networks {#ml_intro_ann}
|
||||
===============
|
||||
|
||||
ML implements feed-forward artificial neural networks or, more particularly, multi-layer perceptrons
|
||||
(MLP), the most commonly used type of neural networks. MLP consists of the input layer, output
|
||||
layer, and one or more hidden layers. Each layer of MLP includes one or more neurons directionally
|
||||
linked with the neurons from the previous and the next layer. The example below represents a 3-layer
|
||||
perceptron with three inputs, two outputs, and the hidden layer including five neurons:
|
||||
|
||||

|
||||
|
||||
All the neurons in MLP are similar. Each of them has several input links (it takes the output values
|
||||
from several neurons in the previous layer as input) and several output links (it passes the
|
||||
response to several neurons in the next layer). The values retrieved from the previous layer are
|
||||
summed up with certain weights, individual for each neuron, plus the bias term. The sum is
|
||||
transformed using the activation function \f$f\f$ that may be also different for different neurons.
|
||||
|
||||

|
||||
|
||||
In other words, given the outputs \f$x_j\f$ of the layer \f$n\f$ , the outputs \f$y_i\f$ of the
|
||||
layer \f$n+1\f$ are computed as:
|
||||
|
||||
\f[u_i = \sum _j (w^{n+1}_{i,j}*x_j) + w^{n+1}_{i,bias}\f]
|
||||
|
||||
\f[y_i = f(u_i)\f]
|
||||
|
||||
Different activation functions may be used. ML implements three standard functions:
|
||||
|
||||
- Identity function ( cv::ml::ANN_MLP::IDENTITY ): \f$f(x)=x\f$
|
||||
|
||||
- Symmetrical sigmoid ( cv::ml::ANN_MLP::SIGMOID_SYM ): \f$f(x)=\beta*(1-e^{-\alpha
|
||||
x})/(1+e^{-\alpha x}\f$ ), which is the default choice for MLP. The standard sigmoid with
|
||||
\f$\beta =1, \alpha =1\f$ is shown below:
|
||||
|
||||

|
||||
|
||||
- Gaussian function ( cv::ml::ANN_MLP::GAUSSIAN ): \f$f(x)=\beta e^{-\alpha x*x}\f$ , which is not
|
||||
completely supported at the moment.
|
||||
|
||||
In ML, all the neurons have the same activation functions, with the same free parameters (
|
||||
\f$\alpha, \beta\f$ ) that are specified by user and are not altered by the training algorithms.
|
||||
|
||||
So, the whole trained network works as follows:
|
||||
|
||||
1. Take the feature vector as input. The vector size is equal to the size of the input layer.
|
||||
2. Pass values as input to the first hidden layer.
|
||||
3. Compute outputs of the hidden layer using the weights and the activation functions.
|
||||
4. Pass outputs further downstream until you compute the output layer.
|
||||
|
||||
So, to compute the network, you need to know all the weights \f$w^{n+1)}_{i,j}\f$ . The weights are
|
||||
computed by the training algorithm. The algorithm takes a training set, multiple input vectors with
|
||||
the corresponding output vectors, and iteratively adjusts the weights to enable the network to give
|
||||
the desired response to the provided input vectors.
|
||||
|
||||
The larger the network size (the number of hidden layers and their sizes) is, the more the potential
|
||||
network flexibility is. The error on the training set could be made arbitrarily small. But at the
|
||||
same time the learned network also "learns" the noise present in the training set, so the error on
|
||||
the test set usually starts increasing after the network size reaches a limit. Besides, the larger
|
||||
networks are trained much longer than the smaller ones, so it is reasonable to pre-process the data,
|
||||
using cv::PCA or similar technique, and train a smaller network on only essential features.
|
||||
|
||||
Another MLP feature is an inability to handle categorical data as is. However, there is a
|
||||
workaround. If a certain feature in the input or output (in case of n -class classifier for
|
||||
\f$n>2\f$ ) layer is categorical and can take \f$M>2\f$ different values, it makes sense to
|
||||
represent it as a binary tuple of M elements, where the i -th element is 1 if and only if the
|
||||
feature is equal to the i -th value out of M possible. It increases the size of the input/output
|
||||
layer but speeds up the training algorithm convergence and at the same time enables "fuzzy" values
|
||||
of such variables, that is, a tuple of probabilities instead of a fixed value.
|
||||
|
||||
ML implements two algorithms for training MLP's. The first algorithm is a classical random
|
||||
sequential back-propagation algorithm. The second (default) one is a batch RPROP algorithm.
|
||||
|
||||
@sa cv::ml::ANN_MLP
|
||||
|
||||
Logistic Regression {#ml_intro_lr}
|
||||
===================
|
||||
|
||||
ML implements logistic regression, which is a probabilistic classification technique. Logistic
|
||||
Regression is a binary classification algorithm which is closely related to Support Vector Machines
|
||||
(SVM). Like SVM, Logistic Regression can be extended to work on multi-class classification problems
|
||||
like digit recognition (i.e. recognizing digits like 0,1 2, 3,... from the given images). This
|
||||
version of Logistic Regression supports both binary and multi-class classifications (for multi-class
|
||||
it creates a multiple 2-class classifiers). In order to train the logistic regression classifier,
|
||||
Batch Gradient Descent and Mini-Batch Gradient Descent algorithms are used (see
|
||||
<http://en.wikipedia.org/wiki/Gradient_descent_optimization>). Logistic Regression is a
|
||||
discriminative classifier (see <http://www.cs.cmu.edu/~tom/NewChapters.html> for more details).
|
||||
Logistic Regression is implemented as a C++ class in LogisticRegression.
|
||||
|
||||
In Logistic Regression, we try to optimize the training parameter \f$\theta\f$ such that the
|
||||
hypothesis \f$0 \leq h_\theta(x) \leq 1\f$ is achieved. We have \f$h_\theta(x) = g(h_\theta(x))\f$
|
||||
and \f$g(z) = \frac{1}{1+e^{-z}}\f$ as the logistic or sigmoid function. The term "Logistic" in
|
||||
Logistic Regression refers to this function. For given data of a binary classification problem of
|
||||
classes 0 and 1, one can determine that the given data instance belongs to class 1 if \f$h_\theta(x)
|
||||
\geq 0.5\f$ or class 0 if \f$h_\theta(x) < 0.5\f$ .
|
||||
|
||||
In Logistic Regression, choosing the right parameters is of utmost importance for reducing the
|
||||
training error and ensuring high training accuracy:
|
||||
|
||||
- The learning rate can be set with @ref cv::ml::LogisticRegression::setLearningRate "setLearningRate"
|
||||
method. It determines how fast we approach the solution. It is a positive real number.
|
||||
|
||||
- Optimization algorithms like Batch Gradient Descent and Mini-Batch Gradient Descent are supported
|
||||
in LogisticRegression. It is important that we mention the number of iterations these optimization
|
||||
algorithms have to run. The number of iterations can be set with @ref
|
||||
cv::ml::LogisticRegression::setIterations "setIterations". This parameter can be thought
|
||||
as number of steps taken and learning rate specifies if it is a long step or a short step. This
|
||||
and previous parameter define how fast we arrive at a possible solution.
|
||||
|
||||
- In order to compensate for overfitting regularization is performed, which can be enabled with
|
||||
@ref cv::ml::LogisticRegression::setRegularization "setRegularization". One can specify what
|
||||
kind of regularization has to be performed by passing one of @ref
|
||||
cv::ml::LogisticRegression::RegKinds "regularization kinds" to this method.
|
||||
|
||||
- Logistic regression implementation provides a choice of 2 training methods with Batch Gradient
|
||||
Descent or the MiniBatch Gradient Descent. To specify this, call @ref
|
||||
cv::ml::LogisticRegression::setTrainMethod "setTrainMethod" with either @ref
|
||||
cv::ml::LogisticRegression::BATCH "LogisticRegression::BATCH" or @ref
|
||||
cv::ml::LogisticRegression::MINI_BATCH "LogisticRegression::MINI_BATCH". If training method is
|
||||
set to @ref cv::ml::LogisticRegression::MINI_BATCH "MINI_BATCH", the size of the mini batch has
|
||||
to be to a positive integer set with @ref cv::ml::LogisticRegression::setMiniBatchSize
|
||||
"setMiniBatchSize".
|
||||
|
||||
A sample set of training parameters for the Logistic Regression classifier can be initialized as follows:
|
||||
@snippet samples/cpp/logistic_regression.cpp init
|
||||
|
||||
@sa cv::ml::LogisticRegression
|
||||
BIN
3rdparty/opencv-4.5.4/modules/ml/doc/pics/SVM_Comparison.png
vendored
Normal file
BIN
3rdparty/opencv-4.5.4/modules/ml/doc/pics/SVM_Comparison.png
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 92 KiB |
BIN
3rdparty/opencv-4.5.4/modules/ml/doc/pics/mlp.png
vendored
Normal file
BIN
3rdparty/opencv-4.5.4/modules/ml/doc/pics/mlp.png
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 11 KiB |
BIN
3rdparty/opencv-4.5.4/modules/ml/doc/pics/neuron_model.png
vendored
Normal file
BIN
3rdparty/opencv-4.5.4/modules/ml/doc/pics/neuron_model.png
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 9.8 KiB |
BIN
3rdparty/opencv-4.5.4/modules/ml/doc/pics/sigmoid_bipolar.png
vendored
Normal file
BIN
3rdparty/opencv-4.5.4/modules/ml/doc/pics/sigmoid_bipolar.png
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 7.0 KiB |
1956
3rdparty/opencv-4.5.4/modules/ml/include/opencv2/ml.hpp
vendored
Normal file
1956
3rdparty/opencv-4.5.4/modules/ml/include/opencv2/ml.hpp
vendored
Normal file
File diff suppressed because it is too large
Load Diff
48
3rdparty/opencv-4.5.4/modules/ml/include/opencv2/ml/ml.hpp
vendored
Normal file
48
3rdparty/opencv-4.5.4/modules/ml/include/opencv2/ml/ml.hpp
vendored
Normal file
@@ -0,0 +1,48 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// License Agreement
|
||||
// For Open Source Computer Vision Library
|
||||
//
|
||||
// Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
|
||||
// Copyright (C) 2009, Willow Garage Inc., all rights reserved.
|
||||
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of the copyright holders may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#ifdef __OPENCV_BUILD
|
||||
#error this is a compatibility header which should not be used inside the OpenCV library
|
||||
#endif
|
||||
|
||||
#include "opencv2/ml.hpp"
|
||||
60
3rdparty/opencv-4.5.4/modules/ml/include/opencv2/ml/ml.inl.hpp
vendored
Normal file
60
3rdparty/opencv-4.5.4/modules/ml/include/opencv2/ml/ml.inl.hpp
vendored
Normal file
@@ -0,0 +1,60 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#ifndef OPENCV_ML_INL_HPP
|
||||
#define OPENCV_ML_INL_HPP
|
||||
|
||||
namespace cv { namespace ml {
|
||||
|
||||
// declared in ml.hpp
|
||||
template<class SimulatedAnnealingSolverSystem>
|
||||
int simulatedAnnealingSolver(SimulatedAnnealingSolverSystem& solverSystem,
|
||||
double initialTemperature, double finalTemperature, double coolingRatio,
|
||||
size_t iterationsPerStep,
|
||||
CV_OUT double* lastTemperature,
|
||||
cv::RNG& rngEnergy
|
||||
)
|
||||
{
|
||||
CV_Assert(finalTemperature > 0);
|
||||
CV_Assert(initialTemperature > finalTemperature);
|
||||
CV_Assert(iterationsPerStep > 0);
|
||||
CV_Assert(coolingRatio < 1.0f);
|
||||
double Ti = initialTemperature;
|
||||
double previousEnergy = solverSystem.energy();
|
||||
int exchange = 0;
|
||||
while (Ti > finalTemperature)
|
||||
{
|
||||
for (size_t i = 0; i < iterationsPerStep; i++)
|
||||
{
|
||||
solverSystem.changeState();
|
||||
double newEnergy = solverSystem.energy();
|
||||
if (newEnergy < previousEnergy)
|
||||
{
|
||||
previousEnergy = newEnergy;
|
||||
exchange++;
|
||||
}
|
||||
else
|
||||
{
|
||||
double r = rngEnergy.uniform(0.0, 1.0);
|
||||
if (r < std::exp(-(newEnergy - previousEnergy) / Ti))
|
||||
{
|
||||
previousEnergy = newEnergy;
|
||||
exchange++;
|
||||
}
|
||||
else
|
||||
{
|
||||
solverSystem.reverseState();
|
||||
}
|
||||
}
|
||||
}
|
||||
Ti *= coolingRatio;
|
||||
}
|
||||
if (lastTemperature)
|
||||
*lastTemperature = Ti;
|
||||
return exchange;
|
||||
}
|
||||
|
||||
}} //namespace
|
||||
|
||||
#endif // OPENCV_ML_INL_HPP
|
||||
42
3rdparty/opencv-4.5.4/modules/ml/misc/java/test/MLTest.java
vendored
Normal file
42
3rdparty/opencv-4.5.4/modules/ml/misc/java/test/MLTest.java
vendored
Normal file
@@ -0,0 +1,42 @@
|
||||
package org.opencv.test.ml;
|
||||
|
||||
import org.opencv.ml.Ml;
|
||||
import org.opencv.ml.SVM;
|
||||
import org.opencv.core.Mat;
|
||||
import org.opencv.core.MatOfFloat;
|
||||
import org.opencv.core.MatOfInt;
|
||||
import org.opencv.core.CvType;
|
||||
import org.opencv.test.OpenCVTestCase;
|
||||
import org.opencv.test.OpenCVTestRunner;
|
||||
|
||||
public class MLTest extends OpenCVTestCase {
|
||||
|
||||
public void testSaveLoad() {
|
||||
Mat samples = new MatOfFloat(new float[] {
|
||||
5.1f, 3.5f, 1.4f, 0.2f,
|
||||
4.9f, 3.0f, 1.4f, 0.2f,
|
||||
4.7f, 3.2f, 1.3f, 0.2f,
|
||||
4.6f, 3.1f, 1.5f, 0.2f,
|
||||
5.0f, 3.6f, 1.4f, 0.2f,
|
||||
7.0f, 3.2f, 4.7f, 1.4f,
|
||||
6.4f, 3.2f, 4.5f, 1.5f,
|
||||
6.9f, 3.1f, 4.9f, 1.5f,
|
||||
5.5f, 2.3f, 4.0f, 1.3f,
|
||||
6.5f, 2.8f, 4.6f, 1.5f
|
||||
}).reshape(1, 10);
|
||||
Mat responses = new MatOfInt(new int[] {
|
||||
0, 0, 0, 0, 0, 1, 1, 1, 1, 1
|
||||
}).reshape(1, 10);
|
||||
SVM saved = SVM.create();
|
||||
assertFalse(saved.isTrained());
|
||||
|
||||
saved.train(samples, Ml.ROW_SAMPLE, responses);
|
||||
assertTrue(saved.isTrained());
|
||||
|
||||
String filename = OpenCVTestRunner.getTempFileName("yml");
|
||||
saved.save(filename);
|
||||
SVM loaded = SVM.load(filename);
|
||||
assertTrue(loaded.isTrained());
|
||||
}
|
||||
|
||||
}
|
||||
9
3rdparty/opencv-4.5.4/modules/ml/misc/objc/gen_dict.json
vendored
Normal file
9
3rdparty/opencv-4.5.4/modules/ml/misc/objc/gen_dict.json
vendored
Normal file
@@ -0,0 +1,9 @@
|
||||
{
|
||||
"enum_fix" : {
|
||||
"EM" : { "Types": "EMTypes" },
|
||||
"SVM" : { "Types": "SVMTypes" },
|
||||
"KNearest" : { "Types": "KNearestTypes" },
|
||||
"DTrees" : { "Flags": "DTreeFlags" },
|
||||
"StatModel" : { "Flags": "StatModelFlags" }
|
||||
}
|
||||
}
|
||||
22
3rdparty/opencv-4.5.4/modules/ml/misc/python/pyopencv_ml.hpp
vendored
Normal file
22
3rdparty/opencv-4.5.4/modules/ml/misc/python/pyopencv_ml.hpp
vendored
Normal file
@@ -0,0 +1,22 @@
|
||||
template<>
|
||||
bool pyopencv_to(PyObject *obj, CvTermCriteria& dst, const ArgInfo& info)
|
||||
{
|
||||
CV_UNUSED(info);
|
||||
if(!obj)
|
||||
return true;
|
||||
return PyArg_ParseTuple(obj, "iid", &dst.type, &dst.max_iter, &dst.epsilon) > 0;
|
||||
}
|
||||
|
||||
template<>
|
||||
bool pyopencv_to(PyObject* obj, CvSlice& r, const ArgInfo& info)
|
||||
{
|
||||
CV_UNUSED(info);
|
||||
if(!obj || obj == Py_None)
|
||||
return true;
|
||||
if(PyObject_Size(obj) == 0)
|
||||
{
|
||||
r = CV_WHOLE_SEQ;
|
||||
return true;
|
||||
}
|
||||
return PyArg_ParseTuple(obj, "ii", &r.start_index, &r.end_index) > 0;
|
||||
}
|
||||
201
3rdparty/opencv-4.5.4/modules/ml/misc/python/test/test_digits.py
vendored
Normal file
201
3rdparty/opencv-4.5.4/modules/ml/misc/python/test/test_digits.py
vendored
Normal file
@@ -0,0 +1,201 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
'''
|
||||
SVM and KNearest digit recognition.
|
||||
|
||||
Sample loads a dataset of handwritten digits from '../data/digits.png'.
|
||||
Then it trains a SVM and KNearest classifiers on it and evaluates
|
||||
their accuracy.
|
||||
|
||||
Following preprocessing is applied to the dataset:
|
||||
- Moment-based image deskew (see deskew())
|
||||
- Digit images are split into 4 10x10 cells and 16-bin
|
||||
histogram of oriented gradients is computed for each
|
||||
cell
|
||||
- Transform histograms to space with Hellinger metric (see [1] (RootSIFT))
|
||||
|
||||
|
||||
[1] R. Arandjelovic, A. Zisserman
|
||||
"Three things everyone should know to improve object retrieval"
|
||||
http://www.robots.ox.ac.uk/~vgg/publications/2012/Arandjelovic12/arandjelovic12.pdf
|
||||
|
||||
'''
|
||||
|
||||
|
||||
# Python 2/3 compatibility
|
||||
from __future__ import print_function
|
||||
|
||||
# built-in modules
|
||||
from multiprocessing.pool import ThreadPool
|
||||
|
||||
import cv2 as cv
|
||||
|
||||
import numpy as np
|
||||
from numpy.linalg import norm
|
||||
|
||||
|
||||
SZ = 20 # size of each digit is SZ x SZ
|
||||
CLASS_N = 10
|
||||
DIGITS_FN = 'samples/data/digits.png'
|
||||
|
||||
def split2d(img, cell_size, flatten=True):
|
||||
h, w = img.shape[:2]
|
||||
sx, sy = cell_size
|
||||
cells = [np.hsplit(row, w//sx) for row in np.vsplit(img, h//sy)]
|
||||
cells = np.array(cells)
|
||||
if flatten:
|
||||
cells = cells.reshape(-1, sy, sx)
|
||||
return cells
|
||||
|
||||
def deskew(img):
|
||||
m = cv.moments(img)
|
||||
if abs(m['mu02']) < 1e-2:
|
||||
return img.copy()
|
||||
skew = m['mu11']/m['mu02']
|
||||
M = np.float32([[1, skew, -0.5*SZ*skew], [0, 1, 0]])
|
||||
img = cv.warpAffine(img, M, (SZ, SZ), flags=cv.WARP_INVERSE_MAP | cv.INTER_LINEAR)
|
||||
return img
|
||||
|
||||
class StatModel(object):
|
||||
def load(self, fn):
|
||||
self.model.load(fn) # Known bug: https://github.com/opencv/opencv/issues/4969
|
||||
def save(self, fn):
|
||||
self.model.save(fn)
|
||||
|
||||
class KNearest(StatModel):
|
||||
def __init__(self, k = 3):
|
||||
self.k = k
|
||||
self.model = cv.ml.KNearest_create()
|
||||
|
||||
def train(self, samples, responses):
|
||||
self.model.train(samples, cv.ml.ROW_SAMPLE, responses)
|
||||
|
||||
def predict(self, samples):
|
||||
_retval, results, _neigh_resp, _dists = self.model.findNearest(samples, self.k)
|
||||
return results.ravel()
|
||||
|
||||
class SVM(StatModel):
|
||||
def __init__(self, C = 1, gamma = 0.5):
|
||||
self.model = cv.ml.SVM_create()
|
||||
self.model.setGamma(gamma)
|
||||
self.model.setC(C)
|
||||
self.model.setKernel(cv.ml.SVM_RBF)
|
||||
self.model.setType(cv.ml.SVM_C_SVC)
|
||||
|
||||
def train(self, samples, responses):
|
||||
self.model.train(samples, cv.ml.ROW_SAMPLE, responses)
|
||||
|
||||
def predict(self, samples):
|
||||
return self.model.predict(samples)[1].ravel()
|
||||
|
||||
|
||||
def evaluate_model(model, digits, samples, labels):
|
||||
resp = model.predict(samples)
|
||||
err = (labels != resp).mean()
|
||||
|
||||
confusion = np.zeros((10, 10), np.int32)
|
||||
for i, j in zip(labels, resp):
|
||||
confusion[int(i), int(j)] += 1
|
||||
|
||||
return err, confusion
|
||||
|
||||
def preprocess_simple(digits):
|
||||
return np.float32(digits).reshape(-1, SZ*SZ) / 255.0
|
||||
|
||||
def preprocess_hog(digits):
|
||||
samples = []
|
||||
for img in digits:
|
||||
gx = cv.Sobel(img, cv.CV_32F, 1, 0)
|
||||
gy = cv.Sobel(img, cv.CV_32F, 0, 1)
|
||||
mag, ang = cv.cartToPolar(gx, gy)
|
||||
bin_n = 16
|
||||
bin = np.int32(bin_n*ang/(2*np.pi))
|
||||
bin_cells = bin[:10,:10], bin[10:,:10], bin[:10,10:], bin[10:,10:]
|
||||
mag_cells = mag[:10,:10], mag[10:,:10], mag[:10,10:], mag[10:,10:]
|
||||
hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
|
||||
hist = np.hstack(hists)
|
||||
|
||||
# transform to Hellinger kernel
|
||||
eps = 1e-7
|
||||
hist /= hist.sum() + eps
|
||||
hist = np.sqrt(hist)
|
||||
hist /= norm(hist) + eps
|
||||
|
||||
samples.append(hist)
|
||||
return np.float32(samples)
|
||||
|
||||
from tests_common import NewOpenCVTests
|
||||
|
||||
class digits_test(NewOpenCVTests):
|
||||
|
||||
def load_digits(self, fn):
|
||||
digits_img = self.get_sample(fn, 0)
|
||||
digits = split2d(digits_img, (SZ, SZ))
|
||||
labels = np.repeat(np.arange(CLASS_N), len(digits)/CLASS_N)
|
||||
return digits, labels
|
||||
|
||||
def test_digits(self):
|
||||
|
||||
digits, labels = self.load_digits(DIGITS_FN)
|
||||
|
||||
# shuffle digits
|
||||
rand = np.random.RandomState(321)
|
||||
shuffle = rand.permutation(len(digits))
|
||||
digits, labels = digits[shuffle], labels[shuffle]
|
||||
|
||||
digits2 = list(map(deskew, digits))
|
||||
samples = preprocess_hog(digits2)
|
||||
|
||||
train_n = int(0.9*len(samples))
|
||||
_digits_train, digits_test = np.split(digits2, [train_n])
|
||||
samples_train, samples_test = np.split(samples, [train_n])
|
||||
labels_train, labels_test = np.split(labels, [train_n])
|
||||
errors = list()
|
||||
confusionMatrixes = list()
|
||||
|
||||
model = KNearest(k=4)
|
||||
model.train(samples_train, labels_train)
|
||||
error, confusion = evaluate_model(model, digits_test, samples_test, labels_test)
|
||||
errors.append(error)
|
||||
confusionMatrixes.append(confusion)
|
||||
|
||||
model = SVM(C=2.67, gamma=5.383)
|
||||
model.train(samples_train, labels_train)
|
||||
error, confusion = evaluate_model(model, digits_test, samples_test, labels_test)
|
||||
errors.append(error)
|
||||
confusionMatrixes.append(confusion)
|
||||
|
||||
eps = 0.001
|
||||
normEps = len(samples_test) * 0.02
|
||||
|
||||
confusionKNN = [[45, 0, 0, 0, 0, 0, 0, 0, 0, 0],
|
||||
[ 0, 57, 0, 0, 0, 0, 0, 0, 0, 0],
|
||||
[ 0, 0, 59, 1, 0, 0, 0, 0, 1, 0],
|
||||
[ 0, 0, 0, 43, 0, 0, 0, 1, 0, 0],
|
||||
[ 0, 0, 0, 0, 38, 0, 2, 0, 0, 0],
|
||||
[ 0, 0, 0, 2, 0, 48, 0, 0, 1, 0],
|
||||
[ 0, 1, 0, 0, 0, 0, 51, 0, 0, 0],
|
||||
[ 0, 0, 1, 0, 0, 0, 0, 54, 0, 0],
|
||||
[ 0, 0, 0, 0, 0, 1, 0, 0, 46, 0],
|
||||
[ 1, 1, 0, 1, 1, 0, 0, 0, 2, 42]]
|
||||
|
||||
confusionSVM = [[45, 0, 0, 0, 0, 0, 0, 0, 0, 0],
|
||||
[ 0, 57, 0, 0, 0, 0, 0, 0, 0, 0],
|
||||
[ 0, 0, 59, 2, 0, 0, 0, 0, 0, 0],
|
||||
[ 0, 0, 0, 43, 0, 0, 0, 1, 0, 0],
|
||||
[ 0, 0, 0, 0, 40, 0, 0, 0, 0, 0],
|
||||
[ 0, 0, 0, 1, 0, 50, 0, 0, 0, 0],
|
||||
[ 0, 0, 0, 0, 1, 0, 51, 0, 0, 0],
|
||||
[ 0, 0, 1, 0, 0, 0, 0, 54, 0, 0],
|
||||
[ 0, 0, 0, 0, 0, 0, 0, 0, 47, 0],
|
||||
[ 0, 1, 0, 1, 0, 0, 0, 0, 1, 45]]
|
||||
|
||||
self.assertLess(cv.norm(confusionMatrixes[0] - confusionKNN, cv.NORM_L1), normEps)
|
||||
self.assertLess(cv.norm(confusionMatrixes[1] - confusionSVM, cv.NORM_L1), normEps)
|
||||
|
||||
self.assertLess(errors[0] - 0.034, eps)
|
||||
self.assertLess(errors[1] - 0.018, eps)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
NewOpenCVTests.bootstrap()
|
||||
40
3rdparty/opencv-4.5.4/modules/ml/misc/python/test/test_goodfeatures.py
vendored
Normal file
40
3rdparty/opencv-4.5.4/modules/ml/misc/python/test/test_goodfeatures.py
vendored
Normal file
@@ -0,0 +1,40 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
# Python 2/3 compatibility
|
||||
from __future__ import print_function
|
||||
|
||||
import cv2 as cv
|
||||
import numpy as np
|
||||
|
||||
from tests_common import NewOpenCVTests
|
||||
|
||||
class TestGoodFeaturesToTrack_test(NewOpenCVTests):
|
||||
def test_goodFeaturesToTrack(self):
|
||||
arr = self.get_sample('samples/data/lena.jpg', 0)
|
||||
original = arr.copy()
|
||||
threshes = [ x / 100. for x in range(1,10) ]
|
||||
numPoints = 20000
|
||||
|
||||
results = dict([(t, cv.goodFeaturesToTrack(arr, numPoints, t, 2, useHarrisDetector=True)) for t in threshes])
|
||||
# Check that GoodFeaturesToTrack has not modified input image
|
||||
self.assertTrue(arr.tostring() == original.tostring())
|
||||
# Check for repeatability
|
||||
for i in range(1):
|
||||
results2 = dict([(t, cv.goodFeaturesToTrack(arr, numPoints, t, 2, useHarrisDetector=True)) for t in threshes])
|
||||
for t in threshes:
|
||||
self.assertTrue(len(results2[t]) == len(results[t]))
|
||||
for i in range(len(results[t])):
|
||||
self.assertTrue(cv.norm(results[t][i][0] - results2[t][i][0]) == 0)
|
||||
|
||||
for t0,t1 in zip(threshes, threshes[1:]):
|
||||
r0 = results[t0]
|
||||
r1 = results[t1]
|
||||
# Increasing thresh should make result list shorter
|
||||
self.assertTrue(len(r0) > len(r1))
|
||||
# Increasing thresh should monly truncate result list
|
||||
for i in range(len(r1)):
|
||||
self.assertTrue(cv.norm(r1[i][0] - r0[i][0])==0)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
NewOpenCVTests.bootstrap()
|
||||
13
3rdparty/opencv-4.5.4/modules/ml/misc/python/test/test_knearest.py
vendored
Normal file
13
3rdparty/opencv-4.5.4/modules/ml/misc/python/test/test_knearest.py
vendored
Normal file
@@ -0,0 +1,13 @@
|
||||
#!/usr/bin/env python
|
||||
import cv2 as cv
|
||||
|
||||
from tests_common import NewOpenCVTests
|
||||
|
||||
class knearest_test(NewOpenCVTests):
|
||||
def test_load(self):
|
||||
k_nearest = cv.ml.KNearest_load(self.find_file("ml/opencv_ml_knn.xml"))
|
||||
self.assertFalse(k_nearest.empty())
|
||||
self.assertTrue(k_nearest.isTrained())
|
||||
|
||||
if __name__ == '__main__':
|
||||
NewOpenCVTests.bootstrap()
|
||||
171
3rdparty/opencv-4.5.4/modules/ml/misc/python/test/test_letter_recog.py
vendored
Normal file
171
3rdparty/opencv-4.5.4/modules/ml/misc/python/test/test_letter_recog.py
vendored
Normal file
@@ -0,0 +1,171 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
'''
|
||||
The sample demonstrates how to train Random Trees classifier
|
||||
(or Boosting classifier, or MLP, or Knearest, or Support Vector Machines) using the provided dataset.
|
||||
|
||||
We use the sample database letter-recognition.data
|
||||
from UCI Repository, here is the link:
|
||||
|
||||
Newman, D.J. & Hettich, S. & Blake, C.L. & Merz, C.J. (1998).
|
||||
UCI Repository of machine learning databases
|
||||
[http://www.ics.uci.edu/~mlearn/MLRepository.html].
|
||||
Irvine, CA: University of California, Department of Information and Computer Science.
|
||||
|
||||
The dataset consists of 20000 feature vectors along with the
|
||||
responses - capital latin letters A..Z.
|
||||
The first 10000 samples are used for training
|
||||
and the remaining 10000 - to test the classifier.
|
||||
======================================================
|
||||
Models: RTrees, KNearest, Boost, SVM, MLP
|
||||
'''
|
||||
|
||||
# Python 2/3 compatibility
|
||||
from __future__ import print_function
|
||||
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
|
||||
def load_base(fn):
|
||||
a = np.loadtxt(fn, np.float32, delimiter=',', converters={ 0 : lambda ch : ord(ch)-ord('A') })
|
||||
samples, responses = a[:,1:], a[:,0]
|
||||
return samples, responses
|
||||
|
||||
class LetterStatModel(object):
|
||||
class_n = 26
|
||||
train_ratio = 0.5
|
||||
|
||||
def load(self, fn):
|
||||
self.model.load(fn)
|
||||
def save(self, fn):
|
||||
self.model.save(fn)
|
||||
|
||||
def unroll_samples(self, samples):
|
||||
sample_n, var_n = samples.shape
|
||||
new_samples = np.zeros((sample_n * self.class_n, var_n+1), np.float32)
|
||||
new_samples[:,:-1] = np.repeat(samples, self.class_n, axis=0)
|
||||
new_samples[:,-1] = np.tile(np.arange(self.class_n), sample_n)
|
||||
return new_samples
|
||||
|
||||
def unroll_responses(self, responses):
|
||||
sample_n = len(responses)
|
||||
new_responses = np.zeros(sample_n*self.class_n, np.int32)
|
||||
resp_idx = np.int32( responses + np.arange(sample_n)*self.class_n )
|
||||
new_responses[resp_idx] = 1
|
||||
return new_responses
|
||||
|
||||
class RTrees(LetterStatModel):
|
||||
def __init__(self):
|
||||
self.model = cv.ml.RTrees_create()
|
||||
|
||||
def train(self, samples, responses):
|
||||
#sample_n, var_n = samples.shape
|
||||
self.model.setMaxDepth(20)
|
||||
self.model.train(samples, cv.ml.ROW_SAMPLE, responses.astype(int))
|
||||
|
||||
def predict(self, samples):
|
||||
_ret, resp = self.model.predict(samples)
|
||||
return resp.ravel()
|
||||
|
||||
|
||||
class KNearest(LetterStatModel):
|
||||
def __init__(self):
|
||||
self.model = cv.ml.KNearest_create()
|
||||
|
||||
def train(self, samples, responses):
|
||||
self.model.train(samples, cv.ml.ROW_SAMPLE, responses)
|
||||
|
||||
def predict(self, samples):
|
||||
_retval, results, _neigh_resp, _dists = self.model.findNearest(samples, k = 10)
|
||||
return results.ravel()
|
||||
|
||||
|
||||
class Boost(LetterStatModel):
|
||||
def __init__(self):
|
||||
self.model = cv.ml.Boost_create()
|
||||
|
||||
def train(self, samples, responses):
|
||||
_sample_n, var_n = samples.shape
|
||||
new_samples = self.unroll_samples(samples)
|
||||
new_responses = self.unroll_responses(responses)
|
||||
var_types = np.array([cv.ml.VAR_NUMERICAL] * var_n + [cv.ml.VAR_CATEGORICAL, cv.ml.VAR_CATEGORICAL], np.uint8)
|
||||
|
||||
self.model.setWeakCount(15)
|
||||
self.model.setMaxDepth(10)
|
||||
self.model.train(cv.ml.TrainData_create(new_samples, cv.ml.ROW_SAMPLE, new_responses.astype(int), varType = var_types))
|
||||
|
||||
def predict(self, samples):
|
||||
new_samples = self.unroll_samples(samples)
|
||||
_ret, resp = self.model.predict(new_samples)
|
||||
|
||||
return resp.ravel().reshape(-1, self.class_n).argmax(1)
|
||||
|
||||
|
||||
class SVM(LetterStatModel):
|
||||
def __init__(self):
|
||||
self.model = cv.ml.SVM_create()
|
||||
|
||||
def train(self, samples, responses):
|
||||
self.model.setType(cv.ml.SVM_C_SVC)
|
||||
self.model.setC(1)
|
||||
self.model.setKernel(cv.ml.SVM_RBF)
|
||||
self.model.setGamma(.1)
|
||||
self.model.train(samples, cv.ml.ROW_SAMPLE, responses.astype(int))
|
||||
|
||||
def predict(self, samples):
|
||||
_ret, resp = self.model.predict(samples)
|
||||
return resp.ravel()
|
||||
|
||||
|
||||
class MLP(LetterStatModel):
|
||||
def __init__(self):
|
||||
self.model = cv.ml.ANN_MLP_create()
|
||||
|
||||
def train(self, samples, responses):
|
||||
_sample_n, var_n = samples.shape
|
||||
new_responses = self.unroll_responses(responses).reshape(-1, self.class_n)
|
||||
layer_sizes = np.int32([var_n, 100, 100, self.class_n])
|
||||
|
||||
self.model.setLayerSizes(layer_sizes)
|
||||
self.model.setTrainMethod(cv.ml.ANN_MLP_BACKPROP)
|
||||
self.model.setBackpropMomentumScale(0)
|
||||
self.model.setBackpropWeightScale(0.001)
|
||||
self.model.setTermCriteria((cv.TERM_CRITERIA_COUNT, 20, 0.01))
|
||||
self.model.setActivationFunction(cv.ml.ANN_MLP_SIGMOID_SYM, 2, 1)
|
||||
|
||||
self.model.train(samples, cv.ml.ROW_SAMPLE, np.float32(new_responses))
|
||||
|
||||
def predict(self, samples):
|
||||
_ret, resp = self.model.predict(samples)
|
||||
return resp.argmax(-1)
|
||||
|
||||
from tests_common import NewOpenCVTests
|
||||
|
||||
class letter_recog_test(NewOpenCVTests):
|
||||
|
||||
def test_letter_recog(self):
|
||||
|
||||
eps = 0.01
|
||||
|
||||
models = [RTrees, KNearest, Boost, SVM, MLP]
|
||||
models = dict( [(cls.__name__.lower(), cls) for cls in models] )
|
||||
testErrors = {RTrees: (98.930000, 92.390000), KNearest: (94.960000, 92.010000),
|
||||
Boost: (85.970000, 74.920000), SVM: (99.780000, 95.680000), MLP: (90.060000, 87.410000)}
|
||||
|
||||
for model in models:
|
||||
Model = models[model]
|
||||
classifier = Model()
|
||||
|
||||
samples, responses = load_base(self.repoPath + '/samples/data/letter-recognition.data')
|
||||
train_n = int(len(samples)*classifier.train_ratio)
|
||||
|
||||
classifier.train(samples[:train_n], responses[:train_n])
|
||||
train_rate = np.mean(classifier.predict(samples[:train_n]) == responses[:train_n].astype(int))
|
||||
test_rate = np.mean(classifier.predict(samples[train_n:]) == responses[train_n:].astype(int))
|
||||
|
||||
self.assertLess(train_rate - testErrors[Model][0], eps)
|
||||
self.assertLess(test_rate - testErrors[Model][1], eps)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
NewOpenCVTests.bootstrap()
|
||||
1533
3rdparty/opencv-4.5.4/modules/ml/src/ann_mlp.cpp
vendored
Normal file
1533
3rdparty/opencv-4.5.4/modules/ml/src/ann_mlp.cpp
vendored
Normal file
File diff suppressed because it is too large
Load Diff
533
3rdparty/opencv-4.5.4/modules/ml/src/boost.cpp
vendored
Normal file
533
3rdparty/opencv-4.5.4/modules/ml/src/boost.cpp
vendored
Normal file
@@ -0,0 +1,533 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// License Agreement
|
||||
// For Open Source Computer Vision Library
|
||||
//
|
||||
// Copyright (C) 2000, Intel Corporation, all rights reserved.
|
||||
// Copyright (C) 2014, Itseez Inc, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of the copyright holders may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#include "precomp.hpp"
|
||||
|
||||
namespace cv { namespace ml {
|
||||
|
||||
static inline double
|
||||
log_ratio( double val )
|
||||
{
|
||||
const double eps = 1e-5;
|
||||
val = std::max( val, eps );
|
||||
val = std::min( val, 1. - eps );
|
||||
return log( val/(1. - val) );
|
||||
}
|
||||
|
||||
|
||||
BoostTreeParams::BoostTreeParams()
|
||||
{
|
||||
boostType = Boost::REAL;
|
||||
weakCount = 100;
|
||||
weightTrimRate = 0.95;
|
||||
}
|
||||
|
||||
BoostTreeParams::BoostTreeParams( int _boostType, int _weak_count,
|
||||
double _weightTrimRate)
|
||||
{
|
||||
boostType = _boostType;
|
||||
weakCount = _weak_count;
|
||||
weightTrimRate = _weightTrimRate;
|
||||
}
|
||||
|
||||
class DTreesImplForBoost CV_FINAL : public DTreesImpl
|
||||
{
|
||||
public:
|
||||
DTreesImplForBoost()
|
||||
{
|
||||
params.setCVFolds(0);
|
||||
params.setMaxDepth(1);
|
||||
}
|
||||
virtual ~DTreesImplForBoost() {}
|
||||
|
||||
bool isClassifier() const CV_OVERRIDE { return true; }
|
||||
|
||||
void clear() CV_OVERRIDE
|
||||
{
|
||||
DTreesImpl::clear();
|
||||
}
|
||||
|
||||
void startTraining( const Ptr<TrainData>& trainData, int flags ) CV_OVERRIDE
|
||||
{
|
||||
CV_Assert(!trainData.empty());
|
||||
DTreesImpl::startTraining(trainData, flags);
|
||||
sumResult.assign(w->sidx.size(), 0.);
|
||||
|
||||
if( bparams.boostType != Boost::DISCRETE )
|
||||
{
|
||||
_isClassifier = false;
|
||||
int i, n = (int)w->cat_responses.size();
|
||||
w->ord_responses.resize(n);
|
||||
|
||||
double a = -1, b = 1;
|
||||
if( bparams.boostType == Boost::LOGIT )
|
||||
{
|
||||
a = -2, b = 2;
|
||||
}
|
||||
for( i = 0; i < n; i++ )
|
||||
w->ord_responses[i] = w->cat_responses[i] > 0 ? b : a;
|
||||
}
|
||||
|
||||
normalizeWeights();
|
||||
}
|
||||
|
||||
void normalizeWeights()
|
||||
{
|
||||
int i, n = (int)w->sidx.size();
|
||||
double sumw = 0, a, b;
|
||||
for( i = 0; i < n; i++ )
|
||||
sumw += w->sample_weights[w->sidx[i]];
|
||||
if( sumw > DBL_EPSILON )
|
||||
{
|
||||
a = 1./sumw;
|
||||
b = 0;
|
||||
}
|
||||
else
|
||||
{
|
||||
a = 0;
|
||||
b = 1;
|
||||
}
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
double& wval = w->sample_weights[w->sidx[i]];
|
||||
wval = wval*a + b;
|
||||
}
|
||||
}
|
||||
|
||||
void endTraining() CV_OVERRIDE
|
||||
{
|
||||
DTreesImpl::endTraining();
|
||||
vector<double> e;
|
||||
std::swap(sumResult, e);
|
||||
}
|
||||
|
||||
void scaleTree( int root, double scale )
|
||||
{
|
||||
int nidx = root, pidx = 0;
|
||||
Node *node = 0;
|
||||
|
||||
// traverse the tree and save all the nodes in depth-first order
|
||||
for(;;)
|
||||
{
|
||||
for(;;)
|
||||
{
|
||||
node = &nodes[nidx];
|
||||
node->value *= scale;
|
||||
if( node->left < 0 )
|
||||
break;
|
||||
nidx = node->left;
|
||||
}
|
||||
|
||||
for( pidx = node->parent; pidx >= 0 && nodes[pidx].right == nidx;
|
||||
nidx = pidx, pidx = nodes[pidx].parent )
|
||||
;
|
||||
|
||||
if( pidx < 0 )
|
||||
break;
|
||||
|
||||
nidx = nodes[pidx].right;
|
||||
}
|
||||
}
|
||||
|
||||
void calcValue( int nidx, const vector<int>& _sidx ) CV_OVERRIDE
|
||||
{
|
||||
DTreesImpl::calcValue(nidx, _sidx);
|
||||
WNode* node = &w->wnodes[nidx];
|
||||
if( bparams.boostType == Boost::DISCRETE )
|
||||
{
|
||||
node->value = node->class_idx == 0 ? -1 : 1;
|
||||
}
|
||||
else if( bparams.boostType == Boost::REAL )
|
||||
{
|
||||
double p = (node->value+1)*0.5;
|
||||
node->value = 0.5*log_ratio(p);
|
||||
}
|
||||
}
|
||||
|
||||
bool train( const Ptr<TrainData>& trainData, int flags ) CV_OVERRIDE
|
||||
{
|
||||
CV_Assert(!trainData.empty());
|
||||
startTraining(trainData, flags);
|
||||
int treeidx, ntrees = bparams.weakCount >= 0 ? bparams.weakCount : 10000;
|
||||
vector<int> sidx = w->sidx;
|
||||
|
||||
for( treeidx = 0; treeidx < ntrees; treeidx++ )
|
||||
{
|
||||
int root = addTree( sidx );
|
||||
if( root < 0 )
|
||||
return false;
|
||||
updateWeightsAndTrim( treeidx, sidx );
|
||||
}
|
||||
endTraining();
|
||||
return true;
|
||||
}
|
||||
|
||||
void updateWeightsAndTrim( int treeidx, vector<int>& sidx )
|
||||
{
|
||||
int i, n = (int)w->sidx.size();
|
||||
int nvars = (int)varIdx.size();
|
||||
double sumw = 0., C = 1.;
|
||||
cv::AutoBuffer<double> buf(n + nvars);
|
||||
double* result = buf.data();
|
||||
float* sbuf = (float*)(result + n);
|
||||
Mat sample(1, nvars, CV_32F, sbuf);
|
||||
int predictFlags = bparams.boostType == Boost::DISCRETE ? (PREDICT_MAX_VOTE | RAW_OUTPUT) : PREDICT_SUM;
|
||||
predictFlags |= COMPRESSED_INPUT;
|
||||
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
w->data->getSample(varIdx, w->sidx[i], sbuf );
|
||||
result[i] = predictTrees(Range(treeidx, treeidx+1), sample, predictFlags);
|
||||
}
|
||||
|
||||
// now update weights and other parameters for each type of boosting
|
||||
if( bparams.boostType == Boost::DISCRETE )
|
||||
{
|
||||
// Discrete AdaBoost:
|
||||
// weak_eval[i] (=f(x_i)) is in {-1,1}
|
||||
// err = sum(w_i*(f(x_i) != y_i))/sum(w_i)
|
||||
// C = log((1-err)/err)
|
||||
// w_i *= exp(C*(f(x_i) != y_i))
|
||||
double err = 0.;
|
||||
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
int si = w->sidx[i];
|
||||
double wval = w->sample_weights[si];
|
||||
sumw += wval;
|
||||
err += wval*(result[i] != w->cat_responses[si]);
|
||||
}
|
||||
|
||||
if( sumw != 0 )
|
||||
err /= sumw;
|
||||
C = -log_ratio( err );
|
||||
double scale = std::exp(C);
|
||||
|
||||
sumw = 0;
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
int si = w->sidx[i];
|
||||
double wval = w->sample_weights[si];
|
||||
if( result[i] != w->cat_responses[si] )
|
||||
wval *= scale;
|
||||
sumw += wval;
|
||||
w->sample_weights[si] = wval;
|
||||
}
|
||||
|
||||
scaleTree(roots[treeidx], C);
|
||||
}
|
||||
else if( bparams.boostType == Boost::REAL || bparams.boostType == Boost::GENTLE )
|
||||
{
|
||||
// Real AdaBoost:
|
||||
// weak_eval[i] = f(x_i) = 0.5*log(p(x_i)/(1-p(x_i))), p(x_i)=P(y=1|x_i)
|
||||
// w_i *= exp(-y_i*f(x_i))
|
||||
|
||||
// Gentle AdaBoost:
|
||||
// weak_eval[i] = f(x_i) in [-1,1]
|
||||
// w_i *= exp(-y_i*f(x_i))
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
int si = w->sidx[i];
|
||||
CV_Assert( std::abs(w->ord_responses[si]) == 1 );
|
||||
double wval = w->sample_weights[si]*std::exp(-result[i]*w->ord_responses[si]);
|
||||
sumw += wval;
|
||||
w->sample_weights[si] = wval;
|
||||
}
|
||||
}
|
||||
else if( bparams.boostType == Boost::LOGIT )
|
||||
{
|
||||
// LogitBoost:
|
||||
// weak_eval[i] = f(x_i) in [-z_max,z_max]
|
||||
// sum_response = F(x_i).
|
||||
// F(x_i) += 0.5*f(x_i)
|
||||
// p(x_i) = exp(F(x_i))/(exp(F(x_i)) + exp(-F(x_i))=1/(1+exp(-2*F(x_i)))
|
||||
// reuse weak_eval: weak_eval[i] <- p(x_i)
|
||||
// w_i = p(x_i)*1(1 - p(x_i))
|
||||
// z_i = ((y_i+1)/2 - p(x_i))/(p(x_i)*(1 - p(x_i)))
|
||||
// store z_i to the data->data_root as the new target responses
|
||||
const double lb_weight_thresh = FLT_EPSILON;
|
||||
const double lb_z_max = 10.;
|
||||
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
int si = w->sidx[i];
|
||||
sumResult[i] += 0.5*result[i];
|
||||
double p = 1./(1 + std::exp(-2*sumResult[i]));
|
||||
double wval = std::max( p*(1 - p), lb_weight_thresh ), z;
|
||||
w->sample_weights[si] = wval;
|
||||
sumw += wval;
|
||||
if( w->ord_responses[si] > 0 )
|
||||
{
|
||||
z = 1./p;
|
||||
w->ord_responses[si] = std::min(z, lb_z_max);
|
||||
}
|
||||
else
|
||||
{
|
||||
z = 1./(1-p);
|
||||
w->ord_responses[si] = -std::min(z, lb_z_max);
|
||||
}
|
||||
}
|
||||
}
|
||||
else
|
||||
CV_Error(CV_StsNotImplemented, "Unknown boosting type");
|
||||
|
||||
/*if( bparams.boostType != Boost::LOGIT )
|
||||
{
|
||||
double err = 0;
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
sumResult[i] += result[i]*C;
|
||||
if( bparams.boostType != Boost::DISCRETE )
|
||||
err += sumResult[i]*w->ord_responses[w->sidx[i]] < 0;
|
||||
else
|
||||
err += sumResult[i]*w->cat_responses[w->sidx[i]] < 0;
|
||||
}
|
||||
printf("%d trees. C=%.2f, training error=%.1f%%, working set size=%d (out of %d)\n", (int)roots.size(), C, err*100./n, (int)sidx.size(), n);
|
||||
}*/
|
||||
|
||||
// renormalize weights
|
||||
if( sumw > FLT_EPSILON )
|
||||
normalizeWeights();
|
||||
|
||||
if( bparams.weightTrimRate <= 0. || bparams.weightTrimRate >= 1. )
|
||||
return;
|
||||
|
||||
for( i = 0; i < n; i++ )
|
||||
result[i] = w->sample_weights[w->sidx[i]];
|
||||
std::sort(result, result + n);
|
||||
|
||||
// as weight trimming occurs immediately after updating the weights,
|
||||
// where they are renormalized, we assume that the weight sum = 1.
|
||||
sumw = 1. - bparams.weightTrimRate;
|
||||
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
double wval = result[i];
|
||||
if( sumw <= 0 )
|
||||
break;
|
||||
sumw -= wval;
|
||||
}
|
||||
|
||||
double threshold = i < n ? result[i] : DBL_MAX;
|
||||
sidx.clear();
|
||||
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
int si = w->sidx[i];
|
||||
if( w->sample_weights[si] >= threshold )
|
||||
sidx.push_back(si);
|
||||
}
|
||||
}
|
||||
|
||||
float predictTrees( const Range& range, const Mat& sample, int flags0 ) const CV_OVERRIDE
|
||||
{
|
||||
int flags = (flags0 & ~PREDICT_MASK) | PREDICT_SUM;
|
||||
float val = DTreesImpl::predictTrees(range, sample, flags);
|
||||
if( flags != flags0 )
|
||||
{
|
||||
int ival = (int)(val > 0);
|
||||
if( !(flags0 & RAW_OUTPUT) )
|
||||
ival = classLabels[ival];
|
||||
val = (float)ival;
|
||||
}
|
||||
return val;
|
||||
}
|
||||
|
||||
void writeTrainingParams( FileStorage& fs ) const CV_OVERRIDE
|
||||
{
|
||||
fs << "boosting_type" <<
|
||||
(bparams.boostType == Boost::DISCRETE ? "DiscreteAdaboost" :
|
||||
bparams.boostType == Boost::REAL ? "RealAdaboost" :
|
||||
bparams.boostType == Boost::LOGIT ? "LogitBoost" :
|
||||
bparams.boostType == Boost::GENTLE ? "GentleAdaboost" : "Unknown");
|
||||
|
||||
DTreesImpl::writeTrainingParams(fs);
|
||||
fs << "weight_trimming_rate" << bparams.weightTrimRate;
|
||||
}
|
||||
|
||||
void write( FileStorage& fs ) const CV_OVERRIDE
|
||||
{
|
||||
if( roots.empty() )
|
||||
CV_Error( CV_StsBadArg, "RTrees have not been trained" );
|
||||
|
||||
writeFormat(fs);
|
||||
writeParams(fs);
|
||||
|
||||
int k, ntrees = (int)roots.size();
|
||||
|
||||
fs << "ntrees" << ntrees
|
||||
<< "trees" << "[";
|
||||
|
||||
for( k = 0; k < ntrees; k++ )
|
||||
{
|
||||
fs << "{";
|
||||
writeTree(fs, roots[k]);
|
||||
fs << "}";
|
||||
}
|
||||
|
||||
fs << "]";
|
||||
}
|
||||
|
||||
void readParams( const FileNode& fn ) CV_OVERRIDE
|
||||
{
|
||||
DTreesImpl::readParams(fn);
|
||||
|
||||
FileNode tparams_node = fn["training_params"];
|
||||
// check for old layout
|
||||
String bts = (String)(fn["boosting_type"].empty() ?
|
||||
tparams_node["boosting_type"] : fn["boosting_type"]);
|
||||
bparams.boostType = (bts == "DiscreteAdaboost" ? Boost::DISCRETE :
|
||||
bts == "RealAdaboost" ? Boost::REAL :
|
||||
bts == "LogitBoost" ? Boost::LOGIT :
|
||||
bts == "GentleAdaboost" ? Boost::GENTLE : -1);
|
||||
_isClassifier = bparams.boostType == Boost::DISCRETE;
|
||||
// check for old layout
|
||||
bparams.weightTrimRate = (double)(fn["weight_trimming_rate"].empty() ?
|
||||
tparams_node["weight_trimming_rate"] : fn["weight_trimming_rate"]);
|
||||
}
|
||||
|
||||
void read( const FileNode& fn ) CV_OVERRIDE
|
||||
{
|
||||
clear();
|
||||
|
||||
int ntrees = (int)fn["ntrees"];
|
||||
readParams(fn);
|
||||
|
||||
FileNode trees_node = fn["trees"];
|
||||
FileNodeIterator it = trees_node.begin();
|
||||
CV_Assert( ntrees == (int)trees_node.size() );
|
||||
|
||||
for( int treeidx = 0; treeidx < ntrees; treeidx++, ++it )
|
||||
{
|
||||
FileNode nfn = (*it)["nodes"];
|
||||
readTree(nfn);
|
||||
}
|
||||
}
|
||||
|
||||
BoostTreeParams bparams;
|
||||
vector<double> sumResult;
|
||||
};
|
||||
|
||||
|
||||
class BoostImpl : public Boost
|
||||
{
|
||||
public:
|
||||
BoostImpl() {}
|
||||
virtual ~BoostImpl() {}
|
||||
|
||||
inline int getBoostType() const CV_OVERRIDE { return impl.bparams.boostType; }
|
||||
inline void setBoostType(int val) CV_OVERRIDE { impl.bparams.boostType = val; }
|
||||
inline int getWeakCount() const CV_OVERRIDE { return impl.bparams.weakCount; }
|
||||
inline void setWeakCount(int val) CV_OVERRIDE { impl.bparams.weakCount = val; }
|
||||
inline double getWeightTrimRate() const CV_OVERRIDE { return impl.bparams.weightTrimRate; }
|
||||
inline void setWeightTrimRate(double val) CV_OVERRIDE { impl.bparams.weightTrimRate = val; }
|
||||
|
||||
inline int getMaxCategories() const CV_OVERRIDE { return impl.params.getMaxCategories(); }
|
||||
inline void setMaxCategories(int val) CV_OVERRIDE { impl.params.setMaxCategories(val); }
|
||||
inline int getMaxDepth() const CV_OVERRIDE { return impl.params.getMaxDepth(); }
|
||||
inline void setMaxDepth(int val) CV_OVERRIDE { impl.params.setMaxDepth(val); }
|
||||
inline int getMinSampleCount() const CV_OVERRIDE { return impl.params.getMinSampleCount(); }
|
||||
inline void setMinSampleCount(int val) CV_OVERRIDE { impl.params.setMinSampleCount(val); }
|
||||
inline int getCVFolds() const CV_OVERRIDE { return impl.params.getCVFolds(); }
|
||||
inline void setCVFolds(int val) CV_OVERRIDE { impl.params.setCVFolds(val); }
|
||||
inline bool getUseSurrogates() const CV_OVERRIDE { return impl.params.getUseSurrogates(); }
|
||||
inline void setUseSurrogates(bool val) CV_OVERRIDE { impl.params.setUseSurrogates(val); }
|
||||
inline bool getUse1SERule() const CV_OVERRIDE { return impl.params.getUse1SERule(); }
|
||||
inline void setUse1SERule(bool val) CV_OVERRIDE { impl.params.setUse1SERule(val); }
|
||||
inline bool getTruncatePrunedTree() const CV_OVERRIDE { return impl.params.getTruncatePrunedTree(); }
|
||||
inline void setTruncatePrunedTree(bool val) CV_OVERRIDE { impl.params.setTruncatePrunedTree(val); }
|
||||
inline float getRegressionAccuracy() const CV_OVERRIDE { return impl.params.getRegressionAccuracy(); }
|
||||
inline void setRegressionAccuracy(float val) CV_OVERRIDE { impl.params.setRegressionAccuracy(val); }
|
||||
inline cv::Mat getPriors() const CV_OVERRIDE { return impl.params.getPriors(); }
|
||||
inline void setPriors(const cv::Mat& val) CV_OVERRIDE { impl.params.setPriors(val); }
|
||||
|
||||
String getDefaultName() const CV_OVERRIDE { return "opencv_ml_boost"; }
|
||||
|
||||
bool train( const Ptr<TrainData>& trainData, int flags ) CV_OVERRIDE
|
||||
{
|
||||
CV_Assert(!trainData.empty());
|
||||
return impl.train(trainData, flags);
|
||||
}
|
||||
|
||||
float predict( InputArray samples, OutputArray results, int flags ) const CV_OVERRIDE
|
||||
{
|
||||
CV_CheckEQ(samples.cols(), getVarCount(), "");
|
||||
return impl.predict(samples, results, flags);
|
||||
}
|
||||
|
||||
void write( FileStorage& fs ) const CV_OVERRIDE
|
||||
{
|
||||
impl.write(fs);
|
||||
}
|
||||
|
||||
void read( const FileNode& fn ) CV_OVERRIDE
|
||||
{
|
||||
impl.read(fn);
|
||||
}
|
||||
|
||||
int getVarCount() const CV_OVERRIDE { return impl.getVarCount(); }
|
||||
|
||||
bool isTrained() const CV_OVERRIDE { return impl.isTrained(); }
|
||||
bool isClassifier() const CV_OVERRIDE { return impl.isClassifier(); }
|
||||
|
||||
const vector<int>& getRoots() const CV_OVERRIDE { return impl.getRoots(); }
|
||||
const vector<Node>& getNodes() const CV_OVERRIDE { return impl.getNodes(); }
|
||||
const vector<Split>& getSplits() const CV_OVERRIDE { return impl.getSplits(); }
|
||||
const vector<int>& getSubsets() const CV_OVERRIDE { return impl.getSubsets(); }
|
||||
|
||||
DTreesImplForBoost impl;
|
||||
};
|
||||
|
||||
|
||||
Ptr<Boost> Boost::create()
|
||||
{
|
||||
return makePtr<BoostImpl>();
|
||||
}
|
||||
|
||||
Ptr<Boost> Boost::load(const String& filepath, const String& nodeName)
|
||||
{
|
||||
return Algorithm::load<Boost>(filepath, nodeName);
|
||||
}
|
||||
|
||||
}}
|
||||
|
||||
/* End of file. */
|
||||
1045
3rdparty/opencv-4.5.4/modules/ml/src/data.cpp
vendored
Normal file
1045
3rdparty/opencv-4.5.4/modules/ml/src/data.cpp
vendored
Normal file
File diff suppressed because it is too large
Load Diff
859
3rdparty/opencv-4.5.4/modules/ml/src/em.cpp
vendored
Normal file
859
3rdparty/opencv-4.5.4/modules/ml/src/em.cpp
vendored
Normal file
@@ -0,0 +1,859 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// Intel License Agreement
|
||||
// For Open Source Computer Vision Library
|
||||
//
|
||||
// Copyright( C) 2000, Intel Corporation, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of Intel Corporation may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
//(including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort(including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even ifadvised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#include "precomp.hpp"
|
||||
|
||||
namespace cv
|
||||
{
|
||||
namespace ml
|
||||
{
|
||||
|
||||
const double minEigenValue = DBL_EPSILON;
|
||||
|
||||
class CV_EXPORTS EMImpl CV_FINAL : public EM
|
||||
{
|
||||
public:
|
||||
|
||||
int nclusters;
|
||||
int covMatType;
|
||||
TermCriteria termCrit;
|
||||
|
||||
inline TermCriteria getTermCriteria() const CV_OVERRIDE { return termCrit; }
|
||||
inline void setTermCriteria(const TermCriteria& val) CV_OVERRIDE { termCrit = val; }
|
||||
|
||||
void setClustersNumber(int val) CV_OVERRIDE
|
||||
{
|
||||
nclusters = val;
|
||||
CV_Assert(nclusters >= 1);
|
||||
}
|
||||
|
||||
int getClustersNumber() const CV_OVERRIDE
|
||||
{
|
||||
return nclusters;
|
||||
}
|
||||
|
||||
void setCovarianceMatrixType(int val) CV_OVERRIDE
|
||||
{
|
||||
covMatType = val;
|
||||
CV_Assert(covMatType == COV_MAT_SPHERICAL ||
|
||||
covMatType == COV_MAT_DIAGONAL ||
|
||||
covMatType == COV_MAT_GENERIC);
|
||||
}
|
||||
|
||||
int getCovarianceMatrixType() const CV_OVERRIDE
|
||||
{
|
||||
return covMatType;
|
||||
}
|
||||
|
||||
EMImpl()
|
||||
{
|
||||
nclusters = DEFAULT_NCLUSTERS;
|
||||
covMatType=EM::COV_MAT_DIAGONAL;
|
||||
termCrit = TermCriteria(TermCriteria::COUNT+TermCriteria::EPS, EM::DEFAULT_MAX_ITERS, 1e-6);
|
||||
}
|
||||
|
||||
virtual ~EMImpl() {}
|
||||
|
||||
void clear() CV_OVERRIDE
|
||||
{
|
||||
trainSamples.release();
|
||||
trainProbs.release();
|
||||
trainLogLikelihoods.release();
|
||||
trainLabels.release();
|
||||
|
||||
weights.release();
|
||||
means.release();
|
||||
covs.clear();
|
||||
|
||||
covsEigenValues.clear();
|
||||
invCovsEigenValues.clear();
|
||||
covsRotateMats.clear();
|
||||
|
||||
logWeightDivDet.release();
|
||||
}
|
||||
|
||||
bool train(const Ptr<TrainData>& data, int) CV_OVERRIDE
|
||||
{
|
||||
CV_Assert(!data.empty());
|
||||
Mat samples = data->getTrainSamples(), labels;
|
||||
return trainEM(samples, labels, noArray(), noArray());
|
||||
}
|
||||
|
||||
bool trainEM(InputArray samples,
|
||||
OutputArray logLikelihoods,
|
||||
OutputArray labels,
|
||||
OutputArray probs) CV_OVERRIDE
|
||||
{
|
||||
Mat samplesMat = samples.getMat();
|
||||
setTrainData(START_AUTO_STEP, samplesMat, 0, 0, 0, 0);
|
||||
return doTrain(START_AUTO_STEP, logLikelihoods, labels, probs);
|
||||
}
|
||||
|
||||
bool trainE(InputArray samples,
|
||||
InputArray _means0,
|
||||
InputArray _covs0,
|
||||
InputArray _weights0,
|
||||
OutputArray logLikelihoods,
|
||||
OutputArray labels,
|
||||
OutputArray probs) CV_OVERRIDE
|
||||
{
|
||||
Mat samplesMat = samples.getMat();
|
||||
std::vector<Mat> covs0;
|
||||
_covs0.getMatVector(covs0);
|
||||
|
||||
Mat means0 = _means0.getMat(), weights0 = _weights0.getMat();
|
||||
|
||||
setTrainData(START_E_STEP, samplesMat, 0, !_means0.empty() ? &means0 : 0,
|
||||
!_covs0.empty() ? &covs0 : 0, !_weights0.empty() ? &weights0 : 0);
|
||||
return doTrain(START_E_STEP, logLikelihoods, labels, probs);
|
||||
}
|
||||
|
||||
bool trainM(InputArray samples,
|
||||
InputArray _probs0,
|
||||
OutputArray logLikelihoods,
|
||||
OutputArray labels,
|
||||
OutputArray probs) CV_OVERRIDE
|
||||
{
|
||||
Mat samplesMat = samples.getMat();
|
||||
Mat probs0 = _probs0.getMat();
|
||||
|
||||
setTrainData(START_M_STEP, samplesMat, !_probs0.empty() ? &probs0 : 0, 0, 0, 0);
|
||||
return doTrain(START_M_STEP, logLikelihoods, labels, probs);
|
||||
}
|
||||
|
||||
float predict(InputArray _inputs, OutputArray _outputs, int) const CV_OVERRIDE
|
||||
{
|
||||
bool needprobs = _outputs.needed();
|
||||
Mat samples = _inputs.getMat(), probs, probsrow;
|
||||
int ptype = CV_64F;
|
||||
float firstres = 0.f;
|
||||
int i, nsamples = samples.rows;
|
||||
|
||||
if( needprobs )
|
||||
{
|
||||
if( _outputs.fixedType() )
|
||||
ptype = _outputs.type();
|
||||
_outputs.create(samples.rows, nclusters, ptype);
|
||||
probs = _outputs.getMat();
|
||||
}
|
||||
else
|
||||
nsamples = std::min(nsamples, 1);
|
||||
|
||||
for( i = 0; i < nsamples; i++ )
|
||||
{
|
||||
if( needprobs )
|
||||
probsrow = probs.row(i);
|
||||
Vec2d res = computeProbabilities(samples.row(i), needprobs ? &probsrow : 0, ptype);
|
||||
if( i == 0 )
|
||||
firstres = (float)res[1];
|
||||
}
|
||||
return firstres;
|
||||
}
|
||||
|
||||
Vec2d predict2(InputArray _sample, OutputArray _probs) const CV_OVERRIDE
|
||||
{
|
||||
int ptype = CV_64F;
|
||||
Mat sample = _sample.getMat();
|
||||
CV_Assert(isTrained());
|
||||
|
||||
CV_Assert(!sample.empty());
|
||||
if(sample.type() != CV_64FC1)
|
||||
{
|
||||
Mat tmp;
|
||||
sample.convertTo(tmp, CV_64FC1);
|
||||
sample = tmp;
|
||||
}
|
||||
sample = sample.reshape(1, 1);
|
||||
|
||||
Mat probs;
|
||||
if( _probs.needed() )
|
||||
{
|
||||
if( _probs.fixedType() )
|
||||
ptype = _probs.type();
|
||||
_probs.create(1, nclusters, ptype);
|
||||
probs = _probs.getMat();
|
||||
}
|
||||
|
||||
return computeProbabilities(sample, !probs.empty() ? &probs : 0, ptype);
|
||||
}
|
||||
|
||||
bool isTrained() const CV_OVERRIDE
|
||||
{
|
||||
return !means.empty();
|
||||
}
|
||||
|
||||
bool isClassifier() const CV_OVERRIDE
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
int getVarCount() const CV_OVERRIDE
|
||||
{
|
||||
return means.cols;
|
||||
}
|
||||
|
||||
String getDefaultName() const CV_OVERRIDE
|
||||
{
|
||||
return "opencv_ml_em";
|
||||
}
|
||||
|
||||
static void checkTrainData(int startStep, const Mat& samples,
|
||||
int nclusters, int covMatType, const Mat* probs, const Mat* means,
|
||||
const std::vector<Mat>* covs, const Mat* weights)
|
||||
{
|
||||
// Check samples.
|
||||
CV_Assert(!samples.empty());
|
||||
CV_Assert(samples.channels() == 1);
|
||||
|
||||
int nsamples = samples.rows;
|
||||
int dim = samples.cols;
|
||||
|
||||
// Check training params.
|
||||
CV_Assert(nclusters > 0);
|
||||
CV_Assert(nclusters <= nsamples);
|
||||
CV_Assert(startStep == START_AUTO_STEP ||
|
||||
startStep == START_E_STEP ||
|
||||
startStep == START_M_STEP);
|
||||
CV_Assert(covMatType == COV_MAT_GENERIC ||
|
||||
covMatType == COV_MAT_DIAGONAL ||
|
||||
covMatType == COV_MAT_SPHERICAL);
|
||||
|
||||
CV_Assert(!probs ||
|
||||
(!probs->empty() &&
|
||||
probs->rows == nsamples && probs->cols == nclusters &&
|
||||
(probs->type() == CV_32FC1 || probs->type() == CV_64FC1)));
|
||||
|
||||
CV_Assert(!weights ||
|
||||
(!weights->empty() &&
|
||||
(weights->cols == 1 || weights->rows == 1) && static_cast<int>(weights->total()) == nclusters &&
|
||||
(weights->type() == CV_32FC1 || weights->type() == CV_64FC1)));
|
||||
|
||||
CV_Assert(!means ||
|
||||
(!means->empty() &&
|
||||
means->rows == nclusters && means->cols == dim &&
|
||||
means->channels() == 1));
|
||||
|
||||
CV_Assert(!covs ||
|
||||
(!covs->empty() &&
|
||||
static_cast<int>(covs->size()) == nclusters));
|
||||
if(covs)
|
||||
{
|
||||
const Size covSize(dim, dim);
|
||||
for(size_t i = 0; i < covs->size(); i++)
|
||||
{
|
||||
const Mat& m = (*covs)[i];
|
||||
CV_Assert(!m.empty() && m.size() == covSize && (m.channels() == 1));
|
||||
}
|
||||
}
|
||||
|
||||
if(startStep == START_E_STEP)
|
||||
{
|
||||
CV_Assert(means);
|
||||
}
|
||||
else if(startStep == START_M_STEP)
|
||||
{
|
||||
CV_Assert(probs);
|
||||
}
|
||||
}
|
||||
|
||||
static void preprocessSampleData(const Mat& src, Mat& dst, int dstType, bool isAlwaysClone)
|
||||
{
|
||||
if(src.type() == dstType && !isAlwaysClone)
|
||||
dst = src;
|
||||
else
|
||||
src.convertTo(dst, dstType);
|
||||
}
|
||||
|
||||
static void preprocessProbability(Mat& probs)
|
||||
{
|
||||
max(probs, 0., probs);
|
||||
|
||||
const double uniformProbability = (double)(1./probs.cols);
|
||||
for(int y = 0; y < probs.rows; y++)
|
||||
{
|
||||
Mat sampleProbs = probs.row(y);
|
||||
|
||||
double maxVal = 0;
|
||||
minMaxLoc(sampleProbs, 0, &maxVal);
|
||||
if(maxVal < FLT_EPSILON)
|
||||
sampleProbs.setTo(uniformProbability);
|
||||
else
|
||||
normalize(sampleProbs, sampleProbs, 1, 0, NORM_L1);
|
||||
}
|
||||
}
|
||||
|
||||
void setTrainData(int startStep, const Mat& samples,
|
||||
const Mat* probs0,
|
||||
const Mat* means0,
|
||||
const std::vector<Mat>* covs0,
|
||||
const Mat* weights0)
|
||||
{
|
||||
clear();
|
||||
|
||||
checkTrainData(startStep, samples, nclusters, covMatType, probs0, means0, covs0, weights0);
|
||||
|
||||
bool isKMeansInit = (startStep == START_AUTO_STEP) || (startStep == START_E_STEP && (covs0 == 0 || weights0 == 0));
|
||||
// Set checked data
|
||||
preprocessSampleData(samples, trainSamples, isKMeansInit ? CV_32FC1 : CV_64FC1, false);
|
||||
|
||||
// set probs
|
||||
if(probs0 && startStep == START_M_STEP)
|
||||
{
|
||||
preprocessSampleData(*probs0, trainProbs, CV_64FC1, true);
|
||||
preprocessProbability(trainProbs);
|
||||
}
|
||||
|
||||
// set weights
|
||||
if(weights0 && (startStep == START_E_STEP && covs0))
|
||||
{
|
||||
weights0->convertTo(weights, CV_64FC1);
|
||||
weights = weights.reshape(1,1);
|
||||
preprocessProbability(weights);
|
||||
}
|
||||
|
||||
// set means
|
||||
if(means0 && (startStep == START_E_STEP/* || startStep == START_AUTO_STEP*/))
|
||||
means0->convertTo(means, isKMeansInit ? CV_32FC1 : CV_64FC1);
|
||||
|
||||
// set covs
|
||||
if(covs0 && (startStep == START_E_STEP && weights0))
|
||||
{
|
||||
covs.resize(nclusters);
|
||||
for(size_t i = 0; i < covs0->size(); i++)
|
||||
(*covs0)[i].convertTo(covs[i], CV_64FC1);
|
||||
}
|
||||
}
|
||||
|
||||
void decomposeCovs()
|
||||
{
|
||||
CV_Assert(!covs.empty());
|
||||
covsEigenValues.resize(nclusters);
|
||||
if(covMatType == COV_MAT_GENERIC)
|
||||
covsRotateMats.resize(nclusters);
|
||||
invCovsEigenValues.resize(nclusters);
|
||||
for(int clusterIndex = 0; clusterIndex < nclusters; clusterIndex++)
|
||||
{
|
||||
CV_Assert(!covs[clusterIndex].empty());
|
||||
|
||||
SVD svd(covs[clusterIndex], SVD::MODIFY_A + SVD::FULL_UV);
|
||||
|
||||
if(covMatType == COV_MAT_SPHERICAL)
|
||||
{
|
||||
double maxSingularVal = svd.w.at<double>(0);
|
||||
covsEigenValues[clusterIndex] = Mat(1, 1, CV_64FC1, Scalar(maxSingularVal));
|
||||
}
|
||||
else if(covMatType == COV_MAT_DIAGONAL)
|
||||
{
|
||||
covsEigenValues[clusterIndex] = covs[clusterIndex].diag().clone(); //Preserve the original order of eigen values.
|
||||
}
|
||||
else //COV_MAT_GENERIC
|
||||
{
|
||||
covsEigenValues[clusterIndex] = svd.w;
|
||||
covsRotateMats[clusterIndex] = svd.u;
|
||||
}
|
||||
max(covsEigenValues[clusterIndex], minEigenValue, covsEigenValues[clusterIndex]);
|
||||
invCovsEigenValues[clusterIndex] = 1./covsEigenValues[clusterIndex];
|
||||
}
|
||||
}
|
||||
|
||||
void clusterTrainSamples()
|
||||
{
|
||||
int nsamples = trainSamples.rows;
|
||||
|
||||
// Cluster samples, compute/update means
|
||||
|
||||
// Convert samples and means to 32F, because kmeans requires this type.
|
||||
Mat trainSamplesFlt, meansFlt;
|
||||
if(trainSamples.type() != CV_32FC1)
|
||||
trainSamples.convertTo(trainSamplesFlt, CV_32FC1);
|
||||
else
|
||||
trainSamplesFlt = trainSamples;
|
||||
if(!means.empty())
|
||||
{
|
||||
if(means.type() != CV_32FC1)
|
||||
means.convertTo(meansFlt, CV_32FC1);
|
||||
else
|
||||
meansFlt = means;
|
||||
}
|
||||
|
||||
Mat labels;
|
||||
kmeans(trainSamplesFlt, nclusters, labels,
|
||||
TermCriteria(TermCriteria::COUNT, means.empty() ? 10 : 1, 0.5),
|
||||
10, KMEANS_PP_CENTERS, meansFlt);
|
||||
|
||||
// Convert samples and means back to 64F.
|
||||
CV_Assert(meansFlt.type() == CV_32FC1);
|
||||
if(trainSamples.type() != CV_64FC1)
|
||||
{
|
||||
Mat trainSamplesBuffer;
|
||||
trainSamplesFlt.convertTo(trainSamplesBuffer, CV_64FC1);
|
||||
trainSamples = trainSamplesBuffer;
|
||||
}
|
||||
meansFlt.convertTo(means, CV_64FC1);
|
||||
|
||||
// Compute weights and covs
|
||||
weights = Mat(1, nclusters, CV_64FC1, Scalar(0));
|
||||
covs.resize(nclusters);
|
||||
for(int clusterIndex = 0; clusterIndex < nclusters; clusterIndex++)
|
||||
{
|
||||
Mat clusterSamples;
|
||||
for(int sampleIndex = 0; sampleIndex < nsamples; sampleIndex++)
|
||||
{
|
||||
if(labels.at<int>(sampleIndex) == clusterIndex)
|
||||
{
|
||||
const Mat sample = trainSamples.row(sampleIndex);
|
||||
clusterSamples.push_back(sample);
|
||||
}
|
||||
}
|
||||
CV_Assert(!clusterSamples.empty());
|
||||
|
||||
calcCovarMatrix(clusterSamples, covs[clusterIndex], means.row(clusterIndex),
|
||||
CV_COVAR_NORMAL + CV_COVAR_ROWS + CV_COVAR_USE_AVG + CV_COVAR_SCALE, CV_64FC1);
|
||||
weights.at<double>(clusterIndex) = static_cast<double>(clusterSamples.rows)/static_cast<double>(nsamples);
|
||||
}
|
||||
|
||||
decomposeCovs();
|
||||
}
|
||||
|
||||
void computeLogWeightDivDet()
|
||||
{
|
||||
CV_Assert(!covsEigenValues.empty());
|
||||
|
||||
Mat logWeights;
|
||||
cv::max(weights, DBL_MIN, weights);
|
||||
log(weights, logWeights);
|
||||
|
||||
logWeightDivDet.create(1, nclusters, CV_64FC1);
|
||||
// note: logWeightDivDet = log(weight_k) - 0.5 * log(|det(cov_k)|)
|
||||
|
||||
for(int clusterIndex = 0; clusterIndex < nclusters; clusterIndex++)
|
||||
{
|
||||
double logDetCov = 0.;
|
||||
const int evalCount = static_cast<int>(covsEigenValues[clusterIndex].total());
|
||||
for(int di = 0; di < evalCount; di++)
|
||||
logDetCov += std::log(covsEigenValues[clusterIndex].at<double>(covMatType != COV_MAT_SPHERICAL ? di : 0));
|
||||
|
||||
logWeightDivDet.at<double>(clusterIndex) = logWeights.at<double>(clusterIndex) - 0.5 * logDetCov;
|
||||
}
|
||||
}
|
||||
|
||||
bool doTrain(int startStep, OutputArray logLikelihoods, OutputArray labels, OutputArray probs)
|
||||
{
|
||||
int dim = trainSamples.cols;
|
||||
// Precompute the empty initial train data in the cases of START_E_STEP and START_AUTO_STEP
|
||||
if(startStep != START_M_STEP)
|
||||
{
|
||||
if(covs.empty())
|
||||
{
|
||||
CV_Assert(weights.empty());
|
||||
clusterTrainSamples();
|
||||
}
|
||||
}
|
||||
|
||||
if(!covs.empty() && covsEigenValues.empty() )
|
||||
{
|
||||
CV_Assert(invCovsEigenValues.empty());
|
||||
decomposeCovs();
|
||||
}
|
||||
|
||||
if(startStep == START_M_STEP)
|
||||
mStep();
|
||||
|
||||
double trainLogLikelihood, prevTrainLogLikelihood = 0.;
|
||||
int maxIters = (termCrit.type & TermCriteria::MAX_ITER) ?
|
||||
termCrit.maxCount : DEFAULT_MAX_ITERS;
|
||||
double epsilon = (termCrit.type & TermCriteria::EPS) ? termCrit.epsilon : 0.;
|
||||
|
||||
for(int iter = 0; ; iter++)
|
||||
{
|
||||
eStep();
|
||||
trainLogLikelihood = sum(trainLogLikelihoods)[0];
|
||||
|
||||
if(iter >= maxIters - 1)
|
||||
break;
|
||||
|
||||
double trainLogLikelihoodDelta = trainLogLikelihood - prevTrainLogLikelihood;
|
||||
if( iter != 0 &&
|
||||
(trainLogLikelihoodDelta < -DBL_EPSILON ||
|
||||
trainLogLikelihoodDelta < epsilon * std::fabs(trainLogLikelihood)))
|
||||
break;
|
||||
|
||||
mStep();
|
||||
|
||||
prevTrainLogLikelihood = trainLogLikelihood;
|
||||
}
|
||||
|
||||
if( trainLogLikelihood <= -DBL_MAX/10000. )
|
||||
{
|
||||
clear();
|
||||
return false;
|
||||
}
|
||||
|
||||
// postprocess covs
|
||||
covs.resize(nclusters);
|
||||
for(int clusterIndex = 0; clusterIndex < nclusters; clusterIndex++)
|
||||
{
|
||||
if(covMatType == COV_MAT_SPHERICAL)
|
||||
{
|
||||
covs[clusterIndex].create(dim, dim, CV_64FC1);
|
||||
setIdentity(covs[clusterIndex], Scalar(covsEigenValues[clusterIndex].at<double>(0)));
|
||||
}
|
||||
else if(covMatType == COV_MAT_DIAGONAL)
|
||||
{
|
||||
covs[clusterIndex] = Mat::diag(covsEigenValues[clusterIndex]);
|
||||
}
|
||||
}
|
||||
|
||||
if(labels.needed())
|
||||
trainLabels.copyTo(labels);
|
||||
if(probs.needed())
|
||||
trainProbs.copyTo(probs);
|
||||
if(logLikelihoods.needed())
|
||||
trainLogLikelihoods.copyTo(logLikelihoods);
|
||||
|
||||
trainSamples.release();
|
||||
trainProbs.release();
|
||||
trainLabels.release();
|
||||
trainLogLikelihoods.release();
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
Vec2d computeProbabilities(const Mat& sample, Mat* probs, int ptype) const
|
||||
{
|
||||
// L_ik = log(weight_k) - 0.5 * log(|det(cov_k)|) - 0.5 *(x_i - mean_k)' cov_k^(-1) (x_i - mean_k)]
|
||||
// q = arg(max_k(L_ik))
|
||||
// probs_ik = exp(L_ik - L_iq) / (1 + sum_j!=q (exp(L_ij - L_iq))
|
||||
// see Alex Smola's blog http://blog.smola.org/page/2 for
|
||||
// details on the log-sum-exp trick
|
||||
|
||||
int stype = sample.type();
|
||||
CV_Assert(!means.empty());
|
||||
CV_Assert((stype == CV_32F || stype == CV_64F) && (ptype == CV_32F || ptype == CV_64F));
|
||||
CV_Assert(sample.size() == Size(means.cols, 1));
|
||||
|
||||
int dim = sample.cols;
|
||||
|
||||
Mat L(1, nclusters, CV_64FC1), centeredSample(1, dim, CV_64F);
|
||||
int i, label = 0;
|
||||
for(int clusterIndex = 0; clusterIndex < nclusters; clusterIndex++)
|
||||
{
|
||||
const double* mptr = means.ptr<double>(clusterIndex);
|
||||
double* dptr = centeredSample.ptr<double>();
|
||||
if( stype == CV_32F )
|
||||
{
|
||||
const float* sptr = sample.ptr<float>();
|
||||
for( i = 0; i < dim; i++ )
|
||||
dptr[i] = sptr[i] - mptr[i];
|
||||
}
|
||||
else
|
||||
{
|
||||
const double* sptr = sample.ptr<double>();
|
||||
for( i = 0; i < dim; i++ )
|
||||
dptr[i] = sptr[i] - mptr[i];
|
||||
}
|
||||
|
||||
Mat rotatedCenteredSample = covMatType != COV_MAT_GENERIC ?
|
||||
centeredSample : centeredSample * covsRotateMats[clusterIndex];
|
||||
|
||||
double Lval = 0;
|
||||
for(int di = 0; di < dim; di++)
|
||||
{
|
||||
double w = invCovsEigenValues[clusterIndex].at<double>(covMatType != COV_MAT_SPHERICAL ? di : 0);
|
||||
double val = rotatedCenteredSample.at<double>(di);
|
||||
Lval += w * val * val;
|
||||
}
|
||||
CV_DbgAssert(!logWeightDivDet.empty());
|
||||
L.at<double>(clusterIndex) = logWeightDivDet.at<double>(clusterIndex) - 0.5 * Lval;
|
||||
|
||||
if(L.at<double>(clusterIndex) > L.at<double>(label))
|
||||
label = clusterIndex;
|
||||
}
|
||||
|
||||
double maxLVal = L.at<double>(label);
|
||||
double expDiffSum = 0;
|
||||
for( i = 0; i < L.cols; i++ )
|
||||
{
|
||||
double v = std::exp(L.at<double>(i) - maxLVal);
|
||||
L.at<double>(i) = v;
|
||||
expDiffSum += v; // sum_j(exp(L_ij - L_iq))
|
||||
}
|
||||
|
||||
CV_Assert(expDiffSum > 0);
|
||||
if(probs)
|
||||
L.convertTo(*probs, ptype, 1./expDiffSum);
|
||||
|
||||
Vec2d res;
|
||||
res[0] = std::log(expDiffSum) + maxLVal - 0.5 * dim * CV_LOG2PI;
|
||||
res[1] = label;
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
void eStep()
|
||||
{
|
||||
// Compute probs_ik from means_k, covs_k and weights_k.
|
||||
trainProbs.create(trainSamples.rows, nclusters, CV_64FC1);
|
||||
trainLabels.create(trainSamples.rows, 1, CV_32SC1);
|
||||
trainLogLikelihoods.create(trainSamples.rows, 1, CV_64FC1);
|
||||
|
||||
computeLogWeightDivDet();
|
||||
|
||||
CV_DbgAssert(trainSamples.type() == CV_64FC1);
|
||||
CV_DbgAssert(means.type() == CV_64FC1);
|
||||
|
||||
for(int sampleIndex = 0; sampleIndex < trainSamples.rows; sampleIndex++)
|
||||
{
|
||||
Mat sampleProbs = trainProbs.row(sampleIndex);
|
||||
Vec2d res = computeProbabilities(trainSamples.row(sampleIndex), &sampleProbs, CV_64F);
|
||||
trainLogLikelihoods.at<double>(sampleIndex) = res[0];
|
||||
trainLabels.at<int>(sampleIndex) = static_cast<int>(res[1]);
|
||||
}
|
||||
}
|
||||
|
||||
void mStep()
|
||||
{
|
||||
// Update means_k, covs_k and weights_k from probs_ik
|
||||
int dim = trainSamples.cols;
|
||||
|
||||
// Update weights
|
||||
// not normalized first
|
||||
reduce(trainProbs, weights, 0, CV_REDUCE_SUM);
|
||||
|
||||
// Update means
|
||||
means.create(nclusters, dim, CV_64FC1);
|
||||
means = Scalar(0);
|
||||
|
||||
const double minPosWeight = trainSamples.rows * DBL_EPSILON;
|
||||
double minWeight = DBL_MAX;
|
||||
int minWeightClusterIndex = -1;
|
||||
for(int clusterIndex = 0; clusterIndex < nclusters; clusterIndex++)
|
||||
{
|
||||
if(weights.at<double>(clusterIndex) <= minPosWeight)
|
||||
continue;
|
||||
|
||||
if(weights.at<double>(clusterIndex) < minWeight)
|
||||
{
|
||||
minWeight = weights.at<double>(clusterIndex);
|
||||
minWeightClusterIndex = clusterIndex;
|
||||
}
|
||||
|
||||
Mat clusterMean = means.row(clusterIndex);
|
||||
for(int sampleIndex = 0; sampleIndex < trainSamples.rows; sampleIndex++)
|
||||
clusterMean += trainProbs.at<double>(sampleIndex, clusterIndex) * trainSamples.row(sampleIndex);
|
||||
clusterMean /= weights.at<double>(clusterIndex);
|
||||
}
|
||||
|
||||
// Update covsEigenValues and invCovsEigenValues
|
||||
covs.resize(nclusters);
|
||||
covsEigenValues.resize(nclusters);
|
||||
if(covMatType == COV_MAT_GENERIC)
|
||||
covsRotateMats.resize(nclusters);
|
||||
invCovsEigenValues.resize(nclusters);
|
||||
for(int clusterIndex = 0; clusterIndex < nclusters; clusterIndex++)
|
||||
{
|
||||
if(weights.at<double>(clusterIndex) <= minPosWeight)
|
||||
continue;
|
||||
|
||||
if(covMatType != COV_MAT_SPHERICAL)
|
||||
covsEigenValues[clusterIndex].create(1, dim, CV_64FC1);
|
||||
else
|
||||
covsEigenValues[clusterIndex].create(1, 1, CV_64FC1);
|
||||
|
||||
if(covMatType == COV_MAT_GENERIC)
|
||||
covs[clusterIndex].create(dim, dim, CV_64FC1);
|
||||
|
||||
Mat clusterCov = covMatType != COV_MAT_GENERIC ?
|
||||
covsEigenValues[clusterIndex] : covs[clusterIndex];
|
||||
|
||||
clusterCov = Scalar(0);
|
||||
|
||||
Mat centeredSample;
|
||||
for(int sampleIndex = 0; sampleIndex < trainSamples.rows; sampleIndex++)
|
||||
{
|
||||
centeredSample = trainSamples.row(sampleIndex) - means.row(clusterIndex);
|
||||
|
||||
if(covMatType == COV_MAT_GENERIC)
|
||||
clusterCov += trainProbs.at<double>(sampleIndex, clusterIndex) * centeredSample.t() * centeredSample;
|
||||
else
|
||||
{
|
||||
double p = trainProbs.at<double>(sampleIndex, clusterIndex);
|
||||
for(int di = 0; di < dim; di++ )
|
||||
{
|
||||
double val = centeredSample.at<double>(di);
|
||||
clusterCov.at<double>(covMatType != COV_MAT_SPHERICAL ? di : 0) += p*val*val;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if(covMatType == COV_MAT_SPHERICAL)
|
||||
clusterCov /= dim;
|
||||
|
||||
clusterCov /= weights.at<double>(clusterIndex);
|
||||
|
||||
// Update covsRotateMats for COV_MAT_GENERIC only
|
||||
if(covMatType == COV_MAT_GENERIC)
|
||||
{
|
||||
SVD svd(covs[clusterIndex], SVD::MODIFY_A + SVD::FULL_UV);
|
||||
covsEigenValues[clusterIndex] = svd.w;
|
||||
covsRotateMats[clusterIndex] = svd.u;
|
||||
}
|
||||
|
||||
max(covsEigenValues[clusterIndex], minEigenValue, covsEigenValues[clusterIndex]);
|
||||
|
||||
// update invCovsEigenValues
|
||||
invCovsEigenValues[clusterIndex] = 1./covsEigenValues[clusterIndex];
|
||||
}
|
||||
|
||||
for(int clusterIndex = 0; clusterIndex < nclusters; clusterIndex++)
|
||||
{
|
||||
if(weights.at<double>(clusterIndex) <= minPosWeight)
|
||||
{
|
||||
Mat clusterMean = means.row(clusterIndex);
|
||||
means.row(minWeightClusterIndex).copyTo(clusterMean);
|
||||
covs[minWeightClusterIndex].copyTo(covs[clusterIndex]);
|
||||
covsEigenValues[minWeightClusterIndex].copyTo(covsEigenValues[clusterIndex]);
|
||||
if(covMatType == COV_MAT_GENERIC)
|
||||
covsRotateMats[minWeightClusterIndex].copyTo(covsRotateMats[clusterIndex]);
|
||||
invCovsEigenValues[minWeightClusterIndex].copyTo(invCovsEigenValues[clusterIndex]);
|
||||
}
|
||||
}
|
||||
|
||||
// Normalize weights
|
||||
weights /= trainSamples.rows;
|
||||
}
|
||||
|
||||
void write_params(FileStorage& fs) const
|
||||
{
|
||||
fs << "nclusters" << nclusters;
|
||||
fs << "cov_mat_type" << (covMatType == COV_MAT_SPHERICAL ? String("spherical") :
|
||||
covMatType == COV_MAT_DIAGONAL ? String("diagonal") :
|
||||
covMatType == COV_MAT_GENERIC ? String("generic") :
|
||||
format("unknown_%d", covMatType));
|
||||
writeTermCrit(fs, termCrit);
|
||||
}
|
||||
|
||||
void write(FileStorage& fs) const CV_OVERRIDE
|
||||
{
|
||||
writeFormat(fs);
|
||||
fs << "training_params" << "{";
|
||||
write_params(fs);
|
||||
fs << "}";
|
||||
fs << "weights" << weights;
|
||||
fs << "means" << means;
|
||||
|
||||
size_t i, n = covs.size();
|
||||
|
||||
fs << "covs" << "[";
|
||||
for( i = 0; i < n; i++ )
|
||||
fs << covs[i];
|
||||
fs << "]";
|
||||
}
|
||||
|
||||
void read_params(const FileNode& fn)
|
||||
{
|
||||
nclusters = (int)fn["nclusters"];
|
||||
String s = (String)fn["cov_mat_type"];
|
||||
covMatType = s == "spherical" ? COV_MAT_SPHERICAL :
|
||||
s == "diagonal" ? COV_MAT_DIAGONAL :
|
||||
s == "generic" ? COV_MAT_GENERIC : -1;
|
||||
CV_Assert(covMatType >= 0);
|
||||
termCrit = readTermCrit(fn);
|
||||
}
|
||||
|
||||
void read(const FileNode& fn) CV_OVERRIDE
|
||||
{
|
||||
clear();
|
||||
read_params(fn["training_params"]);
|
||||
|
||||
fn["weights"] >> weights;
|
||||
fn["means"] >> means;
|
||||
|
||||
FileNode cfn = fn["covs"];
|
||||
FileNodeIterator cfn_it = cfn.begin();
|
||||
int i, n = (int)cfn.size();
|
||||
covs.resize(n);
|
||||
|
||||
for( i = 0; i < n; i++, ++cfn_it )
|
||||
(*cfn_it) >> covs[i];
|
||||
|
||||
decomposeCovs();
|
||||
computeLogWeightDivDet();
|
||||
}
|
||||
|
||||
Mat getWeights() const CV_OVERRIDE { return weights; }
|
||||
Mat getMeans() const CV_OVERRIDE { return means; }
|
||||
void getCovs(std::vector<Mat>& _covs) const CV_OVERRIDE
|
||||
{
|
||||
_covs.resize(covs.size());
|
||||
std::copy(covs.begin(), covs.end(), _covs.begin());
|
||||
}
|
||||
|
||||
// all inner matrices have type CV_64FC1
|
||||
Mat trainSamples;
|
||||
Mat trainProbs;
|
||||
Mat trainLogLikelihoods;
|
||||
Mat trainLabels;
|
||||
|
||||
Mat weights;
|
||||
Mat means;
|
||||
std::vector<Mat> covs;
|
||||
|
||||
std::vector<Mat> covsEigenValues;
|
||||
std::vector<Mat> covsRotateMats;
|
||||
std::vector<Mat> invCovsEigenValues;
|
||||
Mat logWeightDivDet;
|
||||
};
|
||||
|
||||
Ptr<EM> EM::create()
|
||||
{
|
||||
return makePtr<EMImpl>();
|
||||
}
|
||||
|
||||
Ptr<EM> EM::load(const String& filepath, const String& nodeName)
|
||||
{
|
||||
return Algorithm::load<EM>(filepath, nodeName);
|
||||
}
|
||||
|
||||
}
|
||||
} // namespace cv
|
||||
|
||||
/* End of file. */
|
||||
1373
3rdparty/opencv-4.5.4/modules/ml/src/gbt.cpp
vendored
Normal file
1373
3rdparty/opencv-4.5.4/modules/ml/src/gbt.cpp
vendored
Normal file
File diff suppressed because it is too large
Load Diff
222
3rdparty/opencv-4.5.4/modules/ml/src/inner_functions.cpp
vendored
Normal file
222
3rdparty/opencv-4.5.4/modules/ml/src/inner_functions.cpp
vendored
Normal file
@@ -0,0 +1,222 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// Intel License Agreement
|
||||
//
|
||||
// Copyright (C) 2000, Intel Corporation, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of Intel Corporation may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#include "precomp.hpp"
|
||||
|
||||
namespace cv { namespace ml {
|
||||
|
||||
ParamGrid::ParamGrid() { minVal = maxVal = 0.; logStep = 1; }
|
||||
ParamGrid::ParamGrid(double _minVal, double _maxVal, double _logStep)
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
minVal = std::min(_minVal, _maxVal);
|
||||
maxVal = std::max(_minVal, _maxVal);
|
||||
logStep = std::max(_logStep, 1.);
|
||||
}
|
||||
|
||||
Ptr<ParamGrid> ParamGrid::create(double minval, double maxval, double logstep) {
|
||||
return makePtr<ParamGrid>(minval, maxval, logstep);
|
||||
}
|
||||
|
||||
bool StatModel::empty() const { return !isTrained(); }
|
||||
|
||||
int StatModel::getVarCount() const { return 0; }
|
||||
|
||||
bool StatModel::train(const Ptr<TrainData>& trainData, int )
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
CV_Assert(!trainData.empty());
|
||||
CV_Error(CV_StsNotImplemented, "");
|
||||
return false;
|
||||
}
|
||||
|
||||
bool StatModel::train( InputArray samples, int layout, InputArray responses )
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
CV_Assert(!samples.empty());
|
||||
return train(TrainData::create(samples, layout, responses));
|
||||
}
|
||||
|
||||
class ParallelCalcError : public ParallelLoopBody
|
||||
{
|
||||
private:
|
||||
const Ptr<TrainData>& data;
|
||||
bool &testerr;
|
||||
Mat &resp;
|
||||
const StatModel &s;
|
||||
vector<double> &errStrip;
|
||||
public:
|
||||
ParallelCalcError(const Ptr<TrainData>& d, bool &t, Mat &_r,const StatModel &w, vector<double> &e) :
|
||||
data(d),
|
||||
testerr(t),
|
||||
resp(_r),
|
||||
s(w),
|
||||
errStrip(e)
|
||||
{
|
||||
}
|
||||
virtual void operator()(const Range& range) const CV_OVERRIDE
|
||||
{
|
||||
int idxErr = range.start;
|
||||
CV_TRACE_FUNCTION_SKIP_NESTED();
|
||||
Mat samples = data->getSamples();
|
||||
Mat weights=testerr? data->getTestSampleWeights() : data->getTrainSampleWeights();
|
||||
int layout = data->getLayout();
|
||||
Mat sidx = testerr ? data->getTestSampleIdx() : data->getTrainSampleIdx();
|
||||
const int* sidx_ptr = sidx.ptr<int>();
|
||||
bool isclassifier = s.isClassifier();
|
||||
Mat responses = data->getResponses();
|
||||
int responses_type = responses.type();
|
||||
double err = 0;
|
||||
|
||||
|
||||
const float* sw = weights.empty() ? 0 : weights.ptr<float>();
|
||||
for (int i = range.start; i < range.end; i++)
|
||||
{
|
||||
int si = sidx_ptr ? sidx_ptr[i] : i;
|
||||
double sweight = sw ? static_cast<double>(sw[i]) : 1.;
|
||||
Mat sample = layout == ROW_SAMPLE ? samples.row(si) : samples.col(si);
|
||||
float val = s.predict(sample);
|
||||
float val0 = (responses_type == CV_32S) ? (float)responses.at<int>(si) : responses.at<float>(si);
|
||||
|
||||
if (isclassifier)
|
||||
err += sweight * fabs(val - val0) > FLT_EPSILON;
|
||||
else
|
||||
err += sweight * (val - val0)*(val - val0);
|
||||
if (!resp.empty())
|
||||
resp.at<float>(i) = val;
|
||||
}
|
||||
|
||||
|
||||
errStrip[idxErr]=err ;
|
||||
|
||||
};
|
||||
ParallelCalcError& operator=(const ParallelCalcError &) {
|
||||
return *this;
|
||||
};
|
||||
};
|
||||
|
||||
|
||||
float StatModel::calcError(const Ptr<TrainData>& data, bool testerr, OutputArray _resp) const
|
||||
{
|
||||
CV_TRACE_FUNCTION_SKIP_NESTED();
|
||||
CV_Assert(!data.empty());
|
||||
Mat samples = data->getSamples();
|
||||
Mat sidx = testerr ? data->getTestSampleIdx() : data->getTrainSampleIdx();
|
||||
Mat weights = testerr ? data->getTestSampleWeights() : data->getTrainSampleWeights();
|
||||
int n = (int)sidx.total();
|
||||
bool isclassifier = isClassifier();
|
||||
Mat responses = data->getResponses();
|
||||
|
||||
if (n == 0)
|
||||
{
|
||||
n = data->getNSamples();
|
||||
weights = data->getTrainSampleWeights();
|
||||
testerr =false;
|
||||
}
|
||||
|
||||
if (n == 0)
|
||||
return -FLT_MAX;
|
||||
|
||||
Mat resp;
|
||||
if (_resp.needed())
|
||||
resp.create(n, 1, CV_32F);
|
||||
|
||||
double err = 0;
|
||||
vector<double> errStrip(n,0.0);
|
||||
ParallelCalcError x(data, testerr, resp, *this,errStrip);
|
||||
|
||||
parallel_for_(Range(0,n),x);
|
||||
|
||||
for (size_t i = 0; i < errStrip.size(); i++)
|
||||
err += errStrip[i];
|
||||
float weightSum= weights.empty() ? n: static_cast<float>(sum(weights)(0));
|
||||
if (_resp.needed())
|
||||
resp.copyTo(_resp);
|
||||
|
||||
return (float)(err/ weightSum * (isclassifier ? 100 : 1));
|
||||
}
|
||||
|
||||
/* Calculates upper triangular matrix S, where A is a symmetrical matrix A=S'*S */
|
||||
static void Cholesky( const Mat& A, Mat& S )
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
CV_Assert(A.type() == CV_32F);
|
||||
|
||||
S = A.clone();
|
||||
cv::Cholesky ((float*)S.ptr(),S.step, S.rows,NULL, 0, 0);
|
||||
S = S.t();
|
||||
for (int i=1;i<S.rows;i++)
|
||||
for (int j=0;j<i;j++)
|
||||
S.at<float>(i,j)=0;
|
||||
}
|
||||
|
||||
/* Generates <sample> from multivariate normal distribution, where <mean> - is an
|
||||
average row vector, <cov> - symmetric covariation matrix */
|
||||
void randMVNormal( InputArray _mean, InputArray _cov, int nsamples, OutputArray _samples )
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
// check mean vector and covariance matrix
|
||||
Mat mean = _mean.getMat(), cov = _cov.getMat();
|
||||
int dim = (int)mean.total(); // dimensionality
|
||||
CV_Assert(mean.rows == 1 || mean.cols == 1);
|
||||
CV_Assert(cov.rows == dim && cov.cols == dim);
|
||||
mean = mean.reshape(1,1); // ensure a row vector
|
||||
|
||||
// generate n-samples of the same dimension, from ~N(0,1)
|
||||
_samples.create(nsamples, dim, CV_32F);
|
||||
Mat samples = _samples.getMat();
|
||||
randn(samples, Scalar::all(0), Scalar::all(1));
|
||||
|
||||
// decompose covariance using Cholesky: cov = U'*U
|
||||
// (cov must be square, symmetric, and positive semi-definite matrix)
|
||||
Mat utmat;
|
||||
Cholesky(cov, utmat);
|
||||
|
||||
// transform random numbers using specified mean and covariance
|
||||
for( int i = 0; i < nsamples; i++ )
|
||||
{
|
||||
Mat sample = samples.row(i);
|
||||
sample = sample * utmat + mean;
|
||||
}
|
||||
}
|
||||
|
||||
}}
|
||||
|
||||
/* End of file */
|
||||
533
3rdparty/opencv-4.5.4/modules/ml/src/kdtree.cpp
vendored
Normal file
533
3rdparty/opencv-4.5.4/modules/ml/src/kdtree.cpp
vendored
Normal file
@@ -0,0 +1,533 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// License Agreement
|
||||
// For Open Source Computer Vision Library
|
||||
//
|
||||
// Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
|
||||
// Copyright (C) 2009, Willow Garage Inc., all rights reserved.
|
||||
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
|
||||
// Copyright (C) 2014, Itseez Inc, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of the copyright holders may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#include "precomp.hpp"
|
||||
#include "kdtree.hpp"
|
||||
|
||||
namespace cv
|
||||
{
|
||||
namespace ml
|
||||
{
|
||||
// This is reimplementation of kd-trees from cvkdtree*.* by Xavier Delacour, cleaned-up and
|
||||
// adopted to work with the new OpenCV data structures.
|
||||
|
||||
// The algorithm is taken from:
|
||||
// J.S. Beis and D.G. Lowe. Shape indexing using approximate nearest-neighbor search
|
||||
// in highdimensional spaces. In Proc. IEEE Conf. Comp. Vision Patt. Recog.,
|
||||
// pages 1000--1006, 1997. http://citeseer.ist.psu.edu/beis97shape.html
|
||||
|
||||
const int MAX_TREE_DEPTH = 32;
|
||||
|
||||
KDTree::KDTree()
|
||||
{
|
||||
maxDepth = -1;
|
||||
normType = NORM_L2;
|
||||
}
|
||||
|
||||
KDTree::KDTree(InputArray _points, bool _copyData)
|
||||
{
|
||||
maxDepth = -1;
|
||||
normType = NORM_L2;
|
||||
build(_points, _copyData);
|
||||
}
|
||||
|
||||
KDTree::KDTree(InputArray _points, InputArray _labels, bool _copyData)
|
||||
{
|
||||
maxDepth = -1;
|
||||
normType = NORM_L2;
|
||||
build(_points, _labels, _copyData);
|
||||
}
|
||||
|
||||
struct SubTree
|
||||
{
|
||||
SubTree() : first(0), last(0), nodeIdx(0), depth(0) {}
|
||||
SubTree(int _first, int _last, int _nodeIdx, int _depth)
|
||||
: first(_first), last(_last), nodeIdx(_nodeIdx), depth(_depth) {}
|
||||
int first;
|
||||
int last;
|
||||
int nodeIdx;
|
||||
int depth;
|
||||
};
|
||||
|
||||
|
||||
static float
|
||||
medianPartition( size_t* ofs, int a, int b, const float* vals )
|
||||
{
|
||||
int k, a0 = a, b0 = b;
|
||||
int middle = (a + b)/2;
|
||||
while( b > a )
|
||||
{
|
||||
int i0 = a, i1 = (a+b)/2, i2 = b;
|
||||
float v0 = vals[ofs[i0]], v1 = vals[ofs[i1]], v2 = vals[ofs[i2]];
|
||||
int ip = v0 < v1 ? (v1 < v2 ? i1 : v0 < v2 ? i2 : i0) :
|
||||
v0 < v2 ? (v1 == v0 ? i2 : i0): (v1 < v2 ? i2 : i1);
|
||||
float pivot = vals[ofs[ip]];
|
||||
std::swap(ofs[ip], ofs[i2]);
|
||||
|
||||
for( i1 = i0, i0--; i1 <= i2; i1++ )
|
||||
if( vals[ofs[i1]] <= pivot )
|
||||
{
|
||||
i0++;
|
||||
std::swap(ofs[i0], ofs[i1]);
|
||||
}
|
||||
if( i0 == middle )
|
||||
break;
|
||||
if( i0 > middle )
|
||||
b = i0 - (b == i0);
|
||||
else
|
||||
a = i0;
|
||||
}
|
||||
|
||||
float pivot = vals[ofs[middle]];
|
||||
int less = 0, more = 0;
|
||||
for( k = a0; k < middle; k++ )
|
||||
{
|
||||
CV_Assert(vals[ofs[k]] <= pivot);
|
||||
less += vals[ofs[k]] < pivot;
|
||||
}
|
||||
for( k = b0; k > middle; k-- )
|
||||
{
|
||||
CV_Assert(vals[ofs[k]] >= pivot);
|
||||
more += vals[ofs[k]] > pivot;
|
||||
}
|
||||
|
||||
return vals[ofs[middle]];
|
||||
}
|
||||
|
||||
static void
|
||||
computeSums( const Mat& points, const size_t* ofs, int a, int b, double* sums )
|
||||
{
|
||||
int i, j, dims = points.cols;
|
||||
const float* data = points.ptr<float>(0);
|
||||
for( j = 0; j < dims; j++ )
|
||||
sums[j*2] = sums[j*2+1] = 0;
|
||||
for( i = a; i <= b; i++ )
|
||||
{
|
||||
const float* row = data + ofs[i];
|
||||
for( j = 0; j < dims; j++ )
|
||||
{
|
||||
double t = row[j], s = sums[j*2] + t, s2 = sums[j*2+1] + t*t;
|
||||
sums[j*2] = s; sums[j*2+1] = s2;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void KDTree::build(InputArray _points, bool _copyData)
|
||||
{
|
||||
build(_points, noArray(), _copyData);
|
||||
}
|
||||
|
||||
|
||||
void KDTree::build(InputArray __points, InputArray __labels, bool _copyData)
|
||||
{
|
||||
Mat _points = __points.getMat(), _labels = __labels.getMat();
|
||||
CV_Assert(_points.type() == CV_32F && !_points.empty());
|
||||
std::vector<KDTree::Node>().swap(nodes);
|
||||
|
||||
if( !_copyData )
|
||||
points = _points;
|
||||
else
|
||||
{
|
||||
points.release();
|
||||
points.create(_points.size(), _points.type());
|
||||
}
|
||||
|
||||
int i, j, n = _points.rows, ptdims = _points.cols, top = 0;
|
||||
const float* data = _points.ptr<float>(0);
|
||||
float* dstdata = points.ptr<float>(0);
|
||||
size_t step = _points.step1();
|
||||
size_t dstep = points.step1();
|
||||
int ptpos = 0;
|
||||
labels.resize(n);
|
||||
const int* _labels_data = 0;
|
||||
|
||||
if( !_labels.empty() )
|
||||
{
|
||||
int nlabels = _labels.checkVector(1, CV_32S, true);
|
||||
CV_Assert(nlabels == n);
|
||||
_labels_data = _labels.ptr<int>();
|
||||
}
|
||||
|
||||
Mat sumstack(MAX_TREE_DEPTH*2, ptdims*2, CV_64F);
|
||||
SubTree stack[MAX_TREE_DEPTH*2];
|
||||
|
||||
std::vector<size_t> _ptofs(n);
|
||||
size_t* ptofs = &_ptofs[0];
|
||||
|
||||
for( i = 0; i < n; i++ )
|
||||
ptofs[i] = i*step;
|
||||
|
||||
nodes.push_back(Node());
|
||||
computeSums(points, ptofs, 0, n-1, sumstack.ptr<double>(top));
|
||||
stack[top++] = SubTree(0, n-1, 0, 0);
|
||||
int _maxDepth = 0;
|
||||
|
||||
while( --top >= 0 )
|
||||
{
|
||||
int first = stack[top].first, last = stack[top].last;
|
||||
int depth = stack[top].depth, nidx = stack[top].nodeIdx;
|
||||
int count = last - first + 1, dim = -1;
|
||||
const double* sums = sumstack.ptr<double>(top);
|
||||
double invCount = 1./count, maxVar = -1.;
|
||||
|
||||
if( count == 1 )
|
||||
{
|
||||
int idx0 = (int)(ptofs[first]/step);
|
||||
int idx = _copyData ? ptpos++ : idx0;
|
||||
nodes[nidx].idx = ~idx;
|
||||
if( _copyData )
|
||||
{
|
||||
const float* src = data + ptofs[first];
|
||||
float* dst = dstdata + idx*dstep;
|
||||
for( j = 0; j < ptdims; j++ )
|
||||
dst[j] = src[j];
|
||||
}
|
||||
labels[idx] = _labels_data ? _labels_data[idx0] : idx0;
|
||||
_maxDepth = std::max(_maxDepth, depth);
|
||||
continue;
|
||||
}
|
||||
|
||||
// find the dimensionality with the biggest variance
|
||||
for( j = 0; j < ptdims; j++ )
|
||||
{
|
||||
double m = sums[j*2]*invCount;
|
||||
double varj = sums[j*2+1]*invCount - m*m;
|
||||
if( maxVar < varj )
|
||||
{
|
||||
maxVar = varj;
|
||||
dim = j;
|
||||
}
|
||||
}
|
||||
|
||||
int left = (int)nodes.size(), right = left + 1;
|
||||
nodes.push_back(Node());
|
||||
nodes.push_back(Node());
|
||||
nodes[nidx].idx = dim;
|
||||
nodes[nidx].left = left;
|
||||
nodes[nidx].right = right;
|
||||
nodes[nidx].boundary = medianPartition(ptofs, first, last, data + dim);
|
||||
|
||||
int middle = (first + last)/2;
|
||||
double *lsums = (double*)sums, *rsums = lsums + ptdims*2;
|
||||
computeSums(points, ptofs, middle+1, last, rsums);
|
||||
for( j = 0; j < ptdims*2; j++ )
|
||||
lsums[j] = sums[j] - rsums[j];
|
||||
stack[top++] = SubTree(first, middle, left, depth+1);
|
||||
stack[top++] = SubTree(middle+1, last, right, depth+1);
|
||||
}
|
||||
maxDepth = _maxDepth;
|
||||
}
|
||||
|
||||
|
||||
struct PQueueElem
|
||||
{
|
||||
PQueueElem() : dist(0), idx(0) {}
|
||||
PQueueElem(float _dist, int _idx) : dist(_dist), idx(_idx) {}
|
||||
float dist;
|
||||
int idx;
|
||||
};
|
||||
|
||||
|
||||
int KDTree::findNearest(InputArray _vec, int K, int emax,
|
||||
OutputArray _neighborsIdx, OutputArray _neighbors,
|
||||
OutputArray _dist, OutputArray _labels) const
|
||||
|
||||
{
|
||||
Mat vecmat = _vec.getMat();
|
||||
CV_Assert( vecmat.isContinuous() && vecmat.type() == CV_32F && vecmat.total() == (size_t)points.cols );
|
||||
const float* vec = vecmat.ptr<float>();
|
||||
K = std::min(K, points.rows);
|
||||
int ptdims = points.cols;
|
||||
|
||||
CV_Assert(K > 0 && (normType == NORM_L2 || normType == NORM_L1));
|
||||
|
||||
AutoBuffer<uchar> _buf((K+1)*(sizeof(float) + sizeof(int)));
|
||||
int* idx = (int*)_buf.data();
|
||||
float* dist = (float*)(idx + K + 1);
|
||||
int i, j, ncount = 0, e = 0;
|
||||
|
||||
int qsize = 0, maxqsize = 1 << 10;
|
||||
AutoBuffer<uchar> _pqueue(maxqsize*sizeof(PQueueElem));
|
||||
PQueueElem* pqueue = (PQueueElem*)_pqueue.data();
|
||||
emax = std::max(emax, 1);
|
||||
|
||||
for( e = 0; e < emax; )
|
||||
{
|
||||
float d, alt_d = 0.f;
|
||||
int nidx;
|
||||
|
||||
if( e == 0 )
|
||||
nidx = 0;
|
||||
else
|
||||
{
|
||||
// take the next node from the priority queue
|
||||
if( qsize == 0 )
|
||||
break;
|
||||
nidx = pqueue[0].idx;
|
||||
alt_d = pqueue[0].dist;
|
||||
if( --qsize > 0 )
|
||||
{
|
||||
std::swap(pqueue[0], pqueue[qsize]);
|
||||
d = pqueue[0].dist;
|
||||
for( i = 0;;)
|
||||
{
|
||||
int left = i*2 + 1, right = i*2 + 2;
|
||||
if( left >= qsize )
|
||||
break;
|
||||
if( right < qsize && pqueue[right].dist < pqueue[left].dist )
|
||||
left = right;
|
||||
if( pqueue[left].dist >= d )
|
||||
break;
|
||||
std::swap(pqueue[i], pqueue[left]);
|
||||
i = left;
|
||||
}
|
||||
}
|
||||
|
||||
if( ncount == K && alt_d > dist[ncount-1] )
|
||||
continue;
|
||||
}
|
||||
|
||||
for(;;)
|
||||
{
|
||||
if( nidx < 0 )
|
||||
break;
|
||||
const Node& n = nodes[nidx];
|
||||
|
||||
if( n.idx < 0 )
|
||||
{
|
||||
i = ~n.idx;
|
||||
const float* row = points.ptr<float>(i);
|
||||
if( normType == NORM_L2 )
|
||||
for( j = 0, d = 0.f; j < ptdims; j++ )
|
||||
{
|
||||
float t = vec[j] - row[j];
|
||||
d += t*t;
|
||||
}
|
||||
else
|
||||
for( j = 0, d = 0.f; j < ptdims; j++ )
|
||||
d += std::abs(vec[j] - row[j]);
|
||||
|
||||
dist[ncount] = d;
|
||||
idx[ncount] = i;
|
||||
for( i = ncount-1; i >= 0; i-- )
|
||||
{
|
||||
if( dist[i] <= d )
|
||||
break;
|
||||
std::swap(dist[i], dist[i+1]);
|
||||
std::swap(idx[i], idx[i+1]);
|
||||
}
|
||||
ncount += ncount < K;
|
||||
e++;
|
||||
break;
|
||||
}
|
||||
|
||||
int alt;
|
||||
if( vec[n.idx] <= n.boundary )
|
||||
{
|
||||
nidx = n.left;
|
||||
alt = n.right;
|
||||
}
|
||||
else
|
||||
{
|
||||
nidx = n.right;
|
||||
alt = n.left;
|
||||
}
|
||||
|
||||
d = vec[n.idx] - n.boundary;
|
||||
if( normType == NORM_L2 )
|
||||
d = d*d + alt_d;
|
||||
else
|
||||
d = std::abs(d) + alt_d;
|
||||
// subtree prunning
|
||||
if( ncount == K && d > dist[ncount-1] )
|
||||
continue;
|
||||
// add alternative subtree to the priority queue
|
||||
pqueue[qsize] = PQueueElem(d, alt);
|
||||
for( i = qsize; i > 0; )
|
||||
{
|
||||
int parent = (i-1)/2;
|
||||
if( parent < 0 || pqueue[parent].dist <= d )
|
||||
break;
|
||||
std::swap(pqueue[i], pqueue[parent]);
|
||||
i = parent;
|
||||
}
|
||||
qsize += qsize+1 < maxqsize;
|
||||
}
|
||||
}
|
||||
|
||||
K = std::min(K, ncount);
|
||||
if( _neighborsIdx.needed() )
|
||||
{
|
||||
_neighborsIdx.create(K, 1, CV_32S, -1, true);
|
||||
Mat nidx = _neighborsIdx.getMat();
|
||||
Mat(nidx.size(), CV_32S, &idx[0]).copyTo(nidx);
|
||||
}
|
||||
if( _dist.needed() )
|
||||
sqrt(Mat(K, 1, CV_32F, dist), _dist);
|
||||
|
||||
if( _neighbors.needed() || _labels.needed() )
|
||||
getPoints(Mat(K, 1, CV_32S, idx), _neighbors, _labels);
|
||||
return K;
|
||||
}
|
||||
|
||||
|
||||
void KDTree::findOrthoRange(InputArray _lowerBound,
|
||||
InputArray _upperBound,
|
||||
OutputArray _neighborsIdx,
|
||||
OutputArray _neighbors,
|
||||
OutputArray _labels ) const
|
||||
{
|
||||
int ptdims = points.cols;
|
||||
Mat lowerBound = _lowerBound.getMat(), upperBound = _upperBound.getMat();
|
||||
CV_Assert( lowerBound.size == upperBound.size &&
|
||||
lowerBound.isContinuous() &&
|
||||
upperBound.isContinuous() &&
|
||||
lowerBound.type() == upperBound.type() &&
|
||||
lowerBound.type() == CV_32F &&
|
||||
lowerBound.total() == (size_t)ptdims );
|
||||
const float* L = lowerBound.ptr<float>();
|
||||
const float* R = upperBound.ptr<float>();
|
||||
|
||||
std::vector<int> idx;
|
||||
AutoBuffer<int> _stack(MAX_TREE_DEPTH*2 + 1);
|
||||
int* stack = _stack.data();
|
||||
int top = 0;
|
||||
|
||||
stack[top++] = 0;
|
||||
|
||||
while( --top >= 0 )
|
||||
{
|
||||
int nidx = stack[top];
|
||||
if( nidx < 0 )
|
||||
break;
|
||||
const Node& n = nodes[nidx];
|
||||
if( n.idx < 0 )
|
||||
{
|
||||
int j, i = ~n.idx;
|
||||
const float* row = points.ptr<float>(i);
|
||||
for( j = 0; j < ptdims; j++ )
|
||||
if( row[j] < L[j] || row[j] >= R[j] )
|
||||
break;
|
||||
if( j == ptdims )
|
||||
idx.push_back(i);
|
||||
continue;
|
||||
}
|
||||
if( L[n.idx] <= n.boundary )
|
||||
stack[top++] = n.left;
|
||||
if( R[n.idx] > n.boundary )
|
||||
stack[top++] = n.right;
|
||||
}
|
||||
|
||||
if( _neighborsIdx.needed() )
|
||||
{
|
||||
_neighborsIdx.create((int)idx.size(), 1, CV_32S, -1, true);
|
||||
Mat nidx = _neighborsIdx.getMat();
|
||||
Mat(nidx.size(), CV_32S, &idx[0]).copyTo(nidx);
|
||||
}
|
||||
getPoints( idx, _neighbors, _labels );
|
||||
}
|
||||
|
||||
|
||||
void KDTree::getPoints(InputArray _idx, OutputArray _pts, OutputArray _labels) const
|
||||
{
|
||||
Mat idxmat = _idx.getMat(), pts, labelsmat;
|
||||
CV_Assert( idxmat.isContinuous() && idxmat.type() == CV_32S &&
|
||||
(idxmat.cols == 1 || idxmat.rows == 1) );
|
||||
const int* idx = idxmat.ptr<int>();
|
||||
int* dstlabels = 0;
|
||||
|
||||
int ptdims = points.cols;
|
||||
int i, nidx = (int)idxmat.total();
|
||||
if( nidx == 0 )
|
||||
{
|
||||
_pts.release();
|
||||
_labels.release();
|
||||
return;
|
||||
}
|
||||
|
||||
if( _pts.needed() )
|
||||
{
|
||||
_pts.create( nidx, ptdims, points.type());
|
||||
pts = _pts.getMat();
|
||||
}
|
||||
|
||||
if(_labels.needed())
|
||||
{
|
||||
_labels.create(nidx, 1, CV_32S, -1, true);
|
||||
labelsmat = _labels.getMat();
|
||||
CV_Assert( labelsmat.isContinuous() );
|
||||
dstlabels = labelsmat.ptr<int>();
|
||||
}
|
||||
const int* srclabels = !labels.empty() ? &labels[0] : 0;
|
||||
|
||||
for( i = 0; i < nidx; i++ )
|
||||
{
|
||||
int k = idx[i];
|
||||
CV_Assert( (unsigned)k < (unsigned)points.rows );
|
||||
const float* src = points.ptr<float>(k);
|
||||
if( !pts.empty() )
|
||||
std::copy(src, src + ptdims, pts.ptr<float>(i));
|
||||
if( dstlabels )
|
||||
dstlabels[i] = srclabels ? srclabels[k] : k;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
const float* KDTree::getPoint(int ptidx, int* label) const
|
||||
{
|
||||
CV_Assert( (unsigned)ptidx < (unsigned)points.rows);
|
||||
if(label)
|
||||
*label = labels[ptidx];
|
||||
return points.ptr<float>(ptidx);
|
||||
}
|
||||
|
||||
|
||||
int KDTree::dims() const
|
||||
{
|
||||
return !points.empty() ? points.cols : 0;
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
97
3rdparty/opencv-4.5.4/modules/ml/src/kdtree.hpp
vendored
Normal file
97
3rdparty/opencv-4.5.4/modules/ml/src/kdtree.hpp
vendored
Normal file
@@ -0,0 +1,97 @@
|
||||
#ifndef KDTREE_H
|
||||
#define KDTREE_H
|
||||
|
||||
#include "precomp.hpp"
|
||||
|
||||
namespace cv
|
||||
{
|
||||
namespace ml
|
||||
{
|
||||
|
||||
/*!
|
||||
Fast Nearest Neighbor Search Class.
|
||||
|
||||
The class implements D. Lowe BBF (Best-Bin-First) algorithm for the last
|
||||
approximate (or accurate) nearest neighbor search in multi-dimensional spaces.
|
||||
|
||||
First, a set of vectors is passed to KDTree::KDTree() constructor
|
||||
or KDTree::build() method, where it is reordered.
|
||||
|
||||
Then arbitrary vectors can be passed to KDTree::findNearest() methods, which
|
||||
find the K nearest neighbors among the vectors from the initial set.
|
||||
The user can balance between the speed and accuracy of the search by varying Emax
|
||||
parameter, which is the number of leaves that the algorithm checks.
|
||||
The larger parameter values yield more accurate results at the expense of lower processing speed.
|
||||
|
||||
\code
|
||||
KDTree T(points, false);
|
||||
const int K = 3, Emax = INT_MAX;
|
||||
int idx[K];
|
||||
float dist[K];
|
||||
T.findNearest(query_vec, K, Emax, idx, 0, dist);
|
||||
CV_Assert(dist[0] <= dist[1] && dist[1] <= dist[2]);
|
||||
\endcode
|
||||
*/
|
||||
class CV_EXPORTS_W KDTree
|
||||
{
|
||||
public:
|
||||
/*!
|
||||
The node of the search tree.
|
||||
*/
|
||||
struct Node
|
||||
{
|
||||
Node() : idx(-1), left(-1), right(-1), boundary(0.f) {}
|
||||
Node(int _idx, int _left, int _right, float _boundary)
|
||||
: idx(_idx), left(_left), right(_right), boundary(_boundary) {}
|
||||
|
||||
//! split dimension; >=0 for nodes (dim), < 0 for leaves (index of the point)
|
||||
int idx;
|
||||
//! node indices of the left and the right branches
|
||||
int left, right;
|
||||
//! go to the left if query_vec[node.idx]<=node.boundary, otherwise go to the right
|
||||
float boundary;
|
||||
};
|
||||
|
||||
//! the default constructor
|
||||
CV_WRAP KDTree();
|
||||
//! the full constructor that builds the search tree
|
||||
CV_WRAP KDTree(InputArray points, bool copyAndReorderPoints = false);
|
||||
//! the full constructor that builds the search tree
|
||||
CV_WRAP KDTree(InputArray points, InputArray _labels,
|
||||
bool copyAndReorderPoints = false);
|
||||
//! builds the search tree
|
||||
CV_WRAP void build(InputArray points, bool copyAndReorderPoints = false);
|
||||
//! builds the search tree
|
||||
CV_WRAP void build(InputArray points, InputArray labels,
|
||||
bool copyAndReorderPoints = false);
|
||||
//! finds the K nearest neighbors of "vec" while looking at Emax (at most) leaves
|
||||
CV_WRAP int findNearest(InputArray vec, int K, int Emax,
|
||||
OutputArray neighborsIdx,
|
||||
OutputArray neighbors = noArray(),
|
||||
OutputArray dist = noArray(),
|
||||
OutputArray labels = noArray()) const;
|
||||
//! finds all the points from the initial set that belong to the specified box
|
||||
CV_WRAP void findOrthoRange(InputArray minBounds,
|
||||
InputArray maxBounds,
|
||||
OutputArray neighborsIdx,
|
||||
OutputArray neighbors = noArray(),
|
||||
OutputArray labels = noArray()) const;
|
||||
//! returns vectors with the specified indices
|
||||
CV_WRAP void getPoints(InputArray idx, OutputArray pts,
|
||||
OutputArray labels = noArray()) const;
|
||||
//! return a vector with the specified index
|
||||
const float* getPoint(int ptidx, int* label = 0) const;
|
||||
//! returns the search space dimensionality
|
||||
CV_WRAP int dims() const;
|
||||
|
||||
std::vector<Node> nodes; //!< all the tree nodes
|
||||
CV_PROP Mat points; //!< all the points. It can be a reordered copy of the input vector set or the original vector set.
|
||||
CV_PROP std::vector<int> labels; //!< the parallel array of labels.
|
||||
CV_PROP int maxDepth; //!< maximum depth of the search tree. Do not modify it
|
||||
CV_PROP_RW int normType; //!< type of the distance (cv::NORM_L1 or cv::NORM_L2) used for search. Initially set to cv::NORM_L2, but you can modify it
|
||||
};
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
#endif
|
||||
521
3rdparty/opencv-4.5.4/modules/ml/src/knearest.cpp
vendored
Normal file
521
3rdparty/opencv-4.5.4/modules/ml/src/knearest.cpp
vendored
Normal file
@@ -0,0 +1,521 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// License Agreement
|
||||
// For Open Source Computer Vision Library
|
||||
//
|
||||
// Copyright (C) 2000, Intel Corporation, all rights reserved.
|
||||
// Copyright (C) 2014, Itseez Inc, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of the copyright holders may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#include "precomp.hpp"
|
||||
#include "kdtree.hpp"
|
||||
|
||||
/****************************************************************************************\
|
||||
* K-Nearest Neighbors Classifier *
|
||||
\****************************************************************************************/
|
||||
|
||||
namespace cv {
|
||||
namespace ml {
|
||||
|
||||
const String NAME_BRUTE_FORCE = "opencv_ml_knn";
|
||||
const String NAME_KDTREE = "opencv_ml_knn_kd";
|
||||
|
||||
class Impl
|
||||
{
|
||||
public:
|
||||
Impl()
|
||||
{
|
||||
defaultK = 10;
|
||||
isclassifier = true;
|
||||
Emax = INT_MAX;
|
||||
}
|
||||
|
||||
virtual ~Impl() {}
|
||||
virtual String getModelName() const = 0;
|
||||
virtual int getType() const = 0;
|
||||
virtual float findNearest( InputArray _samples, int k,
|
||||
OutputArray _results,
|
||||
OutputArray _neighborResponses,
|
||||
OutputArray _dists ) const = 0;
|
||||
|
||||
bool train( const Ptr<TrainData>& data, int flags )
|
||||
{
|
||||
CV_Assert(!data.empty());
|
||||
Mat new_samples = data->getTrainSamples(ROW_SAMPLE);
|
||||
Mat new_responses;
|
||||
data->getTrainResponses().convertTo(new_responses, CV_32F);
|
||||
bool update = (flags & ml::KNearest::UPDATE_MODEL) != 0 && !samples.empty();
|
||||
|
||||
CV_Assert( new_samples.type() == CV_32F );
|
||||
|
||||
if( !update )
|
||||
{
|
||||
clear();
|
||||
}
|
||||
else
|
||||
{
|
||||
CV_Assert( new_samples.cols == samples.cols &&
|
||||
new_responses.cols == responses.cols );
|
||||
}
|
||||
|
||||
samples.push_back(new_samples);
|
||||
responses.push_back(new_responses);
|
||||
|
||||
doTrain(samples);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
virtual void doTrain(InputArray points) { CV_UNUSED(points); }
|
||||
|
||||
void clear()
|
||||
{
|
||||
samples.release();
|
||||
responses.release();
|
||||
}
|
||||
|
||||
void read( const FileNode& fn )
|
||||
{
|
||||
clear();
|
||||
isclassifier = (int)fn["is_classifier"] != 0;
|
||||
defaultK = (int)fn["default_k"];
|
||||
|
||||
fn["samples"] >> samples;
|
||||
fn["responses"] >> responses;
|
||||
}
|
||||
|
||||
void write( FileStorage& fs ) const
|
||||
{
|
||||
fs << "is_classifier" << (int)isclassifier;
|
||||
fs << "default_k" << defaultK;
|
||||
|
||||
fs << "samples" << samples;
|
||||
fs << "responses" << responses;
|
||||
}
|
||||
|
||||
public:
|
||||
int defaultK;
|
||||
bool isclassifier;
|
||||
int Emax;
|
||||
|
||||
Mat samples;
|
||||
Mat responses;
|
||||
};
|
||||
|
||||
class BruteForceImpl CV_FINAL : public Impl
|
||||
{
|
||||
public:
|
||||
String getModelName() const CV_OVERRIDE { return NAME_BRUTE_FORCE; }
|
||||
int getType() const CV_OVERRIDE { return ml::KNearest::BRUTE_FORCE; }
|
||||
|
||||
void findNearestCore( const Mat& _samples, int k, const Range& range,
|
||||
Mat* results, Mat* neighbor_responses,
|
||||
Mat* dists, float* presult ) const
|
||||
{
|
||||
int testidx, baseidx, i, j, d = samples.cols, nsamples = samples.rows;
|
||||
int testcount = range.end - range.start;
|
||||
|
||||
AutoBuffer<float> buf(testcount*k*2);
|
||||
float* dbuf = buf.data();
|
||||
float* rbuf = dbuf + testcount*k;
|
||||
|
||||
const float* rptr = responses.ptr<float>();
|
||||
|
||||
for( testidx = 0; testidx < testcount; testidx++ )
|
||||
{
|
||||
for( i = 0; i < k; i++ )
|
||||
{
|
||||
dbuf[testidx*k + i] = FLT_MAX;
|
||||
rbuf[testidx*k + i] = 0.f;
|
||||
}
|
||||
}
|
||||
|
||||
for( baseidx = 0; baseidx < nsamples; baseidx++ )
|
||||
{
|
||||
for( testidx = 0; testidx < testcount; testidx++ )
|
||||
{
|
||||
const float* v = samples.ptr<float>(baseidx);
|
||||
const float* u = _samples.ptr<float>(testidx + range.start);
|
||||
|
||||
float s = 0;
|
||||
for( i = 0; i <= d - 4; i += 4 )
|
||||
{
|
||||
float t0 = u[i] - v[i], t1 = u[i+1] - v[i+1];
|
||||
float t2 = u[i+2] - v[i+2], t3 = u[i+3] - v[i+3];
|
||||
s += t0*t0 + t1*t1 + t2*t2 + t3*t3;
|
||||
}
|
||||
|
||||
for( ; i < d; i++ )
|
||||
{
|
||||
float t0 = u[i] - v[i];
|
||||
s += t0*t0;
|
||||
}
|
||||
|
||||
Cv32suf si;
|
||||
si.f = (float)s;
|
||||
Cv32suf* dd = (Cv32suf*)(&dbuf[testidx*k]);
|
||||
float* nr = &rbuf[testidx*k];
|
||||
|
||||
for( i = k; i > 0; i-- )
|
||||
if( si.i >= dd[i-1].i )
|
||||
break;
|
||||
if( i >= k )
|
||||
continue;
|
||||
|
||||
for( j = k-2; j >= i; j-- )
|
||||
{
|
||||
dd[j+1].i = dd[j].i;
|
||||
nr[j+1] = nr[j];
|
||||
}
|
||||
dd[i].i = si.i;
|
||||
nr[i] = rptr[baseidx];
|
||||
}
|
||||
}
|
||||
|
||||
float result = 0.f;
|
||||
float inv_scale = 1.f/k;
|
||||
|
||||
for( testidx = 0; testidx < testcount; testidx++ )
|
||||
{
|
||||
if( neighbor_responses )
|
||||
{
|
||||
float* nr = neighbor_responses->ptr<float>(testidx + range.start);
|
||||
for( j = 0; j < k; j++ )
|
||||
nr[j] = rbuf[testidx*k + j];
|
||||
for( ; j < k; j++ )
|
||||
nr[j] = 0.f;
|
||||
}
|
||||
|
||||
if( dists )
|
||||
{
|
||||
float* dptr = dists->ptr<float>(testidx + range.start);
|
||||
for( j = 0; j < k; j++ )
|
||||
dptr[j] = dbuf[testidx*k + j];
|
||||
for( ; j < k; j++ )
|
||||
dptr[j] = 0.f;
|
||||
}
|
||||
|
||||
if( results || testidx+range.start == 0 )
|
||||
{
|
||||
if( !isclassifier || k == 1 )
|
||||
{
|
||||
float s = 0.f;
|
||||
for( j = 0; j < k; j++ )
|
||||
s += rbuf[testidx*k + j];
|
||||
result = (float)(s*inv_scale);
|
||||
}
|
||||
else
|
||||
{
|
||||
float* rp = rbuf + testidx*k;
|
||||
std::sort(rp, rp+k);
|
||||
|
||||
result = rp[0];
|
||||
int prev_start = 0;
|
||||
int best_count = 0;
|
||||
for( j = 1; j <= k; j++ )
|
||||
{
|
||||
if( j == k || rp[j] != rp[j-1] )
|
||||
{
|
||||
int count = j - prev_start;
|
||||
if( best_count < count )
|
||||
{
|
||||
best_count = count;
|
||||
result = rp[j-1];
|
||||
}
|
||||
prev_start = j;
|
||||
}
|
||||
}
|
||||
}
|
||||
if( results )
|
||||
results->at<float>(testidx + range.start) = result;
|
||||
if( presult && testidx+range.start == 0 )
|
||||
*presult = result;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
struct findKNearestInvoker : public ParallelLoopBody
|
||||
{
|
||||
findKNearestInvoker(const BruteForceImpl* _p, int _k, const Mat& __samples,
|
||||
Mat* __results, Mat* __neighbor_responses, Mat* __dists, float* _presult)
|
||||
{
|
||||
p = _p;
|
||||
k = _k;
|
||||
_samples = &__samples;
|
||||
_results = __results;
|
||||
_neighbor_responses = __neighbor_responses;
|
||||
_dists = __dists;
|
||||
presult = _presult;
|
||||
}
|
||||
|
||||
void operator()(const Range& range) const CV_OVERRIDE
|
||||
{
|
||||
int delta = std::min(range.end - range.start, 256);
|
||||
for( int start = range.start; start < range.end; start += delta )
|
||||
{
|
||||
p->findNearestCore( *_samples, k, Range(start, std::min(start + delta, range.end)),
|
||||
_results, _neighbor_responses, _dists, presult );
|
||||
}
|
||||
}
|
||||
|
||||
const BruteForceImpl* p;
|
||||
int k;
|
||||
const Mat* _samples;
|
||||
Mat* _results;
|
||||
Mat* _neighbor_responses;
|
||||
Mat* _dists;
|
||||
float* presult;
|
||||
};
|
||||
|
||||
float findNearest( InputArray _samples, int k,
|
||||
OutputArray _results,
|
||||
OutputArray _neighborResponses,
|
||||
OutputArray _dists ) const CV_OVERRIDE
|
||||
{
|
||||
float result = 0.f;
|
||||
CV_Assert( 0 < k );
|
||||
k = std::min(k, samples.rows);
|
||||
|
||||
Mat test_samples = _samples.getMat();
|
||||
CV_Assert( test_samples.type() == CV_32F && test_samples.cols == samples.cols );
|
||||
int testcount = test_samples.rows;
|
||||
|
||||
if( testcount == 0 )
|
||||
{
|
||||
_results.release();
|
||||
_neighborResponses.release();
|
||||
_dists.release();
|
||||
return 0.f;
|
||||
}
|
||||
|
||||
Mat res, nr, d, *pres = 0, *pnr = 0, *pd = 0;
|
||||
if( _results.needed() )
|
||||
{
|
||||
_results.create(testcount, 1, CV_32F);
|
||||
pres = &(res = _results.getMat());
|
||||
}
|
||||
if( _neighborResponses.needed() )
|
||||
{
|
||||
_neighborResponses.create(testcount, k, CV_32F);
|
||||
pnr = &(nr = _neighborResponses.getMat());
|
||||
}
|
||||
if( _dists.needed() )
|
||||
{
|
||||
_dists.create(testcount, k, CV_32F);
|
||||
pd = &(d = _dists.getMat());
|
||||
}
|
||||
|
||||
findKNearestInvoker invoker(this, k, test_samples, pres, pnr, pd, &result);
|
||||
parallel_for_(Range(0, testcount), invoker);
|
||||
//invoker(Range(0, testcount));
|
||||
return result;
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
class KDTreeImpl CV_FINAL : public Impl
|
||||
{
|
||||
public:
|
||||
String getModelName() const CV_OVERRIDE { return NAME_KDTREE; }
|
||||
int getType() const CV_OVERRIDE { return ml::KNearest::KDTREE; }
|
||||
|
||||
void doTrain(InputArray points) CV_OVERRIDE
|
||||
{
|
||||
tr.build(points);
|
||||
}
|
||||
|
||||
float findNearest( InputArray _samples, int k,
|
||||
OutputArray _results,
|
||||
OutputArray _neighborResponses,
|
||||
OutputArray _dists ) const CV_OVERRIDE
|
||||
{
|
||||
float result = 0.f;
|
||||
CV_Assert( 0 < k );
|
||||
k = std::min(k, samples.rows);
|
||||
|
||||
Mat test_samples = _samples.getMat();
|
||||
CV_Assert( test_samples.type() == CV_32F && test_samples.cols == samples.cols );
|
||||
int testcount = test_samples.rows;
|
||||
|
||||
if( testcount == 0 )
|
||||
{
|
||||
_results.release();
|
||||
_neighborResponses.release();
|
||||
_dists.release();
|
||||
return 0.f;
|
||||
}
|
||||
|
||||
Mat res, nr, d;
|
||||
if( _results.needed() )
|
||||
{
|
||||
res = _results.getMat();
|
||||
}
|
||||
if( _neighborResponses.needed() )
|
||||
{
|
||||
nr = _neighborResponses.getMat();
|
||||
}
|
||||
if( _dists.needed() )
|
||||
{
|
||||
d = _dists.getMat();
|
||||
}
|
||||
|
||||
for (int i=0; i<test_samples.rows; ++i)
|
||||
{
|
||||
Mat _res, _nr, _d;
|
||||
tr.findNearest(test_samples.row(i), k, Emax, _res, _nr, _d, noArray());
|
||||
res.push_back(_res.t());
|
||||
_results.assign(res);
|
||||
}
|
||||
|
||||
return result; // currently always 0
|
||||
}
|
||||
|
||||
KDTree tr;
|
||||
};
|
||||
|
||||
//================================================================
|
||||
|
||||
class KNearestImpl CV_FINAL : public KNearest
|
||||
{
|
||||
inline int getDefaultK() const CV_OVERRIDE { return impl->defaultK; }
|
||||
inline void setDefaultK(int val) CV_OVERRIDE { impl->defaultK = val; }
|
||||
inline bool getIsClassifier() const CV_OVERRIDE { return impl->isclassifier; }
|
||||
inline void setIsClassifier(bool val) CV_OVERRIDE { impl->isclassifier = val; }
|
||||
inline int getEmax() const CV_OVERRIDE { return impl->Emax; }
|
||||
inline void setEmax(int val) CV_OVERRIDE { impl->Emax = val; }
|
||||
|
||||
public:
|
||||
int getAlgorithmType() const CV_OVERRIDE
|
||||
{
|
||||
return impl->getType();
|
||||
}
|
||||
void setAlgorithmType(int val) CV_OVERRIDE
|
||||
{
|
||||
if (val != BRUTE_FORCE && val != KDTREE)
|
||||
val = BRUTE_FORCE;
|
||||
|
||||
int k = getDefaultK();
|
||||
int e = getEmax();
|
||||
bool c = getIsClassifier();
|
||||
|
||||
initImpl(val);
|
||||
|
||||
setDefaultK(k);
|
||||
setEmax(e);
|
||||
setIsClassifier(c);
|
||||
}
|
||||
|
||||
public:
|
||||
KNearestImpl()
|
||||
{
|
||||
initImpl(BRUTE_FORCE);
|
||||
}
|
||||
~KNearestImpl()
|
||||
{
|
||||
}
|
||||
|
||||
bool isClassifier() const CV_OVERRIDE { return impl->isclassifier; }
|
||||
bool isTrained() const CV_OVERRIDE { return !impl->samples.empty(); }
|
||||
|
||||
int getVarCount() const CV_OVERRIDE { return impl->samples.cols; }
|
||||
|
||||
void write( FileStorage& fs ) const CV_OVERRIDE
|
||||
{
|
||||
writeFormat(fs);
|
||||
impl->write(fs);
|
||||
}
|
||||
|
||||
void read( const FileNode& fn ) CV_OVERRIDE
|
||||
{
|
||||
int algorithmType = BRUTE_FORCE;
|
||||
if (fn.name() == NAME_KDTREE)
|
||||
algorithmType = KDTREE;
|
||||
initImpl(algorithmType);
|
||||
impl->read(fn);
|
||||
}
|
||||
|
||||
float findNearest( InputArray samples, int k,
|
||||
OutputArray results,
|
||||
OutputArray neighborResponses=noArray(),
|
||||
OutputArray dist=noArray() ) const CV_OVERRIDE
|
||||
{
|
||||
return impl->findNearest(samples, k, results, neighborResponses, dist);
|
||||
}
|
||||
|
||||
float predict(InputArray inputs, OutputArray outputs, int) const CV_OVERRIDE
|
||||
{
|
||||
return impl->findNearest( inputs, impl->defaultK, outputs, noArray(), noArray() );
|
||||
}
|
||||
|
||||
bool train( const Ptr<TrainData>& data, int flags ) CV_OVERRIDE
|
||||
{
|
||||
CV_Assert(!data.empty());
|
||||
return impl->train(data, flags);
|
||||
}
|
||||
|
||||
String getDefaultName() const CV_OVERRIDE { return impl->getModelName(); }
|
||||
|
||||
protected:
|
||||
void initImpl(int algorithmType)
|
||||
{
|
||||
if (algorithmType != KDTREE)
|
||||
impl = makePtr<BruteForceImpl>();
|
||||
else
|
||||
impl = makePtr<KDTreeImpl>();
|
||||
}
|
||||
Ptr<Impl> impl;
|
||||
};
|
||||
|
||||
Ptr<KNearest> KNearest::create()
|
||||
{
|
||||
return makePtr<KNearestImpl>();
|
||||
}
|
||||
|
||||
Ptr<KNearest> KNearest::load(const String& filepath)
|
||||
{
|
||||
FileStorage fs;
|
||||
fs.open(filepath, FileStorage::READ);
|
||||
|
||||
Ptr<KNearest> knearest = makePtr<KNearestImpl>();
|
||||
|
||||
((KNearestImpl*)knearest.get())->read(fs.getFirstTopLevelNode());
|
||||
return knearest;
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
/* End of file */
|
||||
604
3rdparty/opencv-4.5.4/modules/ml/src/lr.cpp
vendored
Normal file
604
3rdparty/opencv-4.5.4/modules/ml/src/lr.cpp
vendored
Normal file
@@ -0,0 +1,604 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
//
|
||||
// AUTHOR: Rahul Kavi rahulkavi[at]live[at]com
|
||||
|
||||
//
|
||||
// This is a implementation of the Logistic Regression algorithm
|
||||
//
|
||||
|
||||
#include "precomp.hpp"
|
||||
|
||||
using namespace std;
|
||||
|
||||
namespace cv {
|
||||
namespace ml {
|
||||
|
||||
class LrParams
|
||||
{
|
||||
public:
|
||||
LrParams()
|
||||
{
|
||||
alpha = 0.001;
|
||||
num_iters = 1000;
|
||||
norm = LogisticRegression::REG_L2;
|
||||
train_method = LogisticRegression::BATCH;
|
||||
mini_batch_size = 1;
|
||||
term_crit = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, num_iters, alpha);
|
||||
}
|
||||
|
||||
double alpha; //!< learning rate.
|
||||
int num_iters; //!< number of iterations.
|
||||
int norm;
|
||||
int train_method;
|
||||
int mini_batch_size;
|
||||
TermCriteria term_crit;
|
||||
};
|
||||
|
||||
class LogisticRegressionImpl CV_FINAL : public LogisticRegression
|
||||
{
|
||||
public:
|
||||
|
||||
LogisticRegressionImpl() { }
|
||||
virtual ~LogisticRegressionImpl() {}
|
||||
|
||||
inline double getLearningRate() const CV_OVERRIDE { return params.alpha; }
|
||||
inline void setLearningRate(double val) CV_OVERRIDE { params.alpha = val; }
|
||||
inline int getIterations() const CV_OVERRIDE { return params.num_iters; }
|
||||
inline void setIterations(int val) CV_OVERRIDE { params.num_iters = val; }
|
||||
inline int getRegularization() const CV_OVERRIDE { return params.norm; }
|
||||
inline void setRegularization(int val) CV_OVERRIDE { params.norm = val; }
|
||||
inline int getTrainMethod() const CV_OVERRIDE { return params.train_method; }
|
||||
inline void setTrainMethod(int val) CV_OVERRIDE { params.train_method = val; }
|
||||
inline int getMiniBatchSize() const CV_OVERRIDE { return params.mini_batch_size; }
|
||||
inline void setMiniBatchSize(int val) CV_OVERRIDE { params.mini_batch_size = val; }
|
||||
inline TermCriteria getTermCriteria() const CV_OVERRIDE { return params.term_crit; }
|
||||
inline void setTermCriteria(TermCriteria val) CV_OVERRIDE { params.term_crit = val; }
|
||||
|
||||
virtual bool train( const Ptr<TrainData>& trainData, int=0 ) CV_OVERRIDE;
|
||||
virtual float predict(InputArray samples, OutputArray results, int flags=0) const CV_OVERRIDE;
|
||||
virtual void clear() CV_OVERRIDE;
|
||||
virtual void write(FileStorage& fs) const CV_OVERRIDE;
|
||||
virtual void read(const FileNode& fn) CV_OVERRIDE;
|
||||
virtual Mat get_learnt_thetas() const CV_OVERRIDE { return learnt_thetas; }
|
||||
virtual int getVarCount() const CV_OVERRIDE { return learnt_thetas.cols; }
|
||||
virtual bool isTrained() const CV_OVERRIDE { return !learnt_thetas.empty(); }
|
||||
virtual bool isClassifier() const CV_OVERRIDE { return true; }
|
||||
virtual String getDefaultName() const CV_OVERRIDE { return "opencv_ml_lr"; }
|
||||
protected:
|
||||
Mat calc_sigmoid(const Mat& data) const;
|
||||
double compute_cost(const Mat& _data, const Mat& _labels, const Mat& _init_theta);
|
||||
void compute_gradient(const Mat& _data, const Mat& _labels, const Mat &_theta, const double _lambda, Mat & _gradient );
|
||||
Mat batch_gradient_descent(const Mat& _data, const Mat& _labels, const Mat& _init_theta);
|
||||
Mat mini_batch_gradient_descent(const Mat& _data, const Mat& _labels, const Mat& _init_theta);
|
||||
bool set_label_map(const Mat& _labels_i);
|
||||
Mat remap_labels(const Mat& _labels_i, const map<int, int>& lmap) const;
|
||||
protected:
|
||||
LrParams params;
|
||||
Mat learnt_thetas;
|
||||
map<int, int> forward_mapper;
|
||||
map<int, int> reverse_mapper;
|
||||
Mat labels_o;
|
||||
Mat labels_n;
|
||||
};
|
||||
|
||||
Ptr<LogisticRegression> LogisticRegression::create()
|
||||
{
|
||||
return makePtr<LogisticRegressionImpl>();
|
||||
}
|
||||
|
||||
Ptr<LogisticRegression> LogisticRegression::load(const String& filepath, const String& nodeName)
|
||||
{
|
||||
return Algorithm::load<LogisticRegression>(filepath, nodeName);
|
||||
}
|
||||
|
||||
|
||||
bool LogisticRegressionImpl::train(const Ptr<TrainData>& trainData, int)
|
||||
{
|
||||
CV_TRACE_FUNCTION_SKIP_NESTED();
|
||||
CV_Assert(!trainData.empty());
|
||||
|
||||
// return value
|
||||
bool ok = false;
|
||||
clear();
|
||||
Mat _data_i = trainData->getSamples();
|
||||
Mat _labels_i = trainData->getResponses();
|
||||
|
||||
// check size and type of training data
|
||||
CV_Assert( !_labels_i.empty() && !_data_i.empty());
|
||||
if(_labels_i.cols != 1)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "labels should be a column matrix" );
|
||||
}
|
||||
if(_data_i.type() != CV_32FC1 || _labels_i.type() != CV_32FC1)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "data and labels must be a floating point matrix" );
|
||||
}
|
||||
if(_labels_i.rows != _data_i.rows)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "number of rows in data and labels should be equal" );
|
||||
}
|
||||
|
||||
// class labels
|
||||
set_label_map(_labels_i);
|
||||
Mat labels_l = remap_labels(_labels_i, this->forward_mapper);
|
||||
int num_classes = (int) this->forward_mapper.size();
|
||||
if(num_classes < 2)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "data should have atleast 2 classes" );
|
||||
}
|
||||
|
||||
// add a column of ones to the data (bias/intercept term)
|
||||
Mat data_t;
|
||||
hconcat( cv::Mat::ones( _data_i.rows, 1, CV_32F ), _data_i, data_t );
|
||||
|
||||
// coefficient matrix (zero-initialized)
|
||||
Mat thetas;
|
||||
Mat init_theta = Mat::zeros(data_t.cols, 1, CV_32F);
|
||||
|
||||
// fit the model (handles binary and multiclass cases)
|
||||
Mat new_theta;
|
||||
Mat labels;
|
||||
if(num_classes == 2)
|
||||
{
|
||||
labels_l.convertTo(labels, CV_32F);
|
||||
if(this->params.train_method == LogisticRegression::BATCH)
|
||||
new_theta = batch_gradient_descent(data_t, labels, init_theta);
|
||||
else
|
||||
new_theta = mini_batch_gradient_descent(data_t, labels, init_theta);
|
||||
thetas = new_theta.t();
|
||||
}
|
||||
else
|
||||
{
|
||||
/* take each class and rename classes you will get a theta per class
|
||||
as in multi class class scenario, we will have n thetas for n classes */
|
||||
thetas.create(num_classes, data_t.cols, CV_32F);
|
||||
Mat labels_binary;
|
||||
int ii = 0;
|
||||
for(map<int,int>::iterator it = this->forward_mapper.begin(); it != this->forward_mapper.end(); ++it)
|
||||
{
|
||||
// one-vs-rest (OvR) scheme
|
||||
labels_binary = (labels_l == it->second)/255;
|
||||
labels_binary.convertTo(labels, CV_32F);
|
||||
if(this->params.train_method == LogisticRegression::BATCH)
|
||||
new_theta = batch_gradient_descent(data_t, labels, init_theta);
|
||||
else
|
||||
new_theta = mini_batch_gradient_descent(data_t, labels, init_theta);
|
||||
hconcat(new_theta.t(), thetas.row(ii));
|
||||
ii += 1;
|
||||
}
|
||||
}
|
||||
|
||||
// check that the estimates are stable and finite
|
||||
this->learnt_thetas = thetas.clone();
|
||||
if( cvIsNaN( (double)sum(this->learnt_thetas)[0] ) )
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "check training parameters. Invalid training classifier" );
|
||||
}
|
||||
|
||||
// success
|
||||
ok = true;
|
||||
return ok;
|
||||
}
|
||||
|
||||
float LogisticRegressionImpl::predict(InputArray samples, OutputArray results, int flags) const
|
||||
{
|
||||
// check if learnt_mats array is populated
|
||||
if(!this->isTrained())
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "classifier should be trained first" );
|
||||
}
|
||||
|
||||
// coefficient matrix
|
||||
Mat thetas;
|
||||
if ( learnt_thetas.type() == CV_32F )
|
||||
{
|
||||
thetas = learnt_thetas;
|
||||
}
|
||||
else
|
||||
{
|
||||
this->learnt_thetas.convertTo( thetas, CV_32F );
|
||||
}
|
||||
CV_Assert(thetas.rows > 0);
|
||||
|
||||
// data samples
|
||||
Mat data = samples.getMat();
|
||||
if(data.type() != CV_32F)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "data must be of floating type" );
|
||||
}
|
||||
|
||||
// add a column of ones to the data (bias/intercept term)
|
||||
Mat data_t;
|
||||
hconcat( cv::Mat::ones( data.rows, 1, CV_32F ), data, data_t );
|
||||
CV_Assert(data_t.cols == thetas.cols);
|
||||
|
||||
// predict class labels for samples (handles binary and multiclass cases)
|
||||
Mat labels_c;
|
||||
Mat pred_m;
|
||||
Mat temp_pred;
|
||||
if(thetas.rows == 1)
|
||||
{
|
||||
// apply sigmoid function
|
||||
temp_pred = calc_sigmoid(data_t * thetas.t());
|
||||
CV_Assert(temp_pred.cols==1);
|
||||
pred_m = temp_pred.clone();
|
||||
|
||||
// if greater than 0.5, predict class 0 or predict class 1
|
||||
temp_pred = (temp_pred > 0.5f) / 255;
|
||||
temp_pred.convertTo(labels_c, CV_32S);
|
||||
}
|
||||
else
|
||||
{
|
||||
// apply sigmoid function
|
||||
pred_m.create(data_t.rows, thetas.rows, data.type());
|
||||
for(int i = 0; i < thetas.rows; i++)
|
||||
{
|
||||
temp_pred = calc_sigmoid(data_t * thetas.row(i).t());
|
||||
vconcat(temp_pred, pred_m.col(i));
|
||||
}
|
||||
|
||||
// predict class with the maximum output
|
||||
Point max_loc;
|
||||
Mat labels;
|
||||
for(int i = 0; i < pred_m.rows; i++)
|
||||
{
|
||||
temp_pred = pred_m.row(i);
|
||||
minMaxLoc( temp_pred, NULL, NULL, NULL, &max_loc );
|
||||
labels.push_back(max_loc.x);
|
||||
}
|
||||
labels.convertTo(labels_c, CV_32S);
|
||||
}
|
||||
|
||||
// return label of the predicted class. class names can be 1,2,3,...
|
||||
Mat pred_labs = remap_labels(labels_c, this->reverse_mapper);
|
||||
pred_labs.convertTo(pred_labs, CV_32S);
|
||||
|
||||
// return either the labels or the raw output
|
||||
if ( results.needed() )
|
||||
{
|
||||
if ( flags & StatModel::RAW_OUTPUT )
|
||||
{
|
||||
pred_m.copyTo( results );
|
||||
}
|
||||
else
|
||||
{
|
||||
pred_labs.copyTo(results);
|
||||
}
|
||||
}
|
||||
|
||||
return ( pred_labs.empty() ? 0.f : static_cast<float>(pred_labs.at<int>(0)) );
|
||||
}
|
||||
|
||||
Mat LogisticRegressionImpl::calc_sigmoid(const Mat& data) const
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
Mat dest;
|
||||
exp(-data, dest);
|
||||
return 1.0/(1.0+dest);
|
||||
}
|
||||
|
||||
double LogisticRegressionImpl::compute_cost(const Mat& _data, const Mat& _labels, const Mat& _init_theta)
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
float llambda = 0; /*changed llambda from int to float to solve issue #7924*/
|
||||
int m;
|
||||
int n;
|
||||
double cost = 0;
|
||||
double rparameter = 0;
|
||||
Mat theta_b;
|
||||
Mat theta_c;
|
||||
Mat d_a;
|
||||
Mat d_b;
|
||||
|
||||
m = _data.rows;
|
||||
n = _data.cols;
|
||||
|
||||
theta_b = _init_theta(Range(1, n), Range::all());
|
||||
|
||||
if (params.norm != REG_DISABLE)
|
||||
{
|
||||
llambda = 1;
|
||||
}
|
||||
|
||||
if(this->params.norm == LogisticRegression::REG_L1)
|
||||
{
|
||||
rparameter = (llambda/(2*m)) * sum(theta_b)[0];
|
||||
}
|
||||
else
|
||||
{
|
||||
// assuming it to be L2 by default
|
||||
multiply(theta_b, theta_b, theta_c, 1);
|
||||
rparameter = (llambda/(2*m)) * sum(theta_c)[0];
|
||||
}
|
||||
|
||||
d_a = calc_sigmoid(_data * _init_theta);
|
||||
log(d_a, d_a);
|
||||
multiply(d_a, _labels, d_a);
|
||||
|
||||
// use the fact that: log(1 - sigmoid(x)) = log(sigmoid(-x))
|
||||
d_b = calc_sigmoid(- _data * _init_theta);
|
||||
log(d_b, d_b);
|
||||
multiply(d_b, 1-_labels, d_b);
|
||||
|
||||
cost = (-1.0/m) * (sum(d_a)[0] + sum(d_b)[0]);
|
||||
cost = cost + rparameter;
|
||||
|
||||
if(cvIsNaN( cost ) == 1)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "check training parameters. Invalid training classifier" );
|
||||
}
|
||||
|
||||
return cost;
|
||||
}
|
||||
|
||||
struct LogisticRegressionImpl_ComputeDradient_Impl : ParallelLoopBody
|
||||
{
|
||||
const Mat* data;
|
||||
const Mat* theta;
|
||||
const Mat* pcal_a;
|
||||
Mat* gradient;
|
||||
double lambda;
|
||||
|
||||
LogisticRegressionImpl_ComputeDradient_Impl(const Mat& _data, const Mat &_theta, const Mat& _pcal_a, const double _lambda, Mat & _gradient)
|
||||
: data(&_data)
|
||||
, theta(&_theta)
|
||||
, pcal_a(&_pcal_a)
|
||||
, gradient(&_gradient)
|
||||
, lambda(_lambda)
|
||||
{
|
||||
|
||||
}
|
||||
|
||||
void operator()(const cv::Range& r) const CV_OVERRIDE
|
||||
{
|
||||
const Mat& _data = *data;
|
||||
const Mat &_theta = *theta;
|
||||
Mat & _gradient = *gradient;
|
||||
const Mat & _pcal_a = *pcal_a;
|
||||
const int m = _data.rows;
|
||||
Mat pcal_ab;
|
||||
|
||||
for (int ii = r.start; ii<r.end; ii++)
|
||||
{
|
||||
Mat pcal_b = _data(Range::all(), Range(ii,ii+1));
|
||||
multiply(_pcal_a, pcal_b, pcal_ab, 1);
|
||||
|
||||
_gradient.row(ii) = (1.0/m)*sum(pcal_ab)[0] + (lambda/m) * _theta.row(ii);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
void LogisticRegressionImpl::compute_gradient(const Mat& _data, const Mat& _labels, const Mat &_theta, const double _lambda, Mat & _gradient )
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
const int m = _data.rows;
|
||||
Mat pcal_a, pcal_b, pcal_ab;
|
||||
|
||||
const Mat z = _data * _theta;
|
||||
|
||||
CV_Assert( _gradient.rows == _theta.rows && _gradient.cols == _theta.cols );
|
||||
|
||||
pcal_a = calc_sigmoid(z) - _labels;
|
||||
pcal_b = _data(Range::all(), Range(0,1));
|
||||
multiply(pcal_a, pcal_b, pcal_ab, 1);
|
||||
|
||||
_gradient.row(0) = ((float)1/m) * sum(pcal_ab)[0];
|
||||
|
||||
//cout<<"for each training data entry"<<endl;
|
||||
LogisticRegressionImpl_ComputeDradient_Impl invoker(_data, _theta, pcal_a, _lambda, _gradient);
|
||||
cv::parallel_for_(cv::Range(1, _gradient.rows), invoker);
|
||||
}
|
||||
|
||||
|
||||
Mat LogisticRegressionImpl::batch_gradient_descent(const Mat& _data, const Mat& _labels, const Mat& _init_theta)
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
// implements batch gradient descent
|
||||
if(this->params.alpha<=0)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "check training parameters (learning rate) for the classifier" );
|
||||
}
|
||||
|
||||
if(this->params.num_iters <= 0)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "number of iterations cannot be zero or a negative number" );
|
||||
}
|
||||
|
||||
int llambda = 0;
|
||||
int m;
|
||||
Mat theta_p = _init_theta.clone();
|
||||
Mat gradient( theta_p.rows, theta_p.cols, theta_p.type() );
|
||||
m = _data.rows;
|
||||
|
||||
if (params.norm != REG_DISABLE)
|
||||
{
|
||||
llambda = 1;
|
||||
}
|
||||
|
||||
for(int i = 0;i<this->params.num_iters;i++)
|
||||
{
|
||||
// this seems to only be called to ensure that cost is not NaN
|
||||
compute_cost(_data, _labels, theta_p);
|
||||
|
||||
compute_gradient( _data, _labels, theta_p, llambda, gradient );
|
||||
|
||||
theta_p = theta_p - ( static_cast<double>(this->params.alpha)/m)*gradient;
|
||||
}
|
||||
return theta_p;
|
||||
}
|
||||
|
||||
Mat LogisticRegressionImpl::mini_batch_gradient_descent(const Mat& _data, const Mat& _labels, const Mat& _init_theta)
|
||||
{
|
||||
// implements batch gradient descent
|
||||
int lambda_l = 0;
|
||||
int m;
|
||||
int j = 0;
|
||||
int size_b = this->params.mini_batch_size;
|
||||
|
||||
if(this->params.mini_batch_size <= 0 || this->params.alpha == 0)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "check training parameters for the classifier" );
|
||||
}
|
||||
|
||||
if(this->params.num_iters <= 0)
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "number of iterations cannot be zero or a negative number" );
|
||||
}
|
||||
|
||||
Mat theta_p = _init_theta.clone();
|
||||
Mat gradient( theta_p.rows, theta_p.cols, theta_p.type() );
|
||||
Mat data_d;
|
||||
Mat labels_l;
|
||||
|
||||
if (params.norm != REG_DISABLE)
|
||||
{
|
||||
lambda_l = 1;
|
||||
}
|
||||
|
||||
for(int i = 0;i<this->params.term_crit.maxCount;i++)
|
||||
{
|
||||
if(j+size_b<=_data.rows)
|
||||
{
|
||||
data_d = _data(Range(j,j+size_b), Range::all());
|
||||
labels_l = _labels(Range(j,j+size_b),Range::all());
|
||||
}
|
||||
else
|
||||
{
|
||||
data_d = _data(Range(j, _data.rows), Range::all());
|
||||
labels_l = _labels(Range(j, _labels.rows),Range::all());
|
||||
}
|
||||
|
||||
m = data_d.rows;
|
||||
|
||||
// this seems to only be called to ensure that cost is not NaN
|
||||
compute_cost(data_d, labels_l, theta_p);
|
||||
|
||||
compute_gradient(data_d, labels_l, theta_p, lambda_l, gradient);
|
||||
|
||||
theta_p = theta_p - ( static_cast<double>(this->params.alpha)/m)*gradient;
|
||||
|
||||
j += this->params.mini_batch_size;
|
||||
|
||||
// if parsed through all data variables
|
||||
if (j >= _data.rows) {
|
||||
j = 0;
|
||||
}
|
||||
}
|
||||
return theta_p;
|
||||
}
|
||||
|
||||
bool LogisticRegressionImpl::set_label_map(const Mat &_labels_i)
|
||||
{
|
||||
// this function creates two maps to map user defined labels to program friendly labels two ways.
|
||||
int ii = 0;
|
||||
Mat labels;
|
||||
|
||||
this->labels_o = Mat(0,1, CV_8U);
|
||||
this->labels_n = Mat(0,1, CV_8U);
|
||||
|
||||
_labels_i.convertTo(labels, CV_32S);
|
||||
|
||||
for(int i = 0;i<labels.rows;i++)
|
||||
{
|
||||
this->forward_mapper[labels.at<int>(i)] += 1;
|
||||
}
|
||||
|
||||
for(map<int,int>::iterator it = this->forward_mapper.begin(); it != this->forward_mapper.end(); ++it)
|
||||
{
|
||||
this->forward_mapper[it->first] = ii;
|
||||
this->labels_o.push_back(it->first);
|
||||
this->labels_n.push_back(ii);
|
||||
ii += 1;
|
||||
}
|
||||
|
||||
for(map<int,int>::iterator it = this->forward_mapper.begin(); it != this->forward_mapper.end(); ++it)
|
||||
{
|
||||
this->reverse_mapper[it->second] = it->first;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
Mat LogisticRegressionImpl::remap_labels(const Mat& _labels_i, const map<int, int>& lmap) const
|
||||
{
|
||||
Mat labels;
|
||||
_labels_i.convertTo(labels, CV_32S);
|
||||
|
||||
Mat new_labels = Mat::zeros(labels.rows, labels.cols, labels.type());
|
||||
|
||||
CV_Assert( !lmap.empty() );
|
||||
|
||||
for(int i =0;i<labels.rows;i++)
|
||||
{
|
||||
map<int, int>::const_iterator val = lmap.find(labels.at<int>(i,0));
|
||||
CV_Assert(val != lmap.end());
|
||||
new_labels.at<int>(i,0) = val->second;
|
||||
}
|
||||
return new_labels;
|
||||
}
|
||||
|
||||
void LogisticRegressionImpl::clear()
|
||||
{
|
||||
this->learnt_thetas.release();
|
||||
this->labels_o.release();
|
||||
this->labels_n.release();
|
||||
}
|
||||
|
||||
void LogisticRegressionImpl::write(FileStorage& fs) const
|
||||
{
|
||||
// check if open
|
||||
if(fs.isOpened() == 0)
|
||||
{
|
||||
CV_Error(CV_StsBadArg,"file can't open. Check file path");
|
||||
}
|
||||
writeFormat(fs);
|
||||
string desc = "Logistic Regression Classifier";
|
||||
fs<<"classifier"<<desc.c_str();
|
||||
fs<<"alpha"<<this->params.alpha;
|
||||
fs<<"iterations"<<this->params.num_iters;
|
||||
fs<<"norm"<<this->params.norm;
|
||||
fs<<"train_method"<<this->params.train_method;
|
||||
if(this->params.train_method == LogisticRegression::MINI_BATCH)
|
||||
{
|
||||
fs<<"mini_batch_size"<<this->params.mini_batch_size;
|
||||
}
|
||||
fs<<"learnt_thetas"<<this->learnt_thetas;
|
||||
fs<<"n_labels"<<this->labels_n;
|
||||
fs<<"o_labels"<<this->labels_o;
|
||||
}
|
||||
|
||||
void LogisticRegressionImpl::read(const FileNode& fn)
|
||||
{
|
||||
// check if empty
|
||||
if(fn.empty())
|
||||
{
|
||||
CV_Error( CV_StsBadArg, "empty FileNode object" );
|
||||
}
|
||||
|
||||
this->params.alpha = (double)fn["alpha"];
|
||||
this->params.num_iters = (int)fn["iterations"];
|
||||
this->params.norm = (int)fn["norm"];
|
||||
this->params.train_method = (int)fn["train_method"];
|
||||
|
||||
if(this->params.train_method == LogisticRegression::MINI_BATCH)
|
||||
{
|
||||
this->params.mini_batch_size = (int)fn["mini_batch_size"];
|
||||
}
|
||||
|
||||
fn["learnt_thetas"] >> this->learnt_thetas;
|
||||
fn["o_labels"] >> this->labels_o;
|
||||
fn["n_labels"] >> this->labels_n;
|
||||
|
||||
for(int ii =0;ii<labels_o.rows;ii++)
|
||||
{
|
||||
this->forward_mapper[labels_o.at<int>(ii,0)] = labels_n.at<int>(ii,0);
|
||||
this->reverse_mapper[labels_n.at<int>(ii,0)] = labels_o.at<int>(ii,0);
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
/* End of file. */
|
||||
471
3rdparty/opencv-4.5.4/modules/ml/src/nbayes.cpp
vendored
Normal file
471
3rdparty/opencv-4.5.4/modules/ml/src/nbayes.cpp
vendored
Normal file
@@ -0,0 +1,471 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// Intel License Agreement
|
||||
//
|
||||
// Copyright (C) 2000, Intel Corporation, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of Intel Corporation may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#include "precomp.hpp"
|
||||
|
||||
namespace cv {
|
||||
namespace ml {
|
||||
|
||||
|
||||
class NormalBayesClassifierImpl : public NormalBayesClassifier
|
||||
{
|
||||
public:
|
||||
NormalBayesClassifierImpl()
|
||||
{
|
||||
nallvars = 0;
|
||||
}
|
||||
|
||||
bool train( const Ptr<TrainData>& trainData, int flags ) CV_OVERRIDE
|
||||
{
|
||||
CV_Assert(!trainData.empty());
|
||||
const float min_variation = FLT_EPSILON;
|
||||
Mat responses = trainData->getNormCatResponses();
|
||||
Mat __cls_labels = trainData->getClassLabels();
|
||||
Mat __var_idx = trainData->getVarIdx();
|
||||
Mat samples = trainData->getTrainSamples();
|
||||
int nclasses = (int)__cls_labels.total();
|
||||
|
||||
int nvars = trainData->getNVars();
|
||||
int s, c1, c2, cls;
|
||||
|
||||
int __nallvars = trainData->getNAllVars();
|
||||
bool update = (flags & UPDATE_MODEL) != 0;
|
||||
|
||||
if( !update )
|
||||
{
|
||||
nallvars = __nallvars;
|
||||
count.resize(nclasses);
|
||||
sum.resize(nclasses);
|
||||
productsum.resize(nclasses);
|
||||
avg.resize(nclasses);
|
||||
inv_eigen_values.resize(nclasses);
|
||||
cov_rotate_mats.resize(nclasses);
|
||||
|
||||
for( cls = 0; cls < nclasses; cls++ )
|
||||
{
|
||||
count[cls] = Mat::zeros( 1, nvars, CV_32SC1 );
|
||||
sum[cls] = Mat::zeros( 1, nvars, CV_64FC1 );
|
||||
productsum[cls] = Mat::zeros( nvars, nvars, CV_64FC1 );
|
||||
avg[cls] = Mat::zeros( 1, nvars, CV_64FC1 );
|
||||
inv_eigen_values[cls] = Mat::zeros( 1, nvars, CV_64FC1 );
|
||||
cov_rotate_mats[cls] = Mat::zeros( nvars, nvars, CV_64FC1 );
|
||||
}
|
||||
|
||||
var_idx = __var_idx;
|
||||
cls_labels = __cls_labels;
|
||||
|
||||
c.create(1, nclasses, CV_64FC1);
|
||||
}
|
||||
else
|
||||
{
|
||||
// check that the new training data has the same dimensionality etc.
|
||||
if( nallvars != __nallvars ||
|
||||
var_idx.size() != __var_idx.size() ||
|
||||
norm(var_idx, __var_idx, NORM_INF) != 0 ||
|
||||
cls_labels.size() != __cls_labels.size() ||
|
||||
norm(cls_labels, __cls_labels, NORM_INF) != 0 )
|
||||
CV_Error( CV_StsBadArg,
|
||||
"The new training data is inconsistent with the original training data; varIdx and the class labels should be the same" );
|
||||
}
|
||||
|
||||
Mat cov( nvars, nvars, CV_64FC1 );
|
||||
int nsamples = samples.rows;
|
||||
|
||||
// process train data (count, sum , productsum)
|
||||
for( s = 0; s < nsamples; s++ )
|
||||
{
|
||||
cls = responses.at<int>(s);
|
||||
int* count_data = count[cls].ptr<int>();
|
||||
double* sum_data = sum[cls].ptr<double>();
|
||||
double* prod_data = productsum[cls].ptr<double>();
|
||||
const float* train_vec = samples.ptr<float>(s);
|
||||
|
||||
for( c1 = 0; c1 < nvars; c1++, prod_data += nvars )
|
||||
{
|
||||
double val1 = train_vec[c1];
|
||||
sum_data[c1] += val1;
|
||||
count_data[c1]++;
|
||||
for( c2 = c1; c2 < nvars; c2++ )
|
||||
prod_data[c2] += train_vec[c2]*val1;
|
||||
}
|
||||
}
|
||||
|
||||
Mat vt;
|
||||
|
||||
// calculate avg, covariance matrix, c
|
||||
for( cls = 0; cls < nclasses; cls++ )
|
||||
{
|
||||
double det = 1;
|
||||
int i, j;
|
||||
Mat& w = inv_eigen_values[cls];
|
||||
int* count_data = count[cls].ptr<int>();
|
||||
double* avg_data = avg[cls].ptr<double>();
|
||||
double* sum1 = sum[cls].ptr<double>();
|
||||
|
||||
completeSymm(productsum[cls], 0);
|
||||
|
||||
for( j = 0; j < nvars; j++ )
|
||||
{
|
||||
int n = count_data[j];
|
||||
avg_data[j] = n ? sum1[j] / n : 0.;
|
||||
}
|
||||
|
||||
count_data = count[cls].ptr<int>();
|
||||
avg_data = avg[cls].ptr<double>();
|
||||
sum1 = sum[cls].ptr<double>();
|
||||
|
||||
for( i = 0; i < nvars; i++ )
|
||||
{
|
||||
double* avg2_data = avg[cls].ptr<double>();
|
||||
double* sum2 = sum[cls].ptr<double>();
|
||||
double* prod_data = productsum[cls].ptr<double>(i);
|
||||
double* cov_data = cov.ptr<double>(i);
|
||||
double s1val = sum1[i];
|
||||
double avg1 = avg_data[i];
|
||||
int _count = count_data[i];
|
||||
|
||||
for( j = 0; j <= i; j++ )
|
||||
{
|
||||
double avg2 = avg2_data[j];
|
||||
double cov_val = prod_data[j] - avg1 * sum2[j] - avg2 * s1val + avg1 * avg2 * _count;
|
||||
cov_val = (_count > 1) ? cov_val / (_count - 1) : cov_val;
|
||||
cov_data[j] = cov_val;
|
||||
}
|
||||
}
|
||||
|
||||
completeSymm( cov, 1 );
|
||||
|
||||
SVD::compute(cov, w, cov_rotate_mats[cls], noArray());
|
||||
transpose(cov_rotate_mats[cls], cov_rotate_mats[cls]);
|
||||
cv::max(w, min_variation, w);
|
||||
for( j = 0; j < nvars; j++ )
|
||||
det *= w.at<double>(j);
|
||||
|
||||
divide(1., w, w);
|
||||
c.at<double>(cls) = det > 0 ? log(det) : -700;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
class NBPredictBody : public ParallelLoopBody
|
||||
{
|
||||
public:
|
||||
NBPredictBody( const Mat& _c, const vector<Mat>& _cov_rotate_mats,
|
||||
const vector<Mat>& _inv_eigen_values,
|
||||
const vector<Mat>& _avg,
|
||||
const Mat& _samples, const Mat& _vidx, const Mat& _cls_labels,
|
||||
Mat& _results, Mat& _results_prob, bool _rawOutput )
|
||||
{
|
||||
c = &_c;
|
||||
cov_rotate_mats = &_cov_rotate_mats;
|
||||
inv_eigen_values = &_inv_eigen_values;
|
||||
avg = &_avg;
|
||||
samples = &_samples;
|
||||
vidx = &_vidx;
|
||||
cls_labels = &_cls_labels;
|
||||
results = &_results;
|
||||
results_prob = !_results_prob.empty() ? &_results_prob : 0;
|
||||
rawOutput = _rawOutput;
|
||||
value = 0;
|
||||
}
|
||||
|
||||
const Mat* c;
|
||||
const vector<Mat>* cov_rotate_mats;
|
||||
const vector<Mat>* inv_eigen_values;
|
||||
const vector<Mat>* avg;
|
||||
const Mat* samples;
|
||||
const Mat* vidx;
|
||||
const Mat* cls_labels;
|
||||
|
||||
Mat* results_prob;
|
||||
Mat* results;
|
||||
float* value;
|
||||
bool rawOutput;
|
||||
|
||||
void operator()(const Range& range) const CV_OVERRIDE
|
||||
{
|
||||
int cls = -1;
|
||||
int rtype = 0, rptype = 0;
|
||||
size_t rstep = 0, rpstep = 0;
|
||||
int nclasses = (int)cls_labels->total();
|
||||
int nvars = avg->at(0).cols;
|
||||
double probability = 0;
|
||||
const int* vptr = vidx && !vidx->empty() ? vidx->ptr<int>() : 0;
|
||||
|
||||
if (results)
|
||||
{
|
||||
rtype = results->type();
|
||||
rstep = results->isContinuous() ? 1 : results->step/results->elemSize();
|
||||
}
|
||||
if (results_prob)
|
||||
{
|
||||
rptype = results_prob->type();
|
||||
rpstep = results_prob->isContinuous() ? results_prob->cols : results_prob->step/results_prob->elemSize();
|
||||
}
|
||||
// allocate memory and initializing headers for calculating
|
||||
cv::AutoBuffer<double> _buffer(nvars*2);
|
||||
double* _diffin = _buffer.data();
|
||||
double* _diffout = _buffer.data() + nvars;
|
||||
Mat diffin( 1, nvars, CV_64FC1, _diffin );
|
||||
Mat diffout( 1, nvars, CV_64FC1, _diffout );
|
||||
|
||||
for(int k = range.start; k < range.end; k++ )
|
||||
{
|
||||
double opt = FLT_MAX;
|
||||
|
||||
for(int i = 0; i < nclasses; i++ )
|
||||
{
|
||||
double cur = c->at<double>(i);
|
||||
const Mat& u = cov_rotate_mats->at(i);
|
||||
const Mat& w = inv_eigen_values->at(i);
|
||||
|
||||
const double* avg_data = avg->at(i).ptr<double>();
|
||||
const float* x = samples->ptr<float>(k);
|
||||
|
||||
// cov = u w u' --> cov^(-1) = u w^(-1) u'
|
||||
for(int j = 0; j < nvars; j++ )
|
||||
_diffin[j] = avg_data[j] - x[vptr ? vptr[j] : j];
|
||||
|
||||
gemm( diffin, u, 1, noArray(), 0, diffout, GEMM_2_T );
|
||||
for(int j = 0; j < nvars; j++ )
|
||||
{
|
||||
double d = _diffout[j];
|
||||
cur += d*d*w.ptr<double>()[j];
|
||||
}
|
||||
|
||||
if( cur < opt )
|
||||
{
|
||||
cls = i;
|
||||
opt = cur;
|
||||
}
|
||||
probability = exp( -0.5 * cur );
|
||||
|
||||
if( results_prob )
|
||||
{
|
||||
if ( rptype == CV_32FC1 )
|
||||
results_prob->ptr<float>()[k*rpstep + i] = (float)probability;
|
||||
else
|
||||
results_prob->ptr<double>()[k*rpstep + i] = probability;
|
||||
}
|
||||
}
|
||||
|
||||
int ival = rawOutput ? cls : cls_labels->at<int>(cls);
|
||||
if( results )
|
||||
{
|
||||
if( rtype == CV_32SC1 )
|
||||
results->ptr<int>()[k*rstep] = ival;
|
||||
else
|
||||
results->ptr<float>()[k*rstep] = (float)ival;
|
||||
}
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
float predict( InputArray _samples, OutputArray _results, int flags ) const CV_OVERRIDE
|
||||
{
|
||||
return predictProb(_samples, _results, noArray(), flags);
|
||||
}
|
||||
|
||||
float predictProb( InputArray _samples, OutputArray _results, OutputArray _resultsProb, int flags ) const CV_OVERRIDE
|
||||
{
|
||||
int value=0;
|
||||
Mat samples = _samples.getMat(), results, resultsProb;
|
||||
int nsamples = samples.rows, nclasses = (int)cls_labels.total();
|
||||
bool rawOutput = (flags & RAW_OUTPUT) != 0;
|
||||
|
||||
if( samples.type() != CV_32F || samples.cols != nallvars )
|
||||
CV_Error( CV_StsBadArg,
|
||||
"The input samples must be 32f matrix with the number of columns = nallvars" );
|
||||
|
||||
if( (samples.rows > 1) && (! _results.needed()) )
|
||||
CV_Error( CV_StsNullPtr,
|
||||
"When the number of input samples is >1, the output vector of results must be passed" );
|
||||
|
||||
if( _results.needed() )
|
||||
{
|
||||
_results.create(nsamples, 1, CV_32S);
|
||||
results = _results.getMat();
|
||||
}
|
||||
else
|
||||
results = Mat(1, 1, CV_32S, &value);
|
||||
|
||||
if( _resultsProb.needed() )
|
||||
{
|
||||
_resultsProb.create(nsamples, nclasses, CV_32F);
|
||||
resultsProb = _resultsProb.getMat();
|
||||
}
|
||||
|
||||
cv::parallel_for_(cv::Range(0, nsamples),
|
||||
NBPredictBody(c, cov_rotate_mats, inv_eigen_values, avg, samples,
|
||||
var_idx, cls_labels, results, resultsProb, rawOutput));
|
||||
|
||||
return (float)value;
|
||||
}
|
||||
|
||||
void write( FileStorage& fs ) const CV_OVERRIDE
|
||||
{
|
||||
int nclasses = (int)cls_labels.total(), i;
|
||||
|
||||
writeFormat(fs);
|
||||
fs << "var_count" << (var_idx.empty() ? nallvars : (int)var_idx.total());
|
||||
fs << "var_all" << nallvars;
|
||||
|
||||
if( !var_idx.empty() )
|
||||
fs << "var_idx" << var_idx;
|
||||
fs << "cls_labels" << cls_labels;
|
||||
|
||||
fs << "count" << "[";
|
||||
for( i = 0; i < nclasses; i++ )
|
||||
fs << count[i];
|
||||
|
||||
fs << "]" << "sum" << "[";
|
||||
for( i = 0; i < nclasses; i++ )
|
||||
fs << sum[i];
|
||||
|
||||
fs << "]" << "productsum" << "[";
|
||||
for( i = 0; i < nclasses; i++ )
|
||||
fs << productsum[i];
|
||||
|
||||
fs << "]" << "avg" << "[";
|
||||
for( i = 0; i < nclasses; i++ )
|
||||
fs << avg[i];
|
||||
|
||||
fs << "]" << "inv_eigen_values" << "[";
|
||||
for( i = 0; i < nclasses; i++ )
|
||||
fs << inv_eigen_values[i];
|
||||
|
||||
fs << "]" << "cov_rotate_mats" << "[";
|
||||
for( i = 0; i < nclasses; i++ )
|
||||
fs << cov_rotate_mats[i];
|
||||
|
||||
fs << "]";
|
||||
|
||||
fs << "c" << c;
|
||||
}
|
||||
|
||||
void read( const FileNode& fn ) CV_OVERRIDE
|
||||
{
|
||||
clear();
|
||||
|
||||
fn["var_all"] >> nallvars;
|
||||
|
||||
if( nallvars <= 0 )
|
||||
CV_Error( CV_StsParseError,
|
||||
"The field \"var_count\" of NBayes classifier is missing or non-positive" );
|
||||
|
||||
fn["var_idx"] >> var_idx;
|
||||
fn["cls_labels"] >> cls_labels;
|
||||
|
||||
int nclasses = (int)cls_labels.total(), i;
|
||||
|
||||
if( cls_labels.empty() || nclasses < 1 )
|
||||
CV_Error( CV_StsParseError, "No or invalid \"cls_labels\" in NBayes classifier" );
|
||||
|
||||
FileNodeIterator
|
||||
count_it = fn["count"].begin(),
|
||||
sum_it = fn["sum"].begin(),
|
||||
productsum_it = fn["productsum"].begin(),
|
||||
avg_it = fn["avg"].begin(),
|
||||
inv_eigen_values_it = fn["inv_eigen_values"].begin(),
|
||||
cov_rotate_mats_it = fn["cov_rotate_mats"].begin();
|
||||
|
||||
count.resize(nclasses);
|
||||
sum.resize(nclasses);
|
||||
productsum.resize(nclasses);
|
||||
avg.resize(nclasses);
|
||||
inv_eigen_values.resize(nclasses);
|
||||
cov_rotate_mats.resize(nclasses);
|
||||
|
||||
for( i = 0; i < nclasses; i++, ++count_it, ++sum_it, ++productsum_it, ++avg_it,
|
||||
++inv_eigen_values_it, ++cov_rotate_mats_it )
|
||||
{
|
||||
*count_it >> count[i];
|
||||
*sum_it >> sum[i];
|
||||
*productsum_it >> productsum[i];
|
||||
*avg_it >> avg[i];
|
||||
*inv_eigen_values_it >> inv_eigen_values[i];
|
||||
*cov_rotate_mats_it >> cov_rotate_mats[i];
|
||||
}
|
||||
|
||||
fn["c"] >> c;
|
||||
}
|
||||
|
||||
void clear() CV_OVERRIDE
|
||||
{
|
||||
count.clear();
|
||||
sum.clear();
|
||||
productsum.clear();
|
||||
avg.clear();
|
||||
inv_eigen_values.clear();
|
||||
cov_rotate_mats.clear();
|
||||
|
||||
var_idx.release();
|
||||
cls_labels.release();
|
||||
c.release();
|
||||
nallvars = 0;
|
||||
}
|
||||
|
||||
bool isTrained() const CV_OVERRIDE { return !avg.empty(); }
|
||||
bool isClassifier() const CV_OVERRIDE { return true; }
|
||||
int getVarCount() const CV_OVERRIDE { return nallvars; }
|
||||
String getDefaultName() const CV_OVERRIDE { return "opencv_ml_nbayes"; }
|
||||
|
||||
int nallvars;
|
||||
Mat var_idx, cls_labels, c;
|
||||
vector<Mat> count, sum, productsum, avg, inv_eigen_values, cov_rotate_mats;
|
||||
};
|
||||
|
||||
|
||||
Ptr<NormalBayesClassifier> NormalBayesClassifier::create()
|
||||
{
|
||||
Ptr<NormalBayesClassifierImpl> p = makePtr<NormalBayesClassifierImpl>();
|
||||
return p;
|
||||
}
|
||||
|
||||
Ptr<NormalBayesClassifier> NormalBayesClassifier::load(const String& filepath, const String& nodeName)
|
||||
{
|
||||
return Algorithm::load<NormalBayesClassifier>(filepath, nodeName);
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
/* End of file. */
|
||||
401
3rdparty/opencv-4.5.4/modules/ml/src/precomp.hpp
vendored
Normal file
401
3rdparty/opencv-4.5.4/modules/ml/src/precomp.hpp
vendored
Normal file
@@ -0,0 +1,401 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// Intel License Agreement
|
||||
//
|
||||
// Copyright (C) 2000, Intel Corporation, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of Intel Corporation may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#ifndef __OPENCV_ML_PRECOMP_HPP__
|
||||
#define __OPENCV_ML_PRECOMP_HPP__
|
||||
|
||||
#include "opencv2/core.hpp"
|
||||
#include "opencv2/ml.hpp"
|
||||
#include "opencv2/core/core_c.h"
|
||||
#include "opencv2/core/utility.hpp"
|
||||
|
||||
#include "opencv2/core/private.hpp"
|
||||
|
||||
#include <assert.h>
|
||||
#include <float.h>
|
||||
#include <limits.h>
|
||||
#include <math.h>
|
||||
#include <stdlib.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
#include <time.h>
|
||||
#include <vector>
|
||||
|
||||
/****************************************************************************************\
|
||||
* Main struct definitions *
|
||||
\****************************************************************************************/
|
||||
|
||||
/* log(2*PI) */
|
||||
#define CV_LOG2PI (1.8378770664093454835606594728112)
|
||||
|
||||
namespace cv
|
||||
{
|
||||
namespace ml
|
||||
{
|
||||
using std::vector;
|
||||
|
||||
#define CV_DTREE_CAT_DIR(idx,subset) \
|
||||
(2*((subset[(idx)>>5]&(1 << ((idx) & 31)))==0)-1)
|
||||
|
||||
template<typename _Tp> struct cmp_lt_idx
|
||||
{
|
||||
cmp_lt_idx(const _Tp* _arr) : arr(_arr) {}
|
||||
bool operator ()(int a, int b) const { return arr[a] < arr[b]; }
|
||||
const _Tp* arr;
|
||||
};
|
||||
|
||||
template<typename _Tp> struct cmp_lt_ptr
|
||||
{
|
||||
cmp_lt_ptr() {}
|
||||
bool operator ()(const _Tp* a, const _Tp* b) const { return *a < *b; }
|
||||
};
|
||||
|
||||
static inline void setRangeVector(std::vector<int>& vec, int n)
|
||||
{
|
||||
vec.resize(n);
|
||||
for( int i = 0; i < n; i++ )
|
||||
vec[i] = i;
|
||||
}
|
||||
|
||||
static inline void writeTermCrit(FileStorage& fs, const TermCriteria& termCrit)
|
||||
{
|
||||
if( (termCrit.type & TermCriteria::EPS) != 0 )
|
||||
fs << "epsilon" << termCrit.epsilon;
|
||||
if( (termCrit.type & TermCriteria::COUNT) != 0 )
|
||||
fs << "iterations" << termCrit.maxCount;
|
||||
}
|
||||
|
||||
static inline TermCriteria readTermCrit(const FileNode& fn)
|
||||
{
|
||||
TermCriteria termCrit;
|
||||
double epsilon = (double)fn["epsilon"];
|
||||
if( epsilon > 0 )
|
||||
{
|
||||
termCrit.type |= TermCriteria::EPS;
|
||||
termCrit.epsilon = epsilon;
|
||||
}
|
||||
int iters = (int)fn["iterations"];
|
||||
if( iters > 0 )
|
||||
{
|
||||
termCrit.type |= TermCriteria::COUNT;
|
||||
termCrit.maxCount = iters;
|
||||
}
|
||||
return termCrit;
|
||||
}
|
||||
|
||||
struct TreeParams
|
||||
{
|
||||
TreeParams();
|
||||
TreeParams( int maxDepth, int minSampleCount,
|
||||
double regressionAccuracy, bool useSurrogates,
|
||||
int maxCategories, int CVFolds,
|
||||
bool use1SERule, bool truncatePrunedTree,
|
||||
const Mat& priors );
|
||||
|
||||
inline void setMaxCategories(int val)
|
||||
{
|
||||
if( val < 2 )
|
||||
CV_Error( CV_StsOutOfRange, "max_categories should be >= 2" );
|
||||
maxCategories = std::min(val, 15 );
|
||||
}
|
||||
inline void setMaxDepth(int val)
|
||||
{
|
||||
if( val < 0 )
|
||||
CV_Error( CV_StsOutOfRange, "max_depth should be >= 0" );
|
||||
maxDepth = std::min( val, 25 );
|
||||
}
|
||||
inline void setMinSampleCount(int val)
|
||||
{
|
||||
minSampleCount = std::max(val, 1);
|
||||
}
|
||||
inline void setCVFolds(int val)
|
||||
{
|
||||
if( val < 0 )
|
||||
CV_Error( CV_StsOutOfRange,
|
||||
"params.CVFolds should be =0 (the tree is not pruned) "
|
||||
"or n>0 (tree is pruned using n-fold cross-validation)" );
|
||||
if(val > 1)
|
||||
CV_Error( CV_StsNotImplemented,
|
||||
"tree pruning using cross-validation is not implemented."
|
||||
"Set CVFolds to 1");
|
||||
|
||||
if( val == 1 )
|
||||
val = 0;
|
||||
CVFolds = val;
|
||||
}
|
||||
inline void setRegressionAccuracy(float val)
|
||||
{
|
||||
if( val < 0 )
|
||||
CV_Error( CV_StsOutOfRange, "params.regression_accuracy should be >= 0" );
|
||||
regressionAccuracy = val;
|
||||
}
|
||||
|
||||
inline int getMaxCategories() const { return maxCategories; }
|
||||
inline int getMaxDepth() const { return maxDepth; }
|
||||
inline int getMinSampleCount() const { return minSampleCount; }
|
||||
inline int getCVFolds() const { return CVFolds; }
|
||||
inline float getRegressionAccuracy() const { return regressionAccuracy; }
|
||||
|
||||
inline bool getUseSurrogates() const { return useSurrogates; }
|
||||
inline void setUseSurrogates(bool val) { useSurrogates = val; }
|
||||
inline bool getUse1SERule() const { return use1SERule; }
|
||||
inline void setUse1SERule(bool val) { use1SERule = val; }
|
||||
inline bool getTruncatePrunedTree() const { return truncatePrunedTree; }
|
||||
inline void setTruncatePrunedTree(bool val) { truncatePrunedTree = val; }
|
||||
inline cv::Mat getPriors() const { return priors; }
|
||||
inline void setPriors(const cv::Mat& val) { priors = val; }
|
||||
|
||||
public:
|
||||
bool useSurrogates;
|
||||
bool use1SERule;
|
||||
bool truncatePrunedTree;
|
||||
Mat priors;
|
||||
|
||||
protected:
|
||||
int maxCategories;
|
||||
int maxDepth;
|
||||
int minSampleCount;
|
||||
int CVFolds;
|
||||
float regressionAccuracy;
|
||||
};
|
||||
|
||||
struct RTreeParams
|
||||
{
|
||||
RTreeParams();
|
||||
RTreeParams(bool calcVarImportance, int nactiveVars, TermCriteria termCrit );
|
||||
bool calcVarImportance;
|
||||
int nactiveVars;
|
||||
TermCriteria termCrit;
|
||||
};
|
||||
|
||||
struct BoostTreeParams
|
||||
{
|
||||
BoostTreeParams();
|
||||
BoostTreeParams(int boostType, int weakCount, double weightTrimRate);
|
||||
int boostType;
|
||||
int weakCount;
|
||||
double weightTrimRate;
|
||||
};
|
||||
|
||||
class DTreesImpl : public DTrees
|
||||
{
|
||||
public:
|
||||
struct WNode
|
||||
{
|
||||
WNode()
|
||||
{
|
||||
class_idx = sample_count = depth = complexity = 0;
|
||||
parent = left = right = split = defaultDir = -1;
|
||||
Tn = INT_MAX;
|
||||
value = maxlr = alpha = node_risk = tree_risk = tree_error = 0.;
|
||||
}
|
||||
|
||||
int class_idx;
|
||||
double Tn;
|
||||
double value;
|
||||
|
||||
int parent;
|
||||
int left;
|
||||
int right;
|
||||
int defaultDir;
|
||||
|
||||
int split;
|
||||
|
||||
int sample_count;
|
||||
int depth;
|
||||
double maxlr;
|
||||
|
||||
// global pruning data
|
||||
int complexity;
|
||||
double alpha;
|
||||
double node_risk, tree_risk, tree_error;
|
||||
};
|
||||
|
||||
struct WSplit
|
||||
{
|
||||
WSplit()
|
||||
{
|
||||
varIdx = next = 0;
|
||||
inversed = false;
|
||||
quality = c = 0.f;
|
||||
subsetOfs = -1;
|
||||
}
|
||||
|
||||
int varIdx;
|
||||
bool inversed;
|
||||
float quality;
|
||||
int next;
|
||||
float c;
|
||||
int subsetOfs;
|
||||
};
|
||||
|
||||
struct WorkData
|
||||
{
|
||||
WorkData(const Ptr<TrainData>& _data);
|
||||
|
||||
Ptr<TrainData> data;
|
||||
vector<WNode> wnodes;
|
||||
vector<WSplit> wsplits;
|
||||
vector<int> wsubsets;
|
||||
vector<double> cv_Tn;
|
||||
vector<double> cv_node_risk;
|
||||
vector<double> cv_node_error;
|
||||
vector<int> cv_labels;
|
||||
vector<double> sample_weights;
|
||||
vector<int> cat_responses;
|
||||
vector<double> ord_responses;
|
||||
vector<int> sidx;
|
||||
int maxSubsetSize;
|
||||
};
|
||||
|
||||
inline int getMaxCategories() const CV_OVERRIDE { return params.getMaxCategories(); }
|
||||
inline void setMaxCategories(int val) CV_OVERRIDE { params.setMaxCategories(val); }
|
||||
inline int getMaxDepth() const CV_OVERRIDE { return params.getMaxDepth(); }
|
||||
inline void setMaxDepth(int val) CV_OVERRIDE { params.setMaxDepth(val); }
|
||||
inline int getMinSampleCount() const CV_OVERRIDE { return params.getMinSampleCount(); }
|
||||
inline void setMinSampleCount(int val) CV_OVERRIDE { params.setMinSampleCount(val); }
|
||||
inline int getCVFolds() const CV_OVERRIDE { return params.getCVFolds(); }
|
||||
inline void setCVFolds(int val) CV_OVERRIDE { params.setCVFolds(val); }
|
||||
inline bool getUseSurrogates() const CV_OVERRIDE { return params.getUseSurrogates(); }
|
||||
inline void setUseSurrogates(bool val) CV_OVERRIDE { params.setUseSurrogates(val); }
|
||||
inline bool getUse1SERule() const CV_OVERRIDE { return params.getUse1SERule(); }
|
||||
inline void setUse1SERule(bool val) CV_OVERRIDE { params.setUse1SERule(val); }
|
||||
inline bool getTruncatePrunedTree() const CV_OVERRIDE { return params.getTruncatePrunedTree(); }
|
||||
inline void setTruncatePrunedTree(bool val) CV_OVERRIDE { params.setTruncatePrunedTree(val); }
|
||||
inline float getRegressionAccuracy() const CV_OVERRIDE { return params.getRegressionAccuracy(); }
|
||||
inline void setRegressionAccuracy(float val) CV_OVERRIDE { params.setRegressionAccuracy(val); }
|
||||
inline cv::Mat getPriors() const CV_OVERRIDE { return params.getPriors(); }
|
||||
inline void setPriors(const cv::Mat& val) CV_OVERRIDE { params.setPriors(val); }
|
||||
|
||||
DTreesImpl();
|
||||
virtual ~DTreesImpl() CV_OVERRIDE;
|
||||
virtual void clear() CV_OVERRIDE;
|
||||
|
||||
String getDefaultName() const CV_OVERRIDE { return "opencv_ml_dtree"; }
|
||||
bool isTrained() const CV_OVERRIDE { return !roots.empty(); }
|
||||
bool isClassifier() const CV_OVERRIDE { return _isClassifier; }
|
||||
int getVarCount() const CV_OVERRIDE { return varType.empty() ? 0 : (int)(varType.size() - 1); }
|
||||
int getCatCount(int vi) const { return catOfs[vi][1] - catOfs[vi][0]; }
|
||||
int getSubsetSize(int vi) const { return (getCatCount(vi) + 31)/32; }
|
||||
|
||||
virtual void setDParams(const TreeParams& _params);
|
||||
virtual void startTraining( const Ptr<TrainData>& trainData, int flags );
|
||||
virtual void endTraining();
|
||||
virtual void initCompVarIdx();
|
||||
virtual bool train( const Ptr<TrainData>& trainData, int flags ) CV_OVERRIDE;
|
||||
|
||||
virtual int addTree( const vector<int>& sidx );
|
||||
virtual int addNodeAndTrySplit( int parent, const vector<int>& sidx );
|
||||
virtual const vector<int>& getActiveVars();
|
||||
virtual int findBestSplit( const vector<int>& _sidx );
|
||||
virtual void calcValue( int nidx, const vector<int>& _sidx );
|
||||
|
||||
virtual WSplit findSplitOrdClass( int vi, const vector<int>& _sidx, double initQuality );
|
||||
|
||||
// simple k-means, slightly modified to take into account the "weight" (L1-norm) of each vector.
|
||||
virtual void clusterCategories( const double* vectors, int n, int m, double* csums, int k, int* labels );
|
||||
virtual WSplit findSplitCatClass( int vi, const vector<int>& _sidx, double initQuality, int* subset );
|
||||
|
||||
virtual WSplit findSplitOrdReg( int vi, const vector<int>& _sidx, double initQuality );
|
||||
virtual WSplit findSplitCatReg( int vi, const vector<int>& _sidx, double initQuality, int* subset );
|
||||
|
||||
virtual int calcDir( int splitidx, const vector<int>& _sidx, vector<int>& _sleft, vector<int>& _sright );
|
||||
virtual int pruneCV( int root );
|
||||
|
||||
virtual double updateTreeRNC( int root, double T, int fold );
|
||||
virtual bool cutTree( int root, double T, int fold, double min_alpha );
|
||||
virtual float predictTrees( const Range& range, const Mat& sample, int flags ) const;
|
||||
virtual float predict( InputArray inputs, OutputArray outputs, int flags ) const CV_OVERRIDE;
|
||||
|
||||
virtual void writeTrainingParams( FileStorage& fs ) const;
|
||||
virtual void writeParams( FileStorage& fs ) const;
|
||||
virtual void writeSplit( FileStorage& fs, int splitidx ) const;
|
||||
virtual void writeNode( FileStorage& fs, int nidx, int depth ) const;
|
||||
virtual void writeTree( FileStorage& fs, int root ) const;
|
||||
virtual void write( FileStorage& fs ) const CV_OVERRIDE;
|
||||
|
||||
virtual void readParams( const FileNode& fn );
|
||||
virtual int readSplit( const FileNode& fn );
|
||||
virtual int readNode( const FileNode& fn );
|
||||
virtual int readTree( const FileNode& fn );
|
||||
virtual void read( const FileNode& fn ) CV_OVERRIDE;
|
||||
|
||||
virtual const std::vector<int>& getRoots() const CV_OVERRIDE { return roots; }
|
||||
virtual const std::vector<Node>& getNodes() const CV_OVERRIDE { return nodes; }
|
||||
virtual const std::vector<Split>& getSplits() const CV_OVERRIDE { return splits; }
|
||||
virtual const std::vector<int>& getSubsets() const CV_OVERRIDE { return subsets; }
|
||||
|
||||
TreeParams params;
|
||||
|
||||
vector<int> varIdx;
|
||||
vector<int> compVarIdx;
|
||||
vector<uchar> varType;
|
||||
vector<Vec2i> catOfs;
|
||||
vector<int> catMap;
|
||||
vector<int> roots;
|
||||
vector<Node> nodes;
|
||||
vector<Split> splits;
|
||||
vector<int> subsets;
|
||||
vector<int> classLabels;
|
||||
vector<float> missingSubst;
|
||||
vector<int> varMapping;
|
||||
bool _isClassifier;
|
||||
|
||||
Ptr<WorkData> w;
|
||||
};
|
||||
|
||||
template <typename T>
|
||||
static inline void readVectorOrMat(const FileNode & node, std::vector<T> & v)
|
||||
{
|
||||
if (node.type() == FileNode::MAP)
|
||||
{
|
||||
Mat m;
|
||||
node >> m;
|
||||
m.copyTo(v);
|
||||
}
|
||||
else if (node.type() == FileNode::SEQ)
|
||||
{
|
||||
node >> v;
|
||||
}
|
||||
}
|
||||
|
||||
}}
|
||||
|
||||
#endif /* __OPENCV_ML_PRECOMP_HPP__ */
|
||||
531
3rdparty/opencv-4.5.4/modules/ml/src/rtrees.cpp
vendored
Normal file
531
3rdparty/opencv-4.5.4/modules/ml/src/rtrees.cpp
vendored
Normal file
@@ -0,0 +1,531 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// License Agreement
|
||||
// For Open Source Computer Vision Library
|
||||
//
|
||||
// Copyright (C) 2000, Intel Corporation, all rights reserved.
|
||||
// Copyright (C) 2014, Itseez Inc, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of the copyright holders may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#include "precomp.hpp"
|
||||
namespace cv {
|
||||
namespace ml {
|
||||
|
||||
//////////////////////////////////////////////////////////////////////////////////////////
|
||||
// Random trees //
|
||||
//////////////////////////////////////////////////////////////////////////////////////////
|
||||
RTreeParams::RTreeParams()
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
calcVarImportance = false;
|
||||
nactiveVars = 0;
|
||||
termCrit = TermCriteria(TermCriteria::EPS + TermCriteria::COUNT, 50, 0.1);
|
||||
}
|
||||
|
||||
RTreeParams::RTreeParams(bool _calcVarImportance,
|
||||
int _nactiveVars,
|
||||
TermCriteria _termCrit )
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
calcVarImportance = _calcVarImportance;
|
||||
nactiveVars = _nactiveVars;
|
||||
termCrit = _termCrit;
|
||||
}
|
||||
|
||||
|
||||
class DTreesImplForRTrees CV_FINAL : public DTreesImpl
|
||||
{
|
||||
public:
|
||||
DTreesImplForRTrees()
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
params.setMaxDepth(5);
|
||||
params.setMinSampleCount(10);
|
||||
params.setRegressionAccuracy(0.f);
|
||||
params.useSurrogates = false;
|
||||
params.setMaxCategories(10);
|
||||
params.setCVFolds(0);
|
||||
params.use1SERule = false;
|
||||
params.truncatePrunedTree = false;
|
||||
params.priors = Mat();
|
||||
oobError = 0;
|
||||
}
|
||||
virtual ~DTreesImplForRTrees() {}
|
||||
|
||||
void clear() CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
DTreesImpl::clear();
|
||||
oobError = 0.;
|
||||
}
|
||||
|
||||
const vector<int>& getActiveVars() CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
RNG &rng = theRNG();
|
||||
int i, nvars = (int)allVars.size(), m = (int)activeVars.size();
|
||||
for( i = 0; i < nvars; i++ )
|
||||
{
|
||||
int i1 = rng.uniform(0, nvars);
|
||||
int i2 = rng.uniform(0, nvars);
|
||||
std::swap(allVars[i1], allVars[i2]);
|
||||
}
|
||||
for( i = 0; i < m; i++ )
|
||||
activeVars[i] = allVars[i];
|
||||
return activeVars;
|
||||
}
|
||||
|
||||
void startTraining( const Ptr<TrainData>& trainData, int flags ) CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
CV_Assert(!trainData.empty());
|
||||
DTreesImpl::startTraining(trainData, flags);
|
||||
int nvars = w->data->getNVars();
|
||||
int i, m = rparams.nactiveVars > 0 ? rparams.nactiveVars : cvRound(std::sqrt((double)nvars));
|
||||
m = std::min(std::max(m, 1), nvars);
|
||||
allVars.resize(nvars);
|
||||
activeVars.resize(m);
|
||||
for( i = 0; i < nvars; i++ )
|
||||
allVars[i] = varIdx[i];
|
||||
}
|
||||
|
||||
void endTraining() CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
DTreesImpl::endTraining();
|
||||
vector<int> a, b;
|
||||
std::swap(allVars, a);
|
||||
std::swap(activeVars, b);
|
||||
}
|
||||
|
||||
bool train( const Ptr<TrainData>& trainData, int flags ) CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
RNG &rng = theRNG();
|
||||
CV_Assert(!trainData.empty());
|
||||
startTraining(trainData, flags);
|
||||
int treeidx, ntrees = (rparams.termCrit.type & TermCriteria::COUNT) != 0 ?
|
||||
rparams.termCrit.maxCount : 10000;
|
||||
int i, j, k, vi, vi_, n = (int)w->sidx.size();
|
||||
int nclasses = (int)classLabels.size();
|
||||
double eps = (rparams.termCrit.type & TermCriteria::EPS) != 0 &&
|
||||
rparams.termCrit.epsilon > 0 ? rparams.termCrit.epsilon : 0.;
|
||||
vector<int> sidx(n);
|
||||
vector<uchar> oobmask(n);
|
||||
vector<int> oobidx;
|
||||
vector<int> oobperm;
|
||||
vector<double> oobres(n, 0.);
|
||||
vector<int> oobcount(n, 0);
|
||||
vector<int> oobvotes(n*nclasses, 0);
|
||||
int nvars = w->data->getNVars();
|
||||
int nallvars = w->data->getNAllVars();
|
||||
const int* vidx = !varIdx.empty() ? &varIdx[0] : 0;
|
||||
vector<float> samplebuf(nallvars);
|
||||
Mat samples = w->data->getSamples();
|
||||
float* psamples = samples.ptr<float>();
|
||||
size_t sstep0 = samples.step1(), sstep1 = 1;
|
||||
Mat sample0, sample(nallvars, 1, CV_32F, &samplebuf[0]);
|
||||
int predictFlags = _isClassifier ? (PREDICT_MAX_VOTE + RAW_OUTPUT) : PREDICT_SUM;
|
||||
|
||||
bool calcOOBError = eps > 0 || rparams.calcVarImportance;
|
||||
double max_response = 0.;
|
||||
|
||||
if( w->data->getLayout() == COL_SAMPLE )
|
||||
std::swap(sstep0, sstep1);
|
||||
|
||||
if( !_isClassifier )
|
||||
{
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
double val = std::abs(w->ord_responses[w->sidx[i]]);
|
||||
max_response = std::max(max_response, val);
|
||||
}
|
||||
CV_Assert(fabs(max_response) > 0);
|
||||
}
|
||||
|
||||
if( rparams.calcVarImportance )
|
||||
varImportance.resize(nallvars, 0.f);
|
||||
|
||||
for( treeidx = 0; treeidx < ntrees; treeidx++ )
|
||||
{
|
||||
for( i = 0; i < n; i++ )
|
||||
oobmask[i] = (uchar)1;
|
||||
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
j = rng.uniform(0, n);
|
||||
sidx[i] = w->sidx[j];
|
||||
oobmask[j] = (uchar)0;
|
||||
}
|
||||
int root = addTree( sidx );
|
||||
if( root < 0 )
|
||||
return false;
|
||||
|
||||
if( calcOOBError )
|
||||
{
|
||||
oobidx.clear();
|
||||
for( i = 0; i < n; i++ )
|
||||
{
|
||||
if( oobmask[i] )
|
||||
oobidx.push_back(i);
|
||||
}
|
||||
int n_oob = (int)oobidx.size();
|
||||
// if there is no out-of-bag samples, we can not compute OOB error
|
||||
// nor update the variable importance vector; so we proceed to the next tree
|
||||
if( n_oob == 0 )
|
||||
continue;
|
||||
double ncorrect_responses = 0.;
|
||||
|
||||
oobError = 0.;
|
||||
for( i = 0; i < n_oob; i++ )
|
||||
{
|
||||
j = oobidx[i];
|
||||
sample = Mat( nallvars, 1, CV_32F, psamples + sstep0*w->sidx[j], sstep1*sizeof(psamples[0]) );
|
||||
|
||||
double val = predictTrees(Range(treeidx, treeidx+1), sample, predictFlags);
|
||||
double sample_weight = w->sample_weights[w->sidx[j]];
|
||||
if( !_isClassifier )
|
||||
{
|
||||
oobres[j] += val;
|
||||
oobcount[j]++;
|
||||
double true_val = w->ord_responses[w->sidx[j]];
|
||||
double a = oobres[j]/oobcount[j] - true_val;
|
||||
oobError += sample_weight * a*a;
|
||||
val = (val - true_val)/max_response;
|
||||
ncorrect_responses += std::exp( -val*val );
|
||||
}
|
||||
else
|
||||
{
|
||||
int ival = cvRound(val);
|
||||
//Voting scheme to combine OOB errors of each tree
|
||||
int* votes = &oobvotes[j*nclasses];
|
||||
votes[ival]++;
|
||||
int best_class = 0;
|
||||
for( k = 1; k < nclasses; k++ )
|
||||
if( votes[best_class] < votes[k] )
|
||||
best_class = k;
|
||||
int diff = best_class != w->cat_responses[w->sidx[j]];
|
||||
oobError += sample_weight * diff;
|
||||
ncorrect_responses += diff == 0;
|
||||
}
|
||||
}
|
||||
|
||||
oobError /= n_oob;
|
||||
if( rparams.calcVarImportance && n_oob > 1 )
|
||||
{
|
||||
Mat sample_clone;
|
||||
oobperm.resize(n_oob);
|
||||
for( i = 0; i < n_oob; i++ )
|
||||
oobperm[i] = oobidx[i];
|
||||
for (i = n_oob - 1; i > 0; --i) //Randomly shuffle indices so we can permute features
|
||||
{
|
||||
int r_i = rng.uniform(0, n_oob);
|
||||
std::swap(oobperm[i], oobperm[r_i]);
|
||||
}
|
||||
|
||||
for( vi_ = 0; vi_ < nvars; vi_++ )
|
||||
{
|
||||
vi = vidx ? vidx[vi_] : vi_; //Ensure that only the user specified predictors are used for training
|
||||
double ncorrect_responses_permuted = 0;
|
||||
|
||||
for( i = 0; i < n_oob; i++ )
|
||||
{
|
||||
j = oobidx[i];
|
||||
int vj = oobperm[i];
|
||||
sample0 = Mat( nallvars, 1, CV_32F, psamples + sstep0*w->sidx[j], sstep1*sizeof(psamples[0]) );
|
||||
sample0.copyTo(sample_clone); //create a copy so we don't mess up the original data
|
||||
sample_clone.at<float>(vi) = psamples[sstep0*w->sidx[vj] + sstep1*vi];
|
||||
|
||||
double val = predictTrees(Range(treeidx, treeidx+1), sample_clone, predictFlags);
|
||||
if( !_isClassifier )
|
||||
{
|
||||
val = (val - w->ord_responses[w->sidx[j]])/max_response;
|
||||
ncorrect_responses_permuted += exp( -val*val );
|
||||
}
|
||||
else
|
||||
{
|
||||
ncorrect_responses_permuted += cvRound(val) == w->cat_responses[w->sidx[j]];
|
||||
}
|
||||
}
|
||||
varImportance[vi] += (float)(ncorrect_responses - ncorrect_responses_permuted);
|
||||
}
|
||||
}
|
||||
}
|
||||
if( calcOOBError && oobError < eps )
|
||||
break;
|
||||
}
|
||||
|
||||
if( rparams.calcVarImportance )
|
||||
{
|
||||
for( vi_ = 0; vi_ < nallvars; vi_++ )
|
||||
varImportance[vi_] = std::max(varImportance[vi_], 0.f);
|
||||
normalize(varImportance, varImportance, 1., 0, NORM_L1);
|
||||
}
|
||||
endTraining();
|
||||
return true;
|
||||
}
|
||||
|
||||
void writeTrainingParams( FileStorage& fs ) const CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
DTreesImpl::writeTrainingParams(fs);
|
||||
fs << "nactive_vars" << rparams.nactiveVars;
|
||||
}
|
||||
|
||||
void write( FileStorage& fs ) const CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
if( roots.empty() )
|
||||
CV_Error( CV_StsBadArg, "RTrees have not been trained" );
|
||||
|
||||
writeFormat(fs);
|
||||
writeParams(fs);
|
||||
|
||||
fs << "oob_error" << oobError;
|
||||
if( !varImportance.empty() )
|
||||
fs << "var_importance" << varImportance;
|
||||
|
||||
int k, ntrees = (int)roots.size();
|
||||
|
||||
fs << "ntrees" << ntrees
|
||||
<< "trees" << "[";
|
||||
|
||||
for( k = 0; k < ntrees; k++ )
|
||||
{
|
||||
fs << "{";
|
||||
writeTree(fs, roots[k]);
|
||||
fs << "}";
|
||||
}
|
||||
|
||||
fs << "]";
|
||||
}
|
||||
|
||||
void readParams( const FileNode& fn ) CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
DTreesImpl::readParams(fn);
|
||||
|
||||
FileNode tparams_node = fn["training_params"];
|
||||
rparams.nactiveVars = (int)tparams_node["nactive_vars"];
|
||||
}
|
||||
|
||||
void read( const FileNode& fn ) CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
clear();
|
||||
|
||||
//int nclasses = (int)fn["nclasses"];
|
||||
//int nsamples = (int)fn["nsamples"];
|
||||
oobError = (double)fn["oob_error"];
|
||||
int ntrees = (int)fn["ntrees"];
|
||||
|
||||
readVectorOrMat(fn["var_importance"], varImportance);
|
||||
|
||||
readParams(fn);
|
||||
|
||||
FileNode trees_node = fn["trees"];
|
||||
FileNodeIterator it = trees_node.begin();
|
||||
CV_Assert( ntrees == (int)trees_node.size() );
|
||||
|
||||
for( int treeidx = 0; treeidx < ntrees; treeidx++, ++it )
|
||||
{
|
||||
FileNode nfn = (*it)["nodes"];
|
||||
readTree(nfn);
|
||||
}
|
||||
}
|
||||
|
||||
void getVotes( InputArray input, OutputArray output, int flags ) const
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
CV_Assert( !roots.empty() );
|
||||
int nclasses = (int)classLabels.size(), ntrees = (int)roots.size();
|
||||
Mat samples = input.getMat(), results;
|
||||
int i, j, nsamples = samples.rows;
|
||||
|
||||
int predictType = flags & PREDICT_MASK;
|
||||
if( predictType == PREDICT_AUTO )
|
||||
{
|
||||
predictType = !_isClassifier || (classLabels.size() == 2 && (flags & RAW_OUTPUT) != 0) ?
|
||||
PREDICT_SUM : PREDICT_MAX_VOTE;
|
||||
}
|
||||
|
||||
if( predictType == PREDICT_SUM )
|
||||
{
|
||||
output.create(nsamples, ntrees, CV_32F);
|
||||
results = output.getMat();
|
||||
for( i = 0; i < nsamples; i++ )
|
||||
{
|
||||
for( j = 0; j < ntrees; j++ )
|
||||
{
|
||||
float val = predictTrees( Range(j, j+1), samples.row(i), flags);
|
||||
results.at<float> (i, j) = val;
|
||||
}
|
||||
}
|
||||
} else
|
||||
{
|
||||
vector<int> votes;
|
||||
output.create(nsamples+1, nclasses, CV_32S);
|
||||
results = output.getMat();
|
||||
|
||||
for ( j = 0; j < nclasses; j++)
|
||||
{
|
||||
results.at<int> (0, j) = classLabels[j];
|
||||
}
|
||||
|
||||
for( i = 0; i < nsamples; i++ )
|
||||
{
|
||||
votes.clear();
|
||||
for( j = 0; j < ntrees; j++ )
|
||||
{
|
||||
int val = (int)predictTrees( Range(j, j+1), samples.row(i), flags);
|
||||
votes.push_back(val);
|
||||
}
|
||||
|
||||
for ( j = 0; j < nclasses; j++)
|
||||
{
|
||||
results.at<int> (i+1, j) = (int)std::count(votes.begin(), votes.end(), classLabels[j]);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
double getOOBError() const {
|
||||
return oobError;
|
||||
}
|
||||
|
||||
RTreeParams rparams;
|
||||
double oobError;
|
||||
vector<float> varImportance;
|
||||
vector<int> allVars, activeVars;
|
||||
};
|
||||
|
||||
|
||||
class RTreesImpl CV_FINAL : public RTrees
|
||||
{
|
||||
public:
|
||||
inline bool getCalculateVarImportance() const CV_OVERRIDE { return impl.rparams.calcVarImportance; }
|
||||
inline void setCalculateVarImportance(bool val) CV_OVERRIDE { impl.rparams.calcVarImportance = val; }
|
||||
inline int getActiveVarCount() const CV_OVERRIDE { return impl.rparams.nactiveVars; }
|
||||
inline void setActiveVarCount(int val) CV_OVERRIDE { impl.rparams.nactiveVars = val; }
|
||||
inline TermCriteria getTermCriteria() const CV_OVERRIDE { return impl.rparams.termCrit; }
|
||||
inline void setTermCriteria(const TermCriteria& val) CV_OVERRIDE { impl.rparams.termCrit = val; }
|
||||
|
||||
inline int getMaxCategories() const CV_OVERRIDE { return impl.params.getMaxCategories(); }
|
||||
inline void setMaxCategories(int val) CV_OVERRIDE { impl.params.setMaxCategories(val); }
|
||||
inline int getMaxDepth() const CV_OVERRIDE { return impl.params.getMaxDepth(); }
|
||||
inline void setMaxDepth(int val) CV_OVERRIDE { impl.params.setMaxDepth(val); }
|
||||
inline int getMinSampleCount() const CV_OVERRIDE { return impl.params.getMinSampleCount(); }
|
||||
inline void setMinSampleCount(int val) CV_OVERRIDE { impl.params.setMinSampleCount(val); }
|
||||
inline int getCVFolds() const CV_OVERRIDE { return impl.params.getCVFolds(); }
|
||||
inline void setCVFolds(int val) CV_OVERRIDE { impl.params.setCVFolds(val); }
|
||||
inline bool getUseSurrogates() const CV_OVERRIDE { return impl.params.getUseSurrogates(); }
|
||||
inline void setUseSurrogates(bool val) CV_OVERRIDE { impl.params.setUseSurrogates(val); }
|
||||
inline bool getUse1SERule() const CV_OVERRIDE { return impl.params.getUse1SERule(); }
|
||||
inline void setUse1SERule(bool val) CV_OVERRIDE { impl.params.setUse1SERule(val); }
|
||||
inline bool getTruncatePrunedTree() const CV_OVERRIDE { return impl.params.getTruncatePrunedTree(); }
|
||||
inline void setTruncatePrunedTree(bool val) CV_OVERRIDE { impl.params.setTruncatePrunedTree(val); }
|
||||
inline float getRegressionAccuracy() const CV_OVERRIDE { return impl.params.getRegressionAccuracy(); }
|
||||
inline void setRegressionAccuracy(float val) CV_OVERRIDE { impl.params.setRegressionAccuracy(val); }
|
||||
inline cv::Mat getPriors() const CV_OVERRIDE { return impl.params.getPriors(); }
|
||||
inline void setPriors(const cv::Mat& val) CV_OVERRIDE { impl.params.setPriors(val); }
|
||||
inline void getVotes(InputArray input, OutputArray output, int flags) const CV_OVERRIDE {return impl.getVotes(input,output,flags);}
|
||||
|
||||
RTreesImpl() {}
|
||||
virtual ~RTreesImpl() CV_OVERRIDE {}
|
||||
|
||||
String getDefaultName() const CV_OVERRIDE { return "opencv_ml_rtrees"; }
|
||||
|
||||
bool train( const Ptr<TrainData>& trainData, int flags ) CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
CV_Assert(!trainData.empty());
|
||||
if (impl.getCVFolds() != 0)
|
||||
CV_Error(Error::StsBadArg, "Cross validation for RTrees is not implemented");
|
||||
return impl.train(trainData, flags);
|
||||
}
|
||||
|
||||
float predict( InputArray samples, OutputArray results, int flags ) const CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
CV_CheckEQ(samples.cols(), getVarCount(), "");
|
||||
return impl.predict(samples, results, flags);
|
||||
}
|
||||
|
||||
void write( FileStorage& fs ) const CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
impl.write(fs);
|
||||
}
|
||||
|
||||
void read( const FileNode& fn ) CV_OVERRIDE
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
impl.read(fn);
|
||||
}
|
||||
|
||||
Mat getVarImportance() const CV_OVERRIDE { return Mat_<float>(impl.varImportance, true); }
|
||||
int getVarCount() const CV_OVERRIDE { return impl.getVarCount(); }
|
||||
|
||||
bool isTrained() const CV_OVERRIDE { return impl.isTrained(); }
|
||||
bool isClassifier() const CV_OVERRIDE { return impl.isClassifier(); }
|
||||
|
||||
const vector<int>& getRoots() const CV_OVERRIDE { return impl.getRoots(); }
|
||||
const vector<Node>& getNodes() const CV_OVERRIDE { return impl.getNodes(); }
|
||||
const vector<Split>& getSplits() const CV_OVERRIDE { return impl.getSplits(); }
|
||||
const vector<int>& getSubsets() const CV_OVERRIDE { return impl.getSubsets(); }
|
||||
double getOOBError() const CV_OVERRIDE { return impl.getOOBError(); }
|
||||
|
||||
|
||||
DTreesImplForRTrees impl;
|
||||
};
|
||||
|
||||
|
||||
Ptr<RTrees> RTrees::create()
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
return makePtr<RTreesImpl>();
|
||||
}
|
||||
|
||||
//Function needed for Python and Java wrappers
|
||||
Ptr<RTrees> RTrees::load(const String& filepath, const String& nodeName)
|
||||
{
|
||||
CV_TRACE_FUNCTION();
|
||||
return Algorithm::load<RTrees>(filepath, nodeName);
|
||||
}
|
||||
|
||||
}}
|
||||
|
||||
// End of file.
|
||||
2363
3rdparty/opencv-4.5.4/modules/ml/src/svm.cpp
vendored
Normal file
2363
3rdparty/opencv-4.5.4/modules/ml/src/svm.cpp
vendored
Normal file
File diff suppressed because it is too large
Load Diff
524
3rdparty/opencv-4.5.4/modules/ml/src/svmsgd.cpp
vendored
Normal file
524
3rdparty/opencv-4.5.4/modules/ml/src/svmsgd.cpp
vendored
Normal file
@@ -0,0 +1,524 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// License Agreement
|
||||
// For Open Source Computer Vision Library
|
||||
//
|
||||
// Copyright (C) 2000, Intel Corporation, all rights reserved.
|
||||
// Copyright (C) 2016, Itseez Inc, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of the copyright holders may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#include "precomp.hpp"
|
||||
#include "limits"
|
||||
|
||||
#include <iostream>
|
||||
|
||||
using std::cout;
|
||||
using std::endl;
|
||||
|
||||
/****************************************************************************************\
|
||||
* Stochastic Gradient Descent SVM Classifier *
|
||||
\****************************************************************************************/
|
||||
|
||||
namespace cv
|
||||
{
|
||||
namespace ml
|
||||
{
|
||||
|
||||
class SVMSGDImpl CV_FINAL : public SVMSGD
|
||||
{
|
||||
|
||||
public:
|
||||
SVMSGDImpl();
|
||||
|
||||
virtual ~SVMSGDImpl() {}
|
||||
|
||||
virtual bool train(const Ptr<TrainData>& data, int) CV_OVERRIDE;
|
||||
|
||||
virtual float predict( InputArray samples, OutputArray results=noArray(), int flags = 0 ) const CV_OVERRIDE;
|
||||
|
||||
virtual bool isClassifier() const CV_OVERRIDE;
|
||||
|
||||
virtual bool isTrained() const CV_OVERRIDE;
|
||||
|
||||
virtual void clear() CV_OVERRIDE;
|
||||
|
||||
virtual void write(FileStorage &fs) const CV_OVERRIDE;
|
||||
|
||||
virtual void read(const FileNode &fn) CV_OVERRIDE;
|
||||
|
||||
virtual Mat getWeights() CV_OVERRIDE { return weights_; }
|
||||
|
||||
virtual float getShift() CV_OVERRIDE { return shift_; }
|
||||
|
||||
virtual int getVarCount() const CV_OVERRIDE { return weights_.cols; }
|
||||
|
||||
virtual String getDefaultName() const CV_OVERRIDE {return "opencv_ml_svmsgd";}
|
||||
|
||||
virtual void setOptimalParameters(int svmsgdType = ASGD, int marginType = SOFT_MARGIN) CV_OVERRIDE;
|
||||
|
||||
inline int getSvmsgdType() const CV_OVERRIDE { return params.svmsgdType; }
|
||||
inline void setSvmsgdType(int val) CV_OVERRIDE { params.svmsgdType = val; }
|
||||
inline int getMarginType() const CV_OVERRIDE { return params.marginType; }
|
||||
inline void setMarginType(int val) CV_OVERRIDE { params.marginType = val; }
|
||||
inline float getMarginRegularization() const CV_OVERRIDE { return params.marginRegularization; }
|
||||
inline void setMarginRegularization(float val) CV_OVERRIDE { params.marginRegularization = val; }
|
||||
inline float getInitialStepSize() const CV_OVERRIDE { return params.initialStepSize; }
|
||||
inline void setInitialStepSize(float val) CV_OVERRIDE { params.initialStepSize = val; }
|
||||
inline float getStepDecreasingPower() const CV_OVERRIDE { return params.stepDecreasingPower; }
|
||||
inline void setStepDecreasingPower(float val) CV_OVERRIDE { params.stepDecreasingPower = val; }
|
||||
inline cv::TermCriteria getTermCriteria() const CV_OVERRIDE { return params.termCrit; }
|
||||
inline void setTermCriteria(const cv::TermCriteria& val) CV_OVERRIDE { params.termCrit = val; }
|
||||
|
||||
private:
|
||||
void updateWeights(InputArray sample, bool positive, float stepSize, Mat &weights);
|
||||
|
||||
void writeParams( FileStorage &fs ) const;
|
||||
|
||||
void readParams( const FileNode &fn );
|
||||
|
||||
static inline bool isPositive(float val) { return val > 0; }
|
||||
|
||||
static void normalizeSamples(Mat &matrix, Mat &average, float &multiplier);
|
||||
|
||||
float calcShift(InputArray _samples, InputArray _responses) const;
|
||||
|
||||
static void makeExtendedTrainSamples(const Mat &trainSamples, Mat &extendedTrainSamples, Mat &average, float &multiplier);
|
||||
|
||||
// Vector with SVM weights
|
||||
Mat weights_;
|
||||
float shift_;
|
||||
|
||||
// Parameters for learning
|
||||
struct SVMSGDParams
|
||||
{
|
||||
float marginRegularization;
|
||||
float initialStepSize;
|
||||
float stepDecreasingPower;
|
||||
TermCriteria termCrit;
|
||||
int svmsgdType;
|
||||
int marginType;
|
||||
};
|
||||
|
||||
SVMSGDParams params;
|
||||
};
|
||||
|
||||
Ptr<SVMSGD> SVMSGD::create()
|
||||
{
|
||||
return makePtr<SVMSGDImpl>();
|
||||
}
|
||||
|
||||
Ptr<SVMSGD> SVMSGD::load(const String& filepath, const String& nodeName)
|
||||
{
|
||||
return Algorithm::load<SVMSGD>(filepath, nodeName);
|
||||
}
|
||||
|
||||
|
||||
void SVMSGDImpl::normalizeSamples(Mat &samples, Mat &average, float &multiplier)
|
||||
{
|
||||
int featuresCount = samples.cols;
|
||||
int samplesCount = samples.rows;
|
||||
|
||||
average = Mat(1, featuresCount, samples.type());
|
||||
CV_Assert(average.type() == CV_32FC1);
|
||||
for (int featureIndex = 0; featureIndex < featuresCount; featureIndex++)
|
||||
{
|
||||
average.at<float>(featureIndex) = static_cast<float>(mean(samples.col(featureIndex))[0]);
|
||||
}
|
||||
|
||||
for (int sampleIndex = 0; sampleIndex < samplesCount; sampleIndex++)
|
||||
{
|
||||
samples.row(sampleIndex) -= average;
|
||||
}
|
||||
|
||||
double normValue = norm(samples);
|
||||
|
||||
multiplier = static_cast<float>(sqrt(static_cast<double>(samples.total())) / normValue);
|
||||
|
||||
samples *= multiplier;
|
||||
}
|
||||
|
||||
void SVMSGDImpl::makeExtendedTrainSamples(const Mat &trainSamples, Mat &extendedTrainSamples, Mat &average, float &multiplier)
|
||||
{
|
||||
Mat normalizedTrainSamples = trainSamples.clone();
|
||||
int samplesCount = normalizedTrainSamples.rows;
|
||||
|
||||
normalizeSamples(normalizedTrainSamples, average, multiplier);
|
||||
|
||||
Mat onesCol = Mat::ones(samplesCount, 1, CV_32F);
|
||||
cv::hconcat(normalizedTrainSamples, onesCol, extendedTrainSamples);
|
||||
}
|
||||
|
||||
void SVMSGDImpl::updateWeights(InputArray _sample, bool positive, float stepSize, Mat& weights)
|
||||
{
|
||||
Mat sample = _sample.getMat();
|
||||
|
||||
int response = positive ? 1 : -1; // ensure that trainResponses are -1 or 1
|
||||
|
||||
if ( sample.dot(weights) * response > 1)
|
||||
{
|
||||
// Not a support vector, only apply weight decay
|
||||
weights *= (1.f - stepSize * params.marginRegularization);
|
||||
}
|
||||
else
|
||||
{
|
||||
// It's a support vector, add it to the weights
|
||||
weights -= (stepSize * params.marginRegularization) * weights - (stepSize * response) * sample;
|
||||
}
|
||||
}
|
||||
|
||||
float SVMSGDImpl::calcShift(InputArray _samples, InputArray _responses) const
|
||||
{
|
||||
float margin[2] = { std::numeric_limits<float>::max(), std::numeric_limits<float>::max() };
|
||||
|
||||
Mat trainSamples = _samples.getMat();
|
||||
int trainSamplesCount = trainSamples.rows;
|
||||
|
||||
Mat trainResponses = _responses.getMat();
|
||||
|
||||
CV_Assert(trainResponses.type() == CV_32FC1);
|
||||
for (int samplesIndex = 0; samplesIndex < trainSamplesCount; samplesIndex++)
|
||||
{
|
||||
Mat currentSample = trainSamples.row(samplesIndex);
|
||||
float dotProduct = static_cast<float>(currentSample.dot(weights_));
|
||||
|
||||
bool positive = isPositive(trainResponses.at<float>(samplesIndex));
|
||||
int index = positive ? 0 : 1;
|
||||
float signToMul = positive ? 1.f : -1.f;
|
||||
float curMargin = dotProduct * signToMul;
|
||||
|
||||
if (curMargin < margin[index])
|
||||
{
|
||||
margin[index] = curMargin;
|
||||
}
|
||||
}
|
||||
|
||||
return -(margin[0] - margin[1]) / 2.f;
|
||||
}
|
||||
|
||||
bool SVMSGDImpl::train(const Ptr<TrainData>& data, int)
|
||||
{
|
||||
CV_Assert(!data.empty());
|
||||
clear();
|
||||
CV_Assert( isClassifier() ); //toDo: consider
|
||||
|
||||
Mat trainSamples = data->getTrainSamples();
|
||||
|
||||
int featureCount = trainSamples.cols;
|
||||
Mat trainResponses = data->getTrainResponses(); // (trainSamplesCount x 1) matrix
|
||||
|
||||
CV_Assert(trainResponses.rows == trainSamples.rows);
|
||||
|
||||
if (trainResponses.empty())
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
int positiveCount = countNonZero(trainResponses >= 0);
|
||||
int negativeCount = countNonZero(trainResponses < 0);
|
||||
|
||||
if ( positiveCount <= 0 || negativeCount <= 0 )
|
||||
{
|
||||
weights_ = Mat::zeros(1, featureCount, CV_32F);
|
||||
shift_ = (positiveCount > 0) ? 1.f : -1.f;
|
||||
return true;
|
||||
}
|
||||
|
||||
Mat extendedTrainSamples;
|
||||
Mat average;
|
||||
float multiplier = 0;
|
||||
makeExtendedTrainSamples(trainSamples, extendedTrainSamples, average, multiplier);
|
||||
|
||||
int extendedTrainSamplesCount = extendedTrainSamples.rows;
|
||||
int extendedFeatureCount = extendedTrainSamples.cols;
|
||||
|
||||
Mat extendedWeights = Mat::zeros(1, extendedFeatureCount, CV_32F);
|
||||
Mat previousWeights = Mat::zeros(1, extendedFeatureCount, CV_32F);
|
||||
Mat averageExtendedWeights;
|
||||
if (params.svmsgdType == ASGD)
|
||||
{
|
||||
averageExtendedWeights = Mat::zeros(1, extendedFeatureCount, CV_32F);
|
||||
}
|
||||
|
||||
RNG rng(0);
|
||||
|
||||
CV_Assert (params.termCrit.type & TermCriteria::COUNT || params.termCrit.type & TermCriteria::EPS);
|
||||
int maxCount = (params.termCrit.type & TermCriteria::COUNT) ? params.termCrit.maxCount : INT_MAX;
|
||||
double epsilon = (params.termCrit.type & TermCriteria::EPS) ? params.termCrit.epsilon : 0;
|
||||
|
||||
double err = DBL_MAX;
|
||||
CV_Assert (trainResponses.type() == CV_32FC1);
|
||||
// Stochastic gradient descent SVM
|
||||
for (int iter = 0; (iter < maxCount) && (err > epsilon); iter++)
|
||||
{
|
||||
int randomNumber = rng.uniform(0, extendedTrainSamplesCount); //generate sample number
|
||||
|
||||
Mat currentSample = extendedTrainSamples.row(randomNumber);
|
||||
|
||||
float stepSize = params.initialStepSize * std::pow((1 + params.marginRegularization * params.initialStepSize * (float)iter), (-params.stepDecreasingPower)); //update stepSize
|
||||
|
||||
updateWeights( currentSample, isPositive(trainResponses.at<float>(randomNumber)), stepSize, extendedWeights );
|
||||
|
||||
//average weights (only for ASGD model)
|
||||
if (params.svmsgdType == ASGD)
|
||||
{
|
||||
averageExtendedWeights = ((float)iter/ (1 + (float)iter)) * averageExtendedWeights + extendedWeights / (1 + (float) iter);
|
||||
err = norm(averageExtendedWeights - previousWeights);
|
||||
averageExtendedWeights.copyTo(previousWeights);
|
||||
}
|
||||
else
|
||||
{
|
||||
err = norm(extendedWeights - previousWeights);
|
||||
extendedWeights.copyTo(previousWeights);
|
||||
}
|
||||
}
|
||||
|
||||
if (params.svmsgdType == ASGD)
|
||||
{
|
||||
extendedWeights = averageExtendedWeights;
|
||||
}
|
||||
|
||||
Rect roi(0, 0, featureCount, 1);
|
||||
weights_ = extendedWeights(roi);
|
||||
weights_ *= multiplier;
|
||||
|
||||
CV_Assert((params.marginType == SOFT_MARGIN || params.marginType == HARD_MARGIN) && (extendedWeights.type() == CV_32FC1));
|
||||
|
||||
if (params.marginType == SOFT_MARGIN)
|
||||
{
|
||||
shift_ = extendedWeights.at<float>(featureCount) - static_cast<float>(weights_.dot(average));
|
||||
}
|
||||
else
|
||||
{
|
||||
shift_ = calcShift(trainSamples, trainResponses);
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
float SVMSGDImpl::predict( InputArray _samples, OutputArray _results, int ) const
|
||||
{
|
||||
float result = 0;
|
||||
cv::Mat samples = _samples.getMat();
|
||||
int nSamples = samples.rows;
|
||||
cv::Mat results;
|
||||
|
||||
CV_Assert( samples.cols == weights_.cols && samples.type() == CV_32FC1);
|
||||
|
||||
if( _results.needed() )
|
||||
{
|
||||
_results.create( nSamples, 1, samples.type() );
|
||||
results = _results.getMat();
|
||||
}
|
||||
else
|
||||
{
|
||||
CV_Assert( nSamples == 1 );
|
||||
results = Mat(1, 1, CV_32FC1, &result);
|
||||
}
|
||||
|
||||
for (int sampleIndex = 0; sampleIndex < nSamples; sampleIndex++)
|
||||
{
|
||||
Mat currentSample = samples.row(sampleIndex);
|
||||
float criterion = static_cast<float>(currentSample.dot(weights_)) + shift_;
|
||||
results.at<float>(sampleIndex) = (criterion >= 0) ? 1.f : -1.f;
|
||||
}
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
bool SVMSGDImpl::isClassifier() const
|
||||
{
|
||||
return (params.svmsgdType == SGD || params.svmsgdType == ASGD)
|
||||
&&
|
||||
(params.marginType == SOFT_MARGIN || params.marginType == HARD_MARGIN)
|
||||
&&
|
||||
(params.marginRegularization > 0) && (params.initialStepSize > 0) && (params.stepDecreasingPower >= 0);
|
||||
}
|
||||
|
||||
bool SVMSGDImpl::isTrained() const
|
||||
{
|
||||
return !weights_.empty();
|
||||
}
|
||||
|
||||
void SVMSGDImpl::write(FileStorage& fs) const
|
||||
{
|
||||
if( !isTrained() )
|
||||
CV_Error( CV_StsParseError, "SVMSGD model data is invalid, it hasn't been trained" );
|
||||
|
||||
writeFormat(fs);
|
||||
writeParams( fs );
|
||||
|
||||
fs << "weights" << weights_;
|
||||
fs << "shift" << shift_;
|
||||
}
|
||||
|
||||
void SVMSGDImpl::writeParams( FileStorage& fs ) const
|
||||
{
|
||||
String SvmsgdTypeStr;
|
||||
|
||||
switch (params.svmsgdType)
|
||||
{
|
||||
case SGD:
|
||||
SvmsgdTypeStr = "SGD";
|
||||
break;
|
||||
case ASGD:
|
||||
SvmsgdTypeStr = "ASGD";
|
||||
break;
|
||||
default:
|
||||
SvmsgdTypeStr = format("Unknown_%d", params.svmsgdType);
|
||||
}
|
||||
|
||||
fs << "svmsgdType" << SvmsgdTypeStr;
|
||||
|
||||
String marginTypeStr;
|
||||
|
||||
switch (params.marginType)
|
||||
{
|
||||
case SOFT_MARGIN:
|
||||
marginTypeStr = "SOFT_MARGIN";
|
||||
break;
|
||||
case HARD_MARGIN:
|
||||
marginTypeStr = "HARD_MARGIN";
|
||||
break;
|
||||
default:
|
||||
marginTypeStr = format("Unknown_%d", params.marginType);
|
||||
}
|
||||
|
||||
fs << "marginType" << marginTypeStr;
|
||||
|
||||
fs << "marginRegularization" << params.marginRegularization;
|
||||
fs << "initialStepSize" << params.initialStepSize;
|
||||
fs << "stepDecreasingPower" << params.stepDecreasingPower;
|
||||
|
||||
fs << "term_criteria" << "{:";
|
||||
if( params.termCrit.type & TermCriteria::EPS )
|
||||
fs << "epsilon" << params.termCrit.epsilon;
|
||||
if( params.termCrit.type & TermCriteria::COUNT )
|
||||
fs << "iterations" << params.termCrit.maxCount;
|
||||
fs << "}";
|
||||
}
|
||||
void SVMSGDImpl::readParams( const FileNode& fn )
|
||||
{
|
||||
String svmsgdTypeStr = (String)fn["svmsgdType"];
|
||||
int svmsgdType =
|
||||
svmsgdTypeStr == "SGD" ? SGD :
|
||||
svmsgdTypeStr == "ASGD" ? ASGD : -1;
|
||||
|
||||
if( svmsgdType < 0 )
|
||||
CV_Error( CV_StsParseError, "Missing or invalid SVMSGD type" );
|
||||
|
||||
params.svmsgdType = svmsgdType;
|
||||
|
||||
String marginTypeStr = (String)fn["marginType"];
|
||||
int marginType =
|
||||
marginTypeStr == "SOFT_MARGIN" ? SOFT_MARGIN :
|
||||
marginTypeStr == "HARD_MARGIN" ? HARD_MARGIN : -1;
|
||||
|
||||
if( marginType < 0 )
|
||||
CV_Error( CV_StsParseError, "Missing or invalid margin type" );
|
||||
|
||||
params.marginType = marginType;
|
||||
|
||||
CV_Assert ( fn["marginRegularization"].isReal() );
|
||||
params.marginRegularization = (float)fn["marginRegularization"];
|
||||
|
||||
CV_Assert ( fn["initialStepSize"].isReal() );
|
||||
params.initialStepSize = (float)fn["initialStepSize"];
|
||||
|
||||
CV_Assert ( fn["stepDecreasingPower"].isReal() );
|
||||
params.stepDecreasingPower = (float)fn["stepDecreasingPower"];
|
||||
|
||||
FileNode tcnode = fn["term_criteria"];
|
||||
CV_Assert(!tcnode.empty());
|
||||
params.termCrit.epsilon = (double)tcnode["epsilon"];
|
||||
params.termCrit.maxCount = (int)tcnode["iterations"];
|
||||
params.termCrit.type = (params.termCrit.epsilon > 0 ? TermCriteria::EPS : 0) +
|
||||
(params.termCrit.maxCount > 0 ? TermCriteria::COUNT : 0);
|
||||
CV_Assert ((params.termCrit.type & TermCriteria::COUNT || params.termCrit.type & TermCriteria::EPS));
|
||||
}
|
||||
|
||||
void SVMSGDImpl::read(const FileNode& fn)
|
||||
{
|
||||
clear();
|
||||
|
||||
readParams(fn);
|
||||
|
||||
fn["weights"] >> weights_;
|
||||
fn["shift"] >> shift_;
|
||||
}
|
||||
|
||||
void SVMSGDImpl::clear()
|
||||
{
|
||||
weights_.release();
|
||||
shift_ = 0;
|
||||
}
|
||||
|
||||
|
||||
SVMSGDImpl::SVMSGDImpl()
|
||||
{
|
||||
clear();
|
||||
setOptimalParameters();
|
||||
}
|
||||
|
||||
void SVMSGDImpl::setOptimalParameters(int svmsgdType, int marginType)
|
||||
{
|
||||
switch (svmsgdType)
|
||||
{
|
||||
case SGD:
|
||||
params.svmsgdType = SGD;
|
||||
params.marginType = (marginType == SOFT_MARGIN) ? SOFT_MARGIN :
|
||||
(marginType == HARD_MARGIN) ? HARD_MARGIN : -1;
|
||||
params.marginRegularization = 0.0001f;
|
||||
params.initialStepSize = 0.05f;
|
||||
params.stepDecreasingPower = 1.f;
|
||||
params.termCrit = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 100000, 0.00001);
|
||||
break;
|
||||
|
||||
case ASGD:
|
||||
params.svmsgdType = ASGD;
|
||||
params.marginType = (marginType == SOFT_MARGIN) ? SOFT_MARGIN :
|
||||
(marginType == HARD_MARGIN) ? HARD_MARGIN : -1;
|
||||
params.marginRegularization = 0.00001f;
|
||||
params.initialStepSize = 0.05f;
|
||||
params.stepDecreasingPower = 0.75f;
|
||||
params.termCrit = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 100000, 0.00001);
|
||||
break;
|
||||
|
||||
default:
|
||||
CV_Error( CV_StsParseError, "SVMSGD model data is invalid" );
|
||||
}
|
||||
}
|
||||
} //ml
|
||||
} //cv
|
||||
113
3rdparty/opencv-4.5.4/modules/ml/src/testset.cpp
vendored
Normal file
113
3rdparty/opencv-4.5.4/modules/ml/src/testset.cpp
vendored
Normal file
@@ -0,0 +1,113 @@
|
||||
/*M///////////////////////////////////////////////////////////////////////////////////////
|
||||
//
|
||||
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
|
||||
//
|
||||
// By downloading, copying, installing or using the software you agree to this license.
|
||||
// If you do not agree to this license, do not download, install,
|
||||
// copy or use the software.
|
||||
//
|
||||
//
|
||||
// Intel License Agreement
|
||||
//
|
||||
// Copyright (C) 2000, Intel Corporation, all rights reserved.
|
||||
// Third party copyrights are property of their respective owners.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without modification,
|
||||
// are permitted provided that the following conditions are met:
|
||||
//
|
||||
// * Redistribution's of source code must retain the above copyright notice,
|
||||
// this list of conditions and the following disclaimer.
|
||||
//
|
||||
// * Redistribution's in binary form must reproduce the above copyright notice,
|
||||
// this list of conditions and the following disclaimer in the documentation
|
||||
// and/or other materials provided with the distribution.
|
||||
//
|
||||
// * The name of Intel Corporation may not be used to endorse or promote products
|
||||
// derived from this software without specific prior written permission.
|
||||
//
|
||||
// This software is provided by the copyright holders and contributors "as is" and
|
||||
// any express or implied warranties, including, but not limited to, the implied
|
||||
// warranties of merchantability and fitness for a particular purpose are disclaimed.
|
||||
// In no event shall the Intel Corporation or contributors be liable for any direct,
|
||||
// indirect, incidental, special, exemplary, or consequential damages
|
||||
// (including, but not limited to, procurement of substitute goods or services;
|
||||
// loss of use, data, or profits; or business interruption) however caused
|
||||
// and on any theory of liability, whether in contract, strict liability,
|
||||
// or tort (including negligence or otherwise) arising in any way out of
|
||||
// the use of this software, even if advised of the possibility of such damage.
|
||||
//
|
||||
//M*/
|
||||
|
||||
#include "precomp.hpp"
|
||||
|
||||
namespace cv { namespace ml {
|
||||
|
||||
struct PairDI
|
||||
{
|
||||
double d;
|
||||
int i;
|
||||
};
|
||||
|
||||
struct CmpPairDI
|
||||
{
|
||||
bool operator ()(const PairDI& e1, const PairDI& e2) const
|
||||
{
|
||||
return (e1.d < e2.d) || (e1.d == e2.d && e1.i < e2.i);
|
||||
}
|
||||
};
|
||||
|
||||
void createConcentricSpheresTestSet( int num_samples, int num_features, int num_classes,
|
||||
OutputArray _samples, OutputArray _responses)
|
||||
{
|
||||
if( num_samples < 1 )
|
||||
CV_Error( CV_StsBadArg, "num_samples parameter must be positive" );
|
||||
|
||||
if( num_features < 1 )
|
||||
CV_Error( CV_StsBadArg, "num_features parameter must be positive" );
|
||||
|
||||
if( num_classes < 1 )
|
||||
CV_Error( CV_StsBadArg, "num_classes parameter must be positive" );
|
||||
|
||||
int i, cur_class;
|
||||
|
||||
_samples.create( num_samples, num_features, CV_32F );
|
||||
_responses.create( 1, num_samples, CV_32S );
|
||||
|
||||
Mat responses = _responses.getMat();
|
||||
|
||||
Mat mean = Mat::zeros(1, num_features, CV_32F);
|
||||
Mat cov = Mat::eye(num_features, num_features, CV_32F);
|
||||
|
||||
// fill the feature values matrix with random numbers drawn from standard normal distribution
|
||||
randMVNormal( mean, cov, num_samples, _samples );
|
||||
Mat samples = _samples.getMat();
|
||||
|
||||
// calculate distances from the origin to the samples and put them
|
||||
// into the sequence along with indices
|
||||
std::vector<PairDI> dis(samples.rows);
|
||||
|
||||
for( i = 0; i < samples.rows; i++ )
|
||||
{
|
||||
PairDI& elem = dis[i];
|
||||
elem.i = i;
|
||||
elem.d = norm(samples.row(i), NORM_L2);
|
||||
}
|
||||
|
||||
std::sort(dis.begin(), dis.end(), CmpPairDI());
|
||||
|
||||
// assign class labels
|
||||
num_classes = std::min( num_samples, num_classes );
|
||||
for( i = 0, cur_class = 0; i < num_samples; ++cur_class )
|
||||
{
|
||||
int last_idx = num_samples * (cur_class + 1) / num_classes - 1;
|
||||
double max_dst = dis[last_idx].d;
|
||||
max_dst = std::max( max_dst, dis[i].d );
|
||||
|
||||
for( ; i < num_samples && dis[i].d <= max_dst; ++i )
|
||||
responses.at<int>(dis[i].i) = cur_class;
|
||||
}
|
||||
}
|
||||
|
||||
}}
|
||||
|
||||
/* End of file. */
|
||||
1990
3rdparty/opencv-4.5.4/modules/ml/src/tree.cpp
vendored
Normal file
1990
3rdparty/opencv-4.5.4/modules/ml/src/tree.cpp
vendored
Normal file
File diff suppressed because it is too large
Load Diff
200
3rdparty/opencv-4.5.4/modules/ml/test/test_ann.cpp
vendored
Normal file
200
3rdparty/opencv-4.5.4/modules/ml/test/test_ann.cpp
vendored
Normal file
@@ -0,0 +1,200 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
// #define GENERATE_TESTDATA
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
struct Activation
|
||||
{
|
||||
int id;
|
||||
const char * name;
|
||||
};
|
||||
void PrintTo(const Activation &a, std::ostream *os) { *os << a.name; }
|
||||
|
||||
Activation activation_list[] =
|
||||
{
|
||||
{ ml::ANN_MLP::IDENTITY, "identity" },
|
||||
{ ml::ANN_MLP::SIGMOID_SYM, "sigmoid_sym" },
|
||||
{ ml::ANN_MLP::GAUSSIAN, "gaussian" },
|
||||
{ ml::ANN_MLP::RELU, "relu" },
|
||||
{ ml::ANN_MLP::LEAKYRELU, "leakyrelu" },
|
||||
};
|
||||
|
||||
typedef testing::TestWithParam< Activation > ML_ANN_Params;
|
||||
|
||||
TEST_P(ML_ANN_Params, ActivationFunction)
|
||||
{
|
||||
const Activation &activation = GetParam();
|
||||
const string dataname = "waveform";
|
||||
const string data_path = findDataFile(dataname + ".data");
|
||||
const string model_name = dataname + "_" + activation.name + ".yml";
|
||||
|
||||
Ptr<TrainData> tdata = TrainData::loadFromCSV(data_path, 0);
|
||||
ASSERT_FALSE(tdata.empty());
|
||||
|
||||
// hack?
|
||||
const uint64 old_state = theRNG().state;
|
||||
theRNG().state = 1027401484159173092;
|
||||
tdata->setTrainTestSplit(500);
|
||||
theRNG().state = old_state;
|
||||
|
||||
Mat_<int> layerSizes(1, 4);
|
||||
layerSizes(0, 0) = tdata->getNVars();
|
||||
layerSizes(0, 1) = 100;
|
||||
layerSizes(0, 2) = 100;
|
||||
layerSizes(0, 3) = tdata->getResponses().cols;
|
||||
|
||||
Mat testSamples = tdata->getTestSamples();
|
||||
Mat rx, ry;
|
||||
|
||||
{
|
||||
Ptr<ml::ANN_MLP> x = ml::ANN_MLP::create();
|
||||
x->setActivationFunction(activation.id);
|
||||
x->setLayerSizes(layerSizes);
|
||||
x->setTrainMethod(ml::ANN_MLP::RPROP, 0.01, 0.1);
|
||||
x->setTermCriteria(TermCriteria(TermCriteria::COUNT, 300, 0.01));
|
||||
x->train(tdata, ml::ANN_MLP::NO_OUTPUT_SCALE);
|
||||
ASSERT_TRUE(x->isTrained());
|
||||
x->predict(testSamples, rx);
|
||||
#ifdef GENERATE_TESTDATA
|
||||
x->save(cvtest::TS::ptr()->get_data_path() + model_name);
|
||||
#endif
|
||||
}
|
||||
|
||||
{
|
||||
const string model_path = findDataFile(model_name);
|
||||
Ptr<ml::ANN_MLP> y = Algorithm::load<ANN_MLP>(model_path);
|
||||
ASSERT_TRUE(y);
|
||||
y->predict(testSamples, ry);
|
||||
EXPECT_MAT_NEAR(rx, ry, FLT_EPSILON);
|
||||
}
|
||||
}
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(/**/, ML_ANN_Params, testing::ValuesIn(activation_list));
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
CV_ENUM(ANN_MLP_METHOD, ANN_MLP::RPROP, ANN_MLP::ANNEAL)
|
||||
|
||||
typedef tuple<ANN_MLP_METHOD, string, int> ML_ANN_METHOD_Params;
|
||||
typedef TestWithParam<ML_ANN_METHOD_Params> ML_ANN_METHOD;
|
||||
|
||||
TEST_P(ML_ANN_METHOD, Test)
|
||||
{
|
||||
int methodType = get<0>(GetParam());
|
||||
string methodName = get<1>(GetParam());
|
||||
int N = get<2>(GetParam());
|
||||
|
||||
String folder = string(cvtest::TS::ptr()->get_data_path());
|
||||
String original_path = findDataFile("waveform.data");
|
||||
string dataname = "waveform_" + methodName;
|
||||
string weight_name = dataname + "_init_weight.yml.gz";
|
||||
string model_name = dataname + ".yml.gz";
|
||||
string response_name = dataname + "_response.yml.gz";
|
||||
|
||||
Ptr<TrainData> tdata2 = TrainData::loadFromCSV(original_path, 0);
|
||||
ASSERT_FALSE(tdata2.empty());
|
||||
|
||||
Mat samples = tdata2->getSamples()(Range(0, N), Range::all());
|
||||
Mat responses(N, 3, CV_32FC1, Scalar(0));
|
||||
for (int i = 0; i < N; i++)
|
||||
responses.at<float>(i, static_cast<int>(tdata2->getResponses().at<float>(i, 0))) = 1;
|
||||
|
||||
Ptr<TrainData> tdata = TrainData::create(samples, ml::ROW_SAMPLE, responses);
|
||||
ASSERT_FALSE(tdata.empty());
|
||||
|
||||
// hack?
|
||||
const uint64 old_state = theRNG().state;
|
||||
theRNG().state = 0;
|
||||
tdata->setTrainTestSplitRatio(0.8);
|
||||
theRNG().state = old_state;
|
||||
|
||||
Mat testSamples = tdata->getTestSamples();
|
||||
|
||||
// train 1st stage
|
||||
|
||||
Ptr<ml::ANN_MLP> xx = ml::ANN_MLP::create();
|
||||
Mat_<int> layerSizes(1, 4);
|
||||
layerSizes(0, 0) = tdata->getNVars();
|
||||
layerSizes(0, 1) = 30;
|
||||
layerSizes(0, 2) = 30;
|
||||
layerSizes(0, 3) = tdata->getResponses().cols;
|
||||
xx->setLayerSizes(layerSizes);
|
||||
xx->setActivationFunction(ml::ANN_MLP::SIGMOID_SYM);
|
||||
xx->setTrainMethod(ml::ANN_MLP::RPROP);
|
||||
xx->setTermCriteria(TermCriteria(TermCriteria::COUNT, 1, 0.01));
|
||||
xx->train(tdata, ml::ANN_MLP::NO_OUTPUT_SCALE + ml::ANN_MLP::NO_INPUT_SCALE);
|
||||
#ifdef GENERATE_TESTDATA
|
||||
{
|
||||
FileStorage fs;
|
||||
fs.open(cvtest::TS::ptr()->get_data_path() + weight_name, FileStorage::WRITE + FileStorage::BASE64);
|
||||
xx->write(fs);
|
||||
}
|
||||
#endif
|
||||
|
||||
// train 2nd stage
|
||||
Mat r_gold;
|
||||
Ptr<ml::ANN_MLP> x = ml::ANN_MLP::create();
|
||||
{
|
||||
const string weight_file = findDataFile(weight_name);
|
||||
FileStorage fs;
|
||||
fs.open(weight_file, FileStorage::READ);
|
||||
x->read(fs.root());
|
||||
}
|
||||
x->setTrainMethod(methodType);
|
||||
if (methodType == ml::ANN_MLP::ANNEAL)
|
||||
{
|
||||
x->setAnnealEnergyRNG(RNG(CV_BIG_INT(0xffffffff)));
|
||||
x->setAnnealInitialT(12);
|
||||
x->setAnnealFinalT(0.15);
|
||||
x->setAnnealCoolingRatio(0.96);
|
||||
x->setAnnealItePerStep(11);
|
||||
}
|
||||
x->setTermCriteria(TermCriteria(TermCriteria::COUNT, 100, 0.01));
|
||||
x->train(tdata, ml::ANN_MLP::NO_OUTPUT_SCALE + ml::ANN_MLP::NO_INPUT_SCALE + ml::ANN_MLP::UPDATE_WEIGHTS);
|
||||
ASSERT_TRUE(x->isTrained());
|
||||
#ifdef GENERATE_TESTDATA
|
||||
x->save(cvtest::TS::ptr()->get_data_path() + model_name);
|
||||
x->predict(testSamples, r_gold);
|
||||
{
|
||||
FileStorage fs_response(cvtest::TS::ptr()->get_data_path() + response_name, FileStorage::WRITE + FileStorage::BASE64);
|
||||
fs_response << "response" << r_gold;
|
||||
}
|
||||
#endif
|
||||
{
|
||||
const string response_file = findDataFile(response_name);
|
||||
FileStorage fs_response(response_file, FileStorage::READ);
|
||||
fs_response["response"] >> r_gold;
|
||||
}
|
||||
ASSERT_FALSE(r_gold.empty());
|
||||
|
||||
// verify
|
||||
const string model_file = findDataFile(model_name);
|
||||
Ptr<ml::ANN_MLP> y = Algorithm::load<ANN_MLP>(model_file);
|
||||
ASSERT_TRUE(y);
|
||||
Mat rx, ry;
|
||||
for (int j = 0; j < 4; j++)
|
||||
{
|
||||
rx = x->getWeights(j);
|
||||
ry = y->getWeights(j);
|
||||
EXPECT_MAT_NEAR(rx, ry, FLT_EPSILON) << "Weights are not equal for layer: " << j;
|
||||
}
|
||||
x->predict(testSamples, rx);
|
||||
y->predict(testSamples, ry);
|
||||
EXPECT_MAT_NEAR(ry, rx, FLT_EPSILON) << "Predict are not equal to result of the saved model";
|
||||
EXPECT_MAT_NEAR(r_gold, rx, FLT_EPSILON) << "Predict are not equal to 'gold' response";
|
||||
}
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(/*none*/, ML_ANN_METHOD,
|
||||
testing::Values(
|
||||
ML_ANN_METHOD_Params(ml::ANN_MLP::RPROP, "rprop", 5000),
|
||||
ML_ANN_METHOD_Params(ml::ANN_MLP::ANNEAL, "anneal", 1000)
|
||||
// ML_ANN_METHOD_Params(ml::ANN_MLP::BACKPROP, "backprop", 500) -----> NO BACKPROP TEST
|
||||
)
|
||||
);
|
||||
|
||||
}} // namespace
|
||||
56
3rdparty/opencv-4.5.4/modules/ml/test/test_bayes.cpp
vendored
Normal file
56
3rdparty/opencv-4.5.4/modules/ml/test/test_bayes.cpp
vendored
Normal file
@@ -0,0 +1,56 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
TEST(ML_NBAYES, regression_5911)
|
||||
{
|
||||
int N=12;
|
||||
Ptr<ml::NormalBayesClassifier> nb = cv::ml::NormalBayesClassifier::create();
|
||||
|
||||
// data:
|
||||
float X_data[] = {
|
||||
1,2,3,4, 1,2,3,4, 1,2,3,4, 1,2,3,4,
|
||||
5,5,5,5, 5,5,5,5, 5,5,5,5, 5,5,5,5,
|
||||
4,3,2,1, 4,3,2,1, 4,3,2,1, 4,3,2,1
|
||||
};
|
||||
Mat_<float> X(N, 4, X_data);
|
||||
|
||||
// labels:
|
||||
int Y_data[] = { 0,0,0,0, 1,1,1,1, 2,2,2,2 };
|
||||
Mat_<int> Y(N, 1, Y_data);
|
||||
|
||||
nb->train(X, ml::ROW_SAMPLE, Y);
|
||||
|
||||
// single prediction:
|
||||
Mat R1,P1;
|
||||
for (int i=0; i<N; i++)
|
||||
{
|
||||
Mat r,p;
|
||||
nb->predictProb(X.row(i), r, p);
|
||||
R1.push_back(r);
|
||||
P1.push_back(p);
|
||||
}
|
||||
|
||||
// bulk prediction (continuous memory):
|
||||
Mat R2,P2;
|
||||
nb->predictProb(X, R2, P2);
|
||||
|
||||
EXPECT_EQ(255 * R2.total(), sum(R1 == R2)[0]);
|
||||
EXPECT_EQ(255 * P2.total(), sum(P1 == P2)[0]);
|
||||
|
||||
// bulk prediction, with non-continuous memory storage
|
||||
Mat R3_(N, 1+1, CV_32S),
|
||||
P3_(N, 3+1, CV_32F);
|
||||
nb->predictProb(X, R3_.col(0), P3_.colRange(0,3));
|
||||
Mat R3 = R3_.col(0).clone(),
|
||||
P3 = P3_.colRange(0,3).clone();
|
||||
|
||||
EXPECT_EQ(255 * R3.total(), sum(R1 == R3)[0]);
|
||||
EXPECT_EQ(255 * P3.total(), sum(P1 == P3)[0]);
|
||||
}
|
||||
|
||||
}} // namespace
|
||||
186
3rdparty/opencv-4.5.4/modules/ml/test/test_em.cpp
vendored
Normal file
186
3rdparty/opencv-4.5.4/modules/ml/test/test_em.cpp
vendored
Normal file
@@ -0,0 +1,186 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
CV_ENUM(EM_START_STEP, EM::START_AUTO_STEP, EM::START_M_STEP, EM::START_E_STEP)
|
||||
CV_ENUM(EM_COV_MAT, EM::COV_MAT_GENERIC, EM::COV_MAT_DIAGONAL, EM::COV_MAT_SPHERICAL)
|
||||
|
||||
typedef testing::TestWithParam< tuple<EM_START_STEP, EM_COV_MAT> > ML_EM_Params;
|
||||
|
||||
TEST_P(ML_EM_Params, accuracy)
|
||||
{
|
||||
const int nclusters = 3;
|
||||
const int sizesArr[] = { 500, 700, 800 };
|
||||
const vector<int> sizes( sizesArr, sizesArr + sizeof(sizesArr) / sizeof(sizesArr[0]) );
|
||||
const int pointsCount = sizesArr[0] + sizesArr[1] + sizesArr[2];
|
||||
Mat means;
|
||||
vector<Mat> covs;
|
||||
defaultDistribs( means, covs, CV_64FC1 );
|
||||
Mat trainData(pointsCount, 2, CV_64FC1 );
|
||||
Mat trainLabels;
|
||||
generateData( trainData, trainLabels, sizes, means, covs, CV_64FC1, CV_32SC1 );
|
||||
Mat testData( pointsCount, 2, CV_64FC1 );
|
||||
Mat testLabels;
|
||||
generateData( testData, testLabels, sizes, means, covs, CV_64FC1, CV_32SC1 );
|
||||
Mat probs(trainData.rows, nclusters, CV_64FC1, cv::Scalar(1));
|
||||
Mat weights(1, nclusters, CV_64FC1, cv::Scalar(1));
|
||||
TermCriteria termCrit(cv::TermCriteria::COUNT + cv::TermCriteria::EPS, 100, FLT_EPSILON);
|
||||
int startStep = get<0>(GetParam());
|
||||
int covMatType = get<1>(GetParam());
|
||||
cv::Mat labels;
|
||||
|
||||
Ptr<EM> em = EM::create();
|
||||
em->setClustersNumber(nclusters);
|
||||
em->setCovarianceMatrixType(covMatType);
|
||||
em->setTermCriteria(termCrit);
|
||||
if( startStep == EM::START_AUTO_STEP )
|
||||
em->trainEM( trainData, noArray(), labels, noArray() );
|
||||
else if( startStep == EM::START_E_STEP )
|
||||
em->trainE( trainData, means, covs, weights, noArray(), labels, noArray() );
|
||||
else if( startStep == EM::START_M_STEP )
|
||||
em->trainM( trainData, probs, noArray(), labels, noArray() );
|
||||
|
||||
{
|
||||
SCOPED_TRACE("Train");
|
||||
float err = 1000;
|
||||
EXPECT_TRUE(calcErr( labels, trainLabels, sizes, err , false, false ));
|
||||
EXPECT_LE(err, 0.008f);
|
||||
}
|
||||
|
||||
{
|
||||
SCOPED_TRACE("Test");
|
||||
float err = 1000;
|
||||
labels.create( testData.rows, 1, CV_32SC1 );
|
||||
for( int i = 0; i < testData.rows; i++ )
|
||||
{
|
||||
Mat sample = testData.row(i);
|
||||
Mat out_probs;
|
||||
labels.at<int>(i) = static_cast<int>(em->predict2( sample, out_probs )[1]);
|
||||
}
|
||||
EXPECT_TRUE(calcErr( labels, testLabels, sizes, err, false, false ));
|
||||
EXPECT_LE(err, 0.008f);
|
||||
}
|
||||
}
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(/**/, ML_EM_Params,
|
||||
testing::Combine(
|
||||
testing::Values(EM::START_AUTO_STEP, EM::START_M_STEP, EM::START_E_STEP),
|
||||
testing::Values(EM::COV_MAT_GENERIC, EM::COV_MAT_DIAGONAL, EM::COV_MAT_SPHERICAL)
|
||||
));
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
TEST(ML_EM, save_load)
|
||||
{
|
||||
const int nclusters = 2;
|
||||
Mat_<double> samples(3, 1);
|
||||
samples << 1., 2., 3.;
|
||||
|
||||
std::vector<double> firstResult;
|
||||
string filename = cv::tempfile(".xml");
|
||||
{
|
||||
Mat labels;
|
||||
Ptr<EM> em = EM::create();
|
||||
em->setClustersNumber(nclusters);
|
||||
em->trainEM(samples, noArray(), labels, noArray());
|
||||
for( int i = 0; i < samples.rows; i++)
|
||||
{
|
||||
Vec2d res = em->predict2(samples.row(i), noArray());
|
||||
firstResult.push_back(res[1]);
|
||||
}
|
||||
{
|
||||
FileStorage fs = FileStorage(filename, FileStorage::WRITE);
|
||||
ASSERT_NO_THROW(fs << "em" << "{");
|
||||
ASSERT_NO_THROW(em->write(fs));
|
||||
ASSERT_NO_THROW(fs << "}");
|
||||
}
|
||||
}
|
||||
{
|
||||
Ptr<EM> em;
|
||||
ASSERT_NO_THROW(em = Algorithm::load<EM>(filename));
|
||||
for( int i = 0; i < samples.rows; i++)
|
||||
{
|
||||
SCOPED_TRACE(i);
|
||||
Vec2d res = em->predict2(samples.row(i), noArray());
|
||||
EXPECT_DOUBLE_EQ(firstResult[i], res[1]);
|
||||
}
|
||||
}
|
||||
remove(filename.c_str());
|
||||
}
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
TEST(ML_EM, classification)
|
||||
{
|
||||
// This test classifies spam by the following way:
|
||||
// 1. estimates distributions of "spam" / "not spam"
|
||||
// 2. predict classID using Bayes classifier for estimated distributions.
|
||||
string dataFilename = findDataFile("spambase.data");
|
||||
Ptr<TrainData> data = TrainData::loadFromCSV(dataFilename, 0);
|
||||
ASSERT_FALSE(data.empty());
|
||||
|
||||
Mat samples = data->getSamples();
|
||||
ASSERT_EQ(samples.cols, 57);
|
||||
Mat responses = data->getResponses();
|
||||
|
||||
vector<int> trainSamplesMask(samples.rows, 0);
|
||||
const int trainSamplesCount = (int)(0.5f * samples.rows);
|
||||
const int testSamplesCount = samples.rows - trainSamplesCount;
|
||||
for(int i = 0; i < trainSamplesCount; i++)
|
||||
trainSamplesMask[i] = 1;
|
||||
RNG &rng = cv::theRNG();
|
||||
for(size_t i = 0; i < trainSamplesMask.size(); i++)
|
||||
{
|
||||
int i1 = rng(static_cast<unsigned>(trainSamplesMask.size()));
|
||||
int i2 = rng(static_cast<unsigned>(trainSamplesMask.size()));
|
||||
std::swap(trainSamplesMask[i1], trainSamplesMask[i2]);
|
||||
}
|
||||
|
||||
Mat samples0, samples1;
|
||||
for(int i = 0; i < samples.rows; i++)
|
||||
{
|
||||
if(trainSamplesMask[i])
|
||||
{
|
||||
Mat sample = samples.row(i);
|
||||
int resp = (int)responses.at<float>(i);
|
||||
if(resp == 0)
|
||||
samples0.push_back(sample);
|
||||
else
|
||||
samples1.push_back(sample);
|
||||
}
|
||||
}
|
||||
|
||||
Ptr<EM> model0 = EM::create();
|
||||
model0->setClustersNumber(3);
|
||||
model0->trainEM(samples0, noArray(), noArray(), noArray());
|
||||
|
||||
Ptr<EM> model1 = EM::create();
|
||||
model1->setClustersNumber(3);
|
||||
model1->trainEM(samples1, noArray(), noArray(), noArray());
|
||||
|
||||
// confusion matrices
|
||||
Mat_<int> trainCM(2, 2, 0);
|
||||
Mat_<int> testCM(2, 2, 0);
|
||||
const double lambda = 1.;
|
||||
for(int i = 0; i < samples.rows; i++)
|
||||
{
|
||||
Mat sample = samples.row(i);
|
||||
double sampleLogLikelihoods0 = model0->predict2(sample, noArray())[0];
|
||||
double sampleLogLikelihoods1 = model1->predict2(sample, noArray())[0];
|
||||
int classID = (sampleLogLikelihoods0 >= lambda * sampleLogLikelihoods1) ? 0 : 1;
|
||||
int resp = (int)responses.at<float>(i);
|
||||
EXPECT_TRUE(resp == 0 || resp == 1);
|
||||
if(trainSamplesMask[i])
|
||||
trainCM(resp, classID)++;
|
||||
else
|
||||
testCM(resp, classID)++;
|
||||
}
|
||||
EXPECT_LE((double)(trainCM(1,0) + trainCM(0,1)) / trainSamplesCount, 0.23);
|
||||
EXPECT_LE((double)(testCM(1,0) + testCM(0,1)) / testSamplesCount, 0.26);
|
||||
}
|
||||
|
||||
}} // namespace
|
||||
53
3rdparty/opencv-4.5.4/modules/ml/test/test_kmeans.cpp
vendored
Normal file
53
3rdparty/opencv-4.5.4/modules/ml/test/test_kmeans.cpp
vendored
Normal file
@@ -0,0 +1,53 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
TEST(ML_KMeans, accuracy)
|
||||
{
|
||||
const int iters = 100;
|
||||
int sizesArr[] = { 5000, 7000, 8000 };
|
||||
int pointsCount = sizesArr[0]+ sizesArr[1] + sizesArr[2];
|
||||
|
||||
Mat data( pointsCount, 2, CV_32FC1 ), labels;
|
||||
vector<int> sizes( sizesArr, sizesArr + sizeof(sizesArr) / sizeof(sizesArr[0]) );
|
||||
Mat means;
|
||||
vector<Mat> covs;
|
||||
defaultDistribs( means, covs );
|
||||
generateData( data, labels, sizes, means, covs, CV_32FC1, CV_32SC1 );
|
||||
TermCriteria termCriteria( TermCriteria::COUNT, iters, 0.0);
|
||||
|
||||
{
|
||||
SCOPED_TRACE("KMEANS_PP_CENTERS");
|
||||
float err = 1000;
|
||||
Mat bestLabels;
|
||||
kmeans( data, 3, bestLabels, termCriteria, 0, KMEANS_PP_CENTERS, noArray() );
|
||||
EXPECT_TRUE(calcErr( bestLabels, labels, sizes, err , false ));
|
||||
EXPECT_LE(err, 0.01f);
|
||||
}
|
||||
{
|
||||
SCOPED_TRACE("KMEANS_RANDOM_CENTERS");
|
||||
float err = 1000;
|
||||
Mat bestLabels;
|
||||
kmeans( data, 3, bestLabels, termCriteria, 0, KMEANS_RANDOM_CENTERS, noArray() );
|
||||
EXPECT_TRUE(calcErr( bestLabels, labels, sizes, err, false ));
|
||||
EXPECT_LE(err, 0.01f);
|
||||
}
|
||||
{
|
||||
SCOPED_TRACE("KMEANS_USE_INITIAL_LABELS");
|
||||
float err = 1000;
|
||||
Mat bestLabels;
|
||||
labels.copyTo( bestLabels );
|
||||
RNG &rng = cv::theRNG();
|
||||
for( int i = 0; i < 0.5f * pointsCount; i++ )
|
||||
bestLabels.at<int>( rng.next() % pointsCount, 0 ) = rng.next() % 3;
|
||||
kmeans( data, 3, bestLabels, termCriteria, 0, KMEANS_USE_INITIAL_LABELS, noArray() );
|
||||
EXPECT_TRUE(calcErr( bestLabels, labels, sizes, err, false ));
|
||||
EXPECT_LE(err, 0.01f);
|
||||
}
|
||||
}
|
||||
|
||||
}} // namespace
|
||||
112
3rdparty/opencv-4.5.4/modules/ml/test/test_knearest.cpp
vendored
Normal file
112
3rdparty/opencv-4.5.4/modules/ml/test/test_knearest.cpp
vendored
Normal file
@@ -0,0 +1,112 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
using cv::ml::TrainData;
|
||||
using cv::ml::EM;
|
||||
using cv::ml::KNearest;
|
||||
|
||||
TEST(ML_KNearest, accuracy)
|
||||
{
|
||||
int sizesArr[] = { 500, 700, 800 };
|
||||
int pointsCount = sizesArr[0]+ sizesArr[1] + sizesArr[2];
|
||||
|
||||
Mat trainData( pointsCount, 2, CV_32FC1 ), trainLabels;
|
||||
vector<int> sizes( sizesArr, sizesArr + sizeof(sizesArr) / sizeof(sizesArr[0]) );
|
||||
Mat means;
|
||||
vector<Mat> covs;
|
||||
defaultDistribs( means, covs );
|
||||
generateData( trainData, trainLabels, sizes, means, covs, CV_32FC1, CV_32FC1 );
|
||||
|
||||
Mat testData( pointsCount, 2, CV_32FC1 );
|
||||
Mat testLabels;
|
||||
generateData( testData, testLabels, sizes, means, covs, CV_32FC1, CV_32FC1 );
|
||||
|
||||
{
|
||||
SCOPED_TRACE("Default");
|
||||
Mat bestLabels;
|
||||
float err = 1000;
|
||||
Ptr<KNearest> knn = KNearest::create();
|
||||
knn->train(trainData, ml::ROW_SAMPLE, trainLabels);
|
||||
knn->findNearest(testData, 4, bestLabels);
|
||||
EXPECT_TRUE(calcErr( bestLabels, testLabels, sizes, err, true ));
|
||||
EXPECT_LE(err, 0.01f);
|
||||
}
|
||||
{
|
||||
SCOPED_TRACE("KDTree");
|
||||
Mat neighborIndexes;
|
||||
float err = 1000;
|
||||
Ptr<KNearest> knn = KNearest::create();
|
||||
knn->setAlgorithmType(KNearest::KDTREE);
|
||||
knn->train(trainData, ml::ROW_SAMPLE, trainLabels);
|
||||
knn->findNearest(testData, 4, neighborIndexes);
|
||||
Mat bestLabels;
|
||||
// The output of the KDTree are the neighbor indexes, not actual class labels
|
||||
// so we need to do some extra work to get actual predictions
|
||||
for(int row_num = 0; row_num < neighborIndexes.rows; ++row_num){
|
||||
vector<float> labels;
|
||||
for(int index = 0; index < neighborIndexes.row(row_num).cols; ++index) {
|
||||
labels.push_back(trainLabels.at<float>(neighborIndexes.row(row_num).at<int>(0, index) , 0));
|
||||
}
|
||||
// computing the mode of the output class predictions to determine overall prediction
|
||||
std::vector<int> histogram(3,0);
|
||||
for( int i=0; i<3; ++i )
|
||||
++histogram[ static_cast<int>(labels[i]) ];
|
||||
int bestLabel = static_cast<int>(std::max_element( histogram.begin(), histogram.end() ) - histogram.begin());
|
||||
bestLabels.push_back(bestLabel);
|
||||
}
|
||||
bestLabels.convertTo(bestLabels, testLabels.type());
|
||||
EXPECT_TRUE(calcErr( bestLabels, testLabels, sizes, err, true ));
|
||||
EXPECT_LE(err, 0.01f);
|
||||
}
|
||||
}
|
||||
|
||||
TEST(ML_KNearest, regression_12347)
|
||||
{
|
||||
Mat xTrainData = (Mat_<float>(5,2) << 1, 1.1, 1.1, 1, 2, 2, 2.1, 2, 2.1, 2.1);
|
||||
Mat yTrainLabels = (Mat_<float>(5,1) << 1, 1, 2, 2, 2);
|
||||
Ptr<KNearest> knn = KNearest::create();
|
||||
knn->train(xTrainData, ml::ROW_SAMPLE, yTrainLabels);
|
||||
|
||||
Mat xTestData = (Mat_<float>(2,2) << 1.1, 1.1, 2, 2.2);
|
||||
Mat zBestLabels, neighbours, dist;
|
||||
// check output shapes:
|
||||
int K = 16, Kexp = std::min(K, xTrainData.rows);
|
||||
knn->findNearest(xTestData, K, zBestLabels, neighbours, dist);
|
||||
EXPECT_EQ(xTestData.rows, zBestLabels.rows);
|
||||
EXPECT_EQ(neighbours.cols, Kexp);
|
||||
EXPECT_EQ(dist.cols, Kexp);
|
||||
// see if the result is still correct:
|
||||
K = 2;
|
||||
knn->findNearest(xTestData, K, zBestLabels, neighbours, dist);
|
||||
EXPECT_EQ(1, zBestLabels.at<float>(0,0));
|
||||
EXPECT_EQ(2, zBestLabels.at<float>(1,0));
|
||||
}
|
||||
|
||||
TEST(ML_KNearest, bug_11877)
|
||||
{
|
||||
Mat trainData = (Mat_<float>(5,2) << 3, 3, 3, 3, 4, 4, 4, 4, 4, 4);
|
||||
Mat trainLabels = (Mat_<float>(5,1) << 0, 0, 1, 1, 1);
|
||||
|
||||
Ptr<KNearest> knnKdt = KNearest::create();
|
||||
knnKdt->setAlgorithmType(KNearest::KDTREE);
|
||||
knnKdt->setIsClassifier(true);
|
||||
|
||||
knnKdt->train(trainData, ml::ROW_SAMPLE, trainLabels);
|
||||
|
||||
Mat testData = (Mat_<float>(2,2) << 3.1, 3.1, 4, 4.1);
|
||||
Mat testLabels = (Mat_<int>(2,1) << 0, 1);
|
||||
Mat result;
|
||||
|
||||
knnKdt->findNearest(testData, 1, result);
|
||||
|
||||
EXPECT_EQ(1, int(result.at<int>(0, 0)));
|
||||
EXPECT_EQ(2, int(result.at<int>(1, 0)));
|
||||
EXPECT_EQ(0, trainLabels.at<int>(result.at<int>(0, 0), 0));
|
||||
}
|
||||
|
||||
}} // namespace
|
||||
81
3rdparty/opencv-4.5.4/modules/ml/test/test_lr.cpp
vendored
Normal file
81
3rdparty/opencv-4.5.4/modules/ml/test/test_lr.cpp
vendored
Normal file
@@ -0,0 +1,81 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
//
|
||||
// AUTHOR: Rahul Kavi rahulkavi[at]live[at]com
|
||||
|
||||
//
|
||||
// Test data uses subset of data from the popular Iris Dataset (1936):
|
||||
// - http://archive.ics.uci.edu/ml/datasets/Iris
|
||||
// - https://en.wikipedia.org/wiki/Iris_flower_data_set
|
||||
//
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
TEST(ML_LR, accuracy)
|
||||
{
|
||||
std::string dataFileName = findDataFile("iris.data");
|
||||
Ptr<TrainData> tdata = TrainData::loadFromCSV(dataFileName, 0);
|
||||
ASSERT_FALSE(tdata.empty());
|
||||
|
||||
Ptr<LogisticRegression> p = LogisticRegression::create();
|
||||
p->setLearningRate(1.0);
|
||||
p->setIterations(10001);
|
||||
p->setRegularization(LogisticRegression::REG_L2);
|
||||
p->setTrainMethod(LogisticRegression::BATCH);
|
||||
p->setMiniBatchSize(10);
|
||||
p->train(tdata);
|
||||
|
||||
Mat responses;
|
||||
p->predict(tdata->getSamples(), responses);
|
||||
|
||||
float error = 1000;
|
||||
EXPECT_TRUE(calculateError(responses, tdata->getResponses(), error));
|
||||
EXPECT_LE(error, 0.05f);
|
||||
}
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
TEST(ML_LR, save_load)
|
||||
{
|
||||
string dataFileName = findDataFile("iris.data");
|
||||
Ptr<TrainData> tdata = TrainData::loadFromCSV(dataFileName, 0);
|
||||
ASSERT_FALSE(tdata.empty());
|
||||
Mat responses1, responses2;
|
||||
Mat learnt_mat1, learnt_mat2;
|
||||
String filename = tempfile(".xml");
|
||||
{
|
||||
Ptr<LogisticRegression> lr1 = LogisticRegression::create();
|
||||
lr1->setLearningRate(1.0);
|
||||
lr1->setIterations(10001);
|
||||
lr1->setRegularization(LogisticRegression::REG_L2);
|
||||
lr1->setTrainMethod(LogisticRegression::BATCH);
|
||||
lr1->setMiniBatchSize(10);
|
||||
ASSERT_NO_THROW(lr1->train(tdata));
|
||||
ASSERT_NO_THROW(lr1->predict(tdata->getSamples(), responses1));
|
||||
ASSERT_NO_THROW(lr1->save(filename));
|
||||
learnt_mat1 = lr1->get_learnt_thetas();
|
||||
}
|
||||
{
|
||||
Ptr<LogisticRegression> lr2;
|
||||
ASSERT_NO_THROW(lr2 = Algorithm::load<LogisticRegression>(filename));
|
||||
ASSERT_NO_THROW(lr2->predict(tdata->getSamples(), responses2));
|
||||
learnt_mat2 = lr2->get_learnt_thetas();
|
||||
}
|
||||
// compare difference in prediction outputs and stored inputs
|
||||
EXPECT_MAT_NEAR(responses1, responses2, 0.f);
|
||||
|
||||
Mat comp_learnt_mats;
|
||||
comp_learnt_mats = (learnt_mat1 == learnt_mat2);
|
||||
comp_learnt_mats = comp_learnt_mats.reshape(1, comp_learnt_mats.rows*comp_learnt_mats.cols);
|
||||
comp_learnt_mats.convertTo(comp_learnt_mats, CV_32S);
|
||||
comp_learnt_mats = comp_learnt_mats/255;
|
||||
// check if there is any difference between computed learnt mat and retrieved mat
|
||||
EXPECT_EQ(comp_learnt_mats.rows, sum(comp_learnt_mats)[0]);
|
||||
|
||||
remove( filename.c_str() );
|
||||
}
|
||||
|
||||
}} // namespace
|
||||
10
3rdparty/opencv-4.5.4/modules/ml/test/test_main.cpp
vendored
Normal file
10
3rdparty/opencv-4.5.4/modules/ml/test/test_main.cpp
vendored
Normal file
@@ -0,0 +1,10 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
#if defined(HAVE_HPX)
|
||||
#include <hpx/hpx_main.hpp>
|
||||
#endif
|
||||
|
||||
CV_TEST_MAIN("ml")
|
||||
373
3rdparty/opencv-4.5.4/modules/ml/test/test_mltests.cpp
vendored
Normal file
373
3rdparty/opencv-4.5.4/modules/ml/test/test_mltests.cpp
vendored
Normal file
@@ -0,0 +1,373 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
struct DatasetDesc
|
||||
{
|
||||
string name;
|
||||
int resp_idx;
|
||||
int train_count;
|
||||
int cat_num;
|
||||
string type_desc;
|
||||
public:
|
||||
Ptr<TrainData> load()
|
||||
{
|
||||
string filename = findDataFile(name + ".data");
|
||||
Ptr<TrainData> data = TrainData::loadFromCSV(filename, 0, resp_idx, resp_idx + 1, type_desc);
|
||||
data->setTrainTestSplit(train_count);
|
||||
data->shuffleTrainTest();
|
||||
return data;
|
||||
}
|
||||
};
|
||||
|
||||
// see testdata/ml/protocol.txt (?)
|
||||
DatasetDesc datasets[] = {
|
||||
{ "mushroom", 0, 4000, 16, "cat" },
|
||||
{ "adult", 14, 22561, 16, "ord[0,2,4,10-12],cat[1,3,5-9,13,14]" },
|
||||
{ "vehicle", 18, 761, 4, "ord[0-17],cat[18]" },
|
||||
{ "abalone", 8, 3133, 16, "ord[1-8],cat[0]" },
|
||||
{ "ringnorm", 20, 300, 2, "ord[0-19],cat[20]" },
|
||||
{ "spambase", 57, 3221, 3, "ord[0-56],cat[57]" },
|
||||
{ "waveform", 21, 300, 3, "ord[0-20],cat[21]" },
|
||||
{ "elevators", 18, 5000, 0, "ord" },
|
||||
{ "letter", 16, 10000, 26, "ord[0-15],cat[16]" },
|
||||
{ "twonorm", 20, 300, 3, "ord[0-19],cat[20]" },
|
||||
{ "poletelecomm", 48, 2500, 0, "ord" },
|
||||
};
|
||||
|
||||
static DatasetDesc & getDataset(const string & name)
|
||||
{
|
||||
const int sz = sizeof(datasets)/sizeof(datasets[0]);
|
||||
for (int i = 0; i < sz; ++i)
|
||||
{
|
||||
DatasetDesc & desc = datasets[i];
|
||||
if (desc.name == name)
|
||||
return desc;
|
||||
}
|
||||
CV_Error(Error::StsInternal, "");
|
||||
}
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
// interfaces and templates
|
||||
|
||||
template <typename T> string modelName() { return "Unknown"; };
|
||||
template <typename T> Ptr<T> tuneModel(const DatasetDesc &, Ptr<T> m) { return m; }
|
||||
|
||||
struct IModelFactory
|
||||
{
|
||||
virtual Ptr<StatModel> createNew(const DatasetDesc &dataset) const = 0;
|
||||
virtual Ptr<StatModel> loadFromFile(const string &filename) const = 0;
|
||||
virtual string name() const = 0;
|
||||
virtual ~IModelFactory() {}
|
||||
};
|
||||
|
||||
template <typename T>
|
||||
struct ModelFactory : public IModelFactory
|
||||
{
|
||||
Ptr<StatModel> createNew(const DatasetDesc &dataset) const CV_OVERRIDE
|
||||
{
|
||||
return tuneModel<T>(dataset, T::create());
|
||||
}
|
||||
Ptr<StatModel> loadFromFile(const string & filename) const CV_OVERRIDE
|
||||
{
|
||||
return T::load(filename);
|
||||
}
|
||||
string name() const CV_OVERRIDE { return modelName<T>(); }
|
||||
};
|
||||
|
||||
// implementation
|
||||
|
||||
template <> string modelName<NormalBayesClassifier>() { return "NormalBayesClassifier"; }
|
||||
template <> string modelName<DTrees>() { return "DTrees"; }
|
||||
template <> string modelName<KNearest>() { return "KNearest"; }
|
||||
template <> string modelName<RTrees>() { return "RTrees"; }
|
||||
template <> string modelName<SVMSGD>() { return "SVMSGD"; }
|
||||
|
||||
template<> Ptr<DTrees> tuneModel<DTrees>(const DatasetDesc &dataset, Ptr<DTrees> m)
|
||||
{
|
||||
m->setMaxDepth(10);
|
||||
m->setMinSampleCount(2);
|
||||
m->setRegressionAccuracy(0);
|
||||
m->setUseSurrogates(false);
|
||||
m->setCVFolds(0);
|
||||
m->setUse1SERule(false);
|
||||
m->setTruncatePrunedTree(false);
|
||||
m->setPriors(Mat());
|
||||
m->setMaxCategories(dataset.cat_num);
|
||||
return m;
|
||||
}
|
||||
|
||||
template<> Ptr<RTrees> tuneModel<RTrees>(const DatasetDesc &dataset, Ptr<RTrees> m)
|
||||
{
|
||||
m->setMaxDepth(20);
|
||||
m->setMinSampleCount(2);
|
||||
m->setRegressionAccuracy(0);
|
||||
m->setUseSurrogates(false);
|
||||
m->setPriors(Mat());
|
||||
m->setCalculateVarImportance(true);
|
||||
m->setActiveVarCount(0);
|
||||
m->setTermCriteria(TermCriteria(TermCriteria::COUNT, 100, 0.0));
|
||||
m->setMaxCategories(dataset.cat_num);
|
||||
return m;
|
||||
}
|
||||
|
||||
template<> Ptr<SVMSGD> tuneModel<SVMSGD>(const DatasetDesc &, Ptr<SVMSGD> m)
|
||||
{
|
||||
m->setSvmsgdType(SVMSGD::ASGD);
|
||||
m->setMarginType(SVMSGD::SOFT_MARGIN);
|
||||
m->setMarginRegularization(0.00001f);
|
||||
m->setInitialStepSize(0.1f);
|
||||
m->setStepDecreasingPower(0.75);
|
||||
m->setTermCriteria(TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 10000, 0.00001));
|
||||
return m;
|
||||
}
|
||||
|
||||
template <>
|
||||
struct ModelFactory<Boost> : public IModelFactory
|
||||
{
|
||||
ModelFactory(int boostType_) : boostType(boostType_) {}
|
||||
Ptr<StatModel> createNew(const DatasetDesc &) const CV_OVERRIDE
|
||||
{
|
||||
Ptr<Boost> m = Boost::create();
|
||||
m->setBoostType(boostType);
|
||||
m->setWeakCount(20);
|
||||
m->setWeightTrimRate(0.95);
|
||||
m->setMaxDepth(4);
|
||||
m->setUseSurrogates(false);
|
||||
m->setPriors(Mat());
|
||||
return m;
|
||||
}
|
||||
Ptr<StatModel> loadFromFile(const string &filename) const { return Boost::load(filename); }
|
||||
string name() const CV_OVERRIDE { return "Boost"; }
|
||||
int boostType;
|
||||
};
|
||||
|
||||
template <>
|
||||
struct ModelFactory<SVM> : public IModelFactory
|
||||
{
|
||||
ModelFactory(int svmType_, int kernelType_, double gamma_, double c_, double nu_)
|
||||
: svmType(svmType_), kernelType(kernelType_), gamma(gamma_), c(c_), nu(nu_) {}
|
||||
Ptr<StatModel> createNew(const DatasetDesc &) const CV_OVERRIDE
|
||||
{
|
||||
Ptr<SVM> m = SVM::create();
|
||||
m->setType(svmType);
|
||||
m->setKernel(kernelType);
|
||||
m->setDegree(0);
|
||||
m->setGamma(gamma);
|
||||
m->setCoef0(0);
|
||||
m->setC(c);
|
||||
m->setNu(nu);
|
||||
m->setP(0);
|
||||
return m;
|
||||
}
|
||||
Ptr<StatModel> loadFromFile(const string &filename) const { return SVM::load(filename); }
|
||||
string name() const CV_OVERRIDE { return "SVM"; }
|
||||
int svmType;
|
||||
int kernelType;
|
||||
double gamma;
|
||||
double c;
|
||||
double nu;
|
||||
};
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
struct ML_Params_t
|
||||
{
|
||||
Ptr<IModelFactory> factory;
|
||||
string dataset;
|
||||
float mean;
|
||||
float sigma;
|
||||
};
|
||||
|
||||
void PrintTo(const ML_Params_t & param, std::ostream *os)
|
||||
{
|
||||
*os << param.factory->name() << "_" << param.dataset;
|
||||
}
|
||||
|
||||
ML_Params_t ML_Params_List[] = {
|
||||
{ makePtr< ModelFactory<DTrees> >(), "mushroom", 0.027401f, 0.036236f },
|
||||
{ makePtr< ModelFactory<DTrees> >(), "adult", 14.279000f, 0.354323f },
|
||||
{ makePtr< ModelFactory<DTrees> >(), "vehicle", 29.761162f, 4.823927f },
|
||||
{ makePtr< ModelFactory<DTrees> >(), "abalone", 7.297540f, 0.510058f },
|
||||
{ makePtr< ModelFactory<Boost> >(Boost::REAL), "adult", 13.894001f, 0.337763f },
|
||||
{ makePtr< ModelFactory<Boost> >(Boost::DISCRETE), "mushroom", 0.007274f, 0.029400f },
|
||||
{ makePtr< ModelFactory<Boost> >(Boost::LOGIT), "ringnorm", 9.993943f, 0.860256f },
|
||||
{ makePtr< ModelFactory<Boost> >(Boost::GENTLE), "spambase", 5.404347f, 0.581716f },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "waveform", 17.100641f, 0.630052f },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "mushroom", 0.006547f, 0.028248f },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "adult", 13.5129f, 0.266065f },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "abalone", 4.745199f, 0.282112f },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "vehicle", 24.964712f, 4.469287f },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "letter", 5.334999f, 0.261142f },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "ringnorm", 6.248733f, 0.904713f },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "twonorm", 4.506479f, 0.449739f },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "spambase", 5.243477f, 0.54232f },
|
||||
};
|
||||
|
||||
typedef testing::TestWithParam<ML_Params_t> ML_Params;
|
||||
|
||||
TEST_P(ML_Params, accuracy)
|
||||
{
|
||||
const ML_Params_t & param = GetParam();
|
||||
DatasetDesc &dataset = getDataset(param.dataset);
|
||||
Ptr<TrainData> data = dataset.load();
|
||||
ASSERT_TRUE(data);
|
||||
ASSERT_TRUE(data->getNSamples() > 0);
|
||||
|
||||
Ptr<StatModel> m = param.factory->createNew(dataset);
|
||||
ASSERT_TRUE(m);
|
||||
ASSERT_TRUE(m->train(data, 0));
|
||||
|
||||
float err = m->calcError(data, true, noArray());
|
||||
EXPECT_NEAR(err, param.mean, 4 * param.sigma);
|
||||
}
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(/**/, ML_Params, testing::ValuesIn(ML_Params_List));
|
||||
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
struct ML_SL_Params_t
|
||||
{
|
||||
Ptr<IModelFactory> factory;
|
||||
string dataset;
|
||||
};
|
||||
|
||||
void PrintTo(const ML_SL_Params_t & param, std::ostream *os)
|
||||
{
|
||||
*os << param.factory->name() << "_" << param.dataset;
|
||||
}
|
||||
|
||||
ML_SL_Params_t ML_SL_Params_List[] = {
|
||||
{ makePtr< ModelFactory<NormalBayesClassifier> >(), "waveform" },
|
||||
{ makePtr< ModelFactory<KNearest> >(), "waveform" },
|
||||
{ makePtr< ModelFactory<KNearest> >(), "abalone" },
|
||||
{ makePtr< ModelFactory<SVM> >(SVM::C_SVC, SVM::LINEAR, 1, 0.5, 0), "waveform" },
|
||||
{ makePtr< ModelFactory<SVM> >(SVM::NU_SVR, SVM::RBF, 0.00225, 62.5, 0.03), "poletelecomm" },
|
||||
{ makePtr< ModelFactory<DTrees> >(), "mushroom" },
|
||||
{ makePtr< ModelFactory<DTrees> >(), "abalone" },
|
||||
{ makePtr< ModelFactory<Boost> >(Boost::REAL), "adult" },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "waveform" },
|
||||
{ makePtr< ModelFactory<RTrees> >(), "abalone" },
|
||||
{ makePtr< ModelFactory<SVMSGD> >(), "waveform" },
|
||||
};
|
||||
|
||||
typedef testing::TestWithParam<ML_SL_Params_t> ML_SL_Params;
|
||||
|
||||
TEST_P(ML_SL_Params, save_load)
|
||||
{
|
||||
const ML_SL_Params_t & param = GetParam();
|
||||
|
||||
DatasetDesc &dataset = getDataset(param.dataset);
|
||||
Ptr<TrainData> data = dataset.load();
|
||||
ASSERT_TRUE(data);
|
||||
ASSERT_TRUE(data->getNSamples() > 0);
|
||||
|
||||
Mat responses1, responses2;
|
||||
string file1 = tempfile(".json.gz");
|
||||
string file2 = tempfile(".json.gz");
|
||||
{
|
||||
Ptr<StatModel> m = param.factory->createNew(dataset);
|
||||
ASSERT_TRUE(m);
|
||||
ASSERT_TRUE(m->train(data, 0));
|
||||
m->calcError(data, true, responses1);
|
||||
m->save(file1 + "?base64");
|
||||
}
|
||||
{
|
||||
Ptr<StatModel> m = param.factory->loadFromFile(file1);
|
||||
ASSERT_TRUE(m);
|
||||
m->calcError(data, true, responses2);
|
||||
m->save(file2 + "?base64");
|
||||
}
|
||||
EXPECT_MAT_NEAR(responses1, responses2, 0.0);
|
||||
{
|
||||
ifstream f1(file1.c_str(), std::ios_base::binary);
|
||||
ifstream f2(file2.c_str(), std::ios_base::binary);
|
||||
ASSERT_TRUE(f1.is_open() && f2.is_open());
|
||||
const size_t BUFSZ = 10000;
|
||||
vector<char> buf1(BUFSZ, 0);
|
||||
vector<char> buf2(BUFSZ, 0);
|
||||
while (true)
|
||||
{
|
||||
f1.read(&buf1[0], BUFSZ);
|
||||
f2.read(&buf2[0], BUFSZ);
|
||||
EXPECT_EQ(f1.gcount(), f2.gcount());
|
||||
EXPECT_EQ(f1.eof(), f2.eof());
|
||||
if (!f1.good() || !f2.good() || f1.gcount() != f2.gcount())
|
||||
break;
|
||||
ASSERT_EQ(buf1, buf2);
|
||||
}
|
||||
}
|
||||
remove(file1.c_str());
|
||||
remove(file2.c_str());
|
||||
}
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(/**/, ML_SL_Params, testing::ValuesIn(ML_SL_Params_List));
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
TEST(TrainDataGet, layout_ROW_SAMPLE) // Details: #12236
|
||||
{
|
||||
cv::Mat test = cv::Mat::ones(150, 30, CV_32FC1) * 2;
|
||||
test.col(3) += Scalar::all(3);
|
||||
cv::Mat labels = cv::Mat::ones(150, 3, CV_32SC1) * 5;
|
||||
labels.col(1) += 1;
|
||||
cv::Ptr<cv::ml::TrainData> train_data = cv::ml::TrainData::create(test, cv::ml::ROW_SAMPLE, labels);
|
||||
train_data->setTrainTestSplitRatio(0.9);
|
||||
|
||||
Mat tidx = train_data->getTestSampleIdx();
|
||||
EXPECT_EQ((size_t)15, tidx.total());
|
||||
|
||||
Mat tresp = train_data->getTestResponses();
|
||||
EXPECT_EQ(15, tresp.rows);
|
||||
EXPECT_EQ(labels.cols, tresp.cols);
|
||||
EXPECT_EQ(5, tresp.at<int>(0, 0)) << tresp;
|
||||
EXPECT_EQ(6, tresp.at<int>(0, 1)) << tresp;
|
||||
EXPECT_EQ(6, tresp.at<int>(14, 1)) << tresp;
|
||||
EXPECT_EQ(5, tresp.at<int>(14, 2)) << tresp;
|
||||
|
||||
Mat tsamples = train_data->getTestSamples();
|
||||
EXPECT_EQ(15, tsamples.rows);
|
||||
EXPECT_EQ(test.cols, tsamples.cols);
|
||||
EXPECT_EQ(2, tsamples.at<float>(0, 0)) << tsamples;
|
||||
EXPECT_EQ(5, tsamples.at<float>(0, 3)) << tsamples;
|
||||
EXPECT_EQ(2, tsamples.at<float>(14, test.cols - 1)) << tsamples;
|
||||
EXPECT_EQ(5, tsamples.at<float>(14, 3)) << tsamples;
|
||||
}
|
||||
|
||||
TEST(TrainDataGet, layout_COL_SAMPLE) // Details: #12236
|
||||
{
|
||||
cv::Mat test = cv::Mat::ones(30, 150, CV_32FC1) * 3;
|
||||
test.row(3) += Scalar::all(3);
|
||||
cv::Mat labels = cv::Mat::ones(3, 150, CV_32SC1) * 5;
|
||||
labels.row(1) += 1;
|
||||
cv::Ptr<cv::ml::TrainData> train_data = cv::ml::TrainData::create(test, cv::ml::COL_SAMPLE, labels);
|
||||
train_data->setTrainTestSplitRatio(0.9);
|
||||
|
||||
Mat tidx = train_data->getTestSampleIdx();
|
||||
EXPECT_EQ((size_t)15, tidx.total());
|
||||
|
||||
Mat tresp = train_data->getTestResponses(); // always row-based, transposed
|
||||
EXPECT_EQ(15, tresp.rows);
|
||||
EXPECT_EQ(labels.rows, tresp.cols);
|
||||
EXPECT_EQ(5, tresp.at<int>(0, 0)) << tresp;
|
||||
EXPECT_EQ(6, tresp.at<int>(0, 1)) << tresp;
|
||||
EXPECT_EQ(6, tresp.at<int>(14, 1)) << tresp;
|
||||
EXPECT_EQ(5, tresp.at<int>(14, 2)) << tresp;
|
||||
|
||||
|
||||
Mat tsamples = train_data->getTestSamples();
|
||||
EXPECT_EQ(15, tsamples.cols);
|
||||
EXPECT_EQ(test.rows, tsamples.rows);
|
||||
EXPECT_EQ(3, tsamples.at<float>(0, 0)) << tsamples;
|
||||
EXPECT_EQ(6, tsamples.at<float>(3, 0)) << tsamples;
|
||||
EXPECT_EQ(6, tsamples.at<float>(3, 14)) << tsamples;
|
||||
EXPECT_EQ(3, tsamples.at<float>(test.rows - 1, 14)) << tsamples;
|
||||
}
|
||||
|
||||
}} // namespace
|
||||
51
3rdparty/opencv-4.5.4/modules/ml/test/test_precomp.hpp
vendored
Normal file
51
3rdparty/opencv-4.5.4/modules/ml/test/test_precomp.hpp
vendored
Normal file
@@ -0,0 +1,51 @@
|
||||
#ifndef __OPENCV_TEST_PRECOMP_HPP__
|
||||
#define __OPENCV_TEST_PRECOMP_HPP__
|
||||
|
||||
#include "opencv2/ts.hpp"
|
||||
#include <opencv2/ts/cuda_test.hpp> // EXPECT_MAT_NEAR
|
||||
#include "opencv2/ml.hpp"
|
||||
#include "opencv2/core/core_c.h"
|
||||
|
||||
#include <fstream>
|
||||
using std::ifstream;
|
||||
|
||||
namespace opencv_test {
|
||||
|
||||
using namespace cv::ml;
|
||||
|
||||
#define CV_NBAYES "nbayes"
|
||||
#define CV_KNEAREST "knearest"
|
||||
#define CV_SVM "svm"
|
||||
#define CV_EM "em"
|
||||
#define CV_ANN "ann"
|
||||
#define CV_DTREE "dtree"
|
||||
#define CV_BOOST "boost"
|
||||
#define CV_RTREES "rtrees"
|
||||
#define CV_ERTREES "ertrees"
|
||||
#define CV_SVMSGD "svmsgd"
|
||||
|
||||
using cv::Ptr;
|
||||
using cv::ml::StatModel;
|
||||
using cv::ml::TrainData;
|
||||
using cv::ml::NormalBayesClassifier;
|
||||
using cv::ml::SVM;
|
||||
using cv::ml::KNearest;
|
||||
using cv::ml::ParamGrid;
|
||||
using cv::ml::ANN_MLP;
|
||||
using cv::ml::DTrees;
|
||||
using cv::ml::Boost;
|
||||
using cv::ml::RTrees;
|
||||
using cv::ml::SVMSGD;
|
||||
|
||||
void defaultDistribs( Mat& means, vector<Mat>& covs, int type=CV_32FC1 );
|
||||
void generateData( Mat& data, Mat& labels, const vector<int>& sizes, const Mat& _means, const vector<Mat>& covs, int dataType, int labelType );
|
||||
int maxIdx( const vector<int>& count );
|
||||
bool getLabelsMap( const Mat& labels, const vector<int>& sizes, vector<int>& labelsMap, bool checkClusterUniq=true );
|
||||
bool calcErr( const Mat& labels, const Mat& origLabels, const vector<int>& sizes, float& err, bool labelsEquivalent = true, bool checkClusterUniq=true );
|
||||
|
||||
// used in LR test
|
||||
bool calculateError( const Mat& _p_labels, const Mat& _o_labels, float& error);
|
||||
|
||||
} // namespace
|
||||
|
||||
#endif
|
||||
119
3rdparty/opencv-4.5.4/modules/ml/test/test_rtrees.cpp
vendored
Normal file
119
3rdparty/opencv-4.5.4/modules/ml/test/test_rtrees.cpp
vendored
Normal file
@@ -0,0 +1,119 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
TEST(ML_RTrees, getVotes)
|
||||
{
|
||||
int n = 12;
|
||||
int count, i;
|
||||
int label_size = 3;
|
||||
int predicted_class = 0;
|
||||
int max_votes = -1;
|
||||
int val;
|
||||
// RTrees for classification
|
||||
Ptr<ml::RTrees> rt = cv::ml::RTrees::create();
|
||||
|
||||
//data
|
||||
Mat data(n, 4, CV_32F);
|
||||
randu(data, 0, 10);
|
||||
|
||||
//labels
|
||||
Mat labels = (Mat_<int>(n,1) << 0,0,0,0, 1,1,1,1, 2,2,2,2);
|
||||
|
||||
rt->train(data, ml::ROW_SAMPLE, labels);
|
||||
|
||||
//run function
|
||||
Mat test(1, 4, CV_32F);
|
||||
Mat result;
|
||||
randu(test, 0, 10);
|
||||
rt->getVotes(test, result, 0);
|
||||
|
||||
//count vote amount and find highest vote
|
||||
count = 0;
|
||||
const int* result_row = result.ptr<int>(1);
|
||||
for( i = 0; i < label_size; i++ )
|
||||
{
|
||||
val = result_row[i];
|
||||
//predicted_class = max_votes < val? i;
|
||||
if( max_votes < val )
|
||||
{
|
||||
max_votes = val;
|
||||
predicted_class = i;
|
||||
}
|
||||
count += val;
|
||||
}
|
||||
|
||||
EXPECT_EQ(count, (int)rt->getRoots().size());
|
||||
EXPECT_EQ(result.at<float>(0, predicted_class), rt->predict(test));
|
||||
}
|
||||
|
||||
TEST(ML_RTrees, 11142_sample_weights_regression)
|
||||
{
|
||||
int n = 3;
|
||||
// RTrees for regression
|
||||
Ptr<ml::RTrees> rt = cv::ml::RTrees::create();
|
||||
//simple regression problem of x -> 2x
|
||||
Mat data = (Mat_<float>(n,1) << 1, 2, 3);
|
||||
Mat values = (Mat_<float>(n,1) << 2, 4, 6);
|
||||
Mat weights = (Mat_<float>(n, 1) << 10, 10, 10);
|
||||
|
||||
Ptr<TrainData> trainData = TrainData::create(data, ml::ROW_SAMPLE, values);
|
||||
rt->train(trainData);
|
||||
double error_without_weights = round(rt->getOOBError());
|
||||
rt->clear();
|
||||
Ptr<TrainData> trainDataWithWeights = TrainData::create(data, ml::ROW_SAMPLE, values, Mat(), Mat(), weights );
|
||||
rt->train(trainDataWithWeights);
|
||||
double error_with_weights = round(rt->getOOBError());
|
||||
// error with weights should be larger than error without weights
|
||||
EXPECT_GE(error_with_weights, error_without_weights);
|
||||
}
|
||||
|
||||
TEST(ML_RTrees, 11142_sample_weights_classification)
|
||||
{
|
||||
int n = 12;
|
||||
// RTrees for classification
|
||||
Ptr<ml::RTrees> rt = cv::ml::RTrees::create();
|
||||
|
||||
Mat data(n, 4, CV_32F);
|
||||
randu(data, 0, 10);
|
||||
Mat labels = (Mat_<int>(n,1) << 0,0,0,0, 1,1,1,1, 2,2,2,2);
|
||||
Mat weights = (Mat_<float>(n, 1) << 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10);
|
||||
|
||||
rt->train(data, ml::ROW_SAMPLE, labels);
|
||||
rt->clear();
|
||||
double error_without_weights = round(rt->getOOBError());
|
||||
Ptr<TrainData> trainDataWithWeights = TrainData::create(data, ml::ROW_SAMPLE, labels, Mat(), Mat(), weights );
|
||||
rt->train(data, ml::ROW_SAMPLE, labels);
|
||||
double error_with_weights = round(rt->getOOBError());
|
||||
std::cout << error_without_weights << std::endl;
|
||||
std::cout << error_with_weights << std::endl;
|
||||
// error with weights should be larger than error without weights
|
||||
EXPECT_GE(error_with_weights, error_without_weights);
|
||||
}
|
||||
|
||||
TEST(ML_RTrees, bug_12974_throw_exception_when_predict_different_feature_count)
|
||||
{
|
||||
int numFeatures = 5;
|
||||
// create a 5 feature dataset and train the model
|
||||
cv::Ptr<RTrees> model = RTrees::create();
|
||||
Mat samples(10, numFeatures, CV_32F);
|
||||
randu(samples, 0, 10);
|
||||
Mat labels = (Mat_<int>(10,1) << 0,0,0,0,0,1,1,1,1,1);
|
||||
cv::Ptr<TrainData> trainData = TrainData::create(samples, cv::ml::ROW_SAMPLE, labels);
|
||||
model->train(trainData);
|
||||
// try to predict on data which have fewer features - this should throw an exception
|
||||
for(int i = 1; i < numFeatures - 1; ++i) {
|
||||
Mat test(1, i, CV_32FC1);
|
||||
ASSERT_THROW(model->predict(test), Exception);
|
||||
}
|
||||
// try to predict on data which have more features - this should also throw an exception
|
||||
Mat test(1, numFeatures + 1, CV_32FC1);
|
||||
ASSERT_THROW(model->predict(test), Exception);
|
||||
}
|
||||
|
||||
|
||||
}} // namespace
|
||||
107
3rdparty/opencv-4.5.4/modules/ml/test/test_save_load.cpp
vendored
Normal file
107
3rdparty/opencv-4.5.4/modules/ml/test/test_save_load.cpp
vendored
Normal file
@@ -0,0 +1,107 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
|
||||
void randomFillCategories(const string & filename, Mat & input)
|
||||
{
|
||||
Mat catMap;
|
||||
Mat catCount;
|
||||
std::vector<uchar> varTypes;
|
||||
|
||||
FileStorage fs(filename, FileStorage::READ);
|
||||
FileNode root = fs.getFirstTopLevelNode();
|
||||
root["cat_map"] >> catMap;
|
||||
root["cat_count"] >> catCount;
|
||||
root["var_type"] >> varTypes;
|
||||
|
||||
int offset = 0;
|
||||
int countOffset = 0;
|
||||
uint var = 0, varCount = (uint)varTypes.size();
|
||||
for (; var < varCount; ++var)
|
||||
{
|
||||
if (varTypes[var] == ml::VAR_CATEGORICAL)
|
||||
{
|
||||
int size = catCount.at<int>(0, countOffset);
|
||||
for (int row = 0; row < input.rows; ++row)
|
||||
{
|
||||
int randomChosenIndex = offset + ((uint)cv::theRNG()) % size;
|
||||
int value = catMap.at<int>(0, randomChosenIndex);
|
||||
input.at<float>(row, var) = (float)value;
|
||||
}
|
||||
offset += size;
|
||||
++countOffset;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
typedef tuple<string, string> ML_Legacy_Param;
|
||||
typedef testing::TestWithParam< ML_Legacy_Param > ML_Legacy_Params;
|
||||
|
||||
TEST_P(ML_Legacy_Params, legacy_load)
|
||||
{
|
||||
const string modelName = get<0>(GetParam());
|
||||
const string dataName = get<1>(GetParam());
|
||||
const string filename = findDataFile("legacy/" + modelName + "_" + dataName + ".xml");
|
||||
const bool isTree = modelName == CV_BOOST || modelName == CV_DTREE || modelName == CV_RTREES;
|
||||
|
||||
Ptr<StatModel> model;
|
||||
if (modelName == CV_BOOST)
|
||||
model = Algorithm::load<Boost>(filename);
|
||||
else if (modelName == CV_ANN)
|
||||
model = Algorithm::load<ANN_MLP>(filename);
|
||||
else if (modelName == CV_DTREE)
|
||||
model = Algorithm::load<DTrees>(filename);
|
||||
else if (modelName == CV_NBAYES)
|
||||
model = Algorithm::load<NormalBayesClassifier>(filename);
|
||||
else if (modelName == CV_SVM)
|
||||
model = Algorithm::load<SVM>(filename);
|
||||
else if (modelName == CV_RTREES)
|
||||
model = Algorithm::load<RTrees>(filename);
|
||||
else if (modelName == CV_SVMSGD)
|
||||
model = Algorithm::load<SVMSGD>(filename);
|
||||
ASSERT_TRUE(model);
|
||||
|
||||
Mat input = Mat(isTree ? 10 : 1, model->getVarCount(), CV_32F);
|
||||
cv::theRNG().fill(input, RNG::UNIFORM, 0, 40);
|
||||
|
||||
if (isTree)
|
||||
randomFillCategories(filename, input);
|
||||
|
||||
Mat output;
|
||||
EXPECT_NO_THROW(model->predict(input, output, StatModel::RAW_OUTPUT | (isTree ? DTrees::PREDICT_SUM : 0)));
|
||||
// just check if no internal assertions or errors thrown
|
||||
}
|
||||
|
||||
ML_Legacy_Param param_list[] = {
|
||||
ML_Legacy_Param(CV_ANN, "waveform"),
|
||||
ML_Legacy_Param(CV_BOOST, "adult"),
|
||||
ML_Legacy_Param(CV_BOOST, "1"),
|
||||
ML_Legacy_Param(CV_BOOST, "2"),
|
||||
ML_Legacy_Param(CV_BOOST, "3"),
|
||||
ML_Legacy_Param(CV_DTREE, "abalone"),
|
||||
ML_Legacy_Param(CV_DTREE, "mushroom"),
|
||||
ML_Legacy_Param(CV_NBAYES, "waveform"),
|
||||
ML_Legacy_Param(CV_SVM, "poletelecomm"),
|
||||
ML_Legacy_Param(CV_SVM, "waveform"),
|
||||
ML_Legacy_Param(CV_RTREES, "waveform"),
|
||||
ML_Legacy_Param(CV_SVMSGD, "waveform"),
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(/**/, ML_Legacy_Params, testing::ValuesIn(param_list));
|
||||
|
||||
/*TEST(ML_SVM, throw_exception_when_save_untrained_model)
|
||||
{
|
||||
Ptr<cv::ml::SVM> svm;
|
||||
string filename = tempfile("svm.xml");
|
||||
ASSERT_THROW(svm.save(filename.c_str()), Exception);
|
||||
remove(filename.c_str());
|
||||
}*/
|
||||
|
||||
}} // namespace
|
||||
156
3rdparty/opencv-4.5.4/modules/ml/test/test_svmsgd.cpp
vendored
Normal file
156
3rdparty/opencv-4.5.4/modules/ml/test/test_svmsgd.cpp
vendored
Normal file
@@ -0,0 +1,156 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
static const int TEST_VALUE_LIMIT = 500;
|
||||
enum
|
||||
{
|
||||
UNIFORM_SAME_SCALE,
|
||||
UNIFORM_DIFFERENT_SCALES
|
||||
};
|
||||
|
||||
CV_ENUM(SVMSGD_TYPE, UNIFORM_SAME_SCALE, UNIFORM_DIFFERENT_SCALES)
|
||||
|
||||
typedef std::vector< std::pair<float,float> > BorderList;
|
||||
|
||||
static void makeData(RNG &rng, int samplesCount, const Mat &weights, float shift, const BorderList & borders, Mat &samples, Mat & responses)
|
||||
{
|
||||
int featureCount = weights.cols;
|
||||
samples.create(samplesCount, featureCount, CV_32FC1);
|
||||
for (int featureIndex = 0; featureIndex < featureCount; featureIndex++)
|
||||
rng.fill(samples.col(featureIndex), RNG::UNIFORM, borders[featureIndex].first, borders[featureIndex].second);
|
||||
responses.create(samplesCount, 1, CV_32FC1);
|
||||
for (int i = 0 ; i < samplesCount; i++)
|
||||
{
|
||||
double res = samples.row(i).dot(weights) + shift;
|
||||
responses.at<float>(i) = res > 0 ? 1.f : -1.f;
|
||||
}
|
||||
}
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
typedef tuple<SVMSGD_TYPE, int, double> ML_SVMSGD_Param;
|
||||
typedef testing::TestWithParam<ML_SVMSGD_Param> ML_SVMSGD_Params;
|
||||
|
||||
TEST_P(ML_SVMSGD_Params, scale_and_features)
|
||||
{
|
||||
const int type = get<0>(GetParam());
|
||||
const int featureCount = get<1>(GetParam());
|
||||
const double precision = get<2>(GetParam());
|
||||
|
||||
RNG &rng = cv::theRNG();
|
||||
|
||||
Mat_<float> weights(1, featureCount);
|
||||
rng.fill(weights, RNG::UNIFORM, -1, 1);
|
||||
const float shift = static_cast<float>(rng.uniform(-featureCount, featureCount));
|
||||
|
||||
BorderList borders;
|
||||
float lowerLimit = -TEST_VALUE_LIMIT;
|
||||
float upperLimit = TEST_VALUE_LIMIT;
|
||||
if (type == UNIFORM_SAME_SCALE)
|
||||
{
|
||||
for (int featureIndex = 0; featureIndex < featureCount; featureIndex++)
|
||||
borders.push_back(std::pair<float,float>(lowerLimit, upperLimit));
|
||||
}
|
||||
else if (type == UNIFORM_DIFFERENT_SCALES)
|
||||
{
|
||||
for (int featureIndex = 0; featureIndex < featureCount; featureIndex++)
|
||||
{
|
||||
int crit = rng.uniform(0, 2);
|
||||
if (crit > 0)
|
||||
borders.push_back(std::pair<float,float>(lowerLimit, upperLimit));
|
||||
else
|
||||
borders.push_back(std::pair<float,float>(lowerLimit/1000, upperLimit/1000));
|
||||
}
|
||||
}
|
||||
ASSERT_FALSE(borders.empty());
|
||||
|
||||
Mat trainSamples;
|
||||
Mat trainResponses;
|
||||
int trainSamplesCount = 10000;
|
||||
makeData(rng, trainSamplesCount, weights, shift, borders, trainSamples, trainResponses);
|
||||
ASSERT_EQ(trainResponses.type(), CV_32FC1);
|
||||
|
||||
Mat testSamples;
|
||||
Mat testResponses;
|
||||
int testSamplesCount = 100000;
|
||||
makeData(rng, testSamplesCount, weights, shift, borders, testSamples, testResponses);
|
||||
ASSERT_EQ(testResponses.type(), CV_32FC1);
|
||||
|
||||
Ptr<TrainData> data = TrainData::create(trainSamples, cv::ml::ROW_SAMPLE, trainResponses);
|
||||
ASSERT_TRUE(data);
|
||||
|
||||
cv::Ptr<SVMSGD> svmsgd = SVMSGD::create();
|
||||
ASSERT_TRUE(svmsgd);
|
||||
|
||||
svmsgd->train(data);
|
||||
|
||||
Mat responses;
|
||||
svmsgd->predict(testSamples, responses);
|
||||
ASSERT_EQ(responses.type(), CV_32FC1);
|
||||
ASSERT_EQ(responses.rows, testSamplesCount);
|
||||
|
||||
int errCount = 0;
|
||||
for (int i = 0; i < testSamplesCount; i++)
|
||||
if (responses.at<float>(i) * testResponses.at<float>(i) < 0)
|
||||
errCount++;
|
||||
float err = (float)errCount / testSamplesCount;
|
||||
EXPECT_LE(err, precision);
|
||||
}
|
||||
|
||||
ML_SVMSGD_Param params_list[] = {
|
||||
ML_SVMSGD_Param(UNIFORM_SAME_SCALE, 2, 0.01),
|
||||
ML_SVMSGD_Param(UNIFORM_SAME_SCALE, 5, 0.01),
|
||||
ML_SVMSGD_Param(UNIFORM_SAME_SCALE, 100, 0.02),
|
||||
ML_SVMSGD_Param(UNIFORM_DIFFERENT_SCALES, 2, 0.01),
|
||||
ML_SVMSGD_Param(UNIFORM_DIFFERENT_SCALES, 5, 0.01),
|
||||
ML_SVMSGD_Param(UNIFORM_DIFFERENT_SCALES, 100, 0.01),
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(/**/, ML_SVMSGD_Params, testing::ValuesIn(params_list));
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
TEST(ML_SVMSGD, twoPoints)
|
||||
{
|
||||
Mat samples(2, 2, CV_32FC1);
|
||||
samples.at<float>(0,0) = 0;
|
||||
samples.at<float>(0,1) = 0;
|
||||
samples.at<float>(1,0) = 1000;
|
||||
samples.at<float>(1,1) = 1;
|
||||
|
||||
Mat responses(2, 1, CV_32FC1);
|
||||
responses.at<float>(0) = -1;
|
||||
responses.at<float>(1) = 1;
|
||||
|
||||
cv::Ptr<TrainData> trainData = TrainData::create(samples, cv::ml::ROW_SAMPLE, responses);
|
||||
|
||||
Mat realWeights(1, 2, CV_32FC1);
|
||||
realWeights.at<float>(0) = 1000;
|
||||
realWeights.at<float>(1) = 1;
|
||||
|
||||
float realShift = -500000.5;
|
||||
|
||||
float normRealWeights = static_cast<float>(cv::norm(realWeights)); // TODO cvtest
|
||||
realWeights /= normRealWeights;
|
||||
realShift /= normRealWeights;
|
||||
|
||||
cv::Ptr<SVMSGD> svmsgd = SVMSGD::create();
|
||||
svmsgd->setOptimalParameters();
|
||||
svmsgd->train( trainData );
|
||||
|
||||
Mat foundWeights = svmsgd->getWeights();
|
||||
float foundShift = svmsgd->getShift();
|
||||
|
||||
float normFoundWeights = static_cast<float>(cv::norm(foundWeights)); // TODO cvtest
|
||||
foundWeights /= normFoundWeights;
|
||||
foundShift /= normFoundWeights;
|
||||
EXPECT_LE(cv::norm(Mat(foundWeights - realWeights)), 0.001); // TODO cvtest
|
||||
EXPECT_LE(std::abs((foundShift - realShift) / realShift), 0.05);
|
||||
}
|
||||
|
||||
}} // namespace
|
||||
164
3rdparty/opencv-4.5.4/modules/ml/test/test_svmtrainauto.cpp
vendored
Normal file
164
3rdparty/opencv-4.5.4/modules/ml/test/test_svmtrainauto.cpp
vendored
Normal file
@@ -0,0 +1,164 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test { namespace {
|
||||
|
||||
using cv::ml::SVM;
|
||||
using cv::ml::TrainData;
|
||||
|
||||
static Ptr<TrainData> makeRandomData(int datasize)
|
||||
{
|
||||
cv::Mat samples = cv::Mat::zeros( datasize, 2, CV_32FC1 );
|
||||
cv::Mat responses = cv::Mat::zeros( datasize, 1, CV_32S );
|
||||
RNG &rng = cv::theRNG();
|
||||
for (int i = 0; i < datasize; ++i)
|
||||
{
|
||||
int response = rng.uniform(0, 2); // Random from {0, 1}.
|
||||
samples.at<float>( i, 0 ) = rng.uniform(0.f, 0.5f) + response * 0.5f;
|
||||
samples.at<float>( i, 1 ) = rng.uniform(0.f, 0.5f) + response * 0.5f;
|
||||
responses.at<int>( i, 0 ) = response;
|
||||
}
|
||||
return TrainData::create( samples, cv::ml::ROW_SAMPLE, responses );
|
||||
}
|
||||
|
||||
static Ptr<TrainData> makeCircleData(int datasize, float scale_factor, float radius)
|
||||
{
|
||||
// Populate samples with data that can be split into two concentric circles
|
||||
cv::Mat samples = cv::Mat::zeros( datasize, 2, CV_32FC1 );
|
||||
cv::Mat responses = cv::Mat::zeros( datasize, 1, CV_32S );
|
||||
for (int i = 0; i < datasize; i+=2)
|
||||
{
|
||||
const float pi = 3.14159f;
|
||||
const float angle_rads = (i/datasize) * pi;
|
||||
const float x = radius * cos(angle_rads);
|
||||
const float y = radius * cos(angle_rads);
|
||||
|
||||
// Larger circle
|
||||
samples.at<float>( i, 0 ) = x;
|
||||
samples.at<float>( i, 1 ) = y;
|
||||
responses.at<int>( i, 0 ) = 0;
|
||||
|
||||
// Smaller circle
|
||||
samples.at<float>( i + 1, 0 ) = x * scale_factor;
|
||||
samples.at<float>( i + 1, 1 ) = y * scale_factor;
|
||||
responses.at<int>( i + 1, 0 ) = 1;
|
||||
}
|
||||
return TrainData::create( samples, cv::ml::ROW_SAMPLE, responses );
|
||||
}
|
||||
|
||||
static Ptr<TrainData> makeRandomData2(int datasize)
|
||||
{
|
||||
cv::Mat samples = cv::Mat::zeros( datasize, 2, CV_32FC1 );
|
||||
cv::Mat responses = cv::Mat::zeros( datasize, 1, CV_32S );
|
||||
RNG &rng = cv::theRNG();
|
||||
for (int i = 0; i < datasize; ++i)
|
||||
{
|
||||
int response = rng.uniform(0, 2); // Random from {0, 1}.
|
||||
samples.at<float>( i, 0 ) = 0;
|
||||
samples.at<float>( i, 1 ) = (0.5f - response) * rng.uniform(0.f, 1.2f) + response;
|
||||
responses.at<int>( i, 0 ) = response;
|
||||
}
|
||||
return TrainData::create( samples, cv::ml::ROW_SAMPLE, responses );
|
||||
}
|
||||
|
||||
//==================================================================================================
|
||||
|
||||
TEST(ML_SVM, trainauto)
|
||||
{
|
||||
const int datasize = 100;
|
||||
cv::Ptr<TrainData> data = makeRandomData(datasize);
|
||||
ASSERT_TRUE(data);
|
||||
cv::Ptr<SVM> svm = SVM::create();
|
||||
ASSERT_TRUE(svm);
|
||||
svm->trainAuto( data, 10 ); // 2-fold cross validation.
|
||||
|
||||
float test_data0[2] = {0.25f, 0.25f};
|
||||
cv::Mat test_point0 = cv::Mat( 1, 2, CV_32FC1, test_data0 );
|
||||
float result0 = svm->predict( test_point0 );
|
||||
float test_data1[2] = {0.75f, 0.75f};
|
||||
cv::Mat test_point1 = cv::Mat( 1, 2, CV_32FC1, test_data1 );
|
||||
float result1 = svm->predict( test_point1 );
|
||||
|
||||
EXPECT_NEAR(result0, 0, 0.001);
|
||||
EXPECT_NEAR(result1, 1, 0.001);
|
||||
}
|
||||
|
||||
TEST(ML_SVM, trainauto_sigmoid)
|
||||
{
|
||||
const int datasize = 100;
|
||||
const float scale_factor = 0.5;
|
||||
const float radius = 2.0;
|
||||
cv::Ptr<TrainData> data = makeCircleData(datasize, scale_factor, radius);
|
||||
ASSERT_TRUE(data);
|
||||
|
||||
cv::Ptr<SVM> svm = SVM::create();
|
||||
ASSERT_TRUE(svm);
|
||||
svm->setKernel(SVM::SIGMOID);
|
||||
svm->setGamma(10.0);
|
||||
svm->setCoef0(-10.0);
|
||||
svm->trainAuto( data, 10 ); // 2-fold cross validation.
|
||||
|
||||
float test_data0[2] = {radius, radius};
|
||||
cv::Mat test_point0 = cv::Mat( 1, 2, CV_32FC1, test_data0 );
|
||||
EXPECT_FLOAT_EQ(svm->predict( test_point0 ), 0);
|
||||
|
||||
float test_data1[2] = {scale_factor * radius, scale_factor * radius};
|
||||
cv::Mat test_point1 = cv::Mat( 1, 2, CV_32FC1, test_data1 );
|
||||
EXPECT_FLOAT_EQ(svm->predict( test_point1 ), 1);
|
||||
}
|
||||
|
||||
TEST(ML_SVM, trainAuto_regression_5369)
|
||||
{
|
||||
const int datasize = 100;
|
||||
Ptr<TrainData> data = makeRandomData2(datasize);
|
||||
cv::Ptr<SVM> svm = SVM::create();
|
||||
svm->trainAuto( data, 10 ); // 2-fold cross validation.
|
||||
|
||||
float test_data0[2] = {0.25f, 0.25f};
|
||||
cv::Mat test_point0 = cv::Mat( 1, 2, CV_32FC1, test_data0 );
|
||||
float result0 = svm->predict( test_point0 );
|
||||
float test_data1[2] = {0.75f, 0.75f};
|
||||
cv::Mat test_point1 = cv::Mat( 1, 2, CV_32FC1, test_data1 );
|
||||
float result1 = svm->predict( test_point1 );
|
||||
|
||||
EXPECT_EQ(0., result0);
|
||||
EXPECT_EQ(1., result1);
|
||||
}
|
||||
|
||||
TEST(ML_SVM, getSupportVectors)
|
||||
{
|
||||
// Set up training data
|
||||
int labels[4] = {1, -1, -1, -1};
|
||||
float trainingData[4][2] = { {501, 10}, {255, 10}, {501, 255}, {10, 501} };
|
||||
Mat trainingDataMat(4, 2, CV_32FC1, trainingData);
|
||||
Mat labelsMat(4, 1, CV_32SC1, labels);
|
||||
|
||||
Ptr<SVM> svm = SVM::create();
|
||||
ASSERT_TRUE(svm);
|
||||
svm->setType(SVM::C_SVC);
|
||||
svm->setTermCriteria(TermCriteria(TermCriteria::MAX_ITER, 100, 1e-6));
|
||||
|
||||
// Test retrieval of SVs and compressed SVs on linear SVM
|
||||
svm->setKernel(SVM::LINEAR);
|
||||
svm->train(trainingDataMat, cv::ml::ROW_SAMPLE, labelsMat);
|
||||
|
||||
Mat sv = svm->getSupportVectors();
|
||||
EXPECT_EQ(1, sv.rows); // by default compressed SV returned
|
||||
sv = svm->getUncompressedSupportVectors();
|
||||
EXPECT_EQ(3, sv.rows);
|
||||
|
||||
// Test retrieval of SVs and compressed SVs on non-linear SVM
|
||||
svm->setKernel(SVM::POLY);
|
||||
svm->setDegree(2);
|
||||
svm->train(trainingDataMat, cv::ml::ROW_SAMPLE, labelsMat);
|
||||
|
||||
sv = svm->getSupportVectors();
|
||||
EXPECT_EQ(3, sv.rows);
|
||||
sv = svm->getUncompressedSupportVectors();
|
||||
EXPECT_EQ(0, sv.rows); // inapplicable for non-linear SVMs
|
||||
}
|
||||
|
||||
}} // namespace
|
||||
189
3rdparty/opencv-4.5.4/modules/ml/test/test_utils.cpp
vendored
Normal file
189
3rdparty/opencv-4.5.4/modules/ml/test/test_utils.cpp
vendored
Normal file
@@ -0,0 +1,189 @@
|
||||
// This file is part of OpenCV project.
|
||||
// It is subject to the license terms in the LICENSE file found in the top-level directory
|
||||
// of this distribution and at http://opencv.org/license.html.
|
||||
#include "test_precomp.hpp"
|
||||
|
||||
namespace opencv_test {
|
||||
|
||||
void defaultDistribs( Mat& means, vector<Mat>& covs, int type)
|
||||
{
|
||||
float mp0[] = {0.0f, 0.0f}, cp0[] = {0.67f, 0.0f, 0.0f, 0.67f};
|
||||
float mp1[] = {5.0f, 0.0f}, cp1[] = {1.0f, 0.0f, 0.0f, 1.0f};
|
||||
float mp2[] = {1.0f, 5.0f}, cp2[] = {1.0f, 0.0f, 0.0f, 1.0f};
|
||||
means.create(3, 2, type);
|
||||
Mat m0( 1, 2, CV_32FC1, mp0 ), c0( 2, 2, CV_32FC1, cp0 );
|
||||
Mat m1( 1, 2, CV_32FC1, mp1 ), c1( 2, 2, CV_32FC1, cp1 );
|
||||
Mat m2( 1, 2, CV_32FC1, mp2 ), c2( 2, 2, CV_32FC1, cp2 );
|
||||
means.resize(3), covs.resize(3);
|
||||
|
||||
Mat mr0 = means.row(0);
|
||||
m0.convertTo(mr0, type);
|
||||
c0.convertTo(covs[0], type);
|
||||
|
||||
Mat mr1 = means.row(1);
|
||||
m1.convertTo(mr1, type);
|
||||
c1.convertTo(covs[1], type);
|
||||
|
||||
Mat mr2 = means.row(2);
|
||||
m2.convertTo(mr2, type);
|
||||
c2.convertTo(covs[2], type);
|
||||
}
|
||||
|
||||
// generate points sets by normal distributions
|
||||
void generateData( Mat& data, Mat& labels, const vector<int>& sizes, const Mat& _means, const vector<Mat>& covs, int dataType, int labelType )
|
||||
{
|
||||
vector<int>::const_iterator sit = sizes.begin();
|
||||
int total = 0;
|
||||
for( ; sit != sizes.end(); ++sit )
|
||||
total += *sit;
|
||||
CV_Assert( _means.rows == (int)sizes.size() && covs.size() == sizes.size() );
|
||||
CV_Assert( !data.empty() && data.rows == total );
|
||||
CV_Assert( data.type() == dataType );
|
||||
|
||||
labels.create( data.rows, 1, labelType );
|
||||
|
||||
randn( data, Scalar::all(-1.0), Scalar::all(1.0) );
|
||||
vector<Mat> means(sizes.size());
|
||||
for(int i = 0; i < _means.rows; i++)
|
||||
means[i] = _means.row(i);
|
||||
vector<Mat>::const_iterator mit = means.begin(), cit = covs.begin();
|
||||
int bi, ei = 0;
|
||||
sit = sizes.begin();
|
||||
for( int p = 0, l = 0; sit != sizes.end(); ++sit, ++mit, ++cit, l++ )
|
||||
{
|
||||
bi = ei;
|
||||
ei = bi + *sit;
|
||||
CV_Assert( mit->rows == 1 && mit->cols == data.cols );
|
||||
CV_Assert( cit->rows == data.cols && cit->cols == data.cols );
|
||||
for( int i = bi; i < ei; i++, p++ )
|
||||
{
|
||||
Mat r = data.row(i);
|
||||
r = r * (*cit) + *mit;
|
||||
if( labelType == CV_32FC1 )
|
||||
labels.at<float>(p, 0) = (float)l;
|
||||
else if( labelType == CV_32SC1 )
|
||||
labels.at<int>(p, 0) = l;
|
||||
else
|
||||
{
|
||||
CV_DbgAssert(0);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
int maxIdx( const vector<int>& count )
|
||||
{
|
||||
int idx = -1;
|
||||
int maxVal = -1;
|
||||
vector<int>::const_iterator it = count.begin();
|
||||
for( int i = 0; it != count.end(); ++it, i++ )
|
||||
{
|
||||
if( *it > maxVal)
|
||||
{
|
||||
maxVal = *it;
|
||||
idx = i;
|
||||
}
|
||||
}
|
||||
CV_Assert( idx >= 0);
|
||||
return idx;
|
||||
}
|
||||
|
||||
bool getLabelsMap( const Mat& labels, const vector<int>& sizes, vector<int>& labelsMap, bool checkClusterUniq)
|
||||
{
|
||||
size_t total = 0, nclusters = sizes.size();
|
||||
for(size_t i = 0; i < sizes.size(); i++)
|
||||
total += sizes[i];
|
||||
|
||||
CV_Assert( !labels.empty() );
|
||||
CV_Assert( labels.total() == total && (labels.cols == 1 || labels.rows == 1));
|
||||
CV_Assert( labels.type() == CV_32SC1 || labels.type() == CV_32FC1 );
|
||||
|
||||
bool isFlt = labels.type() == CV_32FC1;
|
||||
|
||||
labelsMap.resize(nclusters);
|
||||
|
||||
vector<bool> buzy(nclusters, false);
|
||||
int startIndex = 0;
|
||||
for( size_t clusterIndex = 0; clusterIndex < sizes.size(); clusterIndex++ )
|
||||
{
|
||||
vector<int> count( nclusters, 0 );
|
||||
for( int i = startIndex; i < startIndex + sizes[clusterIndex]; i++)
|
||||
{
|
||||
int lbl = isFlt ? (int)labels.at<float>(i) : labels.at<int>(i);
|
||||
CV_Assert(lbl < (int)nclusters);
|
||||
count[lbl]++;
|
||||
CV_Assert(count[lbl] < (int)total);
|
||||
}
|
||||
startIndex += sizes[clusterIndex];
|
||||
|
||||
int cls = maxIdx( count );
|
||||
CV_Assert( !checkClusterUniq || !buzy[cls] );
|
||||
|
||||
labelsMap[clusterIndex] = cls;
|
||||
|
||||
buzy[cls] = true;
|
||||
}
|
||||
|
||||
if(checkClusterUniq)
|
||||
{
|
||||
for(size_t i = 0; i < buzy.size(); i++)
|
||||
if(!buzy[i])
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
bool calcErr( const Mat& labels, const Mat& origLabels, const vector<int>& sizes, float& err, bool labelsEquivalent, bool checkClusterUniq)
|
||||
{
|
||||
err = 0;
|
||||
CV_Assert( !labels.empty() && !origLabels.empty() );
|
||||
CV_Assert( labels.rows == 1 || labels.cols == 1 );
|
||||
CV_Assert( origLabels.rows == 1 || origLabels.cols == 1 );
|
||||
CV_Assert( labels.total() == origLabels.total() );
|
||||
CV_Assert( labels.type() == CV_32SC1 || labels.type() == CV_32FC1 );
|
||||
CV_Assert( origLabels.type() == labels.type() );
|
||||
|
||||
vector<int> labelsMap;
|
||||
bool isFlt = labels.type() == CV_32FC1;
|
||||
if( !labelsEquivalent )
|
||||
{
|
||||
if( !getLabelsMap( labels, sizes, labelsMap, checkClusterUniq ) )
|
||||
return false;
|
||||
|
||||
for( int i = 0; i < labels.rows; i++ )
|
||||
if( isFlt )
|
||||
err += labels.at<float>(i) != labelsMap[(int)origLabels.at<float>(i)] ? 1.f : 0.f;
|
||||
else
|
||||
err += labels.at<int>(i) != labelsMap[origLabels.at<int>(i)] ? 1.f : 0.f;
|
||||
}
|
||||
else
|
||||
{
|
||||
for( int i = 0; i < labels.rows; i++ )
|
||||
if( isFlt )
|
||||
err += labels.at<float>(i) != origLabels.at<float>(i) ? 1.f : 0.f;
|
||||
else
|
||||
err += labels.at<int>(i) != origLabels.at<int>(i) ? 1.f : 0.f;
|
||||
}
|
||||
err /= (float)labels.rows;
|
||||
return true;
|
||||
}
|
||||
|
||||
bool calculateError( const Mat& _p_labels, const Mat& _o_labels, float& error)
|
||||
{
|
||||
error = 0.0f;
|
||||
float accuracy = 0.0f;
|
||||
Mat _p_labels_temp;
|
||||
Mat _o_labels_temp;
|
||||
_p_labels.convertTo(_p_labels_temp, CV_32S);
|
||||
_o_labels.convertTo(_o_labels_temp, CV_32S);
|
||||
|
||||
CV_Assert(_p_labels_temp.total() == _o_labels_temp.total());
|
||||
CV_Assert(_p_labels_temp.rows == _o_labels_temp.rows);
|
||||
|
||||
accuracy = (float)countNonZero(_p_labels_temp == _o_labels_temp)/_p_labels_temp.rows;
|
||||
error = 1 - accuracy;
|
||||
return true;
|
||||
}
|
||||
|
||||
} // namespace
|
||||
Reference in New Issue
Block a user