feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake
1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试 2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程 3.重整权利声明文件,重整代码工程,确保最小化侵权风险 Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
This commit is contained in:
64
3rdparty/opencv-4.5.4/doc/js_tutorials/js_video/js_bg_subtraction/js_bg_subtraction.markdown
vendored
Normal file
64
3rdparty/opencv-4.5.4/doc/js_tutorials/js_video/js_bg_subtraction/js_bg_subtraction.markdown
vendored
Normal file
@ -0,0 +1,64 @@
|
||||
Background Subtraction {#tutorial_js_bg_subtraction}
|
||||
======================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
- We will familiarize with the background subtraction methods available in OpenCV.js.
|
||||
|
||||
Basics
|
||||
------
|
||||
|
||||
Background subtraction is a major preprocessing steps in many vision based applications. For
|
||||
example, consider the cases like visitor counter where a static camera takes the number of visitors
|
||||
entering or leaving the room, or a traffic camera extracting information about the vehicles etc. In
|
||||
all these cases, first you need to extract the person or vehicles alone. Technically, you need to
|
||||
extract the moving foreground from static background.
|
||||
|
||||
If you have an image of background alone, like image of the room without visitors, image of the road
|
||||
without vehicles etc, it is an easy job. Just subtract the new image from the background. You get
|
||||
the foreground objects alone. But in most of the cases, you may not have such an image, so we need
|
||||
to extract the background from whatever images we have. It become more complicated when there is
|
||||
shadow of the vehicles. Since shadow is also moving, simple subtraction will mark that also as
|
||||
foreground. It complicates things.
|
||||
|
||||
OpenCV.js has implemented one algorithm for this purpose, which is very easy to use.
|
||||
|
||||
BackgroundSubtractorMOG2
|
||||
------------------------
|
||||
|
||||
It is a Gaussian Mixture-based Background/Foreground Segmentation Algorithm. It is based on two
|
||||
papers by Z.Zivkovic, "Improved adaptive Gaussian mixture model for background subtraction" in 2004
|
||||
and "Efficient Adaptive Density Estimation per Image Pixel for the Task of Background Subtraction"
|
||||
in 2006. One important feature of this algorithm is that it selects the appropriate number of
|
||||
gaussian distribution for each pixel. It provides better adaptibility to varying scenes due illumination
|
||||
changes etc.
|
||||
|
||||
While coding, we use the constructor: **cv.BackgroundSubtractorMOG2 (history = 500, varThreshold = 16,
|
||||
detectShadows = true)**
|
||||
@param history Length of the history.
|
||||
@param varThreshold Threshold on the squared distance between the pixel and the sample to decide
|
||||
whether a pixel is close to that sample. This parameter does not affect the background update.
|
||||
@param detectShadows If true, the algorithm will detect shadows and mark them. It decreases the
|
||||
speed a bit, so if you do not need this feature, set the parameter to false.
|
||||
@return instance of cv.BackgroundSubtractorMOG2
|
||||
|
||||
Use **apply (image, fgmask, learningRate = -1)** method to get the foreground mask
|
||||
@param image Next video frame. Floating point frame will be used without scaling and should
|
||||
be in range [0,255].
|
||||
@param fgmask The output foreground mask as an 8-bit binary image.
|
||||
@param learningRate The value between 0 and 1 that indicates how fast the background model is learnt.
|
||||
Negative parameter value makes the algorithm to use some automatically chosen learning rate. 0 means
|
||||
that the background model is not updated at all, 1 means that the background model is completely
|
||||
reinitialized from the last frame.
|
||||
|
||||
@note The instance of cv.BackgroundSubtractorMOG2 should be deleted manually.
|
||||
|
||||
Try it
|
||||
------
|
||||
|
||||
\htmlonly
|
||||
<iframe src="../../js_bg_subtraction.html" width="100%"
|
||||
onload="this.style.height=this.contentDocument.body.scrollHeight +'px';">
|
||||
</iframe>
|
||||
\endhtmlonly
|
171
3rdparty/opencv-4.5.4/doc/js_tutorials/js_video/js_lucas_kanade/js_lucas_kanade.markdown
vendored
Normal file
171
3rdparty/opencv-4.5.4/doc/js_tutorials/js_video/js_lucas_kanade/js_lucas_kanade.markdown
vendored
Normal file
@ -0,0 +1,171 @@
|
||||
Optical Flow {#tutorial_js_lucas_kanade}
|
||||
============
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
- We will understand the concepts of optical flow and its estimation using Lucas-Kanade
|
||||
method.
|
||||
- We will use functions like **cv.calcOpticalFlowPyrLK()** to track feature points in a
|
||||
video.
|
||||
|
||||
Optical Flow
|
||||
------------
|
||||
|
||||
Optical flow is the pattern of apparent motion of image objects between two consecutive frames
|
||||
caused by the movement of object or camera. It is 2D vector field where each vector is a
|
||||
displacement vector showing the movement of points from first frame to second. Consider the image
|
||||
below (Image Courtesy: [Wikipedia article on Optical
|
||||
Flow](http://en.wikipedia.org/wiki/Optical_flow)).
|
||||
|
||||

|
||||
|
||||
It shows a ball moving in 5 consecutive frames. The arrow shows its displacement vector. Optical
|
||||
flow has many applications in areas like :
|
||||
|
||||
- Structure from Motion
|
||||
- Video Compression
|
||||
- Video Stabilization ...
|
||||
|
||||
Optical flow works on several assumptions:
|
||||
|
||||
-# The pixel intensities of an object do not change between consecutive frames.
|
||||
2. Neighbouring pixels have similar motion.
|
||||
|
||||
Consider a pixel \f$I(x,y,t)\f$ in first frame (Check a new dimension, time, is added here. Earlier we
|
||||
were working with images only, so no need of time). It moves by distance \f$(dx,dy)\f$ in next frame
|
||||
taken after \f$dt\f$ time. So since those pixels are the same and intensity does not change, we can say,
|
||||
|
||||
\f[I(x,y,t) = I(x+dx, y+dy, t+dt)\f]
|
||||
|
||||
Then take taylor series approximation of right-hand side, remove common terms and divide by \f$dt\f$ to
|
||||
get the following equation:
|
||||
|
||||
\f[f_x u + f_y v + f_t = 0 \;\f]
|
||||
|
||||
where:
|
||||
|
||||
\f[f_x = \frac{\partial f}{\partial x} \; ; \; f_y = \frac{\partial f}{\partial y}\f]\f[u = \frac{dx}{dt} \; ; \; v = \frac{dy}{dt}\f]
|
||||
|
||||
Above equation is called Optical Flow equation. In it, we can find \f$f_x\f$ and \f$f_y\f$, they are image
|
||||
gradients. Similarly \f$f_t\f$ is the gradient along time. But \f$(u,v)\f$ is unknown. We cannot solve this
|
||||
one equation with two unknown variables. So several methods are provided to solve this problem and
|
||||
one of them is Lucas-Kanade.
|
||||
|
||||
### Lucas-Kanade method
|
||||
|
||||
We have seen an assumption before, that all the neighbouring pixels will have similar motion.
|
||||
Lucas-Kanade method takes a 3x3 patch around the point. So all the 9 points have the same motion. We
|
||||
can find \f$(f_x, f_y, f_t)\f$ for these 9 points. So now our problem becomes solving 9 equations with
|
||||
two unknown variables which is over-determined. A better solution is obtained with least square fit
|
||||
method. Below is the final solution which is two equation-two unknown problem and solve to get the
|
||||
solution.
|
||||
|
||||
\f[\begin{bmatrix} u \\ v \end{bmatrix} =
|
||||
\begin{bmatrix}
|
||||
\sum_{i}{f_{x_i}}^2 & \sum_{i}{f_{x_i} f_{y_i} } \\
|
||||
\sum_{i}{f_{x_i} f_{y_i}} & \sum_{i}{f_{y_i}}^2
|
||||
\end{bmatrix}^{-1}
|
||||
\begin{bmatrix}
|
||||
- \sum_{i}{f_{x_i} f_{t_i}} \\
|
||||
- \sum_{i}{f_{y_i} f_{t_i}}
|
||||
\end{bmatrix}\f]
|
||||
|
||||
( Check similarity of inverse matrix with Harris corner detector. It denotes that corners are better
|
||||
points to be tracked.)
|
||||
|
||||
So from user point of view, idea is simple, we give some points to track, we receive the optical
|
||||
flow vectors of those points. But again there are some problems. Until now, we were dealing with
|
||||
small motions. So it fails when there is large motion. So again we go for pyramids. When we go up in
|
||||
the pyramid, small motions are removed and large motions becomes small motions. So applying
|
||||
Lucas-Kanade there, we get optical flow along with the scale.
|
||||
|
||||
Lucas-Kanade Optical Flow in OpenCV.js
|
||||
-----------------------------------
|
||||
|
||||
We use the function: **cv.calcOpticalFlowPyrLK (prevImg, nextImg, prevPts, nextPts, status, err, winSize =
|
||||
new cv.Size(21, 21), maxLevel = 3, criteria = new cv.TermCriteria(cv.TermCriteria_COUNT+
|
||||
cv.TermCriteria_EPS, 30, 0.01), flags = 0, minEigThreshold = 1e-4)**.
|
||||
@param prevImg first 8-bit input image or pyramid constructed by buildOpticalFlowPyramid.
|
||||
@param nextImg second input image or pyramid of the same size and the same type as prevImg.
|
||||
@param prevPts vector of 2D points for which the flow needs to be found; point coordinates must
|
||||
be single-precision floating-point numbers.
|
||||
@param nextPts output vector of 2D points (with single-precision floating-point coordinates)
|
||||
containing the calculated new positions of input features in the second image; when cv.OPTFLOW_USE_
|
||||
INITIAL_FLOW flag is passed, the vector must have the same size as in the input.
|
||||
@param status output status vector (of unsigned chars); each element of the vector is set to 1
|
||||
if the flow for the corresponding features has been found, otherwise, it is set to 0.
|
||||
@param err output vector of errors; each element of the vector is set to an error for the
|
||||
corresponding feature, type of the error measure can be set in flags parameter; if the flow wasn't
|
||||
found then the error is not defined (use the status parameter to find such cases).
|
||||
@param winSize size of the search window at each pyramid level.
|
||||
@param maxLevel 0-based maximal pyramid level number; if set to 0, pyramids are not used (single
|
||||
level), if set to 1, two levels are used, and so on; if pyramids are passed to input then algorithm
|
||||
will use as many levels as pyramids have but no more than maxLevel.
|
||||
@param criteria parameter, specifying the termination criteria of the iterative search algorithm
|
||||
(after the specified maximum number of iterations criteria.maxCount or when the search window moves
|
||||
by less than criteria.epsilon.
|
||||
@param flags operation flags:
|
||||
- cv.OPTFLOW_USE_INITIAL_FLOW uses initial estimations, stored in nextPts; if the flag is not set,
|
||||
then prevPts is copied to nextPts and is considered the initial estimate.
|
||||
- cv.OPTFLOW_LK_GET_MIN_EIGENVALS use minimum eigen values as an error measure (see minEigThreshold
|
||||
description); if the flag is not set, then L1 distance between patches around the original and a moved
|
||||
point, divided by number of pixels in a window, is used as a error measure.
|
||||
@param minEigThreshold the algorithm calculates the minimum eigen value of a 2x2 normal matrix of
|
||||
optical flow equations, divided by number of pixels in a window; if this value is less than
|
||||
minEigThreshold, then a corresponding feature is filtered out and its flow is not processed, so it
|
||||
allows to remove bad points and get a performance boost.
|
||||
|
||||
### Try it
|
||||
|
||||
\htmlonly
|
||||
<iframe src="../../js_optical_flow_lucas_kanade.html" width="100%"
|
||||
onload="this.style.height=this.contentDocument.body.scrollHeight +'px';">
|
||||
</iframe>
|
||||
\endhtmlonly
|
||||
|
||||
(This code doesn't check how correct are the next keypoints. So even if any feature point disappears
|
||||
in image, there is a chance that optical flow finds the next point which may look close to it. So
|
||||
actually for a robust tracking, corner points should be detected in particular intervals.)
|
||||
|
||||
Dense Optical Flow in OpenCV.js
|
||||
-------------------------------
|
||||
|
||||
Lucas-Kanade method computes optical flow for a sparse feature set (in our example, corners detected
|
||||
using Shi-Tomasi algorithm). OpenCV.js provides another algorithm to find the dense optical flow. It
|
||||
computes the optical flow for all the points in the frame. It is based on Gunner Farneback's
|
||||
algorithm which is explained in "Two-Frame Motion Estimation Based on Polynomial Expansion" by
|
||||
Gunner Farneback in 2003.
|
||||
|
||||
We use the function: **cv.calcOpticalFlowFarneback (prev, next, flow, pyrScale, levels, winsize,
|
||||
iterations, polyN, polySigma, flags)**
|
||||
@param prev first 8-bit single-channel input image.
|
||||
@param next second input image of the same size and the same type as prev.
|
||||
@param flow computed flow image that has the same size as prev and type CV_32FC2.
|
||||
@param pyrScale parameter, specifying the image scale (<1) to build pyramids for each image;
|
||||
pyrScale=0.5 means a classical pyramid, where each next layer is twice smaller than the previous one.
|
||||
@param levels number of pyramid layers including the initial image; levels=1 means that no extra
|
||||
layers are created and only the original images are used.
|
||||
@param winsize averaging window size; larger values increase the algorithm robustness to image noise
|
||||
and give more chances for fast motion detection, but yield more blurred motion field.
|
||||
@param iterations number of iterations the algorithm does at each pyramid level.
|
||||
@param polyN size of the pixel neighborhood used to find polynomial expansion in each pixel; larger
|
||||
values mean that the image will be approximated with smoother surfaces, yielding more robust algorithm
|
||||
and more blurred motion field, typically polyN =5 or 7.
|
||||
@param polySigma standard deviation of the Gaussian that is used to smooth derivatives used as a
|
||||
basis for the polynomial expansion; for polyN=5, you can set polySigma=1.1, for polyN=7, a good
|
||||
value would be polySigma=1.5.
|
||||
@param flags operation flags that can be a combination of the following:
|
||||
- cv.OPTFLOW_USE_INITIAL_FLOW uses the input flow as an initial flow approximation.
|
||||
- cv.OPTFLOW_FARNEBACK_GAUSSIAN uses the Gaussian 𝚠𝚒𝚗𝚜𝚒𝚣𝚎×𝚠𝚒𝚗𝚜𝚒𝚣𝚎 filter instead of a box filter of
|
||||
the same size for optical flow estimation; usually, this option gives z more accurate flow than with
|
||||
a box filter, at the cost of lower speed; normally, winsize for a Gaussian window should be set to a
|
||||
larger value to achieve the same level of robustness.
|
||||
|
||||
### Try it
|
||||
|
||||
\htmlonly
|
||||
<iframe src="../../js_optical_flow_dense.html" width="100%"
|
||||
onload="this.style.height=this.contentDocument.body.scrollHeight +'px';">
|
||||
</iframe>
|
||||
\endhtmlonly
|
98
3rdparty/opencv-4.5.4/doc/js_tutorials/js_video/js_meanshift/js_meanshift.markdown
vendored
Normal file
98
3rdparty/opencv-4.5.4/doc/js_tutorials/js_video/js_meanshift/js_meanshift.markdown
vendored
Normal file
@ -0,0 +1,98 @@
|
||||
Meanshift and Camshift {#tutorial_js_meanshift}
|
||||
======================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
- We will learn about Meanshift and Camshift algorithms to find and track objects in videos.
|
||||
|
||||
Meanshift
|
||||
---------
|
||||
|
||||
The intuition behind the meanshift is simple. Consider you have a set of points. (It can be a pixel
|
||||
distribution like histogram backprojection). You are given a small window ( may be a circle) and you
|
||||
have to move that window to the area of maximum pixel density (or maximum number of points). It is
|
||||
illustrated in the simple image given below:
|
||||
|
||||

|
||||
|
||||
The initial window is shown in blue circle with the name "C1". Its original center is marked in blue
|
||||
rectangle, named "C1_o". But if you find the centroid of the points inside that window, you will
|
||||
get the point "C1_r" (marked in small blue circle) which is the real centroid of window. Surely
|
||||
they don't match. So move your window such that circle of the new window matches with previous
|
||||
centroid. Again find the new centroid. Most probably, it won't match. So move it again, and continue
|
||||
the iterations such that center of window and its centroid falls on the same location (or with a
|
||||
small desired error). So finally what you obtain is a window with maximum pixel distribution. It is
|
||||
marked with green circle, named "C2". As you can see in image, it has maximum number of points. The
|
||||
whole process is demonstrated on a static image below:
|
||||
|
||||

|
||||
|
||||
So we normally pass the histogram backprojected image and initial target location. When the object
|
||||
moves, obviously the movement is reflected in histogram backprojected image. As a result, meanshift
|
||||
algorithm moves our window to the new location with maximum density.
|
||||
|
||||
### Meanshift in OpenCV.js
|
||||
|
||||
To use meanshift in OpenCV.js, first we need to setup the target, find its histogram so that we can
|
||||
backproject the target on each frame for calculation of meanshift. We also need to provide initial
|
||||
location of window. For histogram, only Hue is considered here. Also, to avoid false values due to
|
||||
low light, low light values are discarded using **cv.inRange()** function.
|
||||
|
||||
We use the function: **cv.meanShift (probImage, window, criteria)**
|
||||
@param probImage Back projection of the object histogram. See cv.calcBackProject for details.
|
||||
@param window Initial search window.
|
||||
@param criteria Stop criteria for the iterative search algorithm.
|
||||
@return number of iterations meanShift took to converge and the new location
|
||||
|
||||
### Try it
|
||||
|
||||
\htmlonly
|
||||
<iframe src="../../js_meanshift.html" width="100%"
|
||||
onload="this.style.height=this.contentDocument.body.scrollHeight +'px';">
|
||||
</iframe>
|
||||
\endhtmlonly
|
||||
|
||||
Camshift
|
||||
--------
|
||||
|
||||
Did you closely watch the last result? There is a problem. Our window always has the same size when
|
||||
the object is farther away and it is very close to camera. That is not good. We need to adapt the window
|
||||
size with size and rotation of the target. Once again, the solution came from "OpenCV Labs" and it
|
||||
is called CAMshift (Continuously Adaptive Meanshift) published by Gary Bradsky in his paper
|
||||
"Computer Vision Face Tracking for Use in a Perceptual User Interface" in 1988.
|
||||
|
||||
It applies meanshift first. Once meanshift converges, it updates the size of the window as,
|
||||
\f$s = 2 \times \sqrt{\frac{M_{00}}{256}}\f$. It also calculates the orientation of best fitting ellipse
|
||||
to it. Again it applies the meanshift with new scaled search window and previous window location.
|
||||
The process is continued until required accuracy is met.
|
||||
|
||||

|
||||
|
||||
### Camshift in OpenCV.js
|
||||
|
||||
It is almost same as meanshift, but it returns a rotated rectangle (that is our result) and box
|
||||
parameters (used to be passed as search window in next iteration).
|
||||
|
||||
We use the function: **cv.CamShift (probImage, window, criteria)**
|
||||
@param probImage Back projection of the object histogram. See cv.calcBackProject for details.
|
||||
@param window Initial search window.
|
||||
@param criteria Stop criteria for the iterative search algorithm.
|
||||
@return Rotated rectangle and the new search window
|
||||
|
||||
### Try it
|
||||
|
||||
\htmlonly
|
||||
<iframe src="../../js_camshift.html" width="100%"
|
||||
onload="this.style.height=this.contentDocument.body.scrollHeight +'px';">
|
||||
</iframe>
|
||||
\endhtmlonly
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
-# French Wikipedia page on [Camshift](http://fr.wikipedia.org/wiki/Camshift). (The two animations
|
||||
are taken from here)
|
||||
2. Bradski, G.R., "Real time face and object tracking as a component of a perceptual user
|
||||
interface," Applications of Computer Vision, 1998. WACV '98. Proceedings., Fourth IEEE Workshop
|
||||
on , vol., no., pp.214,219, 19-21 Oct 1998
|
17
3rdparty/opencv-4.5.4/doc/js_tutorials/js_video/js_table_of_contents_video.markdown
vendored
Normal file
17
3rdparty/opencv-4.5.4/doc/js_tutorials/js_video/js_table_of_contents_video.markdown
vendored
Normal file
@ -0,0 +1,17 @@
|
||||
Video Analysis {#tutorial_js_table_of_contents_video}
|
||||
==============
|
||||
|
||||
- @subpage tutorial_js_meanshift
|
||||
|
||||
Here, we will learn about tracking algorithms such as "Meanshift", and its upgraded version, "Camshift"
|
||||
to find and track objects in videos.
|
||||
|
||||
- @subpage tutorial_js_lucas_kanade
|
||||
|
||||
Now let's discuss an important concept, "Optical Flow", which is related to videos and has many
|
||||
applications.
|
||||
|
||||
- @subpage tutorial_js_bg_subtraction
|
||||
|
||||
In several applications, we need to extract foreground for further operations like object tracking.
|
||||
Background Subtraction is a well-known method in those cases.
|
Reference in New Issue
Block a user