feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake
1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试 2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程 3.重整权利声明文件,重整代码工程,确保最小化侵权风险 Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/brief.jpg
vendored
Normal file
After Width: | Height: | Size: 4.7 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/fast_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 3.1 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/features_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 4.7 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/harris_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 2.8 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/homography_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 4.6 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/matching.jpg
vendored
Normal file
After Width: | Height: | Size: 5.4 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/orb.jpg
vendored
Normal file
After Width: | Height: | Size: 7.1 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/shi_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 3.7 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/sift_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 3.4 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/images/surf_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 3.4 KiB |
92
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_brief/py_brief.markdown
vendored
Normal file
@ -0,0 +1,92 @@
|
||||
BRIEF (Binary Robust Independent Elementary Features) {#tutorial_py_brief}
|
||||
=====================================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter
|
||||
- We will see the basics of BRIEF algorithm
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
We know SIFT uses 128-dim vector for descriptors. Since it is using floating point numbers, it takes
|
||||
basically 512 bytes. Similarly SURF also takes minimum of 256 bytes (for 64-dim). Creating such a
|
||||
vector for thousands of features takes a lot of memory which are not feasible for resource-constraint
|
||||
applications especially for embedded systems. Larger the memory, longer the time it takes for
|
||||
matching.
|
||||
|
||||
But all these dimensions may not be needed for actual matching. We can compress it using several
|
||||
methods like PCA, LDA etc. Even other methods like hashing using LSH (Locality Sensitive Hashing) is
|
||||
used to convert these SIFT descriptors in floating point numbers to binary strings. These binary
|
||||
strings are used to match features using Hamming distance. This provides better speed-up because
|
||||
finding hamming distance is just applying XOR and bit count, which are very fast in modern CPUs with
|
||||
SSE instructions. But here, we need to find the descriptors first, then only we can apply hashing,
|
||||
which doesn't solve our initial problem on memory.
|
||||
|
||||
BRIEF comes into picture at this moment. It provides a shortcut to find the binary strings directly
|
||||
without finding descriptors. It takes smoothened image patch and selects a set of \f$n_d\f$ (x,y)
|
||||
location pairs in an unique way (explained in paper). Then some pixel intensity comparisons are done
|
||||
on these location pairs. For eg, let first location pairs be \f$p\f$ and \f$q\f$. If \f$I(p) < I(q)\f$, then its
|
||||
result is 1, else it is 0. This is applied for all the \f$n_d\f$ location pairs to get a
|
||||
\f$n_d\f$-dimensional bitstring.
|
||||
|
||||
This \f$n_d\f$ can be 128, 256 or 512. OpenCV supports all of these, but by default, it would be 256
|
||||
(OpenCV represents it in bytes. So the values will be 16, 32 and 64). So once you get this, you can
|
||||
use Hamming Distance to match these descriptors.
|
||||
|
||||
One important point is that BRIEF is a feature descriptor, it doesn't provide any method to find the
|
||||
features. So you will have to use any other feature detectors like SIFT, SURF etc. The paper
|
||||
recommends to use CenSurE which is a fast detector and BRIEF works even slightly better for CenSurE
|
||||
points than for SURF points.
|
||||
|
||||
In short, BRIEF is a faster method feature descriptor calculation and matching. It also provides
|
||||
high recognition rate unless there is large in-plane rotation.
|
||||
|
||||
STAR(CenSurE) in OpenCV
|
||||
------
|
||||
STAR is a feature detector derived from CenSurE.
|
||||
Unlike CenSurE however, which uses polygons like squares, hexagons and octagons to approach a circle,
|
||||
Star emulates a circle with 2 overlapping squares: 1 upright and 1 45-degree rotated. These polygons are bi-level.
|
||||
They can be seen as polygons with thick borders. The borders and the enclosed area have weights of opposing signs.
|
||||
This has better computational characteristics than other scale-space detectors and it is capable of real-time implementation.
|
||||
In contrast to SIFT and SURF, which find extrema at sub-sampled pixels that compromises accuracy at larger scales,
|
||||
CenSurE creates a feature vector using full spatial resolution at all scales in the pyramid.
|
||||
BRIEF in OpenCV
|
||||
---------------
|
||||
|
||||
Below code shows the computation of BRIEF descriptors with the help of CenSurE detector.
|
||||
|
||||
note, that you need [opencv contrib](https://github.com/opencv/opencv_contrib)) to use this.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv.imread('simple.jpg',0)
|
||||
|
||||
# Initiate FAST detector
|
||||
star = cv.xfeatures2d.StarDetector_create()
|
||||
|
||||
# Initiate BRIEF extractor
|
||||
brief = cv.xfeatures2d.BriefDescriptorExtractor_create()
|
||||
|
||||
# find the keypoints with STAR
|
||||
kp = star.detect(img,None)
|
||||
|
||||
# compute the descriptors with BRIEF
|
||||
kp, des = brief.compute(img, kp)
|
||||
|
||||
print( brief.descriptorSize() )
|
||||
print( des.shape )
|
||||
@endcode
|
||||
The function brief.getDescriptorSize() gives the \f$n_d\f$ size used in bytes. By default it is 32. Next one
|
||||
is matching, which will be done in another chapter.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
-# Michael Calonder, Vincent Lepetit, Christoph Strecha, and Pascal Fua, "BRIEF: Binary Robust
|
||||
Independent Elementary Features", 11th European Conference on Computer Vision (ECCV), Heraklion,
|
||||
Crete. LNCS Springer, September 2010.
|
||||
2. [LSH (Locality Sensitive Hashing)](https://en.wikipedia.org/wiki/Locality-sensitive_hashing) at wikipedia.
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_fast/images/fast_eqns.jpg
vendored
Normal file
After Width: | Height: | Size: 6.2 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_fast/images/fast_kp.jpg
vendored
Normal file
After Width: | Height: | Size: 25 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_fast/images/fast_speedtest.jpg
vendored
Normal file
After Width: | Height: | Size: 17 KiB |
143
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_fast/py_fast.markdown
vendored
Normal file
@ -0,0 +1,143 @@
|
||||
FAST Algorithm for Corner Detection {#tutorial_py_fast}
|
||||
===================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter,
|
||||
- We will understand the basics of FAST algorithm
|
||||
- We will find corners using OpenCV functionalities for FAST algorithm.
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
We saw several feature detectors and many of them are really good. But when looking from a real-time
|
||||
application point of view, they are not fast enough. One best example would be SLAM (Simultaneous
|
||||
Localization and Mapping) mobile robot which have limited computational resources.
|
||||
|
||||
As a solution to this, FAST (Features from Accelerated Segment Test) algorithm was proposed by
|
||||
Edward Rosten and Tom Drummond in their paper "Machine learning for high-speed corner detection" in
|
||||
2006 (Later revised it in 2010). A basic summary of the algorithm is presented below. Refer original
|
||||
paper for more details (All the images are taken from original paper).
|
||||
|
||||
### Feature Detection using FAST
|
||||
|
||||
-# Select a pixel \f$p\f$ in the image which is to be identified as an interest point or not. Let its
|
||||
intensity be \f$I_p\f$.
|
||||
2. Select appropriate threshold value \f$t\f$.
|
||||
3. Consider a circle of 16 pixels around the pixel under test. (See the image below)
|
||||
|
||||

|
||||
|
||||
-# Now the pixel \f$p\f$ is a corner if there exists a set of \f$n\f$ contiguous pixels in the circle (of
|
||||
16 pixels) which are all brighter than \f$I_p + t\f$, or all darker than \f$I_p − t\f$. (Shown as white
|
||||
dash lines in the above image). \f$n\f$ was chosen to be 12.
|
||||
5. A **high-speed test** was proposed to exclude a large number of non-corners. This test examines
|
||||
only the four pixels at 1, 9, 5 and 13 (First 1 and 9 are tested if they are too brighter or
|
||||
darker. If so, then checks 5 and 13). If \f$p\f$ is a corner, then at least three of these must all
|
||||
be brighter than \f$I_p + t\f$ or darker than \f$I_p − t\f$. If neither of these is the case, then \f$p\f$
|
||||
cannot be a corner. The full segment test criterion can then be applied to the passed candidates
|
||||
by examining all pixels in the circle. This detector in itself exhibits high performance, but
|
||||
there are several weaknesses:
|
||||
|
||||
- It does not reject as many candidates for n \< 12.
|
||||
- The choice of pixels is not optimal because its efficiency depends on ordering of the
|
||||
questions and distribution of corner appearances.
|
||||
- Results of high-speed tests are thrown away.
|
||||
- Multiple features are detected adjacent to one another.
|
||||
|
||||
First 3 points are addressed with a machine learning approach. Last one is addressed using
|
||||
non-maximal suppression.
|
||||
|
||||
### Machine Learning a Corner Detector
|
||||
|
||||
-# Select a set of images for training (preferably from the target application domain)
|
||||
2. Run FAST algorithm in every images to find feature points.
|
||||
3. For every feature point, store the 16 pixels around it as a vector. Do it for all the images to
|
||||
get feature vector \f$P\f$.
|
||||
4. Each pixel (say \f$x\f$) in these 16 pixels can have one of the following three states:
|
||||
|
||||

|
||||
|
||||
-# Depending on these states, the feature vector \f$P\f$ is subdivided into 3 subsets, \f$P_d\f$, \f$P_s\f$,
|
||||
\f$P_b\f$.
|
||||
6. Define a new boolean variable, \f$K_p\f$, which is true if \f$p\f$ is a corner and false otherwise.
|
||||
7. Use the ID3 algorithm (decision tree classifier) to query each subset using the variable \f$K_p\f$
|
||||
for the knowledge about the true class. It selects the \f$x\f$ which yields the most information
|
||||
about whether the candidate pixel is a corner, measured by the entropy of \f$K_p\f$.
|
||||
8. This is recursively applied to all the subsets until its entropy is zero.
|
||||
9. The decision tree so created is used for fast detection in other images.
|
||||
|
||||
### Non-maximal Suppression
|
||||
|
||||
Detecting multiple interest points in adjacent locations is another problem. It is solved by using
|
||||
Non-maximum Suppression.
|
||||
|
||||
-# Compute a score function, \f$V\f$ for all the detected feature points. \f$V\f$ is the sum of absolute
|
||||
difference between \f$p\f$ and 16 surrounding pixels values.
|
||||
2. Consider two adjacent keypoints and compute their \f$V\f$ values.
|
||||
3. Discard the one with lower \f$V\f$ value.
|
||||
|
||||
### Summary
|
||||
|
||||
It is several times faster than other existing corner detectors.
|
||||
|
||||
But it is not robust to high levels of noise. It is dependent on a threshold.
|
||||
|
||||
FAST Feature Detector in OpenCV
|
||||
-------------------------------
|
||||
|
||||
It is called as any other feature detector in OpenCV. If you want, you can specify the threshold,
|
||||
whether non-maximum suppression to be applied or not, the neighborhood to be used etc.
|
||||
|
||||
For the neighborhood, three flags are defined, cv.FAST_FEATURE_DETECTOR_TYPE_5_8,
|
||||
cv.FAST_FEATURE_DETECTOR_TYPE_7_12 and cv.FAST_FEATURE_DETECTOR_TYPE_9_16. Below is a
|
||||
simple code on how to detect and draw the FAST feature points.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv.imread('blox.jpg',0) # `<opencv_root>/samples/data/blox.jpg`
|
||||
|
||||
# Initiate FAST object with default values
|
||||
fast = cv.FastFeatureDetector_create()
|
||||
|
||||
# find and draw the keypoints
|
||||
kp = fast.detect(img,None)
|
||||
img2 = cv.drawKeypoints(img, kp, None, color=(255,0,0))
|
||||
|
||||
# Print all default params
|
||||
print( "Threshold: {}".format(fast.getThreshold()) )
|
||||
print( "nonmaxSuppression:{}".format(fast.getNonmaxSuppression()) )
|
||||
print( "neighborhood: {}".format(fast.getType()) )
|
||||
print( "Total Keypoints with nonmaxSuppression: {}".format(len(kp)) )
|
||||
|
||||
cv.imwrite('fast_true.png', img2)
|
||||
|
||||
# Disable nonmaxSuppression
|
||||
fast.setNonmaxSuppression(0)
|
||||
kp = fast.detect(img, None)
|
||||
|
||||
print( "Total Keypoints without nonmaxSuppression: {}".format(len(kp)) )
|
||||
|
||||
img3 = cv.drawKeypoints(img, kp, None, color=(255,0,0))
|
||||
|
||||
cv.imwrite('fast_false.png', img3)
|
||||
@endcode
|
||||
See the results. First image shows FAST with nonmaxSuppression and second one without
|
||||
nonmaxSuppression:
|
||||
|
||||

|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
-# Edward Rosten and Tom Drummond, “Machine learning for high speed corner detection” in 9th
|
||||
European Conference on Computer Vision, vol. 1, 2006, pp. 430–443.
|
||||
2. Edward Rosten, Reid Porter, and Tom Drummond, "Faster and better: a machine learning approach to
|
||||
corner detection" in IEEE Trans. Pattern Analysis and Machine Intelligence, 2010, vol 32, pp.
|
||||
105-119.
|
||||
|
||||
Exercises
|
||||
---------
|
After Width: | Height: | Size: 31 KiB |
@ -0,0 +1,110 @@
|
||||
Feature Matching + Homography to find Objects {#tutorial_py_feature_homography}
|
||||
=============================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter,
|
||||
- We will mix up the feature matching and findHomography from calib3d module to find known
|
||||
objects in a complex image.
|
||||
|
||||
Basics
|
||||
------
|
||||
|
||||
So what we did in last session? We used a queryImage, found some feature points in it, we took
|
||||
another trainImage, found the features in that image too and we found the best matches among them.
|
||||
In short, we found locations of some parts of an object in another cluttered image. This information
|
||||
is sufficient to find the object exactly on the trainImage.
|
||||
|
||||
For that, we can use a function from calib3d module, ie **cv.findHomography()**. If we pass the set
|
||||
of points from both the images, it will find the perspective transformation of that object. Then we
|
||||
can use **cv.perspectiveTransform()** to find the object. It needs atleast four correct points to
|
||||
find the transformation.
|
||||
|
||||
We have seen that there can be some possible errors while matching which may affect the result. To
|
||||
solve this problem, algorithm uses RANSAC or LEAST_MEDIAN (which can be decided by the flags). So
|
||||
good matches which provide correct estimation are called inliers and remaining are called outliers.
|
||||
**cv.findHomography()** returns a mask which specifies the inlier and outlier points.
|
||||
|
||||
So let's do it !!!
|
||||
|
||||
Code
|
||||
----
|
||||
|
||||
First, as usual, let's find SIFT features in images and apply the ratio test to find the best
|
||||
matches.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
MIN_MATCH_COUNT = 10
|
||||
|
||||
img1 = cv.imread('box.png',0) # queryImage
|
||||
img2 = cv.imread('box_in_scene.png',0) # trainImage
|
||||
|
||||
# Initiate SIFT detector
|
||||
sift = cv.SIFT_create()
|
||||
|
||||
# find the keypoints and descriptors with SIFT
|
||||
kp1, des1 = sift.detectAndCompute(img1,None)
|
||||
kp2, des2 = sift.detectAndCompute(img2,None)
|
||||
|
||||
FLANN_INDEX_KDTREE = 1
|
||||
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
|
||||
search_params = dict(checks = 50)
|
||||
|
||||
flann = cv.FlannBasedMatcher(index_params, search_params)
|
||||
|
||||
matches = flann.knnMatch(des1,des2,k=2)
|
||||
|
||||
# store all the good matches as per Lowe's ratio test.
|
||||
good = []
|
||||
for m,n in matches:
|
||||
if m.distance < 0.7*n.distance:
|
||||
good.append(m)
|
||||
@endcode
|
||||
Now we set a condition that atleast 10 matches (defined by MIN_MATCH_COUNT) are to be there to
|
||||
find the object. Otherwise simply show a message saying not enough matches are present.
|
||||
|
||||
If enough matches are found, we extract the locations of matched keypoints in both the images. They
|
||||
are passed to find the perspective transformation. Once we get this 3x3 transformation matrix, we use
|
||||
it to transform the corners of queryImage to corresponding points in trainImage. Then we draw it.
|
||||
@code{.py}
|
||||
if len(good)>MIN_MATCH_COUNT:
|
||||
src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
|
||||
dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)
|
||||
|
||||
M, mask = cv.findHomography(src_pts, dst_pts, cv.RANSAC,5.0)
|
||||
matchesMask = mask.ravel().tolist()
|
||||
|
||||
h,w,d = img1.shape
|
||||
pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
|
||||
dst = cv.perspectiveTransform(pts,M)
|
||||
|
||||
img2 = cv.polylines(img2,[np.int32(dst)],True,255,3, cv.LINE_AA)
|
||||
|
||||
else:
|
||||
print( "Not enough matches are found - {}/{}".format(len(good), MIN_MATCH_COUNT) )
|
||||
matchesMask = None
|
||||
@endcode
|
||||
Finally we draw our inliers (if successfully found the object) or matching keypoints (if failed).
|
||||
@code{.py}
|
||||
draw_params = dict(matchColor = (0,255,0), # draw matches in green color
|
||||
singlePointColor = None,
|
||||
matchesMask = matchesMask, # draw only inliers
|
||||
flags = 2)
|
||||
|
||||
img3 = cv.drawMatches(img1,kp1,img2,kp2,good,None,**draw_params)
|
||||
|
||||
plt.imshow(img3, 'gray'),plt.show()
|
||||
@endcode
|
||||
See the result below. Object is marked in white color in cluttered image:
|
||||
|
||||

|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_features_harris/images/harris_region.jpg
vendored
Normal file
After Width: | Height: | Size: 17 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_features_harris/images/harris_result.jpg
vendored
Normal file
After Width: | Height: | Size: 34 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_features_harris/images/subpixel3.png
vendored
Normal file
After Width: | Height: | Size: 16 KiB |
150
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_features_harris/py_features_harris.markdown
vendored
Normal file
@ -0,0 +1,150 @@
|
||||
Harris Corner Detection {#tutorial_py_features_harris}
|
||||
=======================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter,
|
||||
|
||||
- We will understand the concepts behind Harris Corner Detection.
|
||||
- We will see the following functions: **cv.cornerHarris()**, **cv.cornerSubPix()**
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
In the last chapter, we saw that corners are regions in the image with large variation in intensity in
|
||||
all the directions. One early attempt to find these corners was done by **Chris Harris & Mike
|
||||
Stephens** in their paper **A Combined Corner and Edge Detector** in 1988, so now it is called
|
||||
the Harris Corner Detector. He took this simple idea to a mathematical form. It basically finds the
|
||||
difference in intensity for a displacement of \f$(u,v)\f$ in all directions. This is expressed as below:
|
||||
|
||||
\f[E(u,v) = \sum_{x,y} \underbrace{w(x,y)}_\text{window function} \, [\underbrace{I(x+u,y+v)}_\text{shifted intensity}-\underbrace{I(x,y)}_\text{intensity}]^2\f]
|
||||
|
||||
The window function is either a rectangular window or a Gaussian window which gives weights to pixels
|
||||
underneath.
|
||||
|
||||
We have to maximize this function \f$E(u,v)\f$ for corner detection. That means we have to maximize the
|
||||
second term. Applying Taylor Expansion to the above equation and using some mathematical steps (please
|
||||
refer to any standard text books you like for full derivation), we get the final equation as:
|
||||
|
||||
\f[E(u,v) \approx \begin{bmatrix} u & v \end{bmatrix} M \begin{bmatrix} u \\ v \end{bmatrix}\f]
|
||||
|
||||
where
|
||||
|
||||
\f[M = \sum_{x,y} w(x,y) \begin{bmatrix}I_x I_x & I_x I_y \\
|
||||
I_x I_y & I_y I_y \end{bmatrix}\f]
|
||||
|
||||
Here, \f$I_x\f$ and \f$I_y\f$ are image derivatives in x and y directions respectively. (These can be easily found
|
||||
using **cv.Sobel()**).
|
||||
|
||||
Then comes the main part. After this, they created a score, basically an equation, which
|
||||
determines if a window can contain a corner or not.
|
||||
|
||||
\f[R = \det(M) - k(\operatorname{trace}(M))^2\f]
|
||||
|
||||
where
|
||||
- \f$\det(M) = \lambda_1 \lambda_2\f$
|
||||
- \f$\operatorname{trace}(M) = \lambda_1 + \lambda_2\f$
|
||||
- \f$\lambda_1\f$ and \f$\lambda_2\f$ are the eigenvalues of \f$M\f$
|
||||
|
||||
So the magnitudes of these eigenvalues decide whether a region is a corner, an edge, or flat.
|
||||
|
||||
- When \f$|R|\f$ is small, which happens when \f$\lambda_1\f$ and \f$\lambda_2\f$ are small, the region is
|
||||
flat.
|
||||
- When \f$R<0\f$, which happens when \f$\lambda_1 >> \lambda_2\f$ or vice versa, the region is edge.
|
||||
- When \f$R\f$ is large, which happens when \f$\lambda_1\f$ and \f$\lambda_2\f$ are large and
|
||||
\f$\lambda_1 \sim \lambda_2\f$, the region is a corner.
|
||||
|
||||
It can be represented in a nice picture as follows:
|
||||
|
||||

|
||||
|
||||
So the result of Harris Corner Detection is a grayscale image with these scores. Thresholding for a
|
||||
suitable score gives you the corners in the image. We will do it with a simple image.
|
||||
|
||||
Harris Corner Detector in OpenCV
|
||||
--------------------------------
|
||||
|
||||
OpenCV has the function **cv.cornerHarris()** for this purpose. Its arguments are:
|
||||
|
||||
- **img** - Input image. It should be grayscale and float32 type.
|
||||
- **blockSize** - It is the size of neighbourhood considered for corner detection
|
||||
- **ksize** - Aperture parameter of the Sobel derivative used.
|
||||
- **k** - Harris detector free parameter in the equation.
|
||||
|
||||
See the example below:
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
|
||||
filename = 'chessboard.png'
|
||||
img = cv.imread(filename)
|
||||
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
|
||||
|
||||
gray = np.float32(gray)
|
||||
dst = cv.cornerHarris(gray,2,3,0.04)
|
||||
|
||||
#result is dilated for marking the corners, not important
|
||||
dst = cv.dilate(dst,None)
|
||||
|
||||
# Threshold for an optimal value, it may vary depending on the image.
|
||||
img[dst>0.01*dst.max()]=[0,0,255]
|
||||
|
||||
cv.imshow('dst',img)
|
||||
if cv.waitKey(0) & 0xff == 27:
|
||||
cv.destroyAllWindows()
|
||||
@endcode
|
||||
Below are the three results:
|
||||
|
||||

|
||||
|
||||
Corner with SubPixel Accuracy
|
||||
-----------------------------
|
||||
|
||||
Sometimes, you may need to find the corners with maximum accuracy. OpenCV comes with a function
|
||||
**cv.cornerSubPix()** which further refines the corners detected with sub-pixel accuracy. Below is
|
||||
an example. As usual, we need to find the Harris corners first. Then we pass the centroids of these
|
||||
corners (There may be a bunch of pixels at a corner, we take their centroid) to refine them. Harris
|
||||
corners are marked in red pixels and refined corners are marked in green pixels. For this function,
|
||||
we have to define the criteria when to stop the iteration. We stop it after a specified number of
|
||||
iterations or a certain accuracy is achieved, whichever occurs first. We also need to define the size
|
||||
of the neighbourhood it searches for corners.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
|
||||
filename = 'chessboard2.jpg'
|
||||
img = cv.imread(filename)
|
||||
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
|
||||
|
||||
# find Harris corners
|
||||
gray = np.float32(gray)
|
||||
dst = cv.cornerHarris(gray,2,3,0.04)
|
||||
dst = cv.dilate(dst,None)
|
||||
ret, dst = cv.threshold(dst,0.01*dst.max(),255,0)
|
||||
dst = np.uint8(dst)
|
||||
|
||||
# find centroids
|
||||
ret, labels, stats, centroids = cv.connectedComponentsWithStats(dst)
|
||||
|
||||
# define the criteria to stop and refine the corners
|
||||
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 100, 0.001)
|
||||
corners = cv.cornerSubPix(gray,np.float32(centroids),(5,5),(-1,-1),criteria)
|
||||
|
||||
# Now draw them
|
||||
res = np.hstack((centroids,corners))
|
||||
res = np.int0(res)
|
||||
img[res[:,1],res[:,0]]=[0,0,255]
|
||||
img[res[:,3],res[:,2]] = [0,255,0]
|
||||
|
||||
cv.imwrite('subpixel5.png',img)
|
||||
@endcode
|
||||
Below is the result, where some important locations are shown in the zoomed window to visualize:
|
||||
|
||||

|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_features_meaning/images/feature_building.jpg
vendored
Normal file
After Width: | Height: | Size: 49 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_features_meaning/images/feature_simple.png
vendored
Normal file
After Width: | Height: | Size: 1.0 KiB |
@ -0,0 +1,89 @@
|
||||
Understanding Features {#tutorial_py_features_meaning}
|
||||
======================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter, we will just try to understand what are features, why are they important, why
|
||||
corners are important etc.
|
||||
|
||||
Explanation
|
||||
-----------
|
||||
|
||||
Most of you will have played the jigsaw puzzle games. You get a lot of small pieces of an image,
|
||||
where you need to assemble them correctly to form a big real image. **The question is, how you do
|
||||
it?** What about the projecting the same theory to a computer program so that computer can play
|
||||
jigsaw puzzles? If the computer can play jigsaw puzzles, why can't we give a lot of real-life images
|
||||
of a good natural scenery to computer and tell it to stitch all those images to a big single image?
|
||||
If the computer can stitch several natural images to one, what about giving a lot of pictures of a
|
||||
building or any structure and tell computer to create a 3D model out of it?
|
||||
|
||||
Well, the questions and imaginations continue. But it all depends on the most basic question: How do
|
||||
you play jigsaw puzzles? How do you arrange lots of scrambled image pieces into a big single image?
|
||||
How can you stitch a lot of natural images to a single image?
|
||||
|
||||
The answer is, we are looking for specific patterns or specific features which are unique, can
|
||||
be easily tracked and can be easily compared. If we go for a definition of such a feature, we may
|
||||
find it difficult to express it in words, but we know what they are. If someone asks you to point
|
||||
out one good feature which can be compared across several images, you can point out one. That is
|
||||
why even small children can simply play these games. We search for these features in an image,
|
||||
find them, look for the same features in other images and align them. That's it. (In jigsaw puzzle,
|
||||
we look more into continuity of different images). All these abilities are present in us inherently.
|
||||
|
||||
So our one basic question expands to more in number, but becomes more specific. **What are these
|
||||
features?**. (The answer should be understandable also to a computer.)
|
||||
|
||||
It is difficult to say how humans find these features. This is already programmed in our brain.
|
||||
But if we look deep into some pictures and search for different patterns, we will find something
|
||||
interesting. For example, take below image:
|
||||
|
||||

|
||||
|
||||
The image is very simple. At the top of image, six small image patches are given. Question for you is to
|
||||
find the exact location of these patches in the original image. How many correct results can you
|
||||
find?
|
||||
|
||||
A and B are flat surfaces and they are spread over a lot of area. It is difficult to find the exact
|
||||
location of these patches.
|
||||
|
||||
C and D are much more simple. They are edges of the building. You can find an approximate location,
|
||||
but exact location is still difficult. This is because the pattern is same everywhere along the edge.
|
||||
At the edge, however, it is different. An edge is therefore better feature compared to flat area, but
|
||||
not good enough (It is good in jigsaw puzzle for comparing continuity of edges).
|
||||
|
||||
Finally, E and F are some corners of the building. And they can be easily found. Because at the
|
||||
corners, wherever you move this patch, it will look different. So they can be considered as good
|
||||
features. So now we move into simpler (and widely used image) for better understanding.
|
||||
|
||||

|
||||
|
||||
Just like above, the blue patch is flat area and difficult to find and track. Wherever you move the blue
|
||||
patch it looks the same. The black patch has an edge. If you move it in vertical direction (i.e.
|
||||
along the gradient) it changes. Moved along the edge (parallel to edge), it looks the same. And for
|
||||
red patch, it is a corner. Wherever you move the patch, it looks different, means it is unique. So
|
||||
basically, corners are considered to be good features in an image. (Not just corners, in some cases
|
||||
blobs are considered good features).
|
||||
|
||||
So now we answered our question, "what are these features?". But next question arises. How do we
|
||||
find them? Or how do we find the corners?. We answered that in an intuitive way, i.e., look for
|
||||
the regions in images which have maximum variation when moved (by a small amount) in all regions
|
||||
around it. This would be projected into computer language in coming chapters. So finding these image
|
||||
features is called **Feature Detection**.
|
||||
|
||||
We found the features in the images. Once you have found it, you should be able to find the same
|
||||
in the other images. How is this done? We take a region around the feature, we explain it in our own
|
||||
words, like "upper part is blue sky, lower part is region from a building, on that building there is
|
||||
glass etc" and you search for the same area in the other images. Basically, you are describing the
|
||||
feature. Similarly, a computer also should describe the region around the feature so that it can
|
||||
find it in other images. So called description is called **Feature Description**. Once you have the
|
||||
features and its description, you can find same features in all images and align them, stitch them together
|
||||
or do whatever you want.
|
||||
|
||||
So in this module, we are looking to different algorithms in OpenCV to find features, describe them,
|
||||
match them etc.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_matcher/images/matcher_flann.jpg
vendored
Normal file
After Width: | Height: | Size: 34 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_matcher/images/matcher_result1.jpg
vendored
Normal file
After Width: | Height: | Size: 31 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_matcher/images/matcher_result2.jpg
vendored
Normal file
After Width: | Height: | Size: 22 KiB |
217
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_matcher/py_matcher.markdown
vendored
Normal file
@ -0,0 +1,217 @@
|
||||
Feature Matching {#tutorial_py_matcher}
|
||||
================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter
|
||||
- We will see how to match features in one image with others.
|
||||
- We will use the Brute-Force matcher and FLANN Matcher in OpenCV
|
||||
|
||||
Basics of Brute-Force Matcher
|
||||
-----------------------------
|
||||
|
||||
Brute-Force matcher is simple. It takes the descriptor of one feature in first set and is matched
|
||||
with all other features in second set using some distance calculation. And the closest one is
|
||||
returned.
|
||||
|
||||
For BF matcher, first we have to create the BFMatcher object using **cv.BFMatcher()**. It takes two
|
||||
optional params. First one is normType. It specifies the distance measurement to be used. By
|
||||
default, it is cv.NORM_L2. It is good for SIFT, SURF etc (cv.NORM_L1 is also there). For binary
|
||||
string based descriptors like ORB, BRIEF, BRISK etc, cv.NORM_HAMMING should be used, which used
|
||||
Hamming distance as measurement. If ORB is using WTA_K == 3 or 4, cv.NORM_HAMMING2 should be
|
||||
used.
|
||||
|
||||
Second param is boolean variable, crossCheck which is false by default. If it is true, Matcher
|
||||
returns only those matches with value (i,j) such that i-th descriptor in set A has j-th descriptor
|
||||
in set B as the best match and vice-versa. That is, the two features in both sets should match each
|
||||
other. It provides consistent result, and is a good alternative to ratio test proposed by D.Lowe in
|
||||
SIFT paper.
|
||||
|
||||
Once it is created, two important methods are *BFMatcher.match()* and *BFMatcher.knnMatch()*. First
|
||||
one returns the best match. Second method returns k best matches where k is specified by the user.
|
||||
It may be useful when we need to do additional work on that.
|
||||
|
||||
Like we used cv.drawKeypoints() to draw keypoints, **cv.drawMatches()** helps us to draw the
|
||||
matches. It stacks two images horizontally and draw lines from first image to second image showing
|
||||
best matches. There is also **cv.drawMatchesKnn** which draws all the k best matches. If k=2, it
|
||||
will draw two match-lines for each keypoint. So we have to pass a mask if we want to selectively
|
||||
draw it.
|
||||
|
||||
Let's see one example for each of SIFT and ORB (Both use different distance measurements).
|
||||
|
||||
### Brute-Force Matching with ORB Descriptors
|
||||
|
||||
Here, we will see a simple example on how to match features between two images. In this case, I have
|
||||
a queryImage and a trainImage. We will try to find the queryImage in trainImage using feature
|
||||
matching. ( The images are /samples/data/box.png and /samples/data/box_in_scene.png)
|
||||
|
||||
We are using ORB descriptors to match features. So let's start with loading images, finding
|
||||
descriptors etc.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
img1 = cv.imread('box.png',cv.IMREAD_GRAYSCALE) # queryImage
|
||||
img2 = cv.imread('box_in_scene.png',cv.IMREAD_GRAYSCALE) # trainImage
|
||||
|
||||
# Initiate ORB detector
|
||||
orb = cv.ORB_create()
|
||||
|
||||
# find the keypoints and descriptors with ORB
|
||||
kp1, des1 = orb.detectAndCompute(img1,None)
|
||||
kp2, des2 = orb.detectAndCompute(img2,None)
|
||||
@endcode
|
||||
Next we create a BFMatcher object with distance measurement cv.NORM_HAMMING (since we are using
|
||||
ORB) and crossCheck is switched on for better results. Then we use Matcher.match() method to get the
|
||||
best matches in two images. We sort them in ascending order of their distances so that best matches
|
||||
(with low distance) come to front. Then we draw only first 10 matches (Just for sake of visibility.
|
||||
You can increase it as you like)
|
||||
@code{.py}
|
||||
# create BFMatcher object
|
||||
bf = cv.BFMatcher(cv.NORM_HAMMING, crossCheck=True)
|
||||
|
||||
# Match descriptors.
|
||||
matches = bf.match(des1,des2)
|
||||
|
||||
# Sort them in the order of their distance.
|
||||
matches = sorted(matches, key = lambda x:x.distance)
|
||||
|
||||
# Draw first 10 matches.
|
||||
img3 = cv.drawMatches(img1,kp1,img2,kp2,matches[:10],None,flags=cv.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
|
||||
|
||||
plt.imshow(img3),plt.show()
|
||||
@endcode
|
||||
Below is the result I got:
|
||||
|
||||

|
||||
|
||||
### What is this Matcher Object?
|
||||
|
||||
The result of matches = bf.match(des1,des2) line is a list of DMatch objects. This DMatch object has
|
||||
following attributes:
|
||||
|
||||
- DMatch.distance - Distance between descriptors. The lower, the better it is.
|
||||
- DMatch.trainIdx - Index of the descriptor in train descriptors
|
||||
- DMatch.queryIdx - Index of the descriptor in query descriptors
|
||||
- DMatch.imgIdx - Index of the train image.
|
||||
|
||||
### Brute-Force Matching with SIFT Descriptors and Ratio Test
|
||||
|
||||
This time, we will use BFMatcher.knnMatch() to get k best matches. In this example, we will take k=2
|
||||
so that we can apply ratio test explained by D.Lowe in his paper.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
img1 = cv.imread('box.png',cv.IMREAD_GRAYSCALE) # queryImage
|
||||
img2 = cv.imread('box_in_scene.png',cv.IMREAD_GRAYSCALE) # trainImage
|
||||
|
||||
# Initiate SIFT detector
|
||||
sift = cv.SIFT_create()
|
||||
|
||||
# find the keypoints and descriptors with SIFT
|
||||
kp1, des1 = sift.detectAndCompute(img1,None)
|
||||
kp2, des2 = sift.detectAndCompute(img2,None)
|
||||
|
||||
# BFMatcher with default params
|
||||
bf = cv.BFMatcher()
|
||||
matches = bf.knnMatch(des1,des2,k=2)
|
||||
|
||||
# Apply ratio test
|
||||
good = []
|
||||
for m,n in matches:
|
||||
if m.distance < 0.75*n.distance:
|
||||
good.append([m])
|
||||
|
||||
# cv.drawMatchesKnn expects list of lists as matches.
|
||||
img3 = cv.drawMatchesKnn(img1,kp1,img2,kp2,good,None,flags=cv.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
|
||||
|
||||
plt.imshow(img3),plt.show()
|
||||
@endcode
|
||||
See the result below:
|
||||
|
||||

|
||||
|
||||
FLANN based Matcher
|
||||
-------------------
|
||||
|
||||
FLANN stands for Fast Library for Approximate Nearest Neighbors. It contains a collection of
|
||||
algorithms optimized for fast nearest neighbor search in large datasets and for high dimensional
|
||||
features. It works faster than BFMatcher for large datasets. We will see the second example
|
||||
with FLANN based matcher.
|
||||
|
||||
For FLANN based matcher, we need to pass two dictionaries which specifies the algorithm to be used,
|
||||
its related parameters etc. First one is IndexParams. For various algorithms, the information to be
|
||||
passed is explained in FLANN docs. As a summary, for algorithms like SIFT, SURF etc. you can pass
|
||||
following:
|
||||
@code{.py}
|
||||
FLANN_INDEX_KDTREE = 1
|
||||
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
|
||||
@endcode
|
||||
While using ORB, you can pass the following. The commented values are recommended as per the docs,
|
||||
but it didn't provide required results in some cases. Other values worked fine.:
|
||||
@code{.py}
|
||||
FLANN_INDEX_LSH = 6
|
||||
index_params= dict(algorithm = FLANN_INDEX_LSH,
|
||||
table_number = 6, # 12
|
||||
key_size = 12, # 20
|
||||
multi_probe_level = 1) #2
|
||||
@endcode
|
||||
Second dictionary is the SearchParams. It specifies the number of times the trees in the index
|
||||
should be recursively traversed. Higher values gives better precision, but also takes more time. If
|
||||
you want to change the value, pass search_params = dict(checks=100).
|
||||
|
||||
With this information, we are good to go.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
img1 = cv.imread('box.png',cv.IMREAD_GRAYSCALE) # queryImage
|
||||
img2 = cv.imread('box_in_scene.png',cv.IMREAD_GRAYSCALE) # trainImage
|
||||
|
||||
# Initiate SIFT detector
|
||||
sift = cv.SIFT_create()
|
||||
|
||||
# find the keypoints and descriptors with SIFT
|
||||
kp1, des1 = sift.detectAndCompute(img1,None)
|
||||
kp2, des2 = sift.detectAndCompute(img2,None)
|
||||
|
||||
# FLANN parameters
|
||||
FLANN_INDEX_KDTREE = 1
|
||||
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
|
||||
search_params = dict(checks=50) # or pass empty dictionary
|
||||
|
||||
flann = cv.FlannBasedMatcher(index_params,search_params)
|
||||
|
||||
matches = flann.knnMatch(des1,des2,k=2)
|
||||
|
||||
# Need to draw only good matches, so create a mask
|
||||
matchesMask = [[0,0] for i in range(len(matches))]
|
||||
|
||||
# ratio test as per Lowe's paper
|
||||
for i,(m,n) in enumerate(matches):
|
||||
if m.distance < 0.7*n.distance:
|
||||
matchesMask[i]=[1,0]
|
||||
|
||||
draw_params = dict(matchColor = (0,255,0),
|
||||
singlePointColor = (255,0,0),
|
||||
matchesMask = matchesMask,
|
||||
flags = cv.DrawMatchesFlags_DEFAULT)
|
||||
|
||||
img3 = cv.drawMatchesKnn(img1,kp1,img2,kp2,matches,None,**draw_params)
|
||||
|
||||
plt.imshow(img3,),plt.show()
|
||||
@endcode
|
||||
See the result below:
|
||||
|
||||

|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_orb/images/orb_kp.jpg
vendored
Normal file
After Width: | Height: | Size: 23 KiB |
98
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_orb/py_orb.markdown
vendored
Normal file
@ -0,0 +1,98 @@
|
||||
ORB (Oriented FAST and Rotated BRIEF) {#tutorial_py_orb}
|
||||
=====================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter,
|
||||
- We will see the basics of ORB
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
As an OpenCV enthusiast, the most important thing about the ORB is that it came from "OpenCV Labs".
|
||||
This algorithm was brought up by Ethan Rublee, Vincent Rabaud, Kurt Konolige and Gary R. Bradski in
|
||||
their paper **ORB: An efficient alternative to SIFT or SURF** in 2011. As the title says, it is a
|
||||
good alternative to SIFT and SURF in computation cost, matching performance and mainly the patents.
|
||||
Yes, SIFT and SURF are patented and you are supposed to pay them for its use. But ORB is not !!!
|
||||
|
||||
ORB is basically a fusion of FAST keypoint detector and BRIEF descriptor with many modifications to
|
||||
enhance the performance. First it use FAST to find keypoints, then apply Harris corner measure to
|
||||
find top N points among them. It also use pyramid to produce multiscale-features. But one problem is
|
||||
that, FAST doesn't compute the orientation. So what about rotation invariance? Authors came up with
|
||||
following modification.
|
||||
|
||||
It computes the intensity weighted centroid of the patch with located corner at center. The
|
||||
direction of the vector from this corner point to centroid gives the orientation. To improve the
|
||||
rotation invariance, moments are computed with x and y which should be in a circular region of
|
||||
radius \f$r\f$, where \f$r\f$ is the size of the patch.
|
||||
|
||||
Now for descriptors, ORB use BRIEF descriptors. But we have already seen that BRIEF performs poorly
|
||||
with rotation. So what ORB does is to "steer" BRIEF according to the orientation of keypoints. For
|
||||
any feature set of \f$n\f$ binary tests at location \f$(x_i, y_i)\f$, define a \f$2 \times n\f$ matrix, \f$S\f$
|
||||
which contains the coordinates of these pixels. Then using the orientation of patch, \f$\theta\f$, its
|
||||
rotation matrix is found and rotates the \f$S\f$ to get steered(rotated) version \f$S_\theta\f$.
|
||||
|
||||
ORB discretize the angle to increments of \f$2 \pi /30\f$ (12 degrees), and construct a lookup table of
|
||||
precomputed BRIEF patterns. As long as the keypoint orientation \f$\theta\f$ is consistent across views,
|
||||
the correct set of points \f$S_\theta\f$ will be used to compute its descriptor.
|
||||
|
||||
BRIEF has an important property that each bit feature has a large variance and a mean near 0.5. But
|
||||
once it is oriented along keypoint direction, it loses this property and become more distributed.
|
||||
High variance makes a feature more discriminative, since it responds differentially to inputs.
|
||||
Another desirable property is to have the tests uncorrelated, since then each test will contribute
|
||||
to the result. To resolve all these, ORB runs a greedy search among all possible binary tests to
|
||||
find the ones that have both high variance and means close to 0.5, as well as being uncorrelated.
|
||||
The result is called **rBRIEF**.
|
||||
|
||||
For descriptor matching, multi-probe LSH which improves on the traditional LSH, is used. The paper
|
||||
says ORB is much faster than SURF and SIFT and ORB descriptor works better than SURF. ORB is a good
|
||||
choice in low-power devices for panorama stitching etc.
|
||||
|
||||
ORB in OpenCV
|
||||
-------------
|
||||
|
||||
As usual, we have to create an ORB object with the function, **cv.ORB()** or using feature2d common
|
||||
interface. It has a number of optional parameters. Most useful ones are nFeatures which denotes
|
||||
maximum number of features to be retained (by default 500), scoreType which denotes whether Harris
|
||||
score or FAST score to rank the features (by default, Harris score) etc. Another parameter, WTA_K
|
||||
decides number of points that produce each element of the oriented BRIEF descriptor. By default it
|
||||
is two, ie selects two points at a time. In that case, for matching, NORM_HAMMING distance is used.
|
||||
If WTA_K is 3 or 4, which takes 3 or 4 points to produce BRIEF descriptor, then matching distance
|
||||
is defined by NORM_HAMMING2.
|
||||
|
||||
Below is a simple code which shows the use of ORB.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv.imread('simple.jpg',0)
|
||||
|
||||
# Initiate ORB detector
|
||||
orb = cv.ORB_create()
|
||||
|
||||
# find the keypoints with ORB
|
||||
kp = orb.detect(img,None)
|
||||
|
||||
# compute the descriptors with ORB
|
||||
kp, des = orb.compute(img, kp)
|
||||
|
||||
# draw only keypoints location,not size and orientation
|
||||
img2 = cv.drawKeypoints(img, kp, None, color=(0,255,0), flags=0)
|
||||
plt.imshow(img2), plt.show()
|
||||
@endcode
|
||||
See the result below:
|
||||
|
||||

|
||||
|
||||
ORB feature matching, we will do in another chapter.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
-# Ethan Rublee, Vincent Rabaud, Kurt Konolige, Gary R. Bradski: ORB: An efficient alternative to
|
||||
SIFT or SURF. ICCV 2011: 2564-2571.
|
||||
|
||||
Exercises
|
||||
---------
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_shi_tomasi/images/shitomasi_block1.jpg
vendored
Normal file
After Width: | Height: | Size: 14 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_shi_tomasi/images/shitomasi_space.png
vendored
Normal file
After Width: | Height: | Size: 4.5 KiB |
75
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_shi_tomasi/py_shi_tomasi.markdown
vendored
Normal file
@ -0,0 +1,75 @@
|
||||
Shi-Tomasi Corner Detector & Good Features to Track {#tutorial_py_shi_tomasi}
|
||||
===================================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter,
|
||||
|
||||
- We will learn about the another corner detector: Shi-Tomasi Corner Detector
|
||||
- We will see the function: **cv.goodFeaturesToTrack()**
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
In last chapter, we saw Harris Corner Detector. Later in 1994, J. Shi and C. Tomasi made a small
|
||||
modification to it in their paper **Good Features to Track** which shows better results compared to
|
||||
Harris Corner Detector. The scoring function in Harris Corner Detector was given by:
|
||||
|
||||
\f[R = \lambda_1 \lambda_2 - k(\lambda_1+\lambda_2)^2\f]
|
||||
|
||||
Instead of this, Shi-Tomasi proposed:
|
||||
|
||||
\f[R = \min(\lambda_1, \lambda_2)\f]
|
||||
|
||||
If it is a greater than a threshold value, it is considered as a corner. If we plot it in
|
||||
\f$\lambda_1 - \lambda_2\f$ space as we did in Harris Corner Detector, we get an image as below:
|
||||
|
||||

|
||||
|
||||
From the figure, you can see that only when \f$\lambda_1\f$ and \f$\lambda_2\f$ are above a minimum value,
|
||||
\f$\lambda_{\min}\f$, it is considered as a corner(green region).
|
||||
|
||||
Code
|
||||
----
|
||||
|
||||
OpenCV has a function, **cv.goodFeaturesToTrack()**. It finds N strongest corners in the image by
|
||||
Shi-Tomasi method (or Harris Corner Detection, if you specify it). As usual, image should be a
|
||||
grayscale image. Then you specify number of corners you want to find. Then you specify the quality
|
||||
level, which is a value between 0-1, which denotes the minimum quality of corner below which
|
||||
everyone is rejected. Then we provide the minimum euclidean distance between corners detected.
|
||||
|
||||
With all this information, the function finds corners in the image. All corners below quality
|
||||
level are rejected. Then it sorts the remaining corners based on quality in the descending order.
|
||||
Then function takes first strongest corner, throws away all the nearby corners in the range of
|
||||
minimum distance and returns N strongest corners.
|
||||
|
||||
In below example, we will try to find 25 best corners:
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv.imread('blox.jpg')
|
||||
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
|
||||
|
||||
corners = cv.goodFeaturesToTrack(gray,25,0.01,10)
|
||||
corners = np.int0(corners)
|
||||
|
||||
for i in corners:
|
||||
x,y = i.ravel()
|
||||
cv.circle(img,(x,y),3,255,-1)
|
||||
|
||||
plt.imshow(img),plt.show()
|
||||
@endcode
|
||||
See the result below:
|
||||
|
||||

|
||||
|
||||
This function is more appropriate for tracking. We will see that when its time comes.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_sift_intro/images/sift_dog.jpg
vendored
Normal file
After Width: | Height: | Size: 30 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_sift_intro/images/sift_keypoints.jpg
vendored
Normal file
After Width: | Height: | Size: 33 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_sift_intro/images/sift_local_extrema.jpg
vendored
Normal file
After Width: | Height: | Size: 15 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_sift_intro/images/sift_scale_invariant.jpg
vendored
Normal file
After Width: | Height: | Size: 3.3 KiB |
168
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_sift_intro/py_sift_intro.markdown
vendored
Normal file
@ -0,0 +1,168 @@
|
||||
Introduction to SIFT (Scale-Invariant Feature Transform) {#tutorial_py_sift_intro}
|
||||
========================================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter,
|
||||
- We will learn about the concepts of SIFT algorithm
|
||||
- We will learn to find SIFT Keypoints and Descriptors.
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
In last couple of chapters, we saw some corner detectors like Harris etc. They are
|
||||
rotation-invariant, which means, even if the image is rotated, we can find the same corners. It is
|
||||
obvious because corners remain corners in rotated image also. But what about scaling? A corner may
|
||||
not be a corner if the image is scaled. For example, check a simple image below. A corner in a small
|
||||
image within a small window is flat when it is zoomed in the same window. So Harris corner is not
|
||||
scale invariant.
|
||||
|
||||

|
||||
|
||||
In 2004, **D.Lowe**, University of British Columbia, came up with a new algorithm, Scale
|
||||
Invariant Feature Transform (SIFT) in his paper, **Distinctive Image Features from Scale-Invariant
|
||||
Keypoints**, which extract keypoints and compute its descriptors. *(This paper is easy to understand
|
||||
and considered to be best material available on SIFT. This explanation is just a short summary of
|
||||
this paper)*.
|
||||
|
||||
There are mainly four steps involved in SIFT algorithm. We will see them one-by-one.
|
||||
|
||||
### 1. Scale-space Extrema Detection
|
||||
|
||||
From the image above, it is obvious that we can't use the same window to detect keypoints with
|
||||
different scale. It is OK with small corner. But to detect larger corners we need larger windows.
|
||||
For this, scale-space filtering is used. In it, Laplacian of Gaussian is found for the image with
|
||||
various \f$\sigma\f$ values. LoG acts as a blob detector which detects blobs in various sizes due to
|
||||
change in \f$\sigma\f$. In short, \f$\sigma\f$ acts as a scaling parameter. For eg, in the above image,
|
||||
gaussian kernel with low \f$\sigma\f$ gives high value for small corner while gaussian kernel with high
|
||||
\f$\sigma\f$ fits well for larger corner. So, we can find the local maxima across the scale and space
|
||||
which gives us a list of \f$(x,y,\sigma)\f$ values which means there is a potential keypoint at (x,y) at
|
||||
\f$\sigma\f$ scale.
|
||||
|
||||
But this LoG is a little costly, so SIFT algorithm uses Difference of Gaussians which is an
|
||||
approximation of LoG. Difference of Gaussian is obtained as the difference of Gaussian blurring of
|
||||
an image with two different \f$\sigma\f$, let it be \f$\sigma\f$ and \f$k\sigma\f$. This process is done for
|
||||
different octaves of the image in Gaussian Pyramid. It is represented in below image:
|
||||
|
||||

|
||||
|
||||
Once this DoG are found, images are searched for local extrema over scale and space. For eg, one
|
||||
pixel in an image is compared with its 8 neighbours as well as 9 pixels in next scale and 9 pixels
|
||||
in previous scales. If it is a local extrema, it is a potential keypoint. It basically means that
|
||||
keypoint is best represented in that scale. It is shown in below image:
|
||||
|
||||

|
||||
|
||||
Regarding different parameters, the paper gives some empirical data which can be summarized as,
|
||||
number of octaves = 4, number of scale levels = 5, initial \f$\sigma=1.6\f$, \f$k=\sqrt{2}\f$ etc as optimal
|
||||
values.
|
||||
|
||||
### 2. Keypoint Localization
|
||||
|
||||
Once potential keypoints locations are found, they have to be refined to get more accurate results.
|
||||
They used Taylor series expansion of scale space to get more accurate location of extrema, and if
|
||||
the intensity at this extrema is less than a threshold value (0.03 as per the paper), it is
|
||||
rejected. This threshold is called **contrastThreshold** in OpenCV
|
||||
|
||||
DoG has higher response for edges, so edges also need to be removed. For this, a concept similar to
|
||||
Harris corner detector is used. They used a 2x2 Hessian matrix (H) to compute the principal
|
||||
curvature. We know from Harris corner detector that for edges, one eigen value is larger than the
|
||||
other. So here they used a simple function,
|
||||
|
||||
If this ratio is greater than a threshold, called **edgeThreshold** in OpenCV, that keypoint is
|
||||
discarded. It is given as 10 in paper.
|
||||
|
||||
So it eliminates any low-contrast keypoints and edge keypoints and what remains is strong interest
|
||||
points.
|
||||
|
||||
### 3. Orientation Assignment
|
||||
|
||||
Now an orientation is assigned to each keypoint to achieve invariance to image rotation. A
|
||||
neighbourhood is taken around the keypoint location depending on the scale, and the gradient
|
||||
magnitude and direction is calculated in that region. An orientation histogram with 36 bins covering
|
||||
360 degrees is created (It is weighted by gradient magnitude and gaussian-weighted circular window
|
||||
with \f$\sigma\f$ equal to 1.5 times the scale of keypoint). The highest peak in the histogram is taken
|
||||
and any peak above 80% of it is also considered to calculate the orientation. It creates keypoints
|
||||
with same location and scale, but different directions. It contribute to stability of matching.
|
||||
|
||||
### 4. Keypoint Descriptor
|
||||
|
||||
Now keypoint descriptor is created. A 16x16 neighbourhood around the keypoint is taken. It is
|
||||
divided into 16 sub-blocks of 4x4 size. For each sub-block, 8 bin orientation histogram is created.
|
||||
So a total of 128 bin values are available. It is represented as a vector to form keypoint
|
||||
descriptor. In addition to this, several measures are taken to achieve robustness against
|
||||
illumination changes, rotation etc.
|
||||
|
||||
### 5. Keypoint Matching
|
||||
|
||||
Keypoints between two images are matched by identifying their nearest neighbours. But in some cases,
|
||||
the second closest-match may be very near to the first. It may happen due to noise or some other
|
||||
reasons. In that case, ratio of closest-distance to second-closest distance is taken. If it is
|
||||
greater than 0.8, they are rejected. It eliminates around 90% of false matches while discards only
|
||||
5% correct matches, as per the paper.
|
||||
|
||||
This is a summary of SIFT algorithm. For more details and understanding, reading the original
|
||||
paper is highly recommended.
|
||||
|
||||
SIFT in OpenCV
|
||||
--------------
|
||||
|
||||
Now let's see SIFT functionalities available in OpenCV. Note that these were previously only
|
||||
available in [the opencv contrib repo](https://github.com/opencv/opencv_contrib), but the patent
|
||||
expired in the year 2020. So they are now included in the main repo. Let's start with keypoint
|
||||
detection and draw them. First we have to construct a SIFT object. We can pass different
|
||||
parameters to it which are optional and they are well explained in docs.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
|
||||
img = cv.imread('home.jpg')
|
||||
gray= cv.cvtColor(img,cv.COLOR_BGR2GRAY)
|
||||
|
||||
sift = cv.SIFT_create()
|
||||
kp = sift.detect(gray,None)
|
||||
|
||||
img=cv.drawKeypoints(gray,kp,img)
|
||||
|
||||
cv.imwrite('sift_keypoints.jpg',img)
|
||||
@endcode
|
||||
**sift.detect()** function finds the keypoint in the images. You can pass a mask if you want to
|
||||
search only a part of image. Each keypoint is a special structure which has many attributes like its
|
||||
(x,y) coordinates, size of the meaningful neighbourhood, angle which specifies its orientation,
|
||||
response that specifies strength of keypoints etc.
|
||||
|
||||
OpenCV also provides **cv.drawKeyPoints()** function which draws the small circles on the locations
|
||||
of keypoints. If you pass a flag, **cv.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS** to it, it will
|
||||
draw a circle with size of keypoint and it will even show its orientation. See below example.
|
||||
@code{.py}
|
||||
img=cv.drawKeypoints(gray,kp,img,flags=cv.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
|
||||
cv.imwrite('sift_keypoints.jpg',img)
|
||||
@endcode
|
||||
See the two results below:
|
||||
|
||||

|
||||
|
||||
Now to calculate the descriptor, OpenCV provides two methods.
|
||||
|
||||
-# Since you already found keypoints, you can call **sift.compute()** which computes the
|
||||
descriptors from the keypoints we have found. Eg: kp,des = sift.compute(gray,kp)
|
||||
2. If you didn't find keypoints, directly find keypoints and descriptors in a single step with the
|
||||
function, **sift.detectAndCompute()**.
|
||||
|
||||
We will see the second method:
|
||||
@code{.py}
|
||||
sift = cv.SIFT_create()
|
||||
kp, des = sift.detectAndCompute(gray,None)
|
||||
@endcode
|
||||
Here kp will be a list of keypoints and des is a numpy array of shape
|
||||
\f$\text{(Number of Keypoints)} \times 128\f$.
|
||||
|
||||
So we got keypoints, descriptors etc. Now we want to see how to match keypoints in different images.
|
||||
That we will learn in coming chapters.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_surf_intro/images/surf_boxfilter.jpg
vendored
Normal file
After Width: | Height: | Size: 13 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_surf_intro/images/surf_kp1.jpg
vendored
Normal file
After Width: | Height: | Size: 26 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_surf_intro/images/surf_kp2.jpg
vendored
Normal file
After Width: | Height: | Size: 28 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_surf_intro/images/surf_matching.jpg
vendored
Normal file
After Width: | Height: | Size: 12 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_surf_intro/images/surf_orientation.jpg
vendored
Normal file
After Width: | Height: | Size: 7.7 KiB |
163
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_surf_intro/py_surf_intro.markdown
vendored
Normal file
@ -0,0 +1,163 @@
|
||||
Introduction to SURF (Speeded-Up Robust Features) {#tutorial_py_surf_intro}
|
||||
=================================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter,
|
||||
- We will see the basics of SURF
|
||||
- We will see SURF functionalities in OpenCV
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
In last chapter, we saw SIFT for keypoint detection and description. But it was comparatively slow
|
||||
and people needed more speeded-up version. In 2006, three people, Bay, H., Tuytelaars, T. and Van
|
||||
Gool, L, published another paper, "SURF: Speeded Up Robust Features" which introduced a new
|
||||
algorithm called SURF. As name suggests, it is a speeded-up version of SIFT.
|
||||
|
||||
In SIFT, Lowe approximated Laplacian of Gaussian with Difference of Gaussian for finding
|
||||
scale-space. SURF goes a little further and approximates LoG with Box Filter. Below image shows a
|
||||
demonstration of such an approximation. One big advantage of this approximation is that, convolution
|
||||
with box filter can be easily calculated with the help of integral images. And it can be done in
|
||||
parallel for different scales. Also the SURF rely on determinant of Hessian matrix for both scale
|
||||
and location.
|
||||
|
||||

|
||||
|
||||
For orientation assignment, SURF uses wavelet responses in horizontal and vertical direction for a
|
||||
neighbourhood of size 6s. Adequate gaussian weights are also applied to it. Then they are plotted in
|
||||
a space as given in below image. The dominant orientation is estimated by calculating the sum of all
|
||||
responses within a sliding orientation window of angle 60 degrees. Interesting thing is that,
|
||||
wavelet response can be found out using integral images very easily at any scale. For many
|
||||
applications, rotation invariance is not required, so no need of finding this orientation, which
|
||||
speeds up the process. SURF provides such a functionality called Upright-SURF or U-SURF. It improves
|
||||
speed and is robust upto \f$\pm 15^{\circ}\f$. OpenCV supports both, depending upon the flag,
|
||||
**upright**. If it is 0, orientation is calculated. If it is 1, orientation is not calculated and it
|
||||
is faster.
|
||||
|
||||

|
||||
|
||||
For feature description, SURF uses Wavelet responses in horizontal and vertical direction (again,
|
||||
use of integral images makes things easier). A neighbourhood of size 20sX20s is taken around the
|
||||
keypoint where s is the size. It is divided into 4x4 subregions. For each subregion, horizontal and
|
||||
vertical wavelet responses are taken and a vector is formed like this,
|
||||
\f$v=( \sum{d_x}, \sum{d_y}, \sum{|d_x|}, \sum{|d_y|})\f$. This when represented as a vector gives SURF
|
||||
feature descriptor with total 64 dimensions. Lower the dimension, higher the speed of computation
|
||||
and matching, but provide better distinctiveness of features.
|
||||
|
||||
For more distinctiveness, SURF feature descriptor has an extended 128 dimension version. The sums of
|
||||
\f$d_x\f$ and \f$|d_x|\f$ are computed separately for \f$d_y < 0\f$ and \f$d_y \geq 0\f$. Similarly, the sums of
|
||||
\f$d_y\f$ and \f$|d_y|\f$ are split up according to the sign of \f$d_x\f$ , thereby doubling the number of
|
||||
features. It doesn't add much computation complexity. OpenCV supports both by setting the value of
|
||||
flag **extended** with 0 and 1 for 64-dim and 128-dim respectively (default is 128-dim)
|
||||
|
||||
Another important improvement is the use of sign of Laplacian (trace of Hessian Matrix) for
|
||||
underlying interest point. It adds no computation cost since it is already computed during
|
||||
detection. The sign of the Laplacian distinguishes bright blobs on dark backgrounds from the reverse
|
||||
situation. In the matching stage, we only compare features if they have the same type of contrast
|
||||
(as shown in image below). This minimal information allows for faster matching, without reducing the
|
||||
descriptor's performance.
|
||||
|
||||

|
||||
|
||||
In short, SURF adds a lot of features to improve the speed in every step. Analysis shows it is 3
|
||||
times faster than SIFT while performance is comparable to SIFT. SURF is good at handling images with
|
||||
blurring and rotation, but not good at handling viewpoint change and illumination change.
|
||||
|
||||
SURF in OpenCV
|
||||
--------------
|
||||
|
||||
OpenCV provides SURF functionalities just like SIFT. You initiate a SURF object with some optional
|
||||
conditions like 64/128-dim descriptors, Upright/Normal SURF etc. All the details are well explained
|
||||
in docs. Then as we did in SIFT, we can use SURF.detect(), SURF.compute() etc for finding keypoints
|
||||
and descriptors.
|
||||
|
||||
First we will see a simple demo on how to find SURF keypoints and descriptors and draw it. All
|
||||
examples are shown in Python terminal since it is just same as SIFT only.
|
||||
@code{.py}
|
||||
>>> img = cv.imread('fly.png',0)
|
||||
|
||||
# Create SURF object. You can specify params here or later.
|
||||
# Here I set Hessian Threshold to 400
|
||||
>>> surf = cv.xfeatures2d.SURF_create(400)
|
||||
|
||||
# Find keypoints and descriptors directly
|
||||
>>> kp, des = surf.detectAndCompute(img,None)
|
||||
|
||||
>>> len(kp)
|
||||
699
|
||||
@endcode
|
||||
1199 keypoints is too much to show in a picture. We reduce it to some 50 to draw it on an image.
|
||||
While matching, we may need all those features, but not now. So we increase the Hessian Threshold.
|
||||
@code{.py}
|
||||
# Check present Hessian threshold
|
||||
>>> print( surf.getHessianThreshold() )
|
||||
400.0
|
||||
|
||||
# We set it to some 50000. Remember, it is just for representing in picture.
|
||||
# In actual cases, it is better to have a value 300-500
|
||||
>>> surf.setHessianThreshold(50000)
|
||||
|
||||
# Again compute keypoints and check its number.
|
||||
>>> kp, des = surf.detectAndCompute(img,None)
|
||||
|
||||
>>> print( len(kp) )
|
||||
47
|
||||
@endcode
|
||||
It is less than 50. Let's draw it on the image.
|
||||
@code{.py}
|
||||
>>> img2 = cv.drawKeypoints(img,kp,None,(255,0,0),4)
|
||||
|
||||
>>> plt.imshow(img2),plt.show()
|
||||
@endcode
|
||||
See the result below. You can see that SURF is more like a blob detector. It detects the white blobs
|
||||
on wings of butterfly. You can test it with other images.
|
||||
|
||||

|
||||
|
||||
Now I want to apply U-SURF, so that it won't find the orientation.
|
||||
@code{.py}
|
||||
# Check upright flag, if it False, set it to True
|
||||
>>> print( surf.getUpright() )
|
||||
False
|
||||
|
||||
>>> surf.setUpright(True)
|
||||
|
||||
# Recompute the feature points and draw it
|
||||
>>> kp = surf.detect(img,None)
|
||||
>>> img2 = cv.drawKeypoints(img,kp,None,(255,0,0),4)
|
||||
|
||||
>>> plt.imshow(img2),plt.show()
|
||||
@endcode
|
||||
See the results below. All the orientations are shown in same direction. It is faster than
|
||||
previous. If you are working on cases where orientation is not a problem (like panorama stitching)
|
||||
etc, this is better.
|
||||
|
||||

|
||||
|
||||
Finally we check the descriptor size and change it to 128 if it is only 64-dim.
|
||||
@code{.py}
|
||||
# Find size of descriptor
|
||||
>>> print( surf.descriptorSize() )
|
||||
64
|
||||
|
||||
# That means flag, "extended" is False.
|
||||
>>> surf.getExtended()
|
||||
False
|
||||
|
||||
# So we make it to True to get 128-dim descriptors.
|
||||
>>> surf.setExtended(True)
|
||||
>>> kp, des = surf.detectAndCompute(img,None)
|
||||
>>> print( surf.descriptorSize() )
|
||||
128
|
||||
>>> print( des.shape )
|
||||
(47, 128)
|
||||
@endcode
|
||||
Remaining part is matching which we will do in another chapter.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
54
3rdparty/opencv-4.5.4/doc/py_tutorials/py_feature2d/py_table_of_contents_feature2d.markdown
vendored
Normal file
@ -0,0 +1,54 @@
|
||||
Feature Detection and Description {#tutorial_py_table_of_contents_feature2d}
|
||||
=================================
|
||||
|
||||
- @subpage tutorial_py_features_meaning
|
||||
|
||||
What are the main
|
||||
features in an image? How can finding those features be useful to us?
|
||||
|
||||
- @subpage tutorial_py_features_harris
|
||||
|
||||
Okay, Corners are good
|
||||
features? But how do we find them?
|
||||
|
||||
- @subpage tutorial_py_shi_tomasi
|
||||
|
||||
We will look into
|
||||
Shi-Tomasi corner detection
|
||||
|
||||
- @subpage tutorial_py_sift_intro
|
||||
|
||||
Harris corner detector
|
||||
is not good enough when scale of image changes. Lowe developed a breakthrough method to find
|
||||
scale-invariant features and it is called SIFT
|
||||
|
||||
- @subpage tutorial_py_surf_intro
|
||||
|
||||
SIFT is really good,
|
||||
but not fast enough, so people came up with a speeded-up version called SURF.
|
||||
|
||||
- @subpage tutorial_py_fast
|
||||
|
||||
All the above feature
|
||||
detection methods are good in some way. But they are not fast enough to work in real-time
|
||||
applications like SLAM. There comes the FAST algorithm, which is really "FAST".
|
||||
|
||||
- @subpage tutorial_py_brief
|
||||
|
||||
SIFT uses a feature
|
||||
descriptor with 128 floating point numbers. Consider thousands of such features. It takes lots of
|
||||
memory and more time for matching. We can compress it to make it faster. But still we have to
|
||||
calculate it first. There comes BRIEF which gives the shortcut to find binary descriptors with
|
||||
less memory, faster matching, still higher recognition rate.
|
||||
|
||||
- @subpage tutorial_py_orb
|
||||
|
||||
SIFT and SURF are good in what they do, but what if you have to pay a few dollars every year to use them in your applications? Yeah, they are patented!!! To solve that problem, OpenCV devs came up with a new "FREE" alternative to SIFT & SURF, and that is ORB.
|
||||
|
||||
- @subpage tutorial_py_matcher
|
||||
|
||||
We know a great deal about feature detectors and descriptors. It is time to learn how to match different descriptors. OpenCV provides two techniques, Brute-Force matcher and FLANN based matcher.
|
||||
|
||||
- @subpage tutorial_py_feature_homography
|
||||
|
||||
Now we know about feature matching. Let's mix it up with calib3d module to find objects in a complex image.
|