feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake
1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试 2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程 3.重整权利声明文件,重整代码工程,确保最小化侵权风险 Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/images/calibration_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 3.6 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/images/depthmap_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 3.7 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/images/epipolar_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 3.6 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/images/pose_icon.jpg
vendored
Normal file
After Width: | Height: | Size: 3.5 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_calibration/images/calib_pattern.jpg
vendored
Normal file
After Width: | Height: | Size: 45 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_calibration/images/calib_radial.jpg
vendored
Normal file
After Width: | Height: | Size: 33 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_calibration/images/calib_result.jpg
vendored
Normal file
After Width: | Height: | Size: 22 KiB |
225
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_calibration/py_calibration.markdown
vendored
Normal file
@ -0,0 +1,225 @@
|
||||
Camera Calibration {#tutorial_py_calibration}
|
||||
==================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this section, we will learn about
|
||||
|
||||
* types of distortion caused by cameras
|
||||
* how to find the intrinsic and extrinsic properties of a camera
|
||||
* how to undistort images based off these properties
|
||||
|
||||
Basics
|
||||
------
|
||||
|
||||
Some pinhole cameras introduce significant distortion to images. Two major kinds of distortion are
|
||||
radial distortion and tangential distortion.
|
||||
|
||||
Radial distortion causes straight lines to appear curved. Radial distortion becomes larger the farther points are from
|
||||
the center of the image. For example, one image is shown below in which two edges of a chess board are
|
||||
marked with red lines. But, you can see that the border of the chess board is not a straight line and doesn't match with the
|
||||
red line. All the expected straight lines are bulged out. Visit [Distortion
|
||||
(optics)](http://en.wikipedia.org/wiki/Distortion_%28optics%29) for more details.
|
||||
|
||||

|
||||
|
||||
Radial distortion can be represented as follows:
|
||||
|
||||
\f[x_{distorted} = x( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6) \\
|
||||
y_{distorted} = y( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6)\f]
|
||||
|
||||
Similarly, tangential distortion occurs because the image-taking lense
|
||||
is not aligned perfectly parallel to the imaging plane. So, some areas in the image may look nearer than
|
||||
expected. The amount of tangential distortion can be represented as below:
|
||||
|
||||
\f[x_{distorted} = x + [ 2p_1xy + p_2(r^2+2x^2)] \\
|
||||
y_{distorted} = y + [ p_1(r^2+ 2y^2)+ 2p_2xy]\f]
|
||||
|
||||
In short, we need to find five parameters, known as distortion coefficients given by:
|
||||
|
||||
\f[Distortion \; coefficients=(k_1 \hspace{10pt} k_2 \hspace{10pt} p_1 \hspace{10pt} p_2 \hspace{10pt} k_3)\f]
|
||||
|
||||
In addition to this, we need to some other information, like the intrinsic and extrinsic parameters
|
||||
of the camera. Intrinsic parameters are specific to a camera. They include information like focal
|
||||
length (\f$f_x,f_y\f$) and optical centers (\f$c_x, c_y\f$). The focal length and optical centers can be used to create a camera matrix, which can be used to remove distortion due to the lenses of a specific camera. The camera matrix is unique to a specific camera, so once calculated, it can be reused on other images taken by the same camera. It is expressed as a 3x3
|
||||
matrix:
|
||||
|
||||
\f[camera \; matrix = \left [ \begin{matrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{matrix} \right ]\f]
|
||||
|
||||
Extrinsic parameters corresponds to rotation and translation vectors which translates a coordinates
|
||||
of a 3D point to a coordinate system.
|
||||
|
||||
For stereo applications, these distortions need to be corrected first. To find these parameters,
|
||||
we must provide some sample images of a well defined pattern (e.g. a chess board). We
|
||||
find some specific points of which we already know the relative positions (e.g. square corners in the chess board). We know the coordinates of these points in real world space and we know the coordinates in the image, so we can solve for the distortion coefficients. For better results, we need at least 10 test patterns.
|
||||
|
||||
Code
|
||||
----
|
||||
|
||||
As mentioned above, we need at least 10 test patterns for camera calibration. OpenCV comes with some
|
||||
images of a chess board (see samples/data/left01.jpg -- left14.jpg), so we will utilize these. Consider an image of a chess board. The important input data needed for calibration of the camera
|
||||
is the set of 3D real world points and the corresponding 2D coordinates of these points in the image. 2D image points
|
||||
are OK which we can easily find from the image. (These image points are locations where two black
|
||||
squares touch each other in chess boards)
|
||||
|
||||
What about the 3D points from real world space? Those images are taken from a static camera and
|
||||
chess boards are placed at different locations and orientations. So we need to know \f$(X,Y,Z)\f$
|
||||
values. But for simplicity, we can say chess board was kept stationary at XY plane, (so Z=0 always)
|
||||
and camera was moved accordingly. This consideration helps us to find only X,Y values. Now for X,Y
|
||||
values, we can simply pass the points as (0,0), (1,0), (2,0), ... which denotes the location of
|
||||
points. In this case, the results we get will be in the scale of size of chess board square. But if
|
||||
we know the square size, (say 30 mm), we can pass the values as (0,0), (30,0), (60,0), ... . Thus, we get
|
||||
the results in mm. (In this case, we don't know square size since we didn't take those images, so we
|
||||
pass in terms of square size).
|
||||
|
||||
3D points are called **object points** and 2D image points are called **image points.**
|
||||
|
||||
### Setup
|
||||
|
||||
So to find pattern in chess board, we can use the function, **cv.findChessboardCorners()**. We also
|
||||
need to pass what kind of pattern we are looking for, like 8x8 grid, 5x5 grid etc. In this example, we
|
||||
use 7x6 grid. (Normally a chess board has 8x8 squares and 7x7 internal corners). It returns the
|
||||
corner points and retval which will be True if pattern is obtained. These corners will be placed in
|
||||
an order (from left-to-right, top-to-bottom)
|
||||
|
||||
@note This function may not be able to find the required pattern in all the images. So, one good option
|
||||
is to write the code such that, it starts the camera and check each frame for required pattern. Once
|
||||
the pattern is obtained, find the corners and store it in a list. Also, provide some interval before
|
||||
reading next frame so that we can adjust our chess board in different direction. Continue this
|
||||
process until the required number of good patterns are obtained. Even in the example provided here, we
|
||||
are not sure how many images out of the 14 given are good. Thus, we must read all the images and take only the good
|
||||
ones.
|
||||
|
||||
@note Instead of chess board, we can alternatively use a circular grid. In this case, we must use the function
|
||||
**cv.findCirclesGrid()** to find the pattern. Fewer images are sufficient to perform camera calibration using a circular grid.
|
||||
|
||||
Once we find the corners, we can increase their accuracy using **cv.cornerSubPix()**. We can also
|
||||
draw the pattern using **cv.drawChessboardCorners()**. All these steps are included in below code:
|
||||
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
import glob
|
||||
|
||||
# termination criteria
|
||||
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)
|
||||
|
||||
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
|
||||
objp = np.zeros((6*7,3), np.float32)
|
||||
objp[:,:2] = np.mgrid[0:7,0:6].T.reshape(-1,2)
|
||||
|
||||
# Arrays to store object points and image points from all the images.
|
||||
objpoints = [] # 3d point in real world space
|
||||
imgpoints = [] # 2d points in image plane.
|
||||
|
||||
images = glob.glob('*.jpg')
|
||||
|
||||
for fname in images:
|
||||
img = cv.imread(fname)
|
||||
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
|
||||
|
||||
# Find the chess board corners
|
||||
ret, corners = cv.findChessboardCorners(gray, (7,6), None)
|
||||
|
||||
# If found, add object points, image points (after refining them)
|
||||
if ret == True:
|
||||
objpoints.append(objp)
|
||||
|
||||
corners2 = cv.cornerSubPix(gray,corners, (11,11), (-1,-1), criteria)
|
||||
imgpoints.append(corners)
|
||||
|
||||
# Draw and display the corners
|
||||
cv.drawChessboardCorners(img, (7,6), corners2, ret)
|
||||
cv.imshow('img', img)
|
||||
cv.waitKey(500)
|
||||
|
||||
cv.destroyAllWindows()
|
||||
@endcode
|
||||
One image with pattern drawn on it is shown below:
|
||||
|
||||

|
||||
|
||||
### Calibration
|
||||
|
||||
Now that we have our object points and image points, we are ready to go for calibration. We can
|
||||
use the function, **cv.calibrateCamera()** which returns the camera matrix, distortion coefficients,
|
||||
rotation and translation vectors etc.
|
||||
@code{.py}
|
||||
ret, mtx, dist, rvecs, tvecs = cv.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None)
|
||||
@endcode
|
||||
|
||||
### Undistortion
|
||||
|
||||
Now, we can take an image and undistort it. OpenCV comes with two
|
||||
methods for doing this. However first, we can refine the camera matrix based on a free scaling
|
||||
parameter using **cv.getOptimalNewCameraMatrix()**. If the scaling parameter alpha=0, it returns
|
||||
undistorted image with minimum unwanted pixels. So it may even remove some pixels at image corners.
|
||||
If alpha=1, all pixels are retained with some extra black images. This function also returns an image ROI which
|
||||
can be used to crop the result.
|
||||
|
||||
So, we take a new image (left12.jpg in this case. That is the first image in this chapter)
|
||||
@code{.py}
|
||||
img = cv.imread('left12.jpg')
|
||||
h, w = img.shape[:2]
|
||||
newcameramtx, roi = cv.getOptimalNewCameraMatrix(mtx, dist, (w,h), 1, (w,h))
|
||||
@endcode
|
||||
#### 1. Using **cv.undistort()**
|
||||
|
||||
This is the easiest way. Just call the function and use ROI obtained above to crop the result.
|
||||
@code{.py}
|
||||
# undistort
|
||||
dst = cv.undistort(img, mtx, dist, None, newcameramtx)
|
||||
|
||||
# crop the image
|
||||
x, y, w, h = roi
|
||||
dst = dst[y:y+h, x:x+w]
|
||||
cv.imwrite('calibresult.png', dst)
|
||||
@endcode
|
||||
#### 2. Using **remapping**
|
||||
|
||||
This way is a little bit more difficult. First, find a mapping function from the distorted image to the undistorted image. Then
|
||||
use the remap function.
|
||||
@code{.py}
|
||||
# undistort
|
||||
mapx, mapy = cv.initUndistortRectifyMap(mtx, dist, None, newcameramtx, (w,h), 5)
|
||||
dst = cv.remap(img, mapx, mapy, cv.INTER_LINEAR)
|
||||
|
||||
# crop the image
|
||||
x, y, w, h = roi
|
||||
dst = dst[y:y+h, x:x+w]
|
||||
cv.imwrite('calibresult.png', dst)
|
||||
@endcode
|
||||
Still, both the methods give the same result. See the result below:
|
||||
|
||||

|
||||
|
||||
You can see in the result that all the edges are straight.
|
||||
|
||||
Now you can store the camera matrix and distortion coefficients using write functions in NumPy
|
||||
(np.savez, np.savetxt etc) for future uses.
|
||||
|
||||
Re-projection Error
|
||||
-------------------
|
||||
|
||||
Re-projection error gives a good estimation of just how exact the found parameters are. The closer the re-projection error is to zero, the more accurate the parameters we found are. Given the intrinsic, distortion, rotation and translation matrices,
|
||||
we must first transform the object point to image point using **cv.projectPoints()**. Then, we can calculate
|
||||
the absolute norm between what we got with our transformation and the corner finding algorithm. To
|
||||
find the average error, we calculate the arithmetical mean of the errors calculated for all the
|
||||
calibration images.
|
||||
@code{.py}
|
||||
mean_error = 0
|
||||
for i in range(len(objpoints)):
|
||||
imgpoints2, _ = cv.projectPoints(objpoints[i], rvecs[i], tvecs[i], mtx, dist)
|
||||
error = cv.norm(imgpoints[i], imgpoints2, cv.NORM_L2)/len(imgpoints2)
|
||||
mean_error += error
|
||||
|
||||
print( "total error: {}".format(mean_error/len(objpoints)) )
|
||||
@endcode
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
||||
|
||||
-# Try camera calibration with circular grid.
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_depthmap/images/disparity_map.jpg
vendored
Normal file
After Width: | Height: | Size: 18 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_depthmap/images/stereo_depth.jpg
vendored
Normal file
After Width: | Height: | Size: 13 KiB |
75
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_depthmap/py_depthmap.markdown
vendored
Normal file
@ -0,0 +1,75 @@
|
||||
Depth Map from Stereo Images {#tutorial_py_depthmap}
|
||||
============================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this session,
|
||||
- We will learn to create a depth map from stereo images.
|
||||
|
||||
Basics
|
||||
------
|
||||
|
||||
In the last session, we saw basic concepts like epipolar constraints and other related terms. We also
|
||||
saw that if we have two images of same scene, we can get depth information from that in an intuitive
|
||||
way. Below is an image and some simple mathematical formulas which prove that intuition. (Image
|
||||
Courtesy :
|
||||
|
||||

|
||||
|
||||
The above diagram contains equivalent triangles. Writing their equivalent equations will yield us
|
||||
following result:
|
||||
|
||||
\f[disparity = x - x' = \frac{Bf}{Z}\f]
|
||||
|
||||
\f$x\f$ and \f$x'\f$ are the distance between points in image plane corresponding to the scene point 3D and
|
||||
their camera center. \f$B\f$ is the distance between two cameras (which we know) and \f$f\f$ is the focal
|
||||
length of camera (already known). So in short, the above equation says that the depth of a point in a
|
||||
scene is inversely proportional to the difference in distance of corresponding image points and
|
||||
their camera centers. So with this information, we can derive the depth of all pixels in an image.
|
||||
|
||||
So it finds corresponding matches between two images. We have already seen how epiline constraint
|
||||
make this operation faster and accurate. Once it finds matches, it finds the disparity. Let's see
|
||||
how we can do it with OpenCV.
|
||||
|
||||
Code
|
||||
----
|
||||
|
||||
Below code snippet shows a simple procedure to create a disparity map.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
imgL = cv.imread('tsukuba_l.png',0)
|
||||
imgR = cv.imread('tsukuba_r.png',0)
|
||||
|
||||
stereo = cv.StereoBM_create(numDisparities=16, blockSize=15)
|
||||
disparity = stereo.compute(imgL,imgR)
|
||||
plt.imshow(disparity,'gray')
|
||||
plt.show()
|
||||
@endcode
|
||||
Below image contains the original image (left) and its disparity map (right). As you can see, the result
|
||||
is contaminated with high degree of noise. By adjusting the values of numDisparities and blockSize,
|
||||
you can get a better result.
|
||||
|
||||

|
||||
|
||||
There are some parameters when you get familiar with StereoBM, and you may need to fine tune the parameters to get better and smooth results. Parameters:
|
||||
- texture_threshold: filters out areas that don't have enough texture for reliable matching
|
||||
- Speckle range and size: Block-based matchers often produce "speckles" near the boundaries of objects, where the matching window catches the foreground on one side and the background on the other. In this scene it appears that the matcher is also finding small spurious matches in the projected texture on the table. To get rid of these artifacts we post-process the disparity image with a speckle filter controlled by the speckle_size and speckle_range parameters. speckle_size is the number of pixels below which a disparity blob is dismissed as "speckle." speckle_range controls how close in value disparities must be to be considered part of the same blob.
|
||||
- Number of disparities: How many pixels to slide the window over. The larger it is, the larger the range of visible depths, but more computation is required.
|
||||
- min_disparity: the offset from the x-position of the left pixel at which to begin searching.
|
||||
- uniqueness_ratio: Another post-filtering step. If the best matching disparity is not sufficiently better than every other disparity in the search range, the pixel is filtered out. You can try tweaking this if texture_threshold and the speckle filtering are still letting through spurious matches.
|
||||
- prefilter_size and prefilter_cap: The pre-filtering phase, which normalizes image brightness and enhances texture in preparation for block matching. Normally you should not need to adjust these.
|
||||
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
- [Ros stereo img processing wiki page](http://wiki.ros.org/stereo_image_proc/Tutorials/ChoosingGoodStereoParameters)
|
||||
|
||||
Exercises
|
||||
---------
|
||||
|
||||
-# OpenCV samples contain an example of generating disparity map and its 3D reconstruction. Check
|
||||
stereo_match.py in OpenCV-Python samples.
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_epipolar_geometry/images/epipolar.jpg
vendored
Normal file
After Width: | Height: | Size: 11 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_epipolar_geometry/images/epiresult.jpg
vendored
Normal file
After Width: | Height: | Size: 78 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_epipolar_geometry/images/essential_matrix.jpg
vendored
Normal file
After Width: | Height: | Size: 15 KiB |
172
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_epipolar_geometry/py_epipolar_geometry.markdown
vendored
Normal file
@ -0,0 +1,172 @@
|
||||
Epipolar Geometry {#tutorial_py_epipolar_geometry}
|
||||
=================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this section,
|
||||
|
||||
- We will learn about the basics of multiview geometry
|
||||
- We will see what is epipole, epipolar lines, epipolar constraint etc.
|
||||
|
||||
Basic Concepts
|
||||
--------------
|
||||
|
||||
When we take an image using pin-hole camera, we loose an important information, ie depth of the
|
||||
image. Or how far is each point in the image from the camera because it is a 3D-to-2D conversion. So
|
||||
it is an important question whether we can find the depth information using these cameras. And the
|
||||
answer is to use more than one camera. Our eyes works in similar way where we use two cameras (two
|
||||
eyes) which is called stereo vision. So let's see what OpenCV provides in this field.
|
||||
|
||||
(*Learning OpenCV* by Gary Bradsky has a lot of information in this field.)
|
||||
|
||||
Before going to depth images, let's first understand some basic concepts in multiview geometry. In
|
||||
this section we will deal with epipolar geometry. See the image below which shows a basic setup with
|
||||
two cameras taking the image of same scene.
|
||||
|
||||

|
||||
|
||||
If we are using only the left camera, we can't find the 3D point corresponding to the point \f$x\f$ in
|
||||
image because every point on the line \f$OX\f$ projects to the same point on the image plane. But
|
||||
consider the right image also. Now different points on the line \f$OX\f$ projects to different points
|
||||
(\f$x'\f$) in right plane. So with these two images, we can triangulate the correct 3D point. This is
|
||||
the whole idea.
|
||||
|
||||
The projection of the different points on \f$OX\f$ form a line on right plane (line \f$l'\f$). We call it
|
||||
**epiline** corresponding to the point \f$x\f$. It means, to find the point \f$x\f$ on the right image,
|
||||
search along this epiline. It should be somewhere on this line (Think of it this way, to find the
|
||||
matching point in other image, you need not search the whole image, just search along the epiline.
|
||||
So it provides better performance and accuracy). This is called **Epipolar Constraint**. Similarly
|
||||
all points will have its corresponding epilines in the other image. The plane \f$XOO'\f$ is called
|
||||
**Epipolar Plane**.
|
||||
|
||||
\f$O\f$ and \f$O'\f$ are the camera centers. From the setup given above, you can see that projection of
|
||||
right camera \f$O'\f$ is seen on the left image at the point, \f$e\f$. It is called the **epipole**. Epipole
|
||||
is the point of intersection of line through camera centers and the image planes. Similarly \f$e'\f$ is
|
||||
the epipole of the left camera. In some cases, you won't be able to locate the epipole in the image,
|
||||
they may be outside the image (which means, one camera doesn't see the other).
|
||||
|
||||
All the epilines pass through its epipole. So to find the location of epipole, we can find many
|
||||
epilines and find their intersection point.
|
||||
|
||||
So in this session, we focus on finding epipolar lines and epipoles. But to find them, we need two
|
||||
more ingredients, **Fundamental Matrix (F)** and **Essential Matrix (E)**. Essential Matrix contains
|
||||
the information about translation and rotation, which describe the location of the second camera
|
||||
relative to the first in global coordinates. See the image below (Image courtesy: Learning OpenCV by
|
||||
Gary Bradsky):
|
||||
|
||||

|
||||
|
||||
But we prefer measurements to be done in pixel coordinates, right? Fundamental Matrix contains the
|
||||
same information as Essential Matrix in addition to the information about the intrinsics of both
|
||||
cameras so that we can relate the two cameras in pixel coordinates. (If we are using rectified
|
||||
images and normalize the point by dividing by the focal lengths, \f$F=E\f$). In simple words,
|
||||
Fundamental Matrix F, maps a point in one image to a line (epiline) in the other image. This is
|
||||
calculated from matching points from both the images. A minimum of 8 such points are required to
|
||||
find the fundamental matrix (while using 8-point algorithm). More points are preferred and use
|
||||
RANSAC to get a more robust result.
|
||||
|
||||
Code
|
||||
----
|
||||
|
||||
So first we need to find as many possible matches between two images to find the fundamental matrix.
|
||||
For this, we use SIFT descriptors with FLANN based matcher and ratio test.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img1 = cv.imread('myleft.jpg',0) #queryimage # left image
|
||||
img2 = cv.imread('myright.jpg',0) #trainimage # right image
|
||||
|
||||
sift = cv.SIFT_create()
|
||||
|
||||
# find the keypoints and descriptors with SIFT
|
||||
kp1, des1 = sift.detectAndCompute(img1,None)
|
||||
kp2, des2 = sift.detectAndCompute(img2,None)
|
||||
|
||||
# FLANN parameters
|
||||
FLANN_INDEX_KDTREE = 1
|
||||
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
|
||||
search_params = dict(checks=50)
|
||||
|
||||
flann = cv.FlannBasedMatcher(index_params,search_params)
|
||||
matches = flann.knnMatch(des1,des2,k=2)
|
||||
|
||||
pts1 = []
|
||||
pts2 = []
|
||||
|
||||
# ratio test as per Lowe's paper
|
||||
for i,(m,n) in enumerate(matches):
|
||||
if m.distance < 0.8*n.distance:
|
||||
pts2.append(kp2[m.trainIdx].pt)
|
||||
pts1.append(kp1[m.queryIdx].pt)
|
||||
@endcode
|
||||
Now we have the list of best matches from both the images. Let's find the Fundamental Matrix.
|
||||
@code{.py}
|
||||
pts1 = np.int32(pts1)
|
||||
pts2 = np.int32(pts2)
|
||||
F, mask = cv.findFundamentalMat(pts1,pts2,cv.FM_LMEDS)
|
||||
|
||||
# We select only inlier points
|
||||
pts1 = pts1[mask.ravel()==1]
|
||||
pts2 = pts2[mask.ravel()==1]
|
||||
@endcode
|
||||
Next we find the epilines. Epilines corresponding to the points in first image is drawn on second
|
||||
image. So mentioning of correct images are important here. We get an array of lines. So we define a
|
||||
new function to draw these lines on the images.
|
||||
@code{.py}
|
||||
def drawlines(img1,img2,lines,pts1,pts2):
|
||||
''' img1 - image on which we draw the epilines for the points in img2
|
||||
lines - corresponding epilines '''
|
||||
r,c = img1.shape
|
||||
img1 = cv.cvtColor(img1,cv.COLOR_GRAY2BGR)
|
||||
img2 = cv.cvtColor(img2,cv.COLOR_GRAY2BGR)
|
||||
for r,pt1,pt2 in zip(lines,pts1,pts2):
|
||||
color = tuple(np.random.randint(0,255,3).tolist())
|
||||
x0,y0 = map(int, [0, -r[2]/r[1] ])
|
||||
x1,y1 = map(int, [c, -(r[2]+r[0]*c)/r[1] ])
|
||||
img1 = cv.line(img1, (x0,y0), (x1,y1), color,1)
|
||||
img1 = cv.circle(img1,tuple(pt1),5,color,-1)
|
||||
img2 = cv.circle(img2,tuple(pt2),5,color,-1)
|
||||
return img1,img2
|
||||
@endcode
|
||||
Now we find the epilines in both the images and draw them.
|
||||
@code{.py}
|
||||
# Find epilines corresponding to points in right image (second image) and
|
||||
# drawing its lines on left image
|
||||
lines1 = cv.computeCorrespondEpilines(pts2.reshape(-1,1,2), 2,F)
|
||||
lines1 = lines1.reshape(-1,3)
|
||||
img5,img6 = drawlines(img1,img2,lines1,pts1,pts2)
|
||||
|
||||
# Find epilines corresponding to points in left image (first image) and
|
||||
# drawing its lines on right image
|
||||
lines2 = cv.computeCorrespondEpilines(pts1.reshape(-1,1,2), 1,F)
|
||||
lines2 = lines2.reshape(-1,3)
|
||||
img3,img4 = drawlines(img2,img1,lines2,pts2,pts1)
|
||||
|
||||
plt.subplot(121),plt.imshow(img5)
|
||||
plt.subplot(122),plt.imshow(img3)
|
||||
plt.show()
|
||||
@endcode
|
||||
Below is the result we get:
|
||||
|
||||

|
||||
|
||||
You can see in the left image that all epilines are converging at a point outside the image at right
|
||||
side. That meeting point is the epipole.
|
||||
|
||||
For better results, images with good resolution and many non-planar points should be used.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
||||
|
||||
-# One important topic is the forward movement of camera. Then epipoles will be seen at the same
|
||||
locations in both with epilines emerging from a fixed point. [See this
|
||||
discussion](http://answers.opencv.org/question/17912/location-of-epipole/).
|
||||
2. Fundamental Matrix estimation is sensitive to quality of matches, outliers etc. It becomes worse
|
||||
when all selected matches lie on the same plane. [Check this
|
||||
discussion](http://answers.opencv.org/question/18125/epilines-not-correct/).
|
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_pose/images/pose_1.jpg
vendored
Normal file
After Width: | Height: | Size: 44 KiB |
BIN
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_pose/images/pose_2.jpg
vendored
Normal file
After Width: | Height: | Size: 26 KiB |
127
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_pose/py_pose.markdown
vendored
Normal file
@ -0,0 +1,127 @@
|
||||
Pose Estimation {#tutorial_py_pose}
|
||||
===============
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this section,
|
||||
- We will learn to exploit calib3d module to create some 3D effects in images.
|
||||
|
||||
Basics
|
||||
------
|
||||
|
||||
This is going to be a small section. During the last session on camera calibration, you have found
|
||||
the camera matrix, distortion coefficients etc. Given a pattern image, we can utilize the above
|
||||
information to calculate its pose, or how the object is situated in space, like how it is rotated,
|
||||
how it is displaced etc. For a planar object, we can assume Z=0, such that, the problem now becomes
|
||||
how camera is placed in space to see our pattern image. So, if we know how the object lies in the
|
||||
space, we can draw some 2D diagrams in it to simulate the 3D effect. Let's see how to do it.
|
||||
|
||||
Our problem is, we want to draw our 3D coordinate axis (X, Y, Z axes) on our chessboard's first
|
||||
corner. X axis in blue color, Y axis in green color and Z axis in red color. So in-effect, Z axis
|
||||
should feel like it is perpendicular to our chessboard plane.
|
||||
|
||||
First, let's load the camera matrix and distortion coefficients from the previous calibration
|
||||
result.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2 as cv
|
||||
import glob
|
||||
|
||||
# Load previously saved data
|
||||
with np.load('B.npz') as X:
|
||||
mtx, dist, _, _ = [X[i] for i in ('mtx','dist','rvecs','tvecs')]
|
||||
@endcode
|
||||
Now let's create a function, draw which takes the corners in the chessboard (obtained using
|
||||
**cv.findChessboardCorners()**) and **axis points** to draw a 3D axis.
|
||||
@code{.py}
|
||||
def draw(img, corners, imgpts):
|
||||
corner = tuple(corners[0].ravel())
|
||||
img = cv.line(img, corner, tuple(imgpts[0].ravel()), (255,0,0), 5)
|
||||
img = cv.line(img, corner, tuple(imgpts[1].ravel()), (0,255,0), 5)
|
||||
img = cv.line(img, corner, tuple(imgpts[2].ravel()), (0,0,255), 5)
|
||||
return img
|
||||
@endcode
|
||||
Then as in previous case, we create termination criteria, object points (3D points of corners in
|
||||
chessboard) and axis points. Axis points are points in 3D space for drawing the axis. We draw axis
|
||||
of length 3 (units will be in terms of chess square size since we calibrated based on that size). So
|
||||
our X axis is drawn from (0,0,0) to (3,0,0), so for Y axis. For Z axis, it is drawn from (0,0,0) to
|
||||
(0,0,-3). Negative denotes it is drawn towards the camera.
|
||||
@code{.py}
|
||||
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)
|
||||
objp = np.zeros((6*7,3), np.float32)
|
||||
objp[:,:2] = np.mgrid[0:7,0:6].T.reshape(-1,2)
|
||||
|
||||
axis = np.float32([[3,0,0], [0,3,0], [0,0,-3]]).reshape(-1,3)
|
||||
@endcode
|
||||
Now, as usual, we load each image. Search for 7x6 grid. If found, we refine it with subcorner
|
||||
pixels. Then to calculate the rotation and translation, we use the function,
|
||||
**cv.solvePnPRansac()**. Once we those transformation matrices, we use them to project our **axis
|
||||
points** to the image plane. In simple words, we find the points on image plane corresponding to
|
||||
each of (3,0,0),(0,3,0),(0,0,3) in 3D space. Once we get them, we draw lines from the first corner
|
||||
to each of these points using our draw() function. Done !!!
|
||||
@code{.py}
|
||||
for fname in glob.glob('left*.jpg'):
|
||||
img = cv.imread(fname)
|
||||
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
|
||||
ret, corners = cv.findChessboardCorners(gray, (7,6),None)
|
||||
|
||||
if ret == True:
|
||||
corners2 = cv.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
|
||||
|
||||
# Find the rotation and translation vectors.
|
||||
ret,rvecs, tvecs = cv.solvePnP(objp, corners2, mtx, dist)
|
||||
|
||||
# project 3D points to image plane
|
||||
imgpts, jac = cv.projectPoints(axis, rvecs, tvecs, mtx, dist)
|
||||
|
||||
img = draw(img,corners2,imgpts)
|
||||
cv.imshow('img',img)
|
||||
k = cv.waitKey(0) & 0xFF
|
||||
if k == ord('s'):
|
||||
cv.imwrite(fname[:6]+'.png', img)
|
||||
|
||||
cv.destroyAllWindows()
|
||||
@endcode
|
||||
See some results below. Notice that each axis is 3 squares long.:
|
||||
|
||||

|
||||
|
||||
### Render a Cube
|
||||
|
||||
If you want to draw a cube, modify the draw() function and axis points as follows.
|
||||
|
||||
Modified draw() function:
|
||||
@code{.py}
|
||||
def draw(img, corners, imgpts):
|
||||
imgpts = np.int32(imgpts).reshape(-1,2)
|
||||
|
||||
# draw ground floor in green
|
||||
img = cv.drawContours(img, [imgpts[:4]],-1,(0,255,0),-3)
|
||||
|
||||
# draw pillars in blue color
|
||||
for i,j in zip(range(4),range(4,8)):
|
||||
img = cv.line(img, tuple(imgpts[i]), tuple(imgpts[j]),(255),3)
|
||||
|
||||
# draw top layer in red color
|
||||
img = cv.drawContours(img, [imgpts[4:]],-1,(0,0,255),3)
|
||||
|
||||
return img
|
||||
@endcode
|
||||
Modified axis points. They are the 8 corners of a cube in 3D space:
|
||||
@code{.py}
|
||||
axis = np.float32([[0,0,0], [0,3,0], [3,3,0], [3,0,0],
|
||||
[0,0,-3],[0,3,-3],[3,3,-3],[3,0,-3] ])
|
||||
@endcode
|
||||
And look at the result below:
|
||||
|
||||

|
||||
|
||||
If you are interested in graphics, augmented reality etc, you can use OpenGL to render more
|
||||
complicated figures.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
22
3rdparty/opencv-4.5.4/doc/py_tutorials/py_calib3d/py_table_of_contents_calib3d.markdown
vendored
Normal file
@ -0,0 +1,22 @@
|
||||
Camera Calibration and 3D Reconstruction {#tutorial_py_table_of_contents_calib3d}
|
||||
========================================
|
||||
|
||||
- @subpage tutorial_py_calibration
|
||||
|
||||
Let's find how good
|
||||
is our camera. Is there any distortion in images taken with it? If so how to correct it?
|
||||
|
||||
- @subpage tutorial_py_pose
|
||||
|
||||
This is a small
|
||||
section which will help you to create some cool 3D effects with calib module.
|
||||
|
||||
- @subpage tutorial_py_epipolar_geometry
|
||||
|
||||
Let's understand
|
||||
epipolar geometry and epipolar constraint.
|
||||
|
||||
- @subpage tutorial_py_depthmap
|
||||
|
||||
Extract depth
|
||||
information from 2D images.
|