feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake

1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试
2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程
3.重整权利声明文件,重整代码工程,确保最小化侵权风险

Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake
Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
This commit is contained in:
wangzhengyang
2022-05-10 09:54:44 +08:00
parent ecdd171c6f
commit 718c41634f
10018 changed files with 3593797 additions and 186748 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

View File

@ -0,0 +1,225 @@
Camera Calibration {#tutorial_py_calibration}
==================
Goal
----
In this section, we will learn about
* types of distortion caused by cameras
* how to find the intrinsic and extrinsic properties of a camera
* how to undistort images based off these properties
Basics
------
Some pinhole cameras introduce significant distortion to images. Two major kinds of distortion are
radial distortion and tangential distortion.
Radial distortion causes straight lines to appear curved. Radial distortion becomes larger the farther points are from
the center of the image. For example, one image is shown below in which two edges of a chess board are
marked with red lines. But, you can see that the border of the chess board is not a straight line and doesn't match with the
red line. All the expected straight lines are bulged out. Visit [Distortion
(optics)](http://en.wikipedia.org/wiki/Distortion_%28optics%29) for more details.
![image](images/calib_radial.jpg)
Radial distortion can be represented as follows:
\f[x_{distorted} = x( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6) \\
y_{distorted} = y( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6)\f]
Similarly, tangential distortion occurs because the image-taking lense
is not aligned perfectly parallel to the imaging plane. So, some areas in the image may look nearer than
expected. The amount of tangential distortion can be represented as below:
\f[x_{distorted} = x + [ 2p_1xy + p_2(r^2+2x^2)] \\
y_{distorted} = y + [ p_1(r^2+ 2y^2)+ 2p_2xy]\f]
In short, we need to find five parameters, known as distortion coefficients given by:
\f[Distortion \; coefficients=(k_1 \hspace{10pt} k_2 \hspace{10pt} p_1 \hspace{10pt} p_2 \hspace{10pt} k_3)\f]
In addition to this, we need to some other information, like the intrinsic and extrinsic parameters
of the camera. Intrinsic parameters are specific to a camera. They include information like focal
length (\f$f_x,f_y\f$) and optical centers (\f$c_x, c_y\f$). The focal length and optical centers can be used to create a camera matrix, which can be used to remove distortion due to the lenses of a specific camera. The camera matrix is unique to a specific camera, so once calculated, it can be reused on other images taken by the same camera. It is expressed as a 3x3
matrix:
\f[camera \; matrix = \left [ \begin{matrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{matrix} \right ]\f]
Extrinsic parameters corresponds to rotation and translation vectors which translates a coordinates
of a 3D point to a coordinate system.
For stereo applications, these distortions need to be corrected first. To find these parameters,
we must provide some sample images of a well defined pattern (e.g. a chess board). We
find some specific points of which we already know the relative positions (e.g. square corners in the chess board). We know the coordinates of these points in real world space and we know the coordinates in the image, so we can solve for the distortion coefficients. For better results, we need at least 10 test patterns.
Code
----
As mentioned above, we need at least 10 test patterns for camera calibration. OpenCV comes with some
images of a chess board (see samples/data/left01.jpg -- left14.jpg), so we will utilize these. Consider an image of a chess board. The important input data needed for calibration of the camera
is the set of 3D real world points and the corresponding 2D coordinates of these points in the image. 2D image points
are OK which we can easily find from the image. (These image points are locations where two black
squares touch each other in chess boards)
What about the 3D points from real world space? Those images are taken from a static camera and
chess boards are placed at different locations and orientations. So we need to know \f$(X,Y,Z)\f$
values. But for simplicity, we can say chess board was kept stationary at XY plane, (so Z=0 always)
and camera was moved accordingly. This consideration helps us to find only X,Y values. Now for X,Y
values, we can simply pass the points as (0,0), (1,0), (2,0), ... which denotes the location of
points. In this case, the results we get will be in the scale of size of chess board square. But if
we know the square size, (say 30 mm), we can pass the values as (0,0), (30,0), (60,0), ... . Thus, we get
the results in mm. (In this case, we don't know square size since we didn't take those images, so we
pass in terms of square size).
3D points are called **object points** and 2D image points are called **image points.**
### Setup
So to find pattern in chess board, we can use the function, **cv.findChessboardCorners()**. We also
need to pass what kind of pattern we are looking for, like 8x8 grid, 5x5 grid etc. In this example, we
use 7x6 grid. (Normally a chess board has 8x8 squares and 7x7 internal corners). It returns the
corner points and retval which will be True if pattern is obtained. These corners will be placed in
an order (from left-to-right, top-to-bottom)
@note This function may not be able to find the required pattern in all the images. So, one good option
is to write the code such that, it starts the camera and check each frame for required pattern. Once
the pattern is obtained, find the corners and store it in a list. Also, provide some interval before
reading next frame so that we can adjust our chess board in different direction. Continue this
process until the required number of good patterns are obtained. Even in the example provided here, we
are not sure how many images out of the 14 given are good. Thus, we must read all the images and take only the good
ones.
@note Instead of chess board, we can alternatively use a circular grid. In this case, we must use the function
**cv.findCirclesGrid()** to find the pattern. Fewer images are sufficient to perform camera calibration using a circular grid.
Once we find the corners, we can increase their accuracy using **cv.cornerSubPix()**. We can also
draw the pattern using **cv.drawChessboardCorners()**. All these steps are included in below code:
@code{.py}
import numpy as np
import cv2 as cv
import glob
# termination criteria
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((6*7,3), np.float32)
objp[:,:2] = np.mgrid[0:7,0:6].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
objpoints = [] # 3d point in real world space
imgpoints = [] # 2d points in image plane.
images = glob.glob('*.jpg')
for fname in images:
img = cv.imread(fname)
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
# Find the chess board corners
ret, corners = cv.findChessboardCorners(gray, (7,6), None)
# If found, add object points, image points (after refining them)
if ret == True:
objpoints.append(objp)
corners2 = cv.cornerSubPix(gray,corners, (11,11), (-1,-1), criteria)
imgpoints.append(corners)
# Draw and display the corners
cv.drawChessboardCorners(img, (7,6), corners2, ret)
cv.imshow('img', img)
cv.waitKey(500)
cv.destroyAllWindows()
@endcode
One image with pattern drawn on it is shown below:
![image](images/calib_pattern.jpg)
### Calibration
Now that we have our object points and image points, we are ready to go for calibration. We can
use the function, **cv.calibrateCamera()** which returns the camera matrix, distortion coefficients,
rotation and translation vectors etc.
@code{.py}
ret, mtx, dist, rvecs, tvecs = cv.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None)
@endcode
### Undistortion
Now, we can take an image and undistort it. OpenCV comes with two
methods for doing this. However first, we can refine the camera matrix based on a free scaling
parameter using **cv.getOptimalNewCameraMatrix()**. If the scaling parameter alpha=0, it returns
undistorted image with minimum unwanted pixels. So it may even remove some pixels at image corners.
If alpha=1, all pixels are retained with some extra black images. This function also returns an image ROI which
can be used to crop the result.
So, we take a new image (left12.jpg in this case. That is the first image in this chapter)
@code{.py}
img = cv.imread('left12.jpg')
h, w = img.shape[:2]
newcameramtx, roi = cv.getOptimalNewCameraMatrix(mtx, dist, (w,h), 1, (w,h))
@endcode
#### 1. Using **cv.undistort()**
This is the easiest way. Just call the function and use ROI obtained above to crop the result.
@code{.py}
# undistort
dst = cv.undistort(img, mtx, dist, None, newcameramtx)
# crop the image
x, y, w, h = roi
dst = dst[y:y+h, x:x+w]
cv.imwrite('calibresult.png', dst)
@endcode
#### 2. Using **remapping**
This way is a little bit more difficult. First, find a mapping function from the distorted image to the undistorted image. Then
use the remap function.
@code{.py}
# undistort
mapx, mapy = cv.initUndistortRectifyMap(mtx, dist, None, newcameramtx, (w,h), 5)
dst = cv.remap(img, mapx, mapy, cv.INTER_LINEAR)
# crop the image
x, y, w, h = roi
dst = dst[y:y+h, x:x+w]
cv.imwrite('calibresult.png', dst)
@endcode
Still, both the methods give the same result. See the result below:
![image](images/calib_result.jpg)
You can see in the result that all the edges are straight.
Now you can store the camera matrix and distortion coefficients using write functions in NumPy
(np.savez, np.savetxt etc) for future uses.
Re-projection Error
-------------------
Re-projection error gives a good estimation of just how exact the found parameters are. The closer the re-projection error is to zero, the more accurate the parameters we found are. Given the intrinsic, distortion, rotation and translation matrices,
we must first transform the object point to image point using **cv.projectPoints()**. Then, we can calculate
the absolute norm between what we got with our transformation and the corner finding algorithm. To
find the average error, we calculate the arithmetical mean of the errors calculated for all the
calibration images.
@code{.py}
mean_error = 0
for i in range(len(objpoints)):
imgpoints2, _ = cv.projectPoints(objpoints[i], rvecs[i], tvecs[i], mtx, dist)
error = cv.norm(imgpoints[i], imgpoints2, cv.NORM_L2)/len(imgpoints2)
mean_error += error
print( "total error: {}".format(mean_error/len(objpoints)) )
@endcode
Additional Resources
--------------------
Exercises
---------
-# Try camera calibration with circular grid.

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

View File

@ -0,0 +1,75 @@
Depth Map from Stereo Images {#tutorial_py_depthmap}
============================
Goal
----
In this session,
- We will learn to create a depth map from stereo images.
Basics
------
In the last session, we saw basic concepts like epipolar constraints and other related terms. We also
saw that if we have two images of same scene, we can get depth information from that in an intuitive
way. Below is an image and some simple mathematical formulas which prove that intuition. (Image
Courtesy :
![image](images/stereo_depth.jpg)
The above diagram contains equivalent triangles. Writing their equivalent equations will yield us
following result:
\f[disparity = x - x' = \frac{Bf}{Z}\f]
\f$x\f$ and \f$x'\f$ are the distance between points in image plane corresponding to the scene point 3D and
their camera center. \f$B\f$ is the distance between two cameras (which we know) and \f$f\f$ is the focal
length of camera (already known). So in short, the above equation says that the depth of a point in a
scene is inversely proportional to the difference in distance of corresponding image points and
their camera centers. So with this information, we can derive the depth of all pixels in an image.
So it finds corresponding matches between two images. We have already seen how epiline constraint
make this operation faster and accurate. Once it finds matches, it finds the disparity. Let's see
how we can do it with OpenCV.
Code
----
Below code snippet shows a simple procedure to create a disparity map.
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
imgL = cv.imread('tsukuba_l.png',0)
imgR = cv.imread('tsukuba_r.png',0)
stereo = cv.StereoBM_create(numDisparities=16, blockSize=15)
disparity = stereo.compute(imgL,imgR)
plt.imshow(disparity,'gray')
plt.show()
@endcode
Below image contains the original image (left) and its disparity map (right). As you can see, the result
is contaminated with high degree of noise. By adjusting the values of numDisparities and blockSize,
you can get a better result.
![image](images/disparity_map.jpg)
There are some parameters when you get familiar with StereoBM, and you may need to fine tune the parameters to get better and smooth results. Parameters:
- texture_threshold: filters out areas that don't have enough texture for reliable matching
- Speckle range and size: Block-based matchers often produce "speckles" near the boundaries of objects, where the matching window catches the foreground on one side and the background on the other. In this scene it appears that the matcher is also finding small spurious matches in the projected texture on the table. To get rid of these artifacts we post-process the disparity image with a speckle filter controlled by the speckle_size and speckle_range parameters. speckle_size is the number of pixels below which a disparity blob is dismissed as "speckle." speckle_range controls how close in value disparities must be to be considered part of the same blob.
- Number of disparities: How many pixels to slide the window over. The larger it is, the larger the range of visible depths, but more computation is required.
- min_disparity: the offset from the x-position of the left pixel at which to begin searching.
- uniqueness_ratio: Another post-filtering step. If the best matching disparity is not sufficiently better than every other disparity in the search range, the pixel is filtered out. You can try tweaking this if texture_threshold and the speckle filtering are still letting through spurious matches.
- prefilter_size and prefilter_cap: The pre-filtering phase, which normalizes image brightness and enhances texture in preparation for block matching. Normally you should not need to adjust these.
Additional Resources
--------------------
- [Ros stereo img processing wiki page](http://wiki.ros.org/stereo_image_proc/Tutorials/ChoosingGoodStereoParameters)
Exercises
---------
-# OpenCV samples contain an example of generating disparity map and its 3D reconstruction. Check
stereo_match.py in OpenCV-Python samples.

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

View File

@ -0,0 +1,172 @@
Epipolar Geometry {#tutorial_py_epipolar_geometry}
=================
Goal
----
In this section,
- We will learn about the basics of multiview geometry
- We will see what is epipole, epipolar lines, epipolar constraint etc.
Basic Concepts
--------------
When we take an image using pin-hole camera, we loose an important information, ie depth of the
image. Or how far is each point in the image from the camera because it is a 3D-to-2D conversion. So
it is an important question whether we can find the depth information using these cameras. And the
answer is to use more than one camera. Our eyes works in similar way where we use two cameras (two
eyes) which is called stereo vision. So let's see what OpenCV provides in this field.
(*Learning OpenCV* by Gary Bradsky has a lot of information in this field.)
Before going to depth images, let's first understand some basic concepts in multiview geometry. In
this section we will deal with epipolar geometry. See the image below which shows a basic setup with
two cameras taking the image of same scene.
![image](images/epipolar.jpg)
If we are using only the left camera, we can't find the 3D point corresponding to the point \f$x\f$ in
image because every point on the line \f$OX\f$ projects to the same point on the image plane. But
consider the right image also. Now different points on the line \f$OX\f$ projects to different points
(\f$x'\f$) in right plane. So with these two images, we can triangulate the correct 3D point. This is
the whole idea.
The projection of the different points on \f$OX\f$ form a line on right plane (line \f$l'\f$). We call it
**epiline** corresponding to the point \f$x\f$. It means, to find the point \f$x\f$ on the right image,
search along this epiline. It should be somewhere on this line (Think of it this way, to find the
matching point in other image, you need not search the whole image, just search along the epiline.
So it provides better performance and accuracy). This is called **Epipolar Constraint**. Similarly
all points will have its corresponding epilines in the other image. The plane \f$XOO'\f$ is called
**Epipolar Plane**.
\f$O\f$ and \f$O'\f$ are the camera centers. From the setup given above, you can see that projection of
right camera \f$O'\f$ is seen on the left image at the point, \f$e\f$. It is called the **epipole**. Epipole
is the point of intersection of line through camera centers and the image planes. Similarly \f$e'\f$ is
the epipole of the left camera. In some cases, you won't be able to locate the epipole in the image,
they may be outside the image (which means, one camera doesn't see the other).
All the epilines pass through its epipole. So to find the location of epipole, we can find many
epilines and find their intersection point.
So in this session, we focus on finding epipolar lines and epipoles. But to find them, we need two
more ingredients, **Fundamental Matrix (F)** and **Essential Matrix (E)**. Essential Matrix contains
the information about translation and rotation, which describe the location of the second camera
relative to the first in global coordinates. See the image below (Image courtesy: Learning OpenCV by
Gary Bradsky):
![image](images/essential_matrix.jpg)
But we prefer measurements to be done in pixel coordinates, right? Fundamental Matrix contains the
same information as Essential Matrix in addition to the information about the intrinsics of both
cameras so that we can relate the two cameras in pixel coordinates. (If we are using rectified
images and normalize the point by dividing by the focal lengths, \f$F=E\f$). In simple words,
Fundamental Matrix F, maps a point in one image to a line (epiline) in the other image. This is
calculated from matching points from both the images. A minimum of 8 such points are required to
find the fundamental matrix (while using 8-point algorithm). More points are preferred and use
RANSAC to get a more robust result.
Code
----
So first we need to find as many possible matches between two images to find the fundamental matrix.
For this, we use SIFT descriptors with FLANN based matcher and ratio test.
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img1 = cv.imread('myleft.jpg',0) #queryimage # left image
img2 = cv.imread('myright.jpg',0) #trainimage # right image
sift = cv.SIFT_create()
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)
# FLANN parameters
FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks=50)
flann = cv.FlannBasedMatcher(index_params,search_params)
matches = flann.knnMatch(des1,des2,k=2)
pts1 = []
pts2 = []
# ratio test as per Lowe's paper
for i,(m,n) in enumerate(matches):
if m.distance < 0.8*n.distance:
pts2.append(kp2[m.trainIdx].pt)
pts1.append(kp1[m.queryIdx].pt)
@endcode
Now we have the list of best matches from both the images. Let's find the Fundamental Matrix.
@code{.py}
pts1 = np.int32(pts1)
pts2 = np.int32(pts2)
F, mask = cv.findFundamentalMat(pts1,pts2,cv.FM_LMEDS)
# We select only inlier points
pts1 = pts1[mask.ravel()==1]
pts2 = pts2[mask.ravel()==1]
@endcode
Next we find the epilines. Epilines corresponding to the points in first image is drawn on second
image. So mentioning of correct images are important here. We get an array of lines. So we define a
new function to draw these lines on the images.
@code{.py}
def drawlines(img1,img2,lines,pts1,pts2):
''' img1 - image on which we draw the epilines for the points in img2
lines - corresponding epilines '''
r,c = img1.shape
img1 = cv.cvtColor(img1,cv.COLOR_GRAY2BGR)
img2 = cv.cvtColor(img2,cv.COLOR_GRAY2BGR)
for r,pt1,pt2 in zip(lines,pts1,pts2):
color = tuple(np.random.randint(0,255,3).tolist())
x0,y0 = map(int, [0, -r[2]/r[1] ])
x1,y1 = map(int, [c, -(r[2]+r[0]*c)/r[1] ])
img1 = cv.line(img1, (x0,y0), (x1,y1), color,1)
img1 = cv.circle(img1,tuple(pt1),5,color,-1)
img2 = cv.circle(img2,tuple(pt2),5,color,-1)
return img1,img2
@endcode
Now we find the epilines in both the images and draw them.
@code{.py}
# Find epilines corresponding to points in right image (second image) and
# drawing its lines on left image
lines1 = cv.computeCorrespondEpilines(pts2.reshape(-1,1,2), 2,F)
lines1 = lines1.reshape(-1,3)
img5,img6 = drawlines(img1,img2,lines1,pts1,pts2)
# Find epilines corresponding to points in left image (first image) and
# drawing its lines on right image
lines2 = cv.computeCorrespondEpilines(pts1.reshape(-1,1,2), 1,F)
lines2 = lines2.reshape(-1,3)
img3,img4 = drawlines(img2,img1,lines2,pts2,pts1)
plt.subplot(121),plt.imshow(img5)
plt.subplot(122),plt.imshow(img3)
plt.show()
@endcode
Below is the result we get:
![image](images/epiresult.jpg)
You can see in the left image that all epilines are converging at a point outside the image at right
side. That meeting point is the epipole.
For better results, images with good resolution and many non-planar points should be used.
Additional Resources
--------------------
Exercises
---------
-# One important topic is the forward movement of camera. Then epipoles will be seen at the same
locations in both with epilines emerging from a fixed point. [See this
discussion](http://answers.opencv.org/question/17912/location-of-epipole/).
2. Fundamental Matrix estimation is sensitive to quality of matches, outliers etc. It becomes worse
when all selected matches lie on the same plane. [Check this
discussion](http://answers.opencv.org/question/18125/epilines-not-correct/).

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

View File

@ -0,0 +1,127 @@
Pose Estimation {#tutorial_py_pose}
===============
Goal
----
In this section,
- We will learn to exploit calib3d module to create some 3D effects in images.
Basics
------
This is going to be a small section. During the last session on camera calibration, you have found
the camera matrix, distortion coefficients etc. Given a pattern image, we can utilize the above
information to calculate its pose, or how the object is situated in space, like how it is rotated,
how it is displaced etc. For a planar object, we can assume Z=0, such that, the problem now becomes
how camera is placed in space to see our pattern image. So, if we know how the object lies in the
space, we can draw some 2D diagrams in it to simulate the 3D effect. Let's see how to do it.
Our problem is, we want to draw our 3D coordinate axis (X, Y, Z axes) on our chessboard's first
corner. X axis in blue color, Y axis in green color and Z axis in red color. So in-effect, Z axis
should feel like it is perpendicular to our chessboard plane.
First, let's load the camera matrix and distortion coefficients from the previous calibration
result.
@code{.py}
import numpy as np
import cv2 as cv
import glob
# Load previously saved data
with np.load('B.npz') as X:
mtx, dist, _, _ = [X[i] for i in ('mtx','dist','rvecs','tvecs')]
@endcode
Now let's create a function, draw which takes the corners in the chessboard (obtained using
**cv.findChessboardCorners()**) and **axis points** to draw a 3D axis.
@code{.py}
def draw(img, corners, imgpts):
corner = tuple(corners[0].ravel())
img = cv.line(img, corner, tuple(imgpts[0].ravel()), (255,0,0), 5)
img = cv.line(img, corner, tuple(imgpts[1].ravel()), (0,255,0), 5)
img = cv.line(img, corner, tuple(imgpts[2].ravel()), (0,0,255), 5)
return img
@endcode
Then as in previous case, we create termination criteria, object points (3D points of corners in
chessboard) and axis points. Axis points are points in 3D space for drawing the axis. We draw axis
of length 3 (units will be in terms of chess square size since we calibrated based on that size). So
our X axis is drawn from (0,0,0) to (3,0,0), so for Y axis. For Z axis, it is drawn from (0,0,0) to
(0,0,-3). Negative denotes it is drawn towards the camera.
@code{.py}
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)
objp = np.zeros((6*7,3), np.float32)
objp[:,:2] = np.mgrid[0:7,0:6].T.reshape(-1,2)
axis = np.float32([[3,0,0], [0,3,0], [0,0,-3]]).reshape(-1,3)
@endcode
Now, as usual, we load each image. Search for 7x6 grid. If found, we refine it with subcorner
pixels. Then to calculate the rotation and translation, we use the function,
**cv.solvePnPRansac()**. Once we those transformation matrices, we use them to project our **axis
points** to the image plane. In simple words, we find the points on image plane corresponding to
each of (3,0,0),(0,3,0),(0,0,3) in 3D space. Once we get them, we draw lines from the first corner
to each of these points using our draw() function. Done !!!
@code{.py}
for fname in glob.glob('left*.jpg'):
img = cv.imread(fname)
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
ret, corners = cv.findChessboardCorners(gray, (7,6),None)
if ret == True:
corners2 = cv.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
# Find the rotation and translation vectors.
ret,rvecs, tvecs = cv.solvePnP(objp, corners2, mtx, dist)
# project 3D points to image plane
imgpts, jac = cv.projectPoints(axis, rvecs, tvecs, mtx, dist)
img = draw(img,corners2,imgpts)
cv.imshow('img',img)
k = cv.waitKey(0) & 0xFF
if k == ord('s'):
cv.imwrite(fname[:6]+'.png', img)
cv.destroyAllWindows()
@endcode
See some results below. Notice that each axis is 3 squares long.:
![image](images/pose_1.jpg)
### Render a Cube
If you want to draw a cube, modify the draw() function and axis points as follows.
Modified draw() function:
@code{.py}
def draw(img, corners, imgpts):
imgpts = np.int32(imgpts).reshape(-1,2)
# draw ground floor in green
img = cv.drawContours(img, [imgpts[:4]],-1,(0,255,0),-3)
# draw pillars in blue color
for i,j in zip(range(4),range(4,8)):
img = cv.line(img, tuple(imgpts[i]), tuple(imgpts[j]),(255),3)
# draw top layer in red color
img = cv.drawContours(img, [imgpts[4:]],-1,(0,0,255),3)
return img
@endcode
Modified axis points. They are the 8 corners of a cube in 3D space:
@code{.py}
axis = np.float32([[0,0,0], [0,3,0], [3,3,0], [3,0,0],
[0,0,-3],[0,3,-3],[3,3,-3],[3,0,-3] ])
@endcode
And look at the result below:
![image](images/pose_2.jpg)
If you are interested in graphics, augmented reality etc, you can use OpenGL to render more
complicated figures.
Additional Resources
--------------------
Exercises
---------

View File

@ -0,0 +1,22 @@
Camera Calibration and 3D Reconstruction {#tutorial_py_table_of_contents_calib3d}
========================================
- @subpage tutorial_py_calibration
Let's find how good
is our camera. Is there any distortion in images taken with it? If so how to correct it?
- @subpage tutorial_py_pose
This is a small
section which will help you to create some cool 3D effects with calib module.
- @subpage tutorial_py_epipolar_geometry
Let's understand
epipolar geometry and epipolar constraint.
- @subpage tutorial_py_depthmap
Extract depth
information from 2D images.