feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake

1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试
2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程
3.重整权利声明文件,重整代码工程,确保最小化侵权风险

Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake
Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
This commit is contained in:
wangzhengyang
2022-05-10 09:54:44 +08:00
parent ecdd171c6f
commit 718c41634f
10018 changed files with 3593797 additions and 186748 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 114 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

View File

@ -0,0 +1,175 @@
High Dynamic Range (HDR) {#tutorial_py_hdr}
========================
Goal
----
In this chapter, we will
- Learn how to generate and display HDR image from an exposure sequence.
- Use exposure fusion to merge an exposure sequence.
Theory
------
High-dynamic-range imaging (HDRI or HDR) is a technique used in imaging and photography to reproduce
a greater dynamic range of luminosity than is possible with standard digital imaging or photographic
techniques. While the human eye can adjust to a wide range of light conditions, most imaging devices use 8-bits
per channel, so we are limited to only 256 levels. When we take photographs of a real
world scene, bright regions may be overexposed, while the dark ones may be underexposed, so we
cant capture all details using a single exposure. HDR imaging works with images that use more
than 8 bits per channel (usually 32-bit float values), allowing much wider dynamic range.
There are different ways to obtain HDR images, but the most common one is to use photographs of
the scene taken with different exposure values. To combine these exposures it is useful to know your
cameras response function and there are algorithms to estimate it. After the HDR image has been
merged, it has to be converted back to 8-bit to view it on usual displays. This process is called
tonemapping. Additional complexities arise when objects of the scene or camera move between shots,
since images with different exposures should be registered and aligned.
In this tutorial we show 2 algorithms (Debevec, Robertson) to generate and display HDR image from an
exposure sequence, and demonstrate an alternative approach called exposure fusion (Mertens), that
produces low dynamic range image and does not need the exposure times data.
Furthermore, we estimate the camera response function (CRF) which is of great value for many computer
vision algorithms.
Each step of HDR pipeline can be implemented using different algorithms and parameters, so take a
look at the reference manual to see them all.
Exposure sequence HDR
---------------------
In this tutorial we will look on the following scene, where we have 4 exposure
images, with exposure times of: 15, 2.5, 1/4 and 1/30 seconds. (You can download
the images from [Wikipedia](https://en.wikipedia.org/wiki/High-dynamic-range_imaging))
![image](images/exposures.jpg)
### 1. Loading exposure images into a list
The first stage is simply loading all images into a list.
In addition, we will need the exposure times for the regular HDR algorithms.
Pay attention for the data types, as the images should be 1-channel or 3-channels
8-bit (np.uint8) and the exposure times need to be float32 and in seconds.
@code{.py}
import cv2 as cv
import numpy as np
# Loading exposure images into a list
img_fn = ["img0.jpg", "img1.jpg", "img2.jpg", "img3.jpg"]
img_list = [cv.imread(fn) for fn in img_fn]
exposure_times = np.array([15.0, 2.5, 0.25, 0.0333], dtype=np.float32)
@endcode
### 2. Merge exposures into HDR image
In this stage we merge the exposure sequence into one HDR image, showing 2 possibilities
which we have in OpenCV. The first method is Debevec and the second one is Robertson.
Notice that the HDR image is of type float32, and not uint8, as it contains the
full dynamic range of all exposure images.
@code{.py}
# Merge exposures to HDR image
merge_debevec = cv.createMergeDebevec()
hdr_debevec = merge_debevec.process(img_list, times=exposure_times.copy())
merge_robertson = cv.createMergeRobertson()
hdr_robertson = merge_robertson.process(img_list, times=exposure_times.copy())
@endcode
### 3. Tonemap HDR image
We map the 32-bit float HDR data into the range [0..1].
Actually, in some cases the values can be larger than 1 or lower the 0, so notice
we will later have to clip the data in order to avoid overflow.
@code{.py}
# Tonemap HDR image
tonemap1 = cv.createTonemap(gamma=2.2)
res_debevec = tonemap1.process(hdr_debevec.copy())
@endcode
### 4. Merge exposures using Mertens fusion
Here we show an alternative algorithm to merge the exposure images, where
we do not need the exposure times. We also do not need to use any tonemap
algorithm because the Mertens algorithm already gives us the result in the
range of [0..1].
@code{.py}
# Exposure fusion using Mertens
merge_mertens = cv.createMergeMertens()
res_mertens = merge_mertens.process(img_list)
@endcode
### 5. Convert to 8-bit and save
In order to save or display the results, we need to convert the data into 8-bit
integers in the range of [0..255].
@code{.py}
# Convert datatype to 8-bit and save
res_debevec_8bit = np.clip(res_debevec*255, 0, 255).astype('uint8')
res_robertson_8bit = np.clip(res_robertson*255, 0, 255).astype('uint8')
res_mertens_8bit = np.clip(res_mertens*255, 0, 255).astype('uint8')
cv.imwrite("ldr_debevec.jpg", res_debevec_8bit)
cv.imwrite("ldr_robertson.jpg", res_robertson_8bit)
cv.imwrite("fusion_mertens.jpg", res_mertens_8bit)
@endcode
Results
-------
You can see the different results but consider that each algorithm have additional
extra parameters that you should fit to get your desired outcome. Best practice is
to try the different methods and see which one performs best for your scene.
### Debevec:
![image](images/ldr_debevec.jpg)
### Robertson:
![image](images/ldr_robertson.jpg)
### Mertenes Fusion:
![image](images/fusion_mertens.jpg)
Estimating Camera Response Function
-----------------------------------
The camera response function (CRF) gives us the connection between the scene radiance
to the measured intensity values. The CRF if of great importance in some computer vision
algorithms, including HDR algorithms. Here we estimate the inverse camera response
function and use it for the HDR merge.
@code{.py}
# Estimate camera response function (CRF)
cal_debevec = cv.createCalibrateDebevec()
crf_debevec = cal_debevec.process(img_list, times=exposure_times)
hdr_debevec = merge_debevec.process(img_list, times=exposure_times.copy(), response=crf_debevec.copy())
cal_robertson = cv.createCalibrateRobertson()
crf_robertson = cal_robertson.process(img_list, times=exposure_times)
hdr_robertson = merge_robertson.process(img_list, times=exposure_times.copy(), response=crf_robertson.copy())
@endcode
The camera response function is represented by a 256-length vector for each color channel.
For this sequence we got the following estimation:
![image](images/crf.jpg)
Additional Resources
--------------------
1. Paul E Debevec and Jitendra Malik. Recovering high dynamic range radiance maps from photographs. In ACM SIGGRAPH 2008 classes, page 31. ACM, 2008. @cite DM97
2. Mark A Robertson, Sean Borman, and Robert L Stevenson. Dynamic range improvement through multiple exposures. In Image Processing, 1999. ICIP 99. Proceedings. 1999 International Conference on, volume 3, pages 159163. IEEE, 1999. @cite RB99
3. Tom Mertens, Jan Kautz, and Frank Van Reeth. Exposure fusion. In Computer Graphics and Applications, 2007. PG'07. 15th Pacific Conference on, pages 382390. IEEE, 2007. @cite MK07
4. Images from [Wikipedia-HDR](https://en.wikipedia.org/wiki/High-dynamic-range_imaging)
Exercises
---------
1. Try all tonemap algorithms: cv::TonemapDrago, cv::TonemapMantiuk and cv::TonemapReinhard
2. Try changing the parameters in the HDR calibration and tonemap methods.

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

View File

@ -0,0 +1,89 @@
Image Inpainting {#tutorial_py_inpainting}
================
Goal
----
In this chapter,
- We will learn how to remove small noises, strokes etc in old photographs by a method called
inpainting
- We will see inpainting functionalities in OpenCV.
Basics
------
Most of you will have some old degraded photos at your home with some black spots, some strokes etc
on it. Have you ever thought of restoring it back? We can't simply erase them in a paint tool
because it is will simply replace black structures with white structures which is of no use. In
these cases, a technique called image inpainting is used. The basic idea is simple: Replace those
bad marks with its neighbouring pixels so that it looks like the neighbourhood. Consider the image
shown below (taken from [Wikipedia](http://en.wikipedia.org/wiki/Inpainting)):
![image](images/inpaint_basics.jpg)
Several algorithms were designed for this purpose and OpenCV provides two of them. Both can be
accessed by the same function, **cv.inpaint()**
First algorithm is based on the paper **"An Image Inpainting Technique Based on the Fast Marching
Method"** by Alexandru Telea in 2004. It is based on Fast Marching Method. Consider a region in the
image to be inpainted. Algorithm starts from the boundary of this region and goes inside the region
gradually filling everything in the boundary first. It takes a small neighbourhood around the pixel
on the neighbourhood to be inpainted. This pixel is replaced by normalized weighted sum of all the
known pixels in the neighbourhood. Selection of the weights is an important matter. More weightage is
given to those pixels lying near to the point, near to the normal of the boundary and those lying on
the boundary contours. Once a pixel is inpainted, it moves to next nearest pixel using Fast Marching
Method. FMM ensures those pixels near the known pixels are inpainted first, so that it just works
like a manual heuristic operation. This algorithm is enabled by using the flag, cv.INPAINT_TELEA.
Second algorithm is based on the paper **"Navier-Stokes, Fluid Dynamics, and Image and Video
Inpainting"** by Bertalmio, Marcelo, Andrea L. Bertozzi, and Guillermo Sapiro in 2001. This
algorithm is based on fluid dynamics and utilizes partial differential equations. Basic principle is
heurisitic. It first travels along the edges from known regions to unknown regions (because edges
are meant to be continuous). It continues isophotes (lines joining points with same intensity, just
like contours joins points with same elevation) while matching gradient vectors at the boundary of
the inpainting region. For this, some methods from fluid dynamics are used. Once they are obtained,
color is filled to reduce minimum variance in that area. This algorithm is enabled by using the
flag, cv.INPAINT_NS.
Code
----
We need to create a mask of same size as that of input image, where non-zero pixels corresponds to
the area which is to be inpainted. Everything else is simple. My image is degraded with some black
strokes (I added manually). I created a corresponding strokes with Paint tool.
@code{.py}
import numpy as np
import cv2 as cv
img = cv.imread('messi_2.jpg')
mask = cv.imread('mask2.png',0)
dst = cv.inpaint(img,mask,3,cv.INPAINT_TELEA)
cv.imshow('dst',dst)
cv.waitKey(0)
cv.destroyAllWindows()
@endcode
See the result below. First image shows degraded input. Second image is the mask. Third image is the
result of first algorithm and last image is the result of second algorithm.
![image](images/inpaint_result.jpg)
Additional Resources
--------------------
-# Bertalmio, Marcelo, Andrea L. Bertozzi, and Guillermo Sapiro. "Navier-stokes, fluid dynamics,
and image and video inpainting." In Computer Vision and Pattern Recognition, 2001. CVPR 2001.
Proceedings of the 2001 IEEE Computer Society Conference on, vol. 1, pp. I-355. IEEE, 2001.
2. Telea, Alexandru. "An image inpainting technique based on the fast marching method." Journal of
graphics tools 9.1 (2004): 23-34.
Exercises
---------
-# OpenCV comes with an interactive sample on inpainting, samples/python/inpaint.py, try it.
2. A few months ago, I watched a video on [Content-Aware
Fill](http://www.youtube.com/watch?v=ZtoUiplKa2A), an advanced inpainting technique used in
Adobe Photoshop. On further search, I was able to find that same technique is already there in
GIMP with different name, "Resynthesizer" (You need to install separate plugin). I am sure you
will enjoy the technique.

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

View File

@ -0,0 +1,152 @@
Image Denoising {#tutorial_py_non_local_means}
===============
Goal
----
In this chapter,
- You will learn about Non-local Means Denoising algorithm to remove noise in the image.
- You will see different functions like **cv.fastNlMeansDenoising()**,
**cv.fastNlMeansDenoisingColored()** etc.
Theory
------
In earlier chapters, we have seen many image smoothing techniques like Gaussian Blurring, Median
Blurring etc and they were good to some extent in removing small quantities of noise. In those
techniques, we took a small neighbourhood around a pixel and did some operations like gaussian
weighted average, median of the values etc to replace the central element. In short, noise removal
at a pixel was local to its neighbourhood.
There is a property of noise. Noise is generally considered to be a random variable with zero mean.
Consider a noisy pixel, \f$p = p_0 + n\f$ where \f$p_0\f$ is the true value of pixel and \f$n\f$ is the noise in
that pixel. You can take large number of same pixels (say \f$N\f$) from different images and computes
their average. Ideally, you should get \f$p = p_0\f$ since mean of noise is zero.
You can verify it yourself by a simple setup. Hold a static camera to a certain location for a
couple of seconds. This will give you plenty of frames, or a lot of images of the same scene. Then
write a piece of code to find the average of all the frames in the video (This should be too simple
for you now ). Compare the final result and first frame. You can see reduction in noise.
Unfortunately this simple method is not robust to camera and scene motions. Also often there is only
one noisy image available.
So idea is simple, we need a set of similar images to average out the noise. Consider a small window
(say 5x5 window) in the image. Chance is large that the same patch may be somewhere else in the
image. Sometimes in a small neighbourhood around it. What about using these similar patches together
and find their average? For that particular window, that is fine. See an example image below:
![image](images/nlm_patch.jpg)
The blue patches in the image looks the similar. Green patches looks similar. So we take a pixel,
take small window around it, search for similar windows in the image, average all the windows and
replace the pixel with the result we got. This method is Non-Local Means Denoising. It takes more
time compared to blurring techniques we saw earlier, but its result is very good. More details and
online demo can be found at first link in additional resources.
For color images, image is converted to CIELAB colorspace and then it separately denoise L and AB
components.
Image Denoising in OpenCV
-------------------------
OpenCV provides four variations of this technique.
-# **cv.fastNlMeansDenoising()** - works with a single grayscale images
2. **cv.fastNlMeansDenoisingColored()** - works with a color image.
3. **cv.fastNlMeansDenoisingMulti()** - works with image sequence captured in short period of time
(grayscale images)
4. **cv.fastNlMeansDenoisingColoredMulti()** - same as above, but for color images.
Common arguments are:
- h : parameter deciding filter strength. Higher h value removes noise better, but removes
details of image also. (10 is ok)
- hForColorComponents : same as h, but for color images only. (normally same as h)
- templateWindowSize : should be odd. (recommended 7)
- searchWindowSize : should be odd. (recommended 21)
Please visit first link in additional resources for more details on these parameters.
We will demonstrate 2 and 3 here. Rest is left for you.
### 1. cv.fastNlMeansDenoisingColored()
As mentioned above it is used to remove noise from color images. (Noise is expected to be gaussian).
See the example below:
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('die.png')
dst = cv.fastNlMeansDenoisingColored(img,None,10,10,7,21)
plt.subplot(121),plt.imshow(img)
plt.subplot(122),plt.imshow(dst)
plt.show()
@endcode
Below is a zoomed version of result. My input image has a gaussian noise of \f$\sigma = 25\f$. See the
result:
![image](images/nlm_result1.jpg)
### 2. cv.fastNlMeansDenoisingMulti()
Now we will apply the same method to a video. The first argument is the list of noisy frames. Second
argument imgToDenoiseIndex specifies which frame we need to denoise, for that we pass the index of
frame in our input list. Third is the temporalWindowSize which specifies the number of nearby frames
to be used for denoising. It should be odd. In that case, a total of temporalWindowSize frames are
used where central frame is the frame to be denoised. For example, you passed a list of 5 frames as
input. Let imgToDenoiseIndex = 2 and temporalWindowSize = 3. Then frame-1, frame-2 and frame-3 are
used to denoise frame-2. Let's see an example.
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
cap = cv.VideoCapture('vtest.avi')
# create a list of first 5 frames
img = [cap.read()[1] for i in range(5)]
# convert all to grayscale
gray = [cv.cvtColor(i, cv.COLOR_BGR2GRAY) for i in img]
# convert all to float64
gray = [np.float64(i) for i in gray]
# create a noise of variance 25
noise = np.random.randn(*gray[1].shape)*10
# Add this noise to images
noisy = [i+noise for i in gray]
# Convert back to uint8
noisy = [np.uint8(np.clip(i,0,255)) for i in noisy]
# Denoise 3rd frame considering all the 5 frames
dst = cv.fastNlMeansDenoisingMulti(noisy, 2, 5, None, 4, 7, 35)
plt.subplot(131),plt.imshow(gray[2],'gray')
plt.subplot(132),plt.imshow(noisy[2],'gray')
plt.subplot(133),plt.imshow(dst,'gray')
plt.show()
@endcode
Below image shows a zoomed version of the result we got:
![image](images/nlm_multi.jpg)
It takes considerable amount of time for computation. In the result, first image is the original
frame, second is the noisy one, third is the denoised image.
Additional Resources
--------------------
-# <http://www.ipol.im/pub/art/2011/bcm_nlm/> (It has the details, online demo etc. Highly
recommended to visit. Our test image is generated from this link)
2. [Online course at coursera](https://www.coursera.org/course/images) (First image taken from
here)
Exercises
---------

View File

@ -0,0 +1,20 @@
Computational Photography {#tutorial_py_table_of_contents_photo}
=========================
Here you will learn different OpenCV functionalities related to Computational Photography like image
denoising etc.
- @subpage tutorial_py_non_local_means
See a good technique
to remove noises in images called Non-Local Means Denoising
- @subpage tutorial_py_inpainting
Do you have a old
degraded photo with many black spots and strokes on it? Take it. Let's try to restore them with a
technique called image inpainting.
- @subpage tutorial_py_hdr
Learn how to merge exposure sequence and process high dynamic range images.