feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake

1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试
2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程
3.重整权利声明文件,重整代码工程,确保最小化侵权风险

Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake
Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
This commit is contained in:
wangzhengyang
2022-05-10 09:54:44 +08:00
parent ecdd171c6f
commit 718c41634f
10018 changed files with 3593797 additions and 186748 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

View File

@ -0,0 +1,111 @@
Canny Edge Detection {#tutorial_py_canny}
====================
Goal
----
In this chapter, we will learn about
- Concept of Canny edge detection
- OpenCV functions for that : **cv.Canny()**
Theory
------
Canny Edge Detection is a popular edge detection algorithm. It was developed by John F. Canny in
1986. It is a multi-stage algorithm and we will go through each stages.
-# **Noise Reduction**
Since edge detection is susceptible to noise in the image, first step is to remove the noise in the
image with a 5x5 Gaussian filter. We have already seen this in previous chapters.
-# **Finding Intensity Gradient of the Image**
Smoothened image is then filtered with a Sobel kernel in both horizontal and vertical direction to
get first derivative in horizontal direction (\f$G_x\f$) and vertical direction (\f$G_y\f$). From these two
images, we can find edge gradient and direction for each pixel as follows:
\f[
Edge\_Gradient \; (G) = \sqrt{G_x^2 + G_y^2} \\
Angle \; (\theta) = \tan^{-1} \bigg(\frac{G_y}{G_x}\bigg)
\f]
Gradient direction is always perpendicular to edges. It is rounded to one of four angles
representing vertical, horizontal and two diagonal directions.
-# **Non-maximum Suppression**
After getting gradient magnitude and direction, a full scan of image is done to remove any unwanted
pixels which may not constitute the edge. For this, at every pixel, pixel is checked if it is a
local maximum in its neighborhood in the direction of gradient. Check the image below:
![image](images/nms.jpg)
Point A is on the edge ( in vertical direction). Gradient direction is normal to the edge. Point B
and C are in gradient directions. So point A is checked with point B and C to see if it forms a
local maximum. If so, it is considered for next stage, otherwise, it is suppressed ( put to zero).
In short, the result you get is a binary image with "thin edges".
-# **Hysteresis Thresholding**
This stage decides which are all edges are really edges and which are not. For this, we need two
threshold values, minVal and maxVal. Any edges with intensity gradient more than maxVal are sure to
be edges and those below minVal are sure to be non-edges, so discarded. Those who lie between these
two thresholds are classified edges or non-edges based on their connectivity. If they are connected
to "sure-edge" pixels, they are considered to be part of edges. Otherwise, they are also discarded.
See the image below:
![image](images/hysteresis.jpg)
The edge A is above the maxVal, so considered as "sure-edge". Although edge C is below maxVal, it is
connected to edge A, so that also considered as valid edge and we get that full curve. But edge B,
although it is above minVal and is in same region as that of edge C, it is not connected to any
"sure-edge", so that is discarded. So it is very important that we have to select minVal and maxVal
accordingly to get the correct result.
This stage also removes small pixels noises on the assumption that edges are long lines.
So what we finally get is strong edges in the image.
Canny Edge Detection in OpenCV
------------------------------
OpenCV puts all the above in single function, **cv.Canny()**. We will see how to use it. First
argument is our input image. Second and third arguments are our minVal and maxVal respectively.
Fourth argument is aperture_size. It is the size of Sobel kernel used for find image gradients. By
default it is 3. Last argument is L2gradient which specifies the equation for finding gradient
magnitude. If it is True, it uses the equation mentioned above which is more accurate, otherwise it
uses this function: \f$Edge\_Gradient \; (G) = |G_x| + |G_y|\f$. By default, it is False.
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('messi5.jpg',0)
edges = cv.Canny(img,100,200)
plt.subplot(121),plt.imshow(img,cmap = 'gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(edges,cmap = 'gray')
plt.title('Edge Image'), plt.xticks([]), plt.yticks([])
plt.show()
@endcode
See the result below:
![image](images/canny1.jpg)
Additional Resources
--------------------
-# Canny edge detector at [Wikipedia](http://en.wikipedia.org/wiki/Canny_edge_detector)
-# [Canny Edge Detection Tutorial](http://dasl.unlv.edu/daslDrexel/alumni/bGreen/www.pages.drexel.edu/_weg22/can_tut.html) by
Bill Green, 2002.
Exercises
---------
-# Write a small application to find the Canny edge detection whose threshold values can be varied
using two trackbars. This way, you can understand the effect of threshold values.

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.2 KiB

View File

@ -0,0 +1,113 @@
Changing Colorspaces {#tutorial_py_colorspaces}
====================
Goal
----
- In this tutorial, you will learn how to convert images from one color-space to another, like
BGR \f$\leftrightarrow\f$ Gray, BGR \f$\leftrightarrow\f$ HSV, etc.
- In addition to that, we will create an application to extract a colored object in a video
- You will learn the following functions: **cv.cvtColor()**, **cv.inRange()**, etc.
Changing Color-space
--------------------
There are more than 150 color-space conversion methods available in OpenCV. But we will look into
only two, which are most widely used ones: BGR \f$\leftrightarrow\f$ Gray and BGR \f$\leftrightarrow\f$ HSV.
For color conversion, we use the function cv.cvtColor(input_image, flag) where flag determines the
type of conversion.
For BGR \f$\rightarrow\f$ Gray conversion, we use the flag cv.COLOR_BGR2GRAY. Similarly for BGR
\f$\rightarrow\f$ HSV, we use the flag cv.COLOR_BGR2HSV. To get other flags, just run following
commands in your Python terminal:
@code{.py}
>>> import cv2 as cv
>>> flags = [i for i in dir(cv) if i.startswith('COLOR_')]
>>> print( flags )
@endcode
@note For HSV, hue range is [0,179], saturation range is [0,255], and value range is [0,255].
Different software use different scales. So if you are comparing OpenCV values with them, you need
to normalize these ranges.
Object Tracking
---------------
Now that we know how to convert a BGR image to HSV, we can use this to extract a colored object. In HSV, it
is easier to represent a color than in BGR color-space. In our application, we will try to extract
a blue colored object. So here is the method:
- Take each frame of the video
- Convert from BGR to HSV color-space
- We threshold the HSV image for a range of blue color
- Now extract the blue object alone, we can do whatever we want on that image.
Below is the code which is commented in detail:
@code{.py}
import cv2 as cv
import numpy as np
cap = cv.VideoCapture(0)
while(1):
# Take each frame
_, frame = cap.read()
# Convert BGR to HSV
hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
# define range of blue color in HSV
lower_blue = np.array([110,50,50])
upper_blue = np.array([130,255,255])
# Threshold the HSV image to get only blue colors
mask = cv.inRange(hsv, lower_blue, upper_blue)
# Bitwise-AND mask and original image
res = cv.bitwise_and(frame,frame, mask= mask)
cv.imshow('frame',frame)
cv.imshow('mask',mask)
cv.imshow('res',res)
k = cv.waitKey(5) & 0xFF
if k == 27:
break
cv.destroyAllWindows()
@endcode
Below image shows tracking of the blue object:
![image](images/frame.jpg)
@note There is some noise in the image. We will see how to remove it in later chapters.
@note This is the simplest method in object tracking. Once you learn functions of contours, you can
do plenty of things like find the centroid of an object and use it to track the object, draw diagrams
just by moving your hand in front of a camera, and other fun stuff.
How to find HSV values to track?
--------------------------------
This is a common question found in [stackoverflow.com](http://www.stackoverflow.com). It is very simple and
you can use the same function, cv.cvtColor(). Instead of passing an image, you just pass the BGR
values you want. For example, to find the HSV value of Green, try the following commands in a Python
terminal:
@code{.py}
>>> green = np.uint8([[[0,255,0 ]]])
>>> hsv_green = cv.cvtColor(green,cv.COLOR_BGR2HSV)
>>> print( hsv_green )
[[[ 60 255 255]]]
@endcode
Now you take [H-10, 100,100] and [H+10, 255, 255] as the lower bound and upper bound respectively. Apart
from this method, you can use any image editing tools like GIMP or any online converters to find
these values, but don't forget to adjust the HSV ranges.
Additional Resources
--------------------
Exercises
---------
-# Try to find a way to extract more than one colored object, for example, extract red, blue, and green
objects simultaneously.

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.9 KiB

View File

@ -0,0 +1,206 @@
Contour Features {#tutorial_py_contour_features}
================
@prev_tutorial{tutorial_py_contours_begin}
@next_tutorial{tutorial_py_contour_properties}
Goal
----
In this article, we will learn
- To find the different features of contours, like area, perimeter, centroid, bounding box etc
- You will see plenty of functions related to contours.
1. Moments
----------
Image moments help you to calculate some features like center of mass of the object, area of the
object etc. Check out the wikipedia page on [Image
Moments](http://en.wikipedia.org/wiki/Image_moment)
The function **cv.moments()** gives a dictionary of all moment values calculated. See below:
@code{.py}
import numpy as np
import cv2 as cv
img = cv.imread('star.jpg',0)
ret,thresh = cv.threshold(img,127,255,0)
contours,hierarchy = cv.findContours(thresh, 1, 2)
cnt = contours[0]
M = cv.moments(cnt)
print( M )
@endcode
From this moments, you can extract useful data like area, centroid etc. Centroid is given by the
relations, \f$C_x = \frac{M_{10}}{M_{00}}\f$ and \f$C_y = \frac{M_{01}}{M_{00}}\f$. This can be done as
follows:
@code{.py}
cx = int(M['m10']/M['m00'])
cy = int(M['m01']/M['m00'])
@endcode
2. Contour Area
---------------
Contour area is given by the function **cv.contourArea()** or from moments, **M['m00']**.
@code{.py}
area = cv.contourArea(cnt)
@endcode
3. Contour Perimeter
--------------------
It is also called arc length. It can be found out using **cv.arcLength()** function. Second
argument specify whether shape is a closed contour (if passed True), or just a curve.
@code{.py}
perimeter = cv.arcLength(cnt,True)
@endcode
4. Contour Approximation
------------------------
It approximates a contour shape to another shape with less number of vertices depending upon the
precision we specify. It is an implementation of [Douglas-Peucker
algorithm](http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm). Check the wikipedia page
for algorithm and demonstration.
To understand this, suppose you are trying to find a square in an image, but due to some problems in
the image, you didn't get a perfect square, but a "bad shape" (As shown in first image below). Now
you can use this function to approximate the shape. In this, second argument is called epsilon,
which is maximum distance from contour to approximated contour. It is an accuracy parameter. A wise
selection of epsilon is needed to get the correct output.
@code{.py}
epsilon = 0.1*cv.arcLength(cnt,True)
approx = cv.approxPolyDP(cnt,epsilon,True)
@endcode
Below, in second image, green line shows the approximated curve for epsilon = 10% of arc length.
Third image shows the same for epsilon = 1% of the arc length. Third argument specifies whether
curve is closed or not.
![image](images/approx.jpg)
5. Convex Hull
--------------
Convex Hull will look similar to contour approximation, but it is not (Both may provide same results
in some cases). Here, **cv.convexHull()** function checks a curve for convexity defects and
corrects it. Generally speaking, convex curves are the curves which are always bulged out, or
at-least flat. And if it is bulged inside, it is called convexity defects. For example, check the
below image of hand. Red line shows the convex hull of hand. The double-sided arrow marks shows the
convexity defects, which are the local maximum deviations of hull from contours.
![image](images/convexitydefects.jpg)
There is a little bit things to discuss about it its syntax:
@code{.py}
hull = cv.convexHull(points[, hull[, clockwise[, returnPoints]]])
@endcode
Arguments details:
- **points** are the contours we pass into.
- **hull** is the output, normally we avoid it.
- **clockwise** : Orientation flag. If it is True, the output convex hull is oriented clockwise.
Otherwise, it is oriented counter-clockwise.
- **returnPoints** : By default, True. Then it returns the coordinates of the hull points. If
False, it returns the indices of contour points corresponding to the hull points.
So to get a convex hull as in above image, following is sufficient:
@code{.py}
hull = cv.convexHull(cnt)
@endcode
But if you want to find convexity defects, you need to pass returnPoints = False. To understand it,
we will take the rectangle image above. First I found its contour as cnt. Now I found its convex
hull with returnPoints = True, I got following values:
[[[234 202]], [[ 51 202]], [[ 51 79]], [[234 79]]] which are the four corner points of rectangle.
Now if do the same with returnPoints = False, I get following result: [[129],[ 67],[ 0],[142]].
These are the indices of corresponding points in contours. For eg, check the first value:
cnt[129] = [[234, 202]] which is same as first result (and so on for others).
You will see it again when we discuss about convexity defects.
6. Checking Convexity
---------------------
There is a function to check if a curve is convex or not, **cv.isContourConvex()**. It just return
whether True or False. Not a big deal.
@code{.py}
k = cv.isContourConvex(cnt)
@endcode
7. Bounding Rectangle
---------------------
There are two types of bounding rectangles.
### 7.a. Straight Bounding Rectangle
It is a straight rectangle, it doesn't consider the rotation of the object. So area of the bounding
rectangle won't be minimum. It is found by the function **cv.boundingRect()**.
Let (x,y) be the top-left coordinate of the rectangle and (w,h) be its width and height.
@code{.py}
x,y,w,h = cv.boundingRect(cnt)
cv.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
@endcode
### 7.b. Rotated Rectangle
Here, bounding rectangle is drawn with minimum area, so it considers the rotation also. The function
used is **cv.minAreaRect()**. It returns a Box2D structure which contains following details - (
center (x,y), (width, height), angle of rotation ). But to draw this rectangle, we need 4 corners of
the rectangle. It is obtained by the function **cv.boxPoints()**
@code{.py}
rect = cv.minAreaRect(cnt)
box = cv.boxPoints(rect)
box = np.int0(box)
cv.drawContours(img,[box],0,(0,0,255),2)
@endcode
Both the rectangles are shown in a single image. Green rectangle shows the normal bounding rect. Red
rectangle is the rotated rect.
![image](images/boundingrect.png)
8. Minimum Enclosing Circle
---------------------------
Next we find the circumcircle of an object using the function **cv.minEnclosingCircle()**. It is a
circle which completely covers the object with minimum area.
@code{.py}
(x,y),radius = cv.minEnclosingCircle(cnt)
center = (int(x),int(y))
radius = int(radius)
cv.circle(img,center,radius,(0,255,0),2)
@endcode
![image](images/circumcircle.png)
9. Fitting an Ellipse
---------------------
Next one is to fit an ellipse to an object. It returns the rotated rectangle in which the ellipse is
inscribed.
@code{.py}
ellipse = cv.fitEllipse(cnt)
cv.ellipse(img,ellipse,(0,255,0),2)
@endcode
![image](images/fitellipse.png)
10. Fitting a Line
------------------
Similarly we can fit a line to a set of points. Below image contains a set of white points. We can
approximate a straight line to it.
@code{.py}
rows,cols = img.shape[:2]
[vx,vy,x,y] = cv.fitLine(cnt, cv.DIST_L2,0,0.01,0.01)
lefty = int((-x*vy/vx) + y)
righty = int(((cols-x)*vy/vx)+y)
cv.line(img,(cols-1,righty),(0,lefty),(0,255,0),2)
@endcode
![image](images/fitline.jpg)
Additional Resources
--------------------
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

View File

@ -0,0 +1,123 @@
Contour Properties {#tutorial_py_contour_properties}
==================
@prev_tutorial{tutorial_py_contour_features}
@next_tutorial{tutorial_py_contours_more_functions}
Here we will learn to extract some frequently used properties of objects like Solidity, Equivalent
Diameter, Mask image, Mean Intensity etc. More features can be found at [Matlab regionprops
documentation](http://www.mathworks.in/help/images/ref/regionprops.html).
*(NB : Centroid, Area, Perimeter etc also belong to this category, but we have seen it in last
chapter)*
1. Aspect Ratio
---------------
It is the ratio of width to height of bounding rect of the object.
\f[Aspect \; Ratio = \frac{Width}{Height}\f]
@code{.py}
x,y,w,h = cv.boundingRect(cnt)
aspect_ratio = float(w)/h
@endcode
2. Extent
---------
Extent is the ratio of contour area to bounding rectangle area.
\f[Extent = \frac{Object \; Area}{Bounding \; Rectangle \; Area}\f]
@code{.py}
area = cv.contourArea(cnt)
x,y,w,h = cv.boundingRect(cnt)
rect_area = w*h
extent = float(area)/rect_area
@endcode
3. Solidity
-----------
Solidity is the ratio of contour area to its convex hull area.
\f[Solidity = \frac{Contour \; Area}{Convex \; Hull \; Area}\f]
@code{.py}
area = cv.contourArea(cnt)
hull = cv.convexHull(cnt)
hull_area = cv.contourArea(hull)
solidity = float(area)/hull_area
@endcode
4. Equivalent Diameter
----------------------
Equivalent Diameter is the diameter of the circle whose area is same as the contour area.
\f[Equivalent \; Diameter = \sqrt{\frac{4 \times Contour \; Area}{\pi}}\f]
@code{.py}
area = cv.contourArea(cnt)
equi_diameter = np.sqrt(4*area/np.pi)
@endcode
5. Orientation
--------------
Orientation is the angle at which object is directed. Following method also gives the Major Axis and
Minor Axis lengths.
@code{.py}
(x,y),(MA,ma),angle = cv.fitEllipse(cnt)
@endcode
6. Mask and Pixel Points
------------------------
In some cases, we may need all the points which comprises that object. It can be done as follows:
@code{.py}
mask = np.zeros(imgray.shape,np.uint8)
cv.drawContours(mask,[cnt],0,255,-1)
pixelpoints = np.transpose(np.nonzero(mask))
#pixelpoints = cv.findNonZero(mask)
@endcode
Here, two methods, one using Numpy functions, next one using OpenCV function (last commented line)
are given to do the same. Results are also same, but with a slight difference. Numpy gives
coordinates in **(row, column)** format, while OpenCV gives coordinates in **(x,y)** format. So
basically the answers will be interchanged. Note that, **row = y** and **column = x**.
7. Maximum Value, Minimum Value and their locations
---------------------------------------------------
We can find these parameters using a mask image.
@code{.py}
min_val, max_val, min_loc, max_loc = cv.minMaxLoc(imgray,mask = mask)
@endcode
8. Mean Color or Mean Intensity
-------------------------------
Here, we can find the average color of an object. Or it can be average intensity of the object in
grayscale mode. We again use the same mask to do it.
@code{.py}
mean_val = cv.mean(im,mask = mask)
@endcode
9. Extreme Points
-----------------
Extreme Points means topmost, bottommost, rightmost and leftmost points of the object.
@code{.py}
leftmost = tuple(cnt[cnt[:,:,0].argmin()][0])
rightmost = tuple(cnt[cnt[:,:,0].argmax()][0])
topmost = tuple(cnt[cnt[:,:,1].argmin()][0])
bottommost = tuple(cnt[cnt[:,:,1].argmax()][0])
@endcode
For eg, if I apply it to an Indian map, I get the following result :
![image](images/extremepoints.jpg)
Additional Resources
--------------------
Exercises
---------
-# There are still some features left in matlab regionprops doc. Try to implement them.

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.9 KiB

View File

@ -0,0 +1,95 @@
Contours : Getting Started {#tutorial_py_contours_begin}
==========================
@next_tutorial{tutorial_py_contour_features}
Goal
----
- Understand what contours are.
- Learn to find contours, draw contours etc
- You will see these functions : **cv.findContours()**, **cv.drawContours()**
What are contours?
------------------
Contours can be explained simply as a curve joining all the continuous points (along the boundary),
having same color or intensity. The contours are a useful tool for shape analysis and object
detection and recognition.
- For better accuracy, use binary images. So before finding contours, apply threshold or canny
edge detection.
- Since OpenCV 3.2, findContours() no longer modifies the source image.
- In OpenCV, finding contours is like finding white object from black background. So remember,
object to be found should be white and background should be black.
Let's see how to find contours of a binary image:
@code{.py}
import numpy as np
import cv2 as cv
im = cv.imread('test.jpg')
imgray = cv.cvtColor(im, cv.COLOR_BGR2GRAY)
ret, thresh = cv.threshold(imgray, 127, 255, 0)
contours, hierarchy = cv.findContours(thresh, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
@endcode
See, there are three arguments in **cv.findContours()** function, first one is source image, second
is contour retrieval mode, third is contour approximation method. And it outputs the contours and hierarchy.
Contours is a Python list of all the contours in the image. Each individual contour is a
Numpy array of (x,y) coordinates of boundary points of the object.
@note We will discuss second and third arguments and about hierarchy in details later. Until then,
the values given to them in code sample will work fine for all images.
How to draw the contours?
-------------------------
To draw the contours, cv.drawContours function is used. It can also be used to draw any shape
provided you have its boundary points. Its first argument is source image, second argument is the
contours which should be passed as a Python list, third argument is index of contours (useful when
drawing individual contour. To draw all contours, pass -1) and remaining arguments are color,
thickness etc.
* To draw all the contours in an image:
@code{.py}
cv.drawContours(img, contours, -1, (0,255,0), 3)
@endcode
* To draw an individual contour, say 4th contour:
@code{.py}
cv.drawContours(img, contours, 3, (0,255,0), 3)
@endcode
* But most of the time, below method will be useful:
@code{.py}
cnt = contours[4]
cv.drawContours(img, [cnt], 0, (0,255,0), 3)
@endcode
@note Last two methods are same, but when you go forward, you will see last one is more useful.
Contour Approximation Method
============================
This is the third argument in cv.findContours function. What does it denote actually?
Above, we told that contours are the boundaries of a shape with same intensity. It stores the (x,y)
coordinates of the boundary of a shape. But does it store all the coordinates ? That is specified by
this contour approximation method.
If you pass cv.CHAIN_APPROX_NONE, all the boundary points are stored. But actually do we need all
the points? For eg, you found the contour of a straight line. Do you need all the points on the line
to represent that line? No, we need just two end points of that line. This is what
cv.CHAIN_APPROX_SIMPLE does. It removes all redundant points and compresses the contour, thereby
saving memory.
Below image of a rectangle demonstrate this technique. Just draw a circle on all the coordinates in
the contour array (drawn in blue color). First image shows points I got with cv.CHAIN_APPROX_NONE
(734 points) and second image shows the one with cv.CHAIN_APPROX_SIMPLE (only 4 points). See, how
much memory it saves!!!
![image](images/none.jpg)
Additional Resources
--------------------
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

View File

@ -0,0 +1,220 @@
Contours Hierarchy {#tutorial_py_contours_hierarchy}
==================
@prev_tutorial{tutorial_py_contours_more_functions}
Goal
----
This time, we learn about the hierarchy of contours, i.e. the parent-child relationship in Contours.
Theory
------
In the last few articles on contours, we have worked with several functions related to contours
provided by OpenCV. But when we found the contours in image using **cv.findContours()** function,
we have passed an argument, **Contour Retrieval Mode**. We usually passed **cv.RETR_LIST** or
**cv.RETR_TREE** and it worked nice. But what does it actually mean ?
Also, in the output, we got three arrays, first is the image, second is our contours, and one more
output which we named as **hierarchy** (Please checkout the codes in previous articles). But we
never used this hierarchy anywhere. Then what is this hierarchy and what is it for ? What is its
relationship with the previous mentioned function argument ?
That is what we are going to deal in this article.
### What is Hierarchy?
Normally we use the **cv.findContours()** function to detect objects in an image, right ? Sometimes
objects are in different locations. But in some cases, some shapes are inside other shapes. Just
like nested figures. In this case, we call outer one as **parent** and inner one as **child**. This
way, contours in an image has some relationship to each other. And we can specify how one contour is
connected to each other, like, is it child of some other contour, or is it a parent etc.
Representation of this relationship is called the **Hierarchy**.
Consider an example image below :
![image](images/hierarchy.png)
In this image, there are a few shapes which I have numbered from **0-5**. *2 and 2a* denotes the
external and internal contours of the outermost box.
Here, contours 0,1,2 are **external or outermost**. We can say, they are in **hierarchy-0** or
simply they are in **same hierarchy level**.
Next comes **contour-2a**. It can be considered as a **child of contour-2** (or in opposite way,
contour-2 is parent of contour-2a). So let it be in **hierarchy-1**. Similarly contour-3 is child of
contour-2 and it comes in next hierarchy. Finally contours 4,5 are the children of contour-3a, and
they come in the last hierarchy level. From the way I numbered the boxes, I would say contour-4 is
the first child of contour-3a (It can be contour-5 also).
I mentioned these things to understand terms like **same hierarchy level**, **external contour**,
**child contour**, **parent contour**, **first child** etc. Now let's get into OpenCV.
### Hierarchy Representation in OpenCV
So each contour has its own information regarding what hierarchy it is, who is its child, who is its
parent etc. OpenCV represents it as an array of four values : **[Next, Previous, First_Child,
Parent]**
<center>*"Next denotes next contour at the same hierarchical level."*</center>
For eg, take contour-0 in our picture. Who is next contour in its same level ? It is contour-1. So
simply put Next = 1. Similarly for Contour-1, next is contour-2. So Next = 2.
What about contour-2? There is no next contour in the same level. So simply, put Next = -1. What
about contour-4? It is in same level with contour-5. So its next contour is contour-5, so Next = 5.
<center>*"Previous denotes previous contour at the same hierarchical level."*</center>
It is same as above. Previous contour of contour-1 is contour-0 in the same level. Similarly for
contour-2, it is contour-1. And for contour-0, there is no previous, so put it as -1.
<center>*"First_Child denotes its first child contour."*</center>
There is no need of any explanation. For contour-2, child is contour-2a. So it gets the
corresponding index value of contour-2a. What about contour-3a? It has two children. But we take
only first child. And it is contour-4. So First_Child = 4 for contour-3a.
<center>*"Parent denotes index of its parent contour."*</center>
It is just opposite of **First_Child**. Both for contour-4 and contour-5, parent contour is
contour-3a. For contour-3a, it is contour-3 and so on.
@note If there is no child or parent, that field is taken as -1
So now we know about the hierarchy style used in OpenCV, we can check into Contour Retrieval Modes
in OpenCV with the help of same image given above. ie what do flags like cv.RETR_LIST,
cv.RETR_TREE, cv.RETR_CCOMP, cv.RETR_EXTERNAL etc mean?
Contour Retrieval Mode
----------------------
### 1. RETR_LIST
This is the simplest of the four flags (from explanation point of view). It simply retrieves all the
contours, but doesn't create any parent-child relationship. **Parents and kids are equal under this
rule, and they are just contours**. ie they all belongs to same hierarchy level.
So here, 3rd and 4th term in hierarchy array is always -1. But obviously, Next and Previous terms
will have their corresponding values. Just check it yourself and verify it.
Below is the result I got, and each row is hierarchy details of corresponding contour. For eg, first
row corresponds to contour 0. Next contour is contour 1. So Next = 1. There is no previous contour,
so Previous = -1. And the remaining two, as told before, it is -1.
@code{.py}
>>> hierarchy
array([[[ 1, -1, -1, -1],
[ 2, 0, -1, -1],
[ 3, 1, -1, -1],
[ 4, 2, -1, -1],
[ 5, 3, -1, -1],
[ 6, 4, -1, -1],
[ 7, 5, -1, -1],
[-1, 6, -1, -1]]])
@endcode
This is the good choice to use in your code, if you are not using any hierarchy features.
### 2. RETR_EXTERNAL
If you use this flag, it returns only extreme outer flags. All child contours are left behind. **We
can say, under this law, Only the eldest in every family is taken care of. It doesn't care about
other members of the family :)**.
So, in our image, how many extreme outer contours are there? ie at hierarchy-0 level?. Only 3, ie
contours 0,1,2, right? Now try to find the contours using this flag. Here also, values given to each
element is same as above. Compare it with above result. Below is what I got :
@code{.py}
>>> hierarchy
array([[[ 1, -1, -1, -1],
[ 2, 0, -1, -1],
[-1, 1, -1, -1]]])
@endcode
You can use this flag if you want to extract only the outer contours. It might be useful in some
cases.
### 3. RETR_CCOMP
This flag retrieves all the contours and arranges them to a 2-level hierarchy. ie external contours
of the object (ie its boundary) are placed in hierarchy-1. And the contours of holes inside object
(if any) is placed in hierarchy-2. If any object inside it, its contour is placed again in
hierarchy-1 only. And its hole in hierarchy-2 and so on.
Just consider the image of a "big white zero" on a black background. Outer circle of zero belongs to
first hierarchy, and inner circle of zero belongs to second hierarchy.
We can explain it with a simple image. Here I have labelled the order of contours in red color and
the hierarchy they belongs to, in green color (either 1 or 2). The order is same as the order OpenCV
detects contours.
![image](images/ccomp_hierarchy.png)
So consider first contour, ie contour-0. It is hierarchy-1. It has two holes, contours 1&2, and they
belong to hierarchy-2. So for contour-0, Next contour in same hierarchy level is contour-3. And
there is no previous one. And its first is child is contour-1 in hierarchy-2. It has no parent,
because it is in hierarchy-1. So its hierarchy array is [3,-1,1,-1]
Now take contour-1. It is in hierarchy-2. Next one in same hierarchy (under the parenthood of
contour-1) is contour-2. No previous one. No child, but parent is contour-0. So array is
[2,-1,-1,0].
Similarly contour-2 : It is in hierarchy-2. There is not next contour in same hierarchy under
contour-0. So no Next. Previous is contour-1. No child, parent is contour-0. So array is
[-1,1,-1,0].
Contour - 3 : Next in hierarchy-1 is contour-5. Previous is contour-0. Child is contour-4 and no
parent. So array is [5,0,4,-1].
Contour - 4 : It is in hierarchy 2 under contour-3 and it has no sibling. So no next, no previous,
no child, parent is contour-3. So array is [-1,-1,-1,3].
Remaining you can fill up. This is the final answer I got:
@code{.py}
>>> hierarchy
array([[[ 3, -1, 1, -1],
[ 2, -1, -1, 0],
[-1, 1, -1, 0],
[ 5, 0, 4, -1],
[-1, -1, -1, 3],
[ 7, 3, 6, -1],
[-1, -1, -1, 5],
[ 8, 5, -1, -1],
[-1, 7, -1, -1]]])
@endcode
### 4. RETR_TREE
And this is the final guy, Mr.Perfect. It retrieves all the contours and creates a full family
hierarchy list. **It even tells, who is the grandpa, father, son, grandson and even beyond... :)**.
For example, I took above image, rewrite the code for cv.RETR_TREE, reorder the contours as per the
result given by OpenCV and analyze it. Again, red letters give the contour number and green letters
give the hierarchy order.
![image](images/tree_hierarchy.png)
Take contour-0 : It is in hierarchy-0. Next contour in same hierarchy is contour-7. No previous
contours. Child is contour-1. And no parent. So array is [7,-1,1,-1].
Take contour-2 : It is in hierarchy-1. No contour in same level. No previous one. Child is
contour-3. Parent is contour-1. So array is [-1,-1,3,1].
And remaining, try yourself. Below is the full answer:
@code{.py}
>>> hierarchy
array([[[ 7, -1, 1, -1],
[-1, -1, 2, 0],
[-1, -1, 3, 1],
[-1, -1, 4, 2],
[-1, -1, 5, 3],
[ 6, -1, -1, 4],
[-1, 5, -1, 4],
[ 8, 0, -1, -1],
[-1, 7, -1, -1]]])
@endcode
Additional Resources
--------------------
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.4 KiB

View File

@ -0,0 +1,136 @@
Contours : More Functions {#tutorial_py_contours_more_functions}
=========================
@prev_tutorial{tutorial_py_contour_properties}
@next_tutorial{tutorial_py_contours_hierarchy}
Goal
----
In this chapter, we will learn about
- Convexity defects and how to find them.
- Finding shortest distance from a point to a polygon
- Matching different shapes
Theory and Code
---------------
### 1. Convexity Defects
We saw what is convex hull in second chapter about contours. Any deviation of the object from this
hull can be considered as convexity defect.
OpenCV comes with a ready-made function to find this, **cv.convexityDefects()**. A basic function
call would look like below:
@code{.py}
hull = cv.convexHull(cnt,returnPoints = False)
defects = cv.convexityDefects(cnt,hull)
@endcode
@note Remember we have to pass returnPoints = False while finding convex hull, in order to find
convexity defects.
It returns an array where each row contains these values - **[ start point, end point, farthest
point, approximate distance to farthest point ]**. We can visualize it using an image. We draw a
line joining start point and end point, then draw a circle at the farthest point. Remember first
three values returned are indices of cnt. So we have to bring those values from cnt.
@code{.py}
import cv2 as cv
import numpy as np
img = cv.imread('star.jpg')
img_gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
ret,thresh = cv.threshold(img_gray, 127, 255,0)
contours,hierarchy = cv.findContours(thresh,2,1)
cnt = contours[0]
hull = cv.convexHull(cnt,returnPoints = False)
defects = cv.convexityDefects(cnt,hull)
for i in range(defects.shape[0]):
s,e,f,d = defects[i,0]
start = tuple(cnt[s][0])
end = tuple(cnt[e][0])
far = tuple(cnt[f][0])
cv.line(img,start,end,[0,255,0],2)
cv.circle(img,far,5,[0,0,255],-1)
cv.imshow('img',img)
cv.waitKey(0)
cv.destroyAllWindows()
@endcode
And see the result:
![image](images/defects.jpg)
### 2. Point Polygon Test
This function finds the shortest distance between a point in the image and a contour. It returns the
distance which is negative when point is outside the contour, positive when point is inside and zero
if point is on the contour.
For example, we can check the point (50,50) as follows:
@code{.py}
dist = cv.pointPolygonTest(cnt,(50,50),True)
@endcode
In the function, third argument is measureDist. If it is True, it finds the signed distance. If
False, it finds whether the point is inside or outside or on the contour (it returns +1, -1, 0
respectively).
@note If you don't want to find the distance, make sure third argument is False, because, it is a
time consuming process. So, making it False gives about 2-3X speedup.
### 3. Match Shapes
OpenCV comes with a function **cv.matchShapes()** which enables us to compare two shapes, or two
contours and returns a metric showing the similarity. The lower the result, the better match it is.
It is calculated based on the hu-moment values. Different measurement methods are explained in the
docs.
@code{.py}
import cv2 as cv
import numpy as np
img1 = cv.imread('star.jpg',0)
img2 = cv.imread('star2.jpg',0)
ret, thresh = cv.threshold(img1, 127, 255,0)
ret, thresh2 = cv.threshold(img2, 127, 255,0)
contours,hierarchy = cv.findContours(thresh,2,1)
cnt1 = contours[0]
contours,hierarchy = cv.findContours(thresh2,2,1)
cnt2 = contours[0]
ret = cv.matchShapes(cnt1,cnt2,1,0.0)
print( ret )
@endcode
I tried matching shapes with different shapes given below:
![image](images/matchshapes.jpg)
I got following results:
- Matching Image A with itself = 0.0
- Matching Image A with Image B = 0.001946
- Matching Image A with Image C = 0.326911
See, even image rotation doesn't affect much on this comparison.
@note [Hu-Moments](http://en.wikipedia.org/wiki/Image_moment#Rotation_invariant_moments) are seven
moments invariant to translation, rotation and scale. Seventh one is skew-invariant. Those values
can be found using **cv.HuMoments()** function.
Additional Resources
====================
Exercises
---------
-# Check the documentation for **cv.pointPolygonTest()**, you can find a nice image in Red and
Blue color. It represents the distance from all pixels to the white curve on it. All pixels
inside curve is blue depending on the distance. Similarly outside points are red. Contour edges
are marked with White. So problem is simple. Write a code to create such a representation of
distance.
-# Compare images of digits or letters using **cv.matchShapes()**. ( That would be a simple step
towards OCR )

View File

@ -0,0 +1,26 @@
Contours in OpenCV {#tutorial_py_table_of_contents_contours}
==================
- @subpage tutorial_py_contours_begin
Learn to find and draw Contours
- @subpage tutorial_py_contour_features
Learn
to find different features of contours like area, perimeter, bounding rectangle etc.
- @subpage tutorial_py_contour_properties
Learn
to find different properties of contours like Solidity, Mean Intensity etc.
- @subpage tutorial_py_contours_more_functions
Learn
to find convexity defects, pointPolygonTest, match different shapes etc.
- @subpage tutorial_py_contours_hierarchy
Learn
about Contour Hierarchy

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

View File

@ -0,0 +1,153 @@
Smoothing Images {#tutorial_py_filtering}
================
Goals
-----
Learn to:
- Blur images with various low pass filters
- Apply custom-made filters to images (2D convolution)
2D Convolution ( Image Filtering )
----------------------------------
As in one-dimensional signals, images also can be filtered with various low-pass filters (LPF),
high-pass filters (HPF), etc. LPF helps in removing noise, blurring images, etc. HPF filters help
in finding edges in images.
OpenCV provides a function **cv.filter2D()** to convolve a kernel with an image. As an example, we
will try an averaging filter on an image. A 5x5 averaging filter kernel will look like the below:
\f[K = \frac{1}{25} \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \end{bmatrix}\f]
The operation works like this: keep this kernel above a pixel, add all the 25 pixels below this kernel,
take the average, and replace the central pixel with the new average value. This operation is continued
for all the pixels in the image. Try this code and check the result:
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('opencv_logo.png')
kernel = np.ones((5,5),np.float32)/25
dst = cv.filter2D(img,-1,kernel)
plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(dst),plt.title('Averaging')
plt.xticks([]), plt.yticks([])
plt.show()
@endcode
Result:
![image](images/filter.jpg)
Image Blurring (Image Smoothing)
--------------------------------
Image blurring is achieved by convolving the image with a low-pass filter kernel. It is useful for
removing noise. It actually removes high frequency content (eg: noise, edges) from the image. So
edges are blurred a little bit in this operation (there are also blurring techniques which don't
blur the edges). OpenCV provides four main types of blurring techniques.
### 1. Averaging
This is done by convolving an image with a normalized box filter. It simply takes the average of all
the pixels under the kernel area and replaces the central element. This is done by the function
**cv.blur()** or **cv.boxFilter()**. Check the docs for more details about the kernel. We should
specify the width and height of the kernel. A 3x3 normalized box filter would look like the below:
\f[K = \frac{1}{9} \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}\f]
@note If you don't want to use a normalized box filter, use **cv.boxFilter()**. Pass an argument
normalize=False to the function.
Check a sample demo below with a kernel of 5x5 size:
@code{.py}
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
img = cv.imread('opencv-logo-white.png')
blur = cv.blur(img,(5,5))
plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(blur),plt.title('Blurred')
plt.xticks([]), plt.yticks([])
plt.show()
@endcode
Result:
![image](images/blur.jpg)
### 2. Gaussian Blurring
In this method, instead of a box filter, a Gaussian kernel is used. It is done with the function,
**cv.GaussianBlur()**. We should specify the width and height of the kernel which should be positive
and odd. We also should specify the standard deviation in the X and Y directions, sigmaX and sigmaY
respectively. If only sigmaX is specified, sigmaY is taken as the same as sigmaX. If both are given as
zeros, they are calculated from the kernel size. Gaussian blurring is highly effective in removing
Gaussian noise from an image.
If you want, you can create a Gaussian kernel with the function, **cv.getGaussianKernel()**.
The above code can be modified for Gaussian blurring:
@code{.py}
blur = cv.GaussianBlur(img,(5,5),0)
@endcode
Result:
![image](images/gaussian.jpg)
### 3. Median Blurring
Here, the function **cv.medianBlur()** takes the median of all the pixels under the kernel area and the central
element is replaced with this median value. This is highly effective against salt-and-pepper noise
in an image. Interestingly, in the above filters, the central element is a newly
calculated value which may be a pixel value in the image or a new value. But in median blurring,
the central element is always replaced by some pixel value in the image. It reduces the noise
effectively. Its kernel size should be a positive odd integer.
In this demo, I added a 50% noise to our original image and applied median blurring. Check the result:
@code{.py}
median = cv.medianBlur(img,5)
@endcode
Result:
![image](images/median.jpg)
### 4. Bilateral Filtering
**cv.bilateralFilter()** is highly effective in noise removal while keeping edges sharp. But the
operation is slower compared to other filters. We already saw that a Gaussian filter takes the
neighbourhood around the pixel and finds its Gaussian weighted average. This Gaussian filter is a
function of space alone, that is, nearby pixels are considered while filtering. It doesn't consider
whether pixels have almost the same intensity. It doesn't consider whether a pixel is an edge pixel or
not. So it blurs the edges also, which we don't want to do.
Bilateral filtering also takes a Gaussian filter in space, but one more Gaussian filter which is a
function of pixel difference. The Gaussian function of space makes sure that only nearby pixels are considered
for blurring, while the Gaussian function of intensity difference makes sure that only those pixels with
similar intensities to the central pixel are considered for blurring. So it preserves the edges since
pixels at edges will have large intensity variation.
The below sample shows use of a bilateral filter (For details on arguments, visit docs).
@code{.py}
blur = cv.bilateralFilter(img,9,75,75)
@endcode
Result:
![image](images/bilateral.jpg)
See, the texture on the surface is gone, but the edges are still preserved.
Additional Resources
--------------------
-# Details about the [bilateral filtering](http://people.csail.mit.edu/sparis/bf_course/)
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

View File

@ -0,0 +1,163 @@
Geometric Transformations of Images {#tutorial_py_geometric_transformations}
===================================
Goals
-----
- Learn to apply different geometric transformations to images, like translation, rotation, affine
transformation etc.
- You will see these functions: **cv.getPerspectiveTransform**
Transformations
---------------
OpenCV provides two transformation functions, **cv.warpAffine** and **cv.warpPerspective**, with
which you can perform all kinds of transformations. **cv.warpAffine** takes a 2x3 transformation
matrix while **cv.warpPerspective** takes a 3x3 transformation matrix as input.
### Scaling
Scaling is just resizing of the image. OpenCV comes with a function **cv.resize()** for this
purpose. The size of the image can be specified manually, or you can specify the scaling factor.
Different interpolation methods are used. Preferable interpolation methods are **cv.INTER_AREA**
for shrinking and **cv.INTER_CUBIC** (slow) & **cv.INTER_LINEAR** for zooming. By default,
the interpolation method **cv.INTER_LINEAR** is used for all resizing purposes. You can resize an
input image with either of following methods:
@code{.py}
import numpy as np
import cv2 as cv
img = cv.imread('messi5.jpg')
res = cv.resize(img,None,fx=2, fy=2, interpolation = cv.INTER_CUBIC)
#OR
height, width = img.shape[:2]
res = cv.resize(img,(2*width, 2*height), interpolation = cv.INTER_CUBIC)
@endcode
### Translation
Translation is the shifting of an object's location. If you know the shift in the (x,y) direction and let it
be \f$(t_x,t_y)\f$, you can create the transformation matrix \f$\textbf{M}\f$ as follows:
\f[M = \begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \end{bmatrix}\f]
You can take make it into a Numpy array of type np.float32 and pass it into the **cv.warpAffine()**
function. See the below example for a shift of (100,50):
@code{.py}
import numpy as np
import cv2 as cv
img = cv.imread('messi5.jpg',0)
rows,cols = img.shape
M = np.float32([[1,0,100],[0,1,50]])
dst = cv.warpAffine(img,M,(cols,rows))
cv.imshow('img',dst)
cv.waitKey(0)
cv.destroyAllWindows()
@endcode
**warning**
The third argument of the **cv.warpAffine()** function is the size of the output image, which should
be in the form of **(width, height)**. Remember width = number of columns, and height = number of
rows.
See the result below:
![image](images/translation.jpg)
### Rotation
Rotation of an image for an angle \f$\theta\f$ is achieved by the transformation matrix of the form
\f[M = \begin{bmatrix} cos\theta & -sin\theta \\ sin\theta & cos\theta \end{bmatrix}\f]
But OpenCV provides scaled rotation with adjustable center of rotation so that you can rotate at any
location you prefer. The modified transformation matrix is given by
\f[\begin{bmatrix} \alpha & \beta & (1- \alpha ) \cdot center.x - \beta \cdot center.y \\ - \beta & \alpha & \beta \cdot center.x + (1- \alpha ) \cdot center.y \end{bmatrix}\f]
where:
\f[\begin{array}{l} \alpha = scale \cdot \cos \theta , \\ \beta = scale \cdot \sin \theta \end{array}\f]
To find this transformation matrix, OpenCV provides a function, **cv.getRotationMatrix2D**. Check out the
below example which rotates the image by 90 degree with respect to center without any scaling.
@code{.py}
img = cv.imread('messi5.jpg',0)
rows,cols = img.shape
# cols-1 and rows-1 are the coordinate limits.
M = cv.getRotationMatrix2D(((cols-1)/2.0,(rows-1)/2.0),90,1)
dst = cv.warpAffine(img,M,(cols,rows))
@endcode
See the result:
![image](images/rotation.jpg)
### Affine Transformation
In affine transformation, all parallel lines in the original image will still be parallel in the
output image. To find the transformation matrix, we need three points from the input image and their
corresponding locations in the output image. Then **cv.getAffineTransform** will create a 2x3 matrix
which is to be passed to **cv.warpAffine**.
Check the below example, and also look at the points I selected (which are marked in green color):
@code{.py}
img = cv.imread('drawing.png')
rows,cols,ch = img.shape
pts1 = np.float32([[50,50],[200,50],[50,200]])
pts2 = np.float32([[10,100],[200,50],[100,250]])
M = cv.getAffineTransform(pts1,pts2)
dst = cv.warpAffine(img,M,(cols,rows))
plt.subplot(121),plt.imshow(img),plt.title('Input')
plt.subplot(122),plt.imshow(dst),plt.title('Output')
plt.show()
@endcode
See the result:
![image](images/affine.jpg)
### Perspective Transformation
For perspective transformation, you need a 3x3 transformation matrix. Straight lines will remain
straight even after the transformation. To find this transformation matrix, you need 4 points on the
input image and corresponding points on the output image. Among these 4 points, 3 of them should not
be collinear. Then the transformation matrix can be found by the function
**cv.getPerspectiveTransform**. Then apply **cv.warpPerspective** with this 3x3 transformation
matrix.
See the code below:
@code{.py}
img = cv.imread('sudoku.png')
rows,cols,ch = img.shape
pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[0,0],[300,0],[0,300],[300,300]])
M = cv.getPerspectiveTransform(pts1,pts2)
dst = cv.warpPerspective(img,M,(300,300))
plt.subplot(121),plt.imshow(img),plt.title('Input')
plt.subplot(122),plt.imshow(dst),plt.title('Output')
plt.show()
@endcode
Result:
![image](images/perspective.jpg)
Additional Resources
--------------------
-# "Computer Vision: Algorithms and Applications", Richard Szeliski
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

View File

@ -0,0 +1,156 @@
Interactive Foreground Extraction using GrabCut Algorithm {#tutorial_py_grabcut}
=========================================================
Goal
----
In this chapter
- We will see GrabCut algorithm to extract foreground in images
- We will create an interactive application for this.
Theory
------
GrabCut algorithm was designed by Carsten Rother, Vladimir Kolmogorov & Andrew Blake from Microsoft
Research Cambridge, UK. in their paper, ["GrabCut": interactive foreground extraction using iterated
graph cuts](http://dl.acm.org/citation.cfm?id=1015720) . An algorithm was needed for foreground
extraction with minimal user interaction, and the result was GrabCut.
How it works from user point of view ? Initially user draws a rectangle around the foreground region
(foreground region should be completely inside the rectangle). Then algorithm segments it
iteratively to get the best result. Done. But in some cases, the segmentation won't be fine, like,
it may have marked some foreground region as background and vice versa. In that case, user need to
do fine touch-ups. Just give some strokes on the images where some faulty results are there. Strokes
basically says *"Hey, this region should be foreground, you marked it background, correct it in next
iteration"* or its opposite for background. Then in the next iteration, you get better results.
See the image below. First player and football is enclosed in a blue rectangle. Then some final
touchups with white strokes (denoting foreground) and black strokes (denoting background) is made.
And we get a nice result.
![image](images/grabcut_output1.jpg)
So what happens in background ?
- User inputs the rectangle. Everything outside this rectangle will be taken as sure background
(That is the reason it is mentioned before that your rectangle should include all the
objects). Everything inside rectangle is unknown. Similarly any user input specifying
foreground and background are considered as hard-labelling which means they won't change in
the process.
- Computer does an initial labelling depending on the data we gave. It labels the foreground and
background pixels (or it hard-labels)
- Now a Gaussian Mixture Model(GMM) is used to model the foreground and background.
- Depending on the data we gave, GMM learns and create new pixel distribution. That is, the
unknown pixels are labelled either probable foreground or probable background depending on its
relation with the other hard-labelled pixels in terms of color statistics (It is just like
clustering).
- A graph is built from this pixel distribution. Nodes in the graphs are pixels. Additional two
nodes are added, **Source node** and **Sink node**. Every foreground pixel is connected to
Source node and every background pixel is connected to Sink node.
- The weights of edges connecting pixels to source node/end node are defined by the probability
of a pixel being foreground/background. The weights between the pixels are defined by the edge
information or pixel similarity. If there is a large difference in pixel color, the edge
between them will get a low weight.
- Then a mincut algorithm is used to segment the graph. It cuts the graph into two separating
source node and sink node with minimum cost function. The cost function is the sum of all
weights of the edges that are cut. After the cut, all the pixels connected to Source node
become foreground and those connected to Sink node become background.
- The process is continued until the classification converges.
It is illustrated in below image (Image Courtesy: <http://www.cs.ru.ac.za/research/g02m1682/>)
![image](images/grabcut_scheme.jpg)
Demo
----
Now we go for grabcut algorithm with OpenCV. OpenCV has the function, **cv.grabCut()** for this. We
will see its arguments first:
- *img* - Input image
- *mask* - It is a mask image where we specify which areas are background, foreground or
probable background/foreground etc. It is done by the following flags, **cv.GC_BGD,
cv.GC_FGD, cv.GC_PR_BGD, cv.GC_PR_FGD**, or simply pass 0,1,2,3 to image.
- *rect* - It is the coordinates of a rectangle which includes the foreground object in the
format (x,y,w,h)
- *bdgModel*, *fgdModel* - These are arrays used by the algorithm internally. You just create
two np.float64 type zero arrays of size (1,65).
- *iterCount* - Number of iterations the algorithm should run.
- *mode* - It should be **cv.GC_INIT_WITH_RECT** or **cv.GC_INIT_WITH_MASK** or combined
which decides whether we are drawing rectangle or final touchup strokes.
First let's see with rectangular mode. We load the image, create a similar mask image. We create
*fgdModel* and *bgdModel*. We give the rectangle parameters. It's all straight-forward. Let the
algorithm run for 5 iterations. Mode should be *cv.GC_INIT_WITH_RECT* since we are using
rectangle. Then run the grabcut. It modifies the mask image. In the new mask image, pixels will be
marked with four flags denoting background/foreground as specified above. So we modify the mask such
that all 0-pixels and 2-pixels are put to 0 (ie background) and all 1-pixels and 3-pixels are put to
1(ie foreground pixels). Now our final mask is ready. Just multiply it with input image to get the
segmented image.
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('messi5.jpg')
mask = np.zeros(img.shape[:2],np.uint8)
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
rect = (50,50,450,290)
cv.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv.GC_INIT_WITH_RECT)
mask2 = np.where((mask==2)|(mask==0),0,1).astype('uint8')
img = img*mask2[:,:,np.newaxis]
plt.imshow(img),plt.colorbar(),plt.show()
@endcode
See the results below:
![image](images/grabcut_rect.jpg)
Oops, Messi's hair is gone. *Who likes Messi without his hair?* We need to bring it back. So we will
give there a fine touchup with 1-pixel (sure foreground). At the same time, Some part of ground has
come to picture which we don't want, and also some logo. We need to remove them. There we give some
0-pixel touchup (sure background). So we modify our resulting mask in previous case as we told now.
*What I actually did is that, I opened input image in paint application and added another layer to
the image. Using brush tool in the paint, I marked missed foreground (hair, shoes, ball etc) with
white and unwanted background (like logo, ground etc) with black on this new layer. Then filled
remaining background with gray. Then loaded that mask image in OpenCV, edited original mask image we
got with corresponding values in newly added mask image. Check the code below:*
@code{.py}
# newmask is the mask image I manually labelled
newmask = cv.imread('newmask.png',0)
# wherever it is marked white (sure foreground), change mask=1
# wherever it is marked black (sure background), change mask=0
mask[newmask == 0] = 0
mask[newmask == 255] = 1
mask, bgdModel, fgdModel = cv.grabCut(img,mask,None,bgdModel,fgdModel,5,cv.GC_INIT_WITH_MASK)
mask = np.where((mask==2)|(mask==0),0,1).astype('uint8')
img = img*mask[:,:,np.newaxis]
plt.imshow(img),plt.colorbar(),plt.show()
@endcode
See the result below:
![image](images/grabcut_mask.jpg)
So that's it. Here instead of initializing in rect mode, you can directly go into mask mode. Just
mark the rectangle area in mask image with 2-pixel or 3-pixel (probable background/foreground). Then
mark our sure_foreground with 1-pixel as we did in second example. Then directly apply the grabCut
function with mask mode.
Additional Resources
--------------------
Exercises
---------
-# OpenCV samples contain a sample grabcut.py which is an interactive tool using grabcut. Check it.
Also watch this [youtube video](http://www.youtube.com/watch?v=kAwxLTDDAwU) on how to use it.
-# Here, you can make this into a interactive sample with drawing rectangle and strokes with mouse,
create trackbar to adjust stroke width etc.

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

View File

@ -0,0 +1,109 @@
Image Gradients {#tutorial_py_gradients}
===============
Goal
----
In this chapter, we will learn to:
- Find Image gradients, edges etc
- We will see following functions : **cv.Sobel()**, **cv.Scharr()**, **cv.Laplacian()** etc
Theory
------
OpenCV provides three types of gradient filters or High-pass filters, Sobel, Scharr and Laplacian.
We will see each one of them.
### 1. Sobel and Scharr Derivatives
Sobel operators is a joint Gausssian smoothing plus differentiation operation, so it is more
resistant to noise. You can specify the direction of derivatives to be taken, vertical or horizontal
(by the arguments, yorder and xorder respectively). You can also specify the size of kernel by the
argument ksize. If ksize = -1, a 3x3 Scharr filter is used which gives better results than 3x3 Sobel
filter. Please see the docs for kernels used.
### 2. Laplacian Derivatives
It calculates the Laplacian of the image given by the relation,
\f$\Delta src = \frac{\partial ^2{src}}{\partial x^2} + \frac{\partial ^2{src}}{\partial y^2}\f$ where
each derivative is found using Sobel derivatives. If ksize = 1, then following kernel is used for
filtering:
\f[kernel = \begin{bmatrix} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0 \end{bmatrix}\f]
Code
----
Below code shows all operators in a single diagram. All kernels are of 5x5 size. Depth of output
image is passed -1 to get the result in np.uint8 type.
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('dave.jpg',0)
laplacian = cv.Laplacian(img,cv.CV_64F)
sobelx = cv.Sobel(img,cv.CV_64F,1,0,ksize=5)
sobely = cv.Sobel(img,cv.CV_64F,0,1,ksize=5)
plt.subplot(2,2,1),plt.imshow(img,cmap = 'gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,2),plt.imshow(laplacian,cmap = 'gray')
plt.title('Laplacian'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,3),plt.imshow(sobelx,cmap = 'gray')
plt.title('Sobel X'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,4),plt.imshow(sobely,cmap = 'gray')
plt.title('Sobel Y'), plt.xticks([]), plt.yticks([])
plt.show()
@endcode
Result:
![image](images/gradients.jpg)
One Important Matter!
---------------------
In our last example, output datatype is cv.CV_8U or np.uint8. But there is a slight problem with
that. Black-to-White transition is taken as Positive slope (it has a positive value) while
White-to-Black transition is taken as a Negative slope (It has negative value). So when you convert
data to np.uint8, all negative slopes are made zero. In simple words, you miss that edge.
If you want to detect both edges, better option is to keep the output datatype to some higher forms,
like cv.CV_16S, cv.CV_64F etc, take its absolute value and then convert back to cv.CV_8U.
Below code demonstrates this procedure for a horizontal Sobel filter and difference in results.
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('box.png',0)
# Output dtype = cv.CV_8U
sobelx8u = cv.Sobel(img,cv.CV_8U,1,0,ksize=5)
# Output dtype = cv.CV_64F. Then take its absolute and convert to cv.CV_8U
sobelx64f = cv.Sobel(img,cv.CV_64F,1,0,ksize=5)
abs_sobel64f = np.absolute(sobelx64f)
sobel_8u = np.uint8(abs_sobel64f)
plt.subplot(1,3,1),plt.imshow(img,cmap = 'gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(1,3,2),plt.imshow(sobelx8u,cmap = 'gray')
plt.title('Sobel CV_8U'), plt.xticks([]), plt.yticks([])
plt.subplot(1,3,3),plt.imshow(sobel_8u,cmap = 'gray')
plt.title('Sobel abs(CV_64F)'), plt.xticks([]), plt.yticks([])
plt.show()
@endcode
Check the result below:
![image](images/double_edge.jpg)
Additional Resources
--------------------
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

View File

@ -0,0 +1,130 @@
Histograms - 3 : 2D Histograms {#tutorial_py_2d_histogram}
==============================
Goal
----
In this chapter, we will learn to find and plot 2D histograms. It will be helpful in coming
chapters.
Introduction
------------
In the first article, we calculated and plotted one-dimensional histogram. It is called
one-dimensional because we are taking only one feature into our consideration, ie grayscale
intensity value of the pixel. But in two-dimensional histograms, you consider two features. Normally
it is used for finding color histograms where two features are Hue & Saturation values of every
pixel.
There is a python sample (samples/python/color_histogram.py) already
for finding color histograms. We will try to understand how to create such a color histogram, and it
will be useful in understanding further topics like Histogram Back-Projection.
2D Histogram in OpenCV
----------------------
It is quite simple and calculated using the same function, **cv.calcHist()**. For color histograms,
we need to convert the image from BGR to HSV. (Remember, for 1D histogram, we converted from BGR to
Grayscale). For 2D histograms, its parameters will be modified as follows:
- **channels = [0,1]** *because we need to process both H and S plane.*
- **bins = [180,256]** *180 for H plane and 256 for S plane.*
- **range = [0,180,0,256]** *Hue value lies between 0 and 180 & Saturation lies between 0 and
256.*
Now check the code below:
@code{.py}
import numpy as np
import cv2 as cv
img = cv.imread('home.jpg')
hsv = cv.cvtColor(img,cv.COLOR_BGR2HSV)
hist = cv.calcHist([hsv], [0, 1], None, [180, 256], [0, 180, 0, 256])
@endcode
That's it.
2D Histogram in Numpy
---------------------
Numpy also provides a specific function for this : **np.histogram2d()**. (Remember, for 1D histogram
we used **np.histogram()** ).
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('home.jpg')
hsv = cv.cvtColor(img,cv.COLOR_BGR2HSV)
hist, xbins, ybins = np.histogram2d(h.ravel(),s.ravel(),[180,256],[[0,180],[0,256]])
@endcode
First argument is H plane, second one is the S plane, third is number of bins for each and fourth is
their range.
Now we can check how to plot this color histogram.
Plotting 2D Histograms
----------------------
### Method - 1 : Using cv.imshow()
The result we get is a two dimensional array of size 180x256. So we can show them as we do normally,
using cv.imshow() function. It will be a grayscale image and it won't give much idea what colors
are there, unless you know the Hue values of different colors.
### Method - 2 : Using Matplotlib
We can use **matplotlib.pyplot.imshow()** function to plot 2D histogram with different color maps.
It gives us a much better idea about the different pixel density. But this also, doesn't gives us
idea what color is there on a first look, unless you know the Hue values of different colors. Still
I prefer this method. It is simple and better.
@note While using this function, remember, interpolation flag should be nearest for better results.
Consider code:
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('home.jpg')
hsv = cv.cvtColor(img,cv.COLOR_BGR2HSV)
hist = cv.calcHist( [hsv], [0, 1], None, [180, 256], [0, 180, 0, 256] )
plt.imshow(hist,interpolation = 'nearest')
plt.show()
@endcode
Below is the input image and its color histogram plot. X axis shows S values and Y axis shows Hue.
![image](images/2dhist_matplotlib.jpg)
In histogram, you can see some high values near H = 100 and S = 200. It corresponds to blue of sky.
Similarly another peak can be seen near H = 25 and S = 100. It corresponds to yellow of the palace.
You can verify it with any image editing tools like GIMP.
### Method 3 : OpenCV sample style !!
There is a sample code for color-histogram in OpenCV-Python2 samples
(samples/python/color_histogram.py).
If you run the code, you can see the histogram shows the corresponding color also.
Or simply it outputs a color coded histogram.
Its result is very good (although you need to add extra bunch of lines).
In that code, the author created a color map in HSV. Then converted it into BGR. The resulting
histogram image is multiplied with this color map. He also uses some preprocessing steps to remove
small isolated pixels, resulting in a good histogram.
I leave it to the readers to run the code, analyze it and have your own hack arounds. Below is the
output of that code for the same image as above:
![image](images/2dhist_opencv.jpg)
You can clearly see in the histogram what colors are present, blue is there, yellow is there, and
some white due to chessboard is there. Nice !!!
Additional Resources
--------------------
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

View File

@ -0,0 +1,124 @@
Histogram - 4 : Histogram Backprojection {#tutorial_py_histogram_backprojection}
========================================
Goal
----
In this chapter, we will learn about histogram backprojection.
Theory
------
It was proposed by **Michael J. Swain , Dana H. Ballard** in their paper **Indexing via color
histograms**.
**What is it actually in simple words?** It is used for image segmentation or finding objects of
interest in an image. In simple words, it creates an image of the same size (but single channel) as
that of our input image, where each pixel corresponds to the probability of that pixel belonging to
our object. In more simpler words, the output image will have our object of interest in more white
compared to remaining part. Well, that is an intuitive explanation. (I can't make it more simpler).
Histogram Backprojection is used with camshift algorithm etc.
**How do we do it ?** We create a histogram of an image containing our object of interest (in our
case, the ground, leaving player and other things). The object should fill the image as far as
possible for better results. And a color histogram is preferred over grayscale histogram, because
color of the object is a better way to define the object than its grayscale intensity. We then
"back-project" this histogram over our test image where we need to find the object, ie in other
words, we calculate the probability of every pixel belonging to the ground and show it. The
resulting output on proper thresholding gives us the ground alone.
Algorithm in Numpy
------------------
-# First we need to calculate the color histogram of both the object we need to find (let it be
'M') and the image where we are going to search (let it be 'I').
@code{.py}
import numpy as np
import cv2 as cvfrom matplotlib import pyplot as plt
#roi is the object or region of object we need to find
roi = cv.imread('rose_red.png')
hsv = cv.cvtColor(roi,cv.COLOR_BGR2HSV)
#target is the image we search in
target = cv.imread('rose.png')
hsvt = cv.cvtColor(target,cv.COLOR_BGR2HSV)
# Find the histograms using calcHist. Can be done with np.histogram2d also
M = cv.calcHist([hsv],[0, 1], None, [180, 256], [0, 180, 0, 256] )
I = cv.calcHist([hsvt],[0, 1], None, [180, 256], [0, 180, 0, 256] )
@endcode
2. Find the ratio \f$R = \frac{M}{I}\f$. Then backproject R, ie use R as palette and create a new image
with every pixel as its corresponding probability of being target. ie B(x,y) = R[h(x,y),s(x,y)]
where h is hue and s is saturation of the pixel at (x,y). After that apply the condition
\f$B(x,y) = min[B(x,y), 1]\f$.
@code{.py}
h,s,v = cv.split(hsvt)
B = R[h.ravel(),s.ravel()]
B = np.minimum(B,1)
B = B.reshape(hsvt.shape[:2])
@endcode
3. Now apply a convolution with a circular disc, \f$B = D \ast B\f$, where D is the disc kernel.
@code{.py}
disc = cv.getStructuringElement(cv.MORPH_ELLIPSE,(5,5))
cv.filter2D(B,-1,disc,B)
B = np.uint8(B)
cv.normalize(B,B,0,255,cv.NORM_MINMAX)
@endcode
4. Now the location of maximum intensity gives us the location of object. If we are expecting a
region in the image, thresholding for a suitable value gives a nice result.
@code{.py}
ret,thresh = cv.threshold(B,50,255,0)
@endcode
That's it !!
Backprojection in OpenCV
------------------------
OpenCV provides an inbuilt function **cv.calcBackProject()**. Its parameters are almost same as the
**cv.calcHist()** function. One of its parameter is histogram which is histogram of the object and
we have to find it. Also, the object histogram should be normalized before passing on to the
backproject function. It returns the probability image. Then we convolve the image with a disc
kernel and apply threshold. Below is my code and output :
@code{.py}
import numpy as np
import cv2 as cv
roi = cv.imread('rose_red.png')
hsv = cv.cvtColor(roi,cv.COLOR_BGR2HSV)
target = cv.imread('rose.png')
hsvt = cv.cvtColor(target,cv.COLOR_BGR2HSV)
# calculating object histogram
roihist = cv.calcHist([hsv],[0, 1], None, [180, 256], [0, 180, 0, 256] )
# normalize histogram and apply backprojection
cv.normalize(roihist,roihist,0,255,cv.NORM_MINMAX)
dst = cv.calcBackProject([hsvt],[0,1],roihist,[0,180,0,256],1)
# Now convolute with circular disc
disc = cv.getStructuringElement(cv.MORPH_ELLIPSE,(5,5))
cv.filter2D(dst,-1,disc,dst)
# threshold and binary AND
ret,thresh = cv.threshold(dst,50,255,0)
thresh = cv.merge((thresh,thresh,thresh))
res = cv.bitwise_and(target,thresh)
res = np.vstack((target,thresh,res))
cv.imwrite('res.jpg',res)
@endcode
Below is one example I worked with. I used the region inside blue rectangle as sample object and I
wanted to extract the full ground.
![image](images/backproject_opencv.jpg)
Additional Resources
--------------------
-# "Indexing via color histograms", Swain, Michael J. , Third international conference on computer
vision,1990.
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

View File

@ -0,0 +1,198 @@
Histograms - 1 : Find, Plot, Analyze !!! {#tutorial_py_histogram_begins}
========================================
Goal
----
Learn to
- Find histograms, using both OpenCV and Numpy functions
- Plot histograms, using OpenCV and Matplotlib functions
- You will see these functions : **cv.calcHist()**, **np.histogram()** etc.
Theory
------
So what is histogram ? You can consider histogram as a graph or plot, which gives you an overall
idea about the intensity distribution of an image. It is a plot with pixel values (ranging from 0 to
255, not always) in X-axis and corresponding number of pixels in the image on Y-axis.
It is just another way of understanding the image. By looking at the histogram of an image, you get
intuition about contrast, brightness, intensity distribution etc of that image. Almost all image
processing tools today, provides features on histogram. Below is an image from [Cambridge in Color
website](http://www.cambridgeincolour.com/tutorials/histograms1.htm), and I recommend you to visit
the site for more details.
![image](images/histogram_sample.jpg)
You can see the image and its histogram. (Remember, this histogram is drawn for grayscale image, not
color image). Left region of histogram shows the amount of darker pixels in image and right region
shows the amount of brighter pixels. From the histogram, you can see dark region is more than
brighter region, and amount of midtones (pixel values in mid-range, say around 127) are very less.
Find Histogram
--------------
Now we have an idea on what is histogram, we can look into how to find this. Both OpenCV and Numpy
come with in-built function for this. Before using those functions, we need to understand some
terminologies related with histograms.
**BINS** :The above histogram shows the number of pixels for every pixel value, ie from 0 to 255. ie
you need 256 values to show the above histogram. But consider, what if you need not find the number
of pixels for all pixel values separately, but number of pixels in a interval of pixel values? say
for example, you need to find the number of pixels lying between 0 to 15, then 16 to 31, ..., 240 to 255.
You will need only 16 values to represent the histogram. And that is what is shown in example
given in @ref tutorial_histogram_calculation "OpenCV Tutorials on histograms".
So what you do is simply split the whole histogram to 16 sub-parts and value of each sub-part is the
sum of all pixel count in it. This each sub-part is called "BIN". In first case, number of bins
were 256 (one for each pixel) while in second case, it is only 16. BINS is represented by the term
**histSize** in OpenCV docs.
**DIMS** : It is the number of parameters for which we collect the data. In this case, we collect
data regarding only one thing, intensity value. So here it is 1.
**RANGE** : It is the range of intensity values you want to measure. Normally, it is [0,256], ie all
intensity values.
### 1. Histogram Calculation in OpenCV
So now we use **cv.calcHist()** function to find the histogram. Let's familiarize with the function
and its parameters :
<center><em>cv.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])</em></center>
-# images : it is the source image of type uint8 or float32. it should be given in square brackets,
ie, "[img]".
-# channels : it is also given in square brackets. It is the index of channel for which we
calculate histogram. For example, if input is grayscale image, its value is [0]. For color
image, you can pass [0], [1] or [2] to calculate histogram of blue, green or red channel
respectively.
-# mask : mask image. To find histogram of full image, it is given as "None". But if you want to
find histogram of particular region of image, you have to create a mask image for that and give
it as mask. (I will show an example later.)
-# histSize : this represents our BIN count. Need to be given in square brackets. For full scale,
we pass [256].
-# ranges : this is our RANGE. Normally, it is [0,256].
So let's start with a sample image. Simply load an image in grayscale mode and find its full
histogram.
@code{.py}
img = cv.imread('home.jpg',0)
hist = cv.calcHist([img],[0],None,[256],[0,256])
@endcode
hist is a 256x1 array, each value corresponds to number of pixels in that image with its
corresponding pixel value.
### 2. Histogram Calculation in Numpy
Numpy also provides you a function, **np.histogram()**. So instead of calcHist() function, you can
try below line :
@code{.py}
hist,bins = np.histogram(img.ravel(),256,[0,256])
@endcode
hist is same as we calculated before. But bins will have 257 elements, because Numpy calculates bins
as 0-0.99, 1-1.99, 2-2.99 etc. So final range would be 255-255.99. To represent that, they also add
256 at end of bins. But we don't need that 256. Upto 255 is sufficient.
@note Numpy has another function, **np.bincount()** which is much faster than (around 10X)
np.histogram(). So for one-dimensional histograms, you can better try that. Don't forget to set
minlength = 256 in np.bincount. For example, hist = np.bincount(img.ravel(),minlength=256)
@note OpenCV function is faster than (around 40X) than np.histogram(). So stick with OpenCV
function.
Now we should plot histograms, but how?
Plotting Histograms
-------------------
There are two ways for this,
-# Short Way : use Matplotlib plotting functions
-# Long Way : use OpenCV drawing functions
### 1. Using Matplotlib
Matplotlib comes with a histogram plotting function : matplotlib.pyplot.hist()
It directly finds the histogram and plot it. You need not use calcHist() or np.histogram() function
to find the histogram. See the code below:
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('home.jpg',0)
plt.hist(img.ravel(),256,[0,256]); plt.show()
@endcode
You will get a plot as below :
![image](images/histogram_matplotlib.jpg)
Or you can use normal plot of matplotlib, which would be good for BGR plot. For that, you need to
find the histogram data first. Try below code:
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('home.jpg')
color = ('b','g','r')
for i,col in enumerate(color):
histr = cv.calcHist([img],[i],None,[256],[0,256])
plt.plot(histr,color = col)
plt.xlim([0,256])
plt.show()
@endcode
Result:
![image](images/histogram_rgb_plot.jpg)
You can deduct from the above graph that, blue has some high value areas in the image (obviously it
should be due to the sky)
### 2. Using OpenCV
Well, here you adjust the values of histograms along with its bin values to look like x,y
coordinates so that you can draw it using cv.line() or cv.polyline() function to generate same
image as above. This is already available with OpenCV-Python2 official samples. Check the
code at samples/python/hist.py.
Application of Mask
-------------------
We used cv.calcHist() to find the histogram of the full image. What if you want to find histograms
of some regions of an image? Just create a mask image with white color on the region you want to
find histogram and black otherwise. Then pass this as the mask.
@code{.py}
img = cv.imread('home.jpg',0)
# create a mask
mask = np.zeros(img.shape[:2], np.uint8)
mask[100:300, 100:400] = 255
masked_img = cv.bitwise_and(img,img,mask = mask)
# Calculate histogram with mask and without mask
# Check third argument for mask
hist_full = cv.calcHist([img],[0],None,[256],[0,256])
hist_mask = cv.calcHist([img],[0],mask,[256],[0,256])
plt.subplot(221), plt.imshow(img, 'gray')
plt.subplot(222), plt.imshow(mask,'gray')
plt.subplot(223), plt.imshow(masked_img, 'gray')
plt.subplot(224), plt.plot(hist_full), plt.plot(hist_mask)
plt.xlim([0,256])
plt.show()
@endcode
See the result. In the histogram plot, blue line shows histogram of full image while green line
shows histogram of masked region.
![image](images/histogram_masking.jpg)
Additional Resources
--------------------
-# [Cambridge in Color website](http://www.cambridgeincolour.com/tutorials/histograms1.htm)
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.3 KiB

View File

@ -0,0 +1,153 @@
Histograms - 2: Histogram Equalization {#tutorial_py_histogram_equalization}
======================================
Goal
----
In this section,
- We will learn the concepts of histogram equalization and use it to improve the contrast of our
images.
Theory
------
Consider an image whose pixel values are confined to some specific range of values only. For eg,
brighter image will have all pixels confined to high values. But a good image will have pixels from
all regions of the image. So you need to stretch this histogram to either ends (as given in below
image, from wikipedia) and that is what Histogram Equalization does (in simple words). This normally
improves the contrast of the image.
![image](images/histogram_equalization.png)
I would recommend you to read the wikipedia page on [Histogram
Equalization](http://en.wikipedia.org/wiki/Histogram_equalization) for more details about it. It has
a very good explanation with worked out examples, so that you would understand almost everything
after reading that. Instead, here we will see its Numpy implementation. After that, we will see
OpenCV function.
@code{.py}
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('wiki.jpg',0)
hist,bins = np.histogram(img.flatten(),256,[0,256])
cdf = hist.cumsum()
cdf_normalized = cdf * float(hist.max()) / cdf.max()
plt.plot(cdf_normalized, color = 'b')
plt.hist(img.flatten(),256,[0,256], color = 'r')
plt.xlim([0,256])
plt.legend(('cdf','histogram'), loc = 'upper left')
plt.show()
@endcode
![image](images/histeq_numpy1.jpg)
You can see histogram lies in brighter region. We need the full spectrum. For that, we need a
transformation function which maps the input pixels in brighter region to output pixels in full
region. That is what histogram equalization does.
Now we find the minimum histogram value (excluding 0) and apply the histogram equalization equation
as given in wiki page. But I have used here, the masked array concept array from Numpy. For masked
array, all operations are performed on non-masked elements. You can read more about it from Numpy
docs on masked arrays.
@code{.py}
cdf_m = np.ma.masked_equal(cdf,0)
cdf_m = (cdf_m - cdf_m.min())*255/(cdf_m.max()-cdf_m.min())
cdf = np.ma.filled(cdf_m,0).astype('uint8')
@endcode
Now we have the look-up table that gives us the information on what is the output pixel value for
every input pixel value. So we just apply the transform.
@code{.py}
img2 = cdf[img]
@endcode
Now we calculate its histogram and cdf as before ( you do it) and result looks like below :
![image](images/histeq_numpy2.jpg)
Another important feature is that, even if the image was a darker image (instead of a brighter one
we used), after equalization we will get almost the same image as we got. As a result, this is used
as a "reference tool" to make all images with same lighting conditions. This is useful in many
cases. For example, in face recognition, before training the face data, the images of faces are
histogram equalized to make them all with same lighting conditions.
Histograms Equalization in OpenCV
---------------------------------
OpenCV has a function to do this, **cv.equalizeHist()**. Its input is just grayscale image and
output is our histogram equalized image.
Below is a simple code snippet showing its usage for same image we used :
@code{.py}
img = cv.imread('wiki.jpg',0)
equ = cv.equalizeHist(img)
res = np.hstack((img,equ)) #stacking images side-by-side
cv.imwrite('res.png',res)
@endcode
![image](images/equalization_opencv.jpg)
So now you can take different images with different light conditions, equalize it and check the
results.
Histogram equalization is good when histogram of the image is confined to a particular region. It
won't work good in places where there is large intensity variations where histogram covers a large
region, ie both bright and dark pixels are present. Please check the SOF links in Additional
Resources.
CLAHE (Contrast Limited Adaptive Histogram Equalization)
--------------------------------------------------------
The first histogram equalization we just saw, considers the global contrast of the image. In many
cases, it is not a good idea. For example, below image shows an input image and its result after
global histogram equalization.
![image](images/clahe_1.jpg)
It is true that the background contrast has improved after histogram equalization. But compare the
face of statue in both images. We lost most of the information there due to over-brightness. It is
because its histogram is not confined to a particular region as we saw in previous cases (Try to
plot histogram of input image, you will get more intuition).
So to solve this problem, **adaptive histogram equalization** is used. In this, image is divided
into small blocks called "tiles" (tileSize is 8x8 by default in OpenCV). Then each of these blocks
are histogram equalized as usual. So in a small area, histogram would confine to a small region
(unless there is noise). If noise is there, it will be amplified. To avoid this, **contrast
limiting** is applied. If any histogram bin is above the specified contrast limit (by default 40 in
OpenCV), those pixels are clipped and distributed uniformly to other bins before applying histogram
equalization. After equalization, to remove artifacts in tile borders, bilinear interpolation is
applied.
Below code snippet shows how to apply CLAHE in OpenCV:
@code{.py}
import numpy as np
import cv2 as cv
img = cv.imread('tsukuba_l.png',0)
# create a CLAHE object (Arguments are optional).
clahe = cv.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
cl1 = clahe.apply(img)
cv.imwrite('clahe_2.jpg',cl1)
@endcode
See the result below and compare it with results above, especially the statue region:
![image](images/clahe_2.jpg)
Additional Resources
--------------------
-# Wikipedia page on [Histogram Equalization](http://en.wikipedia.org/wiki/Histogram_equalization)
2. [Masked Arrays in Numpy](http://docs.scipy.org/doc/numpy/reference/maskedarray.html)
Also check these SOF questions regarding contrast adjustment:
-# [How can I adjust contrast in OpenCV in
C?](http://stackoverflow.com/questions/10549245/how-can-i-adjust-contrast-in-opencv-in-c)
4. [How do I equalize contrast & brightness of images using
opencv?](http://stackoverflow.com/questions/10561222/how-do-i-equalize-contrast-brightness-of-images-using-opencv)
Exercises
---------

View File

@ -0,0 +1,18 @@
Histograms in OpenCV {#tutorial_py_table_of_contents_histograms}
====================
- @subpage tutorial_py_histogram_begins
Learn the basics of histograms
- @subpage tutorial_py_histogram_equalization
Learn to Equalize Histograms to get better contrast for images
- @subpage tutorial_py_2d_histogram
Learn to find and plot 2D Histograms
- @subpage tutorial_py_histogram_backprojection
Learn histogram backprojection to segment colored objects

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

View File

@ -0,0 +1,52 @@
Hough Circle Transform {#tutorial_py_houghcircles}
======================
Goal
----
In this chapter,
- We will learn to use Hough Transform to find circles in an image.
- We will see these functions: **cv.HoughCircles()**
Theory
------
A circle is represented mathematically as \f$(x-x_{center})^2 + (y - y_{center})^2 = r^2\f$ where
\f$(x_{center},y_{center})\f$ is the center of the circle, and \f$r\f$ is the radius of the circle. From
equation, we can see we have 3 parameters, so we need a 3D accumulator for hough transform, which
would be highly ineffective. So OpenCV uses more trickier method, **Hough Gradient Method** which
uses the gradient information of edges.
The function we use here is **cv.HoughCircles()**. It has plenty of arguments which are well
explained in the documentation. So we directly go to the code.
@code{.py}
import numpy as np
import cv2 as cv
img = cv.imread('opencv-logo-white.png',0)
img = cv.medianBlur(img,5)
cimg = cv.cvtColor(img,cv.COLOR_GRAY2BGR)
circles = cv.HoughCircles(img,cv.HOUGH_GRADIENT,1,20,
param1=50,param2=30,minRadius=0,maxRadius=0)
circles = np.uint16(np.around(circles))
for i in circles[0,:]:
# draw the outer circle
cv.circle(cimg,(i[0],i[1]),i[2],(0,255,0),2)
# draw the center of the circle
cv.circle(cimg,(i[0],i[1]),2,(0,0,255),3)
cv.imshow('detected circles',cimg)
cv.waitKey(0)
cv.destroyAllWindows()
@endcode
Result is shown below:
![image](images/houghcircles2.jpg)
Additional Resources
--------------------
Exercises
---------

View File

@ -0,0 +1,234 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->
<svg
xmlns:osb="http://www.openswatchbook.org/uri/2009/osb"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
width="177.16013"
height="162.30061"
id="svg2"
version="1.1"
inkscape:version="0.48.4 r9939"
sodipodi:docname="houghline1.svg">
<defs
id="defs4">
<inkscape:path-effect
effect="skeletal"
id="path-effect8912"
is_visible="true"
pattern="M 0,0 1,0"
copytype="single_stretched"
prop_scale="1"
scale_y_rel="false"
spacing="0"
normal_offset="0"
tang_offset="0"
prop_units="false"
vertical_pattern="false"
fuse_tolerance="0" />
<marker
inkscape:stockid="Arrow1Send"
orient="auto"
refY="0"
refX="0"
id="Arrow1Send"
style="overflow:visible">
<path
id="path3774"
d="M 0,0 5,-5 -12.5,0 5,5 0,0 z"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt"
transform="matrix(-0.2,0,0,-0.2,-1.2,0)"
inkscape:connector-curvature="0" />
</marker>
<inkscape:path-effect
effect="skeletal"
id="path-effect7919"
is_visible="true"
pattern="m -1.0101525,-0.75761441 0.99999995,0"
copytype="single_stretched"
prop_scale="1"
scale_y_rel="false"
spacing="0"
normal_offset="0"
tang_offset="0"
prop_units="false"
vertical_pattern="false"
fuse_tolerance="0" />
<linearGradient
id="linearGradient7535"
osb:paint="solid">
<stop
style="stop-color:#000000;stop-opacity:1;"
offset="0"
id="stop7537" />
</linearGradient>
<inkscape:path-effect
effect="skeletal"
id="path-effect6087"
is_visible="true"
pattern="m -18.94036,-2.5253814 1,0"
copytype="single_stretched"
prop_scale="1"
scale_y_rel="false"
spacing="0"
normal_offset="0"
tang_offset="0"
prop_units="false"
vertical_pattern="false"
fuse_tolerance="0" />
<marker
inkscape:stockid="Arrow1Mstart"
orient="auto"
refY="0"
refX="0"
id="Arrow1Mstart"
style="overflow:visible">
<path
id="path3765"
d="M 0,0 5,-5 -12.5,0 5,5 0,0 z"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt"
transform="matrix(0.4,0,0,0.4,4,0)"
inkscape:connector-curvature="0" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0"
refX="0"
id="Arrow1Mend"
style="overflow:visible">
<path
id="path3768"
d="M 0,0 5,-5 -12.5,0 5,5 0,0 z"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt"
transform="matrix(-0.4,0,0,-0.4,-4,0)"
inkscape:connector-curvature="0" />
</marker>
<marker
inkscape:stockid="Arrow1Lend"
orient="auto"
refY="0"
refX="0"
id="Arrow1Lend"
style="overflow:visible">
<path
id="path3762"
d="M 0,0 5,-5 -12.5,0 5,5 0,0 z"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt"
transform="matrix(-0.8,0,0,-0.8,-10,0)"
inkscape:connector-curvature="0" />
</marker>
<marker
inkscape:stockid="Arrow2Lstart"
orient="auto"
refY="0"
refX="0"
id="Arrow2Lstart"
style="overflow:visible">
<path
id="path3777"
style="fill-rule:evenodd;stroke-width:0.625;stroke-linejoin:round"
d="M 8.7185878,4.0337352 -2.2072895,0.01601326 8.7185884,-4.0017078 c -1.7454984,2.3720609 -1.7354408,5.6174519 -6e-7,8.035443 z"
transform="matrix(1.1,0,0,1.1,1.1,0)"
inkscape:connector-curvature="0" />
</marker>
</defs>
<sodipodi:namedview
id="base"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageopacity="0.0"
inkscape:pageshadow="2"
inkscape:zoom="11.2"
inkscape:cx="52.657337"
inkscape:cy="137.18191"
inkscape:document-units="px"
inkscape:current-layer="layer1"
showgrid="false"
fit-margin-top="10"
fit-margin-left="10"
fit-margin-right="5"
fit-margin-bottom="5"
inkscape:window-width="1366"
inkscape:window-height="709"
inkscape:window-x="0"
inkscape:window-y="27"
inkscape:window-maximized="1" />
<metadata
id="metadata7">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title></dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<g
inkscape:label="Layer 1"
inkscape:groupmode="layer"
id="layer1"
transform="translate(-179.63266,-391.69653)">
<path
style="fill:none;stroke:#000000;stroke-width:0.7592535px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-start:url(#Arrow1Mstart);marker-end:url(#Arrow1Mend)"
d="m 350.84372,403.40485 -159.50274,0 0,144.64323"
id="path2985"
inkscape:connector-curvature="0" />
<path
style="fill:none;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
d="M 313.14729,416.97623 207.33381,522.78971"
id="path5321"
inkscape:connector-curvature="0" />
<path
style="fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:3, 1;stroke-dashoffset:0;marker-start:none;marker-end:none"
d="M 191.17137,403.33917 259.2304,471.3982"
id="path5323"
inkscape:connector-curvature="0" />
<path
style="fill:none;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
d="m 264.65997,465.46355 -7.95495,-7.95495 -5.93465,5.93465"
id="path5701"
inkscape:connector-curvature="0" />
<text
xml:space="preserve"
style="font-size:17.18746948px;font-style:normal;font-weight:normal;line-height:125%;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;font-family:Sans"
x="222.84058"
y="431.17072"
id="text8491"
sodipodi:linespacing="125%"
transform="scale(0.95341781,1.0488581)"><tspan
sodipodi:role="line"
id="tspan8493"
x="222.84058"
y="431.17072">ρ</tspan></text>
<text
xml:space="preserve"
style="font-size:11.58907604px;font-style:normal;font-weight:normal;line-height:125%;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;font-family:Sans"
x="241.60213"
y="356.61414"
id="text8495"
sodipodi:linespacing="125%"
transform="scale(0.8575127,1.1661635)"><tspan
sodipodi:role="line"
id="tspan8497"
x="241.60213"
y="356.61414">θ</tspan></text>
<path
style="fill:none;stroke:#000000;stroke-width:0.46696162;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;marker-end:url(#Arrow1Send)"
d="m 205.18489,403.44491 c -0.0814,1.73816 0.44467,3.57132 -0.24208,5.17013 -0.64958,1.51228 -2.24971,2.34519 -3.3891,3.53293"
id="path8910"
inkscape:path-effect="#path-effect8912"
inkscape:original-d="m 205.18489,403.44491 c -0.0807,1.72338 0.43883,3.58492 -0.24208,5.17013 -0.64405,1.49942 -2.2594,2.35529 -3.3891,3.53293"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cac" />
</g>
</svg>

After

Width:  |  Height:  |  Size: 7.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

View File

@ -0,0 +1,108 @@
Hough Line Transform {#tutorial_py_houghlines}
====================
Goal
----
In this chapter,
- We will understand the concept of the Hough Transform.
- We will see how to use it to detect lines in an image.
- We will see the following functions: **cv.HoughLines()**, **cv.HoughLinesP()**
Theory
------
The Hough Transform is a popular technique to detect any shape, if you can represent that shape in a
mathematical form. It can detect the shape even if it is broken or distorted a little bit. We will
see how it works for a line.
A line can be represented as \f$y = mx+c\f$ or in a parametric form, as
\f$\rho = x \cos \theta + y \sin \theta\f$ where \f$\rho\f$ is the perpendicular distance from the origin to the
line, and \f$\theta\f$ is the angle formed by this perpendicular line and the horizontal axis measured in
counter-clockwise (That direction varies on how you represent the coordinate system. This
representation is used in OpenCV). Check the image below:
![image](images/houghlines1.svg)
So if the line is passing below the origin, it will have a positive rho and an angle less than 180. If it
is going above the origin, instead of taking an angle greater than 180, the angle is taken less than 180,
and rho is taken negative. Any vertical line will have 0 degree and horizontal lines will have 90
degree.
Now let's see how the Hough Transform works for lines. Any line can be represented in these two terms,
\f$(\rho, \theta)\f$. So first it creates a 2D array or accumulator (to hold the values of the two parameters)
and it is set to 0 initially. Let rows denote the \f$\rho\f$ and columns denote the \f$\theta\f$. Size of
array depends on the accuracy you need. Suppose you want the accuracy of angles to be 1 degree, you will
need 180 columns. For \f$\rho\f$, the maximum distance possible is the diagonal length of the image. So
taking one pixel accuracy, the number of rows can be the diagonal length of the image.
Consider a 100x100 image with a horizontal line at the middle. Take the first point of the line. You
know its (x,y) values. Now in the line equation, put the values \f$\theta = 0,1,2,....,180\f$ and check
the \f$\rho\f$ you get. For every \f$(\rho, \theta)\f$ pair, you increment value by one in our accumulator
in its corresponding \f$(\rho, \theta)\f$ cells. So now in accumulator, the cell (50,90) = 1 along with
some other cells.
Now take the second point on the line. Do the same as above. Increment the values in the cells
corresponding to `(rho, theta)` you got. This time, the cell (50,90) = 2. What you actually
do is voting the \f$(\rho, \theta)\f$ values. You continue this process for every point on the line. At
each point, the cell (50,90) will be incremented or voted up, while other cells may or may not be
voted up. This way, at the end, the cell (50,90) will have maximum votes. So if you search the
accumulator for maximum votes, you get the value (50,90) which says, there is a line in this image
at a distance 50 from the origin and at angle 90 degrees. It is well shown in the below animation (Image
Courtesy: [Amos Storkey](http://homepages.inf.ed.ac.uk/amos/hough.html) )
![](images/houghlinesdemo.gif)
This is how hough transform works for lines. It is simple, and may be you can implement it using
Numpy on your own. Below is an image which shows the accumulator. Bright spots at some locations
denote they are the parameters of possible lines in the image. (Image courtesy: [Wikipedia](http://en.wikipedia.org/wiki/Hough_transform) )
![](images/houghlines2.jpg)
Hough Transform in OpenCV
=========================
Everything explained above is encapsulated in the OpenCV function, **cv.HoughLines()**. It simply returns an array of :math:(rho,
theta)\` values. \f$\rho\f$ is measured in pixels and \f$\theta\f$ is measured in radians. First parameter,
Input image should be a binary image, so apply threshold or use canny edge detection before
applying hough transform. Second and third parameters are \f$\rho\f$ and \f$\theta\f$ accuracies
respectively. Fourth argument is the threshold, which means the minimum vote it should get to be
considered as a line. Remember, number of votes depends upon the number of points on the line. So it
represents the minimum length of line that should be detected.
@include hough_line_transform.py
Check the results below:
![image](images/houghlines3.jpg)
Probabilistic Hough Transform
-----------------------------
In the hough transform, you can see that even for a line with two arguments, it takes a lot of
computation. Probabilistic Hough Transform is an optimization of the Hough Transform we saw. It doesn't
take all the points into consideration. Instead, it takes only a random subset of points which is
sufficient for line detection. We just have to decrease the threshold. See image below which compares
Hough Transform and Probabilistic Hough Transform in Hough space. (Image Courtesy :
[Franck Bettinger's home page](http://phdfb1.free.fr/robot/mscthesis/node14.html) )
![image](images/houghlines4.png)
OpenCV implementation is based on Robust Detection of Lines Using the Progressive Probabilistic
Hough Transform by Matas, J. and Galambos, C. and Kittler, J.V. @cite Matas00. The function used is
**cv.HoughLinesP()**. It has two new arguments.
- **minLineLength** - Minimum length of line. Line segments shorter than this are rejected.
- **maxLineGap** - Maximum allowed gap between line segments to treat them as a single line.
Best thing is that, it directly returns the two endpoints of lines. In previous case, you got only
the parameters of lines, and you had to find all the points. Here, everything is direct and simple.
@include probabilistic_hough_line_transform.py
See the results below:
![image](images/houghlines5.jpg)
Additional Resources
--------------------
-# [Hough Transform on Wikipedia](http://en.wikipedia.org/wiki/Hough_transform)
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 923 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 844 B

Some files were not shown because too many files have changed in this diff Show More