feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake

1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试
2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程
3.重整权利声明文件,重整代码工程,确保最小化侵权风险

Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake
Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
This commit is contained in:
wangzhengyang
2022-05-10 09:54:44 +08:00
parent ecdd171c6f
commit 718c41634f
10018 changed files with 3593797 additions and 186748 deletions

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,79 @@
This is a brief description of training process which has been used to get res10_300x300_ssd_iter_140000.caffemodel.
The model was created with SSD framework using ResNet-10 like architecture as a backbone. Channels count in ResNet-10 convolution layers was significantly dropped (2x- or 4x- fewer channels).
The model was trained in Caffe framework on some huge and available online dataset.
1. Prepare training tools
You need to use "ssd" branch from this repository https://github.com/weiliu89/caffe/tree/ssd . Checkout this branch and built it (see instructions in repo's README)
2. Prepare training data.
The data preparation pipeline can be represented as:
(a)Download original face detection dataset -> (b)Convert annotation to the PASCAL VOC format -> (c)Create LMDB database with images + annotations for training
a) Find some datasets with face bounding boxes annotation. For some reasons I can't provide links here, but you easily find them on your own. Also study the data. It may contain small or low quality faces which can spoil training process. Often there are special flags about object quality in annotation. Remove such faces from annotation (smaller when 16 along at least one side, or blurred, of highly-occluded, or something else).
b) The downloaded dataset will have some format of annotation. It may be one single file for all images, or separate file for each image or something else. But to train SSD in Caffe you need to convert annotation to PASCAL VOC format.
PASCAL VOC annotation consist of .xml file for each image. In this xml file all face bounding boxes should be listed as:
<annotation>
<size>
<width>300</width>
<height>300</height>
</size>
<object>
<name>face</name>
<difficult>0</difficult>
<bndbox>
<xmin>100</xmin>
<ymin>100</ymin>
<xmax>200</xmax>
<ymax>200</ymax>
</bndbox>
</object>
<object>
<name>face</name>
<difficult>0</difficult>
<bndbox>
<xmin>0</xmin>
<ymin>0</ymin>
<xmax>100</xmax>
<ymax>100</ymax>
</bndbox>
</object>
</annotation>
So, convert your dataset's annotation to the format above.
Also, you should create labelmap.prototxt file with the following content:
item {
name: "none_of_the_above"
label: 0
display_name: "background"
}
item {
name: "face"
label: 1
display_name: "face"
}
You need this file to establish correspondence between name of class and digital label of class.
For next step we also need file there all our image-annotation file names pairs are listed. This file should contain similar lines:
images_val/0.jpg annotations_val/0.jpg.xml
c) To create LMDB you need to use create_data.sh tool from caffe/data/VOC0712 Caffe's source code directory.
This script calls create_annoset.py inside, so check out what you need to pass as script's arguments
You need to prepare 2 LMDB databases: one for training images, one for validation images.
3. Train your detector
For training you need to have 3 files: train.prototxt, test.prototxt and solver.prototxt. You can find these files in the same directory as for this readme.
Also you need to edit train.prototxt and test.prototxt to replace paths for your LMDB databases to actual databases you've created in step 2.
Now all is done for launch training process.
Execute next lines in Terminal:
mkdir -p snapshot
mkdir -p log
/path_for_caffe_build_dir/tools/caffe train -solver="solver.prototxt" -gpu 0 2>&1 | tee -a log/log.log
And wait. It will take about 8 hours to finish the process.
After it you can use your .caffemodel from snapshot/ subdirectory in resnet_face_ssd_python.py sample.

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,28 @@
train_net: "train.prototxt"
test_net: "test.prototxt"
test_iter: 2312
test_interval: 5000
test_initialization: true
base_lr: 0.01
display: 10
lr_policy: "multistep"
max_iter: 140000
stepvalue: 80000
stepvalue: 120000
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
average_loss: 500
iter_size: 1
type: "SGD"
solver_mode: GPU
random_seed: 0
debug_info: false
snapshot: 1000
snapshot_prefix: "snapshot/res10_300x300_ssd"
eval_type: "detection"
ap_version: "11point"

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,13 @@
<?xml version="1.0" encoding="UTF-8"?>
<metalink xmlns="urn:ietf:params:xml:ns:metalink">
<file name="res10_300x300_ssd_iter_140000_fp16.caffemodel">
<identity>opencv_face_detector_fp16</identity>
<hash type="sha-1">31fc22bfdd907567a04bb45b7cfad29966caddc1</hash>
<url>https://raw.githubusercontent.com/opencv/opencv_3rdparty/dnn_samples_face_detector_20180205_fp16/res10_300x300_ssd_iter_140000_fp16.caffemodel</url>
</file>
<file name="opencv_face_detector_uint8.pb">
<identity>opencv_face_detector_uint8</identity>
<hash type="sha-1">4f2fdf6f231d759d7bbdb94353c5a68690f3d2ae</hash>
<url>https://raw.githubusercontent.com/opencv/opencv_3rdparty/dnn_samples_face_detector_20180220_uint8/opencv_face_detector_uint8.pb</url>
</file>
</metalink>