deepin-ocr/3rdparty/ncnn/docs/developer-guide/how-to-write-a-neon-optimized-op-kernel.md
wangzhengyang 718c41634f feat: 切换后端至PaddleOCR-NCNN,切换工程为CMake
1.项目后端整体迁移至PaddleOCR-NCNN算法,已通过基本的兼容性测试
2.工程改为使用CMake组织,后续为了更好地兼容第三方库,不再提供QMake工程
3.重整权利声明文件,重整代码工程,确保最小化侵权风险

Log: 切换后端至PaddleOCR-NCNN,切换工程为CMake
Change-Id: I4d5d2c5d37505a4a24b389b1a4c5d12f17bfa38c
2022-05-10 10:22:11 +08:00

445 B

benchmark

op

naive C with openmp

for for for

unroll, first try

h

register allocation

kernels

unroll, second try

simd

neon intrinsics

optional

naive neon assembly with pld

asm

pipeline optimize, first try

more register load mla

pipeline optimize, second try

interleave load mla

pipeline optimize, third try

loop tail

usual practice, load/save

233

usual practice, unroll

233

usual practice, save register

233