39 lines
445 B
Markdown
39 lines
445 B
Markdown
|
# benchmark
|
||
|
op
|
||
|
|
||
|
# naive C with openmp
|
||
|
for for for
|
||
|
|
||
|
# unroll, first try
|
||
|
h
|
||
|
|
||
|
# register allocation
|
||
|
kernels
|
||
|
|
||
|
# unroll, second try
|
||
|
simd
|
||
|
|
||
|
# neon intrinsics
|
||
|
optional
|
||
|
|
||
|
# naive neon assembly with pld
|
||
|
asm
|
||
|
|
||
|
# pipeline optimize, first try
|
||
|
more register load mla
|
||
|
|
||
|
# pipeline optimize, second try
|
||
|
interleave load mla
|
||
|
|
||
|
# pipeline optimize, third try
|
||
|
loop tail
|
||
|
|
||
|
# usual practice, load/save
|
||
|
233
|
||
|
|
||
|
# usual practice, unroll
|
||
|
233
|
||
|
|
||
|
# usual practice, save register
|
||
|
233
|