120 lines
2.9 KiB
Markdown
120 lines
2.9 KiB
Markdown
|
### what is packing and why
|
||
|
|
||
|
packing is the form of storing multiple short-sized values as one long-sized value.
|
||
|
|
||
|
element packing is well mapped with the underlying simd register, which usually use one very wide register to store different types of values.
|
||
|
|
||
|
|C|elemsize|elempack|
|
||
|
|---|---|---|
|
||
|
|double|8|1|
|
||
|
|float|4|1|
|
||
|
|int|4|1|
|
||
|
|short|2|1|
|
||
|
|signed char|1|1|
|
||
|
|
||
|
|arm neon|elemsize|elempack|
|
||
|
|---|---|---|
|
||
|
|float64x2_t|16|2|
|
||
|
|float32x4_t|16|4|
|
||
|
|int32x4_t|16|4|
|
||
|
|float16x4_t|8|4|
|
||
|
|int8x8_t|8|8|
|
||
|
|
||
|
Though the real count of values doubles when elempack is two, the wide-sized value is still treated as one value in the view of Mat structure. For example, we want to store 40 float values in Mat object, if elempack 1 is used, Mat width is then 40, while 10 if elempack 4 is used.
|
||
|
|
||
|
|dims|w|h|c|cstep|elemsize|elempack|
|
||
|
|---|---|---|---|---|---|---|
|
||
|
|1|40|1|1|40|4|1|
|
||
|
|1|10|1|1|10|16|4|
|
||
|
|
||
|
### packing style convention
|
||
|
|
||
|
In practice, elempack 1, 4, 8 are the most common cases. It is possible to use any other packing style in theory.
|
||
|
|
||
|
The following table show the packing axis used in ncnn for different dimension.
|
||
|
|
||
|
|dims|packing axis|shape before packing|shape after packing|
|
||
|
|---|---|---|---|
|
||
|
|1|w|w|w/elempack|
|
||
|
|2|h|w, h|w, h/elempack|
|
||
|
|3|c|w, h, c|w, h, c/elempack|
|
||
|
|
||
|
If the packing axis dim is not evenly divisible by elempack, zero padding may be used.
|
||
|
|
||
|
```
|
||
|
outw = (w + elempack - 1) / elempack;
|
||
|
```
|
||
|
|
||
|
The following snippet shows the memory layout after elempack=4 on 3-dim Mat
|
||
|
|
||
|
```
|
||
|
// w=2 h=3 c=4 elempack=1
|
||
|
0 1
|
||
|
2 3
|
||
|
4 5
|
||
|
|
||
|
6 7
|
||
|
8 9
|
||
|
10 11
|
||
|
|
||
|
12 13
|
||
|
14 15
|
||
|
16 17
|
||
|
|
||
|
18 19
|
||
|
20 21
|
||
|
22 23
|
||
|
|
||
|
// w=2 h=3 c=1 elempack=4
|
||
|
(0,6,12,18) (1,7,13,19)
|
||
|
(2,8,14,20) (3,9,15,21)
|
||
|
(4,10,16,22) (5,11,17,23)
|
||
|
```
|
||
|
|
||
|
### how to convert elempack
|
||
|
|
||
|
There is a convenient wrapper function provided
|
||
|
```
|
||
|
// convert to elempack 4 if packing axis dim is evenly divisible by elempack
|
||
|
// return the identity Mat otherwise
|
||
|
ncnn::Mat a;
|
||
|
ncnn::Mat a_packed;
|
||
|
ncnn::convert_packing(a, a_packed, 4);
|
||
|
if (a_packed.elempack == 4)
|
||
|
{
|
||
|
// check if packing is successful
|
||
|
}
|
||
|
|
||
|
// convert to packing 1, aka unpacking, shall be always successful
|
||
|
ncnn::Mat b;
|
||
|
ncnn::Mat b_unpacked;
|
||
|
ncnn::convert_packing(b, b_unpacked, 1);
|
||
|
```
|
||
|
|
||
|
### handle general interleaved data
|
||
|
|
||
|
Here is an example of using convert packing to convert RGB interleaved data to planar
|
||
|
|
||
|
**NOTE:** The following code is just presented to explain what packing is and the conversion process. Do not use it in production due to its poor performance. Do use ncnn::Mat::from_pixels()
|
||
|
|
||
|
```cpp
|
||
|
// rgb_interleaved_u8 is RGB RGB RGB ...
|
||
|
// rgb_interleaved_u8.w = w;
|
||
|
// rgb_interleaved_u8.h = h;
|
||
|
// rgb_interleaved_u8.c = 1;
|
||
|
// rgb_interleaved_u8.elemsize = 3;
|
||
|
// rgb_interleaved_u8.elempack = 3;
|
||
|
|
||
|
ncnn::Mat rgb_interleaved_u8(w, h, 1, 3, 3);
|
||
|
ncnn::Mat rgb_planar_u8;
|
||
|
|
||
|
ncnn::convert_packing(rgb_interleaved_u8, rgb_planar_u8, 1);
|
||
|
|
||
|
// rgb_planar_u8 is now RRR ... GGG ... BBB ...
|
||
|
// rgb_planar_u8.w = w;
|
||
|
// rgb_planar_u8.h = h;
|
||
|
// rgb_planar_u8.c = 3;
|
||
|
// rgb_planar_u8.elemsize = 1;
|
||
|
// rgb_planar_u8.elempack = 1;
|
||
|
```
|