The pooling primitive performs forward or backward max or average pooling operation on 1D, 2D, or 3D spatial data.
The pooling operation is defined by the following formulas. We show formulas only for 2D spatial data which are straightforward to generalize to cases of higher and lower dimensions. Variable names follow the standard Naming Conventions.
Max pooling:
\[ \dst(n, c, oh, ow) = \max\limits_{kh, kw} \left( \src(n, c, oh \cdot SH + kh - PH_L, ow \cdot SW +kw - PW_L) \right) \]
Average pooling:
\[ \dst(n, c, oh, ow) = \frac{1}{DENOM} \sum\limits_{kh, kw} \src(n, c, oh \cdot SH + kh - PH_L, ow \cdot SW +kw - PW_L) \]
Here output spatial dimensions are calculated similarly to how they are done in Convolution.
Average pooling supports two algorithms:
TODO: a picture would be nice here.
The backward propagation computes \(\)\diffsrc\f$(n, c, h, w) \(, based on \) \(\diffdst\)(n, c, h, w) \( and (in case of max pooling) <tt>workspace</tt>. @section autotoc_md232 Execution Arguments When executed, the inputs and outputs should be mapped to an execution argument index as specified by the following table. <table class="markdownTable"> <tr class="markdownTableHead"> <th class="markdownTableHeadNone"> Primitive input/output \ilinebr </th> <th class="markdownTableHeadNone"> Execution argument index \ilinebr </th> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \)\src\f$
DNNL_ARG_SRC
\(\dst\)
DNNL_ARG_DST
workspace
DNNL_ARG_WORKSPACE
\(\diffsrc\)
DNNL_ARG_DIFF_SRC
\(\diffdst\)
DNNL_ARG_DIFF_DST
The pooling primitive supports the following combinations of data types:
| Propagation | Source / Destination | Accumulation data type (used for average pooling only) |
|---|---|---|
| forward / backward | f32, bf16 | f32 |
| forward | f16 | f16 |
| forward | s8, u8, s32 | s32 |
Like other CNN primitives, the pooling primitive expects data to be an \(N \times C \times W\) tensor for the 1D spatial case, an \(N \times C \times H \times W\) tensor for the 2D spatial case, and an \(N \times C \times D \times H \times W\) tensor for the 3D spatial case.
The pooling primitive is optimized for the following memory formats:
| Spatial | Logical tensor | Data type | Implementations optimized for memory formats |
|---|---|---|---|
| 1D | NCW | f32 | dnnl_ncw (dnnl_abc), dnnl_nwc (dnnl_acb), optimized^ |
| 1D | NCW | s32, s8, u8 | dnnl_nwc (dnnl_acb), optimized^ |
| 2D | NCHW | f32 | dnnl_nchw (dnnl_abcd), dnnl_nhwc (dnnl_acdb), optimized^ |
| 2D | NCHW | s32, s8, u8 | dnnl_nhwc (dnnl_acdb), optimized^ |
| 3D | NCDHW | f32 | dnnl_ncdhw (dnnl_abcde), dnnl_ndhwc (dnnl_acdeb), optimized^ |
| 3D | NCDHW | s32, s8, u8 | dnnl_ndhwc (dnnl_acdeb), optimized^ |
Here optimized^ means the format that comes out of any preceding compute-intensive primitive.
The pooling primitive does not support any post-ops or attributes.
N/A
| Engine | Name | Comments |
|---|---|---|
| CPU/GPU | Pooling Primitive Example | This C++ API example demonstrates how to create and execute a Pooling primitive in forward training propagation mode. |