Pytorch AdaptivePooing操作轉Pooling操作

JDD菜老師發表於2022-03-03

Pytorch AdaptivePooing操作轉Pooling操作

多數的前向推理框架不支援AdaptivePooing操作,此時需要將AdaptivePooing操作轉換為普通的Pooling操作。AdaptivePooling與Max/AvgPooling相互轉換提供了一種轉換方法,但我在Pytorch1.6中的測試結果是錯誤的。通過檢視Pytorch原始碼(pytorch-master\aten\src\ATen\native\AdaptiveAveragePooling.cpp)我找出了正確的轉換方式。

  inline int start_index(int a, int b, int c) {
    return (int)std::floor((float)(a * c) / b);
  }

  inline int end_index(int a, int b, int c) {
    return (int)std::ceil((float)((a + 1) * c) / b);
  }

  template <typename scalar_t>
  static void adaptive_avg_pool2d_single_out_frame(
            scalar_t *input_p,
            scalar_t *output_p,
            int64_t sizeD,
            int64_t isizeH,
            int64_t isizeW,
            int64_t osizeH,
            int64_t osizeW,
            int64_t istrideD,
            int64_t istrideH,
            int64_t istrideW)
  {
    at::parallel_for(0, sizeD, 0, [&](int64_t start, int64_t end) {
      for (auto d = start; d < end; d++)
      {
        /* loop over output */
        int64_t oh, ow;
        for(oh = 0; oh < osizeH; oh++)
        {
          int istartH = start_index(oh, osizeH, isizeH);
          int iendH   = end_index(oh, osizeH, isizeH);
          int kH = iendH - istartH;

          for(ow = 0; ow < osizeW; ow++)
          {
            int istartW = start_index(ow, osizeW, isizeW);
            int iendW   = end_index(ow, osizeW, isizeW);
            int kW = iendW - istartW;

            /* local pointers */
            scalar_t *ip = input_p   + d*istrideD + istartH*istrideH + istartW*istrideW;
            scalar_t *op = output_p  + d*osizeH*osizeW + oh*osizeW + ow;

            /* compute local average: */
            scalar_t sum = 0;
            int ih, iw;
            for(ih = 0; ih < kH; ih++)
            {
              for(iw = 0; iw < kW; iw++)
              {
                scalar_t val = *(ip + ih*istrideH + iw*istrideW);
                sum += val;
              }
            }

            /* set output to local average */
            *op = sum / kW / kH;
          }
        }
      }
    });
  }

上述程式碼段中isizeH,isizeW分別表示輸入張量的寬高osizeH,osizeW則表示輸出寬高。關注第二個for迴圈for(oh = 0; oh < osizeH; oh++){.....}中的內容。假設輸入的寬高均為223isizeH = isizeW = 223,輸出的寬高均為7osizeH = osizeW = 224,然後簡單分析一下oh=0,1,2時的情況:

  • oh=0, istartH = 0, iendH = ceil(223/7)=32, kH = 32
  • oh=1, istartH = floor(223/7) = 31, iendH = ceil(223*2/7)=64, kH = 33
  • oh=2, istartH = floor(223*2/7) = 63, iendH = ceil(223*3/7)=96, kH = 33

這裡的kH就是kernel_size的大小. oh=0時的kernel_size比其他情況要小,所以需要在輸入上新增padding,讓oh=0時的kernel_size與其他情況相同。新增的padding大小為1,等價於讓istartH從-1開始,即kH = 32-(-1) = 33. 下一個需要獲取的引數是stride,stride = istartH[oh=i]-istartH[oh=i-1], 在上述例子中即為32。按照上述的例子分析輸入寬高為224的情況可以發現padding=0,所以padding也是一個需要轉換的引數。下面給出3個引數的轉換公式:

  • stride = ceil(input_size / output_size)
  • kernel_size = ceil(2 * input_size / output_size) - floor(input_size / output_size)
  • padding = ceil(input_size / output_size) - floor(input_size / output_size)

在上述的程式碼中最後部分,可以看見均值使用*op = sum / kW / kH計算得到的。這表明在邊緣部分計算均值沒有考慮padding,所以對應的AvgPool中的count_include_pad應該設為False。下面貼出我的測試程式碼:

def test(size):
    import numpy as np
    import torch
    
    x = torch.randn(1,1,size,size)
    
    input_size = np.array(x.shape[2:])
    output_size = np.array([7,7])

    # stride = ceil(input_size / output_size)
    # kernel_size = ceil(2 * input_size / output_size) - floor(input_size / output_size)
    # padding = ceil(input_size / output_size) - floor(input_size / output_size)

    stride = numpy.ceil(input_size / output_size).astype(int)
    kernel_size = (numpy.ceil(2 * input_size / output_size) - numpy.floor(input_size / output_size)).astype(int)
    padding = (numpy.ceil(input_size / output_size) - numpy.floor(input_size / output_size)).astype(int)
    print(stride)
    print(kernel_size)
    print(padding)
    avg1 = nn.AdaptiveAvgPool2d(list(output_size))
    avg2 = nn.AvgPool2d(kernel_size=kernel_size.tolist(), stride=stride.tolist(), padding=padding.tolist(), ceil_mode=False, count_include_pad=False)
    max1 = nn.AdaptiveMaxPool2d(list(output_size))
    max2 = nn.MaxPool2d(kernel_size=kernel_size.tolist(), stride=stride.tolist(), padding=padding.tolist(), ceil_mode=False )


    avg1_out = avg1(x)
    avg2_out = avg2(x)
    max1_out = max1(x)
    max2_out = max2(x)
    print(avg1_out-avg2_out)
    print(max1_out-max2_out)
    print(torch.__version__)
  • inH = inW=224時的輸出

image

  • inH = inW=223時的輸出

image

相關文章