欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

H265 CTU、CU、PU、TU划分的特点及要求

程序员文章站 2022-03-17 14:53:09
...

H265 CTU、CU、PU、TU划分的特点及要求

以下的size特指luma块的大小,min_cu_size 默认情况下= 8。

大小及划分模式

CTU size:16x16,32x32,64x64
CU size:8x8,16x16,32x32,64x64
PU size
设CU size = MxM
Intra PU:M/2 * M/2 (only when CU size reaches min_cu_size),MxM
Inter PU:支持8种划分模式
H265 CTU、CU、PU、TU划分的特点及要求

Iinter PU的限制

  • The splitting into four PBs is allowed only when the CB size is equal
    to the minimum allowed CB size (只有CB为最小的CB时才可以分成4个PB,由下文得知而且只有CB
    大于8时才允许继续成4个PB)
  • The lower four partition types are only allowed when M is 16 or
    larger for luma. (spec: If log2CbSize is greater than MinCbLog2SizeY
    and amp_enabled_flag is equal to 1)
  • PBs of luma size 4×4 are not allowed for interpicture prediction, and
    PBs of luma sizes 4×8 and 8×4 are restricted to unipredictive
    coding.(帧间预测模式下,4x4的PB不允许,4×8 and 8×4只允许在单向预测模式下)

TU size
4x4,8x8, 16x16, 32x32
当CB大于最大TB时,隐含表示该CB需要做进一步的分隔;当CB继续分隔会小于最小TB时,隐含表示该CB不会继续作分隔。在正常范围内且划分深度小于最大划分深度时,都可选择继续划分或不划分。最大划分深度由encoder写在sps中,允许范围为[0,CtbLog2SizeY − MinTbLog2SizeY]。
此外,根据inter/intra PU的划分方式,TU存在默认划分机制(见后文中的记录),TU必须等于或小于intra PU,但可以跨越inter PU边界。

常见问题

1、Spec里对于CTU大小的规定在哪?

在“附录A Profiles, tiers and levels”中,如
Main Profile
CtbLog2SizeY derived according to active SPSs for the base layer shall be in the range of 4 to 6, inclusive.

2、Spec对于TU大小的规定在哪?

7.4.3.2.1 General sequence parameter set RBSP semantics
log2_diff_max_min_luma_transform_block_size
The variable MaxTbLog2SizeY is set equal to log2_min_luma_transform_block_size_minus2 + 2 + log2_diff_max_min_luma_transform_block_size.
The CVS shall not contain data that result in MaxTbLog2SizeY greater than Min( CtbLog2SizeY, 5 ).

3、Spec里对于M/2*M/2的划分方式的规定在哪里?

syntax的解析过程有:
H265 CTU、CU、PU、TU划分的特点及要求
When part_mode is not present, the variables PartMode and IntraSplitFlag are derived as follows:
– PartMode is set equal to PART_2Nx2N.
– IntraSplitFlag is set equal to 0.
所以当intra,而且不是最小CB时,part_mode不存在,则默认不进行划分。Inter时,spec对其取值有限制。
The value of part_mode is restricted as follows:
– If CuPredMode[ x0 ][ y0 ] is equal to MODE_INTRA, part_mode shall be equal to 0 or 1.
– Otherwise (CuPredMode[ x0 ][ y0 ] is equal to MODE_INTER), the following applies:
– If log2CbSize is greater than MinCbLog2SizeY and amp_enabled_flag is equal to 1, part_mode shall be in the range of 0 to 2, inclusive, or in the range of 4 to 7, inclusive.
– Otherwise, if log2CbSize is greater than MinCbLog2SizeY and amp_enabled_flag is equal to 0, or log2CbSize is equal to 3, part_mode shall be in the range of 0 to 2, inclusive.
– Otherwise (log2CbSize is greater than 3 and equal to MinCbLog2SizeY), the value of part_mode shall be in the range of 0 to 3, inclusive.
下表为part_mode和IntraSplitFlag的意义
H265 CTU、CU、PU、TU划分的特点及要求

4、8x4和4x8不支持双向参考的规定在哪?

H265 CTU、CU、PU、TU划分的特点及要求

5、TU划分是如何受PU划分的影响的?TU和PU大小一样吗?

rqt_root_cbf equal to 1 specifies that the transform_tree( ) syntax structure is present for the current coding unit.
rqt_root_cbf equal to 0 specifies that the transform_tree( ) syntax structure is not present for the current coding unit.
When rqt_root_cbf is not present, its value is inferred to be equal to 1.
H265 CTU、CU、PU、TU划分的特点及要求

只有当intra为最小CB时,IntraSplitFlag才有可能等于1,此时MaxTrafoDepth = max_transform_hierarchy_depth_intra+1,允许TU的划分更进一步。

H265 CTU、CU、PU、TU划分的特点及要求

split_transform_flag表示一个block是否要分成等分的4个小块,当split_transform_flag不存在时,按如下方式取值:
When split_transform_flag[ x0 ][ y0 ][ trafoDepth ] is not present, it is inferred as follows:
– If one or more of the following conditions are true, the value of split_transform_flag[ x0 ][ y0 ][ trafoDepth ] is
inferred to be equal to 1:
– log2TrafoSize is greater than MaxTbLog2SizeY.
– IntraSplitFlag is equal to 1 and trafoDepth is equal to 0.
– interSplitFlag is equal to 1.
– Otherwise, the value of split_transform_flag[ x0 ][ y0 ][ trafoDepth ] is inferred to be equal to 0.

The variable interSplitFlag is derived as follows:
– If max_transform_hierarchy_depth_inter is equal to 0 and CuPredMode[ x0 ][ y0 ] is equal to MODE_INTER and PartMode is not equal to PART_2Nx2N and trafoDepth is equal to 0, interSplitFlag is set equal to 1.
– Otherwise, interSplitFlag is set equal to 0.

所以不考虑MaxTbLog2SizeY和MinTbLog2SizeY的影响下
对于inter的CU来讲,如果max_transform_hierarchy_depth_inter=0,split_transform_flag不存在,它的值如下决定:
inter PART_NxN时,TU被自动划分一次,因为到下一层split_transform_flag仍然不存在,就只能取为0,在这种情况下TU和PU的划分完全一致。
inter PART_2Nx2N,TU不被划分,TU和PU的划分也完全一致。
其他inter PU划分方式下,TU被自动划分一次,但与PU大小不一致。
但是若max_transform_hierarchy_depth_inter!=0,则没有这个限制。
对于intra的情况,如果PU划分,IntraSplitFlag=1,在第一层时,split_transform_flag不存在直接被取为1,所以会自动划分。但是下一层时,好像split_transform_flag就可以存在了。
如果PU不划分,IntraSplitFlag=0,那好像split_transform_flag是可以存在的。
所以对于intra的情况,TU只能比PU相等或更小,也不一定完全一致。为什么TU不能跨越Intra PU呢?因为此时帧内预测需要相邻块的重建像素值作为参考,若相邻块和当前一起作transform,这个条件是无法满足的。
既然对于inter,不考虑max_transform_hierarchy_depth_inter的影响,为什么对于intra,需要给max_transform_hierarchy_depth_intra加上IntraSplitFlag呢?这两种情况下执行原因不同,inter时,TU可以跨越PU的边界,为了防止PU边界上的不连续性影响压缩性能,倾向于在CU继续下分PU的情况下把TU也做下分,但是这是split_transform_flag不存在时才会有的行为,此时TU最多下分一层,当然也可以encoder通过split_transform_flag明确指出不分或继续下分很多层。intra的TU要求比PU要小,不存在TU跨越多个intra PU的情况,可能是为了在某些特殊情况下达到细分的效果,允许intra TU比max_transform_hierarchy_depth_intra再多分一层。
IntraSplitFlag只在CU = min_cb_size时才为1,默认情况下min_cb_size=8,此时TU分一层就到4x4了,最多也只分一层。

6、在以上情况下,可以突破MinTbLog2SizeY的限制吗?(暂未解决)

如果MinTbLog2SizeY=MinCbLog2SizeY,如果IntraSplitFlag或interSplitFlag为1,那按上面的逻辑,split_transform_flag会被取为1的。
可是HEVC overview1那篇文章中说:Not splitting is implicit when splitting would result in a luma TB size smaller than the indicated minimum.
看起来TU不应该比MinTbLog2SizeY还要小的。
问题出在哪儿了?

看了好几个资料,也没有找到答案,分析一下HM的代码,要看decoder:

相关参数:
getQuadtreeTUMaxDepthInter
getQuadtreeTUMaxDepthIntra
getQuadtreeTULog2MinSize
getQuadtreeTULog2MaxSize

代码
getQuadtreeTULog2MinSizeInCU()

  if (log2CbSize < (m_pcSlice->getSPS()->getQuadtreeTULog2MinSize() + quadtreeTUMaxDepth - 1 + interSplitFlag + intraSplitFlag) )
  {
    // when fully making use of signaled TUMaxDepth + inter/intraSplitFlag, resulting luma TB size is < QuadtreeTULog2MinSize
    log2MinTUSizeInCU = m_pcSlice->getSPS()->getQuadtreeTULog2MinSize();
  }

TDecEntropy::xDecodeTransform ()

  if( pcCU->isIntra(uiAbsPartIdx) && pcCU->getPartitionSize(uiAbsPartIdx) == SIZE_NxN && uiDepth == pcCU->getDepth(uiAbsPartIdx) )
  {
    uiSubdiv = 1;
  }
  else if( (pcCU->getSlice()->getSPS()->getQuadtreeTUMaxDepthInter() == 1) && (pcCU->isInter(uiAbsPartIdx)) && ( pcCU->getPartitionSize(uiAbsPartIdx) != SIZE_2Nx2N ) && (uiDepth == pcCU->getDepth(uiAbsPartIdx)) )
  {
    uiSubdiv = (uiLog2TrafoSize >quadtreeTULog2MinSizeInCU);
  }
  else if( uiLog2TrafoSize > pcCU->getSlice()->getSPS()->getQuadtreeTULog2MaxSize() )
  {
    uiSubdiv = 1;
  }
  else if( uiLog2TrafoSize == pcCU->getSlice()->getSPS()->getQuadtreeTULog2MinSize() )
  {
    uiSubdiv = 0;
  }
  else if( uiLog2TrafoSize == quadtreeTULog2MinSizeInCU )
  {
    uiSubdiv = 0;
  }
  else
  {
    assert( uiLog2TrafoSize > quadtreeTULog2MinSizeInCU );
    m_pcEntropyDecoderIf->parseTransformSubdivFlag( uiSubdiv, 5 - uiLog2TrafoSize );
  }

这好像是intra情况下tu可以小于MinTbLog2SizeY,但是inter不可以。
Hm把spec中的逻辑改写了好多,包括有些参数的意义也有所改动,比如这里的pcCU->getSlice()->getSPS()->getQuadtreeTUMaxDepthInter(),其写入时加过1,所以要在等于1时判断inter是否需要默认划分,不太好理解。

  READ_UVLC_CHK( uiCode, "max_transform_hierarchy_depth_inter", 0, ctbLog2SizeY - minTbLog2SizeY);    pcSPS->setQuadtreeTUMaxDepthInter( uiCode+1 );
  READ_UVLC_CHK( uiCode, "max_transform_hierarchy_depth_intra", 0, ctbLog2SizeY - minTbLog2SizeY);    pcSPS->setQuadtreeTUMaxDepthIntra( uiCode+1 );

Ffmpeg就是跟spec里的逻辑完全保持一致

    if (log2_trafo_size <= s->ps.sps->log2_max_trafo_size &&
        log2_trafo_size >  s->ps.sps->log2_min_tb_size    &&
        trafo_depth     < lc->cu.max_trafo_depth       &&
        !(lc->cu.intra_split_flag && trafo_depth == 0)) {
        split_transform_flag = ff_hevc_split_transform_flag_decode(s, log2_trafo_size);
    } else {
        int inter_split = s->ps.sps->max_transform_hierarchy_depth_inter == 0 &&
                          lc->cu.pred_mode == MODE_INTER &&
                          lc->cu.part_mode != PART_2Nx2N &&
                          trafo_depth == 0;

        split_transform_flag = log2_trafo_size > s->ps.sps->log2_max_trafo_size ||
                               (lc->cu.intra_split_flag && trafo_depth == 0) ||
                               inter_split;
    }

但是后面的操作中,用log2_min_tu_size做数组寻址,如果TU比min tb小就会有问题了:

        int min_tu_size      = 1 << s->ps.sps->log2_min_tb_size;
        int log2_min_tu_size = s->ps.sps->log2_min_tb_size;// TODO: store cbf_luma somewhere else
        if (cbf_luma) {
            int i, j;
            for (i = 0; i < (1 << log2_trafo_size); i += min_tu_size)
                for (j = 0; j < (1 << log2_trafo_size); j += min_tu_size) {
                    int x_tu = (x0 + j) >> log2_min_tu_size;
                    int y_tu = (y0 + i) >> log2_min_tu_size;
                    s->cbf_luma[y_tu * min_tu_width + x_tu] = 1;
                }
        }

android下的hevc decoder libhevc的处理逻辑也与spec基本一致。
为什么??


  1. [1]. Sullivan, G.J., et al., Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on circuits and systems for video technology, 2012. 22(12): p. 1649-1668. ↩︎