欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

卷积神经网络的经典网络介绍

程序员文章站 2022-07-06 16:22:32
...

前言

下面将介绍几种卷积神经网络的经典网络:

  • LeNet
  • AlexNet
  • VGG
  • NiN
  • GoogleNet
  • ResNet
  • DenseNet

一、LeNet

LeNet网络有:卷积层 2个,池化层 2个,全连接层:3个(其中包含输出层)。

图示如下:

卷积神经网络的经典网络介绍

1、
    net = nn.Sequential()
    net.add(nn.Conv2D(channels = 6,kernel_size = 5,activation = 'relu'),
            nn.MaxPool2D(pool_size = 2,strides = 2),
            nn.Conv2D(channels = 16,kernel_size = 5,activation = 'relu'),
            nn.MaxPool2D(pool_size = 2,strides = 2),
            nn.Dense(120,activation = 'sigmoid'),
            nn.Dense(84,activation = 'sigmoid'),
            nn.Dense(10))

2、
    net
    
    Sequential(
      (0): Conv2D(1 -> 6, kernel_size=(5, 5), stride=(1, 1), Activation(relu))
      (1): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      (2): Conv2D(6 -> 16, kernel_size=(5, 5), stride=(1, 1), Activation(relu))
      (3): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      (4): Dense(256 -> 120, Activation(sigmoid))
      (5): Dense(120 -> 84, Activation(sigmoid))
      (6): Dense(84 -> 10, linear)
    )

3、
    X= nd.random.uniform(shape=(1,1,28,28))
    net.initialize()
    for layer in net:
        X= layer(X)
        print(layer.name,'output_shape:',X.shape)
    
    
    conv0 output shape: (1, 6, 24, 24)
    pool0 output shape: (1, 6, 12, 12)
    conv1 output shape: (1, 16, 8, 8)
    pool1 output shape: (1, 16, 4, 4)
    dense0 output shape: (1, 120)
    dense1 output shape: (1, 84)
    dense2 output shape: (1, 10)

二、AlexNet

AlexNet网络有:卷积层 5个,池化层 3个,全连接层:3个。

卷积神经网络的经典网络介绍

1、
    net_Alex = nn.Sequential()
    net_Alex.add(nn.Conv2D(96,kernel_size = 11,strides = 4,activation = 'relu'),
                 nn.MaxPool2D(pool_size = 3,strides = 2),
                 nn.Conv2D(256,kernel_size = 5,padding = 2,activation = 'relu'),
                 nn.MaxPool2D(pool_size = 3,strides = 2),
                 nn.Conv2D(384,kernel_size =3,padding = 1,activation = 'relu' ),
                 nn.Conv2D(384,kernel_size =3,padding = 1,activation = 'relu' ),
                 nn.Conv2D(256,kernel_size =3,padding = 1,activation = 'relu' ),
                 nn.MaxPool2D(pool_size = 3,strides = 2),
                 nn.Dense(4096,activation = 'relu'),nn.Dropout(0.5),
                 nn.Dense(4096,activation = 'relu'),nn.Dropout(0.5),
                 #采用的是Mnist,类别为10
                 nn.Dense(10)
                )
                
2、
    net_Alex
    
    Sequential(
      (0): Conv2D(1 -> 96, kernel_size=(11, 11), stride=(4, 4), Activation(relu))
      (1): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      (2): Conv2D(96 -> 256, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
      (3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      (4): Conv2D(256 -> 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (5): Conv2D(384 -> 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (6): Conv2D(384 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (7): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      (8): Dense(6400 -> 4096, Activation(relu))
      (9): Dropout(p = 0.5, axes=())
      (10): Dense(4096 -> 4096, Activation(relu))
      (11): Dropout(p = 0.5, axes=())
      (12): Dense(4096 -> 10, linear)
    )
3、
    X = nd.random.uniform(shape=(1,1,224,224))
    net_Alex.initialize()
    for layer_A in net_Alex:
        X=layer_A(X)
        print(layer_A.name,'output shape:',X.shape)
        
    conv0 output shape: (1, 96, 54, 54)
    pool0 output shape: (1, 96, 26, 26)
    conv1 output shape: (1, 256, 26, 26)
    pool1 output shape: (1, 256, 12, 12)
    conv2 output shape: (1, 384, 12, 12)
    conv3 output shape: (1, 384, 12, 12)
    conv4 output shape: (1, 256, 12, 12)
    pool2 output shape: (1, 256, 5, 5)
    dense0 output shape: (1, 4096)
    dropout0 output shape: (1, 4096)
    dense1 output shape: (1, 4096)
    dropout1 output shape: (1, 4096)
    dense2 output shape: (1, 10)

三、VGG

VGG网络有: 卷积层 13 层,全连接层 3 层,池化层 5 层。

( VGG-16:16 = 13 卷积层 + 3 全连接层,16指的是带参数的层。)


卷积神经网络的经典网络介绍


卷积神经网络的经典网络介绍

1、
    def vgg_block(num_convs, num_channels):
        blk = nn.Sequential()
        for _ in range(num_convs):
            blk.add(nn.Conv2D(num_channels,kernel_size = 3,activation = 'relu'))
        blk.add(nn.MaxPool2D(pool_size = 2,strides = 2))
        return blk
    
    def vgg(conv_arch):
    net = nn.Sequential()
    for (num_conv,num_channels) in conv_arch:
        net.add(vgg_block(num_conv,num_channels))
    net.add(nn.Dense(4096,activation = 'relu'),nn.Dropout(0.5),
            nn.Dense(4096,activation = 'relu'),nn.Dropout(0.5),
            nn.Dense(10))
    return net 
    
    conv_arch =  ((1, 64), (1, 128), (2, 256), (2, 512), (2, 512))
    net_vgg = vgg(conv_arch)

2、
    net_vgg
    
    Sequential(
      (0): Sequential(
        (0): Conv2D(1 -> 64, kernel_size=(3, 3), stride=(1, 1), Activation(relu))
        (1): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      )
      (1): Sequential(
        (0): Conv2D(64 -> 128, kernel_size=(3, 3), stride=(1, 1), Activation(relu))
        (1): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      )
      (2): Sequential(
        (0): Conv2D(128 -> 256, kernel_size=(3, 3), stride=(1, 1), Activation(relu))
        (1): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), Activation(relu))
        (2): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      )
      (3): Sequential(
        (0): Conv2D(256 -> 512, kernel_size=(3, 3), stride=(1, 1), Activation(relu))
        (1): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), Activation(relu))
        (2): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      )
      (4): Sequential(
        (0): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), Activation(relu))
        (1): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), Activation(relu))
        (2): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      )
      (5): Dense(4608 -> 4096, Activation(relu))
      (6): Dropout(p = 0.5, axes=())
      (7): Dense(4096 -> 4096, Activation(relu))
      (8): Dropout(p = 0.5, axes=())
      (9): Dense(4096 -> 10, linear)
    )

3、
    net.initialize()
    X = nd.random.uniform(shape=(1, 1, 224, 224))
    for blk in net:
        X = blk(X)
        print(blk.name, 'output shape:\t', X.shape)
    
    sequential1 output shape: (1, 64, 112, 112)
    sequential2 output shape: (1, 128, 56, 56)
    sequential3 output shape: (1, 256, 28, 28)
    sequential4 output shape: (1, 512, 14, 14)
    sequential5 output shape: (1, 512, 7, 7)
    dense0 output shape: (1, 4096)
    dropout0 output shape: (1, 4096)
    dense1 output shape: (1, 4096)
    dropout1 output shape: (1, 4096)
    dense2 output shape: (1, 10)

四、NiN

NiN: 即Network In Network。下图为其结构:

卷积神经网络的经典网络介绍


卷积神经网络的经典网络介绍

卷积神经网络的经典网络介绍

NiN块是NiN中的基础块,由⼀个卷积层加两个充当全连接层的1×1卷积层串联而成(1×1的卷积层可以看成是特殊的全连接层)
1、
    def nin_block(num_channels, kernel_size, strides, padding):
        blk = nn.Sequential()
        blk.add(nn.Conv2D(num_channels, kernel_size,strides, padding, activation='relu'),
            nn.Conv2D(num_channels, kernel_size=1, activation='relu'),
            nn.Conv2D(num_channels, kernel_size=1, activation='relu'))
        return blk
        
    net_nin = nn.Sequential()
    net_nin.add(nin_block(96, kernel_size=11, strides=4, padding=0),
                nn.MaxPool2D(pool_size=3, strides=2),
                nin_block(256, kernel_size=5, strides=1, padding=2),
                nn.MaxPool2D(pool_size=3, strides=2),
                nin_block(384, kernel_size=3, strides=1, padding=1),
                nn.MaxPool2D(pool_size=3, strides=2), nn.Dropout(0.5),
                # 标签类别数是10
                nin_block(10, kernel_size=3, strides=1, padding=1),
                # 全局平均池化层将窗⼝形状⾃动设置成输⼊的⾼和宽
                nn.GlobalAvgPool2D(),
                # 将四维的输出转成⼆维的输出,其形状为(批量⼤⼩, 10)
                nn.Flatten())
2、
    net_nin
    
    Sequential(
      (0): Sequential(
        (0): Conv2D(None -> 96, kernel_size=(11, 11), stride=(4, 4), Activation(relu))
        (1): Conv2D(None -> 96, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        (2): Conv2D(None -> 96, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
      )
      (1): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      (2): Sequential(
        (0): Conv2D(None -> 256, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
        (1): Conv2D(None -> 256, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        (2): Conv2D(None -> 256, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
      )
      (3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      (4): Sequential(
        (0): Conv2D(None -> 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
        (1): Conv2D(None -> 384, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        (2): Conv2D(None -> 384, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
      )
      (5): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      (6): Dropout(p = 0.5, axes=())
      (7): Sequential(
        (0): Conv2D(None -> 10, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
        (1): Conv2D(None -> 10, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        (2): Conv2D(None -> 10, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
      )
      (8): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
      (9): Flatten
    )
    
3、
    X = nd.random.uniform(shape=(1, 1, 224, 224))
    net_nin.initialize()
    for layer in net_nin:
        X = layer(X)
        print(layer.name, 'output shape:\t', X.shape)
    
    sequential1 output shape: (1, 96, 54, 54)
    pool0 output shape: (1, 96, 26, 26)
    sequential2 output shape: (1, 256, 26, 26)
    pool1 output shape: (1, 256, 12, 12)
    sequential3 output shape: (1, 384, 12, 12)
    pool2 output shape: (1, 384, 5, 5)
    dropout0 output shape: (1, 384, 5, 5)
    sequential4 output shape: (1, 10, 5, 5)
    pool3 output shape: (1, 10, 1, 1)

  • 附:这里有个和前面网络不太一样的地方,就是 Global Average Pooling。
传统的cnn中,最后的卷积层所得feature map被矢量化进行全连接层,然后使用softmax 回归进行分类;
而global average pooling分类任务有多少个类别,就控制最终产生多少个feature map,对每个feature map的数值求平均作为某类别的置信度,类似FC层输出的特征向量,再经过softmax分类

FC与global average pooling的区别如下图:

卷积神经网络的经典网络介绍

五、GoogLeNet

GoogLeNet 中的基础卷积块叫做 Inception块。如下图:
卷积神经网络的经典网络介绍

可以看出Inception*有4条并行的线路。
前3条线路分别使用1×1,3×3,5×5的卷积层来获取不同空间尺寸的信息,
中间2个先使用1×1的卷积层来减少通道数,以降低模型的复杂度,然后在使用卷积层
第四条线路则先使用3×3的最大池化层,然后再用1×1的卷积层来改变通道数。
4条线路都使⽤了合适的填充来使输⼊与输出的⾼和宽⼀致。

具体一些参数见下图:

卷积神经网络的经典网络介绍

关于1×1卷积如何降低模型复杂度的计算:

卷积神经网络的经典网络介绍

卷积神经网络的经典网络介绍

1、
    class Inception(nn.Block):
        # c1 - c4为每条线路⾥的层的输出通道数
        def __init__(self, c1, c2, c3, c4, **kwargs):
            super(Inception, self).__init__(**kwargs)
            # 线路1,单1 x 1卷积层
            self.p1_1 = nn.Conv2D(c1, kernel_size=1, activation='relu')
            # 线路2,1 x 1卷积层后接3 x 3卷积层
            self.p2_1 = nn.Conv2D(c2[0], kernel_size=1, activation='relu')
            self.p2_2 = nn.Conv2D(c2[1], kernel_size=3, padding=1,activation='relu')
            # 线路3,1 x 1卷积层后接5 x 5卷积层
            self.p3_1 = nn.Conv2D(c3[0], kernel_size=1, activation='relu')
            self.p3_2 = nn.Conv2D(c3[1], kernel_size=5, padding=2, activation='relu')
            # 线路4,3 x 3最⼤池化层后接1 x 1卷积层
            self.p4_1 = nn.MaxPool2D(pool_size=3, strides=1, padding=1)
            self.p4_2 = nn.Conv2D(c4, kernel_size=1, activation='relu')
        def forward(self, x):
            p1 = self.p1_1(x)
            p2 = self.p2_2(self.p2_1(x))
            p3 = self.p3_2(self.p3_1(x))
            p4 = self.p4_2(self.p4_1(x))
            return nd.concat(p1, p2, p3, p4, dim=1) # 在通道维上连结输出
            
    b1 = nn.Sequential()
    b1.add(nn.Conv2D(64,kernel_size = 7,padding = 3 ,strides = 2,activation = 'relu'),
           nn.MaxPool2D(pool_size = 3,strides = 2,padding = 1))
    b2 = nn.Sequential()
    b2.add(nn.Conv2D(64,kernel_size = 1,activation = 'relu'),
           nn.Conv2D(192,kernel_size = 3,padding = 1,activation = 'relu'),
           nn.MaxPool2D(pool_size = 3,strides = 2,padding = 1))
    b3 = nn.Sequential()
    b3.add(Inception(64, (96, 128), (16, 32), 32),
           Inception(128, (128, 192), (32, 96), 64),
           nn.MaxPool2D(pool_size=3, strides=2, padding=1))
    b4 = nn.Sequential()
    b4.add(Inception(192, (96, 208), (16, 48), 64),
           Inception(160, (112, 224), (24, 64), 64),
           Inception(128, (128, 256), (24, 64), 64),
           Inception(112, (144, 288), (32, 64), 64),
           Inception(256, (160, 320), (32, 128), 128),
           nn.MaxPool2D(pool_size=3, strides=2, padding=1))
    b5 = nn.Sequential()
    b5.add(Inception(256, (160, 320), (32, 128), 128),
           Inception(384, (192, 384), (48, 128), 128),
           nn.GlobalAvgPool2D())
    
    net_google = nn.Sequential()
    net_google.add(b1, b2, b3, b4, b5, nn.Dense(10))

2、
    net_google
    
    Sequential(
      (0): Sequential(
        (0): Conv2D(None -> 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), Activation(relu))
        (1): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      )
      (1): Sequential(
        (0): Conv2D(None -> 64, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        (1): Conv2D(None -> 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
        (2): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      )
      (2): Sequential(
        (0): Inception(
          (p1_1): Conv2D(None -> 64, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_1): Conv2D(None -> 96, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_2): Conv2D(None -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
          (p3_1): Conv2D(None -> 16, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p3_2): Conv2D(None -> 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
          (p4_1): MaxPool2D(size=(3, 3), stride=(1, 1), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
          (p4_2): Conv2D(None -> 32, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        )
        (1): Inception(
          (p1_1): Conv2D(None -> 128, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_1): Conv2D(None -> 128, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_2): Conv2D(None -> 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
          (p3_1): Conv2D(None -> 32, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p3_2): Conv2D(None -> 96, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
          (p4_1): MaxPool2D(size=(3, 3), stride=(1, 1), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
          (p4_2): Conv2D(None -> 64, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        )
        (2): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      )
      (3): Sequential(
        (0): Inception(
          (p1_1): Conv2D(None -> 192, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_1): Conv2D(None -> 96, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_2): Conv2D(None -> 208, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
          (p3_1): Conv2D(None -> 16, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p3_2): Conv2D(None -> 48, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
          (p4_1): MaxPool2D(size=(3, 3), stride=(1, 1), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
          (p4_2): Conv2D(None -> 64, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        )
        (1): Inception(
          (p1_1): Conv2D(None -> 160, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_1): Conv2D(None -> 112, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_2): Conv2D(None -> 224, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
          (p3_1): Conv2D(None -> 24, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p3_2): Conv2D(None -> 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
          (p4_1): MaxPool2D(size=(3, 3), stride=(1, 1), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
          (p4_2): Conv2D(None -> 64, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        )
        (2): Inception(
          (p1_1): Conv2D(None -> 128, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_1): Conv2D(None -> 128, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_2): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
          (p3_1): Conv2D(None -> 24, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p3_2): Conv2D(None -> 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
          (p4_1): MaxPool2D(size=(3, 3), stride=(1, 1), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
          (p4_2): Conv2D(None -> 64, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        )
        (3): Inception(
          (p1_1): Conv2D(None -> 112, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_1): Conv2D(None -> 144, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_2): Conv2D(None -> 288, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
          (p3_1): Conv2D(None -> 32, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p3_2): Conv2D(None -> 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
          (p4_1): MaxPool2D(size=(3, 3), stride=(1, 1), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
          (p4_2): Conv2D(None -> 64, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        )
        (4): Inception(
          (p1_1): Conv2D(None -> 256, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_1): Conv2D(None -> 160, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_2): Conv2D(None -> 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
          (p3_1): Conv2D(None -> 32, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p3_2): Conv2D(None -> 128, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
          (p4_1): MaxPool2D(size=(3, 3), stride=(1, 1), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
          (p4_2): Conv2D(None -> 128, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        )
        (5): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
      )
      (4): Sequential(
        (0): Inception(
          (p1_1): Conv2D(None -> 256, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_1): Conv2D(None -> 160, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_2): Conv2D(None -> 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
          (p3_1): Conv2D(None -> 32, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p3_2): Conv2D(None -> 128, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
          (p4_1): MaxPool2D(size=(3, 3), stride=(1, 1), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
          (p4_2): Conv2D(None -> 128, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        )
        (1): Inception(
          (p1_1): Conv2D(None -> 384, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_1): Conv2D(None -> 192, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p2_2): Conv2D(None -> 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
          (p3_1): Conv2D(None -> 48, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
          (p3_2): Conv2D(None -> 128, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), Activation(relu))
          (p4_1): MaxPool2D(size=(3, 3), stride=(1, 1), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
          (p4_2): Conv2D(None -> 128, kernel_size=(1, 1), stride=(1, 1), Activation(relu))
        )
        (2): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
      )
      (5): Dense(None -> 10, linear)
    )
3、
    X = nd.random.uniform(shape=(1, 1, 96, 96))
    net_google.initialize()
    for layer in net_google:
        X = layer(X)
        print(layer.name, 'output shape:\t', X.shape)
        
    sequential0 output shape:   (1, 64, 24, 24)
    sequential1 output shape:   (1, 192, 12, 12)
    sequential2 output shape:   (1, 480, 6, 6)
    sequential3 output shape:   (1, 832, 3, 3)
    sequential4 output shape:   (1, 1024, 1, 1)
    dense0 output shape:     (1, 10)

六、ResNet

ResNet不同层数时的网络配置:
卷积神经网络的经典网络介绍

七、DenseNet

后期完善。

参考: