【飞桨】图像分割:一文带你从FCN到UNet

【飞桨】图像分割:一文带你从FCN到UNet文章目录飞桨官方图像分割课程图像分割全连接神经网络什么是FCN与VGG的关系UNet神经网络与FCN的关系Paddle实现UNetModelRefers飞桨官方图像分割课程https://aistudio.baidu

飞桨官方图像分割课程

https://aistudio.baidu.com/aistudio/course/introduce/1767

图像分割

图像分割就是把图像分成若干个特定的、具有独特性质的区域并提出感兴趣目标的技术和过程。它是由图像处理到图像分析的关键步骤。现有的图像分割方法主要分以下几类:基于阈值的分割方法、基于区域的分割方法、基于边缘的分割方法以及基于特定理论的分割方法等。从数学角度来看,图像分割是将数字图像划分成互不相交的区域的过程。图像分割的过程也是一个标记过程,即把属于同一区域的像素赋予相同的编号。
在这里插入图片描述

全连接神经网络

什么是FCN

FCN=Fully Convolutional Networks,简而言之即为全卷积,无FC
语义分割其根本原理即为像素级分类

与VGG的关系

经典的VGG16卷积神经网络经过2-2-3-3-3层Conv后在图像识别方向有很好的表现,
在这里插入图片描述
而FCN神经网络在传统的vgg神经网络基础上利用一维卷积神经网络将FC层替换,使OUT与IPT的维度相同,以实现放大或分类feature map的功能。由于一维卷积神经网络可以实现维度的改变,所以FCN神经网络可以实现input.shape==output.shape

输入原图,经过VGG16网络,得到特征map,然后将特征map上采样回去。再将预测结果和ground truth每个像素一一对应分类,做像素级别分类。也就是说将分割问题变成分类问题,而分类问题正好是深度学习的强项。如果只将特征map直接上采样或者反卷积,明显会丢失很多信息。FCN采取解决方法是将pool4、pool3、和特征map融合起来,由于pool3、pool4、特征map大小尺寸是不一样的,所以融合应该前上采样到同一尺寸。这里的融合是拼接在一起,不是对应元素相加。
在这里插入图片描述

UNet神经网络

与FCN的关系

UNet神经网络同样利用了FCN的思想,对input进行下采样后进行上采样,即UNet也是全卷积神经网络,但是针对FCN分割不够精细的缺点,UNet有其优秀的网络结构来提升识别精度,

在这里插入图片描述
由图可见,UNet网络结构像一个“U”,所以叫UNet,直观的可以看出它分为两部分,即特征提取(类似VGG)和上采样
在这里插入图片描述
每上采样一次,就和特征提取部分对应的通道数相同尺度融合,相对于FCN神经网络它分割的效果是非常好的,并且适用于大尺寸图像处理,在医疗影像方面有很好的表现

Paddle实现UNet Model

import numpy as np
import paddle
import paddle.fluid as fluid
from paddle.fluid.dygraph import to_variable
from paddle.fluid.dygraph import Layer
from paddle.fluid.dygraph import Conv2D
from paddle.fluid.dygraph import BatchNorm
from paddle.fluid.dygraph import Pool2D
from paddle.fluid.dygraph import Conv2DTranspose


class Encoder(Layer):
    def __init__(self, num_channels, num_filters):
        super(Encoder, self).__init__()
        # TODO: encoder contains:
        # 1 3x3conv + 1bn + relu +
        # 1 3x3conc + 1bn + relu +
        # 1 2x2 pool
        # return features before and after pool
        self.conv1 = Conv2D(num_channels,
                            num_filters,
                            filter_size=3,
                            stride=1,
                            padding=1)
        self.bn1 = BatchNorm(num_filters, act='relu')

        self.conv2 = Conv2D(num_filters,
                            num_filters,
                            filter_size=3,
                            stride=1,
                            padding=1)
        self.bn2 = BatchNorm(num_filters, act='relu')

        self.pool = Pool2D(pool_size=2, pool_stride=2, pool_type='max', ceil_mode=True)

    def forward(self, inputs):
        # TODO: finish inference part
        x = self.conv1(inputs)
        x = self.bn1(x)
        x = self.conv2(x)
        x = self.bn2(x)
        x_pooled = self.pool(x)

        return x, x_pooled


class Decoder(Layer):
    def __init__(self, num_channels, num_filters):
        super(Decoder, self).__init__()
        # TODO: decoder contains:
        # 1 2x2 transpose conv (makes feature map 2x larger)
        # 1 3x3 conv + 1bn + 1relu +
        # 1 3x3 conv + 1bn + 1relu

        self.up = Conv2DTranspose(num_channels=num_channels,
                                  num_filters=num_filters,
                                  filter_size=2,
                                  stride=2)

        self.conv1 = Conv2D(num_channels,
                            num_filters,
                            filter_size=3,
                            stride=1,
                            padding=1)
        self.bn1 = BatchNorm(num_filters, act='relu')

        self.conv2 = Conv2D(num_filters,
                            num_filters,
                            filter_size=3,
                            stride=1,
                            padding=1)
        self.bn2 = BatchNorm(num_filters, act='relu')

    def forward(self, inputs_prev, inputs):
        # TODO: forward contains an Pad2d and Concat
        x = self.up(inputs)
        h_diff = (inputs_prev.shape[2] - x.shape[2])
        w_diff = (inputs_prev.shape[3] - x.shape[3])
        x = fluid.layers.pad2d(x, paddings=[h_diff // 2, h_diff - h_diff // 2, w_diff // 2, w_diff - w_diff // 2])
        x = fluid.layers.concat([inputs_prev, x], axis=1)
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.bn2(x)

        # Pad

        return x


class UNet(Layer):
    def __init__(self, num_classes=59):
        super(UNet, self).__init__()
        # encoder: 3->64->128->256->512
        # mid: 512->1024->1024

        # TODO: 4 encoders, 4 decoders, and mid layers contains 2 1x1conv+bn+relu
        self.down1 = Encoder(num_channels=3, num_filters=64)
        self.down2 = Encoder(num_channels=64, num_filters=128)
        self.down3 = Encoder(num_channels=128, num_filters=256)
        self.down4 = Encoder(num_channels=256, num_filters=512)

        self.mid_conv1 = Conv2D(512, 1024, filter_size=1, padding=0, stride=1)
        self.mid_bn1 = BatchNorm(1024, act='relu')
        self.mid_conv2 = Conv2D(1024, 1024, filter_size=1, padding=0, stride=1)
        self.mid_bn2 = BatchNorm(1024, act='relu')

        self.up4 = Decoder(1024, 512)
        self.up3 = Decoder(512, 256)
        self.up2 = Decoder(256, 128)
        self.up1 = Decoder(128, 64)

        self.last_conv = Conv2D(num_channels=64, num_filters=num_classes, filter_size=1)

    def forward(self, inputs):
        x1, x = self.down1(inputs)
        print(x1.shape, x.shape)
        x2, x = self.down2(x)
        print(x2.shape, x.shape)
        x3, x = self.down3(x)
        print(x3.shape, x.shape)
        x4, x = self.down4(x)
        print(x4.shape, x.shape)

        # middle layers
        x = self.mid_conv1(x)
        x = self.mid_bn1(x)
        x = self.mid_conv2(x)
        x = self.mid_bn2(x)

        print(x4.shape, x.shape)
        x = self.up4(x4, x)
        print(x3.shape, x.shape)
        x = self.up3(x3, x)
        print(x2.shape, x.shape)
        x = self.up2(x2, x)
        print(x1.shape, x.shape)
        x = self.up1(x1, x)
        print(x.shape)

        x = self.last_conv(x)

        return x


def main():
    with fluid.dygraph.guard(fluid.CUDAPlace(0)):
        model = UNet(num_classes=59)
        x_data = np.random.rand(1, 3, 123, 123).astype(np.float32)
        inputs = to_variable(x_data)
        pred = model(inputs)

        print(pred.shape)


if __name__ == "__main__":
    main()

Refers

飞桨官方课程
网友整理

今天的文章【飞桨】图像分割:一文带你从FCN到UNet分享到此就结束了,感谢您的阅读。

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
如需转载请保留出处:https://bianchenghao.cn/62155.html

(0)
编程小号编程小号

相关推荐

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注