飞桨官方图像分割课程
https://aistudio.baidu.com/aistudio/course/introduce/1767
图像分割
图像分割就是把图像分成若干个特定的、具有独特性质的区域并提出感兴趣目标的技术和过程。它是由图像处理到图像分析的关键步骤。现有的图像分割方法主要分以下几类:基于阈值的分割方法、基于区域的分割方法、基于边缘的分割方法以及基于特定理论的分割方法等。从数学角度来看,图像分割是将数字图像划分成互不相交的区域的过程。图像分割的过程也是一个标记过程,即把属于同一区域的像素赋予相同的编号。
全连接神经网络
什么是FCN
FCN=Fully Convolutional Networks,简而言之即为全卷积,无FC
语义分割其根本原理即为像素级分类
与VGG的关系
经典的VGG16卷积神经网络经过2-2-3-3-3层Conv后在图像识别方向有很好的表现,
而FCN神经网络在传统的vgg神经网络基础上利用一维卷积神经网络将FC层替换,使OUT与IPT的维度相同,以实现放大或分类feature map的功能。由于一维卷积神经网络可以实现维度的改变,所以FCN神经网络可以实现input.shape==output.shape
输入原图,经过VGG16网络,得到特征map,然后将特征map上采样回去。再将预测结果和ground truth每个像素一一对应分类,做像素级别分类。也就是说将分割问题变成分类问题,而分类问题正好是深度学习的强项。如果只将特征map直接上采样或者反卷积,明显会丢失很多信息。FCN采取解决方法是将pool4、pool3、和特征map融合起来,由于pool3、pool4、特征map大小尺寸是不一样的,所以融合应该前上采样到同一尺寸。这里的融合是拼接在一起,不是对应元素相加。
UNet神经网络
与FCN的关系
UNet神经网络同样利用了FCN的思想,对input进行下采样后进行上采样,即UNet也是全卷积神经网络,但是针对FCN分割不够精细的缺点,UNet有其优秀的网络结构来提升识别精度,
由图可见,UNet网络结构像一个“U”,所以叫UNet,直观的可以看出它分为两部分,即特征提取(类似VGG)和上采样
每上采样一次,就和特征提取部分对应的通道数相同尺度融合,相对于FCN神经网络它分割的效果是非常好的,并且适用于大尺寸图像处理,在医疗影像方面有很好的表现
Paddle实现UNet Model
import numpy as np
import paddle
import paddle.fluid as fluid
from paddle.fluid.dygraph import to_variable
from paddle.fluid.dygraph import Layer
from paddle.fluid.dygraph import Conv2D
from paddle.fluid.dygraph import BatchNorm
from paddle.fluid.dygraph import Pool2D
from paddle.fluid.dygraph import Conv2DTranspose
class Encoder(Layer):
def __init__(self, num_channels, num_filters):
super(Encoder, self).__init__()
# TODO: encoder contains:
# 1 3x3conv + 1bn + relu +
# 1 3x3conc + 1bn + relu +
# 1 2x2 pool
# return features before and after pool
self.conv1 = Conv2D(num_channels,
num_filters,
filter_size=3,
stride=1,
padding=1)
self.bn1 = BatchNorm(num_filters, act='relu')
self.conv2 = Conv2D(num_filters,
num_filters,
filter_size=3,
stride=1,
padding=1)
self.bn2 = BatchNorm(num_filters, act='relu')
self.pool = Pool2D(pool_size=2, pool_stride=2, pool_type='max', ceil_mode=True)
def forward(self, inputs):
# TODO: finish inference part
x = self.conv1(inputs)
x = self.bn1(x)
x = self.conv2(x)
x = self.bn2(x)
x_pooled = self.pool(x)
return x, x_pooled
class Decoder(Layer):
def __init__(self, num_channels, num_filters):
super(Decoder, self).__init__()
# TODO: decoder contains:
# 1 2x2 transpose conv (makes feature map 2x larger)
# 1 3x3 conv + 1bn + 1relu +
# 1 3x3 conv + 1bn + 1relu
self.up = Conv2DTranspose(num_channels=num_channels,
num_filters=num_filters,
filter_size=2,
stride=2)
self.conv1 = Conv2D(num_channels,
num_filters,
filter_size=3,
stride=1,
padding=1)
self.bn1 = BatchNorm(num_filters, act='relu')
self.conv2 = Conv2D(num_filters,
num_filters,
filter_size=3,
stride=1,
padding=1)
self.bn2 = BatchNorm(num_filters, act='relu')
def forward(self, inputs_prev, inputs):
# TODO: forward contains an Pad2d and Concat
x = self.up(inputs)
h_diff = (inputs_prev.shape[2] - x.shape[2])
w_diff = (inputs_prev.shape[3] - x.shape[3])
x = fluid.layers.pad2d(x, paddings=[h_diff // 2, h_diff - h_diff // 2, w_diff // 2, w_diff - w_diff // 2])
x = fluid.layers.concat([inputs_prev, x], axis=1)
x = self.conv1(x)
x = self.conv2(x)
x = self.bn2(x)
# Pad
return x
class UNet(Layer):
def __init__(self, num_classes=59):
super(UNet, self).__init__()
# encoder: 3->64->128->256->512
# mid: 512->1024->1024
# TODO: 4 encoders, 4 decoders, and mid layers contains 2 1x1conv+bn+relu
self.down1 = Encoder(num_channels=3, num_filters=64)
self.down2 = Encoder(num_channels=64, num_filters=128)
self.down3 = Encoder(num_channels=128, num_filters=256)
self.down4 = Encoder(num_channels=256, num_filters=512)
self.mid_conv1 = Conv2D(512, 1024, filter_size=1, padding=0, stride=1)
self.mid_bn1 = BatchNorm(1024, act='relu')
self.mid_conv2 = Conv2D(1024, 1024, filter_size=1, padding=0, stride=1)
self.mid_bn2 = BatchNorm(1024, act='relu')
self.up4 = Decoder(1024, 512)
self.up3 = Decoder(512, 256)
self.up2 = Decoder(256, 128)
self.up1 = Decoder(128, 64)
self.last_conv = Conv2D(num_channels=64, num_filters=num_classes, filter_size=1)
def forward(self, inputs):
x1, x = self.down1(inputs)
print(x1.shape, x.shape)
x2, x = self.down2(x)
print(x2.shape, x.shape)
x3, x = self.down3(x)
print(x3.shape, x.shape)
x4, x = self.down4(x)
print(x4.shape, x.shape)
# middle layers
x = self.mid_conv1(x)
x = self.mid_bn1(x)
x = self.mid_conv2(x)
x = self.mid_bn2(x)
print(x4.shape, x.shape)
x = self.up4(x4, x)
print(x3.shape, x.shape)
x = self.up3(x3, x)
print(x2.shape, x.shape)
x = self.up2(x2, x)
print(x1.shape, x.shape)
x = self.up1(x1, x)
print(x.shape)
x = self.last_conv(x)
return x
def main():
with fluid.dygraph.guard(fluid.CUDAPlace(0)):
model = UNet(num_classes=59)
x_data = np.random.rand(1, 3, 123, 123).astype(np.float32)
inputs = to_variable(x_data)
pred = model(inputs)
print(pred.shape)
if __name__ == "__main__":
main()
Refers
飞桨官方课程
网友整理
今天的文章【飞桨】图像分割:一文带你从FCN到UNet分享到此就结束了,感谢您的阅读。
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
如需转载请保留出处:https://bianchenghao.cn/62155.html