posted on 2021-05-23, 18:43authored byAdam W. Harley
This thesis introduces a method to both obtain segmentation information and integrate it uniformly within a convolutional neural network (CNN). This counter-acts the tendency of CNNs to produce smooth predictions, which is undesirable for pixel-wise prediction tasks, such as semantic segmentation. The segmentation information is obtained by a form of metric learning, where a CNN learns to compute pixel embeddings that reflect whether any pair of pixels is likely to belong to the same region. This information is then used within a larger network, to replace all convolutions with foreground-focused convolutions, where the foreground is determined adaptively at each image point by local embeddings. The resulting network is called a segmentation-aware CNN, because the network can change its behaviour at each image location according to local segmentation cues. The proposed method yields systematic improvements on a standard semantic segmentation benchmark when compared to a strong baseline.