I mainly focus on semantic segmentation and potentially related papers.
semantic segmentation
A Dataset for Lane Instance Segmentation in Urban Environments
a dataset paperBiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
spatial path & context path to solve real time segmentationICNet for Real-Time Semantic Segmentation on High-Resolution Images
solve real time segmentationMulti-Scale Context Intertwining for Semantic Segmentation
a novel scheme for aggregating features from different scales, which we refer to as MultiScale Context Intertwining (MSCI).
merge pairs of feature maps in a bidirectional and recurrent fashion, via connections between two LSTM chains. By training the parameters of the LSTM units on the segmentation task, the above approach learns how to extract powerful and effective features for pixellevel semantic segmentation, which are then combined hierarchically.Efficient Semantic Scene Completion Network with Spatial Group Convolution
Spatial Group Convolution (SGC) for accelerating the computation of 3D dense prediction tasks. SGC is orthogonal to group convolution, which works on spatial dimensions rather than feature channel dimension. It divides input voxels into different groups, then conducts 3D sparse convolution on these separated groups.Adaptive Affinity Fields for Semantic Segmentation
propose the concept of Adaptive Affinity Fields (AAF) to capture and match the semantic relations between neighbouring pixels in the label spaceEncoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Spatial pyramid pooling module or encode-decoder structure
Encoder-Decoder with Atrous Convolution
Depthwise separable convolution:
DeepLabv3 as encoder
implementationMVTec D2S: Densely Segmented Supermarket Dataset
a dataset paper
a novel benchmark for instance-aware semantic segmentation in an industrial domain, It contains 21 000 high-resolution images with pixel-wise labels of all object instances. The objects comprise groceries and everyday products from 60 categories.Predicting Future Instance Segmentation by Forecasting Convolutional Features
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a new convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power.Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation
a novel loss function, i.e., Conservative Loss, which penalizes the extreme good and bad cases while encouraging the moderate examples. More specifically, it enables the network to learn features that are discriminative by gradient descent and are invariant to the change of domains via gradient ascend method.Affinity Derivation and Graph Merge for Instance Segmentation
In our scheme, we use two neural networks with similar structures. One predicts the pixel level semantic score and the other is designed to derive pixel affinities. Regarding pixels as the vertexes and affinities as edges, we then propose a simple yet effective graph merge algorithm to cluster pixels into instances.ExFuse: Enhancing Feature Fusion for Semantic Segmentation
In this paper, we first point out that a simple fusion of low-level and high-level features could be less effective because of the gap in semantic levels and spatial resolution. We find that introducing semantic information into low-level features and high-resolution details into high-level features is more effective for the later fusion. Based on this observation, we propose a new framework, named ExFuse, to bridge the gap between low-level and high-level features thus significantly improve the segmentation quality by 4.0% in total.Deep Clustering for Unsupervised Learning of Visual Features
a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, kmeans, and uses the subsequent assignments as supervision to update the weights of the network.Bidirectional Feature Pyramid Network with Recurrent Attention Residual Modules for Shadow Detection
recurrent attention residual (RAR) module to combine the contexts in two adjacent CNN layers and learn an attention map to select a residual and then refine the context features. Second, we develop a bidirectional feature pyramid network (BFPN) to aggregate shadow contexts spanned across different CNN layers by deploying two series of RAR modules in the network to iteratively combine and refine context features: one series to refine context features from deep to shallow layers, and another series from shallow to deep layers
better suppress false detections and enhance shadow details at the same time.Multi-scale Residual Network for Image Super-Resolution
a novel multiscale residual network (MSRN) to fully exploit the image features, which outperform most of the state-of-the-art methods. Based on the residual block, we introduce convolution kernels of different sizes to adaptively detect the image features in different scales. Meanwhile, we let these features interact with each other to get the most efficacious image information, we call this structure Multi-scale Residual Block (MSRB). Furthermore, the outputs of each MSRB are used as the hierarchical features for global feature fusion.
- Predicting Future Instance Segmentation by Forecasting Convolutional Features
For video next frame prediction