PaperKey Ideas
[2019] Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks
  • produce mask to focus on interpretability
  • smallest region of image must be retained to preserve (or deleted to change) model output
  • fine grain visual explanation, no smoothing and regularisations
[2019] Learning Instance Activation Maps for Weakly Supervised Instance Segmentation
  • image and class labels only -> segmentation mask
  • use PRM to collect pseudo ground-truth masks, to learn to predict class-agnostic activation map for each instance on the image
  • instance extent filling module + Instance Activation Map, to extract spatial layout
  • convolutional CRF post-processing extract fine detailed boundaries
[2019] Weakly Supervised Semantic Image Segmentation with Self-correcting Networks
  • image, bounding box labels and small set of segmentation labels -> segmentation mask
  • there are 3 models:
    • primary (DeepLabv3+): image->segmentation
    • ancillary (DeepLabv3+): image+box->segmentation
    • self-correct: refine segmentation for both models, concat both results and feed into 3x3 conv
[2019] Object Counting and Instance Segmentation with Image-level Supervision
  • image and # of object classes -> segmentation (instance) mask and count # of object classes
  • image-level lower-count (ILC) density map estimation approach for object counting
  • perform better than PRM, PRM unable to separate adjacent objects from the same class
  • loss function to for predicting the global count and spatial distribution of objects
  • ResNet feature extraction follower by image classification branch and density branch
[2018] A weakly-supervised framework for interpretable diabetic retinopathy detection on retinal images
  • multi-lesion detection with weakly supervised learning with just class label (presence or absence of disease)
  • Bag of Visual Words (a multiple instance learning framework) to extract weak image features
[2018] Visual Feature Attribution using Wasserstein GANs [code]
  • Wasserstein generative adversarial networks, a generative model in which the additive map is learned as a function of the images, predict subject-specific disease effect maps
  • if there are multiple locations in the images that are the caused of that classification, some locations may not influence the classification output, thus not highlighted
  • Wasserstein GAN to generate feature maps
  • U-Net to learn map generator's features
[2018] Learning to Segment Every Thing
  • partially supervised segmentation model to segment new categories (providing bounding box), and a small set of existing categories (providing bounding box+segmentations)
  • segmentation transfer learning with Mask R-CNN, to train segmentation for new categories
  • first train with existing category, bounding box to mask, then predict a category‚Äôs mask parameters from its bounding box parameters using a generic, category-agnostic weight transfer function
[2018] An Unsupervised Approach for Extraction of Blood Vessels from Fundus Images
  • unsupervised iterative method is proposed for extraction of blood vessels
  • select green channel to reveal maximum contrast
  • perform gamma correction for brightness
  • some variant of adaptive histogram equalization to remove noise and reveal hidden features
  • local adaptive minmax threshold to separate foreground and background
  • pixel with at least 6 neighbors marked as vessel
  • unable to handle connectivity, lead to inaccurate segmentation
[2018] Weakly Supervised Instance Segmentation using Class Peak Response
  • image and class labels only -> segmentation mask
  • enable classification networks to exploiting class peak responses for instance mask extraction
  • Peak Response Maps (PRMs), stimulate peaks to emerge from a class response map
  • effectively mapped to highly informative regions of each object instance and boundaries
[2017] Rethinking Atrous Convolution for Semantic Image Segmentation
  • removed CRF post-processing
  • atrous convolution changed from parallel to cascade, to include 1*1 and 3 3*3. then concatenated and 1*1
  • include batch normalisation
  • include data augmentation, random scaling and flipping
  • keep groundtruth intact (previously downsmaple) and instead upsample final logits
[2017] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
  • image+segment mask -> improve segment
  • build on DeepLabv1, add ASPP (atrous spatial pyramid pooling), to extract objects at multiple scales
[2017] Weakly-supervised localization of diabetic retinopathy lesions in retinal fundus images
  • fine tune class activation maps (CAM) to extract fine grain segments
  • heatmap from CAM are normalised between 0-1, and select binary segments using threshold value
  • unable to segment small pixels
[2014] Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
  • image+segment mask -> improve segment
  • employ atrous convolution to solve signal resolution downsampling (max-pool and striding), to extract dense features and enlarge the field-of-view
  • employ conditional random field to solve spatial insensitivity, to capture object boundaries