Segmentation Results: VOC2012 BETA

Competition "comp5" (train on VOC2012 data)

This leaderboard shows only those submissions that have been marked as public, and so the displayed rankings should not be considered as definitive.

Average Precision (AP %)

  mean

aero
plane
bicycle

bird

boat

bottle

bus

car

cat

chair

cow

dining
table
dog

horse

motor
bike
person

potted
plant
sheep

sofa

train

tv/
monitor
submission
date
XC-FLATTENET [?] 84.394.073.291.574.483.295.590.596.738.194.576.292.995.688.990.476.093.863.686.878.612-Jan-2020
aux_0.4_finetuning_train_checkpoints [?] 81.194.343.695.373.783.795.390.794.739.693.654.290.293.690.287.475.992.656.688.075.516-Jun-2020
AAA_HEM [?] 82.994.573.692.466.581.796.390.095.336.292.168.192.493.290.588.974.193.958.589.776.611-Apr-2020
acm [?] 79.090.941.691.165.280.793.690.092.739.488.660.888.689.788.885.672.189.458.284.972.213-Jun-2020
FDNet_16s [?] 84.095.477.995.969.180.696.492.695.540.592.670.693.893.190.489.971.292.763.188.577.722-Mar-2018
Puzzle-CAM [?] 72.387.237.486.861.571.392.286.391.828.685.164.291.982.182.670.769.487.745.567.037.802-Feb-2021
WASPNet+CRF [?] 79.690.760.985.368.980.793.884.794.538.586.069.790.786.985.486.867.688.857.485.474.219-Nov-2019
WASPNet [?] 79.489.563.887.668.879.193.584.793.737.684.869.490.887.985.786.567.487.057.285.272.725-Jul-2019
Unet_Attention [?] 78.093.067.983.665.978.692.487.390.030.582.565.886.283.986.785.366.286.855.782.173.117-Jan-2022
FLATTENET-001 [?] 83.195.469.886.473.378.794.692.195.541.092.475.893.195.289.890.065.894.060.086.679.629-Dec-2019
refinenet_HPM [?] 74.287.962.276.055.376.086.782.685.428.979.664.281.279.485.185.065.683.251.379.268.601-Mar-2019
FS+WSSS [?] 81.696.348.892.883.982.193.194.494.548.696.945.994.196.193.788.763.593.963.992.854.826-Jun-2023
DCONV_SSD_FCN [?] 72.988.638.185.257.871.490.884.286.032.183.453.780.480.881.681.261.484.151.977.567.017-Mar-2018
CDL_new [?] 73.189.438.987.463.675.190.783.689.235.883.442.785.087.682.281.557.485.948.282.153.229-Jun-2022
weight+RS [?] 64.585.031.985.419.165.388.672.988.524.875.650.183.382.081.666.856.680.345.844.639.724-Mar-2019
GDM [?] 74.585.840.281.661.867.390.884.291.837.380.262.785.286.485.981.756.586.656.779.769.404-Mar-2020
DeeplabV3+_Exploring_Missing_Parts [?] 68.888.737.187.657.964.385.973.285.424.784.229.481.890.281.877.355.588.034.974.651.315-Jul-2021
TCnet [?] 68.472.632.674.259.568.986.777.178.634.468.663.074.475.876.377.054.976.655.576.961.702-May-2018
Deeplab-clims-r38 [?] 71.282.134.085.758.270.587.082.386.330.880.959.483.180.780.678.454.585.359.569.853.625-Oct-2022
Progressive Framework [?] 69.283.435.782.546.066.088.479.190.928.682.444.385.984.778.579.153.483.344.561.363.406-Jul-2021
DSRG_ATTNBN [?] 63.076.932.372.949.059.277.775.476.719.571.527.974.373.677.072.852.776.440.373.552.728-Feb-2020
AD_CLIMS [?] 69.881.033.088.160.669.087.981.788.927.582.560.285.883.578.934.252.180.558.579.161.903-Mar-2023
AttnBN [?] 63.075.732.973.549.960.478.176.577.419.972.027.473.872.777.272.351.277.337.973.553.614-Aug-2019
CGPT [?] 70.877.032.184.156.466.587.683.187.829.881.559.684.984.080.277.350.481.461.272.857.622-May-2023
CGPT_IMN [?] 70.479.032.185.558.965.586.481.287.127.182.457.085.082.679.776.149.383.558.473.257.522-May-2023
UMICH_TCS [?] 65.573.632.481.650.468.586.279.481.828.275.555.679.075.577.774.348.279.052.344.442.728-Nov-2019
UMICH_EG-ConvCRF_Iter_Res101 [?] 67.976.432.182.052.969.589.581.187.529.674.754.283.479.278.776.148.181.059.451.349.313-Dec-2019
bothweight th0.4 [?] 65.379.133.488.220.165.388.076.290.024.780.743.785.185.882.369.847.784.943.841.454.515-Apr-2019
UMICH_TCS_101 [?] 66.774.631.581.550.367.288.480.682.329.972.353.476.176.078.473.647.579.252.057.956.601-Dec-2019
attention RRM [?] 69.285.231.286.649.371.985.576.188.832.676.864.084.582.676.974.347.582.757.842.465.612-Mar-2021
Extended [?] 59.377.928.975.142.655.270.458.953.024.366.751.973.171.372.563.945.259.243.965.258.629-Aug-2018
deeplabv3_plus_reproduction [?] 69.581.738.183.160.162.789.881.487.630.072.860.978.177.778.678.144.976.750.875.458.611-May-2019
UMICH_EG-ConvCRF_Iter_Res50 [?] 66.475.832.285.550.969.386.479.685.129.173.755.779.674.776.375.844.879.651.050.548.510-Dec-2019
CDL_VWL [?] 71.787.633.789.860.168.391.483.189.933.678.362.084.482.582.578.742.781.460.058.665.312-Jul-2022
weakly_seg_validation_test [?] 57.767.631.166.441.960.170.665.471.825.363.624.772.268.768.368.841.667.533.665.049.208-Sep-2019
Progressive Framework [?] 60.072.232.471.942.463.270.270.377.823.460.533.272.471.075.469.839.970.237.764.053.206-Jul-2021
Progressive Framework [?] 55.064.930.868.431.652.670.564.873.422.648.633.468.659.671.168.938.560.740.847.751.703-Jul-2021
CDL_new_rib [?] 69.985.034.887.959.975.590.681.689.133.281.139.683.982.981.774.937.882.757.558.358.529-Jun-2022
NUS_DET_SPR_GC_SP [?] 47.352.931.039.844.558.960.852.549.022.638.127.547.452.446.851.935.755.340.854.247.823-Sep-2012
O2P_SVRSEGM_CPMC_CSI [?] 47.564.032.245.934.746.359.561.749.414.847.931.242.551.358.854.634.954.634.750.642.215-Nov-2012
fcn [?] 51.057.06.255.034.951.069.367.866.713.746.547.054.952.459.864.834.458.935.658.747.526-Apr-2023
BONN_CMBR_O2P_CPMC_LIN [?] 44.860.027.346.440.041.757.659.050.410.041.622.343.051.756.850.133.743.729.547.544.723-Sep-2012
BONN_O2PCPMC_FGT_SEGM [?] 47.065.429.351.333.444.259.860.352.513.653.632.640.357.657.349.033.553.529.247.637.623-Sep-2012
BONNGC_O2P_CPMC_CSI [?] 45.459.327.943.939.841.452.261.556.413.644.526.142.851.757.951.329.845.728.849.943.323-Sep-2012
vgg_unet [?] 45.856.344.654.637.044.155.855.255.813.130.732.845.439.956.965.927.351.123.344.438.713-Aug-2023
unet_resnet50 [?] -62.936.357.038.957.067.670.965.9-24.639.561.639.359.365.726.149.129.456.451.013-Aug-2023
comp6_test_cls [?] 37.736.610.838.925.930.856.053.857.84.924.622.148.133.132.656.123.529.731.842.745.610-May-2018
fcn [?] 38.451.6-11.823.936.865.755.757.83.833.541.240.839.442.652.712.726.126.556.542.426-Apr-2023
OptNBNN-CRF [?] 11.310.52.33.03.01.030.214.915.00.26.12.35.112.115.323.40.58.93.510.75.323-Sep-2012

Abbreviations

TitleMethodAffiliationContributorsDescriptionDate
testAAA_HEMxiongdeng@stu.xmu.edu.cn111test2020-04-11 03:21:43
Clims with adapterAD_CLIMSETS, MontrealBM, RH, RBCLIMS in WSS settings2023-03-03 17:57:45
AttnBNAttnBNAttnBNAttnBNAttnBN2019-08-14 23:23:24
O2P Regressor + Composite Statistical InferenceBONNGC_O2P_CPMC_CSI(1) University of Bonn, (2) Georgia Institute of Technology, (3) University of CoimbraJoao Carreira (1,3) Fuxin Li (2) Guy Lebanon (2) Cristian Sminchisescu (1)We utilize a novel probabilistic inference procedure (unpublished yet), Composite Statisitcal Inference (CSI), on semantic segmentation using predictions on overlapping figure-ground hypotheses. Regressor predictions on segment overlaps to the ground truth object are modelled as generated by the true overlap with the ground truth segment plus noise. A model of ground truth overlap is defined by parametrizing on the unknown percentage of each superpixel that belongs to the unknown ground truth. A joint optimization on all the superpixels and all the categories is then performed in order to maximize the likelihood of the SVR predictions. The optimization has a tight convex relaxation so solutions can be expected to be close to the global optimum. A fast and optimal search algorithm is then applied to retrieve each object. CSI takes the intuition from the SVRSEGM inference algorithm that multiple predictions on similar segments can be combined to better consolidate the segment mask. But it fully develops the idea by constructing a probabilistic framework and performing composite MLE jointly on all segments and categories. Therefore it is able to consolidate better object boundaries and handle hard cases when objects interact closely and heavily occlude each other. For each image, we use 150 overlapping figure-ground hypotheses generated by the CPMC algorithm (Carreira and Sminchisescu, PAMI 2012), and linear SVR predictions on them with the novel second order O2P features (Carreira, Caseiro, Batista, Sminchisescu, ECCV2012; see VOC12 entry BONN_CMBR_O2P_CPMC_LIN) as the input to the inference algorithm.2012-09-23 23:49:02
Linear SVR with second-order pooling.BONN_CMBR_O2P_CPMC_LIN(1) University of Bonn, (2) University of CoimbraJoao Carreira (2,1) Rui Caseiro (2) Jorge Batista (2) Cristian Sminchisescu (1)We present a novel effective local feature aggregation method that we use in conjunction with an existing figure-ground segmentation sampling mechanism. This submission is described in detail in [1]. We sample multiple figure-ground segmentation candidates per image using the Constrained Parametric Min-Cuts (CPMC) algorithm. SIFT, masked SIFT and LBP features are extracted on the whole image, then pooled over each object segmentation candidate to generate global region descriptors. We employ a novel second-order pooling procedure, O2P, with two non-linearities: a tangent space mapping and power normalization. The global region descriptors are passed through linear regressors for each category, then labeled segments in each image having scores above some threshold are pasted onto the image in the order of these scores. Learning is performed using an epsilon-insensitive loss function on overlap with ground truth, similar to [2], but within a linear formulation (using LIBLINEAR). comp6: learning uses all images in the segmentation+detection trainval sets, and external ground truth annotations provided by courtesy of the Berkeley vision group. comp5: one model is trained for each category using the available ground truth segmentations from the 2012 trainval set. Then, on each image having no associated ground truth segmentations, the learned models are used together with bounding box constraints, low-level cues and region competition to generate predicted object segmentations inside all bounding boxes. Afterwards, learning proceeds similarly to the fully annotated case. 1. “Semantic Segmentation with Second-Order Pooling”, Carreira, Caseiro, Batista, Sminchisescu. ECCV 2012. 2. "Object Recognition by Ranking Figure-Ground Hypotheses", Li, Carreira, Sminchisescu. CVPR 2010.2012-09-23 19:11:47
BONN_O2PCPMC_FGT_SEGMBONN_O2PCPMC_FGT_SEGM(1) Universitfy of Bonn, (2) University of Coimbra, (3) Georgia Institute of Technology, (4) Vienna University of TechnologyJoao Carreira(1,2), Adrian Ion(4), Fuxin Li(3), Cristian Sminchisescu(1)We present a joint image segmentation and labeling model which, given a bag of figure-ground segment hypotheses extracted at multiple image locations and scales using CPMC (Carreira and Sminchisescu, PAMI 2012), constructs a joint probability distribution over both the compatible image interpretations (tilings or image segmentations) composed from those segments, and over their labeling into categories. The process of drawing samples from the joint distribution can be interpreted as first sampling tilings, modeled as maximal cliques, from a graph connecting spatially non-overlapping segments in the bag (Ion, Carreira, Sminchisescu, ICCV2011), followed by sampling labels for those segments, conditioned on the choice of a particular tiling. We learn the segmentation and labeling parameters jointly, based on Maximum Likelihood with a novel Incremental Saddle Point estimation procedure (Ion, Carreira, Sminchisescu, NIPS2011). As meta-features we combine outputs from linear SVRs using novel second order O2P features to predict the overlap between segments and ground-truth objects of each class (Carreira, Caseiro, Batista, Sminchisescu, ECCV2012; see VOC12 entry BONNCMBR_O2PCPMC_LINEAR), bounding box object detectors, and kernel SVR outputs trained to predict the overlap between segments and ground-truth objects of each class (Carreira, Li, Sminchisescu, IJCV 2012). comp6: the O2P SVR learning uses all images in the segmentation+detection trainval sets, and external ground truth annotations provided by courtesy of the Berkeley vision group.2012-09-23 21:39:35
CDL_VWLCDL_VWLXJTLU;UoLBingfeng Zhang Jimin XiaoVML IJCV22 WSSS COCO-PRE2022-07-12 02:46:57
CDL_NEWCDL_newXJTLUBingfeng Zhang Jimin Xiaowsss EPS 2022-06-29 03:31:34
CDL_NEW_ribCDL_new_ribXJTLUBingfeng Zhang Jimin XiaoWSSS RIB2022-06-29 03:47:40
WSS SegmentorCGPTETS, MontrealRH, RB, BM, JDSegmentation using WSS2023-05-22 07:14:39
WSS SegCGPT_IMNETS, MontrealRB, RH, BM, JDWSS Segmentation2023-05-22 13:55:35
dssd style archDCONV_SSD_FCNshanghai universityli junhao(jxlijunhao@163.com)combine object detection and semantic segmentation in one forward pass2018-03-17 02:58:20
DSRG_ATTNBNDSRG_ATTNBNDSRG_ATTNBNDSRG_ATTNBNDSRG_ATTNBN2020-02-28 08:27:04
Deeplab-clims-r38Deeplab-clims-r38Ecole de technologie superieureBalamurali MurugesanDeeplab-clims-r382022-10-25 21:52:00
Exploring the missing parts for WSSSDeeplabV3+_Exploring_Missing_PartsNorthestern UniversityDali ChenExploring the mssing parts in the muti-categories saliency map and improve the accuracy of all of the WSSS methods based on saliency map. 2021-07-15 06:59:42
Weakly-supervised modelExtendedSYSUWenfeng LuoDCNN trained under image labels2018-08-29 12:54:29
FDNet_16sFDNet_16sHongKong University of Science and Technology, altizure.comMingmin Zhen, Jinglu Wang, Siyu Zhu, Runze Zhang, Shiwei Li, Tian Fang, Long QuanA fully dense neural network with encoder-decoder structure is proposed that we abbreviate as FDNet. For each stage in the decoder module, feature maps of all the previous blocks are adaptively aggregated to feedforward as input. 2018-03-22 08:52:44
Fully Convolutional NetworkFLATTENET-001Sichuan University, ChinaXin CaiIn contrast to the commonly-used strategies, such as dilated convolution and encoder-decoder structure, we introduce the Flattening Module to produce high-resolution predictions without either removing any subsampling operations or building a complicated decoder module. https://arxiv.org/abs/1909.099612019-12-29 07:29:05
Foundation Model Assisted Weakly Supervised SemantFS+WSSSZhejiang UniversityXiaobo YangFoundation Model Assisted Weakly Supervised Semant2023-06-26 15:53:48
Global Distinguishing ModuleGDMJinan UniversityRunkai ZhengUncertainty weighted loss for extracting globally distinguishable spatial features.2020-03-04 10:42:09
DM2: Detection, Mask transfer, MRF pruningNUS_DET_SPR_GC_SPNational University of Singapore(NUS), Panasonic Singapore Laboratories(PSL)(NUS) Wei XIA, Csaba DOMOKOS, Jian DONG, Shuicheng YAN, Loong Fah CHEONG, (PSL) Zhongyang HUANG, Shengmei SHENWe propose a three-step coarse-to-fine framework for general object segmentation. Given a test image, the object bounding boxes are first predicted by object detectors, and then the coarse masks within the corresponding bounding boxes are transferred from the training data based on the optimization framework of coupled global and local sparse representations in [1]. Then based on the coarse masks as well as the original detection information (bounding boxes and confidence maps), we built a super-pixel based MRF model for each bounding box, and then perform foreground-background inference. Both L-a-b color histogram and detection confidence map are used for characterizing the unary terms, while the PB edge contrast is used as smoothness term. Finally, the segmentation results are further refined by post-processing of multi-scale super-pixel segmentation. [1]Wei Xia, Zheng Song, Jiashi Feng, Loong Fah Cheong and Shuicheng Yan. Segmentation over Detection by Coupled Global and Local Sparse Representations, ECCV 2012. 2012-09-23 20:01:56
O2P+SVRSEGM Regressor + Composite Statistical InferenceO2P_SVRSEGM_CPMC_CSI(1) Georgia Institute of Technology (2) University of California - Berkeley (3) Amazon Inc. (4) Lund University Fuxin Li (1), Joao Carreira (2), Guy Lebanon (3), Cristian Sminchisescu (4)We utilize a novel probabilistic inference procedure, Composite Statisitcal Inference (CSI) [1], on semantic segmentation using predictions on overlapping figure-ground hypotheses. Regressor predictions on segment overlaps to the ground truth object are modelled as generated by the true overlap with the ground truth segment plus noise, parametrized on the unknown percentage of each superpixel that belongs to the unknown ground truth. A joint optimization on all the superpixels and all the categories is then performed in order to maximize the likelihood of the SVR predictions. The optimization has a tight convex relaxation so solutions can be expected to be close to the global optimum. A fast and optimal search algorithm is then applied to retrieve each object. CSI takes the intuition from the SVRSEGM inference algorithm that multiple predictions on similar segments can be combined to better consolidate the segment mask. But it fully develops the idea by constructing a probabilistic framework and performing maximum composite likelihood jointly on all segments and categories. Therefore it is able to consolidate better object boundaries and handle hard cases when objects interact closely and heavily occlude each other. For each image, we use 150 overlapping figure-ground hypotheses generated by the CPMC algorithm (Carreira and Sminchisescu, PAMI 2012), SVRSEGM results, and linear SVR predictions on them with the novel second order O2P features (Carreira, Caseiro, Batista, Sminchisescu, ECCV2012; see VOC12 entry BONN_CMBR_O2P_CPMC_LIN) as the input to the inference algorithm. [1] Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu. Composite Statistical Inference for Semantic Segmentation. CVPR 2013. 2012-11-15 22:50:41
CRF with NBNN features and simple smoothingOptNBNN-CRFUniversity of Amsterdam (UvA)Carsten van Weelden, Maarten van der Velden, Jan van GemertNaive Bayes nearest neighbor (NBNN) [Boiman et al, CVPR 2008] performs well in image classification because it avoids quantization of image features and estimates image-to-class distance. In the context of my MSc thesis we applied the NBNN method to segmentation by estimating image-to-class distances for superpixels, which we use as unary potentials in a simple conditional random field (CRF). To get the NBNN estimates we extract dense SIFT features from the training set and store these in a FLANN index [Muja and Lowe, VISSAPP'09] for efficient nearest neighbor search. To deal with the unbalanced class frequency we learn a linear correction for each class as in [Behmo et al, ECCV 2010]. We segment each test image into 500 SLIC superpixels [Achanta et al, TPAMI 2012] and take each superpixel as a vertex in the CRF. We use the corrected NBNN estimates as unary potentials and Potts potential as pairwise potentials and infer the MAP labeling using alpha-expansion [Boykov et al, TPAMI 2001]. We tune the weighting between the unary and pairwise potential by exhaustive search.2012-09-23 12:48:10
Progressive Framework using Weak Autoencoder (SEC)Progressive FrameworkLakehead University [1,2], Vector Institute [2], The University of British Columbia (Okanagan) [3], Nanyang Technological University [4]Terence Chow [1] (ychow@lakeheadu.ca) Yimin Yang [2] (yyang48@lakeheadu.ca) Shan Du [3] (shan.du@ubc.ca) Zhiping Linc [4] (EZPLin@ntu.edu.sg)http://www.yiminyang.com/weakly_supervised.html2021-07-03 15:40:47
Progressive Framework using Weak Autoencoder(DSRG)Progressive Framework Lakehead University [1,2], Vector Institute [2], The University of British Columbia (Okanagan) [3], Nanyang Technological University [4]Terence Chow [1] (ychow@lakeheadu.ca) Yimin Yang [2] (yyang48@lakeheadu.ca) Shan Du [3] (shan.du@ubc.ca) Zhiping Linc [4] (EZPLin@ntu.edu.sg)http://www.yiminyang.com/weakly_supervised.html 2021-07-06 02:51:33
Progressive Framework using Weak Autoencoder(CIAN)Progressive Framework Lakehead University [1,2], Vector Institute [2], The University of British Columbia (Okanagan) [3], Nanyang Technological University [4]Terence Chow [1] (ychow@lakeheadu.ca) Yimin Yang [2] (yyang48@lakeheadu.ca) Shan Du [3] (shan.du@ubc.ca) Zhiping Linc [4] (EZPLin@ntu.edu.sg)http://www.yiminyang.com/weakly_supervised.html2021-07-06 02:52:28
Puzzle-CAM with ResNeSt-269Puzzle-CAMGYNetworksSanghyun Jo, In-Jae YuWeakly-supervised semantic segmentation (WSSS) is introduced to narrow the gap for semantic segmentation performance from pixel-level supervision to image-level supervision. Most advanced approaches are based on class activation maps (CAMs) to generate pseudo-labels to train the segmentation network. The main limitation of WSSS is that the process of generating pseudo-labels from CAMs that use an image classifier is mainly focused on the most discriminative parts of the objects. To address this issue, we propose Puzzle-CAM, a process that minimizes differences between the features from separate patches and the whole image. Our method consists of a puzzle module and two regularization terms to discover the most integrated region in an object. Puzzle-CAM can activate the overall region of an object using image-level supervision without requiring extra parameters. % In experiments, Puzzle-CAM outperformed previous state-of-the-art methods using the same labels for supervision on the PASCAL VOC 2012 test dataset. In experiments, Puzzle-CAM outperformed previous state-of-the-art methods using the same labels for supervision on the PASCAL VOC 2012 dataset.2021-02-02 05:25:31
TCnetTCnetTsinghua UniversityLiu YulinTCnet2018-05-02 08:02:45
Iterative method with Entropy-gated ConvCRFUMICH_EG-ConvCRF_Iter_Res101University of Michigan Deep Learning Research GroupChuan Cen, supervisor: Prof. Honglak LeeTrain segmentation network and relation models iteratively. Infer pseudo-labels for segmentation network with the novel Entropy-gated ConvCRF, which is proved to be superior to random walk under the same conditions. Seg net: Deeplabv2 Seg net backbone: Res101 Relation model backbone: Res1012019-12-13 00:37:25
Iterative method with Entropy-gated ConvCRFUMICH_EG-ConvCRF_Iter_Res50University of Michigan Deep Learning Research GroupChuan Cen, supervisor: Prof. Honglak LeeTrain segmentation network and relation models iteratively. Infer pseudo-labels for segmentation network with the novel Entropy-gated ConvCRF, which is proved to be superior to random walk under the same conditions. Seg net: Deeplabv2 Seg net backbone: Res50 Relation model backbone: Res502019-12-10 22:02:49
Transductive semi-sup, co-train, self-trainUMICH_TCSUniversity of Michigan Deep Learning Research GroupChuan CenIt's a method for solving weakly supervised semantic segmentation problem with image-level label only. The problem is viewed as a semi-supervised learning task, then apply graph semi-supervised learning method, co-training and self-training methods together achieving the SOTA performance. 2019-11-28 19:09:22
Transductive semi-sup, co-train, self-trainUMICH_TCS_101University of Michigan Deep Learning Research GroupChuan CenIt's a method for solving weakly supervised semantic segmentation problem with image-level label only. The problem is viewed as a semi-supervised learning task, then apply graph semi-supervised learning method, co-training and self-training methods together achieving the SOTA performance. 2019-12-01 02:40:47
PRETRAINED resnetv2_50x1_bit_distilled (384x384 siUnet_AttentionHSEIvan VassilenkoPRETRAINED resnetv2_50x1_bit_distilled (384x384 size) 2022-01-17 07:57:57
WASP for Effective Semantic SegmentationWASPNetRochester Institute of TechnologyBruno Artacho and Andreas Savakis, Rochester Institute of TechnologyWe propose an efficient architecture for semantic segmentation based on an improvement of Atrous Spatial Pyramid Pooling that achieves a considerable accuracy increase while decreasing the number of parameters and amount of memory necessary. Current semantic segmentation methods rely either on deconvolutional stages that inherently require a large number of parameters, or cascade methods that abdicate larger fields-of-views obtained in the parallelization. The proposed Waterfall architecture leverages the progressive information abstraction from cascade architecture while obtaining multi-scale fields-of-view from spatial pyramid configurations. We demonstrate that the Waterfall approach is a robust and efficient architecture for semantic segmentation using ResNet type networks and obtaining state-of-the-art results with over 20% reduction in the number of parameters and improved performance.2019-07-25 20:28:04
Waterfall Atrous Spatial Pooling Arch. for Sem SegWASPNet+CRFRochester Institute of TechnologyRochester Institute of Technology Bruno Artacho Andreas SavakisWe propose a new efficient architecture for semantic segmentation based on a "Waterfall" Atrous Spatial Pooling architecture that achieves a considerable accuracy increase while decreasing the number of network parameters and memory footprint. The proposed Waterfall architecture leverages the efficiency of progressive filtering in the cascade architecture, while maintaining multi-scale fields-of-view comparable to spatial pyramid configurations. Additionally, our method does not rely on a post-processing stage with Conditional Random Fields, which further reduces complexity and required training time. We demonstrate that the Waterfall approach with a ResNet backbone is a robust and efficient architecture for semantic segmentation obtaining state-of-the-art results with significant reduction in the number of parameters for the Pascal VOC dataset and the Cityscapes dataset.2019-11-19 15:19:18
FLATTENETXC-FLATTENETSichuan University, ChinaXin CaiIt is well-known that the reduced feature resolution due to repeated subsampling operations poses a serious challenge to Fully Convolutional Network (FCN) based models. In contrast to the commonly-used strategies, such as dilated convolution and encoder-decoder structure, we introduce a novel Flattening Module to produce high-resolution predictions without either removing any subsampling operations or building a complicated decoder module. https://ieeexplore.ieee.org/document/8932465/metrics#metrics2020-01-12 02:43:17
Improved deeplabv3 for semantic segmentationacmTianjin University of Technology and Education*******2020-06-13 17:49:57
attention RRMattention RRMXJTLUBingfeng Zhang Jimin XiaoRFAM RRM weakly supervised semantic segmentation image level2021-03-12 11:49:44
ConvNet for voc 2012aux_0.4_finetuning_train_checkpointsHoHai UniversitywuYsda2020-06-16 01:45:47
bothweight th0.4bothweight th0.4Northwestern Politechnical UniversityPeng Wang, Chunhua Shenbothweight th0.4 270822019-04-15 10:46:51
comp6_test_clscomp6_test_clscomp6_test_clscomp6_test_clscomp6_test_cls2018-05-10 15:54:47
pretrained resnet_101 and ASPP module deeplabv3_plus_reproductionInstitute of Computing TechnologyZhu LifaReproduction of deeplabv3plus with Tensorflow.2019-05-11 09:45:53
fcnfcngolangboygolangboyfcn2023-04-26 02:19:00
fcnfcngolangboygolangboyfcn2023-04-26 02:00:55
refinenet_HPMrefinenet_HPMSJTUgzxrefinenet_HPM2019-03-01 09:22:09
unet_resnet50unet_resnet50hainnunothingunet_resnet502023-08-13 16:08:21
vgg_unetvgg_unethainnugolangboynothing2023-08-13 12:13:17
weakly_seg_validation_testweakly_seg_validation_testNEUSmile Labweakly_seg_validation_test2019-09-08 01:18:19
weight+RSweight+RSNorthwestern Politechnical UniversityPeng Wang, Shunhua Shenweight+RS2019-03-24 08:13:07