PASCAL VOC Challenge performance evaluation server

Segmentation Results: VOC2012 ^BETA

Competition "comp5" (train on VOC2012 data)

This leaderboard shows only those submissions that have been marked as public, and so the displayed rankings should not be considered as definitive.

The highest scoring entry in each column is shown in bold.
Clicking on the blue arrow symbol () at the top of a column will order the submissions from high to low wrt performance on that column.

Average Precision (AP %)

	mean	aero plane	bicycle	bird	boat	bottle	bus	car	cat	chair	cow	dining table	dog	horse	motor bike	person	potted plant	sheep	sofa	train	tv/ monitor	submission date
XC-FLATTENET ^[?]	84.3	94.0	73.2	91.5	74.4	83.2	95.5	90.5	96.7	38.1	94.5	76.2	92.9	95.6	88.9	90.4	76.0	93.8	63.6	86.8	78.6	12-Jan-2020
TUSS-20k ^[?]	84.2	89.9	50.2	92.6	84.3	81.9	93.2	93.2	96.9	54.7	93.1	69.0	95.4	93.9	93.2	91.5	74.1	96.5	72.5	92.1	65.0	15-May-2024
FDNet_16s ^[?]	84.0	95.4	77.9	95.9	69.1	80.6	96.4	92.6	95.5	40.5	92.6	70.6	93.8	93.1	90.4	89.9	71.2	92.7	63.1	88.5	77.7	22-Mar-2018
TUSS-fs ^[?]	83.9	94.3	44.7	97.2	84.7	84.4	92.8	91.2	94.0	46.5	95.5	68.0	89.0	94.1	91.2	91.9	79.4	92.9	72.0	85.2	78.1	15-May-2024
FLATTENET-001 ^[?]	83.1	95.4	69.8	86.4	73.3	78.7	94.6	92.1	95.5	41.0	92.4	75.8	93.1	95.2	89.8	90.0	65.8	94.0	60.0	86.6	79.6	29-Dec-2019
AAA_HEM ^[?]	82.9	94.5	73.6	92.4	66.5	81.7	96.3	90.0	95.3	36.2	92.1	68.1	92.4	93.2	90.5	88.9	74.1	93.9	58.5	89.7	76.6	11-Apr-2020
FS+WSSS ^[?]	81.6	96.3	48.8	92.8	83.9	82.1	93.1	94.4	94.5	48.6	96.9	45.9	94.1	96.1	93.7	88.7	63.5	93.9	63.9	92.8	54.8	26-Jun-2023
TUSS-ws ^[?]	81.1	85.7	45.8	97.1	75.8	84.1	92.2	92.8	93.9	43.1	95.0	49.9	90.0	95.0	90.8	91.6	77.9	92.2	60.3	87.1	68.1	15-May-2024
aux_0.4_finetuning_train_checkpoints ^[?]	81.1	94.3	43.6	95.3	73.7	83.7	95.3	90.7	94.7	39.6	93.6	54.2	90.2	93.6	90.2	87.4	75.9	92.6	56.6	88.0	75.5	16-Jun-2020
WASPNet+CRF ^[?]	79.6	90.7	60.9	85.3	68.9	80.7	93.8	84.7	94.5	38.5	86.0	69.7	90.7	86.9	85.4	86.8	67.6	88.8	57.4	85.4	74.2	19-Nov-2019
WASPNet ^[?]	79.4	89.5	63.8	87.6	68.8	79.1	93.5	84.7	93.7	37.6	84.8	69.4	90.8	87.9	85.7	86.5	67.4	87.0	57.2	85.2	72.7	25-Jul-2019
acm ^[?]	79.0	90.9	41.6	91.1	65.2	80.7	93.6	90.0	92.7	39.4	88.6	60.8	88.6	89.7	88.8	85.6	72.1	89.4	58.2	84.9	72.2	13-Jun-2020
Unet_Attention ^[?]	78.0	93.0	67.9	83.6	65.9	78.6	92.4	87.3	90.0	30.5	82.5	65.8	86.2	83.9	86.7	85.3	66.2	86.8	55.7	82.1	73.1	17-Jan-2022
GDM ^[?]	74.5	85.8	40.2	81.6	61.8	67.3	90.8	84.2	91.8	37.3	80.2	62.7	85.2	86.4	85.9	81.7	56.5	86.6	56.7	79.7	69.4	04-Mar-2020
refinenet_HPM ^[?]	74.2	87.9	62.2	76.0	55.3	76.0	86.7	82.6	85.4	28.9	79.6	64.2	81.2	79.4	85.1	85.0	65.6	83.2	51.3	79.2	68.6	01-Mar-2019
CDL_new ^[?]	73.1	89.4	38.9	87.4	63.6	75.1	90.7	83.6	89.2	35.8	83.4	42.7	85.0	87.6	82.2	81.5	57.4	85.9	48.2	82.1	53.2	29-Jun-2022
DCONV_SSD_FCN ^[?]	72.9	88.6	38.1	85.2	57.8	71.4	90.8	84.2	86.0	32.1	83.4	53.7	80.4	80.8	81.6	81.2	61.4	84.1	51.9	77.5	67.0	17-Mar-2018
Puzzle-CAM ^[?]	72.3	87.2	37.4	86.8	61.5	71.3	92.2	86.3	91.8	28.6	85.1	64.2	91.9	82.1	82.6	70.7	69.4	87.7	45.5	67.0	37.8	02-Feb-2021
CDL_VWL ^[?]	71.7	87.6	33.7	89.8	60.1	68.3	91.4	83.1	89.9	33.6	78.3	62.0	84.4	82.5	82.5	78.7	42.7	81.4	60.0	58.6	65.3	12-Jul-2022
Deeplab-clims-r38 ^[?]	71.2	82.1	34.0	85.7	58.2	70.5	87.0	82.3	86.3	30.8	80.9	59.4	83.1	80.7	80.6	78.4	54.5	85.3	59.5	69.8	53.6	25-Oct-2022
CGPT ^[?]	70.8	77.0	32.1	84.1	56.4	66.5	87.6	83.1	87.8	29.8	81.5	59.6	84.9	84.0	80.2	77.3	50.4	81.4	61.2	72.8	57.6	22-May-2023
CGPT_IMN ^[?]	70.4	79.0	32.1	85.5	58.9	65.5	86.4	81.2	87.1	27.1	82.4	57.0	85.0	82.6	79.7	76.1	49.3	83.5	58.4	73.2	57.5	22-May-2023
CDL_new_rib ^[?]	69.9	85.0	34.8	87.9	59.9	75.5	90.6	81.6	89.1	33.2	81.1	39.6	83.9	82.9	81.7	74.9	37.8	82.7	57.5	58.3	58.5	29-Jun-2022
AD_CLIMS ^[?]	69.8	81.0	33.0	88.1	60.6	69.0	87.9	81.7	88.9	27.5	82.5	60.2	85.8	83.5	78.9	34.2	52.1	80.5	58.5	79.1	61.9	03-Mar-2023
deeplabv3_plus_reproduction ^[?]	69.5	81.7	38.1	83.1	60.1	62.7	89.8	81.4	87.6	30.0	72.8	60.9	78.1	77.7	78.6	78.1	44.9	76.7	50.8	75.4	58.6	11-May-2019
attention RRM ^[?]	69.2	85.2	31.2	86.6	49.3	71.9	85.5	76.1	88.8	32.6	76.8	64.0	84.5	82.6	76.9	74.3	47.5	82.7	57.8	42.4	65.6	12-Mar-2021
Progressive Framework ^[?]	69.2	83.4	35.7	82.5	46.0	66.0	88.4	79.1	90.9	28.6	82.4	44.3	85.9	84.7	78.5	79.1	53.4	83.3	44.5	61.3	63.4	06-Jul-2021
DeeplabV3+_Exploring_Missing_Parts ^[?]	68.8	88.7	37.1	87.6	57.9	64.3	85.9	73.2	85.4	24.7	84.2	29.4	81.8	90.2	81.8	77.3	55.5	88.0	34.9	74.6	51.3	15-Jul-2021
TCnet ^[?]	68.4	72.6	32.6	74.2	59.5	68.9	86.7	77.1	78.6	34.4	68.6	63.0	74.4	75.8	76.3	77.0	54.9	76.6	55.5	76.9	61.7	02-May-2018
UMICH_EG-ConvCRF_Iter_Res101 ^[?]	67.9	76.4	32.1	82.0	52.9	69.5	89.5	81.1	87.5	29.6	74.7	54.2	83.4	79.2	78.7	76.1	48.1	81.0	59.4	51.3	49.3	13-Dec-2019
UMICH_TCS_101 ^[?]	66.7	74.6	31.5	81.5	50.3	67.2	88.4	80.6	82.3	29.9	72.3	53.4	76.1	76.0	78.4	73.6	47.5	79.2	52.0	57.9	56.6	01-Dec-2019
UMICH_EG-ConvCRF_Iter_Res50 ^[?]	66.4	75.8	32.2	85.5	50.9	69.3	86.4	79.6	85.1	29.1	73.7	55.7	79.6	74.7	76.3	75.8	44.8	79.6	51.0	50.5	48.5	10-Dec-2019
UMICH_TCS ^[?]	65.5	73.6	32.4	81.6	50.4	68.5	86.2	79.4	81.8	28.2	75.5	55.6	79.0	75.5	77.7	74.3	48.2	79.0	52.3	44.4	42.7	28-Nov-2019
bothweight th0.4 ^[?]	65.3	79.1	33.4	88.2	20.1	65.3	88.0	76.2	90.0	24.7	80.7	43.7	85.1	85.8	82.3	69.8	47.7	84.9	43.8	41.4	54.5	15-Apr-2019
weight+RS ^[?]	64.5	85.0	31.9	85.4	19.1	65.3	88.6	72.9	88.5	24.8	75.6	50.1	83.3	82.0	81.6	66.8	56.6	80.3	45.8	44.6	39.7	24-Mar-2019
AttnBN ^[?]	63.0	75.7	32.9	73.5	49.9	60.4	78.1	76.5	77.4	19.9	72.0	27.4	73.8	72.7	77.2	72.3	51.2	77.3	37.9	73.5	53.6	14-Aug-2019
DSRG_ATTNBN ^[?]	63.0	76.9	32.3	72.9	49.0	59.2	77.7	75.4	76.7	19.5	71.5	27.9	74.3	73.6	77.0	72.8	52.7	76.4	40.3	73.5	52.7	28-Feb-2020
Progressive Framework ^[?]	60.0	72.2	32.4	71.9	42.4	63.2	70.2	70.3	77.8	23.4	60.5	33.2	72.4	71.0	75.4	69.8	39.9	70.2	37.7	64.0	53.2	06-Jul-2021
Extended ^[?]	59.3	77.9	28.9	75.1	42.6	55.2	70.4	58.9	53.0	24.3	66.7	51.9	73.1	71.3	72.5	63.9	45.2	59.2	43.9	65.2	58.6	29-Aug-2018
weakly_seg_validation_test ^[?]	57.7	67.6	31.1	66.4	41.9	60.1	70.6	65.4	71.8	25.3	63.6	24.7	72.2	68.7	68.3	68.8	41.6	67.5	33.6	65.0	49.2	08-Sep-2019
Progressive Framework ^[?]	55.0	64.9	30.8	68.4	31.6	52.6	70.5	64.8	73.4	22.6	48.6	33.4	68.6	59.6	71.1	68.9	38.5	60.7	40.8	47.7	51.7	03-Jul-2021
fcn ^[?]	51.0	57.0	6.2	55.0	34.9	51.0	69.3	67.8	66.7	13.7	46.5	47.0	54.9	52.4	59.8	64.8	34.4	58.9	35.6	58.7	47.5	26-Apr-2023
O2P_SVRSEGM_CPMC_CSI ^[?]	47.5	64.0	32.2	45.9	34.7	46.3	59.5	61.7	49.4	14.8	47.9	31.2	42.5	51.3	58.8	54.6	34.9	54.6	34.7	50.6	42.2	15-Nov-2012
NUS_DET_SPR_GC_SP ^[?]	47.3	52.9	31.0	39.8	44.5	58.9	60.8	52.5	49.0	22.6	38.1	27.5	47.4	52.4	46.8	51.9	35.7	55.3	40.8	54.2	47.8	23-Sep-2012
BONN_O2PCPMC_FGT_SEGM ^[?]	47.0	65.4	29.3	51.3	33.4	44.2	59.8	60.3	52.5	13.6	53.6	32.6	40.3	57.6	57.3	49.0	33.5	53.5	29.2	47.6	37.6	23-Sep-2012
vgg_unet ^[?]	45.8	56.3	44.6	54.6	37.0	44.1	55.8	55.2	55.8	13.1	30.7	32.8	45.4	39.9	56.9	65.9	27.3	51.1	23.3	44.4	38.7	13-Aug-2023
BONNGC_O2P_CPMC_CSI ^[?]	45.4	59.3	27.9	43.9	39.8	41.4	52.2	61.5	56.4	13.6	44.5	26.1	42.8	51.7	57.9	51.3	29.8	45.7	28.8	49.9	43.3	23-Sep-2012
BONN_CMBR_O2P_CPMC_LIN ^[?]	44.8	60.0	27.3	46.4	40.0	41.7	57.6	59.0	50.4	10.0	41.6	22.3	43.0	51.7	56.8	50.1	33.7	43.7	29.5	47.5	44.7	23-Sep-2012
fcn ^[?]	38.4	51.6	-	11.8	23.9	36.8	65.7	55.7	57.8	3.8	33.5	41.2	40.8	39.4	42.6	52.7	12.7	26.1	26.5	56.5	42.4	26-Apr-2023
comp6_test_cls ^[?]	37.7	36.6	10.8	38.9	25.9	30.8	56.0	53.8	57.8	4.9	24.6	22.1	48.1	33.1	32.6	56.1	23.5	29.7	31.8	42.7	45.6	10-May-2018
OptNBNN-CRF ^[?]	11.3	10.5	2.3	3.0	3.0	1.0	30.2	14.9	15.0	0.2	6.1	2.3	5.1	12.1	15.3	23.4	0.5	8.9	3.5	10.7	5.3	23-Sep-2012
unet_resnet50 ^[?]	-	62.9	36.3	57.0	38.9	57.0	67.6	70.9	65.9	-	24.6	39.5	61.6	39.3	59.3	65.7	26.1	49.1	29.4	56.4	51.0	13-Aug-2023

Abbreviations

Title	Method	Affiliation	Contributors	Description	Date
test	AAA_HEM	xiongdeng@stu.xmu.edu.cn	111	test	2020-04-11 03:21:43
Clims with adapter	AD_CLIMS	ETS, Montreal	BM, RH, RB	CLIMS in WSS settings	2023-03-03 17:57:45
AttnBN	AttnBN	AttnBN	AttnBN	AttnBN	2019-08-14 23:23:24
O2P Regressor + Composite Statistical Inference	BONNGC_O2P_CPMC_CSI	(1) University of Bonn, (2) Georgia Institute of Technology, (3) University of Coimbra	Joao Carreira (1,3) Fuxin Li (2) Guy Lebanon (2) Cristian Sminchisescu (1)	We utilize a novel probabilistic inference procedure (unpublished yet), Composite Statisitcal Inference (CSI), on semantic segmentation using predictions on overlapping figure-ground hypotheses. Regressor predictions on segment overlaps to the ground truth object are modelled as generated by the true overlap with the ground truth segment plus noise. A model of ground truth overlap is defined by parametrizing on the unknown percentage of each superpixel that belongs to the unknown ground truth. A joint optimization on all the superpixels and all the categories is then performed in order to maximize the likelihood of the SVR predictions. The optimization has a tight convex relaxation so solutions can be expected to be close to the global optimum. A fast and optimal search algorithm is then applied to retrieve each object. CSI takes the intuition from the SVRSEGM inference algorithm that multiple predictions on similar segments can be combined to better consolidate the segment mask. But it fully develops the idea by constructing a probabilistic framework and performing composite MLE jointly on all segments and categories. Therefore it is able to consolidate better object boundaries and handle hard cases when objects interact closely and heavily occlude each other. For each image, we use 150 overlapping figure-ground hypotheses generated by the CPMC algorithm (Carreira and Sminchisescu, PAMI 2012), and linear SVR predictions on them with the novel second order O2P features (Carreira, Caseiro, Batista, Sminchisescu, ECCV2012; see VOC12 entry BONN_CMBR_O2P_CPMC_LIN) as the input to the inference algorithm.	2012-09-23 23:49:02
Linear SVR with second-order pooling.	BONN_CMBR_O2P_CPMC_LIN	(1) University of Bonn, (2) University of Coimbra	Joao Carreira (2,1) Rui Caseiro (2) Jorge Batista (2) Cristian Sminchisescu (1)	We present a novel effective local feature aggregation method that we use in conjunction with an existing figure-ground segmentation sampling mechanism. This submission is described in detail in [1]. We sample multiple figure-ground segmentation candidates per image using the Constrained Parametric Min-Cuts (CPMC) algorithm. SIFT, masked SIFT and LBP features are extracted on the whole image, then pooled over each object segmentation candidate to generate global region descriptors. We employ a novel second-order pooling procedure, O2P, with two non-linearities: a tangent space mapping and power normalization. The global region descriptors are passed through linear regressors for each category, then labeled segments in each image having scores above some threshold are pasted onto the image in the order of these scores. Learning is performed using an epsilon-insensitive loss function on overlap with ground truth, similar to [2], but within a linear formulation (using LIBLINEAR). comp6: learning uses all images in the segmentation+detection trainval sets, and external ground truth annotations provided by courtesy of the Berkeley vision group. comp5: one model is trained for each category using the available ground truth segmentations from the 2012 trainval set. Then, on each image having no associated ground truth segmentations, the learned models are used together with bounding box constraints, low-level cues and region competition to generate predicted object segmentations inside all bounding boxes. Afterwards, learning proceeds similarly to the fully annotated case. 1. �Semantic Segmentation with Second-Order Pooling�, Carreira, Caseiro, Batista, Sminchisescu. ECCV 2012. 2. "Object Recognition by Ranking Figure-Ground Hypotheses", Li, Carreira, Sminchisescu. CVPR 2010.	2012-09-23 19:11:47
BONN_O2PCPMC_FGT_SEGM	BONN_O2PCPMC_FGT_SEGM	(1) Universitfy of Bonn, (2) University of Coimbra, (3) Georgia Institute of Technology, (4) Vienna University of Technology	Joao Carreira(1,2), Adrian Ion(4), Fuxin Li(3), Cristian Sminchisescu(1)	We present a joint image segmentation and labeling model which, given a bag of figure-ground segment hypotheses extracted at multiple image locations and scales using CPMC (Carreira and Sminchisescu, PAMI 2012), constructs a joint probability distribution over both the compatible image interpretations (tilings or image segmentations) composed from those segments, and over their labeling into categories. The process of drawing samples from the joint distribution can be interpreted as first sampling tilings, modeled as maximal cliques, from a graph connecting spatially non-overlapping segments in the bag (Ion, Carreira, Sminchisescu, ICCV2011), followed by sampling labels for those segments, conditioned on the choice of a particular tiling. We learn the segmentation and labeling parameters jointly, based on Maximum Likelihood with a novel Incremental Saddle Point estimation procedure (Ion, Carreira, Sminchisescu, NIPS2011). As meta-features we combine outputs from linear SVRs using novel second order O2P features to predict the overlap between segments and ground-truth objects of each class (Carreira, Caseiro, Batista, Sminchisescu, ECCV2012; see VOC12 entry BONNCMBR_O2PCPMC_LINEAR), bounding box object detectors, and kernel SVR outputs trained to predict the overlap between segments and ground-truth objects of each class (Carreira, Li, Sminchisescu, IJCV 2012). comp6: the O2P SVR learning uses all images in the segmentation+detection trainval sets, and external ground truth annotations provided by courtesy of the Berkeley vision group.	2012-09-23 21:39:35
CDL_VWL	CDL_VWL	XJTLU;UoL	Bingfeng Zhang Jimin Xiao	VML IJCV22 WSSS COCO-PRE	2022-07-12 02:46:57
CDL_NEW	CDL_new	XJTLU	Bingfeng Zhang Jimin Xiao	wsss EPS	2022-06-29 03:31:34
CDL_NEW_rib	CDL_new_rib	XJTLU	Bingfeng Zhang Jimin Xiao	WSSS RIB	2022-06-29 03:47:40
WSS Segmentor	CGPT	ETS, Montreal	RH, RB, BM, JD	Segmentation using WSS	2023-05-22 07:14:39
WSS Seg	CGPT_IMN	ETS, Montreal	RB, RH, BM, JD	WSS Segmentation	2023-05-22 13:55:35
dssd style arch	DCONV_SSD_FCN	shanghai university	li junhao(jxlijunhao@163.com)	combine object detection and semantic segmentation in one forward pass	2018-03-17 02:58:20
DSRG_ATTNBN	DSRG_ATTNBN	DSRG_ATTNBN	DSRG_ATTNBN	DSRG_ATTNBN	2020-02-28 08:27:04
Deeplab-clims-r38	Deeplab-clims-r38	Ecole de technologie superieure	Balamurali Murugesan	Deeplab-clims-r38	2022-10-25 21:52:00
Exploring the missing parts for WSSS	DeeplabV3+_Exploring_Missing_Parts	Northestern University	Dali Chen	Exploring the mssing parts in the muti-categories saliency map and improve the accuracy of all of the WSSS methods based on saliency map.	2021-07-15 06:59:42
Weakly-supervised model	Extended	SYSU	Wenfeng Luo	DCNN trained under image labels	2018-08-29 12:54:29
FDNet_16s	FDNet_16s	HongKong University of Science and Technology, altizure.com	Mingmin Zhen, Jinglu Wang, Siyu Zhu, Runze Zhang, Shiwei Li, Tian Fang, Long Quan	A fully dense neural network with encoder-decoder structure is proposed that we abbreviate as FDNet. For each stage in the decoder module, feature maps of all the previous blocks are adaptively aggregated to feedforward as input.	2018-03-22 08:52:44
Fully Convolutional Network	FLATTENET-001	Sichuan University, China	Xin Cai	In contrast to the commonly-used strategies, such as dilated convolution and encoder-decoder structure, we introduce the Flattening Module to produce high-resolution predictions without either removing any subsampling operations or building a complicated decoder module. https://arxiv.org/abs/1909.09961	2019-12-29 07:29:05
Foundation Model Assisted Weakly Supervised Semant	FS+WSSS	Zhejiang University	Xiaobo Yang	Foundation Model Assisted Weakly Supervised Semant	2023-06-26 15:53:48
Global Distinguishing Module	GDM	Jinan University	Runkai Zheng	Uncertainty weighted loss for extracting globally distinguishable spatial features.	2020-03-04 10:42:09
DM2: Detection, Mask transfer, MRF pruning	NUS_DET_SPR_GC_SP	National University of Singapore(NUS), Panasonic Singapore Laboratories(PSL)	(NUS) Wei XIA, Csaba DOMOKOS, Jian DONG, Shuicheng YAN, Loong Fah CHEONG, (PSL) Zhongyang HUANG, Shengmei SHEN	We propose a three-step coarse-to-fine framework for general object segmentation. Given a test image, the object bounding boxes are first predicted by object detectors, and then the coarse masks within the corresponding bounding boxes are transferred from the training data based on the optimization framework of coupled global and local sparse representations in [1]. Then based on the coarse masks as well as the original detection information (bounding boxes and confidence maps), we built a super-pixel based MRF model for each bounding box, and then perform foreground-background inference. Both L-a-b color histogram and detection confidence map are used for characterizing the unary terms, while the PB edge contrast is used as smoothness term. Finally, the segmentation results are further refined by post-processing of multi-scale super-pixel segmentation. [1]Wei Xia, Zheng Song, Jiashi Feng, Loong Fah Cheong and Shuicheng Yan. Segmentation over Detection by Coupled Global and Local Sparse Representations, ECCV 2012.	2012-09-23 20:01:56
O2P+SVRSEGM Regressor + Composite Statistical Inference	O2P_SVRSEGM_CPMC_CSI	(1) Georgia Institute of Technology (2) University of California - Berkeley (3) Amazon Inc. (4) Lund University	Fuxin Li (1), Joao Carreira (2), Guy Lebanon (3), Cristian Sminchisescu (4)	We utilize a novel probabilistic inference procedure, Composite Statisitcal Inference (CSI) [1], on semantic segmentation using predictions on overlapping figure-ground hypotheses. Regressor predictions on segment overlaps to the ground truth object are modelled as generated by the true overlap with the ground truth segment plus noise, parametrized on the unknown percentage of each superpixel that belongs to the unknown ground truth. A joint optimization on all the superpixels and all the categories is then performed in order to maximize the likelihood of the SVR predictions. The optimization has a tight convex relaxation so solutions can be expected to be close to the global optimum. A fast and optimal search algorithm is then applied to retrieve each object. CSI takes the intuition from the SVRSEGM inference algorithm that multiple predictions on similar segments can be combined to better consolidate the segment mask. But it fully develops the idea by constructing a probabilistic framework and performing maximum composite likelihood jointly on all segments and categories. Therefore it is able to consolidate better object boundaries and handle hard cases when objects interact closely and heavily occlude each other. For each image, we use 150 overlapping figure-ground hypotheses generated by the CPMC algorithm (Carreira and Sminchisescu, PAMI 2012), SVRSEGM results, and linear SVR predictions on them with the novel second order O2P features (Carreira, Caseiro, Batista, Sminchisescu, ECCV2012; see VOC12 entry BONN_CMBR_O2P_CPMC_LIN) as the input to the inference algorithm. [1] Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu. Composite Statistical Inference for Semantic Segmentation. CVPR 2013.	2012-11-15 22:50:41
CRF with NBNN features and simple smoothing	OptNBNN-CRF	University of Amsterdam (UvA)	Carsten van Weelden, Maarten van der Velden, Jan van Gemert	Naive Bayes nearest neighbor (NBNN) [Boiman et al, CVPR 2008] performs well in image classification because it avoids quantization of image features and estimates image-to-class distance. In the context of my MSc thesis we applied the NBNN method to segmentation by estimating image-to-class distances for superpixels, which we use as unary potentials in a simple conditional random field (CRF). To get the NBNN estimates we extract dense SIFT features from the training set and store these in a FLANN index [Muja and Lowe, VISSAPP'09] for efficient nearest neighbor search. To deal with the unbalanced class frequency we learn a linear correction for each class as in [Behmo et al, ECCV 2010]. We segment each test image into 500 SLIC superpixels [Achanta et al, TPAMI 2012] and take each superpixel as a vertex in the CRF. We use the corrected NBNN estimates as unary potentials and Potts potential as pairwise potentials and infer the MAP labeling using alpha-expansion [Boykov et al, TPAMI 2001]. We tune the weighting between the unary and pairwise potential by exhaustive search.	2012-09-23 12:48:10
Progressive Framework using Weak Autoencoder (SEC)	Progressive Framework	Lakehead University [1,2], Vector Institute [2], The University of British Columbia (Okanagan) [3], Nanyang Technological University [4]	Terence Chow [1] (ychow@lakeheadu.ca) Yimin Yang [2] (yyang48@lakeheadu.ca) Shan Du [3] (shan.du@ubc.ca) Zhiping Linc [4] (EZPLin@ntu.edu.sg)	http://www.yiminyang.com/weakly_supervised.html	2021-07-03 15:40:47
Progressive Framework using Weak Autoencoder(DSRG)	Progressive Framework	Lakehead University [1,2], Vector Institute [2], The University of British Columbia (Okanagan) [3], Nanyang Technological University [4]	Terence Chow [1] (ychow@lakeheadu.ca) Yimin Yang [2] (yyang48@lakeheadu.ca) Shan Du [3] (shan.du@ubc.ca) Zhiping Linc [4] (EZPLin@ntu.edu.sg)	http://www.yiminyang.com/weakly_supervised.html	2021-07-06 02:51:33
Progressive Framework using Weak Autoencoder(CIAN)	Progressive Framework	Lakehead University [1,2], Vector Institute [2], The University of British Columbia (Okanagan) [3], Nanyang Technological University [4]	Terence Chow [1] (ychow@lakeheadu.ca) Yimin Yang [2] (yyang48@lakeheadu.ca) Shan Du [3] (shan.du@ubc.ca) Zhiping Linc [4] (EZPLin@ntu.edu.sg)	http://www.yiminyang.com/weakly_supervised.html	2021-07-06 02:52:28
Puzzle-CAM with ResNeSt-269	Puzzle-CAM	GYNetworks	Sanghyun Jo, In-Jae Yu	Weakly-supervised semantic segmentation (WSSS) is introduced to narrow the gap for semantic segmentation performance from pixel-level supervision to image-level supervision. Most advanced approaches are based on class activation maps (CAMs) to generate pseudo-labels to train the segmentation network. The main limitation of WSSS is that the process of generating pseudo-labels from CAMs that use an image classifier is mainly focused on the most discriminative parts of the objects. To address this issue, we propose Puzzle-CAM, a process that minimizes differences between the features from separate patches and the whole image. Our method consists of a puzzle module and two regularization terms to discover the most integrated region in an object. Puzzle-CAM can activate the overall region of an object using image-level supervision without requiring extra parameters. % In experiments, Puzzle-CAM outperformed previous state-of-the-art methods using the same labels for supervision on the PASCAL VOC 2012 test dataset. In experiments, Puzzle-CAM outperformed previous state-of-the-art methods using the same labels for supervision on the PASCAL VOC 2012 dataset.	2021-02-02 05:25:31
TCnet	TCnet	Tsinghua University	Liu Yulin	TCnet	2018-05-02 08:02:45
TUSS-20k	TUSS-20k	Zhejiang University	Xiaobo Yang	TUSS-20k	2024-05-15 16:07:14
TUSS-fs	TUSS-fs	Zhejiang University	TUSS-fs	TUSS-fs	2024-05-15 16:12:32
TUSS-ws	TUSS-ws	Zhejiang University	TUSS-ws	TUSS-ws	2024-05-15 16:11:57
Iterative method with Entropy-gated ConvCRF	UMICH_EG-ConvCRF_Iter_Res101	University of Michigan Deep Learning Research Group	Chuan Cen, supervisor: Prof. Honglak Lee	Train segmentation network and relation models iteratively. Infer pseudo-labels for segmentation network with the novel Entropy-gated ConvCRF, which is proved to be superior to random walk under the same conditions. Seg net: Deeplabv2 Seg net backbone: Res101 Relation model backbone: Res101	2019-12-13 00:37:25
Iterative method with Entropy-gated ConvCRF	UMICH_EG-ConvCRF_Iter_Res50	University of Michigan Deep Learning Research Group	Chuan Cen, supervisor: Prof. Honglak Lee	Train segmentation network and relation models iteratively. Infer pseudo-labels for segmentation network with the novel Entropy-gated ConvCRF, which is proved to be superior to random walk under the same conditions. Seg net: Deeplabv2 Seg net backbone: Res50 Relation model backbone: Res50	2019-12-10 22:02:49
Transductive semi-sup, co-train, self-train	UMICH_TCS	University of Michigan Deep Learning Research Group	Chuan Cen	It's a method for solving weakly supervised semantic segmentation problem with image-level label only. The problem is viewed as a semi-supervised learning task, then apply graph semi-supervised learning method, co-training and self-training methods together achieving the SOTA performance.	2019-11-28 19:09:22
Transductive semi-sup, co-train, self-train	UMICH_TCS_101	University of Michigan Deep Learning Research Group	Chuan Cen	It's a method for solving weakly supervised semantic segmentation problem with image-level label only. The problem is viewed as a semi-supervised learning task, then apply graph semi-supervised learning method, co-training and self-training methods together achieving the SOTA performance.	2019-12-01 02:40:47
PRETRAINED resnetv2_50x1_bit_distilled (384x384 si	Unet_Attention	HSE	Ivan Vassilenko	PRETRAINED resnetv2_50x1_bit_distilled (384x384 size)	2022-01-17 07:57:57
WASP for Effective Semantic Segmentation	WASPNet	Rochester Institute of Technology	Bruno Artacho and Andreas Savakis, Rochester Institute of Technology	We propose an efficient architecture for semantic segmentation based on an improvement of Atrous Spatial Pyramid Pooling that achieves a considerable accuracy increase while decreasing the number of parameters and amount of memory necessary. Current semantic segmentation methods rely either on deconvolutional stages that inherently require a large number of parameters, or cascade methods that abdicate larger fields-of-views obtained in the parallelization. The proposed Waterfall architecture leverages the progressive information abstraction from cascade architecture while obtaining multi-scale fields-of-view from spatial pyramid configurations. We demonstrate that the Waterfall approach is a robust and efficient architecture for semantic segmentation using ResNet type networks and obtaining state-of-the-art results with over 20% reduction in the number of parameters and improved performance.	2019-07-25 20:28:04
Waterfall Atrous Spatial Pooling Arch. for Sem Seg	WASPNet+CRF	Rochester Institute of Technology	Rochester Institute of Technology Bruno Artacho Andreas Savakis	We propose a new efficient architecture for semantic segmentation based on a "Waterfall" Atrous Spatial Pooling architecture that achieves a considerable accuracy increase while decreasing the number of network parameters and memory footprint. The proposed Waterfall architecture leverages the efficiency of progressive filtering in the cascade architecture, while maintaining multi-scale fields-of-view comparable to spatial pyramid configurations. Additionally, our method does not rely on a post-processing stage with Conditional Random Fields, which further reduces complexity and required training time. We demonstrate that the Waterfall approach with a ResNet backbone is a robust and efficient architecture for semantic segmentation obtaining state-of-the-art results with significant reduction in the number of parameters for the Pascal VOC dataset and the Cityscapes dataset.	2019-11-19 15:19:18
FLATTENET	XC-FLATTENET	Sichuan University, China	Xin Cai	It is well-known that the reduced feature resolution due to repeated subsampling operations poses a serious challenge to Fully Convolutional Network (FCN) based models. In contrast to the commonly-used strategies, such as dilated convolution and encoder-decoder structure, we introduce a novel Flattening Module to produce high-resolution predictions without either removing any subsampling operations or building a complicated decoder module. https://ieeexplore.ieee.org/document/8932465/metrics#metrics	2020-01-12 02:43:17
Improved deeplabv3 for semantic segmentation	acm	Tianjin University of Technology and Education	***	****	2020-06-13 17:49:57
attention RRM	attention RRM	XJTLU	Bingfeng Zhang Jimin Xiao	RFAM RRM weakly supervised semantic segmentation image level	2021-03-12 11:49:44
ConvNet for voc 2012	aux_0.4_finetuning_train_checkpoints	HoHai University	wuY	sda	2020-06-16 01:45:47
bothweight th0.4	bothweight th0.4	Northwestern Politechnical University	Peng Wang, Chunhua Shen	bothweight th0.4 27082	2019-04-15 10:46:51
comp6_test_cls	comp6_test_cls	comp6_test_cls	comp6_test_cls	comp6_test_cls	2018-05-10 15:54:47
pretrained resnet_101 and ASPP module	deeplabv3_plus_reproduction	Institute of Computing Technology	Zhu Lifa	Reproduction of deeplabv3plus with Tensorflow.	2019-05-11 09:45:53
fcn	fcn	golangboy	golangboy	fcn	2023-04-26 02:00:55
fcn	fcn	golangboy	golangboy	fcn	2023-04-26 02:19:00
refinenet_HPM	refinenet_HPM	SJTU	gzx	refinenet_HPM	2019-03-01 09:22:09
unet_resnet50	unet_resnet50	hainnu	nothing	unet_resnet50	2023-08-13 16:08:21
vgg_unet	vgg_unet	hainnu	golangboy	nothing	2023-08-13 12:13:17
weakly_seg_validation_test	weakly_seg_validation_test	NEU	Smile Lab	weakly_seg_validation_test	2019-09-08 01:18:19
weight+RS	weight+RS	Northwestern Politechnical University	Peng Wang, Shunhua Shen	weight+RS	2019-03-24 08:13:07

PASCAL VOC Challenge performance evaluation and download server

Segmentation Results: VOC2012 BETA

Competition "comp5" (train on VOC2012 data)

Average Precision (AP %)

Abbreviations

Segmentation Results: VOC2012 ^BETA