VOC2010 PRELIMINARY RESULTS

Key to abbreviations

Classification Results: VOC2010 data

Competition "comp1" (train on VOC2010 data)

Average Precision (AP %)

  aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BONN_FGT_SEGM 88.061.653.163.334.877.572.371.141.156.039.664.368.975.487.532.559.340.878.761.4
BUPT_LPBETA_MULTFEAT 82.138.639.546.515.555.046.446.539.921.331.237.645.841.475.515.641.725.062.544.3
BUPT_SPM_SC_HOG 79.647.042.952.321.366.650.158.744.321.832.746.049.751.772.413.244.128.161.548.8
BUPT_SVM_MULTFEAT 81.145.347.346.320.142.336.449.137.520.638.543.844.954.468.618.048.226.057.740.3
BUT_FU_SVM_SIFT 89.763.964.568.336.877.968.572.057.247.256.763.566.874.285.032.854.349.182.666.8
CVC_FLAT 89.457.663.068.532.076.764.766.951.548.450.054.863.169.983.533.654.846.182.265.9
CVC_PLUS 91.061.866.771.137.778.967.872.255.851.055.859.465.373.084.039.956.948.583.968.1
CVC_PLUSDET 91.770.066.871.349.081.477.571.260.052.655.761.070.976.788.443.259.753.884.771.3
HIT_PROTOLEARN_2 60.722.122.729.015.034.927.831.631.914.117.428.924.020.655.89.222.016.830.924.6
LIG_MSVM_FUSE_CONCEPT 74.443.037.550.422.060.747.146.847.522.235.042.142.948.473.815.631.828.963.846.6
LIP6UPMC_KSVM_BASELINE 78.454.149.961.124.668.358.059.950.735.742.555.060.863.171.125.951.539.974.159.6
LIP6UPMC_MKL_L1 78.555.954.662.525.069.359.560.051.337.946.754.060.564.072.832.852.638.572.761.1
LIP6UPMC_RANKING 78.851.346.158.219.568.655.659.446.830.736.049.352.360.076.317.849.135.366.356.6
LIRIS_MKL_TRAINVAL 87.557.061.768.229.976.661.967.556.935.150.655.162.269.383.635.952.942.779.866.3
NEC_V1_HOGLBP_NONLIN_SVM 93.371.769.976.942.085.377.479.360.055.860.671.175.777.786.833.561.555.887.569.9
NEC_V1_HOGLBP_NONLIN_SVMDET 93.372.969.977.247.985.679.779.461.756.661.171.176.779.386.838.163.955.887.572.9
NII_SVMSIFT 69.340.327.344.119.554.123.944.442.920.331.137.536.640.568.89.324.620.255.643.9
NLPR_VSTAR_CLS_DICTLEARN 90.377.065.375.053.785.980.474.662.966.254.166.876.181.789.941.666.357.085.074.3
NTHU_LINSPARSE_2 77.944.037.448.519.063.649.051.045.527.632.141.746.949.768.513.240.330.161.746.3
NUDT_SVM_LDP_SIFT_PMK_SPMK 86.159.360.268.728.774.863.568.052.541.447.157.560.968.281.529.452.144.579.14.7
NUDT_SVM_WHGO_SIFT_CENTRIST_LLM 83.554.255.266.828.572.165.464.251.936.149.355.658.066.582.125.348.141.778.459.5
NUSPSL_EXCLASSIFIER 91.377.070.075.650.783.277.175.462.562.662.764.677.981.891.144.864.253.286.377.1
NUSPSL_KERNELREGFUSING 93.079.071.677.854.385.278.678.864.564.062.769.682.084.491.648.664.959.689.476.4
NUSPSL_MFDETSVM 91.977.169.574.752.584.377.376.263.063.562.965.079.583.291.245.565.455.087.077.2
RITSU_CBVR_WKF 85.657.254.964.529.271.257.163.253.937.649.654.758.767.980.129.252.143.576.460.9
SURREY_MK_KDA 90.666.167.270.636.079.769.873.458.450.760.165.269.876.987.042.559.649.985.271.3
TIT_SIFT_GMM_MKL 87.256.659.666.032.672.763.164.854.641.249.358.859.168.282.931.249.243.275.063.4
UC3M_GENDISC 85.551.655.464.825.974.460.666.051.045.943.955.059.065.280.324.051.447.076.458.6
UVA_BW_NEWCOLOURSIFT 91.571.067.369.943.980.675.373.459.357.860.864.070.680.088.650.865.656.183.076.2
UVA_BW_NEWCOLOURSIFT_SRKDA 90.666.963.470.249.481.876.770.960.057.160.564.567.479.190.253.363.558.081.974.4
WLU_SPM_EMDIST 75.848.936.844.321.265.852.152.145.428.235.045.347.854.271.014.739.832.762.248.0
XRCE_IFV 87.159.659.969.731.376.462.964.352.542.455.159.764.370.483.932.653.350.480.067.6

Precision/Recall Curves

Classification Results: VOC2010 data

Competition "comp2" (train on own data)

Average Precision (AP %)

  aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BIT_LINSVM_PHOW 59.728.817.429.712.525.328.332.234.315.724.526.331.221.543.87.415.618.237.927.3
UCI_LSVM_MDPM_10X -65.1---78.1---43.816.9-64.060.4--53.125.0-58.7
XRCE_IFV_1M 92.768.069.079.929.381.460.078.045.062.931.669.271.278.678.034.067.3-82.7-
XRCE_IFV_FUSE_OPT 92.768.468.580.438.281.866.977.855.062.156.570.171.479.485.040.067.251.884.667.6

Precision/Recall Curves

Detection Results: VOC2010 data

Competition "comp3" (train on VOC2010 data)

Average Precision (AP %)

  aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BONN_FGT_SEGM 52.7 33.7 13.2 11.0 14.2 43.1 31.9 35.6 5.7 25.4 14.4 20.6 38.1 41.7 25.0 5.8 26.3 18.1 37.6 28.1
BONN_SVR_SEGM 50.5 24.4 17.1 13.3 10.9 39.5 32.9 36.5 5.6 16.0 6.6 22.3 24.9 29.0 29.8 6.7 28.4 13.3 32.1 27.2
CMIC_SYNTHTRAIN - 28.9 - - - 30.2 13.3 - - - - - 26.2 28.1 13.2 - - - 18.8 25.7
CMIC_VARPARTS - 28.2 - - - 26.9 13.7 - - - - - 23.5 24.7 16.1 - - - 18.8 24.5
CMU_RANDPARTS 23.8 31.7 1.2 3.4 11.1 29.7 19.5 14.2 0.8 11.1 7.0 4.7 16.4 31.5 16.0 1.1 15.6 10.2 14.7 21.0
CMU_RANDPARTS_MAXSCORE - - 2.7 - - - - 16.2 - 10.6 8.5 - - - 17.9 - - - 15.7 -
LJKINPG_HOG_LBP_LTP_PLS2ROOTS 32.7 29.7 0.8 1.1 19.8 39.4 27.5 8.6 4.5 8.1 6.3 11.0 22.9 34.1 24.6 3.1 24.0 2.0 23.5 27.0
MITUCLA_HIERARCHY 54.2 48.5 15.7 19.2 29.2 55.5 43.5 41.7 16.9 28.5 26.7 30.9 48.3 55.0 41.7 9.7 35.8 30.8 47.2 40.8
NLPR_HOGLBP_MC_LCEGCHLC 53.3 55.3 19.2 21.0 30.0 54.4 46.7 41.2 20.0 31.5 20.7 30.3 48.6 55.3 46.5 10.2 34.4 26.5 50.3 40.3
NUS_HOGLBP_CTX_CLS_RESCORE_V2 49.1 52.4 17.8 12.0 30.6 53.5 32.8 37.3 17.7 30.6 27.7 29.5 51.9 56.3 44.2 9.6 14.8 27.9 49.5 38.4
TIT_SIFT_GMM_MKL 10.5 1.6 1.2 0.9 0.1 2.8 1.6 6.7 0.1 2.0 0.4 3.0 2.0 4.4 2.0 0.3 1.1 1.2 2.1 1.9
TIT_SIFT_GMM_MKL2 20.0 14.5 3.8 1.2 0.5 17.6 8.1 28.5 0.1 2.9 3.1 17.5 7.2 18.8 3.3 0.8 2.9 6.3 7.6 1.1
UC3M_GENDISC 15.8 5.5 5.6 2.3 0.3 10.2 5.4 12.6 0.5 5.6 4.5 7.7 11.3 12.6 5.3 1.5 2.0 5.9 9.1 3.2
UCI_DPM_SP 46.1 52.6 13.8 15.5 28.3 53.2 44.5 26.6 17.6 - 16.1 20.4 45.5 51.2 43.5 11.6 30.9 20.3 47.6 -
UMNECUIUC_HOGLBP_DHOGBOW_SVM 40.4 34.7 2.7 8.4 26.0 43.1 33.8 17.2 11.2 14.3 14.4 14.9 31.8 37.3 30.0 6.4 25.2 11.6 30.0 35.7
UMNECUIUC_HOGLBP_LINSVM 37.9 33.7 2.7 6.5 25.3 37.5 33.1 15.5 10.9 12.3 12.5 13.7 29.6 34.5 33.8 7.2 22.9 9.9 28.9 34.1
UOCTTI_LSVM_MDPM 52.4 54.3 13.0 15.6 35.1 54.2 49.1 31.8 15.5 26.2 13.5 21.5 45.4 51.6 47.5 9.1 35.1 19.4 46.6 38.0
UVA_DETMONKEY 56.7 39.8 16.8 12.2 13.8 44.9 36.9 47.7 12.1 26.9 26.5 37.2 42.1 51.9 25.7 12.1 37.8 33.0 41.5 41.7
UVA_GROUPLOC 58.4 39.6 18.0 13.3 11.1 46.4 37.8 43.9 10.3 27.5 20.8 36.0 39.4 48.5 22.9 13.0 36.8 30.5 41.2 41.9

Precision/Recall Curves

Detection Results: VOC2010 data

Competition "comp4" (train on own data)

Average Precision (AP %)

  aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BERKELEY_POSELETS 33.2 51.9 8.5 8.2 34.8 39.0 48.8 22.2 - 20.6 - 18.5 48.2 44.1 48.5 9.1 28.0 13.0 22.5 33.0
CVITVGG_HEADDETSEG - - - - - - - 41.7 - - - - - - - - - - - -
UCI_LSVM_MDPM_10X - 48.1 - - - 54.7 - - - 25.1 6.0 - 46.6 41.1 - - 31.2 17.7 - 32.3

Precision/Recall Curves

Segmentation Results (VOC2010 data)

Competition "comp5" (train on VOC2010 data)

Accuracy (%)

- Entries in parentheses are synthesized from detection results.

  [mean] back
ground
aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BONN_FGT_SEGM 36.5 82.5 54.6 22.5 25.1 27.6 40.0 60.2 48.3 39.4 7.3 30.8 21.3 25.3 34.9 54.1 36.6 22.5 45.0 17.6 33.5 37.0
BONN_SVR_SEGM 39.7 84.2 52.5 27.4 32.3 34.5 47.4 60.6 54.8 42.6 9.0 32.9 25.2 27.1 32.4 47.1 38.3 36.8 50.3 21.9 35.2 40.9
BROOKES_AHCRF 30.3 70.1 31.0 18.8 19.5 23.9 31.3 53.5 45.3 24.4 8.2 31.0 16.4 15.8 27.3 48.1 31.1 31.0 27.5 19.8 34.8 26.4
CVC_HARMONY 35.4 80.8 56.7 20.6 31.0 33.9 20.8 57.6 51.4 35.8 7.1 28.1 22.6 24.3 29.3 49.4 37.8 23.3 37.6 18.1 45.6 30.7
CVC_HARMONY_DET 40.1 81.1 58.3 23.1 39.0 37.8 36.4 63.2 62.4 31.9 9.1 36.8 24.6 29.4 37.5 60.6 44.9 30.1 36.8 19.4 44.1 35.9
STANFORD_REGLABEL 29.1 80.0 38.8 21.5 13.6 9.2 31.1 51.8 44.4 25.7 6.7 26.0 12.5 12.8 31.0 41.9 44.4 5.7 37.5 10.0 33.2 32.3
UC3M_GENDISC 27.8 73.4 45.9 12.3 14.5 22.3 9.3 46.8 38.3 41.7 0.0 35.9 20.7 34.1 34.8 33.5 24.6 4.7 25.6 13.0 26.8 26.1
UOCTTI_LSVM_MDPM 31.8 80.0 36.7 23.9 20.9 18.8 41.0 62.7 49.0 21.5 8.3 21.1 7.0 16.4 28.2 42.5 40.5 19.6 33.6 13.3 34.1 48.5
(CMU_RANDPARTS) 12.2 4.3 7.9 10.1 4.7 5.4 20.5 38.9 12.1 11.9 1.7 15.1 3.8 5.7 12.5 25.4 7.6 2.7 19.7 12.1 11.8 23.1
(LJKINPG_HOG_LBP_LTP_PLS2ROOTS) 6.7 0.5 18.6 10.2 0.0 0.0 12.5 26.1 10.2 1.5 0.5 0.0 0.0 3.9 2.9 7.7 5.6 0.0 2.9 0.8 8.5 28.0
(MITUCLA_HIERARCHY) 15.4 0.5 14.2 6.6 10.5 5.9 39.9 30.9 26.8 21.8 4.2 12.0 15.5 10.8 18.1 24.1 11.7 8.0 21.0 11.5 16.0 13.5
(NLPR_HOGLBP_MC_LCEGCHLC) 14.4 2.2 9.6 7.3 11.5 5.8 10.8 39.8 24.4 18.5 5.3 12.4 10.5 15.0 15.5 20.2 21.8 3.2 22.0 7.7 20.0 17.9
(NUS_HOGLBP_CTX_CLS_RESCORE_V2) 9.5 1.4 7.9 12.0 4.2 7.0 4.3 42.5 27.6 2.9 0.3 19.5 5.9 2.4 10.5 16.3 1.5 2.0 0.1 6.0 14.2 11.2
(TIT_SIFT_GMM_MKL) 12.3 13.7 12.7 7.5 6.8 6.9 18.7 29.3 14.5 16.9 6.2 7.4 11.0 11.6 10.4 17.0 9.7 6.4 11.6 8.1 14.3 18.6
(TIT_SIFT_GMM_MKL2) 14.9 21.7 16.6 10.5 9.3 13.8 17.9 42.2 14.9 17.6 6.3 9.9 2.5 11.1 12.5 20.7 8.5 6.2 21.8 7.6 19.9 20.5
(UMNECUIUC_HOGLBP_DHOGBOW_SVM) 11.1 4.8 6.9 6.7 3.2 4.7 20.8 37.3 13.6 10.4 3.4 12.6 8.4 5.8 8.4 14.8 12.0 5.1 14.5 7.3 13.8 19.1
(UMNECUIUC_HOGLBP_LINSVM) 9.7 2.8 4.3 9.5 1.3 0.5 21.9 33.1 17.0 6.8 3.2 1.7 4.6 5.1 8.6 14.5 8.1 1.0 15.6 5.6 14.0 25.3
(UVA_DETMONKEY) 14.7 2.4 13.8 7.8 3.8 6.3 18.5 48.8 36.2 19.2 3.8 11.0 4.5 14.5 10.7 23.6 11.1 6.4 22.7 10.1 14.8 19.9
(UVA_GROUPLOC) 13.8 3.1 15.9 7.3 6.6 5.5 14.4 43.9 24.0 19.0 3.1 11.8 6.2 11.4 11.3 25.0 13.2 3.3 17.9 9.0 15.7 21.8

Segmentation Results (VOC2010 data)

Competition "comp6" (train on own data)

Accuracy (%)

- Entries in parentheses are synthesized from detection results.

  [mean] back
ground
aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BERKELEY_POSELETS_ALIGN_PB 34.7 82.0 49.7 23.3 20.6 19.0 47.1 58.1 53.6 32.5 0.0 31.1 0.0 29.5 42.9 41.9 43.8 16.6 39.0 18.4 38.0 41.5

Person Layout Results: VOC2010 data

Competition "comp7" (train on VOC2010 data)

Average Precision (AP %)

  Head Hand Foot

Precision/Recall Curves

Person Layout Results: VOC2010 data

Competition "comp8" (train on own data)

Average Precision (AP %)

  Head Hand Foot
BCNPCL_HumanLayout 74.43.31.2
OXFORD_SBD 52.710.40.0

Precision/Recall Curves

Action Classification Results: VOC2010 data

Competition "comp9" (train on VOC2010 data)

Average Precision (AP %)

  phoning playing
instrument
reading riding
bike
riding
horse
running taking
photo
using
computer
walking
BONN_ACTION 47.551.131.964.569.178.532.453.961.1
CVC_BASE 56.256.534.775.183.686.525.460.069.2
CVC_SEL 49.852.834.374.285.585.124.964.172.5
INRIA_SPM_HT 53.253.630.278.288.484.630.460.961.8
NUDT_SVM_WHGO_SIFT_CENTRIST_LLM 47.247.924.574.281.079.524.958.671.5
SURREY_MK_KDA 52.653.535.981.089.386.532.859.268.6
UCLEAR_SVM_DOSP_MULTFEATS 47.057.826.978.889.787.332.560.070.1
UMCO_DHOG_KSVM 53.543.032.067.968.883.034.145.960.4
WILLOW_A_SVMSIFT_1-A_LSVM 49.237.722.273.277.181.724.353.756.9
WILLOW_LSVM 40.429.932.253.562.273.617.645.841.5
WILLOW_SVMSIFT 47.929.121.753.576.778.326.042.956.4

Precision/Recall Curves

Action Classification Results: VOC2010 data

Competition "comp10" (train on own data)

Average Precision (AP %)

  phoning playing
instrument
reading riding
bike
riding
horse
running taking
photo
using
computer
walking
BERKELEY_POSELETS_ACTION 45.945.823.779.987.683.126.244.966.6

Precision/Recall Curves

Classification Results: VOC2009 data

Competition "comp1" (train on VOC2010 data)

Average Precision (AP %)

  aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BONN_FGT_SEGM 84.461.256.761.038.573.668.669.444.655.939.958.570.173.086.935.353.242.378.961.6
BUPT_LPBETA_MULTFEAT 79.141.545.545.522.352.744.244.141.326.434.233.447.442.777.016.637.530.065.543.8
BUPT_SPM_SC_HOG 77.347.448.051.724.662.646.655.346.127.737.043.052.851.873.819.039.433.764.149.8
BUPT_SVM_MULTFEAT 78.447.451.946.120.439.435.149.338.524.741.439.348.353.271.918.942.529.763.441.2
BUT_FU_SVM_SIFT 85.263.067.066.237.673.065.168.457.048.855.556.767.771.484.333.450.051.182.665.8
CVC_FLAT 85.557.465.866.234.571.961.264.351.848.550.148.264.666.783.234.449.548.682.865.6
CVC_PLUS 87.361.168.769.239.574.364.269.155.852.155.352.467.870.183.839.251.350.384.067.5
CVC_PLUSDET 88.668.768.668.350.276.174.068.660.353.256.254.672.173.987.242.954.153.684.770.5
HIT_PROTOLEARN_2 60.023.129.126.517.031.928.727.131.515.016.823.127.021.560.89.517.318.932.624.6
LIG_MSVM_FUSE_CONCEPT 71.944.042.950.228.159.544.145.448.323.835.238.546.647.875.120.229.733.566.047.9
LIP6UPMC_KSVM_BASELINE 76.452.854.159.826.064.153.956.252.039.643.549.962.761.772.927.444.343.275.759.9
LIP6UPMC_MKL_L1 75.755.657.660.529.564.355.657.651.742.346.547.462.662.374.532.346.242.474.361.6
LIP6UPMC_RANKING 76.350.748.856.424.964.851.355.748.734.037.944.554.858.577.417.142.138.768.856.6
LIRIS_MKL_TRAINVAL 83.256.964.165.633.270.658.064.456.937.850.148.664.366.183.336.546.447.180.565.5
NEC_V1_HOGLBP_NONLIN_SVM 89.671.070.874.142.780.473.575.860.059.060.564.475.474.086.235.655.557.286.768.1
NEC_V1_HOGLBP_NONLIN_SVMDET 89.671.770.874.348.980.575.776.061.259.561.164.576.375.786.239.957.657.286.771.1
NII_SVMSIFT 66.842.932.843.820.652.426.641.944.221.330.234.140.741.071.517.318.424.657.845.6
NLPR_VSTAR_CLS_DICTLEARN 86.674.267.771.954.481.176.371.762.465.855.860.976.177.788.343.559.857.785.472.0
NTHU_LINSPARSE_2 75.843.942.147.419.859.246.847.546.134.135.339.450.349.970.916.134.134.062.747.9
NUDT_SVM_LDP_SIFT_PMK_SPMK 83.059.362.966.731.770.258.865.654.044.747.150.463.765.281.631.145.447.179.35.0
NUDT_SVM_WHGO_SIFT_CENTRIST_LLM 80.854.759.065.031.767.661.561.851.940.050.249.060.463.681.927.543.544.479.059.7
NUSPSL_EXCLASSIFIER 86.874.871.172.453.277.873.873.162.061.062.059.777.578.389.545.660.753.586.974.2
NUSPSL_KERNELREGFUSING 88.177.173.274.855.380.474.676.363.364.062.464.281.379.989.847.760.259.287.673.8
NUSPSL_MFDETSVM 88.374.871.171.654.179.274.273.262.562.761.759.478.779.089.545.660.155.486.274.6
RITSU_CBVR_WKF 82.057.058.662.633.266.253.360.454.340.347.548.461.565.680.630.947.345.477.759.5
SURREY_MK_KDA 86.465.368.368.338.174.666.370.758.253.357.658.070.773.386.041.552.853.285.569.9
TIT_SIFT_GMM_MKL 83.656.862.264.236.069.659.061.954.345.749.553.561.766.883.032.643.145.276.661.9
UC3M_GENDISC 81.752.458.363.529.569.556.663.550.647.544.950.461.962.180.124.647.847.678.158.5
UVA_BW_NEWCOLOURSIFT 88.169.468.567.545.375.071.769.458.858.558.957.672.175.187.349.759.256.683.275.2
UVA_BW_NEWCOLOURSIFT_SRKDA 86.765.764.667.050.277.272.967.660.057.659.157.670.273.888.552.957.158.283.472.8
WLU_SPM_EMDIST 73.249.242.644.726.561.948.949.146.731.739.142.551.953.473.016.934.737.464.348.8
XRCE_IFV 83.259.462.267.134.170.758.461.152.445.852.553.566.366.983.532.746.253.280.466.3

Precision/Recall Curves

Classification Results: VOC2009 data

Competition "comp2" (train on own data)

Average Precision (AP %)

  aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BIT_LINSVM_PHOW 64.731.622.933.213.928.227.734.338.420.030.028.938.324.951.07.415.525.742.529.1
UCI_LSVM_MDPM_10X -64.3---73.8---44.821.6-66.159.8--47.329.3-58.0
XRCE_IFV_1M 88.068.269.676.030.775.455.674.846.762.935.762.272.774.279.134.661.9-82.6-
XRCE_IFV_FUSE_OPT 88.068.669.676.840.476.862.774.654.862.554.463.272.375.184.638.461.854.584.266.3

Precision/Recall Curves

Detection Results: VOC2009 data

Competition "comp3" (train on VOC2010 data)

Average Precision (AP %)

  aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BONN_FGT_SEGM 52.7 35.7 17.0 15.4 17.4 40.9 30.5 37.9 11.5 26.6 19.6 22.5 41.2 40.5 28.0 12.2 26.8 21.4 42.5 29.4
BONN_SVR_SEGM 50.3 25.2 21.0 18.0 13.1 39.3 31.1 39.3 10.3 17.4 11.8 23.2 26.0 26.2 31.4 11.8 28.8 16.7 36.7 28.9
CMIC_SYNTHTRAIN - 30.9 - - - 28.8 17.2 - - - - - 30.0 30.1 18.6 - - - 25.0 27.0
CMIC_VARPARTS - 29.8 - - - 28.2 16.7 - - - - - 27.7 28.1 20.7 - - - 24.5 25.8
CMU_RANDPARTS 27.4 33.0 2.4 11.0 15.1 30.1 21.4 16.0 0.9 14.8 9.3 4.3 23.7 32.6 21.1 2.6 18.2 14.2 19.3 22.7
CMU_RANDPARTS_MAXSCORE - - 7.2 - - - - 20.0 - 16.6 15.0 - - - 22.2 - - - 22.5 -
LJKINPG_HOG_LBP_LTP_PLS2ROOTS 34.0 31.6 4.8 5.8 22.8 36.8 26.7 13.6 10.6 14.8 13.7 13.8 27.2 35.6 27.3 10.0 23.9 9.4 28.8 28.3
MITUCLA_HIERARCHY 54.3 48.1 18.4 21.0 31.7 50.8 40.7 41.6 19.8 29.0 32.2 29.3 50.8 52.3 43.0 13.9 35.4 31.1 51.5 39.4
NLPR_HOGLBP_MC_LCEGCHLC 52.2 54.5 21.9 23.6 32.3 48.8 44.4 41.1 23.6 33.1 25.7 29.6 50.7 52.3 47.0 15.8 33.4 30.6 53.2 39.2
NUS_HOGLBP_CTX_CLS_RESCORE_V2 49.1 52.2 21.2 16.3 33.5 49.0 32.4 37.7 21.5 28.9 32.2 27.9 53.7 54.1 45.3 14.2 20.0 29.9 53.6 37.8
TIT_SIFT_GMM_MKL 17.0 1.9 1.4 0.9 0.2 4.6 1.3 7.2 0.4 3.6 0.5 2.5 3.3 4.6 10.1 9.1 2.8 9.6 10.5 2.5
TIT_SIFT_GMM_MKL2 24.4 18.0 6.3 2.9 4.5 19.7 11.5 28.4 0.3 6.0 5.6 18.0 10.3 18.1 10.7 9.1 9.1 10.1 9.1 4.5
UC3M_GENDISC 24.2 10.3 11.6 9.1 9.1 17.2 11.1 26.1 9.1 11.5 11.7 17.6 18.0 19.4 12.6 9.1 9.4 12.1 16.7 10.1
UCI_DPM_SP 46.2 52.4 16.8 17.8 31.6 49.0 42.3 27.7 20.9 - 21.5 20.6 48.2 49.4 44.5 15.2 29.7 20.7 50.8 -
UMNECUIUC_HOGLBP_DHOGBOW_SVM 41.8 37.4 9.8 12.6 28.5 39.4 33.3 20.3 15.3 19.0 19.7 16.9 34.9 38.1 34.0 10.3 24.7 15.7 33.4 34.5
UMNECUIUC_HOGLBP_LINSVM 39.2 35.9 9.8 9.4 27.6 34.6 32.1 17.9 14.9 17.5 18.2 16.0 33.6 35.8 35.8 10.9 22.6 14.9 33.5 34.2
UOCTTI_LSVM_MDPM 49.9 52.7 16.6 18.5 37.1 50.0 47.3 31.8 19.3 28.9 18.8 20.9 48.2 49.9 47.9 13.6 32.4 20.9 49.4 37.3
UVA_DETMONKEY 56.3 42.2 20.1 17.4 17.4 42.3 36.1 46.3 17.6 29.6 31.5 34.8 44.4 50.0 28.3 16.9 35.7 36.0 45.5 40.9
UVA_GROUPLOC 57.6 40.0 22.1 17.9 15.0 43.7 36.5 45.4 15.7 27.5 25.3 33.3 42.6 46.3 26.1 17.1 35.6 32.7 46.8 40.5

Precision/Recall Curves

Competition "comp4" (train on own data)

Average Precision (AP %)

  aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BERKELEY_POSELETS 33.0 51.2 12.0 12.3 37.5 37.3 46.9 26.7 - 25.2 - 21.1 50.9 44.6 48.6 14.6 25.9 17.2 27.3 32.5
CVITVGG_HEADDETSEG - - - - - - - 42.8 - - - - - - - - - - - -
UCI_LSVM_MDPM_10X - 47.6 - - - 49.8 - - - 26.8 11.8 - 49.2 40.4 - - 30.0 21.6 - 32.9

Precision/Recall Curves

Segmentation Results (VOC2009 data)

Competition "comp5" (train on VOC2010 data)

Accuracy (%)

- Entries in parentheses are synthesized from detection results.

  [mean] back
ground
aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BONN_FGT_SEGM 36.3 82.6 61.1 21.1 22.9 30.2 36.8 50.0 50.9 33.9 7.9 27.9 22.4 21.5 38.3 52.0 37.8 25.0 44.3 20.4 35.6 39.0
BONN_SVR_SEGM 38.9 84.4 52.8 25.3 31.5 39.5 45.7 50.7 55.5 35.5 9.8 31.7 26.4 19.0 34.7 40.2 39.7 39.6 48.1 26.9 37.4 42.9
BROOKES_AHCRF 31.0 70.2 30.7 19.0 19.5 30.2 36.9 50.0 43.9 21.9 7.5 31.7 16.5 17.3 30.6 45.6 32.1 32.9 29.2 21.4 35.5 29.5
CVC_HARMONY 34.5 80.6 56.5 21.1 30.1 38.5 22.5 50.0 49.1 30.6 6.2 30.3 22.8 18.9 30.0 44.0 37.6 20.8 35.0 21.2 46.4 32.9
CVC_HARMONY_DET 39.7 80.9 62.1 22.8 37.0 42.1 37.9 58.1 62.3 25.9 7.9 39.5 25.4 23.2 39.0 56.6 44.8 29.9 34.5 20.1 45.5 37.9
STANFORD_REGLABEL 29.0 80.1 39.4 21.5 16.0 9.8 32.4 44.5 43.7 21.2 6.8 24.1 13.8 12.0 34.3 38.6 45.9 3.9 40.7 12.2 33.6 34.4
UC3M_GENDISC 26.9 73.5 48.3 11.3 11.8 26.5 12.6 40.3 41.4 37.4 0.0 31.9 20.7 28.6 31.8 29.1 23.3 1.4 25.7 15.9 25.8 26.8
UOCTTI_LSVM_MDPM 32.0 80.8 42.1 23.6 22.5 21.6 43.6 59.7 52.0 12.7 6.6 19.6 6.1 15.7 29.6 43.3 41.4 24.0 27.3 16.6 34.2 49.6
(CMU_RANDPARTS) 12.4 4.1 7.6 10.2 5.2 5.7 19.0 33.4 10.5 8.9 1.4 17.3 4.1 5.2 13.5 27.7 7.8 3.4 22.3 14.5 13.2 25.4
(LJKINPG_HOG_LBP_LTP_PLS2ROOTS) 6.8 0.6 20.7 10.8 0.0 0.0 14.8 23.8 10.0 1.1 0.5 0.0 0.0 2.7 3.3 7.1 5.5 0.0 3.3 1.0 8.3 29.9
(MITUCLA_HIERARCHY) 14.9 0.6 12.5 6.6 10.6 7.0 43.1 25.6 24.0 18.0 3.2 11.7 15.8 9.7 20.3 20.7 11.6 8.1 19.2 11.9 16.1 16.2
(NLPR_HOGLBP_MC_LCEGCHLC) 13.8 2.3 7.8 7.4 11.9 6.8 11.4 34.6 21.0 15.1 4.9 14.5 11.0 12.9 16.2 17.0 22.3 3.2 20.6 8.3 20.1 21.8
(NUS_HOGLBP_CTX_CLS_RESCORE_V2) 9.6 1.4 7.5 12.9 5.1 8.1 4.4 37.4 28.6 2.2 0.4 18.0 3.8 0.6 10.8 19.9 1.6 1.9 0.2 7.5 16.5 12.9
(TIT_SIFT_GMM_MKL) 11.4 13.3 10.6 6.8 6.5 8.4 18.1 25.4 12.7 13.1 5.2 7.1 11.9 8.3 12.2 10.5 9.1 6.0 11.4 9.6 14.8 19.3
(TIT_SIFT_GMM_MKL2) 13.8 21.2 14.3 9.9 8.1 14.7 18.4 40.1 12.8 11.9 5.3 8.1 2.7 8.8 12.4 12.8 8.1 7.4 21.1 8.7 20.8 21.5
(UMNECUIUC_HOGLBP_DHOGBOW_SVM) 10.8 4.9 5.9 6.5 3.5 5.8 20.5 34.0 12.8 7.0 2.8 12.6 9.2 3.6 9.8 15.5 11.6 5.6 13.0 7.4 14.5 21.6
(UMNECUIUC_HOGLBP_LINSVM) 9.8 3.0 3.6 9.6 1.7 0.4 21.2 34.7 16.6 3.2 2.4 1.9 5.0 3.7 9.6 16.5 8.1 1.2 14.6 6.3 14.9 27.3
(UVA_DETMONKEY) 14.0 2.4 11.3 7.4 3.9 7.4 19.8 44.3 33.8 13.9 3.4 11.5 5.3 10.8 11.2 21.1 10.7 6.7 21.6 11.2 15.3 21.7
(UVA_GROUPLOC) 12.9 3.1 14.5 6.8 6.8 6.3 13.5 36.9 20.9 14.6 2.3 12.0 7.0 9.7 13.1 22.5 13.1 2.9 15.8 9.8 15.8 24.1

Segmentation Results (VOC2009 data)

Competition "comp6" (train on own data)

Accuracy (%)

- Entries in parentheses are synthesized from detection results.

  [mean] back
ground
aero
plane
bicycle bird boat bottle bus car cat chair cow dining
table
dog horse motor
bike
person potted
plant
sheep sofa train tv/
monitor
BERKELEY_POSELETS_ALIGN_PB 36.0 82.9 61.2 24.2 22.9 19.9 47.6 52.8 54.0 30.5 0.0 30.9 0.0 27.8 46.0 50.2 45.8 16.7 38.8 22.5 39.8 41.8

Key to Abbreviations

AbbreviationTitleMethodAffiliationContributorsDescriptiorn
BCNPCL_HumanLayout BCNPCL_Human_LayoutCombining detectors for human layout analysisDept. de Matemàtica Aplicada i Anàlisi, Facultat de Matemàtiques, Computer Vision Center Barcelona and Universitat Oberta de CatalunyaM. Drozdzal, A. Hernández, S. Seguí, X. Baró, S. Escalera, A. Lapedriza, D. Masip, P. Radeva, J. VitriàCombination of several detectors for a complex human layout detector. For details see BCNPCL_PASCAL2010.pptx
BERKELEY_POSELETS BERKELEY POSELETSMulticlass poseletsUC Berkeley / AdobeLubomir Bourdev, Subhransu Maji, Thomas Brox, Jitendra MalikPoselets based on Bourdev et al ECCV 2010, extended for multiple categories.
BERKELEY_POSELETS_ACTION BERKELEY_POSELETS_ACTIONPoselets trained on action categoriesUniversity of California, BerkeleySubhransu Maji, Lubomir Bourdev, Jitendra MalikDiscriminatively selected poselets for action classification + context from object detections and other actions in the image.
BERKELEY_POSELETS_ALIGN_PB Berkeley-poselets-align-pbSegmentation on poselet detectionsUC BerkeleyThomas Brox, Lubomir Bourdev, Subhransu Maji, Jitendra MalikNon-rigid alignment of poselet activations to UCM edges, filling of area between object edges by variational smoothing, competition among objects, and competitive refinement of masks
BIT_LINSVM_PHOW LinearSVM-PHOWLinearSVM-PHOWBeijing Institute of TechnologyChunliang Lv, Lu Tian, Yuan Zhou, Xiumin ShiLinear SVM classifier using spatial pyramid matching kernel.
BONN_ACTION Bonn_actionactionUniveristy of BonnJoão Carreira, Adrian Ion, Fuxin Li, Cristian SminchisescuSegmentation-based recognition method, where multiple figure-ground image partitions constrained by the bounding box are extracted using the Constrained Parametric Min Cuts algorithm (CPMC), and classified using a regression-based framework.
BONN_FGT_SEGM Bonn_FGT_SegmFG Detection, FG TilingUniveristy of BonnJoão Carreira, Adrian Ion, Fuxin Li, Cristian SminchisescuFor each image, multiple full image segmentations (tilings) are generated using a maximum clique formulation on a graph that connects all non-overlapping figure-ground segments obtained using the Constrained Parametric Min Cuts Algorithm (CPMC, CVPR10, http://sminchisescu.ins.uni-bonn.de/papers/cs-cvpr10.pdf). A unary + pairwise scoring is then learned using relevance optimization, by alternating between estimating labels for tiles and learning their scoring parameters against the VOC criteria.
BONN_SVR_SEGM Svr-SegmSvr-SegmUniversity of BonnJoao Carreira, Fuxin Li, Adrian Ion, Cristian SminchisescuSupport vector regression on mulltiple descriptors extracted from figure-ground segmentations obtained using the Constrained Parametric Min Cuts Algorithm (CPMC, CVPR10, http://sminchisescu.ins.uni-bonn.de/papers/cs-cvpr10.pdf). Descriptors include SIFT, color SIFT and HOG on foreground and background. Sequential segment aggregation strategy to handle multiple objects and rank multiple figure ground hypotheses (CVPR10, http://sminchisescu.ins.uni-bonn.de/papers/cls-cvpr10.pdf). The winning method of the 2009 segmentation challenge.
BROOKES_AHCRF AHCRFAssociative hierarchical CRFOxford Brookes UniversityLubor Ladicky Christopher Russell Philip TorrAssociative hierarchical CRF with detectors and cooccurence. One pixel layer with dense feature boost potential, 6 segmentation with potentials based on histograms of features, detector potential based on part-based model sliding window detector, generatively trained cooccurence potential
BUPT_LPBETA_MULTFEAT LPbeta-Multi featuresLPbeta with multi featuresBeijing University of Posts and TelecommunicationsCheng Lin, Qi Xianbiao, Li Chunguang, Guo Jun, Zhang Honggang, Chen GuangLPbeta with multi features including SIFT-gray, SIFT-color and SSIM.Trained on full train+val set with default parameters.
BUPT_SPM_SC_HOG SPM-SC-HOGLinear SVM Classifier with dense HOG featuresBeijing University of Posts and TelecommunicationsQi Xianbiao, Cheng Lin, Li Chunguang, Guo Jun, Zhang Honggang, Chen GuangLiblinear classifier with dense HOG features. Trained on full train+val set with default parameters.
BUPT_SVM_MULTFEAT SVM-Multi featuresSvm classifier with multi featuresBeijing University of Posts and TelecommunicationsCheng Lin, Qi Xianbiao, Li Chunguang, Guo Jun, Zhang Honggang, Chen GuangLibsvm classifier with multi features. Re-trained the final result with default cross-validation and tuned parameters.
BUT_FU_SVM_SIFT FU-SVM-SIFTSVM kernel fusion with several SIFTBrno University of TechnologyMichal Hradiš, Ivo ?ezní?ek, David Ba?ina, AdamVl?ekFeatures: grayscale SIFT and color SIFT Sampling: dense, Harris-Laplace Codebook: k-means 4k BOW trans.: codeword uncertainty - whole image, three horizontal stripes, 2x2 grid Kernel: exp() of weigted X2 distances (fusion of 30 distances) SVM: libsvm
CMIC_SYNTHTRAIN CMIC_SynthTrainSynthetic Training of Deformable Part ModelsCairo Microsoft Innovation Lab, Microsoft ResearchOsama Khalil, Yasmine Badr, Motaz El-SabanThis submission applies synthetic training for Deformable Part Models. Using the segmentation mask of the objects, we synthesized new training examples, by relocating the objects to different background. The idea was applied on top of the deformable models approach [1]. [1] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan; Object detection with discriminatively trained part based models; PAMI 2009.
CMIC_VARPARTS CMIC_VarPartsDeformable part models with variable sized partsCairo Microsoft Innovation Lab, Microsoft ResearchOsama Khalil, Yasmine Badr, Motaz El-SabanOur submission is based on the Deformable Part Models approach[1]. We allowed the model parts to have variable sizes accommodating for affine distortion that renders part sizes non-proportional. [1] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan; Object detection with discriminatively trained part based models; PAMI 2009.
CMU_RANDPARTS RandomPartsUnsupervised Parts-based AttributesCarnegie Mellon UniversitySantosh Divvala (CMU) Larry Zitnick (MSR) Ashish Kapoor (MSR) Simon Baker (MSR)http://www.cs.cmu.edu/~santosh/finalReport.pdf (unpublished work)
CMU_RANDPARTS_MAXSCORE RandomParts_maxScoreUnsupervised Parts-based Attributes (max score)Carnegie Mellon UniversitySantosh Divvala (CMU) Larry Zitnick (MSR) Ashish Kapoor (MSR) Simon Baker (MSR)Updated version of earlier submission (http://www.cs.cmu.edu/~santosh/finalReport.pdf). Main update: inclusion of max score feature and one round of iterative training
CVC_BASE CVC-BASESVM classifier with multiple featuresComputer Vision Center, Universitat Autonoma de Barcelona, SpainNataliya Shapovalova, Wenjuan Gong, Fahad Shahbaz Khan, Josep M. Gonfaus, Marco Pedersoli, Andrew D. Bagdanov, Joost van de Weijer, Jordi GonzálezBaseline CVC submission for action recognition. Standard BoW model over multiple features including PHOG, grayscale SIFT and (various) color SIFT descriptors. Foreground/background modeled separately, spatial pyramid over several features for foreground representation. Late fusion of feature-specific SVM outputs for final action score.
CVC_FLAT CVC_FlatBag-of-words with Non-linear SVMComputer Vision Center BarcelonaFahad Shahbaz Khan Joost van de Weijer Andrew D. Bagdanov Noha Elfiky David Rojas Pep Gonfaus Jordi Gonzalez Maria VanrellWe followed the standard bag-of-words pipeline with multiple detectors alongwith SIFT, ColorNames and HUE descriptors. To combine Color and SHape, we use our own Color Attention Algorithm. GIST descriptor is used to obtain the holistic representation of an image. Finally, we use a standard Non linear SVM for learning.
CVC_HARMONY CVC_HarmonyHarmony PotentialsComputer Vision Center - Universitat Autònoma de BarcelonaJosep Maria Gonfaus, Xavier Boix, Fahad Kahn, Joost van de Weijer, Andrew Bagdanov, Marco Pedersoli, Jordi Gonzàlez, Joan SerratOur submission is based on [1]. We use the CVC_flat classification submission as the observations for the global node. New Absolute Position Prior is added to the Superpixels Probabilities [1] Josep M. Gonfaus, Xavier Boix, Joost Van de Weijer, Andrew D. Bagdanov, Joan Serrat, and Jordi Gonzàlez, " Harmony Potentials for Joint Classification and Segmentation ", in Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, 2010.
CVC_HARMONY_DET CVC_Harmony+DetCVC_Harmony plus Detection PriorsComputer Vision Center - Universitat Autònoma de BarcelonaJosep Maria Gonfaus, Xavier Boix, Fahad Kahn, Joost van de Weijer, Andrew Bagdanov, Marco Pedersoli, Joan Serrat, Xavier Roca, Jordi GonzàlezOur submission is based on [1] and the previous submission "CVC_Harmony". Here we add the detection scores of [2] as location prior in the image. [1] J.M. Gonfaus, X. Boix, J. Van de Weijer, A. D. Bagdanov, J. Serrat, and J. Gonzàlez, "Harmony Potentials for Joint Classification and Segmentation", in CVPR 2010. [2] Felzenszwalb, Girshick, McAllester, Ramanan, "Object Detection with Discriminatively Trained Part Based Models", PAMI 2010
CVC_PLUS CVC_PlusBag-of-words with Non-linear SVMComputer Vision Center BarcelonaFahad Shahbaz Khan Joost van de Weijer Andrew D. Bagdanov Noha Elfiky David Rojas Pep Gonfaus Jordi Gonzalez Maria VanrellAll of CVC_Flat with additional color features combined through averaging the kernel combinations.
CVC_PLUSDET CVC_Plus_DetCVC_Plus submission combined with Detection resultComputer Vision Center BarcelonaFahad Shahbaz Khan Joost van de Weijer Andrew D. Bagdanov Noha Elfiky David Rojas Pep Gonfaus Jordi Gonzalez Maria VanrellSame as our CVC_Plus Submission combined with object localization scores.
CVC_SEL CVC-SELSVM classifier with per-class feature selectionComputer Vision Center, Universitat Autonoma de Barcelona, SpainNataliya Shapovalova, Wenjuan Gong, Fahad Shahbaz Khan, Josep M. Gonfaus, Marco Pedersoli, Andrew D. Bagdanov, Joost van de Weijer, Jordi GonzálezEnhanced CVC submission built upon CVC-BASE for action recognition. Standard BoW model over multiple features from CVC-BASE plus contextual object descriptors. Cross-validation procedure for action-specific feature and kernel selection. Foreground/background/neighborhood modeled separately, spatial pyramid over several features for foreground representation. Object detection based on deformable part-based detector incorporated. Late fusion of feature-specific SVM outputs for final action score.
CVITVGG_HEADDETSEG Head-Detect-SegmentCat-CutCVIT-IIIT,Hyderabad, VGG University of OxfordOmkar M Parkhi, Andrea Vedaldi, C.V.Jawahar, Andrew ZissermanDetector is trained to detect cat heads. The detections returned are used to initialize seeds for GrabCut which segments the cat. Bounding box is then inferred from these segmentations.
HIT_PROTOLEARN_2 ProtoLearnLearning the prototype of image categoriesHarbin Institute of TechnologyDeyuan Zhang Bingquan Liu Chengjie Sun XIaolong WangDense SIFT features, learning the prototype of images using large margin framework. Trains the classifier using 200 positive and 300 negative images selected randomly. The parameters is fixed.
INRIA_SPM_HT SPM+HTSpatial Pyramids and Hough TransformINRIANorberto Adrián Goussies, Arnau Ramisa, Cordelia SchmidSpatial Pyramids on the bounding box, on the image and a hough transform for taking into account the object-person interactions for bicycle, horse and tvmonitor. Trained on trainval with 5-fold cross-validation.
LIG_MSVM_FUSE_CONCEPT LIG_msvm_fuse_conceptFusion of MSVMs with several features, concept optLaboratoire d'Informatique de GrenobleBahjat Safadi Georges QuénotLate fusion of multiple SVMs with multiple features. features include dense and Harris-Laplace filtered opponent SIFT, color histograms and Gabor transforms. Fusion is optimized by concept.
LIP6UPMC_KSVM_BASELINE LIP6_KSVM_BaselineBaseline with BOF, SPM and gaussian SVMLIP6 UPMCDavid Picard, Nicolas Thome, Matthieu CordBaseline with Bag Of Feature scheme, Spatial Pyramid and gaussian SVM
LIP6UPMC_MKL_L1 LIP6_MKL_L1l1-MKL with sift, texture and color featuresLIP6 UPMCDavid Picard, Nicolas Thome, Matthieu Cordl1-MKL with sift, texture (gabor) and color features.
LIP6UPMC_RANKING LIP6_rankingBOF scheme with ranking classifierLIP6 UPMCDavid Picard, Nicolas Thome, Matthieu CordSame as baseline, but with ranking classifier
LIRIS_MKL_TRAINVAL LIRIS_Multi-Feature_MKL_trainvalMKL classifier with multiple featuresLIRIS, Ecole Centrale Lyon, CNRS, UMR5205, FranceChao ZHU, Huanzhang FU, Charles-Edmond BICHOT, Emmanuel Dellandrea, Liming CHENMultiple Kernel Learning (MKL) classifier with multiple features: colorSIFT (dense+harris-laplace), colorLBP (see our paper in ICPR2010), PHOG, Self-Similarity, and Color Histogram. A vocabulary of 4000 codewords is created for SIFT. Spatial pyramid information is used. Trained on 'train + val' set.
LJKINPG_HOG_LBP_LTP_PLS2ROOTS HOG+LBP+LTP+PLS2ROOTSHOG+LBP+LTP+PLS2ROOTSLJK,INPGSibt ul Hussain, Bill TriggsThis method consists of two roots based object detector. This detector is also trained using HOG+LBP+LTP as feature sets while PLS based linear SVM is used as learning algorithm.
MITUCLA_HIERARCHY MITUCLA_HierarchyLatent hierarchical structural learningMIT and UCLALong Zhu, Yuanhao Chen, William Freeman, Alan Yuille, Antonio TorralbaLatent hierarchical structural learning with dense HOG and HOW(SIFT) features.
NEC_V1_HOGLBP_NONLIN_SVM hog, lbp, nonlinear coding, svmv1_classdependent_nodectionNEC Labs, AmericaNEC: Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu UIUC: LiangLiang Cao, Thomas Huang Rutgers Univ.: Tong Zhang Univ. Missouri: Xiaoyu Wang, Tony Xu HanDense DHOG and LBP features are coded by both local linear and nonlinear methods. The resulting high-dimension features were then fed to linear SVMs. Class dependent cross-validation.
NEC_V1_HOGLBP_NONLIN_SVMDET hog, lbp, nonlinear coding, svm, detectionv1_classdependent_withdetectionNEC Labs, AmericaNEC: Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu UIUC: LiangLiang Cao, Thomas Huang Rutgers Univ.: Tong Zhang Univ. Missouri: Xiaoyu Wang, Tony Xu HanDense DHOG and LBP features are coded by both local linear and nonlinear methods. The resulting high-dimension features were then fed to linear SVMs. Detection results were taken into account. Class dependent cross-validation.
NII_SVMSIFT SVM-SIFTSVM classifier with color sift featuresNII JapanXiao Zhou, Cai-Zhi Zhulibsvm classifier with color sift features. Trained using 5-fold cross-validation. Re-trained on val set with fixed parameters.
NLPR_HOGLBP_MC_LCEGCHLC Boosted HOG-LBP and multi-context (LC, EGC, HLC)NLPR_VSTAR_DET_4National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of SciencesYinan Yu, Junge Zhang, Yongzhen Huang, Shuai Zheng, Weiqiang Ren, Chong Wang, Kaiqi Huang, Tieniu TanDeformable model with Boosted HOG-LBP and multi-context information, use location context, enhanced global context, HOG and LBP inter-class context.
NLPR_VSTAR_CLS_DICTLEARN NLPR_VSTAR_CLS_DICTLEARNSaliency coding and dictionary learningNational Laboratory of Pattern Recognition , Institute of Automation, Chinese Academy of SciencesYongzhen Huang, Shuai Zheng, Weiqiang Ren, Chong Wang, Yinan Yu, Junge Zhang, Kaiqi Huang, Tieniu TanLib-SVM classifier with dense SIFT features, saliency coding, dictionary leanring and detection information.
NTHU_LINSPARSE_2 LINEAR-SPARSELinear SVM with spatial max pooling features.NTHUTao Yen Tang, Jyun Yi Lin, Cheng Hao Kung, Meng Hua Wu, Chun Han Chien, Jia Yu Kuo, Hwann Tzong ChenLIBLINEAR with spatial max pooling of sparse features. Sparse featues are determined by ScSPM and color descriptor. Sparse coding dictionary is learned by SPAMS.
NUDT_SVM_LDP_SIFT_PMK_SPMK SVM_LDP_SIFT_PMK_SPMKSVM classifier on PMK and SPMK approaches with linNational University of Defense TechnologyHongping Cai, Krystian Mikolajczk, Dewen HuLocal features are extracted on regular grids at multiple scales, then described by SIFT and 'SIFT+hue histogram'. To reduce the memory and computational cost, the linear discriminant projection (LDP) are applied, leading to significant feature dimensionality reduction and performance boosting. With the 30-dim LDP-projected features, the submitted results are obtained by fusing pyramid match kernel (PMK) and spatial pyramid match kernel (SPMK), with an SVM classifier for the final stage.
NUDT_SVM_WHGO_SIFT_CENTRIST_LLM SVM_WHGO_SIFT_CENTRIST_low-level modelingSVM classifier on low-level modeling based approacDepartment of Automatic Control, College of Mechatronics and Automation, National University of Defense TechnologyLi Zhou, Zongtan Zhou, Dewen HuThis method is based on a low-level modeling strategy. The approach works by creating multiple resolution images and partitioning them into sub-regions at different scales. We represent each sub-region with WHGO, CENTRIST and SIFT descriptors and combine the features of different descriptor and resolution channels through an SVM classifier to form the final decision function.
NUSPSL_EXCLASSIFIER Exclusive-Classifierclassifier based on exclusive dense graphNational University of Singapore; Panasonic Singapore Laboratories;NUS: Xiangyu Chen, Qiang Chen, Xiaotong Yuan, Zheng Song, Si Liu, Tat-Seng Chua, Shuicheng Yan; PSL: Yang Hua, Zhongyang Huang, Shengmei ShenExclusive calssifier with both visual features and exclusive contextual information. Trained on full train+val set using both visual and context information.
NUSPSL_KERNELREGFUSING KernelRegFusingkernel regression for all methodsNational University of Singapore; Panasonic Singapore Laboratories;NUS: Qiang Chen, Zheng Song, Si Liu, Xiangyu Chen, Tat-Seng Chua, Shuicheng Yan; PSL: Yang Hua, Zhongyang Huang, Shengmei Shenkernel regression as a combination method to fuse all other submissions.
NUSPSL_MFDETSVM MFDETSVMSVM with multifeature and detection kernelNational University of Singapore; Panasonic Singapore Laboratories;NUS: Qiang Chen, Zheng Song, Si Liu, Shuicheng Yan; PSL: Yang Hua, Zhongyang Huang, Shengmei ShenSVM classifier with multiple feature and detection kernel.
NUS_HOGLBP_CTX_CLS_RESCORE_V2 HOGLBP_context_classification_rescore_v2results refined by context and classificationNational University of SingaporeZheng Song, Qiang Chen, Shuicheng YanUse HOG+LBP trained part-based detector. The detection results are further reranked via the context information of other detect windows and image classificaton scores.
OXFORD_SBD Oxford_SBDSkin based layout detectionUniversity of OxfordArpit Mittal, Andrew Zisserman, Philip H. S. Torr, Manuel J. MarinHead localization is performed using a part-based upper body detector. A local color model of skin pixels is learned using face pixels obtained from the head bounding box. Using this color model, we perform skin detection on the image. Hand positions are then hypothesized from the skin regions so obtained. These hypotheses are verified using the RBF kernel SVM classifier and our own articulated model of the human upper body. It is to be noted that we do not localize feet in the image.
RITSU_CBVR_WKF Ritsu_CBVR_WKFSVM Classifier with dense and Harris BOFRitsumeikan UniversityXian-Hua Han, Yen-Wei Chen, Xiang RuanWe extracted Gray and color (Opponent and C_sift) BOF feature with dense and Harris sampling, and use SVM with normalized kernel fusion for classification
STANFORD_REGLABEL REGION-LABELOptimizing regions and their labelsStanford UniversityM. Pawan Kumar, Stephen Gould, Haithem Turki, Dan Preston, Daphne KollerThe method groups pixels into regions and assigns the regions a unique semantic label simultaneously by minimizing a global energy function. Features used include those obtained from an object detector (deformable parts-based model) as well as a bag of SIFT words computed over the region. Inference as described in CVPR 2010, learning as described in ICCV 2009.
SURREY_MK_KDA Multikernel+KDAMulitkernel fusion with KDAThe University of SurreyPiotr Koniusz, Muhammad Atif Tahir, Mark Barnard, Fei Yan, Krystian MikolajczykKernel-level fusion with Spatial Pyramid Grids, Soft Assignment and Kernel Discriminant Analysis using spectral regression. 18 kernels have been generated from 18 variants of SIFT.
TIT_SIFT_GMM_MKL SIFT-GMM-MKLMultiple kernel learning with SIFT GMMsTokyo Institute of TechnologyNakamasa Inoue, Yusuke Kamishima, Koichi ShinodaWe use multiple kernel learning and GMM supervector kernels with SIFT features.
TIT_SIFT_GMM_MKL2 SIFT-GMM-MKL2Multiple kernel learning with SIFT GMMsTokyo Institute of TechnologyNakamasa Inoue, Yusuke Kamishima, Koichi ShinodaSame as the SIFT-GMM-MKL run but the GrabCut is applied for detection.
UC3M_GENDISC UC3M_Generative_DiscriminativeCombination of Generative Discriminative MethodsUniversidad Carlos III de MadridIván González-Díaz, Fernando Díaz de MaríaCombination of Supervised Topic Models with SVM-based discriminative methods for concurrent image recognition and segmentation
UCI_DPM_SP DPM-SPparts based model and spatial pyramid featuresUniversity of California, IrvineRagib Morshed, Yi Yang, Charless FowlkesParts model results rescored by combining with spatial pyramid based scene (global) classification results. Scene trained using svm with hist intersect kernel, rescoring trained using SVM on train+val+some_from_segmentations with parameters obtained via search.
UCI_LSVM_MDPM_10X UCI_LSVM-MDPM-10X10x train set for LSVM, mixtures, deformable partsUniversity of California, IrvineXiangxin Zhu, Carl Vondrick, Deva Ramanan, Charless FowlkesWe downloaded additional images from Flickr that match the distribution of the testing set. We used Amazon's Mechanical Turk to annotate these training sets that are 10 times larger the standard trainval set. We used our larger training set to train models with the detector from Felzenswalb et. all.
UCLEAR_SVM_DOSP_MULTFEATS SVM-DOSP-MULTFEATSSVM & dense saptial pyramid w/ multiple featuresUniversity of Caen GREYC and INRIA LEARGaurav Sharma, Frederic Jurie, Cordelia SchmidMultiple chi squared kernels are computed: spatial pyramid (SP) w/ dense SIFT, dense overlapping SP w/ HOG, texture filter, LAB values (bag-of-words w/ the above features) and edge dir hists. They are computed on full images, person bounding boxes (BB) and BB of the lower part (simple stretch-scale of person BB) expected to contain horse, bike etc. They are combined with class specific binary weights based on their perf on val set. Finally, class specific SVMs trained on train+val.
UMCO_DHOG_KSVM dhog-ksvmkernel svm classifier with dhog featureUniversity of Missouri - ColumbiaXutao Lv, Xiaoyu Wang, Xi Zhou, Tony X. Hantrain SVM model with different kernels on dhog feature.
UMNECUIUC_HOGLBP_DHOGBOW_SVM HOG-LBP + DHOG bag of words, SVMLinear svm classifier with bag of words methodThe University of Missouri, NEC Labs America, The University of Illinois at Urbana-ChampaignXiaoyu Wang, Xi Zhou, Tony X. Han, Shuai Tang, Guang Chen, Kai Yu, Thomas S. HuangLiblinear SVM with HOG-LBP feature and DHOG bag of words approach
UMNECUIUC_HOGLBP_LINSVM HOG-LBP Linear SVMsvm classifier with HOG LBP featuresThe University of Missouri, NEC Labs America, The University of Illinois at Urbana-ChampaignXiaoyu Wang, Xi Zhou, Tony X. Han, Shuai Tang, Guang Chen, Kai Yu, Thomas S. HuangLiblinear SVM with HOG-LBP features. All classes use the same default training parameters.
UOCTTI_LSVM_MDPM LSVM-MDPMLSVM Mixtures of deformable part modelsUniversity of Chicago and TTI-CPedro Felzenszwalb (UChicago), Ross Girshick (UChicago), David McAllester (TTI-C)Deformable part models with HOG features. Based on [1,2]. Each model has 6 components with 8 parts. We associate a binary mask with each component to generate segmentations. The object detection models were trained from bounding boxes using LSVM. The segmentation masks were trained from segmentations. [1] P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan. Object Detection with Discriminatively Trained Part Based Models. PAMI 32(9), Sept 2010. [2] http://people.cs.uchicago.edu/~pff/latent-release4/
UVA_BW_NEWCOLOURSIFT BW+New Colour SIFTBest Window + New Colour SIFTUvAJasper Uijlings Koen van de Sande Theo Gevers Arnold SmeuldersBest Window approach with new Colour SIFT trained with Multiple Kernel Learning SVM.
UVA_BW_NEWCOLOURSIFT_SRKDA BW+New Colour SIFT-SRKDABW+New Colour SIFT-SRKDAUniversity of AmsterdamJasper Uijlings Koen van de Sande Theo Gevers Arnold Smeulders Remko SchaBest Window Approach plus new color sift. Classification by SRKDA
UVA_DETMONKEY Detection MonkeyDetection MonkeyUniversity of AmsterdamKoen van de Sande Jasper Uijlings Theo Gevers Arnold SmeuldersThe detection monkey is trained with SVM, dense Color SIFT, spatial pyramid and multiple iterations.
UVA_GROUPLOC GroupLocLocalisation with grouping window selectionUniversity of AmsterdamJasper Uijlings Koen van de Sande Theo Gevers Arnold Smeulders Remko SchaCandidate windows are selected using hierarchical grouping. Classification is with SIFT, SVM-Histogram Intersection, Spatial Pyramid
WILLOW_A_SVMSIFT_1-A_LSVM a * SVM-SIFT + (1-a) * LSVMCombination of SVM and DPM with learned weights.France, INRIA - Willow ProjectVincent Delaitre, Ivan Laptev, Josef SivicCombination of a SVM classifier(with dense SIFT features, trained using 5-fold cross-validation and re-trained on full train+val set with fixed parameters) and of the Felzenszwalb's deformable part-based model (trained on full train+val set with default parameters).The classification score is obtained by a linear combination of the scores of the two classifiers:the two classifiers are also trained on the train set and the weights of this combination are determined by optimizing over the val set.
WILLOW_LSVM LSVMFelzenszwalb's part-based model.France, INRIA - Willow ProjectVincent Delaitre, Ivan Laptev, Josef SivicFelzenszwalb's part-based model trained on full train+val set with default parameters.
WILLOW_SVMSIFT SVM-SIFTSVM classifier with dense SIFT features.France, INRIA - Willow ProjectVincent Delaitre, Ivan Laptev, Josef SivicSVM-Light classifier with dense SIFT features. Trained using 5-fold cross-validation. Re-trained on full train+val set with fixed parameters
WLU_SPM_EMDIST WLU-SPM-EMDISTSPM/lin.SVM, codebook using Earth Mover dist.Washington and Lee UniversityChen Zhong, William Richardson, Joshua StoughSpatial Pyramid Match after Lazebnik. Linear SVM on LLC-coded dense SIFT features after Yang/Wang/Yu. Annotation-based codebook training. SPM levels trained separately and combined. Codebook generated using Earth Mover's distance.
XRCE_IFV Improved Fisher VectorLinear SVM on Improved Fisher vectorXRCEFlorent Perronnin Jorge Sanchez Thomas MensinkBased on [PSM10]: F. Perronnin, J. Sanchez and T. Mensink, "Improving the Fisher kernel for Large-Scale Image Classification", ECCV, 2010.
XRCE_IFV_1M Improved Fisher VectorLinear SVM on Improved Fisher vectorXRCEFlorent Perronnin Jorge Sanchez Thomas MensinkBased on [PSM10]: F. Perronnin, J. Sanchez and T. Mensink, "Improving the Fisher kernel for Large-Scale Image Classification", ECCV, 2010. Trained on close to 1M mono-tagged Flickr group images (non-overlapping with test set).
XRCE_IFV_FUSE_OPT Improved Fisher VectorLinear SVM on Improved Fisher vectorXRCEFlorent Perronnin, Jorge Sanchez, Thomas MensinkBased on [PSM10]: F. Perronnin, J. Sanchez and T. Mensink, "Improving the Fisher kernel for Large-Scale Image Classification", ECCV, 2010. Late fusion of two systems trained respectively on i)voc10 trainval and ii) close to 1M mono-tagged Flickr group images. Optimal weights are learned per-class through cross-validation.