A dataset to train object classification or detection algorithms on RoboCup@Work objects.

What is it?

  • A dataset containing more than 35000 RGBD images taken from an Intel SR300 in a RoboCup@Work setting
  • About 33500 are labeled as one of the 13 RoboCup@Work object types (the others show multiple objects, or decoy, non-RoboCup objects)
  • 14000 images are manually labeled with X/Y pickup coordinates.


Source code: GitHub Repo

Download the dataset as a .zip archive from one of these mirrors:
Mirror 1 Mirror 2
Filesize: 9.5 GB
md5sum: 8d5e2ee49a504a47f482a5ab8f7b2a3b


  • While the training consists of mostly objects placed on a (small, non-RoboCup) rotating table, the evaluation set consists entirely of objects placed on appropriate platforms.
  • The training and evaluation set folders are split into the different object types: Axis, Bearing, Bearing_Box, Distance_Tube (aluminium spacer ring), F20_20_B (small black aluminium profile), F20_20_G (small grey aluminium profile), M20 (nut), M20_100 (bolt), M30 (nut), Motor, R20 (plastic tube), S40_40_B (large black aluminium profile), S_40_40_G (large grey aluminium profile), MultipleObjects (multiple, different objects shown in per image) and MultipleDuplicateObjects (multiple, possibly duplicate objects shown per image)
  • Color and depth images are separated into folders. Both are saved in png format.
  • Color images have 3 or 4 channels: If the fourth (alpha) channel is used, it denotes the position of the pickup-point of the object
  • If available, the alpha channel uses the following mapping of value to object type:
    0: Axis
    1: Bearing
    2: Bearing Box
    3: Distance Tube
    4: F20_20_B
    5: F20_20_G
    6: M20
    7: M20_100
    8: M30
    9: Motor
    10: R20
    11: S40_40_B
    12: S_40_40_G
    255: No Object/Background
    Note that each object is only marked with a single pixel.
  • Depth images are 8-bit, one channel only. Values range from 0 (0 cm) to 255 (60 cm distance)
  • The average pixel value (for normalizing purposes) of the training set is (RGBD) [120.37184024, 121.55885514, 118.12432115, 131.96968385]
Color Channel on a sample image
Depth Channel on a sample image