ALEGORIA Image Dataset

Samples of the ALEGORIA dataset

Samples of the ALEGORIA dataset

The ALEGORIA benchmark is an image dataset involving heterogeneous cultural, historical and geographical images of various objects of interest in urban and natural scenes, through a time period ranging from the 1920s to nowadays. The content is highly characterized by multi-date, multi-source and multi-view images. It is designed mainly for CBIR, but can also be used in related tasks:

  • Image-based geolocalization
  • Cross-view image matching
  • Invariant representation learning
  • Few-shot landmark recognition
  • Multi-temporal image matching


ALEGORIA consists of a total of 13175 images of high resolution (800px*variable), decomposed as follows:

  • 1859 query images dispatched in 58 classes, where each class is defined around an object or a location in urban or natural scenery
  • 11316 relevant image distractors


Annotated variations: one originality of the dataset is that each query image was manually annotated with several quantized attributes associated with image variations, making it relevant for the evaluation of approaches facing these variations:

  • Scale: what portion of the image does the object occupy?
  • Illumination: is the object under- or over-illuminated?
  • Vertical orientation: street-view, oblique, vertical
  • Level of occlusion: is the object hidden behind other objects?
  • Alterations: is the image degraded?
  • Color: color image, grayscale, or monochrome e.g. sepia


Details and use rights

These data are made available for research purposes only. For more details on them and their conditions of use, please consult the following documents:

  • Statistics on the dataset: Soon!
  • Usage and rights: Soon!



Date of publication: -