PhD defense of Dimitri Gominski: DONE!

On the 9/11/2021, Dimitri Gominski defended his PhD thesis on "Generalizable features and image search for multi-source interconnection and analysis". Congratulations Dimitri !!!


  • Peter BELL, Professor, Friedrich-Alexander Universität Erlangen-Nürnberg, Germany (Reviewer)
  • Philippe JOLY, Professor, Université Paul Sabatier, Toulouse, France (Reviewer)
  • Jantien STOTER, Professor, Delft University of Technology, Netherlands (President)
  • Dimitris SAMARAS, Professor, Stony Brook University, USA (Examiner)
  • Valérie GOUET-BRUNET, Research Director, Université Gustave Eiffel, France (Co-supervisor)
  • Liming CHEN, Professor, École Centrale Lyon, France (Co-supervisor)



With an ever increasing volume of digitally accessible images, establishing connections to organize and analyse data is all the more important. A typical formulation for connecting images without using metadata is content-based image retrieval (CBIR). Similarly to other applications in computer vision, CBIR has benefited from the expressivity of convolutional neural networks (CNN) and obtained unprecedented results on usual benchmarks. However, it is hard to say whether this performance is explained by the proposal of more and more sophisticated architectures and models, or simply by the presence of a training dataset that matches the use case, i.e. that has similar visual and semantic characteristics. Indeed, the usual paradigm of the model-training dataset couple shows its limits as soon as one leaves the case characterized by the training data: the performance drops when the model is tested on different data, or data with too high variability.

This thesis addresses this issue with a critical look at deep learning methods and their real application potential. In a context of multi-source territorial imagery, a benchmark is proposed to characterize a new research problem: heterogeneous image retrieval, "low-data" (without training data), with a use case where defining a training dataset and a baseline method is not easy. With this benchmark, new measures are proposed to qualify the generalization ability of the model in a CBIR context, then technical solutions that allow to get rid of the hazardous definition of the above-mentioned "similar visual and semantic characteristics". The discussion around the results highlights a probably too great importance given to the architecture of neural networks, and promising ideas in CBIR which provides tools agnostics of the used model, and allowing to exploit the comparative advantages of different models trained on different data sets. Finally, the interest of this generalist approach is confirmed by an application to a case where, despite the abundance of methods and data, they are encapsulated in a set of small datasets and therefore with an unclear application potential: land-use classification with satellite imagery.