Federico Landi

I am currently working as a Deep Learning Engineer at Huawei Technologies, in the Amsterdam Research Center.

I got my Ph.D. at AimageLab at the University of Modena and Reggio Emilia, Italy, under the supervision of Prof. Rita Cucchiara. During my Ph.D, my research focused on the fascinating topic of Embodied AI, at the intersection of Computer Vision, Deep Learning, and Robotics.

As part of my Master Thesis, I was a visiting student at University of Amsterdam (UVA) where I worked under the supervision of Prof. Cees Snoek.

Email  /  CV  /  Google Scholar  /  Github  /  LinkedIn

profile photo

In the first part of my Ph.D. I tackled the recent task of Vision-and-Language Navigation. More recently, I developed a strong interest in Embodied exploration and navigation, as well as in Recurrent Neural Networks.

Dress Code: High-Resolution Multi-Category Virtual Try-On
Davide Morelli, Matteo Fincato, Marcella Cornia, Federico Landi, Fabio Cesari, Rita Cucchiara
ECCV, 2022
arXiv  /  bibtex  /  github page  /  dataset (request form)  /  online demo

We collected a new dataset for image-based virtual try-on composed of image pairs coming from different catalogs of YOOX NET-A-PORTER.

Perception, Reasoning, Action: the New Frontier of Embodied AI
Federico Landi
Ph.D. Thesis
pdf version  /  slides (pptx)  /  talk (PhD-Day 2022)

In my Ph.D. Thesis, I present the new challenges and opportunities offered by the recent advances in the research field of Embodied Artificial Intelligence. While doing so, I outline my contributions to the field and the main results of the research I have carried on during my Ph.D. Feel free to contact me in case you need or desire a physical copy of the Thesis.

Spot the Difference: A Novel Task for Embodied Agents in Changing Environments
Federico Landi, Roberto Bigazzi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara
ICPR, 2022
arXiv  /  bibtex  /  code

We propose Spot the Difference: a novel task for Embodied AI.

Focus on Impact: Indoor Exploration with Intrinsic Motivation
Roberto Bigazzi, Federico Landi, Silvia Cascianelli, Lorenzo Baraldi, Marcella Cornia, Rita Cucchiara
RAL/ICRA, 2022
arXiv  /  bibtex  /  code

We devise an impact-based intrinsic reward to train embodied exploration agents in photorealistic indoor environments.

Embodied Navigation at the Art Gallery
Roberto Bigazzi, Federico Landi, Silvia Cascianelli, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
ICIAP, 2021
arXiv  /  bibtex  /  code

We build and release a new 3D space displaying a complete art museum, called ArtGallery3D (AG3D).

Working Memory Connections for LSTM
Federico Landi, Lorenzo Baraldi, Marcella Cornia, Rita Cucchiara
NEUNET, 2021
arXiv  /  bibtex

A simple heuristic improvement boosts LSTM performance on a variety of tasks. Our approach shows more stable training dynamics and faster convergence time when compared to vanilla LSTM and peephole LSTM.

Multimodal Attention Networks for Low-Level Vision-and-Language Navigation
Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
CVIU, 2021
arXiv  /  bibtex  /  code

The first fully-attentive approach to Vision-and-Language Navigation (VLN). We achieve state-of-art performance on low-level VLN, a setting which is rarely considered in the literature given its additional difficulty.

Out of the Box: Embodied Navigation in the Real World
Roberto Bigazzi, Federico Landi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara
CAIP, 2021
arXiv  /  bibtex  /  code  /  slides

We detail how to deploy a navigation policy trained on the Habitat simulator on a LoCoBot. Additionally, we study the performance on five different PointGoal navigation episodes in a real-world, challenging setting.

Transform, Warp, and Dress: A New Transformation-Guided Model for Virtual Try-On
Matteo Fincato, Marcella Cornia, Federico Landi, Fabio Cesari, Rita Cucchiara
TOMM, 2021

We present a new dataset of upper-body clothes for virtual try-on with high-resolution images. We also propose a new model for virtual try-on that can generate high-quality images using a three-stage pipeline.

VITON-GT: An Image-based Virtual Try-On Model with Geometric Transformations
Matteo Fincato, Federico Landi, Marcella Cornia, Fabio Cesari, Rita Cucchiara
ICPR, 2020
paper  /  bibtex  /  poster  /  slides  /  presentation (video)

We propose a new image-based virtual try-on model that can generate high-quality images. We exploit learnable affine and thin-plate spline transformations, combined with a generative network, to create new images of a person wearing different clothes.

Explore and Explain: Self-supervised Navigation and Recounting
Roberto Bigazzi, Federico Landi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara
ICPR, 2020 (Oral presentation)
arXiv  /  bibtex  /  poster  /  slides  /  presentation (video)

A novel setting for Embodied AI in which an agent needs to explore a previously unknown environment while describing what it sees. The proposed model employs a self-supervised exploration module with penalty, and a fully-attentive captioning model for explanation.

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters
Federico Landi, Lorenzo Baraldi, Massimiliano Corsini, Rita Cucchiara
BMVC, 2019 (Oral presentation)
arXiv  /  bibtex  /  code  /  poster  /  slides  /  talk (video)

In Vision-and-Language Navigation, an agent needs to reach a target destination with the only guidance of a natural language instruction. We exploit dynamic convolutional filters to ground the lingual description into the visual observation in an elegant and efficient way.

Anomaly locality in video surveillance
Federico Landi, Cees Snoek, Rita Cucchiara
ArXiv, 2019
arXiv  /  bibtex  /  dataset

We explore the impact of considering spatiotemporal tubes instead of whole-frame video segments for anomaly detection in video surveillance. We create UCFCrime2Local: the first dataset for anomaly detection with bounding box supervision in both its train and test set.

Reviewing Service


  • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

  • IEEE Robotics and Automation Letters (RAL)

  • ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)

  • Pattern Recognition Letters (PRL)


  • IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  • IEEE International Conference on Computer Vision (ICCV)

  • ACM International Conference on Multimedia (ACMMM)

  • IEEE International Conference on Robotics and Automation (ICRA)

  • IEEE International Conference on Pattern Recognition (ICPR)

Teaching Activities
  • Computer Architecture - Prof. Rita Cucchiara, Prof. Simone Calderara, 2020-2021

  • Machine Learning and Deep Learning - IFOA, 2020

  • Deep Learning - Nuova Didactica, 2020, 2022

Courses and Summer Schools
  • Advanced Course on Data Science and Machine Learning - ACDL 2020, Remote (certificate)

  • International Computer Vision Summer School - ICVSS 2019, Scicli (RG), Italy (certificate)

  • In 2019, I carried on the 3D acquisition of the museum in Galleria Estense, in Modena (see below). One year later, the virtual spaces created for research purpose allowed to offer free guided tours to schools and young students during the Covid-19 lockdown in Italy.

  • I like practicing Shuai Jiao (摔跤) and lifting weights.

I like this website.