About me

I am an Assistant Professor at the University of Udine (Italy), since January 2017. I received my Ph.D. in Computer Engineering, Multimedia and Telecommunications in 2010 at the University of Florence, Italy. From 2014 to 2016 I was a senior postdoctoral researcher at the University of Modena and Reggio Emilia, Italy. I was a visiting researcher at Carnegie Mellon University, Pittsburgh, USA, and at Telecom ParisTech/ENST, Paris, in 2006 and 2010 respectively. My research interests include Machine Learning, Wearable Computing, Natural Language Processing, Image and Video analysis. He is teaching Machine Learning at the University of Udine.

I have published more than 50 publications in the most prestigious journals and conferences in Multimedia. I received the best paper award by the IEEE International Conference on Intelligent Technologies for Interactive Entertainment in 2015 with the paper “Wearable Vision for Retrieving Architectural Details in Augmented Tourist Experiences”. I was the lead organizer of the “International Workshop on Egocentric Perception, Interaction and Computing (EPIC)” in 2016 and 2017 (ECCV’16 - ICCV’17) and I gave tutorials at two international conferences (ICPR’12, CAIP’13). I also serves as a Guest Editor of the Special Issue on “Wearable and Ego-vision Systems for Augmented Experience” on IEEE Transactions on Human-Machine Systems. I was a Technical Program Committee member of several conferences and workshops. I regularly serves as reviewer for international conferences and journals such as  ACM Multimedia, IEEE Transactions on Multimedia, IEEE Transactions on Information Forensics and Security and Pattern Recognition, ECCV, CVPR.

Prospective PHD Students:

If you are a prospective student interested in Machine Learning and Deep Learning Research at the Universirty of Udine, please read about our  Ph.D. admissions process and contact me. If you are applying to our Ph.D. course in Computer Science and are interested in my research, please state this in your statement of purpose. 



Bidirectional LSTM Recurrent Neural Network for Keyphrase Extraction

Our paper “Bidirectional LSTM Recurrent Neural Networkfor Keyphrase Extraction by M. Basaldella, E. Antolli, G. Serra and C. Tasso has been accepted for publication by the Italian Research Conference on Digital Libraries 2018.

To achieve state-of-the-art performance, keyphrase extraction systems rely on domain-specific knowledge and sophisticated features. In this paper, we propose a neural network architecture based on a Bidirectional Long Short-Term Memory Recurrent Neural Network that is able to detect the main topics on the input documents without the need of defining new hand-crafted features. A preliminary experimental evaluation on the well-known INSPEC dataset confirms the effectiveness of the proposed solution.


Special Issue on: Wearable and Ego-vision Systems for Augmented Experience

The rapid progress in the development of low-level component technologies such as wearable cameras, wearable sensors, wearable displays and wearable computers is making it possible to augment everyday living. Wearable and egocentric vision systems can be exploited to analyze multi-modal data types (e.g. video, audio, motion) and to support understanding human interactions with the world (including gesture recognition, action recognition, social interaction recognition). Based on the processing of such data, wearable systems can be used to enhance our capabilities and augment our perception. State-of-the-art techniques for wearable sensing can support assistive technologies and advanced perception. This special issue intends to highlight research in support for human performance through egocentric sensing .

Best paper at INTETAIN 2015

The paper “Wearable Vision for Retrieving Architectural Details in Augmented Tourist Experiences” by Stefano Alletto, Davide Abati, Giuseppe Serra and Rita Cucchiara was awarded the best paper award at INTETAIN in Turin. In this paper we propose an egocentric vision system to enhance tourists’ cultural heritage experience. Exploiting a wearable board and a glass-mounted camera, the visitor can retrieve architectural details of the historical building he is observing and receive related multimedia contents. To obtain an effective retrieval procedure we propose a visual descriptor based on the covariance of local features. Differently than the common Bag of Words approaches our feature vector does not rely on a generated visual vocabulary, removing the dependence from a specific dataset and obtaining a reduction of the computational cost. 3D modeling is used to achieve a precise visitor’s localization that allows browsing visible relevant details that the user may otherwise miss. Experimental results conducted on a publicly available cultural heritage dataset show that the proposed feature descriptor outperforms Bag of Words techniques.

Gesture Recognition Using Wearable Vision Sensors to Enhance Visitors' Museum Experiences

Our paper “Gesture Recognition Using Wearable Vision Sensors to Enhance Visitors' Museum Experiences” by L. Baraldi, F. Paci, G. Serra, L. Benini, R. Cucchiara has been accepted for publication by the IEEE Sensor.

We introduce a novel approach to cultural heritage experience: by means of ego-vision embedded devices we develop a system, which offers a more natural and entertaining way of accessing museum knowledge. Our method is based on distributed self-gesture and artwork recognition, and does not need fixed cameras nor radio-frequency identifications sensors. We propose the use of dense trajectories sampled around the hand region to perform self-gesture recognition, understanding the way a user naturally interacts with an artwork, and demonstrate that our approach can benefit from distributed training. We test our algorithms on publicly available data sets and we extend our experiments to both virtual and real museum scenarios, where our method shows robustness when challenged with real-world data. Furthermore, we run an extensive performance analysis on our ARM-based wearable device.

International Workshop on Wearable and Ego-vision Systems for Augmented Experience (WEsAX)

 I am a co-organizer, with Rita Cucchiara, Kris M. Kitani and Javier Civera, of the  first International Workshop on Wearable and Ego-vision Systems for Augmented Experience (WEsAX) that will be held on the July 3, 2015, in conjunction with the IEEE International Conference on Multimedia and Expo (ICME), Turin, Italy.

The rapid progress in the development of systems based on wearable cameras, multi-sensor wearable devices and embedded computers have created the conditions to use multimedia and computer vision technologies to augment human experience in everyday life activities such as sport, education, social interactions, cultural heritage visits, etc.

Wearable systems can be exploited for collecting and analyzing in real-time multimedia data (e.g. video, audio and multi sensor responses); wearable cameras can be enriched with egocentric vision (or ego-vision) to automatically understand gestures, actions, social interactions, object and events regarding the surrounding world. These systems can enhance our capabilities and augment our perception to create a high customizable and personal way of seeing the world. We believe that we are only at the beginning and that these technologies and their applications can have a great impact in future creative industries and can improve the quality of life.

The goal of the first International Workshop on Wearable and Ego-vision Systems for Augmented Experience (WEsAX) is to give an overview of the recent technologies and system solutions, create a forum to exchange ideas and address challenges emerging in this field.

Tutorial at ICPR 2014

Costantino Grana and me are given a tutorial at the International Conference on Pattern Recognition in Stockholm on "Recent advancements on the Bag of Visual Words model for image classification and concept detection".  Tutorial slides


ACVR2014 - Second Workshop on Assistive Computer Vision and Robotics

Assistive technologies provide a set of advanced tools that can improve the quality of life not only for disabled, patients and elderly but also for healthy people struggling with everyday actions. After a period of slow but steady scientific progress, this scientific area seems to be mature for new research and application breakthroughs. The rapid progress in the development of integrated micro-mechatronic and computer vision tools has boosted this process.

The goal of the workshop is then to give an overview of the state of the art of perception and interaction methodologies involved in this area with special attention to aspects related to computer vision and robotics. Call for paper: http://www.ino.it/ACVR2014/

Local Pyramidal Descriptors for Image Recognition

Our paper “Local Pyramidal Descriptors for Image Recognition” by L. Seidenari, G. Serra, A. D. Bagdanov, A. Del Bimbo has been accepted for publication by the the IEEE Transactions on on Pattern Analysis and Machine Intelligence.

In this paper we present a novel method to improve the flexibility of descriptor matching for image recognition by using local mul- tiresolution pyramids in feature space. We propose that image patches be represented at multiple levels of descriptor detail and that these levels be defined in terms of local spatial pooling resolution. Preserving multiple levels of detail in local descriptors is a way of hedging one’s bets on which levels will most relevant for matching during learning and recognition. We introduce the Pyramid SIFT (P-SIFT) descriptor and show that its use in four state-of-the-art image recognition pipelines improves accuracy and yields state-of-the-art results. Our technique is applicable independently of spatial pyramid matching and we show that spatial pyramids can be combined with local pyramids to obtain further improvement. We achieve state-of-the-art results on Caltech-101 (80.1%) and Caltech-256 (52.6%) when compared to other approaches based on SIFT features over intensity images. Our technique is efficient and is extremely easy to integrate into image recognition pipelines.


3rd International Conference on Informatics, Electronics & Vision (ICIEV), 23-24 May, 2014, Dhaka, Bangladesh

ICIEV13 was Technically Co-Sponsored by IEEE Computer Society, by IEEE Technical Committee on Pattern Analysis and Machine Intelligence (TCPAMI), by Society of Instrument and Control Engineers (SICE), by Institute of Control, Robotics and Systems (ICROS), and Sponsored by The Optical Society (OSA). ICIEV12 was Sponsored by The Optical Society (OSA), Technically Co-Sponsored by IEEE CommSoc BD, & Endorsed by International Association for Pattern Recognition (IAPR). It provides vibrant opportunities for researchers, industry practitioners, & students of CSE/EEE/IT/Physics/Statistics/… to share their research results on Computer, IT, Informatics, Electronics, Computer Vision & related fields. Through various presentations from peer-reviewed accepted papers, invited talks, exhibitions & networking – the ICIEV provides an avenue to share knowledge, make networks, and develop a community for researchers – based on the experiences from experts. ICIEV opens doors for challenging research areas for future, Organized Sessions, Tutorials & Mini-workshops. In ICIEV13, 210 papers have been accepted from 468 submissions; and in ICIEV12, 226 papers have been accepted from 393 submissions. ICIEV2014 welcomes you to take part! Similar to ICIEV12 and ICIEV13, we are working to publish our papers to the IEEE Xplore Digital Library. Call for paper: http://cennser.org/ICIEV/cfpICIEV14.pdf


Subscribe to Front page feed