Attention-Guided Robust Gaze Tracking for Enhanced Human-Robot Interactions

Project: Research

View graph of relations


Collaborations between robots and humans are becoming increasingly important for many robotic applications. In interacting with human, a robot needs to know the reference objects and the physical locations. Gaze tracking can be useful for communicating human intentions to the robot in such cases. However, tracking an object via estimating the 3D points of gaze (POG) in space is very difficult to achieve using a gaze tracker alone. To solve this problem, we propose to explore human visual attention mechanism for use with a gaze tracker to achieve robust object tracking for human-robot interactions.The tracking here needs to address several challenging issues including background cluttering, occlusions, and rapid object pose changes. Focusing the most relevant parts in the scene quickly and learning useful image features automatically will solve these problems. This project aims to develop an attention-driven object tracking method with deep feature learning. Here the visual attention controls where to look, whereas the deep feature learning conveys what is seen. To find the focused region in which the target object may lie, we propose to utilize bottom-up visual attention with a gazer tracker in the initialization stage. In the tracking stage, on the other hand, the top-down visual attention model will be adopted. It is proposed to adapt and fine tune the pre-trained deep model for tracking, in which some layers will be responsible for extracting view-invariant features for semantic segmentation, whereas some other layers will extract view-variant representation for recovering the object pose. The visual attention module will output a small set of potentially relevant candidate regions to be evaluated by the image features learnt. Hence, robust object tracking can be achieved.In this project, the gaze estimation is responsible for providing the initial location of the object of interest on the 2D image via the gazing direction extracted from the gaze tracker. It is the task of the visual attention and deep feature learning based tracker to locate and track the target on-line for interacting with a robot. The output of the project will be a robust gaze guided tracking system for communicating human intentions with a robot. This will be useful for robot to take part in tasks in assisting human collaboratively by following human intentions, which will benefit robotic applications in general with the new type of human-robot interface enhanced by the gaze guided tracking system. 


Project number9042868
Grant typeGRF
Effective start/end date1/01/2026/06/24