JCSE, vol. 17, no. 2, pp.41-50, 2023
DOI: http://dx.doi.org/10.5626/JCSE.2023.17.2.41
Looking to Personalize Gaze Estimation Using Transformers
Seung Hoon Choi, Donghyun Son, Yunjong Ha, Yonggyu Kim, Seonghun Hong, and Taejung Park
VisualCamp, Seoul, South Korea
Department of Cybersecurity, Duksung Women's University, Seoul, South Korea
Abstract: Anatomical differences between people restrain the accuracy of appearance-based gaze estimation. These differences can
be taken into account with few-shot approaches for further optimization. However, these approaches come with additional
computational complexity cost and are vulnerable to corrupt data inputs. Consequently, the use of accurate gaze
estimation in real-world scenarios is restricted. To solve this problem, we introduce a novel and robust gaze estimation
calibration framework called personal transformer-based gaze estimation (PTGE), utilizing a deep learning network that
is separate from the gaze estimation model to adapt to new users. This network learns to model and estimate person-specific
differences in gaze estimation as a low-dimensional latent vector from image features, head pose information, and
gaze point labels. The expensive computational optimization process in few-shot approaches is removed in PTGE
through our separate network. This separate network is composed of transformers, allowing self-attention to weigh the
quality of calibration samples and mitigate the negative effects of corrupt inputs. PTGE achieves near state-of-the-art
performance of 1.49 cm on GazeCapture with a small number of calibration samples (<=16) and no optimization when
adapting to a new user, only a 2% decrease from the state-of-the-art achieved without the hour-long optimization process.
Keyword:
Gaze estimation; Transformer; Artificial intelligence; Human computer interaction; Personal calibration
Full Paper: 158 Downloads, 1966 View
|