Automated emotion recognition in the wild from facial images remains a challenging problem. Although recent advances in deep learning have assumed a significant breakthrough in this topic, strong changes in pose, orientation, and point of view severely harm current approaches. In addition, the acquisition of labeled datasets is costly and the current state-of-the-art deep learning algorithms cannot model all the aforementioned difficulties. In this article, we propose applying a multitask learning loss function to share a common feature representation with other related tasks. Particularly, we show that emotion recognition benefits from jointly learning a model with a detector of facial action units (collective muscle movements). The proposed loss function addresses the problem of learning multiple tasks with heterogeneously labeled data, improving previous multitask approaches. We validate the proposal using three datasets acquired in noncontrolled environments, and an application to predict compound facial emotion expressions.

, , ,
doi.org/10.1109/TCYB.2020.3036935
IEEE Transactions on Cybernetics
Distributed and Interactive Systems

Pons Rodriguez, G., & Masip, D. (2020). Multitask, multilabel, and multidomain learning with convolutional networks for emotion recognition. IEEE Transactions on Cybernetics, 52(6), 4764–4771. doi:10.1109/TCYB.2020.3036935