Humans can quickly recognize objects in a dynamically changing world. This ability is showcased by the fact that observers succeed at recognizing objects in rapidly changing image sequences, at up to 13 ms/image. To date, the mechanisms that govern dynamic object recognition remain poorly understood. Here, we developed deep learning models for dynamic recognition and compared different computational mechanisms, contrasting feedforward and recurrent, single-image and sequential processing as well as different forms of adaptation. We found that only models that integrate images sequentially via lateral recurrence mirrored human performance (N = 36) and were predictive of trial-by-trial responses across image durations (13-80 ms/image). Importantly, models with sequential lateral-recurrent integration also captured how human performance changes as a function of image presentation durations, with models processing images for a few time steps capturing human object recognition at shorter presentation durations and models processing images for more time steps capturing human object recognition at longer presentation durations. Furthermore, augmenting such a recurrent model with adaptation markedly improved dynamic recognition performance and accelerated its representational dynamics, thereby predicting human trial-by-trial responses using fewer processing resources. Together, these findings provide new insights into the mechanisms rendering object recognition so fast and effective in a dynamic visual world.
PLoS Computational Biology

Sörensen, L., Bohte, S., de Jong, D., Slagter, H., & Scholte, S. (2023). Mechanisms of human dynamic object recognition revealed by sequential deep neural networks. PLoS Computational Biology, 19(6). doi:10.1371/journal.pcbi.1011169