To the top

Page Manager: Webmaster
Last update: 9/11/2012 3:13 PM

Tell a friend about this page
Print version

Visual grounding of spati… - University of Gothenburg, Sweden Till startsida
To content Read more about how we use cookies on

Visual grounding of spatial relations in recurrent neural language models

Conference contribution
Authors Mehdi Ghanimifard
Simon Dobnik
Published in Workshop on Models and Representations in Spatial Cognition (MRSC-3) at 11th International Conference on Spatial Cognition 2018, 5 September 2018, Tübingen, Germany
Publication year 2018
Published at Department of Philosophy, Linguistics and Theory of Science
Language en
Keywords spatial recognition, object recognition, image description, neural language model, grounded language model
Subject categories Computational linguistics


The task of automatically describing an image with natural language requires techniques to associate linguistic units with their corresponding visual representations. In the state of the art techniques, most commonly, a pre-trained convolutional neural networks extracts visual features of the image, then a neural language model with attention mechanism will be trained as a decoder to generate descriptions. In this project, we explore the possibility of using the location of objects as explicit features to detect spatial relations between them in the recurrent neural language model.

Page Manager: Webmaster|Last update: 9/11/2012

The University of Gothenburg uses cookies to provide you with the best possible user experience. By continuing on this website, you approve of our use of cookies.  What are cookies?