Resumo:
A summarization task is widely introduced to summarize multiple documents into a short description or to find representative information. A text summarization task is a task that summarizes textual information into a short description, whereas an image collection summarization task also known as an album summarization task, aims to find the visual representative information. Since scene graph generation has the advantage of describing the visual contexts of an image and, in recent years, incorporating external knowledge into scene graph generation has shown effectiveness, we propose a novel scene graph summarization method incorporating external knowledge. The key idea of the proposed method is to enhance the relation predictor toward ambiguous relationships of an image collection. Due to the annotation limitation of the dataset for this task, we first train and evaluate the model using the Visual Genome dataset, a single image scene graph dataset. Then, we introduce an extended annotated MS-COCO dataset for this task to evaluate the model on an image collection scene graph summarization task. A preliminary experiment shows promising direction of this approach.
Tipo: Poster at MIRU Symposium (画像の認識・理解シンポジウム)
Dato de publikigo: July 2023