カストナー マークアウレル

博士(情報学)

Estimating the imageability of words by mining visual characteristics from crawled image data

研究業績へ戻る

著者: Marc A. Kastner, Ichiro Ide, Frank Nack, Yasutomo Kawanishi, Takatsugu Hirayama, Daisuke Deguchi, Hiroshi Murase

あらすじ:

Natural Language Processing and multi-modal analyses are key elements in many applications. However, the semantic gap is an everlasting problem, leading to unnatural results disconnected from the users perception. To understand semantics in multimedia applications, human perception needs to be taken into consideration. Imageability is an approach originating from Pyscholinguistics to quantize the human perception of words. Research shows a relationship between language usage and the imageability of words, making it useful for multi-modal applications. However, the creation of imageability datasets is often manual and labor-intensive. In this paper, we propose a method using image data mining of a variety of visual features to estimate the imageability of words. The main assumption is a relationship between the imageability of concepts, human perception, and the contents of Web-crawled images. Using a set of low- and high-level visual features from Web-crawled images, a model is trained to predict imageability. The evaluations show that the imageability can be predicted with both a sufficiently low error, and a high correlation to the ground truth annotations. The proposed method can be used to increase the corpus of imageability dictionaries.

種類: Journal paper at Multimedia Tools and Applications (MTAP), 79(25), 18167-18199

日付: February 2020

DOI: 10.1007/s11042-019-08571-4

外部リンク: [ github ]


添付ファイル


この研究についてコメントやご意見がある場合、ぜひ以下にコメントを投稿してくだい。メールにてご連絡も大歓迎です。
© 2013-2023 Marc A. Kastner. Powered by KirbyCMS. Some rights reserved. Privacy policy.