Discovery

共同研究先：DoCoMo Euro-LabsCorporate 共同研究数 2

Article　2012　JSTAGE

Leveraging features from background and salient regions for automatic image annotation（Last author）

画像の自動アノテーションのための背景と顕著な領域からの特徴の活用

Supheakmungkol Sarin, Michael Fahrmair, Matthias Wagner, Wataru Kameyama
Journal of Information Processing
【抄録】In this era of information explosion, automating the annotation process of digital images is a crucial step towards efficient and effective management of this increasingly high volume of content. However, this still is a highly challenging task for the research community. One of the main bottlenecks is the lack of integrity and diversity of features. We propose to solve this problem by utilizing 43 image features that cover the holistic content of the image from global to subject, background and scene. In our approach, salient regions and the background are separated without prior knowledge. Each of them together with the whole image are treated independently for feature extraction. Extensive experiments were designed to show the efficiency and the effectiveness of our approach. We chose two publicly available datasets manually annotated with diverse nature of images for our experiments, namely, the Corel5K and ESP Game datasets. We confirm the superior performance of our approach over the use of a single whole image using sign test with p-value < 0.05. Furthermore, our combined feature set gives satisfactory performance compared to recently proposed approaches especially in terms of generalization even with just a simple combination. We also obtain a better performance with the same feature set versus the grid-based approach. More importantly, when using our features with the state-of-the-art technique, our results show higher performance in a variety of standard metrics. © 2012 Information Processing Society of Japan.
【抄録日本語訳】情報爆発時代において、デジタル画像のアノテーションプロセスを自動化することは、この膨大な量のコンテンツを効率的かつ効果的に管理するための重要なステップとなります。しかし、これは研究コミュニティにとって非常に困難な課題です。主なボトルネックの1つは、特徴の完全性と多様性の欠如である。我々は、グローバルから被写体、背景、シーンまで、画像の全容をカバーする43個の画像特徴を活用することで、この問題を解決することを提案する。本アプローチでは、顕著な領域と背景を予備知識なしに分離する。本手法では、顕著な領域と背景を予備知識なしに分離し、それぞれを画像全体とともに独立して扱い、特徴抽出を行う。本アプローチの効率性と有効性を示すため、広範な実験が計画された。実験には、様々な種類の画像を手動でアノテーションした2つの公開データセット、Corel5KデータセットとESPゲームデータセットを選択した。また、p値0.05の符号検定を用いて、単一の画像全体を用いた場合よりも、我々のアプローチが優れた性能を持つことを確認した。さらに，本特徴量の組み合わせにより，最近提案された手法と比較して，特に単純な組み合わせでも汎化の点で十分な性能が得られることを確認した．また，同じ特徴量を用いて，グリッドベースアプローチと比較した場合，より高い性能を得ることができた．さらに重要なことは、我々の特徴量と最先端技術を併用した場合、様々な標準的な指標において高い性能を示すことである。© 2012 情報処理学会.

Conference Paper　2011　IEEE : Institute of Electrical and Electronics Engineers

Holistic feature extraction for automatic image annotation（Last author）

画像自動アノテーションのための全体的特徴抽出

Supheakmungkol Sarin, Michael Fahrmair, Matthias Wagner, Wataru Kameyama
【抄録】Automating the annotation process of digital images is a crucial step towards efficient and effective management of this increasingly high volume of content. It is, nevertheless, an extremely challenging task for the research community. One of the main bottle necks is the lack of integrity and diversity of features. We solve this problem by proposing to utilize 43 image features that cover the holistic content of the image from global to subject, background, and scene. In our approach, saliency regions and background are separated without prior knowledge. Each of them together with the whole image is treated independently for feature extraction. Extensive experiments were designed to show the efficiency and effectiveness of our approach. We chose two publicly available datasets manually annotated and with the diverse nature of images for our experiments, namely, the Corel5k and ESP Game datasets. They contain 5,000 images with 260 keywords and 20,770 images with 268 keywords, respectively. Through empirical experiments, it is confirmed that by using our features with the state-of-the-art technique, we achieve superior performance in many metrics, particularly in auto-annotation. © 2011 IEEE.
【抄録日本語訳】デジタル画像のアノテーションプロセスを自動化することは、膨大な量のコンテンツを効率的かつ効果的に管理するための重要なステップとなります。しかしながら、これは研究者にとって非常に困難な課題である。主なボトルネックの1つは、特徴の完全性と多様性の欠如である。我々は、グローバルから被写体、背景、シーンまで、画像の全体的な内容をカバーする43の画像特徴を利用することを提案し、この問題を解決する。本アプローチでは、顕著性領域と背景を予備知識なしに分離する。本手法では、顕著領域と背景を事前知識なしに分離し、それぞれを画像全体とともに独立に扱って特徴抽出を行う。本アプローチの効率性と有効性を示すため、広範な実験が計画された。実験には、手動でアノテーションされ、画像の多様な性質を持つ2つの公開データセット、すなわち、Corel5kとESP Gameのデータセットを選択した。これらのデータセットには、それぞれ260のキーワードを含む5,000画像と268のキーワードを含む20,770画像が含まれています。実証実験を通じて、我々の特徴と最先端技術を用いることで、多くのメトリクス、特にオートアノテーションにおいて優れた性能を達成することが確認された。© 2011 IEEE.