Attention-Based Image Captioning Using DenseNet Features

Md Zakir Hossain*, Ferdous Sohel, Mohd Fairuz Shiratuddin, Hamid Laga, Mohammed Bennamoun

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present an attention-based image captioning method using DenseNet features. Conventional image captioning methods depend on visual information of the whole scene to generate image captions. Such a mechanism often fails to get the information of salient objects and cannot generate semantically correct captions. We consider an attention mechanism that can focus on relevant parts of the image to generate fine-grained description of that image. We use image features from DenseNet. We conduct our experiments on the MSCOCO dataset. Our proposed method achieved 53.6, 39.8, and 29.5 on BLEU-2, 3, and 4 metrics, respectively, which are superior to the state-of-the-art methods.

Original languageEnglish
Title of host publicationNeural Information Processing - 26th International Conference, ICONIP 2019, Proceedings
EditorsTom Gedeon, Kok Wai Wong, Minho Lee
PublisherSpringer
Pages109-117
Number of pages9
ISBN (Print)9783030368012
DOIs
Publication statusPublished - Jan 1 2019
Event26th International Conference on Neural Information Processing, ICONIP 2019 - Sydney, Australia
Duration: Dec 12 2019Dec 15 2019

Publication series

NameCommunications in Computer and Information Science
Volume1143 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference26th International Conference on Neural Information Processing, ICONIP 2019
CountryAustralia
CitySydney
Period12/12/1912/15/19

Keywords

  • Attention
  • DenseNet
  • Image captioning

ASJC Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)

Fingerprint Dive into the research topics of 'Attention-Based Image Captioning Using DenseNet Features'. Together they form a unique fingerprint.

Cite this