TY - GEN
T1 - Attention-Based Image Captioning Using DenseNet Features
AU - Hossain, Md Zakir
AU - Sohel, Ferdous
AU - Shiratuddin, Mohd Fairuz
AU - Laga, Hamid
AU - Bennamoun, Mohammed
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2019.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2019
Y1 - 2019
N2 - We present an attention-based image captioning method using DenseNet features. Conventional image captioning methods depend on visual information of the whole scene to generate image captions. Such a mechanism often fails to get the information of salient objects and cannot generate semantically correct captions. We consider an attention mechanism that can focus on relevant parts of the image to generate fine-grained description of that image. We use image features from DenseNet. We conduct our experiments on the MSCOCO dataset. Our proposed method achieved 53.6, 39.8, and 29.5 on BLEU-2, 3, and 4 metrics, respectively, which are superior to the state-of-the-art methods.
AB - We present an attention-based image captioning method using DenseNet features. Conventional image captioning methods depend on visual information of the whole scene to generate image captions. Such a mechanism often fails to get the information of salient objects and cannot generate semantically correct captions. We consider an attention mechanism that can focus on relevant parts of the image to generate fine-grained description of that image. We use image features from DenseNet. We conduct our experiments on the MSCOCO dataset. Our proposed method achieved 53.6, 39.8, and 29.5 on BLEU-2, 3, and 4 metrics, respectively, which are superior to the state-of-the-art methods.
KW - Attention
KW - DenseNet
KW - Image captioning
UR - http://www.scopus.com/inward/record.url?scp=85078455156&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078455156&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-36802-9_13
DO - 10.1007/978-3-030-36802-9_13
M3 - Conference contribution
AN - SCOPUS:85078455156
SN - 9783030368012
T3 - Communications in Computer and Information Science
SP - 109
EP - 117
BT - Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings
A2 - Gedeon, Tom
A2 - Wong, Kok Wai
A2 - Lee, Minho
PB - Springer
T2 - 26th International Conference on Neural Information Processing, ICONIP 2019
Y2 - 12 December 2019 through 15 December 2019
ER -