A Transfer Learning Approach for Arabic Image Captions

Authors

Haneen Siraj Ibrahim Department of Computer Science, Mustansiriyah University, Baghdad, Iraq https://orcid.org/0000-0001-7452-1691
Narjis Mezaal Shati Department of Computer Science, Mustansiriyah University, Baghdad, Iraq https://orcid.org/0000-0001-6850-3518
AbdulRahman A. Alsewari College of Computing and Digital Technology, Birmingham City University, Birmingham, United Kingdom https://orcid.org/0000-0002-7802-6628

DOI:

https://doi.org/10.23851/mjs.v35i3.1485

Keywords:

CNN, Computer Vision, LSTM, GRU, NLP

Abstract

Background: Arabic image captioning (AIC) is the automatic generation of text descriptions in the Arabic language for images. Applies a transfer learning approach in deep learning to enhance computer vision and natural language processing. There are many datasets in English reverse other languages. Instead of, the Arabs researchers unanimously agreed that there is a lack of Arabic databases available in this field. Objective: This paper presents the improvement and processing of the available Arabic textual database using Google spreadsheets for translation and creation of AR. Flicker8k2023 dataset is an extension of the Arabic Flicker8k dataset available, it was uploaded to GitHub and made public for researches. Methods: An efficient model proposed using deep learning techniques by including two pre-training models (VGG16 and VGG19), to extract features from the images and build (LSTM and GRU) models to process textual prediction sequence. In addition to the effect of pre-processing the text in Arabic. Results: The adopted model outperforms better compared to the previous study in BLEU-1 from 33 to 40. Conclusions: This paper concluded that the biggest problem is the database available in the Arabic language. This paper has worked to increase the size of the text database from 24,276 to 32,364 thousand captions, where each image contains 4 captions.

Downloads

Download data is not yet available.

References

M. T. Lasheen and N. H. Barakat, "Arabic image captioning: The effect of text pre-processing on the attention weights and the bleu-n scores," International Journal of Advanced Computer Science and Applications, vol. 13, no. 7, pp. 413-423, 2022.

M. Al-Tai, B. M. Nema, and A. . Al-Sherbaz, "Deep learning for fake news detection: Literature review," Al Mustansiriyah Journal of Science, vol. 34, no. 2, pp. 70-81, Jun. 2023.

Z. A. Ramadhan and D. Alzubaydi, "Text detection in natural image by connected component labeling," Al Mustansiriyah Journal of Science, vol. 30, no. 1, p. 111, 2019.

N. M. Khassaf and S. H. Shaker, "Image retrieval based convolutional neural network," Al-Mustansiriyah Journal of Science, vol. 31, no. 4, pp. 43-54, Dec. 2020.

A. Salaberria, G. Azkune, O. L. de Lacalle, A. Soroa, and E. Agirre, "Image captioning for effective use of language models in knowledge-based visual question answering," Expert Systems with Applications, vol. 212, p. 118 669, 2023.

S. M. Sabri, "Arabic image captioning using deep learning with attention," M.S. thesis, Institute for Artificial Intelligence, University of Georgia, 2021.

A. Attai and A. Elnagar, "A survey on arabic image captioning systems using deep learning models," in Proceedings of the 14th International Conference on Innovations in Information Technology (IIT), 2020, pp. 114-119.

T. Ghandi, H. Pourreza, and H. Mahyar, "Deep learning approaches on image captioning: A review," arXiv, 2022.

O. ElJundi, M. Dhaybi, K. Mokadam, H. Hajj, and D. Asmar, "Resources and end-to-end neural network models for arabic image captioning," in Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, vol. 5, 2020, pp. 233-241.

H. Hejazi and K. Shaalan, "Deep learning for arabic image captioning: A comparative study of main factors and preprocessing recommendations," International Journal of Advanced Computer Science and Applications, vol. 12, no. 11, pp. 37-44, 2021.

H. D. Hejazi, "Arabic image captioning (aic): Utilizing deep learning and main factors comparison and prioritization," M.S. thesis, The British University in Dubai (BUiD), 2022.

H. A. Al-muzaini, T. N. Al-yahya, and H. Benhidour, "Automatic arabic image captioning using rnn-lstm-based language model and cnn," International Journal of Advanced Computer Science and Applications, vol. 9, no. 6, pp. 67-73, 2018.

R. Mualla and J. Alkheir, "Development of an arabic image description system," International Journal of Computer Science Trends and Technology (IJCST), vol. 6, no. 3, pp. 205-213, 2018.

M. T. Lasheen and N. H. Barakat, "Arabic image captioning: The effect of text pre-processing on the attention weights and the bleu-n scores," International Journal of Advanced Computer Science and Applications, vol. 13, no. 7, pp. 413-423, 2022.

J. Emami, P. Nugues, A. Elnagar, and I. Afyouni, "Arabic image captioning using pre-training of deep bidirectional transformers," in Proceedings of the 15th International Conference on Natural Language Generation, 2022, pp. 40-51.

I. Afyouni, I. Azhara, and A. Elnagar, "Aracap: A hybrid deep learning architecture for arabic image captioning," Procedia Computer Science, vol. 189, pp. 382-389, Jan. 2021.

H. Siraj and D. N. Mezaal, Arabic-image-captioning-haneen-siraj-and-dr-narjis-mezaal, https://github.com/Haneensiraj/Arabic-image-captioning-Haneen-Siraj-and-Dr-Narjis-Mezaal.

K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.

C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, 2017.

H. Mubarak, "Build fast and accurate lemmatization for arabic," Language Resources and Evaluation, pp. 1128-1132, May 2018.

K. Darwish and H. Mubarak, "Farasa: A new fast and accurate arabic word segmenter," Language Resources and Evaluation, pp. 1070-1074, Jan. 2016.

M. I. Jordan, "Serial order: A parallel distributed processing approach," in Neural-Network Models of Cognition - Biobehavioral Foundations, 1997, pp. 471-495.

Downloads

PDF

Key Dates

Received

18-09-2023

Revised

11-02-2024

Accepted

19-02-2024

Published

30-09-2024

Data Availability Statement

Data is available in the article.

Issue

Vol. 35 No. 3 (2024)

Section

Original Article

License

Copyright (c) 2024 Haneen Siraj Ibrahim, Narjis Mezaal Shati, AbdulRahman A. Alsewari

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

(Starting May 5, 2024) Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution (CC-BY) 4.0 License that allows others to share the work with an acknowledgement of the work’s authorship and initial publication in this journal.

How to Cite

[1]

H. S. Ibrahim, N. M. Shati, and A. A. . Alsewari, “A Transfer Learning Approach for Arabic Image Captions”, Al-Mustansiriyah J. Sci., vol. 35, no. 3, pp. 81–90, Sep. 2024, doi: 10.23851/mjs.v35i3.1485.

Download Citation

Crossref

Scopus

Google Scholar

Europe PMC

Similar Articles

Reem M. Abdullah, Sundos A. Hameed Alazawi, Phaklen Ehkan, SAS-HRM: Secure Authentication System for Human Resource Management , Al-Mustansiriyah Journal of Science: Vol. 34 No. 3 (2023)
Ziadoon W. Salman, Hind Ibrahim Mohammed, Ayman Mohammed Enad, SMS Security by Elliptic Curve and Chaotic Encryption Algorithms , Al-Mustansiriyah Journal of Science: Vol. 34 No. 3 (2023)
Mohammed Haqi Al-Tai , Bashar M. Nema, Ali Al-Sherbaz, Deep Learning for Fake News Detection: Literature Review , Al-Mustansiriyah Journal of Science: Vol. 34 No. 2 (2023)
Hassan Kassim Albahadily, Ismael Abdulsatar Jabbar, Alaa Abdulhussaien Altaay, Xunhuan Ren, Issuing Digital Signatures for Integrity and Authentication of Digital Documents , Al-Mustansiriyah Journal of Science: Vol. 34 No. 3 (2023)
Hiba Hameed Ali, Jolan Rokan Naif, Waleed Rasheed Humood, A New Smart Home Intruder Detection System Based on Deep Learning , Al-Mustansiriyah Journal of Science: Vol. 34 No. 2 (2023)
Shatha Jassim Muhamed, Detection and Prevention WEB-Service for Fraudulent E-Transaction using APRIORI and SVM , Al-Mustansiriyah Journal of Science: Vol. 33 No. 4 (2022)
Raza Abdulla, Hiwa Ali Faraj, Kazhan Othman Mohammed, Mahdi Mohamed Younis, Usability Evaluation of the Top 10 Universities in Iraq Using Heuristic Methods , Al-Mustansiriyah Journal of Science: Vol. 34 No. 2 (2023)
Basim K.M.A Al-Windi, Amel H. Abbas, Mohammed Shakir Mahmood, Using Texture Analyses and Statistical Classification for Detection Plant Leaf Diseases , Al-Mustansiriyah Journal of Science: Vol. 32 No. 5 (2021): Special Issue: Mustansiriyah International Conference on Applied Physics (MICAP 2021)
Zahraa Naser Shah weli, Covid-19 Prediction Model Using Data Mining Algorithms , Al-Mustansiriyah Journal of Science: Vol. 33 No. 1 (2022)
Atheel Sabih Shaker, Saadaldeen Rashid Ahmed, Information Retrieval for Cancer Cell Detection Based on Advanced Machine Learning Techniques , Al-Mustansiriyah Journal of Science: Vol. 33 No. 3 (2022)

Previous 11-20 of 55 Next

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)

narjis mezaal shati, Anomalous Behavior Detection Using the Geometrical Complex Moments in Crowd Scenes of Smart Surveillance Systems , Al-Mustansiriyah Journal of Science: Vol. 28 No. 3 (2017)
Sura Hamed Mousa, Narjis Mezaal Shati, Nageswari Sakthivadivel, DeepRing: Convolution Neural Network based on Blockchain Technology , Al-Mustansiriyah Journal of Science: Vol. 35 No. 2 (2024)