Text Summarizing and Clustering Using Data Mining Technique
DOI:
https://doi.org/10.23851/mjs.v34i1.1195Keywords:
Information Systems, Texts Summary, Large Data, Learning Machine, K-, TF-IDFAbstract
Text summarization is an important research topic in the field of information technology because of the large volume of texts, and the large amount of data found on the Internet and social media. The task of summarizing the text has gained great importance that requires finding highly efficient ways in the process of extracting knowledge in various fields, Thus, there was a need for methods of summarizing texts for one document or multiple documents. The summarization methods aim to obtain the main content of the set of documents at the same time to reduce redundant information. In this paper, an efficient method to summarize texts is proposed that depends on the word association algorithm to separate and merge sentences after summarizing them. As well as the use of data mining technology in the process of redistributing information according to the (K-Mean) algorithm and the use of (Term Frequency Inverse Document Frequency TF-IDF) technology for measuring the properties of summarized texts. The experimental results found that the summarization ratios are good by deleting unimportant words. Also, the method of extracting characteristics for texts was useful in grouping similar texts into clusters, which makes this method possible to be combined with other methods in artificial intelligence such as fuzzy logic or evolutionary algorithms in increasing summarization rates and accelerating cluster operations.
Downloads
References
Mocnik, Franz-Benjamin. "Putting geographical information science in place-towards theories of platial information and platial information systems." Progress in Human Geography (2022): 03091325221074023.
Zhang, Rui, Cairang Jia, and Jian Wang. "Text emotion classification system based on multifractal methods." Chaos, Solitons & Fractals 156 (2022): 111867.
Salloum, Said A., et al. "Using text mining techniques for extracting information from research articles." Intelligent natural language processing: Trends and Applications. Springer, Cham, 2018. 373-397.
El-Kassas, Wafaa S., et al. "Automatic text summarization: A comprehensive survey." Expert Systems with Applications 165 (2021): 113679.
Wang, Danqing, et al. "Heterogeneous graph neural networks for extractive document summarization." arXiv preprint arXiv:2004.12393 (2020).
Jung, Chihoon, et al. "Knowledge Base Driven Automatic Text Summarization using Multi-objective Optimization." INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 12.8 (2021): 836-849.
Memon, Muhammad Qasim, et al. "An ensemble clustering approach for topic discovery using implicit text segmentation." Journal of Information Science 47.4 (2021): 431-457.
Goularte, Fábio Bif, et al. "A text summarization method based on fuzzy rules and applicable to automated assessment." Expert Systems with Applications 115 (2019): 264-275.
Sanchez-Gomez, Jesus M., Miguel A. Vega-Rodríguez, and Carlos J. Pérez. "Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach." Knowledge-Based Systems 159 (2018): 1-8.
Alguliyev, Rasim M., et al. "COSUM: Text summarization based on clustering and optimization." Expert Systems 36.1 (2019): e12340.
Downloads
Key Dates
Published
Issue
Section
License
Copyright (c) 2023 Al-Mustansiriyah Journal of Science
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
(Starting May 5, 2024) Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution (CC-BY) 4.0 License that allows others to share the work with an acknowledgement of the work’s authorship and initial publication in this journal.