Text Summarizing and Clustering Using Data Mining Technique

Authors

  • Zainab Abdul-Wahid Salman Department of Information & Knowledge Management, Mustansiriyah University, 10052 Baghdad, Iraq.

DOI:

https://doi.org/10.23851/mjs.v34i1.1195

Keywords:

Information Systems, Texts Summary, Large Data, Learning Machine, K-, TF-IDF

Abstract

Text summarization is an important research topic in the field of information technology because of the large volume of texts, and the large amount of data found on the Internet and social media. The task of summarizing the text has gained great importance that requires finding highly efficient ways in the process of extracting knowledge in various fields, Thus, there was a need for methods of summarizing texts for one document or multiple documents. The summarization methods aim to obtain the main content of the set of documents at the same time to reduce redundant information. In this paper, an efficient method to summarize texts is proposed that depends on the word association algorithm to separate and merge sentences after summarizing them. As well as the use of data mining technology in the process of redistributing information according to the (K-Mean) algorithm and the use of (Term Frequency Inverse Document Frequency TF-IDF) technology for measuring the properties of summarized texts. The experimental results found that the summarization ratios are good by deleting unimportant words. Also, the method of extracting characteristics for texts was useful in grouping similar texts into clusters, which makes this method possible to be combined with other methods in artificial intelligence such as fuzzy logic or evolutionary algorithms in increasing summarization rates and accelerating cluster operations.

Downloads

Download data is not yet available.

References

Mocnik, Franz-Benjamin. "Putting geographical information science in place-towards theories of platial information and platial information systems." Progress in Human Geography (2022): 03091325221074023.

CrossRef

Zhang, Rui, Cairang Jia, and Jian Wang. "Text emotion classification system based on multifractal methods." Chaos, Solitons & Fractals 156 (2022): 111867.

CrossRef

Salloum, Said A., et al. "Using text mining techniques for extracting information from research articles." Intelligent natural language processing: Trends and Applications. Springer, Cham, 2018. 373-397.

CrossRef

El-Kassas, Wafaa S., et al. "Automatic text summarization: A comprehensive survey." Expert Systems with Applications 165 (2021): 113679.

CrossRef

Wang, Danqing, et al. "Heterogeneous graph neural networks for extractive document summarization." arXiv preprint arXiv:2004.12393 (2020).

CrossRef | PubMed

Jung, Chihoon, et al. "Knowledge Base Driven Automatic Text Summarization using Multi-objective Optimization." INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 12.8 (2021): 836-849.

CrossRef

Memon, Muhammad Qasim, et al. "An ensemble clustering approach for topic discovery using implicit text segmentation." Journal of Information Science 47.4 (2021): 431-457.

CrossRef

Goularte, Fábio Bif, et al. "A text summarization method based on fuzzy rules and applicable to automated assessment." Expert Systems with Applications 115 (2019): 264-275.

CrossRef

Sanchez-Gomez, Jesus M., Miguel A. Vega-Rodríguez, and Carlos J. Pérez. "Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach." Knowledge-Based Systems 159 (2018): 1-8.

CrossRef

Alguliyev, Rasim M., et al. "COSUM: Text summarization based on clustering and optimization." Expert Systems 36.1 (2019): e12340.

CrossRef

Downloads

Key Dates

Published

30-03-2023

Issue

Section

Original Article

How to Cite

[1]
Z. A.-W. Salman, “Text Summarizing and Clustering Using Data Mining Technique”, Al-Mustansiriyah Journal of Science, vol. 34, no. 1, pp. 58–64, Mar. 2023, doi: 10.23851/mjs.v34i1.1195.

Similar Articles

1-10 of 165

You may also start an advanced similarity search for this article.