Best Approximate of Vector Space Model by Using SVD

Raghad M. Hadi

doi:10.23851/mjs.v28i2.509

Authors

Raghad M. Hadi Departement of Computer Science, College of Science, Mustansiriyah University, IRAQ.

DOI:

https://doi.org/10.23851/mjs.v28i2.509

Keywords:

High Dimensional Datasets, Dimensionality reduction, SVD, Vector Space Model.

Abstract

A quick growth of internet technology makes it easy to assemble a huge volume of data as text document; e. g., journals, blogs, network pages, articles, email letters. In text mining application, increasing text space of datasets represent excessive task which makes it hard to pre-processing documents in efficient way to prepare it for text mining application like document clustering. The proposed system focuses on pre-processing document and reduction document space technique to prepare it for clustering technique. The mutual method for text mining problematic is vector space model (VSM), each term represent a features. Thus the proposed system create vector-space mod-el by using pre-processing method to reduce of trivial data from dataset. While the hug dimen-sionality of VSM is resolved by using low-rank SVD. Experiment results show that the proposed system give better document representation results about 10% from previous approach to prepare it for document clustering

Downloads

Download data is not yet available.

References

H. Froud, A. Lachkar and S. A. Ouatik, "Arabic text summarization based on latent semantic analysis to enhance arabic docu-ments clustering," Journal of university sidi mohamed ben abdellah, Morocco, 2012. DOI: https://doi.org/10.5121/ijdkp.2013.3107

N. S. Pathak, P. P. Rajurkar and A. G. Bhor, "effective approach towards exporter IR system through comparision of various pre-processing techniques," International con-ference on advances in engineering science and management, vol.8, 2015.

N. A. Samat, M. A. Azmi and M. T. Abdul-lah, "Malay documents clustering algorithm based on singular value decomposition," Faculty of computer science and infor-mation technology, university of Putra Ma-laysia, vol.3, 2016.

M. W. Berry, Z. Drma and E. R. Jessuo, "Matrices vector spaces and information retrieval," website www. amazon.com, 2012.

S. Lappin and C. Fox, "Vector space models of lexical meaning," Stephen clark universi-ty of cambridge computer laboratory, vol.25th, 2014.

S. Shama and L. Padmalatha, "Performance comarison of image fusion using singular value decomposition," International journal of innovative research in science, Engineer-ing and technology, vol.4, no.9, 2015. DOI: https://doi.org/10.15680/IJIRSET.2015.0409010

D. Munkova, M. Munk and M. Vozar, "Da-ta pre processing evalution for text mining: Transaction/Sequence Model," international conference on computational Science, 2013. DOI: https://doi.org/10.1016/j.procs.2013.05.286

S. Vijayarani and J. Ilamathi, "Prepro-cessing Techniques for text mining an over-view," International journal of computer science and communication networks, vol.5, 2015.

C. Ramasubramanian, R. Ramya and V. Tamilnadu, "Effective preprocessing activi-ties in text mining using improved porters stemming algorithm," international journal of adanced research in computer and com-munication engineering, vol.2, no.12, 2013.

N. P. Katariya, S. Chaudhari and N. P. Ka-tariya, "Text preprocessing for text mining using side information," international jour-nal of computer science and mobile applica-tion, vol.3, no.1, 2015.