Hiindex LOGO

Research Article

Stability Of Indexed Microarray And Text Data


Author(s): T.Velmurugan , S.Deepa Lakshmi
Affiliation: Associate Professor, PG and Research Department of Computer Science, D.G.Vaishnav College, Chennai, India
Year of Publication: 2017
Source: International Journal of Computing Algorithm
     
×

Scholarly Article Identity Link


HTML:


File:


Citation: T.Velmurugan, S.Deepa Lakshmi. "Stability Of Indexed Microarray And Text Data." International Journal of Computing Algorithm 6.2 (2017): 64-68.

Abstract:
The common challenge for machine learning and data mining tasks is the curse of High Dimensionality. Feature selection reduces the dimensionality by selecting the relevant and optimal features from the huge dataset. In this research work, a clustering and genetic algorithm based feature selection CLUST-GA-FS is proposed that has three stages namely irrelevant feature removal, redundant feature removal, and optimal feature generation. The performance of the feature selection algorithms are analyzed using the parameters like classification accuracy, precision, recall and error rate.


Keywords Stability measurements, Jaccard Index, Kuncheva Index, Tanimoto Distance, Dice Coefficient.


  • BibTex
  • Reference
  • XML
  • JSON
  • Dublin Core
  • CSL

@article{Sta1792366, author = {T.Velmurugan,S.Deepa Lakshmi}, title = {Stability Of Indexed Microarray And Text Data}, journal={International Journal of Computing Algorithm}, volume={6}, issue={2}, issn = {2278-2397}, year = {2017}, publisher = {Scholarly Citation Index Analytics-SCIA}

  • [1] V. Kumar and S. Minz, “Feature Selection,” SmartCR, 2014, Vol.4, Issue 3 pp.211-229.
  • [2] I. Guyon and A. Elisseeff, “An introduction to feature extraction,” Featur. Extraction., pp.1-25, 2006.
  • [3] G. John, R. Kohavi, and K. Pfleger, “Irrelevant features and the subset selection problem,” In Machine learning: proceedings of the eleventh international conference, pp. 121-129. 1994.
  • [4] Q. Song, J. Ni, and G. Wang, “A Fast Clustering-Based Feature Subset Selection Algorithm for High Dimensional Data,” IEEE transactions on knowledge and data engineering Vol.25, Issue 1, pp.1-14, 2013.
  • [5] C. Science and N. York, “Stability of Feature Selection Algorithms", Doctoral dissertation, Department of Computer Science Binghamton University, State University of New York, New York, 2010.
  • [6] K. Dunne, P. Cunningham, and F. Azuaje, “Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection,” Journal of Machine. Learning, pp. 1–22, 2002.
  • [7] Somol, P. and Novovicova, J, "Evaluating stability and comparing output of feature selectors that optimize feature subset cardinality", IEEE Transactions on Pattern Analysis and Machine Intelligence, 3211, pp.1921-1939, 2010.
  • [8] Schowe, Benjamin. "Feature selection for high-dimensional data with RapidMiner." In Proceedings of the 2nd RapidMiner Community Meeting And Conference RCOMM 2011, Aachen. 2011.
  • [9] S. Niwattanakul, J. Singthongchai, E. Naenudorn, and S. Wanapu, “Using of Jaccard Coefficient for Keywords Similarity,” In Proceedings of the International MultiConference of Engineers and Computer Scientists, vol. 1, no. 6, pp.380-384, 2013.
  • [10] Y. Saeys, T. Abeel, and Y. Van de Peer, “Robust Feature Selection Using Ensemble Feature Selection Techniques”, Machine learning and knowledge discovery in databases pp.313-32, 2008.
  • [11] G. Roffo and S. Melzi, “Feature Selection via Eigenvector Centrality”, Proceedings of New Frontiers in Mining Complex Patterns NFMCP 2016Oct 2016.
  • [12] I. Kamkar, S. Gupta, Cheng Li, D. Phung, and S. Venkatesh, “Stable clinical prediction using graph support vector machines,” in 2016 23rd International Conference on Pattern Recognition ICPR, pp. 3332–3337, 2016.
  • [13] D. Dernoncourt, B. Hanczar, and J. D. Zucker, “Analysis of feature selection stability on high dimension and small sample data,” Computational Statistics & Data Analysis, vol. 71, pp. 681–693, 2014.
  • [14] S. Nogueira and G. Brown, “Measuring the stability of feature selection,”, In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 442-457. Springer International Publishing, 2016.
  • [15] K. Gao, T. M. Khoshgoftaar, and A. Napolitano, “Impact of Data Sampling on Stability of Feature Selection for Software Measurement Data”, IEEE 23rd International Conference on Tools with Artificial Intelligence, pp. 1004–1011, 2011.
  • <?xml version='1.0' encoding='UTF-8'?> <record> <language>eng</language> <journalTitle>International Journal of Computing Algorithm</journalTitle> <eissn>2278-2397 </eissn> <publicationDate>2017</publicationDate> <volume>6</volume> <issue>2</issue> <startPage>64</startPage> <endPage>68</endPage> <documentType>article</documentType> <title language='eng'>Stability Of Indexed Microarray And Text Data</title> <authors> <author> <name>T.Velmurugan</name> </author> </authors> <abstract language='eng'>The common challenge for machine learning and data mining tasks is the curse of High Dimensionality. Feature selection reduces the dimensionality by selecting the relevant and optimal features from the huge dataset. In this research work, a clustering and genetic algorithm based feature selection CLUST-GA-FS is proposed that has three stages namely irrelevant feature removal, redundant feature removal, and optimal feature generation. The performance of the feature selection algorithms are analyzed using the parameters like classification accuracy, precision, recall and error rate.</abstract> <fullTextUrl format='pdf'>http://www.hindex.org/2017/p923.pdf</fullTextUrl> <keywords language='eng'> <keyword>Stability measurements, Jaccard Index, Kuncheva Index, Tanimoto Distance, Dice Coefficient.</keyword> </keywords> </record>

    { "@context":"http://schema.org", "@type":"publication-article","identifier":"http://www.hindex.org/2017/article.php?page=923", "name":"Stability Of Indexed Microarray And Text Data", "author":[{"name":"T.Velmurugan "}], "datePublished":"2017", "description":"The common challenge for machine learning and data mining tasks is the curse of High Dimensionality. Feature selection reduces the dimensionality by selecting the relevant and optimal features from the huge dataset. In this research work, a clustering and genetic algorithm based feature selection CLUST-GA-FS is proposed that has three stages namely irrelevant feature removal, redundant feature removal, and optimal feature generation. The performance of the feature selection algorithms are analyzed using the parameters like classification accuracy, precision, recall and error rate.", "keywords":["Stability measurements, Jaccard Index, Kuncheva Index, Tanimoto Distance, Dice Coefficient."], "schemaVersion":"https://schema.org/version/3.3", "includedInDataCatalog":{ "@type":"DataCatalog", "name":"Scholarly Citation Index Analytics-SCIA", "url":"http://hindex.org"}, "publisher":{"@type":"Organization", "name":"Scientific Communications Research Academy" } }

    <?xml version='1.0' encoding='utf-8'?> <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"> <dc:contributor>S.Deepa Lakshmi</dc:contributor> <dc:contributor></dc:contributor> <dc:contributor></dc:contributor> <dc:creator>T.Velmurugan</dc:creator> <dc:date>2017</dc:date> <dc:description>The common challenge for machine learning and data mining tasks is the curse of High Dimensionality. Feature selection reduces the dimensionality by selecting the relevant and optimal features from the huge dataset. In this research work, a clustering and genetic algorithm based feature selection CLUST-GA-FS is proposed that has three stages namely irrelevant feature removal, redundant feature removal, and optimal feature generation. The performance of the feature selection algorithms are analyzed using the parameters like classification accuracy, precision, recall and error rate.</dc:description> <dc:identifier>2017SCIA316F0923</dc:identifier> <dc:language>eng</dc:language> <dc:title>Stability Of Indexed Microarray And Text Data</dc:title> <dc:type>publication-article</dc:type> </oai_dc:dc>

    { "identifier": "2017SCIA316F0923", "abstract": "The common challenge for machine learning and data mining tasks is the curse of High Dimensionality. Feature selection reduces the dimensionality by selecting the relevant and optimal features from the huge dataset. In this research work, a clustering and genetic algorithm based feature selection CLUST-GA-FS is proposed that has three stages namely irrelevant feature removal, redundant feature removal, and optimal feature generation. The performance of the feature selection algorithms are analyzed using the parameters like classification accuracy, precision, recall and error rate.", "author": [ { "family": "T.Velmurugan,S.Deepa Lakshmi" } ], "id": "923", "issued": { "date-parts": [ [ 2017 ] ] }, "language": "eng", "publisher": "Scholarly Citation Index Analytics-SCIA", "title": " Stability Of Indexed Microarray And Text Data", "type": "publication-article", "version": "3" }