BCBimax Biclustering Algorithm with Mixed-Type Data

Hanifa Izzati, Indahwati Indahwati, Anik Djuraidah

Abstract


The application of biclustering analysis to mixed data is still relatively new. Initially, biclustering analysis was primarily used on gene expression data that has an interval scale. In this research, we will transform ordinal categorical variables into interval scales using the Method of Successive Interval (MSI). The BCBimax algorithm will be applied in this study with several binarization experiments that produce the smallest Mean Square Residual (MSR) at the predetermined column and row thresholds. Next, a row and column threshold test will be carried out to find the optimal bicluster threshold. The existence of different interests in the variables for international market potential and the number of Indonesian export destination countries is the reason for the need for identification regarding the mapping of destination countries based on international trade potential. The study's results with the median threshold of all data found that the optimal MSR is at the threshold of row 7 and column 2. The number of biclusters formed is 9 which covers 74.7% of countries. Most countries in the bicluster come from the European Continent and a few countries from the African Continent are included in the bicluster.

Keywords


BCBimax, Biclustering, MSR, market potential

References


[1] A. H. Foss and M. Markatou, “Kamila: Clustering mixed-type data in R and hadoop,” J. Stat. Softw., vol. 83, no. 13, p. 3, 2018, doi: 10.18637/jss.v083.i13.

[2] L. Hubert and P. Arabie, “Comparing partitions,” J. Classif., vol. 2, no. 1, pp. 193–218, 1985, doi: 10.1007/BF01908075.

[3] A. A. Mattjik and I. M. Sumertajaya, Sidik Peubah Ganda dengan Menggunakan SAS. Bogor, West Java, Indonesia, 2011.

[4] R. C. Tryon and D. E. Bailey, “Cluster Analysis,” Psychol. Med., p. 184, 1970, doi: 10.1017/S0033291700000234.

[5] Y. Cheng and G. M. Church, “Biclustering of expression data.,” Proc. Int. Conf. Intell. Syst. Mol. Biol., vol. 8, pp. 93–103, 2000.

[6] N. Kavitha Sri and R. Porkodi, “An extensive survey on biclustering approaches and algorithms for gene expression data,” Int. J. Sci. Technol. Res., vol. 8, no. 9, pp. 2228–2236, 2019.

[7] B. Wang, Y. Miao, H. Zhao, J. Jin, and Y. Chen, “A biclustering-based method for market segmentation using customer pain points,” Eng. Appl. Artif. Intell., vol. 47, pp. 101–109, 2016, doi: 10.1016/j.engappai.2015.06.005.

[8] D. de Smet, “A Biclustering Approach to Symptom Clusters and Subgroup Identification in Non-Hodgkin Lymphoma Survivors,” Tilburg University, 2019.

[9] A. Prelić, Bleuler S, Zimmermann P, Wille A, Bu P, Gruissem W, Hennig L, Thiele L, Zitzler E, “A systematic comparison and evaluation of biclustering methods for gene expression data,” Bioinformatics, vol. 22, no. 9, pp. 1122–1129, 2006, doi: 10.1093/bioinformatics/btl060.

[10] A. R. Amna and A. Hermanto, “Implementation of BCBimax algorithm to determine customer segmentation based on customer market and behavior,” Proc. 2017 4th Int. Conf. Comput. Appl. Inf. Process. Technol. CAIPT 2017, pp. 1–5, 2018, doi: 10.1109/CAIPT.2017.8320694.

[11] S. Ningsih and H. H. Dukalang, “Penerapan Metode Suksesif Interval pada Analsis Regresi Linier Berganda,” Jambura J. Math., vol. 1, no. 1, pp. 43–53, 2019, doi: 10.34312/jjom.v1i1.1742.

[12] N. Gregory Mankiw, Pengantar ekonomi makro. Jakarta: Salemba Empat, 2006.

[13] A. Kara, “Assessment of Market Potential: a Research on Determining the Potential Markets of Turkish Exporters,” Bus. Manag. Stud. An Int. J., vol. 7, no. 5, pp. 2577–2595, 2019, doi: 10.15295/bmij.v7i5.1303.

[14] D. Huo, “Cluster analysis of market potential in emerging markets: A dynamic research based on Markov chain,” Rom. J. Econ. Forecast., vol. 16, no. 4, pp. 218–231, 2013.

[15] Sustainable Intelligence, “Global Sustainable Competiveness Report,” 2022. [Online]. Available: https://solability.com/the-global-sustainable-competitiveness-index/downloads.

[16] G. Hofstede, “Motivation, leadership, and organization: Do American theories apply abroad?,” Organ. Dyn., vol. 9, no. 1, pp. 42–63, 1980, doi: https://doi.org/10.1016/0090-2616(80)90013-3.

[17] P. Morosini, S. Shane, and H. Singh, “National Cultural Distance and Cross-Border Acquisition Performance,” J. Int. Bus. Stud., vol. 29, no. 1, pp. 137–158, 1998, doi: 10.1057/palgrave.jibs.8490029.

[18] Amfori, “Country Risk Classification,” 2022, [Online]. Available: https://www.amfori.org/resource/country-risk-classification-2022.

[19] FM Global, “Resilience Index,” 2022. [Online]. Available: https://www.fmglobal.com/research-and-resources/tools-and-resources/resilienceindex/explore-the-data/?&vd=1.

[20] C. A. Putri, R. Irfani, and B. Sartono, “Recognizing poverty pattern in Central Java using Biclustering Analysis,” J. Phys. Conf. Ser., vol. 1863, no. 1, 2021, doi: 10.1088/1742-6596/1863/1/012068.


Full Text: PDF

DOI: 10.30595/juita.v12i1.21519

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN: 2579-8901