Evaluation of Bicluster Analysis Results in Capture Fisheries Using the BCBimax Algorithm

Cynthia Wulandari, I Made Sumertajaya, Muhammad Nur Aidi

Abstract


Biclustering is a simultaneous clustering technique by finding sub-matrixes that have the same similarity between rows and columns. One of the biclustering algorithms that is relatively fast and can be used as a reference for the comparison of several algorithms is the BCBimax algorithm. The BCBimax algorithm works by finding a sub-matrix containing element 1 of the formed binary data matrix. The selection of thresholds in the binarization process and the minimum combination of rows and columns are essential in finding the optimal bicluster. Capture fisheries have an important role in supporting sustainable growth in Indonesia, so information on the potential of fish species that have similarities in several provinces is needed in optimally mapping the potential. The BCBimax algorithm found 11 optimal biclusters in grouping capture fisheries data. The median of each variable is used as a threshold in the binarization process, and the minimum combination of row 2 and maximum column 2 is chosen to find the optimal bicluster result. The optimal average value of Mean Square Residual bicluster obtained is 0.405403 with the similarity of bicluster results (Liu and Wang index) which is different for each bicluster combination produced. All the bicluster results grouped the provinces and types of fish that had the same potential simultaneously.

Keywords


Bi-clustering; BCBimax Algorithm; Mean Square Residu; ASR; Liu and Wang index;

References


[1] A. M. Tamonob, A. Saefuddi, and A. H. Wigena, “Nonlinear Principal Component Analysis and Principal Component Analysis With Successive Interval in K-Means Cluster Analysis,” Forum Stat. Dan Komputasi, vol. 20, no. 2, pp. 68–77, 2015.

[2] N. Trianasari, I. M. Sumertajaya, Erfiani, and I. W. Mangku, “Application of beta mixture distribution in data on gpa proportion and course scores at the mbti telkom university,” Commun. Math. Biol. Neurosci., vol. 2021, pp. 1–12, 2021, doi: 10.28919/cmbn/5391.

[3] M. G. Silva, S. C. Madeira, and R. Henriques, “Water Consumption Pattern Analysis Using Biclustering: When, Why and How,” Water (Switzerland), vol. 14, no. 12, pp. 1–35, 2022, doi: 10.3390/w14121954.

[4] E. N. Castanho, H. Aidos, and S. C. Madeira, “Biclustering fMRI time series: a comparative study,” BMC Bioinformatics, vol. 23, no. 1, pp. 1–30, 2022, doi: 10.1186/s12859-022-04733-8.

[5] J. A. Hartigan, “Direct clustering of a data matrix,” J. Am. Stat. Assoc., vol. 67, no. 337, pp. 123–129, 1972, doi: 10.1080/01621459.1972.10481214.

[6] Y. Cheng and G. M. Church, “Biclustering of expression data.,” Proc. Int. Conf. Intell. Syst. Mol. Biol., vol. 8, pp. 93–103, 2000.

[7] C. A. Putri, R. Irfani, and B. Sartono, “Recognizing poverty pattern in Central Java using Biclustering Analysis,” J. Phys. Conf. Ser., vol. 1863, no. 1, 2021, doi: 10.1088/1742-6596/1863/1/012068.

[8] V. A. Padilha and R. J. G. B. Campello, “A systematic comparative evaluation of biclustering techniques,” BMC Bioinformatics, vol. 18, no. 1, pp. 1–25, 2017, doi: 10.1186/s12859-017-1487-1.

[9] B. Pontes, R. Giráldez, and J. S. Aguilar-Ruiz, “Biclustering on expression data: A review,” J. Biomed. Inform., vol. 57, pp. 163–180, 2015, doi: 10.1016/j.jbi.2015.06.028.

[10] B. Wang, Y. Miao, H. Zhao, J. Jin, and Y. Chen, “A biclustering-based method for market segmentation using customer pain points,” Eng. Appl. Artif. Intell., vol. 47, pp. 101–109, 2016, doi: 10.1016/j.engappai.2015.06.005.

[11] F. Divina, F. A. G. Vela, and M. G. Torres, “Biclustering of smart building electric energy consumption data,” Appl. Sci., vol. 9, no. 2, 2019, doi: 10.3390/app9020222.

[12] A. Prelić et al., “A systematic comparison and evaluation of biclustering methods for gene expression data,” Bioinformatics, vol. 22, no. 9, pp. 1122–1129, 2006, doi: 10.1093/bioinformatics/btl060.

[13] A. Fahrudin, S. H. Wisudo, and B. Juanda, “PERIKANAN TANGKAP DI INDONESIA : POTRET DAN TANTANGAN KEBERLANJUTANNYA Capture Fisheries in Indonesia : Portraits and Challenges of Sustainability,” pp. 145–162, 2019. DOI:10.15578/jsekp.v14i2.8056

[14] Bappenas, “Kajian Strategi Pengelolaan Perikanan Berkelanjutan,” Kementeri. PPN/Bapenas Direktorat Kelaut. dan Perikan., p. 120, 2014.

[15] S. Dolnicar, S. Kaiser, K. Lazarevski, and F. Leisch, “Biclustering: Overcoming data dimensionality problems in market segmentation,” J. Travel Res., vol. 51, no. 1, pp. 41–49, 2012, doi: 10.1177/0047287510394192.

[16] B. S. Biswal, A. Mohapatra, and S. Vipsita, “A review on biclustering of gene expression microarray data: Algorithms, effective measures and validations,” Int. J. Data Min. Bioinform., vol. 21, no. 3, pp. 230–268, 2018, doi: 10.1504/IJDMB.2018.097683.

[17] N. Kavitha Sri and R. Porkodi, “An extensive survey on biclustering approaches and algorithms for gene expression data,” Int. J. Sci. Technol. Res., vol. 8, no. 9, pp. 2228–2236, 2019.

[18] J. Yang, W. Wang, H. Wang, and P. Yu, “δ-clusters: Capturing subspace correlation in a large data set,” Proc. - Int. Conf. Data Eng., pp. 517–528, 2002, doi: 10.1109/icde.2002.994771.

[19] Y. Lee, J.-H. Lee, and C.-H. Jun, “Validation measures of bicluster solutions,” Ind. Eng. Manag. Syst., vol. 8, no. 2, pp. 101–108, 2009, [Online]. Available: http://kiie.org/iems/contents/vol8no2/8-2-04.pdf

[20] X. Liu and L. Wang, “Computing the maximum similarity bi-clusters of gene expression data,” Bioinformatics, vol. 23, no. 1, pp. 50–56, 2007, doi: 10.1093/bioinformatics/btl560.


Full Text: PDF

DOI: 10.30595/juita.v11i1.15457

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN: 2579-8901