AccScience Publishing / MI / Online First / DOI: 10.36922/mi.5077
PERSPECTIVE ARTICLE

Mx. BIOME: A bioinformatics and big data analysis platform in a data explosion era

Patrick C. Y. Woo1,2* Yu-Hsi Lin3 Shao-Yu Huang3 Mei-Hui Chen1 Ming-Hon Hou4,5,6 Chieh-Chen Huang3,7,8*
Show Less
1 Doctoral Program in Translational Medicine and Department of Life Sciences, National Chung Hsing University, Taichung 402, Taiwan
2 The iEGG and Animal Biotechnology Research Center, National Chung Hsing University, Taichung 402, Taiwan
3 Department of Life Sciences, National Chung Hsing University, Taichung 402, Taiwan
4 Institute of Genomics and Bioinformatics and Department of Life Sciences, National Chung Hsing University, Taichung 402, Taiwan
5 Doctoral Program in Medical Biotechnology, National Chung Hsing University, Taichung 402, Taiwan
6 Biotechnology Center, National Chung Hsing University, Taichung 402, Taiwan
7 Innovation and Development Center of Sustainable Agriculture, National Chung Hsing University, Taichung 402, Taiwan
8 Advanced Plant and Food Crop Biotechnology Center, National Chung Hsing University, Taichung 402, Taiwan
Submitted: 8 October 2024 | Revised: 4 December 2024 | Accepted: 2 January 2025 | Published: 3 February 2025
© 2025 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0/ )
Abstract

With the development of next-generation sequencing and other technologies, there has been an exponential increase in DNA, RNA, and protein sequences and protein structures in the last two decades, paralleled with the advancement of user-friendly bioinformatics tools. Furthermore, the enormous improvement in computational power and accumulation of massive amounts of data has given rise to the development of big data analysis, which can unveil novel patterns, associations, and trends. In view of the unprecedented biological data explosion, we have recently developed a platform known as Mx. BIOME, which provides a collection of some of the most popular and cutting-edge tools in the fields of bioinformatics and big data analysis. Such a collection would facilitate end-users to identify suitable tools for analyzing their specific datasets. Further studies and reviews on various in silico tools are necessary to compare the advantages and limitations.

Keywords
Bioinformatics
DNA
Sequencing
Big data
Analysis
SARS-CoV-2
Funding
This work was partly supported by the National Science and Technology Council (NSTC 112-2311-B-005-006-MY3) and the Feature Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE-113-S-0023-A) in Taiwan.
Conflict of interest
Patrick C. Y. Woo is an Editorial Board Member of this journal but was not in any way involved in the editorial and peer-review process conducted for this paper, directly or indirectly. Separately, other authors declared that they have no known competing financial interests or personal relationships that could have influenced the work reported in this paper.
References
  1. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A. 1977;74(12):5463-5467. doi: 10.1073/pnas.74.12.5463

 

  1. Margulies M, Egholm M, Altman WE, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437(7057):376-380. doi: 10.1038/nature03959

 

  1. Eid J, Fehr A, Gray J, et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323(5910):133-138. doi: 10.1126/science.1162986

 

  1. Teng JLL, Yeung ML, Chan E, et al. Pacbio but not illumina technology can achieve fast, accurate and complete closure of the high GC, complex Burkholderia pseudomallei two-chromosome genome. Front Microbiol. 2017;8:1448. doi: 10.3389/fmicb.2017.01448

 

  1. Lu H, Giordano F, Ning Z. Oxford nanopore MinION sequencing and genome assembly. Genomics Proteomics Bioinformatics. 2016;14(5):265-279. doi: 10.1016/j.gpb.2016.05.004

 

  1. Castillo M. The scientific method: A need for something better? AJNR Am J Neuroradiol. 2013;34(9):1669-1671. doi: 10.3174/ajnr.A3401

 

  1. Khamisy-Farah R, Gilbey P, Furstenau LB, et al. Big data for biomedical education with a focus on the Covid-19 era: An integrative review of the literature. Int J Environ Res Public Health. 2021;18(17):8989. doi: 10.3390/ijerph18178989

 

  1. Lau SKP, Woo PCY. Pitfalls in big data analysis: Next-generation technologies, last-generation data. Diagn Microbiol Infect Dis. 2019;94(2):209-210. doi: 10.1016/j.diagmicrobio.2018.12.006

 

  1. Chu YW, Chang KP, Chen CW, Liang YT, Soh ZT, Hsieh LC. miRgo: Integrating various off-the-shelf tools for identification of microRNA-target interactions by heterogeneous features and a novel evaluation indicator. Sci Rep. 2020;10(1):1466. doi: 10.1038/s41598-020-58336-5

 

  1. Huang CC, Chang CC, Chen CW, Ho SY, Chang HP, Chu YW. PClass: Protein quaternary structure classification by using bootstrapping strategy as model selection. Genes (Basel). 2018;9(2):91. doi: 10.3390/genes9020091

 

  1. Pan WJ, Chen CW, Chu YW. siPRED: Predicting siRNA efficacy using various characteristic methods. PLoS One. 2011;6(11):e27602. doi: 10.1371/journal.pone.0027602

 

  1. Tung CH, Chen CW, Guo RC, Ng HF, Chu YW. QuaBingo: A prediction system for protein quaternary structure attributes using block composition. Biomed Res Int. 2016;2016:9480276.doi: 10.1155/2016/9480276

 

  1. Tung CH, Chen CW, Sun HH, Chu YW. Predicting human protein subcellular localization by heterogeneous and comprehensive approaches. PLoS One. 2017;12(6):e0178832. doi: 10.1371/journal.pone.0178832

 

  1. Chen CW, Lin MH, Liao CC, Chang HP, Chu YW. IStable 2.0: Predicting protein thermal stability changes by integrating various characteristic modules. Comput Struct Biotechnol J. 2020;18:622-630. doi: 10.1016/j.csbj.2020.02.021

 

  1. Tung CH, Chien CH, Chen CW, Huang LY, Liu YN, Chu YW. QUATgo: Protein quaternary structural attributes predicted by two-stage machine learning approaches with heterogeneous feature encoding. PLoS One. 2020;15(4):e0232087. doi: 10.1371/journal.pone.0232087

 

  1. Chen CW, Chang KP, Ho CW, Chang HP, Chu YW. KStable: A computational method for predicting protein thermal stability changes by K-Star with Regular-mRMR feature selection. Entropy (Basel). 2018;20(12):988. doi: 10.3390/e20120988

 

  1. Chang CC, Tung CH, Chen CW, Tu CH, Chu YW. SUMOgo: Prediction of sumoylation sites on lysines by motif screening models and the effects of various post-translational modifications. Sci Rep. 2018;8(1):15512. doi: 10.1038/s41598-018-33951-5

 

  1. Chien CH, Chang CC, Lin SH, Chen CW, Chang ZH, Chu YW. N-GlycoGO: Predicting protein N-glycosylation sites on imbalanced data sets by using heterogeneous and comprehensive strategy. IEEE Access. 8 2020;8:165944-165950. doi: 10.1109/ACCESS.2020.3022629

 

  1. Chen CW, Huang LY, Liao CF, Chang KP, Chu YW. GasPhos: Protein phosphorylation site prediction using a new feature selection approach with a GA-aided ant colony system. Int J Mol Sci. 2020;21(21):7891. doi: 10.3390/ijms21217891

 

  1. Zhou Z, Zhu Y, Chu M. Role of Covid-19 vaccines in sars- Cov-2 variants. Front Immunol. 2022;13:898192. doi: 10.3389/fimmu.2022.898192
Share
Back to top
Microbes & Immunity, Electronic ISSN: 3029-2883 Print ISSN: 3041-0886, Published by AccScience Publishing