Computing is essential to DNA sequencing. It is also the cornerstone in genetic-based discoveries ranging from personalized medicine to novel cancer drugs. As impressive as these accomplishments are, they are only a taste of seemingly endless possibilities. A new discipline of specialized computing called bioinformatics has emerged to push human knowledge towards future innovations that were previously beyond the reach of traditional research tools and processes.
While the definition of bioinformatics is still evolving, there is general consensus around the fundamentals.
For a good overview of what bioinformatics is from a first-hand user perspective, check out this video.
Now for the higher view on what bioinformatics is and what it entails.
The building blocks of bioinformatics
At its essence, bioinformatics resembles the computing setup familiar to big data practitioners. There is typically a huge database for storage, a plethora of software to help gather and organize data sets, and an array of advanced analytics, which often also includes machine learning to help sort, label, mine, and analyze the data faster and more efficiently. Algorithms and modeling are meticulously attended, amended, and amassed along the way as researchers and data scientists learn and collaborate on refining the processes.
The main difference between traditional big data computing and bioinformatics is the nature of the data underlying the work. In bioinformatics, biological data is used. According to the US National Library of Medicine National Institutes of Health’s proposed definition, the biological data is most commonly in the form of molecular biology, i.e. macromolecular structures, genome sequences, and the results of genomics experiments, e.g. expression data or epigenetics.
But other types of data can also be ingested to add inputs of significance to complex algorithms. In other words, adding more information to the equation makes the computation more complete, nuanced, and renders it in better context.
“Additional information includes the text of scientific papers and ‘relationship data’ from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering,” write researchers from the Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, USA in the National Center for Biotechnology Information, U.S. National Library of Medicine (NCBI) publication.
“The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses.”
Bioinformatics and quantum computing
The very definition of big data is not based on a precise size measurement, but rather a relative size. Big data is quite simply data too big for current computing to compute. Biological data, like all data categories, are growing at unprecedented rates. Because of this, scientists are currently using supercomputers and high performance computing (HPC) to answer some of the biggest and most complex scientific questions of the day. That includes queries in bioinformatics. While computing muscle rapidly gets stronger in an attempt to keep pace, there is a limit to how much work traditional computing can ultimately perform.
Even supercomputers are not powerful enough to solve many of the most complex and pressing scientific problems. To tackle the biggest of the big problems, a new computing model is needed: quantum computers.
The following short video offers a good explanation of how quantum computing works and why it is important to bioinformatics and healthcare.
While sequencing DNA was in itself a major breakthrough, it is now the foundation to many more innovations each of which will come about via bioinformatics and major advances in computing. We are just beginning to see the glimmer of possibilities with bioinformatics and advanced computing. Our understanding of our world, our bodies, and the human condition will vastly improve alongside advances in our tools. This is goodbye to the Information Age and a hearty hello to the Knowledge Era.