The various technologies (eg. microarrays and gene sequencing) being used today to analyse genes and proteins generate enormous quantities of data
The data being collected ranges from the sequences of genomes, when genes are being expressed during an organism’s life to the structure (amino acid sequence) and functions of proteins
To analyse all of this data scientists are using bioinformatics
Bioinformatics is an interdisciplinary science (incorporating biology with computer technology and statistics) where biological data is collected, organised, manipulated, analysed and stored
Large databases are created containing information ranging from gene sequences to amino acid sequences of proteins. The databases are available online and can perform analysis of the data selected. As this data needs to be accessed and searched software developers play an important role. Some of the databases that exist are:
The European Molecular Biology Laboratory – Nucleotide sequence database
ArrayExpress – a microarray database with the level and types of mRNA expressed in different cells
Protein Data Bank at Europe – Protein sequence searches
BLAST (Basic Local Alignment Search Tool) – used by researchers to find similarities between sequences they are studying with those already in the database
Once a genome is sequenced bioinformatics allows scientists to make comparisons with the genomes of other organisms using the many databases available. This can help to find the degree of similarity between organisms which then gives an indication of how closely related the organisms are and whether there are organisms that could be used in experiments as a model for humans (eg. the fruit fly Drosophila)
The nematode Caenorhabditis elegans is an animal that has been used as a model organism for studying the genetics of organ development, neurone development and cell death. It was the first multicellular organism to have its genome fully sequenced and as it has few cells (less than 1000) and is transparent it has been a useful model
One of the applications for bioinformatics includes using databases with the genome of Plasmodium to determine which genes and or proteins could be altered or affected to control the parasite (eg. finding a vaccine for malaria)