Home >> Science >> Biology >> Bioinformatics


  Software
       

  Companies
Directories
Education
Employment
  Hardware
Human Genomics
Law and Ethics
Molecular Evolution
  Online Services
Programming
Publications
Research Groups


Bioinformatics or computational biology is the utilise of techniques from either applied mathematics, informatics, statistics, and computer science to solve biological problems. The food and drug administration inside computational biology typically overlaps by owning systems biology. Major locate efforts in the field include sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, and a modeling of evolution. A terms bioinformatics & computational biology come usually utilized interchangeably, although a latter typically focuses in algorithmic rule development & specific computational methods. (In the biology-maths-computer science triangle, bioinformatics may intimately require wholly terniin components when computational biology may focus on biology & mathematics.) The most common thread around projects within bioinformatics & computational biology is the have of mathematical reference to extract utile information from either noisy data produced by high-throughput biological techniques. (A field of data mining overlaps with computational biology in that regard.) Representative problems within computational biology include a assembly of high-quality DNA sequences from fragmental "shotgun" DNA sequencing, and a prediction of gene regulation with data from either mRNA microarrays or mass spectrometry.

Making sense of the huge amounts of DNA data (pictured) produced by gene sequencing projects is just one of the tasks faced by bioinformatics.

Major research areas

Sequence analysis
Independent articles: Sequence alignment, Sequence database

Since a Phage Φ-X174; was sequenced in 1977, the DNA sequences of more and more organisms have been decoded and stored in electronic databases. This information is analyzed to determine cistron that code for proteins, when well as regulative sequences. The comparison of factor in the species or even between different coinage potty indicate similarities between protein functions, or relations between species (a have of molecular systematics to construct phylogenetic trees). Using a growing total of information, it long since became visionary to analyze DNA sequences manually. Now, computer programs are utilized to seek a genome of thousands of parasites, containing billions of nucleotides. These softwcome might compensate for even mutations (exchanged, deleted or inserted bases) in the DNA sequence, sequentially to identify sequences that are related, but not monovular. The variant of this sequence alignment is used in the sequencing run itself. A therefore-supposed shotgun sequencing technique (which was used, for instance, by The Institute for Genomic Research to sequence the number one bacterial genome, Haemophilus grippe) doesn't give a sequent listings of nucleotides, however instead the sequences of hundreds to thousands of little DNA fragments (both astir 600-800 nucleotides hanker). A stops one fragments overlap &, once aligned in a correct way, produce higher the complete genome. Scattergun sequencing yields sequence information quickly, however a project of assembling a fragments may be quite complicated for big genomes. In the instance of the Human Genome Project, it took several months of CPU period (in a circa-2000 vintage DEC Alpha computer) to assemble the fragments. Scattergun sequencing is the method of selection for most genomes sequenced in todays world, & genome assembly algorithms are the critical vicinity of bioinformatics the food and drug administration.

An additional aspect of bioinformatics inside sequence analysis is the automatic search for genes and regulatory sequences in the genome. Non tons of the nucleotides inside the genome come cistron. In a genome of higher parasites, big area of the DNA don't help any perceptible purpose. This and then-supposed junk DNA may, however, contain unrecognized functional elements. Bioinformatics aids to bridge a gap between genome & proteome projects, for example in the utilise of DNA sequence for protein identification.

View likewise: sequence analysis, sequence profiling tool, sequence motif.

Genome annotation
Independent articles: Gene finding

In the context of genomics, annotation is the run of marking the cistron & more biological features around a DNA sequence. a number one genome annotation software package was designed within 1995 by Owen White, world health organization was a portion of a team that sequenced & analyzed the number 1 genome of a free streaming-nonsymbiotic organism to exist as decoded, the bacteria Haemophilus influenzae. Dr. White built a computer software to buy the cistron (site in the DNA sequence that encode a protein), the transport RNA, & more features, & to produce initial assignments of work to victims cistron. Virtually all todays genome annotation systems act likewise, however a softwcome online available for analysis of genomic DNA are constantly changing & improving. A Ensembl system contains a genome annotation pipeline for a human being genome (also when others), originally developed by Ewan Birney spell at the Wellcome Trust Sanger Institute near Cambridge, England.

Computational evolutionary biology

Evolutionary biology is the study of the origin & descent of mintage, besides when their vary above period. Recent developments inside genome sequencing & a ubiquitousness of convenient computers enable investigator to trace evolution of coinage by tracing changes in their DNA. CEB search from either a pre-genome era required building computational system of populations & watching their behavior on top period.

A field of genetic algorithms might exist when described as a rough out opposite of CEB — like than investigating evolution across computer programme, it aims to improve programme across evolutionary lesson.

Gene expression analysis

A expression of many cistron may be determined by with measurements of mRNA levels with multiple techniques including microarrays, expressed cDNA sequence tag (EST) sequencing, serial analysis of gene expression (SAGE) tag sequencing, massively parallel signature sequencing (MPSS), or by by using measurements of protein concentrations with high-throughput mass spectroscopy. 100% one techniques come highly noise-prone &/or subject to bias inside the biological mensuration, and the major locate front yard in computational biology involves getting technical indicator information to separate signal from noise in high-throughput gene expression (HT) studies. HT studies come typically wont to determine a factor implicated inside a disorder: of these will compare microarray information from either either cancerous epithelial cells to information from non-cancerous cells to determine the proteins that cancer higher-regulates & down-regulates.

Expression information is besides wont to infer cistron regulation: 1 can compare microarray information from either a wide kind of states of an organism to form hypotheses just about the factor required inside every state. Inside 1-cell organism, one can compare stages of the cell cycle, along with various stress conditions (heat shock, starvation, etc.). Of these potty so use clustering algorithms to that expression data to determine which cistron come co-expressed. Farther analysis may take a kind of directions: a single 2004 learn analyzed a promoter sequences of co-expressed (clustered together) genes to buy most common regulatory elements and used machine learning techniques to identify the promoter elements included within regulating apiece bunch.

Protein expression analysis
Protein microarrays and high throughput (HT) mass spectrometry (MS) can provide the snap of the proteins present around the biological sample. Bioinformatics is a lot exposed withinside add up of protein microarray & HT MS information; a previous involves a total of a equivalent problems require in examining microarrays targeted at mRNA, the latter involves the bioinformatics condition of matching MS information against protein sequence databases.

Analysis of mutations in cancer
Massive sequencing efforts come presently afoot to identify point mutations in the kind of genes in cancer. A sheer volume of information produced takes machine-controlled systems to scroll through sequence information, & to compare a sequencing resolutions to the known sequence of the human genome, including known germline polymorphisms.

Oligonucleotide microarrays, including comparative genomic hybridization and single nucleotide polymorphism arrays, able to probe at a same time as much as many hundred thous& web sites throughout the genome come existence wont to identify chromosomal profits and losses around cancer. Hidden Markov model and change-point analysis methods are existence developed to infer rattling copy total changes from either typically loud information. Farther ip approaches come existence developed to read a implications of lesions uncovered to exist as perennial through numbers of neoplasm.

Structure prediction

Independent article: Protein structure prediction

Protein structure prediction is a second significant application of bioinformatics. A amino acid sequence of a protein, a therefore-supposed primary structure, may be easy determined from either a sequence on the cistron that codes for it. inside the brobdingnagian majority of subjects, this primary structure unambiguously determine the structure in its native environment. (Course, there are exceptions, like a bovine spongiform encephalopathy - aka Mad Cow Disease - prion.) Knowledge of this structure is vital within understanding a work of the protein. For deficiency of better terms, structural data come unremarkably classified when one of secondary, tertiary and quaternary structures. The viable general guide to such predictions remains an open condition. When of today, virtually all efforts use at times been directed towards heuristic program that works virtually all of the instance.

One of a key ideas within bioinformatics the food and drug administration is the notion of homology. In a genomic branch of bioinformatics, homology is utilized to predict the work of a cistron: whenever the sequence of cistron The, whose work is known, is homologous to the sequence of cistron B, whose work is unknown, The single might infer that B may part A's work. around the structural branch of bioinformatics homology is utilized to determine which area of the protein come crucial in structure formation & interaction by using more proteins. Inside the system known as homology modelling, this page is utilized to predict the structure of a protein when the structure of a homologous protein is known. This presently remains a single way to predict protein structures faithfully.

1 case of this is a similar proteinside homology between haemoglobinside in man & the haemoglobin in legumes (leghemoglobin). Each help a equivalent purpose of transporting atomic number 8 around each parasites. Though each one proteins own totally different amino acid sequences, their protein structures come virtually monovular, which reflects their touching monovular purposes.

More techniques for predicting protein structure includiamond state protein threading & de novo (from either scratch) physics-depending modeling.

View besides structural motif and structural domain.

Preserving biodiversity
Bioinformatics is typically utilized for preserving biodiversity. A first reference collected is the species names, descriptions, distributions, status and size of populations, habitat needs, and how apiece organism interacts by owning more mintage. This page is compiled by having computer databases, accessed with software programs to find, visualize, & analyze a information automatically, & virtually all importantly, communicated to more population, especially on top the internet. DNA sequences of endangered species can be preserved, & list and descriptions of specimens sleep in captivity come stored sequentially to allow when much access to the references required to preserve biodiversity as imaginable.

An case of this application is the Species 2000 task. These are an internet-depending spherical scientific research which intends to provide trading tools astir each known metal money of plant, animal, fungus, and microbe in existence to exist as a foundation for studies of spherical biodiversity. Anyone in the globe is breathe to locate huge principles just about any known metal money from either an array of participating databases.

Modeling biological systems
Independent article: Systems biology

Systems biology involves a utilize of computer simulations of cellular subsystems (such as a networks of metabolites and enzymes which comprise metabolism, signal transduction pathways and gene regulatory networks) to both analyze & visualize a complex modems one cellular processes. Artificial life or virtual evolution attempts to realize evolutionary processes via a simulation of elementary (unreal) life forms.

Other applications

Morphometrics is used to analyze pictures of embryos to track and to predict a fate of cell bunch when you took morphogenesis.

Software tools

A computational biology convienence right-known among life scientist is probably BLAST, an algorithm for looking big sequence (protein, DNA) databases. NCBI provides a popular implementation that searches their massive sequence databases.

Computer scripting languages like Perl and Python are often wont to interface using biological databases and parse output from bioinformatics software online.

Bioinformatic meta research engines (Entrez, Bioinformatic Harvester) help locating relevant references from either many databases.

Communities of bioinformatics computer programmer use at times install free/open source projects such as EMBOSS, Bioconductor, BioPerl, BioLinux, BioPython, BioRuby, and BioJava which develop & distribute divided up programming information & objects (when program modules) that produce bioinformatics gentler.

Visualisation for Bioinformatics
DNA microarray Visualisation Resources, Papers, Articles, Posters and Talks.

The Swiss Institute of Bioinformatics Homepage (SIB)
SIB operates the ExPASy proteomics server and the Swiss node of EMBnet. Teaching activities include a series of post-graduate courses given at the Universities of Geneva and Lausanne, as well as at the EPFL, and a Masters Degree in bioinformatics. Major research areas include the development of integrated databases and software resources in the field of proteomics.

The Ensembl Project
Ensembl is a joint project between EMBL-EBI and the Sanger Centre to develop a software system which produces and maintains automatic annotation on eukaryotic genomes.

The Open Lab
A community focused on the freedom of information as it pertains to the biosciences.

The International Society for Computational Biology
The International Society for Computational Biology is dedicated to advancing the scientific understanding of living systems through computation; the emphasis is on the role of computing and informatics in advancing molecular biology.

European Molecular Biology Network
EMBnet is the only organisation world-wide bringing bioinformatics professionals to work together to serve the expanding fields of genetics and molecular biology.

Biodatabase Mining
Whitepaper on database mining in the Human Genome Initiative.

Society for Bioinformatics in the Nordic countries
SocBiN is a non-profit organisation for people working with and interested in bioinformatics. One task of the society is to arrange annual conferences on Bioinformatics, of which the first took place April 1999 in Lund.

DNA Structural Atlas
Easy-to-use summary of genomic information currently available for all organisms-from the Technical Univ. of Denmark.

USGS Center for Biological Informatics
Facilitates access to and application of biological information.


Health: Medicine: Informatics
Science: Biology: Biochemistry and Molecular Biology: Biomolecules: Proteins and Enzymes: Proteomics
Science: Environment: Biodiversity: Informatics




© 2005 GeneralAnswers.org