[RC5] Gene sequencing next d.net project?

David Hemmings david.hemmings at virgin.net
Wed Dec 8 23:34:15 EST 1999

Problems not yet mentioned are : you cannot necessarily base a proteins real
life structure on primary sequence and global minimise it.  Many proteins
start folding before the whole molecule has been synthesised, this will NOT
lead to a global minima structure.  Disulphide bridges are made physically
restraining the protein , this will NOT lead to a global minima structure.
Concentration of solvents / solutes - the addition of different proportions
of water and salts radically change proteins shape (hydrophobic /
hydrophilic base inversions).  Quaternary interactions. Active forms are not
immotile i.e. they can 'breathe' or change shape (Proteins rarely have *one*
structure).  Proteins need not be monomeric (see quaternary interactions).
Modelling with small molecules is hard enough to kick the molecule out of
local minima onto lower minima, imagine this made thousands of time more
complex ?  Interaction with target molecules, agonist / antagonist
molecules, inhibitors (both reversible and irreversible) can change
structure radically e.g. activation of an enzyme by a co-factor.
Temperature changes leads to conformation changes, too high leads to

I am quite sure there are some more that are missing from this list.

On the plus side not all torsional angles are viable or allowable.

What if you started de novo with haemoglobin.  With just the primary
structure, could modelling  software *ever* predict its active form(s) ?


David Hemmings

Date: Mon, 06 Dec 1999 00:03:27 +0000
From: Mustafa Unlu <gen280 at abdn.ac.uk>
Subject: Re: [RC5] Gene sequencing next d.net project?

As a biochemist, I can tell you that gene sequencing is a lot of hard
"wet" work on a bench, not at all related to code cracking.

Essam's idea of working on predicting 3d protein structures is closer to
a project which *might* be valid for d.net, but (wild guess mode) I'd
venture that 1000 giga d.nets would barely be able to make a dent in the
problem (endwild guess mode).

A *very* brief description of what we mean by 3d protein structure
follows for the non-biochemically oriented:

Proteins are *the* essential molecules in a cell. Forget about what
Hollywood tells you about DNA and genes - DNA only exists to provide the
blueprints for the proteins. It's the proteins that do all the work -
they transport things in and out, they are enzymes, structural
molecules, signalling molecules, they replicate DNA and form the backbone
of the immune system (antibodies are proteins for ex). A DNA sequence
means nothing if it never gets translated (made into) a protein.

Most importantly, for proteins structure equals function. The way a
protein molecule folds in 3d determines what it can do. Given that there
are 20 main amino acids which can occupy each position in a protein, and
the average size of a protein is in the region of 500 amino acids (aa),
the diversity which can be generated simply by the primary  structure
(i.e. the simple list of amino acids in order) is enormous. Things are
further complicated when aa's interact with each other and generate
secondary (local folds) and tertiary structures (the whole molecule's
folding in space). However, if you unfold a protein by denaturing it, it
will eventually refold (with a little help from other proteins
sometimes) to its *one* correct position in 3d space 99 times out of
100. Theoretically speaking, it *should* be possible to take quantum
mechanical equations about every *atom* in a protein and minimise these
functions to find an energetically minimal solution but currently we
are unable to solve these equations for large numbers of atoms. The
"holy grail" of protein biochemistry is to be able to predict the 3d
structure of a protein from its primary structure (i.e. list of which
amino acid occupies which position). Then we can start predicting
functions without having to do arduous and time-consuming "wet"

This is not my speciality so I am speaking from my general knowledge and
off the top of my head but this explanation should be enough to describe
the complexity and magnitude of the problem... I can hook people up to
the experts in this field if anyone is interested or maybe there is one
on the list?

- --

- --
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest

To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest

More information about the rc5 mailing list