[RC5] Gene sequencing next d.net project?

Mustafa Unlu gen280 at abdn.ac.uk
Mon Dec 6 00:03:27 EST 1999

As a biochemist, I can tell you that gene sequencing is a lot of hard
"wet" work on a bench, not at all related to code cracking.

Essam's idea of working on predicting 3d protein structures is closer to
a project which *might* be valid for d.net, but (wild guess mode) I'd
venture that 1000 giga d.nets would barely be able to make a dent in the
problem (endwild guess mode).

A *very* brief description of what we mean by 3d protein structure
follows for the non-biochemically oritented:

Proteins are *the* essential molecules in a cell. Forget about what
Hollywood tells you about DNA and genes - DNA only exists to provide the
blueprints for the proteins. It's the proteins that do all the work -
they transport things in and out, they are enzymes, structural
molecules, signaling molecules, they replicate DNA and form the backbone
of the immune system (antibodies are proteins for ex). A DNA sequence
means nothing if it never gets translated (made into) a protein. 

Most importantly, for proteins structure equals function. The way a
protein molecule folds in 3d determines what it can do. Given that there
are 20 main amino acids which can occupy each position in a protein, and
the average size of a protein is in the region of 500 amino acids (aa),
the diversity which can be generated simply by theprimary structure
(i.e. the simple list of amino acids in order) is enormous. Things are
further complicated when aa's interact with each other and generate
secondary (local folds) and tertiary structures (the whole molecule's
folding in space). However, if you unfold a protein by denaturing it, it
will eventually refold (with a little help from other proteins
sometimes) to its *one* correct position in 3d space 99 times out of
100. Theoretically speaking, it *should* be possible to take quantum
mechanical equations about every *atom* in a protein and minimize these
functions to find an energeytically minimal solution but currently we
are unable to solve these equations for large numbers of atoms. The
"holy grail" of protein biochemistry is to be able to predict the 3d
structure of a protein from its primary structure (i.e. list of which
amino acid occupies which position). Then we can start predicting
functions without having to do ardous and time-cosuming "wet"

This is not my specialty so I am speaking from my general knowledge and
off the top of my head but this explanation should be enough to describe
the complexity and magnitude of the problem... I can hook people up to
the experts in this field if anyone is interested or maybe there is one
on the list?


To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest

More information about the rc5 mailing list