[RC5] Tree search algorithms and the next d.net
AZilinskas at SolutionsIQ.com
Tue Apr 20 15:55:22 EDT 1999
From: David Cantrell:
>This suffers from the same problem as we found when talking about
>distributed graphics rendering. The customers DEMAND a particular
> turn-around time; they DEMAND reliability.
I guess that I am thinking of a co-op type system.
There are times you want to crunch a big problem
but can't afford to own a Cray-equivalent. You can
afford a $10,000 system (or pick a reasonable price).
You can get Cray-like performance by teaming up with
a thousand other machines like yours. The pay back to
the group is that when you are finished, your machine
is idle so that others can make use of it in exchange
of using their CPU cycles.
Now I guess its like using city streets. Its a communal
property and works well when each person needs it only
for a period of time. There will be occasional traffic
jams when everyone wants to use it at the same time, but
if planned right we generally get good access to the
Now if there is a hog, someone that will need many CPU-years
from many machines to work their problem. They will have to
either do their own system or compensate the others for
the resources the hog uses and does ot replace. The street
equivalent is now putting a mining company on your street
and the street is always full of dump trucks entering and
leaving the mine. No one else can make use of the street.
The best thing would be for the mining company to make
their own private road and stay off the communal one, or
pay to expand this communal street to make more room.
Back to the $10,000 example, doing a problem on it
alone may take thousands of CPU minutes to solve.
We could buy a $10,000,000 system and solve the
problem in a few minutes. Using to co-op network,
we spread the thousands of CPU minutes compute time
over thousands of machines (given the problem can
be done that way). We get the effect of the $10,000,000
machine while only spending $10,000.
Now if we become a hog, we have such a big problem
or submit so many problems a day that we take alot
of CPU cycles but ever sit idle long enough to give
any back. We should compensate the others for all the
sucking we are doing (pay them like a utility) or
break down and buy enough machines to do our network
dedicated for our own uses and our demands.
If everyone understood they are getting something for
a little, and like rush hour traffic, accept that some
times they will have to wait fro traffic to clear.
They might accept being part of the co-op.
Now about distributed work units. (Greg Hewgill)
This is where alot of design work would be needed.
I am thinking on an idea like the Java sandbox where
work unit cores will, by designed, protect the data
within the unit as well prevent the core code from
violating the host.
Somehow work units are in some sort of consistent
blocks that are checked out of hubs. The core code
for that work unit will also be downloaded if it
is not already cached. The core code works on the
data and checks the block back in. Security will
happen by some sort of mild encryption of the blocks
as well as each block cotnains only tiny parts of
the problem. Now for redundancy, the network
will have a scheme that if a checked out block
does not come back within a reasonable time, it
is assumed to have been lost and the block is
unlocked to be tried by another client. This might
rule out offline clients that check out blocks
and posts them back in 20 days later.
I can also see a development kit for core code.
It helps people write their tasks to fit the
co-op network scheme. The key piece of this
development kit will be a analyzer that reviews
that the code to insure that it stays within the
work block sandbox (that is, an automated review
of the code that it is allowable to run on other
members of the network).
azilinskas at solutionsiq.com
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest
More information about the rc5