[RC5] newbie question

Slawek sgp at telsatgp.com.pl
Thu Aug 28 20:57:02 EDT 2003


> | After residuals are joined there is no reason
> | to split them again. All after all they must have
> | origined from the same participant or we would
> | not be able to catch who's cheating anyway.
> | So if they're from the same participant then
> | there's little sense in checking which of the
> | blocks is flawed. Again - no splitting.
> | 
> | 
> | Of course I could have missed something, so
> | please tell me: why do you want to split them?
>
> You caught the essence of my message.  Blocks are split
> on their way downstream, and combined (when possible
> without losing information) when travelling upstream.

Ok, so no matter what method on generating residuals
we choose we must always keep track of which blocks
came from which client.

As far as I can see we may really easily optimise bits
of information required for residual to be 32-bits
per each block from given client and don't grow up
as long as we combine blocks from the same client.


The only problem I can see is in following example:
- assume 4 smallest possible blocks of data are being
  checked
- in first pass blocks 1,2 are being returned by client #1,
  and block 3,4 by client #2
- in second (verify) pass block 1 is returned by client #3,
  and blocks 2,3,4 are returned by client #4

Let's say than we XORed residuals from pass 1 and
(separatelly) from pass 2 and it happend that they
are different. Who cheated?

Method 1:
Check problematic blocks on trusted computers

Method 2:
Never select multible succesing blocks for checking.
(need to reissue blocks the third time to check who
of whose two cheated anyway)


> | > Another factor is that tracking the running
> | > XOR/CRC/whatever may have a greater
> | > impact on performance on register-starved
> | > architectures (X86).
> | 
> | It can be hold in memory and be cached.
> | 
> | Of course using cache may or may not be desired.
>
> Hmm... Not exactly.  Reading the CRC out of RAM
> into CPU registers, and writing it back to RAM after
> modification is what would cause the slowdown.

Writing to L1 cache and reading from it at least on P4M
is 1 clock cycle each. Doing XORs is 1 cycle as well.
(or was it even 0.5 cycle? I can't remember).

Let's say we've got a 3-pipe loop - we need 2 memory
accesses and 3 xors which is 5 cycles per 3 keys.

This gives only 1 and 2/3 clock cycle per key.


Now... how many cycles does it take to decrypt one key?


On P4 it's _very_ slow because of lack of barrel shifter
so I should probably check on Celeron or P III.

As far as I remember it was somewhere around 200
clock cycles per key (somebody correct me if I'm wrong,
I don't have any P III handy here).

All of that would mean it's slowdown smaller than 1%.

My assumtion of 2% slowdown in the project was taken
from slowdown of clients and reissuing of the same blocks.

I think that assumtion is still correct.



> | This words about cache brings another question.
> | Where d.net client slows down the system most
> | and is it possible to reduce that effect?
>
> The primary bottleneck on a system running dnetc
> is the CPU.  This is understandable, since the entire
> purpose of dnetc is to put as much CPU power as
> possible towards the problem.  Since dnetc runs at
> the lowest possible priority, any other task on the
> same computer with a higher priority should draw
> CPU away from dnetc, as it should.

Does Hyper Threading support priorities?

You know: in case two processes of different priorities
running simultanously. They are executed simultanously,
but aren't they equal when fighting for processor internals?


> There are some problems encountered occasionally
> between dnetc and other idle-priority tasks,

Yea. Like with Garbage Collector...



-- 
Slawek




More information about the rc5 mailing list