[Hardware] RC5 algorithm .... Re: "success"

Martin Klingensmith martin at nnytech.net
Thu Oct 19 09:24:01 EDT 2006


No, John, I got the 40,000 number from when I tried to synthesize it for 
a Spartan 3 - 200 and it said it wanted about 44,000 LUTs but since it 
obviously won't fit in the Spartan 3 - 200 I've been trying to 
synthesize it for a few different chips of boards I can get 
inexpensively. And since it fits in different chips differently, I 
didn't want to quote exactly what Synplify told me. I'm at home and I 
have Xilinx ISE going (it took 20 minutes to synthesize) - Xilinx said 
"Total equivalent gate design: 239,938" and 25,557 LUTs, so as you can 
see it's drastically different than what Synplify said. It should be 
worth noting that the Synplify synthesis will probably work a lot better 
than the Xilinx one, as well. It duplicated registers because some of 
them have fairly high loads.

Yes, I would like everyone to remember that John Bass helped me with 
this a lot since I'm obviously not the expert, I'm not trying to take 
credit where credit is due. See the archives if you would like more info 
on that. I'm having a hard time keeping up with all these emails since I 
have other things to do as well. I didn't expect such a response here.
I also made no claim that a Verilog implementation was better than 
John's FpgaC implementations, or that I should get some sort of credit 
for doing something that many other people have done before i.e. John Bass.
Yes, I am doing my graduate thesis on something related to this but this 
is not the only part, and I'm not going to talk about that right now.
--
Martin K

John L. Bass wrote:
> 	david fleischer <cilantro_il at yahoo.com> on Thu, 19 Oct 2006 00:31:36 wrote:
> 	> Subject: Re: [Hardware] "success"
> 	> To: "John L. Bass" <jbass at dmsd.com>
> 	> 
> 	> Martin,
> 	> What is YOUR estimate?
> 	> 
> 	> Regards,
> 	> David
>
> I suspect this is where the 40K LUT number came from in Martin's comments:
>
> >From jbass Fri Dec 30 16:39:34 2005
>   
>> To: jbass at dmsd.com, martin at nnytech.net
>> Subject: Re: FPGA C code
>>
>>         I'm sure I can get ISE 4.2 from the Xilinx website. I forgot you told me
>>         previously that I needed to convert the XNF to EDF. I've used EDF before
>>         to go from Synplicity to ISE. You mention the Virtex parts. Is it not
>>         possible for some reason to put this on a Spartan 3? The reason is that
>>         I have a Spartan 3 - 200 board right now.
>>         I took a VLSI class last semester, so that is my interest for the RSA
>>         project. It is mainly personal.
>>         I could write a converter for xnf2edf if it isn't too difficult. I don't
>>         know the specifics of either format but I have experience in perl. I
>>         have two weeks before classes start up, at which point my time will be
>>         severely limited.
>>
>> As long as the Spartan 3 has enough LUTs and can route the resulting net
>> should not be a problem. A Spartan 3 has 8 LUT's per CLB, so you will need
>> a device with about 10% more than 36,000/8 = 4500 CLB's. From the data sheet
>> the smallest device is an XC3S2000 with 5120 CLB's which should be more than
>> enough headroom to get the device routed. There may not be enough room to
>> pipeline it, as the additional LUT rams may run you over budget.
>>
>> It took HOURS to route it on my 533MHz PII last year, before it reached timing
>> closure. So expect it might be slow going.
>>
>> A fully pipelined RSA network requires 102 barrel shifters that are 32 bits
>> wide. With lut based hand design this would result in lut muxes 5 deep with
>> a log2 pattern.  102*32*5 = 16,320 LUTs for barrel shifters. This can be
>> reduced for some FPGA's by using H 3-LUTs in XC4K's, and F5 muxes in Virtex
>> devices. There are another 4-5 32 bit adders per stage, which require another
>> 16K LUTs. Using 32x1 LUT rams for pipeline retiming S and L, adds four
>> additional LUTs per stage. To provide framework logic, and additional redundant
>> logic to reduce fanout costs another few percent in space.
>>
>> If you order the mux right, you can combine it with the first adder stage and
>> save a few LUTs. The best hand design I've done, which didn't have the best
>> timing, was only 15-20% smaller than this.
>>
>> It's why I commented it takes a pretty fair sized fpga. I was using XCV2000E's,
>> XCV2600E's, and XC2V6000's.
>>
>> The xnf to edf project is a feature request on the fpgac site:
>>
>> http://sourceforge.net/tracker/index.php?func=detail&aid=1365849&group_id=152034&atid=782959
>>
>> Which has an example of both netlist formats. I'm working on a generalized
>> netlist format as the std backend for FpgaC, that I'm calling cnf. It's not
>> nearly as verbose. If you need to write a converter, I'll send you a working
>> copy of fpgac that has cnf output.
>>
>> This is a fun project, and good experience fitting real world type applications
>> to hardware.
>>
>> Have Fun!
>> John
>>
>>     
> _______________________________________________
> Hardware mailing list
> Hardware at lists.distributed.net
> http://lists.distributed.net/mailman/listinfo/hardware
>   


More information about the Hardware mailing list