[Hardware] "success"

John L. Bass jbass at dmsd.com
Thu Oct 19 04:47:22 EDT 2006


Hi Martin,

Are you SURE that this simulates correctly? I don't see
any pipeline retiming that is necessary to correct for the
unrolled variables ... one of the things I purposefully
left out of the example code I sent you, but warned you about.

Or is there some magic external to this function, like a fixup
script on the netlist?

John

> From jbass Fri Dec 30 05:17:54 2005
> To: jbass at dmsd.com, martin at nnytech.net
> Subject: Re: FPGA C code
> 
> I suppose I oughta be nice and tell you the ugly part about pipelining
> the RSA algorithm, and that is that the S and L boxes need fifo's to
> retime them. When you are calculating the later S terms, it needs to be
> done with the original S term 26 stages earlier for the same solution.
> 
> Your code will pickup the earlier S terms from the wrong solution, that
> is 26 clocks later.
> 
> The reason arrays are partially implemented in FpgaC was a hack to build
> a 26 stage fifo for each S box term. Implemented as a dual port RAM32d1,
> and sharing a 26 state index counter, made it easy to access the previous
> S box entry just as the next was being written by reassigning the FFin
> terms to the array RAMout port. I did the fixups with a rather ugly perl
> script that is long gone. Since RAM's are clocked, I simply substituted
> FF's with RAM's. The dual port rams are much slower, but much less logic
> than FF based 26 stage delay lines for every term.
> 
> Declare an array in FpgaC, long s1[32], and look at the netlist. You will
> see what the fixup script needs to do. It's not that bad, but a rather
> ugly hack.
> 
> John




	Date: Wed, 18 Oct 2006 22:48:30 -0400
	From: Martin Klingensmith <martin at nnytech.net>
	To: Hardware <hardware at lists.distributed.net>
	Subject: Re: [Hardware] "success"

	Okay here it is. I can't think of a reason why to keep it secret. I'm no 
	expert and it only took me a few hours to make it.
	Don't laugh, I'm no HDL expert.

	/////////////
	// Martin Klingensmith 2006
	// Verilog RC5-32/12/9 brute for key checker
	// simulates correctly
	// actually just encrypts "The unkn" with whatever key you give it.
	// probably takes about 45,000 LUTs

	module pipe(A,B,key0,key1,key2,clock,clr);
	    output [31:0]A,B;
	    input [31:0]key0,key1,key2;
	    input clock,clr;

	    reg 
	[31:0]S20,S21,S22,S23,S24,S25,S10,S11,S12,S13,S14,S15,S16,S17,S18,S19,
	        
	L220,L221,L222,L223,L224,L225,L226,L210,L211,L212,L213,L214,L215,L216,L217,L218,L219,
	        
	L201,L202,L203,L204,L205,L206,L207,L208,L209,L120,L121,L122,L123,L124,L125,
	        L110,L111,L112,L113,L114,L115,L116,L117,L118,L119,
	        L100,L101,L102,L103,L104,L105,L106,L107,L108,L109,
	        
	L020,L021,L022,L023,L024,L025,L010,L011,L012,L013,L014,L015,L016,L017,L018,L019,
	        
	L00,L01,L02,L03,L04,L05,L06,L07,L08,L09,B0,B1,B2,B3,B4,B5,B6,B7,B8,B9,B10,B,
	        A0,A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,A,S1000,S1001,S1101,S1200,S1201,
	        S2300,S2301,S2400,S2401,S2500,S2501,
	        S1800,S1801,S1900,S1901,S2000,S2001,S2100,S2101,S2200,S2201,
	        S1300,S1301,S1400,S1401,S1500,S1501,S1600,S1601,S1700,S1701,
	        
	S901,S00,S1102,S400,S401,S500,S501,S600,S601,S700,S701,S800,S801,S900,
	        S100,S101,S200,S201,S300,S301,S0, S1, S2, S3, S4, S5, S6, S7, 
	S8, S9;
	   
	   
	    function [31:0]ROTL;
	        input [31:0]x;
	        input [31:0]n;
	        begin
	            ROTL = (((x) << (n[4:0])) | ((x) >> (32-(n[4:0]))));

	        end
	    endfunction
	   
	    function [31:0]ROTL3;
	        input [31:0]x;
	        ROTL3 = (((x) << 3) | ((x) >> (32-3)));
	    endfunction
	   
	   
	   
	 

	   
	    always @(posedge clock)
	    begin
	        if(clr==1)
	        begin
	               S100 =0;S101=0;S200=0;S201=0;
	          S300=0;S301=0;S0=0;S1=0;
	          S2=0;S3=0;S4=0;S5=0;
	          S6=0;          S7=0;          S8=0;          S9=0;
	          S901=0;          S00=0;          S1102=0;          S400=0;
	          S401=0;          S500=0;          S501=0;          S600=0;
	          S601=0;          S700=0;          S701=0;          S800=0;
	          S801=0;          S900=0;          S20=0;          S21=0;
	          S22=0;          S23=0;          S24=0;          S25=0;
	          S10=0;          S11=0;          S12=0;          S13=0;
	          S14=0;          S15=0;          S16=0;          S17=0;
	          S18=0;          S19=0;          L220=0;        L221=0;
	          L222=0;        L223=0;          L224=0;        L225=0;
	          L226=0;        L210=0;          L211=0;        L212=0;
	          L213=0;        L214=0;          L215=0;        L216=0;
	          L217=0;        L218=0;          L219=0;        L201=0;
	          L202=0;        L203=0;          L204=0;       L205=0;
	          L206=0;        L207=0;          L208=0;        L209=0;
	          L120=0;        L121=0;          L122=0;        L123=0;
	          L124=0;        L125=0;          L110=0;        L111=0;
	          L112=0;        L113=0;          L114=0;        L115=0;
	          L116=0;        L117=0;          L118=0;        L119=0;
	          L100=0;        L101=0;          L102=0;        L103=0;
	          L104=0;        L105=0;          L106=0;        L107=0;
	          L108=0;        L109=0;          L020=0;        L021=0;
	          L022=0;        L023=0;          L024=0;        L025=0;
	          L010=0;        L011=0;          L012=0;        L013=0;
	          L014=0;        L015=0;          L016=0;        L017=0;
	          L018=0;        L019=0;          L00=0;        L01=0;
	          L02=0;        L03=0;          L04=0;        L05=0;
	          L06=0;        L07=0;          L08=0;        L09=0;
	          B0=0;        B1=0;          B2=0;        B3=0;
	          B4=0;        B5=0;          B6=0;        B7=0;
	          B8=0;        B9=0;          B10=0;        B=0;
	          A0=0;          A1=0;        A2=0;        A3=0;
	        A4=0;        A5=0;        A6=0;        A7=0;
	        A8=0;        A9=0;        A10=0;        A=0;
	        S1000=0;        S1001=0;        S1101=0;        S1200=0;
	        S1201=0;        S2300=0;        S2301=0;        S2400=0;
	        S2401=0;        S2500=0;        S2501=0;        S1800=0;
	        S1801=0;        S1900=0;        S1901=0;        S2000=0;
	        S2001=0;        S2100=0;        S2101=0;        S2200=0;
	        S2201=0;        S1300=0;        S1301=0;        S1400=0;
	        S1401=0;        S1500=0;        S1501=0;        S1600=0;
	        S1601=0;        S1700=0;        S1701=0;      
	           
	        end else
	        begin
	    B = ROTL(B10 ^ A, A) + S25;
	    A = ROTL(A10 ^ B10, B10) + S24;
	    B10 = ROTL(B9 ^ A10, A10) + S23;
	    A10 = ROTL(A9 ^ B9, B9) + S22;
	    B9 = ROTL(B8 ^ A9, A9) + S21;
	    A9 = ROTL(A8 ^ B8, B8) + S20;
	    B8 = ROTL(B7 ^ A8, A8) + S19;
	    A8 = ROTL(A7 ^ B7, B7) + S18;
	    B7 = ROTL(B6 ^ A7, A7) + S17;
	    A7 = ROTL(A6 ^ B6, B6) + S16;
	    B6 = ROTL(B5 ^ A6, A6) + S15;
	    A6 = ROTL(A5 ^ B5, B5) + S14;
	    B5 = ROTL(B4 ^ A5, A5) + S13;
	    A5 = ROTL(A4 ^ B4, B4) + S12;
	    B4 = ROTL(B3 ^ A4, A4) + S11;
	    A4 = ROTL(A3 ^ B3, B3) + S10;
	    B3 = ROTL(B2 ^ A3, A3) + S9;
	    A3 = ROTL(A2 ^ B2, B2) + S8;
	    B2 = ROTL(B1 ^ A2, A2) + S7;
	    A2 = ROTL(A1 ^ B1, B1) + S6;
	    B1 = ROTL(B0 ^ A1, A1) + S5;
	    A1 = ROTL(A0 ^ B0, B0) + S4;
	    B0 = ROTL((32'h2ff17af3+S1) ^ A0, A0) + S3;
	    A0 = ROTL((32'h3f3ca653+S0) ^ (32'h2ff17af3+S1), (32'h2ff17af3+S1)) 
	+ S2; //A: 3f3ca653 B: 2ff17af3

	// enc
	//    B = L226;
	//    A = S25;
	    L226  = ROTL(L225 + L125+S25, L125+S25);
	    S25 = ROTL3(S2501 + S24+L125);
	    L125  = ROTL(L124 + L025+S24, L025+S24);
	    S24 = ROTL3(S2401 + S23+L025);
	    L025  = ROTL(L024 + L225+S23, L225+S23);
	    S23 = ROTL3(S2301 + S22+L225);
	    L225  = ROTL(L224 + L124+S22, L124+S22);
	    S22 = ROTL3(S2201 + S21+L124);
	    L124  = ROTL(L123 + L024+S21, L024+S21);
	    S21 = ROTL3(S2101 + S20+L024);
	    L024  = ROTL(L023 + L224+S20, L224+S20);
	    S20 = ROTL3(S2001 + S19+L224);
	    L224  = ROTL(L223 + S19+L123, S19+L123);
	    S19 = ROTL3(S1901 + S18 + L123);
	    L123  = ROTL(L122 + S18+L023, S18+L023);
	    S18 = ROTL3(S1801 + L023+S17);
	    L023  = ROTL(L022 + S17+L223, S17+L223);
	    S17 = ROTL3(S1701 + S16+L223);
	    L223  = ROTL(L222 + L122+S16, L122+S16);
	    S16 = ROTL3(S1601 + L122 + S15);
	    L122  = ROTL(L121 + S15+L022, S15+L022);
	    S15 = ROTL3(S1501 + S14+L022);
	    L022  = ROTL(L021 + S14+L222, S14+L222);
	    S14 = ROTL3(S1401 + L222+S13);
	    L222  = ROTL(L221 + L121+S13, L121+S13);
	    S13 = ROTL3(S1301 + L121+S12);
	    L121  = ROTL(L120 + S12+L021, S12+L021);
	    S12 = ROTL3(S1201 + L021+S11);
	    L021  = ROTL(L020 + S11+L221, S11+L221);
	    S11 = ROTL3(S1102 + L221+S10);
	    L221  = ROTL(L220 + S10+L120, S10+L120);
	    S10 = ROTL3(S1001 + L120+S9);
	    L120  = ROTL(L119 + S9+L020, S9+L020);
	    S9  = ROTL3(S901 + L020+S8);
	    L020  = ROTL(L019 + S8+L220, S8+L220);
	    S8  = ROTL3(S801 + S7+L220);
	    L220  = ROTL(L219 + S7+L119, S7+L119);
	    S7  = ROTL3(S701 + L119+S6);
	    L119  = ROTL(L118 + S6+L019, S6+L019);
	    S6  = ROTL3(S601 + L019+S5);
	    L019  = ROTL(L018 + S5+L219, S5+L219);
	    S5  = ROTL3(S501 + L219+S4);
	    L219  = ROTL(L218 + S4+L118, S4+L118);
	    S4  = ROTL3(S401 + L118+S3);
	    L118  = ROTL(L117 + L018+S3, L018+S3);
	    S3  = ROTL3(S301 + L018+S2);
	    L018  = ROTL(L017 + S2+L218, S2+L218);
	    S2  = ROTL3(S201 + L218+S1);
	    L218  = ROTL(L217 + L117+S1, L117+S1);
	    S1  = ROTL3(S100 + L117+S0);
	    L117  = ROTL(L116 + S0+L017, S0+L017);
	    S0  = ROTL3(S00 + L017+S2501);

	    L017  = ROTL(L016 + S2501+L217, L217+S2501);
	    S2501 = ROTL3(S2500 + L217+S2401);
	    L217  = ROTL(L216 + S2401+L116, S2401+L116);
	    S2401 = ROTL3(S2400 + L116+S2301);
	    L116  = ROTL(L115 + S2301+L016, S2301+L016);
	    S2301 = ROTL3(S2300 + L016+S2201);
	    L016  = ROTL(L015 + S2201+L216, S2201+L216);
	    S2201 = ROTL3(S2200 + L216+S2101);
	    L216  = ROTL(L215 + S2101+L115, S2101+L115);
	    S2101 = ROTL3(S2100 + L115+S2001);
	    L115  = ROTL(L114 + S2001+L015, S2001+L015);
	    S2001 = ROTL3(S2000 + L015+S1901);
	    L015  = ROTL(L014 + S1901+L215, S1901+L215);
	    S1901 = ROTL3(S1900 + L215+S1801);
	    L215  = ROTL(L214 + S1801+L114, S1801+L114);
	    S1801 = ROTL3(S1800 + L114+S1701);
	    L114  = ROTL(L113 + S1701+L014, S1701+L014);
	    S1701 = ROTL3(S1700 + L014+S1601);
	    L014  = ROTL(L013 + S1601+L214, S1601+L214);
	    S1601 = ROTL3(S1600 + L214+S1501);
	    L214  = ROTL(L213 + S1501+L113, S1501+L113);
	    S1501 = ROTL3(S1500 + L113+S1401);
	    L113  = ROTL(L112 + S1401+L013, S1401+L013);
	    S1401 = ROTL3(S1400 + L013+S1301);
	    L013  = ROTL(L012 + S1301+L213, S1301+L213);
	    S1301 = ROTL3(S1300 + L213+S1201);
	    L213  = ROTL(L212 + S1201+L112, S1201+L112);
	    S1201 = ROTL3(S1200 + L112+S1102);
	    L112  = ROTL(L111 + S1102+L012, S1102+L012);
	    S1102 = ROTL3(S1101 + L012+S1001);
	    L012  = ROTL(L011 + S1001+L212, S1001+L212);
	    S1001 = ROTL3(S1000 + L212+S901);
	    L212  = ROTL(L211 + S901+L111, S901+L111);
	    S901  = ROTL3(S900 + L111+S801);
	    L111  = ROTL(L110 + S801+L011, S801+L011);
	    S801  = ROTL3(S800 + L011+S701);
	    L011  = ROTL(L010 + L211+S701, L211+S701);
	    S701  = ROTL3(S700 + L211+S601);
	    L211  = ROTL(L210 + S601+L110, S601+L110);
	    S601  = ROTL3(S600 + L110+S501);
	    L110  = ROTL(L109 + S501+L010, S501+L010);
	    S501  = ROTL3(S500 + L010+S401);
	    L010  = ROTL(L09 + S401+L210, L210+S401);
	    S401  = ROTL3(S400+ L210+S301);
	    L210  = ROTL(L209 + S301+L109, S301+L109);
	    S301  = ROTL3(S300 + L109+S201);
	    L109  = ROTL(L108 + S201+L09, S201+L09);
	    S201  = ROTL3(S200 + L09+S100);
	    L09  = ROTL(L08 + S100+L209, S100+L209);
	    S100  = ROTL3(S101 + L209+S00);
	    L209  = ROTL(L208 + S00+L108, S00+L108);
	    S00  = ROTL3(32'hbf0a8b1d + L108+S2500);

	    L108  = ROTL(L107 + S2500+L08, S2500+L08);
	    S2500 = ROTL3(32'h2b4c3474 + L08+S2400);
	    L08  = ROTL(L07 + S2400+L208, S2400+L208);
	    S2400 = ROTL3(32'h8d14babb + L208+S2300);
	    L208  = ROTL(L207 + L107+S2300, S2300+L107);
	    S2300 = ROTL3(32'heedd4102 + L107+S2200);
	    L107  = ROTL(L106 + S2200+L07, S2200+L07);
	    S2200 = ROTL3(32'h50a5c749 + L07+S2100);
	    L07  = ROTL(L06 + S2100+L207, S2100+L207);
	    S2100 = ROTL3(32'hb26e4d90 + L207+S2000);
	    L207  = ROTL(L206 + S2000+L106, S2000+L106);
	    S2000 = ROTL3(32'h1436d3d7 + S1900+L106);
	    L106  = ROTL(L105 + S1900+L06, S1900+L06);
	    S1900 = ROTL3(32'h75ff5a1e + S1800+L06);
	    L06  = ROTL(L05 + S1800+L206, S1800+L206);
	    S1800 = ROTL3(32'hd7c7e065 + L206+S1700);
	    L206  = ROTL(L205 + S1700+L105, S1700+L105);
	    S1700 = ROTL3(32'h399066ac + L105+S1600);
	    L105  = ROTL(L104 + S1600+L05, S1600+L05);
	    S1600 = ROTL3(32'h9b58ecf3 + S1500+L05);
	    L05  = ROTL(L04 + S1500+L205, S1500+L205);
	    S1500 = ROTL3(32'hfd21733a + L205+S1400);
	    L205  = ROTL(L204 + S1400+L104, S1400+L104);
	    S1400 = ROTL3(32'h5ee9f981 + L104+S1300);
	    L104  = ROTL(L103 + S1300+L04, S1300+L04);
	    S1300 = ROTL3(32'hc0b27fc8 + L04+S1200);
	    L04  = ROTL(L03 + S1200+L204, S1200+L204);
	    S1200 = ROTL3(32'h227b060f + L204+S1101);
	    L204  = ROTL(L203 + S1101+L103, S1101+L103);
	    S1101 = ROTL3(32'h84438c56 + L103+S1000);
	    L103  = ROTL(L102 + S1000 + L03, S1000 + L03);
	    S1000 = ROTL3(32'he60c129d + L03 + S900);
	    L03 = ROTL(L02 + S900 + L203, S900 + L203);
	    S900 = ROTL3(32'h47d498e4 + S800 + L203);
	    L203 = ROTL(L202 + S800 + L102, S800 + L102);
	    S800 = ROTL3(32'ha99d1f2b + S700 + L102);
	    L102 = ROTL(L101 + S700 + L02, S700 + L02);
	    S700 = ROTL3(32'h0b65a572 + L02 + S600);
	    L02 = ROTL(L01 + L202 + S600, L202 + S600);
	    S600 = ROTL3(32'h6d2e2bb9 + S500 + L202);
	    L202 = ROTL(L201 + S500 + L101, S500 + L101);
	    S500 = ROTL3(32'hcef6b200 + S400 + L101);
	    L101 = ROTL(L100 + S400 + L01, S400 + L01);
	    S400 = ROTL3(32'h30bf3847 + S300 + L01);
	    L01 = ROTL(L00 + S300 + L201, S300 + L201);
	    S300 = ROTL3(32'h9287be8e + S200 + L201);
	    L201 = ROTL(key2 + S200 + L100, S200 + L100);
	    S200 = ROTL3(32'hf45044d5 + L100 + S101);
	    L100 = ROTL(key1 + S101 + L00, S101 + L00);
	    S101 = ROTL3(32'h5618cb1c + 32'hbf0a8b1d + L00);
	    L00 = ROTL(key0 + 32'hbf0a8b1d, 32'hbf0a8b1d);
	    end
	end
	endmodule
	   
	///////////////


More information about the Hardware mailing list