Messages from 160525

Article: 160525
Subject: Re: How to handle a data packet while calculating CRC.
From: gtwrek@sonic.net (gtwrek)
Date: Fri, 16 Mar 2018 00:29:52 -0000 (UTC)
Links: << >> << T >> << A >>

In article <almarsoft.7623430815815314146@news.eternal-september.org>,
Bart Fox  <bartfox@gmx.net> wrote:
>On Tue, 13 Mar 2018 19:28:17 -0000 (UTC), gtwrek@sonic.net (gtwrek) 
>wrote:
>> Table methods are very likely NOT the correct solution for FPGA
>> implementations.  Those methods are tuned for SW solutions.
>
>Sorry for respeak.Of course is the usage of a lookup-table a valid 
>design technic for FPGAs.
>Maybe not in the range of megabytes, but e.g. for 8 bit input and 8 
>bit output it's very acceptable.
>One example: a low fidelity DDS usually use a small ROM with a sine 
>table...

Table lookups, in general, are a very valid tool for some hardware
solutions.  Just not for CRCs.  A "brute force" table lookup for an
8-bit input, 8-bit CRC would require 2**16 entries by 8 bits.  So a 
64 KByte table.  Not efficient.  So one looks up many of the 
"Software" table generated techniques to reduce the requirements.  
Problem is those techniques are tuned to optimizes SW, not HW.  

... All to replace something less than a 100 LUT6s, and 8 FFs.  

CRCs in hardware are very efficient just coded brute force.

Regards,

Mark

Article: 160526
Subject: Re: How to handle a data packet while calculating CRC.
From: lasselangwadtchristensen@gmail.com
Date: Thu, 15 Mar 2018 17:57:04 -0700 (PDT)
Links: << >> << T >> << A >>

Den fredag den 16. marts 2018 kl. 01.29.54 UTC+1 skrev gtwrek:
> In article <almarsoft.7623430815815314146@news.eternal-september.org>,
> Bart Fox  <bartfox@gmx.net> wrote:
> >On Tue, 13 Mar 2018 19:28:17 -0000 (UTC), gtwrek@sonic.net (gtwrek) 
> >wrote:
> >> Table methods are very likely NOT the correct solution for FPGA
> >> implementations.  Those methods are tuned for SW solutions.
> >
> >Sorry for respeak.Of course is the usage of a lookup-table a valid 
> >design technic for FPGAs.
> >Maybe not in the range of megabytes, but e.g. for 8 bit input and 8 
> >bit output it's very acceptable.
> >One example: a low fidelity DDS usually use a small ROM with a sine 
> >table...
> 
> Table lookups, in general, are a very valid tool for some hardware
> solutions.  Just not for CRCs.  A "brute force" table lookup for an
> 8-bit input, 8-bit CRC would require 2**16 entries by 8 bits.  So a 
> 64 KByte table.  Not efficient.  So one looks up many of the 
> "Software" table generated techniques to reduce the requirements.  
> Problem is those techniques are tuned to optimizes SW, not HW.  
> 
> ... All to replace something less than a 100 LUT6s, and 8 FFs.  
> 
> CRCs in hardware are very efficient just coded brute force.
> 

http://www.sunshine2k.de/articles/coding/crc/understanding_crc.html#ch6

doesn't look too bad

Article: 160527
Subject: Re: How to handle a data packet while calculating CRC.
From: gtwrek@sonic.net (gtwrek)
Date: Fri, 16 Mar 2018 01:09:35 -0000 (UTC)
Links: << >> << T >> << A >>

In article <b12f68ce-d635-4556-8fdc-ece405559d98@googlegroups.com>,
 <lasselangwadtchristensen@gmail.com> wrote:
>Den fredag den 16. marts 2018 kl. 01.29.54 UTC+1 skrev gtwrek:
>> In article <almarsoft.7623430815815314146@news.eternal-september.org>,
>> Bart Fox  <bartfox@gmx.net> wrote:
>> >On Tue, 13 Mar 2018 19:28:17 -0000 (UTC), gtwrek@sonic.net (gtwrek) 
>> >wrote:
>> >> Table methods are very likely NOT the correct solution for FPGA
>> >> implementations.  Those methods are tuned for SW solutions.
>> >
>> >Sorry for respeak.Of course is the usage of a lookup-table a valid 
>> >design technic for FPGAs.
>> >Maybe not in the range of megabytes, but e.g. for 8 bit input and 8 
>> >bit output it's very acceptable.
>> >One example: a low fidelity DDS usually use a small ROM with a sine 
>> >table...
>> 
>> Table lookups, in general, are a very valid tool for some hardware
>> solutions.  Just not for CRCs.  A "brute force" table lookup for an
>> 8-bit input, 8-bit CRC would require 2**16 entries by 8 bits.  So a 
>> 64 KByte table.  Not efficient.  So one looks up many of the 
>> "Software" table generated techniques to reduce the requirements.  
>> Problem is those techniques are tuned to optimizes SW, not HW.  
>> 
>> ... All to replace something less than a 100 LUT6s, and 8 FFs.  
>> 
>> CRCs in hardware are very efficient just coded brute force.
>> 
>
>http://www.sunshine2k.de/articles/coding/crc/understanding_crc.html#ch6
>
>doesn't look too bad

My favorite reference is:
http://www.ross.net/crc/download/crc_v3.txt

Hardware engineers should STOP reading after section 8.  That's all you
need to know as a hardware designer.  Everything after section 8 is
dedicated to software optimizations where one doesn't have free access
to any bit - like we do in hardware.

Regards,

Mark

Article: 160528
Subject: Re: How to handle a data packet while calculating CRC.
From: David Brown <david.brown@hesbynett.no>
Date: Fri, 16 Mar 2018 09:15:38 +0100
Links: << >> << T >> << A >>

On 16/03/18 01:29, gtwrek wrote:
> In article <almarsoft.7623430815815314146@news.eternal-september.org>,
> Bart Fox  <bartfox@gmx.net> wrote:
>> On Tue, 13 Mar 2018 19:28:17 -0000 (UTC), gtwrek@sonic.net (gtwrek)
>> wrote:
>>> Table methods are very likely NOT the correct solution for FPGA
>>> implementations.  Those methods are tuned for SW solutions.
>>
>> Sorry for respeak.Of course is the usage of a lookup-table a valid
>> design technic for FPGAs.
>> Maybe not in the range of megabytes, but e.g. for 8 bit input and 8
>> bit output it's very acceptable.
>> One example: a low fidelity DDS usually use a small ROM with a sine
>> table...
> 
> Table lookups, in general, are a very valid tool for some hardware
> solutions.  Just not for CRCs.  A "brute force" table lookup for an
> 8-bit input, 8-bit CRC would require 2**16 entries by 8 bits.  So a
> 64 KByte table.  Not efficient.  

No, it is 8-bit in (the address), 32-bit out for a 32-bit CRC - 1024 bytes.

> So one looks up many of the
> "Software" table generated techniques to reduce the requirements.
> Problem is those techniques are tuned to optimizes SW, not HW.
> 
> ... All to replace something less than a 100 LUT6s, and 8 FFs.
> 
> CRCs in hardware are very efficient just coded brute force.
> 

CRC's in hardware, using a shift register, are very efficient if your 
data is coming in a bit at a time.  If you have data in memory or 
arriving in a wider path, table lookup can be very useful.  You can 
choose your sizes to give a balance between speed and size.  For 
example, you could use 4-bit in, 32-bit out tables and run 4 bits at a 
time.  (Such a solution can be pipelined for greater throughput.)

Article: 160529
Subject: Re: How to handle a data packet while calculating CRC.
From: gtwrek@sonic.net (gtwrek)
Date: Fri, 16 Mar 2018 14:52:19 -0000 (UTC)
Links: << >> << T >> << A >>

In article <p8fufb$8es$1@dont-email.me>,
David Brown  <david.brown@hesbynett.no> wrote:
>On 16/03/18 01:29, gtwrek wrote:
>> In article <almarsoft.7623430815815314146@news.eternal-september.org>,
>> Bart Fox  <bartfox@gmx.net> wrote:
>>> On Tue, 13 Mar 2018 19:28:17 -0000 (UTC), gtwrek@sonic.net (gtwrek)
>>> wrote:
>>>> Table methods are very likely NOT the correct solution for FPGA
>>>> implementations.  Those methods are tuned for SW solutions.
>>>
>>> Sorry for respeak.Of course is the usage of a lookup-table a valid
>>> design technic for FPGAs.
>>> Maybe not in the range of megabytes, but e.g. for 8 bit input and 8
>>> bit output it's very acceptable.
>>> One example: a low fidelity DDS usually use a small ROM with a sine
>>> table...
>> 
>> Table lookups, in general, are a very valid tool for some hardware
>> solutions.  Just not for CRCs.  A "brute force" table lookup for an
>> 8-bit input, 8-bit CRC would require 2**16 entries by 8 bits.  So a
>> 64 KByte table.  Not efficient.  
>
>No, it is 8-bit in (the address), 32-bit out for a 32-bit CRC - 1024 bytes.

So the "brute force" table is even larger.  

>> So one looks up many of the
>> "Software" table generated techniques to reduce the requirements.
>> Problem is those techniques are tuned to optimizes SW, not HW.
>> 
>> ... All to replace something less than a 100 LUT6s, and 8 FFs.
>> 
>> CRCs in hardware are very efficient just coded brute force.
>> 
>
>CRC's in hardware, using a shift register, are very efficient if your 
>data is coming in a bit at a time.  If you have data in memory or 
>arriving in a wider path, table lookup can be very useful.  You can 

I don't agree. Those table methods described in all those papers you
find are tuned for software solutions, not hardware.   It's quite easy 
to handle one-bit, or N-bits at a time in hardware.

Code up a simple amount of logic to shift one bit at a time - in whatever
language.

next_crc = shift_crc( current_crc, new_data_bit_in );

The shift_crc function is very simple to implement in verilog/vhdl.

Now for N bits at a time, stick a 'for' loop around that function
call... Done.  The for loop (as always in hardware) describes parallel
hardware, not sequential operation.  And it works to do exactly what is
desired. You get a clear description of what's happening in shockingly
few lines of code.  It reproduces what all those online CRC generators
spit out as a glob of random XOR code.... 

If that's not fast enough, then look at pipelining it.  Table methods
are hardly ever the right solution for CRCs in hardware.  

Burning even one block memory for a CRC calculation in hardware just seems
absurdly excessive to me.  But maybe your designs have a lot of spare
block memories, (and few LUTs available).

Regards,

Mark

Article: 160530
Subject: Re: How to handle a data packet while calculating CRC.
From: =?UTF-8?Q?Adam_G=c3=b3rski?= <gorskiamalpawpkropkapeel_@xx>
Date: Fri, 16 Mar 2018 15:59:19 +0100
Links: << >> << T >> << A >>

On 2018-03-15 18:35, Adam Górski wrote:
> 
>>>> Hi,
>>>>
>>>> I'm trying to process a Ethernet type package. Suppose if i have 
>>>> detected SFD and now have a  <1600Byte  data.
>>>>
>>>> I'm extracting different package element(ds_addr,src_addr,etc) 
>>>> concatenating them in a long shift register and at same time passing 
>>>> it to a fifo to buffer and calculating crc32 which will take some 
>>>> clock cycles(xoring and shifting). Now if calculated CRC matched 
>>>> what is received, pass data to nxt stage else rst fifo.
>>>>
>>>> Is there a better technique for it?
>>>>
>>>> Thank-You in advance.
>>>>
>>>
>>> Hi
>>>
>>> Calculate CRC on-the-fly together with incoming data.
>>>
>>> Adam
>>
>> Hi Adam,
>>
>>   "Calculate CRC on-the-fly together with incoming data." , can you 
>> elaborate it a bit more.
>> I'm getting a 8bit data in one clock cycle from the decoder. Now for 
>> crc i need serial shift register.
>>
> 
> So look for CRC implementation able to process 8bits ( byte ) in single 
> clock and store data in same time to fifo and to CRC unit.
> 
> Hint: Online CRC VHDL generator. There is many.
> 
> Best regards
> 
> Adam Górski
Example:

Ethernet CRC generator code is ( http://www.easics.com/webtools/crctool ):

--------------------------------------------------------------------------------
-- Copyright (C) 1999-2008 Easics NV.
-- This source file may be used and distributed without restriction
-- provided that this copyright statement is not removed from the file
-- and that any derivative work contains the original copyright notice
-- and the associated disclaimer.
--
-- THIS SOURCE FILE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS
-- OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
-- WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
--
-- Purpose : synthesizable CRC function
--   * polynomial: x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 
+ x^8 + x^7 + x^5 + x^4 + x^2 + x^1 + 1
--   * data width: 8
--
-- Info : tools@easics.be
--        http://www.easics.com
--------------------------------------------------------------------------------
library ieee;
use ieee.std_logic_1164.all;

package PCK_CRC32_D8 is
   -- polynomial: x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 
+ x^8 + x^7 + x^5 + x^4 + x^2 + x^1 + 1
   -- data width: 8
   -- convention: the first serial bit is D[7]
   function nextCRC32_D8
     (Data: std_logic_vector(7 downto 0);
      crc:  std_logic_vector(31 downto 0))
     return std_logic_vector;
end PCK_CRC32_D8;


package body PCK_CRC32_D8 is

   -- polynomial: x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 
+ x^8 + x^7 + x^5 + x^4 + x^2 + x^1 + 1
   -- data width: 8
   -- convention: the first serial bit is D[7]
   function nextCRC32_D8
     (Data: std_logic_vector(7 downto 0);
      crc:  std_logic_vector(31 downto 0))
     return std_logic_vector is

     variable d:      std_logic_vector(7 downto 0);
     variable c:      std_logic_vector(31 downto 0);
     variable newcrc: std_logic_vector(31 downto 0);

   begin
     d := Data;
     c := crc;

     newcrc(0) := d(6) xor d(0) xor c(24) xor c(30);
     newcrc(1) := d(7) xor d(6) xor d(1) xor d(0) xor c(24) xor c(25) 
xor c(30) xor c(31);
     newcrc(2) := d(7) xor d(6) xor d(2) xor d(1) xor d(0) xor c(24) xor 
c(25) xor c(26) xor c(30) xor c(31);
     newcrc(3) := d(7) xor d(3) xor d(2) xor d(1) xor c(25) xor c(26) 
xor c(27) xor c(31);
     newcrc(4) := d(6) xor d(4) xor d(3) xor d(2) xor d(0) xor c(24) xor 
c(26) xor c(27) xor c(28) xor c(30);
     newcrc(5) := d(7) xor d(6) xor d(5) xor d(4) xor d(3) xor d(1) xor 
d(0) xor c(24) xor c(25) xor c(27) xor c(28) xor c(29) xor c(30) xor c(31);
     newcrc(6) := d(7) xor d(6) xor d(5) xor d(4) xor d(2) xor d(1) xor 
c(25) xor c(26) xor c(28) xor c(29) xor c(30) xor c(31);
     newcrc(7) := d(7) xor d(5) xor d(3) xor d(2) xor d(0) xor c(24) xor 
c(26) xor c(27) xor c(29) xor c(31);
     newcrc(8) := d(4) xor d(3) xor d(1) xor d(0) xor c(0) xor c(24) xor 
c(25) xor c(27) xor c(28);
     newcrc(9) := d(5) xor d(4) xor d(2) xor d(1) xor c(1) xor c(25) xor 
c(26) xor c(28) xor c(29);
     newcrc(10) := d(5) xor d(3) xor d(2) xor d(0) xor c(2) xor c(24) 
xor c(26) xor c(27) xor c(29);
     newcrc(11) := d(4) xor d(3) xor d(1) xor d(0) xor c(3) xor c(24) 
xor c(25) xor c(27) xor c(28);
     newcrc(12) := d(6) xor d(5) xor d(4) xor d(2) xor d(1) xor d(0) xor 
c(4) xor c(24) xor c(25) xor c(26) xor c(28) xor c(29) xor c(30);
     newcrc(13) := d(7) xor d(6) xor d(5) xor d(3) xor d(2) xor d(1) xor 
c(5) xor c(25) xor c(26) xor c(27) xor c(29) xor c(30) xor c(31);
     newcrc(14) := d(7) xor d(6) xor d(4) xor d(3) xor d(2) xor c(6) xor 
c(26) xor c(27) xor c(28) xor c(30) xor c(31);
     newcrc(15) := d(7) xor d(5) xor d(4) xor d(3) xor c(7) xor c(27) 
xor c(28) xor c(29) xor c(31);
     newcrc(16) := d(5) xor d(4) xor d(0) xor c(8) xor c(24) xor c(28) 
xor c(29);
     newcrc(17) := d(6) xor d(5) xor d(1) xor c(9) xor c(25) xor c(29) 
xor c(30);
     newcrc(18) := d(7) xor d(6) xor d(2) xor c(10) xor c(26) xor c(30) 
xor c(31);
     newcrc(19) := d(7) xor d(3) xor c(11) xor c(27) xor c(31);
     newcrc(20) := d(4) xor c(12) xor c(28);
     newcrc(21) := d(5) xor c(13) xor c(29);
     newcrc(22) := d(0) xor c(14) xor c(24);
     newcrc(23) := d(6) xor d(1) xor d(0) xor c(15) xor c(24) xor c(25) 
xor c(30);
     newcrc(24) := d(7) xor d(2) xor d(1) xor c(16) xor c(25) xor c(26) 
xor c(31);
     newcrc(25) := d(3) xor d(2) xor c(17) xor c(26) xor c(27);
     newcrc(26) := d(6) xor d(4) xor d(3) xor d(0) xor c(18) xor c(24) 
xor c(27) xor c(28) xor c(30);
     newcrc(27) := d(7) xor d(5) xor d(4) xor d(1) xor c(19) xor c(25) 
xor c(28) xor c(29) xor c(31);
     newcrc(28) := d(6) xor d(5) xor d(2) xor c(20) xor c(26) xor c(29) 
xor c(30);
     newcrc(29) := d(7) xor d(6) xor d(3) xor c(21) xor c(27) xor c(30) 
xor c(31);
     newcrc(30) := d(7) xor d(4) xor c(22) xor c(28) xor c(31);
     newcrc(31) := d(5) xor c(23) xor c(29);
     return newcrc;
   end nextCRC32_D8;

end PCK_CRC32_D8;

Article: 160531
Subject: the FPGA one-shot
From: John Larkin <jjlarkin@highland_snip_technology.com>
Date: Fri, 16 Mar 2018 11:18:25 -0700
Links: << >> << T >> << A >>

I finally got a test case for my FPGA async one-shot idea, hacked into
a build for something else.

I got 17 different one-shots, with various pin locations and
speed/drive strength settings. 

https://www.dropbox.com/s/4hxena27mpbpg54/FPGA_OS_1.JPG?raw=1


Most of the outputs look like this, with remarkably consistent timing,
edges within a few hundred ps. This is typical:

https://www.dropbox.com/s/f6a3a66kxjfm776/DIV_RESET.JPG?raw=1

This one has minimum pin speed and drive strength, and was driving
another chip on the board:

https://www.dropbox.com/s/8sdm8dz36um7b1p/GD4.JPG?raw=1

So, it looks like it will be safe to do this. I need to reset some ECL
counters when an async event happens, and don't want to spin up a 500
MHz clock to do it.

The Xilinx tools didn't approve of us doing this.


-- 

John Larkin         Highland Technology, Inc
picosecond timing   precision measurement 

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

Article: 160532
Subject: Re: the FPGA one-shot
From: gtwrek@sonic.net (gtwrek)
Date: Fri, 16 Mar 2018 18:30:50 -0000 (UTC)
Links: << >> << T >> << A >>

In article <7d1oadljinht0pjm5scgih4n5eou1uqv9u@4ax.com>,
John Larkin  <jjlarkin@highland_snip_technology.com> wrote:
>I finally got a test case for my FPGA async one-shot idea, hacked into
>a build for something else.
>
>I got 17 different one-shots, with various pin locations and
>speed/drive strength settings. 
>
>https://www.dropbox.com/s/4hxena27mpbpg54/FPGA_OS_1.JPG?raw=1
>
>
>Most of the outputs look like this, with remarkably consistent timing,
>edges within a few hundred ps. This is typical:
>
>https://www.dropbox.com/s/f6a3a66kxjfm776/DIV_RESET.JPG?raw=1
>
>This one has minimum pin speed and drive strength, and was driving
>another chip on the board:
>
>https://www.dropbox.com/s/8sdm8dz36um7b1p/GD4.JPG?raw=1
>
>So, it looks like it will be safe to do this. I need to reset some ECL
>counters when an async event happens, and don't want to spin up a 500
>MHz clock to do it.
>
>The Xilinx tools didn't approve of us doing this.

John - did you force any special LOCs or other physical contraints
within the implementation?  Since the FF has an async clr, I don't think
it can go in the IOB (the input trigger pin nor the output one shot).  
So, the FF is within the fabric.  Any special constraints on the CLR 
signal route?

Regards,

Mark

Article: 160533
Subject: Re: the FPGA one-shot
From: Joerg <news@analogconsultants.com>
Date: Fri, 16 Mar 2018 11:43:54 -0700
Links: << >> << T >> << A >>

On 2018-03-16 11:18, John Larkin wrote:
> I finally got a test case for my FPGA async one-shot idea, hacked into
> a build for something else.
>
> I got 17 different one-shots, with various pin locations and
> speed/drive strength settings.
>
> https://www.dropbox.com/s/4hxena27mpbpg54/FPGA_OS_1.JPG?raw=1
>
>
> Most of the outputs look like this, with remarkably consistent timing,
> edges within a few hundred ps. This is typical:
>
> https://www.dropbox.com/s/f6a3a66kxjfm776/DIV_RESET.JPG?raw=1
>
> This one has minimum pin speed and drive strength, and was driving
> another chip on the board:
>
> https://www.dropbox.com/s/8sdm8dz36um7b1p/GD4.JPG?raw=1
>
> So, it looks like it will be safe to do this. I need to reset some ECL
> counters when an async event happens, and don't want to spin up a 500
> MHz clock to do it.
>
> The Xilinx tools didn't approve of us doing this.
>

Ok, the last one that is driving across the board I wouldn't consider 
reliable. The other looks good, why don't they allow this? Don't they 
allow ring oscillators?

Unorthodox tricks are the most fun in electronics design.

-- 
Regards, Joerg

http://www.analogconsultants.com/

Article: 160534
Subject: Re: the FPGA one-shot
From: Jon Elson <jmelson@wustl.edu>
Date: Fri, 16 Mar 2018 13:50:28 -0500
Links: << >> << T >> << A >>

John Larkin wrote:


> 
> The Xilinx tools didn't approve of us doing this.
> 
> 
That's no surprise.  My guess is that any tweak of the process could throw 
this off, too, but maybe not enough to cause you grief.

Jon

Article: 160535
Subject: Re: the FPGA one-shot
From: John Larkin <jjlarkin@highland_snip_technology.com>
Date: Fri, 16 Mar 2018 12:11:16 -0700
Links: << >> << T >> << A >>

On Fri, 16 Mar 2018 18:30:50 -0000 (UTC), gtwrek@sonic.net (gtwrek)
wrote:

>In article <7d1oadljinht0pjm5scgih4n5eou1uqv9u@4ax.com>,
>John Larkin  <jjlarkin@highland_snip_technology.com> wrote:
>>I finally got a test case for my FPGA async one-shot idea, hacked into
>>a build for something else.
>>
>>I got 17 different one-shots, with various pin locations and
>>speed/drive strength settings. 
>>
>>https://www.dropbox.com/s/4hxena27mpbpg54/FPGA_OS_1.JPG?raw=1
>>
>>
>>Most of the outputs look like this, with remarkably consistent timing,
>>edges within a few hundred ps. This is typical:
>>
>>https://www.dropbox.com/s/f6a3a66kxjfm776/DIV_RESET.JPG?raw=1
>>
>>This one has minimum pin speed and drive strength, and was driving
>>another chip on the board:
>>
>>https://www.dropbox.com/s/8sdm8dz36um7b1p/GD4.JPG?raw=1
>>
>>So, it looks like it will be safe to do this. I need to reset some ECL
>>counters when an async event happens, and don't want to spin up a 500
>>MHz clock to do it.
>>
>>The Xilinx tools didn't approve of us doing this.
>
>John - did you force any special LOCs or other physical contraints
>within the implementation?  Since the FF has an async clr, I don't think
>it can go in the IOB (the input trigger pin nor the output one shot).  
>So, the FF is within the fabric.  Any special constraints on the CLR 
>signal route?
>
>Regards,
>
>Mark
>

I didn't code this, but I'll ask. The flop is in the i/o block, but
the clear path had to run through a nearby switchbox.

The trigger is from a regular global clock net.


-- 

John Larkin         Highland Technology, Inc
picosecond timing   precision measurement 

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

Article: 160536
Subject: Re: the FPGA one-shot
From: John Larkin <jjlarkin@highland_snip_technology.com>
Date: Fri, 16 Mar 2018 12:17:01 -0700
Links: << >> << T >> << A >>

On Fri, 16 Mar 2018 11:43:54 -0700, Joerg <news@analogconsultants.com>
wrote:

>On 2018-03-16 11:18, John Larkin wrote:
>> I finally got a test case for my FPGA async one-shot idea, hacked into
>> a build for something else.
>>
>> I got 17 different one-shots, with various pin locations and
>> speed/drive strength settings.
>>
>> https://www.dropbox.com/s/4hxena27mpbpg54/FPGA_OS_1.JPG?raw=1
>>
>>
>> Most of the outputs look like this, with remarkably consistent timing,
>> edges within a few hundred ps. This is typical:
>>
>> https://www.dropbox.com/s/f6a3a66kxjfm776/DIV_RESET.JPG?raw=1
>>
>> This one has minimum pin speed and drive strength, and was driving
>> another chip on the board:
>>
>> https://www.dropbox.com/s/8sdm8dz36um7b1p/GD4.JPG?raw=1
>>
>> So, it looks like it will be safe to do this. I need to reset some ECL
>> counters when an async event happens, and don't want to spin up a 500
>> MHz clock to do it.
>>
>> The Xilinx tools didn't approve of us doing this.
>>
>
>Ok, the last one that is driving across the board I wouldn't consider 
>reliable. The other looks good, why don't they allow this? Don't they 
>allow ring oscillators?

The last one was slowest i/o cell speed and 4 mA drive strength. It's
grunting to drive the trace capacitance and a chip pin.

10 layer boards can have a lot of trace capacitance.

>
>Unorthodox tricks are the most fun in electronics design.

The tools pitch hissy-fits when you do async stuff, like ring
oscillators. We are offending The Church Of Synchronous Design.


-- 

John Larkin         Highland Technology, Inc
picosecond timing   precision measurement 

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

Article: 160537
Subject: Re: the FPGA one-shot
From: Joerg <news@analogconsultants.com>
Date: Fri, 16 Mar 2018 13:00:14 -0700
Links: << >> << T >> << A >>

On 2018-03-16 12:17, John Larkin wrote:
> On Fri, 16 Mar 2018 11:43:54 -0700, Joerg <news@analogconsultants.com>
> wrote:
>
>> On 2018-03-16 11:18, John Larkin wrote:
>>> I finally got a test case for my FPGA async one-shot idea, hacked into
>>> a build for something else.
>>>
>>> I got 17 different one-shots, with various pin locations and
>>> speed/drive strength settings.
>>>
>>> https://www.dropbox.com/s/4hxena27mpbpg54/FPGA_OS_1.JPG?raw=1
>>>
>>>
>>> Most of the outputs look like this, with remarkably consistent timing,
>>> edges within a few hundred ps. This is typical:
>>>
>>> https://www.dropbox.com/s/f6a3a66kxjfm776/DIV_RESET.JPG?raw=1
>>>
>>> This one has minimum pin speed and drive strength, and was driving
>>> another chip on the board:
>>>
>>> https://www.dropbox.com/s/8sdm8dz36um7b1p/GD4.JPG?raw=1
>>>
>>> So, it looks like it will be safe to do this. I need to reset some ECL
>>> counters when an async event happens, and don't want to spin up a 500
>>> MHz clock to do it.
>>>
>>> The Xilinx tools didn't approve of us doing this.
>>>
>>
>> Ok, the last one that is driving across the board I wouldn't consider
>> reliable. The other looks good, why don't they allow this? Don't they
>> allow ring oscillators?
>
> The last one was slowest i/o cell speed and 4 mA drive strength. It's
> grunting to drive the trace capacitance and a chip pin.
>
> 10 layer boards can have a lot of trace capacitance.
>
>>
>> Unorthodox tricks are the most fun in electronics design.
>
> The tools pitch hissy-fits when you do async stuff, like ring
> oscillators. We are offending The Church Of Synchronous Design.
>

I have that a lot with RF parts. "Can you guys furnish SPICE data?" ... 
"No, only S-parameters" ... "I want to used it to pulse something" ... 
"It's an RF part, you aren't supposed to do that".

Like when I use my mountain bike to rush a package to the Fedex depot 
because I can let some air out of the rear shock and then it glides like 
a Lincoln, over a very dilapidated stretch of the old Lincoln Highway. I 
even mounted a cargo platform to it. A Lycra-clad pro biker looked at me 
in disgust "You do WHAT with it?"

-- 
Regards, Joerg

http://www.analogconsultants.com/

Article: 160538
Subject: Re: the FPGA one-shot
From: John Larkin <jjlarkin@highland_snip_technology.com>
Date: Fri, 16 Mar 2018 14:01:45 -0700
Links: << >> << T >> << A >>

On Fri, 16 Mar 2018 13:00:14 -0700, Joerg <news@analogconsultants.com>
wrote:

>On 2018-03-16 12:17, John Larkin wrote:
>> On Fri, 16 Mar 2018 11:43:54 -0700, Joerg <news@analogconsultants.com>
>> wrote:
>>
>>> On 2018-03-16 11:18, John Larkin wrote:
>>>> I finally got a test case for my FPGA async one-shot idea, hacked into
>>>> a build for something else.
>>>>
>>>> I got 17 different one-shots, with various pin locations and
>>>> speed/drive strength settings.
>>>>
>>>> https://www.dropbox.com/s/4hxena27mpbpg54/FPGA_OS_1.JPG?raw=1
>>>>
>>>>
>>>> Most of the outputs look like this, with remarkably consistent timing,
>>>> edges within a few hundred ps. This is typical:
>>>>
>>>> https://www.dropbox.com/s/f6a3a66kxjfm776/DIV_RESET.JPG?raw=1
>>>>
>>>> This one has minimum pin speed and drive strength, and was driving
>>>> another chip on the board:
>>>>
>>>> https://www.dropbox.com/s/8sdm8dz36um7b1p/GD4.JPG?raw=1
>>>>
>>>> So, it looks like it will be safe to do this. I need to reset some ECL
>>>> counters when an async event happens, and don't want to spin up a 500
>>>> MHz clock to do it.
>>>>
>>>> The Xilinx tools didn't approve of us doing this.
>>>>
>>>
>>> Ok, the last one that is driving across the board I wouldn't consider
>>> reliable. The other looks good, why don't they allow this? Don't they
>>> allow ring oscillators?
>>
>> The last one was slowest i/o cell speed and 4 mA drive strength. It's
>> grunting to drive the trace capacitance and a chip pin.
>>
>> 10 layer boards can have a lot of trace capacitance.
>>
>>>
>>> Unorthodox tricks are the most fun in electronics design.
>>
>> The tools pitch hissy-fits when you do async stuff, like ring
>> oscillators. We are offending The Church Of Synchronous Design.
>>
>
>I have that a lot with RF parts. "Can you guys furnish SPICE data?" ... 
>"No, only S-parameters" ... "I want to used it to pulse something" ... 
>"It's an RF part, you aren't supposed to do that".

Exactly. I asked Mini-Circuits "Does that MMIC invert the signal?" and
they didn't know. The RF boys just slosh stuff around by the bucket
full.

PHEMT DC transfer function? Rds-on? Leakage? C-V curve? Ha!

I know more about a lot of "RF" parts than the makers do.


>
>Like when I use my mountain bike to rush a package to the Fedex depot 
>because I can let some air out of the rear shock and then it glides like 
>a Lincoln, over a very dilapidated stretch of the old Lincoln Highway. I 
>even mounted a cargo platform to it. A Lycra-clad pro biker looked at me 
>in disgust "You do WHAT with it?"


The Lincoln Highway was cool. In Truckee it is now Donner Pass Road.
It was the first coast-to-coast auto highway, starting in Times Square
in Manhattan and ending in Lincoln Park (now a golf course) in San
Francisco. 3389 miles long.

I drive it to get to Sugar Bowl. It's spectacular.

https://www.dropbox.com/s/5hsohvy2ogacmbf/CW_Donner_Lake.jpg?raw=1

https://www.dropbox.com/s/5x685s2vb5xtxvd/Rainbow_Bridge.jpg?raw=1

https://en.wikipedia.org/wiki/Lincoln_Highway



-- 

John Larkin         Highland Technology, Inc
picosecond timing   precision measurement 

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

Article: 160539
Subject: Re: the FPGA one-shot
From: Joerg <news@analogconsultants.com>
Date: Fri, 16 Mar 2018 14:51:47 -0700
Links: << >> << T >> << A >>

On 2018-03-16 14:01, John Larkin wrote:
> On Fri, 16 Mar 2018 13:00:14 -0700, Joerg <news@analogconsultants.com>
> wrote:
>

[...]

>> Like when I use my mountain bike to rush a package to the Fedex depot
>> because I can let some air out of the rear shock and then it glides like
>> a Lincoln, over a very dilapidated stretch of the old Lincoln Highway. I
>> even mounted a cargo platform to it. A Lycra-clad pro biker looked at me
>> in disgust "You do WHAT with it?"
>
>
> The Lincoln Highway was cool. In Truckee it is now Donner Pass Road.
> It was the first coast-to-coast auto highway, starting in Times Square
> in Manhattan and ending in Lincoln Park (now a golf course) in San
> Francisco. 3389 miles long.
>
> I drive it to get to Sugar Bowl. It's spectacular.
>
> https://www.dropbox.com/s/5hsohvy2ogacmbf/CW_Donner_Lake.jpg?raw=1
>
> https://www.dropbox.com/s/5x685s2vb5xtxvd/Rainbow_Bridge.jpg?raw=1
>
> https://en.wikipedia.org/wiki/Lincoln_Highway
>

In our area it looks more like this and that's the smoother part:

http://www.edhhistory.org/images/photo-gallery/lincoln-hwy-cleanup-day-9-22-16/thumbs/IMG_2519.jpg

On a road bike it can be bone-jarring but the mountian bike rides very 
smoothly.

-- 
Regards, Joerg

http://www.analogconsultants.com/

Article: 160540
Subject: Re: the FPGA one-shot
From: Gerhard Hoffmann <gerhard@hoffmann-hochfrequenz.de>
Date: Sat, 17 Mar 2018 00:50:33 +0100
Links: << >> << T >> << A >>

Am 16.03.2018 um 22:01 schrieb John Larkin:

>>
>> I have that a lot with RF parts. "Can you guys furnish SPICE data?" ...
>> "No, only S-parameters" ... "I want to used it to pulse something" ...
>> "It's an RF part, you aren't supposed to do that".
> 
> Exactly. I asked Mini-Circuits "Does that MMIC invert the signal?" and
> they didn't know. 

  The s-parameters answer that question better than yes or no.  :-)

> The RF boys just slosh stuff around by the bucket
> full.
> 
> PHEMT DC transfer function? Rds-on? Leakage? C-V curve? Ha!
> 
> I know more about a lot of "RF" parts than the makers do.
>

Article: 160541
Subject: Re: the FPGA one-shot
From: John Larkin <jjlarkin@highland_snip_technology.com>
Date: Fri, 16 Mar 2018 17:28:15 -0700
Links: << >> << T >> << A >>

On Sat, 17 Mar 2018 00:50:33 +0100, Gerhard Hoffmann
<gerhard@hoffmann-hochfrequenz.de> wrote:

>Am 16.03.2018 um 22:01 schrieb John Larkin:
>
>>>
>>> I have that a lot with RF parts. "Can you guys furnish SPICE data?" ...
>>> "No, only S-parameters" ... "I want to used it to pulse something" ...
>>> "It's an RF part, you aren't supposed to do that".
>> 
>> Exactly. I asked Mini-Circuits "Does that MMIC invert the signal?" and
>> they didn't know. 
>
>  The s-parameters answer that question better than yes or no.  :-)

Almost 50% of the time!


-- 

John Larkin         Highland Technology, Inc
picosecond timing   precision measurement 

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

Article: 160542
Subject: Re: How to handle a data packet while calculating CRC.
From: David Brown <david.brown@hesbynett.no>
Date: Mon, 19 Mar 2018 10:50:13 +0100
Links: << >> << T >> << A >>

On 16/03/18 15:52, gtwrek wrote:
> In article <p8fufb$8es$1@dont-email.me>,
> David Brown  <david.brown@hesbynett.no> wrote:
>> On 16/03/18 01:29, gtwrek wrote:
>>> In article <almarsoft.7623430815815314146@news.eternal-september.org>,
>>> Bart Fox  <bartfox@gmx.net> wrote:
>>>> On Tue, 13 Mar 2018 19:28:17 -0000 (UTC), gtwrek@sonic.net (gtwrek)
>>>> wrote:
>>>>> Table methods are very likely NOT the correct solution for FPGA
>>>>> implementations.  Those methods are tuned for SW solutions.
>>>>
>>>> Sorry for respeak.Of course is the usage of a lookup-table a valid
>>>> design technic for FPGAs.
>>>> Maybe not in the range of megabytes, but e.g. for 8 bit input and 8
>>>> bit output it's very acceptable.
>>>> One example: a low fidelity DDS usually use a small ROM with a sine
>>>> table...
>>>
>>> Table lookups, in general, are a very valid tool for some hardware
>>> solutions.  Just not for CRCs.  A "brute force" table lookup for an
>>> 8-bit input, 8-bit CRC would require 2**16 entries by 8 bits.  So a
>>> 64 KByte table.  Not efficient.  
>>
>> No, it is 8-bit in (the address), 32-bit out for a 32-bit CRC - 1024 bytes.
> 
> So the "brute force" table is even larger.  

No, they are not.

Let's take the simple case of 8-bit CRC, with data coming in 8-bit
lumps.  This is the C code for it:

static const uint8_t crcTable[256] = {
	0x00, 0x8d, 0x97, 0x1a, 0xa3, 0x2e, 0x34, 0xb9,
	// ...
}

uint8_t calcCrc(uint8_t crc, const uint8_t * p, size_t n)
{
	while (n--) {
		crc = crcTable[crc ^ *p++];
	}
	return crc;
}

In hardware terms, that means you take your incoming 8-bit data, xor it
with your 8-bit current crc, then use that as the address for the lookup
in your 256-entry (8-bit address, 8-bit data) table.  The value from the
table is the new crc to use for the next round of 8-bit data.

The table is 256 bytes - 2K bits, if you prefer.  Not 2^16.

I suspect that what you are missing in your understanding here is that
you do not need a table that combines the current crc and the incoming
byte independently - /that/ would need a 2^16 entry table.  You take
advantage of the way the CRC is defined to combine the parts with xor
(using plain logic) first.

If you have a wider CRC and data coming in as 8-bit lumps, you need more
than one 256-entry table and you combine the steps with xor's.  So for a
32-bit CRC, you can use 4 256-entry, 8-bit wide lookup tables.  The four
lookups and the xors between them can easily be pipelined.

You can reduce the depth of the pipelines by having bigger tables - 2^16
entry tables will half the pipeline depth, but is unlikely to be worth
it due to the size of the tables.  You can also use smaller tables and
more steps, but 8-bit lookups often work well.  It is also possible to
take advantage of wider tables if your incoming data is in wider batches.

> 
>>> So one looks up many of the
>>> "Software" table generated techniques to reduce the requirements.
>>> Problem is those techniques are tuned to optimizes SW, not HW.
>>>
>>> ... All to replace something less than a 100 LUT6s, and 8 FFs.
>>>
>>> CRCs in hardware are very efficient just coded brute force.
>>>
>>
>> CRC's in hardware, using a shift register, are very efficient if your 
>> data is coming in a bit at a time.  If you have data in memory or 
>> arriving in a wider path, table lookup can be very useful.  You can 
> 
> I don't agree. Those table methods described in all those papers you
> find are tuned for software solutions, not hardware.   It's quite easy 
> to handle one-bit, or N-bits at a time in hardware.

I haven't looked at the particular papers here.  One-bit CRC hardware
is, as you say, quite easy to make and it is fine if your data is coming
in 1 bit at a time.  But it is much slower if the data is coming in
parallel bunches as is often the case for high-speed serial lines.  If
you have 1 GBit Ethernet, it is far easier to handle data that is 8-bit
wide at 125 MHz than data that is 1-bit wide at 1 GHz.

> 
> Code up a simple amount of logic to shift one bit at a time - in whatever
> language.
> 
> next_crc = shift_crc( current_crc, new_data_bit_in );
> 
> The shift_crc function is very simple to implement in verilog/vhdl.
> 
> Now for N bits at a time, stick a 'for' loop around that function
> call... Done.  The for loop (as always in hardware) describes parallel
> hardware, not sequential operation.  And it works to do exactly what is
> desired. You get a clear description of what's happening in shockingly
> few lines of code.  It reproduces what all those online CRC generators
> spit out as a glob of random XOR code.... 
> 
> If that's not fast enough, then look at pipelining it.  Table methods
> are hardly ever the right solution for CRCs in hardware.  
> 
> Burning even one block memory for a CRC calculation in hardware just seems
> absurdly excessive to me.  But maybe your designs have a lot of spare
> block memories, (and few LUTs available).
> 
> Regards,
> 
> Mark
>

Article: 160543
Subject: Re: How to handle a data packet while calculating CRC.
From: mac <acolvin@efunct.com>
Date: Tue, 20 Mar 2018 15:27:17 -0000 (UTC)
Links: << >> << T >> << A >>

> If that's not fast enough, then look at pipelining it. Table methods are
> hardly ever the right solution for CRCs in hardware.

In FPGA, almost all methods are table methods.
But not always the same ones as software.

Article: 160544
Subject: Re: How to handle a data packet while calculating CRC.
From: David Brown <david.brown@hesbynett.no>
Date: Tue, 20 Mar 2018 16:52:21 +0100
Links: << >> << T >> << A >>

On 20/03/18 16:27, mac wrote:
>> If that's not fast enough, then look at pipelining it. Table methods are
>> hardly ever the right solution for CRCs in hardware.
> 
> In FPGA, almost all methods are table methods.
> But not always the same ones as software.
> 

(Your quoting is jumbled.  You posted a follow-up to my post, but quoted
from gtwrek's post.  And you failed to give proper attributions.  Please
try to get this right - it is not hard, it is common courtesy, and it
makes newsgroup threads much easier to follow.  Thanks.)

Article: 160545
Subject: Re: How to handle a data packet while calculating CRC.
From: gtwrek@sonic.net (gtwrek)
Date: Tue, 20 Mar 2018 19:47:52 -0000 (UTC)
Links: << >> << T >> << A >>

In article <p8o14m$1iu$1@dont-email.me>,
David Brown  <david.brown@hesbynett.no> wrote:
>>description of CRC tables snipped

Sometimes email/nntp/forums makes these technical dicussions difficult.
As some background - I've written many CRCs designs in hardware for 
both ASICs, and FPGAs.  Every one of them uses the "plain-vanilla" 
method as described in many of the CRC papers.  The only twist
is, as I've tried to describe, is putting the single-shift
implementation in a for loop, to allow 'N' bits to be calculated in a
single clock cycle.  

This "twist" is shockingly simple to write in HDLs.  The core kernel of
a (generic) CRC - shifting through N-bits at a time is really just about
10 (simple) lines of code.  No online-generators needed - that spit out gobs 
of (unreadable) XOR gates.

The "table" methods we've been discussing, are all solutions aimed at
software.  Software where it's difficult to do (many) parallel XORs of 
any random bits.  But that's something that hardware excels at.  

Taking those software solutions, and retargetting them back at hardware,
is at best case, overcomplicating something very simple.  At worst case 
quite costly.

I've never written a table method for calculating CRCs (because as I've
asserted it's pointless for hardware), but I quite understand the underlying 
concepts and what they are doing.

Some numbers to back things up.
I have a generic crc module that's used all over the place.  I 
synthesized in Vivado various version of a 32-bit CRC.  (The one used in 
the MPEG-2 spec with  POLY=0x4C11DB7)

I targetting something high-speed to make sure the tools are
actually trying to do at least some optimizations.  Targetted 2ns clocks
(500 MHZ) in a Virtex7 device

Test 1:
1 bit at a time: 
CLB LUTs*               |  151 
CLB Registers           |   64 
Worst case timing slack: 0.754ns
 
You'll first note that I use twice as many registers than seems
neccesary.  This is because the way it's been coded it stores both the
"augmented" and "non-augmented" CRC in FFs.  

Test 2:
8 bits at a time:
CLB LUTs*               |  172
CLB Registers           |   64  
Worst case timing slack: 0.270ns

Notice, not much delta in LUT usage.  One thing to note - the baseline
may seem high.  After all for a single bit shift, you only need one XOR
gate per (high POLY bit).  The augmentation adds more complexity too.
But, in the end the overhead "housekeeping", if you will, actually is
probably what's dominating the logic here.  Things like loading the
INIT value, holding the current value when no new data is available,
etc.  Plus a LUT6 is under-utilized to do just 2-bit XOR.

Test 3:
32-bits at a time:
CLB LUTs*               |  259
CLB Registers           |   64
Worst case timing slack: 0.332ns

A few more LUTs.  But paradoxically, the timing's better.  
Decided to go for broke, and really up the DATA WIDTH:

Test 4:
512-bits at a time
CLB LUTs*               | 1734
CLB Registers           |   64
Worst case timing slack: 0.188ns

So even taking a very large datapath, things are fine.

Adding a "software" LUT method at it's very baseline, adds 1 BRAM block
to the solution - 36Kbits - a very large, IMHO resource.  I would argue
that for such a software solution aimed at the "Test 3 above (32-bits at a
time)" solution would likely still consume a signficant amount 
of CLB LUTs - just to handle the "housekeeping" and further logic behind
the software LUT.  I'd also argure that it would very likely run slower 
too, as BRAMs are in general slower than logic.  

Well, don't use a BRAM you might ask - use the CLB logic.  I'd argue
that this solution, to the tools is the same thing as my 10-line
"plain-vanilla", just overly complicated, and unclear, and likely take
the tool longer to achieve the same solution (if at all).

Those software implementations are quite interesting methods, and I
quite like reading up at the clever tricks being performed.  But none
of that is neccesary in hardware where you have free-access to any bit,
and can freely perform many things in parallel.

Regards,

Mark

Article: 160546
Subject: Re: How to handle a data packet while calculating CRC.
From: yogesh tripathi <yogitripathi47@gmail.com>
Date: Wed, 21 Mar 2018 01:51:21 -0700 (PDT)
Links: << >> << T >> << A >>

On Monday, March 12, 2018 at 4:32:17 PM UTC+5:30, yogesh tripathi wrote:
> Hi,=20
>=20
> I'm trying to process a Ethernet type package. Suppose if i have detected=
 SFD and now have a  <1600Byte  data.
>=20
> I'm extracting different package element(ds_addr,src_addr,etc) concatenat=
ing them in a long shift register and at same time passing it to a fifo to =
buffer and calculating crc32 which will take some clock cycles(xoring and s=
hifting). Now if calculated CRC matched what is received, pass data to nxt =
stage else rst fifo.
> =20
>=20
> Is there a better technique for it?
>=20
> Thank-You in advance.

--=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Thank-you all for you responses.
I'm tried with the online crc calculator to generate a parallel LFSR's:-
"https://leventozturk.com/engineering/crc/".
And its working ,verified the results. Although how the algorithm is genera=
ting those xor circuitary is still evade me, will look into it further.

following is the generated rtl code for crc32, 8bit input, data widt progre=
ssion d(7)->d(0)

ca(0)  <=3D ca(24) xor ca(30) xor d(1) xor d(7);
            ca(1)  <=3D ca(24) xor ca(25) xor ca(30) xor ca(31) xor d(0) xo=
r d(1) xor d(6) xor d(7);
            ca(2)  <=3D ca(24) xor ca(25) xor ca(26) xor ca(30) xor ca(31) =
xor d(0) xor d(1) xor d(5) xor d(6) xor d(7);
            ca(3)  <=3D ca(25) xor ca(26) xor ca(27) xor ca(31) xor d(0) xo=
r d(4) xor d(5) xor d(6);
            ca(4)  <=3D ca(24) xor ca(26) xor ca(27) xor ca(28) xor ca(30) =
xor d(1) xor d(3) xor d(4) xor d(5) xor d(7);
            ca(5)  <=3D ca(24) xor ca(25) xor ca(27) xor ca(28) xor ca(29) =
xor ca(30) xor ca( 31) xor d(0) xor d(1) xor d(2) xor d(3) xor d(4) xor d(6=
) xor d(7);
            ca(6)  <=3D ca(25) xor ca(26) xor ca(28) xor ca(29) xor ca(30) =
xor ca(31) xor d(0) xor d(1) xor d(2) xor d(3) xor d(5) xor d(6);
            ca(7)  <=3D ca(24) xor ca(26) xor ca(27) xor ca(29) xor ca(31) =
xor d(0) xor d(2) xor d(4) xor d(5) xor d(7);
            ca(8)  <=3D ca(0) xor ca( 24) xor ca(25) xor ca(27) xor ca(28) =
xor d(3) xor d(4) xor d(6) xor d(7);
            ca(9)  <=3D ca(1) xor ca( 25) xor ca(26) xor ca(28) xor ca(29) =
xor d(2) xor d(3) xor d(5) xor d(6);
            ca(10) <=3D ca(2) xor ca( 24) xor ca(26) xor ca(27) xor ca(29) =
xor d(2) xor d(4) xor d(5) xor d(7);
            ca(11) <=3D ca(3) xor ca( 24) xor ca(25) xor ca(27) xor ca(28) =
xor d(3) xor d(4) xor d(6) xor d(7);
            ca(12) <=3D ca(4) xor ca( 24) xor ca(25) xor ca(26) xor ca(28) =
xor ca(29) xor ca(30) xor d(1) xor d(2) xor d(3) xor d(5) xor d(6) xor d(7)=
;
            ca(13) <=3D ca(5) xor ca( 25) xor ca(26) xor ca(27) xor ca(29) =
xor ca(30) xor ca(31) xor d(0) xor d(1) xor d(2) xor d(4) xor d(5) xor d(6)=
;
            ca(14) <=3D ca(6) xor ca( 26) xor ca(27) xor ca(28) xor ca(30) =
xor ca(31) xor d(  0) xor d(1) xor d(3) xor d(4) xor d(5);
            ca(15) <=3D ca(7) xor ca( 27) xor ca(28) xor ca(29) xor ca(31) =
xor d(0) xor d(  2) xor d(  3) xor d(4);
            ca(16) <=3D ca(8) xor ca( 24) xor ca(28) xor ca(29) xor d(2) xo=
r d(3) xor d(7);
            ca(17) <=3D ca(9) xor ca( 25) xor ca(29) xor ca(30) xor d(1) xo=
r d(2) xor d(6);
            ca(18) <=3D ca(10) xor ca(26) xor ca(30) xor ca(31) xor d(0) xo=
r d(1) xor d(5);
            ca(19) <=3D ca(11) xor ca(27) xor ca(31) xor d( 0) xor d(  4);
            ca(20) <=3D ca(12) xor ca(28) xor d(3);
            ca(21) <=3D ca(13) xor ca(29) xor d(2);
            ca(22) <=3D ca(14) xor ca(24) xor d(7);
            ca(23) <=3D ca(15) xor ca(24) xor ca(25) xor ca(30) xor d(1) xo=
r d(  6) xor d(  7);
            ca(24) <=3D ca(16) xor ca(25) xor ca(26) xor ca(31) xor d(0) xo=
r d(  5) xor d(  6);
            ca(25) <=3D ca(17) xor ca(26) xor ca(27) xor d(4) xor d(5);
            ca(26) <=3D ca(18) xor ca(24) xor ca(27) xor ca(28) xor ca(30) =
xor d(1) xor d(3) xor d(  4) xor d(  7);
            ca(27) <=3D ca(19) xor ca(25) xor ca(28) xor ca(29) xor ca(31) =
xor d(0) xor d(2) xor d(  3) xor d(  6);
            ca(28) <=3D ca(20) xor ca(26) xor ca(29) xor ca(30) xor d(1) xo=
r d(2) xor d(5);
            ca(29) <=3D ca(21) xor ca(27) xor ca(30) xor ca(31) xor d(0) xo=
r d(1) xor d(4);
            ca(30) <=3D ca(22) xor ca(28) xor ca(31) xor d(0) xor d(3);
            ca(31) <=3D ca(23) xor ca(29) xor d(2);

Article: 160547
Subject: Re: the FPGA one-shot
From: Paul Urbanus <urb@urbonix.com>
Date: Thu, 22 Mar 2018 06:00:38 -0500
Links: << >> << T >> << A >>

On 3/16/2018 1:18 PM, John Larkin wrote:
 > I finally got a test case for my FPGA async one-shot idea, hacked into
 > a build for something else.
 >
 > I got 17 different one-shots, with various pin locations and
 > speed/drive strength settings.
 >
 > https://www.dropbox.com/s/4hxena27mpbpg54/FPGA_OS_1.JPG?raw=1
 > <snipped>
 >
 > The Xilinx tools didn't approve of us doing this.
 >
  This was the circuit I used to generate 'synchronous' write enables 
for the LUT RAMs in the XC4000 family. This was the first Xilinx FPGA 
family that allowed the LUTs to be used as RAMs. The LUT RAM operation 
was all async, including the write enable, and I needed the RAM writes 
to occur in a single clock cycle.

This was for a proof-of-concept (non-production) system and it worked 
flawlessly.

Article: 160548
Subject: Re: the FPGA one-shot
From: John Larkin <jjlarkin@highlandtechnology.com>
Date: Thu, 22 Mar 2018 06:52:53 -0700
Links: << >> << T >> << A >>

On Thu, 22 Mar 2018 06:00:38 -0500, Paul Urbanus <urb@urbonix.com>
wrote:

>On 3/16/2018 1:18 PM, John Larkin wrote:
> > I finally got a test case for my FPGA async one-shot idea, hacked into
> > a build for something else.
> >
> > I got 17 different one-shots, with various pin locations and
> > speed/drive strength settings.
> >
> > https://www.dropbox.com/s/4hxena27mpbpg54/FPGA_OS_1.JPG?raw=1
> > <snipped>
> >
> > The Xilinx tools didn't approve of us doing this.
> >
>  This was the circuit I used to generate 'synchronous' write enables 
>for the LUT RAMs in the XC4000 family. This was the first Xilinx FPGA 
>family that allowed the LUTs to be used as RAMs. The LUT RAM operation 
>was all async, including the write enable, and I needed the RAM writes 
>to occur in a single clock cycle.
>
>This was for a proof-of-concept (non-production) system and it worked 
>flawlessly.
>

Was that all internal to the FPGA, or did you loop through a pin?


-- 

John Larkin         Highland Technology, Inc

lunatic fringe electronics

Article: 160549
Subject: Re: Microsemi now Microchip
From: Francesco <francescopoderico@googlemail.com>
Date: Sat, 24 Mar 2018 07:31:27 -0700 (PDT)
Links: << >> << T >> << A >>

On Saturday, 3 March 2018 09:28:59 UTC, HT-Lab  wrote:
> In case anybody missed it:
> 
> https://www10.edacafe.com/nbc/articles/1/1569384/Microchip-Technology-Acquire-Microsemi
> 
> Hans
> www.ht-lab.com

True, Microchip is expanding at an exponential rate.
It will be interesting to see the market share against others like Cypress for example.

cheers,
Francesco

-- www.neutronix.co.uk

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search