Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On 02/07/12 00:06, KJ wrote: > On Sunday, July 1, 2012 6:33:05 PM UTC-4, Alan Fitch wrote: >> On 01/07/12 21:03, KJ wrote: >>> >>> Here's another: >>> >>> process(sel, a, b) >>> begin >>> o <= a when sel = '0' else b; >>> end process; >>> >>> Is the above description that of a 2->1 mux (as suggested by the signal names) or a transparent latch? If it's a mux, one would think that this is simple combinatorial logic with no memory; if it's a transparent latch then it certainly does have memory. Without context, one cannot say whether the above process is a mux or a latch. >>> >> >> It's up to the synthesis tool. But it looks like a mux to me. But I >> guess I'll wait for STA to tell me if there's some outside feedback I >> can't see. >> > > It's not up to the synthesis tool, it's up to the designer who wrote the code that is to be synthesized. To see the transparent latch, make the following connection: > > b <= o; -- or a <= o; > Sorry, I left out too many steps. What I intended to say was that I tend to assume that people design latches by mistake. Therefore they will find out when STA (Static Timing Analysis) tells them there is combinational feedback. regards Alan > Where 'sel' plays the role of the latch enable. > > Kevin Jennings > -- Alan FitchArticle: 153951
On Monday, July 2, 2012 3:32:21 PM UTC-4, Gabor wrote: > KJ wrote: > > On Monday, July 2, 2012 5:26:51 AM UTC-4, nba83 wrote: > >> hi > >> is it a good practice to generate a pulse strobe in Verilog with this code? > >> or do i need to write a fsm for such applications? > >> BoardConfigReceivedStrobe is set elsewhere in the process. for resetting > >> this signal after some delay, I used this code,I 'm skeptical that this > >> code is not efficient > >> if(BoardConfigReceivedStrobe) > >> begin > >> DelayCounterBoardConfig<=DelayCounterBoardConfig+1; > >> if(DelayCounterBoardConfig>100) > >> begin > >> DelayCounterBoardConfig<=0; > >> BoardConfigReceivedStrobe<=0; > >> end > >> end > >> > >> tnx in advanced for any comments > >> Neda > >> > >> --------------------------------------- > >> Posted through http://www.FPGARelated.com > > > > For a fixed count value like you have a more efficient way of counting is usually an LFSR. Since a binary counter to 100 as you have now is not really that large to begin with it may not save you all that much. > > > > Kevin Jennings > > The only thing that looks unnecessarily complex is the magnitude > comparison "> 100" instead of an equallity comparison like "== 101" > which should give the same results unless your counter increments > outside of the posted part of the logic. > Actually using > can result in less logic than =. Take for example >96. You would only need to compare to "11-----" (two bits) rather than the full seven bits. Not always less, but sometimes. Kevin JenningsArticle: 153952
Dear All, I'm not an expert in VHDL, i'm just a curious trying to solve a research problem with an FPGA. I'm using a 32 bit accumulator in a IP, as part of a SoC project with a microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a Xilinx XC3S200). The code is included at the end of this message. The input is a 32 bit signed integer coded in two's complement and the output also a 32 bit signed integer. What I would like the accumulator to do is to accumulate synchronously with the rising edge of clk when enb=1 and maintain the result stable at the output when enb=0 ( enb is a asynchronous signal generated elsewhere in the system) But it does not work in this way, it behaves in a strange manner... Some times I get the expected results but often I get strange values (large when they should be small, often negative instead of positive, etc.). If I look at the binary representation of the output, it looks like if the output din't had time to sum and propagate to the output again. In fact, the post place and route simulation shows that when the enb signal goes to 0, the output stays in a undetermined condition (you know, red line with XXXX). I'm guessing I'm doing a very basic mistake that as something to do with the timing of the enb signal, but after 3 days banging my had to the wall, all I have is a a monumental headache. Can some kind soul help me with this? jmariano ================ library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity int_accum is port (clk:in std_logic; clr:in std_logic; enb:in std_logic; d: in std_logic_vector(31 downto 0); ovf:out std_logic; -- overflow q: out std_logic_vector(31 downto 0)); end int_accum; architecture archi of int_accum is signal tmp : signed(32 downto 0); begin process(clk, clr) begin if (clr = '1') then tmp <= (others => '0'); elsif (rising_edge (clk)) then if (enb = '1') then -- The result of the adder will be on 33 bits to keep the carry tmp <= tmp + signed ('0'& d); end if; end if; end process; -- The carry is extracted from the most significant bit of the result ovf <= tmp(32); -- The q output is the 32 least significant bits of sum q <= std_logic_vector (tmp(31 downto 0)); end archi;Article: 153953
On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote: > Dear All, > > I'm not an expert in VHDL, i'm just a curious trying to solve a > research problem with an FPGA. > > I'm using a 32 bit accumulator in a IP, as part of a SoC project with > a microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a > Xilinx XC3S200). The code is included at the end of this message. =A0The > input is a 32 bit signed integer coded in two's complement and the > output also a 32 bit signed integer. What I would like the accumulator > to do is to accumulate synchronously with the rising edge of clk when > enb=3D1 and maintain the result stable at the output when enb=3D0 ( enb i= s > a asynchronous signal generated elsewhere in the system) > > But it does not work in this way, it behaves in a strange manner... > > Some times I get the expected results but often I get strange values > (large when they should be small, often negative instead of positive, > etc.). If I look at the binary representation of the output, it looks > like if the output din't had time to sum and propagate to the output > again. In fact, the post place and route simulation shows that when > the enb signal goes to 0, the output stays in a undetermined condition > (you know, red line with XXXX). > > I'm guessing I'm doing a very basic mistake that as something to do > with the timing of the enb signal, but after 3 days banging my had to > the wall, all I have is a a monumental headache. > > Can some kind soul help me with this? > > jmariano > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > library ieee; > use ieee.std_logic_1164.all; > use ieee.numeric_std.all; > > entity int_accum is > =A0 port =A0(clk:in =A0std_logic; > =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic; > =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic; > =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0); > =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow > =A0 =A0 =A0 =A0 =A0q: =A0out std_logic_vector(31 downto 0)); > end int_accum; > > architecture archi of int_accum is > > =A0 signal tmp : signed(32 downto 0); > > =A0 begin > > =A0 process(clk, clr) > =A0 begin > =A0 =A0 =A0 =A0 if (clr =3D '1') then > =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0'); > =A0 =A0elsif (rising_edge (clk)) then > =A0 =A0 =A0 =A0 if (enb =3D '1') then > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be on 33 = bits to keep the carry > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 tmp <=3D tmp + signed ('0'& d); > =A0 =A0 end if; > =A0 =A0end if; > =A0 end process; > > =A0 -- The carry is extracted from the most significant bit of the > result > =A0 ovf <=3D tmp(32); > > =A0 -- The q output is the 32 least significant bits of sum > =A0 q <=3D std_logic_vector (tmp(31 downto 0)); > > end archi; This is the key to your problem: > enb is a asynchronous signal generated elsewhere in the system You can't expect to take an asynchronous signal into multiple (32 in this case) registers in a synchronous domain and expect that it will work reliably. You need to first synchronize the asynchronous input to the synchronous clock domain before you can use it. Ed McGettigan -- Xilinx Inc.Article: 153954
>> That's why Weng's fundamental premise of how to define a >> 'combinatorial process' was inherently flawed. Whether or not a >> given process implements memory or not cannot be determined simply >> from the process so any attempt to do so will ultimately fail. >> Kevin >This line of reasoning would apply to any type of logic >description, not just VHDL or Verilog. It would imply that >the use of the word "combinatorial" is meaningless, which >is rather absurd. > Jan >A different approach has been taken in SystemVerilog, > where keywords always_comb, always_latch, and always_ff > have been introduced to allow the designer to describe > design intent in the simulation model. > Alan That is why I say SystemVerilog does a correct and better thing than VHDL: It introduces 3 types of basic circuits in digital world: always_comb, always_latch, and always_ff. That eliminates all confusions introduced by VHDL ambiguous definition of one process implying more than 3 different things that I don't see any end-light in the discussion. In DNA world, there are only 4 gene bases: The four bases found in DNA are adenine (abbreviated A), cytosine (C), guanine (G) and thymine (T). Over there people in DNA world always talk about A, C, G and T. In digital electric world there are only 3 basic circuits that people are interested in: ff, latch and combinatorial. By combinatorial signal it means it is neither a ff nor a latch. After realizing that combinatorial is fully excluded in VHDL-2008, I abandon any attempts to use it accurately. So I am finally easy with process(all) specification. I really recommend that VHDL-201x use 3 process type definitions to replace one ambiguous process definition: process_com, process_latch and process_ff. Each type generates only one type of logic. When process_com misses an assignment, it would generate an error. So no any confusion will be introduced. WengArticle: 153955
On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote: > On Jul 2, 4:20 pm, jmariano <jmarian...@gmail.com> wrote: >> Dear All, >> >> I'm not an expert in VHDL, i'm just a curious trying to solve a >> research problem with an FPGA. >> >> I'm using a 32 bit accumulator in a IP, as part of a SoC project with a >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a >> Xilinx XC3S200). The code is included at the end of this message. The >> input is a 32 bit signed integer coded in two's complement and the >> output also a 32 bit signed integer. What I would like the accumulator >> to do is to accumulate synchronously with the rising edge of clk when >> enb=1 and maintain the result stable at the output when enb=0 ( enb is >> a asynchronous signal generated elsewhere in the system) >> >> But it does not work in this way, it behaves in a strange manner... >> >> Some times I get the expected results but often I get strange values >> (large when they should be small, often negative instead of positive, >> etc.). If I look at the binary representation of the output, it looks >> like if the output din't had time to sum and propagate to the output >> again. In fact, the post place and route simulation shows that when the >> enb signal goes to 0, the output stays in a undetermined condition (you >> know, red line with XXXX). >> >> I'm guessing I'm doing a very basic mistake that as something to do >> with the timing of the enb signal, but after 3 days banging my had to >> the wall, all I have is a a monumental headache. >> >> Can some kind soul help me with this? >> >> jmariano >> >> ================ >> >> library ieee; >> use ieee.std_logic_1164.all; >> use ieee.numeric_std.all; >> >> entity int_accum is >> port (clk:in std_logic; >> clr:in std_logic; >> enb:in std_logic; >> d: in std_logic_vector(31 downto 0); >> ovf:out std_logic; -- overflow q: out >> std_logic_vector(31 downto 0)); >> end int_accum; >> >> architecture archi of int_accum is >> >> signal tmp : signed(32 downto 0); >> >> begin >> >> process(clk, clr) >> begin >> if (clr = '1') then >> tmp <= (others => '0'); >> elsif (rising_edge (clk)) then >> if (enb = '1') then >> -- The result of the adder will be on 33 bits >> to keep the carry tmp <= tmp + signed ('0'& d); >> end if; >> end if; >> end process; >> >> -- The carry is extracted from the most significant bit of the >> result >> ovf <= tmp(32); >> >> -- The q output is the 32 least significant bits of sum q <= >> std_logic_vector (tmp(31 downto 0)); >> >> end archi; > > This is the key to your problem: > >> enb is a asynchronous signal generated elsewhere in the system > > You can't expect to take an asynchronous signal into multiple (32 in > this case) registers in a synchronous domain and expect that it will > work reliably. You need to first synchronize the asynchronous input to > the synchronous clock domain before you can use it. Which means that you should latch enb in a register, with the same clock that you're using to twiddle your accumulator, and use the output of that register as your enable signal. Paranoid logic designers will have a string of two or three registers to avoid metastability, but I've been told that's not necessary. (I'm not much of a logic designer). -- Tim Wescott Control system and signal processing consulting www.wescottdesign.comArticle: 153956
On Mon, 02 Jul 2012 14:02:06 -0700, KJ wrote: > On Monday, July 2, 2012 3:32:21 PM UTC-4, Gabor wrote: >> KJ wrote: >> > On Monday, July 2, 2012 5:26:51 AM UTC-4, nba83 wrote: >> >> hi is it a good practice to generate a pulse strobe in Verilog with >> >> this code? >> >> or do i need to write a fsm for such applications? >> >> BoardConfigReceivedStrobe is set elsewhere in the process. for >> >> resetting this signal after some delay, I used this code,I 'm >> >> skeptical that this code is not efficient >> >> if(BoardConfigReceivedStrobe) >> >> begin >> >> DelayCounterBoardConfig<=DelayCounterBoardConfig +1; >> >> if(DelayCounterBoardConfig>100) >> >> begin >> >> DelayCounterBoardConfig<=0; BoardConfigReceivedStrobe<=0; >> >> end >> >> end >> >> >> >> tnx in advanced for any comments Neda >> >> >> >> --------------------------------------- >> >> Posted through http://www.FPGARelated.com >> > >> > For a fixed count value like you have a more efficient way of >> > counting is usually an LFSR. Since a binary counter to 100 as you >> > have now is not really that large to begin with it may not save you >> > all that much. >> > >> > Kevin Jennings >> >> The only thing that looks unnecessarily complex is the magnitude >> comparison "> 100" instead of an equallity comparison like "== 101" >> which should give the same results unless your counter increments >> outside of the posted part of the logic. >> >> > Actually using > can result in less logic than =. Take for example >96. > You would only need to compare to "11-----" (two bits) rather than the > full seven bits. Not always less, but sometimes. > > Kevin Jennings You mean >=96, of course. I'm wondering if it wouldn't be faster to compare on zero, load with 100, and count down. -- Tim Wescott Control system and signal processing consulting www.wescottdesign.comArticle: 153957
On 7/1/2012 12:42 AM, glen herrmannsfeldt wrote: > Rob Doyle <radioengr@gmail.com> wrote: > > (snip, I wrote) >>> Normally you can't make a traditional edge tricgered FF out of >>> orginary gates. You can make a transparent latch, and some other >>> devices with state. > >> Why do you say that? With two transparent latches (a master and >> a slave) you can make an edge triggered flip-flop, right? > > As it was explained to me many years ago, you want a different > threshold for the two. Maybe that isn't always necessary, but > only an optimization, or maybe technology dependent. > >> Even wikipedia has a some examples. > >> http://en.wikipedia.org/wiki/Flip-flop_%28electronics%29 > >> I hacked a D flip-flop in a 16L8 once... as a fix. >> I don't recall being too grossed out by it. > > I will have to look at those. > > -- glen > I don't think different thresholds are necessary. The master and slave latches work on alternate phases of the clock. Way back in the days, we designed latches, flip-flops, shift registers, and memory with just transmission gates and inverters using a two-phase non-overlapping clock... I guess that was in the day when transistors were expensive and logic simulation was done with SPICE. Anyway, the following set of latches /does/ simulate (Xilinx ISE) as a positive edge-triggered toggle flip-flop: ---- begin ---- architecture Beh of test is signal input : std_logic; signal master : std_logic := '0'; signal slave : std_logic := '0'; signal clk : std_logic := '1'; begin -- Clock process begin wait for 10 ns; clk <= not(clk); end process; -- Master Latch master <= input when clk = '0' else master; -- Slave Latch slave <= master when clk = '1' else slave; -- Make flip-flop toggle input <= not(slave); end Beh; ---- end ---- It sure looks combinatorial ... Hmm.... Rob.Article: 153958
On 07/03/2012 03:43 AM, Weng Tianxiang wrote: > That is why I say SystemVerilog does a correct and better thing than > VHDL: It introduces 3 types of basic circuits in digital world: > always_comb, always_latch, and always_ff. > > That eliminates all confusions introduced by VHDL ambiguous > definition of one process implying more than 3 different things that > I don't see any end-light in the discussion. > In digital electric world there are only 3 basic circuits that people > are interested in: ff, latch and combinatorial. By combinatorial > signal it means it is neither a ff nor a latch. > I really recommend that VHDL-201x use 3 process type definitions to > replace one ambiguous process definition: process_com, process_latch > and process_ff. You should think differently about VHDL/Verilog. Their purpose is absolutely not limited to describing circuits according to your classification. First and foremost, you use them for high-level modeling and verification. Your classification doesn't apply to such modeling. Secondly, in RTL modeling a clocked process is a very effective modeling abstraction from which both sequential devices and combinatorial logic are inferred. For the latter reason, I believe always_ff is problematic. If it restricts assignments to FF inference only, it is a step backwards compared to a classic clocked process. Otherwise it is confusing. (I don't know which path vendors choose.) For those reasons, the VHDL process and Verilog initial/always will never go away and will continue to be used intensively. -- Jan Decaluwe - Resources bvba - http://www.jandecaluwe.com Python as a HDL: http://www.myhdl.org VHDL development, the modern way: http://www.sigasi.com World-class digital design: http://www.easics.comArticle: 153959
Tim Wescott wrote: > On Mon, 02 Jul 2012 14:02:06 -0700, KJ wrote: > >> On Monday, July 2, 2012 3:32:21 PM UTC-4, Gabor wrote: >>> KJ wrote: >>>> On Monday, July 2, 2012 5:26:51 AM UTC-4, nba83 wrote: >>>>> hi is it a good practice to generate a pulse strobe in Verilog with >>>>> this code? >>>>> or do i need to write a fsm for such applications? >>>>> BoardConfigReceivedStrobe is set elsewhere in the process. for >>>>> resetting this signal after some delay, I used this code,I 'm >>>>> skeptical that this code is not efficient >>>>> if(BoardConfigReceivedStrobe) >>>>> begin >>>>> DelayCounterBoardConfig<=DelayCounterBoardConfig > +1; >>>>> if(DelayCounterBoardConfig>100) >>>>> begin >>>>> DelayCounterBoardConfig<=0; > BoardConfigReceivedStrobe<=0; >>>>> end >>>>> end >>>>> >>>>> tnx in advanced for any comments Neda >>>>> >>>>> --------------------------------------- >>>>> Posted through http://www.FPGARelated.com >>>> For a fixed count value like you have a more efficient way of >>>> counting is usually an LFSR. Since a binary counter to 100 as you >>>> have now is not really that large to begin with it may not save you >>>> all that much. >>>> >>>> Kevin Jennings >>> The only thing that looks unnecessarily complex is the magnitude >>> comparison "> 100" instead of an equallity comparison like "== 101" >>> which should give the same results unless your counter increments >>> outside of the posted part of the logic. >>> >>> >> Actually using > can result in less logic than =. Take for example >96. >> You would only need to compare to "11-----" (two bits) rather than the >> full seven bits. Not always less, but sometimes. >> >> Kevin Jennings > > You mean >=96, of course. > > I'm wondering if it wouldn't be faster to compare on zero, load with 100, > and count down. > Loading with a value and using the carry chain for compare (up or down) often works faster, but this is quite dependent on the architecture. In a Xilinx FPGA, the carry chain logic is quite fast compared to LUT's, but most CPLD's only care about how many product terms the equation needs - equality comparison generally only needs one - and that affects the size of the logic more than speed (you hit a speed bump if you exceed the product terms you can have locally for a single macrocell). -- GaborArticle: 153960
On Mon, 02 Jul 2012 16:20:52 -0700, jmariano wrote: > Dear All, > > I'm not an expert in VHDL, i'm just a curious trying to solve a research > problem with an FPGA. > > I'm using a 32 bit accumulator in a IP, ... The > input is a 32 bit signed integer coded in two's complement and the > output also a 32 bit signed integer. > But it does not work in this way, it behaves in a strange manner... You have one likely answer from Ed and Tim : unless you KNOW that the input signals "enb" and "d" are already synchronous with "clk" you MUST synchronise them. But there is another problem: tmp <= tmp + signed ('0'& d); This is NOT how to add a leading bit to d. It will convert a small negative d to a very large positive value! Instead you must replicate d's sign bit (MSB) into the leading bit. tmp <= tmp + signed (d(d'high) & d); (Or look for "resize" functions in numeric_std to do this for you). This is far more likely to be the problem, especially if you are detecting these errors at behavioural simulation (as you should be) Incidentally, unless this is the top level of your design, I would consider making the D and Q ports signed. Apart from keeping the type conversions to a minimum, this means the external view of the design (the entity specification) better reflects (or documents) what the design does; preventing surprises when someone re-uses it with unsigned data... - BrianArticle: 153961
On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote: > On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote: >=20 > > On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote: > >> Dear All, > >> > >> I'm not an expert in VHDL, i'm just a curious trying to solve a > >> research problem with an FPGA. > >> > >> I'm using a 32 bit accumulator in a IP, as part of a SoC project with = a > >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a > >> Xilinx XC3S200). The code is included at the end of this message. =A0T= he > >> input is a 32 bit signed integer coded in two's complement and the > >> output also a 32 bit signed integer. What I would like the accumulator > >> to do is to accumulate synchronously with the rising edge of clk when > >> enb=3D1 and maintain the result stable at the output when enb=3D0 ( en= b is > >> a asynchronous signal generated elsewhere in the system) > >> > >> But it does not work in this way, it behaves in a strange manner... > >> > >> Some times I get the expected results but often I get strange values > >> (large when they should be small, often negative instead of positive, > >> etc.). If I look at the binary representation of the output, it looks > >> like if the output din't had time to sum and propagate to the output > >> again. In fact, the post place and route simulation shows that when th= e > >> enb signal goes to 0, the output stays in a undetermined condition (yo= u > >> know, red line with XXXX). > >> > >> I'm guessing I'm doing a very basic mistake that as something to do > >> with the timing of the enb signal, but after 3 days banging my had to > >> the wall, all I have is a a monumental headache. > >> > >> Can some kind soul help me with this? > >> > >> jmariano > >> > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> > >> library ieee; > >> use ieee.std_logic_1164.all; > >> use ieee.numeric_std.all; > >> > >> entity int_accum is > >> =A0 port =A0(clk:in =A0std_logic; > >> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic; > >> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic; > >> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0); > >> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow q: =A0out > >> =A0 =A0 =A0 =A0 =A0std_logic_vector(31 downto 0)); > >> end int_accum; > >> > >> architecture archi of int_accum is > >> > >> =A0 signal tmp : signed(32 downto 0); > >> > >> =A0 begin > >> > >> =A0 process(clk, clr) > >> =A0 begin > >> =A0 =A0 =A0 =A0 if (clr =3D '1') then > >> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0'); > >> =A0 =A0elsif (rising_edge (clk)) then > >> =A0 =A0 =A0 =A0 if (enb =3D '1') then > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be on = 33 bits > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to keep the carry tmp <=3D tmp + signe= d ('0'& d); > >> =A0 =A0 end if; > >> =A0 =A0end if; > >> =A0 end process; > >> > >> =A0 -- The carry is extracted from the most significant bit of the > >> result > >> =A0 ovf <=3D tmp(32); > >> > >> =A0 -- The q output is the 32 least significant bits of sum q <=3D > >> =A0 std_logic_vector (tmp(31 downto 0)); > >> > >> end archi; > >=20 > > This is the key to your problem: > >=20 > >> enb is a asynchronous signal generated elsewhere in the system > >=20 > > You can't expect to take an asynchronous signal into multiple (32 in > > this case) registers in a synchronous domain and expect that it will > > work reliably. You need to first synchronize the asynchronous input to > > the synchronous clock domain before you can use it. >=20 > Which means that you should latch enb in a register, with the same clock= =20 > that you're using to twiddle your accumulator, and use the output of that= =20 > register as your enable signal. >=20 > Paranoid logic designers will have a string of two or three registers to= =20 > avoid metastability, but I've been told that's not necessary. (I'm not= =20 > much of a logic designer). >=20 > --=20 > Tim Wescott > Control system and signal processing consulting > www.wescottdesign.com It isn't just the paranoid logic designer, it should be every logic designe= r. =20 A single register only partially solves the problem of an asynchronous inpu= t with multiple register destinations, but it does not solve the very real = metastability problem. At least two registers should be used to ensure tha= t the metastability condition has resolved and with increasing clock freque= ncy and finer process nodes using three or more stages may be necessary. Ed McGettigan -- Xilinx Inc.Article: 153962
On Jul 3, 5:45=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote: > On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote: > > On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote: > > > > On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote: > > >> Dear All, > > > >> I'm not an expert in VHDL, i'm just a curious trying to solve a > > >> research problem with an FPGA. > > > >> I'm using a 32 bit accumulator in a IP, as part of a SoC project wit= h a > > >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a > > >> Xilinx XC3S200). The code is included at the end of this message. = =A0The > > >> input is a 32 bit signed integer coded in two's complement and the > > >> output also a 32 bit signed integer. What I would like the accumulat= or > > >> to do is to accumulate synchronously with the rising edge of clk whe= n > > >> enb=3D1 and maintain the result stable at the output when enb=3D0 ( = enb is > > >> a asynchronous signal generated elsewhere in the system) > > > >> But it does not work in this way, it behaves in a strange manner... > > > >> Some times I get the expected results but often I get strange values > > >> (large when they should be small, often negative instead of positive= , > > >> etc.). If I look at the binary representation of the output, it look= s > > >> like if the output din't had time to sum and propagate to the output > > >> again. In fact, the post place and route simulation shows that when = the > > >> enb signal goes to 0, the output stays in a undetermined condition (= you > > >> know, red line with XXXX). > > > >> I'm guessing I'm doing a very basic mistake that as something to do > > >> with the timing of the enb signal, but after 3 days banging my had t= o > > >> the wall, all I have is a a monumental headache. > > > >> Can some kind soul help me with this? > > > >> jmariano > > > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > >> library ieee; > > >> use ieee.std_logic_1164.all; > > >> use ieee.numeric_std.all; > > > >> entity int_accum is > > >> =A0 port =A0(clk:in =A0std_logic; > > >> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic; > > >> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic; > > >> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0); > > >> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow q: =A0o= ut > > >> =A0 =A0 =A0 =A0 =A0std_logic_vector(31 downto 0)); > > >> end int_accum; > > > >> architecture archi of int_accum is > > > >> =A0 signal tmp : signed(32 downto 0); > > > >> =A0 begin > > > >> =A0 process(clk, clr) > > >> =A0 begin > > >> =A0 =A0 =A0 =A0 if (clr =3D '1') then > > >> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0'); > > >> =A0 =A0elsif (rising_edge (clk)) then > > >> =A0 =A0 =A0 =A0 if (enb =3D '1') then > > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be o= n 33 bits > > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to keep the carry tmp <=3D tmp + sig= ned ('0'& d); > > >> =A0 =A0 end if; > > >> =A0 =A0end if; > > >> =A0 end process; > > > >> =A0 -- The carry is extracted from the most significant bit of the > > >> result > > >> =A0 ovf <=3D tmp(32); > > > >> =A0 -- The q output is the 32 least significant bits of sum q <=3D > > >> =A0 std_logic_vector (tmp(31 downto 0)); > > > >> end archi; > > > > This is the key to your problem: > > > >> =A0enb is a asynchronous signal generated elsewhere in the system > > > > You can't expect to take an asynchronous signal into multiple (32 in > > > this case) registers in a synchronous domain and expect that it will > > > work reliably. =A0You need to first synchronize the asynchronous inpu= t to > > > the synchronous clock domain before you can use it. > > > Which means that you should latch enb in a register, with the same cloc= k > > that you're using to twiddle your accumulator, and use the output of th= at > > register as your enable signal. > > > Paranoid logic designers will have a string of two or three registers t= o > > avoid metastability, but I've been told that's not necessary. =A0(I'm n= ot > > much of a logic designer). > > > -- > > Tim Wescott > > Control system and signal processing consulting > >www.wescottdesign.com > > It isn't just the paranoid logic designer, it should be every logic desig= ner. > > A single register only partially solves the problem of an asynchronous in= put with multiple register destinations, but it does not solve the very rea= l metastability problem. =A0At least two registers should be used to ensure= that the metastability condition has resolved and with increasing clock fr= equency and finer process nodes using three or more stages may be necessary= . > > Ed McGettigan > -- > Xilinx Inc. Hi Ed. They way it was explained to me, I believe from Peter Alfke, is that what really resolves metastability is the slack time in a register to register path. Over the years FPGA process has resulted in FFs which only need a couple of ns to resolve metastability to 1 in a million operation years or something like that (I don't remember the metric, but it was good enough for anything I do). It doesn't matter that you have logic in that path, you just need those few ns in every part of the path. In theory, even if you use multiple registers with no logic, what really matters is the slack time in the path and that is not guaranteed even with no logic. So the design protocol should be to assure the slack time from the input register to all subsequent registers have sufficient slack time. Do you remember how much time that needs to be? I want to say 2 ns, but it might be more like 5 ns, I just can't recall. Of course it depends on your clock rates, but I believe Peter picked some more aggressive speeds like 100 MHz for his example. RickArticle: 153963
> That eliminates all confusions introduced by VHDL ambiguous definition of= one process implying more than 3 different things that I don't see any end= -light in the discussion. >=20 You seem to be the only one that has stated any confusion, everybody else s= eems OK that a process can be used to describe different hardware. The lan= guages are meant to describe hardware, both do exactly that. The source of= the confusion that you seem to think exists is only in the lack of ability= to slap a particular label on the process. Why you see some utility is su= ch a label is of course your concern, but to the vast majority of those who= use the language, they are interested in one of the following two areas: - Describing a function that needs to meet a performance goal (designers) o= r to test such a design (verifiers) - Interpreting the language and performing a completely computer based func= tion with no direct connection to any specific hardware (simulation and syn= thesis tool writers) What you describe as a confusion is of no consequence to either of these cl= asses of users of the language. These groups probably account for nearly a= ll users of the languages. Outside of those groups of people, it probably = doesn't much matter what someone might think. > In DNA world, there are only 4 gene bases: The four bases found in DNA ar= e adenine (abbreviated A), cytosine (C), guanine (G) and thymine (T). >=20 > Over there people in DNA world always talk about A, C, G and T. >=20 Is this relevant? In the English speaking world, they talk about 26 letter= s...that would be just as relevant. > In digital electric world there are only 3 basic circuits that people are= interested in: ff, latch and combinatorial. By combinatorial signal it mea= ns it is neither a ff nor a latch. >=20 There is another one that is important and widely used that you have not co= nsidered (or at least you haven't mentioned). >=20 > I really recommend that VHDL-201x use 3 process type definitions to repla= ce one ambiguous process definition: process_com, process_latch and process= _ff. >=20 > Each type generates only one type of logic. When process_com misses an as= signment, it would generate an error. So no any confusion will be introduce= d. >=20 Assuming the 'process_xxx' to be optional, I suppose there may be some valu= e to having an error being declared if somebody describes a hunk of logic b= ut declares it to be something else. Maybe it would help those folks who t= oday write combinatorial processes and are ignoring the warnings that are a= lready being reported about latches being created and sensitivity lists bei= ng expanded. For those of us that use clocked processes nearly exclusively= , it would not be of any value so as long as you don't burden us with requi= ring a label, OK. Kevin JenningsArticle: 153964
Am Dienstag, 3. Juli 2012 03:43:47 UTC+2 schrieb Weng Tianxiang: > >> That's why Weng's fundamental premise of how to define a=20 > >> 'combinatorial process' was inherently flawed. Whether or not a=20 > >> given process implements memory or not cannot be determined simply=20 > >> from the process so any attempt to do so will ultimately fail.=20 > >> Kevin >=20 > >This line of reasoning would apply to any type of logic=20 > >description, not just VHDL or Verilog. It would imply that=20 > >the use of the word "combinatorial" is meaningless, which=20 > >is rather absurd.=20 >=20 > > Jan >=20 > >A different approach has been taken in SystemVerilog,=20 > > where keywords always_comb, always_latch, and always_ff=20 > > have been introduced to allow the designer to describe=20 > > design intent in the simulation model.=20 >=20 > > Alan=20 >=20 > That is why I say SystemVerilog does a correct and better thing than VHDL= : It introduces 3 types of basic circuits in digital world: always_comb, al= ways_latch, and always_ff.=20 >=20 > That eliminates all confusions introduced by VHDL ambiguous definition of= one process implying more than 3 different things that I don't see any end= -light in the discussion. >=20 > In DNA world, there are only 4 gene bases: The four bases found in DNA ar= e adenine (abbreviated A), cytosine (C), guanine (G) and thymine (T). >=20 > Over there people in DNA world always talk about A, C, G and T. >=20 > In digital electric world there are only 3 basic circuits that people are= interested in: ff, latch and combinatorial. By combinatorial signal it mea= ns it is neither a ff nor a latch. >=20 > After realizing that combinatorial is fully excluded in VHDL-2008, I aban= don any attempts to use it accurately. So I am finally easy with process(al= l) specification. >=20 > I really recommend that VHDL-201x use 3 process type definitions to repla= ce one ambiguous process definition: process_com, process_latch and process= _ff. >=20 > Each type generates only one type of logic. When process_com misses an as= signment, it would generate an error. So no any confusion will be introduce= d. >=20 > Weng Hi Weng, I think it's a matter of how one thinks about it. What you call "ambiguous" is seen by others (including me) as multivalent. Also limiting specialized process types would be a step back (as already me= ntioned) and just make things unnecessary complicated. When a process is intended to generate only logic - latch - FFs, excluding = everything else, this is like it was at PLD-times. There you had languages like ABEL or Log/IC which had barely just two kinds= of assignments, registerd and unregistered. At least you could write some = boolean equation behind a registered assignment. But as I understand you th= at would be forbidden in a process_ff, because no logic is allowed there, o= r is it? I'm missing one person in this discussion, who's name just don't comes to m= y mind at the moment, but I think everyone will know who I'm talking about. That guy prefered a design style that had everything in one process per arc= hitecture. He makes heavy use of variables, functions and procedures to kee= p his code structured.=20 He too would probably be asking: "What's the gain of your proposal?" Just having the opportunity to get some dedicated error messages instead of= just warnings when latches occur unintended? If this is really needed by designers and the toolmakers see a market for t= his feature, some attributes/pragmas would acheive the same, and they have = no impact on the grammar. e.g. (done with concurrent assignments to save lines)=20 -- pragma combinatorical begin y <=3D a nand b; -- would be accepted q <=3D a when rising_edge(clk); -- a simple D-FF would be rejected here and= give an error -- pragma combinatorical end Similar things could be done for latch and FFs. The tools would need the same code as for your proposal to identify the for= bidden situations, but the basic language definition for a process could st= ill be pure and ... multivalent. ;-) By the way, when I saw the Systemverilog always_comb etc. keywords for the = first time it made me grin and shake my head. Later on I thougth "OK, this = might be useful for verilog, where you have stuff like always @(...posedge = reset) which could be misinterpreted as an edge sensitive input. VHDL does = not have this ambiguouty, so luckyly it's not needed there". Have a nice synthesis EilertArticle: 153965
Dear All, Thank you very much for your input and sorry for the late reply. It is really great to be able to get the opinion of such experts, specially since, at my current location and in a radius of some 200 km, I must be the only person working with FPGA and VHDL! I'm also glad that the discussion as evolved to levels of complexity far beyond my knowledge. I was hoping that by now I would be able to say that the thing was working as expected but, unfortunately, no. I've synchronized the enable signal, as suggested by Ed and Tim, using 3 FF (I'm not paranoid, I just have room). Also, following Brian suggestions, I've clean up the code regarding type conversions. All this as allow me to isolate the remaining source of error, thank you very much. Here's the full story: I'm implementing a gated integrator, as a part of a boxcar averager. This is the standard noise reduction technique used in nuclear magnetic resonance (nmr). This is research, not a commercial product! The module gets is data from 4 8 bits ADC's at 5 MHz (adc0, adc90, adc180, adc270) and accumulates wile enb=1. enb is generated in a different module. The module does this: 1 - generates the acquisition clock (adc_clk) by division by 10 of the S3-SKB 50 MHz main clock 2- generates the accumulation clock (acc_clk) by inverting adc_clk. In this way, there is a delay of 100 ns from the moment the ADC's receive the rising edge of the clock to the moment when the data gets registered at the output. 3 - converts the data from the adc's to excess 128 (bipolar adc) and extends to 32 bit signed 4 - calculates u = adc0-adc180 and v=adc90-adc270. u and v go through a switch and emerge as r and i, to be delivered to 2 alike accumulators. Of course, 3 and 4 must occur in less than 100 ns. The switch unit is very simple: It has a control signal, s[1:0] that comes from a different module, and the following table: 00 -> r=u, i=v; 01 -> r=v, i=-u; 10 -> r=-v, i=u; 11 -> r=-v, i=-u. The s signal is generated in a different clock domain and is stable 500 us before the enb. enb has a typical duration of 10 us. The code is at the end of this message. I continue to get errors, specially when the input values are closed to zero, which means that the result is changing from say FFFFFFFF to 00000001, so lots of bits to change. I have (i think!) trace the source of error to the switch_unit because, if I tie the s signal to a fixed value, 11 for example, the unit works well, but if I connect to a real s signal, I get errors. So I thought, this must be because the real s is noisy and r and i change during the acquisition period (1mm ns) so I have synchronized s with acc_clk, but the problem persists. What is more strange is that, if I do s <= "01" inside the synchronization process, I also get the same type of errors. Really, don't now what to do next. jmariano ================= architecture archi of int_su is begin process(u, v, s) begin case s is when "00" => r <= u; i <= v; when "01" => r <= v; i <= -u; when "10" => r <= -u; i <= -v; when "11" => r <= -v; i <= u; when others => r <= (others => 'X'); i <= (others => 'X'); end case; end process; end archi; ============Article: 153966
> I'm wondering if it wouldn't be faster to compare on zero, load with 100, > and count down. Don't compare at all. Use a signed count value with one extra bit, start at N-2, count down, and stop if highest bit is set (or reload in your case). > or do i need to write a fsm for such applications? Of course you do (and you did). Anything involving a delay must have state (continous or discrete) and if you can build it, it also is finite. Kolja Sulimma www.cronologic.deArticle: 153967
On Jul 1, 6:35 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote: > rickman <gnu...@gmail.com> wrote: > > (snip, I wrote) > > >> Normally you can't make a traditional edge tricgered FF out of > >> orginary gates. You can make a transparent latch, and some other > >> devices with state. > > (snip) > > > But an edge triggered ff is just two sequential latches with > > opposite level sensitivities, no? > > I don't believe that is quite enough. If you read > > http://en.wikipedia.org/wiki/Flip-flop_%28electronics%29#Master.E2.80... > > It says: > > "Nearly simultaneously, the twice inverted "enable" > of the second or "slave" D latch transitions from > low to high (0 to 1) with the clock signal." > > Now, how do you implement "nearly simultaneously"? > > It depends in a complicated way on the timing through the second > inverter, and the timing through the master section. Even more, > it depends on the timing being different for rising and falling > edges of the clock. > > In my first digital electronics (lecture) class, (CS/EE4) > all the class discussion and demonstrations were done > with a two-phase clock. (Must not be so bad, many microprocessors > use a one, and some even a four-phase clock.) > > With two phases, you use one for the master and one for the > slave, and be sure that there is no overlap between them. > > As we were going to have to work with TTL, where FFs only have > a single clock input, the lecturer explained that they work > with two different thresholds on the clock input. > > I never tried to understand that detail since. > > In any case, with an FPGA I don't believe that you can get the > timing close enough to separate the master and slave clocks. > Maybe if built from discrete gates or transistors. > > -- glen I'm not sure what point you are trying to make. The issue at hand is whether the tools should be able to infer an edge triggered ff from an HDL combinatorial description of the gates that make up such a device. To make a gate design work correctly requires careful analysis of delays to preclude the two enables from overlapping. However if I describe two latches with a proper logical description why can't the tool infer a ff in an FPGA? Logically it is correct and it will give the correct behavior in simiulation. Isn't that what really matters, not whether the tool can design an edge sensitive ff from LUTs but if the description is adequate to infer a ff? RickArticle: 153968
On Jul 5, 7:44=A0am, jmariano <jmarian...@gmail.com> wrote: > Dear All, > > Thank you very much for your input and sorry for the late reply. > It is really great to be able to get the opinion of such experts, > specially since, at my current location and in a radius of some 200 > km, I must be the only person working with FPGA and VHDL! I'm also > glad that the discussion as evolved to levels of complexity far beyond > my knowledge. > > I was hoping that by now I would be able to say that the thing was > working as expected but, unfortunately, no. > > I've synchronized the enable signal, as suggested by Ed and Tim, using > 3 FF (I'm not paranoid, I just have room). Also, following Brian > suggestions, I've clean up the code regarding type conversions. All > this as allow me to isolate the remaining source of error, thank you > very much. > > Here's the full story: I'm implementing a gated integrator, as a part > of a boxcar averager. =A0This is the standard noise reduction technique > used in nuclear magnetic resonance (nmr). This is research, not a > commercial product! The module gets is data from 4 8 bits ADC's at 5 > MHz (adc0, adc90, adc180, adc270) and accumulates wile enb=3D1. enb is > generated in a different module. The module does this: > 1 - generates the acquisition clock (adc_clk) by division by 10 of the > S3-SKB 50 MHz main clock > 2- =A0generates the accumulation clock (acc_clk) by inverting adc_clk. > In this way, there is a delay of 100 ns from the moment the ADC's > receive the rising edge of the clock to the moment when the data gets > registered at the output. > 3 - converts the data from the adc's to excess 128 (bipolar adc) and > extends to 32 bit signed > 4 - calculates u =3D adc0-adc180 and v=3Dadc90-adc270. u and v go through > a switch and emerge as r and i, to be delivered to 2 alike > accumulators. > Of course, 3 and 4 must occur in less than 100 ns. > > The switch unit is very simple: It has a control signal, s[1:0] that > comes from a different module, and the following table: 00 -> r=3Du, > i=3Dv; 01 -> r=3Dv, i=3D-u; 10 -> r=3D-v, i=3Du; 11 -> r=3D-v, i=3D-u. Th= e s signal > is generated in a different clock domain and is stable 500 us before > the enb. enb has a typical duration of 10 us. The code is at the end > of this message. > > I continue to get errors, specially when the input values are closed > to zero, which means that the result is changing from say FFFFFFFF to > 00000001, so lots of bits to change. > > I have (i think!) trace the source of error to the switch_unit > because, if I tie the s signal to a fixed value, 11 for example, the > unit works well, but if I connect to a real s signal, I get errors. So > I thought, this must be because the real s is noisy and r and i change > during the acquisition period (1mm ns) so I have synchronized s with > acc_clk, but the problem persists. =A0What is more strange is that, if I > do s <=3D "01" inside the synchronization process, I also get the same > type of errors. > > Really, don't now what to do next. > > jmariano > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > architecture archi of int_su is > begin > =A0 =A0 =A0 =A0 process(u, v, s) > =A0 =A0 =A0 =A0 begin > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 case s is > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "00" =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D =A0u; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D =A0v; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "01" =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D =A0v; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D -u; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "10" =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D -u; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D -v; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "11" =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D -v; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D =A0u; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when others =3D> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D (others =3D> 'X'); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D (others =3D> 'X'); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 end case; > =A0 =A0 =A0 =A0 end process; > end archi; > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D I'm not real clear on your description of your design, but if you are really generating clocks from the 50 MHz, I recommend that inside the FPGA you instead use a single clock and generate clock enables for the various functions. When you use multiple clocks in a circuit you have to do extra work for every signal that crosses a clock domain. Could that be your problem? I don't see anything in your original post about simulation. Do you simulate your modules? I highly recommend that you write a test benche for each and every module you code. You may think this takes too much time, but I believe it pays off in the end with shorter integration time. RickArticle: 153969
On Wednesday, July 4, 2012 12:49:07 PM UTC-7, rickman wrote: > On Jul 3, 5:45=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote: > > On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote: > > > On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote: > > > > > > On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote: > > > >> Dear All, > > > > > >> I'm not an expert in VHDL, i'm just a curious trying to solve a > > > >> research problem with an FPGA. > > > > > >> I'm using a 32 bit accumulator in a IP, as part of a SoC project w= ith a > > > >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is = a > > > >> Xilinx XC3S200). The code is included at the end of this message. = =A0The > > > >> input is a 32 bit signed integer coded in two's complement and the > > > >> output also a 32 bit signed integer. What I would like the accumul= ator > > > >> to do is to accumulate synchronously with the rising edge of clk w= hen > > > >> enb=3D1 and maintain the result stable at the output when enb=3D0 = ( enb is > > > >> a asynchronous signal generated elsewhere in the system) > > > > > >> But it does not work in this way, it behaves in a strange manner..= . > > > > > >> Some times I get the expected results but often I get strange valu= es > > > >> (large when they should be small, often negative instead of positi= ve, > > > >> etc.). If I look at the binary representation of the output, it lo= oks > > > >> like if the output din't had time to sum and propagate to the outp= ut > > > >> again. In fact, the post place and route simulation shows that whe= n the > > > >> enb signal goes to 0, the output stays in a undetermined condition= (you > > > >> know, red line with XXXX). > > > > > >> I'm guessing I'm doing a very basic mistake that as something to d= o > > > >> with the timing of the enb signal, but after 3 days banging my had= to > > > >> the wall, all I have is a a monumental headache. > > > > > >> Can some kind soul help me with this? > > > > > >> jmariano > > > > > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > > >> library ieee; > > > >> use ieee.std_logic_1164.all; > > > >> use ieee.numeric_std.all; > > > > > >> entity int_accum is > > > >> =A0 port =A0(clk:in =A0std_logic; > > > >> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic; > > > >> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic; > > > >> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0); > > > >> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow q: = =A0out > > > >> =A0 =A0 =A0 =A0 =A0std_logic_vector(31 downto 0)); > > > >> end int_accum; > > > > > >> architecture archi of int_accum is > > > > > >> =A0 signal tmp : signed(32 downto 0); > > > > > >> =A0 begin > > > > > >> =A0 process(clk, clr) > > > >> =A0 begin > > > >> =A0 =A0 =A0 =A0 if (clr =3D '1') then > > > >> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0'); > > > >> =A0 =A0elsif (rising_edge (clk)) then > > > >> =A0 =A0 =A0 =A0 if (enb =3D '1') then > > > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be= on 33 bits > > > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to keep the carry tmp <=3D tmp + s= igned ('0'& d); > > > >> =A0 =A0 end if; > > > >> =A0 =A0end if; > > > >> =A0 end process; > > > > > >> =A0 -- The carry is extracted from the most significant bit of the > > > >> result > > > >> =A0 ovf <=3D tmp(32); > > > > > >> =A0 -- The q output is the 32 least significant bits of sum q <=3D > > > >> =A0 std_logic_vector (tmp(31 downto 0)); > > > > > >> end archi; > > > > > > This is the key to your problem: > > > > > >> =A0enb is a asynchronous signal generated elsewhere in the system > > > > > > You can't expect to take an asynchronous signal into multiple (32 i= n > > > > this case) registers in a synchronous domain and expect that it wil= l > > > > work reliably. =A0You need to first synchronize the asynchronous in= put to > > > > the synchronous clock domain before you can use it. > > > > > Which means that you should latch enb in a register, with the same cl= ock > > > that you're using to twiddle your accumulator, and use the output of = that > > > register as your enable signal. > > > > > Paranoid logic designers will have a string of two or three registers= to > > > avoid metastability, but I've been told that's not necessary. =A0(I'm= not > > > much of a logic designer). > > > > > -- > > > Tim Wescott > > > Control system and signal processing consulting > > >www.wescottdesign.com > > > > It isn't just the paranoid logic designer, it should be every logic des= igner. > > > > A single register only partially solves the problem of an asynchronous = input with multiple register destinations, but it does not solve the very r= eal metastability problem. =A0At least two registers should be used to ensu= re that the metastability condition has resolved and with increasing clock = frequency and finer process nodes using three or more stages may be necessa= ry. > > > > Ed McGettigan > > -- > > Xilinx Inc. >=20 > Hi Ed. They way it was explained to me, I believe from Peter Alfke, > is that what really resolves metastability is the slack time in a > register to register path. Over the years FPGA process has resulted > in FFs which only need a couple of ns to resolve metastability to 1 in > a million operation years or something like that (I don't remember the > metric, but it was good enough for anything I do). It doesn't matter > that you have logic in that path, you just need those few ns in every > part of the path. In theory, even if you use multiple registers with > no logic, what really matters is the slack time in the path and that > is not guaranteed even with no logic. So the design protocol should > be to assure the slack time from the input register to all subsequent > registers have sufficient slack time. >=20 > Do you remember how much time that needs to be? I want to say 2 ns, > but it might be more like 5 ns, I just can't recall. Of course it > depends on your clock rates, but I believe Peter picked some more > aggressive speeds like 100 MHz for his example. >=20 > Rick I'm glad to see that one of my 5-6 attempts to post was finally accepted by= Google. I have got to switch to something else. Peter Alfke's publications on metastability definitely fall into the semina= l category, but you must be careful to extrapolate the original data to the= latest technology nodes, circuits and design requirements. There are two = major factors that impact the metastability equations, the tau or metastabi= lity decay rate and the settling time. =20 The tau value is an inherent characteristic of the circuit and technology n= ode and for a long time the expectation was that this is would decrease wit= h each generation, but this has stopped being true. The settling time, Ts, is dependent on the design and is under the user's c= ontrol. Ts is a factor of the destination clock frequency and the timing sl= ack between registers. If you have 100 MHz clock frequency, but you use up = 9.5nS to get to the destination your slack is only 500pS. Adding register s= tages allows for maximum use of the clock period increasing the settling ti= me and for each stage it increases again.=20 Ed McGettigan -- Xilinx Inc.Article: 153970
Hi Rick, tanks for your help. > I'm not real clear on your description of your design, but if you are > really generating clocks from the 50 MHz, I recommend that inside the > FPGA you instead use a single clock and generate clock enables for the > various functions. =20 Yes, I generate a 5 MHz clock inside the module from the main 50 MHz clock = by simple division by 10 because I need a 5 MHz adc clock. I can't use cloc= k enable because the AD9058 adc does not have a enable input, just clock. > When you use multiple clocks in a circuit you have > to do extra work for every signal that crosses a clock domain. Could > that be your problem? What is the extra work? Have no idea! Synchronization? > I don't see anything in your original post about simulation. Do you > simulate your modules? I highly recommend that you write a test > benche for each and every module you code. You may think this takes > too much time, but I believe it pays off in the end with shorter > integration time. Sorry about that, I did, in fact, simulate each module and the top entity. = The behavior simulation gives the expected results, the post and place simu= lation gives same errors that I could not understand, but I'll run the simu= lations again and post the results here. jmariano=20 =20Article: 153971
On Jul 5, 11:04=A0am, jmariano <jmarian...@gmail.com> wrote: > Hi Rick, tanks for your help. > > > I'm not real clear on your description of your design, but if you are > > really generating clocks from the 50 MHz, I recommend that inside the > > FPGA you instead use a single clock and generate clock enables for the > > various functions. > > Yes, I generate a 5 MHz clock inside the module from the main 50 MHz cloc= k by simple division by 10 because I need a 5 MHz adc clock. I can't use cl= ock enable because the AD9058 adc does not have a enable input, just clock. > > > When you use multiple clocks in a circuit you have > > to do extra work for every signal that crosses a clock domain. =A0Could > > that be your problem? > > What is the extra work? Have no idea! Synchronization? > > > I don't see anything in your original post about simulation. =A0Do you > > simulate your modules? =A0I highly recommend that you write a test > > benche for each and every module you code. =A0You may think this takes > > too much time, but I believe it pays off in the end with shorter > > integration time. > > Sorry about that, I did, in fact, simulate each module and the top entity= . The behavior simulation gives the expected results, the post and place si= mulation gives same errors that I could not understand, but I'll run the si= mulations again and post the results here. > > jmariano The good news here is that you have a simulation that shows the same behavior in hardware. Looking at these simulation runs should tell you exactly what the problem is. I don't think that anyone here will be able to the same with the full source code for the design. Ed McGettigan -- Xilinx Inc.Article: 153972
On Jul 5, 2:03=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote: > On Wednesday, July 4, 2012 12:49:07 PM UTC-7, rickman wrote: > > On Jul 3, 5:45=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote: > > > On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote: > > > > > Paranoid logic designers will have a string of two or three registe= rs to > > > > avoid metastability, but I've been told that's not necessary. =A0(I= 'm not > > > > much of a logic designer). > > > > > -- > > > > Tim Wescott > > > > Control system and signal processing consulting > > > >www.wescottdesign.com > > > > It isn't just the paranoid logic designer, it should be every logic d= esigner. > > > > A single register only partially solves the problem of an asynchronou= s input with multiple register destinations, but it does not solve the very= real metastability problem. =A0At least two registers should be used to en= sure that the metastability condition has resolved and with increasing cloc= k frequency and finer process nodes using three or more stages may be neces= sary. > > > > Ed McGettigan > > > -- > > > Xilinx Inc. > > > Hi Ed. =A0They way it was explained to me, I believe from Peter Alfke, > > is that what really resolves metastability is the slack time in a > > register to register path. =A0Over the years FPGA process has resulted > > in FFs which only need a couple of ns to resolve metastability to 1 in > > a million operation years or something like that (I don't remember the > > metric, but it was good enough for anything I do). =A0It doesn't matter > > that you have logic in that path, you just need those few ns in every > > part of the path. =A0In theory, even if you use multiple registers with > > no logic, what really matters is the slack time in the path and that > > is not guaranteed even with no logic. =A0So the design protocol should > > be to assure the slack time from the input register to all subsequent > > registers have sufficient slack time. > > > Do you remember how much time that needs to be? =A0I want to say 2 ns, > > but it might be more like 5 ns, I just can't recall. =A0Of course it > > depends on your clock rates, but I believe Peter picked some more > > aggressive speeds like 100 MHz for his example. > > > Rick > > I'm glad to see that one of my 5-6 attempts to post was finally accepted = by Google. =A0I have got to switch to something else. > > Peter Alfke's publications on metastability definitely fall into the semi= nal category, but you must be careful to extrapolate the original data to t= he latest technology nodes, circuits and design requirements. =A0There are = two major factors that impact the metastability equations, the tau or metas= tability decay rate and the settling time. > > The tau value is an inherent characteristic of the circuit and technology= node and for a long time the expectation was that this is would decrease w= ith each generation, but this has stopped being true. > > The settling time, Ts, is dependent on the design and is under the user's= control. Ts is a factor of the destination clock frequency and the timing = slack between registers. If you have 100 MHz clock frequency, but you use u= p 9.5nS to get to the destination your slack is only 500pS. Adding register= stages allows for maximum use of the clock period increasing the settling = time and for each stage it increases again. > > Ed McGettigan > -- > Xilinx Inc. The info I am referring to are posts that were made here and pertained to the "current" generation of some six or eight years ago. At that time Peter made the point that the "tau" as you call it, had gotten so fast that the impact was negligible for all but the most stringent designs and only a small amount of slack time is needed. A quick search found these two posts about V2Pro devices. I assume your newer devices are at least as good as 10 year old technology. Note that Peter makes a point that the capture window T0, which is a product in the formula, is not an important parameter. Tau is an exponent (in ratio with Tslack) in the formula and so makes much larger contribution to the result. The same is true for the two clock frequencies, they are just products in the formula and so don't make huge changes to the MTBF. So it seems like not much would have changed in 10 years in how a designer should deal with metastability. Leaving 2 ns of slack time in the first register to register path should make literally all designs extremely robust regardless of how many registers are receiving the first register output or if there is logic in the path. Just make sure there is 2 ns slack time and your designs should be good for many, many years! Rick =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Peter Alfke comp.arch.fpga Oct 10 2002, 8:40 pm You mentioned metastability, and that caught my attention. Metastability is a reality, but it (and the fear of it) is highly overrated. We recently tested Virtex-IIPro flip-flops, made on 130 nm technology. You might call that cutting edge technology, but not exotic. When a 330 MHz clock synchronized a ~50 MHz input, there was a 200 ps extra metastable delay ( causing a clock-to-out + short routing + set-up total of 1.5 ns) once every second. That translates into a metastable capture window that has a width of 3 ns divided by 100 million ( since we looked at both edges of the 50 MHz signal). So the window for a 200 ps extra delay is 0.03 femtoseconds. If you can tolerate 500 ps more, the MTBF increases 100 000 times, and the capture window gets that much smaller. Metastability is a real, but highly overrated problem. Peter Alfke, Xilinx Applications =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Peter Alfke comp.arch.fpga Oct 15 2002, 1:11 pm Here are the K2 values for Virtex-IIPro: CLB @1.50V: K2 =3D 27.2, i.e. 1/K2 =3D tau =3D 36.8 picoseconds CLB @1.35V: K2 =3D 23.3, i.e. 1/K2 =3D tau =3D 42.9 picoseconds CLB @1.65V: K2 =3D 35.7, i.e. 1/K2 =3D tau =3D 28.0 picoseconds IOB @1.50V: K2 =3D 24.4, i.e. 1/K2 =3D tau =3D 41.0 picoseconds IOB @1.35V: K2 =3D 19.24, i.e. 1/K2 =3D tau =3D 52.0 picoseconds IOB @1.65V: K2 =3D 44.05, i.e. 1/K2 =3D tau =3D 22.7 picoseconds For each extra 100 ps of acceptable metastable delay, the MTBF increases by a factor 10.3 for CLB @ 1.35 V, or a factor 6.85 for IOB @ 1.35 V. Much better values, of course, at nominal or high Vcc. Klick on http://support.xilinx.com/support/techxclusives/techX-home.htm in early November. Here is the worst-case data point: 50 MHz asynchronous data rate, 330 MHz clock , single-stage synchronizer in IOB, Vcc =3D 1.35 V: clock-to-Q + short routing + set-up time + metastable delay exceeds clock period once per 30,000 years. At nominal Vcc: once per 100 million years. At a 250 MHz clock rate, delay exceeds clock period less often than once per billion years. Peter Alfke, Xilinx Applications =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3DArticle: 153973
On Jul 5, 8:04=A0pm, jmariano <jmarian...@gmail.com> wrote: > Hi Rick, tanks for your help. > > > I'm not real clear on your description of your design, but if you are > > really generating clocks from the 50 MHz, I recommend that inside the > > FPGA you instead use a single clock and generate clock enables for the > > various functions. > > Yes, I generate a 5 MHz clock inside the module from the main 50 MHz cloc= k by simple division by 10 because I need a 5 MHz adc clock. I can't use cl= ock enable because the AD9058 adc does not have a enable input, just clock. > you could just have a state machine running at 50MHz that grap data and set/clear the clock which I guess is partly what you have in you divide by 10 -LasseArticle: 153974
Am 05.07.2012 17:37, schrieb Kolja Sulimma: > > >> I'm wondering if it wouldn't be faster to compare on zero, load with 100, >> and count down. > > > Don't compare at all. > Use a signed count value with one extra bit, start at N-2, count down, > and stop if highest bit is set (or reload in your case). In principle, this should not make a difference. The counter must compare the lower bits anyway to decide whether to toggle the higher bits. Thomas
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z