Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Hi guys, At the moment I'm waiting to find out whether I will be using Xilinx or Actel for my project, and so I'm putting it together for both just in case. In the Actel IP cores, there is an array adder which allows a good number of inputs, and there's some optional pipelining. I figure it's sufficient to just drop this in and wire up as many inputs as I need. Xilinx IP cores seem to have only 2-input adders, and I guess these are probably inferred by XST with the + operator anyway, so I don't want to bother with the IP core gen unless there's some reason why I should. Supposing I want: Result <= A + B + C + D + E; Note, I used only five inputs in my example for brevity, I will have more like 25 in my actual system. (looking in the XST manual, I can either pad the inputs with leading zeros or convert to integer and back to std_logic_vector to get carry bits to fill my wider result) At the end of the day, when I synthesize this, would there be any difference between coding it in stages (adding pairs of two together, then adding their sums together, and so on until all are added up) and just putting A+B+C+D+E in one statement? All I can think of is that (depending how well conversions to/from integer are optimized in XST) I might save a few bits of space in the first stages. Using the bit padding method, I suppose that all of the adders in the first stages would wind up unnecessarily being the same width as the result. Anyway, I'm just curious how this will end up working... any insight appreciated! SteveArticle: 140751
<sbattazz@yahoo.co.jp> wrote in message news:2508079e-f147-4e15-b6bd-ac96f220afbd@s1g2000prd.googlegroups.com... > Hi guys, > At the moment I'm waiting to find out whether I will be using Xilinx > or Actel for my project, and so I'm putting it together for both just > in case. > > In the Actel IP cores, there is an array adder which allows a good > number of inputs, and there's some optional pipelining. I figure it's > sufficient to just drop this in and wire up as many inputs as I need. > > Xilinx IP cores seem to have only 2-input adders, and I guess these > are probably inferred by XST with the + operator anyway, so I don't > want to bother with the IP core gen unless there's some reason why I > should. > Supposing I want: > > Result <= A + B + C + D + E; > Note, I used only five inputs in my example for brevity, I will have > more like 25 in my actual system. > > (looking in the XST manual, I can either pad the inputs with leading > zeros or convert to integer and back to std_logic_vector to get carry > bits to fill my wider result) > > At the end of the day, when I synthesize this, would there be any > difference between coding it in stages (adding pairs of two together, > then adding their sums together, and so on until all are added up) and > just putting A+B+C+D+E in one statement? > All I can think of is that (depending how well conversions to/from > integer are optimized in XST) I might save a few bits of space in the > first stages. > Using the bit padding method, I suppose that all of the adders in the > first stages would wind up unnecessarily being the same width as the > result. > > Anyway, I'm just curious how this will end up working... any insight > appreciated! > > Steve How fast do you need to clock it? How many bits wide is your result?Article: 140752
On May 25, 6:43=A0pm, "Andrew Holme" <a...@nospam.co.uk> wrote: > <sbatt...@yahoo.co.jp> wrote in message > > news:2508079e-f147-4e15-b6bd-ac96f220afbd@s1g2000prd.googlegroups.com... > > > > > Hi guys, > > At the moment I'm waiting to find out whether I will be using Xilinx > > or Actel for my project, and so I'm putting it together for both just > > in case. > > > In the Actel IP cores, there is an array adder which allows a good > > number of inputs, and there's some optional pipelining. I figure it's > > sufficient to just drop this in and wire up as many inputs as I need. > > > Xilinx IP cores seem to have only 2-input adders, and I guess these > > are probably inferred by XST with the + operator anyway, so I don't > > want to bother with the IP core gen unless there's some reason why I > > should. > > Supposing I want: > > > Result <=3D A + B + C + D + E; > > Note, I used only five inputs in my example for brevity, I will have > > more like 25 in my actual system. > > > (looking in the XST manual, I can either pad the inputs with leading > > zeros or convert to integer and back to std_logic_vector to get carry > > bits to fill my wider result) > > > At the end of the day, when I synthesize this, would there be any > > difference between coding it in stages (adding pairs of two together, > > then adding their sums together, and so on until all are added up) and > > just putting A+B+C+D+E in one statement? > > All I can think of is that (depending how well conversions to/from > > integer are optimized in XST) I might save a few bits of space in the > > first stages. > > Using the bit padding method, I suppose that all of the adders in the > > first stages would wind up unnecessarily being the same width as the > > result. > > > Anyway, I'm just curious how this will end up working... any insight > > appreciated! > > > Steve > > How fast do you need to clock it? =A0How many bits wide is your result? Assuming 25 8-bit inputs, the maximum result is 25*255 =3D 6375, meaning 13-bit output. Serial data comes in at 57.6 kilobits/second =3D 7200 bytes/second, and the sum of my array is checked once per byte so there will be a little over 1ms between clock pulses (I can't imagine that being anywhere near playing with timing issues). For the project I won't need anything any faster than that. I'm just wondering how XST would handle such an addition statement with multiple operands (my synthesis report doesn't say anything about adders). Is it smart enough to automatically do some kind of tree algorithm, or would it do a "dumb" array of one adder feeding into the next for each extra operand? Thanks for the quick response! SteveArticle: 140753
On May 25, 12:50=A0am, Kim Enkovaara <kim.enkova...@iki.fi> wrote: > luudee wrote: > > > In my case the MCP is applied to data that travels from one > > clock domain to another and is properly "latched" by synchronized > > control logic. There are many good and bad uses for MCP ... > > Isn't clock domain crossing usually specified as false path not > multicycle path. > > --Kim Most static timing tools will not attempt to verify timing on paths between two unrelated (i.e. asynchronous) clocks. No false path constraint is required. AndyArticle: 140754
On May 25, 4:56=A0am, sbatt...@yahoo.co.jp wrote: > On May 25, 6:43=A0pm, "Andrew Holme" <a...@nospam.co.uk> wrote: > > > > > > > <sbatt...@yahoo.co.jp> wrote in message > > >news:2508079e-f147-4e15-b6bd-ac96f220afbd@s1g2000prd.googlegroups.com... > > > > Hi guys, > > > At the moment I'm waiting to find out whether I will be using Xilinx > > > or Actel for my project, and so I'm putting it together for both just > > > in case. > > > > In the Actel IP cores, there is an array adder which allows a good > > > number of inputs, and there's some optional pipelining. I figure it's > > > sufficient to just drop this in and wire up as many inputs as I need. > > > > Xilinx IP cores seem to have only 2-input adders, and I guess these > > > are probably inferred by XST with the + operator anyway, so I don't > > > want to bother with the IP core gen unless there's some reason why I > > > should. > > > Supposing I want: > > > > Result <=3D A + B + C + D + E; > > > Note, I used only five inputs in my example for brevity, I will have > > > more like 25 in my actual system. > > > > (looking in the XST manual, I can either pad the inputs with leading > > > zeros or convert to integer and back to std_logic_vector to get carry > > > bits to fill my wider result) > > > > At the end of the day, when I synthesize this, would there be any > > > difference between coding it in stages (adding pairs of two together, > > > then adding their sums together, and so on until all are added up) an= d > > > just putting A+B+C+D+E in one statement? > > > All I can think of is that (depending how well conversions to/from > > > integer are optimized in XST) I might save a few bits of space in the > > > first stages. > > > Using the bit padding method, I suppose that all of the adders in the > > > first stages would wind up unnecessarily being the same width as the > > > result. > > > > Anyway, I'm just curious how this will end up working... any insight > > > appreciated! > > > > Steve > > > How fast do you need to clock it? =A0How many bits wide is your result? > > Assuming 25 8-bit inputs, the maximum result is 25*255 =3D 6375, meaning > 13-bit output. > Serial data comes in at 57.6 kilobits/second =3D 7200 bytes/second, and > the sum of my array is checked once per byte so there will be a little > over 1ms between clock pulses (I can't imagine that being anywhere > near playing with timing issues). For the project I won't need > anything any faster than that. > > I'm just wondering how XST would handle such an addition statement > with multiple operands (my synthesis report doesn't say anything about > adders). Is it smart enough to automatically do some kind of tree > algorithm, or would it do a "dumb" array of one adder feeding into the > next for each extra operand? > > Thanks for the quick response! > > Steve- Hide quoted text - > > - Show quoted text - Do you really need to recompute the entire array sum every time, or can you compute a running sum (accumulator) as the data comes in? You can also subtract the last discarded term from your running sum if you are looking for a continuous N-term running sum (as is used in a boxcar filter, etc.) As long as integers will handle your data size, you are much better off using them than padding vectors. Simulations will run much faster, and there is no hardware associated with conversion from/to SLV/signed/ unsigned to/from integer. AndyArticle: 140755
> 3. Why DSP and Memory are rectangular in shape ? Do you mean why they are not round? /MikhailArticle: 140756
I have problems with a Microblaze Based Multiprocessor SoC. I have two Microblaze Cores joined by FSL links. This design works and these cores can communicate with each other. But now, I am trying to make these two Microblaze Cores run from an external memory. The linker script associated with my software applications presents three possible memories; BRAM (that is, internal ram), DDR_MEM_0 and DDR_MEM_1. SO, is it possible to load each software application in each part of the external memory (that is, microblaze_0_app.elf in DDR_MEM_0, and microblaze_1_app.elf in DDR_MEM_1)? my best regards PabloArticle: 140757
Thanks for posting the very useful list of links. One thing in your code doesn't look right to me. You update TDO on the rising edge of TCK. This will work in many situations, but the correct behaviour is to delay the update until the falling edge of TCK. Best regards, MarcArticle: 140758
On May 25, 2:11=A0am, sbatt...@yahoo.co.jp wrote: > Hi guys, > At the moment I'm waiting to find out whether I will be using Xilinx > or Actel for my project, and so I'm putting it together for both just > in case. > > In the Actel IP cores, there is an array adder which allows a good > number of inputs, and there's some optional pipelining. I figure it's > sufficient to just drop this in and wire up as many inputs as I need. > > Xilinx IP cores seem to have only 2-input adders, and I guess these > are probably inferred by XST with the + operator anyway, so I don't > want to bother with the IP core gen unless there's some reason why I > should. > Supposing I want: > > Result <=3D A + B + C + D + E; > Note, I used only five inputs in my example for brevity, I will have > more like 25 in my actual system. > > (looking in the XST manual, I can either pad the inputs with leading > zeros or convert to integer and back to std_logic_vector to get carry > bits to fill my wider result) > > At the end of the day, when I synthesize this, would there be any > difference between coding it in stages (adding pairs of two together, > then adding their sums together, and so on until all are added up) and > just putting A+B+C+D+E in one statement? > All I can think of is that (depending how well conversions to/from > integer are optimized in XST) I might save a few bits of space in the > first stages. > Using the bit padding method, I suppose that all of the adders in the > first stages would wind up unnecessarily being the same width as the > result. > > Anyway, I'm just curious how this will end up working... any insight > appreciated! > > Steve If I understand you right, you have 25 parallel inputs, each sending you bit-serial data. You need to convert the 25 inputs into one 6-bit binary word, and then accumulate these words with increasing (or decreasing) binary weight. Conversion of 25 lines to 6 bits can be done in many ways, including sequential scanning or shifting, which requires a faster clock of > 1.5 MHz. But here is an unconventional and simpler way: Use 13 inputs as address to one port of a BlockRAM with 4 parallel outputs (8K x 4) Use the remaining 12 inputs as address to the other port of the same BlockRAM. Store the conversion of (# of active inputs to a binary value) in the BlockRAM. Add the two 4 bit binary words together to form a 5-bit word that always represents the number of active inputs. Then feed this 5-bit value into a 13-bit accumulator, where you shift the content after each clock tick. This costs you one BlockRAM plus three or four CLBs in Xilinx nomenclature, a tiny portion of the smallest Spartan or Virtex device, and it could be run a few thousand times faster than you need. If you have more than 26 inputs, just add another BlockRAM for a total of up to 52 inputs, and extend the adder and accumulator by one bit. (Yes, I know in Spartan you are limited to 12 address inputs, (4K x 4), but you can add the remaining bit outside...) Peter Alfke, from home.Article: 140759
On May 25, 8:28=A0am, pant...@gmail.com wrote: > I have problems with a Microblaze Based Multiprocessor SoC. I have two > Microblaze Cores joined by FSL links. This design works and these > cores can communicate with each other. But now, I am trying to make > these two Microblaze Cores run from an external memory. The linker > script associated with my software applications presents three > possible memories; BRAM (that is, internal ram), DDR_MEM_0 and > DDR_MEM_1. SO, is it possible to load each software application in > each part of the external memory (that is, microblaze_0_app.elf in > DDR_MEM_0, and microblaze_1_app.elf in DDR_MEM_1)? > > my best regards > > Pablo I use the MPMC to map memory ports to DDR2 external memory. This mechanism has been successfully tested with up to 7 microblazes. /PerArticle: 140760
Hi how is it possible to use N/S routing if GTX clock is connected to tile other then where the GTP is used? the wizard does have RADIOBUTTONS to route clock, but those selections do not actually do anything so is ist REALLY only way to use DRP to change the input clock mux? AnttiArticle: 140761
On May 25, 8:06=A0pm, Antti <Antti.Luk...@googlemail.com> wrote: > Hi > > how is it possible to use N/S routing if GTX clock is connected to > tile other then where the GTP is used? > > the wizard does have RADIOBUTTONS to route clock, but those selections > do not actually do anything > > so is ist REALLY only way to use DRP to change the input clock mux? > > Antti I answer myself, seems that DRP is really the only way :(Article: 140762
I have two processes, one for sampling data with a high speed clock (simplified code) if rising_edge(fastClock) then shiftRegister <= shiftRegister(0) & dataIn; if shiftRegister = "10" then counter <= 0; end if; counter <= counter + 1; if counter = 2 then outBit <= shiftRegister(0); dataValid <= '1'; end if; if dataRead = '1' then dataValid <= '0'; end if; end if; and a slow process for evaluating and feeding to other entities: if rising_edge(slowClock) then dataValidLatch <= dataValid; dataInLatch <= dataIn; if dataValidLatch = '1' then outShift <= outShift(x downto 0) & dataInLatch; dataRead <= '1'; else dataRead <= '0'; end if; end if; But the classic timing analyzer in Quartus thinks there are some paths, with the outShift in it, which needs to be much faster than slowClock and because of more complex processes it can't be synthesized for the fast clock. My hope was that latching removes all dependencies to fastClock. How to do it right? Maybe there are any other trick so sample biphase signals, which can have a wide range of input frequencies, with a high speed clock? fastClock is regenerated from the biphase signal with external chips and looks nice, but a 4x PLL doesn't work, so I want to try 8x, which results in about 200 MHz max frequency, which I hope should work with the simple sampling process, but it doesn't work for all the other processes. -- Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.deArticle: 140763
On May 25, 11:40=A0am, Frank Buss <f...@frank-buss.de> wrote: > But the classic timing analyzer in Quartus thinks there are some paths, > with the outShift in it, which needs to be much faster than slowClock and > because of more complex processes it can't be synthesized for the fast > clock. I suspect your issue is going to be with dataInLatch and dataValidLatch. I'm assuming there is no deterministic phase relationship between the fastClock and the slowClock. This is almost definitely the case unless slowClock is derived from fastClock. If that is the case, there's no way to reliably do what you're doing because eventually dataIn/dataValid (generated in the fastClock domain) will violate the setup and hold requirements in the slowClock domain. > My hope was that latching removes all dependencies to fastClock. How > to do it right? An asynchronous FIFO is probably the easiest way to do this. Otherwise, there needs to be some set phase relationship between fastClock and slowClock. > Maybe there are any other trick so sample biphase signals, > which can have a wide range of input frequencies, with a high speed clock= ? What is a biphase signal? > fastClock is regenerated from the biphase signal with external chips and > looks nice, but a 4x PLL doesn't work Why not? > so I want to try 8x, which results > in about 200 MHz max frequency, which I hope should work with the simple > sampling process, but it doesn't work for all the other processes. Regardless of how fast your slowClock or fastClock clocks are, transferring data between the two of them needs to be done in such a way that ensures setup/hold requirements are always met. The issue (in terms of data integrity and preventing metastability) is the phase relationship between the two clocks. After that, you just need to ensure the ratio of the two clocks is sufficient to ensure the aggregate data rate generated by the fastClock can be consumed by the slowClock without overflow or underflow.Article: 140764
On May 25, 9:16=A0am, Peter Alfke <al...@sbcglobal.net> wrote: > On May 25, 2:11=A0am, sbatt...@yahoo.co.jp wrote: > > > > > > > Hi guys, > > At the moment I'm waiting to find out whether I will be using Xilinx > > or Actel for my project, and so I'm putting it together for both just > > in case. > > > In the Actel IP cores, there is an array adder which allows a good > > number of inputs, and there's some optional pipelining. I figure it's > > sufficient to just drop this in and wire up as many inputs as I need. > > > Xilinx IP cores seem to have only 2-input adders, and I guess these > > are probably inferred by XST with the + operator anyway, so I don't > > want to bother with the IP core gen unless there's some reason why I > > should. > > Supposing I want: > > > Result <=3D A + B + C + D + E; > > Note, I used only five inputs in my example for brevity, I will have > > more like 25 in my actual system. > > > (looking in the XST manual, I can either pad the inputs with leading > > zeros or convert to integer and back to std_logic_vector to get carry > > bits to fill my wider result) > > > At the end of the day, when I synthesize this, would there be any > > difference between coding it in stages (adding pairs of two together, > > then adding their sums together, and so on until all are added up) and > > just putting A+B+C+D+E in one statement? > > All I can think of is that (depending how well conversions to/from > > integer are optimized in XST) I might save a few bits of space in the > > first stages. > > Using the bit padding method, I suppose that all of the adders in the > > first stages would wind up unnecessarily being the same width as the > > result. > > > Anyway, I'm just curious how this will end up working... any insight > > appreciated! > > > Steve > > If I understand you right, you have 25 parallel inputs, each sending > you bit-serial data. > You need to convert the 25 inputs into one 6-bit binary word, and then > accumulate these words with increasing (or decreasing) binary weight. > > Conversion of 25 lines to 6 bits can be done in many ways, including > sequential scanning or shifting, which requires a faster clock of > > 1.5 MHz. > But here is an unconventional and simpler way: > Use 13 inputs as address to one port of a BlockRAM with 4 parallel > outputs =A0(8K x 4) > Use the remaining 12 inputs as address to the other port of the same > BlockRAM. > Store the conversion of (# of active inputs to a binary value) in the > BlockRAM. > > Add the two 4 bit binary words together to form a 5-bit word that > always represents the number of active inputs. > Then feed this 5-bit value into a 13-bit accumulator, where you shift > the content after each clock tick. > > This costs you one BlockRAM plus three or four CLBs in Xilinx > nomenclature, a tiny portion of the smallest Spartan or Virtex device, > and it could be run a few thousand times faster than you need. > If you have more than 26 inputs, just add another BlockRAM for a total > of up to 52 inputs, and extend the adder and accumulator by one bit. > (Yes, I know in Spartan you are limited to 12 address inputs, (4K x > 4), but you can add the remaining bit outside...) > > Peter Alfke, from home.- Hide quoted text - > > - Show quoted text - Hi Steve, 1. Set up a 16*8 FIFO; 2. Each of 25 data sources is first registered in its 8-bit register with valid bit when data bits are full from its serial data source; 3. When valid =3D '1', push the data into FIFO and clear the valid bit; 4. Set up a 13-bit register with initialized 0 data when a new calculation starts; 5. When FIFO is not empty, add 13-bit register with high 5-bit being '0' and low 8-bit from FIFO output. There is no need for 25 data sources. WengArticle: 140765
Nathan Bialke wrote: > I suspect your issue is going to be with dataInLatch and > dataValidLatch. I'm assuming there is no deterministic phase > relationship between the fastClock and the slowClock. This is almost > definitely the case unless slowClock is derived from fastClock. There is a simple phase relationship: I use one PLL with two outputs: fastClock = 8 x the input clock and slowClock = 4 x the input clock. Maybe then I can simplify the FIFO? > An asynchronous FIFO is probably the easiest way to do this. > Otherwise, there needs to be some set phase relationship between > fastClock and slowClock. Do you have some VHDL code for it? I think I could use a BRAM for it, but would be overkill for such a simple case. > What is a biphase signal? http://en.wikipedia.org/wiki/Biphase_mark_code For my application it is AES3. >> fastClock is regenerated from the biphase signal with external chips and >> looks nice, but a 4x PLL doesn't work > > Why not? Looks like there is too much jitter, because sometimes there are bits missing, which depends on the frequency of the signal. Internally generated signals are sampled nice. So my idea was to try it with higher sampling rate. > Regardless of how fast your slowClock or fastClock clocks are, > transferring data between the two of them needs to be done in such a > way that ensures setup/hold requirements are always met. The issue (in > terms of data integrity and preventing metastability) is the phase > relationship between the two clocks. After that, you just need to > ensure the ratio of the two clocks is sufficient to ensure the > aggregate data rate generated by the fastClock can be consumed by the > slowClock without overflow or underflow. Overflow should be no problem, because of the fixed relationship and fixed bitrate, 4 times slower than slowClock. -- Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.deArticle: 140766
Hi, Through discussions of last problem title "Are all these claims in VHDL correct?" I understand how to recognize a transparent latch from a register. Here I gave an example to show what I am puzzled. State1_A : process(CLK) begin if CLK'event and CLK = '1' then if SINI = '1' then State1 <= Idle_S; else State1 <= State1_NS; end if; end if; end if; State1_B : process(State1, A1, A2) begin case State1 is when Idle_S => if A1 = '1' then State1_NS <= X_S; else State1_NS <= Idle_S; end if; when X_S => if A2 = '1' then State1_NS <= Idle_S; else State1_NS <= X_S; end if; end case; end process; State2_A : process(SINI, CLK) begin if CLK'event and CLK = '1' then if SINI = '1' then State2 <= Idle_S; else State2 <= State2_NS; end if; end if; end if; State2_B : process(State2, A1, A2) begin case State2 is when Idle_S => if A1 = '1' then State2_NS <= X_S; -- else <-- key difference -- State2_NS <= Idle_S; end if; when X_S => if A2 = '1' then State2_NS <= Idle_S; else State2_NS <= X_S; end if; end case; end process; From my experiences with state machine, VHDL compiler would generate warning for state2: "state machine state2 will be implemented as latches". Once It took me one week to have found the similar situation with the above state2 in my a long state machine. I don't know why VHDL compiler generate latches for state2. Thank you. WengArticle: 140767
> > I use the MPMC to map memory ports to DDR2 external memory. This > mechanism has been successfully tested with up to 7 microblazes. > > /Per Firstly, thanks a lot. Secondly, I would be grateful if you could tell me if you used XUP Board and the version of Xilinx Platform Studio. Do you know anything else of this type of design? again, my best regardsArticle: 140768
On May 25, 3:52=A0pm, Weng Tianxiang <wtx...@gmail.com> wrote: > Hi, > Through discussions of last problem title "Are all these claims in > VHDL correct?" I understand how to recognize a transparent latch from > a register. > > Here I gave an example to show what I am puzzled. > > State1_A : process(CLK) > begin > =A0 =A0if CLK'event and CLK =3D '1' then > =A0 =A0 =A0 if SINI =3D '1' then > =A0 =A0 =A0 =A0 =A0State1 <=3D Idle_S; > =A0 =A0 =A0 else > =A0 =A0 =A0 =A0 =A0State1 <=3D State1_NS; > =A0 =A0 =A0 end if; > =A0 =A0end if; > end if; > > State1_B : process(State1, A1, A2) > begin > =A0 =A0case State1 is > =A0 =A0 =A0 when Idle_S =3D> > =A0 =A0 =A0 =A0 =A0if A1 =3D '1' then > =A0 =A0 =A0 =A0 =A0 =A0 State1_NS <=3D X_S; > =A0 =A0 =A0 =A0 =A0else > =A0 =A0 =A0 =A0 =A0 =A0 State1_NS <=3D Idle_S; > =A0 =A0 =A0 =A0 =A0end if; > > =A0 =A0 =A0 when X_S =3D> > =A0 =A0 =A0 =A0 =A0if A2 =3D '1' then > =A0 =A0 =A0 =A0 =A0 =A0 State1_NS <=3D Idle_S; > =A0 =A0 =A0 =A0 =A0else > =A0 =A0 =A0 =A0 =A0 =A0 State1_NS <=3D X_S; > =A0 =A0 =A0 =A0 =A0end if; > =A0 =A0end case; > end process; > > State2_A : process(SINI, CLK) > begin > =A0 =A0if CLK'event and CLK =3D '1' then > =A0 =A0 =A0 if SINI =3D '1' then > =A0 =A0 =A0 =A0 =A0State2 <=3D Idle_S; > =A0 =A0 =A0 else > =A0 =A0 =A0 =A0 =A0State2 <=3D State2_NS; > =A0 =A0 =A0 end if; > =A0 =A0end if; > end if; > > State2_B : process(State2, A1, A2) > begin > =A0 =A0case State2 is > =A0 =A0 =A0 when Idle_S =3D> > =A0 =A0 =A0 =A0 =A0if A1 =3D '1' then > =A0 =A0 =A0 =A0 =A0 =A0 State2_NS <=3D X_S; > -- =A0 =A0 =A0 =A0 else =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 <-- key d= ifference > -- =A0 =A0 =A0 =A0 =A0 =A0State2_NS <=3D Idle_S; > =A0 =A0 =A0 =A0 =A0end if; > > =A0 =A0 =A0 when X_S =3D> > =A0 =A0 =A0 =A0 =A0if A2 =3D '1' then > =A0 =A0 =A0 =A0 =A0 =A0 State2_NS <=3D Idle_S; > =A0 =A0 =A0 =A0 =A0else > =A0 =A0 =A0 =A0 =A0 =A0 State2_NS <=3D X_S; > =A0 =A0 =A0 =A0 =A0end if; > =A0 =A0end case; > end process; > > From my experiences with state machine, VHDL compiler would generate > warning for state2: "state machine state2 will be implemented as > latches". > > Once It took me one week to have found the similar situation with the > above state2 in my a long state machine. > > I don't know why VHDL compiler generate latches for state2. > > Thank you. > > Weng Are you sure the latch isn't being created for State2_NS? You may want to put a "when others =3D>" clause at the end of the case statement to make sure State2_NS gets assigned something en every case. A default assignment at the top of the process would give similar effects. Also, SINI doesn't need to be in the sensitivity list for the State2 process, but it shouldn't hurt anything other than simulation time. DaveArticle: 140769
Weng, You've told the synthesizer that state2_ns (the combinatorial signal, not the register) has to remember its previous value under certain circumstances, so it generates a latch to remember the value. Your choices to avoid the latch include a) avoiding combinatorial processes, b) including a default assignment (perhaps from the output of the associated register) in combinatorial processes, or c) making sure every possible execution path through the process results in all driven signals being assigned a value (and not just to themselves). I always choose (a). If you just have to use a combinatorial process, then (b) is much easier to read/write/verify/review than is(c). AndyArticle: 140770
MM wrote: >> 3. Why DSP and Memory are rectangular in shape ? > > Do you mean why they are not round? > > No, why are they rectangular? Do you know? The person who set the assignment does.Article: 140771
On May 21, 11:33=A0pm, Brian Drummond <brian_drumm...@btconnect.com> wrote: > Try running Translate from the command line with the exact command line g= iven in > the ".cmdlog" file - including the "-intstyle ISE" flag and see if that g= ives > the same result as the GUI flow. Thanks for your reply, Brian. Unfortunately, the cmdlog indicates that the GUI flow uses a different executable - unwrapped/ngdbuild.exe which has the problem I described with loading constraints when run from the command line. So one issue right at the start is that the GUI uses different binaries. > Indeed an apparently successful run through the GUI tools produces a > non-functional bitfile! I have not had time to explore deeply enough to > find out why. I don't think it's safe to assume the GUI and command-line flows have the same results. From what I've seen, the two have diverged. Fortunately, by manually including the UCF file (and therefore deliberately disregarding the -i flag that cmdlog said we should use) we are able to produce an identical final bit-file, excluding timestamp in header, even though the intermediate files are all different. That's with our current inputs at least. Who knows whether this will be the case tomorrow, or next month? -- David.Article: 140772
On May 23, 5:07=A0am, LittleAlex <alex.lo...@email.com> wrote: > I have been successful converting a project from GUI to command line > with identical bit files. Hi LittleAlex, thank you for your comments. We too get identical bitfiles (excluding header), but my concern is that the intermediate files are all different. All this does is give me confidence that *this* build is identical, but who can say if future builds will be? We have no way to test this for every build either. For serious use of the Xilinx tools in an engineering environment, this sort of behaviour is ridiculous. > Take a very close look at the log files left behind by the GUI build. > There are options that you will probably not recognize - the GUI knows > better than you what you want it to do :) In fact the GUI got it wrong - it said to use the -i flag, but in fact this was wrong and what was required was removal of the -i flag, and use of the -uc flag instead. > You can get rid of the -intstyle ISE flag. =A0I use a .prj file format > and that works just fine for me. What flow do you use with that? -intstyle xflow? > Most options can be set in a number of places. =A0I'm not 100% sure > which location for the option has priority; I had some weird results > which went away when I made them all match. Is the .prj format similar to Synplify's TCL .prj format? I'll have to look up the .prj format - could be useful. > Another thing to look out for: =A0The GUI scatters work directories all > over the place. =A0Weirdness in these directories can cause run to run > inconsistencies; audit the location and cleanup of these directories > carefully. Yes, our Makefile cleans up pretty well. A 'git clean -fdx' also does the trick during a build. Regards, -- David.Article: 140773
On May 22, 2:30=A0am, phil hays <philh...@dont.spam> wrote: > As the .ise working file changes every run, and is binary to boot, it can > not be an input into a stable and maintainable build process. So the > solution I've used when using gnu make under Cygwin is to delete the > whole result directory (bld) at the start of the build. There are other > files in the result directory (and sub-directories under it) that can > influence the build, and the only way that I'm aware of to get a > consistent result is to start with a fresh directory. Hi phil, thank you for your reply. > One option for doing this would be to have the make file call a Project > Navigator Tcl script (using xtclsh). This script would create a > fresh .ise file every run, and could also be used to run from the GUI. I > posted a script for this sometime ago, and will update it if desired. This sounds useful - can you direct me towards a recent version of this please? > This is because the ISE flow seems to read the UCF file into a data base > first, and then applies the constraints later. However the ISE flow is using the '-i' flag, which is supposed to ignore constraints... I only get the same behaviour from the command line if I ditch the -i flag and use -uc instead, to include constraints. > The .ise file has lots of date and time information. The solution to this > is to think of the .ise file as a working file, rather than a project > file and to delete it at the start of any build script. In conjunction with your earlier comment, this makes sense. However I understand if -intstyle ise is *not* used, there's no dependency on the ise file whatsoever. I'd prefer this at build time, although we would like to automatically create an ise file for local use inside the GUI. > To difference the .bit files, the header needs to be ignored. To make > this automatic, I've written a little difference utility using Tcl. Would > this be of interest? Yes, this would be useful please. Regards, -- David.Article: 140774
Symon <symon_brewer@hotmail.com> wrote: > MM wrote: >>> 3. Why DSP and Memory are rectangular in shape ? >> >> Do you mean why they are not round? >> >> > No, why are they rectangular? Do you know? The person who set the assignment > does. > There you go, they actually don't. That's why they're asking the question. Then they have a whole classload of students to try and find the answer for then. Thankfully some of them are smart enough to come and ask here rather than doing their own homework assignments. Nobby
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z