Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Kolja Sulimma: > The problem here is that users tend to evaluate the > capabilites of an FPGA mainly as logic, while really > you pay mostly for routing. Logic is a very small > portion of the silicon area. Of course the vendors > don't publish the numbers, but university research > suggests the area of LUT and LUT configuration is > only a few percent of total area. That's what I expected. This becomes pretty obvious if you imagine a LUT2 FPGA, where everyone should intuitively understand that the entire silicon would be filled up with routing resources. And LUT4 can't be far off. > Therefore when going from 4-LUT to 6-LUT you don't > get a 4x area increase (16 entries to 64 entries) > but more like a 60% increase (going from 4 inputs > that must be routed to 6 inputs that must be routed > in a somewhat worse than linear routing area). So let's compare Spartans: Spartan6 LUT6: about 7 ins, about 3 outs = 10 ports Spartan3 Slice: about 10 ins, about 6 outs = 16 ports Where the port count for the Sparta3 Slice doesn't include the FXMUX path, but the full XB/YB (I doubt this path has/needs full routing caps, anyway). So from what you said about area with taking routing resources into account, the Spartan3 Slice might very well consume a little more area, although it has only about half the SRAM bits. What do we get for that? For SLICEL, I think of: 2*any 4 inp-func: LUT4:yes, LUT6:no 2*any 4 inp-func, paired invert: LUT4:yes, LUT6:no any 5-inp func: both any 6-inp func: LUT4:no LUT6:yes MUX4: both half/partial populated Carry: LUT4:yes, LUT6:no 2 Bit full Adder: both 2 Bits of long Adder: LUT4:yes, LUT6:one, but 2? 2 Bits of long MulAdder: LUT4:yes, LUT6:one, but 2? 1 Bit ALU (fast Carry): maybe both --with dual Ext-feedin: LUT4:yes(paired with DPram), LUT6:no Large Chain Logic: LUT4: 8Bit/Slice, LUT6:6Bit/LUT DblLUTed Chain Logic: LUT4: no, BX, only, LUT6: yes For SLICEM, I also think of: 64x1 RAM: LUT4:no, LUT6 yes 32x2 RAM: LUT4:no, LUT6 yes 32x1 RAM: LUT4:yes, LUT6 yes 16x2 RAM: LUT4:yes, LUT6 yes 16x1 RAM+Adder: LUT4:yes, LUT6 no Well, for the SLICEM-Part, the LUT6 might be a better choice, but for SLICEL, I'd still prefer the LUT4, given 50% area overhead, although I'm missing a little partial bit more of static MUXes and FF-paths (independent clock-inverters, or something). Gruss Jan Bruns -- Ein paar Fotos: http://abnuto.de/gal/Article: 153401
On Feb 16, 7:07 am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote: > Martin Thompson <martin.j.thomp...@trw.com> wrote: > > (snip) > > > Don't ask me - I'm not making the decisions. Ultimately, Xilinx > > presumably decided it was a "win" in business terms: "We'll make the > > most money doing it this way." > > Well, they do have some competition. If they don't design > and build what works for their customers, they will lose out. > > >> I don't believe there's no market for LUT4 FPGAs using current > >> silicon process. > > No-one is saying there is not a market. Just that it's not > > big enough for Xilinx to be targetting it. > > As I understand it, 6LUT is better for larger chips. > > For smaller ones, it likely doesn't make so much difference. > There is some advantage as far as synthesis software of > keeping a minimum number of different architectures. > > Still, 4LUT chips should be around for a while. > > -- glen I believe that is what it comes down to. Given the fact that routing is a huge percentage of the chip area (and so cost) this becomes a more important factor as the chips get larger. After all, routing does go up at a faster rate than linear. So minimizing routing is more important in larger chips. The tradeoff provides for lower costs with LUT6 in larger devices. The other side of the coin is more "wasted" logic when larger LUTs are underutilized. So it would seem that we have reached the point where the LUT6 is optimal for many if not the vast majority of designs. I don't know that there is a performance penalty in using LUT6. I would expect that is minimal since the muxes in the LUTs are done with transmission gates with very little delay, but I don't really know. If so, the only issue then becomes cost. So if you design is one of the minority designs that can indeed be done more efficiently in a LUT4 architecture, then you will pay a bit more for a LUT6 based part... but given the advantages of smaller feature size you will likely get lower costs with the newer parts than sticking with an old generation. As to design reworks required to optimize a design for a newer part, I expect that would be done for speed and/or cost. My experience is that Xilinx is more than willing to help you with that, especially if it means a design win over a competitor. But would anyone really expect much lost ground from a LUT4 design to a current LUT6 design? Software changes can greatly impact results, but I can't see needing to touch a design from a Spartan 3 to get it to run well in a newer device given the large improvements in the hardware from using a much smaller process. I suppose if you have used hard constraints you may have to remove them. But you knew the risk when you used those features, no? RickArticle: 153402
rickman <gnuarm@gmail.com> wrote: (snip, I wrote) >> For smaller ones, it likely doesn't make so much difference. >> There is some advantage as far as synthesis software of >> keeping a minimum number of different architectures. (snip) > I believe that is what it comes down to. Given the fact that routing > is a huge percentage of the chip area (and so cost) this becomes a > more important factor as the chips get larger. After all, routing > does go up at a faster rate than linear. So minimizing routing is > more important in larger chips. The tradeoff provides for lower > costs with LUT6 in larger devices. > The other side of the coin is more "wasted" logic when larger LUTs are > underutilized. So it would seem that we have reached the point where > the LUT6 is optimal for many if not the vast majority of designs. One that I am interested in, though, is that 6LUT should be much better for building the MUX needed for barrel shifters. A 4LUT makes a two input MUX, but 6LUT can make a 4 input (and two select line) MUX. Other than that, I haven't though much about how useful differnet sizes are. The less logic between FF's, the less advantage to larger ones. > I don't know that there is a performance penalty in using LUT6. I > would expect that is minimal since the muxes in the LUTs are done with > transmission gates with very little delay, but I don't really know. > If so, the only issue then becomes cost. So if you design is one of > the minority designs that can indeed be done more efficiently in a > LUT4 architecture, then you will pay a bit more for a LUT6 based > part... but given the advantages of smaller feature size you will > likely get lower costs with the newer parts than sticking with an old > generation. Well, they have to be designed not to glitch when switching between entries with the same output value. That doesn't naturally happen with an SRAM. Also, with transmission gates you can't go through too many without a buffer, but presumably that is part of optimizing the cell. -- glenArticle: 153403
On Feb 16, 4:50=A0pm, Jan Bruns <jansacco...@arcor.de> wrote: > Spartan6 =A0LUT6: about =A07 ins, about 3 outs =3D 10 ports > Spartan3 Slice: about 10 ins, about 6 outs =3D 16 ports > > So from what you said about area with taking routing > resources into account, the Spartan3 Slice might very > well consume a little more area, although it has only > about half the SRAM bits. Not. It will consume a lot more area if you include routing. Routing grows faster than linear (look up "rent exponent"). Of course it can cover more flexible circuit areas because you can chose much more combinations of input signals with two 4-luts compared to one 6-lut (except if you have high fanin random logic. But the area is much larger. The point is: It does not matter if a LUT-6 on average has lower utilization, as LUT area is virtually free. What matters is routing utilization. There is research that clearly shows that from an efficiency standpoint FPGAs are best that can't achieve 100% LUT utilization because they have sparse routing. The reasons why vendors choose to provide lots of routing anyway is: a) customers don't understand this and tend to start whining when they don't get 100% LUT utilization instead of beeing happy that they get better wire utilization. (Remember: Wires are the expensive part) b) It get's hard to predict what can be implemented and what can't. c) software gets harder to do and slower with worse routing ressources. So you pay a premium to be able to reliably plan your design and to simplify marketing. Back to LUT size: Have a look at figure 3.3 in this: http://www.eecg.utoronto.ca/~jayar/pubs/theses/Ahmed/EliasAhmed.pdf area is virtually constant in that analysis for LUT sizes from 4 to 6. But with LUT size 6 you get much better software runtimes. KoljaArticle: 153404
Kolja Sulimma: >> Spartan6 LUT6: about 7 ins, about 3 outs = 10 ports Spartan3 Slice: >> about 10 ins, about 6 outs = 16 ports >> So from what you said about area with taking routing resources into >> account, the Spartan3 Slice might very well consume a little more area, >> although it has only about half the SRAM bits. > Not. It will consume a lot more area if you include routing. Routing > grows faster than linear (look up "rent exponent"). Of course it can > cover more flexible circuit areas because you can chose much more > combinations of input signals with two 4-luts compared to one 6-lut > (except if you have high fanin random logic. But the area is much > larger. Take some area A of silicon and put n_1 blocks of type T_1 into it. Take another area A of silicon and put n_2 blocks of a similar type T_2 into it. If n_1*portcount(T_1) = n_2*portcount(T_2) then portcount(A) won't depend on what blocktype was implemented, and I don't see any reason why one or the other should consume more routing overhead. > The point is: It does not matter if a LUT-6 on average has lower > utilization, as LUT area is virtually free. What matters is routing > utilization. If the utilization of a given LUT goes low, the routing will on average become lesser "localized", so that wires become longer, > There is research that clearly shows that from an efficiency standpoint > FPGAs are best that can't achieve 100% LUT utilization because they have > sparse routing. > > The reasons why vendors choose to provide lots of routing anyway is: a) > customers don't understand this and tend to start whining when they > don't get 100% LUT utilization instead of beeing happy that they get > better wire utilization. (Remember: Wires are the expensive part) > > b) It get's hard to predict what can be implemented and what can't. > > c) software gets harder to do and slower with worse routing ressources. > > > So you pay a premium to be able to reliably plan your design and to > simplify marketing. > > Back to LUT size: Have a look at figure 3.3 in this: > http://www.eecg.utoronto.ca/~jayar/pubs/theses/Ahmed/EliasAhmed.pdf > Thanks for sharing that link. However, my understanding from that presentation is, that LUT4,,6 give the same overall area utilization, LUT>6 would give shortest delays, and LUT4..6 all give the same best area*delay product. > area is virtually constant in that analysis for LUT sizes from 4 to 6. > But with LUT size 6 you get much better software runtimes. Overall area (including routing) doesn't significantly change from LUT4 to LUT6, and even the delay was similar from LUT4 to LUT6. But these results don't represent the fact, that the Xilinx Lut4-design has an enormous fit to many practically relevant problems (for example, adders ans busmuxes are very frequently used). Even the software generated technology mapping makes heavy use of these additional Lut4 features, that are almost for free, compared to the theoretical, simple LUT4 design. The technology mapping might become easier for synthesis software, if the CLB design comes nearer to the bare LUT (with LUT6, the Carry seems to become the only additional specialized circuit), but the Xilinx software is already able to make good use of their LUT4 specials, it's only that it doesn't always notice the ideal, obvious solution. Gruss Jan Bruns -- Ein paar Fotos: http://abnuto.de/gal/Article: 153405
Kolja Sulimma <ksulimma@googlemail.com> wrote: (snip) > There is research that clearly shows that from an efficiency > standpoint FPGAs are best that can't achieve 100% LUT utilization > because they have sparse routing. I have done place and route on pipelined arrays with different numbers of cells per chip, and found that speed goes fairly close to inversely proportional to the number of cells, over a fairly wide range. -- glenArticle: 153406
glen herrmannsfeldt: >(snip) >> There is research that clearly shows that from an efficiency >> standpoint FPGAs are best that can't achieve 100% LUT utilization >> because they have sparse routing. > I have done place and route on pipelined arrays with different numbers > of cells per chip, and found that speed goes fairly close to inversely > proportional to the number of cells, over a fairly wide range. Some pipeline control signals crossing the data-path and getting slower with wider fanouts? Gruss Jan Bruns -- Ein paar Fotos: http://abnuto.de/gal/Article: 153407
Jan Bruns <jansaccount@arcor.de> wrote: (snip, I wrote) >> I have done place and route on pipelined arrays with different numbers >> of cells per chip, and found that speed goes fairly close to inversely >> proportional to the number of cells, over a fairly wide range. > Some pipeline control signals crossing the data-path and getting slower > with wider fanouts? It is a linear array of fairly simple cells. I believe it is that the routes get longer and slower as things get more tightly packed together. -- glenArticle: 153408
glen herrmannsfeldt: > Jan Bruns <jansaccount@arcor.de> wrote: > (snip, I wrote) >>> I have done place and route on pipelined arrays with different numbers >>> of cells per chip, and found that speed goes fairly close to inversely >>> proportional to the number of cells, over a fairly wide range. > >> Some pipeline control signals crossing the data-path and getting slower >> with wider fanouts? > > It is a linear array of fairly simple cells. I believe it is that the > routes get longer and slower as things get more tightly packed together. Just some days ago, I had a similar problem. There was a horizontal data flow, with the parallel data lines vertically aligned. The bottleneck was one CLB column using a couple of "control signals" sourced elsewhere. The timing heavily scaled down with bus size, and timinganlysis showed up a couple of ns of routing delay, just for the control. Luckily, the critical CLB row had some unused regs, so I used them to replicate the most critical controls. At first, this didn't work out as expected. It even got worse than without the replication. This was caused by the way I've arranged the replicates, with more vertical direct lines than available. So the router came up with solutions like routing a critical, local CLB signal once around that CLB (a lot of hops through a handful of neighbor switch matrices). A simple rearrangment of the replicate usage however fully solved that further problem (by halving the direct neighbor route consumption). Althugh there are now some more signals on the switches (remember the original signals still need to go to the replicate regs), now all the replicates have direct neighbor connects (or better) to the LUTs. So timing doesn't scale anymore with bus width. Gruss Jan Bruns -- Ein paar Fotos: http://abnuto.de/gal/Article: 153409
>nba83 wrote: >> hi >> i am trying to detect falling edge of a 200ns pulse(WriteStrobe) >> synchronously with this code. GlobalClk is 100MHz(10ns) oscillator clk >> attached to global clk pin of Xilinx Spartan 3 XC3s400-5I. the problem i am >> facing is that about 1000 falling edges 100 of them are missed. i used >> IBUFG at the input clk but the output is the same. but if I connect the >> oscillator to a normal io pin with the constraint CLOCK_DEDICATED_ROUTE = >> FALSE; i can detect all the falling edges without error. i don't know >> what's the problem. any help would be appreciated :) >> always @(posedge GlobalClk) >> begin >> pre_WriteStrobe <= WriteStrobe; >> if( pre_WriteStrobe & ~WriteStrobe) >> begin >> StartWritingMemory <=1; >> WriteNibble <=0; >> Write_Address <= 4095; >> end >> end >> >> >> >> --------------------------------------- >> Posted through http://www.FPGARelated.com > >You can't use the asynchronous input in your "if" clause. To >properly sense the falling edge of an asynchronous input you >need two flops and then compare the outputs of those flops. >If instead you compare the output of the first flop to the >input signal (yes I know this would have less latency) you >have the possibility of a vanishingly small time when the >two signals are different. Then you don't meet the setup >time to the registers inside your if statement. > >Try this: > >reg [1:0] pre_WriteStrobe; > > always @(posedge GlobalClk) > begin > pre_WriteStrobe <= {pre_WriteStrobe[0],WriteStrobe}; > if( pre_WriteStrobe[1] & ~pre_WriteStrobe[0]) > begin > StartWritingMemory <=1; > WriteNibble <=0; > Write_Address <= 4095; > end > end > > >-- Gabor > hay, it seemed that's the problem, because it was solved with this solution:). but i have some question, this solution is equal to lowering GlobalClk frequency, i decreased the globalclk frequency to 10MHz but the previous code still have error detecting edges, but with the code you suggest the code is working ok at globalclk 50Mhz, so what may be the issue? tnx in advanced for help --------------------------------------- Posted through http://www.FPGARelated.comArticle: 153410
On Feb 17, 11:51=A0pm, "nba83" <nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote: > this solution is equal to lowering > GlobalClk frequency, No it is not. As Gabor explained, the problem is that the input WriteStrobe is asynchronous to your clock. This means that it can change states likely anywhere within the clock cycle. Now ask yourself the question, if the flip flop has a requirement that the input be stable for some period of time prior to the clock in order to work correctly, how are you going to meet that requirement for the flip flop that is the receiver of logic derived from WriteStrobe? Resynchronizing WriteStrobe to the clock before using it to control any real logic means that the 'real logic' will have a delayed WriteStrobe that only changes immediately after the clock and therefore should meet the timing requirements. Kevin JenningsArticle: 153411
On Feb 18, 2:51=A0pm, KJ <kkjenni...@sbcglobal.net> wrote: > On Feb 17, 11:51=A0pm, "nba83" > > <nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote: > > this solution is equal to lowering > > GlobalClk frequency, > > No it is not. > > As Gabor explained, the problem is that the input WriteStrobe is > asynchronous to your clock. =A0This means that it can change states > likely anywhere within the clock cycle. =A0Now ask yourself the > question, if the flip flop has a requirement that the input be stable > for some period of time prior to the clock in order to work correctly, > how are you going to meet that requirement for the flip flop that is > the receiver of logic derived from WriteStrobe? > > Resynchronizing WriteStrobe to the clock before using it to control > any real logic means that the 'real logic' will have a delayed > WriteStrobe that only changes immediately after the clock and > therefore should meet the timing requirements. > > Kevin Jennings To belabor the point, in your original post you said you missed about 10% of the incoming edges at 100 MHz. Now imagine that your flip-flops need 1ns setup time, and you use the original equation "pre_WriteStrobe & ~WriteStrobe" where WriteStrobe can change state anywhere within the clock cycle. If it changes about 1ns before the clock edge, then the pulse created by "pre_WriteStrobe & ~WriteStrobe" will only be about 1ns long. This would happen about 10% of the time when the clock is 100 MHz. It would still happen, but only about 1% of the time at 10 MHz. Regardless of the clock frequency you would never get down to 0% missed edges. When you use two flip-flops, the real trick is that the first flop is the only one that receives an asynchronous input. Thus all flops down the road rely on it to make the decision as to which clock cycle the input changed. So when WriteStrobe changes 1 ns from a clock edge, either the first flop will change state at that edge or it will change state only at the following clock edge, but in all cases its output will change just after a clock edge and every flop that uses its output will agree on which clock edge the change occurred. Any design where more than one flop is involved in deciding the edge where an asynchronous change happens will be subject to errors like you have observed. -- GaborArticle: 153412
hi have your problem solved, cause i 'm facing the same problem with Realtek RTL8201 chip at the receive section,i connected RXD to TXD and RXDV to TXE to test the chip in loopback through fpga xilinx spartan 3 xc3s400, in this test i got 4 out of 50 packets with fcs error at the pc. what's the issue? tnx for any help >Hi, >I am using xilinx spartan3 xc3s4000 in my design. It is interfaced with 2 >national Gigabit PHYs. So i receive a packet from phy A and transmit it to >PHY B and vice versa. Now the problem i am facing is that one of the bytes >in the packet randomly gets corrupt after a while.. > >First the packet drop was very frequent at high speeds, then i checked the >power requirements of my PHYs and got to know that my regulator couldn't >source that much current. Then i changed the regulator and now the problem >occurs very rarely or it doesnt occur at all. > >I have some checks in the RTL to identify if the error is FCS or buffer >overflow.So every time the packet drops, my fcs flag is raised. So i viewed >the incoming packet and saw that it always had some random corrupt byte. >Like i was sending packets with known pattern, so after a while some random >byte is getting corrupt. I don't know what to look for from now onwards. >I thought maybe it was the heat issue so used heat gun but nah it wasn't >the heat problem. >My ground noise is 80mv peak-to-peak. > >Need some pointers.. > >Regards > > >--------------------------------------- >Posted through http://www.FPGARelated.com > --------------------------------------- Posted through http://www.FPGARelated.comArticle: 153413
On 22/09/2011 20:21, salimbaba wrote: > Hi, > I am using xilinx spartan3 xc3s4000 in my design. It is interfaced with 2 > national Gigabit PHYs. So i receive a packet from phy A and transmit it to > PHY B and vice versa. Now the problem i am facing is that one of the bytes > in the packet randomly gets corrupt after a while.. > > First the packet drop was very frequent at high speeds, then i checked the > power requirements of my PHYs and got to know that my regulator couldn't > source that much current. Then i changed the regulator and now the problem > occurs very rarely or it doesnt occur at all. > > I have some checks in the RTL to identify if the error is FCS or buffer > overflow.So every time the packet drops, my fcs flag is raised. So i viewed > the incoming packet and saw that it always had some random corrupt byte. > Like i was sending packets with known pattern, so after a while some random > byte is getting corrupt. I don't know what to look for from now onwards. > I thought maybe it was the heat issue so used heat gun but nah it wasn't > the heat problem. > My ground noise is 80mv peak-to-peak. > > Need some pointers.. > > Regards > > > --------------------------------------- > Posted through http://www.FPGARelated.com Can you please describe the hardware setup in more detail - is it your own board or a known good board. Has this hardware setup ever worked (ie been error free ?). Is there any pattern in the 'random' corruption (eg is it always bit 0 or a 1 seen as 0 (or a 0 seen as 1) etc etc. Is it always the nth byte in a packet etc. Michael KellettArticle: 153414
"nba83" <nba_baheri@n_o_s_p_a_m.yahoo.com> wrote in message news:RaSdnQf1bZqJJ6bSnZ2dnUVZ_uudnZ2d@giganews.com... > hi > i am trying to detect falling edge of a 200ns pulse(WriteStrobe) > synchronously with this code. GlobalClk is 100MHz(10ns) oscillator clk > attached to global clk pin of Xilinx Spartan 3 XC3s400-5I. the problem i > am > facing is that about 1000 falling edges 100 of them are missed. i used > IBUFG at the input clk but the output is the same. but if I connect the > oscillator to a normal io pin with the constraint CLOCK_DEDICATED_ROUTE = > FALSE; i can detect all the falling edges without error. i don't know > what's the problem. any help would be appreciated :) > always @(posedge GlobalClk) > begin > pre_WriteStrobe <= WriteStrobe; > if( pre_WriteStrobe & ~WriteStrobe) > begin > StartWritingMemory <=1; > WriteNibble <=0; > Write_Address <= 4095; > end > end > > > > --------------------------------------- In theory this will never work, but statistically you will get an acceptable result by reclocking you signal. Make sure you read and understand meta stability. The thing is that if your first FF tries to grab a signal that may change at an invalid phase (setup and hold are bad for the FF), the output of this FF can get to a state where the output voltage level is somewhere between 0 and 1. So even the next FF may get problems seeing if this is a 0 or 1. In worst case it can oscillate. The chance of this happening is rediuced for every FF you pass, so after a 2-3 FF's you should most likely have an acceptable error rate.Article: 153415
On Feb 16, 4:06 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote: > rickman <gnu...@gmail.com> wrote: > > (snip, I wrote) > > >> For smaller ones, it likely doesn't make so much difference. > >> There is some advantage as far as synthesis software of > >> keeping a minimum number of different architectures. > > (snip) > > > I believe that is what it comes down to. Given the fact that routing > > is a huge percentage of the chip area (and so cost) this becomes a > > more important factor as the chips get larger. After all, routing > > does go up at a faster rate than linear. So minimizing routing is > > more important in larger chips. The tradeoff provides for lower > > costs with LUT6 in larger devices. > > The other side of the coin is more "wasted" logic when larger LUTs are > > underutilized. So it would seem that we have reached the point where > > the LUT6 is optimal for many if not the vast majority of designs. > > One that I am interested in, though, is that 6LUT should be > much better for building the MUX needed for barrel shifters. > A 4LUT makes a two input MUX, but 6LUT can make a 4 input > (and two select line) MUX. Other than that, I haven't though > much about how useful differnet sizes are. The less logic > between FF's, the less advantage to larger ones. Yes, the 4LUT can be finagled by using the fourth input as an enable which is in essence the AND gate of the next mux stage, then you can use all four inputs of a LUT as the OR gate to combine 8 inputs in two levels. So the 4LUT is more like 1.5 2 input muxes. > > I don't know that there is a performance penalty in using LUT6. I > > would expect that is minimal since the muxes in the LUTs are done with > > transmission gates with very little delay, but I don't really know. > > If so, the only issue then becomes cost. So if you design is one of > > the minority designs that can indeed be done more efficiently in a > > LUT4 architecture, then you will pay a bit more for a LUT6 based > > part... but given the advantages of smaller feature size you will > > likely get lower costs with the newer parts than sticking with an old > > generation. > > Well, they have to be designed not to glitch when switching > between entries with the same output value. That doesn't > naturally happen with an SRAM. Also, with transmission gates > you can't go through too many without a buffer, but presumably > that is part of optimizing the cell. > > -- glen The glitching is from logic race conditions. Using transmission gates pretty much eliminates that as long as you use break before make connections. Then the capacitance of the line retains the last value until the new value comes up. RickArticle: 153416
i designed and rout the pcb board,i have one RTL8201BL as lan phy layer and Xilinx Spartan 3 XC3s400 as controller. i transmit raw packets. i don't have any error in sending packets, i tested transmit section at 95% bandwidth without a packet loss, but at receive there are random error receiving packets. the bytes that are corrupt are also random and does not have a pattern. only when the packet length become large(about 1400 Bytes), the error occur more frequent about (3-4 out of 20 packets), i don't know how i can debug the problem, tnx in advanced for help :) --------------------------------------- Posted through http://www.FPGARelated.comArticle: 153417
On 21/02/2012 05:04, nba83 wrote: > i designed and rout the pcb board,i have one RTL8201BL as lan phy layer and > Xilinx Spartan 3 XC3s400 as controller. i transmit raw packets. i don't > have any error in sending packets, i tested transmit section at 95% > bandwidth without a packet loss, but at receive there are random error > receiving packets. the bytes that are corrupt are also random and does not > have a pattern. only when the packet length become large(about 1400 Bytes), > the error occur more frequent about (3-4 out of 20 packets), > i don't know how i can debug the problem, > tnx in advanced for help :) > > > > --------------------------------------- > Posted through http://www.FPGARelated.com I can only make the most general suggestions. How are you generating the test packets - are you sure they are good. Can you check the signal integrity on either side of the PHY. Are the errors similar to those you got when the power supply was poor. Is the position of the errored byte at the start or end of the packet or just anywhere. Do you ever get more than one error per packet. Does it get worse or better with any boundary cases: ie does it never fail with a minimum length packet or fail lots more often with maximum length. Is there any sensitivity to packet contents. Is there any sensitivity to rate of packets. MKArticle: 153418
On Feb 20, 9:04=A0pm, "nba83" <nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote: > i designed and rout the pcb board,i have one RTL8201BL as lan phy layer a= nd > Xilinx Spartan 3 XC3s400 as controller. i transmit raw packets. i don't > have any error in sending packets, i tested transmit section at 95% > bandwidth without a packet loss, but at receive there are random error > receiving packets. the bytes that are corrupt are also random and does no= t > have a pattern. only when the packet length become large(about 1400 Bytes= ), > the error occur more frequent about (3-4 out of 20 packets), > i don't know how i can debug the problem, > tnx in advanced for help :) > > --------------------------------------- > Posted throughhttp://www.FPGARelated.com Have you checked all your timing? Making setup/hold times for the Rx side of GigE can be tough with a Spartan. Been there, done that, you need to be very careful. John PArticle: 153419
Hi, How do I setup synopsys_sim.setup for simulating both Verilog and VHDL using VCS for a Xilinx FPGA? I need for instance have SIMPRIM point to both the VHDL and the Verilog compiled library path, I did try using a : and simply append them but it failed. /michaelArticle: 153420
>On Feb 20, 9:04=A0pm, "nba83" ><nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote: >> i designed and rout the pcb board,i have one RTL8201BL as lan phy layer a= >nd >> Xilinx Spartan 3 XC3s400 as controller. i transmit raw packets. i don't >> have any error in sending packets, i tested transmit section at 95% >> bandwidth without a packet loss, but at receive there are random error >> receiving packets. the bytes that are corrupt are also random and does no= >t >> have a pattern. only when the packet length become large(about 1400 Bytes= >), >> the error occur more frequent about (3-4 out of 20 packets), >> i don't know how i can debug the problem, >> tnx in advanced for help :) >> >> --------------------------------------- >> Posted throughhttp://www.FPGARelated.com > >Have you checked all your timing? Making setup/hold times for the Rx >side of GigE can be tough with a Spartan. Been there, done that, you >need to be very careful. > >John P > no i don't know how to do that? i don't know what are setup/hold times for Rx, and my Phy layer is 100MHz not Gig, I used RTL8201BL and i wrote a simple loopback program in which i connected RXDV to TXE and RXD to TXD at the corresponding TXCLK and RXCLK, do i need to do some timing for this simple program?? here is my code: always @ (posedge RTL_RXCLK) begin data <= RTL_RXD; en <= RTL_RXDV; end always @ (posedge RTL_TXCLK) begin RTL_TXD_I <= data; RTL_TXE_I <= en; end --------------------------------------- Posted through http://www.FPGARelated.comArticle: 153421
"nba83" <nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote in message news:H5qdnQ4pDqAm-tnSnZ2dnUVZ_q-dnZ2d@giganews.com... > here is my code: > > always @ (posedge RTL_RXCLK) > begin > data <= RTL_RXD; > en <= RTL_RXDV; > end > > always @ (posedge RTL_TXCLK) > begin > RTL_TXD_I <= data; > RTL_TXE_I <= en; > end Huh? I havent been reading this thread in details, but how are these clocks syncronized? Maybe they arent, hence your problem.Article: 153422
>"nba83" <nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote in message >news:H5qdnQ4pDqAm-tnSnZ2dnUVZ_q-dnZ2d@giganews.com... >> here is my code: >> >> always @ (posedge RTL_RXCLK) >> begin >> data <= RTL_RXD; >> en <= RTL_RXDV; >> end >> >> always @ (posedge RTL_TXCLK) >> begin >> RTL_TXD_I <= data; >> RTL_TXE_I <= en; >> end > >Huh? I havent been reading this thread in details, but how are these clocks >syncronized? Maybe they arent, hence your problem. > > > > i write it in two processes to ensure if the two clk are not syncronized the data won't read and transmited bad. --------------------------------------- Posted through http://www.FPGARelated.comArticle: 153423
"nba83" <nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote in message news:dtKdneQMFslBK9nSnZ2dnUVZ_hudnZ2d@giganews.com... > >"nba83" <nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote in message >>news:H5qdnQ4pDqAm-tnSnZ2dnUVZ_q-dnZ2d@giganews.com... >>> here is my code: >>> >>> always @ (posedge RTL_RXCLK) >>> begin >>> data <= RTL_RXD; >>> en <= RTL_RXDV; >>> end >>> >>> always @ (posedge RTL_TXCLK) >>> begin >>> RTL_TXD_I <= data; >>> RTL_TXE_I <= en; >>> end >> >>Huh? I havent been reading this thread in details, but how are these > clocks >>syncronized? Maybe they arent, hence your problem. >> >> >> >> > i write it in two processes to ensure if the two clk are not syncronized > the data won't read and transmited bad. So you are trying to send some data grabbed in one clock domain into a different? You know that will fail? If the clocks are almost similar, it will work for periods when the clock phases are close. If txclk is derived from rxclk, you may get it to work, but first make a common domain.Article: 153424
>"nba83" <nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote in message >news:dtKdneQMFslBK9nSnZ2dnUVZ_hudnZ2d@giganews.com... >> >"nba83" <nba_baheri@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.com> wrote in message >>>news:H5qdnQ4pDqAm-tnSnZ2dnUVZ_q-dnZ2d@giganews.com... >>>> here is my code: >>>> >>>> always @ (posedge RTL_RXCLK) >>>> begin >>>> data <= RTL_RXD; >>>> en <= RTL_RXDV; >>>> end >>>> >>>> always @ (posedge RTL_TXCLK) >>>> begin >>>> RTL_TXD_I <= data; >>>> RTL_TXE_I <= en; >>>> end >>> >>>Huh? I havent been reading this thread in details, but how are these >> clocks >>>syncronized? Maybe they arent, hence your problem. >>> >>> >>> >>> >> i write it in two processes to ensure if the two clk are not syncronized >> the data won't read and transmited bad. > >So you are trying to send some data grabbed in one clock domain into a >different? You know that will fail? >If the clocks are almost similar, it will work for periods when the clock >phases are close. >If txclk is derived from rxclk, you may get it to work, but first make a >common domain. > > > > tnx for your comment, so what should i do? i have tested the following code too, but still i have error in receiving packets. assign RTL_TXE=RTL_RXDV; assign RTL_TXD=RTL_RXD; tnx in advanced for help :) Neda Baheri --------------------------------------- Posted through http://www.FPGARelated.com
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z