Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Allan Herriman <allanherriman@hotmail.com> wrote: > On Thu, 29 Nov 2012 16:35:43 -0800, mmihai wrote: (snip) >> My problem: >> - V6 design - clocking structure with a IBUF to BUFR which drives a >> BUFG, so both BUFR/BUFG are on the same clock domain - the BUFR also >> clocks few flops - BUFG clocks main logic - par finishes w/o hold errs >> - I can detect data transfer errors between the flops clocked by BUFR >> and the flops clocked by BUFG (direction is data from BUFR flops -> >> BUFG flops, no logic, just data transfer). >> - timingan reports no hold errs on those paths - different runs >> (different placement) will produce a full working design >> [- ISE 13.4... but it should not matter] Seems like according to http://www.xilinx.com/support/documentation/user_guides/ug362.pdf especially in the summary near the end, that BUFR --> BUFG is allowed, though it doesn't say anything about the timing. >> Anyone seen this? Any feedback about this structure? >> Goal is to be able to produce predictable results... Now I have no way >> to do that unless I try it on HW ... but my confidence level is low >> (i.e. if it works on one device will it work on //all//?). > I understand that the circuit looks like this: > Pin--->IBUF--->BUFR--+--->BUFG---+--> > | | > | | > | | > | | > +-----+ +-----+ > | FF Q|---->|D FF | > +-----+ +-----+ > ^ > | > hold time > errors here I wonder if MMCMs would help? > I ran into that exact same problem a couple of years ago. I was given > the task of fixing (someone else's) design that featured a similar misuse > of clock buffers in a Virtex 4. I think the tools might have been ISE > 8.2. As well as I understand it, for FF's clocked off the same clock, (and clock edge) you should never have hold problems. The minimum logic between Q and the next D is long enough that, even with maximum clock skew, a D can never change that fast. (Usually described by saying that the hold time is 0.) There is much discussion on using MMCMs to generate zero delay clocks. That is, the MMCM provides enough delay such that, for a clock of constant frequency, it can match the given edge. > PAR and Trace said it was fine. Actual tests on the chip over > temperature showed otherwise. > Moral: BUFGs have a large delay. Don't expect PAR to be able to make up > for that amount of hold time using routing. As well as I understand it (which might not be all that well) it never tries to make up hold time. > You need to avoid going from your BUFR domain into the BUFG domain on the > same clock edge. > One solution might be to insert FFs clocked from the other edge of the > BUFG clock. > Another solution might be to connect the BUFG input to the IBUF output > (not via the BUFR). That is what I would have thought one would do. -- glenArticle: 154601
On Friday, November 30, 2012 4:46:51 AM UTC-8, Allan Herriman wrote: Thanks for your comments. Most interesting ... different chip(V4) had same issues.... > Moral: BUFGs have a large delay. Don't expect PAR to be able to make up > for that amount of hold time using routing. I don't think is the BUFGs delay; my guess is more related to routing. Based on datasheet BUFG delay is 0.10ns .... reported "Clock Path Skew" is 1.851ns... whatever that includes. > You need to avoid going from your BUFR domain into the BUFG domain on the > same clock edge. > One solution might be to insert FFs clocked from the other edge of the > BUFG clock. I thought about that .... can't do it, the clock is fast and it won't meet setup for half clock cycle. > Another solution might be to connect the BUFG input to the IBUF output > (not via the BUFR). Can't do that either :-( a) pin is not BUFG capable b) even if it was capable it adds to much delay .... the flops clocked by BUFR sample the input, having a BUFG clocking those won't meet hold time on the IOs because the clock is too much delayed. Some more notes: - I don't constrain the placement of BUFR/BUFG. - out of 27 signals always the ones failing have the smallest hold slack (less than .250ns?). Depending on placement the number of failing signals is anywhere between 0 and 5 I find it strange the tools can not handle the clock tree..... I do not think my structure is that exotic. What is the use of regional clocks if one can not transfer data to a global clock? Any way to constrain the hold target >0.0 only for some specific paths? -- mmihaiArticle: 154602
On Friday, November 30, 2012 8:12:49 AM UTC-8, glen herrmannsfeldt wrote: > http://www.xilinx.com/support/documentation/user_guides/ug362.pdf > > especially in the summary near the end, that BUFR --> BUFG > is allowed, though it doesn't say anything about the timing. I did open a webcase with Xilinx ... I've sent them my routed .ncd. Nobody said my clocking scheme is not allowed or not supported. I don't know if I've moved from 1st tier support though ..... till I did not get any meaningful help to solve my problem :-( > I wonder if MMCMs would help? Thought about this too .... it won't work: input freq can change ... I don't think the PLL/DLL would like that -- mmihaiArticle: 154603
On 11/30/2012 12:55 AM, Bart Fox wrote: > rickman wrote: >> I think for FPGAs it is very common to specify an async reset to assign >> the configuration value of each FF, so I have come to expect async >> resets. > Dream on. It ist *not* common to use asnchronous resets on every flipflop. I didn't say it was common to specify an async reset on *every* FF. I said it is common to specify an async reset so that the configuration value is assigned. You can do that on all of the FFs in the design or you can do that one the twelve FFs that control your design or you can do it on none. But if you want set a configuration value it is very common to use an async reset to do that. I think some or perhaps most vendors now provide a non-portable method of setting the configuration value and some even will use an initialization value in the declaration to do that. But I don't believe this is very portable as of yet. > This is your opinion or the opinion of the academic VHDL book you read. This is based on my experience with the tools and looking at other's code. > In synchronous designs an asynchronous reset has no right. > Make an synchronous reset from your asynchronous reset input on one > place, and all will work. I have no idea why you say an async reset won't work in a sync design. Do I misunderstand your statement? I am talking about FPGAs where every chip has an async reset during configuration. You can choose to use this in your design or not, but it is there and it works no matter what you do. I supposed I should have qualified my statement to FPGAs that use RAM configuration and have to be configured. There aren't a lot of true flash based device that come up instantly without a configuration process. RickArticle: 154604
On 11/30/2012 4:05 AM, Thomas Stanka wrote: > > You are right, that clock gateing might destory a bit the story, but > in general clock skew is handled during layout and checked by STA. > During simulation of rtl code I have to assume, that layout does not > destroy my functionality, otherwise I could stop simualtion anyway. > > regards Thomas My understanding is that logic delays in FPGAs are always longer than clock delays on the clock trees so that you can't have a hold time violation. If the clock is routed on the signal routing then all bets are off. I don't know how well the timing analysis does with verifying clock delays on the signal routing because I have never needed to use it that I can remember. RickArticle: 154605
>On 11/30/2012 12:55 AM, Bart Fox wrote: >> rickman wrote: > >I have no idea why you say an async reset won't work in a sync design. >Do I misunderstand your statement? I am talking about FPGAs where every >chip has an async reset during configuration. You can choose to use >this in your design or not, but it is there and it works no matter what >you do. I supposed I should have qualified my statement to FPGAs that >use RAM configuration and have to be configured. There aren't a lot of >true flash based device that come up instantly without a configuration >process. > >Rick > Indeed. Altera recommends using the reset async port but with sync signal pre-synchronised. Xilinx, I believe, recommends using synchronous sync. But again we need to be careful about our wording. Synchronous sync is actually applied to input D through logic and does not mean necessarily it is pre-synchronised. Whether you name it async or sync, the signal be default is not pre-synchronised and for sake of timing at release, they have to be generated from flip's clock domain before applying it. With today's large designs I prefer not to apply reset unless absolutely needed. I know many of us will apply it as routine to masses of buses at every node but I believe it puts massive burden on fitter to meet removal/recovery timing when such effort better be directed somewhere more critical. Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 154606
>>On 11/30/2012 12:55 AM, Bart Fox wrote: >>> rickman wrote: >> >>I have no idea why you say an async reset won't work in a sync design. >>Do I misunderstand your statement? I am talking about FPGAs where every >>chip has an async reset during configuration. You can choose to use >>this in your design or not, but it is there and it works no matter what >>you do. I supposed I should have qualified my statement to FPGAs that >>use RAM configuration and have to be configured. There aren't a lot of >>true flash based device that come up instantly without a configuration >>process. >> >>Rick >> > Indeed. Altera recommends using the reset async port but with sync signal pre-synchronised. Xilinx, I believe, recommends using synchronous sync. But again we need to be careful about our wording. Synchronous sync is actually applied to input D through logic and does not mean necessarily it is pre-synchronised. Whether you name it async or sync, the signal be default is not pre-synchronised and for sake of timing at release, they have to be generated from flip's clock domain before applying it. With today's large designs I prefer not to apply reset unless absolutely needed. I know many of us will apply it as routine to masses of buses at every node but I believe it puts massive burden on fitter to meet removal/recovery timing when such effort better be directed somewhere more critical. Note also the resource difference between the case of using async port(just routing) and sync reset(logic). Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 154607
On Fri, 30 Nov 2012 16:12:49 +0000, glen herrmannsfeldt wrote: > Allan Herriman <allanherriman@hotmail.com> wrote: >> On Thu, 29 Nov 2012 16:35:43 -0800, mmihai wrote: > > (snip) >>> My problem: >>> - V6 design - clocking structure with a IBUF to BUFR which drives a >>> BUFG, so both BUFR/BUFG are on the same clock domain - the BUFR also >>> clocks few flops - BUFG clocks main logic - par finishes w/o hold >>> errs - I can detect data transfer errors between the flops clocked by >>> BUFR and the flops clocked by BUFG (direction is data from BUFR flops >>> -> BUFG flops, no logic, just data transfer). >>> - timingan reports no hold errs on those paths - different runs >>> (different placement) will produce a full working design >>> [- ISE 13.4... but it should not matter] > > Seems like according to > > http://www.xilinx.com/support/documentation/user_guides/ug362.pdf > > especially in the summary near the end, that BUFR --> BUFG is allowed, > though it doesn't say anything about the timing. > >>> Anyone seen this? Any feedback about this structure? > >>> Goal is to be able to produce predictable results... Now I have no way >>> to do that unless I try it on HW ... but my confidence level is low >>> (i.e. if it works on one device will it work on //all//?). > >> I understand that the circuit looks like this: > >> Pin--->IBUF--->BUFR--+--->BUFG---+--> >> | | >> | | >> | | >> | | >> +-----+ +-----+ >> | FF Q|---->|D FF | >> +-----+ +-----+ >> ^ >> | >> hold time errors here > > I wonder if MMCMs would help? MMCMs would not help for the problem I saw - the "clock" was bursty. >> I ran into that exact same problem a couple of years ago. I was given >> the task of fixing (someone else's) design that featured a similar >> misuse of clock buffers in a Virtex 4. I think the tools might have >> been ISE 8.2. > > As well as I understand it, for FF's clocked off the same clock, > (and clock edge) you should never have hold problems. The minimum logic > between Q and the next D is long enough that, even with maximum clock > skew, a D can never change that fast. > (Usually described by saying that the hold time is 0.) I fully agree that it *should* work. However, for at least V4 (and now it seems V6 as well) Xilinx's model of min / max delays on their chips isn't too good, and PAR will fail to compensate for clock skew if that skew is as large as a BUFG delay. Please note I said BUFG delay not BUFG skew. BUFG delay is > 1ns. BUFG skew is usually < 0.3ns. >> Moral: BUFGs have a large delay. Don't expect PAR to be able to make >> up for that amount of hold time using routing. > > As well as I understand it (which might not be all that well) it never > tries to make up hold time. PAR always tries to make up hold time. There is an entire pass in PAR dedicated to that process. I don't have any PAR log files handy, but I believe it's easy to spot: look for the timing score. It will have a score for setup and another score for hold. The initial passes in PAR will reduce the setup score, with the hold time score remaining constant. Then towards the end (when it's finished working on the setup times) the hold time score will drop, usually to zero. Regards, AllanArticle: 154608
On Fri, 30 Nov 2012 11:22:35 -0800, mmihai wrote: > Any way to constrain the hold target >0.0 only for some specific paths? Simply by specifying clocked logic you have constrained the hold time to be > 0 ns. I don't know of a way of adding extra margin though. I believe the best approach is to avoid the need for extra margin. This is a tool bug. You have zero chance of fixing the tool, however you do have a good chance of being able to step around the bug. Some other suggestions: - Lock the placement of the BUFG and BUFR. You might find there is some magic combination of placements that just works. Earlier you said that some runs of PAR would produce designs that worked. Copy the placement from those runs as a starting point. - constrain the logic in the BUFG domain to be physically apart from the BUFR region. This forces longer routes on the chip that will improve your hold time margin. - Finally the brute force approach: treat the BUFG and BUFR clocks as if they were different clock domains. Use some sort of FIFO that is designed to handle different clock domains to pass data from the BUFR domain to the BUFG domain. Regards, AllanArticle: 154609
>> On 11/30/2012 4:05 AM, Thomas Stanka wrote:=20 > > You are right, that clock gateing might destory a bit the=20 > > story, but in general clock skew is handled during layout and=20 > > checked by STA. During simulation of rtl code I have to assume,=20 > > that layout does not destroy my functionality, otherwise I=20 > > could stop simualtion anyway.=20 Short of required hand placement, there is no way to 'handle' clock skew in= layout in an FPGA/CPLD. Logic induced skew between any two clocks creates= two clock domains. If the design is such that the two clocks are treated = as unrelated (for example if data only moves from one domain to the other t= hrough a dual clock fifo) then the design will work. You might get lucky, = you might not if you think that any place and route tool will help you out = here. Most likely you will encounter the 'bad' sort of luck, not the 'good= luck'. > On Friday, November 30, 2012 3:07:17 PM UTC-5, rickman wrote: > My understanding is that logic delays in FPGAs are always longer > than clock delays on the clock trees so that you can't have a=20 > hold time violation. If the clock is routed on the signal routing=20 > then all bets are off. I don't know how well the timing analysis=20 > does with verifying clock delays on the signal routing because I=20 > have never needed to use it that I can remember It doesn't need to depend on routing. Thomas' example introduced a logic d= elay between the two clocks. The implementation of that logic will create = a race condition. Kevin JenningsArticle: 154610
On 11/30/2012 3:36 PM, kaz wrote: >> On 11/30/2012 12:55 AM, Bart Fox wrote: >>> rickman wrote: >> >> I have no idea why you say an async reset won't work in a sync design. >> Do I misunderstand your statement? I am talking about FPGAs where every >> chip has an async reset during configuration. You can choose to use >> this in your design or not, but it is there and it works no matter what >> you do. I supposed I should have qualified my statement to FPGAs that >> use RAM configuration and have to be configured. There aren't a lot of >> true flash based device that come up instantly without a configuration >> process. >> >> Rick >> > > Indeed. Altera recommends using the reset async port but with sync signal > pre-synchronised. That only makes sense if the delay in the async reset path is short enough and properly analyzed by the tools. There have been any number of discussions in these groups about how to properly reset a design and there is no consensus on the best way to do it. > Xilinx, I believe, recommends using synchronous sync. But > again we need to be careful about our wording. Synchronous sync is actually > applied to input D through logic and does not mean necessarily it is > pre-synchronised. Whether you name it async or sync, the signal be default > is not pre-synchronised and for sake of timing at release, they have to be > generated from flip's clock domain before applying it. The best way to do a reset is design specific. As I said above, there are many ways and no agreement on which is "best". > With today's large designs I prefer not to apply reset unless absolutely > needed. I know many of us will apply it as routine to masses of buses at > every node but I believe it puts massive burden on fitter to meet > removal/recovery timing when such effort better be directed somewhere more > critical. That depends on why you are using the reset. If you are only using it to establish the configuration values you can apply an async reset and not be concerned with routing since it will use only the dedicated async reset network. RickArticle: 154611
On 11/30/2012 3:49 PM, kaz wrote: > > Note also the resource difference between the case of using async port(just > > routing) and sync reset(logic). I'm not clear on this. I believe some architectures provide sync reset inputs separate from the D input and so use no logic. RickArticle: 154612
On 11/30/2012 9:25 PM, KJ wrote: >> On Friday, November 30, 2012 3:07:17 PM UTC-5, rickman wrote: >> My understanding is that logic delays in FPGAs are always longer >> than clock delays on the clock trees so that you can't have a >> hold time violation. If the clock is routed on the signal routing >> then all bets are off. I don't know how well the timing analysis >> does with verifying clock delays on the signal routing because I >> have never needed to use it that I can remember > > It doesn't need to depend on routing. Thomas' example introduced a logic delay between the two clocks. The implementation of that logic will create a race condition. I didn't see anything in Thomas' example that required "logic". Here is what I read. Did I read the wrong post? > To demonstrate your point though you simply need to generate the new clock as this: > > clk1 <= clk; This is not logic. In VHDL it inserts a delta delay which is a zero amount of time but treated as a delay in the simulator. By adding a delta delay it will disrupt signals from the earlier clk domain that drive FFs in the later clk1 domain. In a real chip there will be no delay. RickArticle: 154613
rickman <gnuarm@gmail.com> wrote: > On 11/30/2012 3:49 PM, kaz wrote: >> Note also the resource difference between the case of using async port(just >> routing) and sync reset(logic). > I'm not clear on this. I believe some architectures provide > sync reset inputs separate from the D input and so use no logic. Well, it will still take some resources, but maybe not much. -- glenArticle: 154614
>rickman <gnuarm@gmail.com> wrote: >> On 11/30/2012 3:49 PM, kaz wrote: > >>> Note also the resource difference between the case of using async port(just >>> routing) and sync reset(logic). > >> I'm not clear on this. I believe some architectures provide >> sync reset inputs separate from the D input and so use no logic. > >Well, it will still take some resources, but maybe not much. > >-- glen > I am not aware of this type of architecture. I assume it synchronises reset per flip and thus seems a waste of silicon compared to the case of user pre-syncing it once for all relevant flips. Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 154615
On Thu, 29 Nov 2012 16:35:43 -0800, mmihai wrote: > Hi! > > I have a Xilinx webcase for about 2mo about this that goes nowhere ... > may be better luck here. > > My problem: > - V6 design - clocking structure with a IBUF to BUFR which drives a > BUFG, so both BUFR/BUFG are on the same clock domain - the BUFR also > clocks few flops - BUFG clocks main logic - par finishes w/o hold errs > - I can detect data transfer errors between the flops clocked by BUFR > and the flops clocked by BUFG > Anyone seen this? Any feedback about this structure? > > Goal is to be able to produce predictable results... Now I have no way > to do that unless I try it on HW ... but my confidence level is low > (i.e. if it works on one device will it work on //all//?). Is this a new design for the V6 or a port from another FPGA or a previous ISE release? Are you directly instantating these primitives and checking that they are still there in the RTL view? I had a problem some years ago when moving a design from ISE7 to ISE10 and the tools silently changed what I asked into something completely different; in my case it moved a DCM from the BUFG where it generated a nicely aligned x2 clock, to the BUFG input signal - considerably increasing skew between these clocks! So bugs in this area are not particularly new... - BrianArticle: 154616
On 12/1/2012 2:05 AM, kaz wrote: >> rickman<gnuarm@gmail.com> wrote: >>> On 11/30/2012 3:49 PM, kaz wrote: >> >>>> Note also the resource difference between the case of using async > port(just >>>> routing) and sync reset(logic). >> >>> I'm not clear on this. I believe some architectures provide >>> sync reset inputs separate from the D input and so use no logic. >> >> Well, it will still take some resources, but maybe not much. >> >> -- glen >> > > I am not aware of this type of architecture. I assume it synchronises > reset > per flip and thus seems a waste of silicon compared to the case of user > pre-syncing it once for all relevant flips. I don't know the Altera devices as intimately as others, but I learned the Xilinx stuff pretty well once. They have (had) two inputs to each FF, one for reset and one for set, which were configurable between sync and async. My point is that if this is built in, which is not uncommon I think, there are no "used" resources other than dedicated resources for logic in the case sync reset and nothing for async. I prefer to design my resets in a customized way where each section of logic is reset and released from reset asynchronously and each logic section separately takes care of the problems of cleanly starting up. That way there is no global reset competing for either routing or logic. Often nothing special needs to be done for a finite state machine (FSM) because it starts by waiting for some trigger signal anyway. Counters often use an enable which is disabled by default, etc. I pay attention to how my circuits operate from reset and so far this has not bitten me. RickArticle: 154617
On Saturday, December 1, 2012 1:47:06 AM UTC-5, rickman wrote: > On 11/30/2012 9:25 PM, KJ wrote:=20 > > It doesn't need to depend on routing. Thomas' example introduced a=20 >> logic delay between the two clocks. The implementation of that logic=20 >> will create a race condition.=20 > I didn't see anything in Thomas' example that required "logic". Here is= =20 > what I read. Did I read the wrong post?=20 The post that I referring to has the following... Clk1 <=3D Clk when Selected else Other_Clk';=20 [..]=20 Clk2 <=3D Clk1 when Enabled else '0';=20 [..]=20 process (Clk)=20 if rising_edge(Clk) then=20 A <=3D B;=20 [..]=20 process (Clk2)=20 if rising_edge(Clk2)=20 B<=3D A;=20 I believe the point he was trying to make was that because of the simulatio= n delta delay between Clk1 and Clk2 the two processess would not be clocked= at the same time. While it is true that the processes would clock at diff= erent times, it's not really because of the simulation delta delay. An act= ual implementation of the above would have the same problem because it woul= d need to synthesize the logic to create the gated clock. The propogation = delay and additional routing delay in creating the additional clock would c= reate a race condition for signals generated in the 'clk' domain and captur= ed in the 'clk1' or 'clk2' domains. > > clk1 <=3D clk;=20 > This is not logic. In VHDL it inserts a delta delay which is a zero=20 > amount of time but treated as a delay in the simulator. By adding a=20 > delta delay it will disrupt signals from the earlier clk domain that > drive FFs in the later clk1 domain. In a real chip there will be no=20 > delay. And that was my point. Kevin JenningsArticle: 154618
On Friday, November 30, 2012 4:23:51 PM UTC-8, Allan Herriman wrote: > On Fri, 30 Nov 2012 11:22:35 -0800, mmihai wrote: > > > Any way to constrain the hold target >0.0 only for some specific paths? > > Simply by specifying clocked logic you have constrained the hold time to > be > 0 ns. > > I don't know of a way of adding extra margin though. I believe the best > approach is to avoid the need for extra margin. Yes, I've meant extra margin. It seems the hold target is 0.0ns (I would guess the numbers include some padding). > This is a tool bug. You have zero chance of fixing the tool, however you > do have a good chance of being able to step around the bug. It looks like a tool bug. It is very disturbing that it is not related to a particular version and it's on multiple [virtex] families... I would expect the things to work if STA has good numbers. My confidence in the tools took a hit ... > Some other suggestions: > > - Lock the placement of the BUFG and BUFR. You might find there is some > magic combination of placements that just works. Earlier you said that > some runs of PAR would produce designs that worked. Copy the placement > from those runs as a starting point. > > - constrain the logic in the BUFG domain to be physically apart from the > BUFR region. This forces longer routes on the chip that will improve > your hold time margin. a) I did not see any correlation between passing/failing and particular BUFR/BUFG placement b) I think it is a risky approach; if I can get a particular map/par run to work on some systems ... I have no guarantee it will be fine on //all// systems, over PVT. > - Finally the brute force approach: treat the BUFG and BUFR clocks as if > they were different clock domains. Use some sort of FIFO that is > designed to handle different clock domains to pass data from the BUFR > domain to the BUFG domain. Something like this could be the best solution, if doable ... but it's a pity to add logic because Xilinx tools can't handle the clock tree properly.... My logic looks very much like Figure 1-24/Page 28 from UG362, except I don't use BUFIO, so it is not that exotic. -- mmihaiArticle: 154619
On Saturday, December 1, 2012 2:28:43 AM UTC-8, Brian Drummond wrote: > Is this a new design for the V6 or a port from another FPGA or a previous > ISE release? Are you directly instantating these primitives and checking > that they are still there in the RTL view? New design ... and BUFR/BUFG instantiated by hand. I've looked on fpga_editor and the buffers are there. -- mmihaiArticle: 154620
On Sun, 02 Dec 2012 09:55:52 -0800, mmihai wrote: > On Saturday, December 1, 2012 2:28:43 AM UTC-8, Brian Drummond wrote: > >> Is this a new design for the V6 or a port from another FPGA or a >> previous ISE release? Are you directly instantating these primitives >> and checking that they are still there in the RTL view? > > New design ... and BUFR/BUFG instantiated by hand. > I've looked on fpga_editor and the buffers are there. Then it's pretty unlikely to be a regression to what I was seeing. - BrianArticle: 154621
mmihai <iiahim@yahoo.com> wrote: (snip, someone wrote) >> This is a tool bug. You have zero chance of fixing the tool, >> however you do have a good chance of being able to step >> around the bug. > It looks like a tool bug. > It is very disturbing that it is not related to a particular > version and it's on multiple [virtex] families... > I would expect the things to work if STA has good numbers. > My confidence in the tools took a hit ... (snip) > Something like this could be the best solution, if doable ... > but it's a pity to add logic because Xilinx tools can't > handle the clock tree properly.... It seems to me that they do pretty well. Well, the effects of voltage and temperature should be pretty much the same for all transistors on a chip. But process variations could be very different. They verify that the usual paths have delay variations that they can account for, and compute delays based on those. If there are some that they can't account for the delays, at least not to the accuracy required, then they don't guarantee those. As far as I understand, though mostly in general, the idea is to make clock skew in a clock tree small enough, relative to the minimum delay through routing, that two FFs clocked off the same clock can't violate hold time. The skew also must be added to the delay when verifying setup time. But that only works within one clock tree. Computing the variation between two clock trees is different. Now, it would be nice to say that some delay is not characterized enough to use, and so far I haven't seen that they do say that, but it isn't the tools' fault if the data isn't available. -- glenArticle: 154622
On 12/1/2012 7:33 PM, KJ wrote: > On Saturday, December 1, 2012 1:47:06 AM UTC-5, rickman wrote: >> On 11/30/2012 9:25 PM, KJ wrote: >>> It doesn't need to depend on routing. Thomas' example introduced a >>> logic delay between the two clocks. The implementation of that logic >>> will create a race condition. >> I didn't see anything in Thomas' example that required "logic". Here is >> what I read. Did I read the wrong post? > > The post that I referring to has the following... > Clk1<= Clk when Selected else Other_Clk'; > [..] > Clk2<= Clk1 when Enabled else '0'; > [..] > process (Clk) > if rising_edge(Clk) then > A<= B; > [..] > process (Clk2) > if rising_edge(Clk2) > B<= A; > > I believe the point he was trying to make was that because of the simulation delta delay between Clk1 and Clk2 the two processess would not be clocked at the same time. While it is true that the processes would clock at different times, it's not really because of the simulation delta delay. An actual implementation of the above would have the same problem because it would need to synthesize the logic to create the gated clock. The propogation delay and additional routing delay in creating the additional clock would create a race condition for signals generated in the 'clk' domain and captured in the 'clk1' or 'clk2' domains. > >>> clk1<= clk; > >> This is not logic. In VHDL it inserts a delta delay which is a zero >> amount of time but treated as a delay in the simulator. By adding a >> delta delay it will disrupt signals from the earlier clk domain that >> drive FFs in the later clk1 domain. In a real chip there will be no >> delay. > > And that was my point. > > Kevin Jennings I don't know where that code came from, but yes, I think you are accurately analyzing it. Your first post was replying to the OP's post containing the link to the blog. I didn't see anything like this in the blog code, there was no muxing of the clock. Where did you get the code shown above? RickArticle: 154623
On Fri, 30 Nov 2012 11:22:35 -0800, mmihai wrote: > On Friday, November 30, 2012 4:46:51 AM UTC-8, Allan Herriman wrote: > > Thanks for your comments. > > Most interesting ... different chip(V4) had same issues.... > >> Moral: BUFGs have a large delay. Don't expect PAR to be able to make >> up for that amount of hold time using routing. > > I don't think is the BUFGs delay; my guess is more related to routing. > Based on datasheet BUFG delay is 0.10ns .... reported "Clock Path Skew" > is 1.851ns... whatever that includes. Sorry I missed that earlier. You seem to be mixing up skew and delay. > datasheet BUFG delay is 0.10ns That figure is the BUFG skew, not the BUFG delay. It represents the worst case timing difference between outputs on the same BUFG. It isn't relevant to your problem. The "Clock Path Skew" is the important figure. It is the difference between the time of arrival of the clock at the source (clocked from BUFR) flip flops and destination (clocked from BUFG) flip flops. In this case it is mostly made up of the BUFG delay. PAR has to include a routing delay to compensate for that skew. An earlier comment: >> One solution might be to insert FFs clocked from the other edge of the >> BUFG clock. > > I thought about that .... can't do it, the clock is fast and it > won't meet setup for half clock cycle. It might not meet setup for half a clock cycle, but it doesn't have to! The skew works in your favour when using opposite edges and the requirement for setup time is half a clock cycle + 1.851ns. Unless you have a GHz clock that doesn't sound too hard. Regards, AllanArticle: 154624
Glen, All three (voltage, temperature and process) vary over a single die, but no= t by much. The trick is always "by how much?" Are we willing to live with s= lower guaranteed performance in order to simplify the analysis, or is it wo= rth it to invest more in the analysis (NRE) to "speed up" the parts (recurr= ing profit)? Managing hold time is a lot more complicated than it used to be. In the pas= t, the clock skew could always be less than Tco plus minimum routing by des= ign, so they did not even spec hold time for the registers. Over time, the = raw speed of the devices has out-stripped the skew of the clock tree, and h= old time is a real problem that has to be taken care of in placement and ro= uting. We users just don't have control over the clock tree itself to deal = with the problem, like in other domains.=20 Andy
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z