Messages from 154600

Article: 154600
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Fri, 30 Nov 2012 16:12:49 +0000 (UTC)
Links: << >> << T >> << A >>

Allan Herriman <allanherriman@hotmail.com> wrote:
> On Thu, 29 Nov 2012 16:35:43 -0800, mmihai wrote:
 
(snip)
>> My problem:
>>  - V6 design - clocking structure with a IBUF to BUFR which drives a
>>  BUFG, so both BUFR/BUFG are on the same clock domain - the BUFR also
>>  clocks few flops - BUFG clocks main logic - par finishes w/o hold errs
>>  - I can detect data transfer errors between the flops clocked by BUFR
>>  and the flops clocked by BUFG (direction is data from BUFR flops ->
>>  BUFG flops, no logic, just data transfer).
>>  - timingan reports no hold errs on those paths - different runs
>>  (different placement) will produce a full working design
>> [- ISE 13.4... but it should not matter]

Seems like according to 

http://www.xilinx.com/support/documentation/user_guides/ug362.pdf

especially in the summary near the end, that BUFR --> BUFG
is allowed, though it doesn't say anything about the timing.
 
>> Anyone seen this? Any feedback about this structure?
 
>> Goal is to be able to produce predictable results... Now I have no way
>> to do that unless I try it on HW ... but my confidence level is low
>> (i.e. if it works on one device will it work on //all//?).
 
> I understand that the circuit looks like this:
 
> Pin--->IBUF--->BUFR--+--->BUFG---+-->
>                     |           |
>                     |           |
>                     |           |
>                     |           |
>                  +-----+     +-----+
>                  | FF Q|---->|D FF |
>                  +-----+     +-----+
>                            ^
>                            |
>                        hold time
>                        errors here

I wonder if MMCMs would help?
 
> I ran into that exact same problem a couple of years ago.  I was given 
> the task of fixing (someone else's) design that featured a similar misuse 
> of clock buffers in a Virtex 4.  I think the tools might have been ISE 
> 8.2.

As well as I understand it, for FF's clocked off the same clock,
(and clock edge) you should never have hold problems. The minimum
logic between Q and the next D is long enough that, even with
maximum clock skew, a D can never change that fast.
(Usually described by saying that the hold time is 0.)

There is much discussion on using MMCMs to generate zero delay clocks.

That is, the MMCM provides enough delay such that, for a clock of 
constant frequency, it can match the given edge.
 
> PAR and Trace said it was fine.  Actual tests on the chip over 
> temperature showed otherwise.
 
> Moral: BUFGs have a large delay.  Don't expect PAR to be able to make up 
> for that amount of hold time using routing.

As well as I understand it (which might not be all that well) it
never tries to make up hold time. 

> You need to avoid going from your BUFR domain into the BUFG domain on the 
> same clock edge.
> One solution might be to insert FFs clocked from the other edge of the 
> BUFG clock.
> Another solution might be to connect the BUFG input to the IBUF output 
> (not via the BUFR).

That is what I would have thought one would do. 

-- glen

Article: 154601
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: mmihai <iiahim@yahoo.com>
Date: Fri, 30 Nov 2012 11:22:35 -0800 (PST)
Links: << >> << T >> << A >>

On Friday, November 30, 2012 4:46:51 AM UTC-8, Allan Herriman wrote:

Thanks for your comments.

Most interesting ... different chip(V4) had same issues....

> Moral: BUFGs have a large delay.  Don't expect PAR to be able to make up 
> for that amount of hold time using routing.

I don't think is the BUFGs delay; my guess is more related to routing.
Based on datasheet BUFG delay is 0.10ns .... reported "Clock Path Skew" is 1.851ns... whatever that includes.

> You need to avoid going from your BUFR domain into the BUFG domain on the 
> same clock edge.
> One solution might be to insert FFs clocked from the other edge of the 
> BUFG clock.

I thought about that .... can't do it, the clock is fast and it won't meet setup for half clock cycle.

> Another solution might be to connect the BUFG input to the IBUF output 
> (not via the BUFR).

Can't do that either :-(
 a) pin is not BUFG capable
 b) even if it was capable it adds to much delay .... the flops clocked by BUFR sample the input, having a BUFG clocking those won't meet hold time on the IOs because the clock is too much delayed.

Some more notes:
 - I don't constrain the placement of BUFR/BUFG.
 - out of 27 signals always the ones failing have the smallest hold slack
   (less than .250ns?). Depending on placement the number of failing signals
   is anywhere between 0 and 5

I find it strange the tools can not handle the clock tree..... I do not think my structure is that exotic. What is the use of regional clocks if one can not transfer data to a global clock?

Any way to constrain the hold target >0.0 only for some specific paths?

--
mmihai

Article: 154602
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: mmihai <iiahim@yahoo.com>
Date: Fri, 30 Nov 2012 11:32:43 -0800 (PST)
Links: << >> << T >> << A >>

On Friday, November 30, 2012 8:12:49 AM UTC-8, glen herrmannsfeldt wrote:

> http://www.xilinx.com/support/documentation/user_guides/ug362.pdf
> 
> especially in the summary near the end, that BUFR --> BUFG
> is allowed, though it doesn't say anything about the timing.

I did open a webcase with Xilinx ... I've sent them my routed .ncd.
Nobody said my clocking scheme is not allowed or not supported. I don't know if I've moved from 1st tier support though ..... till I did not get any meaningful help to solve my problem :-(

> I wonder if MMCMs would help?

Thought about this too .... it won't work:
input freq can change ... I don't think the PLL/DLL would like that

--
mmihai

Article: 154603
Subject: Re: VHDL expert puzzle
From: rickman <gnuarm@gmail.com>
Date: Fri, 30 Nov 2012 15:03:43 -0500
Links: << >> << T >> << A >>

On 11/30/2012 12:55 AM, Bart Fox wrote:
> rickman wrote:
>> I think for FPGAs it is very common to specify an async reset to assign
>> the configuration value of each FF, so I have come to expect async
>> resets.
> Dream on. It ist *not* common to use asnchronous resets on every flipflop.

I didn't say it was common to specify an async reset on *every* FF.  I 
said it is common to specify an async reset so that the configuration 
value is assigned.  You can do that on all of the FFs in the design or 
you can do that one the twelve FFs that control your design or you can 
do it on none.  But if you want set a configuration value it is very 
common to use an async reset to do that.

I think some or perhaps most vendors now provide a non-portable method 
of setting the configuration value and some even will use an 
initialization value in the declaration to do that.  But I don't believe 
this is very portable as of yet.

> This is your opinion or the opinion of the academic VHDL book you read.

This is based on my experience with the tools and looking at other's code.

> In synchronous designs an asynchronous reset has no right.
> Make an synchronous reset from your asynchronous reset input on one
> place, and all will work.

I have no idea why you say an async reset won't work in a sync design. 
Do I misunderstand your statement?  I am talking about FPGAs where every 
chip has an async reset during configuration.  You can choose to use 
this in your design or not, but it is there and it works no matter what 
you do.  I supposed I should have qualified my statement to FPGAs that 
use RAM configuration and have to be configured.  There aren't a lot of 
true flash based device that come up instantly without a configuration 
process.

Rick

Article: 154604
Subject: Re: VHDL expert puzzle
From: rickman <gnuarm@gmail.com>
Date: Fri, 30 Nov 2012 15:07:17 -0500
Links: << >> << T >> << A >>

On 11/30/2012 4:05 AM, Thomas Stanka wrote:
>
> You are right, that clock gateing might destory a bit the story, but
> in general clock skew is handled during layout and checked by STA.
> During simulation of rtl code I have to assume, that layout does not
> destroy my functionality, otherwise I could stop simualtion anyway.
>
> regards Thomas

My understanding is that logic delays in FPGAs are always longer than 
clock delays on the clock trees so that you can't have a hold time 
violation.  If the clock is routed on the signal routing then all bets 
are off.  I don't know how well the timing analysis does with verifying 
clock delays on the signal routing because I have never needed to use it 
that I can remember.

Rick

Article: 154605
Subject: Re: VHDL expert puzzle
From: "kaz" <3619@embeddedrelated>
Date: Fri, 30 Nov 2012 14:36:21 -0600
Links: << >> << T >> << A >>

>On 11/30/2012 12:55 AM, Bart Fox wrote:
>> rickman wrote:
>
>I have no idea why you say an async reset won't work in a sync design. 
>Do I misunderstand your statement?  I am talking about FPGAs where every 
>chip has an async reset during configuration.  You can choose to use 
>this in your design or not, but it is there and it works no matter what 
>you do.  I supposed I should have qualified my statement to FPGAs that 
>use RAM configuration and have to be configured.  There aren't a lot of 
>true flash based device that come up instantly without a configuration 
>process.
>
>Rick
>

Indeed. Altera recommends using the reset async port but with sync signal
pre-synchronised. Xilinx, I believe, recommends using synchronous sync. But
again we need to be careful about our wording. Synchronous sync is actually
applied to input D through logic and does not mean necessarily it is
pre-synchronised. Whether you name it async or sync, the signal be default
is not pre-synchronised and for sake of timing at release, they have to be
generated from flip's clock domain before applying it.

With today's large designs I prefer not to apply reset unless absolutely
needed. I know many of us will apply it as routine to masses of buses at
every node but I believe it puts massive burden on fitter to meet
removal/recovery timing when such effort better be directed somewhere more
critical.

Kaz 	   

---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154606
Subject: Re: VHDL expert puzzle
From: "kaz" <3619@embeddedrelated>
Date: Fri, 30 Nov 2012 14:49:21 -0600
Links: << >> << T >> << A >>

>>On 11/30/2012 12:55 AM, Bart Fox wrote:
>>> rickman wrote:
>>
>>I have no idea why you say an async reset won't work in a sync design. 
>>Do I misunderstand your statement?  I am talking about FPGAs where every

>>chip has an async reset during configuration.  You can choose to use 
>>this in your design or not, but it is there and it works no matter what 
>>you do.  I supposed I should have qualified my statement to FPGAs that 
>>use RAM configuration and have to be configured.  There aren't a lot of 
>>true flash based device that come up instantly without a configuration 
>>process.
>>
>>Rick
>>
>
Indeed. Altera recommends using the reset async port but with sync signal
pre-synchronised. Xilinx, I believe, recommends using synchronous sync.
But
again we need to be careful about our wording. Synchronous sync is
actually
applied to input D through logic and does not mean necessarily it is
pre-synchronised. Whether you name it async or sync, the signal be default
is not pre-synchronised and for sake of timing at release, they have to be
generated from flip's clock domain before applying it.

With today's large designs I prefer not to apply reset unless absolutely
needed. I know many of us will apply it as routine to masses of buses at
every node but I believe it puts massive burden on fitter to meet
removal/recovery timing when such effort better be directed somewhere more
critical.

Note also the resource difference between the case of using async port(just

routing) and sync reset(logic).

Kaz 	   
					
	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154607
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: Allan Herriman <allanherriman@hotmail.com>
Date: 01 Dec 2012 00:08:42 GMT
Links: << >> << T >> << A >>

On Fri, 30 Nov 2012 16:12:49 +0000, glen herrmannsfeldt wrote:

> Allan Herriman <allanherriman@hotmail.com> wrote:
>> On Thu, 29 Nov 2012 16:35:43 -0800, mmihai wrote:
>  
> (snip)
>>> My problem:
>>>  - V6 design - clocking structure with a IBUF to BUFR which drives a
>>>  BUFG, so both BUFR/BUFG are on the same clock domain - the BUFR also
>>>  clocks few flops - BUFG clocks main logic - par finishes w/o hold
>>>  errs - I can detect data transfer errors between the flops clocked by
>>>  BUFR and the flops clocked by BUFG (direction is data from BUFR flops
>>>  -> BUFG flops, no logic, just data transfer).
>>>  - timingan reports no hold errs on those paths - different runs
>>>  (different placement) will produce a full working design
>>> [- ISE 13.4... but it should not matter]
> 
> Seems like according to
> 
> http://www.xilinx.com/support/documentation/user_guides/ug362.pdf
> 
> especially in the summary near the end, that BUFR --> BUFG is allowed,
> though it doesn't say anything about the timing.
>  
>>> Anyone seen this? Any feedback about this structure?
>  
>>> Goal is to be able to produce predictable results... Now I have no way
>>> to do that unless I try it on HW ... but my confidence level is low
>>> (i.e. if it works on one device will it work on //all//?).
>  
>> I understand that the circuit looks like this:
>  
>> Pin--->IBUF--->BUFR--+--->BUFG---+-->
>>                     |           |
>>                     |           |
>>                     |           |
>>                     |           |
>>                  +-----+     +-----+
>>                  | FF Q|---->|D FF |
>>                  +-----+     +-----+
>>                            ^
>>                            |
>>                        hold time errors here
> 
> I wonder if MMCMs would help?


MMCMs would not help for the problem I saw - the "clock" was bursty.

  
>> I ran into that exact same problem a couple of years ago.  I was given
>> the task of fixing (someone else's) design that featured a similar
>> misuse of clock buffers in a Virtex 4.  I think the tools might have
>> been ISE 8.2.
> 
> As well as I understand it, for FF's clocked off the same clock,
> (and clock edge) you should never have hold problems. The minimum logic
> between Q and the next D is long enough that, even with maximum clock
> skew, a D can never change that fast.
> (Usually described by saying that the hold time is 0.)


I fully agree that it *should* work.  However, for at least V4 (and now 
it seems V6 as well) Xilinx's model of min / max delays on their chips 
isn't too good, and PAR will fail to compensate for clock skew if that 
skew is as large as a BUFG delay.

Please note I said BUFG delay not BUFG skew.  BUFG delay is > 1ns.  BUFG 
skew is usually < 0.3ns.


>> Moral: BUFGs have a large delay.  Don't expect PAR to be able to make
>> up for that amount of hold time using routing.
> 
> As well as I understand it (which might not be all that well) it never
> tries to make up hold time.


PAR always tries to make up hold time.  There is an entire pass in PAR 
dedicated to that process.  I don't have any PAR log files handy, but I 
believe it's easy to spot: look for the timing score.  It will have a 
score for setup and another score for hold.  The initial passes in PAR 
will reduce the setup score, with the hold time score remaining 
constant.  Then towards the end (when it's finished working on the setup 
times) the hold time score will drop, usually to zero.


Regards,
Allan

Article: 154608
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: Allan Herriman <allanherriman@hotmail.com>
Date: 01 Dec 2012 00:23:51 GMT
Links: << >> << T >> << A >>

On Fri, 30 Nov 2012 11:22:35 -0800, mmihai wrote:

> Any way to constrain the hold target >0.0 only for some specific paths?

Simply by specifying clocked logic you have constrained the hold time to 
be > 0 ns.

I don't know of a way of adding extra margin though.  I believe the best 
approach is to avoid the need for extra margin.

This is a tool bug.  You have zero chance of fixing the tool, however you 
do have a good chance of being able to step around the bug.

Some other suggestions:

- Lock the placement of the BUFG and BUFR.  You might find there is some 
magic combination of placements that just works.  Earlier you said that 
some runs of PAR would produce designs that worked.  Copy the placement 
from those runs as a starting point.

- constrain the logic in the BUFG domain to be physically apart from the 
BUFR region.  This forces longer routes on the chip that will improve 
your hold time margin.

- Finally the brute force approach: treat the BUFG and BUFR clocks as if 
they were different clock domains.  Use some sort of FIFO that is 
designed to handle different clock domains to pass data from the BUFR 
domain to the BUFG domain.

Regards,
Allan

Article: 154609
Subject: Re: VHDL expert puzzle
From: KJ <kkjennings@sbcglobal.net>
Date: Fri, 30 Nov 2012 18:25:17 -0800 (PST)
Links: << >> << T >> << A >>

>> On 11/30/2012 4:05 AM, Thomas Stanka wrote:=20
> > You are right, that clock gateing might destory a bit the=20
> > story, but in general clock skew is handled during layout and=20
> > checked by STA. During simulation of rtl code I have to assume,=20
> > that layout does not destroy my functionality, otherwise I=20
> > could stop simualtion anyway.=20

Short of required hand placement, there is no way to 'handle' clock skew in=
 layout in an FPGA/CPLD.  Logic induced skew between any two clocks creates=
 two clock domains.  If the design is such that the two clocks are treated =
as unrelated (for example if data only moves from one domain to the other t=
hrough a dual clock fifo) then the design will work.  You might get lucky, =
you might not if you think that any place and route tool will help you out =
here.  Most likely you will encounter the 'bad' sort of luck, not the 'good=
 luck'.

> On Friday, November 30, 2012 3:07:17 PM UTC-5, rickman wrote:
> My understanding is that logic delays in FPGAs are always longer
> than clock delays on the clock trees so that you can't have a=20
> hold time violation. If the clock is routed on the signal routing=20
> then all bets are off. I don't know how well the timing analysis=20
> does with verifying clock delays on the signal routing because I=20
> have never needed to use it that I can remember

It doesn't need to depend on routing.  Thomas' example introduced a logic d=
elay between the two clocks.  The implementation of that logic will create =
a race condition.

Kevin Jennings

Article: 154610
Subject: Re: VHDL expert puzzle
From: rickman <gnuarm@gmail.com>
Date: Sat, 01 Dec 2012 01:36:40 -0500
Links: << >> << T >> << A >>

On 11/30/2012 3:36 PM, kaz wrote:
>> On 11/30/2012 12:55 AM, Bart Fox wrote:
>>> rickman wrote:
>>
>> I have no idea why you say an async reset won't work in a sync design.
>> Do I misunderstand your statement?  I am talking about FPGAs where every
>> chip has an async reset during configuration.  You can choose to use
>> this in your design or not, but it is there and it works no matter what
>> you do.  I supposed I should have qualified my statement to FPGAs that
>> use RAM configuration and have to be configured.  There aren't a lot of
>> true flash based device that come up instantly without a configuration
>> process.
>>
>> Rick
>>
>
> Indeed. Altera recommends using the reset async port but with sync signal
> pre-synchronised.

That only makes sense if the delay in the async reset path is short 
enough and properly analyzed by the tools.  There have been any number 
of discussions in these groups about how to properly reset a design and 
there is no consensus on the best way to do it.


> Xilinx, I believe, recommends using synchronous sync. But
> again we need to be careful about our wording. Synchronous sync is actually
> applied to input D through logic and does not mean necessarily it is
> pre-synchronised. Whether you name it async or sync, the signal be default
> is not pre-synchronised and for sake of timing at release, they have to be
> generated from flip's clock domain before applying it.

The best way to do a reset is design specific.  As I said above, there 
are many ways and no agreement on which is "best".


> With today's large designs I prefer not to apply reset unless absolutely
> needed. I know many of us will apply it as routine to masses of buses at
> every node but I believe it puts massive burden on fitter to meet
> removal/recovery timing when such effort better be directed somewhere more
> critical.

That depends on why you are using the reset.  If you are only using it 
to establish the configuration values you can apply an async reset and 
not be concerned with routing since it will use only the dedicated async 
reset network.

Rick

Article: 154611
Subject: Re: VHDL expert puzzle
From: rickman <gnuarm@gmail.com>
Date: Sat, 01 Dec 2012 01:41:01 -0500
Links: << >> << T >> << A >>

On 11/30/2012 3:49 PM, kaz wrote:
>
> Note also the resource difference between the case of using async port(just
>
> routing) and sync reset(logic).

I'm not clear on this.  I believe some architectures provide sync reset 
inputs separate from the D input and so use no logic.

Rick

Article: 154612
Subject: Re: VHDL expert puzzle
From: rickman <gnuarm@gmail.com>
Date: Sat, 01 Dec 2012 01:47:06 -0500
Links: << >> << T >> << A >>

On 11/30/2012 9:25 PM, KJ wrote:
>> On Friday, November 30, 2012 3:07:17 PM UTC-5, rickman wrote:
>> My understanding is that logic delays in FPGAs are always longer
>> than clock delays on the clock trees so that you can't have a
>> hold time violation. If the clock is routed on the signal routing
>> then all bets are off. I don't know how well the timing analysis
>> does with verifying clock delays on the signal routing because I
>> have never needed to use it that I can remember
>
> It doesn't need to depend on routing.  Thomas' example introduced a logic delay between the two clocks.  The implementation of that logic will create a race condition.

I didn't see anything in Thomas' example that required "logic".  Here is 
what I read.  Did I read the wrong post?

> To demonstrate your point though you simply need to generate the new clock as this:
>
> clk1 <= clk;

This is not logic.  In VHDL it inserts a delta delay which is a zero 
amount of time but treated as a delay in the simulator.  By adding a 
delta delay it will disrupt signals from the earlier clk domain that 
drive FFs in the later clk1 domain.  In a real chip there will be no delay.

Rick

Article: 154613
Subject: Re: VHDL expert puzzle
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Sat, 1 Dec 2012 06:57:03 +0000 (UTC)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:
> On 11/30/2012 3:49 PM, kaz wrote:

>> Note also the resource difference between the case of using async port(just
>> routing) and sync reset(logic).
 
> I'm not clear on this.  I believe some architectures provide 
> sync reset inputs separate from the D input and so use no logic.

Well, it will still take some resources, but maybe not much.

-- glen

Article: 154614
Subject: Re: VHDL expert puzzle
From: "kaz" <3619@embeddedrelated>
Date: Sat, 01 Dec 2012 01:05:40 -0600
Links: << >> << T >> << A >>

>rickman <gnuarm@gmail.com> wrote:
>> On 11/30/2012 3:49 PM, kaz wrote:
>
>>> Note also the resource difference between the case of using async
port(just
>>> routing) and sync reset(logic).
> 
>> I'm not clear on this.  I believe some architectures provide 
>> sync reset inputs separate from the D input and so use no logic.
>
>Well, it will still take some resources, but maybe not much.
>
>-- glen
>

I am not aware of this type of architecture. I assume it synchronises
reset
per flip and thus seems a waste of silicon compared to the case of user
pre-syncing it once for all relevant flips.

Kaz	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154615
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: Brian Drummond <brian@shapes.demon.co.uk>
Date: Sat, 1 Dec 2012 10:28:43 +0000 (UTC)
Links: << >> << T >> << A >>

On Thu, 29 Nov 2012 16:35:43 -0800, mmihai wrote:

> Hi!
> 
> I have a Xilinx webcase for about 2mo about this that goes nowhere ...
> may be better luck here.
> 
> My problem:
>  - V6 design - clocking structure with a IBUF to BUFR which drives a
>  BUFG, so both BUFR/BUFG are on the same clock domain - the BUFR also
>  clocks few flops - BUFG clocks main logic - par finishes w/o hold errs
>  - I can detect data transfer errors between the flops clocked by BUFR
>  and the flops clocked by BUFG

> Anyone seen this? Any feedback about this structure?
> 
> Goal is to be able to produce predictable results... Now I have no way
> to do that unless I try it on HW ... but my confidence level is low
> (i.e. if it works on one device will it work on //all//?).

Is this a new design for the V6 or a port from another FPGA or a previous 
ISE release? Are you directly instantating these primitives and checking 
that they are still there in the RTL view?

I had a problem some years ago when moving a design from ISE7 to ISE10 
and the tools silently changed what I asked into something completely 
different; in my case it moved a DCM from the BUFG where it generated a 
nicely aligned x2 clock, to the BUFG input signal - considerably 
increasing skew between these clocks!

So bugs in this area are not particularly new...

- Brian

Article: 154616
Subject: Re: VHDL expert puzzle
From: rickman <gnuarm@gmail.com>
Date: Sat, 01 Dec 2012 14:32:54 -0500
Links: << >> << T >> << A >>

On 12/1/2012 2:05 AM, kaz wrote:
>> rickman<gnuarm@gmail.com>  wrote:
>>> On 11/30/2012 3:49 PM, kaz wrote:
>>
>>>> Note also the resource difference between the case of using async
> port(just
>>>> routing) and sync reset(logic).
>>
>>> I'm not clear on this.  I believe some architectures provide
>>> sync reset inputs separate from the D input and so use no logic.
>>
>> Well, it will still take some resources, but maybe not much.
>>
>> -- glen
>>
>
> I am not aware of this type of architecture. I assume it synchronises
> reset
> per flip and thus seems a waste of silicon compared to the case of user
> pre-syncing it once for all relevant flips.

I don't know the Altera devices as intimately as others, but I learned 
the Xilinx stuff pretty well once.  They have (had) two inputs to each 
FF, one for reset and one for set, which were configurable between sync 
and async.

My point is that if this is built in, which is not uncommon I think, 
there are no "used" resources other than dedicated resources for logic 
in the case sync reset and nothing for async.

I prefer to design my resets in a customized way where each section of 
logic is reset and released from reset asynchronously and each logic 
section separately takes care of the problems of cleanly starting up. 
That way there is no global reset competing for either routing or logic. 
  Often nothing special needs to be done for a finite state machine 
(FSM) because it starts by waiting for some trigger signal anyway. 
Counters often use an enable which is disabled by default, etc.  I pay 
attention to how my circuits operate from reset and so far this has not 
bitten me.

Rick

Article: 154617
Subject: Re: VHDL expert puzzle
From: KJ <kkjennings@sbcglobal.net>
Date: Sat, 1 Dec 2012 16:33:50 -0800 (PST)
Links: << >> << T >> << A >>

On Saturday, December 1, 2012 1:47:06 AM UTC-5, rickman wrote:
> On 11/30/2012 9:25 PM, KJ wrote:=20
> > It doesn't need to depend on routing. Thomas' example introduced a=20
>> logic delay between the two clocks. The implementation of that logic=20
>> will create a race condition.=20
> I didn't see anything in Thomas' example that required "logic". Here is=
=20
> what I read. Did I read the wrong post?=20

The post that I referring to has the following...
    Clk1 <=3D Clk when Selected else Other_Clk';=20
    [..]=20
    Clk2 <=3D Clk1 when Enabled else '0';=20
    [..]=20
    process (Clk)=20
      if rising_edge(Clk) then=20
         A <=3D B;=20
    [..]=20
    process (Clk2)=20
      if rising_edge(Clk2)=20
         B<=3D A;=20

I believe the point he was trying to make was that because of the simulatio=
n delta delay between Clk1 and Clk2 the two processess would not be clocked=
 at the same time.  While it is true that the processes would clock at diff=
erent times, it's not really because of the simulation delta delay.  An act=
ual implementation of the above would have the same problem because it woul=
d need to synthesize the logic to create the gated clock.  The propogation =
delay and additional routing delay in creating the additional clock would c=
reate a race condition for signals generated in the 'clk' domain and captur=
ed in the 'clk1' or 'clk2' domains.

> > clk1 <=3D clk;=20

> This is not logic. In VHDL it inserts a delta delay which is a zero=20
> amount of time but treated as a delay in the simulator. By adding a=20
> delta delay it will disrupt signals from the earlier clk domain that
> drive FFs in the later clk1 domain. In a real chip there will be no=20
> delay.

And that was my point.

Kevin Jennings

Article: 154618
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: mmihai <iiahim@yahoo.com>
Date: Sun, 2 Dec 2012 09:53:27 -0800 (PST)
Links: << >> << T >> << A >>

On Friday, November 30, 2012 4:23:51 PM UTC-8, Allan Herriman wrote:
> On Fri, 30 Nov 2012 11:22:35 -0800, mmihai wrote:
> 
> > Any way to constrain the hold target >0.0 only for some specific paths?
> 
> Simply by specifying clocked logic you have constrained the hold time to 
> be > 0 ns.
> 
> I don't know of a way of adding extra margin though.  I believe the best 
> approach is to avoid the need for extra margin.

Yes, I've meant extra margin. It seems the hold target is 0.0ns (I would guess the numbers include some padding).

> This is a tool bug.  You have zero chance of fixing the tool, however you 
> do have a good chance of being able to step around the bug.

It looks like a tool bug.
It is very disturbing that it is not related to a particular version and it's on multiple [virtex] families...

I would expect the things to work if STA has good numbers.
My confidence in the tools took a hit ...

> Some other suggestions:
> 
> - Lock the placement of the BUFG and BUFR.  You might find there is some 
> magic combination of placements that just works.  Earlier you said that 
> some runs of PAR would produce designs that worked.  Copy the placement 
> from those runs as a starting point.
> 
> - constrain the logic in the BUFG domain to be physically apart from the 
> BUFR region.  This forces longer routes on the chip that will improve 
> your hold time margin.

a) I did not see any correlation between passing/failing and particular BUFR/BUFG placement
b) I think it is a risky approach; if I can get a particular map/par run to work on some systems ... I have no guarantee it will be fine on //all// systems, over PVT.

> - Finally the brute force approach: treat the BUFG and BUFR clocks as if 
> they were different clock domains.  Use some sort of FIFO that is 
> designed to handle different clock domains to pass data from the BUFR 
> domain to the BUFG domain.

Something like this could be the best solution, if doable ... but it's a pity to add logic because Xilinx tools can't handle the clock tree properly....

My logic looks very much like Figure 1-24/Page 28 from UG362, except I don't use BUFIO, so it is not that exotic. 

--
mmihai

Article: 154619
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: mmihai <iiahim@yahoo.com>
Date: Sun, 2 Dec 2012 09:55:52 -0800 (PST)
Links: << >> << T >> << A >>

On Saturday, December 1, 2012 2:28:43 AM UTC-8, Brian Drummond wrote:

> Is this a new design for the V6 or a port from another FPGA or a previous 
> ISE release? Are you directly instantating these primitives and checking 
> that they are still there in the RTL view?

New design ... and BUFR/BUFG instantiated by hand.
I've looked on fpga_editor and the buffers are there.

--
mmihai

Article: 154620
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: Brian Drummond <brian@shapes.demon.co.uk>
Date: Sun, 2 Dec 2012 19:46:04 +0000 (UTC)
Links: << >> << T >> << A >>

On Sun, 02 Dec 2012 09:55:52 -0800, mmihai wrote:

> On Saturday, December 1, 2012 2:28:43 AM UTC-8, Brian Drummond wrote:
> 
>> Is this a new design for the V6 or a port from another FPGA or a
>> previous ISE release? Are you directly instantating these primitives
>> and checking that they are still there in the RTL view?
> 
> New design ... and BUFR/BUFG instantiated by hand.
> I've looked on fpga_editor and the buffers are there.

Then it's pretty unlikely to be a regression to what I was seeing.

- Brian

Article: 154621
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Sun, 2 Dec 2012 22:22:37 +0000 (UTC)
Links: << >> << T >> << A >>

mmihai <iiahim@yahoo.com> wrote:

(snip, someone wrote)

>> This is a tool bug.  You have zero chance of fixing the tool, 
>> however you do have a good chance of being able to step 
>> around the bug.

> It looks like a tool bug.
> It is very disturbing that it is not related to a particular 
> version and it's on multiple [virtex] families...

> I would expect the things to work if STA has good numbers.
> My confidence in the tools took a hit ...

(snip)

> Something like this could be the best solution, if doable ... 
> but it's a pity to add logic because Xilinx tools can't 
> handle the clock tree properly....

It seems to me that they do pretty well.

Well, the effects of voltage and temperature should be pretty
much the same for all transistors on a chip. But process variations
could be very different.

They verify that the usual paths have delay variations that they
can account for, and compute delays based on those. If there are
some that they can't account for the delays, at least not to the
accuracy required, then they don't guarantee those.

As far as I understand, though mostly in general, the idea is
to make clock skew in a clock tree small enough, relative to
the minimum delay through routing, that two FFs clocked off
the same clock can't violate hold time. The skew also must
be added to the delay when verifying setup time.

But that only works within one clock tree. Computing the
variation between two clock trees is different. 

Now, it would be nice to say that some delay is not characterized
enough to use, and so far I haven't seen that they do say that,
but it isn't the tools' fault if the data isn't available.

-- glen

Article: 154622
Subject: Re: VHDL expert puzzle
From: rickman <gnuarm@gmail.com>
Date: Sun, 02 Dec 2012 22:43:04 -0500
Links: << >> << T >> << A >>

On 12/1/2012 7:33 PM, KJ wrote:
> On Saturday, December 1, 2012 1:47:06 AM UTC-5, rickman wrote:
>> On 11/30/2012 9:25 PM, KJ wrote:
>>> It doesn't need to depend on routing. Thomas' example introduced a
>>> logic delay between the two clocks. The implementation of that logic
>>> will create a race condition.
>> I didn't see anything in Thomas' example that required "logic". Here is
>> what I read. Did I read the wrong post?
>
> The post that I referring to has the following...
>      Clk1<= Clk when Selected else Other_Clk';
>      [..]
>      Clk2<= Clk1 when Enabled else '0';
>      [..]
>      process (Clk)
>        if rising_edge(Clk) then
>           A<= B;
>      [..]
>      process (Clk2)
>        if rising_edge(Clk2)
>           B<= A;
>
> I believe the point he was trying to make was that because of the simulation delta delay between Clk1 and Clk2 the two processess would not be clocked at the same time.  While it is true that the processes would clock at different times, it's not really because of the simulation delta delay.  An actual implementation of the above would have the same problem because it would need to synthesize the logic to create the gated clock.  The propogation delay and additional routing delay in creating the additional clock would create a race condition for signals generated in the 'clk' domain and captured in the 'clk1' or 'clk2' domains.
>
>>> clk1<= clk;
>
>> This is not logic. In VHDL it inserts a delta delay which is a zero
>> amount of time but treated as a delay in the simulator. By adding a
>> delta delay it will disrupt signals from the earlier clk domain that
>> drive FFs in the later clk1 domain. In a real chip there will be no
>> delay.
>
> And that was my point.
>
> Kevin Jennings

I don't know where that code came from, but yes, I think you are 
accurately analyzing it.  Your first post was replying to the OP's post 
containing the link to the blog.  I didn't see anything like this in the 
blog code, there was no muxing of the clock.  Where did you get the code 
shown above?

Rick

Article: 154623
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: Allan Herriman <allanherriman@hotmail.com>
Date: 03 Dec 2012 12:55:56 GMT
Links: << >> << T >> << A >>

On Fri, 30 Nov 2012 11:22:35 -0800, mmihai wrote:

> On Friday, November 30, 2012 4:46:51 AM UTC-8, Allan Herriman wrote:
> 
> Thanks for your comments.
> 
> Most interesting ... different chip(V4) had same issues....
> 
>> Moral: BUFGs have a large delay.  Don't expect PAR to be able to make
>> up for that amount of hold time using routing.
> 
> I don't think is the BUFGs delay; my guess is more related to routing.
> Based on datasheet BUFG delay is 0.10ns .... reported "Clock Path Skew"
> is 1.851ns... whatever that includes.

Sorry I missed that earlier.  You seem to be mixing up skew and delay.

> datasheet BUFG delay is 0.10ns

That figure is the BUFG skew, not the BUFG delay.
It represents the worst case timing difference between outputs on the 
same BUFG.  It isn't relevant to your problem.

The "Clock Path Skew" is the important figure.  It is the difference 
between the time of arrival of the clock at the source (clocked from 
BUFR) flip flops and destination (clocked from BUFG) flip flops.  In this 
case it is mostly made up of the BUFG delay.

PAR has to include a routing delay to compensate for that skew.

An earlier comment:

>> One solution might be to insert FFs clocked from the other edge of the
>> BUFG clock.
>
> I thought about that .... can't do it, the clock is fast and it
> won't meet setup for half clock cycle.

It might not meet setup for half a clock cycle, but it doesn't have to!  
The skew works in your favour when using opposite edges and the 
requirement for setup time is half a clock cycle + 1.851ns.  Unless you 
have a GHz clock that doesn't sound too hard.

Regards,
Allan

Article: 154624
Subject: Re: V6 BUFR -> BUFG clocking structure (hold issue?)
From: jonesandy@comcast.net
Date: Mon, 3 Dec 2012 06:57:11 -0800 (PST)
Links: << >> << T >> << A >>

Glen,

All three (voltage, temperature and process) vary over a single die, but no=
t by much. The trick is always "by how much?" Are we willing to live with s=
lower guaranteed performance in order to simplify the analysis, or is it wo=
rth it to invest more in the analysis (NRE) to "speed up" the parts (recurr=
ing profit)?

Managing hold time is a lot more complicated than it used to be. In the pas=
t, the clock skew could always be less than Tco plus minimum routing by des=
ign, so they did not even spec hold time for the registers. Over time, the =
raw speed of the devices has out-stripped the skew of the clock tree, and h=
old time is a real problem that has to be taken care of in placement and ro=
uting. We users just don't have control over the clock tree itself to deal =
with the problem, like in other domains.=20

Andy

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search