Messages from 158875

Article: 158875
Subject: Re: Problem if compilation order in OOC compilations in Xilinx Vivado
From: wzab01@gmail.com
Date: Fri, 13 May 2016 07:52:22 -0700 (PDT)
Links: << >> << T >> << A >>

Of course the subject should be "problem WITH compilation order in OOC compilations in Xilinx Vivado".

Article: 158876
Subject: Re: FPGA boards in egypt
From: GaborSzakacs <gabor@alacron.com>
Date: Fri, 13 May 2016 11:35:20 -0400
Links: << >> << T >> << A >>

aymanmimomimomimo@gmail.com wrote:
> Plz I want to buy FPGA boards for my graduation project and l don't know where can I get it from Egypt (FPGA spartan 6 sp601 or sp605)

According to the Xilinx Authorized distributor page,
your best bet is to contact EMPA Elektronik in Turkey.

http://www.xilinx.com/about/contact/authorized-distributors.html

As noted, your University should be able to get these products
through the XUP:

http://www.xilinx.com/support/university.html

Your professor would need to contact Xilinx for this.

-- 
Gabor

Article: 158877
Subject: Re: Problem if compilation order in OOC compilations in Xilinx Vivado
From: rickman <gnuarm@gmail.com>
Date: Fri, 13 May 2016 12:24:58 -0400
Links: << >> << T >> << A >>

On 5/13/2016 9:47 AM, wzab01@gmail.com wrote:
> Hi,
>
> Has anybody in this group faced the problem of incorrect compilation order of blocks selected for Out-of-context (OOC) compilation?
> It works correctly in case of blocks converted into packaged IP-cores.
> However very often in the huge projects I'm dealing with (e.g. using the 70% of Virtex 7 xc7vx690tffg1927) I want to split the design into blocks synthesized separately without packaging them.
>
> It allows to avoid resynthesizing of the whole design (which takes from 5 to 8 hours) when only one of the blocks is modified.
>
> The reason to not package them is the fact that the design is implemented in high level VHDL. The ports of the components are usually implemented as records describing the data structures. The current packaging standards based oon IP-XACT do not support such ports.
>
> Anyway, I'm quite happy with using OOC synthesis for my components (Yes, you can do it even for blocks with ports of user defined types as long as you provide your own stubs instead of letting Vivado to generate them automatically).
> However, there is one HUGE problem.
>
> The "OOC Module Runs" are performed perfectly, and the separate modules are synthesized correctly, but as soon as I start the main (usually "synth_1") run, all OOC runs are set "out-of-date" and restarted. This time they are synthesized using incorrect compilation order which results in compilation errors (typically entities using types and constants defined in the packages are compiled before those packages).
>
> After some research I've found an work-around, which can be even implemented in the Tcl script.
> I've described the whole problem together with a "bug reproducer" and workaround on the Xilinx forum - https://forums.xilinx.com/t5/Synthesis/Vivado-incorrect-automatic-compilation-order-in-OOC-synthesis/m-p/698402#M18253 but up to know I didn't get any answer from Xilinx.
>
> I'm quite surprised, that no one alse has reported this rather serious bug before.
>
> AFAIK it was present at least in all 2015.x and 2016.1 versions. (I don't know whether it was in the earlier versions, as it was begining of the 2015, when I had to split my design into OOC blocks).

I haven't used the Xilinx tools in years.  But the Lattice tools I have 
used allow you to specify the order of compilation for files.  I 
routinely have to specify which files are libraries and which are to be 
compiled later.  Does Xilinx give you a way to flag libraries?  I can't 
imagine it couldn't figure out the file dependencies of libraries.

I suspect you are working in a different way where you can group files 
into blocks at a higher level.

-- 

Rick C

Article: 158878
Subject: Re: Problem with AXI4 Lite in Cyclone V
From: wzab01@gmail.com
Date: Fri, 13 May 2016 14:39:43 -0700 (PDT)
Links: << >> << T >> << A >>

Hi,

Thanks Rob Gaddi for the suggestion. Even though my code was correct, his suggestion that I may have erroneously implemented AXI4-Lite protocol has risen doubt if the Altera's implementation is correct.

I suspected that maybe I assert the RVALID too early (even though the QSYS Interconnect keeps RREADY high).

I've enforced that RVALID may not be asserted in the same cycle in which my slave receives ARVALID (even though it is able to produce valid RDATA in the same cycle).

Indeed after that small modification (which costs one additional clock cycle in each transaction) the read is performed correctly.
As I have already mentioned, the Xilinx Zynq has no such problems.

So it seems that the implementation of the AXI4-Lite protocol in the Altera Qsys Interconect is buggy. Even though it keeps RREADY asserted in the cycle in which it receives ARREADY, it is not able to accept RVALID in that cycle.
The RREADY may be asserted at fastest in the next cycle.

I hope that this post may help others who faced the similar problem. I have seen a few similar posts in different mailing lists, but have never seen the solution...

With best regards,
Wojtek

PS. I have sent the waveforms and sources of the corrected (OK, it was correct, so just modified) design to the Altera forum, but it is waiting for moderator's acceptance.

Article: 158879
Subject: Re: Problem with AXI4 Lite in Cyclone V
From: wzab01@gmail.com
Date: Fri, 13 May 2016 16:05:31 -0700 (PDT)
Links: << >> << T >> << A >>

W dniu pi=C4=85tek, 13 maja 2016 23:39:47 UTC+2 u=C5=BCytkownik wza...@gmai=
l.com napisa=C5=82:

> PS. I have sent the waveforms and sources of the corrected (OK, it was co=
rrect, so just modified) design to the Altera forum, but it is waiting for =
moderator's acceptance.

As the post with waveforms and corrected bridge is still not accepted, I've=
 published the sources on IPbus website: https://svnweb.cern.ch/trac/cactus=
/ticket/1876

Best regards,
Wojtek

Article: 158880
Subject: Constraining data to out-of-phase clocks
From: Rob Gaddi <rgaddi@highlandtechnology.invalid>
Date: Fri, 13 May 2016 23:18:56 -0000 (UTC)
Links: << >> << T >> << A >>

I've got a project coming up in which one of the things I'm going to
need to do is take in the 1 PPS output from a GPS receiver and align it 
to the 100 MHz frequency reference clock.

The problem here is that the phase relationship is static but undefined.
 There's plenty of time somewhere to not violate the setup time, but I
have to find where it is.

My thought had been to use one of the FPGA PLLs to spin up 8 phases
(well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
all of them. Then resychronize those to the 0° clock, read the
thermometer code, and figure out what the data phase is. Then I can
use the value opposite that and know I've got enough margin to park a
semitrailer in.

All well and good, except it dawns on me that I don't know how to
convince the tools to make timing for that.  Somehow my SDC file has to
call for running the input signal from the input pin to 8 flops such
that the clock/data skew is less than 1 ns.

Anyone know how I'd do such a thing?  Altera seems to support
set_max_skew, which might do it if I can disentangle the arcana of the
syntax from
http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
But I'm not necessarily wedded to Altera on this project.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 158881
Subject: Re: Constraining data to out-of-phase clocks
From: rickman <gnuarm@gmail.com>
Date: Fri, 13 May 2016 23:59:11 -0400
Links: << >> << T >> << A >>

On 5/13/2016 7:18 PM, Rob Gaddi wrote:
> I've got a project coming up in which one of the things I'm going to
> need to do is take in the 1 PPS output from a GPS receiver and align it
> to the 100 MHz frequency reference clock.
>
> The problem here is that the phase relationship is static but undefined.
>   There's plenty of time somewhere to not violate the setup time, but I
> have to find where it is.
>
> My thought had been to use one of the FPGA PLLs to spin up 8 phases
> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
> all of them. Then resychronize those to the 0° clock, read the
> thermometer code, and figure out what the data phase is. Then I can
> use the value opposite that and know I've got enough margin to park a
> semitrailer in.
>
> All well and good, except it dawns on me that I don't know how to
> convince the tools to make timing for that.  Somehow my SDC file has to
> call for running the input signal from the input pin to 8 flops such
> that the clock/data skew is less than 1 ns.
>
> Anyone know how I'd do such a thing?  Altera seems to support
> set_max_skew, which might do it if I can disentangle the arcana of the
> syntax from
> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
> But I'm not necessarily wedded to Altera on this project.

To make sure I understand your issue, you want to *measure* the phase of 
the clock the 1 pps is detected on to 1/8th the resolution of the 100 
MHz clock itself?  If you are not trying to measure that phase 
relationship, I'm not sure why you would want to use 8 phases.

You can use the FFs in the I/O blocks to capture a true and inverted 
clocked version of the signal on that pin.  Each of four pins can 
receive the same signal and be clocked by differently phased clocks.  I 
don't know how much differential timing you will see between the four 
clocks as well as the true/inverted clocked FFs.

Another way to do this is to use a SERDES which PLLs the clock up in 
frequency.  But I don't think this lets you correlate the data received 
to the main clock in an absolute way.

Otherwise I think you may need to do the sampling in external logic and 
read the thermometer code in parallel.

A friend did something like this where he used a digital delay line that 
I think he delay locked to the 1 PPS so he was on the edge of detecting 
the 1 PPS on the same clock each cycle.  I don't recall this for sure, 
but it was something like this.  His circuit was generating a 1 PPS for 
an new atomic clock at the Naval Observatory.  It sync'ed up to the 
existing 1 PPS for a while to find the right phase, then locked and ran. 
  Sounds like what you are doing.

-- 

Rick C

Article: 158882
Subject: Re: Constraining data to out-of-phase clocks
From: Tim Wescott <tim@seemywebsite.com>
Date: Fri, 13 May 2016 23:48:52 -0500
Links: << >> << T >> << A >>

On Fri, 13 May 2016 23:18:56 +0000, Rob Gaddi wrote:

> I've got a project coming up in which one of the things I'm going to
> need to do is take in the 1 PPS output from a GPS receiver and align it
> to the 100 MHz frequency reference clock.
> 
> The problem here is that the phase relationship is static but undefined.
>  There's plenty of time somewhere to not violate the setup time, but I
> have to find where it is.
> 
> My thought had been to use one of the FPGA PLLs to spin up 8 phases
> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
> all of them. Then resychronize those to the 0Â° clock, read the
> thermometer code, and figure out what the data phase is. Then I can use
> the value opposite that and know I've got enough margin to park a
> semitrailer in.
> 
> All well and good, except it dawns on me that I don't know how to
> convince the tools to make timing for that.  Somehow my SDC file has to
> call for running the input signal from the input pin to 8 flops such
> that the clock/data skew is less than 1 ns.
> 
> Anyone know how I'd do such a thing?  Altera seems to support
> set_max_skew, which might do it if I can disentangle the arcana of the
> syntax from
> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/
tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
> But I'm not necessarily wedded to Altera on this project.

There's not enough foreshadowing for me to figure out where the 
thermometer comes in.

But I have lots of questions!

How precisely do you really need to know the 1PPS?

How can you know that the phase alignment is static, yet not know what it 
is?  The only way I can think of that you'd have a static phase alignment 
is if there's a PLL locking to the 1PPS, and that implies a known phase 
alignment.

Why not use a 100MHz clock that _is_ of known phase alignment?  Seems to 
me that an VCXO plus some logic (possibly inside the FPGA) plus an op-amp 
with an integrator would add up to phase lock.

Certainly if you're using a free-running oscillator, even a crystal one, 
your phase will, at best, be drifting all over the place, and it'll 
probably have enough of an offset to it that it'll be running away rather 
than exhibiting a random walk.

-- 
Tim Wescott
Control systems, embedded software and circuit design
I'm looking for work!  See my website if you're interested
http://www.wescottdesign.com

Article: 158883
Subject: Re: Constraining data to out-of-phase clocks
From: rickman <gnuarm@gmail.com>
Date: Sat, 14 May 2016 01:17:37 -0400
Links: << >> << T >> << A >>

On 5/14/2016 12:48 AM, Tim Wescott wrote:
> On Fri, 13 May 2016 23:18:56 +0000, Rob Gaddi wrote:
>
>> I've got a project coming up in which one of the things I'm going to
>> need to do is take in the 1 PPS output from a GPS receiver and align it
>> to the 100 MHz frequency reference clock.
>>
>> The problem here is that the phase relationship is static but undefined.
>>   There's plenty of time somewhere to not violate the setup time, but I
>> have to find where it is.
>>
>> My thought had been to use one of the FPGA PLLs to spin up 8 phases
>> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
>> all of them. Then resychronize those to the 0Â° clock, read the
>> thermometer code, and figure out what the data phase is. Then I can use
>> the value opposite that and know I've got enough margin to park a
>> semitrailer in.
>>
>> All well and good, except it dawns on me that I don't know how to
>> convince the tools to make timing for that.  Somehow my SDC file has to
>> call for running the input signal from the input pin to 8 flops such
>> that the clock/data skew is less than 1 ns.
>>
>> Anyone know how I'd do such a thing?  Altera seems to support
>> set_max_skew, which might do it if I can disentangle the arcana of the
>> syntax from
>> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/
> tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
>> But I'm not necessarily wedded to Altera on this project.
>
> There's not enough foreshadowing for me to figure out where the
> thermometer comes in.
>
> But I have lots of questions!
>
> How precisely do you really need to know the 1PPS?
>
> How can you know that the phase alignment is static, yet not know what it
> is?  The only way I can think of that you'd have a static phase alignment
> is if there's a PLL locking to the 1PPS, and that implies a known phase
> alignment.

I can answer that, the 100 MHz and 1 PPS are from two different, but 
both very stable sources... most likely.  I expect this is like the 
circuit my friend designed, a 1 PPS generator, or at least an internal 
alignment to the external 1 PPS, but clocked from the local 100 MHz 
clock.  You would figure at any given time the phase relationship is 
stable, but not known apriori.  With a clock stability of 10^-... pick a 
number, the phase would not change very fast.


> Why not use a 100MHz clock that _is_ of known phase alignment?  Seems to
> me that an VCXO plus some logic (possibly inside the FPGA) plus an op-amp
> with an integrator would add up to phase lock.

1 PPS from GPS, I expect clocked into the FPGA by a local atomic clock.


> Certainly if you're using a free-running oscillator, even a crystal one,
> your phase will, at best, be drifting all over the place, and it'll
> probably have enough of an offset to it that it'll be running away rather
> than exhibiting a random walk.

How about rubidium or cesium?  I think they get better than 10^-12, 
maybe 10^-15.

-- 

Rick C

Article: 158884
Subject: Recoding openCV C++ project in pure verilog
From: Marvin L <user123random@gmail.com>
Date: Sat, 14 May 2016 06:16:57 -0700 (PDT)
Links: << >> << T >> << A >>

I have a 6 month project to work with by hand-recoding openCV C++ project into pure RTL for FPGA usage.

I have a Xilinx Zynq FPGA and I have Vivado.

The code I am using are at https://github.com/Itseez/opencv/blob/master/modules/features2d/src/orb.cpp , https://github.com/Itseez/opencv/blob/master/modules/features2d/src/feature2d.cpp , and http://pastebin.com/N69WE89j  

Anyone can advise on how to start working on this ? I have been told to scrutinise the code for any maths operations (especially floating) which I am doing now and I need to use the AXI interface and logicore IP for this, right ?

Thanks !

Article: 158885
Subject: Re: Constraining data to out-of-phase clocks
From: BobH <wanderingmetalhead.nospam.please@yahoo.com>
Date: Sat, 14 May 2016 06:21:09 -0700
Links: << >> << T >> << A >>

On 05/13/2016 04:18 PM, Rob Gaddi wrote:
> I've got a project coming up in which one of the things I'm going to
> need to do is take in the 1 PPS output from a GPS receiver and align it
> to the 100 MHz frequency reference clock.
>
> The problem here is that the phase relationship is static but undefined.
>  There's plenty of time somewhere to not violate the setup time, but I
> have to find where it is.
>
> My thought had been to use one of the FPGA PLLs to spin up 8 phases
> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
> all of them. Then resychronize those to the 0° clock, read the
> thermometer code, and figure out what the data phase is. Then I can
> use the value opposite that and know I've got enough margin to park a
> semitrailer in.
>
> All well and good, except it dawns on me that I don't know how to
> convince the tools to make timing for that.  Somehow my SDC file has to
> call for running the input signal from the input pin to 8 flops such
> that the clock/data skew is less than 1 ns.
>
> Anyone know how I'd do such a thing?  Altera seems to support
> set_max_skew, which might do it if I can disentangle the arcana of the
> syntax from
> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
> But I'm not necessarily wedded to Altera on this project.
>

The 1PPS signals out of the GPS receivers that I have seen, advertise 
50nS accuracy. Your 100MHz clock has a period of 10nS. From what I 
understand of your project, I would just put the 1PPS signal in through 
a couple of sync FF's and then compensate for an assumed delay of 2 
clock cycles in the application of the sync'ed 1PPS signal downstream.

The phase relationship between the 1PPS input and the local 100MHz clock 
won't be all that static unless you phase lock the 100MHz clock to the 
1PPS. Even an exotic local oscillator is going to drift around with time 
and temperature compared to the 1PPS.

Bobh

Article: 158886
Subject: Re: Recoding openCV C++ project in pure verilog
From: BobH <wanderingmetalhead.nospam.please@yahoo.com>
Date: Sat, 14 May 2016 17:39:01 -0700
Links: << >> << T >> << A >>

On 05/14/2016 06:16 AM, Marvin L wrote:
> I have a 6 month project to work with by hand-recoding openCV C++ project into pure RTL for FPGA usage.
>
> I have a Xilinx Zynq FPGA and I have Vivado.
>
> The code I am using are at https://github.com/Itseez/opencv/blob/master/modules/features2d/src/orb.cpp
> https://github.com/Itseez/opencv/blob/master/modules/features2d/src/feature2d.cpp
> and http://pastebin.com/N69WE89j
>
> Anyone can advise on how to start working on this ? I have been told to scrutinise the
> code for any maths operations (especially floating) which I am doing now and I need to
> use the AXI interface and logicore IP for this, right ?
>
> Thanks !
>
Can you get the algorithm descriptions rather than the C++ code? 
Optimizing an a implementation of an algorithm for software execution 
and optimizing it for hardware execution are fairly different beasts.

If you cannot get the algorithm description, your best path will be to 
create your own from the C++. Things like floating point operations are 
important, but figuring out how to take advantage of the parallel nature 
of hardware is crucial.

Good Luck,
BobH

Article: 158887
Subject: Re: Recoding openCV C++ project in pure verilog
From: Tim Wescott <tim@seemywebsite.com>
Date: Sat, 14 May 2016 20:49:58 -0500
Links: << >> << T >> << A >>

On Sat, 14 May 2016 17:39:01 -0700, BobH wrote:

> On 05/14/2016 06:16 AM, Marvin L wrote:
>> I have a 6 month project to work with by hand-recoding openCV C++
>> project into pure RTL for FPGA usage.
>>
>> I have a Xilinx Zynq FPGA and I have Vivado.
>>
>> The code I am using are at
>> https://github.com/Itseez/opencv/blob/master/modules/features2d/src/
orb.cpp
>> https://github.com/Itseez/opencv/blob/master/modules/features2d/src/
feature2d.cpp
>> and http://pastebin.com/N69WE89j
>>
>> Anyone can advise on how to start working on this ? I have been told to
>> scrutinise the code for any maths operations (especially floating)
>> which I am doing now and I need to use the AXI interface and logicore
>> IP for this, right ?
>>
>> Thanks !
>>
> Can you get the algorithm descriptions rather than the C++ code?
> Optimizing an a implementation of an algorithm for software execution
> and optimizing it for hardware execution are fairly different beasts.
> 
> If you cannot get the algorithm description, your best path will be to
> create your own from the C++. Things like floating point operations are
> important, but figuring out how to take advantage of the parallel nature
> of hardware is crucial.

+1.  Sometimes you even want to go as far as rearranging the algorithm to 
take advantage of the underlying hardware -- processors and FPGA's just 
do things differently, and that can bubble up pretty far in the design 
chain.

-- 
Tim Wescott
Control systems, embedded software and circuit design
I'm looking for work!  See my website if you're interested
http://www.wescottdesign.com

Article: 158888
Subject: Re: Constraining data to out-of-phase clocks
From: rickman <gnuarm@gmail.com>
Date: Sun, 15 May 2016 00:55:58 -0400
Links: << >> << T >> << A >>

On 5/14/2016 1:17 AM, rickman wrote:
> On 5/14/2016 12:48 AM, Tim Wescott wrote:
>> On Fri, 13 May 2016 23:18:56 +0000, Rob Gaddi wrote:
>>
>>> I've got a project coming up in which one of the things I'm going to
>>> need to do is take in the 1 PPS output from a GPS receiver and align it
>>> to the 100 MHz frequency reference clock.
>>>
>>> The problem here is that the phase relationship is static but undefined.
>>>   There's plenty of time somewhere to not violate the setup time, but I
>>> have to find where it is.
>>>
>>> My thought had been to use one of the FPGA PLLs to spin up 8 phases
>>> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
>>> all of them. Then resychronize those to the 0Â° clock, read the
>>> thermometer code, and figure out what the data phase is. Then I can use
>>> the value opposite that and know I've got enough margin to park a
>>> semitrailer in.
>>>
>>> All well and good, except it dawns on me that I don't know how to
>>> convince the tools to make timing for that.  Somehow my SDC file has to
>>> call for running the input signal from the input pin to 8 flops such
>>> that the clock/data skew is less than 1 ns.
>>>
>>> Anyone know how I'd do such a thing?  Altera seems to support
>>> set_max_skew, which might do it if I can disentangle the arcana of the
>>> syntax from
>>> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/
>> tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
>>> But I'm not necessarily wedded to Altera on this project.
>>
>> There's not enough foreshadowing for me to figure out where the
>> thermometer comes in.
>>
>> But I have lots of questions!
>>
>> How precisely do you really need to know the 1PPS?
>>
>> How can you know that the phase alignment is static, yet not know what it
>> is?  The only way I can think of that you'd have a static phase alignment
>> is if there's a PLL locking to the 1PPS, and that implies a known phase
>> alignment.
>
> I can answer that, the 100 MHz and 1 PPS are from two different, but
> both very stable sources... most likely.  I expect this is like the
> circuit my friend designed, a 1 PPS generator, or at least an internal
> alignment to the external 1 PPS, but clocked from the local 100 MHz
> clock.  You would figure at any given time the phase relationship is
> stable, but not known apriori.  With a clock stability of 10^-... pick a
> number, the phase would not change very fast.
>
>
>> Why not use a 100MHz clock that _is_ of known phase alignment?  Seems to
>> me that an VCXO plus some logic (possibly inside the FPGA) plus an op-amp
>> with an integrator would add up to phase lock.
>
> 1 PPS from GPS, I expect clocked into the FPGA by a local atomic clock.
>
>
>> Certainly if you're using a free-running oscillator, even a crystal one,
>> your phase will, at best, be drifting all over the place, and it'll
>> probably have enough of an offset to it that it'll be running away rather
>> than exhibiting a random walk.
>
> How about rubidium or cesium?  I think they get better than 10^-12,
> maybe 10^-15.

Now that I have read JL's post, I understand.  You have two signals 
provided by your customer and not enough information to know how they 
are related or how to use them.  This will all depend on what you need 
to use the signals for.  In other words, you have not provided enough 
info for us to know how to design your system.

So how are you going to use these two signals?  What are the 
requirements you need to meet?

-- 

Rick C

Article: 158889
Subject: Re: Problem with AXI4 Lite in Cyclone V
From: Theo Markettos <theom+news@chiark.greenend.org.uk>
Date: 15 May 2016 17:01:22 +0100 (BST)
Links: << >> << T >> << A >>

wzab01@gmail.com wrote:
> Hi,
> 
> Thanks Rob Gaddi for the suggestion. Even though my code was correct, his
> suggestion that I may have erroneously implemented AXI4-Lite protocol has
> risen doubt if the Altera's implementation is correct.
> 
> I suspected that maybe I assert the RVALID too early (even though the QSYS
> Interconnect keeps RREADY high).

FWIW, a bug I recently spent a long time chasing was an Avalon
Clock-Crossing Bridge that, when fed with the same clock on both sides (to
rule out any asynchrony) will generate a spurious avs_readdatavalid cycle to
its master.  Eventually this trickles back into AXI land and generates an
unexpected rvalid message with the same ID as the previous one.  That turns
into a cache response which wedges the CPU (not an ARM).

The bug is rather hard to provoke, but replacing with a single-clock Avalon
Pipeline Bridge fixed it.

Theo

Article: 158890
Subject: Re: Using an FPGA to drive the 80386 CPU on a real motherboard
From: Aleksandar Kuktin <akuktin@gmail.com>
Date: Mon, 16 May 2016 13:18:00 +0000 (UTC)
Links: << >> << T >> << A >>

On Sun, 01 May 2016 23:13:44 -0400, rickman wrote:

> On 5/1/2016 2:24 PM, Aleksandar Kuktin wrote:

>> [snip]

>> Right now, with what I'm currently working on, I have a blessing in
>> that I don't have to run the circuit very fast so I can get away with
>> three- gate deep logic, maybe even four gate deep. But if I were going
>> for break-
>> neck speeds, I would be constrained to logic two LUT4 gates deep. Only
>> so many features can be crammed into a design made with that. :)
> 
> Maybe I don't know what you mean by fast and not fast.  Got some
> numbers?

I consider 100MHz to be fast, 10MHz to be slow. On a clock-independent 
level, I consider logic 2 LUT deep to be fast and logic more than 4 LUTs 
deep to be slow.

>>>> If you add a bit to the word or address size, you are not just
>>>> doubling the CPUs capabilities, you are also doubling the number,
>>>> size and scope of problems you have to deal with.
>>>
>>> ???  My CPU design did not specify the data size, only the instruction
>>> size.  I didn't have a problem adjusting the data size to suit my
>>> application.
>>
>> I suppose you can parameterize the data size, and later change the
>> definition of the parameter to suit.

Bad wording - I meant "parameterize" in the sense of "use language 
features of verilog", assuming one uses verilog. :)

> Not just parameters, but the instruction format doesn't care.  Literals
> are built up in 7 bit chunks from an 8 bit instruction or 8 bit chunks
> with a 9 bit instruction since many FPGAs have memory and multipliers 9
> bit wide multiples.  The data path has no restrictions on width.

For the most part, I will acknowledge and defer - you have more 
experience than I do. But I'd say that "width" does relate to "speed" and 
"complexity" - the thesis I'm pushing because wide enough ... bit 
chunks ... require addition of extra gates to the sides and at some point 
require additional gates in the depth to ... integrate/consume ... the 
wider output.

Illustration: let's assume we're making a comparator, but using only LUTs 
(so no carry chain/mux magic). If we compare two 2-bit numbers, we can 
use a single LUT4 to produce an output, with the whole construct being 1 
gate deep. If we increase the width to 3 or 4 bits, we now need two LUT4s 
to compare the numbers. But those two gates have two outputs. To further 
"compress" it to a single output, we need a third LUT4 *in series* with 
the other two. So the whole thing is now 2x slower and 3x bigger than it 
used to be. Now, taking into account the users of the comparators output, 
we can optimize. If the output used to go into a LUT with one spare 
input, we don't have to add the extra series gate, we can route the extra 
side gate to the spare input. But that increases the complexity, may 
require a (buggy) optimizing synthesizer/place&route and may tie you down 
if you ever need to change something that would affect the "spare" input 
that got pressed into optimization.

This would be much easier if I knew for sure what the standard 
terminology was. Self-taught and all that..

> [snip]

>> Adding a single bit to a round number can throw the synthesis results
>> way out of optimum. Adding one more can make the gate chain too long to
>> fit into a clock cycle. Changing the clock period can be impossible
>> because the design could have several other interlocking clocks. And on
>> and on.
> 
> Now you are way outside the issues of CPU design.  Now you are in the
> design of your application.

The way I usually do things, I have batches of verilog that gets 
synthesized, placed and routed in one go. I don't floorplan. With that 
setting, a change in one module can have nasty consequences somewhere 
unrelated. I had this happene when I added a larger (16 bits) counter in 
one part that made the whole thing run too slow. The solution was 
chopping the counter into a series of smaller (4x4-bits) counters and 
manually carrying the carry between them.

I infer the problem was that the big counter was taking up prime real 
estate in the chip that the (big and clunky home-grown CPU) needed. 
Chopping it up allowed the p&r tool to spread the counter over a wider 
area, allowing the CPU bits to be closer together and work fast. The chip 
was over 95% full that time.

>> For example - I discovered that the synthesis tool I used (Lattices
>> synthesizer) would produce a sub-optimal result if a unit - say a
>> module - had even a single odd-sized register. Changing the register
>> sizes to even numbers makes synthesis much better, even if it does
>> throw away a bit.
> 
> What was suboptimal about a register?  What sort of unit?

I remember increasing a register from 2 bits to 3 bits killed the maximum 
frequency the circuit can run. It became at least 2x slower. If I 
remember correctly, further increasing it to 4 bits fixed most of the 
problems. It was bizzare.

If I look hard enough, I may even be able to find the code. I seem to 
remember it was in a routing module of some kind. Or some other glue 
module that would connect the CPU core to the SRAM. That was on a MachXO2.

>>> Exactly what is your dream computer?
>>
>> A device whose design can fit in my head, that is transparent and
>> serviceable on all levels, free(-as-in-freedom), secure and usable for
>> real-world tasks. I should probably put "usable" at the top of the
>> list. :)
>>
>> Right now, that would mean a FPGA-implemented FOSH SoC that is self-
>> contained. That means, which can regenerate the images and binaries of
>> itself, by itself (so you don't need a second computer for that).
> 
> I thought there was already a CPU design like that.  RISC-V  Does that
> not fit your description?  Check out this page for ideas...
> 
> http://www.lowrisc.org/

I am aware of lowRISC, and of Milkymist, of some x86 (soft) chip, and of 
Ben Nanonote and of Novena (which doesn't fully fit the bill).

I was originally going to use Milkymist, with maybe some peripherals 
swapped but I didn't like the way they implemented the DDR controler 
(there wasn't a lot of it). Furthermore, lm32 - the systems CPU - had 
problems. Its load/store module would block the instruction pipeline and 
didn't look like could be easily converted into a non-blocking 
implementation. Also, on a cache miss, the CPUs cache would block 
everything, fill the entire cache line (which could be several words 
long) and release the block. I felt like it would be easier to take a 
simpler CPU and extend it with the requisite functionality.

A warning: the following paragraph lists the *exact* causes of my 
frustration with lowRISC, and may cause you to get frustrated in turn, 
especially if you have a stake in the project. :)

I took a look (back in October or November 2015) at lowRISC but two 
(well, three) things put me off. First, the code is hard to find. Just 
now, as I was fact-checking my post, I had to click dozens of times on 
the web-site to see a link to GitHub. And, once there, it still took 
another dozen clicks and some fudging to find at least some of the actual 
verilog. I still don't know where to find the code for the CPU, for 
example. Milkymist had no such problems. Neither did the repository for 
Nyuzi CPU, or the AEMB/AEMB2 repo. The second problem was that the web 
site mostly lists a lot of ... well ... fluf, but is remarkably short on 
meaty details. How about a list of the peripherals? It doesn't even have 
a page on Wikipedia! The wiki magic could have saved it.

The third problem it shares with OpenRISC and it is that the benchmarks 
show it running at about 40MHz or less in a rough class FPGA-s I was 
likely to implement it in. Meanwhile, Lattice was promising 85MHz for its 
lm32. The final nail in the coffin for me and OpenRISC was this document:
http://iosrjournals.org/iosr-jvlsi/papers/vol2-issue4/G0244346.pdf
I am aware they list OpenRISC as running at 185MHz. It is also the place 
where I discovered AEMB.

Article: 158891
Subject: Re: Using an FPGA to drive the 80386 CPU on a real motherboard
From: rickman <gnuarm@gmail.com>
Date: Mon, 16 May 2016 11:53:48 -0400
Links: << >> << T >> << A >>

On 5/16/2016 9:18 AM, Aleksandar Kuktin wrote:
> On Sun, 01 May 2016 23:13:44 -0400, rickman wrote:
>
>> On 5/1/2016 2:24 PM, Aleksandar Kuktin wrote:
>
>>> [snip]
>
>>> Right now, with what I'm currently working on, I have a blessing in
>>> that I don't have to run the circuit very fast so I can get away with
>>> three- gate deep logic, maybe even four gate deep. But if I were going
>>> for break-
>>> neck speeds, I would be constrained to logic two LUT4 gates deep. Only
>>> so many features can be crammed into a design made with that. :)
>>
>> Maybe I don't know what you mean by fast and not fast.  Got some
>> numbers?
>
> I consider 100MHz to be fast, 10MHz to be slow. On a clock-independent
> level, I consider logic 2 LUT deep to be fast and logic more than 4 LUTs
> deep to be slow.
>
>>>>> If you add a bit to the word or address size, you are not just
>>>>> doubling the CPUs capabilities, you are also doubling the number,
>>>>> size and scope of problems you have to deal with.
>>>>
>>>> ???  My CPU design did not specify the data size, only the instruction
>>>> size.  I didn't have a problem adjusting the data size to suit my
>>>> application.
>>>
>>> I suppose you can parameterize the data size, and later change the
>>> definition of the parameter to suit.
>
> Bad wording - I meant "parameterize" in the sense of "use language
> features of verilog", assuming one uses verilog. :)
>
>> Not just parameters, but the instruction format doesn't care.  Literals
>> are built up in 7 bit chunks from an 8 bit instruction or 8 bit chunks
>> with a 9 bit instruction since many FPGAs have memory and multipliers 9
>> bit wide multiples.  The data path has no restrictions on width.
>
> For the most part, I will acknowledge and defer - you have more
> experience than I do. But I'd say that "width" does relate to "speed" and
> "complexity" - the thesis I'm pushing because wide enough ... bit
> chunks ... require addition of extra gates to the sides and at some point
> require additional gates in the depth to ... integrate/consume ... the
> wider output.
>
> Illustration: let's assume we're making a comparator, but using only LUTs
> (so no carry chain/mux magic). If we compare two 2-bit numbers, we can
> use a single LUT4 to produce an output, with the whole construct being 1
> gate deep. If we increase the width to 3 or 4 bits, we now need two LUT4s
> to compare the numbers. But those two gates have two outputs. To further
> "compress" it to a single output, we need a third LUT4 *in series* with
> the other two. So the whole thing is now 2x slower and 3x bigger than it
> used to be. Now, taking into account the users of the comparators output,
> we can optimize. If the output used to go into a LUT with one spare
> input, we don't have to add the extra series gate, we can route the extra
> side gate to the spare input. But that increases the complexity, may
> require a (buggy) optimizing synthesizer/place&route and may tie you down
> if you ever need to change something that would affect the "spare" input
> that got pressed into optimization.
>
> This would be much easier if I knew for sure what the standard
> terminology was. Self-taught and all that..

It's not so much an issue of terminology, but of technology.  A 
comparator in an FPGA would use a carry chain.  Yes, this results in a 
delay that increases linearly with data width, but in general the delay 
is so short that for any data size up to 64 bits it won't significantly 
impact the speed.  So unless you are going for very large data paths, 
this is not a major factor in your CPU speed.


>>> Adding a single bit to a round number can throw the synthesis results
>>> way out of optimum. Adding one more can make the gate chain too long to
>>> fit into a clock cycle. Changing the clock period can be impossible
>>> because the design could have several other interlocking clocks. And on
>>> and on.
>>
>> Now you are way outside the issues of CPU design.  Now you are in the
>> design of your application.
>
> The way I usually do things, I have batches of verilog that gets
> synthesized, placed and routed in one go. I don't floorplan. With that
> setting, a change in one module can have nasty consequences somewhere
> unrelated. I had this happene when I added a larger (16 bits) counter in
> one part that made the whole thing run too slow. The solution was
> chopping the counter into a series of smaller (4x4-bits) counters and
> manually carrying the carry between them.
>
> I infer the problem was that the big counter was taking up prime real
> estate in the chip that the (big and clunky home-grown CPU) needed.
> Chopping it up allowed the p&r tool to spread the counter over a wider
> area, allowing the CPU bits to be closer together and work fast. The chip
> was over 95% full that time.
>
>>> For example - I discovered that the synthesis tool I used (Lattices
>>> synthesizer) would produce a sub-optimal result if a unit - say a
>>> module - had even a single odd-sized register. Changing the register
>>> sizes to even numbers makes synthesis much better, even if it does
>>> throw away a bit.
>>
>> What was suboptimal about a register?  What sort of unit?
>
> I remember increasing a register from 2 bits to 3 bits killed the maximum
> frequency the circuit can run. It became at least 2x slower. If I
> remember correctly, further increasing it to 4 bits fixed most of the
> problems. It was bizzare.

All the bits in a register run in parallel, so length doesn't directly 
impact the speed.  The only factor of register length that would impact 
speed is the length of the routing that connected the registers.  You 
would need to look at the timing report to see what was causing your 
routing delays.  Trying to analyze it a priori really isn't practical.


> If I look hard enough, I may even be able to find the code. I seem to
> remember it was in a routing module of some kind. Or some other glue
> module that would connect the CPU core to the SRAM. That was on a MachXO2.

In any given design there can always be issues where a small change in 
design causes a huge change in results.  This is due to the chaotic 
behavior of the tools when a design starts to push the density or speed 
of the device.

-- 

Rick C

Article: 158892
Subject: Re: Constraining data to out-of-phase clocks
From: Rob Gaddi <rgaddi@highlandtechnology.invalid>
Date: Mon, 16 May 2016 15:58:05 -0000 (UTC)
Links: << >> << T >> << A >>

BobH wrote:

> On 05/13/2016 04:18 PM, Rob Gaddi wrote:
>> I've got a project coming up in which one of the things I'm going to
>> need to do is take in the 1 PPS output from a GPS receiver and align it
>> to the 100 MHz frequency reference clock.
>>
>> The problem here is that the phase relationship is static but undefined.
>>  There's plenty of time somewhere to not violate the setup time, but I
>> have to find where it is.
>>
>> My thought had been to use one of the FPGA PLLs to spin up 8 phases
>> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
>> all of them. Then resychronize those to the 0° clock, read the
>> thermometer code, and figure out what the data phase is. Then I can
>> use the value opposite that and know I've got enough margin to park a
>> semitrailer in.
>>
>> All well and good, except it dawns on me that I don't know how to
>> convince the tools to make timing for that.  Somehow my SDC file has to
>> call for running the input signal from the input pin to 8 flops such
>> that the clock/data skew is less than 1 ns.
>>
>> Anyone know how I'd do such a thing?  Altera seems to support
>> set_max_skew, which might do it if I can disentangle the arcana of the
>> syntax from
>> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
>> But I'm not necessarily wedded to Altera on this project.
>>
>
> The 1PPS signals out of the GPS receivers that I have seen, advertise 
> 50nS accuracy. Your 100MHz clock has a period of 10nS. From what I 
> understand of your project, I would just put the 1PPS signal in through 
> a couple of sync FF's and then compensate for an assumed delay of 2 
> clock cycles in the application of the sync'ed 1PPS signal downstream.
>

You know how sometimes when you get started on a project, and you get
some assumptions so firmly in your head that you don't even realize that
they're just assumptions?  I had a substantially better opinion of the
jitter on the 1 PPS than is actually warranted.  There are a _COUPLE_ of
receivers out there with 1 PPS jitter down to 10 ns (RMS), but for the
most part you're entirely correct.  And even at 10, there's no
static relationship against a PLL-locked 100 MHz; it's just every pulse
for itself.

You'd think with as many times as I've learned that lesson, one of these
times it would stick.

> The phase relationship between the 1PPS input and the local 100MHz clock 
> won't be all that static unless you phase lock the 100MHz clock to the 
> 1PPS. Even an exotic local oscillator is going to drift around with time 
> and temperature compared to the 1PPS.
>

Now that's a question I have no idea how to resolve.  Is the exotic
LO (let's assume cesium for fun) drifting and the PPS stable, or is it
(I'd guess) that no matter how stable the LO that the PPS is walking
around?

Making claims of objective truth on these timescales is hard.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 158893
Subject: Re: Constraining data to out-of-phase clocks
From: Rob Gaddi <rgaddi@highlandtechnology.invalid>
Date: Mon, 16 May 2016 16:12:21 -0000 (UTC)
Links: << >> << T >> << A >>

rickman wrote:

> On 5/14/2016 1:17 AM, rickman wrote:
>> On 5/14/2016 12:48 AM, Tim Wescott wrote:
>>> On Fri, 13 May 2016 23:18:56 +0000, Rob Gaddi wrote:
>>>
>>>> I've got a project coming up in which one of the things I'm going to
>>>> need to do is take in the 1 PPS output from a GPS receiver and align it
>>>> to the 100 MHz frequency reference clock.
>>>>
>>>> The problem here is that the phase relationship is static but undefined.
>>>>   There's plenty of time somewhere to not violate the setup time, but I
>>>> have to find where it is.
>>>>
>>>> My thought had been to use one of the FPGA PLLs to spin up 8 phases
>>>> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
>>>> all of them. Then resychronize those to the 0° clock, read the
>>>> thermometer code, and figure out what the data phase is. Then I can use
>>>> the value opposite that and know I've got enough margin to park a
>>>> semitrailer in.
>>>>
>>>> All well and good, except it dawns on me that I don't know how to
>>>> convince the tools to make timing for that.  Somehow my SDC file has to
>>>> call for running the input signal from the input pin to 8 flops such
>>>> that the clock/data skew is less than 1 ns.
>>>>
>>>> Anyone know how I'd do such a thing?  Altera seems to support
>>>> set_max_skew, which might do it if I can disentangle the arcana of the
>>>> syntax from
>>>> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/
>>> tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
>>>> But I'm not necessarily wedded to Altera on this project.
>>>
>>> There's not enough foreshadowing for me to figure out where the
>>> thermometer comes in.
>>>
>>> But I have lots of questions!
>>>
>>> How precisely do you really need to know the 1PPS?
>>>
>>> How can you know that the phase alignment is static, yet not know what it
>>> is?  The only way I can think of that you'd have a static phase alignment
>>> is if there's a PLL locking to the 1PPS, and that implies a known phase
>>> alignment.
>>
>> I can answer that, the 100 MHz and 1 PPS are from two different, but
>> both very stable sources... most likely.  I expect this is like the
>> circuit my friend designed, a 1 PPS generator, or at least an internal
>> alignment to the external 1 PPS, but clocked from the local 100 MHz
>> clock.  You would figure at any given time the phase relationship is
>> stable, but not known apriori.  With a clock stability of 10^-... pick a
>> number, the phase would not change very fast.
>>
>>
>>> Why not use a 100MHz clock that _is_ of known phase alignment?  Seems to
>>> me that an VCXO plus some logic (possibly inside the FPGA) plus an op-amp
>>> with an integrator would add up to phase lock.
>>
>> 1 PPS from GPS, I expect clocked into the FPGA by a local atomic clock.
>>
>>
>>> Certainly if you're using a free-running oscillator, even a crystal one,
>>> your phase will, at best, be drifting all over the place, and it'll
>>> probably have enough of an offset to it that it'll be running away rather
>>> than exhibiting a random walk.
>>
>> How about rubidium or cesium?  I think they get better than 10^-12,
>> maybe 10^-15.
>
> Now that I have read JL's post, I understand.  You have two signals 
> provided by your customer and not enough information to know how they 
> are related or how to use them.  This will all depend on what you need 
> to use the signals for.  In other words, you have not provided enough 
> info for us to know how to design your system.
>
> So how are you going to use these two signals?  What are the 
> requirements you need to meet?
>

Exactly, and so it's not, as both you and Tim mentioned, about
"measuring" the phase of the PPS signal per se.  It's about being able
to clock it in, either by delaying the PPS or rotating the clock, in
such a way that you've guaranteed you're not sampling the PPS during the
maybe/maybenot interval.  If you did you'd get a one clock ambiguity in
the sampling which would becomes period jitter.  Always be early, or
always be late, but don't keep changing your mind.

Though as BobH pointed out, it's not quite so simple as that, either,
since the PPS reference from any purchasable source has tens of
nanoseconds of peak-to-peak period jitter.  So my assumptions of phase
stability are blown, and it's back to the customer, bucket o' questions
in hand, to find out whether I'm supposed to faithfully replicate the
PPS or jitter clean it.  So this question, while still interesting,
becomes moot for my actual application.

I hadn't even realized JL put it out to s.e.d.  Generally I stay off his
groups and he off of mine, thus allowing both of us to maintain a
careful mystique of omniscience by sneaking off and asking folks on the
Internet.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 158894
Subject: Re: Constraining data to out-of-phase clocks
From: rickman <gnuarm@gmail.com>
Date: Mon, 16 May 2016 12:17:32 -0400
Links: << >> << T >> << A >>

On 5/16/2016 11:58 AM, Rob Gaddi wrote:
> BobH wrote:
>
>> On 05/13/2016 04:18 PM, Rob Gaddi wrote:
>>> I've got a project coming up in which one of the things I'm going to
>>> need to do is take in the 1 PPS output from a GPS receiver and align it
>>> to the 100 MHz frequency reference clock.
>>>
>>> The problem here is that the phase relationship is static but undefined.
>>>   There's plenty of time somewhere to not violate the setup time, but I
>>> have to find where it is.
>>>
>>> My thought had been to use one of the FPGA PLLs to spin up 8 phases
>>> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
>>> all of them. Then resychronize those to the 0° clock, read the
>>> thermometer code, and figure out what the data phase is. Then I can
>>> use the value opposite that and know I've got enough margin to park a
>>> semitrailer in.
>>>
>>> All well and good, except it dawns on me that I don't know how to
>>> convince the tools to make timing for that.  Somehow my SDC file has to
>>> call for running the input signal from the input pin to 8 flops such
>>> that the clock/data skew is less than 1 ns.
>>>
>>> Anyone know how I'd do such a thing?  Altera seems to support
>>> set_max_skew, which might do it if I can disentangle the arcana of the
>>> syntax from
>>> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
>>> But I'm not necessarily wedded to Altera on this project.
>>>
>>
>> The 1PPS signals out of the GPS receivers that I have seen, advertise
>> 50nS accuracy. Your 100MHz clock has a period of 10nS. From what I
>> understand of your project, I would just put the 1PPS signal in through
>> a couple of sync FF's and then compensate for an assumed delay of 2
>> clock cycles in the application of the sync'ed 1PPS signal downstream.
>>
>
> You know how sometimes when you get started on a project, and you get
> some assumptions so firmly in your head that you don't even realize that
> they're just assumptions?  I had a substantially better opinion of the
> jitter on the 1 PPS than is actually warranted.  There are a _COUPLE_ of
> receivers out there with 1 PPS jitter down to 10 ns (RMS), but for the
> most part you're entirely correct.  And even at 10, there's no
> static relationship against a PLL-locked 100 MHz; it's just every pulse
> for itself.
>
> You'd think with as many times as I've learned that lesson, one of these
> times it would stick.
>
>> The phase relationship between the 1PPS input and the local 100MHz clock
>> won't be all that static unless you phase lock the 100MHz clock to the
>> 1PPS. Even an exotic local oscillator is going to drift around with time
>> and temperature compared to the 1PPS.
>>
>
> Now that's a question I have no idea how to resolve.  Is the exotic
> LO (let's assume cesium for fun) drifting and the PPS stable, or is it
> (I'd guess) that no matter how stable the LO that the PPS is walking
> around?
>
> Making claims of objective truth on these timescales is hard.

It is still unclear to me what you wish to use the 1 PPS for exactly. 
But if you require a 1 PPS which is locked to a 100 MHz, both of which 
can be connected to the GPS signals, you would need to generate both 
yourself.  In another thread John has indicated he doesn't want to wait 
for the synthesized 1 PPS to sync up to the GPS 1 PPS, but I don't see 
where you have a choice if you really need the two signals locked in the 
way you seem to want.

The issue is not a design issue really.  It is fundamental to the fact 
that the 1 PPS from the GPS has significant jitter.

As I've said before, I know someone who designed exactly this circuit 
for one of the Naval Observatory's newer atomic clocks.  I can put you 
in touch if you would like.  His design also has a time of day clock 
display in case you need the time.  lol

-- 

Rick C

Article: 158895
Subject: Re: Constraining data to out-of-phase clocks
From: Tim Wescott <tim@seemywebsite.com>
Date: Mon, 16 May 2016 11:32:29 -0500
Links: << >> << T >> << A >>

On Mon, 16 May 2016 15:58:05 +0000, Rob Gaddi wrote:

> BobH wrote:
> 
>> On 05/13/2016 04:18 PM, Rob Gaddi wrote:
>>> I've got a project coming up in which one of the things I'm going to
>>> need to do is take in the 1 PPS output from a GPS receiver and align
>>> it to the 100 MHz frequency reference clock.
>>>
>>> The problem here is that the phase relationship is static but
>>> undefined.
>>>  There's plenty of time somewhere to not violate the setup time, but I
>>> have to find where it is.
>>>
>>> My thought had been to use one of the FPGA PLLs to spin up 8 phases
>>> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
>>> all of them. Then resychronize those to the 0Â° clock, read the
>>> thermometer code, and figure out what the data phase is. Then I can
>>> use the value opposite that and know I've got enough margin to park a
>>> semitrailer in.
>>>
>>> All well and good, except it dawns on me that I don't know how to
>>> convince the tools to make timing for that.  Somehow my SDC file has
>>> to call for running the input signal from the input pin to 8 flops
>>> such that the clock/data skew is less than 1 ns.
>>>
>>> Anyone know how I'd do such a thing?  Altera seems to support
>>> set_max_skew, which might do it if I can disentangle the arcana of the
>>> syntax from
>>> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/
tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
>>> But I'm not necessarily wedded to Altera on this project.
>>>
>>>
>> The 1PPS signals out of the GPS receivers that I have seen, advertise
>> 50nS accuracy. Your 100MHz clock has a period of 10nS. From what I
>> understand of your project, I would just put the 1PPS signal in through
>> a couple of sync FF's and then compensate for an assumed delay of 2
>> clock cycles in the application of the sync'ed 1PPS signal downstream.
>>
>>
> You know how sometimes when you get started on a project, and you get
> some assumptions so firmly in your head that you don't even realize that
> they're just assumptions?  I had a substantially better opinion of the
> jitter on the 1 PPS than is actually warranted.  There are a _COUPLE_ of
> receivers out there with 1 PPS jitter down to 10 ns (RMS), but for the
> most part you're entirely correct.  And even at 10, there's no static
> relationship against a PLL-locked 100 MHz; it's just every pulse for
> itself.
> 
> You'd think with as many times as I've learned that lesson, one of these
> times it would stick.
> 
>> The phase relationship between the 1PPS input and the local 100MHz
>> clock won't be all that static unless you phase lock the 100MHz clock
>> to the 1PPS. Even an exotic local oscillator is going to drift around
>> with time and temperature compared to the 1PPS.
>>
>>
> Now that's a question I have no idea how to resolve.  Is the exotic LO
> (let's assume cesium for fun) drifting and the PPS stable, or is it (I'd
> guess) that no matter how stable the LO that the PPS is walking around?
> 
> Making claims of objective truth on these timescales is hard.

I'm pretty sure that it would be the 1PPS walking around, but walking 
around with an average error of zero, or at least zero somewhere inside 
the receiver, with a fixed offset to the spigot on the box, and another 
fixed offset from that spigot to your chip.  So your "objective truth" is 
that the average time error is zero, or at least some knowable, 
calibrateable constant.

GPS gets a message from each of a bunch of satellites that basically add 
up to "this is what time it is and this is where I am".  If the GPS 
receiver knew what time it was, it could take three such messages and 
determine where it is (because there are three spatial dimensions, so it 
needs three equations to solve for three unknowns).  But the GPS receiver 
doesn't know what time it is, so it has to solve for time, too.

So a GPS receiver is either going to smooth the 1PPS itself, thereby 
running the chance of making it worse, and requiring it to have a pretty 
good time base, or it's going to not smooth the 1PPS, and instead just 
use the most recent GPS time/position solution to decide when to pop off 
the 1PPS pulse.  If it doesn't smooth the 1PPS, then you can expect a 
time error that's commensurate with the usual GPS position error of 10 
meters or so (100 meters if the defense department decides to turn 
selective availability on, and no guarantees that in time of war they 
won't).  10 meters works out to 30ns, so...

-- 
Tim Wescott
Control systems, embedded software and circuit design
I'm looking for work!  See my website if you're interested
http://www.wescottdesign.com

Article: 158896
Subject: Re: Constraining data to out-of-phase clocks
From: Tim Wescott <tim@seemywebsite.com>
Date: Mon, 16 May 2016 11:32:50 -0500
Links: << >> << T >> << A >>

On Mon, 16 May 2016 16:12:21 +0000, Rob Gaddi wrote:

> rickman wrote:
> 
>> On 5/14/2016 1:17 AM, rickman wrote:
>>> On 5/14/2016 12:48 AM, Tim Wescott wrote:
>>>> On Fri, 13 May 2016 23:18:56 +0000, Rob Gaddi wrote:
>>>>
>>>>> I've got a project coming up in which one of the things I'm going to
>>>>> need to do is take in the 1 PPS output from a GPS receiver and align
>>>>> it to the 100 MHz frequency reference clock.
>>>>>
>>>>> The problem here is that the phase relationship is static but
>>>>> undefined.
>>>>>   There's plenty of time somewhere to not violate the setup time,
>>>>>   but I
>>>>> have to find where it is.
>>>>>
>>>>> My thought had been to use one of the FPGA PLLs to spin up 8 phases
>>>>> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal
>>>>> on all of them. Then resychronize those to the 0Â° clock, read the
>>>>> thermometer code, and figure out what the data phase is. Then I can
>>>>> use the value opposite that and know I've got enough margin to park
>>>>> a semitrailer in.
>>>>>
>>>>> All well and good, except it dawns on me that I don't know how to
>>>>> convince the tools to make timing for that.  Somehow my SDC file has
>>>>> to call for running the input signal from the input pin to 8 flops
>>>>> such that the clock/data skew is less than 1 ns.
>>>>>
>>>>> Anyone know how I'd do such a thing?  Altera seems to support
>>>>> set_max_skew, which might do it if I can disentangle the arcana of
>>>>> the syntax from
>>>>> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/
>>>> tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
>>>>> But I'm not necessarily wedded to Altera on this project.
>>>>
>>>> There's not enough foreshadowing for me to figure out where the
>>>> thermometer comes in.
>>>>
>>>> But I have lots of questions!
>>>>
>>>> How precisely do you really need to know the 1PPS?
>>>>
>>>> How can you know that the phase alignment is static, yet not know
>>>> what it is?  The only way I can think of that you'd have a static
>>>> phase alignment is if there's a PLL locking to the 1PPS, and that
>>>> implies a known phase alignment.
>>>
>>> I can answer that, the 100 MHz and 1 PPS are from two different, but
>>> both very stable sources... most likely.  I expect this is like the
>>> circuit my friend designed, a 1 PPS generator, or at least an internal
>>> alignment to the external 1 PPS, but clocked from the local 100 MHz
>>> clock.  You would figure at any given time the phase relationship is
>>> stable, but not known apriori.  With a clock stability of 10^-... pick
>>> a number, the phase would not change very fast.
>>>
>>>
>>>> Why not use a 100MHz clock that _is_ of known phase alignment?  Seems
>>>> to me that an VCXO plus some logic (possibly inside the FPGA) plus an
>>>> op-amp with an integrator would add up to phase lock.
>>>
>>> 1 PPS from GPS, I expect clocked into the FPGA by a local atomic
>>> clock.
>>>
>>>
>>>> Certainly if you're using a free-running oscillator, even a crystal
>>>> one, your phase will, at best, be drifting all over the place, and
>>>> it'll probably have enough of an offset to it that it'll be running
>>>> away rather than exhibiting a random walk.
>>>
>>> How about rubidium or cesium?  I think they get better than 10^-12,
>>> maybe 10^-15.
>>
>> Now that I have read JL's post, I understand.  You have two signals
>> provided by your customer and not enough information to know how they
>> are related or how to use them.  This will all depend on what you need
>> to use the signals for.  In other words, you have not provided enough
>> info for us to know how to design your system.
>>
>> So how are you going to use these two signals?  What are the
>> requirements you need to meet?
>>
>>
> Exactly, and so it's not, as both you and Tim mentioned, about
> "measuring" the phase of the PPS signal per se.  It's about being able
> to clock it in, either by delaying the PPS or rotating the clock, in
> such a way that you've guaranteed you're not sampling the PPS during the
> maybe/maybenot interval.  If you did you'd get a one clock ambiguity in
> the sampling which would becomes period jitter.  Always be early, or
> always be late, but don't keep changing your mind.
> 
> Though as BobH pointed out, it's not quite so simple as that, either,
> since the PPS reference from any purchasable source has tens of
> nanoseconds of peak-to-peak period jitter.  So my assumptions of phase
> stability are blown, and it's back to the customer, bucket o' questions
> in hand, to find out whether I'm supposed to faithfully replicate the
> PPS or jitter clean it.  So this question, while still interesting,
> becomes moot for my actual application.
> 
> I hadn't even realized JL put it out to s.e.d.  Generally I stay off his
> groups and he off of mine, thus allowing both of us to maintain a
> careful mystique of omniscience by sneaking off and asking folks on the
> Internet.

And we ratted you out.  Sorry.

-- 
Tim Wescott
Control systems, embedded software and circuit design
I'm looking for work!  See my website if you're interested
http://www.wescottdesign.com

Article: 158897
Subject: Re: Constraining data to out-of-phase clocks
From: Rob Gaddi <rgaddi@highlandtechnology.invalid>
Date: Mon, 16 May 2016 16:47:08 -0000 (UTC)
Links: << >> << T >> << A >>

Tim Wescott wrote:

> On Mon, 16 May 2016 16:12:21 +0000, Rob Gaddi wrote:
>
>> rickman wrote:
>> 
>>> On 5/14/2016 1:17 AM, rickman wrote:
>>>
>>> Now that I have read JL's post, I understand.  You have two signals
>>> provided by your customer and not enough information to know how they
>>> are related or how to use them.  This will all depend on what you need
>>> to use the signals for.  In other words, you have not provided enough
>>> info for us to know how to design your system.
>>>
>>> So how are you going to use these two signals?  What are the
>>> requirements you need to meet?
>>>
>>>
>> Exactly, and so it's not, as both you and Tim mentioned, about
>> "measuring" the phase of the PPS signal per se.  It's about being able
>> to clock it in, either by delaying the PPS or rotating the clock, in
>> such a way that you've guaranteed you're not sampling the PPS during the
>> maybe/maybenot interval.  If you did you'd get a one clock ambiguity in
>> the sampling which would becomes period jitter.  Always be early, or
>> always be late, but don't keep changing your mind.
>> 
>> Though as BobH pointed out, it's not quite so simple as that, either,
>> since the PPS reference from any purchasable source has tens of
>> nanoseconds of peak-to-peak period jitter.  So my assumptions of phase
>> stability are blown, and it's back to the customer, bucket o' questions
>> in hand, to find out whether I'm supposed to faithfully replicate the
>> PPS or jitter clean it.  So this question, while still interesting,
>> becomes moot for my actual application.
>> 
>> I hadn't even realized JL put it out to s.e.d.  Generally I stay off his
>> groups and he off of mine, thus allowing both of us to maintain a
>> careful mystique of omniscience by sneaking off and asking folks on the
>> Internet.
>
> And we ratted you out.  Sorry.
>

C'est la vie.  Honestly, once you've been seen to lose track of 4U
rackmount equipment in your own office, the mystique fades a bit.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 158898
Subject: Re: Constraining data to out-of-phase clocks
From: rickman <gnuarm@gmail.com>
Date: Mon, 16 May 2016 13:14:12 -0400
Links: << >> << T >> << A >>

On 5/16/2016 12:12 PM, Rob Gaddi wrote:
> rickman wrote:
>
>> On 5/14/2016 1:17 AM, rickman wrote:
>>> On 5/14/2016 12:48 AM, Tim Wescott wrote:
>>>> On Fri, 13 May 2016 23:18:56 +0000, Rob Gaddi wrote:
>>>>
>>>>> I've got a project coming up in which one of the things I'm going to
>>>>> need to do is take in the 1 PPS output from a GPS receiver and align it
>>>>> to the 100 MHz frequency reference clock.
>>>>>
>>>>> The problem here is that the phase relationship is static but undefined.
>>>>>    There's plenty of time somewhere to not violate the setup time, but I
>>>>> have to find where it is.
>>>>>
>>>>> My thought had been to use one of the FPGA PLLs to spin up 8 phases
>>>>> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
>>>>> all of them. Then resychronize those to the 0° clock, read the
>>>>> thermometer code, and figure out what the data phase is. Then I can use
>>>>> the value opposite that and know I've got enough margin to park a
>>>>> semitrailer in.
>>>>>
>>>>> All well and good, except it dawns on me that I don't know how to
>>>>> convince the tools to make timing for that.  Somehow my SDC file has to
>>>>> call for running the input signal from the input pin to 8 flops such
>>>>> that the clock/data skew is less than 1 ns.
>>>>>
>>>>> Anyone know how I'd do such a thing?  Altera seems to support
>>>>> set_max_skew, which might do it if I can disentangle the arcana of the
>>>>> syntax from
>>>>> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/
>>>> tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
>>>>> But I'm not necessarily wedded to Altera on this project.
>>>>
>>>> There's not enough foreshadowing for me to figure out where the
>>>> thermometer comes in.
>>>>
>>>> But I have lots of questions!
>>>>
>>>> How precisely do you really need to know the 1PPS?
>>>>
>>>> How can you know that the phase alignment is static, yet not know what it
>>>> is?  The only way I can think of that you'd have a static phase alignment
>>>> is if there's a PLL locking to the 1PPS, and that implies a known phase
>>>> alignment.
>>>
>>> I can answer that, the 100 MHz and 1 PPS are from two different, but
>>> both very stable sources... most likely.  I expect this is like the
>>> circuit my friend designed, a 1 PPS generator, or at least an internal
>>> alignment to the external 1 PPS, but clocked from the local 100 MHz
>>> clock.  You would figure at any given time the phase relationship is
>>> stable, but not known apriori.  With a clock stability of 10^-... pick a
>>> number, the phase would not change very fast.
>>>
>>>
>>>> Why not use a 100MHz clock that _is_ of known phase alignment?  Seems to
>>>> me that an VCXO plus some logic (possibly inside the FPGA) plus an op-amp
>>>> with an integrator would add up to phase lock.
>>>
>>> 1 PPS from GPS, I expect clocked into the FPGA by a local atomic clock.
>>>
>>>
>>>> Certainly if you're using a free-running oscillator, even a crystal one,
>>>> your phase will, at best, be drifting all over the place, and it'll
>>>> probably have enough of an offset to it that it'll be running away rather
>>>> than exhibiting a random walk.
>>>
>>> How about rubidium or cesium?  I think they get better than 10^-12,
>>> maybe 10^-15.
>>
>> Now that I have read JL's post, I understand.  You have two signals
>> provided by your customer and not enough information to know how they
>> are related or how to use them.  This will all depend on what you need
>> to use the signals for.  In other words, you have not provided enough
>> info for us to know how to design your system.
>>
>> So how are you going to use these two signals?  What are the
>> requirements you need to meet?
>>
>
> Exactly, and so it's not, as both you and Tim mentioned, about
> "measuring" the phase of the PPS signal per se.  It's about being able
> to clock it in, either by delaying the PPS or rotating the clock, in
> such a way that you've guaranteed you're not sampling the PPS during the
> maybe/maybenot interval.  If you did you'd get a one clock ambiguity in
> the sampling which would becomes period jitter.  Always be early, or
> always be late, but don't keep changing your mind.
>
> Though as BobH pointed out, it's not quite so simple as that, either,
> since the PPS reference from any purchasable source has tens of
> nanoseconds of peak-to-peak period jitter.  So my assumptions of phase
> stability are blown, and it's back to the customer, bucket o' questions
> in hand, to find out whether I'm supposed to faithfully replicate the
> PPS or jitter clean it.  So this question, while still interesting,
> becomes moot for my actual application.
>
> I hadn't even realized JL put it out to s.e.d.  Generally I stay off his
> groups and he off of mine, thus allowing both of us to maintain a
> careful mystique of omniscience by sneaking off and asking folks on the
> Internet.

Certainly you are a lot more pleasant to discuss things with.

-- 

Rick C

Article: 158899
Subject: Re: Constraining data to out-of-phase clocks
From: rickman <gnuarm@gmail.com>
Date: Mon, 16 May 2016 13:23:12 -0400
Links: << >> << T >> << A >>

On 5/16/2016 12:32 PM, Tim Wescott wrote:
> On Mon, 16 May 2016 15:58:05 +0000, Rob Gaddi wrote:
>
>> BobH wrote:
>>
>>> On 05/13/2016 04:18 PM, Rob Gaddi wrote:
>>>> I've got a project coming up in which one of the things I'm going to
>>>> need to do is take in the 1 PPS output from a GPS receiver and align
>>>> it to the 100 MHz frequency reference clock.
>>>>
>>>> The problem here is that the phase relationship is static but
>>>> undefined.
>>>>   There's plenty of time somewhere to not violate the setup time, but I
>>>> have to find where it is.
>>>>
>>>> My thought had been to use one of the FPGA PLLs to spin up 8 phases
>>>> (well, 4 and 4 bar) of the 100 MHz clock and capture the PPS signal on
>>>> all of them. Then resychronize those to the 0Â° clock, read the
>>>> thermometer code, and figure out what the data phase is. Then I can
>>>> use the value opposite that and know I've got enough margin to park a
>>>> semitrailer in.
>>>>
>>>> All well and good, except it dawns on me that I don't know how to
>>>> convince the tools to make timing for that.  Somehow my SDC file has
>>>> to call for running the input signal from the input pin to 8 flops
>>>> such that the clock/data skew is less than 1 ns.
>>>>
>>>> Anyone know how I'd do such a thing?  Altera seems to support
>>>> set_max_skew, which might do it if I can disentangle the arcana of the
>>>> syntax from
>>>> http://quartushelp.altera.com/14.1/mergedProjects/tafs/tafs/
> tcl_pkg_sdc_ext_ver_1.0_cmd_set_max_skew.htm
>>>> But I'm not necessarily wedded to Altera on this project.
>>>>
>>>>
>>> The 1PPS signals out of the GPS receivers that I have seen, advertise
>>> 50nS accuracy. Your 100MHz clock has a period of 10nS. From what I
>>> understand of your project, I would just put the 1PPS signal in through
>>> a couple of sync FF's and then compensate for an assumed delay of 2
>>> clock cycles in the application of the sync'ed 1PPS signal downstream.
>>>
>>>
>> You know how sometimes when you get started on a project, and you get
>> some assumptions so firmly in your head that you don't even realize that
>> they're just assumptions?  I had a substantially better opinion of the
>> jitter on the 1 PPS than is actually warranted.  There are a _COUPLE_ of
>> receivers out there with 1 PPS jitter down to 10 ns (RMS), but for the
>> most part you're entirely correct.  And even at 10, there's no static
>> relationship against a PLL-locked 100 MHz; it's just every pulse for
>> itself.
>>
>> You'd think with as many times as I've learned that lesson, one of these
>> times it would stick.
>>
>>> The phase relationship between the 1PPS input and the local 100MHz
>>> clock won't be all that static unless you phase lock the 100MHz clock
>>> to the 1PPS. Even an exotic local oscillator is going to drift around
>>> with time and temperature compared to the 1PPS.
>>>
>>>
>> Now that's a question I have no idea how to resolve.  Is the exotic LO
>> (let's assume cesium for fun) drifting and the PPS stable, or is it (I'd
>> guess) that no matter how stable the LO that the PPS is walking around?
>>
>> Making claims of objective truth on these timescales is hard.
>
> I'm pretty sure that it would be the 1PPS walking around, but walking
> around with an average error of zero, or at least zero somewhere inside
> the receiver, with a fixed offset to the spigot on the box, and another
> fixed offset from that spigot to your chip.  So your "objective truth" is
> that the average time error is zero, or at least some knowable,
> calibrateable constant.
>
> GPS gets a message from each of a bunch of satellites that basically add
> up to "this is what time it is and this is where I am".  If the GPS
> receiver knew what time it was, it could take three such messages and
> determine where it is (because there are three spatial dimensions, so it
> needs three equations to solve for three unknowns).  But the GPS receiver
> doesn't know what time it is, so it has to solve for time, too.
>
> So a GPS receiver is either going to smooth the 1PPS itself, thereby
> running the chance of making it worse, and requiring it to have a pretty
> good time base, or it's going to not smooth the 1PPS, and instead just
> use the most recent GPS time/position solution to decide when to pop off
> the 1PPS pulse.  If it doesn't smooth the 1PPS, then you can expect a
> time error that's commensurate with the usual GPS position error of 10
> meters or so (100 meters if the defense department decides to turn
> selective availability on, and no guarantees that in time of war they
> won't).  10 meters works out to 30ns, so...

I don't know for sure, but I find it hard to believe they would output 
the 1 PPS based solely on one measurement each.  While the 10 meter may 
often constrain the accuracy of any one calculation, it is probabilistic 
and can have a much larger actual error for a short amount of time. 
While they may not be doing something equivalent to a PLL for the time, 
I expect they are at least using some form of weighted average for 
calculating the 1 PPS signal.

I had a GPS that would receive the signals ok in my house and connected 
it to a utility that would map the location over time on an XY display. 
  I saw it walk off the property maybe once per day and come back within 
a minute or less.  That would be more like 50 meters.  I've heard of 
shipwrecks due to a loss of accuracy caused by poor constellations. 
That has to be a lot more than 50 meters.

-- 

Rick C

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search