Messages from 155600

Article: 155600
Subject: Re: serial protocol specs and verification
From: rickman <gnuarm@gmail.com>
Date: Mon, 29 Jul 2013 11:35:52 -0400
Links: << >> << T >> << A >>

On 7/29/2013 5:09 AM, alb wrote:
>
> Rick was suggesting a phase jitter with a high and a low frequency
> component. This can be even a more realistic case since it models slow
> drifts due to temperature variations... I do not know how critical would
> be to simulate *all* jitter components of a clock (they may depend on
> temperature, power noise, ground noise, ...).

Just to be clear my suggestion for simulating with both fast and slow 
clock frequency variations is not intended to match any real world 
conditions so much, but just to exercise the circuit in two ways that I 
would expect to detect failures.

If the clock is sampling the data on the edge, it is random which level 
is measured.  This can be simulated by a fast jitter in the clock.  A 
slow noise component in the clock frequency would provide for simulation 
of mismatched clock frequencies in both the positive and negative 
directions.  Another way of implementing the slow drift is to just 
simulate at a very slightly higher frequency and at a very slightly 
lower frequency.  That might show errors faster and more deterministically.

-- 

Rick

Article: 155601
Subject: Re: serial protocol specs and verification
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Mon, 29 Jul 2013 17:40:13 +0000 (UTC)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:
> On 7/28/2013 2:32 PM, alb wrote:

(snip)

>>>>> For the communication with Frontend asynchronous LVDS 
>>>>> connection is used.

(snip)
>>> Async, eh?  At 2x clock to data?  Not sure I would want to 
>>> design this.
>>>   I assume you have to phase lock to the data stream somehow?  
>>> I think that is the part I would worry about.
>>> two samples per bit.  I don't know how this can be expected to work myself.

(snip)
>> Since modules are likely to have different temperatures being far apart,
>> I would certainly expect a phase problem. Your idea to have a slow and a
>> high frequency variation in the phase generation might bring out some
>> additional info.

(snip)
> I was assuming that perhaps you were doing something I didn't quite 
> understand, but I'm pretty sure I am on target with this.  
> You *must* up your sample rate by a sufficient amount so that 
> you can guarantee you get a minimum of two samples per bit.  
> Otherwise you have no way to distinguish a slipped sample due 
> to clock mismatch.  Clock frequency mismatch is guaranteed, 
> unless you are using the same clock somehow.  

Everyone's old favorite asynchronous serial RS232 usually uses a
clock at 16x, though I have seen 64x. From the beginning of the 
start bit, it counts half a bit time (in clock cycles), verifies
the start bit (and not random noise) then counts whole bits and
decodes at that point. So, the actual decoding is done with a 1X
clock, but with 16 (or 64) possible phase values. It resynchronizes
at the beginning of each character, so it can't get too far off.

> It is not just a matter of phase, but of frequency.  With a 2x clock, 
> seeing a transition 3 clocks later doesn't distinguish one bit 
> time from two bit times.

For 10Mbit ethernet, on the other hand, as well as I understand it
the receiver locks (PLL) to the transmitter. Manchester coding is
wasteful of bandwidth, but allows for a simpler receiver. 
I believe it is usual to feed the transmit clock to the PLL to keep
it close to the right frequency until a signal comes in. Speeds up
the lock time. 

> I'm having trouble expressing myself I think, but I'm trying to say the 
> basic premise of this design is flawed because the sample clock is only 
> 2x the data rate.  I say you need 3x and I strongly encourage 4x.  At 4x 
> the samples have four states, expected timing, fast timing, slow timing 
> and "error" timing meaning the loop control isn't working.

Seems to me that it should depend on how far of you can get. 
For async RS232, you have to stay within about a quarter bit time
over 10 bits, so even if the clock is 2% off, it still works.
But as above, that depends on having a clock of the appropriate
phase.

-- glen

Article: 155602
Subject: Re: serial protocol specs and verification
From: rickman <gnuarm@gmail.com>
Date: Mon, 29 Jul 2013 16:14:50 -0400
Links: << >> << T >> << A >>

On 7/29/2013 1:40 PM, glen herrmannsfeldt wrote:
> rickman<gnuarm@gmail.com>  wrote:
>> On 7/28/2013 2:32 PM, alb wrote:
>
> (snip)
>
>>>>>> For the communication with Frontend asynchronous LVDS
>>>>>> connection is used.
>
> (snip)
>>>> Async, eh?  At 2x clock to data?  Not sure I would want to
>>>> design this.
>>>>    I assume you have to phase lock to the data stream somehow?
>>>> I think that is the part I would worry about.
>>>> two samples per bit.  I don't know how this can be expected to work myself.
>
> (snip)
>>> Since modules are likely to have different temperatures being far apart,
>>> I would certainly expect a phase problem. Your idea to have a slow and a
>>> high frequency variation in the phase generation might bring out some
>>> additional info.
>
> (snip)
>> I was assuming that perhaps you were doing something I didn't quite
>> understand, but I'm pretty sure I am on target with this.
>> You *must* up your sample rate by a sufficient amount so that
>> you can guarantee you get a minimum of two samples per bit.
>> Otherwise you have no way to distinguish a slipped sample due
>> to clock mismatch.  Clock frequency mismatch is guaranteed,
>> unless you are using the same clock somehow.
>
> Everyone's old favorite asynchronous serial RS232 usually uses a
> clock at 16x, though I have seen 64x. From the beginning of the
> start bit, it counts half a bit time (in clock cycles), verifies
> the start bit (and not random noise) then counts whole bits and
> decodes at that point. So, the actual decoding is done with a 1X
> clock, but with 16 (or 64) possible phase values. It resynchronizes
> at the beginning of each character, so it can't get too far off.

Yes, that protocol requires a clock matched to the senders clock to at 
least 2.5% IIRC.  The protocol the OP describes has much longer char 
sequences which implies much tighter clock precision at each end and I'm 
expecting it to use a clock recovery circuit... but maybe not.  I think 
he said they don't use one but get "frequent" errors.


>> It is not just a matter of phase, but of frequency.  With a 2x clock,
>> seeing a transition 3 clocks later doesn't distinguish one bit
>> time from two bit times.
>
> For 10Mbit ethernet, on the other hand, as well as I understand it
> the receiver locks (PLL) to the transmitter. Manchester coding is
> wasteful of bandwidth, but allows for a simpler receiver.
> I believe it is usual to feed the transmit clock to the PLL to keep
> it close to the right frequency until a signal comes in. Speeds up
> the lock time.
>
>> I'm having trouble expressing myself I think, but I'm trying to say the
>> basic premise of this design is flawed because the sample clock is only
>> 2x the data rate.  I say you need 3x and I strongly encourage 4x.  At 4x
>> the samples have four states, expected timing, fast timing, slow timing
>> and "error" timing meaning the loop control isn't working.
>
> Seems to me that it should depend on how far of you can get.
> For async RS232, you have to stay within about a quarter bit time
> over 10 bits, so even if the clock is 2% off, it still works.
> But as above, that depends on having a clock of the appropriate
> phase.

Not sure why you mention phase.  In 232 type character async you have 
*no* phase relationship between clocks.  There is no PLL so you aren't 
phase locked to the data either.  I guess you mean a clock with enough 
precision?

I've never analyzed an async design with longer data streams so I don't 
know how much precision would be required, but I"m sure you can't do 
reliable data recovery with a 2x clock (without a pll).  I think this 
would contradict the Nyquist criterion.

In my earlier comments when I'm talking about a PLL I am referring to a 
digital PLL.  I guess I should have said a DPLL.

-- 

Rick

Article: 155603
Subject: Re: serial protocol specs and verification
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Mon, 29 Jul 2013 20:36:46 +0000 (UTC)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:

(snip, I wrote)

>> Everyone's old favorite asynchronous serial RS232 usually uses a
>> clock at 16x, though I have seen 64x. From the beginning of the
>> start bit, it counts half a bit time (in clock cycles), verifies
>> the start bit (and not random noise) then counts whole bits and
>> decodes at that point. So, the actual decoding is done with a 1X
>> clock, but with 16 (or 64) possible phase values. It resynchronizes
>> at the beginning of each character, so it can't get too far off.
 
> Yes, that protocol requires a clock matched to the senders clock to at 
> least 2.5% IIRC.  The protocol the OP describes has much longer char 
> sequences which implies much tighter clock precision at each end and I'm 
> expecting it to use a clock recovery circuit... but maybe not.  I think 
> he said they don't use one but get "frequent" errors.

(snip)
>> Seems to me that it should depend on how far of you can get.
>> For async RS232, you have to stay within about a quarter bit time
>> over 10 bits, so even if the clock is 2% off, it still works.
>> But as above, that depends on having a clock of the appropriate
>> phase.
 
> Not sure why you mention phase.  In 232 type character async you have 
> *no* phase relationship between clocks.  There is no PLL so you aren't 
> phase locked to the data either.  I guess you mean a clock with enough 
> precision?

The reason for the 16x clock is that it can then clock the bits
in one at a time with any of 16 different phases. That is, the actual
bits are only looked at once (usually). 
 
> I've never analyzed an async design with longer data streams 
> so I don't know how much precision would be required, but I"m 
> sure you can't do reliable data recovery with a 2x clock (without 
> a pll).  I think this would contradict the Nyquist criterion.

If you start from the leading edge of the start bit, choose which
cycle of the 2x clock is closest to the center, and count from there,
seems to me you do pretty well if the clocks are close enough. Also,
the bit times should be pretty close to correct.

> In my earlier comments when I'm talking about a PLL I am 
> referring to a digital PLL.  I guess I should have said a DPLL.

I was thinking of an analog one. I still remember when analog (PLL
based) data separators were better for floppy disk reading. 
Most likely by now, digital ones are better, possibly because
of a higher clock frequency.
 
-- glen

Article: 155604
Subject: Re: serial protocol specs and verification
From: langwadt@fonz.dk
Date: Mon, 29 Jul 2013 15:10:44 -0700 (PDT)
Links: << >> << T >> << A >>

On Saturday, July 27, 2013 1:59:46 AM UTC+2, rickman wrote:
> On 7/26/2013 11:22 AM, alb wrote:
> 
> > Hi all,
> 
> >
> 
> > I have the following specs for the physical level of a serial protocol:
> 
> >
> 
> >> For the communication with Frontend asynchronous LVDS connection is used.
> 
> >> The bitrate is set to 20 Mbps.
> 
> >> Data encoding on the LVDS line is NRZI:
> 
> >> - bit '1' is represented by a transition of the physical level,
> 
> >> - bit '0' is represented by no transition of the physical level,
> 
> >> - insertion of an additional bit '1' after 6 consecutive bits '0'.
> 
> >
> 
> > Isn't there a missing requirement on reset condition of the line?
> 
> > System clock is implicitly defined on a different section of the specs
> 
> > and is set at 40MHz.
> 
> >
> 
> > At the next layer there's a definition of a 'frame' as a sequence of 16
> 
> > bit words preceded by a 3 bit sync pattern (111) and a header of 16 bits
> 
> > defining the type of the packet and the length of the packet (in words).
> 
> >
> 
> > I'm writing a test bench for it and I was wondering whether there's any
> 
> > recommendation you would suggest. Should I take care about randomly
> 
> > select the phase between the system clock and the data?
> 
> 
> 
> Async, eh?  At 2x clock to data?  Not sure I would want to design this. 
> 
>   I assume you have to phase lock to the data stream somehow?  I think 
> 
> that is the part I would worry about.
> 
> 
> 
> In simulation I would recommend that you both jitter the data clock at a 
> 
> high bandwidth and also with something fairly slow.  The slow variation 
> 
> will test the operation of your data extraction with a variable phase 
> 
> and the high bandwidth jitter will check for problems from only having 
> 
> two samples per bit.  I don't know how this can be expected to work myself.
> 
> 
> 
> I did something similar where I had to run a digital phase locked loop 
> 
> on standard NRZ data (no encoding) and used a 4x clock, but I think I 
> 
> proved to myself I could do it with a 3x clock, it just becomes 
> 
> impossible to detect when you have a sample error... lol.
> 

Doesn't sound so different from usb (full speed)

usually done by sampling the 12mbit/s using a 48MHz clk or 
rising and falling edge on 24MHz clock


-Lasse

Article: 155605
Subject: Re: serial protocol specs and verification
From: rickman <gnuarm@gmail.com>
Date: Mon, 29 Jul 2013 20:05:53 -0400
Links: << >> << T >> << A >>

On 7/29/2013 4:36 PM, glen herrmannsfeldt wrote:
> rickman<gnuarm@gmail.com>  wrote:
>
> (snip, I wrote)
>
>>> Everyone's old favorite asynchronous serial RS232 usually uses a
>>> clock at 16x, though I have seen 64x. From the beginning of the
>>> start bit, it counts half a bit time (in clock cycles), verifies
>>> the start bit (and not random noise) then counts whole bits and
>>> decodes at that point. So, the actual decoding is done with a 1X
>>> clock, but with 16 (or 64) possible phase values. It resynchronizes
>>> at the beginning of each character, so it can't get too far off.
>
>> Yes, that protocol requires a clock matched to the senders clock to at
>> least 2.5% IIRC.  The protocol the OP describes has much longer char
>> sequences which implies much tighter clock precision at each end and I'm
>> expecting it to use a clock recovery circuit... but maybe not.  I think
>> he said they don't use one but get "frequent" errors.
>
> (snip)
>>> Seems to me that it should depend on how far of you can get.
>>> For async RS232, you have to stay within about a quarter bit time
>>> over 10 bits, so even if the clock is 2% off, it still works.
>>> But as above, that depends on having a clock of the appropriate
>>> phase.
>
>> Not sure why you mention phase.  In 232 type character async you have
>> *no* phase relationship between clocks.  There is no PLL so you aren't
>> phase locked to the data either.  I guess you mean a clock with enough
>> precision?
>
> The reason for the 16x clock is that it can then clock the bits
> in one at a time with any of 16 different phases. That is, the actual
> bits are only looked at once (usually).
>
>> I've never analyzed an async design with longer data streams
>> so I don't know how much precision would be required, but I"m
>> sure you can't do reliable data recovery with a 2x clock (without
>> a pll).  I think this would contradict the Nyquist criterion.
>
> If you start from the leading edge of the start bit, choose which
> cycle of the 2x clock is closest to the center, and count from there,
> seems to me you do pretty well if the clocks are close enough. Also,
> the bit times should be pretty close to correct.

That is the point.  With a 2x clock there isn't enough resolution to 
"pick" an edge.  The clock that detects the edge is somewhere in the 
first *half* of the start bit and the following clock is somewhere in 
the second half of the start bit... which do you use?  Doesn't matter, 
if the clock detecting the start bit is close enough to the wrong point, 
one or the other will be far too close to the next transition to 
guarantee that you are sampling data from the correct bit.

>> In my earlier comments when I'm talking about a PLL I am
>> referring to a digital PLL.  I guess I should have said a DPLL.
>
> I was thinking of an analog one. I still remember when analog (PLL
> based) data separators were better for floppy disk reading.
> Most likely by now, digital ones are better, possibly because
> of a higher clock frequency.

If you have an analog PLL then you just need to make sure your sample 
clock is *faster* than 2x the bit rate.  Then you can be certain of how 
many bits are between adjacent transitions.  But if at any time due to 
frequency error or jitter you sample on the wrong side of a transition 
you will get an unrecoverable error.

When it comes to analog media like disk drives where the position of the 
bit pulse can jitter significantly I would expect a significantly higher 
clock rate would be very useful.  It all comes down to distinguishing 
which half of the bit time the transition falls into.  With a run of six 
zeros (no transition) between 1 bits (transition) it becomes more 
important to sample with adequate resolution with a DPLL or to use an 
analog PLL.

I did a DPLL design for a data input to an IP circuit to packet card. 
It worked well in simulation and in product test and verification.  I'm 
not sure they have used this feature in the field though.  It was added 
to the product "just in case" and that depends on the customer needing 
the feature.

-- 

Rick

Article: 155606
Subject: Re: serial protocol specs and verification
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Tue, 30 Jul 2013 00:58:43 +0000 (UTC)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:

(snip, I wrote)
>> If you start from the leading edge of the start bit, choose which
>> cycle of the 2x clock is closest to the center, and count from there,
>> seems to me you do pretty well if the clocks are close enough. Also,
>> the bit times should be pretty close to correct.

> That is the point.  With a 2x clock there isn't enough resolution to 
> "pick" an edge.  The clock that detects the edge is somewhere in the 
> first *half* of the start bit and the following clock is somewhere in 
> the second half of the start bit... which do you use?  

The easy way is to use the opposite edge of the clock. I suppose that
really means that the clock is 4x, though, so maybe that doesn't count.
Say you clock on the falling edge. If the clock is currently high,
the next falling edge will be less than half a cycle away. If it
is currently low, then it will be more. Using that, you can find the
falling edge closest to the center.

The hard way is to have the receive clock slightly faster or slightly
slower. That is, the speed such that if the first edge is in the
first half, later edges will be later in the bit time, and not past
the 3/4 mark. Now, having different receive and transmit clocks is
inconvenient, but not impossible.

> Doesn't matter,  if the clock detecting the start bit is close 
> enough to the wrong point, one or the other will be far too 
> close to the next transition to guarantee that you are sampling 
> data from the correct bit.

(snip)

>> I was thinking of an analog one. I still remember when analog (PLL
>> based) data separators were better for floppy disk reading.
>> Most likely by now, digital ones are better, possibly because
>> of a higher clock frequency.

> If you have an analog PLL then you just need to make sure your sample 
> clock is *faster* than 2x the bit rate.  Then you can be certain of how 
> many bits are between adjacent transitions.  But if at any time due to 
> frequency error or jitter you sample on the wrong side of a transition 
> you will get an unrecoverable error.

It is interesting in the case of magnetic media. The read head reads
changes in the recorded magnetic field. For single density (FM) there
is a flux transition at the edge of the bit cell (clock bit), and either
is or isn't one in the center (data bit). So, including jitter, the
data bit is +/- one quarter bit time from the center, and the clock
bits are +/- one quarter from the cell boundary. The data rate is half
the maximum flux transition rate.  The time between transitions is
either 1/2 or 1 bit time.

For the usual IBM double density (MFM), the data bits are again in the
center of the bit cell, but clock bits only occur on bit cell boundaries
between two zero (no transition) bits. The data rate is then equal to
the maximum flux transition rate. The time between transitions is then
either one or 1.5 bit times. The result, though, as you noted, is that
it is more sensitive to jitter. In the case of magnetic media response,
though, there is a predictable component to the transition times. 
As the field doesn't transition infinitely fast, the result is that
as two transitions get closer together, when read back they come
slightly farther apart than you might expect. Precompensation is then
used to correct for this. Transitions are moved slightly earlier
or slightly later, depending on the expected movement of the read
pulse.

> When it comes to analog media like disk drives where the position 
> of the bit pulse can jitter significantly I would expect a 
> significantly higher clock rate would be very useful.  

One way to do the precompensation is to run a clock fast enough
such that you can move the transition one cycle early or late.
The other way is with an analog delay line.

> It all comes down to distinguishing which half of the bit time 
> the transition falls into.  With a run of six zeros (no transition) 
> between 1 bits (transition) it becomes more important to sample 
> with adequate resolution with a DPLL or to use an analog PLL.

The early magnetic tape used NRZI coding, flux transition for one,
no transition for zero. Odd parity means at least one bit will
change for every character written to tape. Even parity means
at least two will change, but you can't write the character
with all bits zero. Both were used for 7 track (six bit characters)
and odd parity was used for 800 BPI 9 track tapes.  There can be
long runs of zero (no transition) for any individual track, but
taken together there is at least one.

For 1600 BPI tapes, IBM changed to PE, which is pretty similar to
that used for single density floppies. The flux transition rate can
be twice the bit rate (3200/inch) but each track has its own clock
pulse. It is fairly insensitive to head azimuth, unlike 800 BPI NRZI.
There are no long periods without a transition on any track.
Reading tapes is much more reliable, especially on a different
drive than the data was written on.

IBM 6250 tapes use GCR, with more complicated patterns of bit
transitions, and more variation in time between transitions.
Again, much more reliable than its predecessor.

> I did a DPLL design for a data input to an IP circuit to packet card. 
> It worked well in simulation and in product test and verification.  I'm 
> not sure they have used this feature in the field though.  It was added 
> to the product "just in case" and that depends on the customer needing 
> the feature.

-- glen

Article: 155607
Subject: seperate high speed rules for HDL?
From: sketyro@gmail.com
Date: Mon, 29 Jul 2013 20:01:28 -0700 (PDT)
Links: << >> << T >> << A >>

Hi everyone. I'm trying to find out if, at high speeds, it is necessary to =
clock every other register using every other clock transition. For instance=
, clocking every other register in a shift register using the positive cloc=
k transition and the rest use the negative clock transition. This VHDL may =
help explain:

I know this works at lower speeds:

   if(clk'event and clk=3D'1')then
      D <=3D C;
      C <=3D B;
      B <=3D A;
      A <=3D input;
   end if;

But I wonder if at higher speeds this sort of coding is required:

   if(clk'event and clk=3D'1')then
      D <=3D C;
      B <=3D A;
   end if
   if(clk'event and clk=3D'0')then
      C <=3D B;
      A <=3D input;
   end if;

That way, in my second example, "B" for instance captures it's data in the =
middle of "A's" data eye. Is this coding style required above some speed? I=
f so, does anyone know how to find out what that speed is or just tell me s=
ome general approximate?

Article: 155608
Subject: Re: serial protocol specs and verification
From: Richard Damon <Richard@Damon-Family.org>
Date: Tue, 30 Jul 2013 00:45:54 -0400
Links: << >> << T >> << A >>

On 7/29/13 5:09 AM, alb wrote:
> On 29/07/2013 03:05, Richard Damon wrote:
>> On 7/26/13 11:22 AM, alb wrote:
>>> Hi all,
>>>
>>> I have the following specs for the physical level of a serial protocol:
>>>
>>>> For the communication with Frontend asynchronous LVDS connection is used.
>>>> The bitrate is set to 20 Mbps.
>>>> Data encoding on the LVDS line is NRZI:
>>>> - bit '1' is represented by a transition of the physical level,
>>>> - bit '0' is represented by no transition of the physical level,
>>>> - insertion of an additional bit '1' after 6 consecutive bits '0'.
>>>
>>> Isn't there a missing requirement on reset condition of the line?
>>> System clock is implicitly defined on a different section of the specs
>>> and is set at 40MHz.
> []
>> You don't need to specify a reset state, as either level will work. At
>> reset the line will be toggling every 7 bit times due to the automatic
>> insertion of a 1 after 6 0s.
> 
> Uhm, since there's a sync pattern of '111' I have to assume that no
> frame is transmitted when only zeros are flowing (with the '1' stuffed
> every 6 zeros).

My assumption for the protocol would be that between frames an "all
zero" pattern is sent. (note that this is on the layer above the raw
transport level, where every time 6 zeros are sent, a 1 is added). Thus
all frames will begin with three 1s in a row, as a signal for start of
frame (and also gives a lot of transitions to help lock the clock if
using a pll).
> 
>> I would be hard pressed to use 40 MHz as a system clock, unless I was
>> allowed to use both edges of the clock (so I could really sample at a 4x
>> rate).
> 
> I'm thinking about having a system clock multiplied internally via PLL
> and then go for a x4 or x8 in order to center the bit properly.

I would think that sampling at 4x of the data rate is an minimum, faster
will give you better margins for frequency errors. So with a 20 MHz data
rate, you need to sample the data at 80 MHz, faster can help, and will
cause less jitter in your recovered data clock out.

Note that the first level of processing will perform data detection and
clock recovery, and this might be where the 40 MHz came from, a 40 MHz
processing system can be told most of the time to take data every other
clock cycle, but have bandwidth to at times if the data is coming in
slightly faster to take data on two consecutive clocks. You don't want
to make this clock much faster than that, as then it becomes harder to
design for no benefit. Any higher speed bit detection clock needs to
have the results translated to this domain for further processing.  (You
could also generate a recovered clock, but that starts you down the road
to an async design as the recovered clock isn't well related to your
existing clock, being a combinatorial result of registers clocked on
your sampling clock.)
> 
>>
>> For a test bench, I would build something that could be set to work
>> slightly "off frequency" and maybe even with some phase jitter in the
>> data clock. 
> 
> Rick was suggesting a phase jitter with a high and a low frequency
> component. This can be even a more realistic case since it models slow
> drifts due to temperature variations... I do not know how critical would
> be to simulate *all* jitter components of a clock (they may depend on
> temperature, power noise, ground noise, ...).
> 
>> I am assuming that system clock does NOT travel between
>> devices, or there wouldn't be as much need for the auto 1 bit, unless
>> this is just a bias leveling, but if isn't real great for that.
> 
> Your assumption is correct. No clock distribution between devices.
>

Article: 155609
Subject: Re: seperate high speed rules for HDL?
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Tue, 30 Jul 2013 04:46:02 +0000 (UTC)
Links: << >> << T >> << A >>

sketyro@gmail.com wrote:

> Hi everyone. I'm trying to find out if, at high speeds, it is 
> necessary to clock every other register using every other clock 
> transition. For instance, clocking every other register in a 
> shift register using the positive clock transition and the rest 
> use the negative clock transition. This VHDL may help explain:

How fast are you thinking about?

I have done FPGA designs that had at most two LUTs between FFs.
That, and optimal routing, leads to a fairly fast design.
I believe that using one clock edge works best in this case.

For most FPGA families, there is a well optimized clock tree to
minimize the clock skew. If you use different clock edges, there
must either be an inverter on some clock inputs, or two separate
clock trees. Either seems likely to add clock skew, and limit the
speed.

Also, the timing tools might have a harder time figuring out the
appropriate timing. Probably it doesn't cause so much of a problem,
but it is your problem to get the clock timing right.

The only advantage I see is that the clock signal runs at a lower
frequency.

OK, in the olden days there were advantages. We now have nice, well
designed master-slave flip-flops. Before TTL, as well as I know it, 
much logic was done using only latches. The Earle latch allows one
to generate efficient pipelines, merging two levels of logic with
the latch logic. Without the advantage of a master-slave FF, 
using either two clock edges or, more usual, two separate clock 
phases, allows for nice pipelines. 

If I remember, the TMS9900 microprocessor uses a four phase clock.
The 8088 and 8086 use a single clock input with 33% duty cycle,
and dynamic logic. (There is a minimum clock frequency of about
one or two MHz. It has been some time since I thought of the
exact value.) The 33% is optimal for the different path lengths
on the two clock edges.

But as far as I know, there are no advantages for current FPGA families.

Now, there are DDR DRAMs which clock on both edges. The FPGA logic
required to do that likely has FFs clocked on both edges. It might
be that for signals going into or out of the FPGA, that you can 
do it faster using both edges. 

-- glen

Article: 155610
Subject: Re: seperate high speed rules for HDL?
From: KJ <kkjennings@sbcglobal.net>
Date: Mon, 29 Jul 2013 22:44:41 -0700 (PDT)
Links: << >> << T >> << A >>

On Monday, July 29, 2013 11:01:28 PM UTC-4, ske...@gmail.com wrote:
> Hi everyone. I'm trying to find out if, at high speeds, it is necessary t=
o clock every
> other register using every other clock transition. For instance, clocking=
 every other
> register in a shift register using the positive clock transition and the =
rest use
> the negative clock transition.
>=20
> I know this works at lower speeds:=20
>=20
>    if(clk'event and clk=3D'1')then=20
>       D <=3D C;=20
>       C <=3D B;=20
>       B <=3D A;=20
>       A <=3D input;=20
>    end if;=20

It works at any speed

> But I wonder if at higher speeds this sort of coding is required:=20
>=20
>    if(clk'event and clk=3D'1')then=20
>       D <=3D C;=20
>       B <=3D A;=20
>    end if=20
>    if(clk'event and clk=3D'0')then=20
>       C <=3D B;=20
>       A <=3D input;=20
>    end if;=20

About the only plausible situation where you would benefit here is if you c=
an't double the clock frequency either because it would exceed the device l=
imits or if the other logic that is clocked by that same clock can't run th=
at fast without having to massively redesign.  You'll still have to deal wi=
th trying to receieve data with only one half of a clock time of setup if t=
hose negative edge triggered flip flop outputs fan out to anything other th=
an your shift register (i.e. it's a small design niche where it may be usef=
ul).

> That way, in my second example, "B" for instance captures it's data in th=
e middle of
> "A's" data eye.=20

Being in the middle of a data eye that you can do nothing about doesn't hel=
p.  You have to meet the setup and hold time of the flip flops, there is no=
 extra credit for placing the sampling clock edge in what you think may be =
the middle.  Devices are designed to distribute free running clocks that or=
iginate at an input pin or an internal PLL output with zero skew from the p=
erspective of the designer.

> Is this coding style required above some speed? If so, does anyone > know=
 how to find
> out what that speed is or just tell me some general approximate?=20

The speed would be device specific since it would be the maximum clocking s=
peed of that device which you can find in the datasheet.  However, that max=
imum speed is typically only applicable to a simple shift register, stick i=
n any logic and the clock speed will drop.

Kevin Jennings

Article: 155611
Subject: Re: serial protocol specs and verification
From: alb <alessandro.basili@cern.ch>
Date: Tue, 30 Jul 2013 19:01:45 +0200
Links: << >> << T >> << A >>

Hi Rick,

On 29/07/2013 17:19, rickman wrote:
[]
>> what do you mean by saying 'it becomes impossible to detect when you
>> have a sample error'?
> 
> I was assuming that perhaps you were doing something I didn't quite
> understand, but I'm pretty sure I am on target with this.  You *must* up
> your sample rate by a sufficient amount so that you can guarantee you
> get a minimum of two samples per bit.  Otherwise you have no way to
> distinguish a slipped sample due to clock mismatch.  Clock frequency
> mismatch is guaranteed, unless you are using the same clock somehow.  Is
> that the case?  If so, the sampling would just be synchronous and I
> don't follow where the problem is.

There's no clock distribution, therefore each end has its own clock
on-board. We are certainly talking about same oscillator frequency, but
how well they match is certainly something we *do not* want to rely on.

> It is not just a matter of phase, but of frequency.  With a 2x clock,
> seeing a transition 3 clocks later doesn't distinguish one bit time from
> two bit times.

I agree with you, the 2x clock is not fine enough to adjust for phase
shifts and/or frequency mismatch.

> I'm having trouble expressing myself I think, but I'm trying to say the
> basic premise of this design is flawed because the sample clock is only
> 2x the data rate.  I say you need 3x and I strongly encourage 4x.  At 4x
> the samples have four states, expected timing, fast timing, slow timing
> and "error" timing meaning the loop control isn't working.

uhm, I didn't quite follow what you mean by 'fast timing' and 'slow
timing'. With perfect frequency matching I would expect a bit to have a
transition on cycle #2 (see graph). If the bit is slightly shifted I
would either notice the transition in cycle 2 or cycle 3 depending on
being it slightly earlier or slightly later than the clock edge.

           bit
         center
            ^
            |
cycles   2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0
Data     ________--------________--------________--------_____
SmplClk  -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
SmplData __________--------________--------________--------___

On perfect frequency match SmplData will be 1 clock delayed.

> Data     ____----____----____----____----____----____----____
> SmplClk  --__--__--__--__--__--__--__--__--__--__--__--__--__
> SmplData -----____----____----____----____----____----____----
> 
> This is how you expect it to work.  But if the data is sampled slightly
> off it looks like this.

Uhm this graphics shows a clock frequency which is 1x the clock
frequency of the data... Am I missing something??? This will never work
of course...

> The sample clock does not need to be any particular ratio to the data
> stream if you use an NCO to control the sample rate.  Then the phase
> detection will bump the rate up and down to suit.

I might use the internal PLL to multiply the clock frequency to x4 data
frequency (=80 MHz) and then phase lock on data just looking at the
transition. If for some reason I see a transition earlier or later I
would adjust my recovered clock accordingly.

I'm sure this stuff has been implemented a gazillions of times.

> Do you follow what I am saying?  Or have I mistaken what you are doing?

I follow partially...I guess you understood what I'm saying, but I'm
loosing you somewhere in the middle of the explanation (especially with
the graph representing a 1x clock rate...).

Article: 155612
Subject: Re: serial protocol specs and verification
From: alb <alessandro.basili@cern.ch>
Date: Tue, 30 Jul 2013 19:14:10 +0200
Links: << >> << T >> << A >>

On 29/07/2013 19:40, glen herrmannsfeldt wrote:
[]
> Everyone's old favorite asynchronous serial RS232 usually uses a
> clock at 16x, though I have seen 64x. From the beginning of the 
> start bit, it counts half a bit time (in clock cycles), verifies
> the start bit (and not random noise) then counts whole bits and
> decodes at that point. So, the actual decoding is done with a 1X
> clock, but with 16 (or 64) possible phase values. It resynchronizes
> at the beginning of each character, so it can't get too far off.

I believe that with 4x or 8x you could easily resync at the bit level.
First transition comes in a shift register (4ff or 8ff), when the shift
register has half of the bit set and half reset you generate a clock to
sample data. Second transition comes in and the same mechanism happens.
The clock recovered is adjust to match when the transition happens in
the middle of the shift register.

Since the protocol is bit stuffed, it won't get too far off.

[]
>> I'm having trouble expressing myself I think, but I'm trying to say the 
>> basic premise of this design is flawed because the sample clock is only 
>> 2x the data rate.  I say you need 3x and I strongly encourage 4x.  At 4x 
>> the samples have four states, expected timing, fast timing, slow timing 
>> and "error" timing meaning the loop control isn't working.
> 
> Seems to me that it should depend on how far of you can get. 
> For async RS232, you have to stay within about a quarter bit time
> over 10 bits, so even if the clock is 2% off, it still works.
> But as above, that depends on having a clock of the appropriate
> phase.

IMO a phase shift does not matter too much, while the frequency mismatch
will accumulate time differences and lead the transmitter and receiver
to have different timings. But if you lock on phase shift it means you
lock on frequency as well.

Article: 155613
Subject: Re: serial protocol specs and verification
From: rickman <gnuarm@gmail.com>
Date: Tue, 30 Jul 2013 13:40:05 -0400
Links: << >> << T >> << A >>

On 7/30/2013 1:01 PM, alb wrote:
> Hi Rick,
>
> On 29/07/2013 17:19, rickman wrote:
> []
>>> what do you mean by saying 'it becomes impossible to detect when you
>>> have a sample error'?
>>
>> I was assuming that perhaps you were doing something I didn't quite
>> understand, but I'm pretty sure I am on target with this.  You *must* up
>> your sample rate by a sufficient amount so that you can guarantee you
>> get a minimum of two samples per bit.  Otherwise you have no way to
>> distinguish a slipped sample due to clock mismatch.  Clock frequency
>> mismatch is guaranteed, unless you are using the same clock somehow.  Is
>> that the case?  If so, the sampling would just be synchronous and I
>> don't follow where the problem is.
>
> There's no clock distribution, therefore each end has its own clock
> on-board. We are certainly talking about same oscillator frequency, but
> how well they match is certainly something we *do not* want to rely on.
>
>> It is not just a matter of phase, but of frequency.  With a 2x clock,
>> seeing a transition 3 clocks later doesn't distinguish one bit time from
>> two bit times.
>
> I agree with you, the 2x clock is not fine enough to adjust for phase
> shifts and/or frequency mismatch.

Ok, we are on the same page then.

>> I'm having trouble expressing myself I think, but I'm trying to say the
>> basic premise of this design is flawed because the sample clock is only
>> 2x the data rate.  I say you need 3x and I strongly encourage 4x.  At 4x
>> the samples have four states, expected timing, fast timing, slow timing
>> and "error" timing meaning the loop control isn't working.
>
> uhm, I didn't quite follow what you mean by 'fast timing' and 'slow
> timing'. With perfect frequency matching I would expect a bit to have a
> transition on cycle #2 (see graph). If the bit is slightly shifted I
> would either notice the transition in cycle 2 or cycle 3 depending on
> being it slightly earlier or slightly later than the clock edge.
>
>             bit
>           center
>              ^
>              |
> cycles   2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0
> Data     ________--------________--------________--------_____
> SmplClk  -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
> SmplData __________--------________--------________--------___
>
> On perfect frequency match SmplData will be 1 clock delayed.

No point in even discussing the "perfect" frequency match.

>> Data     ____----____----____----____----____----____----____
>> SmplClk  --__--__--__--__--__--__--__--__--__--__--__--__--__
>> SmplData -----____----____----____----____----____----____----
>>
>> This is how you expect it to work.  But if the data is sampled slightly
>> off it looks like this.
>
> Uhm this graphics shows a clock frequency which is 1x the clock
> frequency of the data... Am I missing something??? This will never work
> of course...

Yes, you are right, still your diagram above shows a 4x clock.  That 
will work all day long.  It is the 2x clock that doesn't work well.  A 
3x clock will work but can't provide any info on whether it is sync'd or 
not.  A 4x clock can tell if the data has slipped giving an error.

What I meant further up by the timing is that your circuit will detect 
the data transitions and try to sample near the middle of the stable 
portion.  So with a 4x clock if it sees a transition where it expects 
one, it is "on time".  If it sees a transition one clock early it knows 
it is "slow", if it sees a transition clock one late it knows it is 
"fast".  When it sees a transition in the fourth phase, it should assume 
that it is out of sync and needs to go into hunt mode.  Or you can get 
fancier and use some hysteresis for the transitions between "hunt" and 
"locked" modes.

I designed this with an NCO controlled PLL.  With your async protocol 
you should be able to receive a packet based on the close frequency 
matching of the two ends.  This would really just be correcting for the 
phase of the incoming data and not worrying about the frequency 
mismatch... like a conventional UART.  This circuit can realign every 7 
pulses max.  That would work I think.

I was making this a bit more complicated because in my case I didn't 
have matched frequency clocks, it was specified in the software to maybe 
1-2% and the NCO had to PLL to the incoming data to get a frequency 
lock.  I also didn't have bit stuffing so a long enough string without 
transitions would cause a lock slip.

>> The sample clock does not need to be any particular ratio to the data
>> stream if you use an NCO to control the sample rate.  Then the phase
>> detection will bump the rate up and down to suit.
>
> I might use the internal PLL to multiply the clock frequency to x4 data
> frequency (=80 MHz) and then phase lock on data just looking at the
> transition. If for some reason I see a transition earlier or later I
> would adjust my recovered clock accordingly.

Yes, that is it exactly.  The bit stuffing will give you enough 
transitions that you should never lose lock.  It is trying to do this at 
2x that won't work well because you can't distinguish early from late.

> I'm sure this stuff has been implemented a gazillions of times.
>
>> Do you follow what I am saying?  Or have I mistaken what you are doing?
>
> I follow partially...I guess you understood what I'm saying, but I'm
> loosing you somewhere in the middle of the explanation (especially with
> the graph representing a 1x clock rate...).

Sorry.  If this is not clear now, I'll try the diagram again... lol

I would give you my code, but in theory it is proprietary to someone 
else.  Just think state machine that outputs a clock enable every four 
states, then either adds a state or skips a state to stay in alignment 
only when it sees data transitions.  If it sees a transition in the 
fourth state, it is not in alignment.  If there is no transition the FSM 
just counts...

A timing diagram is worth a thousand words.

-- 

Rick

Article: 155614
Subject: Lattice Announces EOL for XP and EC/P Product Lines
From: rickman <gnuarm@gmail.com>
Date: Tue, 30 Jul 2013 14:37:21 -0400
Links: << >> << T >> << A >>

This is likely not a big deal to most, but it hurts me a lot.  I have 
one product in production and it uses an XP device.  They are only 
giving until November to get your last time buy orders in.  I think 
Lattice is doing a disservice to themselves as well as the rest of us. 
I am very accustomed to extended longevity in FPGAs.  This act on the 
part of Lattice puts them in a separate camp I think.

I have been looking at the alternatives.  The three distinguishing 
issues are package, capacity and the need for external configuration 
memory.  The XP I was using is in a 100 pin QFP which is perfect for the 
board, easy to assemble and works with 6/6 design rules and 12 mil hole 
diameter.  It has 3000 LUTs which are around 80% used and the internal 
configuration Flash saves space on the tiny, cramped board.

Mostly the alternatives are other Lattice devices, but none are a 
perfect fit.  XP2, XO2 and the iCE40 line.  The ones that come in the 
same package don't have as many LUTs, only 2100 which would require 
using a soft CPU to implement the slow functions in fewer LUTs.  The 
larger parts are in harder to use packages like 0.5 mm BGAs which need 
very fine pitch design rules and small drills.

The Xilinx parts are interesting.  Spartan 3 devices come in 100 QFPs 
and have enough of the "right stuff" inside including multipliers which 
I can use.  But that external flash needs a spot on the board and I have 
to use a 1.2 volt regulator for the core.  The XP parts use an internal 
regulator and run from 3.3 volts only.  Xilinx has a rep for keeping 
parts in production for a long, long time, but the S3 line came out in 
2005, same as the XP line.  Spartan 6 parts give a *lot* more 
functionality, but I'd have to use a 256 pin 1.0 mm BGA *and* external 
flash *and* the 1.2 volt supply *and* they are twice the price.  Maybe 
I'll talk to the disties.  Maybe they can do something about the price 
at least.

I have yet to check out the Altera line.  I don't remember them having 
anything I liked in a nice package.  But that will be something to do 
later today.  I guess I should check out the Micro-Semi line as well. 
It's been a while since I looked hard at their parts and, oh yeah, there 
is the PSOC from Cypress.  I don't think that was an option at the time 
I did this design.

An interesting note is that the Spartan 3 parts came out the same year 
as the XP line.  Xilinx still sells multiple generations of parts older 
than the Spartan 3, so I think it will be a while before they can get 
around to obsoleting that line.

-- 

Rick

Article: 155615
Subject: Re: Lattice Announces EOL for XP and EC/P Product Lines
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Tue, 30 Jul 2013 11:54:28 -0700
Links: << >> << T >> << A >>

On Tue, 30 Jul 2013 14:37:21 -0400
rickman <gnuarm@gmail.com> wrote:

> 
> An interesting note is that the Spartan 3 parts came out the same year 
> as the XP line.  Xilinx still sells multiple generations of parts older 
> than the Spartan 3, so I think it will be a while before they can get 
> around to obsoleting that line.

Actually just had this conversation with my Xilinx people.  They're not
recommending Spartan 3 for new designs, and are talking
(speculatively) about obsoleting it in 2018.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 155616
Subject: Re: seperate high speed rules for HDL?
From: muzaffer.kal@gmail.com
Date: Tue, 30 Jul 2013 11:59:17 -0700 (PDT)
Links: << >> << T >> << A >>

On Monday, July 29, 2013 8:01:28 PM UTC-7, ske...@gmail.com wrote:
> But I wonder if at higher speeds this sort of coding is required:
>    if(clk'event and clk=3D'1')then
>       D <=3D C;
>       B <=3D A;
>    end if
>    if(clk'event and clk=3D'0')then
>       C <=3D B;
>       A <=3D input;
>    end if;

This is overall not a very good idea. Even with 50% duty cycle clocks, say =
path A.Q to B.D has T/2 time so you are cutting the time available for B to=
 register by half. To make a design run faster you need to increase the sou=
rce clock edge to destination clock time, not decrease as you are doing her=
e.=20
Your options are add multicycle paths or useful skew to increase the time a=
vailable between clock edges. The former is difficult to constrain and the =
latter is strictly a physical design solution which doesn't apply to FPGAs.

Article: 155617
Subject: Re: Lattice Announces EOL for XP and EC/P Product Lines
From: GaborSzakacs <gabor@alacron.com>
Date: Tue, 30 Jul 2013 15:03:54 -0400
Links: << >> << T >> << A >>

Rob Gaddi wrote:
> On Tue, 30 Jul 2013 14:37:21 -0400
> rickman <gnuarm@gmail.com> wrote:
> 
>> An interesting note is that the Spartan 3 parts came out the same year 
>> as the XP line.  Xilinx still sells multiple generations of parts older 
>> than the Spartan 3, so I think it will be a while before they can get 
>> around to obsoleting that line.
> 
> Actually just had this conversation with my Xilinx people.  They're not
> recommending Spartan 3 for new designs, and are talking
> (speculatively) about obsoleting it in 2018.
> 
And yet, Xilinx recently updated the Spartan 3 data sheet to remove
the "not recommended for new designs" banner and indicated that they
in fact *are* recommended for new designs.

Still all of these manufacturers are at the mercy of their foundries
and have to pull the plug on devices that can no longer be manufactured
due to the process going away at UMC, TSMC, ...

At this point it's hard to say whether the FPGA manufacturer's previous
track record on supporting old devices is any indication of future
performance.

Another point on Xilinx parts in small packages - I seem to remember
that Lattice gave you more usable IO in the same package / pin count
than Xilinx.  So the fact that you could get a Spartan 3 in a TQ100
doesn't necessarily mean it will have enough IO to replace the Lattice
XP device.

The other obvious options are:

1) Try to estimate your future usage of this part and schedule that LTB.

2) Stick you head in the sand and deal with the grey market for parts
until you can't get any more, then redesign.  (This seems to be the
approved method here)

-- 
Gabor

Article: 155618
Subject: Re: Lattice Announces EOL for XP and EC/P Product Lines
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Tue, 30 Jul 2013 12:17:06 -0700
Links: << >> << T >> << A >>

On Tue, 30 Jul 2013 15:03:54 -0400
GaborSzakacs <gabor@alacron.com> wrote:

> Rob Gaddi wrote:
> > On Tue, 30 Jul 2013 14:37:21 -0400
> > rickman <gnuarm@gmail.com> wrote:
> > 
> >> An interesting note is that the Spartan 3 parts came out the same year 
> >> as the XP line.  Xilinx still sells multiple generations of parts older 
> >> than the Spartan 3, so I think it will be a while before they can get 
> >> around to obsoleting that line.
> > 
> > Actually just had this conversation with my Xilinx people.  They're not
> > recommending Spartan 3 for new designs, and are talking
> > (speculatively) about obsoleting it in 2018.
> > 
> And yet, Xilinx recently updated the Spartan 3 data sheet to remove
> the "not recommended for new designs" banner and indicated that they
> in fact *are* recommended for new designs.
> 
> Still all of these manufacturers are at the mercy of their foundries
> and have to pull the plug on devices that can no longer be manufactured
> due to the process going away at UMC, TSMC, ...
> 
> At this point it's hard to say whether the FPGA manufacturer's previous
> track record on supporting old devices is any indication of future
> performance.
> 
> Another point on Xilinx parts in small packages - I seem to remember
> that Lattice gave you more usable IO in the same package / pin count
> than Xilinx.  So the fact that you could get a Spartan 3 in a TQ100
> doesn't necessarily mean it will have enough IO to replace the Lattice
> XP device.
> 
> The other obvious options are:
> 
> 1) Try to estimate your future usage of this part and schedule that LTB.
> 
> 2) Stick you head in the sand and deal with the grey market for parts
> until you can't get any more, then redesign.  (This seems to be the
> approved method here)
> 
> -- 
> Gabor

The thing I'm finding really concerning with Xilinx at the moment is
that they've got this big investment in yet another entirely new
toolchain (Vivado), and they're saying it's the way of the future.  And
it doesn't even support Spartan 6, let alone anything older.

I switched years ago from X to A when my continuing problems with ISE
finally became too much to deal with.  I applauded the decision to
scrap ISE's dodgy old codebase and take a new crack at it.  But if the
software they're pushing going forward doesn't support a given chip,
then I can't possibly consider that chip to be going forward with them.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 155619
Subject: Re: Lattice Announces EOL for XP and EC/P Product Lines
From: already5chosen@yahoo.com
Date: Tue, 30 Jul 2013 12:21:39 -0700 (PDT)
Links: << >> << T >> << A >>

On Tuesday, July 30, 2013 9:37:21 PM UTC+3, rickman wrote:
> This is likely not a big deal to most, but it hurts me a lot.  I have 
> 
> one product in production and it uses an XP device.  They are only 
> 
> giving until November to get your last time buy orders in.  I think 
> 
> Lattice is doing a disservice to themselves as well as the rest of us. 
> 
> I am very accustomed to extended longevity in FPGAs.  This act on the 
> 
> part of Lattice puts them in a separate camp I think.
> 
> 
> 
> I have been looking at the alternatives.  The three distinguishing 
> 
> issues are package, capacity and the need for external configuration 
> 
> memory.  The XP I was using is in a 100 pin QFP which is perfect for the 
> 
> board, easy to assemble and works with 6/6 design rules and 12 mil hole 
> 
> diameter.  It has 3000 LUTs which are around 80% used and the internal 
> 
> configuration Flash saves space on the tiny, cramped board.
> 
> 
> 
> Mostly the alternatives are other Lattice devices, but none are a 
> 
> perfect fit.  XP2, XO2 and the iCE40 line.  The ones that come in the 
> 
> same package don't have as many LUTs, only 2100 which would require 
> 
> using a soft CPU to implement the slow functions in fewer LUTs.  The 
> 
> larger parts are in harder to use packages like 0.5 mm BGAs which need 
> 
> very fine pitch design rules and small drills.
> 
> 
> 
> The Xilinx parts are interesting.  Spartan 3 devices come in 100 QFPs 
> 
> and have enough of the "right stuff" inside including multipliers which 
> 
> I can use.  But that external flash needs a spot on the board and I have 
> 
> to use a 1.2 volt regulator for the core.  The XP parts use an internal 
> 
> regulator and run from 3.3 volts only.  Xilinx has a rep for keeping 
> 
> parts in production for a long, long time, but the S3 line came out in 
> 
> 2005, same as the XP line.  Spartan 6 parts give a *lot* more 
> 
> functionality, but I'd have to use a 256 pin 1.0 mm BGA *and* external 
> 
> flash *and* the 1.2 volt supply *and* they are twice the price.  Maybe 
> 
> I'll talk to the disties.  Maybe they can do something about the price 
> 
> at least.
> 
> 
> 
> I have yet to check out the Altera line.  I don't remember them having 
> anything I liked in a nice package. 

If "nice" = 100 pin QFP, then yes, except for ancient Cyclone-I, Altera does not have anything nice.
But if 144 pin QFP is also o.k. then there are relatively modern Cyclone III devices. Voltage and the rest is more or less the same as Xilinx.

MAX2/MAX5 are not for you - too few LUTs.


> But that will be something to do 
> 
> later today.  I guess I should check out the Micro-Semi line as well. 
> 
> It's been a while since I looked hard at their parts and, oh yeah, there 
> 
> is the PSOC from Cypress.  I don't think that was an option at the time 
> 
> I did this design.
> 
> 
> 
> An interesting note is that the Spartan 3 parts came out the same year 
> 
> as the XP line.  Xilinx still sells multiple generations of parts older 
> 
> than the Spartan 3, so I think it will be a while before they can get 
> 
> around to obsoleting that line.
> 
> 
> 
> -- 
> 
> 
> 
> Rick

Article: 155620
Subject: Re: Lattice Announces EOL for XP and EC/P Product Lines
From: Jon Elson <jmelson@wustl.edu>
Date: Tue, 30 Jul 2013 14:50:22 -0500
Links: << >> << T >> << A >>

rickman wrote:


> The Xilinx parts are interesting.  Spartan 3 devices come in 100 QFPs
> and have enough of the "right stuff" inside including multipliers which
> I can use.  But that external flash needs a spot on the board and I have
> to use a 1.2 volt regulator for the core.  The XP parts use an internal
> regulator and run from 3.3 volts only.  Xilinx has a rep for keeping
> parts in production for a long, long time, but the S3 line came out in
> 2005, same as the XP line.  Spartan 6 parts give a *lot* more
> functionality, but I'd have to use a 256 pin 1.0 mm BGA *and* external
> flash *and* the 1.2 volt supply *and* they are twice the price.  Maybe
> I'll talk to the disties.  Maybe they can do something about the price
> at least.
> 
Spartan 3AN has internal flash.  I don't recall if there is a 100-pin
version, I am using the 144-pin version in a couple products.
I refuse to go to BGAs until there are no leaded parts remaining
available.

Jon

Article: 155621
Subject: Re: Lattice Announces EOL for XP and EC/P Product Lines
From: GaborSzakacs <gabor@alacron.com>
Date: Tue, 30 Jul 2013 15:56:59 -0400
Links: << >> << T >> << A >>

Jon Elson wrote:
> rickman wrote:
> 
> 
>> The Xilinx parts are interesting.  Spartan 3 devices come in 100 QFPs
>> and have enough of the "right stuff" inside including multipliers which
>> I can use.  But that external flash needs a spot on the board and I have
>> to use a 1.2 volt regulator for the core.  The XP parts use an internal
>> regulator and run from 3.3 volts only.  Xilinx has a rep for keeping
>> parts in production for a long, long time, but the S3 line came out in
>> 2005, same as the XP line.  Spartan 6 parts give a *lot* more
>> functionality, but I'd have to use a 256 pin 1.0 mm BGA *and* external
>> flash *and* the 1.2 volt supply *and* they are twice the price.  Maybe
>> I'll talk to the disties.  Maybe they can do something about the price
>> at least.
>>
> Spartan 3AN has internal flash.  I don't recall if there is a 100-pin
> version, I am using the 144-pin version in a couple products.
> I refuse to go to BGAs until there are no leaded parts remaining
> available.
> 
> Jon

I'm pretty sure that the 144-pin package is the smallest with flash.
In any case it's not a big win over an external SPI flash part.  The
difference in footprint between 100 TQFP and 144 TQFP is more than
the flash footprint.  Not to mention there's a price premium for that
multi-die package.

-- 
Gabor

Article: 155622
Subject: Re: Lattice Announces EOL for XP and EC/P Product Lines
From: Theo Markettos <theom+news@chiark.greenend.org.uk>
Date: 30 Jul 2013 22:11:06 +0100 (BST)
Links: << >> << T >> << A >>

already5chosen@yahoo.com wrote:
> If "nice" = 100 pin QFP, then yes, except for ancient Cyclone-I, Altera
> does not have anything nice.
>
> But if 144 pin QFP is also o.k. then there are relatively modern Cyclone
> III devices.  Voltage and the rest is more or less the same as Xilinx.

There's some Cyclone IVs in 144ish QFP too.

Theo

Article: 155623
Subject: Re: Lattice Announces EOL for XP and EC/P Product Lines
From: Jon Elson <jmelson@wustl.edu>
Date: Tue, 30 Jul 2013 17:02:47 -0500
Links: << >> << T >> << A >>

GaborSzakacs wrote:


> I'm pretty sure that the 144-pin package is the smallest with flash.
> In any case it's not a big win over an external SPI flash part.  The
> difference in footprint between 100 TQFP and 144 TQFP is more than
> the flash footprint.  Not to mention there's a price premium for that
> multi-die package.
> 
Right, unless I have a pretty strong reason to use the 3AN, I use
the Spartan 3, and the SST flash chips, which are insanely cheap.
I wrote my own programmer code for those.  Spartan 2E needed some
interface fooling around to command the memory to start dumping at
location zero, but the 3A knows how to do it by setting some config
pins.

Jon

Article: 155624
Subject: Re: Lattice Announces EOL for XP and EC/P Product Lines
From: rickman <gnuarm@gmail.com>
Date: Tue, 30 Jul 2013 18:35:32 -0400
Links: << >> << T >> << A >>

On 7/30/2013 3:03 PM, GaborSzakacs wrote:
> Rob Gaddi wrote:
>> On Tue, 30 Jul 2013 14:37:21 -0400
>> rickman <gnuarm@gmail.com> wrote:
>>
>>> An interesting note is that the Spartan 3 parts came out the same
>>> year as the XP line. Xilinx still sells multiple generations of parts
>>> older than the Spartan 3, so I think it will be a while before they
>>> can get around to obsoleting that line.
>>
>> Actually just had this conversation with my Xilinx people. They're not
>> recommending Spartan 3 for new designs, and are talking
>> (speculatively) about obsoleting it in 2018.
>>
> And yet, Xilinx recently updated the Spartan 3 data sheet to remove
> the "not recommended for new designs" banner and indicated that they
> in fact *are* recommended for new designs.
>
> Still all of these manufacturers are at the mercy of their foundries
> and have to pull the plug on devices that can no longer be manufactured
> due to the process going away at UMC, TSMC, ...
>
> At this point it's hard to say whether the FPGA manufacturer's previous
> track record on supporting old devices is any indication of future
> performance.

Before making any decisions I will do my due diligence as well as have 
any decision approved by my customer.  They will be designing my board 
into their new product, so they are free to make the decision for me. 
Actually, 2018 might work for me if not for my customer.  I expect I'll 
be fully retired in 5 more years.

> Another point on Xilinx parts in small packages - I seem to remember
> that Lattice gave you more usable IO in the same package / pin count
> than Xilinx. So the fact that you could get a Spartan 3 in a TQ100
> doesn't necessarily mean it will have enough IO to replace the Lattice
> XP device.

In this case the count is higher in the Spartan 3 part and is a *lot* 
higher in nearly any other part since most won't be in the same package.

> The other obvious options are:
>
> 1) Try to estimate your future usage of this part and schedule that LTB.

Not mine to estimate.  I tried buying just 10 boards ahead and ended up 
with 10 rev 1.1 boards after we did the 2.0 design.  The demand is 
*very* lumpy as my customer puts it.  We got orders this year for more 
units than we have sold in the last five...

> 2) Stick you head in the sand and deal with the grey market for parts
> until you can't get any more, then redesign. (This seems to be the
> approved method here)

No, this is too important to me any my customer.  We will work it out 
one way or the other.  Thanks for your comments.

-- 

Rick

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search