Messages from 154675

Article: 154675
Subject: Re: DC fifo behaviour at underflow/overflow
From: "kaz" <3619@embeddedrelated>
Date: Sat, 15 Dec 2012 12:14:53 -0600
Links: << >> << T >> << A >>

>
>My crucial point is:
>Is there anyway this altera fifo will break up the stream into another
>stream 
>with even samples ahead of its odd half by 8 samples?
>
>Kaz	   
>					
>---------------------------------------		
>Posted through http://www.FPGARelated.com
>

Let me rephrase the problem. 
It may not be that the presumed fifo problem is a case of
underflow/overflow 
but rather it is a timing problem or both mixed up.

dc fifos protect against metastability to some degree but a failure could
occur.
The cross-domain paths are made false by default understandably. So is't a

case of loss of functionality for some time intermittently that has to be 
accepted. The error stays for several tens of msec then disappears. Don't
we 
expect fifos to recover more quickly(its internal sync pipeline is set to
3).

Kaz
	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154676
Subject: Re: DC fifo behaviour at underflow/overflow
From: Allan Herriman <allanherriman@hotmail.com>
Date: 16 Dec 2012 05:03:14 GMT
Links: << >> << T >> << A >>

On Sat, 15 Dec 2012 09:47:35 -0600, kaz wrote:

>>"kaz" <3619@embeddedrelated> wrote in message
>>news:AaydnUTdO7FfE1HNnZ2dnUVZ_iydnZ2d@giganews.com...
>>> >
>>>>I would never let a FIFO over or under flow. You should always stop
>>> writing
>>>>to the FIFO if the full flag is set and discard the input data stream.
> If
>>>
>>>>the empty flag is set you should not read from the FIFO - instead
> output
>>>>known dummy data (invariably I output all zero's).
>>>>
>>>>Following this rule the behaviour of the FIFO is  totally predictable.
>>>>
>>>>Andy
>>>>
>>>>
>>> Thanks Andy. No question that fifo is meant to be working away from
>>> underflow or overflow. What I am asking is there any known patterns
>>> that could emerge - after all - within this unpredictibility. Here I
>>> am asking about
> known
>>> symptoms of wrong behaviour really.
>>>
>>> Kaz
>>>
>>>
>>Depends how the FIFO is constructed.
>>
>>If it is as a dual port RAM with an incrementable write pointer on the
> input
>>port, and an incrementable read pointer on the output port then if you
> fill
>>it to full - stop writing - then keep pulling data from the read port it
>>will act as a circular buffer with data that will repeat over a number
>>of
> 
>>cycles which will equal the FIFO length.
>>
>>You can work out other scenarios for this architecture yourself, for
> sure.
>>
>>Andy
>>
>>
>>
>>
> My crucial point is:
> Is there anyway this altera fifo will break up the stream into another
> stream with even samples ahead of its odd half by 8 samples?

I saw a DC (dual clock) FIFO do something like that once.  It was in a 
Xilinx part, but the design error would apply equally well to an Altera 
part or an ASIC.

It was part of an IP core that a client had bought.  To make a 64 bit 
wide FIFO, the IP developer had used two 32 bit wide FIFOs in parallel. 
The two FIFOs had independent control circuits.

Of course, as a dual clock FIFO, one can't really make any guarantees 
about the depth immediately after an asynchronous reset when the clocks 
are running, and indeed the two halves of the FIFO would start with 
different depths sometimes.  There was no circuit to check for this state 
and get them back into sync and the end result was until the next reset, 
32 bit chunks of data were swapped around.

Regards,
Allan

Article: 154677
Subject: Re: DC fifo behaviour at underflow/overflow
From: Michael S <already5chosen@yahoo.com>
Date: Sun, 16 Dec 2012 01:31:26 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 15, 8:14=A0pm, "kaz" <3619@embeddedrelated> wrote:
> >My crucial point is:
> >Is there anyway this altera fifo will break up the stream into another
> >stream
> >with even samples ahead of its odd half by 8 samples?
>
> >Kaz
>
> >---------------------------------------
> >Posted throughhttp://www.FPGARelated.com
>
> Let me rephrase the problem.
> It may not be that the presumed fifo problem is a case of
> underflow/overflow
> but rather it is a timing problem or both mixed up.

The symptoms look exactly like underflow/overflow.

>
> dc fifos protect against metastability to some degree but a failure could
> occur.
> The cross-domain paths are made false by default understandably. So is't =
a
>
> case of loss of functionality for some time intermittently that has to be
> accepted. The error stays for several tens of msec then disappears. Don't
> we
> expect fifos to recover more quickly(its internal sync pipeline is set to
> 3).
>

According to the dcfifo help, value of 3 is internally translated to
1, which for very high clock rates that you are using is almost
certainly insufficient. Try 4.

> Kaz
>
> ---------------------------------------
> Posted throughhttp://www.FPGARelated.com

Did you pay attention to DELAY_RDUSEDW/DELAY_WRUSEDW parameters?
Altera's default value (1) is unintuitive and, in my experience, tends
to cause problems. If you rely on exact values of rdusedw or wrusedw
ports for anything non-trivial, I'd recommend to set respective
DELAY_xxUSEDW to 0.
I'd also set OVERFLOW_CHECKING/UNDERFLOW_CHECKING to "OFF" and do
underflow/overflow prevention in my own logic.

BTW, personally, I wouldn't use Altera's 8-deep FIFOs, they don't
appear to be as well tested as their deeper relatives. Or, may be,
it's just me.

Article: 154678
Subject: Re: DC fifo behaviour at underflow/overflow
From: "kaz" <3619@embeddedrelated>
Date: Sun, 16 Dec 2012 04:39:46 -0600
Links: << >> << T >> << A >>

Many thanks for your contributions.

The fifo I am using is very basic: 32 bits wide, 8 words deep, no reset,
3 stage synchroniser, write and read connected directly(combinatorially)
to
full/empty flags, word count not used, clocks(wr/rd 368/245).

I am trying to put my head deeper into how a fifo might work internally.
Assuming a simplest case, I understand the write pointer is clocked by
write 
clock and it increments on write request (counting is binary or Gray).
The read pointer mirrors that on the read side.

The signals that cross the clock domain are the empty/full flag (in my case

as I am not using the word counts).

What now mystifies me is that if anything went wrong be it flow issue or 
timing then wouldn't these counters just increment from where they might
have 
landed implying self recovery, excluding the case that read pointer is
ahead 
of write pointer (as assumed in my case because samples are read out 
correctly each but misaligned).

I mean to get 8 samples odd/even misalignment I can only think of pointers

going crazy or address arriving crazy but regular.


Kaz



	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154679
Subject: Re: DC fifo behaviour at underflow/overflow
From: Michael S <already5chosen@yahoo.com>
Date: Sun, 16 Dec 2012 04:37:58 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 16, 12:39=A0pm, "kaz" <3619@embeddedrelated> wrote:
> Many thanks for your contributions.
>
> The fifo I am using is very basic: 32 bits wide, 8 words deep, no reset,
> 3 stage synchroniser, write and read connected directly(combinatorially)
> to
> full/empty flags, word count not used, clocks(wr/rd 368/245).
>
> I am trying to put my head deeper into how a fifo might work internally.
> Assuming a simplest case, I understand the write pointer is clocked by
> write
> clock and it increments on write request (counting is binary or Gray).

Gray.

> The read pointer mirrors that on the read side.
>
> The signals that cross the clock domain are the empty/full flag (in my ca=
se
> as I am not using the word counts).

No, it does not work like that.
The signals that cross clock domains are:
* write pointer - resynchronized variant of it is used in the rdclk
clock domain to generate rdempty and rdusedw
* read pointer - resynchronized variant of it is used in the wrclk
clock domain to generate wrempty and wrusedw

>
> What now mystifies me is that if anything went wrong be it flow issue or
> timing then wouldn't these counters just increment from where they might
> have
> landed implying self recovery, excluding the case that read pointer is
> ahead
> of write pointer (as assumed in my case because samples are read out
> correctly each but misaligned).

Your write clock is faster than your read clock, so, supposedly, your
wrreq has <100% duty cicle, right?
The exact effect of underflow/overflow will depend on specific pattern
applied to wrreq.

You wrote above that wrreq is connected directly to wrfull. Does it
mean that wrreq depends *only* on wrfull or does wrreq logic equation
has some additional terms?

>
> I mean to get 8 samples odd/even misalignment I can only think of pointer=
s
>
> going crazy or address arriving crazy but regular.
>
> Kaz
>
> ---------------------------------------
> Posted throughhttp://www.FPGARelated.com

Article: 154680
Subject: Re: DC fifo behaviour at underflow/overflow
From: "kaz" <3619@embeddedrelated>
Date: Sun, 16 Dec 2012 06:53:14 -0600
Links: << >> << T >> << A >>

>On Dec 16, 12:39=A0pm, "kaz" <3619@embeddedrelated> wrote:
>> Many thanks for your contributions.
>>
>> The fifo I am using is very basic: 32 bits wide, 8 words deep, no
reset,
>> 3 stage synchroniser, write and read connected
directly(combinatorially)
>> to
>> full/empty flags, word count not used, clocks(wr/rd 368/245).
>>
>> I am trying to put my head deeper into how a fifo might work
internally.
>> Assuming a simplest case, I understand the write pointer is clocked by
>> write
>> clock and it increments on write request (counting is binary or Gray).
>
>Gray.
>
>> The read pointer mirrors that on the read side.
>>
>> The signals that cross the clock domain are the empty/full flag (in my
ca=
>se
>> as I am not using the word counts).
>
>No, it does not work like that.
>The signals that cross clock domains are:
>* write pointer - resynchronized variant of it is used in the rdclk
>clock domain to generate rdempty and rdusedw
>* read pointer - resynchronized variant of it is used in the wrclk
>clock domain to generate wrempty and wrusedw
>
>>

I agree that a resynchronised variant of write pointer will be used to 
generate rdempty and rdusew in other domain but not for read pointer itself

i.e. each side has its own pointer.

>> What now mystifies me is that if anything went wrong be it flow issue
or
>> timing then wouldn't these counters just increment from where they
might
>> have
>> landed implying self recovery, excluding the case that read pointer is
>> ahead
>> of write pointer (as assumed in my case because samples are read out
>> correctly each but misaligned).
>
>Your write clock is faster than your read clock, so, supposedly, your
>wrreq has <100% duty cicle, right?
>The exact effect of underflow/overflow will depend on specific pattern
>applied to wrreq.
>
>You wrote above that wrreq is connected directly to wrfull. Does it
>mean that wrreq depends *only* on wrfull or does wrreq logic equation
>has some additional terms?
>

yes the input rate is controlled by valid being active in 2/3 ratio
regularly.
The read side is always active if fifo is not empty.

Kaz	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154681
Subject: Re: DC fifo behaviour at underflow/overflow
From: Michael S <already5chosen@yahoo.com>
Date: Sun, 16 Dec 2012 05:16:47 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 16, 2:53=A0pm, "kaz" <3619@embeddedrelated> wrote:
> >On Dec 16, 12:39=3DA0pm, "kaz" <3619@embeddedrelated> wrote:
> >> Many thanks for your contributions.
>
> >> The fifo I am using is very basic: 32 bits wide, 8 words deep, no
> reset,
> >> 3 stage synchroniser, write and read connected
>
> directly(combinatorially)
>
>
>
>
>
>
>
>
>
> >> to
> >> full/empty flags, word count not used, clocks(wr/rd 368/245).
>
> >> I am trying to put my head deeper into how a fifo might work
> internally.
> >> Assuming a simplest case, I understand the write pointer is clocked by
> >> write
> >> clock and it increments on write request (counting is binary or Gray).
>
> >Gray.
>
> >> The read pointer mirrors that on the read side.
>
> >> The signals that cross the clock domain are the empty/full flag (in my
> ca=3D
> >se
> >> as I am not using the word counts).
>
> >No, it does not work like that.
> >The signals that cross clock domains are:
> >* write pointer - resynchronized variant of it is used in the rdclk
> >clock domain to generate rdempty and rdusedw
> >* read pointer - resynchronized variant of it is used in the wrclk
> >clock domain to generate wrempty and wrusedw
>
> I agree that a resynchronised variant of write pointer will be used to
> generate rdempty and rdusew in other domain but not for read pointer itse=
lf
>
> i.e. each side has its own pointer.

Of course.
Each side has it's own pointer + resynchronised copy of other side's
pointer

>
>
>
>
>
>
>
>
>
>
>
> >> What now mystifies me is that if anything went wrong be it flow issue
> or
> >> timing then wouldn't these counters just increment from where they
> might
> >> have
> >> landed implying self recovery, excluding the case that read pointer is
> >> ahead
> >> of write pointer (as assumed in my case because samples are read out
> >> correctly each but misaligned).
>
> >Your write clock is faster than your read clock, so, supposedly, your
> >wrreq has <100% duty cicle, right?
> >The exact effect of underflow/overflow will depend on specific pattern
> >applied to wrreq.
>
> >You wrote above that wrreq is connected directly to wrfull. Does it
> >mean that wrreq depends *only* on wrfull or does wrreq logic equation
> >has some additional terms?
>
> yes the input rate is controlled by valid being active in 2/3 ratio
> regularly.
> The read side is always active if fifo is not empty.
>
> Kaz
>
> ---------------------------------------
> Posted throughhttp://www.FPGARelated.com

yes =3D *only* wrfull or yes=3Dadditional terms?
If the former, where wrdata is coming from?

Can you post here a representative excerpt from your design ?

Article: 154682
Subject: Re: DC fifo behaviour at underflow/overflow
From: "kaz" <3619@embeddedrelated>
Date: Sun, 16 Dec 2012 07:57:43 -0600
Links: << >> << T >> << A >>

Yes there is extra term.
Here is some excerpt:

TX_SRX_FIFO_inst : TX_SRX_FIFO 
  PORT MAP (
    data     => TX_SRX_FIFO_DATA,
    rdclk    => iCLK245,
    rdreq    => TX_SRX_FIFO_rdreq,
    wrclk    => iCLK368,
    wrreq    => TX_SRX_FIFO_wrreq,
    q        => TX_SRX_FIFO_q,
    rdempty  => TX_SRX_FIFO_empty,
    wrfull   => TX_SRX_FIFO_full
    );


    -- 2 in 3 clock enables is used 
    TX_SRX_FIFO_wrreq <= (Sync_23_1b(1) AND (not TX_SRX_FIFO_full));
    TX_SRX_FIFO_rdreq <= not TX_SRX_FIFO_empty;


the clock ratio is 368.64 to 245.76 to be exact.

Kaz


	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154683
Subject: Re: DC fifo behaviour at underflow/overflow
From: Michael S <already5chosen@yahoo.com>
Date: Sun, 16 Dec 2012 08:00:09 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote:
> Yes there is extra term.
> Here is some excerpt:
>
> TX_SRX_FIFO_inst : TX_SRX_FIFO
> =A0 PORT MAP (
> =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA,
> =A0 =A0 rdclk =A0 =A0=3D> iCLK245,
> =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq,
> =A0 =A0 wrclk =A0 =A0=3D> iCLK368,
> =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq,
> =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q,
> =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty,
> =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full
> =A0 =A0 );
>
> =A0 =A0 -- 2 in 3 clock enables is used
> =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not TX_SRX_FIFO_full))=
;
> =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty;
>
> the clock ratio is 368.64 to 245.76 to be exact.
>
> Kaz
>
> ---------------------------------------
> Posted throughhttp://www.FPGARelated.com

WOW, I reproduced the behavior that you describe  (non-recovery after
overflow) in functional simulation with Altera's internal simulator!
I never imagined that anything like that is possible.
Sounds like bug in implementation of dcfifo. Of course Altera will
call this bug a feature and will say that as long as there was
overflow nothing could be guaranteed. Or similar bullsheet.
I am writing sequential counter and see the pattern like (64, 57, 66,
59, 68, 61, 70, 63...) on the read side.
To be continued...

Article: 154684
Subject: Re: DC fifo behaviour at underflow/overflow
From: Michael S <already5chosen@yahoo.com>
Date: Sun, 16 Dec 2012 09:44:17 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 16, 6:00=A0pm, Michael S <already5cho...@yahoo.com> wrote:
> On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote:
>
>
>
>
>
>
>
>
>
> > Yes there is extra term.
> > Here is some excerpt:
>
> > TX_SRX_FIFO_inst : TX_SRX_FIFO
> > =A0 PORT MAP (
> > =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA,
> > =A0 =A0 rdclk =A0 =A0=3D> iCLK245,
> > =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq,
> > =A0 =A0 wrclk =A0 =A0=3D> iCLK368,
> > =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq,
> > =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q,
> > =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty,
> > =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full
> > =A0 =A0 );
>
> > =A0 =A0 -- 2 in 3 clock enables is used
> > =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not TX_SRX_FIFO_full=
));
> > =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty;
>
> > the clock ratio is 368.64 to 245.76 to be exact.
>
> > Kaz
>
> > ---------------------------------------
> > Posted throughhttp://www.FPGARelated.com
>
> WOW, I reproduced the behavior that you describe =A0(non-recovery after
> overflow) in functional simulation with Altera's internal simulator!
> I never imagined that anything like that is possible.
> Sounds like bug in implementation of dcfifo. Of course Altera will
> call this bug a feature and will say that as long as there was
> overflow nothing could be guaranteed. Or similar bullsheet.
> I am writing sequential counter and see the pattern like (64, 57, 66,
> 59, 68, 61, 70, 63...) on the read side.
> To be continued...

Few more observations:
1. The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could
be easily forced into the same "mad" state.
2. A single write into full FIFO is not enough to trigger the problem.
You have to write to full FIFO 3 times in a row. Which, generally
should never happen even in presence of poorly prevented
metastability.
3. So, in order to force FIFO into "mad" state you have to do stupid
sequence on the write side. But when FIFO is already mad, it's a read
side that is keeping it here. Somehow, it stops correctly detecting
rdempty condition.

What would I do?
1. I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very
unlikely that the problem is here, but  for such high clock
frequencies the value of 3 is still wrong.
2. I'd start looking for race condition type of bug. Like feeding one
clock domain with vector, generated in other clock domain. If you
don't know all parts of design then try to look at Timequest "clock
transfers" display. It could be helpful.
3. In the longer run, I'd redesign the whole synchronization block.
IMHO, a design that has maximal FIFO read throughput exactly equal to
*nominal* write throughput is not sufficiently robust. I'd very much
prefer maximal read throughput to be at least 1% higher. Then your
FIFO will be most of the time close to empty and the block as whole
will be "self-curing". As additional benefit, you will have more
predictable latency through FIFO. Even if latency is not important for
your main functionality, it's good for easier debugging.

Article: 154685
Subject: Re: DC fifo behaviour at underflow/overflow
From: "kaz" <3619@embeddedrelated>
Date: Sun, 16 Dec 2012 12:02:58 -0600
Links: << >> << T >> << A >>

>On Dec 16, 6:00=A0pm, Michael S <already5cho...@yahoo.com> wrote:
>> On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> > Yes there is extra term.
>> > Here is some excerpt:
>>
>> > TX_SRX_FIFO_inst : TX_SRX_FIFO
>> > =A0 PORT MAP (
>> > =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA,
>> > =A0 =A0 rdclk =A0 =A0=3D> iCLK245,
>> > =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq,
>> > =A0 =A0 wrclk =A0 =A0=3D> iCLK368,
>> > =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq,
>> > =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q,
>> > =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty,
>> > =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full
>> > =A0 =A0 );
>>
>> > =A0 =A0 -- 2 in 3 clock enables is used
>> > =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not
TX_SRX_FIFO_full=
>));
>> > =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty;
>>
>> > the clock ratio is 368.64 to 245.76 to be exact.
>>
>> > Kaz
>>
>> > ---------------------------------------
>> > Posted throughhttp://www.FPGARelated.com
>>
>> WOW, I reproduced the behavior that you describe =A0(non-recovery after
>> overflow) in functional simulation with Altera's internal simulator!
>> I never imagined that anything like that is possible.
>> Sounds like bug in implementation of dcfifo. Of course Altera will
>> call this bug a feature and will say that as long as there was
>> overflow nothing could be guaranteed. Or similar bullsheet.
>> I am writing sequential counter and see the pattern like (64, 57, 66,
>> 59, 68, 61, 70, 63...) on the read side.
>> To be continued...
>
>Few more observations:
>1. The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could
>be easily forced into the same "mad" state.
>2. A single write into full FIFO is not enough to trigger the problem.
>You have to write to full FIFO 3 times in a row. Which, generally
>should never happen even in presence of poorly prevented
>metastability.
>3. So, in order to force FIFO into "mad" state you have to do stupid
>sequence on the write side. But when FIFO is already mad, it's a read
>side that is keeping it here. Somehow, it stops correctly detecting
>rdempty condition.
>
>What would I do?
>1. I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very
>unlikely that the problem is here, but  for such high clock
>frequencies the value of 3 is still wrong.
>2. I'd start looking for race condition type of bug. Like feeding one
>clock domain with vector, generated in other clock domain. If you
>don't know all parts of design then try to look at Timequest "clock
>transfers" display. It could be helpful.
>3. In the longer run, I'd redesign the whole synchronization block.
>IMHO, a design that has maximal FIFO read throughput exactly equal to
>*nominal* write throughput is not sufficiently robust. I'd very much
>prefer maximal read throughput to be at least 1% higher. Then your
>FIFO will be most of the time close to empty and the block as whole
>will be "self-curing". As additional benefit, you will have more
>predictable latency through FIFO. Even if latency is not important for
>your main functionality, it's good for easier debugging.
>
>
>
>

Thanks so much Michael. It is great that you thought of simulating fifo in
this mad state. I will try reproduce that. I assume you are doing
functional
simulation. 
The interesting thing is that we got many fifos in our system but only one
is
misbehaving.

Kaz	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154686
Subject: Re: DC fifo behaviour at underflow/overflow
From: Michael S <already5chosen@yahoo.com>
Date: Sun, 16 Dec 2012 14:35:01 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 16, 8:02=A0pm, "kaz" <3619@embeddedrelated> wrote:
> >On Dec 16, 6:00=3DA0pm, Michael S <already5cho...@yahoo.com> wrote:
> >> On Dec 16, 3:57=3DA0pm, "kaz" <3619@embeddedrelated> wrote:
>
> >> > Yes there is extra term.
> >> > Here is some excerpt:
>
> >> > TX_SRX_FIFO_inst : TX_SRX_FIFO
> >> > =3DA0 PORT MAP (
> >> > =3DA0 =3DA0 data =3DA0 =3DA0 =3D3D> TX_SRX_FIFO_DATA,
> >> > =3DA0 =3DA0 rdclk =3DA0 =3DA0=3D3D> iCLK245,
> >> > =3DA0 =3DA0 rdreq =3DA0 =3DA0=3D3D> TX_SRX_FIFO_rdreq,
> >> > =3DA0 =3DA0 wrclk =3DA0 =3DA0=3D3D> iCLK368,
> >> > =3DA0 =3DA0 wrreq =3DA0 =3DA0=3D3D> TX_SRX_FIFO_wrreq,
> >> > =3DA0 =3DA0 q =3DA0 =3DA0 =3DA0 =3DA0=3D3D> TX_SRX_FIFO_q,
> >> > =3DA0 =3DA0 rdempty =3DA0=3D3D> TX_SRX_FIFO_empty,
> >> > =3DA0 =3DA0 wrfull =3DA0 =3D3D> TX_SRX_FIFO_full
> >> > =3DA0 =3DA0 );
>
> >> > =3DA0 =3DA0 -- 2 in 3 clock enables is used
> >> > =3DA0 =3DA0 TX_SRX_FIFO_wrreq <=3D3D (Sync_23_1b(1) AND (not
> TX_SRX_FIFO_full=3D
> >));
> >> > =3DA0 =3DA0 TX_SRX_FIFO_rdreq <=3D3D not TX_SRX_FIFO_empty;
>
> >> > the clock ratio is 368.64 to 245.76 to be exact.
>
> >> > Kaz
>
> >> > ---------------------------------------
> >> > Posted throughhttp://www.FPGARelated.com
>
> >> WOW, I reproduced the behavior that you describe =3DA0(non-recovery af=
ter
> >> overflow) in functional simulation with Altera's internal simulator!
> >> I never imagined that anything like that is possible.
> >> Sounds like bug in implementation of dcfifo. Of course Altera will
> >> call this bug a feature and will say that as long as there was
> >> overflow nothing could be guaranteed. Or similar bullsheet.
> >> I am writing sequential counter and see the pattern like (64, 57, 66,
> >> 59, 68, 61, 70, 63...) on the read side.
> >> To be continued...
>
> >Few more observations:
> >1. The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could
> >be easily forced into the same "mad" state.
> >2. A single write into full FIFO is not enough to trigger the problem.
> >You have to write to full FIFO 3 times in a row. Which, generally
> >should never happen even in presence of poorly prevented
> >metastability.
> >3. So, in order to force FIFO into "mad" state you have to do stupid
> >sequence on the write side. But when FIFO is already mad, it's a read
> >side that is keeping it here. Somehow, it stops correctly detecting
> >rdempty condition.
>
> >What would I do?
> >1. I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very
> >unlikely that the problem is here, but =A0for such high clock
> >frequencies the value of 3 is still wrong.
> >2. I'd start looking for race condition type of bug. Like feeding one
> >clock domain with vector, generated in other clock domain. If you
> >don't know all parts of design then try to look at Timequest "clock
> >transfers" display. It could be helpful.
> >3. In the longer run, I'd redesign the whole synchronization block.
> >IMHO, a design that has maximal FIFO read throughput exactly equal to
> >*nominal* write throughput is not sufficiently robust. I'd very much
> >prefer maximal read throughput to be at least 1% higher. Then your
> >FIFO will be most of the time close to empty and the block as whole
> >will be "self-curing". As additional benefit, you will have more
> >predictable latency through FIFO. Even if latency is not important for
> >your main functionality, it's good for easier debugging.
>
> Thanks so much Michael. It is great that you thought of simulating fifo i=
n
> this mad state. I will try reproduce that. I assume you are doing
> functional
> simulation.
> The interesting thing is that we got many fifos in our system but only on=
e
> is
> misbehaving.
>
> Kaz
>
> ---------------------------------------
> Posted throughhttp://www.FPGARelated.com

I thought a bit more about it. As a result, I am taking back
everything I said about Altera in the post #13.
Altera's dcfifo is o.k. The access pattern is just too troublesome for
overflow recovery, it will cause problems to any reasonable FIFO
implementation.
Sorry, Altera, I was wrong.

Now I'd try to explain the problem:
Immediately after overflow read pointer and write pointer are very
close to each other - on one cycle read pointer pulls ahead of write
pointer and reads sample from 9 writes ago, on the next cycle it fails
behind write pointer and reads the very last write sample, and then
again pulls ahead and so on.
It happens because read machine sees delayed version of write pointer,
trailing read pointer by one or two and then thinks that the FIFO is
almost full.  And continues to read. Write machine, on the other hand,
sees delayed version of read pointer, equal to write pointer or
trailing it by one and then thinks that FIFO is either empty or almost
empty. And continues to write.
Since average rate of writing is exactly equal to rate of reading the
recovery fromthis situation can take a lot of time, in case of common
clock source recovery could never happen.

The solution? Assure that overflow/underflow never happens.
If you can't - at least increase the frequency of read clock, as
suggested in my previous post. 1% increase is enough.
If that is too hard too then slightly modify write pattern. Instead of
"++-++-++-++-" do "+++++++++---+++++++++---". That pattern will
guarantee instant overflow/underflow recovery. If, for some reason,
such modification of the write pattern is impossible then do smaller
modification "++++--++++--". This pattern is not safe, but
probabilistically should recover from overflow much faster than yours.

Good luck.

Article: 154687
Subject: Re: DC fifo behaviour at underflow/overflow
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Mon, 17 Dec 2012 00:01:26 +0000 (UTC)
Links: << >> << T >> << A >>

Michael S <already5chosen@yahoo.com> wrote:

(snip)

>> >> WOW, I reproduced the behavior that you describe =A0(non-recovery after
>> >> overflow) in functional simulation with Altera's internal simulator!
>> >> I never imagined that anything like that is possible.

(snip)

> I thought a bit more about it. As a result, I am taking back
> everything I said about Altera in the post #13.
> Altera's dcfifo is o.k. The access pattern is just too troublesome for
> overflow recovery, it will cause problems to any reasonable FIFO
> implementation.
> Sorry, Altera, I was wrong.
 
> Now I'd try to explain the problem:
> Immediately after overflow read pointer and write pointer are very
> close to each other - on one cycle read pointer pulls ahead of write
> pointer and reads sample from 9 writes ago, on the next cycle it fails
> behind write pointer and reads the very last write sample, and then
> again pulls ahead and so on.

Last I knew, FIFOs were supposed to have an almost full and almost
empty signal to avoid that problem. Maybe at 7/8 and 1/8.

> It happens because read machine sees delayed version of write pointer,
> trailing read pointer by one or two and then thinks that the FIFO is
> almost full.  And continues to read. Write machine, on the other hand,
> sees delayed version of read pointer, equal to write pointer or
> trailing it by one and then thinks that FIFO is either empty or almost
> empty. And continues to write.

If you use the almost full and almost empty, that should leave plenty
of margin for such delays. Even more is needed if the signals are
processed through software.

> Since average rate of writing is exactly equal to rate of reading the
> recovery fromthis situation can take a lot of time, in case of common
> clock source recovery could never happen.

Then, only after writing is finished, flush out all the data with the
actual empty flag.

-- glen

Article: 154688
Subject: Re: DC fifo behaviour at underflow/overflow
From: Michael S <already5chosen@yahoo.com>
Date: Sun, 16 Dec 2012 16:51:45 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote:
> Yes there is extra term.
> Here is some excerpt:
>
> TX_SRX_FIFO_inst : TX_SRX_FIFO
> =A0 PORT MAP (
> =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA,
> =A0 =A0 rdclk =A0 =A0=3D> iCLK245,
> =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq,
> =A0 =A0 wrclk =A0 =A0=3D> iCLK368,
> =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq,
> =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q,
> =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty,
> =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full
> =A0 =A0 );
>
> =A0 =A0 -- 2 in 3 clock enables is used
> =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not TX_SRX_FIFO_full))=
;
> =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty;
>
> the clock ratio is 368.64 to 245.76 to be exact.
>
> Kaz
>
> ---------------------------------------
> Posted throughhttp://www.FPGARelated.com

I'd also like to see a definition of  TX_SRX_FIFO.

Article: 154689
Subject: Re: Where to move for an embedded software engineer.
From: David Brown <david@westcontrol.removethisbit.com>
Date: Mon, 17 Dec 2012 09:27:31 +0100
Links: << >> << T >> << A >>

On 14/12/2012 15:38, hamilton wrote:
> On 12/14/2012 2:16 AM, scrts wrote:
>> "Philipp Klaus Krause"  wrote in message news:kacoae$25e$1@solani.org...
>> On 10.12.2012 02:04, no one wrote:
>>> I assume the bay area is number one for embedded software engineers,
>>> but where else are the big markets,
>>
>> Southern Germany. Lots of companies seem to be hiring around here,
>> inluding some bigger ones, such as Bosch and Sick, and a huge number of
>> smaller ones with at most a few hundred employees.
>>
>>
>>
>> Remember to tell that You will probably must speek German...
>>
> Are the EU countries more welcoming to foreign workers then the USA ?

That question is impossible to answer - the EU is not like different 
states in the US.  Different countries will have different attitudes to 
immigrants from different parts of the world, even though the official 
rules are probably fairly similar.  And there are countries in Europe 
that are not part of the EU (such as Norway).

>
> What is the equivalent to the H1-B visa ?
>
> Does each EU country have a different visa requirement ?

You are going to get more useful answers by looking at a more official 
source of information.  I would recommend looking at the website of the 
embassy for the countries you are interested in - they will give you the 
exact answer.  In general, if you are a well-qualified US citizen (and 
no complications like criminal records), and there is a company in the 
European country that can offer you a job, then the paperwork should not 
be a hinder.

More importantly, you have to consider if you can live in the country 
you are looking at.  Language is a big issue.  In many technical 
departments of companies in Scandinavia, much of the work is in English. 
  But towards the south of Europe, it is mostly in native languages 
(except in the bigger and more international companies).  Outside work, 
you can get by with English more easily in Northern Europe and smaller 
countries, but less so further south and in bigger countries.  A 
particular point here is that in countries like Norway or the 
Netherlands, much of the TV is foreign and in English with local 
subtitles - while in bigger countries like Spain and Italy, far more of 
the foreign TV is dubbed.  Similarly with books and translations.  This 
makes a big difference to how familiar people are with English in 
everyday life.

Other things like climate, food, culture and general way of life are 
important, and vary enormously across Europe.

>
> thanks for any info
>
> hamilton
>
>

Article: 154690
Subject: Re: DC fifo behaviour at underflow/overflow
From: "kaz" <3619@embeddedrelated>
Date: Mon, 17 Dec 2012 03:17:38 -0600
Links: << >> << T >> << A >>

>On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote:
>> Yes there is extra term.
>> Here is some excerpt:
>>
>> TX_SRX_FIFO_inst : TX_SRX_FIFO
>> =A0 PORT MAP (
>> =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA,
>> =A0 =A0 rdclk =A0 =A0=3D> iCLK245,
>> =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq,
>> =A0 =A0 wrclk =A0 =A0=3D> iCLK368,
>> =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq,
>> =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q,
>> =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty,
>> =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full
>> =A0 =A0 );
>>
>> =A0 =A0 -- 2 in 3 clock enables is used
>> =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not
TX_SRX_FIFO_full))=
>;
>> =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty;
>>

Hi Michael,

below is definition of fifo.
What troubles me is that write/read are tied up to full/empty respectively
so I don't see why flow problems should occur. Moreover the write/read is 
protected internally as well.

Could you also please let me know was it timing simulation that you did?

Thanks


LIBRARY ieee;
USE ieee.std_logic_1164.all;

LIBRARY altera_mf;
USE altera_mf.all;

ENTITY TX_SRX_FIFO IS
	PORT	(
		data		: IN STD_LOGIC_VECTOR (31 DOWNTO 0);
		rdclk		: IN STD_LOGIC ;
		rdreq		: IN STD_LOGIC ;
		wrclk		: IN STD_LOGIC ;
		wrreq		: IN STD_LOGIC ;
		q		: OUT STD_LOGIC_VECTOR (31 DOWNTO 0);
		rdempty		: OUT STD_LOGIC ;
		wrfull		: OUT STD_LOGIC 
	);
END TX_SRX_FIFO;


ARCHITECTURE SYN OF tx_srx_fifo IS

	SIGNAL sub_wire0	: STD_LOGIC ;
	SIGNAL sub_wire1	: STD_LOGIC ;
	SIGNAL sub_wire2	: STD_LOGIC_VECTOR (31 DOWNTO 0);

	COMPONENT dcfifo
	GENERIC (
		intended_device_family	: STRING;
		lpm_numwords		: NATURAL;
		lpm_showahead		: STRING;
		lpm_type		: STRING;
		lpm_width		: NATURAL;
		lpm_widthu		: NATURAL;
		overflow_checking	: STRING;
		rdsync_delaypipe	: NATURAL;
		underflow_checking	: STRING;
		use_eab		        : STRING;
		wrsync_delaypipe	: NATURAL
	);
	PORT (
			wrclk	: IN STD_LOGIC ;
			rdempty	: OUT STD_LOGIC ;
			rdreq	: IN STD_LOGIC ;
			wrfull	: OUT STD_LOGIC ;
			rdclk	: IN STD_LOGIC ;
			q	: OUT STD_LOGIC_VECTOR (31 DOWNTO 0);
			wrreq	: IN STD_LOGIC ;
			data	: IN STD_LOGIC_VECTOR (31 DOWNTO 0)
	);
	END COMPONENT;

BEGIN
	rdempty    <= sub_wire0;
	wrfull    <= sub_wire1;
	q    <= sub_wire2(31 DOWNTO 0);

	dcfifo_component : dcfifo
	GENERIC MAP (
		intended_device_family => "Stratix IV",
		lpm_numwords => 8,
		lpm_showahead => "OFF",
		lpm_type => "dcfifo",
		lpm_width => 32,
		lpm_widthu => 3,
		overflow_checking => "ON",
		rdsync_delaypipe => 5,
		underflow_checking => "ON",
		use_eab => "ON",
		wrsync_delaypipe => 5
	)
	PORT MAP (
		wrclk => wrclk,
		rdreq => rdreq,
		rdclk => rdclk,
		wrreq => wrreq,
		data => data,
		rdempty => sub_wire0,
		wrfull => sub_wire1,
		q => sub_wire2
	);

END SYN;

Kaz

	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154691
Subject: Re: DC fifo behaviour at underflow/overflow
From: Michael S <already5chosen@yahoo.com>
Date: Mon, 17 Dec 2012 02:28:12 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 17, 11:17=A0am, "kaz" <3619@embeddedrelated> wrote:
> >On Dec 16, 3:57=3DA0pm, "kaz" <3619@embeddedrelated> wrote:
> >> Yes there is extra term.
> >> Here is some excerpt:
>
> >> TX_SRX_FIFO_inst : TX_SRX_FIFO
> >> =3DA0 PORT MAP (
> >> =3DA0 =3DA0 data =3DA0 =3DA0 =3D3D> TX_SRX_FIFO_DATA,
> >> =3DA0 =3DA0 rdclk =3DA0 =3DA0=3D3D> iCLK245,
> >> =3DA0 =3DA0 rdreq =3DA0 =3DA0=3D3D> TX_SRX_FIFO_rdreq,
> >> =3DA0 =3DA0 wrclk =3DA0 =3DA0=3D3D> iCLK368,
> >> =3DA0 =3DA0 wrreq =3DA0 =3DA0=3D3D> TX_SRX_FIFO_wrreq,
> >> =3DA0 =3DA0 q =3DA0 =3DA0 =3DA0 =3DA0=3D3D> TX_SRX_FIFO_q,
> >> =3DA0 =3DA0 rdempty =3DA0=3D3D> TX_SRX_FIFO_empty,
> >> =3DA0 =3DA0 wrfull =3DA0 =3D3D> TX_SRX_FIFO_full
> >> =3DA0 =3DA0 );
>
> >> =3DA0 =3DA0 -- 2 in 3 clock enables is used
> >> =3DA0 =3DA0 TX_SRX_FIFO_wrreq <=3D3D (Sync_23_1b(1) AND (not
>
> TX_SRX_FIFO_full))=3D
>
> >;
> >> =3DA0 =3DA0 TX_SRX_FIFO_rdreq <=3D3D not TX_SRX_FIFO_empty;
>
> Hi Michael,
>
> below is definition of fifo.
> What troubles me is that write/read are tied up to full/empty respectivel=
y
> so I don't see why flow problems should occur. Moreover the write/read is
> protected internally as well.
>
> Could you also please let me know was it timing simulation that you did?

I disabled built-in protections and forcefully overwrote protections
in user logic.

  rdreq <=3D (not rdempty or force1_rdreq) and not force0_rdreq;
  wrreq <=3D ((not wrfull and  (not xx(1)))  or force1_wrreq) and not
force0_wrreq;

>
> Thanks
>
> LIBRARY ieee;
> USE ieee.std_logic_1164.all;
>
> LIBRARY altera_mf;
> USE altera_mf.all;
>
> ENTITY TX_SRX_FIFO IS
> =A0 =A0 =A0 =A0 PORT =A0 =A0(
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 data =A0 =A0 =A0 =A0 =A0 =A0: IN STD_LOGI=
C_VECTOR (31 DOWNTO 0);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdclk =A0 =A0 =A0 =A0 =A0 : IN STD_LOGIC =
;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdreq =A0 =A0 =A0 =A0 =A0 : IN STD_LOGIC =
;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrclk =A0 =A0 =A0 =A0 =A0 : IN STD_LOGIC =
;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrreq =A0 =A0 =A0 =A0 =A0 : IN STD_LOGIC =
;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q =A0 =A0 =A0 =A0 =A0 =A0 =A0 : OUT STD_L=
OGIC_VECTOR (31 DOWNTO 0);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdempty =A0 =A0 =A0 =A0 : OUT STD_LOGIC ;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrfull =A0 =A0 =A0 =A0 =A0: OUT STD_LOGIC
> =A0 =A0 =A0 =A0 );
> END TX_SRX_FIFO;
>
> ARCHITECTURE SYN OF tx_srx_fifo IS
>
> =A0 =A0 =A0 =A0 SIGNAL sub_wire0 =A0 =A0 =A0 =A0: STD_LOGIC ;
> =A0 =A0 =A0 =A0 SIGNAL sub_wire1 =A0 =A0 =A0 =A0: STD_LOGIC ;
> =A0 =A0 =A0 =A0 SIGNAL sub_wire2 =A0 =A0 =A0 =A0: STD_LOGIC_VECTOR (31 DO=
WNTO 0);
>
> =A0 =A0 =A0 =A0 COMPONENT dcfifo
> =A0 =A0 =A0 =A0 GENERIC (
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 intended_device_family =A0: STRING;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_numwords =A0 =A0 =A0 =A0 =A0 =A0: NAT=
URAL;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_showahead =A0 =A0 =A0 =A0 =A0 : STRIN=
G;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_type =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0:=
 STRING;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_width =A0 =A0 =A0 =A0 =A0 =A0 =A0 : N=
ATURAL;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_widthu =A0 =A0 =A0 =A0 =A0 =A0 =A0: N=
ATURAL;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 overflow_checking =A0 =A0 =A0 : STRING;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdsync_delaypipe =A0 =A0 =A0 =A0: NATURAL=
;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 underflow_checking =A0 =A0 =A0: STRING;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 use_eab =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 :=
 STRING;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrsync_delaypipe =A0 =A0 =A0 =A0: NATURAL
> =A0 =A0 =A0 =A0 );
> =A0 =A0 =A0 =A0 PORT (
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrclk =A0 : IN STD_LOGIC =
;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdempty : OUT STD_LOGIC ;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdreq =A0 : IN STD_LOGIC =
;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrfull =A0: OUT STD_LOGIC=
 ;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdclk =A0 : IN STD_LOGIC =
;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q =A0 =A0 =A0 : OUT STD_L=
OGIC_VECTOR (31 DOWNTO 0);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrreq =A0 : IN STD_LOGIC =
;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 data =A0 =A0: IN STD_LOGI=
C_VECTOR (31 DOWNTO 0)
> =A0 =A0 =A0 =A0 );
> =A0 =A0 =A0 =A0 END COMPONENT;
>
> BEGIN
> =A0 =A0 =A0 =A0 rdempty =A0 =A0<=3D sub_wire0;
> =A0 =A0 =A0 =A0 wrfull =A0 =A0<=3D sub_wire1;
> =A0 =A0 =A0 =A0 q =A0 =A0<=3D sub_wire2(31 DOWNTO 0);
>
> =A0 =A0 =A0 =A0 dcfifo_component : dcfifo
> =A0 =A0 =A0 =A0 GENERIC MAP (
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 intended_device_family =3D> "Stratix IV",
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_numwords =3D> 8,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_showahead =3D> "OFF",
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_type =3D> "dcfifo",
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_width =3D> 32,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_widthu =3D> 3,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 overflow_checking =3D> "ON",
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdsync_delaypipe =3D> 5,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 underflow_checking =3D> "ON",
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 use_eab =3D> "ON",
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrsync_delaypipe =3D> 5
> =A0 =A0 =A0 =A0 )
> =A0 =A0 =A0 =A0 PORT MAP (
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrclk =3D> wrclk,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdreq =3D> rdreq,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdclk =3D> rdclk,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrreq =3D> wrreq,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 data =3D> data,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdempty =3D> sub_wire0,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrfull =3D> sub_wire1,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q =3D> sub_wire2
> =A0 =A0 =A0 =A0 );
>
> END SYN;
>
> Kaz
>
> ---------------------------------------
> Posted throughhttp://www.FPGARelated.com

There is a known bug in Quartus 11.1 that applies to use_eab =3D> "ON".
According to the knowledge base, it was fixed in 11.1SP1.
http://www.altera.com/support/kdb/solutions/rd11182011_10.html

There is another known bug that is not fixed in 11.1SP1/11.1SP2, but
it only applies to use_eab =3D> "OFF" so you shouldn't care about it.

Looking at your parameters, rdsync_delaypipe =3D> 5 and wrsync_delaypipe
=3D> 5 sound like an overkill.
It's unlikely that value of 5 really improve anything over value of 4.
And if overflow happens nevertheless (don't ask me how, I don't know)
then longer sync pipelines certainly make self-recovery more
difficult.

Article: 154692
Subject: Re: DC fifo behaviour at underflow/overflow
From: Michael S <already5chosen@yahoo.com>
Date: Mon, 17 Dec 2012 02:33:03 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 17, 11:17=A0am, "kaz" <3619@embeddedrelated> wrote:
>
> Could you also please let me know was it timing simulation that you did?
>

No, I did a functional simulation in Quartus 9.1 internal simulator.
Quartus 9.1 ISIM can't simulate Stratix-IV, so I told him that it is
Cyclone-III. For a functional simulation it should make no difference.

Article: 154693
Subject: Re: MII SFD Detection with Shematics
From: jonesandy@comcast.net
Date: Tue, 18 Dec 2012 15:15:08 -0800 (PST)
Links: << >> << T >> << A >>

Sorry, but bubble diagrams of state machines drawn in HDL Designer are not schematics! 

I would like a tool that converts HDL into a re-arrangeable graphical bubble diagram (they never seem to arrange the diagram the way I intended it) for documentation. 

HDL source is portable and maintainable long after HDL Designer is obsolete, license expired, etc. And the machine-generated HDL source code is rarely very human readable.

Andy

Article: 154694
Subject: FPGA DSP basics: clock enable / new clock
From: o pere o <me@somewhere.net>
Date: Wed, 19 Dec 2012 09:32:19 +0100
Links: << >> << T >> << A >>

My current goal is to implement some digital signal processing (filters) 
on a FPGA. I am currently using Terasics DE0 nano board. This board has 
an ADC128S022 ADC. I have started as follows:

 From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL. 
 From this clock I generate the signals required to drive the ADC, 
essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a 
12 bit sample. This translates into 200 ksps. I generate a signal 
"smpl_rdy" at the appropriate position which allows me to latch the 8 
most significant bits.

The main question is how should I do the signal processing:

a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable
b) Deriving a new 200 kHz clock from the PLL

I have done some projects on FPGAs but they were quite simple, so I 
consider myself only a little more than a beginner. I can think of some 
problems with both approaches, but I may have overlooked may others:

For instance, if I followed a), I guess that Quartus II would think that 
the processing happens at 25.6 MHz: if there is a long combinational 
path between registers, the timing analyzer will not be able to figure 
out that the data and the enable signal are stable during 16 clock 
cycles. Is there a way to provide this info to Quartus II? OTOH, using 
the same signal as an enable for everything further down does not seem 
sound enough, thinking of fanouts. So what?

If I tried to follow b), how would I ensure that there is the proper 
phase relationship between both clocks? Is there a way to achieve this?

Thanks for any advice.

Pere

Article: 154695
Subject: Re: FPGA DSP basics: clock enable / new clock
From: Thomas Stanka <usenet_nospam_valid@stanka-web.de>
Date: Wed, 19 Dec 2012 01:19:04 -0800 (PST)
Links: << >> << T >> << A >>

On 19 Dez., 09:32, o pere o <m...@somewhere.net> wrote:
> My current goal is to implement some digital signal processing (filters)
> on a FPGA. I am currently using Terasics DE0 nano board. This board has
> an ADC128S022 ADC. I have started as follows:
>
> =A0From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL.
> =A0From this clock I generate the signals required to drive the ADC,
> essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a
> 12 bit sample. This translates into 200 ksps. I generate a signal
> "smpl_rdy" at the appropriate position which allows me to latch the 8
> most significant bits.
>
> The main question is how should I do the signal processing:
>
> a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable
> b) Deriving a new 200 kHz clock from the PLL

If you have not that much experience try to use only _one_ clock for
everything in the FPGA, this gives a synchronous design. As this clock
is higher than the ADC clock you can easily treat the signals from ADC
as asynchronous, and get them still proper (oversampling of ready,
allows you to determine, when the data is stable).
The easiest design is complete synchronous with only one clock and
considering all inputs as asynchronous. The frequencies you mention
indicate no reason, why that should not be possible in your case.

25 MHz should not be the big deal for a most operations in modern
FPGAs if you use some pipelining. In case you would like to use some
enable and have the operation processing within several clock cycles
use "multicycle path" constraint. I have no experience with quartus
but in general every tool should be able to allow setting multi cycle
constraints in a certain way.

best regards Thomas

Article: 154696
Subject: Re: FPGA DSP basics: clock enable / new clock
From: o pere o <me@somewhere.net>
Date: Wed, 19 Dec 2012 11:17:52 +0100
Links: << >> << T >> << A >>

On 12/19/2012 10:19 AM, Thomas Stanka wrote:
> On 19 Dez., 09:32, o pere o <m...@somewhere.net> wrote:
>> My current goal is to implement some digital signal processing (filters)
>> on a FPGA. I am currently using Terasics DE0 nano board. This board has
>> an ADC128S022 ADC. I have started as follows:
>>
>>   From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL.
>>   From this clock I generate the signals required to drive the ADC,
>> essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a
>> 12 bit sample. This translates into 200 ksps. I generate a signal
>> "smpl_rdy" at the appropriate position which allows me to latch the 8
>> most significant bits.
>>
>> The main question is how should I do the signal processing:
>>
>> a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable
>> b) Deriving a new 200 kHz clock from the PLL
>
> If you have not that much experience try to use only _one_ clock for
> everything in the FPGA, this gives a synchronous design. As this clock
> is higher than the ADC clock you can easily treat the signals from ADC
> as asynchronous, and get them still proper (oversampling of ready,
> allows you to determine, when the data is stable).
> The easiest design is complete synchronous with only one clock and
> considering all inputs as asynchronous. The frequencies you mention
> indicate no reason, why that should not be possible in your case.

Well, this is just to get started. Once I'm running I will try to speed 
everything up as much as possible, just to learn something from it :) 
So, I'd also like to know "the" way to do it right.

BTW, I don't understand what you mean saying that I can treat the 
signals as asynchronous: At a given time point in the ADC serial stream, 
I generate a 1-clock-wide signal that indicates that the data is ready. 
In the approach a) I plan to use this signal as an enable for all the 
registers in the processing path.

main clock TTTTTTTTTTTTTTTTTTTTTT....TTTTTTTTTTTTTTT
                _____     _____            _____
ADC  clock ____     _____     ___...._____     ___...
                                             _
smpl_rdy   _________________________________ _____...

 From all this, I would say that I am doing a fully synchronous design.

> 25 MHz should not be the big deal for a most operations in modern
> FPGAs if you use some pipelining. In case you would like to use some
> enable and have the operation processing within several clock cycles
> use "multicycle path" constraint. I have no experience with quartus
> but in general every tool should be able to allow setting multi cycle
> constraints in a certain way.

Aha! This is exactly what I was looking for. I would rather not have to 
pipeline something just because the fitter is trying to meet the 25.6 
MHz timing when the true clock is just 200 kHz. With this keyword I will 
(hopefully) find the way to do in Quartus.

> best regards Thomas

Thanks!
Pere

Article: 154697
Subject: Re: FPGA DSP basics: clock enable / new clock
From: "langwadt@fonz.dk" <langwadt@fonz.dk>
Date: Wed, 19 Dec 2012 14:55:58 -0800 (PST)
Links: << >> << T >> << A >>

On Dec 19, 9:32=A0am, o pere o <m...@somewhere.net> wrote:
> My current goal is to implement some digital signal processing (filters)
> on a FPGA. I am currently using Terasics DE0 nano board. This board has
> an ADC128S022 ADC. I have started as follows:
>
> =A0From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL.
> =A0From this clock I generate the signals required to drive the ADC,
> essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a
> 12 bit sample. This translates into 200 ksps. I generate a signal
> "smpl_rdy" at the appropriate position which allows me to latch the 8
> most significant bits.
>
> The main question is how should I do the signal processing:
>
> a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable
> b) Deriving a new 200 kHz clock from the PLL
>
> I have done some projects on FPGAs but they were quite simple, so I
> consider myself only a little more than a beginner. I can think of some
> problems with both approaches, but I may have overlooked may others:
>
> For instance, if I followed a), I guess that Quartus II would think that
> the processing happens at 25.6 MHz: if there is a long combinational
> path between registers, the timing analyzer will not be able to figure
> out that the data and the enable signal are stable during 16 clock
> cycles. Is there a way to provide this info to Quartus II? OTOH, using
> the same signal as an enable for everything further down does not seem
> sound enough, thinking of fanouts. So what?
>
> If I tried to follow b), how would I ensure that there is the proper
> phase relationship between both clocks? Is there a way to achieve this?
>
> Thanks for any advice.
>
> Pere

Do you need the 25.6MHz or could you do everything synchronous on a
3.2MHz ?
Then, even without telling it that you have multi cycles paths, timing
should
be easy

But also consider that for a lower clock rate you might need more
resources

i.e. filter running at 25.6MHz might only need one mul-acc, where a
filter
running at 3.2MHz needs 8

-Lasse

Article: 154698
Subject: Re: FPGA DSP basics: clock enable / new clock
From: rickman <gnuarm@gmail.com>
Date: Thu, 20 Dec 2012 00:19:38 -0500
Links: << >> << T >> << A >>

On 12/19/2012 5:17 AM, o pere o wrote:
> On 12/19/2012 10:19 AM, Thomas Stanka wrote:
>> On 19 Dez., 09:32, o pere o <m...@somewhere.net> wrote:
>>> My current goal is to implement some digital signal processing (filters)
>>> on a FPGA. I am currently using Terasics DE0 nano board. This board has
>>> an ADC128S022 ADC. I have started as follows:
>>>
>>> From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL.
>>> From this clock I generate the signals required to drive the ADC,
>>> essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a
>>> 12 bit sample. This translates into 200 ksps. I generate a signal
>>> "smpl_rdy" at the appropriate position which allows me to latch the 8
>>> most significant bits.
>>>
>>> The main question is how should I do the signal processing:
>>>
>>> a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable
>>> b) Deriving a new 200 kHz clock from the PLL
>>
>> If you have not that much experience try to use only _one_ clock for
>> everything in the FPGA, this gives a synchronous design. As this clock
>> is higher than the ADC clock you can easily treat the signals from ADC
>> as asynchronous, and get them still proper (oversampling of ready,
>> allows you to determine, when the data is stable).
>> The easiest design is complete synchronous with only one clock and
>> considering all inputs as asynchronous. The frequencies you mention
>> indicate no reason, why that should not be possible in your case.
>
> Well, this is just to get started. Once I'm running I will try to speed
> everything up as much as possible, just to learn something from it :)
> So, I'd also like to know "the" way to do it right.

I don't know that there is a single "right" way to do this.  Certainly 
there are many "wrong" ways.  In general it is easy to use a single 
clock, but as you say, you then have concerns about how to provide the 
appropriate timing constraint to Quartus.  I haven't used Quartus in 
years, but I am sure this is possible.

Some folks call this "multi-cycle" timing.  The way constraints are 
handled under the Xilinx tools it is just a different timing constraint 
than the clock period specification and so has priority for the paths a 
specific timing constraint is specified for.

> BTW, I don't understand what you mean saying that I can treat the
> signals as asynchronous: At a given time point in the ADC serial stream,
> I generate a 1-clock-wide signal that indicates that the data is ready.
> In the approach a) I plan to use this signal as an enable for all the
> registers in the processing path.
>
> main clock TTTTTTTTTTTTTTTTTTTTTT....TTTTTTTTTTTTTTT
> _____ _____ _____
> ADC clock ____ _____ ___...._____ ___...
> _
> smpl_rdy _________________________________ _____...
>
>  From all this, I would say that I am doing a fully synchronous design.

I think Thomas is saying you can treat the ADC interface as if it were 
async to the main clock.  But your ADC is driven synchronously with your 
25.6 MHz clock so this is not really needed and will likely use extra 
logic and work.

>> 25 MHz should not be the big deal for a most operations in modern
>> FPGAs if you use some pipelining. In case you would like to use some
>> enable and have the operation processing within several clock cycles
>> use "multicycle path" constraint. I have no experience with quartus
>> but in general every tool should be able to allow setting multi cycle
>> constraints in a certain way.
>
> Aha! This is exactly what I was looking for. I would rather not have to
> pipeline something just because the fitter is trying to meet the 25.6
> MHz timing when the true clock is just 200 kHz. With this keyword I will
> (hopefully) find the way to do in Quartus.
>
>> best regards Thomas
>
> Thanks!
> Pere

To use a 200 kHz clock should not be any real problem.  You just need to 
consider the timing of the 200 kHz clock when you design your circuit. 
An easy way to cross the clock domain boundary is to register the data 
from the ADC using the 25.6 MHz clock and a 200 kHz enable.  Make sure 
the rising edge of the 200 kHz clock is delayed at least one cycle from 
this register enable.  Then register the data a second time in the 200 
kHz clock domain.  This will help save some power as well as minimizing 
your timing analysis issues.

Rick

Article: 154699
Subject: Re: FPGA DSP basics: clock enable / new clock
From: o pere o <me@somewhere.net>
Date: Thu, 20 Dec 2012 10:58:23 +0100
Links: << >> << T >> << A >>

On 12/19/2012 11:55 PM, langwadt@fonz.dk wrote:
> On Dec 19, 9:32 am, o pere o <m...@somewhere.net> wrote:
>> My current goal is to implement some digital signal processing (filters)
>> on a FPGA. I am currently using Terasics DE0 nano board. This board has
>> an ADC128S022 ADC. I have started as follows:
>>
>>   From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL.
>>   From this clock I generate the signals required to drive the ADC,
>> essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a
>> 12 bit sample. This translates into 200 ksps. I generate a signal
>> "smpl_rdy" at the appropriate position which allows me to latch the 8
>> most significant bits.
>>
>> The main question is how should I do the signal processing:
>>
>> a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable
>> b) Deriving a new 200 kHz clock from the PLL
>>
>> I have done some projects on FPGAs but they were quite simple, so I
>> consider myself only a little more than a beginner. I can think of some
>> problems with both approaches, but I may have overlooked may others:
>>
>> For instance, if I followed a), I guess that Quartus II would think that
>> the processing happens at 25.6 MHz: if there is a long combinational
>> path between registers, the timing analyzer will not be able to figure
>> out that the data and the enable signal are stable during 16 clock
>> cycles. Is there a way to provide this info to Quartus II? OTOH, using
>> the same signal as an enable for everything further down does not seem
>> sound enough, thinking of fanouts. So what?
>>
>> If I tried to follow b), how would I ensure that there is the proper
>> phase relationship between both clocks? Is there a way to achieve this?
>>
>> Thanks for any advice.
>>
>> Pere
>
> Do you need the 25.6MHz or could you do everything synchronous on a
> 3.2MHz ?
> Then, even without telling it that you have multi cycles paths, timing
> should
> be easy

In this case, I could do everything at a much lower frequency. As I have
to generate the signals to control the ADC, my approach has been to 
start with a system frequency at least 2x the ADC clock frequency that I 
have to generate. So, I could work with 6.4MHz and timing would be much 
easier. However, the main point of my question was to learn the proper 
way to do this.

> But also consider that for a lower clock rate you might need more
> resources
>
> i.e. filter running at 25.6MHz might only need one mul-acc, where a
> filter
> running at 3.2MHz needs 8

That's certainly true!

BTW, any inputs on whether using my smpl_rdy as an enable for each 
register is a good/bad idea?

> -Lasse
>
Thanks for your inputs!

Pere

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search