Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
> >My crucial point is: >Is there anyway this altera fifo will break up the stream into another >stream >with even samples ahead of its odd half by 8 samples? > >Kaz > >--------------------------------------- >Posted through http://www.FPGARelated.com > Let me rephrase the problem. It may not be that the presumed fifo problem is a case of underflow/overflow but rather it is a timing problem or both mixed up. dc fifos protect against metastability to some degree but a failure could occur. The cross-domain paths are made false by default understandably. So is't a case of loss of functionality for some time intermittently that has to be accepted. The error stays for several tens of msec then disappears. Don't we expect fifos to recover more quickly(its internal sync pipeline is set to 3). Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 154676
On Sat, 15 Dec 2012 09:47:35 -0600, kaz wrote: >>"kaz" <3619@embeddedrelated> wrote in message >>news:AaydnUTdO7FfE1HNnZ2dnUVZ_iydnZ2d@giganews.com... >>> > >>>>I would never let a FIFO over or under flow. You should always stop >>> writing >>>>to the FIFO if the full flag is set and discard the input data stream. > If >>> >>>>the empty flag is set you should not read from the FIFO - instead > output >>>>known dummy data (invariably I output all zero's). >>>> >>>>Following this rule the behaviour of the FIFO is totally predictable. >>>> >>>>Andy >>>> >>>> >>> Thanks Andy. No question that fifo is meant to be working away from >>> underflow or overflow. What I am asking is there any known patterns >>> that could emerge - after all - within this unpredictibility. Here I >>> am asking about > known >>> symptoms of wrong behaviour really. >>> >>> Kaz >>> >>> >>Depends how the FIFO is constructed. >> >>If it is as a dual port RAM with an incrementable write pointer on the > input >>port, and an incrementable read pointer on the output port then if you > fill >>it to full - stop writing - then keep pulling data from the read port it >>will act as a circular buffer with data that will repeat over a number >>of > >>cycles which will equal the FIFO length. >> >>You can work out other scenarios for this architecture yourself, for > sure. >> >>Andy >> >> >> >> > My crucial point is: > Is there anyway this altera fifo will break up the stream into another > stream with even samples ahead of its odd half by 8 samples? I saw a DC (dual clock) FIFO do something like that once. It was in a Xilinx part, but the design error would apply equally well to an Altera part or an ASIC. It was part of an IP core that a client had bought. To make a 64 bit wide FIFO, the IP developer had used two 32 bit wide FIFOs in parallel. The two FIFOs had independent control circuits. Of course, as a dual clock FIFO, one can't really make any guarantees about the depth immediately after an asynchronous reset when the clocks are running, and indeed the two halves of the FIFO would start with different depths sometimes. There was no circuit to check for this state and get them back into sync and the end result was until the next reset, 32 bit chunks of data were swapped around. Regards, AllanArticle: 154677
On Dec 15, 8:14=A0pm, "kaz" <3619@embeddedrelated> wrote: > >My crucial point is: > >Is there anyway this altera fifo will break up the stream into another > >stream > >with even samples ahead of its odd half by 8 samples? > > >Kaz > > >--------------------------------------- > >Posted throughhttp://www.FPGARelated.com > > Let me rephrase the problem. > It may not be that the presumed fifo problem is a case of > underflow/overflow > but rather it is a timing problem or both mixed up. The symptoms look exactly like underflow/overflow. > > dc fifos protect against metastability to some degree but a failure could > occur. > The cross-domain paths are made false by default understandably. So is't = a > > case of loss of functionality for some time intermittently that has to be > accepted. The error stays for several tens of msec then disappears. Don't > we > expect fifos to recover more quickly(its internal sync pipeline is set to > 3). > According to the dcfifo help, value of 3 is internally translated to 1, which for very high clock rates that you are using is almost certainly insufficient. Try 4. > Kaz > > --------------------------------------- > Posted throughhttp://www.FPGARelated.com Did you pay attention to DELAY_RDUSEDW/DELAY_WRUSEDW parameters? Altera's default value (1) is unintuitive and, in my experience, tends to cause problems. If you rely on exact values of rdusedw or wrusedw ports for anything non-trivial, I'd recommend to set respective DELAY_xxUSEDW to 0. I'd also set OVERFLOW_CHECKING/UNDERFLOW_CHECKING to "OFF" and do underflow/overflow prevention in my own logic. BTW, personally, I wouldn't use Altera's 8-deep FIFOs, they don't appear to be as well tested as their deeper relatives. Or, may be, it's just me.Article: 154678
Many thanks for your contributions. The fifo I am using is very basic: 32 bits wide, 8 words deep, no reset, 3 stage synchroniser, write and read connected directly(combinatorially) to full/empty flags, word count not used, clocks(wr/rd 368/245). I am trying to put my head deeper into how a fifo might work internally. Assuming a simplest case, I understand the write pointer is clocked by write clock and it increments on write request (counting is binary or Gray). The read pointer mirrors that on the read side. The signals that cross the clock domain are the empty/full flag (in my case as I am not using the word counts). What now mystifies me is that if anything went wrong be it flow issue or timing then wouldn't these counters just increment from where they might have landed implying self recovery, excluding the case that read pointer is ahead of write pointer (as assumed in my case because samples are read out correctly each but misaligned). I mean to get 8 samples odd/even misalignment I can only think of pointers going crazy or address arriving crazy but regular. Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 154679
On Dec 16, 12:39=A0pm, "kaz" <3619@embeddedrelated> wrote: > Many thanks for your contributions. > > The fifo I am using is very basic: 32 bits wide, 8 words deep, no reset, > 3 stage synchroniser, write and read connected directly(combinatorially) > to > full/empty flags, word count not used, clocks(wr/rd 368/245). > > I am trying to put my head deeper into how a fifo might work internally. > Assuming a simplest case, I understand the write pointer is clocked by > write > clock and it increments on write request (counting is binary or Gray). Gray. > The read pointer mirrors that on the read side. > > The signals that cross the clock domain are the empty/full flag (in my ca= se > as I am not using the word counts). No, it does not work like that. The signals that cross clock domains are: * write pointer - resynchronized variant of it is used in the rdclk clock domain to generate rdempty and rdusedw * read pointer - resynchronized variant of it is used in the wrclk clock domain to generate wrempty and wrusedw > > What now mystifies me is that if anything went wrong be it flow issue or > timing then wouldn't these counters just increment from where they might > have > landed implying self recovery, excluding the case that read pointer is > ahead > of write pointer (as assumed in my case because samples are read out > correctly each but misaligned). Your write clock is faster than your read clock, so, supposedly, your wrreq has <100% duty cicle, right? The exact effect of underflow/overflow will depend on specific pattern applied to wrreq. You wrote above that wrreq is connected directly to wrfull. Does it mean that wrreq depends *only* on wrfull or does wrreq logic equation has some additional terms? > > I mean to get 8 samples odd/even misalignment I can only think of pointer= s > > going crazy or address arriving crazy but regular. > > Kaz > > --------------------------------------- > Posted throughhttp://www.FPGARelated.comArticle: 154680
>On Dec 16, 12:39=A0pm, "kaz" <3619@embeddedrelated> wrote: >> Many thanks for your contributions. >> >> The fifo I am using is very basic: 32 bits wide, 8 words deep, no reset, >> 3 stage synchroniser, write and read connected directly(combinatorially) >> to >> full/empty flags, word count not used, clocks(wr/rd 368/245). >> >> I am trying to put my head deeper into how a fifo might work internally. >> Assuming a simplest case, I understand the write pointer is clocked by >> write >> clock and it increments on write request (counting is binary or Gray). > >Gray. > >> The read pointer mirrors that on the read side. >> >> The signals that cross the clock domain are the empty/full flag (in my ca= >se >> as I am not using the word counts). > >No, it does not work like that. >The signals that cross clock domains are: >* write pointer - resynchronized variant of it is used in the rdclk >clock domain to generate rdempty and rdusedw >* read pointer - resynchronized variant of it is used in the wrclk >clock domain to generate wrempty and wrusedw > >> I agree that a resynchronised variant of write pointer will be used to generate rdempty and rdusew in other domain but not for read pointer itself i.e. each side has its own pointer. >> What now mystifies me is that if anything went wrong be it flow issue or >> timing then wouldn't these counters just increment from where they might >> have >> landed implying self recovery, excluding the case that read pointer is >> ahead >> of write pointer (as assumed in my case because samples are read out >> correctly each but misaligned). > >Your write clock is faster than your read clock, so, supposedly, your >wrreq has <100% duty cicle, right? >The exact effect of underflow/overflow will depend on specific pattern >applied to wrreq. > >You wrote above that wrreq is connected directly to wrfull. Does it >mean that wrreq depends *only* on wrfull or does wrreq logic equation >has some additional terms? > yes the input rate is controlled by valid being active in 2/3 ratio regularly. The read side is always active if fifo is not empty. Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 154681
On Dec 16, 2:53=A0pm, "kaz" <3619@embeddedrelated> wrote: > >On Dec 16, 12:39=3DA0pm, "kaz" <3619@embeddedrelated> wrote: > >> Many thanks for your contributions. > > >> The fifo I am using is very basic: 32 bits wide, 8 words deep, no > reset, > >> 3 stage synchroniser, write and read connected > > directly(combinatorially) > > > > > > > > > > >> to > >> full/empty flags, word count not used, clocks(wr/rd 368/245). > > >> I am trying to put my head deeper into how a fifo might work > internally. > >> Assuming a simplest case, I understand the write pointer is clocked by > >> write > >> clock and it increments on write request (counting is binary or Gray). > > >Gray. > > >> The read pointer mirrors that on the read side. > > >> The signals that cross the clock domain are the empty/full flag (in my > ca=3D > >se > >> as I am not using the word counts). > > >No, it does not work like that. > >The signals that cross clock domains are: > >* write pointer - resynchronized variant of it is used in the rdclk > >clock domain to generate rdempty and rdusedw > >* read pointer - resynchronized variant of it is used in the wrclk > >clock domain to generate wrempty and wrusedw > > I agree that a resynchronised variant of write pointer will be used to > generate rdempty and rdusew in other domain but not for read pointer itse= lf > > i.e. each side has its own pointer. Of course. Each side has it's own pointer + resynchronised copy of other side's pointer > > > > > > > > > > > > >> What now mystifies me is that if anything went wrong be it flow issue > or > >> timing then wouldn't these counters just increment from where they > might > >> have > >> landed implying self recovery, excluding the case that read pointer is > >> ahead > >> of write pointer (as assumed in my case because samples are read out > >> correctly each but misaligned). > > >Your write clock is faster than your read clock, so, supposedly, your > >wrreq has <100% duty cicle, right? > >The exact effect of underflow/overflow will depend on specific pattern > >applied to wrreq. > > >You wrote above that wrreq is connected directly to wrfull. Does it > >mean that wrreq depends *only* on wrfull or does wrreq logic equation > >has some additional terms? > > yes the input rate is controlled by valid being active in 2/3 ratio > regularly. > The read side is always active if fifo is not empty. > > Kaz > > --------------------------------------- > Posted throughhttp://www.FPGARelated.com yes =3D *only* wrfull or yes=3Dadditional terms? If the former, where wrdata is coming from? Can you post here a representative excerpt from your design ?Article: 154682
Yes there is extra term. Here is some excerpt: TX_SRX_FIFO_inst : TX_SRX_FIFO PORT MAP ( data => TX_SRX_FIFO_DATA, rdclk => iCLK245, rdreq => TX_SRX_FIFO_rdreq, wrclk => iCLK368, wrreq => TX_SRX_FIFO_wrreq, q => TX_SRX_FIFO_q, rdempty => TX_SRX_FIFO_empty, wrfull => TX_SRX_FIFO_full ); -- 2 in 3 clock enables is used TX_SRX_FIFO_wrreq <= (Sync_23_1b(1) AND (not TX_SRX_FIFO_full)); TX_SRX_FIFO_rdreq <= not TX_SRX_FIFO_empty; the clock ratio is 368.64 to 245.76 to be exact. Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 154683
On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote: > Yes there is extra term. > Here is some excerpt: > > TX_SRX_FIFO_inst : TX_SRX_FIFO > =A0 PORT MAP ( > =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA, > =A0 =A0 rdclk =A0 =A0=3D> iCLK245, > =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq, > =A0 =A0 wrclk =A0 =A0=3D> iCLK368, > =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq, > =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q, > =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty, > =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full > =A0 =A0 ); > > =A0 =A0 -- 2 in 3 clock enables is used > =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not TX_SRX_FIFO_full))= ; > =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty; > > the clock ratio is 368.64 to 245.76 to be exact. > > Kaz > > --------------------------------------- > Posted throughhttp://www.FPGARelated.com WOW, I reproduced the behavior that you describe (non-recovery after overflow) in functional simulation with Altera's internal simulator! I never imagined that anything like that is possible. Sounds like bug in implementation of dcfifo. Of course Altera will call this bug a feature and will say that as long as there was overflow nothing could be guaranteed. Or similar bullsheet. I am writing sequential counter and see the pattern like (64, 57, 66, 59, 68, 61, 70, 63...) on the read side. To be continued...Article: 154684
On Dec 16, 6:00=A0pm, Michael S <already5cho...@yahoo.com> wrote: > On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote: > > > > > > > > > > > Yes there is extra term. > > Here is some excerpt: > > > TX_SRX_FIFO_inst : TX_SRX_FIFO > > =A0 PORT MAP ( > > =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA, > > =A0 =A0 rdclk =A0 =A0=3D> iCLK245, > > =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq, > > =A0 =A0 wrclk =A0 =A0=3D> iCLK368, > > =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq, > > =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q, > > =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty, > > =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full > > =A0 =A0 ); > > > =A0 =A0 -- 2 in 3 clock enables is used > > =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not TX_SRX_FIFO_full= )); > > =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty; > > > the clock ratio is 368.64 to 245.76 to be exact. > > > Kaz > > > --------------------------------------- > > Posted throughhttp://www.FPGARelated.com > > WOW, I reproduced the behavior that you describe =A0(non-recovery after > overflow) in functional simulation with Altera's internal simulator! > I never imagined that anything like that is possible. > Sounds like bug in implementation of dcfifo. Of course Altera will > call this bug a feature and will say that as long as there was > overflow nothing could be guaranteed. Or similar bullsheet. > I am writing sequential counter and see the pattern like (64, 57, 66, > 59, 68, 61, 70, 63...) on the read side. > To be continued... Few more observations: 1. The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could be easily forced into the same "mad" state. 2. A single write into full FIFO is not enough to trigger the problem. You have to write to full FIFO 3 times in a row. Which, generally should never happen even in presence of poorly prevented metastability. 3. So, in order to force FIFO into "mad" state you have to do stupid sequence on the write side. But when FIFO is already mad, it's a read side that is keeping it here. Somehow, it stops correctly detecting rdempty condition. What would I do? 1. I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very unlikely that the problem is here, but for such high clock frequencies the value of 3 is still wrong. 2. I'd start looking for race condition type of bug. Like feeding one clock domain with vector, generated in other clock domain. If you don't know all parts of design then try to look at Timequest "clock transfers" display. It could be helpful. 3. In the longer run, I'd redesign the whole synchronization block. IMHO, a design that has maximal FIFO read throughput exactly equal to *nominal* write throughput is not sufficiently robust. I'd very much prefer maximal read throughput to be at least 1% higher. Then your FIFO will be most of the time close to empty and the block as whole will be "self-curing". As additional benefit, you will have more predictable latency through FIFO. Even if latency is not important for your main functionality, it's good for easier debugging.Article: 154685
>On Dec 16, 6:00=A0pm, Michael S <already5cho...@yahoo.com> wrote: >> On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote: >> >> >> >> >> >> >> >> >> >> > Yes there is extra term. >> > Here is some excerpt: >> >> > TX_SRX_FIFO_inst : TX_SRX_FIFO >> > =A0 PORT MAP ( >> > =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA, >> > =A0 =A0 rdclk =A0 =A0=3D> iCLK245, >> > =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq, >> > =A0 =A0 wrclk =A0 =A0=3D> iCLK368, >> > =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq, >> > =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q, >> > =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty, >> > =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full >> > =A0 =A0 ); >> >> > =A0 =A0 -- 2 in 3 clock enables is used >> > =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not TX_SRX_FIFO_full= >)); >> > =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty; >> >> > the clock ratio is 368.64 to 245.76 to be exact. >> >> > Kaz >> >> > --------------------------------------- >> > Posted throughhttp://www.FPGARelated.com >> >> WOW, I reproduced the behavior that you describe =A0(non-recovery after >> overflow) in functional simulation with Altera's internal simulator! >> I never imagined that anything like that is possible. >> Sounds like bug in implementation of dcfifo. Of course Altera will >> call this bug a feature and will say that as long as there was >> overflow nothing could be guaranteed. Or similar bullsheet. >> I am writing sequential counter and see the pattern like (64, 57, 66, >> 59, 68, 61, 70, 63...) on the read side. >> To be continued... > >Few more observations: >1. The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could >be easily forced into the same "mad" state. >2. A single write into full FIFO is not enough to trigger the problem. >You have to write to full FIFO 3 times in a row. Which, generally >should never happen even in presence of poorly prevented >metastability. >3. So, in order to force FIFO into "mad" state you have to do stupid >sequence on the write side. But when FIFO is already mad, it's a read >side that is keeping it here. Somehow, it stops correctly detecting >rdempty condition. > >What would I do? >1. I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very >unlikely that the problem is here, but for such high clock >frequencies the value of 3 is still wrong. >2. I'd start looking for race condition type of bug. Like feeding one >clock domain with vector, generated in other clock domain. If you >don't know all parts of design then try to look at Timequest "clock >transfers" display. It could be helpful. >3. In the longer run, I'd redesign the whole synchronization block. >IMHO, a design that has maximal FIFO read throughput exactly equal to >*nominal* write throughput is not sufficiently robust. I'd very much >prefer maximal read throughput to be at least 1% higher. Then your >FIFO will be most of the time close to empty and the block as whole >will be "self-curing". As additional benefit, you will have more >predictable latency through FIFO. Even if latency is not important for >your main functionality, it's good for easier debugging. > > > > Thanks so much Michael. It is great that you thought of simulating fifo in this mad state. I will try reproduce that. I assume you are doing functional simulation. The interesting thing is that we got many fifos in our system but only one is misbehaving. Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 154686
On Dec 16, 8:02=A0pm, "kaz" <3619@embeddedrelated> wrote: > >On Dec 16, 6:00=3DA0pm, Michael S <already5cho...@yahoo.com> wrote: > >> On Dec 16, 3:57=3DA0pm, "kaz" <3619@embeddedrelated> wrote: > > >> > Yes there is extra term. > >> > Here is some excerpt: > > >> > TX_SRX_FIFO_inst : TX_SRX_FIFO > >> > =3DA0 PORT MAP ( > >> > =3DA0 =3DA0 data =3DA0 =3DA0 =3D3D> TX_SRX_FIFO_DATA, > >> > =3DA0 =3DA0 rdclk =3DA0 =3DA0=3D3D> iCLK245, > >> > =3DA0 =3DA0 rdreq =3DA0 =3DA0=3D3D> TX_SRX_FIFO_rdreq, > >> > =3DA0 =3DA0 wrclk =3DA0 =3DA0=3D3D> iCLK368, > >> > =3DA0 =3DA0 wrreq =3DA0 =3DA0=3D3D> TX_SRX_FIFO_wrreq, > >> > =3DA0 =3DA0 q =3DA0 =3DA0 =3DA0 =3DA0=3D3D> TX_SRX_FIFO_q, > >> > =3DA0 =3DA0 rdempty =3DA0=3D3D> TX_SRX_FIFO_empty, > >> > =3DA0 =3DA0 wrfull =3DA0 =3D3D> TX_SRX_FIFO_full > >> > =3DA0 =3DA0 ); > > >> > =3DA0 =3DA0 -- 2 in 3 clock enables is used > >> > =3DA0 =3DA0 TX_SRX_FIFO_wrreq <=3D3D (Sync_23_1b(1) AND (not > TX_SRX_FIFO_full=3D > >)); > >> > =3DA0 =3DA0 TX_SRX_FIFO_rdreq <=3D3D not TX_SRX_FIFO_empty; > > >> > the clock ratio is 368.64 to 245.76 to be exact. > > >> > Kaz > > >> > --------------------------------------- > >> > Posted throughhttp://www.FPGARelated.com > > >> WOW, I reproduced the behavior that you describe =3DA0(non-recovery af= ter > >> overflow) in functional simulation with Altera's internal simulator! > >> I never imagined that anything like that is possible. > >> Sounds like bug in implementation of dcfifo. Of course Altera will > >> call this bug a feature and will say that as long as there was > >> overflow nothing could be guaranteed. Or similar bullsheet. > >> I am writing sequential counter and see the pattern like (64, 57, 66, > >> 59, 68, 61, 70, 63...) on the read side. > >> To be continued... > > >Few more observations: > >1. The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could > >be easily forced into the same "mad" state. > >2. A single write into full FIFO is not enough to trigger the problem. > >You have to write to full FIFO 3 times in a row. Which, generally > >should never happen even in presence of poorly prevented > >metastability. > >3. So, in order to force FIFO into "mad" state you have to do stupid > >sequence on the write side. But when FIFO is already mad, it's a read > >side that is keeping it here. Somehow, it stops correctly detecting > >rdempty condition. > > >What would I do? > >1. I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very > >unlikely that the problem is here, but =A0for such high clock > >frequencies the value of 3 is still wrong. > >2. I'd start looking for race condition type of bug. Like feeding one > >clock domain with vector, generated in other clock domain. If you > >don't know all parts of design then try to look at Timequest "clock > >transfers" display. It could be helpful. > >3. In the longer run, I'd redesign the whole synchronization block. > >IMHO, a design that has maximal FIFO read throughput exactly equal to > >*nominal* write throughput is not sufficiently robust. I'd very much > >prefer maximal read throughput to be at least 1% higher. Then your > >FIFO will be most of the time close to empty and the block as whole > >will be "self-curing". As additional benefit, you will have more > >predictable latency through FIFO. Even if latency is not important for > >your main functionality, it's good for easier debugging. > > Thanks so much Michael. It is great that you thought of simulating fifo i= n > this mad state. I will try reproduce that. I assume you are doing > functional > simulation. > The interesting thing is that we got many fifos in our system but only on= e > is > misbehaving. > > Kaz > > --------------------------------------- > Posted throughhttp://www.FPGARelated.com I thought a bit more about it. As a result, I am taking back everything I said about Altera in the post #13. Altera's dcfifo is o.k. The access pattern is just too troublesome for overflow recovery, it will cause problems to any reasonable FIFO implementation. Sorry, Altera, I was wrong. Now I'd try to explain the problem: Immediately after overflow read pointer and write pointer are very close to each other - on one cycle read pointer pulls ahead of write pointer and reads sample from 9 writes ago, on the next cycle it fails behind write pointer and reads the very last write sample, and then again pulls ahead and so on. It happens because read machine sees delayed version of write pointer, trailing read pointer by one or two and then thinks that the FIFO is almost full. And continues to read. Write machine, on the other hand, sees delayed version of read pointer, equal to write pointer or trailing it by one and then thinks that FIFO is either empty or almost empty. And continues to write. Since average rate of writing is exactly equal to rate of reading the recovery fromthis situation can take a lot of time, in case of common clock source recovery could never happen. The solution? Assure that overflow/underflow never happens. If you can't - at least increase the frequency of read clock, as suggested in my previous post. 1% increase is enough. If that is too hard too then slightly modify write pattern. Instead of "++-++-++-++-" do "+++++++++---+++++++++---". That pattern will guarantee instant overflow/underflow recovery. If, for some reason, such modification of the write pattern is impossible then do smaller modification "++++--++++--". This pattern is not safe, but probabilistically should recover from overflow much faster than yours. Good luck.Article: 154687
Michael S <already5chosen@yahoo.com> wrote: (snip) >> >> WOW, I reproduced the behavior that you describe =A0(non-recovery after >> >> overflow) in functional simulation with Altera's internal simulator! >> >> I never imagined that anything like that is possible. (snip) > I thought a bit more about it. As a result, I am taking back > everything I said about Altera in the post #13. > Altera's dcfifo is o.k. The access pattern is just too troublesome for > overflow recovery, it will cause problems to any reasonable FIFO > implementation. > Sorry, Altera, I was wrong. > Now I'd try to explain the problem: > Immediately after overflow read pointer and write pointer are very > close to each other - on one cycle read pointer pulls ahead of write > pointer and reads sample from 9 writes ago, on the next cycle it fails > behind write pointer and reads the very last write sample, and then > again pulls ahead and so on. Last I knew, FIFOs were supposed to have an almost full and almost empty signal to avoid that problem. Maybe at 7/8 and 1/8. > It happens because read machine sees delayed version of write pointer, > trailing read pointer by one or two and then thinks that the FIFO is > almost full. And continues to read. Write machine, on the other hand, > sees delayed version of read pointer, equal to write pointer or > trailing it by one and then thinks that FIFO is either empty or almost > empty. And continues to write. If you use the almost full and almost empty, that should leave plenty of margin for such delays. Even more is needed if the signals are processed through software. > Since average rate of writing is exactly equal to rate of reading the > recovery fromthis situation can take a lot of time, in case of common > clock source recovery could never happen. Then, only after writing is finished, flush out all the data with the actual empty flag. -- glenArticle: 154688
On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote: > Yes there is extra term. > Here is some excerpt: > > TX_SRX_FIFO_inst : TX_SRX_FIFO > =A0 PORT MAP ( > =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA, > =A0 =A0 rdclk =A0 =A0=3D> iCLK245, > =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq, > =A0 =A0 wrclk =A0 =A0=3D> iCLK368, > =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq, > =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q, > =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty, > =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full > =A0 =A0 ); > > =A0 =A0 -- 2 in 3 clock enables is used > =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not TX_SRX_FIFO_full))= ; > =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty; > > the clock ratio is 368.64 to 245.76 to be exact. > > Kaz > > --------------------------------------- > Posted throughhttp://www.FPGARelated.com I'd also like to see a definition of TX_SRX_FIFO.Article: 154689
On 14/12/2012 15:38, hamilton wrote: > On 12/14/2012 2:16 AM, scrts wrote: >> "Philipp Klaus Krause" wrote in message news:kacoae$25e$1@solani.org... >> On 10.12.2012 02:04, no one wrote: >>> I assume the bay area is number one for embedded software engineers, >>> but where else are the big markets, >> >> Southern Germany. Lots of companies seem to be hiring around here, >> inluding some bigger ones, such as Bosch and Sick, and a huge number of >> smaller ones with at most a few hundred employees. >> >> >> >> Remember to tell that You will probably must speek German... >> > Are the EU countries more welcoming to foreign workers then the USA ? That question is impossible to answer - the EU is not like different states in the US. Different countries will have different attitudes to immigrants from different parts of the world, even though the official rules are probably fairly similar. And there are countries in Europe that are not part of the EU (such as Norway). > > What is the equivalent to the H1-B visa ? > > Does each EU country have a different visa requirement ? You are going to get more useful answers by looking at a more official source of information. I would recommend looking at the website of the embassy for the countries you are interested in - they will give you the exact answer. In general, if you are a well-qualified US citizen (and no complications like criminal records), and there is a company in the European country that can offer you a job, then the paperwork should not be a hinder. More importantly, you have to consider if you can live in the country you are looking at. Language is a big issue. In many technical departments of companies in Scandinavia, much of the work is in English. But towards the south of Europe, it is mostly in native languages (except in the bigger and more international companies). Outside work, you can get by with English more easily in Northern Europe and smaller countries, but less so further south and in bigger countries. A particular point here is that in countries like Norway or the Netherlands, much of the TV is foreign and in English with local subtitles - while in bigger countries like Spain and Italy, far more of the foreign TV is dubbed. Similarly with books and translations. This makes a big difference to how familiar people are with English in everyday life. Other things like climate, food, culture and general way of life are important, and vary enormously across Europe. > > thanks for any info > > hamilton > >Article: 154690
>On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote: >> Yes there is extra term. >> Here is some excerpt: >> >> TX_SRX_FIFO_inst : TX_SRX_FIFO >> =A0 PORT MAP ( >> =A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA, >> =A0 =A0 rdclk =A0 =A0=3D> iCLK245, >> =A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq, >> =A0 =A0 wrclk =A0 =A0=3D> iCLK368, >> =A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq, >> =A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q, >> =A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty, >> =A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full >> =A0 =A0 ); >> >> =A0 =A0 -- 2 in 3 clock enables is used >> =A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not TX_SRX_FIFO_full))= >; >> =A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty; >> Hi Michael, below is definition of fifo. What troubles me is that write/read are tied up to full/empty respectively so I don't see why flow problems should occur. Moreover the write/read is protected internally as well. Could you also please let me know was it timing simulation that you did? Thanks LIBRARY ieee; USE ieee.std_logic_1164.all; LIBRARY altera_mf; USE altera_mf.all; ENTITY TX_SRX_FIFO IS PORT ( data : IN STD_LOGIC_VECTOR (31 DOWNTO 0); rdclk : IN STD_LOGIC ; rdreq : IN STD_LOGIC ; wrclk : IN STD_LOGIC ; wrreq : IN STD_LOGIC ; q : OUT STD_LOGIC_VECTOR (31 DOWNTO 0); rdempty : OUT STD_LOGIC ; wrfull : OUT STD_LOGIC ); END TX_SRX_FIFO; ARCHITECTURE SYN OF tx_srx_fifo IS SIGNAL sub_wire0 : STD_LOGIC ; SIGNAL sub_wire1 : STD_LOGIC ; SIGNAL sub_wire2 : STD_LOGIC_VECTOR (31 DOWNTO 0); COMPONENT dcfifo GENERIC ( intended_device_family : STRING; lpm_numwords : NATURAL; lpm_showahead : STRING; lpm_type : STRING; lpm_width : NATURAL; lpm_widthu : NATURAL; overflow_checking : STRING; rdsync_delaypipe : NATURAL; underflow_checking : STRING; use_eab : STRING; wrsync_delaypipe : NATURAL ); PORT ( wrclk : IN STD_LOGIC ; rdempty : OUT STD_LOGIC ; rdreq : IN STD_LOGIC ; wrfull : OUT STD_LOGIC ; rdclk : IN STD_LOGIC ; q : OUT STD_LOGIC_VECTOR (31 DOWNTO 0); wrreq : IN STD_LOGIC ; data : IN STD_LOGIC_VECTOR (31 DOWNTO 0) ); END COMPONENT; BEGIN rdempty <= sub_wire0; wrfull <= sub_wire1; q <= sub_wire2(31 DOWNTO 0); dcfifo_component : dcfifo GENERIC MAP ( intended_device_family => "Stratix IV", lpm_numwords => 8, lpm_showahead => "OFF", lpm_type => "dcfifo", lpm_width => 32, lpm_widthu => 3, overflow_checking => "ON", rdsync_delaypipe => 5, underflow_checking => "ON", use_eab => "ON", wrsync_delaypipe => 5 ) PORT MAP ( wrclk => wrclk, rdreq => rdreq, rdclk => rdclk, wrreq => wrreq, data => data, rdempty => sub_wire0, wrfull => sub_wire1, q => sub_wire2 ); END SYN; Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 154691
On Dec 17, 11:17=A0am, "kaz" <3619@embeddedrelated> wrote: > >On Dec 16, 3:57=3DA0pm, "kaz" <3619@embeddedrelated> wrote: > >> Yes there is extra term. > >> Here is some excerpt: > > >> TX_SRX_FIFO_inst : TX_SRX_FIFO > >> =3DA0 PORT MAP ( > >> =3DA0 =3DA0 data =3DA0 =3DA0 =3D3D> TX_SRX_FIFO_DATA, > >> =3DA0 =3DA0 rdclk =3DA0 =3DA0=3D3D> iCLK245, > >> =3DA0 =3DA0 rdreq =3DA0 =3DA0=3D3D> TX_SRX_FIFO_rdreq, > >> =3DA0 =3DA0 wrclk =3DA0 =3DA0=3D3D> iCLK368, > >> =3DA0 =3DA0 wrreq =3DA0 =3DA0=3D3D> TX_SRX_FIFO_wrreq, > >> =3DA0 =3DA0 q =3DA0 =3DA0 =3DA0 =3DA0=3D3D> TX_SRX_FIFO_q, > >> =3DA0 =3DA0 rdempty =3DA0=3D3D> TX_SRX_FIFO_empty, > >> =3DA0 =3DA0 wrfull =3DA0 =3D3D> TX_SRX_FIFO_full > >> =3DA0 =3DA0 ); > > >> =3DA0 =3DA0 -- 2 in 3 clock enables is used > >> =3DA0 =3DA0 TX_SRX_FIFO_wrreq <=3D3D (Sync_23_1b(1) AND (not > > TX_SRX_FIFO_full))=3D > > >; > >> =3DA0 =3DA0 TX_SRX_FIFO_rdreq <=3D3D not TX_SRX_FIFO_empty; > > Hi Michael, > > below is definition of fifo. > What troubles me is that write/read are tied up to full/empty respectivel= y > so I don't see why flow problems should occur. Moreover the write/read is > protected internally as well. > > Could you also please let me know was it timing simulation that you did? I disabled built-in protections and forcefully overwrote protections in user logic. rdreq <=3D (not rdempty or force1_rdreq) and not force0_rdreq; wrreq <=3D ((not wrfull and (not xx(1))) or force1_wrreq) and not force0_wrreq; > > Thanks > > LIBRARY ieee; > USE ieee.std_logic_1164.all; > > LIBRARY altera_mf; > USE altera_mf.all; > > ENTITY TX_SRX_FIFO IS > =A0 =A0 =A0 =A0 PORT =A0 =A0( > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 data =A0 =A0 =A0 =A0 =A0 =A0: IN STD_LOGI= C_VECTOR (31 DOWNTO 0); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdclk =A0 =A0 =A0 =A0 =A0 : IN STD_LOGIC = ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdreq =A0 =A0 =A0 =A0 =A0 : IN STD_LOGIC = ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrclk =A0 =A0 =A0 =A0 =A0 : IN STD_LOGIC = ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrreq =A0 =A0 =A0 =A0 =A0 : IN STD_LOGIC = ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q =A0 =A0 =A0 =A0 =A0 =A0 =A0 : OUT STD_L= OGIC_VECTOR (31 DOWNTO 0); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdempty =A0 =A0 =A0 =A0 : OUT STD_LOGIC ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrfull =A0 =A0 =A0 =A0 =A0: OUT STD_LOGIC > =A0 =A0 =A0 =A0 ); > END TX_SRX_FIFO; > > ARCHITECTURE SYN OF tx_srx_fifo IS > > =A0 =A0 =A0 =A0 SIGNAL sub_wire0 =A0 =A0 =A0 =A0: STD_LOGIC ; > =A0 =A0 =A0 =A0 SIGNAL sub_wire1 =A0 =A0 =A0 =A0: STD_LOGIC ; > =A0 =A0 =A0 =A0 SIGNAL sub_wire2 =A0 =A0 =A0 =A0: STD_LOGIC_VECTOR (31 DO= WNTO 0); > > =A0 =A0 =A0 =A0 COMPONENT dcfifo > =A0 =A0 =A0 =A0 GENERIC ( > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 intended_device_family =A0: STRING; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_numwords =A0 =A0 =A0 =A0 =A0 =A0: NAT= URAL; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_showahead =A0 =A0 =A0 =A0 =A0 : STRIN= G; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_type =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0:= STRING; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_width =A0 =A0 =A0 =A0 =A0 =A0 =A0 : N= ATURAL; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_widthu =A0 =A0 =A0 =A0 =A0 =A0 =A0: N= ATURAL; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 overflow_checking =A0 =A0 =A0 : STRING; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdsync_delaypipe =A0 =A0 =A0 =A0: NATURAL= ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 underflow_checking =A0 =A0 =A0: STRING; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 use_eab =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 := STRING; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrsync_delaypipe =A0 =A0 =A0 =A0: NATURAL > =A0 =A0 =A0 =A0 ); > =A0 =A0 =A0 =A0 PORT ( > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrclk =A0 : IN STD_LOGIC = ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdempty : OUT STD_LOGIC ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdreq =A0 : IN STD_LOGIC = ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrfull =A0: OUT STD_LOGIC= ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdclk =A0 : IN STD_LOGIC = ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q =A0 =A0 =A0 : OUT STD_L= OGIC_VECTOR (31 DOWNTO 0); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrreq =A0 : IN STD_LOGIC = ; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 data =A0 =A0: IN STD_LOGI= C_VECTOR (31 DOWNTO 0) > =A0 =A0 =A0 =A0 ); > =A0 =A0 =A0 =A0 END COMPONENT; > > BEGIN > =A0 =A0 =A0 =A0 rdempty =A0 =A0<=3D sub_wire0; > =A0 =A0 =A0 =A0 wrfull =A0 =A0<=3D sub_wire1; > =A0 =A0 =A0 =A0 q =A0 =A0<=3D sub_wire2(31 DOWNTO 0); > > =A0 =A0 =A0 =A0 dcfifo_component : dcfifo > =A0 =A0 =A0 =A0 GENERIC MAP ( > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 intended_device_family =3D> "Stratix IV", > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_numwords =3D> 8, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_showahead =3D> "OFF", > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_type =3D> "dcfifo", > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_width =3D> 32, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lpm_widthu =3D> 3, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 overflow_checking =3D> "ON", > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdsync_delaypipe =3D> 5, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 underflow_checking =3D> "ON", > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 use_eab =3D> "ON", > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrsync_delaypipe =3D> 5 > =A0 =A0 =A0 =A0 ) > =A0 =A0 =A0 =A0 PORT MAP ( > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrclk =3D> wrclk, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdreq =3D> rdreq, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdclk =3D> rdclk, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrreq =3D> wrreq, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 data =3D> data, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rdempty =3D> sub_wire0, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wrfull =3D> sub_wire1, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q =3D> sub_wire2 > =A0 =A0 =A0 =A0 ); > > END SYN; > > Kaz > > --------------------------------------- > Posted throughhttp://www.FPGARelated.com There is a known bug in Quartus 11.1 that applies to use_eab =3D> "ON". According to the knowledge base, it was fixed in 11.1SP1. http://www.altera.com/support/kdb/solutions/rd11182011_10.html There is another known bug that is not fixed in 11.1SP1/11.1SP2, but it only applies to use_eab =3D> "OFF" so you shouldn't care about it. Looking at your parameters, rdsync_delaypipe =3D> 5 and wrsync_delaypipe =3D> 5 sound like an overkill. It's unlikely that value of 5 really improve anything over value of 4. And if overflow happens nevertheless (don't ask me how, I don't know) then longer sync pipelines certainly make self-recovery more difficult.Article: 154692
On Dec 17, 11:17=A0am, "kaz" <3619@embeddedrelated> wrote: > > Could you also please let me know was it timing simulation that you did? > No, I did a functional simulation in Quartus 9.1 internal simulator. Quartus 9.1 ISIM can't simulate Stratix-IV, so I told him that it is Cyclone-III. For a functional simulation it should make no difference.Article: 154693
Sorry, but bubble diagrams of state machines drawn in HDL Designer are not schematics! I would like a tool that converts HDL into a re-arrangeable graphical bubble diagram (they never seem to arrange the diagram the way I intended it) for documentation. HDL source is portable and maintainable long after HDL Designer is obsolete, license expired, etc. And the machine-generated HDL source code is rarely very human readable. AndyArticle: 154694
My current goal is to implement some digital signal processing (filters) on a FPGA. I am currently using Terasics DE0 nano board. This board has an ADC128S022 ADC. I have started as follows: From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL. From this clock I generate the signals required to drive the ADC, essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a 12 bit sample. This translates into 200 ksps. I generate a signal "smpl_rdy" at the appropriate position which allows me to latch the 8 most significant bits. The main question is how should I do the signal processing: a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable b) Deriving a new 200 kHz clock from the PLL I have done some projects on FPGAs but they were quite simple, so I consider myself only a little more than a beginner. I can think of some problems with both approaches, but I may have overlooked may others: For instance, if I followed a), I guess that Quartus II would think that the processing happens at 25.6 MHz: if there is a long combinational path between registers, the timing analyzer will not be able to figure out that the data and the enable signal are stable during 16 clock cycles. Is there a way to provide this info to Quartus II? OTOH, using the same signal as an enable for everything further down does not seem sound enough, thinking of fanouts. So what? If I tried to follow b), how would I ensure that there is the proper phase relationship between both clocks? Is there a way to achieve this? Thanks for any advice. PereArticle: 154695
On 19 Dez., 09:32, o pere o <m...@somewhere.net> wrote: > My current goal is to implement some digital signal processing (filters) > on a FPGA. I am currently using Terasics DE0 nano board. This board has > an ADC128S022 ADC. I have started as follows: > > =A0From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL. > =A0From this clock I generate the signals required to drive the ADC, > essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a > 12 bit sample. This translates into 200 ksps. I generate a signal > "smpl_rdy" at the appropriate position which allows me to latch the 8 > most significant bits. > > The main question is how should I do the signal processing: > > a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable > b) Deriving a new 200 kHz clock from the PLL If you have not that much experience try to use only _one_ clock for everything in the FPGA, this gives a synchronous design. As this clock is higher than the ADC clock you can easily treat the signals from ADC as asynchronous, and get them still proper (oversampling of ready, allows you to determine, when the data is stable). The easiest design is complete synchronous with only one clock and considering all inputs as asynchronous. The frequencies you mention indicate no reason, why that should not be possible in your case. 25 MHz should not be the big deal for a most operations in modern FPGAs if you use some pipelining. In case you would like to use some enable and have the operation processing within several clock cycles use "multicycle path" constraint. I have no experience with quartus but in general every tool should be able to allow setting multi cycle constraints in a certain way. best regards ThomasArticle: 154696
On 12/19/2012 10:19 AM, Thomas Stanka wrote: > On 19 Dez., 09:32, o pere o <m...@somewhere.net> wrote: >> My current goal is to implement some digital signal processing (filters) >> on a FPGA. I am currently using Terasics DE0 nano board. This board has >> an ADC128S022 ADC. I have started as follows: >> >> From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL. >> From this clock I generate the signals required to drive the ADC, >> essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a >> 12 bit sample. This translates into 200 ksps. I generate a signal >> "smpl_rdy" at the appropriate position which allows me to latch the 8 >> most significant bits. >> >> The main question is how should I do the signal processing: >> >> a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable >> b) Deriving a new 200 kHz clock from the PLL > > If you have not that much experience try to use only _one_ clock for > everything in the FPGA, this gives a synchronous design. As this clock > is higher than the ADC clock you can easily treat the signals from ADC > as asynchronous, and get them still proper (oversampling of ready, > allows you to determine, when the data is stable). > The easiest design is complete synchronous with only one clock and > considering all inputs as asynchronous. The frequencies you mention > indicate no reason, why that should not be possible in your case. Well, this is just to get started. Once I'm running I will try to speed everything up as much as possible, just to learn something from it :) So, I'd also like to know "the" way to do it right. BTW, I don't understand what you mean saying that I can treat the signals as asynchronous: At a given time point in the ADC serial stream, I generate a 1-clock-wide signal that indicates that the data is ready. In the approach a) I plan to use this signal as an enable for all the registers in the processing path. main clock TTTTTTTTTTTTTTTTTTTTTT....TTTTTTTTTTTTTTT _____ _____ _____ ADC clock ____ _____ ___...._____ ___... _ smpl_rdy _________________________________ _____... From all this, I would say that I am doing a fully synchronous design. > 25 MHz should not be the big deal for a most operations in modern > FPGAs if you use some pipelining. In case you would like to use some > enable and have the operation processing within several clock cycles > use "multicycle path" constraint. I have no experience with quartus > but in general every tool should be able to allow setting multi cycle > constraints in a certain way. Aha! This is exactly what I was looking for. I would rather not have to pipeline something just because the fitter is trying to meet the 25.6 MHz timing when the true clock is just 200 kHz. With this keyword I will (hopefully) find the way to do in Quartus. > best regards Thomas Thanks! PereArticle: 154697
On Dec 19, 9:32=A0am, o pere o <m...@somewhere.net> wrote: > My current goal is to implement some digital signal processing (filters) > on a FPGA. I am currently using Terasics DE0 nano board. This board has > an ADC128S022 ADC. I have started as follows: > > =A0From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL. > =A0From this clock I generate the signals required to drive the ADC, > essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a > 12 bit sample. This translates into 200 ksps. I generate a signal > "smpl_rdy" at the appropriate position which allows me to latch the 8 > most significant bits. > > The main question is how should I do the signal processing: > > a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable > b) Deriving a new 200 kHz clock from the PLL > > I have done some projects on FPGAs but they were quite simple, so I > consider myself only a little more than a beginner. I can think of some > problems with both approaches, but I may have overlooked may others: > > For instance, if I followed a), I guess that Quartus II would think that > the processing happens at 25.6 MHz: if there is a long combinational > path between registers, the timing analyzer will not be able to figure > out that the data and the enable signal are stable during 16 clock > cycles. Is there a way to provide this info to Quartus II? OTOH, using > the same signal as an enable for everything further down does not seem > sound enough, thinking of fanouts. So what? > > If I tried to follow b), how would I ensure that there is the proper > phase relationship between both clocks? Is there a way to achieve this? > > Thanks for any advice. > > Pere Do you need the 25.6MHz or could you do everything synchronous on a 3.2MHz ? Then, even without telling it that you have multi cycles paths, timing should be easy But also consider that for a lower clock rate you might need more resources i.e. filter running at 25.6MHz might only need one mul-acc, where a filter running at 3.2MHz needs 8 -LasseArticle: 154698
On 12/19/2012 5:17 AM, o pere o wrote: > On 12/19/2012 10:19 AM, Thomas Stanka wrote: >> On 19 Dez., 09:32, o pere o <m...@somewhere.net> wrote: >>> My current goal is to implement some digital signal processing (filters) >>> on a FPGA. I am currently using Terasics DE0 nano board. This board has >>> an ADC128S022 ADC. I have started as follows: >>> >>> From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL. >>> From this clock I generate the signals required to drive the ADC, >>> essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a >>> 12 bit sample. This translates into 200 ksps. I generate a signal >>> "smpl_rdy" at the appropriate position which allows me to latch the 8 >>> most significant bits. >>> >>> The main question is how should I do the signal processing: >>> >>> a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable >>> b) Deriving a new 200 kHz clock from the PLL >> >> If you have not that much experience try to use only _one_ clock for >> everything in the FPGA, this gives a synchronous design. As this clock >> is higher than the ADC clock you can easily treat the signals from ADC >> as asynchronous, and get them still proper (oversampling of ready, >> allows you to determine, when the data is stable). >> The easiest design is complete synchronous with only one clock and >> considering all inputs as asynchronous. The frequencies you mention >> indicate no reason, why that should not be possible in your case. > > Well, this is just to get started. Once I'm running I will try to speed > everything up as much as possible, just to learn something from it :) > So, I'd also like to know "the" way to do it right. I don't know that there is a single "right" way to do this. Certainly there are many "wrong" ways. In general it is easy to use a single clock, but as you say, you then have concerns about how to provide the appropriate timing constraint to Quartus. I haven't used Quartus in years, but I am sure this is possible. Some folks call this "multi-cycle" timing. The way constraints are handled under the Xilinx tools it is just a different timing constraint than the clock period specification and so has priority for the paths a specific timing constraint is specified for. > BTW, I don't understand what you mean saying that I can treat the > signals as asynchronous: At a given time point in the ADC serial stream, > I generate a 1-clock-wide signal that indicates that the data is ready. > In the approach a) I plan to use this signal as an enable for all the > registers in the processing path. > > main clock TTTTTTTTTTTTTTTTTTTTTT....TTTTTTTTTTTTTTT > _____ _____ _____ > ADC clock ____ _____ ___...._____ ___... > _ > smpl_rdy _________________________________ _____... > > From all this, I would say that I am doing a fully synchronous design. I think Thomas is saying you can treat the ADC interface as if it were async to the main clock. But your ADC is driven synchronously with your 25.6 MHz clock so this is not really needed and will likely use extra logic and work. >> 25 MHz should not be the big deal for a most operations in modern >> FPGAs if you use some pipelining. In case you would like to use some >> enable and have the operation processing within several clock cycles >> use "multicycle path" constraint. I have no experience with quartus >> but in general every tool should be able to allow setting multi cycle >> constraints in a certain way. > > Aha! This is exactly what I was looking for. I would rather not have to > pipeline something just because the fitter is trying to meet the 25.6 > MHz timing when the true clock is just 200 kHz. With this keyword I will > (hopefully) find the way to do in Quartus. > >> best regards Thomas > > Thanks! > Pere To use a 200 kHz clock should not be any real problem. You just need to consider the timing of the 200 kHz clock when you design your circuit. An easy way to cross the clock domain boundary is to register the data from the ADC using the 25.6 MHz clock and a 200 kHz enable. Make sure the rising edge of the 200 kHz clock is delayed at least one cycle from this register enable. Then register the data a second time in the 200 kHz clock domain. This will help save some power as well as minimizing your timing analysis issues. RickArticle: 154699
On 12/19/2012 11:55 PM, langwadt@fonz.dk wrote: > On Dec 19, 9:32 am, o pere o <m...@somewhere.net> wrote: >> My current goal is to implement some digital signal processing (filters) >> on a FPGA. I am currently using Terasics DE0 nano board. This board has >> an ADC128S022 ADC. I have started as follows: >> >> From the 50 MHz board reference I derive a 25.6 MHz signal with a PLL. >> From this clock I generate the signals required to drive the ADC, >> essentially a clock at 3.2 MHz. Every 16 clock cycles, the ADC gives a >> 12 bit sample. This translates into 200 ksps. I generate a signal >> "smpl_rdy" at the appropriate position which allows me to latch the 8 >> most significant bits. >> >> The main question is how should I do the signal processing: >> >> a) Using the 25.6 MHz clock and using smpl_rdy as a clock enable >> b) Deriving a new 200 kHz clock from the PLL >> >> I have done some projects on FPGAs but they were quite simple, so I >> consider myself only a little more than a beginner. I can think of some >> problems with both approaches, but I may have overlooked may others: >> >> For instance, if I followed a), I guess that Quartus II would think that >> the processing happens at 25.6 MHz: if there is a long combinational >> path between registers, the timing analyzer will not be able to figure >> out that the data and the enable signal are stable during 16 clock >> cycles. Is there a way to provide this info to Quartus II? OTOH, using >> the same signal as an enable for everything further down does not seem >> sound enough, thinking of fanouts. So what? >> >> If I tried to follow b), how would I ensure that there is the proper >> phase relationship between both clocks? Is there a way to achieve this? >> >> Thanks for any advice. >> >> Pere > > Do you need the 25.6MHz or could you do everything synchronous on a > 3.2MHz ? > Then, even without telling it that you have multi cycles paths, timing > should > be easy In this case, I could do everything at a much lower frequency. As I have to generate the signals to control the ADC, my approach has been to start with a system frequency at least 2x the ADC clock frequency that I have to generate. So, I could work with 6.4MHz and timing would be much easier. However, the main point of my question was to learn the proper way to do this. > But also consider that for a lower clock rate you might need more > resources > > i.e. filter running at 25.6MHz might only need one mul-acc, where a > filter > running at 3.2MHz needs 8 That's certainly true! BTW, any inputs on whether using my smpl_rdy as an enable for each register is a good/bad idea? > -Lasse > Thanks for your inputs! Pere
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z