Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
b2508: >>On 10/22/2015 4:50 PM, b2508 wrote: >>>> In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, >>>> b2508 <108118@FPGARelated> wrote: >>>>> Hm.. I tought that multiplication cannot be implemented without > delay. >>>>> >>>>> This could cause timing issues to my knowledge. >>>>> >>>>> Moreover, full formula is >>>>> >>>>> y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >>>>> e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] >>>>> >>>>> Error is difference between output before and after quantization. >>>>> I asked for the initial one because even that i don't know how to >>>>> implement. >>>>> >>>>> if x1 appears at t1, corresponding y1 is ready at earlies at > t2=t1+1. >>If >>> I >>>>> register subtracting operation as well, e1 is available at t3=t1+1. >>>>> However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 >>> are >>>>> ready at that time. >>>> >>>> Without really looking at your required function in detail (just > noting >>>> that it has feedback terms) - I'll just note in general. >>>> >>>> The statement "multiplication cannot be implemented without delay" is >>>> false, in many ways. It all depends on your processing requirements. >>>> What is your sample rate? What are your bit widths? >>>> >>>> You're processing clock does NOT need to be the same as your sample >>> clock. >>>> If you wish them to be the same - it may be easier for new FPGA users > to >>> >>>> design - then you MAY be able to run the multiplier full combinational > - >>> >>>> If you're sample rate is low enough. >>>> >>>> The alternative (at a high level) is to buffer an input and output, > and >>>> process with a faster processing clock. Modern FPGA's these days can >>run >>> >>>> DSP functions upwards to around 400-500 MHz. This is likely much > faster >>> >>>> than your sample rate. >>>> >>>> Regards, >>>> Mark >>> >>> OK, I was taught that it is always safer to put registers wherever you >>> can. I have no choice in my project but to have same sampling and >>> processing rate. >>> >>> My rate is 100 MHz. >>> Input data or x[n] has data format - unsigned, 16 bit, 1 bit for > integer. >>> >>> Also, I am not sure how to select data widths after each of these >>> operations. >>> >>> If x[n] and x[n-1] are 16/1 and their subtraction is 17 bit unsigned > with >>> 2 bit integers, how do I proceed with data width selection? Feedback > loop >>> part is unclear to me. >>> Also, should I use DSP48 for the multiplication with P or should I > make >>it >>> somehow power of two and do it by shifting? >>> >>> Q is quantization, or reducing number of samples after all these >>> operations. >> >>Do you know the value of P? Multiplies are done by shifting and adding. > >> I don't know which chip you are planning to use, but all the >>multipliers I know of require pipelining, the only option is how many >>stages, 1, 2, etc... Since P is a constant (it *is* a constant, right?) > >>you only need to use adders for the 1s, or if there are long runs of 1s >>or 0s, you can subtract at the lsb of the run and add in at the bit just > >>past the msb of the run. The point is you may not need to use a built >>in multiplier. >> >>Your filter seems very complex for a feedback filter. Is there some >>special need driving this? Can you use a simpler filter? >> >>-- >> >>Rick > > I do not really know the value of P or how to determine it. I was thinking > to use 0.99 because I tried it out in software simulation and it seems to > do what I wanted it to do. The idea for this filter came from this article > / second filter on Figure 2. > > http://www.digitalsignallabs.com/dcblock.pdf > > Someone said to forget equations and do as it is drawn in figure, but > these figures never account for potential latency of the > add/subtract/multiply blocks or if I do not add registers, then I may have > timing issues. > > Anyway, I will try to do add and multiply in one clock cycle and see where > this gets me. > > Thank you all very much anyhow. > > > --------------------------------------- > Posted through http://www.FPGARelated.com Here's a neat way to do it (if using Xilinx parts) and quite alot of helpful discussion: www.xilinx.com/support/documentation/white_papers/wp279.pdf MKArticle: 158351
On Fri, 23 Oct 2015 02:51:42 -0500, b2508 wrote: >>On 10/22/2015 4:50 PM, b2508 wrote: >>>> In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, >>>> b2508 <108118@FPGARelated> wrote: >>>>> Hm.. I tought that multiplication cannot be implemented without > delay. >>>>> >>>>> This could cause timing issues to my knowledge. >>>>> >>>>> Moreover, full formula is >>>>> >>>>> y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >>>>> e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] >>>>> >>>>> Error is difference between output before and after quantization. >>>>> I asked for the initial one because even that i don't know how to >>>>> implement. >>>>> >>>>> if x1 appears at t1, corresponding y1 is ready at earlies at > t2=t1+1. >>If >>> I >>>>> register subtracting operation as well, e1 is available at t3=t1+1. >>>>> However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 >>> are >>>>> ready at that time. >>>> >>>> Without really looking at your required function in detail (just > noting >>>> that it has feedback terms) - I'll just note in general. >>>> >>>> The statement "multiplication cannot be implemented without delay" is >>>> false, in many ways. It all depends on your processing requirements. >>>> What is your sample rate? What are your bit widths? >>>> >>>> You're processing clock does NOT need to be the same as your sample >>> clock. >>>> If you wish them to be the same - it may be easier for new FPGA users > to >>> >>>> design - then you MAY be able to run the multiplier full >>>> combinational > - >>> >>>> If you're sample rate is low enough. >>>> >>>> The alternative (at a high level) is to buffer an input and output, > and >>>> process with a faster processing clock. Modern FPGA's these days can >>run >>> >>>> DSP functions upwards to around 400-500 MHz. This is likely much > faster >>> >>>> than your sample rate. >>>> >>>> Regards, >>>> Mark >>> >>> OK, I was taught that it is always safer to put registers wherever you >>> can. I have no choice in my project but to have same sampling and >>> processing rate. >>> >>> My rate is 100 MHz. >>> Input data or x[n] has data format - unsigned, 16 bit, 1 bit for > integer. >>> >>> Also, I am not sure how to select data widths after each of these >>> operations. >>> >>> If x[n] and x[n-1] are 16/1 and their subtraction is 17 bit unsigned > with >>> 2 bit integers, how do I proceed with data width selection? Feedback > loop >>> part is unclear to me. >>> Also, should I use DSP48 for the multiplication with P or should I > make >>it >>> somehow power of two and do it by shifting? >>> >>> Q is quantization, or reducing number of samples after all these >>> operations. >> >>Do you know the value of P? Multiplies are done by shifting and adding. > >> I don't know which chip you are planning to use, but all the >>multipliers I know of require pipelining, the only option is how many >>stages, 1, 2, etc... Since P is a constant (it *is* a constant, right?) > >>you only need to use adders for the 1s, or if there are long runs of 1s >>or 0s, you can subtract at the lsb of the run and add in at the bit just > >>past the msb of the run. The point is you may not need to use a built >>in multiplier. >> >>Your filter seems very complex for a feedback filter. Is there some >>special need driving this? Can you use a simpler filter? >> >>-- >> >>Rick > > I do not really know the value of P or how to determine it. I was > thinking to use 0.99 because I tried it out in software simulation and > it seems to do what I wanted it to do. The idea for this filter came > from this article / second filter on Figure 2. > > http://www.digitalsignallabs.com/dcblock.pdf > > Someone said to forget equations and do as it is drawn in figure, but > these figures never account for potential latency of the > add/subtract/multiply blocks or if I do not add registers, then I may > have timing issues. > > Anyway, I will try to do add and multiply in one clock cycle and see > where this gets me. > > Thank you all very much anyhow. I have successfully implemented a DC blocker in VHDL based on the information in that paper. It was less that a page of VHDL. Instead of multiplying by something like 0.99, multiply by (1-1/(2**N)) (for some fixed integer N, e.g. 6 or 7). This can be done with just a shift and subtract, and the low pass filter can be done all in one clock cycle. This is a DC blocker, after all, and the position of the pole probably isn't all that critical. Regards, AllanArticle: 158352
On Fri, 23 Oct 2015 02:51:42 -0500, b2508 wrote: >>On 10/22/2015 4:50 PM, b2508 wrote: >>>> In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, >>>> b2508 <108118@FPGARelated> wrote: >>>>> Hm.. I tought that multiplication cannot be implemented without > delay. >>>>> >>>>> This could cause timing issues to my knowledge. >>>>> >>>>> Moreover, full formula is >>>>> >>>>> y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >>>>> e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] >>>>> >>>>> Error is difference between output before and after quantization. >>>>> I asked for the initial one because even that i don't know how to >>>>> implement. >>>>> >>>>> if x1 appears at t1, corresponding y1 is ready at earlies at > t2=t1+1. >>If >>> I >>>>> register subtracting operation as well, e1 is available at t3=t1+1. >>>>> However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 >>> are >>>>> ready at that time. >>>> >>>> Without really looking at your required function in detail (just > noting >>>> that it has feedback terms) - I'll just note in general. >>>> >>>> The statement "multiplication cannot be implemented without delay" is >>>> false, in many ways. It all depends on your processing requirements. >>>> What is your sample rate? What are your bit widths? >>>> >>>> You're processing clock does NOT need to be the same as your sample >>> clock. >>>> If you wish them to be the same - it may be easier for new FPGA users > to >>> >>>> design - then you MAY be able to run the multiplier full >>>> combinational > - >>> >>>> If you're sample rate is low enough. >>>> >>>> The alternative (at a high level) is to buffer an input and output, > and >>>> process with a faster processing clock. Modern FPGA's these days can >>run >>> >>>> DSP functions upwards to around 400-500 MHz. This is likely much > faster >>> >>>> than your sample rate. >>>> >>>> Regards, >>>> Mark >>> >>> OK, I was taught that it is always safer to put registers wherever you >>> can. I have no choice in my project but to have same sampling and >>> processing rate. >>> >>> My rate is 100 MHz. >>> Input data or x[n] has data format - unsigned, 16 bit, 1 bit for > integer. >>> >>> Also, I am not sure how to select data widths after each of these >>> operations. >>> >>> If x[n] and x[n-1] are 16/1 and their subtraction is 17 bit unsigned > with >>> 2 bit integers, how do I proceed with data width selection? Feedback > loop >>> part is unclear to me. >>> Also, should I use DSP48 for the multiplication with P or should I > make >>it >>> somehow power of two and do it by shifting? >>> >>> Q is quantization, or reducing number of samples after all these >>> operations. >> >>Do you know the value of P? Multiplies are done by shifting and adding. > >> I don't know which chip you are planning to use, but all the >>multipliers I know of require pipelining, the only option is how many >>stages, 1, 2, etc... Since P is a constant (it *is* a constant, right?) > >>you only need to use adders for the 1s, or if there are long runs of 1s >>or 0s, you can subtract at the lsb of the run and add in at the bit just > >>past the msb of the run. The point is you may not need to use a built >>in multiplier. >> >>Your filter seems very complex for a feedback filter. Is there some >>special need driving this? Can you use a simpler filter? >> >>-- >> >>Rick > > I do not really know the value of P or how to determine it. I was > thinking to use 0.99 because I tried it out in software simulation and > it seems to do what I wanted it to do. The idea for this filter came > from this article / second filter on Figure 2. > > http://www.digitalsignallabs.com/dcblock.pdf > > Someone said to forget equations and do as it is drawn in figure, but > these figures never account for potential latency of the > add/subtract/multiply blocks or if I do not add registers, then I may > have timing issues. > > Anyway, I will try to do add and multiply in one clock cycle and see > where this gets me. So, the set of equations that I suggested elsewhere pretty much implement what you claim to want, you can use d = 2^-N, which implies a shift, and -- as I stated -- you can use some delay in the "servo to average value" step. So what's your problem again? Equations reiterated: u: input y: output x: state variable y[n] = u[n] - x[n-1] x[n] = d * y[n] -- Tim Wescott Wescott Design Services http://www.wescottdesign.comArticle: 158353
> >Apologies but can I ask if this reply is from an ASIC mindset or is it >spam. Almost every statement doesn't make any sense whatsoever for FPGAs. > >Zak >--------------------------------------- >Posted through http://www.FPGARelated.com Valid question. It is from an ASIC mindset and has nothing to do with FPGA's. This entire thread has nothing to do with FPGA's Xilinx recomends that you never use an asynchonous reset in your logic. Its expensive ( sync reset flops takes 3 LUT's while an async one takes 4). It's slow ( Xilinx has a app note on rtl programming styles that says async resets can slow you down by a factor as high as 4. All those really fast multiplier macros use sync reset so if you ask for async then it must build them out of LUTs. It is also redundent. Your programming input is asynchronous and leaves every flop in the design in a known state. That is your power on asynchronous reset system. Adding a second one in logic makes no sense at all. I haven't read Altera docs but I suspect the same will apply --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158354
>zak wrote: > >I just found a condition where you could have an un-synchronized input on a > >product that has been in manufacture for a decade. Normally, this signal IS > >synchronized on the FPGA, but there was a particular configuration involving > >multiple boards where it would be synched from the OTHER board, only. >OOPS! > >Jon That goes to show that once you ship your product then it probably will not see many resets during its lifetime. I've done embedded systems that are always on and I suspect that few of them would ever see more than a few dozen resets before they are trashed. Designers worry about resets all the time but if you screw it up then it is very unlikely that the customer will trigger a failure and extremely unlikely that they will do so twice in a row. Turning it back off and on again can hide a lot of mistakes John Eaton --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158355
In article <K-6dnfZewNs-8LfLnZ2dnUU7-IudnZ2d@giganews.com>, jt_eaton <84408@FPGARelated> wrote: > >> >>Apologies but can I ask if this reply is from an ASIC mindset or is it >>spam. Almost every statement doesn't make any sense whatsoever for >FPGAs. >> >>Zak >>--------------------------------------- >>Posted through http://www.FPGARelated.com > >Valid question. It is from an ASIC mindset and has nothing to do with >FPGA's. This entire thread has nothing to do with FPGA's > >Xilinx recomends that you never use an asynchonous reset in your logic. >Its expensive ( sync reset flops takes 3 LUT's while an async one takes >4). It's slow ( Xilinx has a app note on rtl programming styles that says >async resets can slow you down by a factor as high as 4. All those really >fast multiplier macros use sync reset so if you ask for async then it must >build them out of LUTs. It is also redundent. Your programming input is >asynchronous and leaves every flop in the design in a known state. That is >your power on asynchronous reset system. Adding a second one in logic >makes no sense at all. > >I haven't read Altera docs but I suspect the same will apply A few nits, but the general gist is all good. Async resets should be MINIMIZED in Xilinx logic - but "never" isn't usually attainable. Even in some Xilinx app notes they're used. Why?- Well, using an async reset on a flop removes the requirement of a good clock. That flop'll will be reset irrespective of the clock state. So, many times in your clock and reset generation blocks, you'll likely need the async resets on flops. I also warn to be careful with assuming FPGA Config = Reset. They're often NOT the same, and just relying on the former, can lead to some rather nasty and long debug sessions. My point here is that specifically including a SYNCHRONOUS reset, in MOST of your logic is a wise design practice. (Something Xilinx discourages for optimal fpga use, but IMHO, they're just dead wrong). Regards, MarkArticle: 158356
In article <-Jadnfhk6Ivf6bfLnZ2dnUU7-fednZ2d@giganews.com>, jt_eaton <84408@FPGARelated> wrote: >>zak wrote: >> > >>I just found a condition where you could have an un-synchronized input on >a >> >>product that has been in manufacture for a decade. Normally, this signal >IS >> >>synchronized on the FPGA, but there was a particular configuration >involving >> >>multiple boards where it would be synched from the OTHER board, only. >>OOPS! >> >>Jon > >That goes to show that once you ship your product then it probably will >not see many resets during its lifetime. I've done embedded systems that >are always on and I suspect that few of them would ever see more than a >few dozen resets before they are trashed. Designers worry about resets all >the time but if you screw it up then it is very unlikely that the customer >will trigger a failure and extremely unlikely that they will do so twice >in a row. > >Turning it back off and on again can hide a lot of mistakes John, I agree with assertion, but not neccesarily your conclusions. Yes, resets may not occur very often in you're devices. But that IMHO, is more the reason to get it right. Murphy's law says that that one customer will hit the troublesome case more often. And since it's so fiddly, and hardly used you don't have many datapoints on failures. Meaning debugging's a bitch. I pay close attention to reset and initialization. Been bitten by too many low probability errors that occur much too late in the design cycle. They're the devil to debug. Regards, MarkArticle: 158357
David, I like your explanation of mapping the field elements to something abstract= , like 0, 1, a, b, c, d, e, f. I've only seen one textbook that mentioned = something like this (it used Greek letters) and I found it very helpful to = realize that the field elements are only arbitrarily *mapped* to integers a= nd are not equivalent to them. I don't use ROMs for field multiplication. A GF(2048) multiplier takes abo= ut 64 6-input LUTs, as I recall, with about 3 levels of logic. A fixed mul= tiplier (multiplying by a constant) is more like 20 LUTs. I use ROMs for d= ivision (find the reciprocal), cubing, etc. If you use ROMs for multiplica= tion it induces a lot of latency, because you have 2-3 cycles for each bloc= kRAM, and you have to do a log and antilog lookup, so you can end up with 5= + cycles of latency. KevinArticle: 158358
On 10/23/2015 1:25 PM, jt_eaton wrote: >> >> Apologies but can I ask if this reply is from an ASIC mindset or is it >> spam. Almost every statement doesn't make any sense whatsoever for > FPGAs. >> >> Zak >> --------------------------------------- >> Posted through http://www.FPGARelated.com > > Valid question. It is from an ASIC mindset and has nothing to do with > FPGA's. This entire thread has nothing to do with FPGA's > > Xilinx recomends that you never use an asynchonous reset in your logic. > Its expensive ( sync reset flops takes 3 LUT's while an async one takes > 4). It's slow ( Xilinx has a app note on rtl programming styles that says > async resets can slow you down by a factor as high as 4. All those really > fast multiplier macros use sync reset so if you ask for async then it must > build them out of LUTs. It is also redundent. Your programming input is > asynchronous and leaves every flop in the design in a known state. That is > your power on asynchronous reset system. Adding a second one in logic > makes no sense at all. > > I haven't read Altera docs but I suspect the same will apply I think you are a bit confused about the async reset. Every FF in the fabric is connected to an async reset signal that spans the entire chip. This is called (oddly enough) the Global Set/Reset or GSR in Xilinx devices. It is always asserted during configuration and can also be driven by a user signal. An async reset can be critical to restore a defined state to the design in the event the clock stops. This signal is usually pretty slow and may be hard to get to meet system synchronous timing requirements. The trick to proper use is to make sure there are other controls that maintain a reset state for the design when the async signal is removed. These synchronous resets can apply to local sections of logic which must start up synchronized while usually the entire design does not. Then timing is more easily met. -- RickArticle: 158359
I think I need a quad-port blockRAM in a Xilinx V7. Having multiple read p= orts is no problem, but I need two read ports and two write ports. The two= write ports is the problem. I can't double the clock speed. To be clear,= I need to be able to do two reads and two writes per cycle. (Not writes t= o the same address.) The only idea I could come up with is to have four dual-port BRAMs and a se= maphore array. Let's call the BRAMs AC, AD, BC, and BD. Writer A writes t= he same value to address x in AC and AD and simultaneously sets the semapho= re of address x to point to 'A'. Now when reader C wants to read address x= , it reads AC and BC and the semaphore, sees that semaphore points toward t= he A side, and uses the value from AC and discards BC. If writer B writes = to address x, it writes the value to both BC and BD and sets the semaphore = x to point to side B. Reader D reads AD and BD and picks one based on the = semaphore bit. The semaphore itself is complicated. I think it would consists of 2 quad-p= ort RAMs, one bit wide and the depth of AC, each one having 1 write and 3 r= ead ports. This could be distributed RAM. Writer A would read the side B = semaphore bit and set its own to the same, and writer B would read the side= A bit and set its own to the opposite. Now when reader C or D read their = two copies (A/B) of the semaphore bits using their read ports, they check i= f they are the same (use side A) or opposite (use side B). It's a big mess and uses 4x the BRAMs as a dual-port. Maybe I need a diffe= rent solution.Article: 158360
Update: I found a solution in the "Altera Synthesis Cookbook" and it seems= to be the scheme I described above, but implementing the semaphore bits as= FFs instead of distributed RAM. I'd need about 2048 semaphore bits, so im= plementing that in a distributed RAM would probably be advantageous. You c= an do a 64-bit quad port (1 wr, 3 rd) in a 4-LUT slice, so I'd need 2048/64= *4*2 =3D 256 LUTs to do 2 2048-bit quad-port distributed RAMs. (Add in ~10= slices for 32->1 muxes.)Article: 158361
In article <n0dvqo$m3p$1@dont-email.me>, rickman <gnuarm@gmail.com> wrote: >On 10/23/2015 1:25 PM, jt_eaton wrote: >>> >>> Apologies but can I ask if this reply is from an ASIC mindset or is it >>> spam. Almost every statement doesn't make any sense whatsoever for >> FPGAs. >>> >>> Zak >>> --------------------------------------- >>> Posted through http://www.FPGARelated.com >> >> Valid question. It is from an ASIC mindset and has nothing to do with >> FPGA's. This entire thread has nothing to do with FPGA's >> >> Xilinx recomends that you never use an asynchonous reset in your logic. >> Its expensive ( sync reset flops takes 3 LUT's while an async one takes >> 4). It's slow ( Xilinx has a app note on rtl programming styles that says >> async resets can slow you down by a factor as high as 4. All those really >> fast multiplier macros use sync reset so if you ask for async then it must >> build them out of LUTs. It is also redundent. Your programming input is >> asynchronous and leaves every flop in the design in a known state. That is >> your power on asynchronous reset system. Adding a second one in logic >> makes no sense at all. >> >> I haven't read Altera docs but I suspect the same will apply > >I think you are a bit confused about the async reset. Every FF in the >fabric is connected to an async reset signal that spans the entire chip. > This is called (oddly enough) the Global Set/Reset or GSR in Xilinx >devices. It is always asserted during configuration and can also be >driven by a user signal. > >An async reset can be critical to restore a defined state to the design >in the event the clock stops. This signal is usually pretty slow and >may be hard to get to meet system synchronous timing requirements. The >trick to proper use is to make sure there are other controls that >maintain a reset state for the design when the async signal is removed. > These synchronous resets can apply to local sections of logic which >must start up synchronized while usually the entire design does not. >Then timing is more easily met. I've got trouble thinking how the GSR could be used for any useful thing by the designer. The trouble with the GSR is that it's removal edge is slow, and asynchronous to everything. So flops may reset fine with this signal, but coming out of reset, the state's still unknown. The parts of a design that can make reliable use of this, is a rare exception, not the rule, IMHO. And back to the OP - I believe he was NOT referring to this GSR at all, but to the normal Async reset on a FF i.e. (verilog) always @( posedge clk or posedge reset ) if( reset ) q <= 0; else q <= d; This is the async flip flop, where clock recovery/removal timing checks must be done, as was asked in the start of the thread. Remove the "posedge reset" from the sensitivity list, and it's a synchronous reset, with just normal setup/hold checks. Synchronous are preferred and much more efficient in today's FPGAs. Regards, MarkArticle: 158362
On 10/23/2015 4:32 PM, Mark Curry wrote: > In article <n0dvqo$m3p$1@dont-email.me>, rickman <gnuarm@gmail.com> wrote: >> On 10/23/2015 1:25 PM, jt_eaton wrote: >>>> >>>> Apologies but can I ask if this reply is from an ASIC mindset or is it >>>> spam. Almost every statement doesn't make any sense whatsoever for >>> FPGAs. >>>> >>>> Zak >>>> --------------------------------------- >>>> Posted through http://www.FPGARelated.com >>> >>> Valid question. It is from an ASIC mindset and has nothing to do with >>> FPGA's. This entire thread has nothing to do with FPGA's >>> >>> Xilinx recomends that you never use an asynchonous reset in your logic. >>> Its expensive ( sync reset flops takes 3 LUT's while an async one takes >>> 4). It's slow ( Xilinx has a app note on rtl programming styles that says >>> async resets can slow you down by a factor as high as 4. All those really >>> fast multiplier macros use sync reset so if you ask for async then it must >>> build them out of LUTs. It is also redundent. Your programming input is >>> asynchronous and leaves every flop in the design in a known state. That is >>> your power on asynchronous reset system. Adding a second one in logic >>> makes no sense at all. >>> >>> I haven't read Altera docs but I suspect the same will apply >> >> I think you are a bit confused about the async reset. Every FF in the >> fabric is connected to an async reset signal that spans the entire chip. >> This is called (oddly enough) the Global Set/Reset or GSR in Xilinx >> devices. It is always asserted during configuration and can also be >> driven by a user signal. >> >> An async reset can be critical to restore a defined state to the design >> in the event the clock stops. This signal is usually pretty slow and >> may be hard to get to meet system synchronous timing requirements. The >> trick to proper use is to make sure there are other controls that >> maintain a reset state for the design when the async signal is removed. >> These synchronous resets can apply to local sections of logic which >> must start up synchronized while usually the entire design does not. >> Then timing is more easily met. > > I've got trouble thinking how the GSR could be used for any useful thing > by the designer. The trouble with the GSR is that it's removal edge > is slow, and asynchronous to everything. So flops may reset fine with > this signal, but coming out of reset, the state's still unknown. The > parts of a design that can make reliable use of this, is a rare exception, > not the rule, IMHO. > > And back to the OP - I believe he was NOT referring to this GSR at all, but > to the normal Async reset on a FF i.e. (verilog) > > always @( posedge clk or posedge reset ) > if( reset ) > q <= 0; > else > q <= d; > > This is the async flip flop, where clock recovery/removal timing checks > must be done, as was asked in the start of the thread. > > Remove the "posedge reset" from the sensitivity list, and it's a > synchronous reset, with just normal setup/hold checks. Synchronous > are preferred and much more efficient in today's FPGAs. If the OP is talking about using the async reset of individual FFs, I can offer no advise. I have never encountered a situation where I needed that. -- RickArticle: 158363
In has been mentioned here a few times that there are no FPGAs with internal tristates. However, surfing around somewhat aimlessly, I stumbled upon a device series which has them and obviously had to share my find: The Atmel AT40K and AT40KAL series. OTOH, their gate delay is pretty big: around 1.2 ns, and their SRAM read times aren't much better at about 10 ns. I didn't dig much deeper into the datasheets so far.Article: 158364
On 10/24/2015 10:38 AM, Aleksandar Kuktin wrote: > In has been mentioned here a few times that there are no FPGAs with > internal tristates. However, surfing around somewhat aimlessly, I > stumbled upon a device series which has them and obviously had to share > my find: > > The Atmel AT40K and AT40KAL series. > > OTOH, their gate delay is pretty big: around 1.2 ns, and their SRAM read > times aren't much better at about 10 ns. I didn't dig much deeper into > the datasheets so far. This is a truly ancient FPGA family and you would use them at your own risk. I'd be very surprised if they didn't come with a caution: Not for New Designs. I think they are at least 15 years old, no? I haven't looked them up, but I bet they are pretty expensive too. -- RickArticle: 158365
On 10/24/2015 10:55 AM, rickman wrote: > On 10/24/2015 10:38 AM, Aleksandar Kuktin wrote: >> In has been mentioned here a few times that there are no FPGAs with >> internal tristates. However, surfing around somewhat aimlessly, I >> stumbled upon a device series which has them and obviously had to share >> my find: >> >> The Atmel AT40K and AT40KAL series. >> >> OTOH, their gate delay is pretty big: around 1.2 ns, and their SRAM read >> times aren't much better at about 10 ns. I didn't dig much deeper into >> the datasheets so far. > > This is a truly ancient FPGA family and you would use them at your own > risk. I'd be very surprised if they didn't come with a caution: Not for > New Designs. I think they are at least 15 years old, no? I haven't > looked them up, but I bet they are pretty expensive too. > If you want another old FPGA with internal tristate that is still in production, check out Xilinx Spartan 2. It's also 5V tolerant... -- GaborArticle: 158366
On 10/24/2015 04:38 PM, Aleksandar Kuktin wrote: > In has been mentioned here a few times that there are no FPGAs with > internal tristates. However, surfing around somewhat aimlessly, I > stumbled upon a device series which has them and obviously had to share > my find: > > The Atmel AT40K and AT40KAL series. > > OTOH, their gate delay is pretty big: around 1.2 ns, and their SRAM read > times aren't much better at about 10 ns. I didn't dig much deeper into > the datasheets so far. > A while back i tried verifying the available documentation against the bitstreams their IDS software produces. Got most of the details figured out. there it is: <https://github.com/klammerj/iverilog/tree/master/tgt-slick/util/wigglepuppy>Article: 158367
All of us learn our craft starting off with small designs that we take through each design stage all the way to a working product. But real life is different. Real designs are far to large for any single engineer to do the entire thing so we use several engineers with each one assigned to do a different task. Those tasks are: Component Designers who create the leaf cell designs Architects who select the leaf cells, configure and interconnect them Board designers who design the printed circuit assembly When you work by yourself it is tempting to look ahead and try to take care of issues that will arise in the next phase of the design process. DO NOT DO THIS. While I am sure that you are a competent engineer and that the engineer who was assigned that task will be happy that you decided to butt in and do his job for him it you should rmember that some people are sensitive about this. If its not your job then you don't get to make those decisions. >I've got trouble thinking how the GSR could be used for any useful >thing by the designer. The trouble with the GSR is that it's >removal edge is slow, and asynchronous to everything. So flops may >reset fine with this signal, but coming out of reset, the state's >still unknown. The parts of a design that can make reliable use of >this, is a rare exception,not the rule, IMHO. > The way it works is that you need to do an asynchronous assert and a synchronous deassert. Your assert resets all user flops but sets all the flops in your synchronous reset system into their assert state. Your user flop will clear to 0 and it will have a 0 on its D input from the synchronous reset system. Nothing will happen when you deassert your asynchronous reset because you will still have the synchronous one active. That one will deassert over time and will be synchronous to the flops clock. You design it by finding out the longest path from your asynchronous input pin to every flop in your chip. Double that time ( we like to sandbag our designs) and divide it by your reference clock period. That gives you the number of clock cycles that you have to maintain the synchronous reset after you deassert your asynchronus reset. That takes care of your slow removal because each flop will still be controlled by the synchronous reset. > >always @( posedge clk or posedge reset ) > if( reset ) > q <= 0; > else > q <= d; > You see this alot in designs. Component designers are obsessed with making sure that their designs get reset so they build every flop with an asynchronous reset port and connnect them all directly to the boards power on reset signal. Theres just one little problem. Component designers do not own the reset system design, It now belongs to other engineers and you are doing their job for them. They are not happy. With a top down design they tell you what to do, you do not get to tell them what to do. What should you do instead. always @( posedge clk or negedge reset_n ) if (! reset_n ) q <= 0; else if ( reset ) q <= 0; else q <= d; If you do this then the architects are no longer locked into your reset design. If they want it then they can tie the reset inactive and they have your design. They could also tie the reset_n inactive and have a completely synchronous design or they could as I have suggested use both. If you use both then the design is dft clean and you are not messing up your timing the way the your design does. So who does own the reset system design? The board designer The board designer is responsible for creating the board's power on reset signal. They look at all the power supplies and debounce the reset button to create a single active low open drain signal. They then route that signal to every chip on the board. If a chip doesn't have a reset input then they route it through the chip controlling it. They stitch together the products reset system from the reset systems of all the chips on the board. Finally they add the ESD filtering and the design is complete. The board designer has the complete engineering responsibility for the products power on reset system. Now heres where the problem comes in. Component designers are obsessed with making sure that each flop is reset while the reset button is depressed. But we surveyed every board designer who had ever done that job and none of them were worried about that at all. Sure it gets screwed up every now and then but our digital tools are good enough that you will always find the problem and fix it before release. Its not a problem. What is a problem that keeps the board designers up at nite is ESD. You can't even begin to test for that until you have real silicon, PCBs and case parts. By the time you get those you are well past the point where you should be making changes to your design. So your boss is obsessed with making sure that the product does not reset while the reset button is NOT pressed and you are designing to the exact opposite requirement. Thats why I get so annoyed with engineers who insist that a complete asynchronous reset system is the only way to go. For one thing there are other equivaltent options that they have not considered and second of all they do not get to decide how to do it. Its not their job. John Eaton --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158368
On 10/23/15 1:11 PM, Tim Wescott wrote: > > Equations reiterated: > > u: input > y: output > x: state variable > > y[n] = u[n] - x[n-1] > x[n] = d * y[n] > Something can't be right with these equations. Let us say we have had signal for awhile with a DC component. u[n] now goes to that value. By definition, y[n] should be 0, and thus so would x[n] if u[n+1] has that same value again, y[n+1] = u[n+1] and not 0 as required, The equations I tend to use for something like this is: y[n]= u[n] = x[n-1] x[n] = x[n-1] + k*y[n] (k being some power of 0.5 so the multiply is a shift.) x often needing additional fractional bits, especially if k is very small.Article: 158369
Dear all first of all want thanx to everyone who belong to this forum and give help to each other. I checked ML405's schematic that pins, and are exist,i used advice of Aurelian Lazarut (cleaning project files) but no success. It cannot be mapped. Could someone help me with this problem? Here is Errors. ERROR:MapLib:30 - LOC constraint M19 on SW1 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P16 on SW2 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint M11 on SW3 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N11 on SW4 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint M10 on SW5 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint M9 on SW6 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N8 on SW7 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N7 on SW8 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P10 on vga_b_out<0> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P11 on vga_b_out<1> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint R8 on vga_b_out<2> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N9 on vga_g_out<0> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N6 on vga_g_out<1> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P6 on vga_g_out<2> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint R6 on vga_r_out<0> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint R7 on vga_r_out<1> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P9 on vga_r_out<2> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'.Article: 158370
Dear all first of all want thanx to everyone who belong to this forum and give help to each other. I use Xilinx ISE 14.7, and ML405 I checked ML405's schematic that pins, and are exist,i used advise of Aurelian Lazarut (cleaning project files) but no success. It cannot be mapped. Could someone help me with this problem? Here is Errors. ERROR:MapLib:30 - LOC constraint M19 on SW1 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P16 on SW2 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint M11 on SW3 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N11 on SW4 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint M10 on SW5 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint M9 on SW6 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N8 on SW7 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N7 on SW8 is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P10 on vga_b_out<0> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P11 on vga_b_out<1> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint R8 on vga_b_out<2> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N9 on vga_g_out<0> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint N6 on vga_g_out<1> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P6 on vga_g_out<2> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint R6 on vga_r_out<0> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint R7 on vga_r_out<1> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. ERROR:MapLib:30 - LOC constraint P9 on vga_r_out<2> is invalid: No such site on the device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'.Article: 158371
>All of us learn our craft starting off with small designs that we take >through each design stage all the way to a working product. But real life >is different. Real designs are far to large for any single engineer to do >the entire thing so we use several engineers with each one assigned to do >a different task. > >Those tasks are: > > >Component Designers who create the leaf cell designs > > >Architects who select the leaf cells, configure and interconnect them > > >Board designers who design the printed circuit assembly > > > > >When you work by yourself it is tempting to look ahead and try to take >care of issues that will arise in the next phase of the design process. DO >NOT DO THIS. While I am sure that you are a competent engineer and that >the engineer who was assigned that task will be happy that you decided to >butt in and do his job for him it you should rmember that some people are >sensitive about this. > >If its not your job then you don't get to make those decisions. > > > > >>I've got trouble thinking how the GSR could be used for any useful >thing >by the designer. The trouble with the GSR is that it's >>removal edge is slow, and asynchronous to everything. So flops may >>reset fine with this signal, but coming out of reset, the state's >>still unknown. The parts of a design that can make reliable use of >Thats why I get so annoyed with engineers who insist that a complete >asynchronous reset system is the only way to go. For one thing there are >other equivaltent options that they have not considered and second of all >they do not get to decide how to do it. Its not their job. > > >John Eaton Inside fpga there might be several clock domains and the fpga engineer is responsible to synchronise reset to all domains. So it is internal responsibilty and even if it is one clock domain it is much easier to do that sort of work inside fpga. You certainly need reset otherwise you will have power up variation or variation per reset release if not applied properly. But some registers may not need it. It is certainly vital for state machines and control registers but for data paths if data value is irrelevant for any control logic. One thing I find it misunderstood is that reset whether applied through D input(mis-named synchronous) or through async port must be viewed as asynchronous from its origin if it can arrive independent of clock. --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158372
op links to this doc http://www.digitalsignallabs.com/dcblock.pdf looking into above document I got a bit curious; to get rid of truncation dc bias why add fraction of truncation when you can just use rounding to nearest integer or to nearest even. Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158373
Aleksandar Kuktin <akuktin@gmail.com> wrote: > In has been mentioned here a few times that there are no FPGAs with > internal tristates. However, surfing around somewhat aimlessly, I > stumbled upon a device series which has them and obviously had to share > my find: > The Atmel AT40K and AT40KAL series. > OTOH, their gate delay is pretty big: around 1.2 ns, and their SRAM read > times aren't much better at about 10 ns. I didn't dig much deeper into > the datasheets so far. Internal tristates don't scale properly. As the chips get bigger, and wires get smaller, the delay increases too much. For unidirectional wires, there are buffers along the way, but you can't buffer lines with tristates. But you don't need to worry. The usual synthesis tools will generate wire-AND (or maybe wire-OR) logic if you write tri-state logic. If you are implementing existing logic designs, it is easy enough to allow the synthesis tools to process them. If not, it is easy enough to write wire-AND logic. (At least is is in verilog, I haven't tried in VHDL yet.) -- glenArticle: 158374
abirov@gmail.com wrote: > Dear all first of all want thanx to everyone who belong to this > forum and give help to each other. > I checked ML405's schematic that pins, and are exist, > i used advice of Aurelian Lazarut (cleaning project files) > but no success. It cannot be mapped. > Could someone help me with this problem? > Here is Errors. > ERROR:MapLib:30 - LOC constraint M19 on SW1 is invalid: No such site on the > device. To bypass this error set the environment variable 'XIL_MAP_LOCWARN'. (snip of more similar messages) My guess is that you have the wrong package set, and the pins are numbered different than the right package. -- glen
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z