Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
David Wade <dave.g4ugm@gmail.com> wrote: > On 20/10/2015 22:52, jim.brakefield@ieee.org wrote: >> On Tuesday, October 20, 2015 at 6:40:35 AM UTC-5, b2508 wrote: >>> How do I most efficiently add 8 numbers in FPGA? >>> What is the best way to save LUTs? >>> How is data width affecting LUT consumption? > Why not try it out. Run one of the tool chains and see what happens when > you build adder in different ways and then if its not what you expect > come and ask on here. The tool chains will show you what the LUT usage > is. I was a tad suprised to find that when I coded:- > byteout <= byte1+byte2+byte3+byte4+byte5+byte6+byte7+byte8 ; > and compared it with > temp1 <= byte1 + byte2 + byte3 + byte4 ; > temp2 <= byte5 + byte6 + byte7 + byte8 ; > byteout <= temp1 + temp2 ; > I got the same number of LUTs and Slices used.... Yes, the optimizers can likely figure that one out. Some years ago, I needed a 36 bit population count. That is, how many '1' bits there are in a 36 bit word. The usual way to make one is with carry save adders, so I build one up, I think first 8 bits, and then combined those. It was a little unusual, since I needed to know 0, 1, 2, 3, more than 3. It wasn't hard to make, but it turns out that if you just say: p=x[0]+x[1]+x[2]+x[3]+ ... x[35]; it works just about as well. It might be that I had to pipeline it also, but it still would have been easier to write. (snip) >> The latest parts from Xilinx and Altera will add three numbers >> at a time using a single carry chain. > In this modern world of optimising tool chains why not just put them all > in one expression and let the tool chain work out what is best for the chip. You mean ones with 6 input LUTs? I haven't looked at those much yet. (snip) My favorite test of the optimizer is when I make a tiny mistake, which turns out to cause some signal to never change, and the optimizer optimizes out all the logic! Nothing at all left! -- glenArticle: 158326
On Wednesday, October 21, 2015 at 8:08:59 PM UTC-5, glen herrmannsfeldt wrote: > David Wade <dave...@gmail.com> wrote: > > On 20/10/2015 22:52, jim...@ieee.org wrote: > >> On Tuesday, October 20, 2015 at 6:40:35 AM UTC-5, b2508 wrote: > >>> How do I most efficiently add 8 numbers in FPGA? > >>> What is the best way to save LUTs? > >>> How is data width affecting LUT consumption? > > > Why not try it out. Run one of the tool chains and see what happens when > > you build adder in different ways and then if its not what you expect > > come and ask on here. The tool chains will show you what the LUT usage > > is. I was a tad suprised to find that when I coded:- > > > byteout <= byte1+byte2+byte3+byte4+byte5+byte6+byte7+byte8 ; > > > and compared it with > > > temp1 <= byte1 + byte2 + byte3 + byte4 ; > > temp2 <= byte5 + byte6 + byte7 + byte8 ; > > byteout <= temp1 + temp2 ; > > > I got the same number of LUTs and Slices used.... > > Yes, the optimizers can likely figure that one out. > > Some years ago, I needed a 36 bit population count. > That is, how many '1' bits there are in a 36 bit word. > > The usual way to make one is with carry save adders, so > I build one up, I think first 8 bits, and then combined those. > > It was a little unusual, since I needed to know 0, 1, 2, 3, more than 3. > > It wasn't hard to make, but it turns out that if you just say: > > p=x[0]+x[1]+x[2]+x[3]+ ... x[35]; > > it works just about as well. It might be that I had to pipeline > it also, but it still would have been easier to write. > > (snip) > > >> The latest parts from Xilinx and Altera will add three numbers > >> at a time using a single carry chain. > > > In this modern world of optimising tool chains why not just put them all > > in one expression and let the tool chain work out what is best for the chip. > > You mean ones with 6 input LUTs? I haven't looked at those much yet. > > (snip) > > My favorite test of the optimizer is when I make a tiny mistake, which > turns out to cause some signal to never change, and the optimizer > optimizes out all the logic! Nothing at all left! > > -- glen > You mean ones with 6 input LUTs? I haven't looked at those much yet. 6LUTs are a favorite of mine: One 4-to-1 mux or two 2-to-1 muxes 2-to-1 mux and an add/subtract IMHO their reason for being is that they reduce the number of logic levels. Routing delay is now larger than logic delay, so reducing logic levels is a big speed win, more so than the greater logic capability. The ALUT/ALM is somewhat different and more complicated. Not currently using it, but does appear to have overall characteristics similar to the 6LUT. JimArticle: 158327
I am receiving this error in xilinx ERROR:Xst:899 - "../../rtl/dff.v" line 7: The logic for <qout> does not match a known FF or Latch template. The code being synthesizd is: module dff(output reg qout, input clok, rst,d,enf); always @(posedge clok or negedge rst) begin if(enf) begin if(rst) qout<=0; else qout<=d; end end endmodule can anyone guide me where i am going wrong? --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158328
Hi all, I want to implement an ADC Interface for an ADC - ADS 7230 (TI) in VHDL. I am not very familiar with ADCs to implement it in VHDL. I already have an ADC Interface for a 10 bit ADC (MAX 1030) and a 12-bit ADC (LTC1407). Unfortunately these are in AHDL. Is it possible to use any of the existing ADC interfaces and adapt it to suit ADS 7230 in AHDL itself? If yes, what are the necessary details I should look into from the data sheet to change the existing ADC interface available in AHDL? Or do you have any other suggestions to implement an ADC interface in the quickest possible way? Is there any link where I can get a reference of a 12-bit ADC in VHDL similar to ADS 7230? Thank you! --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158329
AnamDar wrote: > I am receiving this error in xilinx > ERROR:Xst:899 - "../../rtl/dff.v" line 7: The logic for <qout> does not > match a known FF or Latch template. > > The code being synthesizd is: > > module dff(output reg qout, input clok, rst,d,enf); > always @(posedge clok or negedge rst) > begin > if(enf) > begin > if(rst) > qout<=0; > else > qout<=d; > end > end > > endmodule > > > can anyone guide me where i am going wrong? > > > > --------------------------------------- > Posted through http://www.FPGARelated.com What are you trying to do with signal "enf"? The way you coded it, enf overrides the asynchronous reset as well as the clock. Also since you coded reset as a negedge signal, it should be tested for being low rather than high. If "enf" is supposed to be a clock enable it should be part of the else clause like: always @ (posedge clok or negedge rst) begin if (!rst) // active low async reset qout <= 0; else if (enf) // enf used as a clock enable qout <= d; end -- GaborArticle: 158330
Hi all, I need to implement DC blocker in FPGA. Data samples are coming at every clock cycle. My original idea was to implement high pass filter as in formula below: y[n] = x[n] - x[n-1] + p*y[n-1] However it seems to me that I cannot achieve this with the given data rate. I am unable to calculate output by the time when I need it in feedback loop for the next sample. Is there some way to do this that I don't see? If not, I was thinking of finding mean value of signal and subtracting it from signal in order to clear DC. However, I do not know how to determine appropriate number of samples for this and do i do this by FIR filtering with all coefficients equal to 1/N? Thank you in advance. --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158331
>Hi all, > >I need to implement DC blocker in FPGA. Data samples are coming at every >clock cycle. > >My original idea was to implement high pass filter as in formula below: > >y[n] = x[n] - x[n-1] + p*y[n-1] > >However it seems to me that I cannot achieve this with the given data >rate. I am unable to calculate output by the time when I need it in >feedback loop for the next sample. > you can, just put one delay stage(register) on input to get x[n-1] and one on output to get y[n-1], multiply by p and the circuit will do the job at data rate. The mult output should not be registered and this may be speed bottleneck. Moreover the above subtraction/addition cannot be pipelined i.e. result should arrive at same clock edge. What is your data rate (system clock) and device? >Is there some way to do this that I don't see? >If not, I was thinking of finding mean value of signal and subtracting it >from signal in order to clear DC. >However, I do not know how to determine appropriate number of samples for >this and do i do this by FIR filtering with all coefficients equal to >1/N? > This an alternative but you may need long delay stages to filter off dc only. for n stages, design n stages of delay, subtract current input from last stage and accumulate/scale. Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158332
On 10/22/2015 11:04 AM, AlexKrish wrote: > Hi all, > > I want to implement an ADC Interface for an ADC - ADS 7230 (TI) in VHDL. > I am not very familiar with ADCs to implement it in VHDL. I already have > an ADC Interface for a 10 bit ADC (MAX 1030) and a 12-bit ADC (LTC1407). > Unfortunately these are in AHDL. > > Is it possible to use any of the existing ADC interfaces and adapt it to > suit ADS 7230 in AHDL itself? If yes, what are the necessary details I > should look into from the data sheet to change the existing ADC > interface available in AHDL? > > Or do you have any other suggestions to implement an ADC interface in > the quickest possible way? > > Is there any link where I can get a reference of a 12-bit ADC in VHDL > similar to ADS 7230? It looks like the ADS7230 uses an SPI interface for data and control. This is not a complex interface, but the ADS7230 looks like it is intended to be controlled by software in a processor. To control this from an FPGA you will need to design a state machine to initialize the appropriate registers before reading data samples. The LTC1407 is much simpler with no control registers, just read the data. The MAX1030 has control registers, but they are not the same as the ADS7230, so I think you are going to need to design this interface yourself using available SPI code perhaps. Just think of the SPI like a UART, a vehicle for getting the data in and out of the ADC. You need to read about the registers and figure out how they need to be programmed for your application. -- RickArticle: 158333
Hm.. I tought that multiplication cannot be implemented without delay. This could cause timing issues to my knowledge. Moreover, full formula is y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] Error is difference between output before and after quantization. I asked for the initial one because even that i don't know how to implement. if x1 appears at t1, corresponding y1 is ready at earlies at t2=t1+1. If I register subtracting operation as well, e1 is available at t3=t1+1. However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 are ready at that time. --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158334
>Hm.. I tought that multiplication cannot be implemented without delay. > >This could cause timing issues to my knowledge. > >Moreover, full formula is > >y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] > >Error is difference between output before and after quantization. >I asked for the initial one because even that i don't know how to >implement. > >if x1 appears at t1, corresponding y1 is ready at earlies at t2=t1+1. If I >register subtracting operation as well, e1 is available at t3=t1+1. >However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 are >ready at that time. > > > >--------------------------------------- >Posted through http://www.FPGARelated.com As such you got very long combinatorial paths running from mult input right through adders/subtractors. Unless your speed is low enough you can't do that in practice. The fir subtraction is certainly doable but you need a long delay line e.g. n = 1024 or more but depends on signal Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158335
On 10/22/2015 3:13 PM, b2508 wrote: > Hm.. I tought that multiplication cannot be implemented without delay. > > This could cause timing issues to my knowledge. > > Moreover, full formula is > > y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} > e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] > > Error is difference between output before and after quantization. > I asked for the initial one because even that i don't know how to > implement. > > if x1 appears at t1, corresponding y1 is ready at earlies at t2=t1+1. If I > register subtracting operation as well, e1 is available at t3=t1+1. > However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 are > ready at that time. What is Q? Y1 is ready at t1+delta which is a logic delay, not a clock cycle. So don't sweat that. If you need to pipeline this to meet timing constraints, you are in trouble, lol. What clock rate are you shooting for? -- RickArticle: 158336
In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, b2508 <108118@FPGARelated> wrote: >Hm.. I tought that multiplication cannot be implemented without delay. > >This could cause timing issues to my knowledge. > >Moreover, full formula is > >y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] > >Error is difference between output before and after quantization. >I asked for the initial one because even that i don't know how to >implement. > >if x1 appears at t1, corresponding y1 is ready at earlies at t2=t1+1. If I >register subtracting operation as well, e1 is available at t3=t1+1. >However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 are >ready at that time. Without really looking at your required function in detail (just noting that it has feedback terms) - I'll just note in general. The statement "multiplication cannot be implemented without delay" is false, in many ways. It all depends on your processing requirements. What is your sample rate? What are your bit widths? You're processing clock does NOT need to be the same as your sample clock. If you wish them to be the same - it may be easier for new FPGA users to design - then you MAY be able to run the multiplier full combinational - If you're sample rate is low enough. The alternative (at a high level) is to buffer an input and output, and process with a faster processing clock. Modern FPGA's these days can run DSP functions upwards to around 400-500 MHz. This is likely much faster than your sample rate. Regards, MarkArticle: 158337
>In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, >b2508 <108118@FPGARelated> wrote: >>Hm.. I tought that multiplication cannot be implemented without delay. >> >>This could cause timing issues to my knowledge. >> >>Moreover, full formula is >> >>y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >>e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] >> >>Error is difference between output before and after quantization. >>I asked for the initial one because even that i don't know how to >>implement. >> >>if x1 appears at t1, corresponding y1 is ready at earlies at t2=t1+1. If I >>register subtracting operation as well, e1 is available at t3=t1+1. >>However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 are >>ready at that time. > >Without really looking at your required function in detail (just noting >that it has feedback terms) - I'll just note in general. > >The statement "multiplication cannot be implemented without delay" is >false, in many ways. It all depends on your processing requirements. >What is your sample rate? What are your bit widths? > >You're processing clock does NOT need to be the same as your sample clock. >If you wish them to be the same - it may be easier for new FPGA users to >design - then you MAY be able to run the multiplier full combinational - >If you're sample rate is low enough. > >The alternative (at a high level) is to buffer an input and output, and >process with a faster processing clock. Modern FPGA's these days can run >DSP functions upwards to around 400-500 MHz. This is likely much faster >than your sample rate. > >Regards, >Mark OK, I was taught that it is always safer to put registers wherever you can. I have no choice in my project but to have same sampling and processing rate. My rate is 100 MHz. Input data or x[n] has data format - unsigned, 16 bit, 1 bit for integer. Also, I am not sure how to select data widths after each of these operations. If x[n] and x[n-1] are 16/1 and their subtraction is 17 bit unsigned with 2 bit integers, how do I proceed with data width selection? Feedback loop part is unclear to me. Also, should I use DSP48 for the multiplication with P or should I make it somehow power of two and do it by shifting? Q is quantization, or reducing number of samples after all these operations. --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158338
zak wrote: > True but the tool takes care of that and tells you if it passed > recovery/removal at every register based on clock period check. In case it > doesn't you can then assist by reducing reset fanout e.g. cascading > stages. If the reset signal is produced on-chip by a FF clocked off the same clock as the system that is getting the rest, then you are right, the tools have all the info they need to make sure this is right. It would be insane to hace an externally-generated reset without reclocking it in the FPGA off the same clock, but you certainly could do this by mistake. I just found a condition where you could have an un-synchronized input on a product that has been in manufacture for a decade. Normally, this signal IS synchronized on the FPGA, but there was a particular configuration involving multiple boards where it would be synched from the OTHER board, only. OOPS! JonArticle: 158339
Mark Curry <gtwrek@sonic.net> wrote: (snip) > Without really looking at your required function in detail (just noting > that it has feedback terms) - I'll just note in general. > The statement "multiplication cannot be implemented without delay" is > false, in many ways. It all depends on your processing requirements. > What is your sample rate? What are your bit widths? I would say that it is right, but not very useful. Addition can't be implemented without delay, and for that matter no filter can be. Even wires have delay. If you are lucky, you can do all processing within one sample period, so one sample delay. You have to include any delay from the previous register, so you have less than one sample period. But more often, you can live with a few cycles delay, and pipeline the whole system. > You're processing clock does NOT need to be the same as your sample clock. > If you wish them to be the same - it may be easier for new FPGA users to > design - then you MAY be able to run the multiplier full combinational - > If you're sample rate is low enough. > The alternative (at a high level) is to buffer an input and output, and > process with a faster processing clock. Modern FPGA's these days can run > DSP functions upwards to around 400-500 MHz. This is likely much faster > than your sample rate. -- glenArticle: 158340
On Thu, 22 Oct 2015 13:18:14 -0500, b2508 wrote: > Hi all, > > I need to implement DC blocker in FPGA. Data samples are coming at every > clock cycle. > > My original idea was to implement high pass filter as in formula below: > > y[n] = x[n] - x[n-1] + p*y[n-1] > > However it seems to me that I cannot achieve this with the given data > rate. I am unable to calculate output by the time when I need it in > feedback loop for the next sample. > > Is there some way to do this that I don't see? > If not, I was thinking of finding mean value of signal and subtracting > it from signal in order to clear DC. > However, I do not know how to determine appropriate number of samples > for this and do i do this by FIR filtering with all coefficients equal > to 1/N? There are other ways to implement high-pass filters. I'm not much of an FPGA guy, but this one may help. I'm going to rearrange your nomenclature: u: input y: output x: state variable y[n] = u[n] - x[n-1] x[n] = d * y[n] For d << 1 this should be pretty robust even if you have to toss in extra delays (i.e., x[n] = d * y[n - m], for some integer value of m). -- Tim Wescott Wescott Design Services http://www.wescottdesign.comArticle: 158341
On 10/22/2015 4:50 PM, b2508 wrote: >> In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, >> b2508 <108118@FPGARelated> wrote: >>> Hm.. I tought that multiplication cannot be implemented without delay. >>> >>> This could cause timing issues to my knowledge. >>> >>> Moreover, full formula is >>> >>> y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >>> e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] >>> >>> Error is difference between output before and after quantization. >>> I asked for the initial one because even that i don't know how to >>> implement. >>> >>> if x1 appears at t1, corresponding y1 is ready at earlies at t2=t1+1. If > I >>> register subtracting operation as well, e1 is available at t3=t1+1. >>> However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 > are >>> ready at that time. >> >> Without really looking at your required function in detail (just noting >> that it has feedback terms) - I'll just note in general. >> >> The statement "multiplication cannot be implemented without delay" is >> false, in many ways. It all depends on your processing requirements. >> What is your sample rate? What are your bit widths? >> >> You're processing clock does NOT need to be the same as your sample > clock. >> If you wish them to be the same - it may be easier for new FPGA users to > >> design - then you MAY be able to run the multiplier full combinational - > >> If you're sample rate is low enough. >> >> The alternative (at a high level) is to buffer an input and output, and >> process with a faster processing clock. Modern FPGA's these days can run > >> DSP functions upwards to around 400-500 MHz. This is likely much faster > >> than your sample rate. >> >> Regards, >> Mark > > OK, I was taught that it is always safer to put registers wherever you > can. I have no choice in my project but to have same sampling and > processing rate. > > My rate is 100 MHz. > Input data or x[n] has data format - unsigned, 16 bit, 1 bit for integer. > > Also, I am not sure how to select data widths after each of these > operations. > > If x[n] and x[n-1] are 16/1 and their subtraction is 17 bit unsigned with > 2 bit integers, how do I proceed with data width selection? Feedback loop > part is unclear to me. > Also, should I use DSP48 for the multiplication with P or should I make it > somehow power of two and do it by shifting? > > Q is quantization, or reducing number of samples after all these > operations. Do you know the value of P? Multiplies are done by shifting and adding. I don't know which chip you are planning to use, but all the multipliers I know of require pipelining, the only option is how many stages, 1, 2, etc... Since P is a constant (it *is* a constant, right?) you only need to use adders for the 1s, or if there are long runs of 1s or 0s, you can subtract at the lsb of the run and add in at the bit just past the msb of the run. The point is you may not need to use a built in multiplier. Your filter seems very complex for a feedback filter. Is there some special need driving this? Can you use a simpler filter? -- RickArticle: 158342
In article <nqOdnUF-N6G90bTLnZ2dnUU7-VmdnZ2d@giganews.com>, b2508 <108118@FPGARelated> wrote: >>In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, >>b2508 <108118@FPGARelated> wrote: >>>Hm.. I tought that multiplication cannot be implemented without delay. >>> >>>This could cause timing issues to my knowledge. >>> >>>Moreover, full formula is >>> >>>y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >>>e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] >>> > >OK, I was taught that it is always safer to put registers wherever you >can. I have no choice in my project but to have same sampling and >processing rate. This is a complete non-sequitur. Yes, register often in an FPGA. That's a good rule of thumb. NOTHING to do with "having same sampling and processing rate". Think of it as an analogy - if you implemented this in software, would you force (if you could) the processor to operate on the sample clock? Of course not. You buffer a few input samples, do your processing at the higher speed clock, then buffer your output. Your requirements are, the total processing must complete in one-sample time. When designing at the higher rate clock, each register is NOT neccesarily a Z-1 sample delay of your function. The register is just a retiming step (i.e. pipeline stage). >My rate is 100 MHz. >Input data or x[n] has data format - unsigned, 16 bit, 1 bit for integer. A 100 MHz fully combinational multiply is doable in a modern DSP48. Now, whether the rest of the algorithm would fit, I dunno. I'd not design it this way. I'd use the faster processing clock. >Also, I am not sure how to select data widths after each of these >operations. > >If x[n] and x[n-1] are 16/1 and their subtraction is 17 bit unsigned with >2 bit integers, how do I proceed with data width selection? I'm confused on you're notation - x[n], and x[n-1] should be same format. But in any event, in cases like these you just need to make sure your scaling of each variable is the same (i.e. align the "decimal" points), and appropriate sign-extend the size of each input. >part is unclear to me. >Also, should I use DSP48 for the multiplication with P or should I make it >somehow power of two and do it by shifting? This is an implementation trade-off you must decide. Regards, MarkArticle: 158343
In article <n0bjhg$fr2$1@speranza.aioe.org>, glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote: >Mark Curry <gtwrek@sonic.net> wrote: > >(snip) > >> Without really looking at your required function in detail (just noting >> that it has feedback terms) - I'll just note in general. > >> The statement "multiplication cannot be implemented without delay" is >> false, in many ways. It all depends on your processing requirements. >> What is your sample rate? What are your bit widths? > >I would say that it is right, but not very useful. > >Addition can't be implemented without delay, and for that matter >no filter can be. Even wires have delay. Ok, picking nits - my terminology wasn't clear. But multiplies and adds can be done on an FPGA in 0 clock cycles i.e. pure combinational logic. My notation (which I think is common), is counting pipeline cycles. Both add, and multiply (and quite often both) can be done within 1 cycle, such that you can register just the final output, and use that final output as a new input on the next iteration. > >If you are lucky, you can do all processing within one sample >period, so one sample delay. You have to include any delay from >the previous register, so you have less than one sample period. > >But more often, you can live with a few cycles delay, and pipeline >the whole system. Which is, what I think the OP is (correctly) worried about. His feedback term is needed for the next calculation, so he can't fully pipeline. His requirements are "Processing must be complete in one sample time." Where as in general, full-pipelined designs requirements are just "Can accept another input in one cycle time"; Output may appear (some reasonable) number of clock cycles later. Regards, MarkArticle: 158344
Tim Wescott wrote: > On Thu, 22 Oct 2015 13:18:14 -0500, b2508 wrote: > >> Hi all, >> >> I need to implement DC blocker in FPGA. Data samples are coming at every >> clock cycle. >> >> My original idea was to implement high pass filter as in formula below: >> >> y[n] = x[n] - x[n-1] + p*y[n-1] >> >> However it seems to me that I cannot achieve this with the given data >> rate. I am unable to calculate output by the time when I need it in >> feedback loop for the next sample. >> >> Is there some way to do this that I don't see? >> If not, I was thinking of finding mean value of signal and subtracting >> it from signal in order to clear DC. >> However, I do not know how to determine appropriate number of samples >> for this and do i do this by FIR filtering with all coefficients equal >> to 1/N? > > There are other ways to implement high-pass filters. I'm not much of an > FPGA guy, but this one may help. I'm going to rearrange your > nomenclature: > > u: input > y: output > x: state variable > > y[n] = u[n] - x[n-1] > x[n] = d * y[n] > x need not be a vector, does it? I think it works out to be a single value. That may matter for an FPGA implementation. > For d << 1 this should be pretty robust even if you have to toss in extra > delays (i.e., x[n] = d * y[n - m], for some integer value of m). > -- Les CargillArticle: 158345
The OP appeared on dsprelated.com first where dsp guys know everthing about fpgas but then migrated here thankfully. The guy has posted there a link to a doc written by same dsp guys who control that forum. It is a leaky integrator based dc filter followed by a modification where the quantisation error is added back to the loop. The double equations given here are misleading. My suggestion is just implement as per filter2 in the diagram and forget about equations. It is there ready for you. and if you use P as power of 2 it might be enough for your resolution and so mult needed. your input is 16 bits unsigned?? that means dc offset, I believe the design is meant for signed. regarding bit growth: 16bits after addition/subtraction => 17 bits. for feedback use 16 bits. Truncation error is meant to help that. If you get into fmax issues then I hope dsp guys will come to help!! Kaz --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158346
On Thu, 22 Oct 2015 18:18:51 -0500, Les Cargill wrote: > Tim Wescott wrote: >> On Thu, 22 Oct 2015 13:18:14 -0500, b2508 wrote: >> >>> Hi all, >>> >>> I need to implement DC blocker in FPGA. Data samples are coming at >>> every clock cycle. >>> >>> My original idea was to implement high pass filter as in formula >>> below: >>> >>> y[n] = x[n] - x[n-1] + p*y[n-1] >>> >>> However it seems to me that I cannot achieve this with the given data >>> rate. I am unable to calculate output by the time when I need it in >>> feedback loop for the next sample. >>> >>> Is there some way to do this that I don't see? >>> If not, I was thinking of finding mean value of signal and subtracting >>> it from signal in order to clear DC. >>> However, I do not know how to determine appropriate number of samples >>> for this and do i do this by FIR filtering with all coefficients equal >>> to 1/N? >> >> There are other ways to implement high-pass filters. I'm not much of >> an FPGA guy, but this one may help. I'm going to rearrange your >> nomenclature: >> >> u: input y: output x: state variable >> >> y[n] = u[n] - x[n-1] >> x[n] = d * y[n] >> >> > x need not be a vector, does it? I think it works out to be a single > value. That may matter for an FPGA implementation. > >> For d << 1 this should be pretty robust even if you have to toss in >> extra delays (i.e., x[n] = d * y[n - m], for some integer value of m). >> In this case the x[n] notation means "x at sample time n", where "n" means "today". Basically the notation that the OP used. -- Tim Wescott Wescott Design Services http://www.wescottdesign.comArticle: 158347
>In article <nqOdnUF-N6G90bTLnZ2dnUU7-VmdnZ2d@giganews.com>, >b2508 <108118@FPGARelated> wrote: >>>In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, >>>b2508 <108118@FPGARelated> wrote: >>OK, I was taught that it is always safer to put registers wherever you >>can. I have no choice in my project but to have same sampling and >>processing rate. > >This is a complete non-sequitur. Yes, register often in an FPGA. That's >a good rule of thumb. NOTHING to do with "having same sampling and >processing >rate". > > >Regards, > >Mark Hey, these are only two sentences next to each other, I didn't mean that I register because sample and processing rate are the same :-) --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158348
>On 10/22/2015 4:50 PM, b2508 wrote: >>> In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, >>> b2508 <108118@FPGARelated> wrote: >>>> Hm.. I tought that multiplication cannot be implemented without delay. >>>> >>>> This could cause timing issues to my knowledge. >>>> >>>> Moreover, full formula is >>>> >>>> y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >>>> e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] >>>> >>>> Error is difference between output before and after quantization. >>>> I asked for the initial one because even that i don't know how to >>>> implement. >>>> >>>> if x1 appears at t1, corresponding y1 is ready at earlies at t2=t1+1. >If >> I >>>> register subtracting operation as well, e1 is available at t3=t1+1. >>>> However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 >> are >>>> ready at that time. >>> >>> Without really looking at your required function in detail (just noting >>> that it has feedback terms) - I'll just note in general. >>> >>> The statement "multiplication cannot be implemented without delay" is >>> false, in many ways. It all depends on your processing requirements. >>> What is your sample rate? What are your bit widths? >>> >>> You're processing clock does NOT need to be the same as your sample >> clock. >>> If you wish them to be the same - it may be easier for new FPGA users to >> >>> design - then you MAY be able to run the multiplier full combinational - >> >>> If you're sample rate is low enough. >>> >>> The alternative (at a high level) is to buffer an input and output, and >>> process with a faster processing clock. Modern FPGA's these days can >run >> >>> DSP functions upwards to around 400-500 MHz. This is likely much faster >> >>> than your sample rate. >>> >>> Regards, >>> Mark >> >> OK, I was taught that it is always safer to put registers wherever you >> can. I have no choice in my project but to have same sampling and >> processing rate. >> >> My rate is 100 MHz. >> Input data or x[n] has data format - unsigned, 16 bit, 1 bit for integer. >> >> Also, I am not sure how to select data widths after each of these >> operations. >> >> If x[n] and x[n-1] are 16/1 and their subtraction is 17 bit unsigned with >> 2 bit integers, how do I proceed with data width selection? Feedback loop >> part is unclear to me. >> Also, should I use DSP48 for the multiplication with P or should I make >it >> somehow power of two and do it by shifting? >> >> Q is quantization, or reducing number of samples after all these >> operations. > >Do you know the value of P? Multiplies are done by shifting and adding. > I don't know which chip you are planning to use, but all the >multipliers I know of require pipelining, the only option is how many >stages, 1, 2, etc... Since P is a constant (it *is* a constant, right?) >you only need to use adders for the 1s, or if there are long runs of 1s >or 0s, you can subtract at the lsb of the run and add in at the bit just >past the msb of the run. The point is you may not need to use a built >in multiplier. > >Your filter seems very complex for a feedback filter. Is there some >special need driving this? Can you use a simpler filter? > >-- > >Rick I do not really know the value of P or how to determine it. I was thinking to use 0.99 because I tried it out in software simulation and it seems to do what I wanted it to do. The idea for this filter came from this article / second filter on Figure 2. http://www.digitalsignallabs.com/dcblock.pdf Someone said to forget equations and do as it is drawn in figure, but these figures never account for potential latency of the add/subtract/multiply blocks or if I do not add registers, then I may have timing issues. Anyway, I will try to do add and multiply in one clock cycle and see where this gets me. Thank you all very much anyhow. --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158349
On 10/23/2015 3:51 AM, b2508 wrote: >> On 10/22/2015 4:50 PM, b2508 wrote: >>>> In article <2Z-dnUTCaKvCqLTLnZ2dnUU7-VOdnZ2d@giganews.com>, >>>> b2508 <108118@FPGARelated> wrote: >>>>> Hm.. I tought that multiplication cannot be implemented without > delay. >>>>> >>>>> This could cause timing issues to my knowledge. >>>>> >>>>> Moreover, full formula is >>>>> >>>>> y[n] = Q {x[n] - x[n-1] + p*y[n-1] - e[n-1]} >>>>> e[n] = x[n] - x[n-1] + p*y[n-1] - e[n-1] - y[n] >>>>> >>>>> Error is difference between output before and after quantization. >>>>> I asked for the initial one because even that i don't know how to >>>>> implement. >>>>> >>>>> if x1 appears at t1, corresponding y1 is ready at earlies at > t2=t1+1. >> If >>> I >>>>> register subtracting operation as well, e1 is available at t3=t1+1. >>>>> However x2 arrives at t2 and neither y1 (corresponding y[n-1]) or e1 >>> are >>>>> ready at that time. >>>> >>>> Without really looking at your required function in detail (just > noting >>>> that it has feedback terms) - I'll just note in general. >>>> >>>> The statement "multiplication cannot be implemented without delay" is >>>> false, in many ways. It all depends on your processing requirements. >>>> What is your sample rate? What are your bit widths? >>>> >>>> You're processing clock does NOT need to be the same as your sample >>> clock. >>>> If you wish them to be the same - it may be easier for new FPGA users > to >>> >>>> design - then you MAY be able to run the multiplier full combinational > - >>> >>>> If you're sample rate is low enough. >>>> >>>> The alternative (at a high level) is to buffer an input and output, > and >>>> process with a faster processing clock. Modern FPGA's these days can >> run >>> >>>> DSP functions upwards to around 400-500 MHz. This is likely much > faster >>> >>>> than your sample rate. >>>> >>>> Regards, >>>> Mark >>> >>> OK, I was taught that it is always safer to put registers wherever you >>> can. I have no choice in my project but to have same sampling and >>> processing rate. >>> >>> My rate is 100 MHz. >>> Input data or x[n] has data format - unsigned, 16 bit, 1 bit for > integer. >>> >>> Also, I am not sure how to select data widths after each of these >>> operations. >>> >>> If x[n] and x[n-1] are 16/1 and their subtraction is 17 bit unsigned > with >>> 2 bit integers, how do I proceed with data width selection? Feedback > loop >>> part is unclear to me. >>> Also, should I use DSP48 for the multiplication with P or should I > make >> it >>> somehow power of two and do it by shifting? >>> >>> Q is quantization, or reducing number of samples after all these >>> operations. >> >> Do you know the value of P? Multiplies are done by shifting and adding. > >> I don't know which chip you are planning to use, but all the >> multipliers I know of require pipelining, the only option is how many >> stages, 1, 2, etc... Since P is a constant (it *is* a constant, right?) > >> you only need to use adders for the 1s, or if there are long runs of 1s >> or 0s, you can subtract at the lsb of the run and add in at the bit just > >> past the msb of the run. The point is you may not need to use a built >> in multiplier. >> >> Your filter seems very complex for a feedback filter. Is there some >> special need driving this? Can you use a simpler filter? >> >> -- >> >> Rick > > I do not really know the value of P or how to determine it. I was thinking > to use 0.99 because I tried it out in software simulation and it seems to > do what I wanted it to do. The idea for this filter came from this article > / second filter on Figure 2. > > http://www.digitalsignallabs.com/dcblock.pdf > > Someone said to forget equations and do as it is drawn in figure, but > these figures never account for potential latency of the > add/subtract/multiply blocks or if I do not add registers, then I may have > timing issues. > > Anyway, I will try to do add and multiply in one clock cycle and see where > this gets me. > > Thank you all very much anyhow. About the timing issues. Try it without extra registers first. Then if you have problems you will need to find ways to address them. Your calculation can not work if you add more register delays. -- Rick
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z