Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
morpheus wrote: > Hi Folks, > I've been active on this forum asking questions on digital receiver. > Earlier I was doing most of the work in the FPGA (thats how I > architected it). Due to space constraints, this radio is going to be > used as a Search and Rescue radio for the coast guard, I am re- > architecting and planning to use a multi-channel digital receiver > chip instead. This helps us to get rid of most of our analog front end > and do minimal processing in the FPGA, and thus use a smaller FPGA > also. > The chips that come to mind are AD6654 and AD6624A. The AD6654A is > special as it has the in-built ADC. > Now some specs on my design requirements > a) 4 channels, > b) 121.5MHz - 245MHz band of interest. > c) AM, FM, FSK demodulation. > I have also looked at some Grayscale chips. I was wondering if you > have any more suggestions. > Thanks > Morpheus > I'm surprised it doesn't also have to receive 406MHz to be compatible with the new ELTs and EPRBs. The FPGA can be made to work, but you'd need a fast ADC to do it without an analog mix to some convenient IF. There are several gigasample ADCs out there, but they are no more than 8-10 bits, which may not give you the dynamic range you need. If done on an FPGA, much of the receiver can be shared by all 4 channels. Really only the NCO, mixer and first stages of your DDC need to be dedicated to single channels, and that's assuming sampling at hundreds of MHz. The lower your sample rate coming in, the easier it is to share hardware among channels. The nice thing with this application is the bandwidth is pretty much just audio, so there is room for lots of decimation. With careful design, you could get a 4 channel receiver that samples at 500 MHz on the input into a smaller V4SX device. If you look at my website gallery at http://www.andraka.com/DSA.htm there is a floorplan of a 10 channel digital receiver that samples at 500 MHz, and includes the phase and amplitude twiddles for beamforming. The device is a Virtex4 SX55, which is being run at a fairly comfortable 250 MHz. The ADC presents two samples in parallel per clock. In this case,the output from each channel is programmable between 5KHz and 40 MHz, which is a much wider range than you'd need. In that image, the logic that is dedicated to each channel is clearly visible as 10 discrete blocks. The logic at the middle bottom is I and Q filters that are shared among all 10 channels on a time multiplexed basis, and the loose stuff up the middle is extra stuff for interfacing and control.Article: 117601
morpheus wrote: > Hi There! > I did start a post earlier with the title " Digital AM/FM Receiver", > and I'm starting one again in continuation to that (the reason for > starting another post is selfish....get more visibility....I know...I > know...I should be a better person, but I'll wait for next year to > make that resolution!!) > Anyways, the responses to my earlier post were very informative and > with them I've been able to get to a point where I am downconverting > from IF (thanks for the suggestion Ray), doing decimation (CIC + > invSinc filter) and basic lpf to sharpen the band of interest. > Now I plan to use CORDIC to demodulate FM. > I was reading up on the CORDIC core and it seems that for ATan(y/x) it > takes inputs with 1QN and generates phase outputs as 2QN. > The output of my LPF is 42 bits wide 2's complement data and that > needs to be fed as the X & Y inputs (obviously 1 lpf for I and one for > Q), I want to operate the CORDIC with 16-20 bits of input, I was > wondering how I should truncate/round. Do I have to convert since the > data is already in 2's complement format and if I just > truncated....can I just use the upper bits of the lpf output as the > input to the CORDIC? > > Ideas...a*@ kicking much appreciated!! > cheers > -M > Depends. simple truncation will introduce a DC offset on your signal. You can use symmetric rounding instead (round x.5 toward or away from 0) to eliminate the bias, or in some cases you can subtract out the bias later. For CORDIC, you can keep extra LSBs through the CORDIC processor and then round them off when done to greatly reduce the bias rather than going through the expense of doing a symmetric round at each stage.Article: 117602
"Hrishi" <sankpalhrishi@gmail.com> wrote: >Hello, > I m building a Function generator using an FPGA!! I m almost >done wid the Coding part in VHDL. I'd just like some help regarding >the interfacing of DAC0808 with the FPGA. Also how th output >Amplitude variation can be done.!!! > You can do this using a seperate DAC and use it as a reference and/or a programmable gain amplifier. This will help to maintain the resolution of the signal generating DAC. Programmable gain amplifiers with bandwidths up to 60MHz are available for reasonable prices. -- Reply to nico@nctdevpuntnl (punt=.) Bedrijven en winkels vindt U op www.adresboekje.nlArticle: 117603
Symon wrote: > <ryan_usenet@yahoo.com> wrote in message > news:1175696938.323383.273760@o5g2000hsb.googlegroups.com... > >>Although your suggestions have already been useful, I'd appreciate >>more hints maybe concerning exotic manufacturers that I have never >>heard of. Or are Xilinx and Altera definitely the only choices? >> >>Best regards, >>Ryan >> > > Hi Ryan, > Don't forget FPGAs have other resources apart from 'hard' multipliers. Check > this page on Mr. Andraka'a website about distributed arithmetic. You can > make a _lot_ of multipliers out of the ordinary fabric of FPGAs. > HTH, Syms. > > http://www.andraka.com/distribu.htm > > You can also time-share the multipliers. The DSP48 elements in Virtex 4 can be clocked at 400 MHz in the slow speed grade part. With careful design, the fabric can also support 400 MHz (well, except for the carry logic, which is hard pressed at 400MHz for anything but simple counters). My gigasample floating point FFT design runs on a 400 MHz clock in a V4SX55. That design is highlighted in this month's Xilinx DSP magazine: http://www.xilinx.com/publications/magazines/dsp_03/xc_pdf/p42-44-3dsp-andraka.pdf and I have the floorplan for that design on my website at http://www.andraka.com/V4_FP_fft.htmArticle: 117604
On Apr 4, 2:57 pm, "anand" <writean...@gmail.com> wrote: > Why do we use Gray code in asynchronous FIFO? (I am aware that only > ONE bit changes in Gray code at a time, but what is the advantage of > that?) This sounds suspiciously like a homework problem, but I'll give you a clue. Suppose you wanted to sample the write address pointer using the read clock? What would happen if the address were changing just as you sampled it?Article: 117605
ryan_usenet@yahoo.com wrote: > Hello, > I need the highest possible number of multiplication operations per > second at low cost. I know that several factors affect the overall > performance, but since I have no idea which FPGA chips might be worth > to be considered, I'd like to ask what you think is the chip with the > lowest ratio > > R=(prize of chip)* (delay time)/(number of multipliers) > > 18x18bit multipliers seem to be quite common, so lets assume this > design for the estimate. > > For example for the Spartan XC3S1000 (~60$, 24 multipliers, 4ns delay) > I have > R= 10$ per (Billion multiplications/s). The Cyclone EP2C70 (~230$, 150 > multipliers, 4ns delay) has R=6.13$ per (Billion multiplications/s). > > Do other FPGAs exist that are maybe specialized for multiplication- > intensive tasks and which therefore are much cheaper? > > Best regards, > Ryan > What is your application? There is usually more than one way to approach a problem. For example, distributed arithmetic is an elegant solution to handling a sum of products with constant or nearly constant coefficients that does a great job at compacting the area required for fabric multipliers. The precipitation radar design on my website gallery ( http://www.andraka.com/precip_radar.htm ) does about 82 multiply-accumulates per clock cycle at 133 MHz, which works out to over 10 billion multiplies/sec in an FPGA that has zero hardware multipliers (XCV1000) and is rather small and slow compared to current FPGA offerings. In order to find a possible alternative approach, you need to disect your algorithm to see if there are other constructs that might get you the performance you want in a reasonable amount of hardware.Article: 117606
ryan_usenet@yahoo.com wrote: >>It would be interesting to know what you think "a multiplier" is - i.e. what >>word-length, and are you talking about fixed-point or floating-point >>operations? > > > Simulations show that 36bit fixed-point are sufficient. However for > later improvements we might change to floating-point calculation. Do > FPGAs with floating point multipliers/adders exist, or how can one > estimate the (emulated) floating-point performance given the fixed- > point performance? > > Ryan > No, there are currently no FPGAs with built-in floating point support. However, with care, floating point can be done at speeds similar to fixed point at the price of additional logic. See my article in this month's Xilinx DSP magazine on a floating point FFT ( http://www.xilinx.com/publications/magazines/dsp_03/xc_pdf/p42-44-3dsp-andraka.pdf ) Floating point does not have to be done on every elemental operation, but instead can be done after a series of operations or a more complex fixed point operation without losing any precision if the fixed point kernel which is surrounded by the floating point extensions has enough room for growth to prevent overflow. Taking that approach can greatly reduce the amount of hardware needed to support floating point. The expensive thing in floating point are the normalize and denormalize funstions. Those can be done at DSP48 speeds by using the DSP48 multipliers for the shifters.Article: 117607
Hi, thanks for your answer at first > Your bandwidth requirement is less than 800kB/s. How about compact > flash or an SD-card? These are well known and there are loads of > resources available on the internet. Flash is no option, because I write an read the data streams, and this should also be possible 24/7, where Flash would die fast. It will be a kind of mixer application (with the possibilty for delay on each channel), where incoming and outgoing audio streams are using a rather complex bus system. I want the Microblaze to handle the higher level bus protocols, store the data in RAM and load the mixed stream after the delay and give it back on the bus system. Because the Microblaze should also run of this RAM (I will have SPI Flash attached as FPGA Configuration and Microblaze Program Storage, but I think it's too slow to run the program directly out of it), the bandwith memory is higher than 800kB/s. But I have no experience how fast the memory should be for a good Microblaze performance. > There are no free _usefull_ cores to control DDR memory. So Xilinx OPB DDR Interface is not useful? At the moment SDRAM is my favourite. But I'm still not sure if Xilinx SDRAM Interfaces work with Spartan 3E :(Article: 117608
> If they have enough bandwidth for your application, psuedo srams are > very simple to use. I have used this part: > > http://www.micron.com/products/partdetail?part=MT45W4MW16BFB-856%20WT Hi John, I had a look on the CellularRAM too. It really looks easy to use and price is ok. However I'm not sure if the bandwith is sufficient for me. The 85ms Access Time looks slow in Comparison to the various DRAMs. But as I understand it's faster in Burst Mode. But as I understand the EMC Controller handles it only as normal SRAM, without using the extended Features.Article: 117609
On Apr 4, 12:19 pm, "Gabor" <g...@alacron.com> wrote: > On Apr 4, 2:57 pm, "anand" <writean...@gmail.com> wrote: > > > Why do we use Gray code in asynchronous FIFO? (I am aware that only > > ONE bit changes in Gray code at a time, but what is the advantage of > > that?) > > This sounds suspiciously like a homework problem, but I'll give you a > clue. Suppose you wanted to sample the write address pointer using > the read clock? What would happen if the address were changing just > as you sampled it? Well the setup time would not be met, therefore, metastability will occur? No, this is not a HW problem, rather, I am trying to prepare for an interview ;-)Article: 117610
"anand" <writeanand@gmail.com> wrote in message news:1175720120.262865.103380@q75g2000hsh.googlegroups.com... > On Apr 4, 12:19 pm, "Gabor" <g...@alacron.com> wrote: >> On Apr 4, 2:57 pm, "anand" <writean...@gmail.com> wrote: >> >> > Why do we use Gray code in asynchronous FIFO? (I am aware that only >> > ONE bit changes in Gray code at a time, but what is the advantage of >> > that?) >> >> This sounds suspiciously like a homework problem, but I'll give you a >> clue. Suppose you wanted to sample the write address pointer using >> the read clock? What would happen if the address were changing just >> as you sampled it? > > Well the setup time would not be met, therefore, metastability will > occur? > No, this is not a HW problem, rather, I am trying to prepare for an > interview ;-) I almost think it's unfair to give you information if the new comany wants you for your knowledge and ability. But perhaps the ability to ask the professionals in a newsgroup is underrated. Keep in mind that propagation delays aren't the same for all sources and all destinations. When you sample the write address from the read domain, there will be an area of uncertainty where the bits are transitioning between the previous and next values. The metastability windows in modern devices are extremely slow; I think Peter Alfke's numbers from one of the most recent high-end families was sub-femtosecond. Darn, that's small! So Metastability isn't an issue, the uncertainty region between values is the issue. If you're only uncertain about one bit, it doesn't matter if you have the previous or next value - both are valid for your "on the edge" calculation. What isn't valid is getting a pointer that's off by 2, 4, or 8 from either of the two valid values you're trying to sample. Now, do you know how to properly pipeline? How about state machines: can you design a soda dispenser state machine to count change? And how many gas stations would you estimate are in the US (please explain your reasoning)? If you had a belt tight around the equator of the earth, how much length do you figure you'd need to add to raise the belt to an altitude of 1 foot across the entire distance and why? Good luck with your interviewing.Article: 117611
Quick typo clarification. I meant small, not slow. "The metastability windows in modern devices are extremely small; I think Peter Alfke's numbers from one of the most recent high-end families was sub-femtosecond." I hate it when that happens. "John_H" <newsgroup@johnhandwork.com> wrote in message news:13185gkp47lpq2c@corp.supernews.com... > "anand" <writeanand@gmail.com> wrote in message > news:1175720120.262865.103380@q75g2000hsh.googlegroups.com... >> On Apr 4, 12:19 pm, "Gabor" <g...@alacron.com> wrote: >>> On Apr 4, 2:57 pm, "anand" <writean...@gmail.com> wrote: >>> >>> > Why do we use Gray code in asynchronous FIFO? (I am aware that only >>> > ONE bit changes in Gray code at a time, but what is the advantage of >>> > that?) >>> >>> This sounds suspiciously like a homework problem, but I'll give you a >>> clue. Suppose you wanted to sample the write address pointer using >>> the read clock? What would happen if the address were changing just >>> as you sampled it? >> >> Well the setup time would not be met, therefore, metastability will >> occur? >> No, this is not a HW problem, rather, I am trying to prepare for an >> interview ;-) > > I almost think it's unfair to give you information if the new comany wants > you for your knowledge and ability. But perhaps the ability to ask the > professionals in a newsgroup is underrated. > > Keep in mind that propagation delays aren't the same for all sources and > all destinations. When you sample the write address from the read domain, > there will be an area of uncertainty where the bits are transitioning > between the previous and next values. The metastability windows in modern > devices are extremely slow; I think Peter Alfke's numbers from one of the > most recent high-end families was sub-femtosecond. Darn, that's small! > So Metastability isn't an issue, the uncertainty region between values is > the issue. If you're only uncertain about one bit, it doesn't matter if > you have the previous or next value - both are valid for your "on the > edge" calculation. What isn't valid is getting a pointer that's off by 2, > 4, or 8 from either of the two valid values you're trying to sample. > > Now, do you know how to properly pipeline? How about state machines: can > you design a soda dispenser state machine to count change? And how many > gas stations would you estimate are in the US (please explain your > reasoning)? If you had a belt tight around the equator of the earth, how > much length do you figure you'd need to add to raise the belt to an > altitude of 1 foot across the entire distance and why? > > Good luck with your interviewing. >Article: 117612
hi thanks for your response(s). "I almost think it's unfair to give you information if the new comany wants you for your knowledge and ability. But perhaps the ability to ask the professionals in a newsgroup is underrated" Well I would disagree with the above. Reason is, it is not that I CANNOT design - I DO have a design background and have done a lot of it in my last role (barring the async fifo that is!) . It is just that the last few years I happen to be doing something else, and all I am doing is brushing up. And for doing so, anyone will certainly use ANY and ALL resources including groups at their disposal, isnt that logical? Unless you use something on a daily basis, I dont think it is realistic to show up at a tech company and be serious about landing an offer. Anyways, you have answered my question and that is what counts Hopefully, my "read_ptr" is not even off by 1 bit :-) Thanks!!!Article: 117613
Thanks to all for your very informative replies. Since I have only experience with one rather small FPGA application every little information like the lattice family, the dsp48 slices, and Rays publications promise to be very useful, and I'll evaluate your suggestions in greater detail than I could at a first glance. I hope you understand that I am reluctant to describe the project more precisely. I'll start with a downscaled solution just for curiosity and as a proof of concept as kind of low-budget hobby project, but I don't want to give away the chance to have a profitable project later. Of course, since I spread only very little information, I can't expect you to tell me the perfect solution. But I am already very happy with your answers. Thank you RyanArticle: 117614
FYI, I found that by gating the fifo write request with the fifo full signal I could prevent this problem. Apparently the fifo does not like to be written to once it is already full. -Clark "cpope" <cepope@nc.rr.com> wrote in message news:4613bd5c$0$18872$4c368faf@roadrunner.com... > Note: using EDk 8.2.02 with peripheral generated with the wizard with FIFO > enabled and one user interrupt that is set to the fifo_almostfull line. > > I'm set up to generate interrupts on the fifo almost full signal but I get a > couple interrupts and then it stops. If I go read the occupancy register I > get a number(0x737) greater than the size of the fifo (0x400). If I then go > and manually read from the fifo data register(from gdb) I can get the > occupancy to go back to 0x400 and the next interrupt occurs when I run but > they stop shortly after. > > So, > How could the occupancy be greater than the size of the fifo? > Have you seen anything like this before? > > Thanks, > Clark > >Article: 117615
On Apr 3, 5:22 pm, "Andre Renee" <trauben...@arcor.de> wrote: > Hi Nick, > > thanks for your quick reply. The problem is: with my TCP/IP implementation I > have only one port available. As far as I know with FTP I need at least > two...!? Yes, FTP client, FTP server and TFTP server require at least two ports. But TFTP client could leave with one local port that communicates with two remote ports. By the way, do you really implemented full TCP transport protocol in hardware, all related RCPs including adaptive retransmission timers, window size adjustment, flow control and congestion avoidance? If you did that implementing application protocols should be a piece of cake - most of them are order of magnitude simpler than TCP. BTW, what you have against microblaze? Could save you months (years?) of work and plenty of logic cells at cost of some embedded memory.Article: 117616
That's the way controllers are designed. When a FULL signal is generated, it is up to the data source to stop writing (and when EMPTY,it's up to the destination to stop reading.) That's what these status or handshake signals are for. One could design these controllers to be idiot-proof, but that usually sacrificesperformance or versatility. In a FIFO controller, you want to be "lean andmean", to maintain max performance. Nothing stops you as user to add "child-proof" circuitry, as long as the loss of performance is acceptable. Peter Alfke, Xilinx On Apr 4, 4:09 pm, "cpope" <cep...@nc.rr.com> wrote: > FYI, I found that by gating the fifo write request with the fifo full signal > I could prevent this problem. Apparently the fifo does not like to be > written to once it is already full. > > -Clark > > "cpope" <cep...@nc.rr.com> wrote in message > > news:4613bd5c$0$18872$4c368faf@roadrunner.com... > > > Note: using EDk 8.2.02 with peripheral generated with the wizard with FIFO > > enabled and one user interrupt that is set to the fifo_almostfull line. > > > I'm set up to generate interrupts on the fifo almost full signal but I get > a > > couple interrupts and then it stops. If I go read the occupancy register I > > get a number(0x737) greater than the size of the fifo (0x400). If I then > go > > and manually read from the fifo data register(from gdb) I can get the > > occupancy to go back to 0x400 and the next interrupt occurs when I run but > > they stop shortly after. > > > So, > > How could the occupancy be greater than the size of the fifo? > > Have you seen anything like this before? > > > Thanks, > > ClarkArticle: 117617
On Apr 5, 1:16 am, "Daniel S." <digitalmastrmind_no_s...@hotmail.com> wrote: > Gordon Freeman wrote: > >> Watch the values change in the accumulators as you add and subtract the > >> input values. > > > Hi everyone! > > Thank you for your reply! > > I used ModelSim to simulate. The result is the same when I canculate > > by calculator. > > But when I implement on FPGA, it don't work too. > > I design IIR filter with 10 orders. I use Matlab to generate > > coefficients for filter (b(k) and a(k)). For coefficient "b", I > > multiply with 2^14 and multiply with 2^5 for coefficent "a". After > > that I round them. These coefficients stored in LUT. I use SDA for > > filter. Because I think IIR filter include tow FIR filter. One filter > > with coefficient "b" and one with coefficient "a". Is it right? > > Check your synthesis log to make sure your design did not get 'optimized' > away. If you forgot some control signals somewhere, it is possible that > synthesis deduced that some of your design had static elements, removed > them to optimize, then deduced that everything else is now unconnected and > removed that as well. > > BTW, also make sure your VHDL agrees with your board's reset polarity. Hi. Thank you for your reply. I design it by myself and RTL coding in Verilog and I implement it on Xilinx FPGA (XC3S400). I use Matlab to generate coefficients. And this is the code: % All frequency values are in Hz. Fs = 48000; % Sampling Frequency N = 10; % Order Fpass1 = 3000; % First Passband Frequency Fpass2 = 6000; % Second Passband Frequency Apass = 1; % Passband Ripple (dB) Astop = 80; % Stopband Attenuation (dB) % Construct an FDESIGN object and call its ELLIP method. h = fdesign.bandpass('N,Fp1,Fp2,Ast1,Ap,Ast2', N, Fpass1, Fpass2, ... Astop, Apass, Astop, Fs); Hd = ellip(h); % Get the transfer function values. [b, a] = tf(Hd); The result: a(k) = -8.0088 30.2376 -70.5488 112.3786 -127.5439 104.4117 -60.9016 24.2544 -5.9702 0.6931 b(k) = 0.0004 -0.0019 0.0040 -0.0050 0.0036 0.0000 -0.0036 0.0050 -0.0040 0.0019 -0.0004 After that: round(a(k)*2^5)= 32 -256 968 -2258 3596 -4081 3341 -1949 776 -191 22 round(b(k)*2^14)= 7 -31 65 -82 59 0 -59 82 -65 31 -7 The filter output result will be divide by 2^5 * 2^14. And filter structure is Direct Form I and this is biquads. I think it run into overflow errors, too. But I don't know how can I modify it. Can you help me, please?Article: 117618
hi, having a state machine in a datapath element a bad design practice? CMOSArticle: 117619
CMOS wrote: > hi, > > having a state machine in a datapath element a bad design practice? > > CMOS It depends. Give the name of your tutor, so replies can be sent straight to him/her. -jgArticle: 117620
The need is to reduce the analog front end and also the risk in the FPGA. By risk, I mean, with such a high frequency design, comes tighter control on the design. I am trying to get a healthy balance. I agree with you that its going to be hard to digitize the RF directly. I should work on the analog downconversion to IF. The more I think about it, the more redundant the chip gets (AD6654). I think, by carefully considering undersampling techniques on the IF, the FPGA design can be made easier. The kicker is that i'll have to have 4 ADCs to do the job on the FE. I do appreciate your input. I did get Matlab (finally...my manager cringed paying $4500 for the license re-activation) so the modeling should be insightful.Article: 117621
> > >And how many gas stations would you estimate are in the US (please explain your > > reasoning)? This is interesting. I dont know the answer, but the reasoning can be tagged to the distribution of gas station /sq mile or per population density. > >If you had a belt tight around the equator of the earth, how > > much length do you figure you'd need to add to raise the belt to an > > altitude of 1 foot across the entire distance and why? The earth's equatorial radius is 3,963.189mi, so elevate the radius by 1 ft, from the top view, it would be like drawing an outer circle with a 1 ft border. From this we can calculate the new radius and hence the addition in the radius gives us the new circumference and thus the addition in length.....what do u think...am I way off???Article: 117622
> The DRAM control/data/etc. signals is one thing, making the DRAM work > exactly the way you want it to is quite another - you need to familiarize > yourself with DRAMs' internals... learn what row activation and precharge > do, why these are necessary, how they can affect your design and how you > can work around these delays by doing pipelined burst transfers. There are > a bunch of other quirks that can be exploited or have to be avoided, the > ones enumerated are simply the more fundamental ones IMO... and do not > forget those auto-refresh cycles. Yeh kind of learnt little about the refreshing of the DRAM in school and it was difficult. Hmm, i seriously need some ultra pure basic on how to use the DRAM, any such books or websites or watever? > These things are automatic only to the extent where the HDL coder follows > some limitations. For SRAMs/ROMs/registers to get mapped onto BRAMs, the > synthesis tools must be able to reduce the access/data logic down to > something supported by the hardware. For a BRAM, this means the logic must > be reducible to no more than two read+write+address+clock sets. Depending > on the target device, there may be additional restrictions such as > read-write policies - write first, read before write or no change. Meaning the structure of the HDL codes is written such that the synthesis will infer what kind of devices to use. (Amazing..write first or write later also have effect on the device being used...) oh the XST guide offered alot of help on this. > > > > Look at your synthesis reports pay attention to each BRAM's inference data > and port mappings. Look for memories that are under 8kbits and are not > using both read and write ports - these may be mergeable if they use the > same clocks. > Wow, the synthesis report is so cool. It tells me alot of information, like which modules contains their respective warnings, the devices inferred from the each modules (like adders, subtractors,etc) and analysis of different values for the data types. Thumbs up! Anyway got this information on the BRAMs 57 rams RAMB16_S2_S2 : 2 RAMB16_S36_S36 : 18 RAMB16_S4_S4 : 25 RAMB16_S9_S9 : 12 All are 16kbits Ram with different port widths for A and B (as indicated by 16_Sx_Sy) > > Any common techniques that you would know that people have used when they wish to reduce BRAMs? > > 1) FIR filters are often symmetric: the nth tap (n=0..N) has the same > coefficient as the (N-n)th one... it is unlikely that this optimization has not already been done if applicable to your filter but double-checking is cheap. > The nth tap as the same coefficient as the (N-n)th one. Hmm I have only a H tap and a V tap. I don't see a series of taps being used though. Just for info sake, I have already tried to reduce H tap to 4 and keeping V tap as the default 4 to fit the multipliers and also reduce a few BRAMS being used. But I can't find anywhere about the coefficient of the tap. > 2) If your coefficient tables (ROMs?) use under half a BRAM and only one > port, you should be able to merge two tables into one BRAM by using both > ports for reading: map one address to "'0' & addrA" and the other to "'1' & addrB". > Nah, no chance. from my final synthesis report, all of them are using dual port rams. Anyway the codes were already written for dual port BRAM so there shouldn't be any reason that my coefficient tables will be using under half a BRAM, ya? > 3) Examine the tables to find redundancies and equivalences, it may be > possible to multiplex accesses to the coefficient tables. > hmm....erm....I'm not sure to go about this method. Anyway, regarding the tables, I guess the tables of coefficients have not yet been used for input in the design (thus I felt kinda puzzled by your 1st and 3rd methods). Right now, I'm just synthesizing designs which some of them will process the tables later on. So what the tables have inside are the issue here. All the codes in the each of those sequential lookup and coefficient tables are using the same dp_bram, then the wrapper instantiating them will porting different values in them. > > How much data do you need to put in the DRAMs? 60 BRAMs x 2KB each = 120KB > max., assuming you intended to put everything on the DRAMs. The question > you really need to ask yourself is: can you afford the glue-logic? > > Try ripping the memory controller off any project with DRAM controller you > may have handy, do an unconstrained implementation run (use a slow clock > like 50MHz and the memory controller's top as your synthesis project's top) > and see how resource-hungry the memory controller you have is. Oh yah well, I'm not sure if how much data is needed but based how many BRAMs is used and each Brams is a 16kbit. 57 x 16k = 912k. But anyway, the board I will be using has four DDR SRAM Memory Chips (512Mbit). And since it could provide so much storage, surely my chip should be able to handle or afford the glue-logic. Hmm oh yah I have used 4570 out of 14752 slices. Now I read that I can use the distributed rams from the remaining slices and I thinking it is not enough to reduce the BRAMs till there is no overloading but it certainly reduce to a extent. I went to check out the language templates for using distributed rams, but... they all support only 1bit data storage while many bits address...(what's use can this be..) hmm but then I read that this distributed rams can be combined to form longer data bits storage (some sort of combination). but then again, I don't how to go about doing it. Tried to find more about it and got this 2 methods: 1) Use the "HDL Coding Techniques" in your XST User Guide. 2) Specifying constraints (see your Constraints Guide), or by "instantiating" library parts that force the logic to be created a certain way (see your Libraries Guide). For example, the RAM_STYLE constraint lets your force either Block RAM or Distributed RAM. Woah that RAM_STYLE constraint looks simple to use :) Tried that and not even a single change happened. :( So I thought of looking at 1) and found out that in my code there was this constraint already being used <<attribute ram_style of mem_array : signal is "block">>; Thus, I changed it to <<attribute ram_style of mem_array : signal is "pipe_distributed">> However, it seems to continue forever during my synthesis. Currently, I'm using 57 out of 36 BRAMS (after lowering the H taps) 57 - 36 = 21 => 378000 bits of ram...Article: 117623
> >>> If you had a belt tight around the equator of the earth, how >>> much length do you figure you'd need to add to raise the belt to an >>> altitude of 1 foot across the entire distance and why? > The earth's equatorial radius is 3,963.189mi, so elevate the radius by > 1 ft, from the top view, it would be like drawing an outer circle with > a 1 ft border. From this we can calculate the new radius and hence the > addition in the radius gives us the new circumference and thus the > addition in length.....what do u think...am I way off??? You don't need the earth radius. Let's says the original radius is r and expressed in feet, the original belt would be 2*pi*r . By adding 1 feet, you now have a belt of 2*pi*(r+1). So you added 2*pi feet to it no matter what. SylvainArticle: 117624
On 4 Apr 2007 19:52:40 -0700, "Gordon Freeman" <gordonfreeman1983@gmail.com> wrote: >On Apr 5, 1:16 am, "Daniel S." <digitalmastrmind_no_s...@hotmail.com> >wrote: >> Gordon Freeman wrote: >> >> Watch the values change in the accumulators as you add and subtract the >> >> input values. >> >> > Hi everyone! >> > Thank you for your reply! >> > I used ModelSim to simulate. The result is the same when I canculate >> > by calculator. >> > But when I implement on FPGA, it don't work too. >> > I design IIR filter with 10 orders. >And filter structure is Direct Form I and this is biquads. > >I think it run into overflow errors, too. But I don't know how can I >modify it. > >Can you help me, please? If it is a cascade of five 2nd-order filters, as you suggest, then start by implementing a SINGLE biquad. Simply change N from 10 to 2 to generate an optimal coefficient set for a 2nd order filter. Once you have a single biquad working, then you can start to cascade them. I'd go 4th order, then 10th order if you have enough information about what's gone wrong here. - Brian
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z