Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
--------------7753285DA8DF17FB9A633A91 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit OK guys, what constitutes an instance? If the syntax has to be INST "instance_name" IOB=FALSE; what is the instance_name? I have a module declaration (Verilog here) called flops that contains all the DataOut flops to my external SRAM. In my top level FPGA Verilog I have an instance of flops called uflops, the outputs of which feed the output pads. Now, does the instance_name in the UCF syntax above refer to the instantiation of flops or to the registers inside? Also, what is the hierarchical syntax for an instance at the top level? I currently have: INST "uflops" IOB=FALSE; which did not work. Should I have: INST "/uflops" IOB=FALSE; or INST "\uflops" IOB=FALSE; or if the instance_name is meant to refer to the flops themselves: INST "uflops.*" IOB=FALSE; with or without some form of slash at the front? Thanks again.Article: 49501
"Falk Brunner" <Falk.Brunner@gmx.de> schrieb im Newsbeitrag news:aqrpaj$af1of$1@ID-84877.news.dfncis.de... > "Sanjay Patil" <sanjay@cg-coreel.com> schrieb im Newsbeitrag > news:aqqko0$bvdvr$1@ID-164436.news.dfncis.de... > > Hi, > > It will generate .xnf file which is same as EDIF. > > Only thing is it is Xilinx Proprietary. > > It will also give .edn file at the project directory location. > > NO. EDIF output has been removed in version 5.1,. EDIF was never an official > supported output format. > True, but ISE5.1 comes with 2 command line tools, ngc2edif.exe and ngd2edif.exe. These tools convert Xilinx netlists to EDIF. -- MichaelArticle: 49502
I'm looking for resources, either free or IP, I can use to make a simple kind of switching fabric in an FPGA. I have a bunch of 8-bit busses (actually 5 sets of input/output busses) I would like to funnel into a single FIFO, or connect arbitrarily to each other. I've heard horror stories of trying to do this muxing inside an FPGA and meet timing (I'll be running these busses at 54 MHz) so I'm interested in what people have experience with. thanks in advance, brandonArticle: 49503
Hi, I am using Xilinx ISE5.1i, I have a design with has some IOBUF pins requiring LVDCI_25 type, I was told I can name the IOBUF type in UCF file, but I do not know the syntax. Could you point me the correct syntax for that? Is that NET "sdr_dq<0>" LOC ="T2", attribute = LVDCI_25; ? Thanks,Article: 49504
Hey all, Xilinx has a new flow for Incremental Design in 5.1i. The application note is available at: http://support.xilinx.com/xapp/xapp418.pdf In this flow, the design needs to be divided into Logic Groups (typically the modules/entities instantiated in the top level of the design). Logic Groups need to following the hierarchy of the design. Each logic group needs to be synthesized seperately so a change in one logic group doesn't affect a different logic group. Each logic group is assigned to an area group using the Floorplanner or PACE. Run the design through the tools to create a guide file. You might need to do several iterations of the floorplan to get the design to meet timing. Once the design meets timing, then use these ncd files as the map and par guide file. Try to keep changes in the code limited to one Logic Grup and synthesized it seperately. This way nothing else will change. Use guide mode Incremental and a guide file in both the map and par processes. These can be set in the map and par properties in ISE or use -gf for the guide file and -gm for the guide mode. PAR will 100% guide the unchanged logic groups but re-implement the changed logic group. We have seen 2x to 3x faster runtimes in using this flow. Please read the app note for more detailed information. Any synthesis tool can be used. Synplify's Multipoint is the easiest way in the Synplicity tools. If you don't have that, then create separate projects for each logic group. Instantiate each logic group as a black box in the top level of the design. When you synthesis the logic group, make sure to turn off I/O insertion. This is all described in the app note. Kate Kelley Richard Iachetta wrote: > In article <aq5di8$me2$1@mail.cn99.com>, pc_dragon@sohu.com says... > > Hello, every one. > > I'm working on a FPGA design project, it includes many module. > > It'll take so long a time to run full implement one time when I only > > make a little change in one module. > > So I wonder how I can do incremental implement. I use Synplify Pro > > and ISE 5.1i. > > I've read the Xilinx document on how to do Incremental designing, but > > not really understand, it seems that I need do it in Synplify Pro GUI mode. > > Is it impossible to do it in ISE? And where can I find any example code? > > > > Thanks for any advence > > > > That was my biggest complaint back when I used to use FPGAs. One change > and it was another 12 plus hour respin. I likened it to building a > multi-story building and then realizing you installed the wrong lightbulbs > in the hallways and then tearing down the whole building including the > foundation and starting over, but this time putting the right bulbs in when > you get to that point. > > -- > Rich Iachetta > iachetta@us.ibm.com > I do not speak for IBM.Article: 49505
Hi, maybe you can help me: I've been following the developments on systemc, and other high-level languages for hardware modeling: http://www.systemc.org http://www.cadence.com/company/pr/052002_SFV_Verification.html http://www.celoxica.com ***** What is systemC? How is it different from other verification languages? Is SystemC compatible with System Verilog, Super log, or the like? Are assertion based languages competing in the same space as SystemC? What are the alternatives to SystemC when architecting/simulating hardware systems? Why would anyone bother to lear and use SystemC? Or, is it better to wait for "wide acceptance" before spending time & effort on yet another modeling language? Thanks in advance for you feedback, Alfredo.Article: 49506
David Binnie wrote: > Apart from DOS based PALASM does anyone know of any freeware which can > generate a JEDEC file from Boolean equations or other descriptor ? Xilinx's iSE or Webpack tools generate .jed files for our CPLDs. Input language can be Abel, VHDL, Verilog, or schematic. Just run the "Generate Programming File" step in iSE Project Navigator... Webpack can be downloaded for free from our web site. -Dennis McCrohan Xilinx CPLD S/W DevelopmentArticle: 49507
David Binnie wrote: > Apart from DOS based PALASM does anyone know of any freeware which can > generate a JEDEC file from Boolean equations or other descriptor ? > Well, Icarus PAL can write JEDEC files for some common types of PLDs. The web page is here: <http://www.icarus.com/eda/ipal/>. The project has languished for lack of user interest (those FPGA devices seem to be catching on) but it's a start. -- Steve Williams "The woods are lovely, dark and deep. steve at icarus.com But I have promises to keep, steve at picturel.com and lines to code before I sleep, http://www.picturel.com And lines to code before I sleep." abuse@xo.com uce@ftc.govArticle: 49508
Ray Andraka wrote: > Putting registers on the I/O of the design alleviates the > need for the synthesis tools to try to optimize across module > boundaries. As we move into larger devices and get into more > of a modular design flow, you'll generally want to at least > register the module outputs, just to maintain consistency in > timing. . If you register both the inputs and outputs of modules, and connect a variety of such together, do you end up with double registering at module boundaries, and the associated cycle latency? Or can the tools (be told to) remove redundant registers? Basically, what are the "best practice" guidelines in this regard? Register on inputs, register on outputs, both, ...??? JohnArticle: 49509
John Williams wrote: > Basically, what are the "best practice" guidelines in this regard? > Register on inputs, register on outputs, both, ...??? I'll give you "a practice" guideline. I use a synchronous template for all processes. This gives all entities registered outputs but assumes that all inputs are synchronous to the same clock. This gives you one register per module interface. For inputs from device pins, I interpose one input register for synchronous inputs or two registers for asynchronous. -- Mike TreselerArticle: 49510
Mike Treseler wrote: > > John Williams wrote: > > > Basically, what are the "best practice" guidelines in this regard? > > Register on inputs, register on outputs, both, ...??? > > I'll give you "a practice" guideline. No, I want "the" answer, not "an" answer! :) > I use a synchronous template for all processes. > This gives all entities registered outputs but assumes > that all inputs are synchronous to the same clock. > This gives you one register per module interface. > > For inputs from device pins, I interpose one > input register for synchronous inputs or > two registers for asynchronous. Yup - that makes sense. Now if you must meet tight timing, you want to pack registers into the IOBs - which means there must be no logic between the pin and the input register, and similarly none between the output register and the output pin. This approach is directly and automagically supported by your practice described above, is it not? Rgds, JohnArticle: 49511
Hi folks, I have to program a Spartan2 with a USB Microcontrolle in JTAG mode. (I know that programming in serial mode would be easier) The data stream should come from the host directly to the FPGA and also to large I2C memory. The I2C memory is for booting without host. Now I need some information about the Bit file format or any other file format that fits my wishes. Also I would be very happy about some suggestions, tips and hints to possible traps. Thanks in advance, JensArticle: 49512
You'll need an FPGA and an external memory to store the frames. Presumably it needs to be properly timed on the input as well as on the 6 outputs. It isn't clear, but I am guessing that you intend to slice the input image into 9 regions, each region going to one of the outputs so that you can make a mosaic display. To do that, you will need to write the data into a frame buffer, then read in turn one pixel from each region in raster order. You may also need to deal with interlace on one or both sides, depending on the monitors you are driving. So far, this is easily handled with a couple of counters and a state machine to take care of the memory access for both read and write. Make sure you have sufficient width on the memory to handle the bandwidth of the input plus 9x output. Once you have reordered the data, you need to convert it to YUV with a color space converter, add in the sync timing and composite blanking (I just realized the output is going to have to be NTSC for UHF broadcast, which means that you do need to interlace the outputs), then run that out to a D to A, then modulate it using an off the shelf UHF modulator (if you have monitors with direct NTSC inputs you could skip the modulator). NTSC Color encoding is a pain in the derriere. It would probably be cheaper, and certainly less painful to use an NTSC encoder such as the one made by Phillips (SAA 7111?) to take your digital YUV, YCrCb, or RGB data directly (it also generates the NTSC timing), especially considering you need 9 of these encoders. I think there may be some dual models out there as well. You can cut down on the memory by converting to a YUV color space in 4:2:2 format before writing the input image to memory. That conversion will need a color space converter plus a decimating filter for the UV channel, but it will get your image into 16 bits per pixel instead of 24. At NTSC rates you'll need to read and write 9 pixels in about 96 ns (62.5us/ scan line / 640 pixels/scan line * 6 images), assuming the input image is at the same frame rate. That is around 5 ns per 16 bit memory transaction not counting overhead, which is doable with an SDRAM simm module or equivalent SDRAMs. There are several FPGA boards available with 32 bit or wider SDRAM on them. You can use a line buffer in the FPGA memory to compress the output rate to the active part of the image. For skills, at a bare minimum you need some half decent digital logic design skills, preferably an understanding of video standards, and a good measure of debug skills to pull it off. You'll need to be comfortable pushing the FPGA and memory fairly fast in order to keep the cost of both down. You didn't mention the scan rate of the input signal. Depending on which standard it is, you may also need to do scan rate conversion, which makes it considerably harder. Unless you are already an accomplished digital designer, this is probably a bit ambitious for a science project dan wrote: > I want to take a 1920x1440 rgb output, Break it up into 9 640x480 > pieces, and broadcast each piece to a separate uhf channel. > > How much would it cost to build something like that. > What kinds of skills would a person need to have? > > It's for a science project > > Daniel Savage > lakeside school district -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 49513
Buy, beg, borrow or steal (well, no don't steal) a copy of P.P. Vaidyanathan's multirate filtering book (You can buy it through the bookstore on my website, which will give me a kickback which in turn helps to support the website). That will give you the information about the architecture you need to design the hardware. You should have a good visualization of the dsired hardware before you ever start thinking about the VHDL. That will produce a better design, and it also makes it alot easier to figure out how to code it in VHDL. chankc wrote: > "Caillet" <regis.caillet@dspfactory.ch> wrote in message news:<aq88k1$258$1@rex.ip-plus.net>... > > "chankc" <chankwanchien@yahoo.com> wrote in message > > news:954ab655.0210302354.2aedc62a@posting.google.com... > > > I am currently working on this two multirate signal processing. Anyone > > > has VHDL code for decimator and interpolater for my reference? > > that depend of which method is use to interpolate or decimate ?! > > Would you give me some example using two stages... CIC follow by FIR... -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 49514
The Xilinx Virtex-IIPro demo at DAC had an LED bank where the intensity of each LED rose smoothly from zero to 100%, then faded back to zero. They may have done this with the on-board PowerPC, but I have implemented it as standard VHDL. The code is in the 'free stuff' section of www.rockylogic.com The basic idea is 1. generate a 150Hz (approx) Tick signal from the master clock 2. generate a saw-tooth incrementing Delta register which wraps round to 0. 3. generate LedOn via a phase accumulator Here is the central code: -- Accum is the phase accumulator for the PWM constant B: integer := 5; signal Accum : unsigned(B downto 0); signal LedOn : boolean; process (clk) variable Acc, Delt: unsigned(Accum'range); begin if rising_edge(clk) then if Tick then Accum <= (others=>'0'); else Acc := '0' & Accum(B-1 downto 0); -- clear overflow to zero Delt:= '0' & Delta; -- bit-extend with zero Accum <= Acc + Delt; end if; LedOn <= (Accum(B) = '1'); -- overflow drives LED end if; end process; Enjoy /TimArticle: 49515
Austin Franklin wrote: > I don't understand. What tools are you talking about? I simply run my > testbench, written in HDL, same as I would for any design even if the design > is in HDL, and get the same output waveforms...or better yet, it displays > the actual signals on the schematics. My viewlogic license is only enabled for viewsim and viewdraw. It doesn't support the viewlogic VHDL, that was extra. Frankly, the Viewlogic VHDL is quite poor compared to Modelsim or Aldec. > What about placement? Problems I've had were the tools didn't allow the use > of consistent names, either when a change was made with to either the > design, or the toolset. I haven't had much problems with placement, at least not with stuff instantiated in the code (that is one of the major reasons I do as much structural instantiation as I do). Only inferred logic changes its names, instantiated logic generally does not. Inferred flip-flops generally take on the name of the output net, so there is no problem floorplanning using inferred flip-flops. The LUTs do tend to get random names, so they are not as easy to deal with in floorplanning, but then if you do your design with one level of logic, the mapper packs them with the flip-flops anyway. > And what preculdes you from doing that with schematics? Did you ever see > Philip's tool for generating schematic elements? Yes, I did, and it is a very nice tool too. I never did get my own copy because just as he came out with it I was in the middle of a transition to HDLs. HDLs give you that capability without having to obtain an add on tool. > > > > The advantage is if I make a change to the macro, it only gets changed > > in one place, which is not necessarily true with schematics (using 2 bit > slices > > for arithmetic, it is almost true, but you still have the special cases at > the > > start and end of a carry chain). The parameterization includes options > for > > layout, assignment to different device families (RLOC format for example), > > automatic signed/unsigned extension, automatic selection of reset vector > values > > with the proper FDRE/FDSE etc. These are things that were a little > awkward with > > schematics, and are very easy to do with the HDL generates. > > Hum, I don't find them awkward at all with schematics, but do with HDLs... For example, say you have (this is from a design that I did with schematics) a bank of 128 129 bit LFSRs. Each is identical except it has a different reset value. With an HDL, you can construct one parameterized module that generates the proper combination of sets and resets without ever having to look inside the module, then you can instantiate those 128 modules in a generate statement that indexes a constant array (probably in a package in another file so that you never have to modify the source even if you change the constants) to parameterize the initial values of each module. If I want to change the intial values, I just edit the list of initial values, which by the way can be expressed as binary, decimal, hex, octal or any mix of that you like. With schematics, you need to generate each module using FDREs and FDSEs. As I recall, Philips tool didn't do this readily, and it certainly didn't read the init values out of a common file. Similarly, I can set up filter coefficients for distributed arithmetic filters as a naturally ordered list of coefficients in a separate package. My VHDL filter is parameterized to read the coefficients from a file, process them (with a procedure) to create the init values for the DA LUTs, and then build the filter including placement. The code is parameterized for the coefficient width, filter add tree width, bits per clock, length of the filter etc, and I never have to go inside that module to modify anything. It took a while to build the library to where I was as productive with VHDL as I was with schematics, but I am now well past there. > Mainstream? Not really. Synplify may be the "tool de Jour", but I don't > see that as being any better than schematics, though you are locked to a > single vendor with schematics, no doubt. Also, as you know, every damn > revision of these HDL compilers generates different code...which reeks havoc > on some designs. Depends on the definition of mainstream, I suppose. Where I come from, mainstream means that which most people are doing, not what which is 'best'. When all of my customers are asking for an HDL design flow, I think that can be described as the mainstream. Schematic entry for FPGA design, like it or not, can't really be considered mainstream anymore, at least by the definition of mainstream I am familiar with. You can get around the variations between compilers by using structural generation for the critical parts of your design. We do it in a large percentage of each of our designs. As an indicator, I spend far more time tweaking things for PAR than I do for getting the synthesis to turn out what I want. > > I agree. Designs I do for my own projects, I do in schematics...simply > because it keeps the parts cost down, ups the speed significantly...and I > don't have to wrestle with the tools. I do mostly HDL work for clients now, > as for misbegotten reasons, they believe it saves them time and money...when > in every instance, it absolutely, unquestionably does not. I'm not sure an HDL will save money or not. Because my library is far more parameterized than I was able to achieve with schematics, and because I have the option of structural cosntruction where it matters or RTL level coding where things are not as critical, my design capture is perhaps a little bit shorter than it was with schematics. I have seen tremendous gains in the simulation however because the sophistication of the testbenches is much higher. > > > > So will I be seeing you in San Jose tomorrow? If so, we can discuss this > in > > person. > > No, sigh...I am unable to make it, but I was assured by Philip that you > would defend the fort better than either of us would ;-) Depends which fort. I don't defend the schematic fort anymore. I got out of there right before it burned down around me. As for using a mix of schematics and HDLs, I find that more awkward than using either...it means maintaining two libraries, proficiency on additional tools and customers griping louder because of more tools needed to support a design. You missed a good meeting, They were very receptive and have been following up this week (which is something we didn't see before). -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 49516
This particular one demodulated 160 channels from a high IF. IIRC, the channels were around 1 MHz, individually tuned. The final demod filter was shared by all channels. I think it was 48 taps. It was a DA filter, and it was clocked at something like 192 Mhz. VirtexE-6, 12 to 16 bits in the coefs, 10 or 12 bit data. Ken Mac wrote: > Ray, > > That is a lot of channels! > > Are you able to elaborate a little more on the specs of the system? > (full-parallel/N clocks per sample, filter type (singlerate/rate-changing) > input sample widths, coefficient widths, clock rate/sampling rate, device > you put it on etc.) - I am just interested to know what sorts of things > people do DSP-wise in the real world (I am in academia just now). > > Thanks for your time, > > Ken > > > I bid one job that was to have 160 channels in one filter last year. > > > > > Ken Mac wrote: > > > > > Hello folks, > > > > > > Xilinx coregens DA filter supports up to 8 channels for some of the FIR > > > filter types. > > > > > > Could you please let me know what is the largest number of channels you > have > > > used/seen used through a single FIR filter of any type (including > > > rate-changing) on an FPGA? > > > > > > Thanks for your time, > > > > > > Ken -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 49517
In his case, where one input is fewer bits than the other, the upper part is simply an incrementer (assuming no sign extension of the shorter input). In Virtex*, as well as all the Altera devices the carry chain needs to use the LUTs, so you are stuck with those LUTs. If you look closely at the slice structure, you'll note that the carry mux select input is driven by the LUT output, so even if the LUT is a constant, you still need it to control the carry chain. In the 4000/spartan series parts you basically needed the LUT to get at the bit outputs from the carry chain, so unless you weren't using particular bits, you still had to use the LUTs there. The advantage the 4K carry chain had was that if you were only using the output at the end of the carry chain, perhaps as a subtractive comparator, then the LUTs were available for other things with the restriction that a few of the inputs were shared with the carry chain. That made certain functions more compact than can be done in Virtex, but not your case. Peter Alfke wrote: > The answer is negative: > Adding one bit to a 24 bit value is effectively a 24-bit incrementer or counter > ( flip-flops are free). That has always taken 24 LUTs. > > Peter Alfke > ========================= > Sanjay Patil wrote: > > > Hi, > > We are using Virtex-2 device for one of the applications and We have > > observed that the LUT utilization for a 16 + 16 bit adder is 16LUTs and also > > 16 + 1bit adder is also 16LUTs. The above is because the Carry chains are > > routed through LUTs. Is there any possibility of reducing the LUTs in case > > where unequal number of bits are added to less than the Max number of input > > vector. i.e. if say 24bits are added with 1bit, then can the logic can be > > such that it utilizes less than 24 LUTs > > > > Can anyne help me. > > Regards, > > Sanjay -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 49518
We generally register just the outputs and then just watch the levels of logic on the input. The reasoning is that this mirrors the structure of the FPGA which has a LUT preceding a flip-flop. On critical stuff that you don't know what it is interfacing to, it sometimes is prudent to add an input register. John Williams wrote: > Ray Andraka wrote: > > > Putting registers on the I/O of the design alleviates the > > need for the synthesis tools to try to optimize across module > > boundaries. As we move into larger devices and get into more > > of a modular design flow, you'll generally want to at least > > register the module outputs, just to maintain consistency in > > timing. . > > If you register both the inputs and outputs of modules, and connect a > variety of such together, do you end up with double registering at > module boundaries, and the associated cycle latency? Or can the tools > (be told to) remove redundant registers? > > Basically, what are the "best practice" guidelines in this regard? > Register on inputs, register on outputs, both, ...??? > > John -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 49519
If you can stand pipelining latency, I think you'll be able to get that speed without too much trouble. Its such a simple design functionally, why don't you give it a try. Regards brandbrow98@yahoo.com (bbrown) wrote in message news:<179756b1.0211130947.fed7116@posting.google.com>... > I'm looking for resources, either free or IP, I can use to make a > simple kind of switching fabric in an FPGA. I have a bunch of 8-bit > busses (actually 5 sets of input/output busses) I would like to funnel > into a single FIFO, or connect arbitrarily to each other. I've heard > horror stories of trying to do this muxing inside an FPGA and meet > timing (I'll be running these busses at 54 MHz) so I'm interested in > what people have experience with. > > thanks in advance, > brandonArticle: 49520
Hi, Just find the IEEE Paper on "Multiplexer Based Array Multiplier" --Sanjay "mehmeto" <mehmetozcelebi@turk.net> wrote in message news:e6ae63f0.0211130755.4ed0ef88@posting.google.com... > Dear Computer Arithmetic Gurus, > I am currently working on the implementation of an unsigned Parallel > Multiplier. After reading some articles I found the modified Booth-2 > algorithm suitable. > It was described in Al_Twaijry's thesis "Area and Performance > optimized CMOS multipliers" page 11 ,1997. > I wonder if the figure shown in the thesis page 11 is still the state > of the art way to produce partial products? are more advanced > techniques discovered since 1997? > > ThanksArticle: 49521
"Amy Mitby" <amyks@sgi.com> wrote in message news:2d2a8f5d.0211121059.5eb0a76d@posting.google.com... > Are there any major benefits or disadvantages to optimization > with a synthesis design flow where module boundaries are > registered either at inputs or outputs? In other words, do the > tools' optimizations across module boundaries sometimes work > better than self-imposed sequential boundaries for reaching > better performance, or is it best to put those register boundaries > in yourself and floorplan the location of those registers? If you're going to do floorplanning, you'll have to tell the synthesys tool to retain your hierarchy - and that kills the inter-module optimizations. However there's No Way I'd bet my project on those kinds of optimizations, anyway - so I'd say it doesn't make any difference from a tools point of view if your inputs are your regs, you outputs are your regs, or your regs are buried inside your module. From a technology point of view, however, the Xilinx primitives do optimize the case of an equation followed by a flop, so for that reason alone it might be a good idea to not have input flops, only output or buried. The guideline "register your outputs" is lore that came from old ASIC tools that used the hierarchy to define the placement starting point. In that case, inter-module delays would be fast, but inter-module delays could (and would) be slow, because you didn't know what block would be placed next to what. The Xilinx tools do not do that - they flatten the design, throw away your hierarchy, then place. The only way to enforce your hierarchy in the physical domain is to floorplan - and in that case, you do the floorplan yourself, so you will know which inter-modules are slow and which are fast, based on your placement. -StanArticle: 49522
> I'm looking for resources, either free or IP, I can use to make a > simple kind of switching fabric in an FPGA. I have a bunch of 8-bit > busses (actually 5 sets of input/output busses) I would like to funnel > into a single FIFO, or connect arbitrarily to each other. I've heard > horror stories of trying to do this muxing inside an FPGA and meet > timing (I'll be running these busses at 54 MHz) so I'm interested in > what people have experience with. With that number of busses and that speed, go ahead and code anything you see fit. Unless you use a tiny or slow FPGA, you should have minimal problems. -StanArticle: 49523
Nothing says you have to use full IEEE floating point. A numerical analysis of your problem will provide how much dynamic range and accuracy you need at each step. From there you can determine how many bits of exponent and significand are really needed. Small format Floating point and allowing it to denormalize over several operands can get you an implementation that is more efficient than fixed point. Goran Bilski wrote: > Hi Stan, > > You can do IEEE 754 but the efficient will be low. > A fully compliant single precision add/subtract unit would require around 800 > LUTs and take 6 clock cycles to perform running around 100 MHz > How efficient is that when you compare against 32-bit integer operations which > takes 32 LUTS and 1 clock cycle at 250 MHz? > Quantitative (Number of operations per seconds/ needed area) > > Floating point : (100_000_000/6)/800 = 20833 > Integer : (250_000_000/1)/32 = 7812500 > > Integer operations are roughly 400 times more efficient than floating point. > > However floating point has some benefits of larger ranges and easier handling of > different sizes of values. > > But if you know the exact algorithm and can translate it to integer operations, > you will gain 400 times more efficiency. > Not bad. > > Göran > > Stan wrote: -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 49524
To add to what I wrote earlier: I think you'll find that the Block RAMs are not quite as slow as they were for virtex and virtexE. The enables still have longer setup times. You can keep them out of the critical path (and slightly alleviate congestion around the RAMs) by tying the enables high and then using the address to steer unwanted writes to locations that will not matter (next location for a linear buffer or fifo). -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z