Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
You might want to look at Xilinx's cor-gen. It's a relatively new tool that will give you the RPM, a VHDL fragment to cut and paste into your code, and a behavior model. Use the behavior model for simulation, then treat the entity as a "black box" for synthesis. The cor-gen online docs to a reasonably job at explaining the flow. Good Luck Carl Christensen Jacob W Janovetz wrote: > Hi... > > I'm interested in using something like the 16-tap 8-bit FIR > described in Xilinx's app note. I'm using Leonardo for > VHDL synthesis and, perhaps I'm missing something obvious, > but it seems that I should be able to just invoke this RPM > somehow and not do a whole lotta work. Is my intuition > wrong? How do I go about doing this? A pointer to a document > would be fine if there is one out there. > > Cheers, > Jake > > -- > janovetz@uiuc.edu | Once you have flown, you will walk the earth with > University of Illinois | your eyes turned skyward, for there you have been, > | there you long to return. -- da Vinci > PP-ASEL | http://www.ews.uiuc.edu/~janovetz/index.htmlArticle: 10001
Hello, I was wondering if there are any 'good' high-level language to hardware compiler's in existence. Bill J.Article: 10002
I'd like to designs a arbiter which process two, four and more interrupt request cause the controller can acknowledg- ing with no priority -- just "first come, first serve", how to implement it with less cell ? Any help would be greatly appreciated. Regards, Channing Wen -----== Posted via Deja News, The Leader in Internet Discussion ==----- http://www.dejanews.com/ Now offering spam-free web-based newsreadingArticle: 10003
Billy Bagshaw wrote: > Is there a good way to compress the data for a 10K20. I have a board > wich does not have JTAG wired up so I can not use JAM. Any ideas ? a simple run-length encoding should work fine and cheaply. FPGA programming data tends to have lots of zeros or ones in a row. you simply count up the number of zeros, for instance, then output a special code that says "N zeros are here". the tricky part is those small areas where there are many short strings of zeros and ones. saying "1 zero here, 2 ones here, 1 zero here, 1 one here, ..." then becomes inefficient. more details follow.... an efficient way to handle this is to use a bit-stuffing algorithm and/or an escape codes. at this point i must also ask are you using software or hardware for the decompression? software works nicer with bytes and hardware works better with bits. the exact details of the scheme you choose will depend on this. let's say you are using software... it's easier to talk about. what you do is choose a data size to work with (say 8 bits) and then an escape code which you expect to be uncommon in the data stream, say 10000001. to compress, you read the FPGA data one byte at a time, looking ahead at the next byte. if you notice that the input byte is not repeated as the next byte, then output the input byte. EXCEPTION: if the input byte is the same as your escape code, you must output the escape code twice followed a count value of 1 (see the next step in case the escape code is repeated multiple times in the input). if the input byte is repeated in the next byte, then count how many more times it is repeated. then you output the escape code, the byte value, and the count (also an 8-bit value). this will compress up to 256 bytes with the same value down to just three bytes. notice that this increases the size of your data if the escape code occurs frequently in the input. that's the drawback, so you must choose an escape code that is uncommon. decompression should be obvious. read the compressed data one byte at a time. if the byte equals the escape code, then read the next two bytes as the value to output and the count, otherwise just output the read byte. this compression scheme can also be implemented in hardware but you have to work on byte data, which may be awkward. you can do a nice bit-serial version of this in hardware, which may be more efficient. but it is more complex to explain, so bear with me as i give it a try: i'll assume you are getting your input data in a bit-serial fashion. choose a data size for your escape code (can be 5 bits, 9 bits, anything!). larger escape codes are good because they are less common in the input data, however they also increase the size of the hardware FSMs and increase the length of a bit-run you need to get an advantage out of compression. during compression, you need to do bit-stuffing to make sure your escape code never occurs in the output stream except when you want it to. suppose your escape code is 0001. now you should never allow 00010 or 00011 to appear in the output stream unless you do so intentionally! whenever you see 0001z in the input stream, you output the value 00010z instead (where the z holds the same bit value between input and output). now to do compression, you read the input one bit-at-a-time, and you output almost every bit you input (but slightly delayed). you have an FSM that is watching the history of bit values and does the bit stuffing to make sure the escape codes are treated as discussed above. as well, you are looking for long strings of ones or zeros to compress. when you find a sufficiently long string, you output the special code 00011 followed by the string value (1 or 0), followed by the count (how many bits do you want to make the counter? choose some appropriate value, say 6 bits). this will compress a bit string of up to 2^6=64 bits down to 5+1+6=12 bits. if you notice that strings of zeros or ones tend to be longer than 64, increase the count size accordingly. notice your bit stream will increase in size if 0001z occurs often. assuming it occurs 1/16 of the time, then your overall increase will be limited to one bit per 16*4=64 bits, a modest amount. decompression is easier. just read the values bit-at-time. whenever you detect 00010z, output 0001z instead. if you detect 00011, then get the next set of bits (zCCCCCC) and output the value z specified by the count in CCCCCC. most of this stuff is taken from my (aging) memory. i may have overlooked something in the hardware bit-stuffing, but i think it will work. the design of the FSMs to detect the escape codes can be a bit tricky (it's a common question given on undergrad tests!), but you can probably manage. guy lemieuxArticle: 10004
G. Herrmannsfeldt wrote: > The Altera FPGA uses a lookup table for both sum and carry, and may > be able to do what you want. the problem with most architectures (including Altera) is the carry output does not go to the general interconnect, only to the next adjacent LUT. why not? because an extra output would require a rather largish output driver and a handful of programming bits and connecting elements (eg, SRAM bits and pass transistors). this would increase the size of the layout for the FPGA. this increase in layout area must be *balanced* against the increase in the # of CLBs for the (few?) subcircuits which would benefit from the extra output. designing an area-efficient FPGA is a tricky task because there are many demands to consider and balance. as spock said, "the need of the many outweigh the needs of the few". guy lemieuxArticle: 10005
Vitit Kantabutra wrote: > Thanks a lot for your extensive reply. The reason I'm struggling with > carry-save adders is because I'm trying to do the X & Y iterations for my > radix-4 CORDIC algorithm. A Xilinx application engineer responded by > suggesting reducing 4 operands to 3, instead of 3 to 2. (He's done carry-save > adders himself.) But it seems to me that this would take 3 CLB's for every 2 > bit positions. Is it possible to do better? yes, there are a few other possibilities. it seems your ultimate goal is to get circuit speed. have you considered pipelining your design instead? you can even pipeline the carry chains if you must (ie, delay the upper bits by an extra cycle if the carry-propogate is on the critical path). also, you are concentrating on having 17 3:2 carry-save FAs. why not group the bits into threes (or larger) and use a ripple carry in the middle? one CLB can implement 2 bits of FAs using ripple carry to another CLB, which can implement the 3rd bit and a final carry out. this uses 2 CLBs to produce 3 sum bits and 1 carry-out. the final 17-bit hybrid/carry-save adder would use 11.5 CLBs. since you will have only one carry-save bit per 3 input bits, the design will probably route easier. your 3:2 FA design used 17 CLBs, or 34 CLBs to do a 4:2. the FAE's suggestion sounds like it will use 17/2*3 = 27 CLBs to do a 4:3. my design should use 11.5+11.5 = 23 CLBs to do a 4:2. only 15% larger than your desired 10 CLB 3:2 :-) guy lemieuxArticle: 10006
Rickman wrote in article <353D2639.EF0614AC@yahoo.com>... >Prof. Vitit Kantabutra wrote ...snip.. > >-- >Another alternative would be to use an Atmel part. I believe they can >implement a full adder in a single cell (with a rather minimalist >architechture). The big advantage is that you get so many more cells >than you do in a comparable sized Xilinx part. I forget the numbers, but >I believe there are about 6 or 8 times as many cells for a part of a >given size rating. The limitation in the Atmel FPGA is the limited >interconnect. It is based on adjacent cell connections with few and >rather limited interconnect otherwise. But they have a new line coming >out that sounds like it gets around this somewhat. We have two architectures that are both well suited to this problem - you are talking about AT6K above and also mention the new AT40K devices. The AT40K architecture has a full adder in a core cell, two outputs and enough routing resources for the most demanding designs. The FREE AT40K development software is available at (http://www.atmel.com/fpga_software.html) and includes our parameterised macro generators and our **OPEN** Macro Generator Language (MGL) - if you don't like what we have done on a macro it is very easy for you to change it to what you want - or just create your own from scratch - all the documentation and examples are on the CD. Martin Mason Atmel Corp. > >But then I use Xilinx for everything I do. ;-) > > > >Rick Collins > >rickman@XYwriteme.com > >remove the X and Y to email me.Article: 10007
Hello, I am a library science student at San Jose State University who is conducting a study on the development of romantic relationships that began (and perhaps ended) through online communication. If you or someone you know has had experience in this area and would be willing to spend a few minutes responding to a confidential survey, please email me at lchow@wahoo.sjsu.edu. Thanks, Lillian ChowArticle: 10008
On Sat, 18 Apr 1998 19:06:22 -0400, Rickman <spamgoeshere1@yahoo.com> wrote: >I have a partial copy of class handouts labled "Timing & Constraints". I >have 24 pages, starting with slide 3 and runing to slide 59, missing >many slides here and there. thanks - i got this from xilinx tech support. there are 64 pages, dated jan 98, containing some very useful stuff that i haven't seen anywhere else (and i've spent a lot of time looking!) i've uploaded the whole presentation to: http://www.riverside-machines.com/pub/timingcsts.htm for anyone else who's interested. evan (ems@nospam.riverside-machines.com)Article: 10009
In fact this idea raises when I answer the e-mail from the person below. As I come to a new company doing ASIC & FPGA design. I would like to seek advice from different people for the company to buy a new tool for synthesis, simulation and implementation. Xilinx - very popular vendor. Good FPGA tool. Excellent for moderate design like joystick, Gun for playing TV games (PS, .....). But it seems quite difficult for trouble shooting on a large design about 30,000 gates on PC environment. I have used to design a video ASIC chip with DRAM & SDRAM controller. Another problem is the cost. It is quite expensive, probablity because its name is commonly known. Epson - Auklet. SLAx000 series. This tool is a Gate-level design. You can draw schematic diagram and simulate. Its cost is low for small quantity. Besides, it has some embedded chip for FPGAs. Exemplar Leonardo - ModelSim (V-system) - Fast simulation even for large design Viewlogic - ViewLogic consists of ViewDraw,ViewSim and ViewTrace which allows us to draw schematic diagrams with connections of module blocks such that each blocks can be formulated from the library or based on VHDL codes (compiled by Designer Series etc.). We would then checked out the whether if the VHDL codes or connections are correct by simulation through ViewSim. One can execute the command file which contains all the input signals (to test all the blocks). Or you may compile & execute the Testbench file. Actually I use ViewLogic for small design, as it takes long time to compile. Someone may like the schematic generating, perhaps for easy trouble shooting. ---------- From: Yip Yiu Man Sent: Wednesday, April 22, 1998 8:39 AM To: 'engp6396@leonis.nus.edu.sg' Subject: RE: Exemplar Leonardo Experiences; Many bugs in ViewLogic & MAX Siew Kuok, Actually I use ViewLogic for small design, as it takes long time to compile. Someone may like the schematic generating, perhaps for easy trouble shooting. In the past, I simulate the result like Xilinx command tool, ViewSim, executing the command file. It seems to me that this one costs a lot of time to do synthesis and sometimes some unexpected results occur. Perhaps these are the most important factors that people concern. Best regards, YM Yip Yip Yiu Man, Leslie 5/F ES Phone: +852 26912 835 FAX: +852 26192 159 E-mail: leslie.yip@asmpt.com ---------- From: engp6396@leonis.nus.edu.sg[SMTP:engp6396@leonis.nus.edu.sg] Sent: Monday, April 20, 1998 5:14PM To: leslie.yip@asmpt.com Subject: Re: Exemplar Leonardo Experiences; Many bugs in ViewLogic & MAX Hi Leslie, While reading the newsgroup, I came across your message posting and I would like to seek advice from you on this : Software programs like ViewLogic consists of ViewDraw,ViewSim and ViewTrace which allows us to draw schematic diagrams with connections of module blocks such that each blocks can be formulated from the library or based on VHDL codes (compiled by Designer Series etc.). We would then checked out the whether if the VHDL codes or connections are correct by simulation through ViewSim. I suppose by executing the command file which contains all the input signals (to test all the blocks), it is like compiling & executing the Testbench file ? It seems that with the ViewDraw & ViewSim, we can more or less elminate the need for writing test benches, why would people still stick to using test benches sort of thing ? Is the ViewSynthesis, the application to compile & link Test Benches & programs of VHDL codes ? Hope you don't mind these simple questions as I am still a novice in VHDL & ViewLogic. Thanks, Siew Kuok -----== Posted via Deja News, The Leader in Internet Discussion ==----- http://www.dejanews.com/ Now offering spam-free web-based newsreadingArticle: 10010
In fact this idea raises when I answer the e-mail from the person below. As I come to a new company doing ASIC & FPGA design. I would like to seek advice from different people for the company to buy a new tool for synthesis, simulation and implementation. Xilinx - very popular vendor. Good FPGA tool. Excellent for moderate design like joystick, Gun for playing TV games (PS, .....). But it seems quite difficult for trouble shooting on a large design about 30,000 gates on PC environment. I have used to design a video ASIC chip with DRAM & SDRAM controller. Another problem is the cost. It is quite expensive, probablity because its name is commonly known. Epson - Auklet. SLAx000 series. This tool is a Gate-level design. You can draw schematic diagram and simulate. Its cost is low for small quantity. Besides, it has some embedded chip for FPGAs. Exemplar Leonardo - ModelSim (V-system) - Fast simulation even for large design Viewlogic - ViewLogic consists of ViewDraw,ViewSim and ViewTrace which allows us to draw schematic diagrams with connections of module blocks such that each blocks can be formulated from the library or based on VHDL codes (compiled by Designer Series etc.). We would then checked out the whether if the VHDL codes or connections are correct by simulation through ViewSim. One can execute the command file which contains all the input signals (to test all the blocks). Or you may compile & execute the Testbench file. Actually I use ViewLogic for small design, as it takes long time to compile. Someone may like the schematic generating, perhaps for easy trouble shooting. ---------- From: Yip Yiu Man Sent: Wednesday, April 22, 1998 8:39 AM To: 'engp6396@leonis.nus.edu.sg' Subject: RE: Exemplar Leonardo Experiences; Many bugs in ViewLogic & MAX Siew Kuok, Actually I use ViewLogic for small design, as it takes long time to compile. Someone may like the schematic generating, perhaps for easy trouble shooting. In the past, I simulate the result like Xilinx command tool, ViewSim, executing the command file. It seems to me that this one costs a lot of time to do synthesis and sometimes some unexpected results occur. Perhaps these are the most important factors that people concern. Best regards, YM Yip Yip Yiu Man, Leslie 5/F ES Phone: +852 26912 835 FAX: +852 26192 159 E-mail: leslie.yip@asmpt.com ---------- From: engp6396@leonis.nus.edu.sg[SMTP:engp6396@leonis.nus.edu.sg] Sent: Monday, April 20, 1998 5:14PM To: leslie.yip@asmpt.com Subject: Re: Exemplar Leonardo Experiences; Many bugs in ViewLogic & MAX Hi Leslie, While reading the newsgroup, I came across your message posting and I would like to seek advice from you on this : Software programs like ViewLogic consists of ViewDraw,ViewSim and ViewTrace which allows us to draw schematic diagrams with connections of module blocks such that each blocks can be formulated from the library or based on VHDL codes (compiled by Designer Series etc.). We would then checked out the whether if the VHDL codes or connections are correct by simulation through ViewSim. I suppose by executing the command file which contains all the input signals (to test all the blocks), it is like compiling & executing the Testbench file ? It seems that with the ViewDraw & ViewSim, we can more or less elminate the need for writing test benches, why would people still stick to using test benches sort of thing ? Is the ViewSynthesis, the application to compile & link Test Benches & programs of VHDL codes ? Hope you don't mind these simple questions as I am still a novice in VHDL & ViewLogic. Thanks, Siew Kuok -----== Posted via Deja News, The Leader in Internet Discussion ==----- http://www.dejanews.com/ Now offering spam-free web-based newsreadingArticle: 10011
William Jones <wbjones@msn.com> writes: > > Hello, > I was wondering if there are any 'good' high-level language to > hardware compiler's in existence. > > > > > Bill J. > You may want to have a look at our web site (http://www.comlab.ox.ac.uk/oucl/hwcomp.html) and the Handel-C pages (http://www.comlab.ox.ac.uk/oucl/users/ian.page/handel/handel.html) A quote from this page: "Handel-C is a programming language designed for compiling programs into hardware implementations. It is a subset of the C language extended with a few powerful extra features. However, Handel-C is definitely not a hardware description language. Indeed it is our aim to develop a single programming language which can be used effectively for creating complete systems which have both hardware and software components. We believe that Handel-C is a significant step along this road." We have so far been quite successful in compiling to FPGAs and people seem to like the programming approach to hardware design (some endorsements can be found on: http://www.comlab.ox.ac.uk/oucl/users/ian.page/hwcomp/endorsements.html) For mor info you can contact Ian Page (Ian.Page@comlab.ox.ac.uk) who leads the Hardware Compilation Group here Cheers Matthias -- Matthias Sauer Tel +44-1865-283549 Oxford University Computing Laboratory Fax +44-1865-273839 Wolfson Bldg., Parks Road email masa@comlab.ox.ac.uk Oxford OX1 3QD, U.K. URL: http://www.comlab.ox.ac.uk/oucl/users/matthias.sauer/Article: 10012
I am involved in a design using Xilinx FPGAs, using the Foundation 1.4 package. I am configuring the part from a Xilinx 17128D serial PROM. To make the PROM work, I must change the polarity of the RESET/OE pin by writing data to 4 locations in the part. For the life of me, I can't seem to find a way to automate this using the Foundation toolset. Currently I must edit the MCS file or manually edit the data on the programmer. Surely this has been addressed many times and an easy solution is available, or the solution is within the toolset and I'm just missing it. Any help in solving this problem will be greatly appreciated. -- Michael Ellis first initial last name at pesa dot you know whatArticle: 10013
Mike Kelly wrote: > > Hey all, > > I am getting strange problem with my Minc fitter for Synario. I have > a single clock pin assigned for a MACH466. I am using about 180 > macrocells out of 256. When the Minc fitter runs, it assigns another > non-clock pin for my clock! I don't use the clock as an input in any > equations, just for clock. Why the heck would the fitter decide I > needed this clock as an input signal? When I run the static timing > analyzer it shows several of my output signals as being asyncronous, > which indicates that the clock for that output is coming from the I/O > pin, not the clock pin! I have tried Synario support, and they just > pointed the finger at Minc. I am inclined to believe them since the > pre-fit equations (and the post fit for that matter) show this clock > input as being used as a clock only, not as a signal. I left a > message for Minc, but I don't know when they will get back to me. > > Any ideas would be greatly appreciated. > > Mike Thanks for the responses. It turns out that for the MACH4 if you use a reset signal in a block that does not go to all the flops in that block, the ones that do get the reset will be put into async mode. When this happens, the flops in async mode can only use global clocks 0 and 1. Clocks 2 and 3 must be brought in through a product term. So thats why the fitter assigns an external i/o pin, assuming that you will know enough to tie the clock signal to that pin. It would be real nice if a fitter gave you a message that it had to do something outside of your original assignments in order to accomodate a need of the underlying architecture. Something like what I went through is very subtle and not easy to spot when you are new to the chip and/or the design tool. Mike P.S. I must say I received excellent support from MINC. The guy there tracked down this problem for me and came back with an answer in less than 4 hours. How often does that happen! -- ******************************************************* * Michael J. Kelly tel: (508) 278-9400 * * Cogent Computer Systems, Inc. fax: (508) 278-9500 * * 10 River Rd., Suite 205 web: www.cogcomp.com * * Uxbridge, MA 01569 email: mike@cogcomp.com * * * * CMA - Universal Target Platform for 32/64-Bit RISC * *******************************************************Article: 10014
I have just started to use Synopsys FPGA compiler and I am slightly confused regarding the difference in optimization between DC and FPGA-compiler libraries. I am trying to port a standard cell design to FPGA (Xilinx) and have (not suprisingly) run into some performance problems. If I use the design compiler libraries for Xilinx the compilation looks like when I'm using an ordinary standard cell library. If I instead use the libraries for the FPGA compiler the optimization looks like this: =============================== compile -map_effort high -cut- OPTIMIZATION DESIGN RULE TRIALS AREA DELTA DELAY COST COST -------- ------ ----------- ------ ------ 0 -------- 0 Optimization complete --------------------- =============================== No optimization at all here ! Is this the correct behavior ? I can imaginge that it is pretty difficult to do optimization on CLBs, but the performance I get is not very impressive. Is this step skipped on purpose or could this be Synopsys way of saying "it's no use trying, I give up..." ? /Per Fremrot Per.Fremrot@tde.lth.seArticle: 10015
Mike Kelly wrote: > > Hey all, > > I am getting strange problem with my Minc fitter for Synario. I have > a single clock pin assigned for a MACH466. I am using about 180 > macrocells out of 256. When the Minc fitter runs, it assigns another > non-clock pin for my clock! I don't use the clock as an input in any > equations, just for clock. Why the heck would the fitter decide I > needed this clock as an input signal? When I run the static timing > analyzer it shows several of my output signals as being asyncronous, > which indicates that the clock for that output is coming from the I/O > pin, not the clock pin! I have tried Synario support, and they just > pointed the finger at Minc. I am inclined to believe them since the Of course, they are now the same company - Minc bought Synario. Have you checked the Vantis and Synario web sites to make sure you have the latest version of the fitter? Otherwise, try Vantis tech support (techsupport@vantis.com). On one occasion, they solved a fitter problem I was having by providing me with beta code for the next release. -- Barry A. Brown Microwave Instruments Division Hewlett-Packard Company **** Remove the "nospam" from my email address *****Article: 10016
leslie.yip@asmpt.com wrote in message <6hkboc$k6o$1@nnrp1.dejanews.com>... [snip] >Xilinx - very popular vendor. Good FPGA tool. Excellent for moderate design >like joystick, Gun for playing TV games (PS, .....). But it seems quite >difficult for trouble shooting on a large design about 30,000 gates on PC >environment. I have used to design a video ASIC chip with DRAM & SDRAM >controller. Another problem is the cost. It is quite expensive, probablity >because its name is commonly known. [snip] Various programmable logic vendors and related software vendors provide their tools at little or no cost for evaluation purposes. The Programmable Logic Jump Station maintains a list of available options at http://www.optimagic.com/lowcost.html. Also, as most of the programmable logic vendors are using advanced processing technology (0.35u, three-layer metal CMOS and beyond), FPGA prices will be coming down dramatically. There will always be some expensive FPGA, but at the upper end of performance or density. In some ways, the FPGA market is like the PC market. You can buy a high-end PC today, but the same machine will be much cheaper next year. Or, for the same amount of money as today's high-end PC, you could buy an even higher-performance machine next year. ----------------------------------------------------------- Steven K. Knapp OptiMagic, Inc. -- "Great Designs Happen 'OptiMagic'-ally" E-mail: sknapp@optimagic.com Web: http://www.optimagic.com -----------------------------------------------------------Article: 10017
In terms of using other companies' FPGA's, the only suggestion I got was to try Altera's ....... (I think it was the 8K series). But if I want to stick to Xilinx 4000's, then Lemieux post contains an excellent idea. One of Andraka's posts also contains the same idea, perhaps in a little less detail. Xilinx does appear to think they are the immutable gods of this kind of design. If your design doesn't fit well in a Xilinx, then it's your problem, not theirs. I think they should learn from IBM about what could happen years down the road when they don't listen to suggestions! > I would be curious to hear what other people tell you about > other companies FPGA architectures that are more suited for > carry-save work. Maybe you could post a summary? > > - Larry Doolittle <ldoolitt@jlab.org>Article: 10018
Guy Lemieux wrote: > why not group the bits into threes (or larger) and use a ripple carry > in the middle? one CLB can implement 2 bits of FAs using ripple > carry to another CLB, which can implement the 3rd bit and a final > carry out. this uses 2 CLBs to produce 3 sum bits and 1 carry-out. > the final 17-bit hybrid/carry-save adder would use 11.5 CLBs. > since you will have only one carry-save bit per 3 input bits, the > design will probably route easier. > Actually, your scheme doesn't reduce the number of operands, so I'm not sure how to use it. That is, from what I understand, both the inputs and the outputs are two numbers. Granted, one of your output numbers (the carries) has fewer bits than the input operands, so the output will route more easily.Article: 10019
So why not have several "flavor" of CLB's? There are certainly enough models of FPGA's out there to support that. Guy Lemieux wrote: > designing an area-efficient FPGA is a tricky task because there > are many demands to consider and balance. as spock said, "the > need of the many outweigh the needs of the few".Article: 10020
Great! How large a circuit can the software handle? Martin Mason wrote: We have two architectures that are both well suited to this problem....Article: 10021
Someone from Atmel also suggested their AT6K and AT40K devices. (mtmason@ix.netcom.com) Vitit Kantabutra wrote: > In terms of using other companies' FPGA's, the only suggestion I got was to > try Altera's ....... (I think it was the 8K series). But if I want to stick > to Xilinx 4000's, then Lemieux post contains an excellent idea. One of > Andraka's posts also contains the same idea, perhaps in a little less detail. > > Xilinx does appear to think they are the immutable gods of this kind of > design. If your design doesn't fit well in a Xilinx, then it's your problem, > not theirs. I think they should learn from IBM about what could happen years > down the road when they don't listen to suggestions! > > > I would be curious to hear what other people tell you about > > other companies FPGA architectures that are more suited for > > carry-save work. Maybe you could post a summary? > > > > - Larry Doolittle <ldoolitt@jlab.org>Article: 10022
Vitit, Even the smallest FGPA has many CLBs. Why is it so important to use 10 instead of 17 when you have hundreds or thousands? In article <353CC1BC.3FB6@isu.edu>, kantviti@isu.edu wrote: > > Frank Gilbert wrote: > > As far as I remember, a full-adder-cell has three inputs (a, b, > > carry-in) and two outputs (sum, carry-out). To generate two outputs, you > > will always need both function generators of one CLB (XC4K). This will > > lead to n CLB's for n full-adders. > > > > The only reason why the n-bit ripple-carry-adders generated by LogiBlox > > use less than n CLB's is by using dedicated carry-logic included in > > every CLB. It is difficult or nearly impossible to use this carry-logic > > for your own purpose. The dedicated carry-logic is explained in > > http://www.xilinx.com/xapp/xapp013.pdf . > > Thanks. That makes sense. But now I wish CLB's were designed in a more > carry-free-operations-friendly manner. Aren't there FPGA's out there > that are better suited to carry-free arithmetic? > -----== Posted via Deja News, The Leader in Internet Discussion ==----- http://www.dejanews.com/ Now offering spam-free web-based newsreadingArticle: 10023
I had to do this manually too, on my Data I/O Chiplab programmer. One could develop a little awk program which would run, in e.g. a batch file, and edit the .mcs file as required. >I am involved in a design using Xilinx FPGAs, using the Foundation 1.4 >package. I am configuring the part from a Xilinx 17128D serial PROM. To >make the PROM work, I must change the polarity of the RESET/OE pin by >writing data to 4 locations in the part. For the life of me, I can't seem >to find a way to automate this using the Foundation toolset. Currently I >must edit the MCS file or manually edit the data on the programmer. Surely >this has been addressed many times and an easy solution is available, or the >solution is within the toolset and I'm just missing it. Any help in solving >this problem will be greatly appreciated. Peter. Return address is invalid to help stop junk mail. E-mail replies to zX80@digiYserve.com but remove the X and the Y.Article: 10024
> Guy Lemieux wrote: > > why not group the bits into threes (or larger) and use a ripple carry > > in the middle? one CLB can implement 2 bits of FAs using ripple > > carry to another CLB, which can implement the 3rd bit and a final > > carry out. this uses 2 CLBs to produce 3 sum bits and 1 carry-out. > > the final 17-bit hybrid/carry-save adder would use 11.5 CLBs. > > since you will have only one carry-save bit per 3 input bits, the > > design will probably route easier. > > > > Actually, your scheme doesn't reduce the number of operands, so I'm > not sure how to > use it. That is, from what I understand, both the inputs and the > outputs are two > numbers. Granted, one of your output numbers (the carries) has fewer > bits than the > input operands, so the output will route more easily. hmmm... i think what i was suggesting was really a pipelined carry, with the carry being propogated forward in time/space along with the data... sort of a lazy evaluation of the carry. this is not exactly a carry-save adder, it is more like a hybrid carry-save / ripple-carry adder. in my mind i saw it reducing roughly 2.5 operands down to 1.5 operands. i realize now tha the area savings may not be as significant as i stated before, because there may be some up-front penalty (see the discussion below). also, my xilinx knowledge is weak (i use Altera mostly), so please forgive my mistakes in estimating CLB usage. i imagined using 2 CLBs as follows: ____ a0 -- | | b0 -- | FA |---> S0 c0 -- | | ---- | |carry __v_ a1 -- | | b1 -- | FA |---> S1 -- | | ---- | ---CLB_boundary--- |carry __v_ a2 -- | | b2 -- | FA |---> S2 -- | | ---- | |carry __v_ -- | | -- | |---> C, weight(C)==weight(S2)+1 -- | | ---- the input numbers are a0,b0. the outputs (S0,S1,S2,C) can all be registered for pipelining. the C can be forwarded to the next stage in the pipeline and used as the c0 input of the next stage. this is the 'saved carry' and represents the .5 of an operand in my 2.5:1.5 reduction estimate above. my estimate was that 5 of the above CLB-pairs are needed to add 15 (of 17) bits, and the final 2 bits can be done in 1.5 CLBs (?). thus, the # of CLBs needed would be 11.5 * (#_of_operands - 1). we can probably round this to 12 * (#_of_operands - 1) how exactly are you using these adders? if they are being used as accumulators, then you can feed back the S outputs to the b inputs, and the carries can get wrapped accordingly too. you get immediate area savings (17 CLBs for the 3:2, vs 12 CLBs for this method). if you have a bunch of numbers to add, and you are trying to pipeline the addition without worrying about carry chain prop delay, then this structure is not as area efficient for small numbers of operands (break-even point is about 4). if you only need to add 3 operands, your 3:2 compressor (plus a final adder) will be smaller than two of these compressors (with the same final adder) (17 CLBs versus 2*12=24). but if you need to add four or more operands, this structure starts to become more area efficient. with four operands, your 3:2 compressors require 34 CLBs and my suggestion requires 3*12=36 CLBs. with five operands, it's 51 vs 48 CLBs. with eleven operands, it's 170 vs 120 CLBs. guy
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z