Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Floating point -- that's one worth trying to sort out... Do you really need Floating Point? I guess that's the question which always comes to mind first. In practice, I see floating point weilded in many situations where it isn't necessary, and I see people demanding it who don't really need it. Often I see people who think they need floating point simply because they want a decimal place. "Modern" languages like C, which lack support for non-integer fixed point numbers, have this tendency to force you into that mindset. My impression is that the main case for really need floating point is when the dynamic range of the variables is very large and the magnitudes at any point in a program (or stored value) are unpredictable -- and, hence, must be explicitly represented along with the values. I believe, the only other justification for floating point is "ease of programming" -- the programmer isn't burdened with thinking about the dynamic range or magnitudes of the intermediate values (though he does still need to think about the number of bits of significance kept in these intermediates). I don't want to under-rate this issues, but I do want to sort it out as separate from the cases where floating point is fundamentally required. [BTW -- I'd be glad to be educated about other uses and virtues of floating point in case I'm simply ignorant on the matter.] Can FPGAs do floating point? Of course -- see, for example, [Shairazi, Walters and Athanas, FCCM'95] (though I think I heard some concern about the stability of their detailed, short floating point implementation) How well can FPGAs do Floating Point? Admittedly, poorly. Unfortunatley, there's no direct implementations to point at (I don't feel comfortably directly extrapolating from the ~16 bit floating point implementation mentioned above to something which is comparable with the more traditional 32 or 64 bit formats). From my review of multiply implementations (which takes up about half of the space in the few floating point implementations I've seen), I see a 200x space-time disadvantage for an FPGA implentation versus a hardwired implementation. I wouldn't be surprised to see the full floating point implementation being worse -- perhaps 500x. Is this important? That's a good one to discuss. How often is floating point *really* needed? -- that's a question for application and algorithm people to think hard about. I, for one, would like to hear people's thoughts. Are people so attached to the convenience of floating point that they can't think of doing things any other way (even if floating point isn't strictly necessary)? This will never be something we can quantify -- but it would be nice to get a feel for whether or not floating point (IEEE floating point with all of its little quirks and details) is a piece of standardization baggage we're going to be stuck with for all time. Is Floating Point a reconfigurable killer? Certainly not. If it is important for one of the reasons mentioned above, then it may be time to think about adapting reconfigurable architectures to deal with it better. ...and, of course, if it isn't then it doesn't matter... What can we do if floating point is so important? * Range analysis and compile to fixed point -- I appreciate the desire to take the burden of keeping track of the location of the decimal point and value ranges away from the programmer. There are many cases where we can give that responsibility to the *compiler* rather than to the hardware. Certainly, such a scheme could take care of all the cases where floating point was not needed for dealing with wide, unpredictable dynamic ranges, but was useful in easing the task of programming. I could see a compiler being told the basic ranges of source inputs at some point of a program and propagating range/significance analysis through the code to infer the ranges needed at each point in the program. In the process it could decide whether the decimal could be fixed staticly (and where it occurred in the number) or whether it needed to be handled dynamicly. --> sounds like a good compiler/synthesis project for someone to look into. * Do what processors do -- General-purpose processors/ALUs were (are) pretty bad at floating point, themselves. That's why modern microprocessors (and older high-performance processors) include hardwired floating point units to handle floating point operations. Modern, "general-purpose" processor dedicate about 15% of their die area to hold a floating-point unit. If this were important, one could do the same with reconfigurable devices -- add a few FPUs in the corner of the die. E.g. the Altera 10K now has banks for specialized memory block; one could replace those blocks with a specialized FPU if it were deemed that critical to all applications. * Keep Floating-Point datapaths in mind while designing reconfigurable elements -- If you look at most FPGAs, you'll see that adder/substractor/comparison datapaths were carefully considered during design. This leads to architectures which aren't purely "a bunch of homogenous bit-level processing elements" and which deal with arithmetic moderately robustly. With similar care and attention to the requirements for floating point, I suspect one could architecture more "floating point friendly" architectures. I'd start by looking at what some of the SIMD processors have done along this vein to improve their ability to handle floating point code (particularly, I'm thinking of the the Masspar MP2 and not the CM which took the approach mentioned above of adding a specialized FPU). Andre' DeHon Reinventing Computing MIT AI Lab <http://www.ai.mit.edu/projects/transit/rc_home_page.html>Article: 2876
> > > Floating point -- that's one worth trying to sort out... > > Do you really need Floating Point? > Because floating point is avalible it is used. In other words programs and algorithms exist that take full advantage of floating point range. Many real world apps depend on the range of floats it would take lifetimes to redo the codes and algorithms involved. I'd like to know what length word one would have to use in fixed point to recreate the full dynamic range of single precision floating point. More to the point what computer system around today doesn't do floating point? How can we say "Reconfigurable Computing" if we can't do what other computing systems do? It is a tough one and we have to tackle it head on to get ANY respect from the rest of the computing world. > How well can FPGAs do Floating Point? > > Admittedly, poorly. Most floating point work I've seen uses VHDL which is not effecient. While it may take some time I'm going to be working on this soon which is why I'm writing my little Xilinx assembly language. > Is this important? > > Absolutely!!!! I can not tell you how many times I've heard "can it do floating point?" If all I can say is "if you make it it can" I don't get the sale. For me it is very important. > > Is Floating Point a reconfigurable killer? > > adapting reconfigurable architectures to deal with it better. > This is what has to be done. It is tough when one or two companies hold the patents that lock the field up. Reconfigurable computing is a small market because the right devices don't exist and the right devices don't exist because it is a small market (ARRRRGGGGG!!!!). > What can we do if floating point is so important? > > * Range analysis and compile to fixed point -- I appreciate the desire > to take the burden of keeping track of the location of the decimal I appreciate what your saying here but wouldn't it be nice to just have the right structures to do the right thing? > > * Do what processors do -- General-purpose processors/ALUs were (are) > pretty bad at floating point, themselves. That's why modern > microprocessors (and older high-performance processors) include hardwired > floating point units to handle floating point operations. This is an IMPORTANT thought.. Look what they had to do in the 50's then look at what they did in the early 70's then look what they did in the 80's "those who do not know history are comdemned to repeat it" > * Keep Floating-Point datapaths in mind while designing reconfigurable > elements -- If you look at most FPGAs, you'll see that > adder/substractor/comparison datapaths were carefully considered during and barrel shifters don't forget barrel shifters in floating point it is adds that really get you! Steve Casselman Virtual ComputerArticle: 2877
In the same vein as Steve and Andre have been saying... 15 years ago the Forth community struggled with the same issues about floating point. Chuck Moore always maintained that if you really knew your application and its numerical behavior, like any good programmer should, then FP was unneccessary baggage. He and other Forthers quite successfully developed numerical applications in fixed point Forth, ranging when necessary, etc. That was in a time when FP was rare and expensive, there were many different formats, etc. After not too many more years, the cost of FP dropped to the point where the issue started seeming silly, and everybody started using FP anyway. It's easier to just throw FP at the problem than do rigorous analysis, and after all programmers are at least as lazy as anyone else. I think computers are supposed to make problem solving easier, so I think that is a reasonable position to take. I wasn't paying attention, but I believe the DSP community is now going through a similar process. FP DSPs are increasingly common, powerful and affordable. I think FCCMs are in the early stage of a similar path. Now FP is terribly expensive, especially if implemented in LUTs and programmable interconnect. FP operators are so atomic and well understood and universal that they are bound to be hard-wired into programmable arrays at some point in the future. How to architect arrays to mix LUTs with other hard function units is an interesting problem, that Andre and his cohorts and others in the research community, and Altera and probably others in the commercial world, are usefully exploring. The real point of FCCMs is to program the hardware that is *usefully* reprogrammed. The more cases we find where we can use hard-wired hardware as elements in our programmable soup, the better off we will be in speed, area and power. Arithmetic elements are good cases like that, and will get better and better as silicon shrinks further. For now it's very important to find gate-efficient ways to get arithmetic done in the FPGAs we have. It's useful to learn from those who traveled this path before us in GP computing, and later in DSP, and use the same tricks. A good example is the resurrection of CORDIC arithmetic to replace multiplies with adds, reported by Chris Dick of La Trobe U in Melbourne at FPGA '96 ("Computing the Discrete Fourier Transform on FPGA Based Systolic Arrays", pp 129-135.) He gets a 1000-point DFT on one XC4010 in 51.45 milliseconds, 5ms on 10 XC4010s. --Mike -- Mike Butts, Portland, Oregon mbutts@netcom.comArticle: 2878
In article <4fqs7k$2tn@hacgate2.hac.com>, Lance Gin <c43lyg@dso.hac.com> wrote: > >Just for the record, we're using the A3F release and XACT 5.1.1 on > >Sun Solaris 2.5. Generally, once bashed into shape by my mate vi ;-) > >the system works well. > I was told by Xilinx they are not yet ported to Solaris, and yet you tell us casually on working on Solaris 2.5 !!! amazing. how stable it is ? Share with us Lance. thx Zeev -- Zeev Yelin, Cadence Design Systems(Israel) "no such thing as a Finish Line" My personal opinions, use it or loose it, always at your own risk.Article: 2879
> [Post] [Reply] > -------------------------------------------------------------------- > > Re: Xilinx FPGA's with Mentor Tools? > > From: vanbeek@students.uiuc.edu (Christopher VanBeek) > Date: 1996/02/12 > > MessageID: 4fo8ae$j4h@vixen.cso.uiuc.edu#1/1 > -------------------------------------------------------------------- > > distribution: inet > references: <4f3l48$c6o@hacgate2.hac.com> <4fnu3l$eo3@gcsin3.geccs.gecm.com> > organization: University of Illinois at Urbana > newsgroups: comp.arch.fpga,comp.cad.synthesis,comp.lang.vhdl,comp.lang.verilog,co= mp.lsi .cad,comp.sys.mentor > > Hi, > > I'm a college student at the University of Illinois and am just > learning to use VHDL. I have compiled and synthsized some simply= > designs using Mentor's tools and Xilinx libraries. Here was > the design flow I used: > > First, I had to create a work directory using Mentor's "qvlib" > program. Then compile the VHDL using "qvcom". Simulation can be >= performed using "qvsim", or I think QuickSim (I have not tried > QuickSim). I found qvsim to be faster than QuickSim every was, > and it allows step tracing and breakpoints in the source as well > as waveforms. To synthesize, I compiled using "qvcom" with the > "-sythesis" option. This executes Mentor's System-1076 Compiler > after compiling the VHDL. That creates a symbol in the work > directory and the Autologic viewport for the design. Then I ran > Mentor's Autologic program, read in the viewport, and set the > destination technology. The Autologic libraries for the Xilinx > FPGAs are on supportnet.mentorg.com in the > /pub/mentortech/tdd/libraries/fpga directory. Then I set a couple > contraints and hit the synthesis button. It created a bunch of= > sheets with Xilinx gates and flip flops on them. > > Christopher Van Beek > > -------------------------------------------------------------------- chris, we haven't gotten into the details yet, but our flow will need to support an XC4025E device in a pga-299 pkg using mixed schematic/VHDL entry as follows: DA->sys1076(ugh!)/quickHDL/qsim2->autologic2->XACT we'll also be using model tech's v-system VHDL simulator on PC for blocks. our local mentor FAE thinks this flow will work (with some caveats) and has given me a xilinx doc which describes autologic2 synthesis guidelines. he also made sure the latest autologic library for XC4KE was installed at the supportnet ftp site. i've got a URL for some xilinx/mentor tutorials which you might be interested in: http://www.mentorug.org/sigs/univ_sig/index.html you might also like to subscribe to a few of the mentor e-mail exploders like asic_fpga, falcon, or univ_sig where i've got a more up-to-date version of this thread going. call mentor tech support at (800) 547-4303 for details. thanks for your flow info chris. when the time comes, i'll compare our flow with yours, and possibly contact you again with comments. regards, -- ____________________________________________________________________________ Lance Gin "off the keyboard Delco Systems - GM Hughes Electronics over the bridge, OFC: 805.961.7737 FAX: 805.961.7329 through the gateway, C43LYG@dso.hac.com nothing but NET!" ____________________________________________________________________________Article: 2880
Andre' DeHon wrote: > > Floating point ... > > I suspect the most efficient way to do floating point multiply accumulates with an FPGA is to conect it to connect it to an i860 or a C40 and then throw away the FPGA. If however, one needs floating point trancendental functions, divides, atan2 or other complex functions, they can run more efficiently on an FPGA by using the cordic algorithm. This creates 1 bit per clock and can be pipelined to create a complete result every clock. I'd like to see a DSP that could do 50 million atan()s per second. Another reason to do floating point in an FPGA might simplt be because you have an FPGA available, not a FPU. I believe the reason that FPUs have become so common is first that they are affordable, and second because using floats eliminates a lot of bugs. I certainly use floats in C. To properly do math with ints requires a lot of expertise in a field that very few are expert in, which is precision analysis. It must be noted however that real world problems can usually be implemented without floats. The ecomomic advantage of 10-100 x performance advantage might impel one to implement a solution as integer math and to do the precision analysis. It seem to me that that the value of FPGAs is not in emulating IEEE floating point operations, but in solving real world problems which they tend to do very well. The precision analysis issue is one that won't go away and will probably only be solved when high level tools are used to transform floating point algorithms into FPGA configurations. By the way, the carry chain in a 4K FPGA can be used to implement the find first one function. When the FPGA manufactures start to support wide fanin muxes, the barrel shift might be a little less painful. Also I supect that floats using bit serial math are relatively efficient. Has anyone checked this out? Brad TaylorArticle: 2881
In article <1996Feb21.190049.3248@super.org> sc@vcc.com (Steve Casselman) writes: I think the Java thread is about reconfigurable computing in that Java talks about virtual machines and these could be implemented as FPGA based hardware objects. Below is a little SUN blurb about Java processors comming out. I read that the pico Java will be 2mm square this would fit in an 10,000 gate FPGA and the micro Java might fit in a 40,000 gate FPGA. A good reconfigurable computing project would be a JAVA to FPGA complier that could take the JAVA language and deside what could go into hardware and what would be run in software. Then of course design a machine that would speed up such a program:) Java has a 32-bit datapath, floating point, stacks (probably requiring stack->register mapping or stack caching), a reasonably complicated calling convention, user-level exception handlers, and a requirement for garbage collection. I can't imagine how this beast could ever fit in a 40k FPGA, let alone a 10k job. The execution path is fairly simple, so you won't get a speed boost accelerating a specific set of virtual machine instructions. I guess you could shoot to compile the high level code to hardware, but for that task java is perhaps the *worst* high level language you could choose, given the heavy weight of its calls and memory model (read requisit reliance on objects for anything complex). Java is kind of interesting, and it *appears* that Gosling was trying to target dynamic compilation when he defined the virtual machine. However, its a long way from anything that I can imagine ever being implemented in reconfigurable logic, and I personally consider myself a believer in that concept. There are a lot of people who think that Sun is just blowing smoke with their content-free announcement. If you read it closely, they don't even claim that the processor will actually execute virtual machine instructions directly, leaving open the possibility of something simple like a sparc core with dynamic compilation (hah! now I'm calling dynamic compilation simple...). However, a recent posting from a sun architect strongly suggest that in fact they do intend to use this approach, even if the news release doesn't claim it. BillArticle: 2882
I do agree totally, in software most people use floating point thoughtlessly. Once I saw a linker which needed fp, just to show how many percent of memory were used. I guess that most of the wide spread programming languages are unfit for automatic data range compilation. Languages like Haskell with a very clever type system might fill the gap. Even VHDL would not free the engineer of explicitly controling precision. Another Option might be the use of fractions. In good old D.E. Knuth's books is a discussion about that. It would only require a fast gcd-unit, which should be doable in hardware with moderate effort. Just my 2% A. ----------------------------------------------------------------- Andreas Doering Medizinische Universitaet zu Luebeck Institut fuer Technische Informatik Germany ----------------------------------------------------------------Article: 2883
Peter Alfke (peter@xilinx.com) wrote: : In article <Dn3uCy.25v@icon.rose.hp.com>, tak@core.rose.hp.com (Tom : Keaveny) wrote: : > Note, that there are a number of "synchronous" bus spec's that mandate : > a non-zero hold time. : Please let me suggest a more careful noemnclature: : "Hold-time" is not an OUTPUT characteristic, : "Hold time" is always an INPUT requirement. A positive hold time means : that input data is required to be "held" valid until after the active : clock edge. : "Propagation delay" or "clock-to-out" is the relevant OUTPUT specification. : If two devices are directly interconnected, and share a common clock : without any skew, then a positive hold-time requirement at the input can : only be satisfied by a guaranteed minimum clock-to-out delay on the output : that drives that input. : That's why a positive hold-time requirement on a data input is so bad, and : that's why Xilinx has added internal delay to increase the data set-up : time so much that the pin-to-pin hold-time requirement on all inputs is : never positive. : As a result, you can take data away simultaneously with the clock, and : still be sure that the old data is being clocked in. : A minimum clock-to-out delay specification is then not needed. The actual, : physically unavoidable shortest output delay acts as additionasl : protection against clock-skew problems. : Peter Alfke, Xilinx Applications. Come on Peter; surely you don't mean to suggest that nobody design a XILINX part to interface to something besides than other XILINX parts. There are certain facts of life which require consideration of at least some system clock skew. If a bus spec REQUIRES that data not change for 3ns (for example) after the clock, I would call that a hold time. Why the bus spec may even refer to it as t(DH)! So, how would you suggest one design an interface to such a bus using a XILINX device? If you would suggest that XILINX not be considered for these designs, let us know. At least we won't waste time evaluating them. Mike Budwey budwey@sequoia.comArticle: 2884
In many scientific problems, floating point is required. The dynamic range is large, and the precision is relatively small. But outside of scientific work, I don't believe this. In text processing, I believe that floating point should mostly not be used. In Knuth's TeX and Metafont, floating point is never used for anything that will affect the output. It is used for things in the log file, though. Postscript uses floating point, and I know of a number of cases where the results are wrong because of it. Even a 180 degree rotation can't be done in postscript, at least not with the rotate operator. Rounding in the sin/cos means that the zero terms in the rotation matrix are not zero. In high resolution, this is easily visible. Metafont does fixed point sin,cos,sqrt,etc just to do this right. Oh well, just a chance to say something that I don't see said very often. -- glenArticle: 2885
I would like to hear from people who have synthesized PCI models to FPGAs. ======================================================================== David Emrich Exemplar Logic, Inc. emrich@exemplar.com 815 Atlantic Ave., Suite 105 Alameda, CA 94501-2274 USA ========================================================================Article: 2886
mbutts@netcom.com (Mike Butts) writes: >For now it's very important to find gate-efficient ways to >get arithmetic done in the FPGAs we have. It's useful to >learn from those who traveled this path before us in >GP computing, and later in DSP, and use the same tricks. > --Mike >-- >Mike Butts, Portland, Oregon mbutts@netcom.com This is what we also think, so we have implemented FP addition and multiplication on the Altera FLEX 8000 using the IEEE single precision format. Various methods were investigated in order to get the best combination of time-space. We finally used a pipeline design for the adder and a digit-serial design for the multiplier. The adder takes about 50% of the chip and it has a peak rate of about 7MFlops. The multiplier takes 344 logic cells (34% of the chip) and it is clocked at 15.9 MHz. Since we are using a digit-size of 4, 12 clock cycles are needed for the complete result to be available. This translates to a rate of 1.3MFlops. If you would like to get more information on these designs please send email to louca@ece.rutgers.edu LoucasArticle: 2887
Brad Taylor <blt@emf.net> writes: >Also I supect that floats using bit serial math are relatively efficient. >Has anyone checked this out? >Brad Taylor we did design a 32-bit IEEE single precision FP multiplier using digit-serial arithmetic on Altera FLEX 8000. For details on area requirements and speed see my previous posting. For more info email louca@ece.rutgers.edu. LoucasArticle: 2888
Is there anyone in the Austin, TX area who has had experience developing for the Xilinx 7336 EPLD device? What are good tools for Boolean equation entry AND functional simulation? Any help would be appreciated. Regards, William A. Gordon, Jr. dsp@io.comArticle: 2889
In article <mbuttsDn6yKC.FoB@netcom.com>, Mike Butts <mbutts@netcom.com> wrote: >I wasn't paying attention, but I believe the DSP community is >now going through a similar process. FP DSPs are increasingly >common, powerful and affordable. But DSP's have a powerful motivation for staying as silicon-efficient as possible -- the descent into low-end, high-volume embedded application domains. A fixed-point multiplier will always take less die area than a floating-point multiplier, and that difference in area can be the difference between profit and loss when selling DSP's below the $1 price point. And this is the price point that will need to be breached for voice I/O to become truly ubiquitious, to choose just one example ... -- ------------------------------------------------------------------------------- John Lazzaro My Home Page: http://http.cs.berkeley.edu/~lazzaro lazzaro@cs.berkeley.edu Chipmunk CAD: http://www.pcmp.caltech.edu/chipmunk/ -------------------------------------------------------------------------------Article: 2890
Numbers are great! > This is what we also think, so we have implemented FP addition > and multiplication on the Altera FLEX 8000 using the IEEE > single precision format. Various methods were investigated > in order to get the best combination of time-space. We > finally used a pipeline design for the adder and a > digit-serial design for the multiplier. > > The adder takes about 50% of the chip and it has a peak rate > of about 7MFlops. The multiplier takes 344 logic cells > (34% of the chip) and it is clocked at 15.9 MHz. Since we > are using a digit-size of 4, 12 clock cycles are needed for > the complete result to be available. This translates to a > rate of 1.3MFlops. > > > If you would like to get more information on these designs > please send email to louca@ece.rutgers.edu Let's do some back of the envelope calculations of FP capacity based on these numbers. FPGA ------------------------------------ I'm going to start by assuming that Altera LE's are about the same size as Xilinx 4-LUTs (half a CLB in the 3/4k family) -- I think this is about ballpark right, but it could be off by 20% one way are the other. Assume: 1 LE ~= 600K lambda^2 (lambda = 1/2 minimum feature size, a technology normalizer) You must be using an 81188 -- 1008 LE's since you say 344 logic cells is 34% of the chip. I'll call 50% of the chip 500 cells to keep things easy. On addition this gives a throughput of 1 FP add per: 500 * 600K lambda^2 * 63ns or 0.053 FP Adds / lambda^2*s on multiply, a throughput of 1 FP-MPY per: 344 & 600K lambda^2 & 63ns * 12 or 0.0064 FP-mpy / lambda^2*s Custom ------------------------------------ I happen to have a custom 32-bit FPU papers in my files (one data point -- not enough to necessarily know whether or not this is a particulary good or bad custom implementation). Matsushita has a 32-bit CMOS FP-mpy JSSC vol 19, #5, p.697ff. 5.75x5.67mm die in lambda=1um --> 32.5M lambda^2 area 78.7~ns multiply --> 0.39 FP-mpy/lambda^2*s [ 0.39 / 0.0064 ~ = 61x] It fairs better than I had predicted in my first message. (at some point it would be worthwhile to look a bit more broadly to see if this custom implementation is typical). Also note, a composite FP unit which does more than just FP-mpy would be less dense when just considering the FP-MPY operation. DSP/proc extrapolate ------------------------------------ Now, if you buy a processor/DSP to do FP-MPY's for you, the FP-MPYer usually consumes about 10-20% of the area on the die. So, the ratio for a DSP etc. which integrates such a muliplier might be more like 1/5th this (12x) rather than ~60x. Summary ------------------------------------ 0.0064 FP-mpy/lambda^2*s FPGA 0.08 FP-mpy/lambda^2*s DSP estimate [12x FPGA density] 0.39 FP-mpy/lambda^2*s custom [60x FPGA density] To round this out, does anyone have good references/numbers on the number of ALU cycles a "typical" (or some particular) processor w/out FP hardware requires to implement a 32-bit FP-add and FP-mpy? (my intution is that the processor sans FPU will have a lower computational density than the FPGA). Andre' DeHon andre@mit.edu Reinventing Computing MIT AI Lab <http://www.ai.mit.edu/projects/transit/rc_home_page.html>Article: 2891
In article <31294A97.31FC@microplex.com>, Fred Fierling <fff@microplex.com> wrote: >No doubt this would make a FAQ if this newsgroup had one: > >I'm trying to decide between Verilog and VHDL for FGPA design now and >possible ASIC design in the future. Does anyone know of any articles >or reports that compare the two? Preferably something less biased than >an article by a company with an interest in either... > Yes. Try to find a pointer to John Colley's ESNUG entry: /www.chronologic.com/misc/reviews/verilog.review.html Alex Koegel DSPC IsraelArticle: 2892
I just completed a project targeting AT&T Orca 2C parts. The timing required for PCI is VERY tight. And FPGAs are still relatively slow especially for wire delays. The real trick to PCI is not the PCI side of the world, but in how you interface to the local bus side. We had more problems there than in the PCI side of the world. I have to commend the PCI spec writers for creating a relatively easy to read specification that doesn't leave too many things to interpretation. I tried to have a fully synthesizable core but could not meet the timing. Not even close really. Although much of the blame lies on the shoulders of the AT&T place and Route tool. If I had more control over where certain functions were to be placed, the timing probably could have made it. Instead, I had to create several "hardmacros" which were still synthesized to create them, but the hierarchy gave me something to "latch onto" and I could tell the P&R tool where to put these macros. Even with all of this we are still just barely making 25Mhz. Fortunately the FPGA is really just a prototype for an ASIC and we will definitely meet the timing there for the full 33Mhz. Since this is just a prototype, we determined it was not worthwhile spending a lot of time with manual place and route and additional hardmacros to achieve the full 33Mhz in the FPGA. Another option is that there are faster speed Orca parts coming out by the end of the year which should be fast enough to meet the PCI timing. The core we developed was both a Target and an Initiator. A Target only core should be pretty easy to implement in an FPGA even at 33Mhz. It's the Initiator that is the real tough part. We won't be offering the PCI core as a "standard" Core. Instead it will be built into other cores. I find it very difficult to beleive that there can be such a thing as a "standard" PCI core. PCI by it's very nature is a Configurable Core. The local bus side will always have it's own, unique requirements. Thus, each implementation of a PCI core is really a custom development. -- Eric Ryherd eric@vautomation.com VAutomation Inc. Synthesizable HDL Cores 20 Trafalgar Square http://www.vautomation.com Suite 443 Nashua NH 03063 (603) 882-2282 FAX:882-1587Article: 2893
Edward Leventhal <ed.leventhal@omitron.gsfc.nasa.gov> wrote: >Hello, > > Could someone please tell me the status of the Xilinx 8100 >"Sea Of Gates" FPGAs? Are these parts available? Will the current >XACT software be used for these parts or is there another software >package which must be used - If so, can XACT be "upgraded" ?? > > I have read that these FPGAs yield excellent routing when >used with logic synthesis (e.g. VHDL), and I am interested in any >feedback / information. I was a Beta site for these parts. THEY'RE GREAT! The design flow is very much like and ASIC, HDL->Synthesis->EDIF->P&R. The gates you get are also very much like an ASIC. Various flavors of AND/OR and DFFs and stuff, none of these crazy LUTs into DFFs. The routing is excellent. Lots of available tracks. Still won't get 100% utilization depending on your design but 90+% wasn't a problem for me. The best part is that you don't have a fixed number of random logic and sequential logic elements. Each CLC can be configured as either random logic or as 1/2 of a DFF. Our designs are generally Random Logic intesive which makes them terrible in regular LUT based FPGAs (which have tons of DFFs and never enough LUTs). The parts are available. But the package choices are limited. PQ84s are easy to get and should be on virtually any distributors shelves. The biggest bummer is of course that the P&R SW is not in the "standard" XACT release. You need XACT8000. You'll have to check with Xilinx on the cost. On the other hand, XACT8000 is about 8000X better than plain old XACT... -- Eric Ryherd eric@vautomation.com VAutomation Inc. Synthesizable HDL Cores 20 Trafalgar Square http://www.vautomation.com Suite 443 Nashua NH 03063 (603) 882-2282 FAX:882-1587Article: 2894
Alex Koegel wrote: > In article <31294A97.31FC@microplex.com>, > Fred Fierling <fff@microplex.com> wrote: > >I'm trying to decide between Verilog and VHDL for FGPA design now and > >possible ASIC design in the future. Does anyone know of any articles > Yes. Try to find a pointer to John Colley's ESNUG entry: > /www.chronologic.com/misc/reviews/verilog.review.html After reading this article, it occurs to me that I've been naive. The question isn't which one is technically best, the question is which one will win the marketing war? -- Fred Fierling fff@microplex.com Tel: +1 604 444-4232 Microplex Systems Ltd http://microplex.com/ Tel: +1 800 665-7798 8525 Commerce Court Fax: +1 604 444-4239 Burnaby, BC V5A 4N3Article: 2895
Edward Leventhal <ed.leventhal@omitron.gsfc.nasa.gov> wrote: >Hello, > > Could someone please tell me the status of the Xilinx 8100 >"Sea Of Gates" FPGAs? Are these parts available? Will the current >XACT software be used for these parts or is there another software >package which must be used - If so, can XACT be "upgraded" ?? > The XC8100 FPGAs are now in production. They use an addition to the XACT software called XACT8000. Please contact your local Xilinx representative for pricing information. Thank you for your interest in Xilinx programmable logic. -- ===================================================================== _ / /\/ Steven K. Knapp E-mail: stevek@xilinx.com \ \ Corporate Applications Mgr. Tel: 1-408-879-5172 / / Xilinx, Inc. Fax: 1-408-879-4442 \_\/\ 2100 Logic Drive Web: http://www.xilinx.com San Jose, CA 95124 =====================================================================Article: 2896
We have recently purchased AT17C128 Serial PROMS to replace Xilinx configuration PROMs, but do not have any apparent means to program these new devices. I have downloaded Atmels CONFIGURATOR application note describing the programming spec, which now prompts me to ask the question "Is there not a piece of programming software already already out there somewhere that'll save me some time?" I imagine the serial bus protocol could be implemented relatively simply using a couple've pins on a PC's parallel port interface... -Is there such a program at a FTP site somewhere? Thanks in adavance. Regards PETER FENN **************************************************** * ____ ______________ * * E L E C T R O | \ ELECTROSOLV cc * * ==============| \ * * | )======= ELECTRONICS * * ==============| / S O L V & SOFTWARE * * |___/ DESIGN GROUP * * * ****************************************************Article: 2897
In article <peter-2102961234010001@appsmac-1.xilinx.com>, peter@xilinx.com (Peter Alfke) writes: [snip] > If two devices are directly interconnected, and share a common clock > without any skew, then a positive hold-time requirement at the input can > only be satisfied by a guaranteed minimum clock-to-out delay on the output > that drives that input. > > That's why a positive hold-time requirement on a data input is so bad, and > that's why Xilinx has added internal delay to increase the data set-up > time so much that the pin-to-pin hold-time requirement on all inputs is > never positive. > > As a result, you can take data away simultaneously with the clock, and > still be sure that the old data is being clocked in. > A minimum clock-to-out delay specification is then not needed. The actual, > physically unavoidable shortest output delay acts as additionasl > protection against clock-skew problems. I've worked on several 3000/3100 designs. I like your chips (or I'd use something else) and I think I understand your reasoning for the no-min-delay philosophy. But it sure makes my job difficult. Let me put that another way. Any official help in that area would increase my productivity and make your chips more attractive/valuable. Here is my view on (part of) the design process... I would like to be able to convince myself that a board/system I am about to build will meet all the timing requirements before I push the button to make the PCB. I'm willing to do a lot of work to do that. For example, correcting for trace lengths and taking advantage of your 70% across-chip rule. If you are hard nosed about no-min-delay, then I have a lot of troubles proving that a design will work. Here are a few examples: Suppose I want to clock data from one 3100 to another. The specs say 0 hold time but no min output delay. So I have to have 0 clock skew. That's unrealistic. But how much skew can I get away with? Should I work hard making it small or put the effort into something else? Suppose I have 0 clock skew between the chips. When I look at the fine print in the data book that corresponds to your description above, it only applies if I'm using the CMOS clock input. What do I do if I'm using the TTL input? [A lot of modern CMOS chips have output pads that drive TTL levels.] PCI has a min clock to output spec of 2 ns. How can I convince myself that a sensible design will work? Will it still work after the next few speed upgrades? [I assume it will work because Xilinx is pushing PCI. You might have added some extra delay in your design, but I doubt it because that makes other things harder and they are already hard enough.]Article: 2898
HI ALL! Excuse me if you have just discused this topic! I'm new in this group! A friend of mine ask me to find some information about FPGA and testing or testability of it (I don't now much about it) I have found some WWW sites but I stil can't get to the FPGA and testability information. Can you help me? If you do, send me an answer on my e-mail (I don't realy read this group very often) BTW: How many participants of this group from POLAND you have ??? -- Marcin Piaskowski http://kpbm.pb.bielsko.pl/~zwirek Sorry, only in Polish, so far ;-) mailto:zwirek@kpbm.pb.bielsko.pl ZwArticle: 2899
So now we see we are only 60x away from floating point dominance of the world (not bad when most companies were not even trying:). We have to think about what languages to use to program. VHDL and verilog are out since they are only 2D (you can specify timing but you end up with a time wise flat design) and can not handle things like MITs multicontext FPGA - which is where we have to go IMHO. High level languages like C and fortran are naturally "time aware." By that I mean if you look at the alu as a reconfigurable unit (reconfigures in one clock) HLL describe an algorithms execution over time. On the other hand most HLLs can not describe the fine parallelism the FPGAs can execute. It would be good if I could take existing programs and have the complier understand that I have this great resource avalible to me. The hangup now is there are not many complier writers out there thinking about this problem. Something reconfigurable to think about (instead on how to port my FPGA to ASICs or whether my antifuse part routes better:) Steve Casselman Virtual Computer Corporation
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z