Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
In article <3ca509$gd5@atlantic.merl.com>, Doug Hahn <hahn@ca.merl.com> wrote: >I was wondering if anyone has any experience driving a PCI >bus from an XC4000 device. PCI has specific I/V characteristics >which need to be met and has anybody have any experience meeting >these criteria (what output driver configuration is needed?). I haven't tried it, but the I/V curve for the 3K and 2K devices is just in the accepted area for PCI. I assume 4K is too. The only problem I can think of is that having a large number of pins driving a high-capacitance bus all on the same clock edge probably exceeds the power capability of the chip. I think there's a capacitance per power pin limit in the 4K specs somewhere; so be sure to check it. I'd like to here if you have any success with this. -- /* jhallen@world.std.com (192.74.137.5) */ /* Joseph H. Allen */ int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0) +r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2 ]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}Article: 501
(Hope the crosspost to comp.arch.fpga is OK, the topic is amateur processor implementations using FPGAs.) In <3c6is4$d7k@gordon.enea.se> pefo@enea.se (Per Fogelstrom) writes: > >PDP11 Hacker ..... (ard@siva.bris.ac.uk) wrote: > >: My main interest is in designing a CPU _from scratch_. OK, I know I'll get >: poor performance from all those FPGAs wire-wrapped together (all that >: capacitive loading for one thing), but with a good underlying design it should >: be useable (heck.. The PERQ 1a was had a CPU built from 250 TTL chips, PALs and >: PROMs, clocked at 5MHz, and still beats this 386DX33 for graphics performance >: :-)). And there's the joy when a prompt appears on a machine that you even >: designed the instruction set for. > >I've did a few bitslice designs many years ago. One was for my own amusement >and was based on AMD2903 slices (32 bits, 8 chips). It was fun but very time- >consuming. It was clocked with 5Mhz and executed reg-reg instructions in >two clocks. I later redesigned it to fetch and decode in the same cycle as >the previous execute. It never ran any serious software. > >Per > On homebrew computers: start simple and learn as you go. When they work they are *very* satisfying. I was encouraged by helpful U.Waterloo hardware hacker friends (thanks Ashok and Mike and co., wherever you are) into building my first homebrew 6809 system -- the "Gray-1", in 12th grade about 14 years ago. It started with ROM, SRAM, and LEDs, and gradually acquired serial ports, video, and a Votrax speech synthesizer. Eight bit micros and 1 MHz clock rates are easy to do: easy to wire wrap, and easy to program. Start with one of those; PICs look like a good choice today. On homebrew processors: I went into the software biz but my love for hardware and computer architecture remains. I've always been envious of the engineers in industry and academia who get to design and build new processors. For a hobbyist, custom VLSI, gate arrays, or standard cell has these hugely expensive barriers to entry. And only the most determined hobbyist would build a useful 32-bit CPU using bitslice parts. In the years since, the programmable logic industry has arrived! These days you can buy, quantity one, 5,000 gate field programmable gate arrays (FPGAs) for ~$100, and 10,000 gate parts for about ~$200. The beauty of these parts is they are adequately dense for implementing processors and they abstract away a lot of the high speed circuit stuff for you. For instance, clock skew is of little concern. If you stick to fully synchronous designs (no async preset/clear, no gated clocks, etc.), carefully floorplan your functional units, and stay on chip :-), your designs have a good chance of working at 20-25 MHz. In my copious spare time I am experimenting with homebrew RISC CPUs. Right now I have a partially finished, partially functional 16-bit RISC CPU and ambitions for a dual issue 32-bit CPU. The former ("jr16") is compiled for a Xilinx XC4005PC84C-5, the latter ("NVLIW1" -- "not very long instruction word #1") will be for a XC4010PC84C-5. jr16 is a pipelined 16 16-bit register, 3-operand, load/store RISC. The basic instruction formats are: { 0, op: 3, rd: 4, ra: 4, rb: 4 /* add/logic operations */ }, { 10, op: 2, rd: 4, ra: 4, imm: 4 /* load/store, EA=ra+imm4 */ }, and { 11, op: 2, rd: 4, imm: 8 /* load immediate, branch */ }. Instruction pipeline is the classic IF (insn fetch), RF (write back previous result and reg fetch), and EX (execute add/logic/effective address computation.) If there's a load/store the pipeline stops until completed. The 16-bit datapath is 8 rows by 5 columns of CLBs (Xilinx Configurable Logic Blocks) (only ~20% of an XC4005 which has an array of 14x14 CLBs). The columns are: rfa (reg file read port A), rfb (reg file read port B), mux (multiplex B or immediate data), adder, logic unit (and, or, xor, xnor). Results (add/logic/load data) are multiplexed into a write-back register on long lines (LLs) using the XC4000's dedicated LL tristate drivers. For this first design I avoided a separate PC incrementor and associated multiplexors and instead use r15 for a PC. Thus the clock phases are: phase register file exec. unit load/store 1 write back result reg add 2 to PC latch insn, read another 2 read next A, B regs add 2 to PC 3 write back PC user insn add/logic 4 read PC user insn add/logic (The execution unit takes two clocks to add/mux result at (unproven) 40 MHz.) A nice aspect of this design is the alternating inc-PC and user-insn cycle means that the previous user insn finishes and any results are written back to the reg file before the next user insn operands are read, thus eliminating any need for bypass multiplexors in the operand busses or ugly operation latencies in the programming model. To date I have this design running using the 11 MHz Xilinx XChecker circuit probe, incrementing PC, fetching instructions from an on-chip 16-word boot ROM, and performing ALU operations, but haven't yet implemented condition codes, branch or load/store circuitry. Soon! (I know it works as far as it does because I can verify internal state: the XChecker probe allows you to examine the state of every function generator and flip flop on the part.) As for top speed, XDelay static timing analysis (I don't have the simulator software) indicates I should be able to clock this at 40 MHz (25 ns). (I do have a critical path or two to better pipeline yet). Thus it should do 10 peak MIPS, not too shabby for a first design. One neat thing about the Xilinx XC4000 architecture (and I haven't seriously looked at the other FPGA vendor's architecture's to know if this is unique or inferior or superior) is there are enough flip flops mixed in with the function generators that you can make a RISC datapath in as few as three columns of CLBs: one register file (that you have to take two clocks to read two operands), one adder, one logic unit, result multiplexing being done on the LLs using tristate drivers). And using the dedicated carry paths you can do 16-bit adders in 9 CLBs, delay about 25 ns, and 32-bit adders in 17 CLBs, delay about 35 ns. As for the dual-issue 32-bit NVLIW1 my current plans are for a two-unit implementation of a simple VLIW achitecture. Each "unit" has a separate 16 32-bit register file, and 3 operand instructions (rdest = ra op rb), rdest and rb are local to the unit, specified using a 4-bit reg no., but ra can be read from either unit, and is spec'd using 1+4 bits. Thus a 2-unit machine has a basic 34-bit insn word: { op0: 4, rd0: 4, ra0: 5, rb0: 4, op1: 4, rd1: 4, ra1: 5, rb1: 4 }. (I'd obviously like to get that 34-bit word down to 32-bits but there isn't much fluff left. Any ideas out there? 32 - 2*(4+5+4) = 6, and six bits doesn't encode two operations very well...) Using the above "modestly decoupled" architecture, a separate PC incrementer, bypass result multiplexing, a VLIW-like limited access between register files/functional units, it should do peak two instructions in two clocks at 25 ns, or 40 MIPS. Here, the columns of functional units in the data path floor plan will be something like LAMMRRR RRRMMAL (L=logic unit, A=adder, MM=4-way A-bus source mux, RRR=3-read 2-write register file) with the two halves being placed such that splitting the LL bus lets me mux the adder or logic unit results of each concurrently. Thus the datapath of this 32-bit dual-issue machine should fit nicely in 14 columns X 17 rows of a 20x20 XC4010. On a 4013 (24x24) I would add a 16-entry 256-byte direct mapped cache (16 16-byte lines) whose cache and data SRAMs would burn another 5 rows by 16 columns. On a 4025, (32x32) ... It is amazing what you can squeeze onto these parts if you design the machine architecture carefully to exploit FPGA resources. In contrast, there was a very interesting article in a recent EE Times by a fellow from VAutomation doing virtual 6502's in VHDL, then synthesizing them down into arbitrary FPGA architectures. Although the 6502 design used only about 4000 "ASIC gates" it didn't quite fit in a XC4010, a so- called "10,000 gate" FPGA. That a dual-issue 32-bit RISC should fit, and a 4 MHz 6502 does not, states a great deal about VHDL synthesis vs. manual placement, about legacy architectures vs. custom ones, and maybe even something about CISC vs. RISC... Well, that serves as kind of a brain dump of work (play) in progress. Please drop me a line if you have questions, advice, etc. Jan Gray Redmond, WA jsgray@ix.netcom.com (home: hacking processors) jangr@microsoft.com (work: hacking Microsoft Visual C++)Article: 502
jhallen@world.std.com (Joseph H Allen) writes: >In article <3ca509$gd5@atlantic.merl.com>, Doug Hahn <hahn@ca.merl.com> wrote: >>I was wondering if anyone has any experience driving a PCI >>bus from an XC4000 device. PCI has specific I/V characteristics >>which need to be met and has anybody have any experience meeting >>these criteria (what output driver configuration is needed?). >I haven't tried it, but the I/V curve for the 3K and 2K devices is just in >the accepted area for PCI. I assume 4K is too. The only problem I can >think of is that having a large number of pins driving a high-capacitance bus >all on the same clock edge probably exceeds the power capability of the >chip. I think there's a capacitance per power pin limit in the 4K specs >somewhere; so be sure to check it. >I'd like to here if you have any success with this. >-- >/* jhallen@world.std.com (192.74.137.5) */ /* Joseph H. Allen */ I too am interested in this .. I have contacted Xilinx and they have a PCI compatibility package .. Just send email to pci@xilinx.com and they'll send it out to you. From what I gather, their XC3100 series is fully PCI compliant. They even have a handy VHDL file that implements a PCI bridge.. -Joel glickj@rpi.eduArticle: 503
If you have a Xilinx FPGA design that you would like completed, Optimum Solutions can help you out! Optimum Solutions offers Viewlogic schematic entry, Simulation, and Xilinx FPGA design services over the Internet. All you need to do is present a rough block diagram and/or a rough specification of what you want. You will receive detailed Viewlogic schematics, Viewsim simulation vectors, timing and performance specifications, and a fully routed LCA file. All this for a reasonable fee! If your interested please E-Mail Phillip Roberts at Optimum Solutions. proberts@rmii.comArticle: 504
In article <3c80k0$9j7@timp.ee.byu.edu> hutch@timp.ee.byu.edu (Brad Hutchings) writes: From: hutch@timp.ee.byu.edu (Brad Hutchings) Newsgroups: comp.arch.fpga Date: 8 Dec 1994 15:16:32 -0700 Organization: ECEN Department, Brigham Young University References: <1994Dec7.210319.3344@super.org> |> |> I think the format for any benchmarks should be either |> |> 1) pen and paper or |> 2) C code |> |> The idea here is to specify the benchmark at the highest level. |> In the pen and paper approach the benchmark is pure algorithm. |> Consisting of a mathematical description of the input data, the |> algorithm and output data. The C code approach gives the implementor |> a real example of the behavior of an algorithm. Why C-code? C-code seems like a poor choice as it only supports sequential semantics. Thus any algorithm implemented in C will demonstrated *sequential* behavior. However, the goal is to implement hardware that is highly concurrent. I don't see where C will be helpful in this case. C is useful for specification of the input and output of the algorithm. It is a much more well-defined specification language than English for example. There are many IEEE specs that use either C or Pascal (e.g. the Ethernet spec) to augment text descriptions. The original author's first sentence says explicitly "The idea here is to specify the benchmark at the highest level". This says nothing about proposing that your benchmark implementation must follow this behavior, only that it solve the same problem. Maybe I'm just reading what makes sense to me and attributing it to the original author. |> A C benchmark allows |> the designer to make hardware/software tradeoffs -- if their using |> a big board they can put the whole algorithm in hardware if they |> are using a small board they can cut up the program and divide and |> conqure. I think that C falls flat when it comes to hardware/software tradeoffs. Again, a C-implementation of an algorithm and a hardware implementation of an algorithm will likely be quite different for the reasons that I expressed above. Using C will only complicate the design-space search for mixed hardware-software solutions. Only if you use C as input to your system, as opposed to the definition of a computational problem which you are to demonstrate your system's ability to solve using it's own specification paradigm. Just another view. -- Brad L. Hutchings (801) 378-2667 Assistant Professor Brigham Young University - Electrical Eng. Dept. - 459 CB - Provo, UT 84602 Reconfigurable Logic Laboratory -- Jack Greenbaum | Ricoh California Research Center jackg@crc.ricoh.com | 2882 Sand Hill Rd. Suite 115 (415) 496-5711 voice | Menlo Park, CA 94025-7002 (415) 854-8740 fax | -- Jack Greenbaum | Ricoh California Research Center jackg@crc.ricoh.com | 2882 Sand Hill Rd. Suite 115 (415) 496-5711 voice | Menlo Park, CA 94025-7002 (415) 854-8740 fax |Article: 505
In article <JACKG.94Dec12152537@downhaul.crc.ricoh.com>, jackg@downhaul.crc.ricoh.com (Jack Greenbaum) writes: |> In article <3c80k0$9j7@timp.ee.byu.edu> hutch@timp.ee.byu.edu (Brad Hutchings) writes: |> |> Why C-code? C-code seems like a poor choice as it only supports |> sequential semantics. Thus any algorithm implemented in C will |> demonstrated *sequential* behavior. However, the goal is to implement |> hardware that is highly concurrent. I don't see where C will be |> helpful in this case. |> |> C is useful for specification of the input and output of the algorithm. |> It is a much more well-defined specification language than English for |> example. There are many IEEE specs that use either C or Pascal (e.g. the |> Ethernet spec) to augment text descriptions. Sure. But why would C be better than VHDL, for example (which the author was arguing against)? VHDL allows a much broader range of abstractions that make sense for both hardware and software. Concurrent and sequential semantics are supported. If all that matters is the specification of the benchmark and not its implementation then just about any *executable* specification will do. However, if the eventual goal is to compare different systems and approaches, it would be useful to have a specification language that can get closer to hardware so that specific approaches and implementation strategies can be directly compared. |> |> The original author's first sentence says explicitly "The idea here is to |> specify the benchmark at the highest level". This says nothing about |> proposing that your benchmark implementation must follow this behavior, |> only that it solve the same problem. Maybe I'm just reading what makes |> sense to me and attributing it to the original author. |> |> |> A C benchmark allows |> |> the designer to make hardware/software tradeoffs -- if their using |> |> a big board they can put the whole algorithm in hardware if they |> |> are using a small board they can cut up the program and divide and |> |> conqure. |> |> I think that C falls flat when it comes to hardware/software tradeoffs. |> Again, a C-implementation of an algorithm and a hardware implementation |> of an algorithm will likely be quite different for the reasons that |> I expressed above. Using C will only complicate the design-space |> search for mixed hardware-software solutions. |> |> Only if you use C as input to your system, as opposed to the |> definition of a computational problem which you are to demonstrate your |> system's ability to solve using it's own specification paradigm. Just |> another view. I missed your point. I was commenting on how C is of very little use for doing software/hardware tradeoffs. How does C help here? C can help to *define* the problem but it seems like a poor choice if the real goal is to to experiment with different hardware/software tradeoffs. -- Brad L. Hutchings (801) 378-2667 Assistant Professor Brigham Young University - Electrical Eng. Dept. - 459 CB - Provo, UT 84602 Reconfigurable Logic LaboratoryArticle: 506
Ryan Raz (morph@io.org) wrote: : We are currently working on a large design including digital filters, : shift registers, SRAM's, VRAM's, etc. Also we will be using FPGA's : to handle timing, control and ALU functions. : We are looking for CAD tools for overall design description and simulation : and FPGA synthesis. So far we have looked at Data I/O's Synario, : Viewlogic's Pro Series, Exemplar Logic and the Xilinx development tools. : Are there any comments on these systems or on alternatives? I've use the Viewlogic WORKVIEW Plus on a PC, over the Xilinx compilers (the new 5.0 on a Sparc, as well as the older versions on a PC) with pretty good results. The PC platform crashes a lot, I believe due to the overabundance of M*cr*S*ft products on it, but it's fairly useable. A better solution, IMO, would be a real workstation. The Altera tools for the PC work pretty well too, although I am not as familiar with how well their compiler works. I do know that it is quite easy to code up some AHDL (Altera's Hardware language) and get it compiled. They have some nice design and debug tools. Their waveform simulator could use some additions, like accepting macro-command file input, but it has its edges rounded better than most. If you are truly looking for HDL and *NOT* schematic entry, your low- cost choice should probably be the Altera tools, since the Xilinx tools make it somewhat difficult to do purely HDL designs. I am not familiar with other design environments, but many do exist, for other FPGAs not mentioned.Article: 507
[... use of C, VHDL as benchmarking language ...] A different approach, and one that is gaining popularity, is to specify the algorithm in a general way, then permit any sort of implementation. This allows true testing of the architecture, rather than the cleverness of the optimizer (not that testing optimizers is necessarily a bad thing). I believe the SLALOM benchmark takes this approach. this is probably more practical, considering the wide range (or maybe lack of range :^) of programming language support for these machines. I personally would favor a high level language which permits the explicit expression of parallelism. There has been some interesting work done along these lines with data parallel C and Occam. But I know at least one person who believes FORTRAN will be necessary if these machines are to be accepted by the high performance computing community (but I'd rather not even think about it ...). -- Steve -- 12/13/94Article: 508
####### ##### ##### # # ### ##### ####### # # # # # ## ## ### # # # # # # # # # # # # # # ##### # # # # # # ###### ###### # # # # # # # # # # # # # # # # # # # ##### ##### # # ##### ##### C A L L F O R P A P E R S THE THIRD ANNUAL IEEE SYMPOSIUM ON FPGAs FOR CUSTOM COMPUTING MACHINES Napa, California April 19 - 21, 1995 For more information, refer to the WWW URL page: http://www.super.org:8000/FPGA/comp.arch.fpga PURPOSE: To bring together researchers to present recent work in the use of Field Programmable Gate Arrays or other means for obtaining reconfigurable computing elements. This symposium will focus primarily on the current opportunities and problems in this new and evolving technology for computing. SOLICITATIONS: Papers are solicited on all aspects of the use or applications of FPGAs or other means for obtaining reconfigurable computing elements in attached or special-purpose processors or co-processors, especially including but not limited to: ) Coprocessor boards for augmenting the instruction set of general- purpose computers. ) Attached processors for specific purposes (e.g. signal processing). ) Languages, compilation techniques, tools, and environments for programming. ) Application domains. ) Architecture prototyping for emulation and instruction. A special session will be organized in which venders of hardware and software can present new or upcoming products involving FPGAs for computing. SUBMISSIONS: Authors should send submissions (4 copies, 10 pages double-spaced maximum) before January 16, 1995, to Peter Athanas. A Proceedings will be published by the IEEE Computer Society. Specific questions about the conference should be directed to Kenneth Pocek. SPONSORSHIP: The IEEE Computer Society and the TC on Computer Architecture. CO-CHAIRS: Kenneth L. Pocek Intel Mail Stop RN6-18 2200 Mission College Boulevard Santa Clara, California 95052 (408)765-6705 voice (408)765-5165 fax kpocek@sc.intel.com Peter M. Athanas Virginia Polytechnic Institute and State University Bradley Department of Electrical Engineering 340 Whittemore Hall Blacksburg, Virginia 24061-0111 (703)231-7010 voice (703)231-3362 fax athanas@vt.edu ORGANIZING COMMITTEE: Jeffrey Arnold, Supercomputing Research Center Brad Hutchings, Brigham Young Univ. Duncan Buell, Supercomputing Research Center Tom Kean, Xilinx, Inc. (U.K). Pak Chan, Univ. California, Santa Cruz Wayne Luk, Oxford Univ. Apostolos Dollas, Technical Univ. of CreteArticle: 509
> |> > |> I think the format for any benchmarks should be either > |> > |> 1) pen and paper or > |> 2) C code > |> > |> The idea here is to specify the benchmark at the highest level. > |> In the pen and paper approach the benchmark is pure algorithm. > |> Consisting of a mathematical description of the input data, the > |> algorithm and output data. The C code approach gives the implementor > |> a real example of the behavior of an algorithm. > > Why C-code? C-code seems like a poor choice as it only supports > sequential semantics.Thus any algorithm implemented in C will > demonstrated *sequential* behavior. A C-coded (or any sequential language) *emulation* of an algorithm is a form of specification. It does not have to give an exact form of implemention nor does it have to specify hardware structure. A C program can however give one a measure of performance when comparing throughput of present day CPUs against new reconfigurable architectures. > However, the goal is to implement > hardware that is highly concurrent. I don't see where C will be > helpful in this case. In my opinion the goal is to implement algorithms in hardware to accelerate a computation. If there is one thing C has going for it is there are lots of algorithms (and benchmarks) implemented in that language. > > |> A C benchmark allows > |> the designer to make hardware/software tradeoffs -- if their using > |> a big board they can put the whole algorithm in hardware if they > |> are using a small board they can cut up the program and divide and > |> conqure. > > I think that C falls flat when it comes to hardware/software tradeoffs. It seems to me that we need to start with software to have a hw/sw tradeoff otherwise we will have to rewrite the billion lines of code all over again. > Again, a C-implementation of an algorithm and a hardware implementation > of an algorithm will likely be quite different for the reasons that > I expressed above. Using C will only complicate the design-space > search for mixed hardware-software solutions. I agree that if we use C as the implemention specification it puts all the burden on the complier writer. I think it will come to that some day however. For now I think we should just specify (give an example of) the algorithm to be implemented. For example. main() { subroutine1() { ..... ..... } subroutine2() { ..... ..... } subroutine3() { ..... ..... } } A large system might be able to place all the functions in hardware at one time. A smaller system might just implement a three hardware designs sequentially. A very small system might have to parse each subroutine into more than one hardware object. Exactly how these subroutines are implemented (VHDL verilog schematic) would be up to the designer. But at least we would know everyone is getting the same results because we all have the same C program benchmark to refer to. Steve CasselmanArticle: 510
hi all: as title, any easy way to make the XNF file download!! --Article: 511
I'm looking for some shareware (preferably in C) code that does logic minimization - i.e. Presto, Espresso II, boozer, MINI, etc.. I would really like to get the ESPRESSO II algorithm so I can implement it in some code I'm writing. Kirk weedk@pogo.wv.tek.comArticle: 512
The RNG I implemented was like a fibonacci number generator except I add one every time I access the EVC over the SBus. I take two random integers stick them together format them and subtract 1. The result are double float random numbers. I then use this RNG in the NAS embar supercomputer benchmark. The results are below. It should be noted that whatever speed this benchmark will run on a Sparc 20 it will run around twice as fast using an EVC. Steve Casselman Run 1 numbers > 0.000000 and < 0.100000 99979 numbers > 0.100000 and < 0.200000 100238 numbers > 0.200000 and < 0.300000 100190 numbers > 0.300000 and < 0.400000 99991 numbers > 0.400000 and < 0.500000 100290 numbers > 0.500000 and < 0.600000 99362 numbers > 0.600000 and < 0.700000 100268 numbers > 0.700000 and < 0.800000 99627 numbers > 0.800000 and < 0.900000 100384 numbers > 0.900000 and < 1.000000 99671 Total random numbers = 1000000 max = 0.999998027735670 min = 0.000000566747618 Run 2 numbers > 0.000000 and < 0.100000 100082 numbers > 0.100000 and < 0.200000 100478 numbers > 0.200000 and < 0.300000 100391 numbers > 0.300000 and < 0.400000 100174 numbers > 0.400000 and < 0.500000 99763 numbers > 0.500000 and < 0.600000 99686 numbers > 0.600000 and < 0.700000 99779 numbers > 0.700000 and < 0.800000 100166 numbers > 0.800000 and < 0.900000 99272 numbers > 0.900000 and < 1.000000 100209 1000000 max = 0.999998301519109 min = 0.000000095615990 Using Sparc Station 2 for everything CPU TIME = 632.6700 (684.9300 when complied with -p) N = 2^24 NO. GAUSSIAN PAIRS = 13176389. COUNTS: 0 6140517. 1 5865300. 2 1100361. 3 68546. 4 1648. 5 17. 6 0. 7 0. 8 0. 9 0. Using EVC just for random numbers CPU TIME = 353.880004 N = 2^24 NO. GAUSSIAN PAIRS = 13177271. COUNTS: 0 6138931. 1 5865486. 2 1101640. 3 69558. 4 1634. 5 22. 6 0. 7 0. 8 0. 9 0. top two functions were replaced by hardware 26.3+23.9 = 50.2% its a little more if you take out mcount which is the profiler itself. %time cumsecs #call ms/call name 26.3 179.74100663620 0.00 _aint 23.9 343.38 257 636.73 _vranlc_ 15.1 446.7813176389 0.01 _sqrt 14.2 543.68 1 96900.00 _MAIN_ 13.7 637.73 mcount 6.8 684.5813176389 0.00 _log 0.0 684.59 3 3.33 _cfree 0.0 684.60 5 2.00 _ioctl 0.0 684.61 16 0.62 _write 0.0 684.61 11 0.00 .div 0.0 684.61 11 0.00 .mul 0.0 684.61 64 0.00 .rem 0.0 684.61 16 0.00 .udiv 0.0 684.61 3 0.00 .umul ... other junkArticle: 513
I still do not see, how a design spread over 256 FPGA's would rate favourably on a benchmark.. other than being an academic exercise ..!! wilfredArticle: 514
In article <3cp0ep$914@news.csie.nctu.edu.tw> dyliu@dyliu.dorm2.nctu.edu.tw (¤GªÙªk§J) writes: > > hi all: > > as title, any easy way to make the XNF file download!! > >-- YES: (assuming you are using an XC4000 style device, and version 5 SW) 1) xnfmerge filename.xnf <<< this is your xnf file 2) xnfprep filename <<< merge generated an .xff file, and this turns it into a .xtf file 3) ppr filename <<< ppr turns the .xtf into .lca 4) makebits filename <<< this turns the .lca file into a .bit file 5) xchecker filename <<< this takes the .bit file and downloads it to a chip. ALL THE BEST Philip Freidin :-) :-) :-) I couldn't help my self.Article: 515
I am preparing a presentation for an IEEE conference on FPGA applications and would like to hear from people in industry. (I have the FCCM proceedings, but *most* of those papers are from academics). I would also be very interested in hearing from FPGA vendors as to how customers are using their parts. (Great opportunity to showcase your product!) :-) -- Jim Frenzel, Asst. Prof Electrical Engineering, BEL 213 208-885-7532 University of Idaho jfrenzel@uidaho.edu Moscow, ID 83844-1023 USAArticle: 516
In article <3cp0ep$914@news.csie.nctu.edu.tw>, GªÙªk§J <dyliu@dyliu.dorm2.nctu.edu.tw> wrote: > > hi all: > > as title, any easy way to make the XNF file download!! > >-- Nope. You absolutely must place/route the XNF file and run Makebits. --MontyArticle: 517
Few days back I saw some posting regarding 'analog FPGA' in this newsgroup. Somebody gave reference to some articles in Electronic Design(?). Can someone repost the information regarding that ? Thanks in advance, SwapnajitArticle: 518
------------------------------------------------------------------- # ##### ### ##### ### ##### ####### # # # # # # # ### # # # # # # # # # # # # # # ##### # # # ###### ###### ####### # # # # # # # # # # # # # # # # # # ##### ### ##### ##### ##### Eighth Annual APPLICATION SPECIFIC INTEGRATED CIRCUIT Conference and Exhibit 1995 "Implementing the Information Superhighway with Emerging Technologies" Stouffer Renaissance Hotel Austin, Texas September 18-22 CALL FOR PAPERS, TUTORIALS, & WORKSHOPS The IEEE International ASIC Conference and Exhibit provides a forum for examining current issues related to ASIC applications and system implementation, design, test, and design automation. The conference offers a balance of emphasis on industry standard techniques and evolving research topics. Information is exchanged through workshops, tutorials, and paper presentations. These promote an understanding of the current technical challenges and issues of system integration using programmable logic devices, gate arrays, cell based ICs, and full custom ICs in both digital and analog domains. ____________________________________________________________________ Technical Papers, Tutorials, and Workshop Proposals are solicited in the following areas: ASIC Applications: Wireless Communications, PC/WS and Peripherals, Multimedia, Networking, Image Processing, Data Communications, Storage Technologies, Graphics, Digital Signal Processing Technologies: Digital, Analog, Mixed Signal, CMOS, BiCMOS, ECL, GaAs CAD Tools: Design Capture, Layout, Test, Synthesis, Modeling, Simulation Architectures: PLDs, Gate Arrays, Cell Based ICs, Full Custom ICs Evolving Research: Research in Methodologies, Tools, Technologies & Architectures Design Methodologies: System Design, Top-down, Graphical, HDLs Manufacturing: Process, Testability, Packaging Workshops: Four or eight hour technical workshops covering ASIC design knowledge and skills. Proposals to form these workshops for either introductory or advanced levels are invited. ASIC industry as well as universities are encouraged to submit proposals. Contact the Workshop Chair. ______________________________________________________________________ INSTRUCTIONS TO AUTHORS Authors of papers, tutorials, and workshops are asked to submit 15 copies of a review package that consists of a 500 word summary and a title page. The title page should include the technical area from above, the title, a 50 word abstract, the authors names as well as an indication of the primary contact author with a COMPLETE mailing address, telephone number and TELEX/FAX/Email. The summary should clearly state: 1) title of the paper; 2) the purpose of the work; 3) the major contributions to the art; and 4) the specific results and their significance. IMPORTANT DATES Summaries and Proposals due: March 3, 1995 Notification of Acceptance: April 14, 1995 Final Camera Ready Manuscript due: June 2, 1995 SEND REVIEW PACKAGE TO Lynne M. Engelbrecht ASIC Conference Coordinator 1806 Lyell Avenue Rochester, NY 14606 Phone: (716) 254-2350 Fax: (716) 254-2237 CONFERENCE INFORMATION http://asic.union.edu Proceedings, and the Advance Program Airline Discounts, Exhibits, Technical Sessions, Schedule, Registration, Hotel Sites, CONFERENCE CHAIR TECHNICAL CHAIR WORKSHOP CHAIR William A. Cook Richard A. Hull P. R. Mukund Eastman Kodak Co. Xerox Corp. RIT Rochester, NY 14650 Webster, NY 14580 Rochester, NY 14623 Phone: (716) 477-5119 Phone: (716) 422-0281 Phone: (716) 475-2174 Fax: (716) 477-4947 Fax: (716) 422-9237 Fax: (716) 475-5845 bcook@kodak.com rah.wbst102a@xerox.com mukund@cs.rit.edu EXHIBIT CO-CHAIRS Kerry Van Iseghem Kenneth W. Hsu LSI Logic Corporation RIT Victor, NY 14564 Rochester, NY 14623 Phone: (716) 233-8820 Phone: (716) 475-2655 Fax: (716) 233-8822 Fax: (716) 475-5041 kerryv@lsil.com kwheec@ritvax.isc.rit.edu Sponsored by the IEEE Rochester Section in cooperation with the Solid State Circuits Council and the IEEE Austin Section -------------------------------------------------------------------Article: 519
guccione@sparcplug.mcc.com (Steve Guccione) writes: >But I know at least one person who believes FORTRAN will be necessary >if these machines are to be accepted by the high performance computing >community (but I'd rather not even think about it ...). At least FORTRAN doesn't have pointers... --Mike -- Mike Butts, Portland, Oregon mbutts@netcom.comArticle: 520
In article <D0rM2x.MJr@mcc.com> Steve Guccione, guccione@sparcplug.mcc.com writes: > [... use of C, VHDL as benchmarking language ...] > > A different approach, and one that is gaining popularity, is to > specify the algorithm in a genera l way, then permit any sort of > implementation. This allows true testing of the architecture, rather > than the cleverness of the optimizer (not that testing optimizers is > necessarily a bad thing). I believe the SLALOM benchmark takes this > approach. Well it all depends what one is trying to benchmark: € Use of a particular language: this is useful if one wants to see how good a particular language ³translation² system is. There are major problems in comparing one language with another take C and VHDL: C is a real programming language with which one can get good effective code although C++ is better, it lacks parallel constructs, timing info and detailed type sizes; while VHDL is more of a hardware language, but which has major flaws such as poor timing specification (at least in terms of what is wanted rather than what has been achived), very limited concurrent/parallel operations, and which is not very good for writing software (if one wants to look at hw/sw tradeoffs). [I ought at this point to indicate that my research area is using FPGAs to accelerate software by mapping key functions, but the implementations we produce only make sense when suitably coupled with the microprocessor and original program.] € Language independent algorithms: useful if that is what one wants. The problem here is that I personally doubt one can be completely independent of the underlying semantics of ones implementation system. The biggest problems will be in the nature of I/O, how the protocols are defined and knowledge about data sizes. This will help some systems and not others. Another problem is that the amount of concurrency available has major implications on effective algorithmic complexity. € General problem description: this at least allows people to choose the best algorithm for their technique - one then compares ³fastest sorters², say. Problems should beself evident. Much of this goes back to the problem with benchmarks. They are not, nor are they supposed to be, real world examples. The problem is that most people will really be interested in the effects on real examples, rather than benchmarks. The worst case scenario is that the benchmarks will costrain the tools, and prove as usual not to be typical. _____________________________________________________________ Dr John Forrest Tel: +44-161-200-3315 Dept of Computation Fax: +44-161-200-3321 UMIST E-mail: jf@ap.co.umist.ac.uk MANCHESTER M60 1QD UKArticle: 521
Requires high volume low cost board success in the past with no jumper wires. Call for more information. EMail smaki@teleport.com background/contact information. Full-time work for the right person. Stock options.Article: 522
hello. In response the the letter before, I would like to know for example Xilinx 4000 or 3000, to change XNF and load it into Xilinx, do I need any software to help me? What is its name, where can I get? Regards, David.Article: 523
(Chiu See Ming <EEE3>) writes: > In response the the letter before, I would like to know for example >Xilinx 4000 or 3000, to change XNF and load it into Xilinx, do I need any >software to help me? What is its name, where can I get? To make this perhaps a bit more plain, here is an illustration: Schematics are the FPGA equivalent of source code .XNF files are the equivalent of source code that has been pre-compiled to assembly language source code .LCA files are post-compilation linked assembly language code .BIT files are equivalent to compiled binary object code The analogy isn't bulletproof, but good enough for this discussion. To get from .XNF files to .BIT files, one needs the "compiler" (at least). For Xilinx FPGAs, both NeoCad and Xilinx *sell* compilers. There are no freeware/shareware Xilinx FPGA compilers. If there was one, I would run from it, since anyone trying to "offer" one would obviously be irrational. The "compiler" is the place and route tool, by and large. If you want to play this game, you will need/want to buy competent and well-supported (that means *commercial*, in this case) software. Bob Elkind, Tektronix TV bobe@tv.tv.tek.com (speaking for myself)Article: 524
A couple of weeks ago I asked for info on ASIC emulation from several groups and promised to provide a summary. Since all the responses didn't make it to the different groups, here is a summary of the posting and email activity starting with my original post. The material is given in the order I saw it. Thanks for all of your replies. Dan =========================================================================== We're looking into ASIC emulation of the Quickturn variety and are interested in experiences with Quickturn or any other FPGA-based emulation systems. Can you just drop a design on it and start running test vectors through, or do you still have do some FPGA-like hardware design? Does it scale well to large emulations, say of a complete CPU or even multiple chips? We hear that the emulation runs 100 times slower than actual hardware (which seems a little slow). Are the FPGAs really that much slower individually or is it a problem with their combination into a larger system? Any insights, experiences and/or references to articles describing experiences would be greatly appreciated. I will summarize responses for the net. Thanks again. ********************************************************************** NSF Engineering Research Center for Computational Field Simulation ********************************************************************** Daniel H. Linder linder@erc.msstate.edu NSF Engineering Research Center for CFS (601) 325-2057 P.O. Box 6176 fax: (601) 325-7692 Miss. State, MS 39762 ********************************************************************** =========================================================================== Date: Wed, 30 Nov 94 11:52:33 PST From: mbutts!mbutts%mbutts@uunet.uu.net ( Mike Butts) To: linder@erc.msstate.edu Subject: Re: ASIC emulation (Quickturn, etc.) Cc: mbutts@uunet.uu.net Hi, Dan, glad you asked. Here's the reply which I posted to the newsgroups. You might want to contact one of our local people, such as Ken Mason, who's our AE in your part of the world, in our Cary, NC office (919-380-7178). > We're looking into ASIC emulation of the Quickturn variety and are > interested in experiences with Quickturn or any other FPGA-based emulation > systems. > > Can you just drop a design on it and start running test vectors > through, or do you still have do some FPGA-like hardware design? Yes you can. All the FPGA-specific details are completely contained within the design compiler, which does the technology mapping from the source netlist and libraries into the FPGAs. The emulation user sees the design elements, netnames, etc. in the design's terms. After running your vectors, or even without vectors, you can go directly in-circuit. > Does it scale well to large emulations, say of a complete CPU or even > multiple chips? It scales very well to large emulations. Intel, Sun, and many other major developers are emulating entire CPU designs, at 2 million gates and more. Quickturn's System Emulator M3000 has a 3 million gate capacity, with provisions for multi-M3000 systems that allow over 10 million gate emulations off the shelf. Most CPU developers and many ASSP and ASIC projects now use Quickturn emulators to run OSs and applications before tapeout. > We hear that the emulation runs 100 times slower than actual hardware > (which seems a little slow). Are the FPGAs really that much slower > individually or is it a problem with their combination into a larger system? The programmable interconnect inside and between FPGAs does take more time than real metal and wires, because of RC delays in pass transistors and many more chip-crossings. 100X slowdown is an upper bound in our experience. Most emulations run from 1 to 8 MHz. That's 3 to 5 orders of magnitude faster than cycle-based simulators, which is the difference between running lots of real code and just doing vectors or one OS boot. Multi-million-gate CPU emulations are slower than 200K gate ASIC emulations, but the CPU projects find the speed is plenty for what they do so it all works out. ASICs typically run at multi-MHz in current-generation emulators, and there are many techniques for successfully matching the target system's speed to the emulator. > Any insights, experiences and/or references to articles describing > experiences would be greatly appreciated. I will summarize responses for > the net. Thanks again. A detailed and quantitative article written by a user is called "Logic Design Aids Design Process", by Jim Gateley of Sun, in the July 1994 issue of ASIC & EDA. It's an account of the MicroSPARC II project's experiences with the Quickturn Enterprise (previous generation) logic emulator on a 200K gate 32-bit SPARC CPU. "During the 25 days prior to tapeout, the emulated processor and testbed system successfully executed power-on self tests and open boot PROM, booted single- and multi-user Solaris, Open Windows, and Open Windows applications. Altogether, emulation logged 15 bugs and enhancements against MicroSPARC II, PROM, and the kernel before tapeout. First silicon was very clean. MicroSPARC II shipped three months early." --Mike Butts, Emulation Architect, Quickturn Design Systems (mbutts@qcktrn.com) =========================================================================== Date: Wed, 30 Nov 94 14:14:47 PST From: John.Sullivan@Eng.Sun.COM (John J. Sullivan) To: mbutts@netcom.com, linder@erc.msstate.edu Subject: Re: ASIC emulation (Quickturn, etc.) Cc: John.Sullivan@Eng.Sun.COM In article Fpy@netcom.com, mbutts@netcom.com (Mike Butts) writes: > linder@ERC.MsState.Edu (Dan Linder) writes: >> We're looking into ASIC emulation of the Quickturn variety and are >> interested in experiences with Quickturn or any other FPGA-based emulation >> systems. >> >> Can you just drop a design on it and start running test vectors >> through, or do you still have do some FPGA-like hardware design? > Yes you can. All the FPGA-specific details are completely contained > within the design compiler, which does the technology mapping from the > source netlist and libraries into the FPGAs. The emulation user sees > the design elements, netnames, etc. in the design's terms. After running > your vectors, or even without vectors, you can go directly in-circuit. Not to put down Quickturn products, but just a FYI: Quickturn emulation may indeed be very simple if you are doing a medium to large ASIC design based purely on a gate-library. However, you're greatly understating the problem for any type of semi-custom or full-custom design. Any kind of memory structure in your design such as RAMs or register files can be very problematic to model, especially multi-ported memories. And all of your custom circuits will have to be modeled at the primitive gate level (behavioral Verilog or VHDL will have to be completely re-written.) For a large design, it can take > 1 day to compile the design to be loaded into the emulation system. MicroSparc-II was a great experience for Sun in terms of both emulation and design. The chip (and its predecessor MicroSparc-I) were both attempts to have highly automated design flows (synthesys, layout, chip assembly.) I believe this gave them an advantage in getting to emulation quickly because it forced them to avoid complex structures that would be hard to map. (They also sacrificed speed and density, but MicroSparc-II still gained quite an advantage by being quickly ported to a 0.5um technology.) Our experiences with two other processors SuperSparc-II and UltraSparc-I were that it took 3-4 engineers plus a full-time Quickturn FAE on site approximately 8-9 months to bring up the system to run vectors or do ICE. Quickturn has been very helpful and responsive to our problems, and their systems have allowed us to go a long distance toward bug-free silicon. But, I just want to point out that this does not necessarily come for free without substantial investment of time and resources on the user's end. ---------------------------------------------------------------------------- John Sullivan, SparcTech VLSI | email: sullivan@eng.sun.com Sun Microsystems, Sunnyvale CA, Bldg. SUN02 | phone: 408-774-8097 =========================================================================== (anonymous) In general, you are looking at working with a state of the art CAE tool, one which has no more than a few hundred installations worldwide. (If I'm wrong, their sales rep can correct that quickly.) In addition, you are talking about a very, very complex simulation job. My past experience with CAE tools is that: (1) It takes a certain amount of expertise (measured in full time people) to get the thing running at all. This can sometimes be avoided by having the vendor set it up for you (clone a working site). (2) At that point you can "play" with trivial cases -- things well within the envelope of what has been stressed by a lot of different users. If you're not pushing the state of the art in size of simulation, number of test cases, performance, etc real work can get done at this point. You probably will stumble over a bug or two, and if you bypassed (1) above you will be clueless how to troubleshoot the bug and completely dependent on your vendor, who will have their money already and thus be a bit less responsive than they were before they were paid. (3) Then you load the real problem on the system, stretching some limit no one has before, or doing something some way the tool's designers never thought of, and (1) you will uncover defects in the tool, or (2) the problem no longer fits on the tool you bought, or (3) performance falls apart and you have to tune it back into usefulness. This is the point at which those full time, really talented people I mentioned above bail out your project by figuring out how to work around the tool's bugs and limitations. BTW, the industry puts up with this because without the tools, the design or simulation problems simply couldn't be done in our lifetimes. I once knew an engineer who owned a Porche 930 Turbo. He didn't have the money to pay $5000 or so to have the engine rebuilt every year or so, so he rebuilt it himself in his garage. That was the right tradeoff for him. Most of the rest of us own simpler, less aggressive, easier to drive, simpler to maintain cars from GM or Ford. =========================================================================== From: Kenny Chen - MPG SLV <kenchen@pcocd2.intel.com> Date: Wed, 30 Nov 1994 15:36:05 -0800 To: linder@ERC.MsState.Edu (Dan Linder) Subject: Re: ASIC emulation (Quickturn, etc.) Newsgroups: comp.arch X-Newsreader: TIN [version 1.2 PL2] > (My original post was included here.) Dan, 1. You just need to synthesize your design into Quickturn's library, which maps into the LCAs on the Xilinx FPGA. You don't need to do FPGA-like hardware design. (Unless you want to do.) Although there's a mode you can use it as a tester for vectors, in general it can do better than that. AMD used QT emulation for K5 and booted dos/windows on a PC. 2. Depend on what you mean by "scalable". 3. Yes, you can put the whole CPU into it, as long as it fits, :) If your design gets too big, their SW can partition it into several boxes ($$$ !). For asics you should be able to fit several chips into one box. Don't trust their sales quote. 4. It's slow, and be prepared for that. But it's hundred times faster than simulation. It's due to the interconnect and backplane routing. -- -Kenny Chen =========================================================================== Newsgroups: comp.arch.fpga From: dej@eecg.toronto.edu (David Jones) Subject: Re: ASIC emulation (Quickturn, etc.) Nntp-Posting-Host: ziffs.eecg.toronto.edu Organization: Department of Computer Engineering, University of Toronto Date: 30 Nov 94 22:03:31 GMT In article <mbuttsD03IM1.G8C@netcom.com>, Mike Butts <mbutts@netcom.com> wrote: >real code and just doing vectors or one OS boot. Multi-million-gate CPU >emulations are slower than 200K gate ASIC emulations, but the CPU projects find >the speed is plenty for what they do so it all works out. ASICs typically run Interesting question: What is the slowest "acceptable" speed for logic emulation of a CPU? In particular, would 1/256 full-speed be acceptable? =========================================================================== From: rwieler@ee.umanitoba.ca (wieler) Newsgroups: comp.arch.fpga Subject: Re: ASIC emulation (Quickturn, etc.) Date: 30 Nov 1994 23:38:21 GMT Organization: Elect & Comp Engineering, U of Manitoba, Winnipeg, Manitoba,Canada Distribution: world Reply-To: rwieler@ee.umanitoba.ca NNTP-Posting-Host: wine.ee.umanitoba.ca > (My original post was included here.) No experiences except on our homegrown system, however you should not be surprised at system level, or large emulutions running 100 times slower. Remember that CPU bus speeds (internal) are now running at 100 +MHz, think of the wire length busses have to run through on a board that will fit such a design for emulation. There is no way you will get anywhere near that speed. However a drop in speed by only 100 is small potatoes, when you think of the drop in speed when simulating. Good luck. Richard Dept of Electrical and Computer Eng. University of Manitoba =========================================================================== Date: Thu, 1 Dec 94 11:27:52 PST From: mbutts!mbutts%mbutts@uunet.uu.net ( Mike Butts) To: John.Sullivan@Eng.Sun.COM, linder@erc.msstate.edu Subject: Re: ASIC emulation (Quickturn, etc.) Cc: mbutts@uunet.uu.net Thanks for your comments, John. No question that a big full-custom design emulation can be a big effort. ASIC designs like Dan was aking about are going quite a bit easier, especially if their clocking isn't too exotic. We've gotten a lot more capable in our memory modeling lately. The System Realizer includes a menu-driven memory compiler which can generate memories for XC4013 CLB implementation in the Logic Modules, or in the bigger or more heavily multiported cases, in the Core Memory Module hardware. No question that full-custom designs can raise modeling issues which don't come up in the ASIC world because we've already done the libraries. I'm very glad that we've been able to help with Sun projects. The System Realizers reflect much of the experience we gained in working with you folks and everyone else, and I believe they are another big step forward towards our ultimate goal of making emulation as easy to use as simulation. Certainly it remains a complex and evolving technology. Thanks! --Mike (mbutts@qcktrn.com) ----- Begin Included Message ----- (John Sullivan's post above was included here.) ----- End Included Message ----- =========================================================================== Date: Thu, 1 Dec 94 18:24:25 PST From: weedk@pogo.WV.TEK.COM (Kirk A Weedman) To: linder@ERC.MsState.Edu Subject: Re: ASIC emulation (Quickturn, etc.) Newsgroups: comp.arch.fpga In-Reply-To: <LINDER.94Nov30111210@gemini.ERC.MsState.Edu> Organization: Tektronix, Inc., Wilsonville, OR. > (My original post was included here.) I too am curious about their tools. What tools are you currently using? I've been using Cadence Concept for schematic capture or a tool called CIRGEN that automatically generates Concept schematics from equations (can use just about any vendor library). Next I create an EDIF netlist and feed that into ALTERA tools along with a mapping file. So far I like the Altera tools and parts - one design for use in a 33Mhz processor application - but am looking at other vendors too. Anyway, let me know what you hear about their tools. Kirk weedk@pogo.wv.tek.com =========================================================================== From: Paul Micheletti <pm1@sparc.SanDiegoCA.NCR.COM> Subject: Quickturn (fwd) To: linder@ERC.MsState.Edu Date: Mon, 5 Dec 1994 12:00:53 -0800 (PST) Cc: pm1@sparc.SanDiegoCA.NCR.COM X-Mailer: ELM [version 2.4 PL20] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 3200 > (My original post was included here.) We just finished evaluating our newly purchased Quickturn MARS system, and found the task of emulating an ASIC to be non-trivial, but worth-while. We took an already designed and tested ASIC off of a known good PC board and replaced it with a Quickturn emulation model of this ASIC. This ASIC was approximately 70K gates, so it easily fit into a single Logic Block Module (LBM). The biggest problems we encountered were: 1) timing problems induced by using their RAM macros for our RAM blocks. 2) the learning curve for operating new software. I never had to know what parts of the design were placed into which FPGA, because the software adequately hides the need for this info from the user. The design is automatically partitioned into multiple FPGAs, and the place and rout for these FPGAs was performed using a neat tool that spawns jobs off to multiple machines, which is needed when performing >200 FPGA compiles. Our ASIC test vectors ran against this model at 1MHz which is 1/20th of the 20 MHz ASIC clock rate. When we performed the actual emulation on our PC board, we were able to run the clock at 1.75MHz. This is just under 10% of the real ASIC worst case clock rate. If we used a larger ASIC for this test, our observed clock frequency would have been lower because of the added time required when connecting multiple LBM modules. I don't know the exact degradation for this since we haven't tried using multiple LBMs yet. -- Paul Micheletti -- AT&T Global Information Solutions -- email: paul.micheletti@sandiegoca.ncr.com
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z