Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Does anybody have any info/datasheet on these parts ? It seems a strange situation that I can go all the way from synthesis through to place&route but I don't know what's actually in these devices ?!. I'm sort of assuming that they relate to Virtex in the same kind of way that Spartan1 related to the smaller XC4K series parts.Article: 19676
At Siemens Switzerland‘s R&D department, I have developed and implemented an adaptive digital filter based on the computational efficient Lapped Orthogonal Transform (LOT). The filter bank is implemented in a Xilinx Virtex FPGA using VHDL as description language. The design runs at 40 MHz and computes approximately 400M multiplications and 350M additions per second. This LOT filter bank is adaptively controled by a control alogrithm in a microprocessor. Including the FPGA and the microprocessor’s algorithm, the system acts as a notch filter by adaptively suppressing interferences on the actual signal. Specifications of my application required to attenuate interferences by at least 30 dB in the frequency band from DC to 10 MHz. Along with the implementation, I have also written MatLab simulation models. Thanks to the flexible architecture of the filter bank, the filter‘s characteristics and performance can be adapted to a variety of applications with only minor changes. If you’ve got any questions regarding my work, feel free to contact me: christoph.cronimund@siemens.chArticle: 19677
In article <38761dca.521735891@mindmeld.idcomm.com>, Richard Erlacher <edick@hotmail.com> wrote: >It's their sales volume they want to maximize, and in the here and >now, so it doesn't matter to them one bit if you go under next quarter >because of counterfeiting this quarter. > >That's why we don't have the technology that would solve this problem. >The FPGA makers would be investing lots of money now in order to >reduce their sales volume later. To them it's shooting themselves in >the foot. > Richard just accused all non-volatile FPGA manufacturers of conspiring to limit technology choices to the end designer to sell a few more chips to counterfeit board maker. This of course means that the counterfeiter has to reproduce the entire board from components to PCB traces and stackups to product docs to labels to software all of which is "non-secure". And they have to do all of this cheaper than the original manufacture to make money. Can anyone out there provide any documented proof that this has ever happened? Or is this just a bunch of worrying about nothing? EdArticle: 19678
edick@hotmail.com (Richard Erlacher) writes: > The whole problem of theft risk would go away if the FPGA makers would > put the config EEROM inside the FPGA. That wold save space, > uncomplicate the board layout process, and let more of us sleep at > night. > > Everything the FPGA vendors do, however, is to benefit them. THEY > sell more devices when there is competition from counterfeiters, at > list in the first quarter . . . and, like most businesses, they don't > care about the second quarter. Wait and see half a year or so... I've heard rumours... Homann -- Magnus Homann, M.Sc. CS & E d0asta@dtek.chalmers.seArticle: 19679
On Fri, 07 Jan 2000 16:20:19 +0100, Kai Troester <troester@imms.de> wrote: >Hi > >Im using a Xilinx Virtex, synthesizing with Synplify and I have the >following problem > >My design contains 2 clocks. sclk (50 MHz) and clk (25 MHz). The clock >clk is derived from the sclk clock with a flip-flop and sclk is the >clock input of the design. The design uses both of the clocks. To >minimize the delay between both clocks I want the clock input of the >toggle flip-flop to be driven from the clock pin directly, but the other >sclk clocked flip-flops should be driven through a global clockbuffer. >And the output of the toggle flip-flop (clk) should drive a global >buffer too. Here is a little schematic to illustrate the problem : > > +------+ +---------+ > +------+ | | | sclk | > | PAD +----+----+ BUFG +----> clocked | > | sclk | | | | | logic | > +------+ | +------+ +---------+ > | +---------+ +------+ +--------+ > | | clock | | | | clk | > +--> divider +--+ BUFG +--> clocked| > | FF | | | | logic | > +---------+ +------+ +--------+ > >How can I constrain this in Synplify ? Is it possible to assign the >syn_noclockbuf attribute to the pin of a single FF ? >I already managed this in a Xilinx XC4000XL, but the way I uses there >seems not to work with a Virtex. The synthesis and the place&route >always comes out with a structure like this: > > +------+ +---------+ > +------+ | | | sclk | > | PAD +---------+ BUFG +----> clocked | > | sclk | | | | logic | > +------+ +--+---+ +---------+ > | > +-------+ > | > | +---------+ +------+ +--------+ > | | clock | | | | clk | > +--> divider +--+ BUFG +--> clocked| > | FF | | | | logic | > +---------+ +------+ +--------+ if you really want low skew, i'd go for ray's suggestion and use a DLL. however, this may give you problems if you want to transfer data between the 2 clock domains, since there will be a finite non-zero skew between the 2 clocks. if you really want to implement the diagrams above, i'd ignore the constraints and just wire it structurally in your HDL - this will be much easier. obviously the 2nd diagram is better if you want to transfer data between the 2 domains. this works fine, but the problem is that it's very difficult to put in timing constraints on the signals which have to pass between the 2 domains - you need 2 constraints on each signal, and you have to arbitrarily divide the available time between the two. it's much easier to use the lower-frequency clock as a clock enable; you can then put a single constraint on the clock enable. evanArticle: 19680
The spec is probably referring to the CPU bus interface. Many CPUs spec bus timing relative to the bus control signals (rd, wr, ale, etc.), but do not specify anything relative to the CPU external clock. This is an asynchronous CPU. Many other CPUs spec all bus timing relative to the external clock, thus they are synchronous. RJS "Xanatos" <deletemeaoe_londonfog@hotmail.com> wrote in message news:wzpd4.43675$Ke.28717@news21.bellglobal.com... > Hi all, > > I'm new to the logic design game, and I have a question....In a design spec, > they note that we could select an asyncronous or syncronous CPU. Can someone > take a second and explain what the difference is? I know asyncronous > generally means "no clock", but I can't quite grasp how this works for a > processor/processor interface. > > Thanks, > Xanatos > > >Article: 19681
Xanatos <deletemeaoe_londonfog@hotmail.com> wrote in message news:wzpd4.43675$Ke.28717@news21.bellglobal.com... > Hi all, > > I'm new to the logic design game, and I have a question....In a design spec, > they note that we could select an asyncronous or syncronous CPU. Can someone > take a second and explain what the difference is? I know asyncronous > generally means "no clock", but I can't quite grasp how this works for a > processor/processor interface. Synchronous/asynchronous CPU doesn't ring any bells with me. The closest that I come to this is synchronous/asynchronous bus cycles. Some processors use asynchronus bus cycles where the end of the cycle is determined by the assertion of certain control signals (e.g. - Mot 68020). Holding off this "termination" signal inserts wait states into the bus cycle. Other processors use synchronous bus cycles where each cycle is a fixed number of clock cycles and there is no native mechanism for inserting wait states (e.g. Mot 68HC11, Rockwell 6502). You mention processor/processor interface. A common method to communicate between processors is through a serial port (aka COM port for the PC literate). Serial ports may be synchronous (clock is provided along with the data) or asynchronous (framing information is provided within the data stream). Many processors contain either or both types of serial ports on-board. -- Michael Ellis first initial last name at pesa commercial domainArticle: 19682
On Fri, 07 Jan 2000 17:38:16 GMT, edick@hotmail.com (Richard Erlacher) wrote: ... > >The counterfeit boards don't even have to work, because you'll have to >fix them under the terms of your warranty. The counterfeiter gets his >money when the products are solt into distribution, and not even YOU >will know where they actually came from. If the boards don't have to work, why copy the EEPROM? P.S. (To everyone:) What about putting the EEPROM die in the same package as the FPGA die? Having to cut into the package and connect to a tiny bond wire would discourage some people; maybe even most of the people without unstoppable resources. -- Michael Lee work play Mike leem @ @ generalstandards hiwaay .com .netArticle: 19683
> > > > >Im using a Xilinx Virtex, synthesizing with Synplify and I have the > >following problem > > > >My design contains 2 clocks. sclk (50 MHz) and clk (25 MHz). The clock > >clk is derived from the sclk clock with a flip-flop and sclk is the > >clock input of the design. Since the frequencies are so low ( 50 and 25 MHz ), I would clock the divide-by-two flip-flop on the "wrong" edge of the 50 MHz clock. Instead of an undefined set-up or hold-time and the resulting potentially unpredictable behavior, you now have most of 10 ns to sort things out. If you cannot solve the uncertainty, move things about, so they become certain. Peter AlfkeArticle: 19684
The clock skew between the DLL 1x and 2x outputs is smaller than the clock to Q and set up of the flip-flops, so you won't run into problems crossing the clock domain boundary for two clocks generated by the _SAME_ DLL. The only disadvantage I see to using the DLL is the finite start up time required for the DLL to stabilize. There is also another valid argument for not using the clock enables in high speed designs. The clock enable signal, if it fans out to many ff-s becomes a slow net, and distributing the clock enable in a register tree gets expensive in terms of resources and power. In the absence of on-chip DLLs, you can still go between the clock domains where the clocks are related in frequency (one is derived from the other through division), but you need to be very careful about how it is done. It takes a fair amount of logic to do it safely - basically, you do a clock enabled version of the slow clock in the fast clock domain, and use some extra FFs to determine what the relationship between the domains is and adjust the sample point so that it falls away from the transistion edges. I've used this in 4K parts where the 1x and 2x clocks were generated by an off-chip PLL so the skew between them was not known. eml@riverside-machines.com.NOSPAM wrote: > On Fri, 07 Jan 2000 16:20:19 +0100, Kai Troester <troester@imms.de> > wrote: > > >Hi > > > >Im using a Xilinx Virtex, synthesizing with Synplify and I have the > >following problem > > > >My design contains 2 clocks. sclk (50 MHz) and clk (25 MHz). The clock > >clk is derived from the sclk clock with a flip-flop and sclk is the > >clock input of the design. The design uses both of the clocks. To > >minimize the delay between both clocks I want the clock input of the > >toggle flip-flop to be driven from the clock pin directly, but the other > >sclk clocked flip-flops should be driven through a global clockbuffer. > >And the output of the toggle flip-flop (clk) should drive a global > >buffer too. Here is a little schematic to illustrate the problem : > > > > +------+ +---------+ > > +------+ | | | sclk | > > | PAD +----+----+ BUFG +----> clocked | > > | sclk | | | | | logic | > > +------+ | +------+ +---------+ > > | +---------+ +------+ +--------+ > > | | clock | | | | clk | > > +--> divider +--+ BUFG +--> clocked| > > | FF | | | | logic | > > +---------+ +------+ +--------+ > > > >How can I constrain this in Synplify ? Is it possible to assign the > >syn_noclockbuf attribute to the pin of a single FF ? > >I already managed this in a Xilinx XC4000XL, but the way I uses there > >seems not to work with a Virtex. The synthesis and the place&route > >always comes out with a structure like this: > > > > +------+ +---------+ > > +------+ | | | sclk | > > | PAD +---------+ BUFG +----> clocked | > > | sclk | | | | logic | > > +------+ +--+---+ +---------+ > > | > > +-------+ > > | > > | +---------+ +------+ +--------+ > > | | clock | | | | clk | > > +--> divider +--+ BUFG +--> clocked| > > | FF | | | | logic | > > +---------+ +------+ +--------+ > > if you really want low skew, i'd go for ray's suggestion and use a > DLL. however, this may give you problems if you want to transfer data > between the 2 clock domains, since there will be a finite non-zero > skew between the 2 clocks. > > if you really want to implement the diagrams above, i'd ignore the > constraints and just wire it structurally in your HDL - this will be > much easier. obviously the 2nd diagram is better if you want to > transfer data between the 2 domains. this works fine, but the problem > is that it's very difficult to put in timing constraints on the > signals which have to pass between the 2 domains - you need 2 > constraints on each signal, and you have to arbitrarily divide the > available time between the two. it's much easier to use the > lower-frequency clock as a clock enable; you can then put a single > constraint on the clock enable. > > evan -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 19685
Ray Andraka wrote in message <38767C28.5F60BCBA@ids.net>... >There is also another valid argument for not using the clock enables in high >speed designs. The clock enable signal, if it fans out to many ff-s becomes >a slow net, and distributing the clock enable in a register tree gets >expensive in terms of resources and power. Why not run the clock enable into a global low-skew buffer and have that drive the clock enables? Or or are the low-skew buffers only for clock nets? Where is that data book when I need it.... -- a ----------------------------------------- Andy Peters Sr Electrical Engineer National Optical Astronomy Observatories 950 N Cherry Ave Tucson, AZ 85719 apeters (at) noao \dot\ edu Spelling Counts! You don't loose your money - you lose it.Article: 19686
If you need clock enables otherwise in the design, which is usually the case, then it means gating the clock enable before the FF (which has to be routed on the regular routing), or adding more logic (2 more inputs) to the logic feeding the D input. In both cases, this extra level of logic can break a high speed design. In the case of Virtex, you only have 4 global buffers to begin with, at least one of which is for the clock. Andy Peters wrote: > Ray Andraka wrote in message <38767C28.5F60BCBA@ids.net>... > > >There is also another valid argument for not using the clock enables in > high > >speed designs. The clock enable signal, if it fans out to many ff-s > becomes > >a slow net, and distributing the clock enable in a register tree gets > >expensive in terms of resources and power. > > Why not run the clock enable into a global low-skew buffer and have that > drive the clock enables? > > Or or are the low-skew buffers only for clock nets? > > Where is that data book when I need it.... > > -- a > ----------------------------------------- > Andy Peters > Sr Electrical Engineer > National Optical Astronomy Observatories > 950 N Cherry Ave > Tucson, AZ 85719 > apeters (at) noao \dot\ edu > > Spelling Counts! You don't loose your money - you lose it. -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 19687
Actually, when you compare an SRAM based FPGA with an ASIC you have tons more observability thanks to the ability to reconfigure as needed. Granted, if you are not floorplanning, then the internal timing can change drastically from compile to compile. Proper use of timing constraints and adherence to static timing analysis will eliminate timing issues from the chip debug though. For large designs, you can put in partial function with the means to capture the output, or you can disable parts of the design to gain visibility into circuits that would be hard or expensive (in terms of test logic added to the design) to implement in an ASIC. For board level development, you can go as far as implementing a special test circuit in the FPGA for exercising the external interfaces to the point of breaking (such as high speed memory tests with aggressive most bits switching alternating read/write cycles) that provide a much more thorough test of the board and the surrounding circuits. When it comes to testing the guts of the FPGA design, a hierarchical design methodology makes testing of sub-components possible. It is much easier, both in simulation and actual test, to fully exercise a smaller building block than a larger one. Prove all the small things work, especially in the special corner of the envelope cases before testing the big design. Do that, and the problems in the big design will usually be the easy to find stuff. You might look at a paper I presented at MAPLD that discusses a radar environment simulator in FPGAs. While that was multiple FPGAs, it was tested using exactly this methodology. We had it up and running in under a week of LAB time, and no scope ever touched the board (we used the FPGAs to provide and capture the test vectors and return them to the host computer). The paper is available in the publications section of my website. So the bottom line is: 1. separate board test from FPGA test to the extent it is possible using special board test configurations, 2. use a hierarchical design style, and then use that hierarchy in the design verification 3. Use reconfiguration to test the FPGA's environment to the breaking point 4. Use reconfiguration to isolate and test individual sub-designs. Debug problems generally fall into three broad categories for synchronous designs (and your FPGA design should be synchronous): functional problems (the logic doesn't do what you intended it to do), timing problems (the function is OK but it doesn't work at speed), and signal integrity/faulty device problems. Proper time constraints and static timing analysis will let you avoid the timing problems in the Lab. A thorough functional simulation will get most of the functional problems before you get to the lab, so the lab should just be confirming the function is what the simulator said, and doing a board level check-out. Separating the FPGA function from the board checkout simplifies both (board checkout is easier because the FPGA fucntion is simple, FPGA check-out is done with a known good board). If you have signal integrity/faulty device problems, those will show up in board check-out, so again you make your job easier. Matt Billenstein wrote: > Some of my colleagues at work are worried about designing a new product > around a XCV1000 Virtex E FPGA (one with 512 pins max user io) ... Their > fear lies in the fact that there is a perceived "lack of observability" with > such a part and that the debugging of such a large design will be next to > impossible even with a thorough simulation written... I've read just a > little about the ability to read out the state and configuration of the > Virtex series, but certainly you can't see everything that is going on in > real time.. ? I know we'll have spare pins for debug connectors and that we > can route any internal net out to them, but one of our pin estimates put the > number of such spare pins in the mid twenties. I don't know if this is near > enough depending upon what we might run up against. > > Has anyone here had such problems? Or can recommend better tools that I may > know exist for bringing up 1000K gate FPGA's. What, if any, is the standard > way for probing up a large BGA device? Can one pin up the FPGA and put a > socket on the board with logic analyzer connectors for debugging purposes? > > thx > > m > > Matt Billenstein > http://w3.one.net/~mbillens/ > REMOVEmbillens@one.net -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 19688
Hi folks, Is Xilinx 4000 configuration bitstream structure open to public. In other words, is it possible for us to edit bitstream. If the structure of the circuit is known in advance (placement and routing), why should people go through the lengthy process of Xilinx tools (A lot of DSP aarchitectures are quiet regular). How about Virtex? Cheers.Article: 19689
Richard Erlacher wrote: > > The whole problem of theft risk would go away if the FPGA makers would > put the config EEROM inside the FPGA. That wold save space, > uncomplicate the board layout process, and let more of us sleep at > night. You won't (ever?) see EEPROM on the same die, but with BGA devices, it's not a great problem to integrate some EE and SRAM FPGA. As a real example, Atmel do a ARM core, and 2MBytes of FLASH for $13 like this. A slight tweak would be needed to the EEPROMs of today, to separate the Read.Write ports & add Secure - then the READ channel would be buried, and Verify channel SECURED. -jgArticle: 19690
On 7 Jan 2000 11:44:00 -0800, mcgett@xilinx.com (Ed Mcgettigan) wrote: >In article <38761dca.521735891@mindmeld.idcomm.com>, >Richard Erlacher <edick@hotmail.com> wrote: >>It's their sales volume they want to maximize, and in the here and >>now, so it doesn't matter to them one bit if you go under next quarter >>because of counterfeiting this quarter. >> >>That's why we don't have the technology that would solve this problem. >>The FPGA makers would be investing lots of money now in order to >>reduce their sales volume later. To them it's shooting themselves in >>the foot. >> > >Richard just accused all non-volatile FPGA manufacturers of conspiring >to limit technology choices to the end designer to sell a few more >chips to counterfeit board maker. > >This of course means that the counterfeiter has to reproduce the entire >board from components to PCB traces and stackups to product docs to >labels to software all of which is "non-secure". And they have to do >all of this cheaper than the original manufacture to make money. > >Can anyone out there provide any documented proof that this has ever >happened? Or is this just a bunch of worrying about nothing? It happens all the time. I only have personal experience of it happening in Asian countries though. Not documented proof, but a technician that worked at a company I used to work for once went to a customer's site as part of a service contract. We had sold the customer two widgets. They had a room full of them, and weren't shy about letting us know. Allan.Article: 19691
On Fri, 07 Jan 2000 19:17:01 +0000, Rick Filipkiewicz <rick@algor.co.uk> wrote: >Does anybody have any info/datasheet on these parts ? It seems a strange >situation that I can go all the way from synthesis through to >place&route but I don't know what's actually in these devices ?!. >I'm >sort of assuming that they relate to Virtex in the same kind of way that >Spartan1 related to the smaller XC4K series parts. That's what I've been told by the local X reps. The range of packages is somewhat limited, and IIRC there are no "thermally enhanced" packages, to keep costs down. They appear to be bitstream compatible with Virtex parts. I'll even guess that the first ones that come out actually use Virtex die. Regards, Allan.Article: 19692
Hi group! I'm looking into a new design, consisting of 4 pcs. of 32-bit 100 MHz asynchronous counters. When stopped, the counters are emptied into a FIFO (common to all counters - 32 kbyte size total). The FIFO's will be read through an ordinary 8 MHz CPU interface. My idea is to have all counters in one FPGA, and an external FIFO. Or should I use a FPGA big enough to contain the FIFO as well? Give me your oppinion on which gives the best price/performance, please. I'm totally open to which family and tools to use. Have anybody good og bad experiences in similar designs? KrestenArticle: 19693
It doesn't sound like your FIFO needs to be really high performance, and besides the FIFO there isn't much that needs to be put in the FPGA. You could use a part with 32K of embedded RAM and roll your own FIFO inside the FPGA for the most highly integrated solution. You'll wind up with alot of unused FPGA resource with this approach, so you are probably buying much more FPGA than you really need. You could, of course use the extra FPGA resource for other stuff in your design. As you point out, you could use an external FIFO plus counters implemented in a small FPGA (those could easily go in a CPLD such as a Lattice 1016 too). I think you'll find that the cost of the FIFO a major part of the cost of this approach. I think a cheaper solution is to use a Xilinx spartan part and a 32Kx8 SRAM. Use some of the logic in the FPGA to control the SRAM to use it as a FIFO. With care, you should be able to get the whole thing in an XCS-05, which can be had in quantity for around $3. "Kresten Nørgaard" wrote: > Hi group! > I'm looking into a new design, consisting of 4 pcs. of 32-bit 100 MHz > asynchronous counters. When stopped, the counters are emptied into a FIFO > (common to all counters - 32 kbyte size total). The FIFO's will be read > through an ordinary 8 MHz CPU interface. > > My idea is to have all counters in one FPGA, and an external FIFO. Or should > I use a FPGA big enough to contain the FIFO as well? Give me your oppinion > on which gives the best price/performance, please. > > I'm totally open to which family and tools to use. Have anybody good og bad > experiences in similar designs? > > Kresten -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 19694
On Sat, 08 Jan 2000 00:00:50 GMT, Ray Andraka <randraka@ids.net> wrote: >The clock skew between the DLL 1x and 2x outputs is smaller than the clock to >Q and set up of the flip-flops, so you won't run into problems crossing the >clock domain boundary for two clocks generated by the _SAME_ DLL. The only >disadvantage I see to using the DLL is the finite start up time required for >the DLL to stabilize. you're not taking into account routing delays. the fpga manufacturer guarantees (i hope) that this will work, where 'a' is a short segment, 'b' is a long maximally-loaded segment, and 'c' is minimally loaded: x--x x--x --| |--------c---| |-- |>--a-> | ---b---> | | | | | | | | x--x | x--x | | ------------- this is a nasty problem involving minimum delays, and can only be fixed by custom creation of clock trees. if you now split 'a' and 'b' into different tracks, and introduce a 150ps uncertainty in their source timing, then you have a much more difficult problem. granted, it'll probably work, but i won't be doing it until i see it guaranteed in the datasheet. if you do use a DLL and split the two clocks, then you can arrange to clock the 2 F/Fs at different times, as peter suggested. this would be a good fix, apart from (1) any signals passing between the 2 domains now only have approximately half of the period of the high-frequency clock available to them, rather than the whole period, and (2) this is - i think - difficult to constrain. i agree that, in general, it often makes more sense to route multiple clocks than to use a CE. this is the way that large SOC designs are going, and it will become more relevant to us as FPGA die sizes increase. however, there's one critical argument that generally makes this a no-starter. the most important part of any FPGA or ASIC development is verification. xilinx's timing constraint mechanism is good (much better than others i've seen), but it's still next to impossible to constrain a design where a significant number of signals pass between clock domains. on the other hand, it's trivially easy to constrain a design which uses clock enables, which makes the choice a no-brainer. you can verify your multi-clock design with timing simulation, but there's always the possibility that you'll miss the path that makes your design fail. one more point: if your clock-enabled design fails due to a routing and fanout problem, then it's generally easy to fix, which certainly isn't true of a multi-clock design. evanArticle: 19695
On Fri, 7 Jan 2000 18:57:51 -0700, "Andy Peters" <apeters.Nospam@nospam.noao.edu.nospam> wrote: >Why not run the clock enable into a global low-skew buffer and have that >drive the clock enables? > >Or or are the low-skew buffers only for clock nets? > >Where is that data book when I need it.... > >-- a the documentation on this is a bit sparse. as far as i can make out, the BUFG clock lines have had more and more restricted routing since the 4K, running through spartan, to virtex. on virtex, it seems that you can now only connect a BUFG to a clock pin. however, you can now connect the secondary low-skew networks to the SR, CE, and CLK pins on a CLB, so you have to use the secondary networks (or general routing) for CEs. you can connect to all of CLK, SR, and CE using the general routing. i got this info from fpga editor - it would be interesting to hear from anyone with any more specific information. synthesisers, and the router, seem to pretty intelligent about this. if you have a high fanout CE, the synth will duplicate it into a number of low fanout nets, and some are routed on the low-skew lines (at least they were in a case i checked recently). evanArticle: 19696
Most (if not all) commercially available CPUs are syncronous. This means that the internal storage elements latch data in relation to a clock signal. All data moves and is computed on in time slices relative to a repetitive (clocked) signal. Asyncronous circuits (including CPUs) use a signaling protocal to move or compute on data that is independent of any clocked signal. The most common system is the two rail system that latches data when both signals are valid and true. What asynchronous circuits bring to the game is that they can be made to be delay insensitive (no race conditions - they just run as fast as they can). Steve Casselman Virtual Computer Corporation Xanatos <deletemeaoe_londonfog@hotmail.com> wrote in message news:wzpd4.43675$Ke.28717@news21.bellglobal.com... > Hi all, > > I'm new to the logic design game, and I have a question....In a design spec, > they note that we could select an asyncronous or syncronous CPU. Can someone > take a second and explain what the difference is? I know asyncronous > generally means "no clock", but I can't quite grasp how this works for a > processor/processor interface. > > Thanks, > Xanatos > > >Article: 19697
I am working on a VHDL design to be synthesized on an Altera 10k100 for debugging and on an ASIC standard cell library for production. The current code base does not fit into an Altera 10k100 (Synplicity claims 120% device usage). The tools used are Synplicity and Max PlusII for the FPGA and Synopsis for the ASIC. Speed is less than 10MHz. The code implements a multi channel DSP sensor interface and signal processing, involving among others many adders. There are also some 40 byte wide configuration registers accessed by an serial interface. All algorithms are RTL style using unsigned() bit vectors. I'm not very familiar with Altera's architecture. Can you suggest any "tricks" ? Are there good an bad ways (for Altera's architecture) to describe e.g. a 12 bit adder? Or do you have any hints on where I can find useful information for trying to reduce the resource usage in Altera FPGA but keeping in mind that the same code must also be compiled efficiently by Synopsis on a standard cell library. All suggestions are very welcome, Thank you in advance, BerniArticle: 19698
Steve Casselman <sc@vcc.com> wrote in message news:s7f3sk7qoj824@corp.supernews.com... > data when both signals are valid and true. What asynchronous circuits bring > to the game is that they can be made to be delay insensitive (no race > conditions - > they just run as fast as they can). They can also consume significantly less power than the equivalent synchronous circuit operating at the same effective throughput. ---Joel KolstadArticle: 19699
Synplicity does a pretty good job with adders already. The adders will get implemented using the Altera carry chain, which is the way you want it to work. Altera's array is not very strong for arithmetic applications, which is probably the root of your not fitting. When the carry chain is utilized, the altera LE's are put in the "arithmetic" mode, which breaks each 4 input Look-Up Table (4-LUT) into a pair of 3-LUTs, one for the carry to the next bit and one for the sum function. One input to each of those LUTs is the carry from the previous bit, so any arithmetic function with more than two inputs is forced to two levels of logic, and thus two LE's per bit. If you use the register clock enable, the clock enable input to the LE steals one of the LUT inputs too, so for arithmetic, you are down to a 1 input arithmetic function if you want to stay in one level of logic. The bottom line here is if you are doing clock enables, accumulators with load or synch clear, adder/subtractors or any other 3 or more input arithmetic you'll end up using 2 LE's per bit per function. Xilinx FPGAs have a different carry chain structure that permits 2 bits of 3 input arithmetic per CLB for 4000 and spartan series, or 4 input arithmetic in Virtex and Spartan II. Also, in most DSP algorithms you wind up with a need for sample delays (taps in a filter, etc). In Altera devices, the delays are implemented in the LE registers, which means you use up an LE for each bit of each clock of delay. Xilinx devices allow you to turn the 4-LUTs into 16x1 memories, which can be used as 16 bit shift registers. Synplicity will infer those shift registers from your RTL code for Virtex, but not for 4000 series (4000 series requires a counterfor the memory). You might be able to revisit your algorithm to find a better way to do it. Doing so will decrease the size of the design both in the Altera device and in your ASIC. As far as architectural optimizations, you will do better in Altera if you design keeping in mind the structure of the LE's but where your target is an ASIC, I don't think you'll get real far that way. You can also use the EABs as function look-ups if you are not already using them for something else. Here, the Idea is to map stuff that takes up alot (several levels) of logic into the EAB. Of course, that won't map well to the ASIC either. Can you run at a higher clock rate and time multiplex the multiple channels through the same set of hardware? Alternatively, at a 10 MHZ, you should be able to some of the processing bit-serial, again at a higher system clock. Berni Joss wrote: > I am working on a VHDL design to be synthesized on an Altera 10k100 for > debugging and on an ASIC standard cell library for production. > > The current code base does not fit into an Altera 10k100 (Synplicity claims > 120% device usage). > The tools used are Synplicity and Max PlusII for the FPGA and Synopsis for > the ASIC. > Speed is less than 10MHz. > The code implements a multi channel DSP sensor interface and signal > processing, involving among others many adders. There are also some 40 byte > wide configuration registers accessed by an serial interface. > All algorithms are RTL style using unsigned() bit vectors. > > I'm not very familiar with Altera's architecture. > > Can you suggest any "tricks" ? > Are there good an bad ways (for Altera's architecture) to describe e.g. a 12 > bit adder? > Or do you have any hints on where I can find useful information for trying > to reduce the resource usage in Altera FPGA but keeping in mind that the > same code must also be compiled efficiently by Synopsis on a standard cell > library. > > All suggestions are very welcome, Thank you in advance, > Berni -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randraka
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z