Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
I have a design that I'm trying to implement in an Altera Flex10k100A part and it requires: Type A: 3 blocks of 16 x 64 bits RAM = 3 x 1024 = 3072 bits Type B: 16 blocks of 8 x 8 bits RAM = 16 x 64 = 1024 bits Total RAM = 4096 bits Now the 10k100A has 12 memory blocks (EABs), each 2048 bits for a total of 24k bits. Each EAB can be configured in any of the following: 256x8, 512x4, 1024x2, or 2048x1. When I place and route (P&R) my design with MaxPlusII, it fails to implement the design in a single 10K100, and if I allow it to fit to multiple devices, it ends up routing it in 5 devices. From the results, it looks like a single type A RAM consumes 8 EABs (out of a total of 12) since it needs to cascade that many to create a 'x64' row. This seems terribly wasteful and inefficient. This is only 4kbits out of a total capacity of 24kbits. As a possible solution, I've tried to 'clique' (group) each RAM block together, but that causes more severe problems (even more devices). Does anyone know of better way to implement these RAMs in this type Altera part? Thanks in Advance, Edwin Grigorian JPLArticle: 12226
You didn't mention the speed at which you need to access the RAM. Is it low enough to read successive bytes out of an EAB and assemble them in your logic? If you need all 64 bits at a high enough rate that you can't afford multiple EAB accesses to get the data, then you are stuck using 8 EABs for each type A memory. How about using a Xilinx part? the memory you describe would occupy 32 CLBs for each of the 3 "type A" blocks and 4 CLBs for each of the 16 "Type B" blocks, for a total of 160 CLBs for the memory. An XCS40 (Spartan equivalent of an XC4020) is 28x28=784 CLBs, so the memory only occupies a tad over 20% of the device. XCS40's are quite a bit cheaper than a 10K100A too. This is yet another reason I prefer the Xilinx 4K architecture over Altera for DSP and data flow applications (Try doing a couple of delay queues in Altera...you'll either use up your EABs quickly or waste an awful lot of LEs to mimic memory). Edwin Grigorian wrote: > I have a design that I'm trying to implement in an Altera Flex10k100A part > and it requires: > > Type A: 3 blocks of 16 x 64 bits RAM = 3 x 1024 = 3072 bits > Type B: 16 blocks of 8 x 8 bits RAM = 16 x 64 = 1024 bits > Total RAM = 4096 bits > > Now the 10k100A has 12 memory blocks (EABs), each 2048 bits for a total of > 24k bits. Each EAB can be configured in any of the following: 256x8, 512x4, > 1024x2, or 2048x1. > > When I place and route (P&R) my design with MaxPlusII, it fails to implement > the design in a single 10K100, and if I allow it to fit to multiple devices, > it ends up routing it in 5 devices. From the results, it looks like a > single type A RAM consumes 8 EABs (out of a total of 12) since it needs to > cascade that many to create a 'x64' row. This seems terribly wasteful and > inefficient. This is only 4kbits out of a total capacity of 24kbits. As a > possible solution, I've tried to 'clique' (group) each RAM block together, > but that causes more severe problems (even more devices). > > Does anyone know of better way to implement these RAMs in this type Altera > part? -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 12227
Andreas Doering wrote: > > I plan to use an XV (propably XC40150XVBG600) for a research > project. I want to plan the power supply. > I only found the estimation of I/O-Power. How accurate of an estimate do you need? Power supply current for an FPGA design is really hard to estimate. Power use will vary from placement and route to placement and route as different routing resources will be used for signals, each of which switches at different rates with different loads. Unless you maintain control over the design at a very low level, perhaps reasonable with a very regular design with fixed placement and very repeatable routing, you just don't know enough about the circuit to hope to do an accurate estimate. -- Phil Hays "Irritatingly, science claims to set limits on what we can do, even in principle." Carl SaganArticle: 12228
It would be useful to have some idea what is likely. 1. You clock every possible gate at the highest possible speed and the worst possible temperature for a mS or two. - what is the current. (this worst case won't last long in real life - thermal effects) 2. Package thermal limited continous current use at temperature limits and 25 C. Simon ======================================================================== Phil Hays <spampostmaster@sprynet.com> wrote: >Andreas Doering wrote: >> >> I plan to use an XV (propably XC40150XVBG600) for a research >> project. I want to plan the power supply. >> I only found the estimation of I/O-Power. > >How accurate of an estimate do you need? > >Power supply current for an FPGA design is really hard to estimate. >Power use will vary from placement and route to placement and route as >different routing resources will be used for signals, each of which >switches at different rates with different loads. Unless you maintain >control over the design at a very low level, perhaps reasonable with a >very regular design with fixed placement and very repeatable routing, >you just don't know enough about the circuit to hope to do an accurate >estimate. > > >-- >Phil Hays >"Irritatingly, science claims to set limits on what >we can do, even in principle." Carl Sagan Design Your Own MicroProcessor(tm) http://www/tefbbs.com/spacetime/index.htmArticle: 12229
An EAB in a 10KA series has a maximum of 8 output bits so : 3 blocks of 64 bits wide : 3x8 = 24 EABs 16 blocks of 8 bits wide : 16x1 = 16 EABs Thus in total you need 40 EABs, thus at least 4 10K100A's. You are right that Altera is not so well suited for wide rams They are better suited for deep rams (high number of addresses). An alternative that I used was to implement some of the rams (dual port rams) into FFs. But of course this could also consume a lot in your case. Edwin Grigorian wrote: > I have a design that I'm trying to implement in an Altera Flex10k100A part > and it requires: > > Type A: 3 blocks of 16 x 64 bits RAM = 3 x 1024 = 3072 bits > Type B: 16 blocks of 8 x 8 bits RAM = 16 x 64 = 1024 bits > Total RAM = 4096 bits > > Now the 10k100A has 12 memory blocks (EABs), each 2048 bits for a total of > 24k bits. Each EAB can be configured in any of the following: 256x8, 512x4, > 1024x2, or 2048x1. > > When I place and route (P&R) my design with MaxPlusII, it fails to implement > the design in a single 10K100, and if I allow it to fit to multiple devices, > it ends up routing it in 5 devices. From the results, it looks like a > single type A RAM consumes 8 EABs (out of a total of 12) since it needs to > cascade that many to create a 'x64' row. This seems terribly wasteful and > inefficient. This is only 4kbits out of a total capacity of 24kbits. As a > possible solution, I've tried to 'clique' (group) each RAM block together, > but that causes more severe problems (even more devices). > > Does anyone know of better way to implement these RAMs in this type Altera > part? > > Thanks in Advance, > > Edwin Grigorian > JPL -- Koenraad SCHELFHOUT Alcatel Telecom Switching Systems Division http://www.alcatel.com/ Microelectronics Department - VA21 _______________ ________________________________________\ /-___ \ / / Phone : (32/3) 240 89 93 \ ALCATEL / / Fax : (32/3) 240 99 88 \ / / mailto:koenraad.schelfhout@alcatel.be \ / / _____________________________________________\ / /______ \ / / Francis Wellesplein, 1 v\/ B-2018 Antwerpen BelgiumArticle: 12230
On 05 Oct 1998 12:23:49 -0400, Scott Bilik <sbilik@nospam.tiac.net> wrote: Thanks for pointing that out, Scott. We're a bit behind the times here with our Leonardo releases - can't have had one for at least a month ;-) Dave -- REPLACE "NOJUNK" in address with "david.storrar" to reply Development Engineer | Marconi Electronic Systems | Tel: +44 (0)131 343 4484 RCS | Fax: +44 (0)131 343 4091Article: 12231
On Wed, 30 Sep 1998 23:14:58 -0400, Ray Andraka <no_spam_randraka@ids.net> wrote: >Huh?? >An FIR filter implemented in an FPGA outperforms one implemented in a DSP by a >wide margin. The more taps, the higher the performance gain. The FPGA >implementations perform the multiplications for all the taps in parallel, where >a DSP microprocessor computes the taps sequentially. The above statement bothers me a little. If you need to perform the multiplications for all taps in parallel, dont you have to have that many multipliers in hardware ? I am just off of a SOC design where I designed many filters in hardware and one thing stood out : High precision signed multipliers are not cheap, in gate count and levels of logic. I eventually implemented filters with one multiplier doing one tap per clock cycle and that was no different than the sequential operation of the DSP. I would go for the FPGA over the DSP when DSP looks like an overkill for an embedded application. Otherwise, I do not see FPGA very attractive, atleast not yet, for I have seen people spending more time tackling the problems of FPGA rather than the actual design itself. Just my 2 cents :) Kartheepan, MArticle: 12232
On 05 Oct 1998 12:23:49 -0400, Scott Bilik <sbilik@nospam.tiac.net> wrote: >So TO SUMMARIZE: > >It does have great scripting capabilities. And it's scripting language >is not limited to merely setting variables (ala Synplify). No GUI >ever need be invoked. So your worries (and flame) are for naught, at >least this time. :) do you also get the scripting features on level 1? and any news on level 0?? evanArticle: 12233
David R Brooks wrote: <snip count sequence for 6-bit twisted ring counter> > This has of course, used 12 of the 64 states possible in 6 bits. All > the other states are illegal. If inadvertently entered, the counter > will usually continue cycling through illegal states. A clean reset > will start it off right, but if your application must be sure to > recover from faults, you'll need to trap those illegal states. Exactly so. To do this you need to a) determine all the possible cycles of illegal states (don't forget there may be multiple non-intersecting cycles) b) invent some combinatorial gubbins that will identify at least one state in EACH of these cycles, and NO states in the wanted cycle c) use that logic to force the counter into one of the legal states and/or raise a fault flag The same argument applies to one-hot state machines and indeed any state machine with any illegal states, i.e. any (sub)system with N flipflops but fewer than 2^N legal states. This is a very interesting problem, which most textbooks and teachers miserably fail to address. The good reasons for using one-hot, well rehearsed here and elsewhere, are the small amounts of next-state logic and hence the fast inter-flipflop logic paths they yield. Conventional wisdom says that one-hot SMs are therefore likely to be smaller and faster than encoded SMs, at least in flipflop-rich FPGAs. Similarly, Johnson counters have the benefit of very high speed, tiny amounts of next-state logic, and (irrelevant to this post, but very useful) freedom from output decode spikes when used to implement one-of-N sequencers. BUT, BUT, BUT: If you are trying to design a _robust_ system, in which illegal states are detected and dealt-with as they should be, you will likely end up with unwieldy amounts of illegal-state detection logic which will be just as expensive of speed and real-estate as the next-state logic for a fully encoded system! I am very distrustful of authors who assert that effective start-up reset will solve this problem. As FPGA design rules fall to 0.35u and below, we will surely see soft errors in FPGAs occasionally? In any case, in high-reliability applications you have to accept that metastability will come up from behind and get you one day. Sure, I have designed plenty of systems with a few un-caught illegal states - but I'm not proud of it. Anyone else as worried about this as I am? Jonathan Bromley --Article: 12234
Since I am a MINC employee I'll try to keep this as spam free as possible but... Try MINC's PLSynthesizer. We were reviewed in the 9/12 issue of EDN along with FPGAXpress and Accolade with good performance results. Synplicity refused to take part. Here in Boston, we have had some strong design wins against the Big 2 at a couple of "high profile"accounts. PLS *does* support scripting. www.minc.com Jay Doherty MINC/Synario Design Automation 508-893-7944 jdoherty@synario.comArticle: 12235
M Kartheepan wrote: > > On Wed, 30 Sep 1998 23:14:58 -0400, Ray Andraka > <no_spam_randraka@ids.net> wrote: > > >Huh?? > >An FIR filter implemented in an FPGA outperforms one implemented in a DSP by a > >wide margin. The more taps, the higher the performance gain. The FPGA > >implementations perform the multiplications for all the taps in parallel, where > >a DSP microprocessor computes the taps sequentially. > > The above statement bothers me a little. If you need to perform the > multiplications for all taps in parallel, dont you have to have that > many multipliers in hardware ? I am just off of a SOC design where I > designed many filters in hardware and one thing stood out : High > precision signed multipliers are not cheap, in gate count and levels > of logic. I eventually implemented filters with one multiplier doing > one tap per clock cycle and that was no different than the sequential > operation of the DSP. > > I would go for the FPGA over the DSP when DSP looks like an overkill > for an embedded application. Otherwise, I do not see FPGA very > attractive, atleast not yet, for I have seen people spending more time > tackling the problems of FPGA rather than the actual design itself. > > Just my 2 cents :) > > Kartheepan, M I agree in your analysis. But I disagree with your conclusions. I think DSP is most usefull anytime you can fit the processing into the available DSP MIPs. If your embedded application is so light that you don't need a DSP then I would use a standard embedded micro for the task. But I would only use FPGAs for DSP when I needed MORE MIPs than the DSP can provide. The design time for a micro is the least since they have good support and are easy to program in high level language. You might have to program a DSP fuction in assembler to meet your performance goals however. A DSP is a little harder to develop on since the architecture makes them somewhat harder to use advanced development tools. But there are a lot of DSP routines already written which makes some of the work easy. But when you need every ounce of performance, you often need to use unusual variations or even unusual algorithms which must be hand coded. Then the FPGA can give you the most performance with the highest requirement for development effort. While processors can be debugged using many available tools, the FPGA must be debugged in a slow simulator or in a real system with more difficulty. Of course all of this is a generalization and any given problem may be an exception. But I think most projects will find this ranking to be true. -- Rick Collins redsp@XYusa.net remove the XY to email me.Article: 12236
Hello! Firstly, my english is not very good. Sorry! I'm a brazilian chemical engineering and I'm a begginer in on line data acquisition and computer interfacing. I have a board in my PC which contains: AD converter (12 bits input), DC converter (8 bits input), multiplexer, etc. I'm not sure but a think that the board has also a 8253 programable timer. I'm confused because I know that a PC has a 8253 timer. I don't know if the board have its 8253 chip or if is using the 8253 of the PC. I guess the first option because in the manual's board (the manual is old, concise and bad written) is written that the conuter 0 and couter 1 are linked together (cascated, chain). I'd like to measure a signals from four pressure transducer. I'm a software loop to measure time but I think it is poor method. I'm trying to use the 8253 timer. My questions: 1) anyone adc or dac has a timer chip? 2) why to use a 8253 timer? Would I use a loop to simulated time intervals? 3) how to program a 8253 timer if I have: base+12 = counter 0 port base+13 = counter 1 port base+14 = counter 2 port base+15 = counter 3 port Thanks in advance and, if possible, send me a e-mail too. ArlanArticle: 12237
There are tricks in hardware that can condense the design. One of the most useful, at least for filters, is distributed arithmetic which provides a technique to hide the multiplications by rearranging the multiply-accumulates at the bit level. Distributed arithmetic is a bit awkward to describe in a few lines, but I'll do my best. If we look at a FIR filter, the input is delayed in a tapped delay line. At each tap, a delayed version of the input is multiplied by a possibly unique coefficient. The products from all the taps are then summed to obtain the filter output. Now imagine that the input is only one bit wide for a moment. In that case, the multiply accumulate can be represented by a look-up table with one input for each tap, and enough output bits to encompass any combination of the coefficients without overflow. This is illustrated by the table below, where the coefficients are labeled A,B,C & D: inputs output 0000 0 0001 A 0010 B 0011 A+B 0100 C 0101 A+C ... 1101 A+C+D 1110 B+C+D 1111 A+B+C+D Note that the multiply-accumulate in this case is accomplished by a 4 input look-up, and is valid regardless of the width of the coefficients (the width of the table needs to be sufficient to hold the sum of any combination of coefficients). Now, in most cases we need imputs that are wider than one bit. In that case, we use an identical instance of the table for each bit of the input to generate a partial sum of products corresponding to each bit. Those partials are then combined in an adder tree. The inputs to the adder tree are shifted to match the bit weights. What we have done, is applied the distributive property of addition and multiplication at the bit level to reduce the multiply-accumulate to a look-up table and adder tree. If speed is not as much an issue, the input can be presented one bit at a time to the same look-up table to save logic resources. The table output is then shifted and accumulated till all the bits of the input are accounted for. Even with the input serialized, the result of the whole filter is produced in the number of clock cycles equal to the number of bits in the input regardless of the number of taps (the table size grows exponentially with the number of taps). This is a considerable improvement over having to multiply for each tap as in the method described by M Kartheepan. The size of the table can also be contained by combining results of smaller tables in an adder. For instance, an eight input MAC would require a 256 entry table. Alternatively, it can be constructed from two identical 4 input tables (one table is addressed by the first 4 inputs, the other by the remaining 4 inputs) if the table outputs are summed. For more detailed info, you might take a look at xilinx application note "The Role of Distributed Arithmetic in FPGA based Signal Processing" which can be found at http://www.xilinx.com/appnotes/theory1.pdf As far as the time to develop an algorithm in an FPGA goes, a heavily data path design can be developed reasonably quickly if you are familiar with the FPGA architecture, tools and hardware implementation of algorithms and your library is reasonably complete. It also helps to do the data path design using schematics rather than synthesizing it so that you can have the control over the design implementation and placement to obtain good performance. This said, I have done fairly well packed (75% and greater utilization) high performance (40+ MHz) XC4028 data path designs in under a week (33 FPGA designs completed in the past year). Another, not so obvious advantage of FPGAs over DSPs for medical applications is that the Food and Drug Administration treats the FPGA programs as hardware rather than software. That can lead to a shorter product approval cycle when compared to a similar product using a DSP microprocessor. M Kartheepan wrote: > On Wed, 30 Sep 1998 23:14:58 -0400, Ray Andraka > <no_spam_randraka@ids.net> wrote: > > >Huh?? > >An FIR filter implemented in an FPGA outperforms one implemented in a DSP by a > >wide margin. The more taps, the higher the performance gain. The FPGA > >implementations perform the multiplications for all the taps in parallel, where > >a DSP microprocessor computes the taps sequentially. > > The above statement bothers me a little. If you need to perform the > multiplications for all taps in parallel, dont you have to have that > many multipliers in hardware ? I am just off of a SOC design where I > designed many filters in hardware and one thing stood out : High > precision signed multipliers are not cheap, in gate count and levels > of logic. I eventually implemented filters with one multiplier doing > one tap per clock cycle and that was no different than the sequential > operation of the DSP. > > I would go for the FPGA over the DSP when DSP looks like an overkill > for an embedded application. Otherwise, I do not see FPGA very > attractive, atleast not yet, for I have seen people spending more time > tackling the problems of FPGA rather than the actual design itself. > > Just my 2 cents :) > > Kartheepan, M -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 12238
Hi everyone. I was wondering if anyone has tried Xilinx's FROM:TO user constraint. I have a VHDL design that I would like to run at 40MHz (period of 25ns). I read on Xilinx's homepage (on one of their online slide presentations) that I can use the following user constraint to ensure that the design will run at 25MHz from flip flop to flop: TIMESPEC TS04 = FROM FFS to FFS 25ns; I have a few questions about this constraint. 1) From flip flop to flip flip assumes flip flops connected via the same path? 25ns 25ns FF1-------->FF2---------->FF3 2) Does this mean that if the design meets this 25ns constraint then that data from one flip to another arranged as in 1) will only have 25ns delay between them? I'm assuming this is only true for the internal flip flops, not connected to IO pads (IPAD to FF, or FF to OPAD). I tried compiling a design using this constraint, and I got some numbers that I was not sure about. The part I used was an xc4028EX-3 and I got the following results: Pre place-and-route timing under this constraint : 181.3MHz Post place-and-route timing under this constraint : 41.3 MHz I had also used a 25ns PERIOD constraint on my main clock net Please post a reply to this newsgroup or email me if anyone knows more about this constraint. Many thanks in advance. -- Nestor Caouras nestor@ece.concordia.ca http://www.ece.concordia.ca/~nestor/addr.html |-------------------------------------------| | Dept. of Electrical and Computer Eng. | | Concordia University | | 1455 de Maisonneuve Blvd (West) | | Montreal, Quebec, Canada H3G 1M8. | | Tel: (514)848-8784 Fax: (514)848-2802 | |-------------------------------------------|Article: 12239
Welcome to the wonderful and wacky world of Xilinx constraints! Nestor Caouras wrote: >I was wondering if anyone has tried Xilinx's FROM:TO user constraint. I >have a VHDL design that I would like to run at 40MHz (period of 25ns). >I read on Xilinx's homepage (on one of their online slide presentations) >that I can use the following user constraint to ensure that the design >will run at 25MHz from flip flop to flop: > >TIMESPEC TS04 = FROM FFS to FFS 25ns; > >I have a few questions about this constraint. >1) From flip flop to flip flip assumes flip flops connected via the same >path? > > 25ns 25ns > FF1-------->FF2---------->FF3 By same "path," do you mean that FFs 1, 2, and 3 share the same clock? Actually, what the constraint means is that ALL flipflops in your circuit use this constraint (unless you tell it otherwise with different constraints). >2) Does this mean that if the design meets this 25ns constraint then >that data from one flip to another arranged as in 1) will only have 25ns >delay between them? I'm assuming this is only true for the internal flip >flops, not connected to IO pads (IPAD to FF, or FF to OPAD). What it means is that you want to use a 40 MHz clock, and that the delay through the logic between the flops will be constrained to be 25 ns or less. If the place and route tools can't make that logic fast enough to meet that constraint, it'll flag it as a constraint not met. >I tried compiling a design using this constraint, and I got some numbers >that I was not sure about. >The part I used was an xc4028EX-3 and I got the following results: > >Pre place-and-route timing under this constraint : 181.3MHz >Post place-and-route timing under this constraint : 41.3 MHz > >I had also used a 25ns PERIOD constraint on my main clock net The number for pre-PPR is simply what clock speed you could run your chip at if there were zero routing delays. It's interesting but not necessarily useful. The number for Post-PPR is what the actual chip can do - it's the fastest clock you can actually use. You told the tools that you wanted to use a 40MHz clock, and the tools chugged away and were able to not only meet your constraints, but give you a little slack, as well. So, if for some reason you wanted to use a 41.3 MHz clock, you could. (If you dig through the timing reports, you'll see that this number is what the slowest path in your design is.) -andy ------------------------- Andy Peters Sr Electrical Engineer National Optical Astronomy Observatories 950 N Cherry Ave Tucson, AZ 85719 520-318-8191 apeters@noao.eduArticle: 12240
uh, i'm more worried and we've been looking at the output of synthesizers for various encoding schemes and have been finding different levels of robustness. irregardless of feature size, transients of various sorts can happen, say from ESD, and the critical parts of a state machine should continue to function and not lockup or cycle through illegal states. somehow, i don't see the vendors talk about this when discussing "quality of results." rk Jonathan Bromley wrote: > David R Brooks wrote: > <snip count sequence for 6-bit twisted ring counter> > > This has of course, used 12 of the 64 states possible in 6 bits. All > > the other states are illegal. If inadvertently entered, the counter > > will usually continue cycling through illegal states. A clean reset > > will start it off right, but if your application must be sure to > > recover from faults, you'll need to trap those illegal states. > > Exactly so. To do this you need to > a) determine all the possible cycles of illegal states > (don't forget there may be multiple non-intersecting cycles) > b) invent some combinatorial gubbins that will identify > at least one state in EACH of these cycles, and NO states > in the wanted cycle > c) use that logic to force the counter into one of the legal > states and/or raise a fault flag > > The same argument applies to one-hot state machines and > indeed any state machine with any illegal states, i.e. any > (sub)system with N flipflops but fewer than 2^N legal states. > > This is a very interesting problem, which most textbooks and > teachers miserably fail to address. The good reasons for using > one-hot, well rehearsed here and elsewhere, are the small > amounts of next-state logic and hence the fast inter-flipflop > logic paths they yield. Conventional wisdom says that > one-hot SMs are therefore likely to be smaller and faster than > encoded SMs, at least in flipflop-rich FPGAs. Similarly, Johnson > counters have the benefit of very high speed, tiny amounts of > next-state logic, and (irrelevant to this post, but very useful) > freedom from output decode spikes when used to implement one-of-N > sequencers. > > BUT, BUT, BUT: If you are trying to design a _robust_ system, > in which illegal states are detected and dealt-with as they > should be, you will likely end up with unwieldy amounts of > illegal-state detection logic which will be just as expensive > of speed and real-estate as the next-state logic for a fully > encoded system! > > I am very distrustful of authors who assert that effective start-up > reset will solve this problem. As FPGA design rules fall to 0.35u > and below, we will surely see soft errors in FPGAs > occasionally? In any case, in high-reliability applications > you have to accept that metastability will come up from behind > and get you one day. Sure, I have designed plenty of systems > with a few un-caught illegal states - but I'm not proud of it. > > Anyone else as worried about this as I am? > > Jonathan Bromley > --Article: 12241
In comp.lang.vhdl Andy Peters <apeters@noao.edu.NOSPAM> wrote: :> :> 25ns 25ns :> FF1-------->FF2---------->FF3 : By same "path," do you mean that FFs 1, 2, and 3 share the same clock? : Actually, what the constraint means is that ALL flipflops in your circuit : use this constraint (unless you tell it otherwise with different : constraints). Thanks for replying Andy. Yes, all flip flops in my design use the same clock and I data transfer from one flip flop to another should not exceed 25ns. My design is heavily pipelined because I wanted to avoid a slow design. NestorArticle: 12242
"Edwin Grigorian" <edwin.grigorian@jpl.nasa.gov> wrote: >I have a design that I'm trying to implement in an Altera Flex10k100A part >and it requires: > > Type A: 3 blocks of 16 x 64 bits RAM = 3 x 1024 = 3072 bits > Type B: 16 blocks of 8 x 8 bits RAM = 16 x 64 = 1024 bits > Total RAM = 4096 bits > >Now the 10k100A has 12 memory blocks (EABs), each 2048 bits for a total of >24k bits. Each EAB can be configured in any of the following: 256x8, 512x4, >1024x2, or 2048x1. Counting the bits of RAM is meaningless unless you can time multiplex the access. Each EAB can only implement at most 1 RAM block, and then only 8 bits wide at that. Yes, you are leaving a lot of unused RAM in each EAB since your memory is so short. Thus your EAB usage is: Type A: 3 blocks of 16 x 64 bits RAM = 3 x 8 EAB = 24 EAB (1/32 used) Type B: 16 blocks of 8 x 8 bits RAM = 16 x 1 EAB = 16 EAB (1/64 used) Total RAM = 40 EAB Possible Solutions: 1) Multiplex your memories to get more out of them, especially the A blocks. 2) Use the new 10KE family parts, I believe each EAB in these can do 256x16 of Dual-ported memory. I don't have at hand their release schedule, they are very new. 3) Build some Ram out of LEs. These are probably slower and will take a lot of device resources, and you will need a bigger part. 4) Use some other part which is better at small building Small RAMs. -- richard_damon@iname.com (Redirector to my current best Mailbox) rdamon@beltronicsInspection.com (Work Adddress) Richad_Damon@msn.com (Just for Fun)Article: 12243
I work in the field of neural network hardware implementation. I used the Lattice isp (in-system programmable) FPGAs with excellent results. I am now looking for an FPGA that supports in system programmability and is supported by software tools that can allow me to automate the design architecture of the FPGA with minimum lay i.e. not technical user intervention. Perhaps, the design software will allow complex scripting, and/or linking to a high level langauge The goal is to design a neural hardware system that is general purpose, mutliconfigurable and user friendly. I believe that FPGAs will allow me to do this when linked with a special neural processor. Any ideas, pointers, feedback will be appreciated. Ali El-Mousa -----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your OwnArticle: 12244
This is a multi-part message in MIME format. --------------F1A94D87E26CA79E1BB76324 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Jonathan Bromley wrote: . . . > BUT, BUT, BUT: If you are trying to design a _robust_ system, > in which illegal states are detected and dealt-with as they > should be, you will likely end up with unwieldy amounts of > illegal-state detection logic which will be just as expensive > of speed and real-estate as the next-state logic for a fully > encoded system! . . . I think you are right on the money. When you consider the whole problem, encoding states saves time when it fits and runs fast enough. -Mike Treseler --------------F1A94D87E26CA79E1BB76324 Content-Type: text/x-vcard; charset=us-ascii; name="vcard.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Mike Treseler Content-Disposition: attachment; filename="vcard.vcf" begin: vcard fn: Mike Treseler n: Treseler;Mike org: Fluke Networks Division adr;dom: 6920 Seaway Blvd;;;Everett WA;;98203; email;internet: tres@tc.fluke.com title: Sr. Staff Engineer tel;work: 425.356.5409 tel;fax: 425.356.5043 x-mozilla-cpt: tc.fluke.com;2 x-mozilla-html: TRUE version: 2.1 end: vcard --------------F1A94D87E26CA79E1BB76324--Article: 12245
Jonathan Bromley wrote: > > will start it off right, but if your application must be sure to > > recover from faults, you'll need to trap those illegal states. > > Exactly so. To do this you need to > a) determine all the possible cycles of illegal states > (don't forget there may be multiple non-intersecting cycles) > b) invent some combinatorial gubbins that will identify > at least one state in EACH of these cycles, and NO states > in the wanted cycle > c) use that logic to force the counter into one of the legal > states and/or raise a fault flag > > The same argument applies to one-hot state machines and > indeed any state machine with any illegal states, i.e. any > (sub)system with N flipflops but fewer than 2^N legal states. I don't completely agree with your premise that illegal states MUST be trapped nor the method of handling them. My question is, how useful is illegal state detection when your machine is already screwed up if you are IN an illegal state? If soft errors are a concern, using a fully encoded machine does not solve the error problem. It only increases the chance of jumping to a legal, but incorrect state. Either way, you have an error in the machine operation. If you really need soft error resistance in your design, you must use redundant circuitry with error detection and correction. I have never explored this type of circuit, but you might be able to use one of the many ECC schemes with a fully encoded state. Next Cur Corrected Inputs State State ECC Cur Output Logic FFs Logic State Logic +----+ +----+ +----+ +----+ -------->| |---->|D Q|---->| |---+-->| |----> | | | | | | | | | +--->| | |> | | | | | | | +----+ +----+ +----+ | +----+ +-----------------------------------+ The output of the Next State logic will have N bits of state plus M bits of ECC. If you have an error in the logic rather than just the FFs, then I don't know if the ECC circuit can correct for the error. -- Rick Collins redsp@XYusa.net remove the XY to email me.Article: 12246
elmousa@my-dejanews.com wrote: > > I work in the field of neural network hardware implementation. I used the > Lattice isp (in-system programmable) FPGAs with excellent results. > > I am now looking for an FPGA that supports in system programmability and is > supported by software tools that can allow me to automate the design > architecture of the FPGA with minimum lay i.e. not technical user > intervention. Perhaps, the design software will allow complex scripting, > and/or linking to a high level langauge > > The goal is to design a neural hardware system that is general purpose, > mutliconfigurable and user friendly. I believe that FPGAs will allow me to do > this when linked with a special neural processor. > > Any ideas, pointers, feedback will be appreciated. I think I understand what you are trying to do and it may be possible. The most likely method would be to use your input program to analyze your user requirements and produce VHDL as output. The VHDL can then be compiled to a chip by means of any of the many VHDL compiliers available. The problem to this approach will be that you have to work within the subset of VHDL supported for synthesis by your tool. But if you can produce C code from your inputs, you should be able to produce VHDL code. The trick will be producing GOOD VHDL code. -- Rick Collins redsp@XYusa.net remove the XY to email me.Article: 12247
hi rick, we went through this a number of years ago, and again more recently. you do need to have transitions from all of the states ... they really must be handled. the state info + extra bit approach is not sufficient for all cases. for example, suppose that you have an "upset" on the clock line, giving a runt pulse, tossing you into an illegal state or sequence of states. the edac schemes typically work based on the hamming distance of the code implemented. other examples that can cause multiple upsets are drop outs on the power bus, esd, certain man-initiated events, etc., etc. more recently i've been looking at what vhdl synthesizers do to state machines and am feeding in simple examples of sequencers. for one hot state machines, one compiler makes a structure that can either lose it's one-hot state or have two hot states chasing each other around. another one makes quite a robust machine which always rights itself and is not truly one hot but will right itself no matter what you do to it at little overhead (and some extra delay for larger numbers of states). for an encoded machine, i have found one compiler eliminate 'unreachable' states by default and their transitions, even if you go to the trouble of coding htem in. gotta figure out what each version of the synthesizers and optimizers do and how to disable certain parts. anyways, interesting topic, we're looking into it deeper, open for ideas (email if you wish, remove no-spam). rk ______________________________________________________ Rickman wrote: > Jonathan Bromley wrote: > > > will start it off right, but if your application must be sure to > > > recover from faults, you'll need to trap those illegal states. > > > > Exactly so. To do this you need to > > a) determine all the possible cycles of illegal states > > (don't forget there may be multiple non-intersecting cycles) > > b) invent some combinatorial gubbins that will identify > > at least one state in EACH of these cycles, and NO states > > in the wanted cycle > > c) use that logic to force the counter into one of the legal > > states and/or raise a fault flag > > > > The same argument applies to one-hot state machines and > > indeed any state machine with any illegal states, i.e. any > > (sub)system with N flipflops but fewer than 2^N legal states. > > I don't completely agree with your premise that illegal states MUST be > trapped nor the method of handling them. > > My question is, how useful is illegal state detection when your machine > is already screwed up if you are IN an illegal state? If soft errors are > a concern, using a fully encoded machine does not solve the error > problem. It only increases the chance of jumping to a legal, but > incorrect state. Either way, you have an error in the machine operation. > > If you really need soft error resistance in your design, you must use > redundant circuitry with error detection and correction. I have never > explored this type of circuit, but you might be able to use one of the > many ECC schemes with a fully encoded state. > > Next Cur Corrected > Inputs State State ECC Cur Output > Logic FFs Logic State Logic > +----+ +----+ +----+ +----+ > -------->| |---->|D Q|---->| |---+-->| |----> > | | | | | | | | | > +--->| | |> | | | | | | > | +----+ +----+ +----+ | +----+ > +-----------------------------------+ > > The output of the Next State logic will have N bits of state plus M bits > of ECC. If you have an error in the logic rather than just the FFs, then > I don't know if the ECC circuit can correct for the error. > > -- > > Rick Collins > > redsp@XYusa.net > > remove the XY to email me.Article: 12248
You can't go wrong with Model Technology's ModelSim (www.model.com) or Simucad Silos III (www.simucad.com). Model Tech will give you a free 30 day eval license and Simucad has a free version good for 100-200 lines of code. Download them and try them out. Silos III is probably a little cheaper. Martin Meserve <meserve@my-dejanews.com> wrote in article <3618CD06.41C67EA6@my-dejanews.com>... > Rick Filipkiewicz wrote: > > > > I'm looking for a reasonably priced Verilog simulator to add to our > > Xilinx Foundation+Express package. > > So far I can see VeriWell, Chronologic, QuickTurn. Anybody have any > > comments on these or others. We could go to $5000 which I assume > > writes off Cadence. > > > > Also looking for a Verilog PCI testbench suite. > > If that's your budget, yes, you can write off Cadence's Verilog-XL. > However, you should also write off Chronologic. It's a toss up > which one is better, Chronologic or Cadence. They both have their > good points and their bad points. The last time we received a quote > form them, Chronologic's was more expensive. By the way, Chronologics' > VCS is now owned by Synopsys. They have had a tight hold on VHDL > development tools and now they want to expand to Verilog. > > We have both VCS and Verilog-XL. We use VCS for our large ASIC > development and Verilog-XL for board development but they can > be used for either. > > One thing that you didn't mention is a waveform viewing tool. > I have, in the past, developed a medium sized ASIC with only > text output (reams and reams of zeros and ones), but it's > not an easy task. We have a viewer from Summit Design that we use > with VCS and we use Cadence's built in viewer, with Verilog-XL. > They both work well. > > If your budget is as tight as you say, take the suggestions from > the other posters. Other wise you are going to have the get > someone to stretch the budget. And, don't forget the 15% a year > for support. > > Martin > > -- > Martin E. Meserve martin.e.meserve > Engineer, Program/Project Specialist AT > Lockheed Martin Tactical Defense Systems - AZ lmco.com > > -----------== Posted via Deja News, The Discussion Network ==---------- > http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own >Article: 12249
This is a multi-part message in MIME format. --------------DDBDB1AB852CD888C58C5B70 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Botond Kardos wrote: > ..... > This paper claims that one needs to make an electron microscope shot > to get every simple antifuse, and this photos are destructive and quite > expensive, so breaking an Actel antifuse FPGA wich contains about 50,000 > antifuses might cost $50 million. > Is this true ? Aren't there other ways for reprogramming or > eliminating the read-out protection (it also may be a single or more > antifuses) for example with an ion-beam ? Here's my 2p worth: the Actel paper starts out as a reasonable if superficialintroduction to design security. Then it gradually gets into the realms of the ridiculous as the marketing department takes over. The figure on Actel's slide 33 is actually $500M to crack an antifuse FPGA based on $1000 per picture and 500,000 antifuses. I bet you anything you like that if you ask for 500,000 pictures the lab will give you a price break :-) More seriously I don't think anybody would ever try to take pictures of every antifuse to figure out the configuration of an FPGA. Using an Ion beam machine to get at the programming circuitry would be a good first step: once the programming circuitry is active the antifuse is no safer than an SRAM. Tom. --------------DDBDB1AB852CD888C58C5B70 Content-Type: text/x-vcard; charset=us-ascii; name="vcard.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Tom Kean Content-Disposition: attachment; filename="vcard.vcf" begin: vcard fn: Tom Kean n: Kean;Tom org: Algotronix Ltd. adr: P.O. Box 23116;;;Edinburgh;;EH8 8YB;Scotland email;internet: tom@algotronix.com title: Director tel;work: UK +44 131 556 9242 tel;fax: UK +44 131 556 9247 note: Web Site: www.algotronix.com x-mozilla-cpt: ;0 x-mozilla-html: TRUE version: 2.1 end: vcard --------------DDBDB1AB852CD888C58C5B70--
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z