Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Hi all, I have been struggling with a problem for nearly a month now. I have a small piece of code which gives hold time violations when its synthesized in Quartus II. But if I synthesize and complete the fitting of the design, there are no hold time violations. Post fitting simulations dont show any violations either. Has anyone see such behavior in their designs ? Is this a problem and if so, what needs to be done to fix it ? Thanks, PrashantArticle: 50476
Heiko Kalte wrote: > Hi, > I am trying to estimate the partial bitstream size of a 201 slices > design part in a Virtex-II 500. > > The problem is that I did not found out the frames per CLB in a Virtex2, > there must be more than 48 because there are 4 slices in a CLB. > Additionally I nead the bit per frame for a Virtex-II 500. > Please Help me. Here are some basic numbers: The XC2V500 has 32 rows and 24 columns of CLBs, and needs a total of ~2.56 million configuration bits. Each column of CLBs is composed of 22 frames. Do the math: There are roughly 107 000 config bits per CLB column, with about 4800 bits in each of the 22 frames. Now come the constraints: You can only reconfigure integer frames, and you have to be really clever to reconfigure only portions of the 22 frames making up a column. So, most likely you will reconfigure whole columns of 32 CLBs = 128 slices = 256 LUTs. In your case, you will reconfigure two columns, and you have to floorplan your partial design such that it all fits into two vertical CLB columns. So, count on roughly 1/12 of the bitstream, or about 215,000 config bits. Peter Alfke, Xilinx ApplicationsArticle: 50477
On 11 Dec 2002 09:17:18 -0800, prashantj@usa.net (Prashant) wrote: >Hi all, >I have been struggling with a problem for nearly a month now. I have a >small piece of code which gives hold time violations when its >synthesized in Quartus II. But if I synthesize and complete the >fitting of the design, there are no hold time violations. Post fitting >simulations dont show any violations either. Has anyone see such >behavior in their designs ? Is this a problem and if so, what needs to >be done to fix it ? > >Thanks, >Prashant Hold violations happen when clk->Q + logic delay (assuming zero skew) is smaller than the hold constraint of the accepting flop. If you have little or no logic and a fast flop, the synthesizer may underestimate the wire delay and generate a hold violation. When you do P&R, the actual wire delay gets added and that may resolve the hold violation. So if you're not seeing hold violations after P&R, you're OK. In the future if you see hold violations after P&R the solution might be to do an ECO and add a slow buffer between the two flops to fix it. Muzaffer Kal http://www.dspia.com ASIC/FPGA design/verification consulting specializing in DSP algorithm implementationsArticle: 50478
I'm designing a soft CPU and I have a question regarding power consumption. I know that in an ASIC, the power consumption is roughly proportional to the clock rate. I think driving the capacitance of wires is mainly what consumes power, so I'd like to keep them short, and try to keep long wires from changing state often. My question is, do registers draw much power if you clock them but their outputs don't change state?Article: 50479
"Jay" <kayrock66@yahoo.com> wrote in message news:d049f91b.0212101735.6d7ba6f6@posting.google.com... > I'm going to second Paul's suggestion about just computing the the > correlation sequentially, its just too easy not to do. And also, I > don't think a single correlation is going to give you your time delay, > I think you're going to have to slide the 2 data sets across each > other and keep computing that correlation until it peaks. > Do you mean convolution? I think correlation already does the sliding stuff.Article: 50480
Hello Peter, thanks for your help. Can you give me the amount of additional bits (besides the frame data) for initialisation and all that stuff. I heard about 384Bit + frame data + dummy frame for the Virtex and VirtexE series. Heiko PS: Ich bins, aus dem kalten und regnerischen Paderborn, NRW. Ich wünsch dir frohe Weihnachten, falls wir uns nicht mehr hören. Peter Alfke schrieb: > > Heiko Kalte wrote: > > > Hi, > > I am trying to estimate the partial bitstream size of a 201 slices > > design part in a Virtex-II 500. > > > > The problem is that I did not found out the frames per CLB in a Virtex2, > > there must be more than 48 because there are 4 slices in a CLB. > > Additionally I nead the bit per frame for a Virtex-II 500. > > Please Help me. > > Here are some basic numbers: > The XC2V500 has 32 rows and 24 columns of CLBs, and needs a total of ~2.56 > million configuration bits. Each column of CLBs is composed of 22 frames. > > Do the math: > There are roughly 107 000 config bits per CLB column, with about 4800 bits > in each of the 22 frames. > > Now come the constraints: > You can only reconfigure integer frames, and you have to be really clever > to reconfigure only portions of the 22 frames making up a column. > So, most likely you will reconfigure whole columns of 32 CLBs = 128 slices > = 256 LUTs. > In your case, you will reconfigure two columns, and you have to floorplan > your partial design such that it all fits into two vertical CLB columns. > > So, count on roughly 1/12 of the bitstream, or about 215,000 config bits. > > Peter Alfke, Xilinx Applications -- --------------------------------------------------------------- Dipl. Ing. H. Kalte | HEINZ NIXDORF INSTITUTE | Office: F1.213 System and Circuit Technology | Fon: +49 (0)5251 60-6459 Fürstenallee 11 | Fax: +49 (0)5251 60-6351 33102 Paderborn, Germany | --------------------------------------------------------------- mailto:kalte@hni.uni-paderborn.de http://wwwhni.uni-paderborn.de/sct/ --------------------------------------------------------------- Home of the RAPTOR Rapid Prototyping Systems http://www.RAPTOR2000.de/ ---------------------------------------------------------------Article: 50481
Martin Schoeberl wrote: > <snip> > I was really disappointed when starting with my Java processor. I played > around a little bit with hand optimization of JVM code. In one example > program execution time dropped on JOP and in Suns JVM in interpreting mode > for about 25%. But the execution time > of the JIT compiled program was LONGER!. In JDK 1.1 about 10% and in JDK 1.3 > it was about 15 times slower! Wow - a real warning to those who would use Java for embedded/real time work. Just be real carefull to never change versions. Was that 15x slower, because the interpreting mode got faster ? It's not a trivial amount of work to make something run 15x slower :) > > Here is the complete example: http://www.jopdesign.com/perf.html > > I think javac does no optimization to make it easier for JIT to compile JVM > stack code to a register machine. And the JIT assumes this simple minded > stack code for it's optimization. > > I was looking arund to find some byte code optimizer, but have only found > this 'obfuscators'. No optimizer for the stack code. > > > Further you might say that the stack access must be slower than a register > > file, since you require to access to the stack plus write-back. But in > > No. I assumed the same single cycle access time for 2 operands load and one > write on the stack as for two register load and one write. > > I think this thread gets a little bit OT, perhaps we should move to > c.l.forth or c.l.j.machine I find the HW/SW layer interaction interesting, and not OT. It is also good to hear real experiences in deploying different designs into FPGAs. Did you ever compare the .NET byte code, or consider a .NET FPGA engine ? I noted this in a AMD press release: #AMD Athlon MP 2400+ dual point-to-point 266MHz processor buses, providing up to #2.1Gbytes/s bandwidth. 0.13-micron processing #.. has a pipelined superscalar floating point engine and a Level 2 #cache translation look-aside buffer. #AMD also has added 51 new instructions to its 3DNow instruction set. #The processor has a list price of $228 in 1,000-unit quantities. The addition of 51 new opcodes seemed quite significant, and we should expect stack-machine opcodes to appear in future mainstream CPUs. -jgArticle: 50482
"Ralph Mason" wrote: > "William Tanksley Google" <wtanksley@bigfoot.com> wrote: > > Why would that be? The simulation I built as a college project was faster > > than any of my classmates', so I'm a bit sceptical. > A stack machine will require more cycles per instruction. So by the cycles > per interction metric they are slower. My machine was single-cycle, and I computed the clock rate as 90MHz, better than any of the other machines in that class for which people computed a rate. Chuck Moore's chips follow a similar design, and he fabs them; they're single cycle per instruction, limited by memory bandwidth (except for the ADD instruction, which will return bad results if you don't let the numbers sit on the stack long enough for the carry to propagate). They run at peak 500MIPS, so roughly 100MHz (they're a bit asynchronous, but only between word loads -- there are 4 instructions per instruction word load). Stack machines should require less time per instruction, not more. They have no need for a register access stage, for example; the ALU can be directly connected to the TOS and nOS, and compute all results directly. Memory accesses can also be prefetched (to read from memory, first load the A register, then do other things until you think it's fetched, then read from the D register). But anyhow, both of these are single-cycle machines. >But perhaps a stack machine is simple enough that it can have a faster cycle >time, I would guess so -- it eliminates the register selection phase, thus allowing ALU results to be computed in parallel with instruction decoding. > or perhaps the instructions give a higher computation yield than risc >instructions. I doubt it -- most of them I've seen are fairly RISC. Although for my final assignment in that design class I had to implement a sort -- I added a perfect shuffle circuit onto the stack, and added an instruction to operate it. Four instructions to sort 16 elements (plus overhead to load and unload the elements, of course). Then, to add insult to injury, one of my fellow student complained to me that the TA graded him down for excessive program length in that assignment, and asked me what my program length was. He walked away dejected and convinced that the TA was right after all (his program was 800 bytes, mine was 90). Heh. -BillyArticle: 50483
"Martin Schoeberl" <martin.schoeberl@chello.at> wrote in message news:<VtuJ9.115181$A9.1351706@news.chello.at>... > > > Thats perfectly rigth. Things are a lot easier with a stack architecture > > > (but also slower since you need more cycles for basic operations than on > a > > > RISC CPU). > > > > Why would that be? The simulation I built as a college project was faster > > than any of my classmates', so I'm a bit sceptical. > > It's slower in terms of cycles/instructions for the same statement: > > A simple example: a = b + c; > > On a RISC CPU the chance is high, that local variables are allocated to > registers. So this statement compiles to somthing like: > > add r0, r1, r3 > > On a stack machine (like the JVM) this will compile to: > > load a > load b > add > store c Hi You seemed to have missed the point of a stack machine. One vary rarely uses local variables. One maintains the stack in such a way as to avoid using locals. Most stack machine instructions would just be: add In some cases, the stack order might not be right or you might need to fetch something else. Things like: over add In this case, it looks like more cycles but often the over operation can be optimized into the instruction it self, with no cycle penalty. Your thinking of local usage is the result of using languages that depend on them. When one programs a stack machine, one thinks more about optimizing data flow. The 'just in time concept' applies. It is just another way of thinking. Dwight > > > The advantage of Forth is that it's well-suited as-is for running > > hardware, and once you have it running Java can be implemented on top. I > would > > rather use Forth as a machine language than Java bytecodes. > > Isn't Forth a 16 bit system? Building a 32 bit JVM on top of this would be > not very efficient. > > MartinArticle: 50484
hmurray@suespammers.org (Hal Murray) wrote: > > A stack machine will require more cycles per instruction. So by the cycles > > per interction metric they are slower. > That's not obvious to me. What's an instruction? Who cares? Why > not measure time to execute a line of code? The old MIPS problem (Meaningless Indication of Processor Speed). Yes, instruction time is hard to judge -- but on a theoretical basis we can talk about different processors with equivalent instruction sets, and the comparison becomes meaningful. > > That's a point I like about stack machines. See the stack as a better, more > > efficient cach. And it's better predictable when it comes to real time > > systems. > Huh? How are you measuring efficiency? More predictable than registers? Yes, more predictable than registers; but more importantly, more predictable than random access memory. The top of the stack contains the things which will be needed soon, so store them in fast memory or registers; the middle of the stack will be needed eventually, so store it in medium-speed storage like RAM; and the rest of the stack will take a while to be needed, so allow it to be paged out if needed. > > And my thinking was, perhaps a stack machine meets these requirements better > > than a risc type machine. > How are you measuring goodness? Meeting requirements. -BillyArticle: 50485
The output of any register has to drive the input of the next block in your design. If the output is low than the is a sink current drown from the next input and if it is high the register is sousering the current to the next input block. Now when there is a high frequency of the register switching there are additional currents for charging and discharging the transmition lines (connection wires). And so on..... "Brad Eckert" <brad@tinyboot.com> wrote in message news:4da09e32.0212111000.599de814@posting.google.com... > I'm designing a soft CPU and I have a question regarding power > consumption. I know that in an ASIC, the power consumption is roughly > proportional to the clock rate. > > I think driving the capacitance of wires is mainly what consumes > power, so I'd like to keep them short, and try to keep long wires from > changing state often. > > My question is, do registers draw much power if you clock them but > their outputs don't change state?Article: 50486
"Martin Schoeberl" <martin.schoeberl@chello.at> wrote in message news:<BnGJ9.123098$A9.1424459@news.chello.at>... > Sorry, but my first statement was to simple and a mix of theory and praxis. > I'll try (as I can) to explain it in more detail: > > > Acutally I'm currently looking at implemeting a stack based architecture > > along with a RISC architecture. I think you miss the point a bit with > > your example. You say that local varaiables are allocated to registers > > for the RISC example, but then force the stack based design to load to > > the stack. Giving the stack based architecture the same advantage, > > that both operands are available at the top of the stack surely the > > optimised code would be :- > > > > add > > As I know (perhaps I'm wrong) in theory every computing problem can be > solved with a stack architecture without local variables. But for procedural > languages you need two stacks: one for the operands and one for the return > addresses. When you mix them in one stack you have to load the function > parameters on the current operand stack. > > And how is following example solved? > > f(a) { > b = 1-a; > return b > } > > assume paramters and return values on stack: > > push 1 > swap -- one 'extra' stack manipulation is necessery > sub > return Why wouldn't you include an instruction that did: swap/sub as a single step. This is waht is typically done in many of the Forth processors I've seen. > > Now to the practical thing: > > The JVM is a little bit inconsistent on usage of the stack. For function > calls the parameters must be pushed on the stack. But in the called function > they are accessed as 'locals'. I think this point comes from (perhaps wrong) > anticipation of the language designers to use one stack for data > (parameters) and the return addresses. > > So a simple function like: > > int f(int a, int b) { > return a+b; > } > > translates to: > > Method int f(int, int) > 0 iload_1 > 1 iload_2 > 2 iadd > 3 ireturn ---snip--- Again, your mind set is setting your expectations. You are forcing the machine to do what you believe is the best way to represent the problem. Maybe Java is not the best example of a stack language to look at. In the above example, you've assumed that the two values need to be loaded. In a typical stack implementation, they are already there and are just consumed by the add and replaced by the result. Forth typically has two stacks but both input and result data are kept on the same stack. In Forth, the two stacks are use to keep program flow separate from data flow. DwightArticle: 50487
Is there a place where i can get HDL for Hough tranform. I would like to implement the algorithm for both circle and elips thanksArticle: 50488
"Ralph Mason" wrote: > But in a stack machine, one would hope that the operands you were working on > would be right on the top of the stack, thus the instruction is only > add > All the registers are implicit, thereby giving a more compact instruction > size. True. > The slowness comes in because of the work the stack machine must do to > perform that add > Fetch op one from stack (dec sp ) > Fetch op two from stack (dec sp) > Add ops > Push result to stack ( inc sp) Nope -- the mistake here is in assuming that the stack processor is only pretending to be a stack processor. Nope, it's a real stack processor; the ALU is gated directly to the two top-of-stack elements. If the stack's entirely on-chip, there are no fetch operations; if only the top few elements are on-chip, the fetch operations can occur in parallel with the add. > By my count this is 4 cycles, although I suppose you could use a dual port > stack and pop the arguments in one cycle. Although this increases the area > of the design, which doesn't fit with what I want to acheieve. It doesn't > look like it lends itself well to any kind of > pipelining either. Does the design I explained make sense? > Basically I want to make something that is > 1. very small (in area) > 2. As useful as possible (eg can fit lots of code) > 3. Powerful enough (how's that for scientific) > 4.Is easy to make a good development tool See the MuP21 and later chips for concrete examples. > Ralph -BillyArticle: 50489
Heiko, Use the following option to get the difference between two bitstreams: bitgen {all options used for design1.bit} -g ActiveReconfig:Yes -r design1.bit design2.ncd This creates the difference bitfile from the two ncd files. In this way you can see exactly what a partial bitstream size is for reconfiguration. Austin Heiko Kalte wrote: > Hi, > I am trying to estimate the partial bitstream size of a 201 slices > design part in a Virtex-II 500. I did a rough estimation for a > Virtex600E. Therefor I divied the Slices by 2 to get the Number of CLBs. > Afterward I divied this by the number of CLBs in a Column for a > Virtex600E. This leads to at least 2 columns (or more depends on the > floorplan). Each CLB column consists of 48 frames and each frame of > 960bit. Adding some initialization leads to 93504 config bit. > > The problem is that I did not found out the frames per CLB in a Virtex2, > there must be more than 48 because there are 4 slices in a CLB. > Additionally I nead the bit per frame for a Virtex-II 500. > Please Help me. > Heiko > > By the way is this calculation correct? > > > -- > --------------------------------------------------------------- > Dipl. Ing. H. Kalte | > HEINZ NIXDORF INSTITUTE | Office: F1.213 > System and Circuit Technology | Fon: +49 (0)5251 60-6459 > Fürstenallee 11 | Fax: +49 (0)5251 60-6351 > 33102 Paderborn, Germany | > --------------------------------------------------------------- > mailto:kalte@hni.uni-paderborn.de > http://wwwhni.uni-paderborn.de/sct/ > --------------------------------------------------------------- > > Home of the RAPTOR Rapid Prototyping Systems > http://www.RAPTOR2000.de/ > > ---------------------------------------------------------------Article: 50490
If it is just 8 bit video, you could just use a single BRAM as an 8 in to 8 out LUT and be done with it, that is if you don't need the BRAM for your line buffers. "Normand Bélanger" wrote: > "Open mouth, insert foot" > > I went back to look at the previous messages on this thread and saw > that you are right. I was replying to Philip Freidin message in which he > talked about LNS and floating point so I assumed that FP like precision > was needed; I should have taken a look a the OP first. > > Sorry for the confusion, > > Normand > > "Ray Andraka" <ray@andraka.com> a écrit dans le message de news: > 3DF7699F.D0D3A7C8@andraka.com... > > When I read imaging application, I was assuming 8 or 10 bit video, in > which case > > a 4 or 5 bit lookup is plenty. If it is for medical or surveillance, it > might > > have more bits per pixel, in which case a higher precision log might be > > desired. Could use block ram as a LUT for 8 bit look-up if you have the > block > > ram to spare, or you could go to either a divider-like structure similar > to the > > one Isreal Koren presents in his book, or to a two tiered LUT approach. > > > > "Normand Bélanger" wrote: > > > > > Agreed. I was under the impression that precision was needed in this > > > case so I suggested this. > > > > > > Normand > > > > > > "Ray Andraka" <ray@andraka.com> a écrit dans le message de news: > > > 3DF6BC7D.B9D0BB5A@andraka.com... > > > > Depends on the accuracy you desire and the resources at hand. The > quick > > > and > > > > dirty log I mentioned before is both faster and smaller than the > restoring > > > > arithmetic method described by Israel Koren in his book. > > > > > > > > "Normand Bélanger" wrote: > > > > > > > > > I'm currently working on an "FPU" like this (i.e. LNS computations). > > > > > The best way I know of computing a LOG is described in Prof. Koren > > > > > Computer arithmetic book in chapter 9 (if I recall correctly). It is > > > also > > > > > fairly easy to implement if you don't mind a significant latency. > > > > > > > > > > Good luck, > > > > > > > > > > Normand > > > > > > > > -- > > > > --Ray Andraka, P.E. > > > > President, the Andraka Consulting Group, Inc. > > > > 401/884-7930 Fax 401/884-7950 > > > > email ray@andraka.com > > > > http://www.andraka.com > > > > > > > > "They that give up essential liberty to obtain a little > > > > temporary safety deserve neither liberty nor safety." > > > > -Benjamin Franklin, 1759 > > > > > > > > > > > > -- > > --Ray Andraka, P.E. > > President, the Andraka Consulting Group, Inc. > > 401/884-7930 Fax 401/884-7950 > > email ray@andraka.com > > http://www.andraka.com > > > > "They that give up essential liberty to obtain a little > > temporary safety deserve neither liberty nor safety." > > -Benjamin Franklin, 1759 > > > > -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 50492
I don't believe so. It's been a while since I've cogitated at the transistor level, but my recollection is that power consumption arises as transistors move through their active region (from saturation to cutoff or from cutoff to saturation). If the bits of the register don't change then the transistor states don't change and you shouldn't have to pay the power piper. Sound right to the rest of you? Kip Ingram -- Get daily news and analysis from the FPGA market for pennies a day. Subscribe to The FPGA Roundup today: http://www.KipIngram.com/FPGARoundup.html -- "Brad Eckert" <brad@tinyboot.com> wrote in message news:4da09e32.0212111000.599de814@posting.google.com... > I'm designing a soft CPU and I have a question regarding power > consumption. I know that in an ASIC, the power consumption is roughly > proportional to the clock rate. > > I think driving the capacitance of wires is mainly what consumes > power, so I'd like to keep them short, and try to keep long wires from > changing state often. > > My question is, do registers draw much power if you clock them but > their outputs don't change state?Article: 50493
> As I know (perhaps I'm wrong) in theory every computing problem can be > solved with a stack architecture without local variables. But for procedural > languages you need two stacks: one for the operands and one for the return > addresses. When you mix them in one stack you have to load the function > parameters on the current operand stack. > > And how is following example solved? > > f(a) { > b = 1-a; > return b > } > > assume paramters and return values on stack: > > push 1 > swap -- one 'extra' stack manipulation is necessery > sub > return > There are some excellent points by others here as well, but I should also say that, having two stacks for your stack based processor the return instruction can actually be encoded into your sub, so that the actual instruction count is only :- push 1 swap sub + return So your return is basically free, as it can be computed in parallel with the subtraction. -- Tim Simpson Ph.D Design Engineer (reply to address is not valid remove _xx_ to reply)Article: 50494
Hi Hal, > The PCI connector has power pins for 5V, and 3.3V, and also a > few more pins for IO power. No, there is no guarantee you will get 3.3V power, unless you are in a 3.3V slot! > They are either 3 or 5, depending > upon the signaling voltage, No, as well. The VIO pins are 3 or 5 depeding on the signaling voltage, but the 5V pins are still 5V, and the 3.3V pins are only guaranteed to be 3.3V on a 3.3V slot. > the idea being that you can wire > them to the supply rail for your IO pads and make a board that > supports either 3V signaling or 5V, depending upon the power the > motherboard supplies on those pins. The power pins and VIO pins are separate. > The PCI connector has a plug that matches with a cutout on the > board. The plug goes in either of two positions (turn the connector > around), one for 5V signaling, the other for 3V. So in theory, > you can make three types of cards. The normal card in wide use > is 5V signaling, though they may only drive the outputs with a > 3V CMOS driver. You can also make 3V only card by putting the > cutout on the other end of the card. You can also make 3V/5V > cards by cutting out both slots and maybe wiring the IO pad > rail on your chip to the IO supply from the PCI connector. Yes, that is correct. > I've never seen any 3V or dual cards. I have done quite a few of them. > The main question I was trying to ask was if anybody had seen > any 3V or dual signaling level cards. If so, I might think more > about taking advantage of that. Since I didn't see many > encouraging responses I'll probably but this on the back burner. It depends on what your goal is. There is no need to do a 3.3V card for use in a standard PC, as they are all still 5V today. The only reason really for going to 3.3V is to go 66MHz. > Some early systems didn't actually supply any 3.3V power. You > can dance around that with an on-board regulator. I plan to > ignore that. (But I'll check my systems first, just in case, > and listen for tales of troubles with not-so-early boards.) I would not ignore that. Most systems don't have the 3.3V power, not just "early" ones. > The 3V signaling rules overlap the 5V rules enough so that a > card that drives high to 3V will work in a 5V system. Well, not really. The important issue isn't voltage but the VI curve. > The Spartan-II is 5V tolerant but doesn't have DLLs. The -IIE > has DLLs, but doesn't tolerate 5V signaling. What do you want to use the DLL for? > Since 3V systems don't seem to be very popular, I probably won't > build a card expecting to find a 3V only slot. > > Several years ago, I put a scope on a system that had the connector > pegs set for 5V. I never saw anything go over 3V. Obviously that > depends upon what cards are plugged in. Somebody could add an > old/evil card that really does drive to 5V. > > For hack/research systems it might make sense to use a FPGA that > wasn't 5V tolerant on a card that could be plugged into a 5V system. > You would have to remember to get out the scope before adding a card > that hadn't been tested yet. I'm probably not desperate enough > to get the DLLs that I will do this. (But I'm still scheming.) Checking a voltage level has somewhat little to do with the actual signaling. It's far more than just the voltage level with PCI. I'm not quite sure what you're talking about above...but you really should read the signaling part of the PCI spec. > Thanks for the PLX suggestions. Their web site expects me to > register before they give me data sheets so I'll put that on the > back burner. Why is registering a problem? > Thanks for the heads-up about using DLLs on PCI clocks. Is > that a clear don't-do-that, or just another worm for the list? To be PCI spec compliant, don't do that...but I personally have NEVER seen a system that changes the PCI clock, except when switching from 33MHz to 66MHz (they all start out at 33MHz to check which cards are 66MHz compatable...or at least they're supposed to ;-). Some times they base the PCI clock on the FSB speed...100MHz FSB = 25MHz PCI clock...but it's static, and doesn't change once up and running. Regards, AustinArticle: 50495
I've been playing with the PCI development kit for a few months using Altera's Quartus II software. I keep running problems trying to program the board using either JTAG or on board flash with designs configured in Quartus with Altera's MT64 interface IP. The design simulate fine for target transactions (that's all I looking at right now). Altera's support people alternate between telling me to reinstall the drivers and rebooting my computer. If anyone is using the kit I would like to hear if it really works or not. Thanks Jason HannulaArticle: 50496
Stephen, > Normal consumer PC's most certainly have 3.3V power rails on their > PCI slots, You can't guarantee ALL PCI systems do that, consumer or not...and is something to be conscious of if you want to make a board that will not give you customer service issues. AustinArticle: 50497
"Martin Schoeberl" <martin.schoeberl@chello.at> wrote: >>> Thats perfectly rigth. Things are a lot easier with a stack architecture >>> (but also slower since you need more cycles for basic operations than on >>> a RISC CPU). >> Why would that be? The simulation I built as a college project was faster >> than any of my classmates', so I'm a bit sceptical. > It's slower in terms of cycles/instructions for the same statement: > A simple example: a = b + c; > On a RISC CPU the chance is high, that local variables are allocated to > registers. So this statement compiles to somthing like: > add r0, r1, r3 > On a stack machine (like the JVM) this will compile to: > load a > load b > add > store c That's not a stack machine -- that's a register machine with a temporary stack, probably the most wasteful decision possible. >>The advantage of Forth is that it's well-suited as-is for running >>hardware, and once you have it running Java can be implemented on top. I >>would rather use Forth as a machine language than Java bytecodes. > Isn't Forth a 16 bit system? Building a 32 bit JVM on top of this would be > not very efficient. No, it's whatever bittedness you want it to be. Most modern Forths match the processors they run on. There's a 20.5 bit processor whose instruction set is Forth (21 bit stack, 20 bit memory). > Martin -BillyArticle: 50498
Mathew Orman wrote: > The output of any register has to drive the input of the next block in your > design. If the output is low than the is a sink current drown from the next > input and if it is high > the register is sousering the current to the next input block. Not true. Nowadays we are all using CMOS technology, where the static current is essentially zero ( femtoamps inside the chip). The original question was whether a flip-flop draws dynamic current when it is clocked, but does not change state. And the answer is: a little power is spent by the clock line plus the clock input to the flip-flop wiggling, but obviously less than when Q also wiggles. BTW, Xilinx FPGAs automatically prune the permanently unused branches from the clock-distribution tree, in order to save power.. Peter Alfke, Xilinx ApplicationsArticle: 50499
Kip Ingram wrote: > > I don't believe so. It's been a while since I've cogitated at the > transistor level, but my recollection is that power consumption arises as > transistors move through their active region (from saturation to cutoff or > from cutoff to saturation). If the bits of the register don't change then > the transistor states don't change and you shouldn't have to pay the power > piper. > > Sound right to the rest of you? Yes, but you do have a clock budget to pay, to actually get the clock distributed to the (not changing) registers. So there are two elements in the power eqn, Clock routing, and output transistion. Note also with the latter, that glitches from combinatorial delay deltas will add power for no nett logic - see some earlier posts about the % of power change from reduce of glitch by better pipeline/floorplan. The clock structure inside the FPGA should also be considered, and just how granular the routes are - typically there will be coarser steps, as a new row or column buffer gets enabled. -jg
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z