Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Goran Bilski wrote: > > > Please do. > > If you double all the registers in the data pipeline, hasn't you doubled the > pipeline? > Or is all functionality between the pipestages shared? Yes, you've doubled the pipeline by doubling the registers. In a non-pipelined design, you presumably have several layers of LUTs between registers. Adding pipeline registers is nearly free in the FPGA, since they are there but unused in slices used for combinatorial LUTs. By adding the pipeline stages you can crank up the clock considerably if you've balanced the delays between the registers. That in turn lets you have more than one thread in process at any given moment. All the functionality goes through the same path, it is just that the path has more registers in it, so that the next sample can be put into the pipeline before the previous one comes out. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 48351
Hi, I'm developing an application using Xilinx FPGA (XC4005) and the ISE foundation software from Xilinx. I've got a little question: it exists a way to delay some signal (just a few nanosec) using the schematic editor??? I tried to use a little "chain" of inverters, but this work only before implementation! (because it performs a logic minimization...). Thanks.Article: 48352
You can't just double the number of pipestage for a processor without major impacts. For streaming pipeline which hardware pipelines are I agree but for processor that can't be done. Göran Ray Andraka wrote: > Goran Bilski wrote: > > > > > > > Please do. > > > > If you double all the registers in the data pipeline, hasn't you doubled the > > pipeline? > > Or is all functionality between the pipestages shared? > > Yes, you've doubled the pipeline by doubling the registers. In a non-pipelined > design, you presumably have several layers of LUTs between registers. Adding pipeline > registers is nearly free in the FPGA, since they are there but unused in slices used > for combinatorial LUTs. By adding the pipeline stages you can crank up the clock > considerably if you've balanced the delays between the registers. That in turn lets > you have more than one thread in process at any given moment. All the functionality > goes through the same path, it is just that the path has more registers in it, so that > the next sample can be put into the pipeline before the previous one comes out. > > -- > --Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email ray@andraka.com > http://www.andraka.com > > "They that give up essential liberty to obtain a little > temporary safety deserve neither liberty nor safety." > -Benjamin Franklin, 1759Article: 48353
Neil Franklin wrote: > > I care for whatever chip solves my problem. The 2 year old ones do so. That is my point. You are focusing on tools for "out of date" chips. I am talking about the fact that open source tools will always be out of date. If that is all you need, then great. But don't expect the design community to welcome open source with open arms. :) > > point all the *major* new designs that are going to be done with it > > *have* been done and every thing else is maintenance. > > Makes one wonder why Xilinx re-warmed their over 2 years old Virtex in > form of Spartan-II. After already doing the same trick with Spartan, > and have since repeated it with Spartan-IIE. > > Seem to sell, so there seem to be quite a few people who do not need > then newest possible. You clearly don't understand the FPGA market. The Spartan II parts are low cost versions of the Virtex parts. They are a major improvement over the previous Spartan parts. In that sense they are the "latest" parts for that market segment. > > > You can now see an actual project, started, that has an planned out > > > path leading to tools. > > > > I did check out the web site and I am not clear about where it is > > going. Can you explain in terms that an FPGA engineer can understand? > > I can only say stuff in whatever language I know. Which contains 74xx > logic, 8051s, Unix programming in C, shell/perl programming, and doing > FPGA in JBits. Sorry if your particular jargon is not supported. > > > > For such features many (including myself) are willing to sacrifice > > > using the latest chip family for the time needed to reverse-engineer, > > > dito also not wringing the last drop of power out of an chip. > > > > Many does not include the majority of FPGA engineers, IMHO. > > Even if that is majority, there still is the rest. And therefore the open source tools will be an also-ran, poor step child. > > > Screws from one manufacturer once did not fit nuts from an other, > > > about 120 years ago, due to everyone using their own threads. Once the > > > basic issues and design space in thread design were explored, standards > > > came. > > > > Bad analogy. It was in the best interest of the dozens or hundreds of > > manufacturer's interests to be compatible. > > Nope. It was in their best interst to lock in customers with incompatible > designs. It was users professional societies that forced the change in style. That is not correct. Even today electronics vendors know that they are better off making "standardized" parts because they are much more widely accepted and therefore sell better. > > > It is horses for courses. > > > > Not sure what that means, > > It means that what is best for one, is not automatically best for > another. And that the anther may be best served with something that > the first would never want. Yes, you are not an FPGA designer. You are on the fringe and you can use whatever you want. But this discussion was about the viability of open source tools and I think you will still find that they will not be well received by the FPGA design community. In the end they will be like Charles Moore and his small, but loyal following of Forth programmers. Not to say there is nothing good about Forth, but it is in very, very limited use even today. > > but I have seen $1000 FPGAs go on a board. If > > they had known that they would need a $2000 chip, the board would have > > never been designed. > > Absolutely not relevant for me, not for my project sizes. Yes, but that is the FPGA world. You are working in a different one where you can afford to spend manyears to develop your own tools to do a project that would otherwise be done in a few weeks. > > > > tools (unless you are counting the pennies). The expensive tools are > > > > the synthesis and simulation tools. > > > > > > I was talking of those. > > > > And you can live a rich full life without the $10,000+ tools. Right? > > Could you point out, where I am supposed to have claimed contrary to that? I believe you snipped your own statements and I have lost track of this. > I was pointing out, that your "less users, A and X will not bei able > to afford development" stuff was nonsense. That there exists people > prepared to pay large sums for tools proves that A and X can keep on > making tools, by simply rising their price a bit. My point is that X and A won't cooperate with open source tools so that the users keep buying their tools since it will be a large impact on them to lose that revenue. Maybe I am wrong. Maybe they don't care about the revenue. But they still care very much about having control of their in house tools and will not support open source efforts. They have said that they feel they have to provide support to anyone using their chips regardless of the tools they are using. So if open source tools are being used for their chips it creates problems for them in support. So they will try to stop it since they feel they can provide better tools anyway. > > I am not talking about Synthesis and Simulation. I am talking about the > > back end tools. If Xilinx loses > 50% of its tool market to open source > > or third party tools, I expect they will drop their inhouse tools. > > Or rise their price to 2 times what they are today. Still a lot less > than what some people are prepared to pay. > > > tools in the future or start charging more for the chips to make up the > > difference which would make it harder for them to compete. > > And so what? Their competitors (ASIC vendors) also have to pay for > their tool development. Both can then chose how to distribute the > costs over software licenses (less larger ones) or chip costs > (possibly more sales due to the open source software), or use some of > the open source stuff to reduce their costs. > > May the best win. > > > sucessful open source tools will drive FPGA vendors out of the tool > > market. Then they would have to compete on just the chips and be a > > CPU manufacturers seem to be living ok, on doing just that. You like to make this comparison. But there is very little to compare. > > > You are forgetting the most important part of open source: motivation. > > > It takes a lot of it. To accept sacrifice of the time to make software > > > without pay. That motivation must come from somewhere. Look for the > > > "somewheres" if you want to know where to look for the first appearances. > > > > Why would anyone be more motivated to develop backend tools? What is > > their value without the front end tools? > > As foundation to build front end tools on? > > The bitstream is the target. Back end makes that. From there on to > the front is ever increasing comfort. What will you feed into the backend? Output from the X or A front end? > > > > they are much more like the gcc target. But the back end tools are very > > > > different. *That* is why there are no third party back end tool > > > > vendors. > > > > > > They are software, like all other. I.e. designs to be specced, lines > > > of implementing program code to be written, work time to be spent. > > > That process is well understood. > > > > I am glad that you think *all* software is the same. > > All software in the end boils down to the same: time. Learning, > researching, coding, testing are all in the end time. Yes, but the time required to solve a problem depends greatly on the problem. Why do you think there are no third party back end tool vendors? > > matter of solving problems, not writing code. If you don't have good > > agorithms, you code will be lousy. Writing code is the *easy* part. > > Developing the algorithm is the hard part. > > It is in the end just time. > > > > Huh? I never said "all" CPUs, nor did I say all FPGAs. I clearly > > > stated 2 markets, "mass" and "specialist". > > > > What is your point? NO ONE can make a Xilinx compatible FPGA except > > Xilinx. NO ONE can make an Altera FPGA except Altera... > > I have my doubts. Where there is a will (enough money) competitors > will appear. You don't understand patent law, copyright and the economics of chip design and manufacture. How could anyone make a bitstream compatible FPGA? Can you explain how they would get around the legal IP issues? > > Just ask > > Clearlogic. :) > > AFAIK, they did not make FPGAs. They made ASICs that were layouted > automatically from Altera bitstreams. And it was not their ASICs that > got them into trouble, but rather that every single use of their > technology being helping Altera software licensees break the license. You misunderstand. The did nothing to "help break the license" other than to use the bitstream that came from an Altera tool. Likewise the innards of an FPGA are patented and otherwise protected IP. If you try to make an FPGA that is bitstream compatible you will either violate patents or end up with a very unworkable chip design or both. > So they are not particularly a good example for "cloning not possible". > > In fact in an hypothetical world with open source (no-Altera) tools, > user using them could develop on Altera and then manufacture on > Clearlogic, and those would not be violation licences, and so there > would be an non-infringing use for Clearlogic -> Altera loses case. Yes, but that is not an FPGA is it? The point is that Altera felt a threat and used their IP to shut them down. End of story. Do you know about the student who wrote an HDL version of an ARM processor? I don't remember the name, but he pulled it from the web after the ARM people had a chat with him. Same issues and he never once put it in silicon. > > > You claimed that binary incompatible is neccessary, and as such will > > > break tools, I pointed out that binary compatibility is possible in this > > > market, just as it turned out to be in CPUs, and sketched what its > > > result could look like. > > > > How is compatibility necessary in CPUs or FPGAs? It only exists in the > > x86 world because the cat is out of the bag. > > And who says it will not get out of the bag in the FPGA world? I do as well as X and A. How do you expect that to happen? > > > It is interesting that Altera did not chose the direct course of > > > attacking them with their patents on the actual chip technology. That > > > they needed to use such an indirect method of helping users breach the > > > the devel tools software license, which is less likely to succeed in > > > court, is telling us a lot. > > > > You are talking about making chips, now you are talking about the > > tools. > > Because the Clearlogic case was about tool missuse, or rather about > helping people missuse an tool, and no non-infringing use. > > > > > AMD could make parts that fit the Pentium socket because they had a > > > > license for that. After Socket 7 (IIRC) they no longer had that license > > > > and they now have to make their own interfaces. > > > > > > Socket 7 did not require any license. It is only with Slot 1 and later > > > Socket 370 that Intel introduced an patented signalling protocol (not > > > pinout) which required an license that they then refused to AMD. The > > > pinout is copyable, but useless it one can implement the signalling. > > > > No, you are confused. Socket 7 required a license, but AMD and several > > other companies already had that license due to manufacturing agreements > > that Intel had set up previously. They were later interpreted to > > include the pinout, the instruction set and even the microcode for > > processors up to the 386. > > AMDs license agreements only went up to 486. Even there they were not > complete (the ICE code was not covered). Socket 7 is Pentium, and so > not covered by any 486 stuff. > > Sockets can not be copyrighted, can not be patented, can not be > trademarked, so no protection. Signaling protocols can be patented, > that is what Intel then did on PentiumII. > > Same issue that they had with numbers not being trademarkable, so AMD > copied the 486 name with impunity. So Intel renamed the becomeing 586 > into Pentium, to prevent AMD being able to copy it. So you *do* understand that companies will protect their IP!!! Glad you could grasp this concept. > > > Also known as: do the most/first needed part first, show an actually > > > usable result, and accept that obsolence will happen and require an > > > "chase the moving target" attitude. gcc did/does this (different CPUs), > > > Linux did/does this (different computer architectures). Sure. > > > > I disagree that the backend is needed most or first. But then it is not > > my decision to make. > > Exactly. That is my descision. And my knowledge of the open source > community that runs into it. > > > > > Once you > > > > have built all the parts of the intended toolchain, what will the flow > > > > be? > > > > > > Tentatively (subject to changes while implementing): > > > > > > Users chosen language -> compiler (3rd party, multiple) > > > -> design reduced to LUT-sized elements, relative placed, their connection > s > > > reduced design -> vas (from my toolset) > > > -> design fitted to LUTs/F5s/etc, absolute placed, connections to PIP list > s > > > placed/routed design -> vm and libvirtex it calls (from my toolset) > > > -> .bit file to be used or displayed/debugged (using existing vd and vv) > > > > I don't understand any of this. What are you planning to do? > > If we do not have that much common language to base our discourse on, > I might as well give up. Good bye. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 48354
Hi, "Nicholas C. Weaver" wrote: > In article <3DAD92AF.187507B4@Xilinx.com>, > Goran Bilski <Goran.Bilski@Xilinx.com> wrote: > >> I can send you a paper submission and a thesis chapter draft on the > >> subject if you want. > >> > > > >Please do. > > Done. > Thanks > > >If you double all the registers in the data pipeline, hasn't you doubled the > >pipeline? > >Or is all functionality between the pipestages shared? > > The functions between the pipeline stages remain unchanged, so one is > pipelining the computation on a finer grain. This can work fairly > well for FPGAs as the ratio of LUTs to FFs is 1/1, but that is usually > not the case in logic except for highly agressive designs. > The problem to share functionality is that you need muxes to select which thread to use. These muxes will be in most cases larger than the function itself and will definitely decrease the clock frequency of the processor. I still believe that multi-threading is useful for custom/ASIC implementation but multi-processor works better in FPGA: Göran > -- > Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 48355
Don't use delay elements. You will regret it. "Stevenson" <NOSPAMstevenson@infinito.it> wrote in message news:GEhr9.39661$RO.975695@twister1.libero.it... > Hi, > I'm developing an application using Xilinx FPGA (XC4005) and the ISE > foundation software from Xilinx. > I've got a little question: it exists a way to delay some signal (just a few > nanosec) using the schematic editor??? > I tried to use a little "chain" of inverters, but this work only before > implementation! (because it performs a logic minimization...). > > Thanks. > >Article: 48356
Hi, can we talk you out of this unhealthy endeavor? "Just a few ns" achieved by cascaded LUTs or routing delays will become a nightmare in a production design. Two years from now, you will complain bitterly when your design has become unreliable when you stuff it with newer parts. Try to find other ways. Using both edges of the clock is ok, and newer devices have phase-controlled clocks ( Virtex-II has DCMs that are very good at this). And BTW, why are you designing with technically obsolete parts? Newer families ( Spartan-II, Virtex, Virtex-II...) are better and cheaper, and have better software support... Peter Alfke, Xilinx Applications Stevenson wrote: > Hi, > I'm developing an application using Xilinx FPGA (XC4005) and the ISE > foundation software from Xilinx. > I've got a little question: it exists a way to delay some signal (just a few > nanosec) using the schematic editor??? > I tried to use a little "chain" of inverters, but this work only before > implementation! (because it performs a logic minimization...). > > Thanks.Article: 48357
Neil Franklin wrote: > > nweaver@ribbit.CS.Berkeley.EDU (Nicholas C. Weaver) writes: > > > rickman <spamgoeshere4@yahoo.com> wrote: > > > > >So? Few if any significant companies can't afford the low end tools > > >that will get the job done. Not all tools are 5 figure. In fact very > > >few are. What is your point? > > > > WHen you are paying the engineers $50k/year, so they cost you > > $100k/year, you aren't going to blanch at the ~$1000 for the back-end > > xilinx tools. > > IF you are in that price range. Sure. I am not. > > > >I am glad that you think *all* software is the same. Technology is a > > >matter of solving problems, not writing code. If you don't have good > > >agorithms, you code will be lousy. Writing code is the *easy* part. > > >Developing the algorithm is the hard part. > > > > Ohh, strongly disagree. Developing the algorthms and proving the > > concepts in code is the easy part (its prototyping), its turning or > > recreatingthat code into something robust an widely usable thats hard, > > IMO. > > So the main 2 critics disagree totally with each other. Not really. I am saying that good routing algorithms are not easy. If they were we would have better ones already. FPGA is not the only world that uses them. There is tons of money spent on making them better. Nicholas is saying that even when you pick an algorithm and get it coded, it is a much larger job to make it commercially acceptable. > > {flow munched} > > > > >I don't understand any of this. What are you planning to do? > > > > Back end tools. Starting with bitgen and working backwards. > > > > Oh, where was static timing analysis? > > Not mentioned in the post, as not part of the question I was answering. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 48358
"Nicholas C. Weaver" wrote: > When you pay $2500 for foundation or $1100 for Alliance, you are > paying for the ability to call up and get (hopefully) clued people on > the other end of the line. GOOD LUCK!!! The best support I have gotten is in this newsgroup. When I call the support lines, I get people who are dying to get me off the phone so they can get credit for closing the case. When I mention some of the problems I have had with tools or parts I get emails not posted here. Too bad that is usually well after I need help. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 48359
Neil Franklin wrote: > > nweaver@ribbit.CS.Berkeley.EDU (Nicholas C. Weaver) writes: > > > Neil Franklin <neil@franklin.ch.remove> wrote: > > > > > >Makes one wonder why Xilinx re-warmed their over 2 years old Virtex in > > >form of Spartan-II. After already doing the same trick with Spartan, > > >and have since repeated it with Spartan-IIE. > > > > > >Seem to sell, so there seem to be quite a few people who do not need > > >then newest possible. > > > > The latest Spartan xx is "What was the top part 1.5 process > > generations ago" in the sweet spot die sizes. > > And it (SPartan-IIE) is also very similar to the Virtex-E and as such > an easy target to expand to. But by the time open source is ready for the SpartanII parts, there will be SpartanIII or SpartanIV parts out with *no* open source tools. > > So why redesign the logic block? > > No need to. And that is why we have quite a few near bit compatible > families. > > > >> And you can live a rich full life without the $10,000+ tools. Right? > > > > > >Could you point out, where I am supposed to have claimed contrary to that? > > > > > >I was pointing out, that your "less users, A and X will not bei able > > >to afford development" stuff was nonsense. That there exists people > > > > Xilinx and Altera do NOT NOT NOT make money from their tools. > > And that makes rickmans "open source is a threat" argument totally moot. I have already made the point that even if the $$$ are not an issue, FPGA vendors *have* to have control over their tools and will have major hearburn over trying to support open source tools for their chip customers. So they will not cooperate with open source tools. Heck, it has only been the last year or so that they will support *running* the tools under Linux!!! > > >> tools in the future or start charging more for the chips to make up the > > >> difference which would make it harder for them to compete. > > > > > >And so what? Their competitors (ASIC vendors) also have to pay for > > >their tool development. Both can then chose how to distribute the > > >costs over software licenses (less larger ones) or chip costs > > >(possibly more sales due to the open source software), or use some of > > >the open source stuff to reduce their costs. > > > > > >May the best win. > > > > Actually, ASIC venders generally don't provide tools. TSMC etc just > > takes designs and pops out chips. Cadence etc provide the tools, > > which the customers directly pay for. > > Which in the end, for the customer still is just a total-cost > comparison. Whether they pay Cadence direct or via TSMC is not really > relevant. Just as it is irrelevant whether they pay Xilinx via > software costs or chip costs. That is totally not true. NRE and chip costs are totally different. Anyone using large quantities of chips will switch for a few cents difference in price. Not so for thousands of dollars of tools costs. > > >> What is your point? NO ONE can make a Xilinx compatible FPGA except > > >> Xilinx. NO ONE can make an Altera FPGA except Altera... > > > > > >I have my doubts. Where there is a will (enough money) competitors > > >will appear. > > > > Wait another decade, and THEN you may see Xilinx 4000 compatable > > parts. Patents are an enforced monopoly. > > I know enough about patent law. It is just a matter of money to get > around it (even if that means buying strategic patents to trip Xilinx > up and make it cheaper for them to license their stuff). That is BS. A large company can spend their way around a small company, but two large companies play games to the death. Where is the money to start a new FPGA going to come from... ? > > >> How is compatibility necessary in CPUs or FPGAs? It only exists in the > > >> x86 world because the cat is out of the bag. > > > > > >And who says it will not get out of the bag in the FPGA world? > > > > It doesn't even exist across parts, a V300 is NOT bitfile compatable > > with a V400. > > Depends on your definition of compatible. I regard them as identical, > modulo some size parameters (1 line in an table each). Even XC2S30 and > XCV300 are just table differences. Only from the E types on do we have > bit meanings change (seems to be limited to DLLs and IOBs). And XCVxxxE > to XC2SxxxE seems to also be just table lines. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 48360
Nicholas C. Weaver wrote: > > In article <3DAD80F2.DC5AD4C4@Xilinx.com>, > Goran Bilski <Goran.Bilski@Xilinx.com> wrote: > >So with two threads in MicroBlaze, to double the pipeline is to > >double the size of MicroBlaze. You also have to double the > >instruction fetching data throughput in order to get the two streams > >busy. That would put a big burden on the bus infrastructure and > >external memory interface which suddenly has to double it's > >performance. The doubling of the pipeline and added control handling > >WILL also lower the maximum clock frequency of MicroBlaze. > > You don't need to double the exteral memory interface if you share the > cache, this is especially true on workloads where the threads are > related. The external memory interfare is now 2x the CLOCK, but you > could slow it down from there and arbitrate beween the two streams of > execution. Suppose the memory interface was optimised for synchronous/burst FLASH, - doesn't this multi threading fight a little with that interface ? Or do you always need a multi-layer memory interface ? > You also probably want to make the feeding of interrupts a little > different, so you can designate one thread as receiving the > interrupts. That sounds a good idea, plus it infers SW access to thread steering, so small routines can also be tagged for a 'fast thread'. Is this the same scheme Intel has comming, where they claim 25% higher performance ( if the SW supports it :) -jgArticle: 48361
In article <3DADAAB7.E96D53F2@Xilinx.com>, Goran Bilski <Goran.Bilski@Xilinx.com> wrote: >You can't just double the number of pipestage for a processor without >major impacts. For streaming pipeline which hardware pipelines are I >agree but for processor that can't be done. Uhh, yes it can. Double all the pipeline stages, double the register file, rebalance the delays now that you have more pipelining, and out drops a 2-thread multithreaded architecture. Each single thread now runs slower, but aggregate throughput (sum of the two threads) is increased. It is so obvious yet unintuitive that nobody has actually DONE it before. :) -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 48362
"Theron Hicks (Terry)" <hicksthe@egr.msu.edu> wrote > I will keep the rework machine idea in mind if we ever need to go to the virtex2 > or other BGA parts. How does one inspect the solder joints to determine whether the > joints all have flowed correctly? How steep is the learning curve to mount the chip > consistently? I have one of these: http://www.ersa.de/ersaenglisch/additional/IR500/IRR.html Not too expensive. I have the IR reflow only, not the whole system) Too look under the BGA just by a small (5mm or less) optical right angle prism and use your binocular. MarcArticle: 48363
On big advantage of multi threading is that the pipeline interlocks can be eliminated if the number of threads is larger than the longest feedback path in the pipeline. For example, a branch instruction does not have to stall waiting for conditions from the preceeding comparison. This yields some boost in the total performance. Permanent state such as conditions codes, register files have to be expanded into larger memories with part of the index being the current thread id, but other registers mostly do not have to be modified. Given the distributed ram capabilities in Xilinx parts, this is pretty cheap. The first place I saw this was the CDC 6600 IO processors, which I belive ran 16 threads. - Ken Goran Bilski wrote: > Hi, > > "Nicholas C. Weaver" wrote: > > >>In article <3DAD80F2.DC5AD4C4@Xilinx.com>, >>Goran Bilski <Goran.Bilski@Xilinx.com> wrote: >> >>>Hi, >>> >>>Sort of. >>> >>>The complete decoding and the ALU is around 10-13% of the design. >>>The actual instruction decoding is less than 5%. >>> >>>Make it multithreading as I understand is to have more than 1 instructions >>>streams in the pipeline. >>>What is the benefit unless you double the pipeline and have two data pipelines? >>>Almost nothing >>> >>Uhh, you don't double the pipelines, you take the single pipeline, >>double up the registers IN them, and then move the regsters to >>rebalance all the pipeline stages, as now you have 2x the registers >>through any fedback loop, allowing you to up the clock frequncy alot. >> >>If you do this to every register in the core (and tweak the RF), a >>multithreaded design just sort of "dros out" automatically. >> >>You can even write a tool to do that automatically. >> >>What happens in the end is is you take adantage of the two threads to >>up the clock substantially. Each individual thread is now a little >>slower, but the throughput for the 2 threads is now substantiall >>higher. You use more pipelining and more power, and you may or may >>not end up thrashing the caches, but itdoes work. >> >>I can send you a paper submission and a thesis chapter draft on the >>subject if you want. >> >> > > Please do. > > If you double all the registers in the data pipeline, hasn't you doubled the > pipeline? > Or is all functionality between the pipestages shared? > > >>>So with two threads in MicroBlaze, to double the pipeline is to >>>double the size of MicroBlaze. You also have to double the >>>instruction fetching data throughput in order to get the two streams >>>busy. That would put a big burden on the bus infrastructure and >>>external memory interface which suddenly has to double it's >>>performance. The doubling of the pipeline and added control handling >>>WILL also lower the maximum clock frequency of MicroBlaze. >>> >>You don't need to double the exteral memory interface if you share the >>cache, this is especially true on workloads where the threads are >>related. The external memory interfare is now 2x the CLOCK, but you >>could slow it down from there and arbitrate beween the two streams of >>execution. >> > >>You also probably want to make the feeding of interrupts a little >>different, so you can designate one thread as receiving the >>interrupts. >> >> >>>Say you suddenly would like to have 5 threads instead of 2. That is a major >>>change of the multithreading MicroBlaze and almost impossible to get the >>>instruction fetching to keep up. With multiprocessing, just add another 3 >>>MicroBlazes and you're done. >>> >>What you do is you have a 1 thread and a 2 thread version (going >>beyond 2 threads seems to be less effective, maby 3 depending on the >>architecture). From the exterior, however, they still look normal. >>You can still tile that like any other core to create a multiprocessor >>machine. >> >> >>>BUT there is always a catch and that is how you write programs for these >>>systems. >>> >>"one thread for I/O, one thread for processing" does come up in some >>cases. >> >> >>>Göran >>> >>>Hal Murray wrote: >>> >>> >>>>>Another approach is to add multi-threading capabilities but I think that >>>>>multi-processing is better for FPGA than multi-threading. >>>>> >>>>Why? >>>> >>>>If I understand what multi-threading means, the idea is to interleave >>>>alternate cycles of two execution streams in order to reduce the >>>>losses due to stalls. >>>> >>>>It looks like it "just" requires an extra address bit (odd/even cycle) >>>>to the register file and the same bit selects between pairs of special >>>>registers like the PC. >>>> >>>>Are you telling me that the ALU and instruction decoding is small enough >>>>so that I might just as well build two copies of the whole CPU? >>>> >>>>-- >>>>The suespammers.org mail server is located in California. So are all my >>>>other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited >>>>commercial e-mail to my suespammers.org address or any of my other addresses. >>>>These are my opinions, not necessarily my employer's. I hate spam. >>>> >>-- >>Nicholas C. Weaver nweaver@cs.berkeley.edu >> >Article: 48364
Hi, I agree that you can if you also double the clock frequency of the pipeline, creating parts of the normal clock. What I meant was the keeping the same clock and just adding more pipestages. I have finally got the idea of multithreading but it not as easy to implement since you need to find a good middle point in each pipestage that can divide the pipestage into equal parts. The control path is also needed to split into subparts and you also need to find good points to break it up. The processor also definitely needs a cache which you can run at the double speed or more ports to in order to get the data for each thread. I think that would be the largest obstacle for multithreading MicroBlaze, the number of ports to the BRAM is finite (2) and in my implementation BRAM is almost already in the critical path. Göran Bilski "Nicholas C. Weaver" wrote: > In article <3DADAAB7.E96D53F2@Xilinx.com>, > Goran Bilski <Goran.Bilski@Xilinx.com> wrote: > > >You can't just double the number of pipestage for a processor without > >major impacts. For streaming pipeline which hardware pipelines are I > >agree but for processor that can't be done. > > Uhh, yes it can. > > Double all the pipeline stages, double the register file, rebalance > the delays now that you have more pipelining, and out drops a 2-thread > multithreaded architecture. Each single thread now runs slower, but > aggregate throughput (sum of the two threads) is increased. > > It is so obvious yet unintuitive that nobody has actually DONE it > before. :) > -- > Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 48365
In article <3DADB08B.6D8E@designtools.co.nz>, Jim Granville <jim.granville@designtools.co.nz> wrote: >Nicholas C. Weaver wrote: >> You don't need to double the exteral memory interface if you share the >> cache, this is especially true on workloads where the threads are >> related. The external memory interfare is now 2x the CLOCK, but you >> could slow it down from there and arbitrate beween the two streams of >> execution. > >Suppose the memory interface was optimised for synchronous/burst FLASH, >- doesn't this multi threading fight a little with that interface ? A bit, but not too bad. You want to redo the memory interface anyway to make the outside look like a normal, single threaded part for convenience. >> You also probably want to make the feeding of interrupts a little >> different, so you can designate one thread as receiving the >> interrupts. > >That sounds a good idea, plus it infers SW access to thread steering, >so small routines can also be tagged for a 'fast thread'. > >Is this the same scheme Intel has comming, where they claim 25% higher >performance ( if the SW supports it :) It's not coming, its present in the P4 Xeons, and present but untested and disabed in the desktop P4s. You want better software or at least scheduling to use it RIGHT, but it is there. Same concept (run 2 threads as 2 separate virtual processors on the same shared hardware), largely the same programmer implications (same working set good, different working sets BAAAAD, which is just the opposite of an SMP), totally different implementation: Intel/Uwashington (Hyperthreading/SMT) approach is to issue from 2 instruction streams into a superscalar core, as any one stream actually has pretty crappy utilization of the functional units. INtel doesn't even increase the physical registers (which are much more numerous than the architectural registers). $C$-slow multithreading is to double up the pipeline, issuing from two threads on an even/odd basis, so each thread now runs a little slower, but the finer pipelining allows the system to run faster. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 48366
In article <3DADB64E.CE568067@Xilinx.com>, Goran Bilski <Goran.Bilski@Xilinx.com> wrote: >Hi, > >I agree that you can if you also double the clock frequency of the pipeline, >creating parts of the normal clock. >What I meant was the keeping the same clock and just adding more pipestages. Yeah, thats a loss generally. You need to add more stages around all feedback loops to really improve things, and that is what the multithreading allows. >I have finally got the idea of multithreading but it not as easy to >implement since you need to find a good middle point in each >pipestage that can divide the pipestage into equal parts. The >control path is also needed to split into subparts and you also need >to find good points to break it up. You can actually write a tool to do it all automatically, especially if you are willing to say "screw initial conditions". -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 48367
>The problem to share functionality is that you need muxes to select which thread >to use. >These muxes will be in most cases larger than the function itself and will >definitely decrease the clock frequency of the processor. I don't think we are on the same wavelength yet. The idea is not to add those muxes, but to have two threads running through the same heavily pipelined structure on alternating cycles. The extra pipeline registers are free on most FPGAs. They let you run with a faster clock rate. But if you only have one thread you will waste many of thone new cycles on stalls. You can get back those cycles if you let another thread use them. Only around the edges do you need new muxes - a wider register file and new mux/enables on registers like the PC. > I still believe that multi-threading is useful for custom/ASIC implementation but > multi-processor works better in FPGA: This isn't rocket science. All we need is to compare 2x the LUTs used by a single threaded design with a multi threaded design. (Scaled by clock speed if that changes.) -- The suespammers.org mail server is located in California. So are all my other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited commercial e-mail to my suespammers.org address or any of my other addresses. These are my opinions, not necessarily my employer's. I hate spam.Article: 48368
> BUT there is always a catch and that is how you write programs > for these systems. Standard programming problem. People are getting pretty good at it. Yes, there are lots of applications where it doesn't work. If you can't take advantage of multi-threading then you wouldn't be able to use multi-processing either. -- The suespammers.org mail server is located in California. So are all my other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited commercial e-mail to my suespammers.org address or any of my other addresses. These are my opinions, not necessarily my employer's. I hate spam.Article: 48369
In article <aoheo3$2q0$1@news.storm.ca>, "jakab tanko" <jtanko@ics-ltd.com> wrote: > Hello, > > Does anybody know how to (safely) drive Virtex2 > inputs with 5V TTL levels? > > Thanks, > > jakab A solution I know that works, is to use QuickSwitches. - QuickSwitches are CMOS tranmission gates. - They are available in buffer or multiplexer configurations. - When enabled, they look like a 5 ohm series resistance between the A and B port (the two "sides" of the device). - When enabled, the propagation delay through the device is 250pS (0.25nS), so they have a minimal impact on timing. - They provide no buffering, so the drive strength of your solution is determine solely by the devices on either side of the QuickSwitch (minus the extra 5 ohms of resistance). - QuickSwitches draw no power (except for a brief transient when the device toggles between logic states). Hence, they can be powered from moderate impedance power supplies, as long as a bypass cap is connected directly to the VCC pin on the QuickSwitch. An application note was written for one of these devices years ago, which introduced the concept of connecting 3.3v to 5.0v devices, without worrying about the 5v tolerance properties of the 3.3v device. The trick is to connect the VCC on the QuickSwitch to a 4.3 volt power supply. The QuickSwitch can only make a voltage output roughly 1 volt below the VCC supply, so by powering with a 4.3 volt supply, signals passing through the device cannot exceed 3.3 volts. To make a 4.3 volt supply, connect a silicon diode from your 5V supply to a 470 ohm resistor to ground. This will draw 10ma to ground, and create a roughly 0.7volt drop across the diode. Bypass the voltage across the 470 ohm resistor with a 0.1uf capacitor. I tried this with a 74CBT3125H (an octal device). I bought a product the other day, that used a 34X245. This would condition 32 signals at the same time. See this link, for a high density solution. I have no idea what the price is on one of these buffers. http://www.idt.com/docs/QS34X245_DS_79760.pdf Here is an ASCII art schematic. View in Courier font... 5.0v | | | ----- 1N4004 \ / type v _______ | |------------------------- 4.3 volts | | | ------ | | VCC | 470 | | 0.1uf -------------- | ohm | _____ | | | res | _____ | QuickSwitch | ------- | | | | | -------------- | | | | | | Ground Ground Ground ------------ ------------- | | An Bn | | | Device A |------------- -------------------| Device B | | | | | | | ____________ | | -------------- _______ QuickSwitch 3.3 volt ----- Buffer 1 of n 5.0 volt part | part | OE pin | Ground Note that, when the OE pin is enabled, the two sides are connected together. You are still responsible for ensuring that "Device A" and "Device B" do not enable their drivers at the same time. If you choose to drive the OE pin with logic, be aware that the turn on time of the QuickSwitch is on the order of 7nS, so the zero delay property of the device only happens if you leave it permanently turned on. HTH, Paul P.S Any openings at ICS ? I'm available and I live in town :-)Article: 48370
Hal Murray wrote: > > >Processors in FPGAs has to be handle more delicate than ASIC processor due to > >forwarding in pipeline could easy remove all benefits gain by more pipeline stages. In > >FPGA a mux cost as much as an ALU which is not the case for ASIC or custom design. > > > >Another approach is to rely on advanced compiler techniques for handling all the > >pipeline hazardous but it would make it almost impossible to program the processor in > >assembler since the user has to do the handling. > >I personally don't think that this approach would gain that much more performance than > >MicroBlaze and you have to spend a lot of resources on the compiler which could be > >used for other stuff. > > This seems like an interesting opportunity for an open source project. Aren't there already CPUs in FPGA open source projects? http://www.fpgacpu.org/ http://www.opencores.org/ The list is getting pretty long. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 48371
rickman <spamgoeshere4@yahoo.com> writes: > Neil Franklin wrote: > > > > I care for whatever chip solves my problem. The 2 year old ones do so. > > date. If that is all you need, then great. But don't expect the design > community to welcome open source with open arms. :) Did I ever do thet? I only said that I was doing something. You then decided to jump on me. Perhaps they are not for you, perhaps you will never use them. I could not care. I dont't make them for everyone. I make them for those people who want to use them. And I know that such people exist. Some of them outside of todays FPGA market. FPGAs are growing, not just in chip size, but also into new user groups. Some of them are lookign for what I am making. > > Seem to sell, so there seem to be quite a few people who do not need > > then newest possible. > > You clearly don't understand the FPGA market. I expect that Xilinx does understand it. They launched Spartan-II. I will take their estimate over yours. > > > > It is horses for courses. > > > > > > Not sure what that means, > > > > It means that what is best for one, is not automatically best for > > another. And that the anther may be best served with something that > > the first would never want. > > Yes, you are not an FPGA designer. You are on the fringe and you can > use whatever you want. But this discussion was about the viability of > open source tools and I think you will still find that they will not be > well received by the FPGA design community. For me it is, and has allways been, about making them, and those people I know who intend to use them. It you were labouring under the impression that I intended to take over the market, then you were reading my stuff wrong. > > I was pointing out, that your "less users, A and X will not bei able > > to afford development" stuff was nonsense. That there exists people > > prepared to pay large sums for tools proves that A and X can keep on > > making tools, by simply rising their price a bit. > > My point is that X and A won't cooperate with open source tools so that > the users keep buying their tools since it will be a large impact on > them to lose that revenue. Maybe I am wrong. Maybe they don't care > about the revenue. Given that they put out Webpack (and Alteras counterpart) for free, I doubt they will feel any financial impact. > have said that they feel they have to provide support to anyone using > their chips regardless of the tools they are using. Simply point out, that it is not out tool - no support - come back if problem also happens with our official tool. Open source users understand this. > > The bitstream is the target. Back end makes that. From there on to > > the front is ever increasing comfort. > > What will you feed into the backend? Output from the X or A front end? At present just interest to feed in my own simple language. May add XDL if that is sufficiently interesting. > > > > Huh? I never said "all" CPUs, nor did I say all FPGAs. I clearly > > > > stated 2 markets, "mass" and "specialist". > > > > > > What is your point? NO ONE can make a Xilinx compatible FPGA except > > > Xilinx. NO ONE can make an Altera FPGA except Altera... > > > > I have my doubts. Where there is a will (enough money) competitors > > will appear. > > You don't understand patent law, copyright and the economics of chip > design and manufacture. You seem to be good at missestimating. I actually know law quite well. To the extent that I am usually the persone everyone around here ask for legal advise. Most likely due to me having actually read the relevant law texts. > How could anyone make a bitstream compatible > FPGA? Can you explain how they would get around the legal IP issues? As I said: worst case get an license, in absolute worst case by tripping up one of the existing players (if they blankly refuse). IP law can be very interesting in that respect. Also anyone with enough financial interest can actually take an patent to court and have it declared irrelevant on an whole range of issues. It takes time and cost. But if enough profit are waiting, things like that happen. Lesser case: You do know, that nearly all the fundamental patents in FPGAs appeared around/pre 1985 (XC2000) and are now nearing their 17 year, and so at end of life? Give a few years (needed for any hypothetical bit-compatible scenario that makes cloning interesting anyway), and quite a lot of them will be gone. Don't forget that then the only patents remaining are detail patents, i.e. on the actual implementation. And that can be varied, without losing bit compatibility. The situation is getting simpler the longer time goes. Also you may want to take into account, that Altera managed to survive Xilinxes patents, despite starting when they Xilinx had maximal protection, and with Altera an latecomer. Any new competitor has an easier situation. And an further scenario: assume bit compatible becomes important. Either X or A is the winner in becoming the standard. How long do you think will the other of the 2 look at declining sales, until they clone? And we already know that a patent battle between them 2 ends in stalemate. The short conclusion: IP law is in no way the "no chance" you seem to regard it as. In particular when one has got enough money to run through an dedicated battle. > > > Just ask > > > Clearlogic. :) > > > > AFAIK, they did not make FPGAs. They made ASICs that were layouted > > automatically from Altera bitstreams. And it was not their ASICs that > > got them into trouble, but rather that every single use of their > > technology being helping Altera software licensees break the license. > > You misunderstand. The did nothing to "help break the license" other > than to use the bitstream that came from an Altera tool. And that is exactly the entire meaning of "help break the license". Offering an service that is auxillary to an crime, without any other legal use for that service. Look up "contributary infringement" if you want an interesting read. I understand the legal concept here very well, having read quite a bit on the Napster/Kazaa/etc cases and the deCSS/2600/websites cases, and the argumentation against them. > Likewise the > innards of an FPGA are patented and otherwise protected IP. If you try > to make an FPGA that is bitstream compatible you will either violate > patents or end up with a very unworkable chip design or both. You can get around an patent. Altera survived Xilinxes ones. AMD has wrung patents off of Intel, by tripping them over other stuff. Via has stopped Intel attacks by tripping them up. Ask an good IP lawyer about all the possibilities. IP law is not the clear "you lose" that you believe it to be. In fact the very name IP is an error, they are no property, but rather privileges, granted for very specific terms. And many patents do not fit those terms, and only survive because being not challenged, because fighting them is not profitable. Add an good slice of potential profit and the overturning starts. > > So they are not particularly a good example for "cloning not possible". > > > > In fact in an hypothetical world with open source (no-Altera) tools, > > user using them could develop on Altera and then manufacture on > > Clearlogic, and those would not be violation licences, and so there > > would be an non-infringing use for Clearlogic -> Altera loses case. > > Yes, but that is not an FPGA is it? The point is that Altera felt a > threat and used their IP to shut them down. The point is that they only just managed. That IP is not the surefire "end of competition" you seem to regard it as. > o you know > about the student who wrote an HDL version of an ARM processor? I don't > remember the name, but he pulled it from the web after the ARM people > had a chat with him. Yes, I know the case. I also know that IP owners (and any other financially strong party) can push financially weak opponents aside, ever if the patent would not hold up. AFAIK ARMs patent claim was not that strong, they had luck that their opponent was weak or possibly simply not interested (better stuff to do than fight[1]) and caved in. MIPS had a stronger case against a cloner, but their patent is also near/in EOL. [1] A motive I know well, having also given up against a weak trumped up claim (of violating data protection laws), because the cost (in time, the case was classic multiple appeals type stuff, with an good chance of going right up to the supreme court) of defending was larger than the loss of giving in. And yes, that is an other reason why I know quite a lot about law. > > Sockets can not be copyrighted, can not be patented, can not be > > trademarked, so no protection. Signaling protocols can be patented, > > that is what Intel then did on PentiumII. > > > > Same issue that they had with numbers not being trademarkable, so AMD > > copied the 486 name with impunity. So Intel renamed the becomeing 586 > > into Pentium, to prevent AMD being able to copy it. > > So you *do* understand that companies will protect their IP!!! Glad you > could grasp this concept. I understand IP law very well. Glad that you have noticed it. -- Neil Franklin, neil@franklin.ch.remove http://neil.franklin.ch/ Hacker, Unix Guru, El Eng HTL/BSc, Programmer, Archer, Roleplayer - hardware runs the world, software controls the hardware code generates the software, have you coded today?Article: 48372
No, you don't need muxes. The process is time division multiplexed in the same hardware. It is just a matter of keeping track where the pieces of each thread are in relation to one another. As long as all parts have the same depth through hardware loops you get that depth worth of multithreading. The only place muxes may be needed is for selecting outside inputs. Goran Bilski wrote: > Hi, > > "Nicholas C. Weaver" wrote: > > The problem to share functionality is that you need muxes to select which thread > to use. > These muxes will be in most cases larger than the function itself and will > definitely decrease the clock frequency of the processor. > > Göran > > > -- > > Nicholas C. Weaver nweaver@cs.berkeley.edu -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 48373
It can, if each phase of the pipeline is assigned to a different thread. Goran Bilski wrote: > You can't just double the number of pipestage for a processor without major impacts. > For streaming pipeline which hardware pipelines are I agree but for processor that can't > be done. > > Göran > > Ray Andraka wrote: > > > Goran Bilski wrote: > > > > > > > > > > > Please do. > > > > > > If you double all the registers in the data pipeline, hasn't you doubled the > > > pipeline? > > > Or is all functionality between the pipestages shared? > > > > Yes, you've doubled the pipeline by doubling the registers. In a non-pipelined > > design, you presumably have several layers of LUTs between registers. Adding pipeline > > registers is nearly free in the FPGA, since they are there but unused in slices used > > for combinatorial LUTs. By adding the pipeline stages you can crank up the clock > > considerably if you've balanced the delays between the registers. That in turn lets > > you have more than one thread in process at any given moment. All the functionality > > goes through the same path, it is just that the path has more registers in it, so that > > the next sample can be put into the pipeline before the previous one comes out. > > > > -- > > --Ray Andraka, P.E. > > President, the Andraka Consulting Group, Inc. > > 401/884-7930 Fax 401/884-7950 > > email ray@andraka.com > > http://www.andraka.com > > > > "They that give up essential liberty to obtain a little > > temporary safety deserve neither liberty nor safety." > > -Benjamin Franklin, 1759 -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 48374
We do it all the time in our DSP designs. Granted, they are not typically microprocessors in the traditional sense, but the fact is it is doable. Basically what happens is that if you have a single thread, you have extra clock cycles between each instruction to allow time for the pipeline propagation. Let's take a really simple case where you are just pulling data out of a register, conditionally adding 1 to it an putting it back. In a non=pipelined version you can increment a particular value on every clock. If you add pipelining to the data path, the result of the first increment is not available in the memory to be used again for N clocks where N is the depth of the pipeline. The pipelining does allow you to run the clock faster, but it actually slows the processing of that one memory location since the process has to be stalled until the memory is updated. However, you can use the clock cycles in between to increment other locations in memory so long as you don't use any values that have been updated before they become available again. In a sense then, you can partition the memory in to N areas, each of which is accessed only by 1 in N cycles. Each of the cycles is then a different 'thread' running on the same processor. The result is no one thread is any faster than the unpipelined processor (it is actually a bit slower because you add set-up and clock-Q times plus slack for each register you add to the pipeline), but because the path is pipelined you can have more than one thread being operated on at a time (each with a one clock skew in relation to the previous). This is extendable to a general purpose processor as long as the pipeline depth is consistent for all instructions. "Nicholas C. Weaver" wrote: > In article <3DADAAB7.E96D53F2@Xilinx.com>, > Goran Bilski <Goran.Bilski@Xilinx.com> wrote: > > >You can't just double the number of pipestage for a processor without > >major impacts. For streaming pipeline which hardware pipelines are I > >agree but for processor that can't be done. > > Uhh, yes it can. > > Double all the pipeline stages, double the register file, rebalance > the delays now that you have more pipelining, and out drops a 2-thread > multithreaded architecture. Each single thread now runs slower, but > aggregate throughput (sum of the two threads) is increased. > > It is so obvious yet unintuitive that nobody has actually DONE it > before. :) > -- > Nicholas C. Weaver nweaver@cs.berkeley.edu -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z