Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
"Austin Lesea" <austin@xilinx.com> wrote: > > Something I just couldn't find anywhere was the actual performance of > the x86 co-processor for something like a floating point square root. SQRT is generally done by software. Here are the number of clocks needed: http://www.intel.com/software/products/mkl/data/vml/functions/sqrt.html When you can compute several sqrt at once (vector) the use of the SIMD instructions (SSE) makes things better. On a 1000 single float vector an Itanium 2 takes only 5.14 cycles/sqrt > We have clock cycles for each IEEE floating point operator, and the > speed of the synthesized palced and routed core for various families, > from Spartan 3 to Virtex4 in that pdf file. > > I suppose uP software people don't really care about performance in > terms of cycles or ns or mops....its all about what game screen graphics > are displayed in the coolest fashion.... > > Does anyone have a link to such a site that has 'real' data of floating > point op performance? Look at the IA-32 Intel Architecture Optimization Reference Manual: ftp://download.intel.com/design/Pentium4/manuals/24896612.pdf In particular appendix C: IA-32 Instruction Latency and Throughput For instance the latency of a FPMUL is 7 clocks and you can put new operands every 2 clocks. (well 2 mult in // in SSE3) There is an FPSQRT which takes 40 clocks latency with a 40 clocks throughput. The MULT is much faster than a V4 but there can be more than one FPU in a V4 ;-) MarcArticle: 88501
Here we go again. Either one will do quite well with floating point, but the performance is entirely dependent on the implementation. There is nothing significant that is native to either one that makes one better than the other for floating point. That said, the 18 x 18 embedded multipliers in both families are a PITA for IEEE floats. It would be much easier with 24x24 multipliers. Once again, I could easily code an RTL design that is "portable" so that it favors either family, especially when the DSP elements are involved. Austin Lesea wrote: > Marc, > > IEEE floating point standard? You need to be more specific. > > Does it need to integrate with a processor? > > I believe the Xilinx IBM 405 Power PC using the APU interface in > Virtex 4 with the floating point IP core provides the best and fastest > performance. > > Especially since no other FPGA vendor has a hardened processor to > compete with us. > > If all you want is the floating point processing, without a > microprocessor, then I think you will find similar performance between > Xilinx and our competition, with us (of course) claiming the superior > performance edge. > > It would not surprise me at all to see them also post claiming they > are superior. > > For a specific floating point core, with a given precision, for given > features, it would be pretty easy to bench mark, so there is very > little wiggle room here for marketing nonsense. > > I would be interested to hear from others (not competitors) about what > floating point cores they use, and how well they perform (as you > obviously are interested). > > Austin > > > Marc Battyani wrote: > >> Hello, >> >> Does anybody already made a comparison of the high performance FPGA >> (Stratix >> II, V4, ?) relative to double precision floating point performance (add, >> mult, div, etc.) ? >> >> It's for an HPC aplication. >> >> Thanks >> >> Marc >> >> -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 88502
In article <11gcl358r608m9e@corp.supernews.com>, Marc Battyani <Marc.Battyani@fractalconcept.com> wrote: >I find it somewhat depressing to see that Cray can't come up with something >much better than a bunch of FPGAs but at the same time it's very cool to >have access to the same technology than Cray. Or even better as they seem to >use Virtex II :) Remember that Cray are not in an absolute sense a very large company; market cap about a hundred million, sales fifty million in a good quarter, and very little of that profit. Xilinx could buy them with six weeks of their profit. The Cray XD1 with the FPGAs in it is basically a product of Octiga Bay, a smallish Canadian start-up that Cray bought outright eighteen months back when Cray's shares were worth rather more; there's one impressive ASIC in there, and that's the big system router. I admit to daydreams of a fractal-generating board about the size of a postcard tiled with XC3S1500 chips; six 17x17 multipliers make a 3:48 x 3:48 by a Karatsuba-like process, which is enough for reasonably deep zooming into most complex-plane fractals. Four of those fit in one $75 chip with multipliers left over to assist the divider, six of those chips on a board ... but I never got the state machines to drive them working correctly in Modelsim, let alone on a physical chip, let alone on a card-full, and I wouldn't know where to start with the PCB layout. [also, I doubt there's a market for $500 Newton-set generators, except possibly as a *shiny*maths* exhibit in museums, of which there are not all _that_ many] TomArticle: 88503
In article <WjtNe.1215$L03.637@newssvr27.news.prodigy.net>, Austin Lesea <austin@xilinx.com> wrote: >JJ, > >Something I just couldn't find anywhere was the actual performance of >the x86 co-processor for something like a floating point square root. Intel's manuals are on-line at http://intel.com/design/pentium4/manuals/index_new.htm In a locked filing cabinet concealed in a disused lavatory you will find Document 248966, which has, in appendix C, guarded by a leopard, performance figures. For the vector unit, you have PD and PS versions; PD instructions do two independent double-precision operations, PS instructions do four independent single-precision ones. PD PS + 2/4 2/6 * 2/6 2/6 / 70 40 Sqrt 70 40 '2/4' and '2/6' mean that the machine can accept a new pair of operands every two ticks, but only produces the answer six ticks later; it's quite deeply pipelined. The multiply and add pipelines run simultaneously. So a 3.6GHz Pentium 4, which costs about $600 (comparable to a V4LX40, I suppose, though the Pentium's in a vastly more convenient package) can do about a hundred million double-precision square roots per second, and about fourteen billion single-precision floating-point operations. The figures in http://www.xilinx.com/bvdocs/ipcenter/data_sheet/floating_point.pdf suggest that the V4 isn't entirely competitive even in tasks parallel enough for the MFlops figure to be maximum frequency * slices-in-chip / slices-in-core. Not altogether surprising given that Intel devotes most of its R&D efforts to making Pentium 4s faster, and has an annual R&D budget equal to 250% of Xilinx's total revenue. If you replaced a few of those DSP48 units with full-custom FPUs, on another hand ... > Does anyone have a link to such a site that has 'real' data of > floating point op performance? Does this help? They're very much guaranteed-not-to-exceed figures (I have achieved them, but not in loops doing useful work), but only comparably fanciful to f_max values. TomArticle: 88504
Hi, I am a new here. Also, a new to start learning FPGA. Could you tell me more good forums or website related to FPGA? Thanks!Article: 88505
Austin, First, thank you for the prompt response; now i understand the picture. Well.................. I have pretty similar personality to Antti's (not entirely similar :) !!! ) So, statistically speaking, your explanation of your response whatever it was (let's not call it apology, I hate this word) should have made a difference and I would let Antti spend a few days and finally chill down. And if Antti still reads this post, I would advise following of my "life guidelines", which is "Try judging people not by mistakes they make, but whether they can realize && acknowledge && let you know that they understand they made one". After all, we are people first, and only then, Angineers............. Best regards. Sincerely, Vladislav "Austin Lesea" <austin@xilinx.com> wrote in message news:cy7Ne.3108$Z87.1391@newssvr14.news.prodigy.com... > Vlad, > > No secret, it was I who offended Antti. I have apologized to him > personnally. I did not intend to slight him in any way. In fact, his > comments have been, and continue to be, very valuable to Xilinx. > > I respect Antti, if and when he feels comfortable with posting again, or > when he feels it may be useful to post, it will be up to him to decide. I > can live with that. Everyone here has the right to post, or not to post, > and if they post, to post what they will. > > It takes time and energy to post here, and some people get things done > without the bother. For example, the largest source of email addresses > for spam are harvested from newsgroups (unfortunately). > > I reserve my right to reply to postings as well. I also have the best > email filtering imaginable (three levels), and our internet service > provider is daily driven crazy by the number of spam emails they have to > block for Peter and me. > > I never intended to insult or offend anyone. > > Austin > > > Vladislav Muravin wrote: > >> Antti, >> >> Two things: >> >> (*) What's the story with 5 bucks? >> (**) I could not find this reply && so can someone explain what the hell >> is going here. This "someone" should be (Antti || Xilinx). >> >> Thanks >> >> Vladislav >> >> "Antti Lukats" <antti@openchip.org> wrote in message >> news:ddt6pp$rjq$01$1@news.t-online.com... >> >>>Hi all >>> >>>I am regret to inform you all that this the last time I either post or >>>reply to comp.arch.fpga newsgroup. This decision was triggered by an >>>reply from an Xilinx employee to one of my postings. If someone wants to >>>look up that posting then its around the sentence: "This may have been a >>>mistake" - I do understand that I may have understand the original >>>intentions of that posting and that sentence and the context wrong, but >>>that doesnt make any difference to my decision which is final. I will not >>>discuss this matter in public or make any comments on it. A small >>>explanation about the reasoning and background of my decision is >>>available but not for free and not for quoting or republishing by any >>>media. >>> >>>http://shop.openchip.org/shop/product_info.php?cPath=28_29&products_id=36 >>> >>>Antti Lukats, posted to comp.arch.fpga at 1900PM on 16 August 2005 >>> >>>my final smile :) to all of you. >>> >>> >>> >> >>Article: 88506
http://www.fpga4fun.com/ gave me a good start on FPGAs. I would buy a digilent (http://www.digilent.us) board though, not one of the fpga4fun boards. -ArlenArticle: 88507
Also, I should have said: Xilinx App Notes are really good http://www.xilinx.com/xlnx/xweb/xil_publications_index.jsp?category=Application+Notes Altera's a fairly good (but in my experience Xillinx's App Notes are better) http://www.altera.com/literature/lit-an.jsp I also recommend the Lattice reference designs. They are quite good http://www.latticesemi.com/products/devtools/ip/refdesigns/index.cfm These are probably all you need to get started. Also, these are useful for much more than FPGAs. Many of them are generally applicable to all digital design. I was sent to a Xilinx App Note during a job interview as an ASIC designer (not with Xilinx). Good luck, ArlenArticle: 88508
Tom, Wow. Talk about buried. But thanks. Looks like for double precision the Qinetic core is on par (per my earlier post). If we replaced the DSP48 with a mor powerful block, then we woul dof course have that covered. Just stay tuned. The DSP48 is evolving as we hear back from customers how to make it work even better (ie more features). AustinArticle: 88509
Regards: What is the diffrences between lattice's FPGA and Xilinx's FPGA Thank You. Best Regards to you all.Article: 88510
In article <1124260721.732953.159600@g14g2000cwa.googlegroups.com>, <apsolar@rediffmail.com> wrote: >Hello everyone >Does anyone where I can find a simple VHDl code example based on >evolutionary algorithms.I am doing a project on evolvable hardware. >This will help me get a start on the implementation of Evolvable >Hardware. >Ankit Parikh >Manukau Institute Of Technology > There are many different directions you can take this as it is still a wide open field. Even within the category of 'evolutionary algorithms' there are a lot of sub-categories, two big ones would be: 1) Genetic algorithms (using reproduction (crossover), mutation operators). If you were to take this kind of approach in hardware you wuold probably just be implementing the GA algorithm in hardware. The result would be a sort of GA accelerator (one could imagine it running faster than on a generalpurpose CPU). However, you don't really end up with 'evolvable hardware' going this route (where 'evolvable hardware' would indicate that some structural changes have been made to the hardware to increase it's fitness) 2) Genetic programming: Where the actual code is mutated and adapted (traditionally Lisp has been used for this because the syntax is very forgiving - neither VHDL nor Verilog seem very amenable, but the paper mentioned here earilier [http://www.ce.chalmers.se/~mekman/MasterThesis.pdf ] seems to do something like this with within constraints imposed by the syntax of VHDL. In that paper the investigator actually seems to be mixing and matching candidate state machines from two sets of state machines to come up with a chain of resulting FSMs that produces the desired behavior. This 'evolution' does't actually take place on the target FPGA, but instead candidates are 'evolved' and then simulated in ModelSim. Seems like a valid approach, but also seems to be very tedious and the resulting hardware is static (it's not going to be changing once the FPGA has been programmed). I wonder if perhaps a better approach would be adaptable hardware. This would be very useful. Imagine things like planetary probes (essentially robots) that need to adapt to various changes in their environment because no operators are within several hours (by radio) able to control them. So how could we have this adaptation (or learning) take place inside the FPGA? Here's a crazy, off-the-top-of-my-head idea: One could imagine a state machine with many different inputs (from the environment) and many different potential states and resulting actions. If the FSM encountered a set of inputs for which it has no resulting state perhaps a new state could be created in the machine, it would be akin to the robot encountering a new situation and making a note of it. But then what action should be taken when this new state is encountered the first time (or in the future)?... well that's the part you have to figure out. Maybe it iterates through some candidate actions until it finds one that is deemed suitable for the situation and then saves that one for future use... Or maybe it considers what actions are performed in other similar states (a-priori defined states which resulted from a similar set of inputs). So let's say we have inputs A,B and C. We've got a next-state defined for inputs A=1, B=1, C=1, but nothing was defined for when inputs A=1, B=1 and C=0 so we add a new state for that condition and now we look at what actions are performed for the next closest state (A=1,B=1,C=1) - eventually, if it is determined that the two are equivilent then they can be collapsed into one single state (A=1, B=1, C=*) - generalization? In some ways the proposed system is similar to a Classifier System, so you might want to study those. You would probably want to use some sort of lookup-table based state machine where the inputs are the address to your state table RAM. Oh, and maybe you make it so that currently mapped actions can be modified in the future. Also, instead of having one big state machine maybe it would be best to have several smaller FSMs that can each contribute to generating an answer. Anyway, this quick brainstorm idea might be totally unworkable. As I said the field is wide open right now. You need to think of a lot of questions that you can use to create experiments. PhilArticle: 88511
Hallo, what is the syntax to write in Verilog the following vhdl code? architecture IMP of user_logic is component clk1440 port ( async_reset : in std_logic; clock : in std_logic; clock_enable : in std_logic; output_50 : out std_logic; output_pulse : out std_logic); end component; begin --USER logic implementation added here CLK_1440_I : clk1440 port map( async_reset => BUS2IP_Reset, clock => BUS2IP_Clk, clock_enable => '1', output_50 => lcd_cl_2_s, output_pulse => Clock_En_s); Many Thanks MarcoArticle: 88512
Regards: Why some firmware is made by lattice's FPGA instead of C language? Thank You. Best Regards to you all.Article: 88513
Hello... Can somebody tell me what people work behind Altera's mysupport? Trying now since a month to explain them to change my contact details... without success...now just no feedback anymore since 2 weeks!!! Next problem coming in the form of a motherboard upgrade with built-in LAN...probably takes half a year to switch Quartus/NIOS license... rickArticle: 88514
Hi rick, the e-mail mysupport_cs@altera.com (from your discussion about this in the past) has worked out fine for me. They have even changed the incorrect e-mail-address on their web-page, now. But for your orignal question: My bet is that they are located in India... (not sure if I should place a smiley here). Regarding the LAN I would contact a your FAE, this should be much easier then via their web-page. Regards, Thomas "jedi" <me@aol.com> schrieb im Newsbeitrag news:YAXNe.103$kc4.37@read3.inet.fi... > Hello... > > > Can somebody tell me what people work behind Altera's mysupport? > > Trying now since a month to explain them to change my contact details... > without success...now just no feedback anymore since 2 weeks!!! > > > Next problem coming in the form of a motherboard upgrade with > built-in LAN...probably takes half a year to switch Quartus/NIOS > license... > > > > rick >Article: 88515
Hello, I need to use large number (up to 12 774 182 400) in my design, bigger than a typical long. The question is : how can I do it ? And efficiently ? I could split the number in two parts, do the computations and the carry-bits, but i'm afraid it would be to much of a pain and I bet somebody already did it... Many thanks, NickArticle: 88516
Nick wrote: > Hello, > > I need to use large number (up to 12 774 182 400) in my design, bigger > than a typical long. > The question is : how can I do it ? And efficiently ? > > I could split the number in two parts, do the computations and the > carry-bits, but i'm afraid it would be to much of a pain and I bet > somebody already did it... Howdy Nick (although this reply would help mikelinyoho/frankgerlach22 as well), The quality of the answers you get in comp.arch.fpga are usually proportional to the amount of detailed information included in the question. In your case, you have provided only two (mostly insignificant) pieces of info: you have a single largish number and you want to do something to it that might or might not involve some carry bits. To get a good answer, you'll need to explain what you are trying to do, and if possible, why you are trying to do it. There are lots of very sharp people in this newsgroup, and for open-ended questions like these, they often identify solutions that the OP never even considered. In short, when asking a question in this newsgroup, you need to include as much detail as possible: 1. What's the 10,000 foot (high-level) view of what you're trying to do? 2. What are the details (low-level view) of what you're trying to do? 3. What solution(s) have you already identified? 4. What function(s) surround this one? 5. What's the interface to the surrounding functions (speed/width/protocol)? 6. Which vendor/part number/speed grade are you trying to use? 7. What clock speed(s) are being fed to the FPGA pins? 8. Is there a limit to the amount of time it can take to perform the function? Have fun, MarcArticle: 88517
Nick wrote: > I need to use large number (up to 12 774 182 400) in my design, bigger > than a typical long. > The question is : how can I do it ? 12774182400 is 0x2F9668E00 so use 34 bits. And efficiently Use signed or unsigned types and functions from the numeric_std library and a synchronous process. > I could split the number in two parts, do the computations and the > carry-bits, but i'm afraid it would be to much of a pain and I bet > somebody already did it... Try a simple synthesis try first and check static timing for Fmax. -- Mike TreselerArticle: 88518
I would like to implement a design that shares the external SDRAM that is installed on the Stratix II DSP Development Kit board between the Nios II controller and custom circuitry that will occupy a portion of the remaining Stratix II LE's (ALM's). Half of the SDRAM would be dedicated for collecting interleaved 8-bit data samples between the two on-board A/D's. Four of the 8-bit data samples would be accumulated from each A/D and stored, in an interleaved fashion, into a custom-designed FIFO forming a 64-bit word. These 64-bit words would then be written to the SDRAM by either the Nios II processor or a DMA engine @ 12.5 MHz. Once the SDRAM has been filled, the Nios II processor could be notifed (via a register?), where the data samples stored on the SDRAM can be copied and placed into a file in the on-board Compact Flash. The other half of the SDRAM would be used as the program/data memory for the Nios II processor. Can the SDRAM be shared by using the Avalon switching fabric to allow the functionality described? Can SOPC Builder be used to ensure that only half of the available on-board SDRAM is used for the Nios II leaving the rest available for storing of the A/D samples? Any feeback would be would be appreciated.Article: 88519
I'm trying to use some real constants in synthesis. I expect them to be truncated to binary values, but it appears XST doesn't like them and it also doesn't like the $realtobits() system function in verilog. Here is a short example of what I'm trying to do: `define CLKSPD 50 //MHz `define WTIME 43.03*`CLKSPD //Wait Time is 43.03us (approximately) module wait(clk,rst,finished); input clk; input rst; output finished; reg finished; reg [11:0] accum; always @ (posedge clk or posedge rst) begin if(rst) begin accum <= 12'h0; finished <= 1'b0; end else begin if(accum >= `WTIME) finished <= 1'b1; else accum <= accum + 1; end end endmodule XST gives me: ERROR:Xst:850 - "waiter.v" line 22: Unsupported real constant. The XST user guide says that real constansts are supported. Why doesn't this work? Is there someway I can make it work? FYI: I'm using Webpack 7.1, synthesizing for Spartan IIE 300 Thanks, ArlenArticle: 88520
My mind is boggling. Experimenting with Quartus 5.0 SP1 (targeting an EP1C20F400C7) I was quite puzzled to find that test1 can run at 189.83 MHz, while test2 maxes out at 156.25 MHz. Aren't the two fragments logically exactly the same? Tommy module test1(clk, index, value2); parameter N = 5; // Word size-1 parameter M = 4; // Entries log2-1 input wire clk; input wire [M:0] index; output reg [N:0] value2; reg [N:0] counters[(1 << (M + 1)) - 1:0]; reg [M:0] index1, index2, index3; reg [N:0] value3; always @(posedge clk) begin index1 <= index; index2 <= index1; value2 <= counters[index1]; index3 <= index2; value3 <= value2 + 1; counters[index3] <= value3; end endmodule module test2(clk, index, value2); parameter N = 5; // Word size-1 parameter M = 4; // Entries log2-1 input wire clk; input wire [M:0] index; output reg [N:0] value2; reg [N:0] counters[(1 << (M + 1)) - 1:0]; reg [M:0] index1, index2, index3; reg [N:0] value3; always @(posedge clk) begin {index1} <= {index}; {index2, value2} <= {index1, counters[index1]}; {index3, value3} <= {index2, value2 + 1}; {counters[index3]} <= {value3}; end endmoduleArticle: 88521
On 20 Aug 2005 11:02:04 -0700, mikelinyoho@gmail.com wrote: >Regards: > > What is the diffrences between lattice's FPGA and Xilinx's FPGA Lattice FPGAs are made by Lattice at http://www.latticesemi.com/ Xilinx FPGAs are made by Xilinx at http://www.xilinx.com/ There are 7 letters in Lattice There are 6 letters in Xilinx If you want better answers, you need to ask better questions. Let me help you: Tell us what research you have already done. What specific things did you read that you didn't understand. Tell us why you are asking the question. Tell us what type of differences you are interested in. Both companies have data sheets that totally describe their products. Have you read any of them? When you write useless questions, the other readers of the news group think you are lazy, and if you are lazy, then why bother trying to help? (My answer here is my charity work for today) PhilipArticle: 88522
On 21 Aug 2005 00:13:33 -0700, "mikelinyoho" <mikelinyoho@gmail.com> wrote: > > Why some firmware is made by lattice's FPGA > instead of C language? This is even worse than your previous question. I have read it 5 times, and I have no idea what you are asking. It seems likely that English is not your native language, so you have an additional problem. I am sure though that your English is far better than my ability to speak or write in your native language. Since almost all the discussion in this news group is in English, you don't have a choice. Sorry. My recommendation is that when you want to ask a question, you write it and give it to a friend and ask them if they understand what you are asking about. (No cheating and discussing it with them. Just give it to them.) If they can't figure out what you are asking, then neither can we. >If you want better answers, you need to ask better questions. > >Let me help you: > >Tell us what research you have already done. What specific things >did you read that you didn't understand. > >Tell us why you are asking the question. Marc Randolph wrote this in another thread, but you may have missed it: > The quality of the answers you get in comp.arch.fpga are usually >proportional to the amount of detailed information included in the >question. > >To get a good answer, you'll need to explain what you are trying to do, >and if possible, why you are trying to do it. There are lots of very >sharp people in this newsgroup, and for open-ended questions like >these, they often identify solutions that the OP never even considered. > >In short, when asking a question in this newsgroup, you need to include >as much detail as possible: > >1. What's the 10,000 foot (high-level) view of what you're trying to >do? >2. What are the details (low-level view) of what you're trying to do? >3. What solution(s) have you already identified? > >4. What function(s) surround this one? >5. What's the interface to the surrounding functions >(speed/width/protocol)? > >6. Which vendor/part number/speed grade are you trying to use? >7. What clock speed(s) are being fed to the FPGA pins? >8. Is there a limit to the amount of time it can take to perform the >function? > >Have fun, > > Marc We are a really helpful bunch here in comp.arch.fpga , but to help you, you need to ask your questions better. Cheers, PhilipArticle: 88523
Hi Antti, At the moment, only engineering samples are available. Your favorite Xilinx distributor should be able to help you obtain XC3S100E or XC3S500E samples. These two part numbers are generally available. However, for a variety of reasons, distributors do not like to stock engineering samples so they generally will show some amount of lead time and place the order with Xilinx. I'm working with internal groups to see how we can put Spartan-3E FPGAs on the Xilinx online store. I agree that this should be easier than it is currently. --------------------------------- Steven K. Knapp Applications Manager, Xilinx Inc. General Products Division Spartan-3/-3E FPGAs http://www.xilinx.com/spartan3e --------------------------------- The Spartan(tm)-3 Generation: The World's Lowest-Cost FPGAs.Article: 88524
yeah they are the same but the quartus isnt that intelligent to make it out. I believe the bit/part selection confuses the tool.
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z