Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On a sunny day (4 Apr 2007 05:43:42 -0700) it happened "Patrick Dubois" <prdubois@gmail.com> wrote in <1175690622.594433.213100@d57g2000hsg.googlegroups.com>: >On Apr 3, 2:00 pm, Jan Panteltje <pNaonStpealm...@yahoo.com> wrote: >> On a sunny day (3 Apr 2007 08:21:24 -0700) it happened "PatrickDubois" >> <prdub...@gmail.com> wrote in >> <1175613684.443163.290...@d57g2000hsg.googlegroups.com>: >> >> >On Apr 3, 8:59 am, Jan Panteltje <pNaonStpealm...@yahoo.com> wrote: >> >> PS3 has only 1 power processor and _6_ SPE cores. >> >> http://en.wikipedia.org/wiki/PlayStation_3#Central_processing_unit >> >> >Nope, 1 central PPC core and 8 Synergistic Processor Unit: >> >http://www.research.ibm.com/cell/heterogeneousCMP.html >> >> Nope, in the PS3 only 6 are available. > >Alright, I don't want to argue about this but I think we can fairly >say that the info on the web is not clear... >Just for fun, here's a link directly from Sony with the PS3 Cell >specs :) >http://cell.scei.co.jp/index_e.html OK, all good and well, but here some facts: PS3 runs Linux in a 'hypervisor'. The hypervisor limits access to whatever Sony pleases to allow access too. One SPE is in use for the PS3 graphics, and no way you can touch it from Linux. The story goes IMB had yield problems, so Sony settled for chips with one working core less. That leaves 6 available from Linux. The wikipedia article is up to date and quite correct. Version of Linux that runs on PS3: Yellow dog Linux. If you are in Europe, there is a special C'T magazine release out with YD Linux for PS3 including some of the IBM development tools: https://www.heise.de/kiosk/special/ct/07/01/ All I can tell you now.Article: 117576
On Apr 4, 9:49 am, Jan Panteltje <pNaonStpealm...@yahoo.com> wrote: > On a sunny day (4 Apr 2007 05:43:42 -0700) it happened "Patrick Dubois" > <prdub...@gmail.com> wrote in > <1175690622.594433.213...@d57g2000hsg.googlegroups.com>: > > > > >On Apr 3, 2:00 pm, Jan Panteltje <pNaonStpealm...@yahoo.com> wrote: > >> On a sunny day (3 Apr 2007 08:21:24 -0700) it happened "PatrickDubois" > >> <prdub...@gmail.com> wrote in > >> <1175613684.443163.290...@d57g2000hsg.googlegroups.com>: > > >> >On Apr 3, 8:59 am, Jan Panteltje <pNaonStpealm...@yahoo.com> wrote: > >> >> PS3 has only 1 power processor and _6_ SPE cores. > >> >> http://en.wikipedia.org/wiki/PlayStation_3#Central_processing_unit > > >> >Nope, 1 central PPC core and 8 Synergistic Processor Unit: > >> >http://www.research.ibm.com/cell/heterogeneousCMP.html > > >> Nope, in the PS3 only 6 are available. > > >Alright, I don't want to argue about this but I think we can fairly > >say that the info on the web is not clear... > >Just for fun, here's a link directly from Sony with the PS3 Cell > >specs :) > >http://cell.scei.co.jp/index_e.html > > OK, all good and well, but here some facts: > PS3 runs Linux in a 'hypervisor'. > The hypervisor limits access to whatever Sony pleases to allow access too. > One SPE is in use for the PS3 graphics, and no way you can touch it from Linux. > The story goes IMB had yield problems, so Sony settled for chips with one > working core less. > That leaves 6 available from Linux. > The wikipedia article is up to date and quite correct. > Version of Linux that runs on PS3: Yellow dog Linux. > If you are in Europe, there is a special C'T magazine release out with YD Linux for PS3 > including some of the IBM development tools: > https://www.heise.de/kiosk/special/ct/07/01/ > > All I can tell you now. Ok, thanks for the clarification.Article: 117577
Thank you for your replies so far. It is no problem if the chip is ~1000$, because the aim is some 10^12 multiplications/s. The simplistic estimate needs 3 chips V5 SX95T or 27 chips EP2C70. For easier design a lower number of chips is preferred. Is there some more comprehensive overview of FPGA prizes, especially for the larger devices, than what I find at www.digikey.com? Although your suggestions have already been useful, I'd appreciate more hints maybe concerning exotic manufacturers that I have never heard of. Or are Xilinx and Altera definitely the only choices? Best regards, Ryan ( >Wow, FPGA marketing departments are going to be swarming all over you like >it's Christmas! > >Wait, are you sure you're real...? ;-) > > -Ben- Why do you doubt this? Because the question sounds stupid? I don't pretend that I am close to production of the planned FPGA board, but I need an idea of what will be possible at what costs.)Article: 117578
Note: using EDk 8.2.02 with peripheral generated with the wizard with FIFO enabled and one user interrupt that is set to the fifo_almostfull line. I'm set up to generate interrupts on the fifo almost full signal but I get a couple interrupts and then it stops. If I go read the occupancy register I get a number(0x737) greater than the size of the fifo (0x400). If I then go and manually read from the fifo data register(from gdb) I can get the occupancy to go back to 0x400 and the next interrupt occurs when I run but they stop shortly after. So, How could the occupancy be greater than the size of the fifo? Have you seen anything like this before? Thanks, ClarkArticle: 117579
<ryan_usenet@yahoo.com> wrote in message news:1175696938.323383.273760@o5g2000hsb.googlegroups.com... > > Although your suggestions have already been useful, I'd appreciate > more hints maybe concerning exotic manufacturers that I have never > heard of. Or are Xilinx and Altera definitely the only choices? > > Best regards, > Ryan > Hi Ryan, Don't forget FPGAs have other resources apart from 'hard' multipliers. Check this page on Mr. Andraka'a website about distributed arithmetic. You can make a _lot_ of multipliers out of the ordinary fabric of FPGAs. HTH, Syms. http://www.andraka.com/distribu.htmArticle: 117580
I am trying to get video through an fpga to a texas instruments TMDS transmitter chip (TFP410). I would like see video out at 1280x1024 at 60 FPS. The VESA spec calls for this to be clocked at 108 Mhz. But if you look at the actual amount of data (1280x1024x60 = 78.6 MHz) 108 Mhz is alot of overhead for the actual number of valid pixels that need to be pushed through the system. The FPGA I am using (a stratix I) doesn't want to run this fast. I would like to generate my video at a slower clock speed (with less blanking time). Does anyone know if the TFP410 will accept input video with less blanking time than the VESA spec allows for? I have been experimenting with slower clock rates but can't seem to get them to work with the blanking intervals / timing parameters i have chosen. I wondered if there was some formula that I needed to follow. I have asked TI about this and they have been unhelpful.Article: 117581
Ryan, Your metric is extremely simple. Perhaps too simple? What is it that you wish to do? 1E12 multiplies per second is a bit too simplistic. You need to consider the 'care and feeding' of this 'monster'. What is the resolution required? 18X18? or 9X9? Or really 25 X 18? Is it all multiplies, and no accumulates? Hard to imagine a problem where no addition whatsoever is required. Multipliers alone may not suffice. What about accumulators? What number of bits? The DSP48 blocks in the Xilinx architectures are intended for most common needs. The DSP48 in V5 also has the traditional 4 bit control (16 function) ALU as part of the DSP block. Very wide AND, OR, XOR, etc. are all provided. You should also consider how much SRAM is on chip. If you have insufficient RAM, you can not "feed your monster." Similarly, IO is required to feed the RAM, and get the results. You need to consider if the IO is a bottleneck (limits your performance). AustinArticle: 117582
On 4 Apr., 17:22, "Symon" <symon_bre...@hotmail.com> wrote: > Hi Ryan, > Don't forget FPGAs have other resources apart from 'hard' multipliers. Well part of the other logic is of course needed for accumulation, registers etc. If huge amounts of logic will be free, my plan is to use those as additional 'software' multipliers. However, I have no idea how efficient these multipliers built from logic blocks can be. As far as I know you have the choice of implementing either parallel or sequential 'software' multipliers. The parallel architecture has the advantage that it can calculate 1 product per clock, but requires lots of logic, so that not many of them could be used. The sequential architecture needs about 1/nth of the logic space, but finishes after about n clocks, so that I don't expect either of these solutions to significantly contribute to the dedicated 'hardware' multipliers performance. If the sequential multiplier could be used in 'pipeline' operation (meaning the result is delayed by n clocks, but then a new result is produce in each cycle), this would be interesting, but I guess, it cannot? Thanks RyanArticle: 117583
"wallge" <wallge@gmail.com> wrote in message news:1175700538.999587.321180@d57g2000hsg.googlegroups.com... >I am trying to get video through an fpga to a texas instruments TMDS > transmitter chip (TFP410). > I would like see video out at 1280x1024 at 60 FPS. The VESA spec calls > for this to be clocked at > 108 Mhz. > The FPGA I am using (a stratix I) doesn't want to run this fast. The FPGA will run plenty fast, but you probably need to fix your design if it doesn't meet your timing requirements. > Does anyone know if the TFP410 will accept input video with less > blanking time than the VESA spec allows for? You cannae change the laws of physics. The DVI transmitter might be capable of transmitting with near-zero blanking but the monitor at the other end will have none of it. In order to compress the data stream coming out of your hypothetical slowed-down video source into a pixel stream at a 108MHz dot clock, something somewhere is going to have to buffer a whole lot of video data to effect the rate change. The best candidate for that "something" is really the FPGA you're designing. > I wondered if there was some formula that I needed to follow. Google for VESA GTF (General Timing Formula). -Ben-Article: 117584
<ryan_usenet@yahoo.com> wrote in message news:1175686533.726030.62610@d57g2000hsg.googlegroups.com... > Hello, > I need the highest possible number of multiplication operations per > second at low cost. I know that several factors affect the overall > performance, but since I have no idea which FPGA chips might be worth > to be considered, I'd like to ask what you think is the chip with the > lowest ratio > > R=(price of chip)* (delay time)/(number of multipliers) > > 18x18bit multipliers seem to be quite common, so lets assume this > design for the estimate. > > For example for the Spartan XC3S1000 (~60$, 24 multipliers, 4ns delay) > I have > R= 10$ per (Billion multiplications/s). The Cyclone EP2C70 (~230$, 150 > multipliers, 4ns delay) has R=6.13$ per (Billion multiplications/s). > > Do other FPGAs exist that are maybe specialized for multiplication- > intensive tasks and which therefore are much cheaper? > > Best regards, > Ryan Another chip to consider: the Lattice ECP2(M) series. The Lattice approach was to stuff the inexpensive chips with DSP resources since many massively parallel algorithms can't afford the super-performance chips. Their sysDSP blocks can be configured for 8x9-bit, 4x18-bit or 1x36-bit multipliers per block. A MAC structure is included for FIR style applications. For 18-bit multipliers, an 88 multiplier solution has a "marketing" price around $35 (ECP2-70). I think your 18-bit R value is about $1.60 but I have a little difficulty figuring out the multiplier speeds. The families with the most multipliers at reasonable costs - Altera Cyclone III, Xilinx Spartan-3DSP, Lattice ECP2 - are all rather new. The costs might be larger in the near term than marketing announcements would suggest. You really need to get conversations going with the sales reps from these three companies so that they can work the numbers toward your goal. Getting them to where they can understand your pipeline needs, they should be able to give you a true attainable frequency and a price or price range that would allow you to compare to your economy of scale. Something else that might interest you is the FPOA from Mathstar. Their "Field Programmable Object Array" has MACs, ALUs, and Register Files as the distributed elements capable of 1 GHz speeds according to their literature. Since this product is far outside my tecnical needs, I haven't delved too far into it but your application might be one of the few FPGA-style designs that can seriously leverage this not-so-mainstream technology.Article: 117585
Hi Folks, I've been active on this forum asking questions on digital receiver. Earlier I was doing most of the work in the FPGA (thats how I architected it). Due to space constraints, this radio is going to be used as a Search and Rescue radio for the coast guard, I am re- architecting and planning to use a multi-channel digital receiver chip instead. This helps us to get rid of most of our analog front end and do minimal processing in the FPGA, and thus use a smaller FPGA also. The chips that come to mind are AD6654 and AD6624A. The AD6654A is special as it has the in-built ADC. Now some specs on my design requirements a) 4 channels, b) 121.5MHz - 245MHz band of interest. c) AM, FM, FSK demodulation. I have also looked at some Grayscale chips. I was wondering if you have any more suggestions. Thanks MorpheusArticle: 117586
<ryan_usenet@yahoo.com> wrote in message news:1175701234.081630.27560@d57g2000hsg.googlegroups.com... > On 4 Apr., 17:22, "Symon" <symon_bre...@hotmail.com> wrote: > > As far as I know you have the choice of > implementing either parallel or sequential 'software' multipliers. The > parallel architecture has the advantage that it can calculate 1 > product per clock, but requires lots of logic, so that not many of > them could be used. You would have to define "lots". Also, you might calculate one product per clock with a fabric-based multiplier, but it will be a pretty long clock cycle. > If the sequential multiplier could be used in 'pipeline' operation > (meaning the result is delayed by n clocks, but then a new result is > produce in each cycle), this would be interesting, but I guess, it > cannot? No, but the "parallel" multiplier architecture could. Indeed, it must be, or your clock speed would be ridiculously slow. It would be interesting to know what you think "a multiplier" is - i.e. what word-length, and are you talking about fixed-point or floating-point operations? -Ben-Article: 117587
On Apr 3, 8:12 am, "Paul" <pauljbenn...@gmail.com> wrote: > Yea... for non-BGA packages - Altera seems to give you more logic. > > If you're really stuck on Xilinx, I'd pick up the phone and call Avnet > (or whoever > their equivalent is if you're not in the US) to find out what's > actually available. > Often, all the configurations listed aren't actually produced unless > they have > orders for them. > > If you're not stuck on Xilinx - look into Altera. And don't limit > yourself, > check out the smaller guys too - lattice semiconductor has a pretty > decent non-BGA offering as well. > > On Apr 2, 6:26 pm, "John_H" <newsgr...@johnhandwork.com> wrote: > > > "radarman" <jsham...@gmail.com> wrote in message > > >news:1175547385.428898.143910@o5g2000hsb.googlegroups.com... > > > >I have an application that is very space limited, and would like to > > > use a Spartan XC3S250E in the VQ100 package. Even this package has > > > more I/O than I need. (I need a lot of logic, but communication is via > > > SPI and RS232 - about 10 I/O total) > > > > I can find the 3S100E's in this package from most distributors, but > > > not the 3S250E. > > > > I only need four total, and I could probably squeeze a TQ144 package > > > in if I got creative with parts placement, but the PQ208 absolutely > > > won't fit. > > > > Also, since these are prototypes, I would like avoid BGA's. The cost > > > to have them fitted is just too high. > > > > BTW - if anyone from Xilinx is reading this, this is why it would be > > > nice if your online store was actually a "store", and not just a link > > > to Avnet and Nu Horizons. > > > > Thanks! > > > This is where a distributor/sales contact (distributor FAE, for instance) > > can really help out. Even if you're "small potatoes" for their business, > > they should be able to tell you about availability. The online store > > *would* be a tremendous resource if it was in a usable state.- Hide quoted text - > > > - Show quoted text - Without the VQ100 package, Altera does appear to be the better chip at this end of the spectrum. Not only does it have more equivalent gates, but if I'm reading the datasheet right, it only needs two supply voltages - 3.3v and 1.2v. That will cut an extra switcher from the board, which somewhat makes up for the larger package. Plus, Digikey is carrying both the 2c5 and 2c8 in the TQ144 package in stock.Article: 117588
Matthias Einwag <matthias.einwag@web.de> wrote: >Hi everyone, > >I'm working on a Board for Bus Interfacing and some Audio Processing, >which will use Spartan 3E 1200 and the Microblaze. >Now I'm not sure what kind of memory I should use. Size ist not too >important for me. 8Mbytes would be ok, but more is always better. But >the speed should be high enough to get the Microblaze run at a decent >speed and to allow permanent storing auf 4 stereo audio streams. >Most important is, that the layout and memory interface should run >stable and from the beginning. There's not much time for error >investigations and redesigns. Your bandwidth requirement is less than 800kB/s. How about compact flash or an SD-card? These are well known and there are loads of resources available on the internet. >Spartan 3E Starter Kit uses a 16bit Wide DDR SDRAM Memory. But I read >here and in some other boards, that people have much trouble with this >one. Is this because Spartan 3E and the given IP Cores have always >Trouble with DDR, or because of the missing clock feedback path on the >board? There are no free _usefull_ cores to control DDR memory. -- Reply to nico@nctdevpuntnl (punt=.) Bedrijven en winkels vindt U op www.adresboekje.nlArticle: 117589
On 4 Apr., 17:35, Austin Lesea <aus...@xilinx.com> wrote: > Ryan, > > Your metric is extremely simple. Perhaps too simple? Definitely too simple. But I need a first guess what might be possible, not the final answer about the ideal FPGA. Therefore I don't want to discuss all details, but only the multiplication rate which is one of the limiting factors, which will boil down the choice to <10 chips. After this preselection I'll work out if additional logic, RAM or IO will limit the performance. > What is it that > you wish to do? 1E12 multiplies per second is a bit too simplistic. Yes? Then please tell me how to run 1E12 multiplies per second - nothing else - for 100$, and we can continue to the next step. > You need to consider the 'care and feeding' of this 'monster'. > > What is the resolution required? 18X18? or 9X9? Or really 25 X 18? More like 36x36 which will increase the required number of FPGAs by a factor of 4. I'd appreciate if you can suggest FPGAs with higher- resolution multipliers that are more cost-efficient than those with 18x18 (combined to 36x36). > Is it all multiplies, and no accumulates? I do need accumulators. > Hard to imagine a problem > where no addition whatsoever is required. I wasn't asking you to imagine the problem, but whether you know chips that can do highspeed multiplication at low cost. > Multipliers alone may not > suffice. What about accumulators? What number of bits? The DSP48 > blocks in the Xilinx architectures are intended for most common needs. > The DSP48 in V5 also has the traditional 4 bit control (16 function) ALU > as part of the DSP block. Very wide AND, OR, XOR, etc. are all provided. Yes, I guess the 640 DSP48-slices of the XC5VSX95T are the multipliers that Sylvian mentioned. I don't mind if the FPGA is filled with additional (then unused) logic unless it uses too much power. So you're welcome to suggest xtreme DSP chips even if their multipliers are used inside DSP slices. > You should also consider how much SRAM is on chip. If you have > insufficient RAM, you can not "feed your monster." With 1000 18x18 multipliers I won't need more than 8x1000x18=144kbit RAM space. > > Similarly, IO is required to feed the RAM, and get the results. You > need to consider if the IO is a bottleneck (limits your performance). It is not, because the computational intensive part is localized on each chip. RyanArticle: 117590
> It would be interesting to know what you think "a multiplier" is - i.e. what > word-length, and are you talking about fixed-point or floating-point > operations? Simulations show that 36bit fixed-point are sufficient. However for later improvements we might change to floating-point calculation. Do FPGAs with floating point multipliers/adders exist, or how can one estimate the (emulated) floating-point performance given the fixed- point performance? RyanArticle: 117591
<ryan_usenet@yahoo.com> wrote in message news:1175704368.383458.84460@l77g2000hsb.googlegroups.com... >> It would be interesting to know what you think "a multiplier" is - i.e. >> what >> word-length, and are you talking about fixed-point or floating-point >> operations? > Simulations show that 36bit fixed-point are sufficient. However for > later improvements we might change to floating-point calculation. Do > FPGAs with floating point multipliers/adders exist, or how can one > estimate the (emulated) floating-point performance given the fixed- > point performance? Well, it's tricky and your mileage may vary a great deal, but you might want to take a look at the Xilinx Floating-point operator core datasheet. This will tell you how big and how fast the various FP operations will be depending on your desired wordlength. Multiplication in floating-point is not really too much of an overhead relative to fixed-point, but addition certainly is much much bigger and slower. Good luck, -Ben-Article: 117592
Hello, I m building a Function generator using an FPGA!! I m almost done wid the Coding part in VHDL. I'd just like some help regarding the interfacing of DAC0808 with the FPGA. Also how th output Amplitude variation can be done.!!!Article: 117593
On Apr 4, 12:44 pm, "Amal" <akhailt...@gmail.com> wrote: > Has anyone used std8980 model from FMF (http://www.freemodelfoundry.com/)? We are having some problems with the > model and I was wondering if anyone has successfully used this model. > > -- Amal Oh, one more thing, does anyone has a pointer to a good tap controller model? -- AmalArticle: 117594
radarman <jshamlet@gmail.com> wrote: ... > but if I'm reading the datasheet right, it only needs two supply > voltages - 3.3v and 1.2v. That will cut an extra switcher from the > board, which somewhat makes up for the larger package. If the 2.5 Volt is only needed for VCCAUX in the XC3S, there is no need for a switcher. Quiecent current is well below 100 mA even for a XC3S1600E and well suited for a low drop regulator from 2.5V -- Uwe Bonnes bon@elektron.ikp.physik.tu-darmstadt.de Institut fuer Kernphysik Schlossgartenstrasse 9 64289 Darmstadt --------- Tel. 06151 162516 -------- Fax. 06151 164321 ----------Article: 117595
On 3 Apr., 15:23, Ray Andraka <r...@andraka.com> wrote: > You might consider one of the Matlab clones: Octave or I think the other > is called Scilab. I don't have experience with either, but from what I > understand both will run matlab M files and have pretty good coverage of > the matlab function set. Scilab is similar to Matlab but not compatible, Octave is said to have very good compatibility to Matlab. Scilab has a good Windows installer, Octave is better used under Linux. KoljaArticle: 117596
Gordon Freeman wrote: >> Watch the values change in the accumulators as you add and subtract the >> input values. > > Hi everyone! > Thank you for your reply! > I used ModelSim to simulate. The result is the same when I canculate > by calculator. > But when I implement on FPGA, it don't work too. > I design IIR filter with 10 orders. I use Matlab to generate > coefficients for filter (b(k) and a(k)). For coefficient "b", I > multiply with 2^14 and multiply with 2^5 for coefficent "a". After > that I round them. These coefficients stored in LUT. I use SDA for > filter. Because I think IIR filter include tow FIR filter. One filter > with coefficient "b" and one with coefficient "a". Is it right? Check your synthesis log to make sure your design did not get 'optimized' away. If you forgot some control signals somewhere, it is possible that synthesis deduced that some of your design had static elements, removed them to optimize, then deduced that everything else is now unconnected and removed that as well. BTW, also make sure your VHDL agrees with your board's reset polarity.Article: 117597
>> What is the resolution required? 18X18? or 9X9? Or really 25 X 18? > More like 36x36 which will increase the required number of FPGAs by a > factor of 4. Google for karatsuba, you can do it with just 3 ;)Article: 117598
ryan_usenet@yahoo.com wrote: > On 4 Apr., 17:22, "Symon" <symon_bre...@hotmail.com> wrote: > >> Hi Ryan, >> Don't forget FPGAs have other resources apart from 'hard' multipliers. > > If the sequential multiplier could be used in 'pipeline' operation > (meaning the result is delayed by n clocks, but then a new result is > produce in each cycle), this would be interesting, but I guess, it > cannot? If you want to infer pipelined multipliers with ISE, the basic coding template is simple and is the same for both hardware and fabric multipliers: process(clk) begin if(rising_edge(clk)) then multpipe0 <= inA * inB; -- pipe 0..2 are inA'WIDTH + inB'WIDTH wide multpipe1 <= multpipe0; multpipe2 <= multpipe1; multout <= multpipe2; -- You can truncate the result here if necessary end if; end process; This will yield one multiplier result each clock and have a latency of four cycles from input to output. Depending on what FPGA family you are using and your synthesis tools, you may need more or fewer pipeline registers - these will be either absorbed by the hardware multipliers (up to four for Spartan 2 and V2Pro, two for V4/V5) or distributed within fabric multipliers to improve timings. AFAIK, there is no way to infer a sequential multiplier... all you can have is "large combinational blob" which is really slow and pipelined, both using roughly the same amount of FPGA slices when implemented in fabric. If you want sequential for whatever reason, you will have to do it yourself.Article: 117599
Why do we use Gray code in asynchronous FIFO? (I am aware that only ONE bit changes in Gray code at a time, but what is the advantage of that?)
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z