Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
["Followup-To:" header set to comp.lang.verilog.] * Colin Marquardt <colin@marquardt-home.de> wrote: > Mike Treseler <mike_treseler@comcast.net> writes: > >> If the aim were to document the >> design, I would do it as comments >> in the source and testbench code. Sounds like what doxygen does for C++ code. Has anyone written a verilog parser for doxygen? Regards Jahagirdar Vijayvithal S > > I document testcases as comments with a special prefix. These are > then filtered out and processed with LaTeX, giving a nice PDF > file. One could also embed diagrams in description languages like > pstricks, metapost, dot, pic etc. and handle them in the same way. > If somebody has already done this, I'd love to hear about it. > > To the original poster: take a look at emacs' vhdl-mode and its > integration into speedbar for a hierarchy parser. > > Cheers, > Colin > > -- > If God had not given us sticky tape, it would have been necessary to > invent it. [Pete Zakel] Regards Jahagirdar Vijayvithal S -- If a man does only what is required of him, he is a slave. If a man does more than is required of him, he is a free man. --Chinese Proverb Jahagirdar .V.S IC Design Engineer , Texas Instruments (India) Ltd. 91-80-25099129(O) 91-80-28540394(R)Article: 87301
Hi, thank you for your answers. With "NO time" I mean that I have no time to sychronize the input when using the external data clock in my FSM. Some more information on the interface: DATA[7..0] 8-bit bidirectional data bus. The FPGA has to drive the bus LOW by default. By sending a non-zero data pattern called TXCMD (transmit command) the FPGA initiates transfers. The direction of DATA[7..0] is controlled by DIR. Contents of the bus lines must be ignored for one clock cycle whenever DIR changes value (turnaround) DIR Controls direction of data bus. The external PHY drives DIR LOW by default so that it can listen to TXCMDs from FPGA. The PHY drives DIR HIGH when it has data for the FPGA. STP The FPGA drives STP HIGH for one clock cycle after the last byte of data was sent to the PHY NXT The PHY drives NXT HIGH to throttle data. If DIR is LOW, the PHY asserts NXT to notify the FPGA to place the next data byte on DATA[7..0] in the following clock cycle. If DIR is HIGH the PHY asserts NXT HIGH to notify the FPGA a valid byte is on DATA[7..0]. I have tried to illustrate the timing in the following diagram: http://mitglied.lycos.de/vazquez78/ Some dynamic characteristics of the PHY which provides the 60MHz clock: timings with respect to positive edge of PHY clock tSETUP (input-only pins) max. 6.0 ns tHOLD (input-only pins) max. 0.0 ns tOUT (output-only pins) 2pF - ns 12pF - ns 30pF max. 9.0 ns As the plot shows I have to provide data on the next PHY clock cycle as soon the PHY accepts my TXCMD. So not really much time to do a good synchronous job ... ? Rgds Andr=E9Article: 87302
The best option for you is to add IO buffers to inputs and outputs in synthesis stage. This can be done by specifying the "add IO Buffers" in the "Xilinx Specific Options" tab of XST properties. If your design is such that you do not want to add IO buffers (that is to not connect your signals externally), then you should unset the "Trim Unconnected Signal" option in Map properties. Love SinghalArticle: 87303
at Wed, 20 Jul 2005 18:48:51 GMT in <1121885331.639515.18250 @g44g2000cwa.googlegroups.com>, daniel.leu@gmail.com (Daniel Leu) wrote : >All the details are in the specification: >www.jedec.org/download/search/jesd71.pdf > Well, the spec says *what* the STAPL composer would do but gives no implementation thereof. (Nor is it the job of a specification to supply an implementation to go along with it) What I was hoping for was to find some sort of software tool to help generate custom STAPL files so that it doesn't become a DIY job. For simple stuff one can always simply create a STAPL file manually using a text editor but for anything more elaborate than the very most basic operations I think a tool would be much preferred. Either way, though, there's still the pesky CRC at the end and to do this with a manually-generated file you need something like a hex editor so you can calculate the CRC. -- Alex Rast ad.rast.7@nwnotlink.NOSPAM.com (remove d., .7, not, and .NOSPAM to reply)Article: 87304
Hello experts! I have to implement a receiver for AES/EBU digital audio to I2S-Bus on a Xilinx Spartan3 FPGA. Concerning my level of FPGA-knowledge, I'm through the basic and in-depth tutorials from Xilinx and the whole flashing-LED and stopwatch-stuff ;-) and have also implemented designs of this level on my own. But there's no experience in larger, more complex designs. During my search I came across the SPDIF Interface and I2S Cores from opencores.org, which seem to be way too high for understanding for me at the moment. Has anyone successfully implemented them on a FPGA and could give me a rough idea of the necessary knowledge and engineering effort? The other possibility would be using a commercial IP-core, like for example from coreworks.pt. I've already established contact, their offer seems quite promising. But for comparison reasons, does anybody know other providers for digital audio interface IP-cores? Any help will be greatly appreciated! HolgerArticle: 87305
methi schrieb: > shift_register <= shift_register( 3453 downto 0 ) & shiftin; This alone would probably synthesize to 256 LUTs configured as SLR16. No probleme there > shiftout <= shift_register(right-1); But this is a 3453 to 1 multiplexer. Without any tricks you need about 2k-LUTs to implement it and it will be reaallllyyyy slow. So you should use the BRAM as suggested by others. Kolja SulimmaArticle: 87306
methi schrieb: > I am trying to delay a pulse by N ticks where N takes a maximum value > of 3454...N is a variable here.... Yes, but a single pulse, or many pulses? That was Rays question. Can you guarantee that there at most M pulses within 3454 clock cycles? (with a small value of M) Kolja SulimmaArticle: 87307
Hi Marc, > In short, please explain which of the above comparison columns is > incorrect, and why. I see your point now. The problem is the unfortunate use of the term "Equivalent Four-Input LUTs". This value does *not* represent a simple 4-LUT. What that column really represents is "Equivalent Benchmarked Logic Units". In otherwords, it is a measurement of capacity based on benchmark results. In this case, we have normalized the result so that our arbitrary unit of measurement matches one Xilinx half-slice (which is labeled "Actual LUTs" in your table). These comparisons take into account all features of the respective logic elements, such as dedicated adders, little XOR and MUX widgets, etc. To obtain the Stratix II value of 186K, we benchmarked a large suite of designs by running Synplify + ISE vs. Quartus to figure out how many ALMs were needed vs. how many half-slices were needed. The result was that 2.6 half-slices were needed to implement the same functionality as 1 ALM (on average). Multiply by number of ALMs in 2S180 (71760) and you get 186576. So perhaps a more correct label would have been "Equivalent Virtex Half-Slices". We could have equally chosen to normalize the results to "Stratix Logic Elements". In this case, rather than 186K vs. 178K, the result would be ~179K vs. ~170K (ballpark guess). In that case, we would be scaling the LX200 capacity by our observed Virtex-4 vs. Stratix packing ratio, and we would be scaling the Stratix II ALM capacity by our benchmarked Stratix II vs. Straitx ratio of 2.5 LEs vs. ALM. Hope that clarifies things. Paul Leventis Altera Corp.Article: 87308
Hi Tim, I would highly recommend looking at Figure 2-6 of the Stratix II databook (http://www.altera.com/literature/hb/stx2/stx2_sii51002.pdf) to gain a better understanding of exactly what hardware there is in the ALM. This is a very detailed diagram that shows exactly how an ALM is constructed; As you will see, it comprises 2 4-LUTs plus 4 3-LUTs plus a whole bunch of muxing. If you are going to simplify it, it is closest to a 6-LUT with a multiple outputs and some replicated internal nodes, not 2 4-LUTs as you characterize it. In a posting on this subject in the past (thread "Mine is bigger than yours..."), I've given a few links to Altera white papers and two published conference papers that better describe the ALM and the architectural choices involved in its development. See http://tinyurl.com/dx88v. As for the Xilinx "Variable LUT", there are big differences between a "composible" LUT architecture (what Xilinx has) and a "fracturable" LUT architecture (the ALM). I've posted a very long (technical) post previously on the subject under the thread "Stratix 2 ALUT architecture patented ?". See http://tinyurl.com/a2jo5. The short version is that in order to make a 5- or 6-input LUT out of Xilinx slices, you use a lot more silicon area. As a case in point, if you were to try to fill up a LX200 with 6-input LUTs in this manner, you would only be able to fit 44,544, while you could fit 71,760 in the 2S180. Regards, Paul Leventis Altera Corp.Article: 87309
Hi Ray, This benchmarking study looks at how the two archtiectures studied respond to HDL synthesis. This is the method of design is used by the vast majority of our user base, who don't have the time or knowledge to hand synthesize large portions of their designs. I will concede that for the advanced designer such as yourself, this comparison method is insufficient to draw a conclusion on how well the architecture will work for you. And yes, if a design is ultra-pipelined, there are fewer opportunities to make use of larger LUTs, so it is conceivable that the advantage would be less in these cases. As for your comment on academic research on LUTs, we still agree with the body of research you point to. Simple LUTs larger than 4-LUTs are less area efficient. The problem with simple 6-LUTs is that you sometimes can't use the whole thing, so you waste a lot of area. With smaller LUTs, you tend to have less wasted LUT area. Of course, taken to an extreme very small LUTs are not efficient either because there is overhead getting signals to and from LUTs. This is why the ALM is the complicated beast that it is. By allowing the 6-LUT to be fractured, we improve logic packing. We can efficiently combine LUTs of varying sizes into one ALM, greatly reducing the amount of wastage. From a speed perspective, academic research has shown that larger LUTs are better (as far as I can recall at least). There is little speed penalty to using a larger LUT, since the fastest inputs are still the same speed. So to first order, the bigger the LUT, the fewer levels of logic, and those levels still have roughly the same delay (for the critical path). The ALM was designed to give us the best of both worlds -- the speed of a 6-LUT, without the area penalty. Regards, Paul Leventis Altera Corp.Article: 87310
Hi Peter, > Call me biased, but I was thoroughly bored by that presentation. Alex will be crushed to hear that! > Today's FPGA are not just LUTs [snip] > It's like a car salesman bragging: "My trunk is bigger than your trunk, > if you measure it my way, with my set of boxes". But the majority of FPGA area is still consumed by logic. As you point out, this is merely one faucet of the suitability of one architecture or another to a given design opportunity. I would say what we're claiming is that we can fit more or slightly larger passengers in our car. And like a good salesman, you are deflecting by steering people towards the seat heater you've got on the passenger side... Regards, Paul Leventis Altera Corp.Article: 87311
"Shanon Fernald" <sfernald@gmail.com> wrote in message news:1121925907.080093.42120@g47g2000cwa.googlegroups.com... > Ok, hi all, I'm new to fpgas and am having some fun with an Altera UP3 > kit. > > In the app I'm developing I have a component that I use 8 times in > parallel and the problem is that in the logic of this component are two > divide by 5 performed on integer variables. I can't use bit shifting > obviously, but is there cheap in terms of LEs way to do a divide by 5 > other than with a divide? This is killing me because the divides use > something like 1500 LEs which is almost 2X larger than the rest of the > logic. > > If there's no other reasonable way I think I can rearchitect it and > make it a divide by 4 and thus make available the use of bit shifting. > > Any help appreciated, thanks. > > Here's a dump of all the crap quartus seems to be adding for the > divide: > > Info: Found 1 design units, including 1 entities, in source file > ../../../../../../../altera/quartus50sp1/libraries/megafunctions/lpm_divide.tdf > Info: Found entity 1: lpm_divide > Info: Found 1 design units, including 1 entities, in source file > db/lpm_divide_smf.tdf > Info: Found entity 1: lpm_divide_smf > Info: Found 1 design units, including 1 entities, in source file > db/sign_div_unsign_uig.tdf > Info: Found entity 1: sign_div_unsign_uig > Info: Found 1 design units, including 1 entities, in source file > db/alt_u_div_1od.tdf > Info: Found entity 1: alt_u_div_1od > Info: Found 1 design units, including 1 entities, in source file > db/add_sub_ke8.tdf > Info: Found entity 1: add_sub_ke8 > Info: Found 1 design units, including 1 entities, in source file > db/add_sub_le8.tdf > Info: Found entity 1: add_sub_le8 > Info: Found 1 design units, including 1 entities, in source file > db/add_sub_me8.tdf > Info: Found entity 1: add_sub_me8 > Info: Found 1 design units, including 1 entities, in source file > db/add_sub_ne8.tdf > Info: Found entity 1: add_sub_ne8 > Info: Found 1 design units, including 1 entities, in source file > db/add_sub_la8.tdf > Info: Found entity 1: add_sub_la8 > Shannon : Divide by 5 can expressed as (multiply by a constant, then divide by 2^N), obviously the divide by 2^N is a shift. So depending on your necessary accuracy you can : * 3 / 16 (divides 5.3333) * 13 / 64 (divides 4.92) * 51 / 256 (divides 5.02) etc To implement this, you multiply by your chosen constant then discard N bits. A multiplier much smaller than a divider Actually an even smaller way would be to do * 3.25 / 16. ie. Y = (X + X + X + (X >> 2)) >> 4 (this is a 4.92 divisor) I'm sure there are better ideas GaryArticle: 87312
Holger Blum <usenet0705@kennsch.net> wrote in news:dbo09c$acv$1@online.de: > Hello experts! > > I have to implement a receiver for AES/EBU digital audio to I2S-Bus on a > Xilinx Spartan3 FPGA. > Concerning my level of FPGA-knowledge, I'm through the basic and > in-depth tutorials from Xilinx and the whole flashing-LED and > stopwatch-stuff ;-) and have also implemented designs of this level on > my own. But there's no experience in larger, more complex designs. > > During my search I came across the SPDIF Interface and I2S Cores from > opencores.org, which seem to be way too high for understanding for me at > the moment. Has anyone successfully implemented them on a FPGA and could > give me a rough idea of the necessary knowledge and engineering effort? > > The other possibility would be using a commercial IP-core, like for > example from coreworks.pt. I've already established contact, their offer > seems quite promising. But for comparison reasons, does anybody know > other providers for digital audio interface IP-cores? > > Any help will be greatly appreciated! > Holger > Altera has an application note for theier devices: http://www.altera.com/literature/an/an369.pdf -- Al Clark Danville Signal Processing, Inc. -------------------------------------------------------------------- Purveyors of Fine DSP Hardware and other Cool Stuff Available at http://www.danvillesignal.comArticle: 87313
Dear Azam, I can think of only missing I/O buffers, so you need to tuirn on "Add I/O Buffers" option in XST synthesis options. But there supposed to be a section which says exactly what logic has been optimized (i think it's called "trimmed") Are you still having this issue? Regards, VladislavArticle: 87314
Hi, Is there anybody who have utilized a heat sink for stratix FPGA I need a thermal resistance less than 6°C/W because the power consumption of my design is 3.8W... and without heat sink the maximum power dissipation is 2,4 W... ThanksArticle: 87315
Paul, Paul, I was referring mostly to a higher level of tailoring than 'hand synthesis'. I am referring to design decisions that any good designer makes that are influenced by the architecture of the fpga. As a basic example, there are significant differences in the memory structures between Xilinx and Altera devices. Any decent designer is going to look at what's in the FPGA, at least in the macro level, and tailor his design to that. Sure, the differences are a little more subtle when you are looking at the differences in the fabric rather than at the added features, and I'll concede that there are a lot of designers out there that push the button and hope for the best. The thing is, when pushing the button doesn't achieve acceptable results, those same designers look at the synthesis results and then go back and tweak the design in the areas that the sythesis did the worst with. I would argue that the design adjustments made there are in effect tailoring the design to the architecture, even though it is done somewhat blindly, and certainly not at the level of efficiency as one who writes his RTL in a way to give the synthesis very strong hints on how to assemble the logic based on the designers knowledge of the architecture. The point is, HDL synthesis is still only a translation. The tools can't think for the designer. Coding style can have a significant effect on the resulting design. Even at the RTL level, the design can and absolutely should be biased by the underlying structure. I would argue that most of your user base does at least consider the 10,000 mile birdseye view of the architecture when doing their designs, and it is this bias that I was referring to that can tip the scales toward any particular architecture. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 87316
This is often easier to understand when expressed in binary. 1/5 equals 0.00110011001100110011 etc. 0.00110011 etc equals 11 multiplied by 0.000100010001 etc. 0.000100010001 etc equals 10001 multiplied by 0.0000000100000001 etc. This suggests the following result (in C syntax): Y1 = (X + X<<1) Y2 = (Y1 + Y1<<4) Y3 = (Y2 + Y2<<8) and we have a result accurate to 16 bits with only three adders. Y4 = (Y3 + Y3<<16) 32 bits with only four adders.Article: 87317
Hi Thankyou for pointer. That document is quite nice. I followed the tutorial (Inserter and Analyzer) and got the (more ore less --:) same waveform as the tutorial. ISE/ChipScopePro 6.3 has following differences (compared to version 4.2) among others. - In 'Inserter', there is no ' Extended matching' button in 'Match setting' -----> So I ignored it. - In 'Inserter', 'data depth' is minimally '512' -----> So I chose 512. - In 'Analyzer', there is no 'match length type' and 'match length value' in 'Trigger setting' -----> So i ignored it. - In 'Analyzer', there is no 'capture type' -----> So i ignored it. The waveform says that the counter logic is okay. Problem is that - The waveform starts with the counter value "0011 0110 0101 0010" (with setting depth=512, position=100) - At time '0', the counter value is "0011 0110 1011 0110", meaning first value + 100 Is it problematic? BTW, I have two things unclear for me about 'match value' and 'position' in 'Trigger setup'. Regarding the trigger condition : "00000001" < 2 match functions < "00000011" - Is it correct that we make this condition in order to consider the 2 cycles of 'ILA' latency ? Regarding 'position' - In my case, . But in 'position' 100 (as indicated in the tutorial), the counter value is not "0000 0000 0000 0010", which was expected from the tutorial. Anyway I wish to have "0000 0000 0000 0010" at time step '0'. Thankyou again :) RegardsArticle: 87318
Philip Freidin wrote: >Actually, what this does is benchmark your "experts". And the results >at best are only valid if you then use that expert for your design :-) > > > Philip, good point. I think you see what I am trying to say. Basically, that this "mine's bigger" pissing contest is at best a demonstration of how well the parts compare on an undisclosed set of benchmarks that are biased towards the favorite device. My point in my previous posts is that */any/* design is going to be biased toward one of the devices. That bias can be toyed with by making changes in the benchmark designs (and I am talking about changes at the RTL level, not hand crafting). Even naive designs have this bias, but I suspect that the benchmarks were either designs done by the vendor's FAEs or by the vendor's customers, which would already introduce a distinct natural bias toward that vendor's devices. Even if he did use naive designs, marketing would have undoubtedly polished the numbers by either tweaking the designs or cherry picking the benchmarks to support his sales pitch. I'm not saying there is anything wrong with that, just trying to expose the nearly unavoidable bias that is naturally there. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 87319
Our HW folks found the issue. There is a bug in xilinx's genace.tcl script on support -jprog option on jtag chain with multiple FPGA. For multiple devices jtag chain, we have to customize -jprog code ourself. -TonyArticle: 87320
Page 80: http://direct.xilinx.com/bvdocs/userguides/ug002.pdf Discusses the DCM, and the fixed phase offset feature. To get 0, 90, 180, 270 use 0 offset. To get anything else, set the offset to the desired fractional value (8 bits describe the N/256 of a period shift). For example, 45 degree offset would be 32. 32/256= 1/8 period. 1/8 period = 45/360, or 45 degrees. Then CLK0, CLK2X, CLKDV are all 45 degrees offset, and CLK90 is 135 degrees, CLK180 is 225 degrees, and CLK270 is 315 degrees offset. You will end up using BUFG's for all clocks to make sure the skew between clocks does not cause you any concerns. Use two DCMs both driven by the same CLKIN source to get more than 4 separate phases with offsets (drive in parallel). It is suggested do not place them in tandem, as that increases the jitter. If jitter is not a concern, then use in tandem is OK for the DLL part of the DCM (however do not drive the second DCM input CLKIN from the CLKFX of a first DCM -- that is not guaranteed to work, but CLKIN of the second DCM from CLK0, CLK90, CLK180, CLK270, CLK2X, CLKDV is OK). AustinArticle: 87321
My company is attempting to produce a secure switch based on the Altera Stratix II (EP2S60) board. Unfortunatley, Altera no longer sells the Ethernet Development Kit. We're attempting to build a replica of the daughter card that came with the kit from Altera's designs, but it would be useful to have a working example of the card for comparison purposes. So if anyone has a daughter card based on the LAN91C111 chip that they'd be willing to part with, please e-mail me. The folks I've been talking to at Altera think they can locate the documentation, but not the part. --Brent KuceraArticle: 87322
ALuPin@web.de wrote: > I have tried to illustrate the timing in the following diagram: > http://mitglied.lycos.de/vazquez78/ Looks like a standard synchronous handshake to me. fpga sees NXT then registers DATA and drives STP on the next clock (or later if collecting a burst) Good luck. -- Mike TreselerArticle: 87323
brentkucera@gmail.com wrote: > So if anyone has a daughter card based on the LAN91C111 chip that > they'd be willing to part with, please e-mail me. The folks I've been > talking to at Altera think they can locate the documentation, but not > the part. Shouldn't be too hard to prototype one: http://www.google.com/search?q=LAN91C111+ethernet+reference+design -- Mike TreselerArticle: 87324
Andre, From what I see on the web (timing diagram), i think it's just a pileline which needs to be enabled/disabled. Maybe I am wrong, but where you need to do everything on one cycle? You always "prepare" the next data before opening the bus towards the external PHY Vladislav <ALuPin@web.de> wrote in message news:1121932820.228642.199200@z14g2000cwz.googlegroups.com... Hi, thank you for your answers. With "NO time" I mean that I have no time to sychronize the input when using the external data clock in my FSM. Some more information on the interface: DATA[7..0] 8-bit bidirectional data bus. The FPGA has to drive the bus LOW by default. By sending a non-zero data pattern called TXCMD (transmit command) the FPGA initiates transfers. The direction of DATA[7..0] is controlled by DIR. Contents of the bus lines must be ignored for one clock cycle whenever DIR changes value (turnaround) DIR Controls direction of data bus. The external PHY drives DIR LOW by default so that it can listen to TXCMDs from FPGA. The PHY drives DIR HIGH when it has data for the FPGA. STP The FPGA drives STP HIGH for one clock cycle after the last byte of data was sent to the PHY NXT The PHY drives NXT HIGH to throttle data. If DIR is LOW, the PHY asserts NXT to notify the FPGA to place the next data byte on DATA[7..0] in the following clock cycle. If DIR is HIGH the PHY asserts NXT HIGH to notify the FPGA a valid byte is on DATA[7..0]. I have tried to illustrate the timing in the following diagram: http://mitglied.lycos.de/vazquez78/ Some dynamic characteristics of the PHY which provides the 60MHz clock: timings with respect to positive edge of PHY clock tSETUP (input-only pins) max. 6.0 ns tHOLD (input-only pins) max. 0.0 ns tOUT (output-only pins) 2pF - ns 12pF - ns 30pF max. 9.0 ns As the plot shows I have to provide data on the next PHY clock cycle as soon the PHY accepts my TXCMD. So not really much time to do a good synchronous job ... ? Rgds André
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z