Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mehdi, What exactly are you looking for, in terms of partial reconfiguration support? Generally, partial reconfiguration is already well supported, especially in Virtex-4 and Virtex-5 devices. The early XAPP290 flow works, although it's a bit clunky. However, the Early Access Partial Reconfiguration (access approval required) flow also works very well and is significantly more efficient. Although it's not a "hard" requirement, you probably need PlanAhead to really employ it. I could be wrong, but I don't think ISE 9.1i includes any "silver bullet" partial reconfiguration support, if you're looking for an automated process. But certainly look into the EAPR flow -- it's as good as it gets for the time being. I've personally experienced a high degree of success with it. Julian Kain El-Mehdi Taileb wrote: > Hi all! > ISE 9.1i is finally out! > I read the marketing papers but didn't find any word about partial > reconfiguration support. > > MehdiArticle: 114551
Hi, I need to implement UHF tx-er in FPGA(Actel). let me know the blocks in tx-er.Also help me with VHDL code. thank you, sindhuArticle: 114552
Antti wrote: > I wonder if thats only from Europe or is that something at Xilinx > server, all login attemps time-out today :( > > Antti The same problem today. And in the download section there are also problems: it's impossible to choose chipscope 8.2 sp4 for linux from the menus. MehdiArticle: 114553
Thanks to alll for sending their comments/Useful Links. I'm getting worried about the phase misalignment of divided clock w.r.t the source clock because of combinational logic associated with the output. I would like to welcome your suggestions/comments on this Note: Duty cycle with 33%, 66% will not be worked out in our case, so badly I'm in need of clock with 50%dutycycle. Thanks a lot once again, Sudheer.Article: 114554
Antonio Di Bacco wrote: > To fix the error you can also patch the xsim.h file under the Xilinx > installation directory and use your brand new gcc 4.1 that ships with Suse > 10.1 > > Bye, > Antonio. Can you please give me more details? MehdiArticle: 114555
spectrallypure wrote: > Hi all! I am experiencing a very strange and rather frustrating problem > while trying to run the same backannotated simulation in two different > versions of Modelsim. In both cases I am using exactly the very same > files for everything, and also the same compilation and simulation > commands. The old version (Modelsim 5.8b) simulates fine and gives the > expected results, while a newer release (Modelsim 6.2e) gives erroneous > results. The following caption explains the situation (copy and paste > the following link in your browser): One possibility is a bug in one of the Modelsim versions. SDF annotation at least in huge chips usually finds some problems in the simulation engine. Timing engine errors and differences are really hard to track usually. You need at least the Verilog standard at hand to consult exact functionality. The differences are usually much bigger between different simulator vendors, although all of them should behave the same way. I would suggest to send a testcase to Mentor. They have more tools to pinpoint what really goes wrong. Also check the simulation resolution you are using, rounding errors might be different in different tool versions. I made a signoff with 5.8b and 6.2a, both of them had some minor problems, and different signoff corners needed different simulator version to work the same way as static timing analysis (which usually is the golden model). If you have very complex tristate/opendrain structures 5.8b might cause different results compared to newer versions (5.8b results are the incorrect ones). And some early 6.2 versions had some problems with conditional timing checks. Also if the design is small have you tried to compare nonoptimized, fast and vopt flows. As far as I know vopt timing engine is a new one, and if you use the fast flow the engine behaves differently (the old way). Also if the design is small easiest way to compare the results is to record all the signals during the simulation (add log -r /*) and then do a wave compare in the newer simulator version. Then search from the comparison the first difference. You might need to change the comparison default limits to much higher values to complete the comparison (millions of differences allowed). --KimArticle: 114556
"motty" <mottoblatto@yahoo.com> wrote in message news:1169179994.331058.234730@a75g2000cwd.googlegroups.com... > The hyperlinks in Xilinx' Timing Analyzer do not go to valid web pages. > I am using 8.2 with the latest SP. I've done a search on individual > delays and found some hits, but is there a definitive destination on > the web site that I can find all the definitions? > %xilinx%\doc\usenglish\help\delay_types\html\web_ds_v4Article: 114557
Hi, spectrallypure schrieb: > files for everything, and also the same compilation and simulation > commands. The old version (Modelsim 5.8b) simulates fine and gives the > expected results, while a newer release (Modelsim 6.2e) gives erroneous > results. The following caption explains the situation (copy and paste I think there are some changes in the preferences between both versions. I would start and use the same preferences. Second, there might be some changes in compilation result when using the same switches, as defaults changed. Check the release notes for 6.2e to see, if this is applicable in your design. bye ThomasArticle: 114558
Share please the circuit of the phase detector (I do not want to invent a bicycle). Any of the circuits laid out in an Internet does not work.Article: 114559
Eric Smith wrote: > Phil Hays wrote about a one's complement adder, which uses > end-around carry: >> It looks to me like there is a chance of a pulse running around the carry >> loop for a multiple times, perhaps even forever. > > There is no way that can happen if the inputs are stable. For an n-bit > one's complement adder, there can't be a carry propogation wider than n > bits. > > In other words, in a 16-bit adder, a carry propogation might start in > position 5, but even if it wraps it cannot cause carry propogation > past position 4. > > You can prove this by exhaustion for a short word width, and then by > induction for longer word widths. > Eric, I was wondering if you can provide a reference to a good book or article that deals with the proof you are mentioning here? Regards, KoenArticle: 114560
Hi, Thanks for the inf. and your comments and frustrations about using GDB and XMD. I'll try and let you know.Thanks. Regards, R.Bal Jhlw wrote: > Hello rbal, > > What I did when creating a new application was to also > copy the linker script from "TestApp_Memory" and then > configure that as my linker script. That used the BRAM. > Then I used "Generate Link Script" and the size of the > boot section was 0x00000010. Then I just used the combo > box to select SRAM. So make sure that you create > TestApp_Memory when you use BSB to set up your > system. > Later, after you successfully load your code with GDB, > it seems to make a difference between clicking on the > Run icon (small pic of running man) and using the drop- > down menu Run command. If you don't click on the icon, > but click on the menu Run command, it seems to work > and stop on the breakpoint at the beginning of main, > otherwise the GDB window goes blank and the message > at the bottom of the window frame says "stopped". > > Regards, > -James > > rbal wrote: > > >Jhlw wrote: > > > Hello All, > > > > > > I figured out how to run an application from external memory in Xilinx. > > > I have EDK ver 8.2.01i and an ML403 board. > > > What you do is "Mark to init BRAMs" on the default bootloop app in > > > the Applications window pane, update your bitstream and then load > > > it into the board using iMPACT. > > > Change your linker script using "Generate Link Script" for your > > > large application and set the .text section to SRAM. Rebuild your > > > app. > > > > > > > I have done upto this in ML402 board.But I am not able to run program > > from external memory. > > > > > You also have to set the .boot section (0x00000010) to SRAM. > > > > How to set the .boot section? Have I to create a new section and > > select memory as SRAM.But it shows "0 bytes" and I couldn't select the > > address - 0x00000010.And I am not able to run application from external > > memory.Pls.advice me.Thanks > > > > Regards, > > R.BalArticle: 114561
On 2007-01-19, John_H <newsgroup@johnhandwork.com> wrote: > Austin, > > With the plethora of customer designs Xilinx has archived for software > benchmarking, could the gate-to-LUT utilization be determined "on average?" > There would be different classes of designs in the same sense there are > different versions of the Virtex-5 family. Logic-only designs would have a > very different equivalent gate count than the heavy signal processing > solution which would also be different than light signal processing. If someone is feeling bored it should be possible to download all functioning packages from opencores and synthesize them for hardware and software and look at the difference. That said, the number of gates an ASIC design will occupy will also differ depending on your optimization goal. (Area, power or speed) /AndreasArticle: 114562
Austin Lesea wrote: > I have been asked, many times, how an ASIC "gate" compares to a FPGA "gate." > Now don't just groan, and hit ignore, bear with me (if you have an > opinion or feel like a comment). > An ASIC "gate" (in my feeble mind) is 4 transistors, arranged as a NOR, > or a NAND. From that basic element, you can make everything else, or at > least represent the complexity of everything else. As I understand it, for CMOS it is four transistors in any shape, but as you say, they can make a NAND or NOR gate. > Now take a FPGA. Look at the LUT. Take the 4 LUT in Virtex 4. It is > 16 memory cells. Is that 32 "gates"? What happens when you use it as a > 16 bit LUTRAM, or SRL16? Isn't that closer to 64 "gates"? A single LUT RAM, I would probably say closer to 128, including the address decoders, but when implementing a large RAM it would average closer to 64. > If I use the LUT as a 2 input NAND gate, then it is one "gate" and I > have to use some LUTs as small gates, so I obviously can't count all my > LUTs as 64 gates! I haven't written this for a while, so I can probably say it again. It probably makes more sense with at the current sizes that it used to. What you need is a scalable design (or designs), yet one that is reasonably representative of real designs. (I used to design systolic arrays, which scale fairly well. Most designs probably don't.) Given a scalable design, you increase the scale to the biggest that will fit in a given device, compute a reasonably equivalent gate count (in terms of CMOS) and use that. It might almost work in terms of a modern CPU. Set up the design such that the data path width is variable (I believe that is easy to do in verilog). Not that anyone will ever want to use a 97 bit wide processor, but as a measuring tool it should work. (It might be that one should optimize for width divided by clock cycle time, to fairly penalize designs based on routing through slow pathways.) -- glenArticle: 114563
Austin Lesea wrote: (snip regarding 3.3v differential signals) > No. The IO pins are tied to the a supply through diodes, so applying > any signal that exceeds that supply will forward bias the clamp diode. > The common mode voltage also wants to be about 1/2 of the supply > (typically, but not always, as we see below). (I was about to jokingly suggest capacitive coupling, then I scroll down.) > It appears that V4 FX wants to set its own common mode, hence the reason > why capacitive coupling is required. And since it has an internal > termination, it makes the only specification required, the peak to peak > swing (everything else is automatically taken care of). I suppose for a clock, capacitive coupling works fine. It doesn't for a signal with an arbitrarily long time between transitions. -- glenArticle: 114564
anand wrote: > Thanks for the response. I will try to outline what I am doing with > some specific concerns I have: > Basically, I am working on behalf of a company that develops compute > intensive algorithms for biological applications using a SW programming > language like C/C++. That company is trying to get a performance boost > by mapping the same algorithms onto a Hardware Platform like an FPGA or > ASIC or anything in between. Main idea is to see if we can get a > minimum of 10X-20X speedup versus a software implementation. Systolic array processors. Especially if the problem is dynamic programming, but many others work well as systolic arrays, too. > Here are some of my basic concerns: > (1) Many of the end-customers now use laptops as their only computers, > sometimes with a docking station and/or external keyboard and monitor > when they're at their desks. How would a endcustomer implement such a > hardware solution? Would it come as a plug-in card for a card slot > (unusable on a laptop), or in some format that would enable using it on > a laptop? How about a standalone box? There are PCMCIA cards with FPGAs in them. For real accelerators, you want a large box (maybe desktop PC size), but for 10x or 20x a PCMCIA card might do. > (2) Assuming one is able to connect the HW implementation to a laptop, > how would the end customer feed the input files. Note that in some > apps, the input file is ASCII text, while in other apps, it may be > binary files in a proprietary format. How does the output of the > simulation be collected? Wd it be redirected to an ASCII text file? This isn't really applicable to the discussion. > (3) What happens when the algorithm needs to be updated? Is there a > way to "update" the hardware (such as an FPGA), or does is it mean the > hardware becomes obsolete and must be replace (if so, at what kind of > cost to an end user)? FPGA is a good choice. Not only can it be updated when the algorithm needs change, but it can be updated while running. It might be that you make one pass through the data with one configuration, reprogram and process it with another configuration. > (4) Hardware/Software Partitioning: Can various "core" functions be > programmed into the hardware while still allowing other functions to be > in software in order to provide flexibility in the mathematical models? > If so, is the potential speed advantage still high? That would be the usual way, yes. Find the parts of the algorithm where the most intense computation is done. It might be a few lines of C inside nested for loops. > (5) Can you shed some light on how one can translate existing code > from C/C++ to a HW platform? What tools would be used, how would the > design be verified, and how long does it take to get a working demo > version? The design process is completely different from serial C programming. > (6) What about if the existing code is in a proprietary language, other > than C/C++? Is it possible to translate into a HW mapping in that case? You want to go from the description of the algorithm, not C or C++ code. In that case, a proprietary language is probably better. > (7) Finally, to get a demo/working prototype, what do you recommend, > FPGA, or ASIC or something in between and why? If you had to take a > stab at guessing the cost for developing such a prototype, what would > it be? Assume about 100,000 lines of existing code in C/C++. It is the wrong way to think about it if you have 100,000 lines of C code. Consider an algorithm based on an FFT. The total may be 100,000 lines but the FFT (or at least DFT) can be written in just a few lines. You want to find those few lines that are executed 1e12 times, program those into the FPGA, write code to do the I/O to the FPGA, and leave the rest of the 100,000 lines as they are. -- glenArticle: 114565
Kim Enkovaara wrote: > glen herrmannsfeldt wrote: >>axr0284 wrote: >>> When computing the packet for sending, it gives me the proper answer. >>> It's when receiving that i am having an issue since I do not see the >>> "magic number as the output of the CRC module. I think i might be >>> feeding in the original CRC at the end of the packet the wrong way or >>> something. >> That wouldn't make sense, because, as the bits come in you don't know >> when the CRC starts until after it is done. Consider doing it with >> an LFSR with the bit stream as input. > The packet start and end are available from the physical layer (for > example in 8b/10b coding the S and T sets). Add few pipeline stages and > the crc can be easily extracted from the packet. > I think the single bit LFSR solution is the worst one with current > technology. With gigabit Ethernet the LFSR must run at 1GHz frequency, > that is hard even with ASICs and consumes power. With 32b parallel crc > calculation the frequency is just ~30MHz which is easy to handle, and > with 32 bits the parallel implementation does not have so many layers of > logic. Well, the OP didn't mention speed, but my answer was mostly meant in conceptual terms. One can go through and determine the result of the LFSR, and then how to implement that in parallel. I believe the solution is somewhat different than the usual parallel software (and look-up table) solution, but I don't remember it right now. When ethernet was new, logic was much more expensive and the LFSR solution worked well. -- glenArticle: 114566
Hi Austin, remembering the good old XC3000 or XC4000 which had only CLBs and IOBs it was easy to give a rough estimation of a gate equivalent. The usefulness of this number could be put in question, whatsoever. Todays FPGAs are different. Not only that there's Blockram and other large macro-functions (CPUs, DSP-Cells, Multipliers...) even the LUTS can be used in more than one way. People tend to reduce complex measures to simple numbers. e.g.Clockspeed for Microprocessors or Pixelnumbers for digital cameras. Both are as useful (or not) as a gate count for FPGAs. So, why not calculate these numbers to feed the marketing people and publish the calculation algorithm for the adult custumers who want to take a look behind the bare numbers. The experienced custumers don't need these numbers anymore and don't care anyway. They know how to find the right chip for their designs. A "gate count range" would not be useful. The 100% increase of numbers (since you have two numbers now) would be too much for marketing and simple minded custumers("Two numbers? But which one is true for me???"). As you and mk already mentioned, there are some designs which prove any numbers wrong in both directions. But these designs are just peaks in the wide field of the average applications. So it is not important to the average custumer. A simple solution for the publication of gatecounts may look like this: The XYZ1234 FPGA has a mean equivalent gate count of xxxx k-gates.* . . . ____ * Equivalent gate count depends strongly of the users application. Gate count calculation facts can be downloaded from http://www.xyz-brand.com/fpgas/gatecount.html Best regards EilertArticle: 114567
"James Wu" <jameswu@yahoo.com> writes: > Say in an if-statement I assign a value by: "temp <= FPGA_input". This > if statement is only ran once. Will "temp" always be mapped to FPGA > input? Or do I have to set temp to the input every time I expect the > input to change? Just the once when the "if" statement runs. You don't have to set it each time you expect it to change though - just doing temp <= FPGA_input as a combinatorial line will set temp to the FPGA_input each time it changes. And it will synthesise to a wire. > > Also, if I declare a pin an "inout", how do I switch between the two? > Sometimes an external device will write to my FPGA, othertimes it will > read. It's always readable. If you want the external device to drive it, you must make sure you drive it to 'Z' in your FPGA. Then the external device will be able to "override" the Z. Eg: input_signal <= io_pin; -- this reads the io_pin io_pin <= output_signal when output_enable = '1' else 'Z'; will drive output_signal onto the io_pin when output_enable is high. HTH, Martin -- martin.j.thompson@trw.com TRW Conekt - Consultancy in Engineering, Knowledge and Technology http://www.conekt.net/electronics.htmlArticle: 114568
K. Sudheer Kumar schrieb: > I need to generate a 70MHz clock from 210MHz. Is there any way to > generate it rather than using a DCM. You could use pseudo dual-edge flipflops. http://www.ralf-hildebrandt.de/publication/pdf_dff/pde_dff.pdf RalfArticle: 114569
Hi, I googled around a bit but could not find the answer. I am using Xilinx Spartan FPGA, with 2 DCM's in series to generate a 32MHz clock (50% duty cycle) from a 75MHz input. (First DCM is divide 2.5, second multiply 32 and divide 30). The first DCM is connected to a system reset via a user pin, and the LOCKED signal of this DCM is used to reset the second DCM. My question : What is the total lock time i.e. When will the 32MHz clock be available ? Thanks for any insights, StevenArticle: 114570
On Thu, 18 Jan 2007 11:24:56 -0800, "James Wu" <jameswu@yahoo.com> wrote: >Say in an if-statement I assign a value by: "temp <= FPGA_input". This if statement is only ran once. Will "temp" always be mapped to FPGA input? Or do I have to set temp to the input every time I expect the input to change? You only have to do this once. If you are coming from a pure software background, think of the above statement (outside a "process" block) as an independent parallel process, which schedules itself every time its input changes, performs its computation (a simple assignment in this case!) and suspends... >Also, if I declare a pin an "inout", how do I switch between the two? Sometimes an external device will write to my FPGA, othertimes it will read. Explicitly, if you want normal inout operation. The signal should be of a "resolved type", i.e. one in which the result of connecting several different signal sources together is explicitly defined in a "resolution function". (e.g. std_logic[_vector], or unsigned from the numeric_std library) Then, if you drive "1" on that line and external sources drive "0", the result will be "X", i.e. unknown, which is generally an indication that something's wrong. (*) It is normal to drive "Z" on the IO pin when you want to read it (i.e. explicitly turn OFF your output drive) to allow the input signal to be read. When outputting, you have to ensure that the other sources drive "Z"; there has to be some arbitration to accomplish this. A good example would be the OEn output enable signal on a typical SRAM; you would normally drive this with '0' to read from SRAM, and '1' when writing to it. (The SRAM model will normally drive its I/O pins with data when its OEn input is 0, and with "Z" when OEn is '1') (*) There are alternative schemes, such as "wired OR" schemes, where resolution is accomplished differently, but this is the common case - BrianArticle: 114571
> I googled around a bit but could not find the answer. The data sheets, user guides, and switching characteristics data sheets are your friend. > I am using Xilinx Spartan FPGA, with 2 DCM's in series to generate a > 32MHz clock (50% duty cycle) from a 75MHz input. (First DCM is divide > 2.5, second multiply 32 and divide 30). > > The first DCM is connected to a system reset via a user pin, and the > LOCKED signal of this DCM is used to reset the second DCM. > > My question : What is the total lock time i.e. When will the 32MHz > clock be available ? > > Thanks for any insights, > > Steven I'm sure the Spartan datasheets spec the time needed for the DLL to lock given a certain output frequency. Find that number for the first DCM and then find the lock time for the DFS on the second DCM. You should have a ballpark figure then. You're talking about milliseconds (at least according to the Virtex4 data - which should be close to the Spartan figures). So the delay from the lock asserting on the first DCM to the second DCM being enabled should be negligible.Article: 114572
<moogyd@yahoo.co.uk> wrote in message news:1169217267.704821.197420@m58g2000cwm.googlegroups.com... > Hi, > > I googled around a bit but could not find the answer. > > I am using Xilinx Spartan FPGA, with 2 DCM's in series to generate a > 32MHz clock (50% duty cycle) from a 75MHz input. (First DCM is divide > 2.5, second multiply 32 and divide 30). > > The first DCM is connected to a system reset via a user pin, and the > LOCKED signal of this DCM is used to reset the second DCM. > > My question : What is the total lock time i.e. When will the 32MHz > clock be available ? > > Thanks for any insights, > > Steven > Hi Steven, In the usenet spirit of ansering a different question, here's my insight! So, I guess you're using Spartan3 as the earlier ones don't have a DCM, IIRC. Why not just multiply by 32/15 and then divide by 5 using the DDR feature of the IOBs to get your 50% duty cycle. Anything internal wouldn't need 50% duty cycle. Probably! HTH, Syms.Article: 114573
axalay wrote: > Share please the circuit of the phase detector (I do not want to invent > a bicycle). Any of the circuits laid out in an Internet does not work. XOR? Short non-descriptive posts get you short non-descriptive answers. : )Article: 114574
Symon wrote: > "motty" <mottoblatto@yahoo.com> wrote in message > news:1169179994.331058.234730@a75g2000cwd.googlegroups.com... > > The hyperlinks in Xilinx' Timing Analyzer do not go to valid web pages. > > I am using 8.2 with the latest SP. I've done a search on individual > > delays and found some hits, but is there a definitive destination on > > the web site that I can find all the definitions? > > > %xilinx%\doc\usenglish\help\delay_types\html\web_ds_v4 Thanks Symon. I should have looked there!
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z