Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Giox wrote: > I'm interested in the implementation of a fast adder for 32 bit data. > The CLA is too expensive so I'm searching for something different, can > you provide me some reference? > I think that Ling adder can be a good choice, but I don't know.. > Thanks a lot > On a FPGA going faster than the dedicated fast carry ripple chain for only 32 bits data might not be easy. What is you target speed and what is your current speed ? SylvainArticle: 86826
Hi, I'm using a virtex 300E and after the synthesis step (not place and route), the frequency is estimated as 82.129MHz. The performances are better than whose that I need, but the occupied area is considerable. GioArticle: 86827
Giox wrote: > Hi, I'm using a virtex 300E and after the synthesis step (not place and > route), the frequency is estimated as 82.129MHz. > The performances are better than whose that I need, but the occupied > area is considerable. > Gio Which software you use to sintesis you project? did you try to use pipeline ? can you show you code sourse ? des00Article: 86828
Hi Ben, thank you for your answer. I am using Lattice ispLEVER 5.0 together with Modelsim. What do you mean with "that you just need to make your testbench wait for a few hundred nanoseconds before doing anything". ? After deasserting my reset signal in the testbench, my FSM enters the state s_ini_a. After three clock cycles the first action occurs in the FSM. So how can I wait for a few hundred nanoseconds ? With some kind of enable signal ? Rgds Andr=E9 Ben Jones schrieb: > > I have performed a functional simulation and static timing analysis > > of my design. Both are OK. > > But when trying to perform timing simulation my state machine > > does not change from the first state. > > What could go wrong? > > If you are using the Xilinx ISE tools, then the post-map and post-PAR > simulation models will model the "power-on reset" of the FPGA for you > automatically (this is completely separate from any user-defined reset > signal you might have in your design). > > When you generate the simulation models, there is a setting called "Reset= On > Configuration Pulse width" which defines how long (in nanoseconds) the > synchronous elements will wait in their "INIT" states before responding to > clock edges. So, it could be that you just need to make your testbench wa= it > for a few hundred nanoseconds before doing anything (or reduce the value = of > this pulse width when you create the model). >=20 > Hope this helps, >=20 > -Ben-Article: 86829
Hi Brad, > I am attempting to get data (column_in) from a fast > clock domain(clk_wr) to a slow clock domain(clk_rd). I've always like this paper as a good introduction to intra-domain transfers: http://www.sunburst-design.com/papers/CummingsSNUG2001SJ_AsyncClk.pdf There are some other papers from the same author that are pretty good too. Regards, Paul Leventis Altera Corp.Article: 86830
All Does anyone have a feel for how robust aurora is? I have worked out a packet based protocol to put on top of aurora which will re-send packets which are corrupt at the receiver but what is the chances of the aurora system itself falling over? KatherineArticle: 86831
I'm using the Xilinx ISE pack, with it's synthesis pack. I'm not able to show the code, but it is simply a 32 bit CLA bit from 4 different 8 bit CLA with group propagate and generate GioArticle: 86832
I have just sent you the file that I was sent. At about the same time I wanted to find a dev board where a SPARTAN could control a PCI CONNECTOR so that I could do FPGA to something where I had bought the something on a PCI card. I never found one so I designed it, SPARTAN 3, 32bit PCI CONNECTOR, RS232 some SDRAM, LEDs and switches. I haven't used a core. If you are doing point to point PCI which your post suggests then you don't want most of the features it would give you. I haven't even done parity as the chip I'm talking to can be told to ignore it. However 25MHz gets me all the bandwidth I need at the moment. Colin RobJ wrote: > There was a thread with this subject line last September. Steve Knapp from > Xilinx responded in the thread as follows: > > "I have a separate document that I will send you that includes all the > relevant PCI 3.0 timing and which parameters that you can ignore if you're > on the same PC board." > > I sent Steve an email last week but haven't gotten a response yet. If anyone > has a copy of this document I'd much appreciate a copy. You can send it to > bsefton@sandel.com. > > By the way, anybody running PCI in Spartan-3 at 66MHz? If so, which core are > you using? > > Thanks, > Bob S.Article: 86833
"Andi" <00andi@web.de> schrieb im Newsbeitrag news:ee8f784.2@webx.sUN8CHnE... > Hi, > > how is the interrut signal specified? As low or high activ edge/level ? > > How does the mhs file look like? Its level high. NO Device ISC is used and the interrupt capture mode is INTR_PASS_THRU. Here is the part of the MHS File. BEGIN ppc405 PARAMETER INSTANCE = ppc405_0 PARAMETER HW_VER = 2.00.c BUS_INTERFACE DPLB = plb BUS_INTERFACE IPLB = plb BUS_INTERFACE JTAGPPC = jtagppc_0_0 PORT RSTC405RESETCHIP = RSTC405RESETCHIP PORT C405RSTSYSRESETREQ = C405RSTSYSRESETREQ PORT C405RSTCORERESETREQ = C405RSTCORERESETREQ PORT C405RSTCHIPRESETREQ = C405RSTCHIPRESETREQ PORT PLBCLK = sys_clk_s PORT RSTC405RESETCORE = RSTC405RESETCORE PORT RSTC405RESETSYS = RSTC405RESETSYS PORT CPMC405CLOCK = sys_clk_s PORT EICC405EXTINPUTIRQ = EICC405EXTINPUTIRQ END BEGIN plb_decoder PARAMETER INSTANCE = plb_decoder_0 PARAMETER HW_VER = 1.00.a PARAMETER C_BASEADDR = 0x90000000 PARAMETER C_HIGHADDR = 0x900003ff BUS_INTERFACE SPLB = plb PORT PLB_Clk = sys_clk_s PORT IP2INTC_Irpt = EICC405EXTINPUTIRQ ENDArticle: 86834
Have you tried a simple adder? Verilog: module myadd ( input clk, input [31:0] a, b, output reg [31:0] y ); always @(posedge clk) y = a + b; enmodule The dedicated adder circuitry is very fast silicon. Trying to best the native performance of the adder is difficult. Most people have their performance hurt by having more than one (or two) levels of logic in the adder. If you go from registered inputs to registered outputs you should get significantly better performance than the CLA structure you're trying. Let us know how your performance changes with a simple 32-bit adder. Giox wrote: > I'm using the Xilinx ISE pack, with it's synthesis pack. > I'm not able to show the code, but it is simply a 32 bit CLA bit from 4 > different 8 bit CLA with group propagate and generate > GioArticle: 86835
Which speed grade? You may want to check the latest user guide and data sheet. HTH, Jim "azam" <azamirfan@gmail.com> wrote in message news:1120689352.809458.282120@o13g2000cwo.googlegroups.com... > I was trying to determine the max frequency that can be clocked thru > the Virtex-4 FPGA pins. While referring the Virtex-4 User Guide the (DC > and Switching characteristics) Section of the book has blank columns. > Although the Virtex-4 FPGA Handbook mentions the capabiltiy to meet > different I/O standards, I was wondering has anyone been able to > exercise the I/Os of the Virtex Family above 300MHz. > > Thanks > -Azam >Article: 86836
special for you, i did simple test This code library ieee; use ieee.std_logic_arith.all; use ieee.std_logic_unsigned.all; use ieee.std_logic_1164.all; entity adder is port ( in_clock : in std_logic; in_reset_b : in std_logic; in_dataA : in std_logic_vector(31 downto 0); in_dataB : in std_logic_vector(31 downto 0); in_strobe : in std_logic; out_data : out std_logic_vector(32 downto 0); out_strobe : out std_logic ); end entity adder; architecture adder of adder is begin process (in_clock, in_reset_b) is begin if (in_reset_b = '0') then out_data <= (others => '0'); elsif (rising_edge(in_clock)) then if (in_strobe = '1') then out_data <= ext(in_dataA, out_data'length) + ext(in_dataB, out_data'length); end if; end if; end process; process (in_clock) is begin if (rising_edge(in_clock)) then out_strobe <= in_strobe; end if; end process; end architecture adder; i syntesis by Simplify 8.1 for xcv300efg256-6 simplify report is Worst slack in design: 3.605 ................................... Requested Estimated Requested Estimated Clock Clock Starting Clock Frequency Frequency Period Period Slack Type Group --------------------------------------------------------------------------------------------------------------------- adder|in_clock 100.0 MHz 156.4 MHz 10.000 6.395 3.605 inferred Inferred_clkgroup_0 ===================================================================================================================== ............................... then i P&R in ISE 6.3 SP2 full and report ............... Number of errors: 0 Number of warnings: 0 Logic Utilization: Number of Slice Flip Flops: 26 out of 6,144 1% Number of 4 input LUTs: 32 out of 6,144 1% Logic Distribution: Number of occupied Slices: 17 out of 3,072 1% Number of Slices containing only related logic: 17 out of 17 100% Number of Slices containing unrelated logic: 0 out of 17 0% *See NOTES below for an explanation of the effects of unrelated logic Total Number of 4 input LUTs: 32 out of 6,144 1% Number of bonded IOBs: 100 out of 176 56% IOB Flip Flops: 8 Number of GCLKs: 1 out of 4 25% Number of GCLKIOBs: 1 out of 4 25% ................. where didn't was an error? constrain was only NET "in_clock" TNM_NET = "in_clock"; TIMESPEC "TS_in_clock" = PERIOD "in_clock" 10.000 ns HIGH 50.00%; #End clock constraints # Output Constraints OFFSET = OUT : 10.000 : AFTER in_clock ; # Input Constraints OFFSET = IN : 10.000 : BEFORE in_clock ; That's why i recomend you to use Simplify :) Good luck!Article: 86837
I have added an Enable which gets active after some hundred nanoseconds and the Timing Simulation is OK now :o) One further question: Are the synchronous elements in the real design immediately ready when leaving Global Reset (Pll is already locked) ? Or is there some "wait" time according to the timing simulation?=20 Thank you for your help. Rgds Andr=E9Article: 86838
Hi Andre, > I am using Lattice ispLEVER 5.0 together with Modelsim. Aha. I'm afraid I don't know anything about that! :-) > What do you mean with "that you just need to make your testbench wait > for a few hundred nanoseconds before doing anything". ? > After deasserting my reset signal in the testbench, ... Well, how does your testebench know how long to wait before deasserting that reset signal? If you're using VHDL then you might have a "wait for" statement somewhere, which you can modify appropriately. If you're using Verilog... well, I don't know anything about that either, I'm afraid. Of course, as you're using Lattice's tools then there's a very good chance that nothing I've suggested can help you - sorry! Cheers, -Ben-Article: 86839
Gulp, interesting. I tested your code with my tools, it is faster with simplify than with my tools. However it seems that the biggest trouble is the use of CLA, it seems that the synthesis process allows for better results than the CLA that I implemented by hand. I'm not as experienced as you but is it possible that a standard (read from standard university book) implementation of CLA generate conflicts that disable the use of specific feature of the FPGA? It seems that yes but I would like your advice. Thanks again GiovanniArticle: 86840
Hi Andre, Interesting to know that the Lattice simulations models work just like the Xilinx ones in this respect. > Are the synchronous elements in the real design immediately ready when > leaving Global Reset (Pll is already locked) ? Or is there some "wait" time > according to the timing simulation? They are ready immediately. The "wait" time is really just an artefact of the simulation environment. It may be possible when generating your simulation model to pull the "configuration reset" signal out and turn it into a port on the model, so that you can control exactly when it gets de-asserted (relative to other devices in your simulation, for example). However as I mentioned, I've not used the Lattice tools so I'm guessing a bit here! Glad you got it all working. -Ben-Article: 86841
Giox wrote: > Gulp, interesting. > I tested your code with my tools, it is faster with simplify than with > my tools. However it seems that the biggest trouble is the use of CLA, > it seems that the synthesis process allows for better results than the > CLA that I implemented by hand. I'm not as experienced as you but is it > possible that a standard (read from standard university book) > implementation of CLA generate conflicts that disable the use of > specific feature of the FPGA? You mean you didn't try the simple + first ? All modern FPGA have a dedicated carry ripple chain that allows a very quick propagation of the carry from a LogicCell to the adjacent one. So by using this, you only need n LogicCells for a n bits adders and the carry is handled by dedicated logic. When trying to do your CLA, you only used generic logic so you add supplementary delays. Using others architecture for addition than the simple + is only good for very big adders. SylvainArticle: 86842
Sorry, but I have no experience in this field and so I thought that the simple approach could not be prductive so I skipped it. Thanks for you help. GiovanniArticle: 86843
Joel Kolstad wrote: > -- Embedded processor usage. I never used them, but Xilinx and Altera's > embeeded "soft cores" (microblaze and NIOS) both seemed pretty neat, and > Xilinx was offering ARM hard cores if you really wanted "big iron." Actually, Altera was the one with the ARM9 core that came out in their Excalibur family, this wasn't really a "big iron" processor and the family is no longer actively pushed. Xilinx has the PowerPC405 hard core which spans 3 families, Virtex-II Pro, Virtex-II Pro X and Virtex-4 FX. With a 705 DMIPS this would be considered "big iron" for FPGA offerings. > -- Debugging support. Xilinx had some "soft probe" thing that would let you > poke around the internal nets of the FPGA as it was running, and I believe > Altera had something like this even before Xilinx. > There are several levels of the debugging support from Xilinx. The "soft probe" function allows you to use FPGA Editor to route any net to an output pin. This has been around for 10+ years. There is also the ChipScope Pro cores and software that allow you to insert logic analyzers (ILA), processor bus analyzers (IBA), and virtual I/O (VIO) cores into your design for debug and control. And through a joint project with Agilent you can use an external logic analyzer with internal cores (ATC2) to provide easy control and deep trace analysis. EdArticle: 86844
Thanks fred .... So i can go ahead with my own deisgn basd on 7805and zenar. is there any noise or regulation requirements in SELV.Article: 86845
In article <1120748013.248876.239830@g44g2000cwa.googlegroups.com >, vssumesh <vssumesh_asic@yahoo.com> writes >Thanks fred .... >So i can go ahead with my own deisgn basd on 7805and zenar. is there >any noise or regulation requirements in SELV. "SELV only describes the user safety aspects of the supply" -- fredArticle: 86846
katherine wrote: > All > > Does anyone have a feel for how robust aurora is? > > I have worked out a packet based protocol to put on top of aurora which > will re-send packets which are corrupt at the receiver but what is the > chances of the aurora system itself falling over? > That is really a signal integrity issue. If you get errors, then you need to investigate where the signal intergrity problems are (of course after you verify that you are not using the Aurora interface incorrectly). I have run many GBytes through the links without any errors.Article: 86847
You have obviously been reading some ASIC/VLSI oriented Arithmetic texts which don't really apply to FPGA except in general algorithms. The ripple carry is I'd guess disproportionately faster in FPGA relative to Lut logic by about 3x than in ASIC since its given for free and highly optimized. Think about it, an adder might be placed in any set of Luts lined up in a column so only a ripple can be provided. All the clever logic schemes are highly irregular and must use Lut logic about 3-5x slower than ASIC logic. I also wanted 32bit add for a cpu, the ripple was too slow IIRC about 170MHz for simple a+b but registered on inputs & output, using fastest speed grade for V2Pro. For awhile I used a CSA array, ie 7 8bit ripple add sections with a follow stage to combine the proper select carries. It did cycle faster, maybe near 300MHz IIRC but it used up about 3x the area and needs the extra pipeline. Another downside was that in order to use CSA the addition is done twice with a carry in of both 1 and 0 for the cells 8-15,16-23,24-32. This doubles the fanout on those registers driving duplicate adders and limits the speed up and complicates hand placement. CSA and CLA, Ling schemes are better used for ASICs and full custom. I believe some Altera devices have CSA adder logic built in. In the end I flipped to an alternate approach, use a 2 cycle design that is limited to the 16bit critical path, this also happens to be very close to blockram cycle time so now in 2 clocks, I get 4port ram, 32b add at near 150MHz (actual clock is 2x that). It also uses far less HW and had other simplifying effects elsewhere. Then by placing pair of 2cycle cpus on oposing clocks, it gets the equiv performamance of 1 300MHz cpu datapath. This idea can usually be copied for DSP engines pretty well. johnjakson at usa dot comArticle: 86848
Hello I’m looking for some good book about bit serial programming. I’m interested in learning about it to implement functions bit serially in VHDL. The only book I have found is “VLSI Signal Processing: A Bit-Serial Approach”, but I don’t think it’s available any more. Or if you know of some page on the web that has some info on the subject. /SwedenArticle: 86849
On Wed, 06 Jul 2005 15:19:01 -0700, amko wrote: > Hello! > > Currently I design FPGA design which will have pc104(ISA) bus. First at > all I cannot find detailed ISA bus specification on Internet (I think > for free). > Also PCI to ISA bus bridge(national: CS5530) which is located on > processor card (that I use) and generate PC104 timing does not contain > any useful information in datasheet. > > I read on this forum a lot about ISA bus and I am little confused :)... > if ISA bus signals are synchronized on ISA bus clock or are received and > transmitted on (rising ?)edge of IOR# and IOW# signals. Also I am not > sure If I need address latch signal (BALE) or I can latch address bus on > IOR# and IOW# signals??? > I would not depend on ISA clock synchronization with control signals at all. Not only is the timing of the clock relative to the strobes not guaranteed, the signals are often too 'dirty' for use as a clock. What we do for our PC/104 FPGA cards is use a high speed synchronous clock to the FPGA (50 or 100 MHz), not using any PC/104 signal as a clock. Then use edge detection on the write strobes to determine when to latch inputs The read path can be asynchronous (just enabling the output buffer with the read strobe) > > I see example: http://www.jacyltechnology.com/documents/AP002.pdf and it > is all synchronized to PC104 clock signal...... Is it only special > case?? > > In my design FPGA must act as slave on ISA bus and should support 8-bit > and 16-bit I/O mode and DMA. > > So my question is where I can get detailed ISA bus specification (free > :)) and how much this specification are depend on devices.. I think > setup time, hold time... > How is better to implement PC104 bus design sync or async? (Here I also > have in mind possible problems with async design if i change FPGA) > > Thank you and regards, > > AMIR Peter Walllace
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z