Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Hey, I just updated to ISE8.2SP2 but still the same issue ... i started debugging the issue and found that for some reason it has a problem with the IFF1:\#FF\ statement and still Xilinx makes the same attribute when i use the records script option and do the changement manually? i've mailed to my FAE just in case ... thanks for the feedback, kind regards, Tim John_H schreef: > I don't have an answer, but a suggestion: break the config into its parts. > If you set the Config IFF1 in one line and Config INIT_Q4 in another - just > one per line - you might find out what "really" fails. > > If I had to guess I'd think the line length in your command is a problem. > Breaking it up eliminates this issue and can let you debug the problem to > its core. > > > "yttrium" <yttrium@telenet.be> wrote in message > news:%RHGg.127097$YS7.119815@blueberry.telenet-ops.be... > > Hey, when i run a script through FPGA editor i get the following error: > > > > setattr comp > > c_DDR2_framebuffer_0/c_Ddr2Interface/i_DDR/i_DdrInputOutput_DataEvenFromSdr_d(2) > > Config CE1INV:CE1\ CLKDIVINV:CLKDIV\ CLKINV:CLK\ IDELAYMUX:1\ IFF1:\#FF\ > > IFFDELMUX:0\ IFFMUX:1\ INIT_Q1:0\ INIT_Q2:0\ INIT_Q3:0\ INIT_Q4:0\ > > IOBDELAY_TYPE:VARIABLE\ IOBDELAY_VALUE:0\ Q1MUX:IFF3\ Q2MUX:IFF4 > > ERROR:FPGAEditor:25 - "CE1INV:CE1 CLKDIVINV:CLKDIV CLKINV:CLK IDELAYMUX:1 > > IFF1:\#FF IFFDELMUX:0 IFFMUX:1 INIT_Q1:0 INIT_Q2:0 INIT_Q3:0 INIT_Q4:0 > > IOBDELAY_TYPE:VARIABLE IOBDELAY_VALUE:0 Q1MUX:IFF3 Q2MUX:IFF4" is not a > > valid value for the Config attribute. BRBCF ILOGIC Failure: INVALID_RB: > > "IFF1::\#FF" CFG: "CE1INV:CE1 CLKDIVINV:CLKDIV CLKINV:CLK IDELAYMUX:1 > > IFF1:\#FF IFFDELMUX:0 IFFMUX:1 INIT_Q1:0 INIT_Q2:0 INIT_Q3:0 INIT_Q4:0 > > IOBDELAY_TYPE:VARIABLE IOBDELAY_VALUE:0 Q1MUX:IFF3 Q2MUX:IFF4" > > > > even when i make this change manually and then save and record this a > > script and then run it, it gives me that error. So when FPGA editor > > generates the following statement: > > > > setattr comp > > c_DDR2_framebuffer_0/c_Ddr2Interface/i_DDR/i_DdrInputOutput_DataEvenFromSdr_d(2) > > Config CE1INV:CE1\ CLKDIVINV:CLKDIV\ CLKINV:CLK\ IDELAYMUX:1\ IFF1:\#FF\ > > IFFDELMUX:0\ IFFMUX:1\ INIT_Q1:0\ INIT_Q2:0\ INIT_Q3:0\ INIT_Q4:0\ > > IOBDELAY_TYPE:VARIABLE\ IOBDELAY_VALUE:0\ Q1MUX:IFF3\ Q2MUX:IFF4 > > > > and that one is exactly the same as the one in my script then what am i > > doing wrong??? to get me that answer? > > > > Thanks in advance for your help, > > > > kind regards, > > > > TimArticle: 106976
Martin Schoeberl wrote: > From: "Tommy Thorn" <tommy.thorn@gmail.com> Newsgroups: > > > Martin Schoeberl wrote: > >> JOP at 100MHz on the Altera DE2 using the 16-bit SRAM: > >> > >> Avalon: 11,322 > >> SimpCon: 14,760 > >> > >> So for the SRAM interface SimpCon is a clear winner ;-) > >> The 16-bit SRAM SimpCon solution is even faster than > >> the 32-bit SRAM Avalon solution. > > > > I'm not sure what your point is. It's hardly surprising that a JOP > > works better with the interface it was codesigned with, rather than > > some other one crafted on top. It says nothing is the relative merits > > of Avalon and SimpCon. I could code up a counter example quite easily. > > You're right from your point of view. I have only JOP to compare > SimpCon and Avalon. JOP takes advantage of the early acknowledge of > SimpCon. However, it's still simpler with SimpCon to implement a > SRAM interface with the input and output registers at the IO > cells in the FPGA without adding one cycle latency. > We talked a bit about this early on in the thread, but I think that is simply because of how you chose to implement the SRAM controller simply by only using SOPC Builder to create that Avalon slave device and external signals to the SRAM. > A small defense of the JOP/SimpCon version: SimpCon was added very > late to JOP. Up to this time JOP used it's own proprietary memory > interface that was not shared with the IO subsystem. The IO devices > also used a proprietary interface. Than I changed JOP to use > Wishbone for memory and IO, but had to add a non Wishbone compliant > early ack signal to get the performance I wanted. This resulted in > the definition of SimpCon and another change in JOPs memory/IO > system. > > It would be interesting to take another CPU (not NIOS or JOP) and > implement an Avalon and a SimpCon SRAM interface and compare the > performance. However, who has time to do this... > Never the time ;) I did take a quick look though at the jop\sopc\components\jop_avalon\class.ptf (which I think is the SOPC component file for your processor) and jop\quartus\sopcmin\jop_system.ptf (which looks to be the SOPC Builder 'system.ptf' file for the entire system) from what is on OpenCores.org. Upon quick perusal (like you said, never the time) I didn't see any use of the Avalon bus 'readdatavalid' signal (in addition to the obligatory 'waitrequest'). This implies that the master devices are being controlled strictly via the Avalon 'waitrequest' signal. The lack of use of Avalon's 'readdatavalid' I think could explain some or all of the performance differences that you're seeing between Avalon and SimpCon. To make use of it, both the master and slave side need to have the 'readdatavalid' signals implemented. If either side does not have 'readdatavalid', then SOPC Builder builds the appropriate interconnect logic but basically will end up implementing the same thing as if only 'waitrequest' was used as you have in your design. My guess is that use of 'waitrequest' and 'readdatavalid' should end up being equivalent to the SimpCon implementation. > > > > Altera has an App note on "Using Nios II Tightly Coupled Memory > > Tutorial" > > (http://altera.com/literature/tt/tt_nios2_tightly_coupled_memory_tutorial.pdf), > > but as far as I understand you, this is already how you use the memory. > > Very interesting, thanks for the link. No, this is not the way I > used the on-chip memory with JOP - this looks NIOS specific. And it > is stated there: > > 'The term tightly coupled memory interface refers to an > *Avalon-like* interface...' > > That's interesting as it is an indication that there are issues for > low latency connections with Avalon ;-) > I don't think it's so much an Avalon 'issue' as it is a statement that there are things you need to do properly to get high performance and low latency that go above and beyond the lower performance (but perhaps 'quicker' to implement) ways. Wishbone goes beyond their 'normal' transaction to sort of define some tag signals that one can use. From an earlier post it sounded like you extended SimpCon in some fashion to handle this case also. Avalon has 'waitrequest' and 'readdatavalid' built right into their basic protocol for maximum performance but also allows components to not use them if they so desire (with SOPC Builder building the appropriate connection logic for you) in exchange for not getting the highest performance. KJArticle: 106977
John McGrath wrote: > One thing that looks a little suspect - > the lines: > > TMPOUT = DIN >> SHIFTp; > DOUT = TMPOUT[15:0]; > > are inside the always statement - and as shiftP is on the right hand > side, it should be in the sensitivity list. But SHIFTp and TMPOUT are entirely generated inside the always statement so I wouldn't think it is necessary.. (Is my C programmer heritage showing yet?) > I think moving these two lines outside, and making them assign > statements will get rid of your synthesis warning - as to what logic is > actually being produced - hard to say! but I'd imagne that this > construct would make behavioural different from post synthesis in > simulation. Well the problem is that module seems to work :) If I do a post-PAR simulation it gives me the result I expect, and the RTL schematic for that section is exactly what I'd expect, ie SHIFT[3:0] ---> ROM --- >Right Shift ---> DOUT[15:0] ^ | DIN[31:0] -----------------+ Unfortunately they are the only synthesis errors I see (that aren't generated by the coregen code), so I guess I'll have to do more looking into timing issues :) -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8CArticle: 106978
Hi all, i'm wondering if someone knows if it does exist a cheap virtex4fx based card with at least 2 10/100/1000 ethernet phy onboard regards Sandro P.S. cheap I mean < 500-600 USDArticle: 106979
dont know about PCI Express, but in normal PCI reads are INEFFICENT. it easily takes 10 retries (motherboard dependant) before the first dword gets into the device.Article: 106980
Sandro schrieb: > Hi all, > > i'm wondering if someone knows if it > does exist a cheap virtex4fx based card with > at least 2 10/100/1000 ethernet phy onboard > > regards > Sandro > > P.S. cheap I mean < 500-600 USD NO. it doesnt. AnttiArticle: 106981
What is the best source for learning about the Xilinx floorplanner? I see stuff in the constraints documentation but the docs are a bit sketchy on the overall strategy of floorplanning. What's the difference between the post place and the post map floorplanner as shown on the process pane of the ISE 7.1 ? How do I add registers to allow a bus to transverse across the chip and not have the synth tool pack the registers into an SRL16? Brad Smallridge aivision dot comArticle: 106982
Rob, Here goes, See below: Austin -snip- > Serious question: Does Altera's PLL's offer an advantage (veratility, > jitter, etc) over Xilinx's DCM's? Very good question. PLLs will filter out high frequency jitter above a certain frequency, but all also susceptible to creating jitter as they share the same substrate with the FPGA logic and IOs. One study we di shoed the PLL with much less jitter when nothing else was happening, but it had twice the jitter of the DCM when a moderate amount of logic and IOs were switching. There is also the concern for a PLL that it must have clean (filtered) power and ground connections (so pcb layout can be critical to success). That said, sometimes you need a PLL, and nothing else will suffice. I'm to understand that the DCM is not a > PLL, correct? What is the working principal behind the DCM (any literature > links?) The DCM uses six tapped delay lines, with a DSP engine that creates all the phase shifts, frequency multiples, etc. It is 100% digital, so it is process/voltage/temperature independent. The phase resolution is ~20 ps (V4), so fine phase shifts, or even dynamic phase shifting can be done. > > This question arises from an upcoming design where we have three serial LVDS > interfaces that need to go into a V2PRO part. I implemented this interface > within an Altera Stratix part without a problem; but I'm told by the group > responsible for the V2PRO design that their FPGA doesn't have the resources > to handle the aforementioned interface, which will necessitate putting > deserializers on the board at an added cost. What resources does it not have? This surprises me, as Stratix (generally speaking) is competitive with Virtex II. Of course, they do not have a hardened microprocessor in Stratix, which means that if you need one, VII Pro is your only choice (at that technology node). Other than that, I didn't think there was any real significant differences between the two (not being in marketing, I can say this). > > Here's some information on the LVDS interface: > Each interface has its own clock and 3 serialized lanes, for a total of 3 > clocks (45MHz) and 9 x 7 deep (315MHz fast clock) serialized lanes. As long as you stick with the DLL features, and do not have to use the DFS outputs (CLKFX, CLKFX_180), you will have less jitter (synhthesis of a new frequency creates the most jitter). Use of the DLL and its fine phase shifting capabilities should be able to do what you need to have done. I am confused, however, the 45 MHz ports should be trivial, and won't even require a DCM. It is the 315 MHz ports that will make use of the DCM. V2Pro does not have the ability to phase shift individual IOBs to allow for pcb trace lengths, and V4 does have this capability, so it is too bad that you will be using V2Pro, as V4 could do it cleaner, with more features (ie the pcb traces for the 315 MHz ports could be completely different in length, and each bit can be phase shifted so that the sample points all line up perfectly). Or is the 315 = 7 X 45 ? In which case, you probably need the DFS CLKFX (to multiply the 45 MHz clock by 7, and keep it aligned), and you will have to carefully watch the resulting jitter, and be sure you still have margin for recovery of the data. If this is the case, then the V4 is a much better solution, as it has built in IO serdes functions on every pin.Article: 106983
Antti wrote: > ... > NO. it doesnt. > ... Antti, Negative answer :-( ...but very fast ;-) thanks SandroArticle: 106984
Austin Lesea schrieb: > Rob, > > Here goes, > > See below: > > Austin > > -snip- > > Serious question: Does Altera's PLL's offer an advantage (veratility, [snip] > I am confused, however, the 45 MHz ports should be trivial, and won't > even require a DCM. It is the 315 MHz ports that will make use of the Austin, the OP desing looks very much like CameraLink. so the incoming clock would be multiplied by x7 to get bit clock. it is doable with Virtex and DCM, but for what I would suggest (if it is CameraLink) is still to use the dedicated deserializer. if you look at Virtex boards with Cameralink support than most of them (but not all) have dedicated serializers. of course if it not Cameralink (or those deserializers can not be used) then it has to be checked if it makes sense to use only virtex LVDS + DCM or have external circutry too. Antti http://xilant.comArticle: 106985
Brad Smallridge wrote: > What is the best source for learning about the Xilinx floorplanner? Depends on what you want to learn. The mechanics of the floorplanner tool are described in the users guide, however it is fairly intuitive, so you may find it easier to just start using it (although you might not stumble across all the features that way) > > I see stuff in the constraints documentation but the docs are a bit > sketchy on the overall strategy of floorplanning. The strategy is more art than science. It is sort of like putting together a puzzle which has many possible solutions. You should start with a block diagram of your design, grouping the pieces together on paper to minimize the lengths of critical interconnect. Then you use that as a guide to placing the pieces. It helps tremendously to do the floorplanning hierarchically rather than attempting it on a flat design, as it is far easier to optimize small pieces and then place them in the larger design than it is to optimize the whole thing at once. Unfortunately, the floorplanner is not all that hierarchical. You can use the hierarchy browser to work on the design more or less hierarchically. Basically you want critical connections to be short, preferably in the same row/column for source and destination. Carry chains are typically the long pole in the tent, so you want flip-flops outputting to carry chain logic located in close proximity to the carry chain. Best bet is to play with it with some small designs to get a feel for how it works and to start learning some placement strategies. > > What's the difference between the post place and the post map floorplanner > as shown on the process pane of the ISE 7.1 ? Post map floorplanner only shows the stuff you manually placed. The rest of the design is not yet placed, so that information is not shown. The post PAR floorplanner brings up an additional pane that shows the actual placement of all of the elements in the design. You can use the post PAR floorplanner to tweak an automatic placement in order to improve the timing. > > How do I add registers to allow a bus to transverse across the chip and > not have the synth tool pack the registers into an SRL16? It has to be done in your RTL of course. The easiest way to prevent SRL16 inference is to put a reset on the flip-flops. You can also do it by putting syn_keeps on the signal between each flip-flop, or a syn_preserve on the flip-flops, or with a synthesis directive. I've had varying success with the synthesis directive, often finding it doesn't work right. I've also had difficulties from synthesis version to synthesis tool version of different behaviors for syn_keep and syn_preserve in certain cases (the most notable is inputs to carry chains...synplicity currently infers an additional lut if you put a syn keep on an instantiated carry chain bit input). Hope that helps. > > Brad Smallridge > aivision > dot com > >Article: 106986
Hi Daniel, You are right - they are totally contained within that always block - so you most likely are getting away with it - but if you had used noo-blocking <= statements, I wonder if the result may have been different ;) (at least for behavioural sim). There are some imteresting subtlties in verilog. I often use the verilog 2001 construct always@(*) .... to ensure that the sensitivity list is always complete - should avoid that problem. As for the issue, it seems that the above is indeed not the issue from your discription, and that it is most likely timing related. Good luck with it! Daniel O'Connor wrote: > John McGrath wrote: > > One thing that looks a little suspect - > > the lines: > > > > TMPOUT = DIN >> SHIFTp; > > DOUT = TMPOUT[15:0]; > > > > are inside the always statement - and as shiftP is on the right hand > > side, it should be in the sensitivity list. > > But SHIFTp and TMPOUT are entirely generated inside the always statement so > I wouldn't think it is necessary.. (Is my C programmer heritage showing > yet?) > > > I think moving these two lines outside, and making them assign > > statements will get rid of your synthesis warning - as to what logic is > > actually being produced - hard to say! but I'd imagne that this > > construct would make behavioural different from post synthesis in > > simulation. > > Well the problem is that module seems to work :) > If I do a post-PAR simulation it gives me the result I expect, and the RTL > schematic for that section is exactly what I'd expect, ie > > SHIFT[3:0] ---> ROM --- >Right > Shift ---> DOUT[15:0] > ^ > | > DIN[31:0] -----------------+ > > Unfortunately they are the only synthesis errors I see (that aren't > generated by the coregen code), so I guess I'll have to do more looking > into timing issues :) > > -- > Daniel O'Connor software and network engineer > for Genesis Software - http://www.gsoft.com.au > "The nice thing about standards is that there > are so many of them to choose from." > -- Andrew Tanenbaum > GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8CArticle: 106987
Antti, Thank you. Yes, it does look like this is a X7 deserializer. In which case the cheapest, fastest and easiest may be to just buy the ASSP that was designed to do that job. I still think that V4 could also do this without any need for the ASSP as the SSIO features include X7 sampling, without having to mutliply the clock. http://direct.xilinx.com/bvdocs/userguides/ug070.pdf page 355 Austin Antti wrote: > Austin Lesea schrieb: > >> Rob, >> >> Here goes, >> >> See below: >> >> Austin >> >> -snip- >>> Serious question: Does Altera's PLL's offer an advantage (veratility, > [snip] >> I am confused, however, the 45 MHz ports should be trivial, and won't >> even require a DCM. It is the 315 MHz ports that will make use of the > > Austin, > > the OP desing looks very much like CameraLink. > > so the incoming clock would be multiplied by x7 to get bit clock. > > it is doable with Virtex and DCM, but for what I would suggest > (if it is CameraLink) is still to use the dedicated deserializer. > > if you look at Virtex boards with Cameralink support than most > of them (but not all) have dedicated serializers. > > of course if it not Cameralink (or those deserializers can not > be used) then it has to be checked if it makes sense to > use only virtex LVDS + DCM or have external circutry too. > > Antti > http://xilant.com >Article: 106988
Oops, You still need to have the X7 clock, the the ISERDES does all the work. Austin Austin Lesea wrote: > Antti, > > Thank you. Yes, it does look like this is a X7 deserializer. In which > case the cheapest, fastest and easiest may be to just buy the ASSP that > was designed to do that job. > > I still think that V4 could also do this without any need for the ASSP > as the SSIO features include X7 sampling, without having to mutliply the > clock. > > http://direct.xilinx.com/bvdocs/userguides/ug070.pdf > page 355 > > Austin > > Antti wrote: >> Austin Lesea schrieb: >> >>> Rob, >>> >>> Here goes, >>> >>> See below: >>> >>> Austin >>> >>> -snip- >>>> Serious question: Does Altera's PLL's offer an advantage (veratility, >> [snip] >>> I am confused, however, the 45 MHz ports should be trivial, and won't >>> even require a DCM. It is the 315 MHz ports that will make use of the >> Austin, >> >> the OP desing looks very much like CameraLink. >> >> so the incoming clock would be multiplied by x7 to get bit clock. >> >> it is doable with Virtex and DCM, but for what I would suggest >> (if it is CameraLink) is still to use the dedicated deserializer. >> >> if you look at Virtex boards with Cameralink support than most >> of them (but not all) have dedicated serializers. >> >> of course if it not Cameralink (or those deserializers can not >> be used) then it has to be checked if it makes sense to >> use only virtex LVDS + DCM or have external circutry too. >> >> Antti >> http://xilant.com >>Article: 106989
xilprg supports multiple Xilinx devices and supports Digilent USB and Xilinx Parallel III cable. Zoltan Daniel O'Connor wrote: > fpgakid@gmail.com wrote: > > I've released the first version of my Xilinx JTAG programmer for > > Win32/Linux. > > > > Supports Parallel III Cable and Digilent USB. Check and of message for > > supported devices! > > > > http://sourceforge.net/projects/xilprg > > Why not just port xc3sprog to Windows? > > -- > Daniel O'Connor software and network engineer > for Genesis Software - http://www.gsoft.com.au > "The nice thing about standards is that there > are so many of them to choose from." > -- Andrew Tanenbaum > GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8CArticle: 106990
Hi I have a design that I have verified using a post route simulation and then PAR the design with no timing errors. Do I need to do a post PAR timing simulation or can I be confident that the design will work ok in the fpga? Thanks JonArticle: 106991
Pico Computing has a single gigabit phy on a $1400 (I think) v4fx board. Sandro wrote: > Hi all, > > i'm wondering if someone knows if it > does exist a cheap virtex4fx based card with > at least 2 10/100/1000 ethernet phy onboard > > regards > Sandro > > P.S. cheap I mean < 500-600 USDArticle: 106992
Brannon schrieb: > Pico Computing has a single gigabit phy on a $1400 (I think) v4fx > board. > > Sandro wrote: > > Hi all, > > > > i'm wondering if someone knows if it > > does exist a cheap virtex4fx based card with > > at least 2 10/100/1000 ethernet phy onboard > > > > regards > > Sandro > > > > P.S. cheap I mean < 500-600 USD good gosh - the OP wanted 2 GBPHy and below 600 !! the closest could be Avnet LX25 board with 2 AvBus comm modules I got the V4LX25 board for 249USD but I dont know how much the avbus comm module is. I bet the packet deal would be just above 600USD though. Thats why I said NO, eg not available below 600USD AnttiArticle: 106993
maxascent wrote: > Hi > > I have a design that I have verified using a post route simulation and > then PAR the design with no timing errors. Do I need to do a post PAR > timing simulation or can I be confident that the design will work ok in > the fpga? > > Thanks > > Jon Assuming all the important paths are constrained and you've handled any clock domain crossings properly, the static timing analysis is sufficient, and actually more comprehensive than doing a timing simulation. If you meet these conditions, you can be confident that it will behave the same as your functional simulation. I can count on one hand the number of times I've done a timing simulation, and I've done literally hundreds of high performance FPGA designs.Article: 106994
>> What do you mean with 'very close to the hardware'? I try to >> avoid vendor specific library elements as much as possible and >> stay with plain VHDL. If you mean that the VHDL coding style >> is more hardware oriented, than I agree. > > Yes, this was what I mean, e.g. figures 5.6 to 5.9 of your thesis, where > you describe the processor pipeline with gates and which is implemented > like this in VHDL. But maybe this is the normal case and I'm just to new to > VHDL to write and interconnect components in this way. > > http://www.jopdesign.com/thesis/thesis.pdf nice that you read it ;-) > >> I started directly >> in an FPGA implementation and did almost no simulation. > > Why not? When I was implementing my CRC32 check for my network core, I've > tested the algorithm with a VHDL testbench (ethernet packet send and > receive works at 10 Mbit and 100 Mbit on my Spartan 3E starter kit now). > The turnaround times are faster with simulation and it is very easy to > debug it, instead of debugging a synthesized core in hardware. The same was > true for my DS2432 ROM id reader, where I've written the testbench, first > and then implemented the reader. > http://www.frank-buss.de/vhdl/spartan3e.html Ok, the main reason for not using simulation was just because I had no ModelSim and the Quartus simulator was a pain (actually I started with MaxPlus II). However, I wrote my own kind of debugging device using the printer port on the PC. Clocked the design with the printer port and read back the interesting signals with a small state machine. Kind of creasy ;-) Now, a lot has changed. E.g. ModelSim for Xilinx is free. So there is now a testbench for JOP available that you can use with ModelSim XE. For all FPGA specific parts (on-chip memories) I wrote plain VHDL models. So you can now debug with ModelSim XE and compile for Altera.... And I agree, simulation can save you a lot of time (and sometimes waste a lot of time - I still like to look on the code till I find the issue). MartinArticle: 106995
Hello Joel, the Cyclone II-unit would be nice for me - I assume. Is it programable with the WEB-Version of Quartus? (This is a pivate project at the moment)Article: 106996
The idea with the delay should for the moment. Thanks! kayrock66@yahoo.com wrote: > A couple of questions that will help answer your query: > Are you using hard or soft multipliers? What are you doing that you > need 40 bit factors? a) In this particular case it is for audio at the moment, the data comes with 48bit. b) currently the embedded MULs are used by Quartus. Obviously, when demanding very high freqs, Quartus even takes more MULS then required, possibly as a result of doublicated hardware to meet the timing reqs. When I do some softer time contraints, less MULs are used, however. (?) Anyway: I just need that to verify the logial correctness of the delay actions and timing when rejoining the data paths.Article: 106997
> We talked a bit about this early on in the thread, but I think that is > simply because of how you chose to implement the SRAM controller > simply by only using SOPC Builder to create that Avalon slave device > and external signals to the SRAM. Nope, I did both versions: With a VHDL design and just the SOPC builder (non VHDL) version. >> It would be interesting to take another CPU (not NIOS or JOP) and >> implement an Avalon and a SimpCon SRAM interface and compare the >> performance. However, who has time to do this... >> > Never the time ;) I did take a quick look though at the > jop\sopc\components\jop_avalon\class.ptf (which I think is the SOPC > component file for your processor) and > jop\quartus\sopcmin\jop_system.ptf (which looks to be the SOPC Builder > 'system.ptf' file for the entire system) from what is on OpenCores.org. Ok, cool ;-) > Upon quick perusal (like you said, never the time) I didn't see any > use of the Avalon bus 'readdatavalid' signal (in addition to the > obligatory 'waitrequest'). This implies that the master devices are > being controlled strictly via the Avalon 'waitrequest' signal. Yes, that's true. In the general case the master (JOP) does not issue more than one read request (exception is cache load). And in this case 'readdatavalid' is of no additional use. 'readdatavalid' does only help with pipelined requests when: a.) the master can issue a new request even when the former is outstanding b.) the slave implements a FIFO to queue those requests c.) AFAIK when the request goes to the same slave (???) In a single plain read you don't get a single cycle less latency with 'readdatavalid'. The issue is that JOP relys on the early ack to reduce the latency. And this is also the case for the cache load. 'readdatavalid' does not save a single cycle for my case, sorry. I would need to redesign the cache load to issue more read command. That's the only point where I could adapt JOP to the Avalon (Wishbone,...) way. > The lack of use of Avalon's 'readdatavalid' I think could explain some > or all of the performance differences that you're seeing between Avalon > and SimpCon. To make use of it, both the master and slave side need to > have the 'readdatavalid' signals implemented. If either side does not and of course some FIFO pipelining in the SRAM slave, which adds latency for the 'normal' case. > have 'readdatavalid', then SOPC Builder builds the appropriate > interconnect logic but basically will end up implementing the same > thing as if only 'waitrequest' was used as you have in your design. agree > My guess is that use of 'waitrequest' and 'readdatavalid' should end up > being equivalent to the SimpCon implementation. > >> > >> > Altera has an App note on "Using Nios II Tightly Coupled Memory >> > Tutorial" >> > (http://altera.com/literature/tt/tt_nios2_tightly_coupled_memory_tutorial.pdf), >> > but as far as I understand you, this is already how you use the memory. >> >> Very interesting, thanks for the link. No, this is not the way I >> used the on-chip memory with JOP - this looks NIOS specific. And it >> is stated there: >> >> 'The term tightly coupled memory interface refers to an >> *Avalon-like* interface...' >> >> That's interesting as it is an indication that there are issues for >> low latency connections with Avalon ;-) >> > I don't think it's so much an Avalon 'issue' as it is a statement that > there are things you need to do properly to get high performance and > low latency that go above and beyond the lower performance (but perhaps > 'quicker' to implement) ways. If it would be a 'real' Avalon interface, just without the arbritration part, than they should offer this additional Avalon communication link for others to attach the main memory to the NIOS. > Wishbone goes beyond their 'normal' transaction to sort of define some > tag signals that one can use. From an earlier post it sounded like you Agree, pipelined trasnactions should be built into the specification and not be an add-on. > extended SimpCon in some fashion to handle this case also. Avalon has My pipeline approach is just this little funny busy counter instead of a single ack and that a slave has to declare it's pipeline level (0 to 3). Level 1 is almost ever possible. It's more or less for free in a slave. Level 1 means that the master can issue the next read/write command in the same cycle when the data is available (rdy_cnt=0). Level 2 means issue the next command one cycle earlier (rdy_cnt=1). Still not a big issue for a slave (especially for a memory slave where you need a little state machine anyway). Figure 4 in http://www.opencores.org/cvsweb.cgi/~checkout~/simpcon/doc/simpcon.pdf should explain it. However, it looks like I have to draw it more clearer... > 'waitrequest' and 'readdatavalid' built right into their basic protocol > for maximum performance but also allows components to not use them if > they so desire (with SOPC Builder building the appropriate connection > logic for you) in exchange for not getting the highest performance. Enjoy this discussion :-) MartinArticle: 106998
HI, could you tell me what is DQPx in CY7C1386C (SRAM)? There are 36 bits of datas.. 32 DQx and 4 DOPx.. Thanks zlotawyArticle: 106999
Check out Avnet Virtex-4FX12 Mini Module. It is tiny, has one GB phy and it's $250. Put two boards together and use one of them as a slave, to stream the phy data to the master module. Only $500 :) http://www.em.avnet.com/evk/home/0,1719,RID%253D0%2526CID%253D25726%2526CCD%253DUSA%2526SID%253D32214%2526DID%253DDF2%2526SRT%253D1%2526LID%253D32232%2526PRT%253D0%2526PVW%253D%2526BID%253DDF2%2526CTP%253DEVK,00.html
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z