Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On Mar 29, 9:59 pm, Antti <Antti.Luk...@googlemail.com> wrote: > On 29 Mrz., 20:34, Mike Treseler <mike_trese...@comcast.net> wrote: > > > Antti wrote: > > > any ideas how to really clean the 4mhz clock? > > > I would try soldering on one of those > > little schmitt trigger packs. Sometimes > > a low slew rate will clock both edges > > once in a while. > > > -- Mike Treseler > > yes schmit trigger input could be the cure.. > but i still cant understand the missing clock pulses! > > Antti Schmit trigger will not help you. Please try to register all inputs, verify the FFs are in the IOB pads, and your troubles will go. It is very easy to understand the missing clock pulses: your 4mhz signal used as strobe controls many FFs in your design. But actually, you do not control the timing propagation of the combinatorial logic from the pad to these FFs. When a rising of 50mhz is very close to the rising_edge 4mhz, some FFs see the 4Mhz signal as a 0 logic level and some others see the 4 Mhz signal as a 1 logic level... and YOU LOSE SOME EDGE ON SOME FFs Very common error ! Laurent www.amontec.comArticle: 130701
On 30 Mrz., 22:42, j...@amontec.com wrote: > On Mar 29, 9:59 pm, Antti <Antti.Luk...@googlemail.com> wrote: > > > > > On 29 Mrz., 20:34, Mike Treseler <mike_trese...@comcast.net> wrote: > > > > Antti wrote: > > > > any ideas how to really clean the 4mhz clock? > > > > I would try soldering on one of those > > > little schmitt trigger packs. Sometimes > > > a low slew rate will clock both edges > > > once in a while. > > > > -- Mike Treseler > > > yes schmit trigger input could be the cure.. > > but i still cant understand the missing clock pulses! > > > Antti > > Schmit trigger will not help you. > Please try to register all inputs, verify the FFs are in the IOB pads, > and your troubles will go. > It is very easy to understand the missing clock pulses: your 4mhz > signal used as strobe controls many FFs in your design. But actually, > you do not control the timing propagation of the combinatorial logic > from the pad to these FFs. > When a rising of 50mhz is very close to the rising_edge 4mhz, some FFs > see the 4Mhz signal as a 0 logic level and some others see the 4 Mhz > signal as a 1 logic level... and YOU LOSE SOME EDGE ON SOME FFs > > Very common error ! > > Laurentwww.amontec.com Larry, there error happened when 1 single FF was clocked with 50mhz and having 4mhz pulses in D. the output from this FF did miss one COMPLETE pulse (one rising edge) in about 1 per 100M pulses counted. that can only happen if the pulse was not latched proper level for many clocks of 50mhz, something i can really not understand. well all those errors have disappeared happily, without schmit trigger or any other magic AnttiArticle: 130702
"Antti" <Antti.Lukats@googlemail.com> wrote in message news:ff1c62b5-f856-4740-9c12-5e303525500b@s13g2000prd.googlegroups.com... > On 30 Mrz., 22:42, j...@amontec.com wrote: >> On Mar 29, 9:59 pm, Antti <Antti.Luk...@googlemail.com> wrote: > that can only happen if the pulse was not latched proper level for > many clocks of 50mhz, something i can really not understand. > > well all those errors have disappeared happily, without schmit trigger > or any other magic > Just what was the change anyway? Getting rid of using the strobe to clock things or something else? KJArticle: 130703
On 30 Mrz., 23:34, "KJ" <kkjenni...@sbcglobal.net> wrote: > "Antti" <Antti.Luk...@googlemail.com> wrote in message > > news:ff1c62b5-f856-4740-9c12-5e303525500b@s13g2000prd.googlegroups.com... > > > On 30 Mrz., 22:42, j...@amontec.com wrote: > >> On Mar 29, 9:59 pm, Antti <Antti.Luk...@googlemail.com> wrote: > > that can only happen if the pulse was not latched proper level for > > many clocks of 50mhz, something i can really not understand. > > > well all those errors have disappeared happily, without schmit trigger > > or any other magic > > Just what was the change anyway? Getting rid of using the strobe to clock > things or something else? > > KJ well as soon as the strobe is passing one FF clocked at 50mhz there was measurable error rate for both extra and missing strobes. thats basically it. I had some of the design tested on Xilinx and bad- proto where I had to filter the signals, so at design change i removed some filters and in the strobe left 1 FF what looked like good solution until i did start to measure the actual error rate.. AnttiArticle: 130704
Antti wrote: > > Jim > > http://www.actel.com/documents/Clock_Skew_AN.pdf > > look as example figure 9 there how do you like if your FPGA vendor > suggest using this type of clock distribution? :) > i do not have any local clocks, not any more, but i have seen those > effects very well. I assumed the FGPA fitter tools to take care those > situations or issue warning at least or that it shows in post place > simulation, but no. those Actel FF that clock 100% false can pass > fitter and show no problems in post-place sims also. These days, very large reliance is placed on the models, these post-place-sims are still SIMULATIONS, and they rely on the numbers the vendor gives for their models. Those numbers often come from relatively simple bench tests, and are not derived from 'aggressive corner killer' designs. Sometimes it is a good idea to go looking for problems - we have provided (or helped provide) test cases to a few large US coporations, and I am bemused by their lack of what I would call 'test coverage'. > > my failure rate change may also be just different fitter run > differences. I have no almost all working, that is no double or > missing strobes, and the 50mhz domain part also working ok Can you lock the Test portion, so you know that does NOT change. You will of course get more jitter on the 4MHz clock chain, with more of the device 'otherwise active', and that may be enough to trigger these aperture effects. The models SHOULD safely margin all cases, and it sounds like they may be close, but not close enough ;) How easy is it to edit the models in the Actel devices ? What margin does it 'think' it has now ? -jgArticle: 130705
KJ wrote: > "Jim Granville" <no.spam@designtools.maps.co.nz> wrote in message > news:47eebe69@clear.net.nz... > >>KJ wrote: >> >>>Your post never mentioned anything about having measured a slow edge on >>>the 4MHz signal either. If the edge rate is within spec, adding a >>>Schmitt trigger will have no effect. >> >>Not entirely true. >>In the real analog world, there are other details that can >>trip you up. Ground level shifting and series inductances all >>conspire against clean digital operation... >> >>(The best schmitt is a non-inverting one.) >> > > > The usefulness of the trigger from an engineering perspective is to turn a > slow edge into a faster one in order to meet input characteristics of a part > that can not tolerate the slower edge. Yes, but it does more as well. The output of a schmitt now references against the local ground, and so you have increased the tolerance of inter-system ground bounce. > Use of a schmitt trigger to address > any of the issues that you mention would only be considered if there are > some other physical constraints that precludes the proper engineering > solution which would consist of > - Termination > - Proper grounding > - Differential signalling > > for the simple reason that the trigger would not be addressing the root > cause issues that you brought up. Of course, but we work in the real world and if (eg) the design has to work across multiple PCBs, then the 'proper grounding' may not be on the table. Differential signaling is great, but you may be forced to deploy to a i2c or SPI interface standard - or just be told that many wires are too expensive!, Schmitts also allow designers to deploy EMC measures, often late in the design flow. Pin Configurable schmitts are common in CPLDs, and I know one vendor who put them there, because we provided examples of when/how they were needed in real world applications. -jgArticle: 130706
Hi, is it possible to force a re initialization from within the fpga ? if I have, for instance. transfered new configuration data to a flash, and want to reinitialize the fpga without using the external prog_b pin ?Article: 130707
I have Suse 10.3 x86_64 and the install worked fine for me. However, I can not for the life of me find the executable to start up the webpack. Do you guys know where it is? It is starting to make me go crazy.Article: 130708
My question: How can I write to DDR SDRAM from a custom IP core on the PLB Bus? Background: I am developing a custom IP core for a Virtex II Pro based system (the Xilinx XUP board). This core captures video data using the Digilent VDEC1 board at a pixel rate of 13 MHz. I would like to write several video frames to memory and process them in the Power PC processor. Right now my core captures video and attempts to send it to the PPC using the PLB FIFO IPIF. However, there are several problems with this approach. It turns out that reading this data from the FIFO in software is extremely slow, so the FIFO fills up before I can read all of the pixels out of it (resulting in lost pixels and unusable frames). Also, there is only enough memory in the FIFO to store about one image frame (360x240 resolution), so it could not serve as a multi- frame buffer (and I need about 8 frames to do my processing) Therefore, I would like to write the image data directly to DDR SDRAM from my custom IP core. I have 512 MB installed on the board, and I'm able to access it in C using the plb_ddr core. However, I am not sure how I can get my custom core to write to the DDR using the PLB bus. If anyone could point me in the right direction, I would really appreciate it. I know that I will need to make my custom core a "Master" on the PLB bus and possibly use DMA, but I can't find a good example on how to read/write to memory. Thanks.Article: 130709
"Dave" <dhschetz@gmail.com> wrote in message news:940082fb-7675-4407-8c91-fd1300a0ed41@e60g2000hsh.googlegroups.com... > On Mar 26, 10:25 am, "BobW" <nimby_NEEDS...@roadrunner.com> wrote: >> <shakith.ferna...@gmail.com> wrote in message >> >> news:a01b5815-57f0-4f38-bade-fae9d7937826@d4g2000prg.googlegroups.com... >> >> >> >> > Hi, >> >> > We have developed a High Speed on FPGA using the MGT/RocketIO to >> > generate high speed signals. Also we receive the high speed signals >> > using the MGT. Due to the nature of the application, we require a pure >> > signal without encoding (We have switched off the 8B/10B encoding in >> > the MGT). The problem is that it effects the clock recovery accuracy >> > in the received signal as there might not be good DC balance in the >> > signal. Which in turn effects the data received. And MGT is a black >> > box to us. >> >> > Two options- >> > 1. Another encoding mechanism such that we have pure signal and good >> > clock recovery. >> > 2. Or is there a parameter in MGT to improve the clock recovery. >> >> > We are looking for some solution around this problem. >> >> > Our setup: >> > FPGA - Xilinx Virtex II Pro >> > Board - Xilinx XUP Virtex(tm)-II Pro Development System >> > Software - ISE 9.1i >> >> > Best Regards >> > Shakith >> >> The clock recovery circuitry requires that there be a minimum number of >> received edges per unit time. Otherwise, the recovered clock edges will >> drift with respect to the data stream. >> >> If you don't use some type of encoding (e.g. 8B/10B), or you don't insure >> that your raw data ALWAYS provides this minimum edge density requirement, >> then the clock/data recovery circuitry will not work. >> >> Bob > > Perhaps a scrambler/descrambler using LFSR's would help alleviate any > edge density or DC bias problems by randomizing the data a bit? I'm > not sure if it would guarantee correct reception, but it may help. You don't say why you need a "pure" signal (no coding). I could assume that you are trying to eliminate the overhead of coding (with 8b10b you only get an effective data bandwidth of 80%). You didn't say whether or not you MUST have a DC-balanced signal. Are your transmitters AC-coupled to the receivers? If yes, the you must have a DC-balanced signal and coding IS required. You also don't mention your clocking requirements. Do you have a common (distributed) reference clock, or are the reference clocks independant? Typical clock recovery circuits assume plesio-synchronous operation. This means the the reference clocks can be independent but must be within +/- XYZ PPM (parts-per-million) of each other. For a receiver to track the transmitter the receiver must recover the clock (and data) from the transmitted bit stream. At high speeds this is done using PLLs. For the PLLs to perform well they require frequent edges in the transmitted bit stream. Without a coded bit-stream you can't have DC-balance and you can't gaurantee frequent edges for the PLL. So if you really can't have a coded bit-stream then you must have a distributed (common) reference clock and you cannot use AC-coupling. If you are trying to minimize the overhead of coding look at other coding methods like 64b/66b codes. These are more efficient but still have the benefits of DC-balanced tranmission and frequent clock edges. Note that another benefit of 8b10b is the ability to align your symbols. How are you going to know the boundaries of a byte, a packet, etc., without a coded stream and use of non-data symbols (control symbols)? If you have independent reference clocks how are you going to avoid underrun and overrun of the reciever withouth control symbols (i.e. insertion and removal of idle characters)? Do you need any form of closed-loop flow control, error signalling, etc.? If yes, how are you going to do this without control symbols? TCArticle: 130710
On 31 Mrz., 00:25, kislo <kisl...@student.sdu.dk> wrote: > Hi, is it possible to force a re initialization from within the fpga ? > if I have, for instance. transfered new configuration data to a flash, > and want to reinitialize the fpga without using the external prog_b > pin ? s3e support multiboot but only for configuration from parallel flash read the manuals, it is very easy to implement AnttiArticle: 130711
On Sat, 29 Mar 2008 08:18:23 -0700 (PDT), emeb <ebrombaugh@gmail.com> wrote: >I've got a fairly large design that I've been working with in ISE >9.2.04 for a while - it takes about 90% of a V2P100 and runs to >completion in about 3.5 to 4 hours on my Linux x86_64 system (Athlon >dual core 3800 w/ 4GB) using home-made make scripts. I decided to take >10.1 out for a spin to see if it really helps speed things up. Here's >what I've seen on the first few runs: > >- XST seems to run about as fast as it used to. >- NGDBUILD seems faster and seems to find errors in timing exceptions >more quickly. >- MAP works about the same. >- PAR takes a lot longer to run. I'm seeing 8 hour runs that used to >take 2 - 2.5 hours in 9.2.04 with the same constraints. It appears to >be coming up with bad placements (Phase 12.27 seems to take _forever_) >that are impossible to route successfully. > >I'm in the process of adding more timing exceptions and this seems to >help, but I still haven't had a successful PAR run. Let me re-iterate: >9.2.04 didn't have any trouble with this design using the exact same >source and constraints. > >Summary: 10.1 isn't working as well as 9.2.04. I'll probably be >shelving it and waiting for the service pack. > Well, my designs have all compiled OK. A complex design (83% of an XC3S400, with a microblaze and a whole bunch of proprietary peripherals included, with minor differences in pinout and contents over two different boards), has compiled consistently in 20% less time (25 minutes brought down to 20 minutes), just adding only 7 slices to the final design. PC, WXP SP2 ES, dual athlon) The GCC compiler has some differences, as it will migrate some data from bss to text (which might be a problem for me, as the may be located on different RAM types, but I have not yet identified the differences). 4 different SW projects working on both different above mentioned boards working OK. So, for me, everything seem to work fine up to this hour. Of course, the work is being done on a branch, the main development remains on 9.2.4(ISE)2(EDK). Now, I'd really love receiving my license update to register my products (EDK and Chipscope), so I may begin working with it.... Best regards, ZaraArticle: 130712
On Mar 30, 9:41 am, move <liubeny...@gmail.com> wrote: > hi all: > > i am currently working on a "toy" design of my first big project (in > VHDL) on the Xilinx Spartan III starter kit. now facing a timer > problem and i could not properlly solve it using my limited design > experience, here is it: > > module A will prepare data for output (node : dout(7 downto 0)) to > module B when it receive a READY signal from B. in order to notify B > that the data is ready on the bus , A will ouput a signal DONE , but > the DONE will be '1' after 30 ms A received signal READY and will just > last 10 ms before going low. I wonder is there any standard or elegant > way of implement the timer in VHDL? > > PLZ give me some hint! > thank U all in advance ! :) When A receives the B's READY signal, A should place its data on the bus (dout), and then A should turn on a data-ready signal for B. When B has captured (latched) A's data, then B should turn off its READY signal. A will then turn off its data-ready signal. There was a paper on the Xilinx website that dealt with this situation, but I think that Xilinx's method was faster/simpler than my "off the top of my head" method. HTH -Dave PollumArticle: 130713
On 31 mar, 05:13, admbarn...@gmail.com wrote: > My question: How can I write to DDR SDRAM from a custom IP core on the > PLB Bus? > > Background: > I am developing a custom IP core for a Virtex II Pro based system (the > Xilinx XUP board). This core captures video data using the Digilent > VDEC1 board at a pixel rate of 13 MHz. I would like to write several > video frames to memory and process them in the Power PC processor. > > Right now my core captures video and attempts to send it to the PPC > using the PLB FIFO IPIF. However, there are several problems with > this approach. It turns out that reading this data from the FIFO in > software is extremely slow, so the FIFO fills up before I can read all > of the pixels out of it (resulting in lost pixels and unusable > frames). Also, there is only enough memory in the FIFO to store about > one image frame (360x240 resolution), so it could not serve as a multi- > frame buffer (and I need about 8 frames to do my processing) > > Therefore, I would like to write the image data directly to DDR SDRAM > from my custom IP core. I have 512 MB installed on the board, and I'm > able to access it in C using the plb_ddr core. However, I am not sure > how I can get my custom core to write to the DDR using the PLB bus. > > If anyone could point me in the right direction, I would really > appreciate it. I know that I will need to make my custom core a > "Master" on the PLB bus and possibly use DMA, but I can't find a good > example on how to read/write to memory. > > Thanks. From my experience, I'll tell you that I have used open_ddr core from opencores to use with PowerPC. It is not easy but possible. In your case I will use plb_ddr core. I would do this: - First of all, I'd generate the design with the EDK including DDR memory. - Second I'd use ISE to edit the system internal connection, so I'd disconnect DDR from PPC and connect to my custom core. Only You have to study the signal provided by PLB_IPIF to know what to do. This is my opinion. I hope it could help you.Article: 130714
Hey *, I'm trying to use USB programming cables from both Xilinx and Lattice on the same Windows machine. No luck so far, I can only get one of them working at a time, when I completely uninstall the other's software. Seems like the problem is that both use Jungo's "windrv", and the tools are getting confused or something... Has anyone of you managed to get this working? The only solution I can think of right now is runnig the tools inside a virtual machine, but before I try that I'd like to know if maybe there's a simpler solution... cu, Sean -- My email address is only valid until the end of the month. Try figuring out what the address is going to be after that...Article: 130715
I just wanted to mention.. We tried exact same build on two different computers here.. (both running ISE 9.2.04i) 1. A Dell precision 2Gb mem, 3Ghz xeon running XPx32 2. A HP xw9300 workstation, 4Gb mem, 2.40Ghz dual core AMD Opteron running XPx64 Results: 1. 25min 2. 1h25 We did not expected this difference. Maybe x64 is too slow? I suspect the timing is not only dependant on the ISE itself.. but on several unknown factors. I wish ISE could run some diag to help us find non-ideal conditions.Article: 130716
> Bob, > I sure hope so, however you need to realize that what's funny for native English speakers is not > necessarily funny for Russians or French, etc., and vice versa. In any other case I would say > nothing, but there are certain subjects, which are a total taboo for making any fun of (or adding > twists to them as you say) in the culture I grew up in... It is considered very low taste and > insensitive. It might be a small world but this shows that we're all (slightly) different. I grew up in Belfast during 'the troubles' and almost anything could be (and was) subject to ridicule. This 'black' humour was used by many as a coping strategy. I smiled at the reply to the original post and I'd agree that Anne Frank's house is worth a look. NialArticle: 130717
hi, is it possible to increase the available memory for a microblaze after it has been created ? .. i have created a microblaze with 16kb blockram using the base system builder, but now i want to use 32kb .. can this be changed?Article: 130718
There are a couple of choices you have when it comes to writing back to external memory from your core. Depending on which versions of the tools you are using you can do a pretty simple DMA from your core (across the PLB) or directly connect your core to external memory (say, via the MPMC). I tend to lean towards the MPMC as it is a much higher bandwidth solution than the PLB, but it is a little more complex. If you are using >=9.2i tools there is a good Answer Record on Xilinx's website about using the Native Port Interface (NPI) with the Multi-Ported Memory Controller.Article: 130719
Sean Durkin <news_mar08@tuxroot.de> wrote: >Hey *, >I'm trying to use USB programming cables from both Xilinx and Lattice on >the same Windows machine. No luck so far, I can only get one of them >working at a time, when I completely uninstall the other's software. >Seems like the problem is that both use Jungo's "windrv", and the tools >are getting confused or something... >Has anyone of you managed to get this working? The only solution I can >think of right now is runnig the tools inside a virtual machine, but >before I try that I'd like to know if maybe there's a simpler solution... Jungos tools have a history of totally violating any sane programming rules. Maybe you could add another usb card to let them install on different hardware..? Or install the lattice programmer, and then use a win32 port of the linux driver for the xilinx stuff? I don't know the sanity of the lattice drivers, but likely the Jungo drivers are to blame.Article: 130720
On Mar 31, 8:23 am, kislo <kisl...@student.sdu.dk> wrote: > hi, is it possible to increase the available memory for a microblaze > after it has been created ? .. i have created a microblaze with 16kb > blockram using the base system builder, but now i want to use 32kb .. > can this be changed? Yes, you can either edit the MHS file and change the HIGHADDR of the memory controller, or you can edit it from the Address view in XPS.Article: 130721
On Mar 30, 9:15 am, louis <y...@ce.et.tudelft.nl> wrote: > Hi, All, > > I built a PowerPC system on Xilinx ML410 board (Virtex4 fx60), the > system contains 64K OCM instruction memory and 64K PLB data memory. I > used XMD to debug the system via the jtagppc, when I reset the system, > the PC register went back to a address like "0Xfffff048". But > according to the PPC manual, after reset, the PC should be > "0Xfffffffc". Does anyone met the similar problem before? > > regards, > louis How did you perform the reset? Do you have a software program that you're actively trying to debug and have already "downloaded" onto the FPGA via the JTAGPPC? If so, your debugger may be seeing the start of the application as 0xfffff048 instead of 0xfffffffc.Article: 130722
On 31 Mar., 14:55, morphiend <morphi...@gmail.com> wrote: > On Mar 31, 8:23 am, kislo <kisl...@student.sdu.dk> wrote: > > > hi, is it possible to increase the available memory for a microblaze > > after it has been created ? .. i have created a microblaze with 16kb > > blockram using the base system builder, but now i want to use 32kb .. > > can this be changed? > > Yes, you can either edit the MHS file and change the HIGHADDR of the > memory controller, or you can edit it from the Address view in XPS. will this also make xps initialize more bram? if i for instance build a system with 16kb block ram it automaticaly assigns 32K for dlmb_cntlr, but that dosent mean it has 32K of blockram .. if i use more that 16K for application code the compiler complains..Article: 130723
On 31 Mrz., 10:46, "Morten Leikvoll" <mleik...@yahoo.nospam> wrote: > I just wanted to mention.. > > We tried exact same build on two different computers here.. (both running > ISE 9.2.04i) > 1. A Dell precision 2Gb mem, 3Ghz xeon running XPx32 > 2. A HP xw9300 workstation, 4Gb mem, 2.40Ghz dual core AMD Opteron running > XPx64 > > Results: > 1. 25min > 2. 1h25 > > We did not expected this difference. That one is easy. Depending somewhat on the exact model, the Xeon is likely to have 4MB unified cache, whereas the Operon probably has 1MB cache per core. This means that for an compute intensive application that can only use one core the Xeon provides 4x the cache size. The x64 application makes things even worse because it has a larger memory footprint. Kolja SulimmaArticle: 130724
On Mar 21, 3:20 pm, Antti <Antti.Luk...@googlemail.com> wrote: > On 21 Mrz., 16:13, rickman <gnu...@gmail.com> wrote: > > > > > On Mar 20, 3:19 am, Antti <Antti.Luk...@googlemail.com> wrote: > > > > "serial implementations of the past" - have worked with COP800 and I > > > had hard days optimizing DES for ST62T10 > > > - none of those is suitable directly, maybe there is something to look > > > and learn, but not much to direct use > > > > small FPGA soft-cores (existing ones) > > > - too large > > > Until you tell us what "too large" means in numbers, we can't consider > > this requirement. > > small enough means "virtually free" in most small FPGA's, from any > vendor > what may be small enough for Spartan3 may not be an option for Actel > as the fabric is too different and the "price of each type of resource > is different" If you were a customer, I would say nothing, but keep asking related questions until I got an answer I could use. But since you appear to be an engineer, I will point out that you answered my request for clarification with yet another *vague* and *undefined* answer! As long as you use terms like "too large" and "virtually free", I can't know what you mean. There was a Dilbert cartoon about this once. He kept asking his customer to clarify his requirement and the customer kept giving useless answers until finally he insisted that Dilbert should be able to read his mind! I can' read your mind, so I don't know how big "too large" is and I will never know what "virtually free" means. The cost of asking this question keeps the answer from ever being free in any sense. The extra effort required to pull a good answer from you has just raised my bid by 20%. ;^) > > > - not flexible in configuration options > > > What flexibility do you require? > > either with regisers in LUTRAM > or in BRAM > or even in external serial ram or ram buffer or serial rom So what about existing FPGA soft cores is not "flexible enough"? I am pretty sure they all use either block ram or LOT ram for register files. > > > - no real C compilers exist > > > That is not my understanding... What do you mean by "real"? > > NOT PLAY compilers. > > try C compiler for picoblaze, that is an example of what is not really > really useful Ok, now we are getting a definition of "real compiler"... it is any compiler that is not exactly like the picoblaze C compiler. > > > - can not address "flat word" 32bit addressing > > > That is an interesting requirement. If you have a bit serial > > processor that executes, at best, 2 MIPS, why do you "require" a 32 > > bit flat addressing model? What is the application that needs 4 GB of > > address space? > > yes, that is interesting and valid argument. if you did read my specs > you > maybe noticed that I mentioned possibility to use SD card as MAIN > memory > that is code execution directly while reading from sd card in place, > so > if the SD card interface is the CODE/DATA memory interface then it > would be nice to simply MAP ALL the card into the same flat space > sure the slow cpu would take ages executing a 4GB large program > but the ability to directly access the full sd card would still > simplify things But what happens in a couple of years when it is hard to find a 4 GB card and you need at least 26 bit addressing? > well the 32 bit is even too small as the card exeed 4GB (even microSD > are available in 8GB) Yes, and you won't be able to buy 4 GB cards in another 2 or 3 years. So why bother with the 32 bit requirement? Think big and just go with 64 bits like any "real" processor will do. > > > now some additional considerations: > > > > 32 bit ALU for serial implementation cost the same as 1 bit ALU > > > Not true. The overhead for control of a bit serial ALU is higher than > > the data processing and more complex than a parallel data path. There > > are happy mediums such as using an 8 bit core to perform 32 bit > > computations. > > sure it is not 1:1 but wide datapath is overhead that sometimes > isnt needed We are not talking about wide datapath. We are talking about the size of the registers and the length of time to do simple operations. In a parallel implementation wider data paths cost is in silicon. In a bit serial design a wider data path cost is in time. This process is already very slow and will have limited apps due to the slowness. So why cripple it by successive factors of 2 just because the ALU logic is "free"? An 8 bit design will be 4X faster than a 32 bit design and can be just as powerful since you can easily implement 32 bit instructions by chaining 4 8 bit instructions. Then you have the speed of 8 bits and the flexibility of 32 (or even 64) bits. > > > 32 bit registers if implemented in BRAM or data buffer of Atmel > > > I don't know what this means. > > it means that WHEN onchip RAM is at premium, then the spu would > use the DUAL RAM buffers in the SPI DATAFLASH ROM as main storage > for registers and stack, this would make i EXTREMLY slow but > if goal is not use any BRAM then it mayb only option > > > > Dataflash cost very little (0 FPGA fabric resource) > > > Yes, anytime you go off chip, you are not using the FPGA fabric. > > > > code data memory space cost very little, so opcode density is almost > > > least priority > > > Code memory may be inexpensive, but the time to fetch is not low. > > This ends up being a limiter on the speed, but then you have already > > given up any speed requirements in the initial set of constraints. > > I'm not so sure of data memory. Are you talking ram or flash? > > flash is cheap, serial ram in most cases doesnt makes sense > so the processor would in most cases have limited ram Ok, registers in the DataFlash buffers... aren't they used when writing to the flash? Maybe I am not familiar with the DataFlash you use. > > > number of cycles per instruction is also very low priority (at least > > > for some optimization options) > > > > lets look sone special targets > > > > 1) device S3AN-50 > > > ============= > > > > if we use picoblaze, we use 30% of BRAM and some small % of FPGA, but > > > we only get 1KW of code and 8 bit processor/ALU > > > You can always extend the code size by extending the processor to > > fetch from external dataflash. I don't think CPU cores are locked to > > the original implementation. Like I said above, an 8 bit processor > > can do 32 bit arithmetic very easily. > > sure, any processor could be made to fetch code from external serial > memory > but only processor that was designed to perform for this operation can > do it optimal Optimal in *what* sense??? You have already spec'd out speed and your registers in DataFlash buffers have reduced the speed to what can be supported by the external memory. So what could "optimal" possibly mean in any sense? > Think for NIOS this is already done, a avalon slave that reads from > serial flash > I would like that for Microblaze too, but havent had eneough reasons > todo it, > and it would be still rather large implementation Wait a year or two and the definition of "too large" and "virtually free" will change to allow not only Microblaze processor to fit the requirement, but likely any ARM implementation too! > > > if all of the above special cases would be support by the same C > > > compiler (with settings to adapt to config differences) ? > > > > single chip MCU's are hard to crack, but that isnt the goal, in many > > > cases there are "unused" resources present, so the SPU could really > > > come virtually free, besides an extra IC + extra 0.80 USD is still > > > extra cost for additional MCU in the system If this is your definition of "virtually free" then there is *no* solution. There are always designs that use all practical amount of some resource in an FPGA and there are always designs that have tons of free resources. So how can you talk about using "unused" resources as if this was some guaranteed amount? On the other hand, there are going to be an incredibly small number of designs that are using an FPGA and also can't afford an $0.80 CPU. If you want to consider a definition of "virtually free", then I suggest that you consider the increment between FPGA sizes. In the Spartan 3 the smallest increment is 2300 LUTs after accounting for the Xilinx "expanding universe" inflation factor. If your feature uses one fifth of this amount, then adding it is unlikely to cause the design to be bumped up to the next size of FPGA and so will be "free". This equates to 460 LUTs/FFs in the *smallest* device, the 3S50. As the starting chip size gets larger, the increment gets larger so that even in the lowly 3S400 "virtually free" means 1,638.4 LUTs. I think that for any but most strict applications, a standard, small soft core processor is "small enough" to be "virtually free".
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z