Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On Mar 11, 10:34 pm, "Antti.Luk...@googlemail.com" <Antti.Luk...@googlemail.com> wrote: > > MAXII is a bad FPGA, as Altera made design mistakes (no distributed > ram!) > > Antti Perhaps that explains why Altera markets it as a CPLD, not an FPGA? ALArticle: 138851
On Mar 13, 8:12=A0am, Walter Banks <wal...@bytecraft.com> wrote: > Jim > > You beat me to the comment on serial processors. > > A long time ago I designed a number of bit serial processors > they can be very hardware efficient. There are a number of > very clever math algorithms that take advantage of bit serial. The obvious next question, is do you have compilers & maths libraries for such an animal ? ;) -jgArticle: 138852
On Mar 12, 8:17=A0pm, LittleAlex <alex.lo...@email.com> wrote: > On Mar 11, 10:34 pm, "Antti.Luk...@googlemail.com" > > <Antti.Luk...@googlemail.com> wrote: > > > MAXII is a bad FPGA, as Altera made design mistakes (no distributed > > ram!) > > > Antti > > Perhaps that explains why Altera markets it as a CPLD, not an FPGA? > > AL its bad when marketed as CPLD as well ;) because it looks like poor man FPGA... AnttiArticle: 138853
Hi 12 bit (4K) 230 LEs 28 pin + Power 20-30MHz nibz12-bit.vhd from download http://nibz.googlecode.com if you only need a 4K address space of 12 bit memory locations. Full 12 bit datapath. There is no benefit really going lower in the wide generic, as an 8-bit only has 256 addressable locations for program and data. cheers jacko p.s. main reason for slowness is C% grade measure and use of carry chain to reduce ALU area.Article: 138854
On Mar 12, 12:31=A0pm, -jg <Jim.Granvi...@gmail.com> wrote: > On Mar 12, 8:57=A0pm, rickman <gnu...@gmail.com> wrote: > > > I seem to recall that you were trying to find a bit serial CPU that > > would be the smallest possible in an FPGA. =A0Did you ever find one you > > liked? =A0Personally, I think that is a goal with a very low target > > application size. =A0But certainly there are some apps where this could > > be useful. > > Bit-serial (like Cop8) only makes size-sense to simplify bus routing. > - but that's almost free in a FPGA, so other needs better drive Bit- > Serial. > > Bit-serial Multiply/divide can save resource, but that's less a core > than > an algortihm =A0trade-off, and Mul/Div are rare in the smallest cores > anyway. > > One plus that appeals to me, is Execute from Serial FLASH, (and now > Serial RAM) > [does a nibble fetch from 4 bit SPI still count as Bit-serial ? ] > =A0as resource space. Saves MANY pins, and PCB space, but I'm not sure > the > core will be _smaller_ as a result - more likely slightly larger ? > > -jg Jim, this almost the old discussion ;) yes, a bit serial processor that executes in place either from spi flash or SD card and uses say the dual 512 data buffers of atmel dataflash as ram could be very low resources say for lowest cost Xilinx FPGA S3A-50 resources are pretty tight, so if a soft core can execute from the same spi flash that is used for config using the flash as code memory and flash buffers as ram, it would retain almost all the rest of the FPGA resources for user application should be doable in <100 Xilinx s3a slices i think AnttiArticle: 138855
Hi, Just do it yourself ! Is not so complicate and you can have some help with xapp258. Then you can modify the init value of the BlockRam (or replace this blockRam with a small distributed RAM). Regards.Article: 138856
Jim You beat me to the comment on serial processors. A long time ago I designed a number of bit serial processors they can be very hardware efficient. There are a number of very clever math algorithms that take advantage of bit serial. A second comment on this core is what thought has been made on parallel partitioning problems and how would that be handled. Regards, -- Walter Banks Byte Craft Limited http://www.bytecraft.com -jg wrote: > On Mar 12, 8:57 pm, rickman <gnu...@gmail.com> wrote: > > I seem to recall that you were trying to find a bit serial CPU that > > would be the smallest possible in an FPGA. Did you ever find one you > > liked? Personally, I think that is a goal with a very low target > > application size. But certainly there are some apps where this could > > be useful. > > Bit-serial (like Cop8) only makes size-sense to simplify bus routing. > - but that's almost free in a FPGA, so other needs better drive Bit- > Serial. > > Bit-serial Multiply/divide can save resource, but that's less a core > than > an algortihm trade-off, and Mul/Div are rare in the smallest cores > anyway. > > One plus that appeals to me, is Execute from Serial FLASH, (and now > Serial RAM) > [does a nibble fetch from 4 bit SPI still count as Bit-serial ? ] > as resource space. Saves MANY pins, and PCB space, but I'm not sure > the > core will be _smaller_ as a result - more likely slightly larger ? > > -jgArticle: 138857
On 12 mar, 08:32, SUMAN <suman...@gmail.com> wrote: > Hello! > > My team is doing real time machine vision project in Spartan3a dsp > 1800 board. We took greyscale data from c3038 camera module and > succesfully performed sobel edge detection in hardware. Now we are > detecting lines form the binary image fed to microblaze processor > =A0We have to perform iterations to determine the value of r from the > following equation > r =3D x*cos(t) + y*sin(t) =A0for each (x,y) from t=3D -90 =A0degree to 90 > degree > > We have thought some solutions:- > > I) USING FLOATING POINT UNIT OF MICROBLAZE 7 AND PERFORMING THE > CALCULATION WITH SINE /COSINE LOOKUP TABLE KEPT IN MEMORY > > II) USING CORDIC/ SINECOSINE LUT CORE CONECTED TO MICROBLAZE THROUGH > FSL LINK > > CAN ANY BODY SUGGESTS ME ANY OTHER SOLUTION FOR MY PROBLEM > > THANK YOU You could also use the CORDIC algorithm in software. It might be simpler. Check this link on how to implement an efficient version of the CORDIC algorithm in software. http://www.embedded.com/design/embeddeddsp/210200583?_requestid=3D8108Article: 138858
In comp.arch.fpga, Antti.Lukats@googlemail.com <Antti.Lukats@googlemail.com> wrote: > > and.. i do not like any wires direct to FPGA or MCU either (going off > board/cable), > have seen a Atmel to bulk erase itself because the reset line was > in 2 meter long cable parallel to wire carrying 12V (reed relay > switched) > (well Atmel claimed such bulk erase is impossible... > but it happened twice and second time i had another > guy to witness it, so i wasnt seeing ghosts) A little off-topic perhaps, but now i'm curious. What atmel chip did you experience this erase with? We have just experienced a few spontanious erases (over a couple of months) of the first sector of an atmel dataflash on one of our boards. Another board in the same system uses a dataflash for MCU code and one for FPGA configuration and we have not had trouble with those. -- Stef (remove caps, dashes and .invalid from e-mail address to reply by mail) New York's got the ways and means; Just won't let you be. -- The Grateful DeadArticle: 138859
On Mar 12, 10:20=A0pm, Stef <stef...@yahooI-N-V-A-L-I-D.com.invalid> wrote: > In comp.arch.fpga, > > Antti.Luk...@googlemail.com <Antti.Luk...@googlemail.com> wrote: > > > and.. i do not like any wires direct to FPGA or MCU either (going off > > board/cable), > > have seen a Atmel to bulk erase itself because the reset line was > > in 2 meter long cable parallel to wire carrying 12V (reed relay > > switched) > > (well Atmel claimed such bulk erase is impossible... > > but it happened twice and second time i had another > > guy to witness it, so i wasnt seeing ghosts) > > A little off-topic perhaps, but now i'm curious. What atmel chip did > you experience this erase with? We have just experienced a few > spontanious erases (over a couple of months) of the first sector of an > atmel dataflash on one of our boards. Another board in the same system > uses a dataflash for MCU code and one for FPGA configuration and we have > not had trouble with those. > > -- > Stef =A0 =A0(remove caps, dashes and .invalid from e-mail address to repl= y by mail) > > New York's got the ways and means; > Just won't let you be. > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The Grateful Dead it was the VERY first samples of ATmega32 those samples had m32 written with pencil on top no i am lying, it was atmega163 and we tried to program those m32 the erasure did erase ALL memory, application sector and boot sector at same time, and there is no IAP command todo that.. so it should never happen AnttiArticle: 138860
In comp.arch.fpga, Antti.Lukats@googlemail.com <Antti.Lukats@googlemail.com> wrote: > On Mar 12, 10:20 pm, Stef <stef...@yahooI-N-V-A-L-I-D.com.invalid> > wrote: >> In comp.arch.fpga, >> >> Antti.Luk...@googlemail.com <Antti.Luk...@googlemail.com> wrote: >> >> > and.. i do not like any wires direct to FPGA or MCU either (going off >> > board/cable), >> > have seen a Atmel to bulk erase itself because the reset line was >> > in 2 meter long cable parallel to wire carrying 12V (reed relay >> > switched) >> > (well Atmel claimed such bulk erase is impossible... >> > but it happened twice and second time i had another >> > guy to witness it, so i wasnt seeing ghosts) >> >> A little off-topic perhaps, but now i'm curious. What atmel chip did >> you experience this erase with? We have just experienced a few >> spontanious erases (over a couple of months) of the first sector of an >> atmel dataflash on one of our boards. Another board in the same system >> uses a dataflash for MCU code and one for FPGA configuration and we have >> not had trouble with those. >> >> -- >> Stef (remove caps, dashes and .invalid from e-mail address to reply by mail) >> >> New York's got the ways and means; >> Just won't let you be. >> -- The Grateful Dead > > > it was the VERY first samples of ATmega32 > > those samples had m32 written with pencil on top > > no i am lying, it was atmega163 and we tried to program those m32 > > the erasure did erase ALL memory, application sector and boot sector > at same time, and there is no IAP command todo that.. so it should > never happen OK, thanks for the info. No obvious relation to our problem, we keep searching. -- Stef (remove caps, dashes and .invalid from e-mail address to reply by mail) A programming language is low level when its programs require attention to the irrelevant.Article: 138861
On Mar 13, 12:18=A0am, Walter Banks <wal...@bytecraft.com> wrote: > -jg wrote: > > On Mar 13, 8:12 am, Walter Banks <wal...@bytecraft.com> wrote: > > > Jim > > > > You beat me to the comment on serial processors. > > > > A long time ago I designed a number of bit serial processors > > > they can be very hardware efficient. There are a number of > > > very clever math algorithms that take advantage of bit serial. > > > The obvious next question, is do you have compilers > > & maths libraries for such an animal ? ;) > > We did do a COP8 compiler where some of the features was > a software / hardware solution. > > There were a lot of papers on this stuff at one point. Most I > would expect if there are on the net will be available in image > only. > > The COP8 is one of the well known bit serial processors. Many > of the early 8051's were bit serial. Go back far enough and there is > the PDP8S the S was jokingly referred to as slow. The IBM 1620 > was a serial nybble processor. The 1620 had one of the known > serial processor advantages that of variable length numbers. > > All of which could be implemented with a SD card and some logic. > > Regards, > > -- > Walter Banks > Byte Craft Limitedhttp://www.bytecraft.com for execute in place for SD possible 4 bit cpu would be best as sd fetches 4 bit per clock eh, there is 4 bit 8051 !! Atom i wonder if that would be very small in FPGA or not didnt deep enough to see how much of 8051 is there and what is made 4 bit wide AnttiArticle: 138862
On Mar 12, 6:31=A0am, -jg <Jim.Granvi...@gmail.com> wrote: > On Mar 12, 8:57=A0pm, rickman <gnu...@gmail.com> wrote: > > > I seem to recall that you were trying to find a bit serial CPU that > > would be the smallest possible in an FPGA. =A0Did you ever find one you > > liked? =A0Personally, I think that is a goal with a very low target > > application size. =A0But certainly there are some apps where this could > > be useful. > > Bit-serial (like Cop8) only makes size-sense to simplify bus routing. > - but that's almost free in a FPGA, so other needs better drive Bit- > Serial. I don't agree with that. An ALU can be serial and a register file can be a single bit wide with random access to any bit of any register. A single LUT4 can hold two 8 bit registers so a CPU with four 8 bit registers can hold them in two LUT4s. The main memory likewise can be implemented with a single bit data path. So the data path of a small CPU can be as little as half a dozen LUTs. Of course there is a trade off in complexity of the control logic, but still the CPU can be made very small compared to even a PICO blaze. With a little innovation, the data size can be arbitrarily wide as well as independent of the data path. A stack machine can use a block ram as both the data and return stacks with a bit of address work. I am doing that with a 16 bit wide machine and I expect the same architecture can be used with a 1 bit data path to greatly reduce the amount of resources used. > Bit-serial Multiply/divide can save resource, but that's less a core > than > an algortihm =A0trade-off, and Mul/Div are rare in the smallest cores > anyway. You can all any of the data path sizings are "algorithm" trade-offs, but they can still be very efficient in resource usage. > One plus that appeals to me, is Execute from Serial FLASH, (and now > Serial RAM) > [does a nibble fetch from 4 bit SPI still count as Bit-serial ? ] > =A0as resource space. Saves MANY pins, and PCB space, but I'm not sure > the > core will be _smaller_ as a result - more likely slightly larger ? The control logic is likely larger. If you can't figure out how to make the data path structure smaller by reducing the data path width, you need to go back to school! RickArticle: 138863
On Mar 12, 2:56=A0pm, "Antti.Luk...@googlemail.com" <Antti.Luk...@googlemail.com> wrote: > On Mar 12, 12:31=A0pm, -jg <Jim.Granvi...@gmail.com> wrote: > > > > > On Mar 12, 8:57=A0pm, rickman <gnu...@gmail.com> wrote: > > > > I seem to recall that you were trying to find a bit serial CPU that > > > would be the smallest possible in an FPGA. =A0Did you ever find one y= ou > > > liked? =A0Personally, I think that is a goal with a very low target > > > application size. =A0But certainly there are some apps where this cou= ld > > > be useful. > > > Bit-serial (like Cop8) only makes size-sense to simplify bus routing. > > - but that's almost free in a FPGA, so other needs better drive Bit- > > Serial. > > > Bit-serial Multiply/divide can save resource, but that's less a core > > than > > an algortihm =A0trade-off, and Mul/Div are rare in the smallest cores > > anyway. > > > One plus that appeals to me, is Execute from Serial FLASH, (and now > > Serial RAM) > > [does a nibble fetch from 4 bit SPI still count as Bit-serial ? ] > > =A0as resource space. Saves MANY pins, and PCB space, but I'm not sure > > the > > core will be _smaller_ as a result - more likely slightly larger ? > > > -jg > > Jim, > > this almost the old discussion ;) > > yes, a bit serial processor that executes in place either from spi > flash or SD card > and uses say the dual 512 data buffers of atmel dataflash as ram could > be > very low resources > > say for lowest cost Xilinx FPGA S3A-50 resources are pretty tight, so > if > a soft core can execute from the same spi flash that is used for > config > using the flash as code memory and flash buffers as ram, it would > retain almost all the rest of the FPGA resources for user application > > should be doable in <100 Xilinx s3a slices i think > > Antti If you really mean 100 ***slices*** then you are not beating the parallel processors. The pico blaze and the Micro8 are both about 200 LUTs, IIRC. I don't measure slices because only Xilinx and Lattice have slices. Most FPGAs have LUT4s (except for Actel and Atmel, but nobody uses Atmel and not many use Actel). RickArticle: 138864
On Mar 12, 1:30=A0pm, Jacko <jackokr...@gmail.com> wrote: > hi > > 292 LEs fully stripped, no ROM, no RAM no IO pins, 16 bit address, 16 > bit data Bus. Expected 20-30MHz, (36 pins plus power), About 10 MIPS > at 20MHz. > > cheers jacko What's a MIPS? Native instructions? Or something that can be compared to other processors? I have yet found a good way to compare these small, FPGA CPUs. The ZPU is pretty small, but the originator seems to still think in terms of Dhrystones. I can't begin to measure my processor in Dhrystones. The original imnplementation was about 600 LUTs, 50 MHz, 50 MIPS in an Altera ACEX 1K part (very old and pretty slow). I am working to update it for a more current FPGA. RickArticle: 138865
On Mar 12, 1:44=A0pm, "Antti.Luk...@googlemail.com" <Antti.Luk...@googlemail.com> wrote: > On Mar 12, 7:30=A0pm, Jacko <jackokr...@gmail.com> wrote: > > > hi > > > 292 LEs fully stripped, no ROM, no RAM no IO pins, 16 bit address, 16 > > bit data Bus. Expected 20-30MHz, (36 pins plus power), About 10 MIPS > > at 20MHz. > > > cheers jacko > > without any io/ram/rom its kinda useless? > and such small soft cores can usually run 200mhz+ in decent FPGA's ;) > (150mhz in low cost FPGA's) > > ok, 292 or <570LE, it's ir-relevant as long as there are no tools to > program it, > and your forth-xxx whatever isnt useable yet? > > there are zillions of stack soft-cpu's but i fail to see nice and easy > tools > todo anything with them.. no compilers > some forth xxx things that requires some xxx to be installed on your > PC > and then do something very awkward and some more to get some code > actually executing... > > and yes I have programmed in Forth many decades ago, think used > something called GraphForth for msdos > > =3D=3D > > compile_my_forth_to_bin.exe hello.forth > hello.bin > > if that creates a ready to use bin file to run with your nibz > and you have tons of tested libraries.. someone may get interested.. > > if you have something totally untested, not ready no demos > no reference design ? > > =3D=3D > > ok ZPU (a stack machine also) has GCC toolchain kind of, but it isnt > that small anymore the core despite being advertized as smallest 32 > bit core with GCC support > > Antti The author claims one incarnation is around 400 LUTs. I have not seen any of the four versions actually in a form that can be compiled to run a program of your choice without some work. I did a block diagram of the ZPU small and estimated around 600 LUTs. They are all saying it is a bit slow with a clock speed of under 50 MHz, sometimes very far below 50 MHz, IIRC and under 10, maybe under 1 DMIPS, I can't remember exactly. The effort is poorly organized and I found it hard to contribute anything useful other than my block diagram drawing which I'm not sure anyone cared about. Most of the participants are hard core software guys who don't seem to understand how to optimize an FPGA CPU for resources, speed and code density. Code density is a primary consideration which is one I share. But the instruction set is designed for "efficient" C coding which means the author doesn't want to put too much effort into the compiler to produce instructions that are easier to put in the FPGA. He optimized the compiler back end and now that tail is wagging the ZPU dog. Still, it is a very interesting effort and I am watching the mailing list and occasionally make a post. RickArticle: 138866
On Mar 12, 11:38=A0pm, rickman <gnu...@gmail.com> wrote: > On Mar 12, 2:56=A0pm, "Antti.Luk...@googlemail.com" > > > > <Antti.Luk...@googlemail.com> wrote: > > On Mar 12, 12:31=A0pm, -jg <Jim.Granvi...@gmail.com> wrote: > > > > On Mar 12, 8:57=A0pm, rickman <gnu...@gmail.com> wrote: > > > > > I seem to recall that you were trying to find a bit serial CPU that > > > > would be the smallest possible in an FPGA. =A0Did you ever find one= you > > > > liked? =A0Personally, I think that is a goal with a very low target > > > > application size. =A0But certainly there are some apps where this c= ould > > > > be useful. > > > > Bit-serial (like Cop8) only makes size-sense to simplify bus routing. > > > - but that's almost free in a FPGA, so other needs better drive Bit- > > > Serial. > > > > Bit-serial Multiply/divide can save resource, but that's less a core > > > than > > > an algortihm =A0trade-off, and Mul/Div are rare in the smallest cores > > > anyway. > > > > One plus that appeals to me, is Execute from Serial FLASH, (and now > > > Serial RAM) > > > [does a nibble fetch from 4 bit SPI still count as Bit-serial ? ] > > > =A0as resource space. Saves MANY pins, and PCB space, but I'm not sur= e > > > the > > > core will be _smaller_ as a result - more likely slightly larger ? > > > > -jg > > > Jim, > > > this almost the old discussion ;) > > > yes, a bit serial processor that executes in place either from spi > > flash or SD card > > and uses say the dual 512 data buffers of atmel dataflash as ram could > > be > > very low resources > > > say for lowest cost Xilinx FPGA S3A-50 resources are pretty tight, so > > if > > a soft core can execute from the same spi flash that is used for > > config > > using the flash as code memory and flash buffers as ram, it would > > retain almost all the rest of the FPGA resources for user application > > > should be doable in <100 Xilinx s3a slices i think > > > Antti > > If you really mean 100 ***slices*** then you are not beating the > parallel processors. =A0The pico blaze and the Micro8 are both about 200 > LUTs, IIRC. =A0I don't measure slices because only Xilinx and Lattice > have slices. =A0Most FPGAs have LUT4s (except for Actel and Atmel, but > nobody uses Atmel and not many use Actel). > > Rick hm, i did mean <200 lut and yes it is about the size of picoblaze i am aware of that but it if it works with LARGE memory space >=3D8GB and does not use FPGA ram (but the spi flash ram buffers) then i would say if it is fit <200 lut it would be very nice already. I did assume the lut number for some minimal but full functional system, not the the bare cpu only Antti PS I do have the Atmel FPSLIC board+dongle and some silicon samples also Actel ProAsic/ProAsic3/Fusion boards and programmers, but you are right, i can not say i have used the Atmel.. its just too damn expensive compared to features it has.Article: 138867
> >The author claims one incarnation is around 400 LUTs. I have not seen >any of the four versions actually in a form that can be compiled to >run a program of your choice without some work. I did a block diagram >of the ZPU small and estimated around 600 LUTs. They are all saying >it is a bit slow with a clock speed of under 50 MHz, sometimes very >far below 50 MHz, IIRC and under 10, maybe under 1 DMIPS, I can't >remember exactly. The effort is poorly organized and I found it hard >to contribute anything useful other than my block diagram drawing >which I'm not sure anyone cared about. Most of the participants are >hard core software guys who don't seem to understand how to optimize >an FPGA CPU for resources, speed and code density. Code density is a >primary consideration which is one I share. But the instruction set >is designed for "efficient" C coding which means the author doesn't >want to put too much effort into the compiler to produce instructions >that are easier to put in the FPGA. He optimized the compiler back >end and now that tail is wagging the ZPU dog. > >Still, it is a very interesting effort and I am watching the mailing >list and occasionally make a post. If the clock speed gets that slow, I'd consider making a simple clean CPU and emulating the ZPU instruction set. It would need ROM for the microcode and somebody would have to write the microcode. -- These are my opinions, not necessarily my employer's. I hate spam.Article: 138868
Perhaps this can help someone. I've spent a day trying to understand why suddenly I couldn't build an archived ISE/EDK 8.2 project anymore. The ISE would just stop for no reason producing no reports. I still don't know why this happened, but at least I found how to work around the problem. I found in the Xilinx knowledge base that there is an environment variable XIL_PROJNAV_FLOW_DEBUG_LEVEL one can set to 99 to enable debug prints. I did it and it actually pointed me to a problem with attaching two IP cores to the design. There is still no explanation of why it worked before, but as I said at least I was able to work around the issue. /MikhailArticle: 138869
-jg wrote: > On Mar 13, 8:12 am, Walter Banks <wal...@bytecraft.com> wrote: > > Jim > > > > You beat me to the comment on serial processors. > > > > A long time ago I designed a number of bit serial processors > > they can be very hardware efficient. There are a number of > > very clever math algorithms that take advantage of bit serial. > > The obvious next question, is do you have compilers > & maths libraries for such an animal ? ;) We did do a COP8 compiler where some of the features was a software / hardware solution. There were a lot of papers on this stuff at one point. Most I would expect if there are on the net will be available in image only. The COP8 is one of the well known bit serial processors. Many of the early 8051's were bit serial. Go back far enough and there is the PDP8S the S was jokingly referred to as slow. The IBM 1620 was a serial nybble processor. The 1620 had one of the known serial processor advantages that of variable length numbers. All of which could be implemented with a SD card and some logic. Regards, -- Walter Banks Byte Craft Limited http://www.bytecraft.comArticle: 138870
>was just thinking of extending SPI like comms using cheapest >and ready made cabling > >so one pair in each direction only. was hoping to get 50mbit/s? How far do you want to go? Ethernet gets a gigabit in each direction over 4 pairs. That takes a lot of DSP magic and 3 level signaling. Part of the complication is to reduce EMI. Ethernet uses transformers rather than capacitors. I'm not sure why. You should also consider USB. If you are using caps (or transformers) you have to do something to make sure there are no long strings of 0s or 1. Manchester encoding does that at the cost of 2x in bandwidth. (which may not be a problem for short distances) 4b5b, 8b/10b type encoders get back most of the bandwidth at the cost of some complexity at each end. Manchester is trivial to encode and pretty easy to decode with a small state machine if you have a 8x clock at the receiver. -- These are my opinions, not necessarily my employer's. I hate spam.Article: 138871
On Mar 13, 10:32=A0am, "Antti.Luk...@googlemail.com" > for execute in place for SD possible 4 bit cpu would be best > as sd fetches 4 bit per clock > > eh, there is 4 bit 8051 !! > Atom > > i wonder if that would be very small in FPGA or not > didnt deep enough to see how much of 8051 is there > and what is made 4 bit wide Atom (4 bit '80C51') Data is here http://www.coreriver.co.kr/data/manual/BM-ATOM1.1-V1.0.pdf Atom moves to 4 bits, drops all registers, Opcodes are 1 byte, 2 for calls Direct memory index of 4 bits is supported.(no offset index?) An 8 bit memory-index pointer exists Calls are 12 bits Some Boolean opcodes remain Does have PUSH A, POP A, & and 8 bit Stack Pointer No MUL/DIV, and no RETI So not that well suited to FPGA morph, & too large for CPLDs - FPGA's better suit 18 bit opcodes, and the dual-port ram, means register-cores map better. Width is almost free inside a FPGA, & a 36 bit fetch (9 clocks) from nibble SPI, would fit two 18b opcodes - and move you into SoftCPU space. -jgArticle: 138872
On 12 Mrz., 13:32, SUMAN <suman...@gmail.com> wrote: > Now we are > detecting lines form the binary image fed to microblaze processor > =A0We have to perform iterations to determine the value of r from the > following equation > r =3D x*cos(t) + y*sin(t) =A0for each (x,y) from t=3D -90 =A0degree to 90 > degree I do not understand that specification. Are x, y and t independent so you are iterating over three dimensional parameter space? Do you loop over all x and y, or do you have a sparse set of x, y points and only iterate over all angles for each of these? It is strange that you call the results "r" as the equation corresponds to the y coordinate of a rotation. And this seems to me the clue to solve this problem efficiently: If you compute both the X and Y result of the rotation in each step you can obtain all your results by incremental Rotations of 1 degree at a cost of 4 multiplications per result. You should be able to do somewhere around 60M Iterations in a Virtex-4, more if you do C-Slow-Retiming (e.g. pipelining the iteration and work on multiple X,Y pairs alternatingly. You only need the constants sin(1=B0) and cos(1=B0). Y' =3D X*sin(1=B0) + Y*cos(1=B0) X' =3D X*cos(1=B0) + Y*sin(1=B0); To match your alignment above and to start with -90=B0 you need to swap and/or X and Y accoringly. Have fun, Kolja SulimmaArticle: 138873
On Mar 12, 6:08 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal Murray) wrote: > >The author claims one incarnation is around 400 LUTs. I have not seen > >any of the four versions actually in a form that can be compiled to > >run a program of your choice without some work. I did a block diagram > >of the ZPU small and estimated around 600 LUTs. They are all saying > >it is a bit slow with a clock speed of under 50 MHz, sometimes very > >far below 50 MHz, IIRC and under 10, maybe under 1 DMIPS, I can't > >remember exactly. The effort is poorly organized and I found it hard > >to contribute anything useful other than my block diagram drawing > >which I'm not sure anyone cared about. Most of the participants are > >hard core software guys who don't seem to understand how to optimize > >an FPGA CPU for resources, speed and code density. Code density is a > >primary consideration which is one I share. But the instruction set > >is designed for "efficient" C coding which means the author doesn't > >want to put too much effort into the compiler to produce instructions > >that are easier to put in the FPGA. He optimized the compiler back > >end and now that tail is wagging the ZPU dog. > > >Still, it is a very interesting effort and I am watching the mailing > >list and occasionally make a post. > > If the clock speed gets that slow, I'd consider making a > simple clean CPU and emulating the ZPU instruction set. It > would need ROM for the microcode and somebody would have to > write the microcode. I don't think the clock speed is all that slow. It just doesn't do a lot in each clock cycle I believe. It is stack based, but the stack is in memory, not inside the CPU. I don't know all the details. I looked at the code a bit, but that is a poor way to learn the architecture... or at least a painful way to learn it. I did draw a diagram of the data paths. It is surprisingly straightforward, but each part of the CPU process is a separate clock cycle, Fetch, Decode, and multiple Execute steps. As a hardware designer it is not anything like what I would have designed, but they did keep it fairly small at 600 LUTs ballpark. The idea is to have other implementations that run identical code but much faster. Like I said, it is interesting and I'll keep watching it. RickArticle: 138874
On 12 Mar, 21:44, rickman <gnu...@gmail.com> wrote: > On Mar 12, 1:30 pm, Jacko <jackokr...@gmail.com> wrote: > > > hi > > > 292 LEs fully stripped, no ROM, no RAM no IO pins, 16 bit address, 16 > > bit data Bus. Expected 20-30MHz, (36 pins plus power), About 10 MIPS > > at 20MHz. > > > cheers jacko > > What's a MIPS? Native instructions? Or something that can be > compared to other processors? Native instructions. Fetch/execute, and Fetch/execute/execute (SUm instruction). > I have yet found a good way to compare these small, FPGA CPUs. The > ZPU is pretty small, but the originator seems to still think in terms > of Dhrystones. I can't begin to measure my processor in Dhrystones. I do not yet have any C compilier so drystones are not measurable. > The original imnplementation was about 600 LUTs, 50 MHz, 50 MIPS in an > Altera ACEX 1K part (very old and pretty slow). I am working to > update it for a more current FPGA. A re-implementation of the ALU is possible to improve clock rate, but the area does increase as it is a 16 bit ALU, with 4 operations. So 4 ops * 4way multiplex. The reason for using a carry chain is that the alu shrinks as some of the multiplex is merged with the alu operations (not much more complicated than just add in luts). I understand from altera support that the two lut3 arithmetic mode is not really used, and so fast carry propagation is not done, and it is the critical path, hence the extra cycle inserted. A harvard architecture is not used. Code density is reasonably high using threaded subroutines, as no jump opcode has to prefix the jump address. In full ASIC custom logic such carry propergation issues are not as dominant .So considering each instruction is 16 bit data in width, the processor is quite impressivly small. If only the carry chain was used effectively in lut4 mode. Of course all optimization was for area, and not speed, with no retiming. Just the register duplication (2 LEs) for increased routability. Yes lack of high level software tools is a real pain for product design. But I'm slowly working on that. I do not have a major development team. I am just me, and this is not paid work. I think my offer of 1 free core per chip, for just a logo print, and documentation copyright recognition and URL is very good. Especially as this offer unlike the still standing BSD offer allows derived products without revealing the derived source. Cheers jacko
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z