Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Please any one give me the idea how i can generate prime number in verilog without the use of for loop. i computed in this way that a prime number is not divisible by its previous prime number.Article: 157601
On Fri, 26 Dec 2014 08:35:25 -0800, mawais2011 wrote: > Please any one give me the idea how i can generate prime number in > verilog without the use of for loop. i computed in this way that a prime > number is not divisible by its previous prime number. To what end? Are you looking to synthesize something, or do you need prime numbers for a test bench? If you want to end up with hardware that coughs up prime numbers -- I think you'll either need to settle for a look-up table with stored numbers, or some some sort of state machine that does a sieve or whatever it is that people do these days that's more efficient than a sieve. -- Tim Wescott Wescott Design Services http://www.wescottdesign.comArticle: 157602
On Fri, 26 Dec 2014 08:35:25 -0800, mawais2011 wrote: > Please any one give me the idea how i can generate prime number in > verilog without the use of for loop. i computed in this way that a prime > number is not divisible by its previous prime number. Assuming you want this to be synthesisable, may I suggest the Sieve of Eratosthenes <http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes> In common with other sieve algorithms, it requires that you know the upper limit in advance and have enough memory to hold a flag for each number up to that upper limit. As such, it is not suited to a design that has to produce a continuous stream of prime numbers without bound. AllanArticle: 157603
On Sat, 27 Dec 2014 04:59:51 +0000, Allan Herriman wrote: > On Fri, 26 Dec 2014 08:35:25 -0800, mawais2011 wrote: > >> Please any one give me the idea how i can generate prime number in >> verilog without the use of for loop. i computed in this way that a >> prime number is not divisible by its previous prime number. > > Assuming you want this to be synthesisable, may I suggest the Sieve of > Eratosthenes <http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes> > > In common with other sieve algorithms, it requires that you know the > upper limit in advance and have enough memory to hold a flag for each > number up to that upper limit. As such, it is not suited to a design > that has to produce a continuous stream of prime numbers without bound. > > Allan Well, a flag for each number up to the square root of the upper limit, if you really want to be stingy. Is there _any_ prime-finding algorithm that doesn't require you to store more and more history as you go, or to repeat calculations that you wouldn't otherwise have to repeat? -- Tim Wescott Wescott Design Services http://www.wescottdesign.comArticle: 157604
On Sat, 27 Dec 2014 19:00:48 -0600, Tim Wescott wrote: > On Sat, 27 Dec 2014 04:59:51 +0000, Allan Herriman wrote: > >> On Fri, 26 Dec 2014 08:35:25 -0800, mawais2011 wrote: >> >>> Please any one give me the idea how i can generate prime number in >>> verilog without the use of for loop. i computed in this way that a >>> prime number is not divisible by its previous prime number. >> >> Assuming you want this to be synthesisable, may I suggest the Sieve of >> Eratosthenes <http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes> >> >> In common with other sieve algorithms, it requires that you know the >> upper limit in advance and have enough memory to hold a flag for each >> number up to that upper limit. As such, it is not suited to a design >> that has to produce a continuous stream of prime numbers without bound. >> >> Allan > > Well, a flag for each number up to the square root of the upper limit, > if you really want to be stingy. > > Is there _any_ prime-finding algorithm that doesn't require you to store > more and more history as you go, or to repeat calculations that you > wouldn't otherwise have to repeat? Yes, they're called "incremental sieves". I've never tried one. The crypto folk (well, those still using RSA) generate their big primes by generating random numbers then testing them. Generating big random numbers is quick, and the primality test is pretty quick too (particularly if one can trade off accuracy and let a pseudoprime through occasionally). Regards, AllanArticle: 157605
Tim Wescott <seemywebsite@myfooter.really> wrote: > On Sat, 27 Dec 2014 04:59:51 +0000, Allan Herriman wrote: >> On Fri, 26 Dec 2014 08:35:25 -0800, mawais2011 wrote: >>> Please any one give me the idea how i can generate prime number in >>> verilog without the use of for loop. i computed in this way that a >>> prime number is not divisible by its previous prime number. Unlike the for loop in C or Java, the verilog for loop increases the amount of hardware used to do the computation. Very often, that is not what you want. >> Assuming you want this to be synthesisable, may I suggest the Sieve of >> Eratosthenes <http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes> (snip) > Well, a flag for each number up to the square root of the upper limit, if > you really want to be stingy. > Is there _any_ prime-finding algorithm that doesn't require you to store > more and more history as you go, or to repeat calculations that you > wouldn't otherwise have to repeat? I would guess not, but you can make it interesting. Find a compromise between storing and calculating. My first prime number program was on an HP 9810A calculator. It stored the values computed so far in addressable registers, and stopped when it ran out of them. (I didn't yet know about the square root rule.) Not so much later, I started learning Fortran, and the sample program in the OS/360 Fortran manual prints a table of primes. That one first prints out 2 and 3, then loops for odd numbers, testing each by dividing by values up to sqrt(float(n)). The program was much smaller and simpler than I had done on the 9810A. Note that the Fortran program takes one optimization, after printing 2, not even trying other even numbers. You could take that one better, as only two of every six aren't divisible by either 2 or 3. You could write a loop incrementing by six, with two trial loops inside. Note that this trades of program size and complexity for time. You could also extend it using 5, 7, and 11 as previously known primes, and reduce the computation accordingly. In the verilog case, you likely don't want hardware with size proportional to, or maybe even proporional to the square of, the largest prime value that you want to compute. Most often, that means a state machine. You want to loop in time, not space (logic), generating potential primes and testing them. Seems to me that either the sieve or looping in a state machine through potential primes, and another loop in time testing them isn't so complicated to generate, though not the easiest. You might do it wit a state machine generating tool. It seems likely that the hardware has to be designed based on the largest possibly value, as enough bits will have to be supplied to hold that value. But bits are pretty cheap these days. You could loop at 50MHz, trying primes, maybe with an iterative divider. Then stop for 1s when you find one, with its value on the built-in (to your FPGA board) display. -- glenArticle: 157606
On 12/28/2014 03:24 AM, Allan Herriman wrote: > The crypto folk (well, those still using RSA) generate their big primes > by generating random numbers then testing them. > Generating big random numbers is quick, and the primality test is pretty > quick too (particularly if one can trade off accuracy and let a > pseudoprime through occasionally). That's indeed how the crypto folks do it. The Miller-Rabin test uses little memory, is fast and not to hard to implement. https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test As you said the algorithm will indeed let a pseudoprime through, but you have control over how often that happens on average. I just looked up the numbers in the book Cryptographic Engineering by Bruce Schneider et. al. He states that just 64 random Miller-Rabin tests are enough to lower the chances of a pseudoprime down to 2^-128 percent. He also states that the chance of getting killed by a meteor while reading this sentence is far larger than that. :-) /NilsArticle: 157607
the link is expired, Can you the share it. Even i doing research on BCH Encoder and Decoder. Thank You.Article: 157608
wrote in message=20 news:961f3a7b-97e1-42f2-81d3-1b6092248b54@googlegroups.com... the link is expired, Can you the share it. Even i doing research on BCH=20 Encoder and Decoder. Thank You. https://www.google.com/search?as_q=3Dbch+encoder&as_epq=3D&as_oq=3D&as_eq= =3D&as_nlo=3D&as_nhi=3D&lr=3D&cr=3D&as_qdr=3Dall&as_sitesearch=3D&as_occt= =3Dany&safe=3Dimages&tbs=3D&as_filetype=3D&as_rights=3D&gws_rd=3Dssl=20Article: 157609
On Friday, January 2, 2015 8:19:22 PM UTC-8, manoj...@gmail.com wrote: > the link is expired, Can you the share it. Even i doing research on BCH Encoder and Decoder. > > Thank You. Which link is expired? The only link in the post is to github, which is fine.Article: 157610
On Saturday, November 29, 2014 11:27:47 AM UTC-6, Antti wrote: > On Saturday, 29 November 2014 16:58:04 UTC+1, jim.bra...@ieee.org wrote: > > On Wednesday, November 26, 2014 1:01:18 PM UTC-6, Theo Markettos wrote: > > > Anyone know if there's a standard(ish) for simple mezzanine cards for= FPGA > > > boards? > > >=20 > > > I know about things like FMC and HSMC which are very 'high end' - mul= ti > > > gigabit transceivers, expensive connectors. There's also Arduino, wh= ich is > > > simple and low pin count, but everything is designed to talk to a dum= b slow > > > Atmega (which usually means putting another Atmega on the mezzanine c= ard and > > > talking via SPI). Or there's Raspberry Pi, but again it's assumes yo= u have > > > slow I/O and things like Ethernet and USB already exist on the CPU bo= ard. > > >=20 > > > Is there anything between the two? Something like an Arduino-scale s= ystem > > > but with a $10 FPGA in mind rather than an 8 bit micro or a $1000 FPG= A. For > > > instance, an 100M Ethernet PHY which is just the phy rather than a > > > memory-mapped MAC, and so just presents an RMII or SMII interface. O= r a > > > USB2 ULPI PHY. Having a microcontroller on the board is OK (USB offl= oad is > > > a useful task), just drinking it through an SPI straw is not. > > >=20 > > > I found: > > > http://www.wvshare.com/column/Accessory_Boards.htm?1 > > > which seems to be cheap boards all over ebay that are rather Arduino-= like > > > while intended for FPGAs, but there doesn't seem to be much of a comm= unity > > > around them (in other words, they might disappear tomorrow). > > >=20 > > > Any other ideas? > > >=20 > > > Thanks > > > Theo > >=20 > > If one wants something beyond Arduino and Pmod connections, one could u= se USB 3.0 extension cabling as the basis for hackable interconnect: Use t= he USB 2.0 wires for power, ground and some simple standard interface (I2C,= SPI, UART, CAN, etc). That leaves the two USB 3.0 pairs for the serial in= terface of your own devising. With a microprocessor and/or FPGA on each en= d it's your choice. >=20 > USB connectors have been used and missued already by some projects >=20 > for the most hacking experience there is nothing but 100 mil headers >=20 > Antti > http:/igg.me/at/zynq Have nothing against 0.1" headers except their size and the inflexibility o= f where the GND and VCC pins are located. One could go to 2mm headers, the= y are less available and too small for my fingers. =20 However, it should be possible to make GND available for, say, any of the o= dd pins and VCC available for any of the even pins. This would allow maxim= um utilization of the 0.1" spaced pins/holes, giving upwards of twice the d= ensity of Pmod connections. Can think of several ways to allow the GND and= VCC connections, probably the cheapest and most compact is "solder bridges= ": small copper circles with a open area down the middle of the circle.Article: 157611
On Wednesday, November 26, 2014 11:01:18 AM UTC-8, Theo Markettos wrote: > Anyone know if there's a standard(ish) for simple mezzanine cards for FPGA > boards? > > I know about things like FMC and HSMC which are very 'high end' - multi > gigabit transceivers, expensive connectors. There's also Arduino, which is > simple and low pin count, but everything is designed to talk to a dumb slow > Atmega (which usually means putting another Atmega on the mezzanine card and > talking via SPI). Or there's Raspberry Pi, but again it's assumes you have > slow I/O and things like Ethernet and USB already exist on the CPU board. > > Is there anything between the two? Something like an Arduino-scale system > but with a $10 FPGA in mind rather than an 8 bit micro or a $1000 FPGA. For > instance, an 100M Ethernet PHY which is just the phy rather than a > memory-mapped MAC, and so just presents an RMII or SMII interface. Or a > USB2 ULPI PHY. Having a microcontroller on the board is OK (USB offload is > a useful task), just drinking it through an SPI straw is not. > > I found: > http://www.wvshare.com/column/Accessory_Boards.htm?1 > which seems to be cheap boards all over ebay that are rather Arduino-like > while intended for FPGAs, but there doesn't seem to be much of a community > around them (in other words, they might disappear tomorrow). > > Any other ideas? > > Thanks > Theo Check out the Bugblat devices that support some of the open hardware platforms based on Lattice FPGAs. http://www.bugblat.com/products/pif/index.html http://www.bugblat.com/products/fino/index.htmlArticle: 157612
// Testbench rst=true; wait(10, SC_MS); rst=false; in1=8; in2=2; in3=3; in4=6; sel=2; wait(50, SC_MS); cout << "selected data:" << out << " " << endl; rst=true; sel = 0; wait(50, SC_MS); cout << "selected data:" << out << " " << endl; I am used to VHDL which runs in parallel, but I finding it difficult to understand/if Systemc code run in parallel like the example above.Article: 157613
As for recent update (version 1.2) IDE comes with a huge content patch, bug fixes and overall quality improvements. Basic noncommercial license is still free of charge and until 31.1. 2015 the other prices are with 80%+ discounts. Regards MMSystems teamArticle: 157614
On 06/01/2015 09:15, muyihwah@gmail.com wrote: > // Testbench > rst=true; wait(10, SC_MS); > rst=false; in1=8; in2=2; in3=3; in4=6; sel=2; > wait(50, SC_MS); > cout << "selected data:" << out << " " << endl; rst=true; sel = 0; > wait(50, SC_MS); > cout << "selected data:" << out << " " << endl; > > > I am used to VHDL which runs in parallel, but I finding it difficult to understand/if Systemc code run in parallel like the example above. > SystemC's simulation model is very similar to VHDL (incl. delta cycles etc). I suspect you pasted the wrong code snippet? The code above is sequential. Hans www.ht-lab.comArticle: 157615
Am Dienstag, 6. Januar 2015 13:30:59 UTC+1 schrieb HT-Lab: > On 06/01/2015 09:15, muyihwah@gmail.com wrote: > > // Testbench > > rst=true; wait(10, SC_MS); > > rst=false; in1=8; in2=2; in3=3; in4=6; sel=2; > > wait(50, SC_MS); > > cout << "selected data:" << out << " " << endl; rst=true; sel = 0; > > wait(50, SC_MS); > > cout << "selected data:" << out << " " << endl; > > > > > > I am used to VHDL which runs in parallel, but I finding it difficult to understand/if Systemc code run in parallel like the example above. > > > SystemC's simulation model is very similar to VHDL (incl. delta cycles > etc). > > I suspect you pasted the wrong code snippet? The code above is sequential. > > Hans > www.ht-lab.com Please could you explain what shows the code to be sequential.Article: 157616
Antti <antti.lukats@gmail.com> wrote: > yes and IDEA is born, will be public soon - > [prelim spec deleted here, sorry] To follow up... In the end I went with putting a connector with the WaveShare pinout for my board. After all my other pins were done I had 10 pins spare, consisting of 8 I/Os and 2 inputs to spare. That pretty much constrained the pinout, and the need to hardwire power pins to particular places sealed the rest. (I also had 8 analog inputs I didn't need - it's possible to abuse the ADC to make those ~100KHz digital inputs, but that's of limited use) To summarise the dicussion: * A number were for boards containing FPGAs. I have the FPGA, I was looking for a system for accessory modules to plug into the FPGA board. Perhaps this was unclear from the question. * All the systems mentioned were tied to a particular board vendor. * I think the system with the biggest following behind it is Pmod, which at least has non-Digilent modules. But at 4 I/Os it's quite limited - it's really in SPI/I2C territory, which means low speed 'sensors' rather than higher speed I/O. * xkcd.com/927 I'm still most interested in an Arduino-like standard for FPGAs boards: * Something that scales to variable numbers of I/Os without pain (so you can attach an SPI thing or a PCI socket to the same connector) * That copes with both high and low speed devices (eg a cheap slow connector with an optional more expensive addon for transceivers) * That has a similar kind of following to it as Arduino does: - Multi-vendor (so more than one organisation sell boards to that form factor) - Multi-platform (so will work on multiple FPGA vendors) - Has some degree of ecosystem around it (eg example code exists) * That isn't tightly tied to the limitations of one platform (eg the need to put a microcontroller on the Arduino 'shield' because the main Atmega can't cope) Perhaps we need FPGA to be more mainstream for this to take off, I don't know... TheoArticle: 157617
On 1/6/2015 2:17 PM, Theo Markettos wrote: > Antti <antti.lukats@gmail.com> wrote: >> yes and IDEA is born, will be public soon - >> [prelim spec deleted here, sorry] > > To follow up... > > In the end I went with putting a connector with the WaveShare pinout for my > board. After all my other pins were done I had 10 pins spare, consisting of > 8 I/Os and 2 inputs to spare. That pretty much constrained the pinout, and > the need to hardwire power pins to particular places sealed the rest. > (I also had 8 analog inputs I didn't need - it's possible to abuse the ADC > to make those ~100KHz digital inputs, but that's of limited use) > > To summarise the dicussion: > * A number were for boards containing FPGAs. I have the FPGA, I was looking > for a system for accessory modules to plug into the FPGA board. Perhaps > this was unclear from the question. > * All the systems mentioned were tied to a particular board vendor. > * I think the system with the biggest following behind it is Pmod, which at > least has non-Digilent modules. But at 4 I/Os it's quite limited - it's > really in SPI/I2C territory, which means low speed 'sensors' rather than > higher speed I/O. > * xkcd.com/927 > > > I'm still most interested in an Arduino-like standard for FPGAs boards: > * Something that scales to variable numbers of I/Os without pain (so you can > attach an SPI thing or a PCI socket to the same connector) > * That copes with both high and low speed devices (eg a cheap slow connector > with an optional more expensive addon for transceivers) > * That has a similar kind of following to it as Arduino does: > - Multi-vendor (so more than one organisation sell boards to that form > factor) > - Multi-platform (so will work on multiple FPGA vendors) > - Has some degree of ecosystem around it (eg example code exists) > * That isn't tightly tied to the limitations of one platform (eg the need to > put a microcontroller on the Arduino 'shield' because the main Atmega > can't cope) > > > Perhaps we need FPGA to be more mainstream for this to take off, I don't > know... Sounds to me like it is time to exercise option xkcd.com/927 I dare you to produce a use case document. -- RickArticle: 157618
muyihwah@gmail.com wrote: (snip, someone wrote) >> > I am used to VHDL which runs in parallel, but I finding it >> > difficult to understand/if Systemc code run in parallel like >> > the example above. >> SystemC's simulation model is very similar to VHDL (incl. delta cycles >> etc). (snip) > Please could you explain what shows the code to be sequential. Personally, I am against the idea of things like SystemsC. Well, I use structural verilog mostly, with behavioral verilog only for things that can't be done otherwise, like registers. But if one is used to wiring up gates and flip-flops, it isn't hard to think about writing that down, and, as with TTL logic, everything happening in parallel. SystemsC pretends to look like serial C, and makes you believe that you can design logic with serial thinking. Even more, it might make you believe that you can port serial programs to parallel logic without change. My favorite use for FPGAs is systolic array implementations of dynamic programming algorithms. They look very different from the serial (C) implementations. There is no useful porting of C code. -- glenArticle: 157619
On 06/01/15 19:13, muyihwah@gmail.com wrote: > Am Dienstag, 6. Januar 2015 13:30:59 UTC+1 schrieb HT-Lab: >> On 06/01/2015 09:15, muyihwah@gmail.com wrote: >>> // Testbench >>> rst=true; wait(10, SC_MS); >>> rst=false; in1=8; in2=2; in3=3; in4=6; sel=2; >>> wait(50, SC_MS); >>> cout << "selected data:" << out << " " << endl; rst=true; sel = 0; >>> wait(50, SC_MS); >>> cout << "selected data:" << out << " " << endl; >>> >>> >>> I am used to VHDL which runs in parallel, but I finding it difficult to understand/if Systemc code run in parallel like the example above. >>> >> SystemC's simulation model is very similar to VHDL (incl. delta cycles >> etc). >> >> I suspect you pasted the wrong code snippet? The code above is sequential. >> >> Hans >> www.ht-lab.com > > Please could you explain what shows the code to be sequential. > They are obviously not declarations, therefore they are statements in a single C++ function, and hence are sequential. regards Alan -- Alan FitchArticle: 157620
On 06/01/2015 23:43, glen herrmannsfeldt wrote: Hi Glen, > muyihwah@gmail.com wrote: > > (snip, someone wrote) > >>>> I am used to VHDL which runs in parallel, but I finding it >>>> difficult to understand/if Systemc code run in parallel like >>>> the example above. > >>> SystemC's simulation model is very similar to VHDL (incl. delta cycles >>> etc). > > (snip) > >> Please could you explain what shows the code to be sequential. > > Personally, I am against the idea of things like SystemsC. Why? It is just another RTL language (library). > > Well, I use structural verilog mostly, with behavioral verilog only > for things that can't be done otherwise, like registers. Behavioural Verilog to model a Register? > > But if one is used to wiring up gates and flip-flops, it isn't > hard to think about writing that down, and, as with TTL logic, > everything happening in parallel. > > SystemsC pretends to look like serial C, SystemC doesn't look at all like serial C, SystemC is processes, signals, ports, hierarchy, i.e all the constructs you will find in Verilog/VHDL. If you want to do architectural exploration you need to use a block of C/C++ code and not SystemC. SystemC adds concurrency to C/C++. > and makes you believe that > you can design logic with serial thinking. Even more, it might make > you believe that you can port serial programs to parallel logic > without change. I give you the "without a change" but other than that you should have a look at the latest ESL tools capability like Catapult8 and Cynthesizer5, you will be impressed. > > My favorite use for FPGAs is systolic array implementations of > dynamic programming algorithms. They look very different from the > serial (C) implementations. There is no useful porting of C code. > I am sure you can model a systolic array in plain old C, in SystemC it should be as easy as Verilog or VHDL, Regards, Hans. www.ht-lab.com > -- glen >Article: 157621
On Wednesday, January 7, 2015 12:48:41 AM UTC+1, Alan Fitch wrote: > On 06/01/15 19:13, muyihwah@gmail.com wrote: > > Am Dienstag, 6. Januar 2015 13:30:59 UTC+1 schrieb HT-Lab: > >> On 06/01/2015 09:15, muyihwah@gmail.com wrote: > >>> // Testbench > >>> rst=true; wait(10, SC_MS); > >>> rst=false; in1=8; in2=2; in3=3; in4=6; sel=2; > >>> wait(50, SC_MS); > >>> cout << "selected data:" << out << " " << endl; rst=true; sel = 0; > >>> wait(50, SC_MS); > >>> cout << "selected data:" << out << " " << endl; > >>> > >>> > >>> I am used to VHDL which runs in parallel, but I finding it difficult to understand/if Systemc code run in parallel like the example above. > >>> > >> SystemC's simulation model is very similar to VHDL (incl. delta cycles > >> etc). > >> > >> I suspect you pasted the wrong code snippet? The code above is sequential. > >> > >> Hans > >> www.ht-lab.com > > > > Please could you explain what shows the code to be sequential. > > > > They are obviously not declarations, therefore they are statements in a > single C++ function, and hence are sequential. > > regards > Alan > > -- > Alan Fitch Thanks a lot for the response. Could you please point me to some online resources or books that might provide detailed explanations.Article: 157622
Hello, I've come up with an issue, where I need to rotate the incoming video stream image by +/-5 degrees with 0.5 degree step. The problem now is to identify the most resource saving approach, which would also use the memory as efficiently as possible, because I need to design a new PCB and do a component selection. The FPGA will be Altera Cyclone V with one hard memory controller (5CEFA2 device). I am trying to check if it will be sufficient to use one DDR3 memory chip or it's better to use two devices with 32bit memory bus, thus increasing the bandwidth. The incoming video stream is from the camera, which has a separate clock, thus the frame buffer is a requirement. I've come accross two options so far: 1) Image rotation by shearing: https://www.ocf.berkeley.edu/~fricke/projects/israel/paeth/rotation_by_shearing.html It seems like this is kinda easy approach, but it will require at least three memory accesses. In a combination with regular 3 frames frame buffer, I could end up doing 5 memory read/write cycles. 2) Image rotation by having lookup table of each pixel. If the lookup table will be placed into the memory, then this will require one access to read the location and another access to read the pixels and write them to the moved location. I am not sure which method is used the most common in the FPGA video processing? Maybe you, experts, have good resources to read about this? Thank you. Regards Tomas D.Article: 157623
Tomas D. wrote: > Hello, > I've come up with an issue, where I need to rotate the incoming video stream > image by +/-5 degrees with 0.5 degree step. The problem now is to identify > the most resource saving approach, which would also use the memory as > efficiently as possible, because I need to design a new PCB and do a > component selection. > The FPGA will be Altera Cyclone V with one hard memory controller (5CEFA2 > device). I am trying to check if it will be sufficient to use one DDR3 > memory chip or it's better to use two devices with 32bit memory bus, thus > increasing the bandwidth. > > The incoming video stream is from the camera, which has a separate clock, > thus the frame buffer is a requirement. > > I've come accross two options so far: > 1) Image rotation by shearing: > https://www.ocf.berkeley.edu/~fricke/projects/israel/paeth/rotation_by_shearing.html > > It seems like this is kinda easy approach, but it will require at least > three memory accesses. In a combination with regular 3 frames frame buffer, > I could end up doing 5 memory read/write cycles. > > 2) Image rotation by having lookup table of each pixel. If the lookup table > will be placed into the memory, then this will require one access to read > the location and another access to read the pixels and write them to the > moved location. > > I am not sure which method is used the most common in the FPGA video > processing? Maybe you, experts, have good resources to read about this? > > Thank you. > > Regards > Tomas D. > > Some time ago I did image rotation for a check scanner that used a line-scan camera. My issue was mostly the general lack of BRAM in the small (XCV50) FPGAs I used and I had to come up with an algorithm that only read small groups of pixels at a time. My suggestion is to try to find a part that has as much internal RAM as you can reasonably afford. Then remember that when reading you want to keep as much of the data you actually read (full bursts) so you don't have to re-read during the same rotation pass. I would not think that going to a two-pass shearing algorithm will really save much in terms of logic. I didn't need to go that way and I used relatively small parts with no internal hardware multipliers. The algorithm I used simply started with the location of the first destination pixel of the first output line, which may be located at some point outside the actual input image. Remember that free rotation usually requires a larger "canvas" than the input image. In my case I didn't really need the whole input image since the output image was calculated by the detected corners of the check. Then the algorithm simply walked a pixel at a time by adding a delta to the starting location. You need to read pixels surrounding each computed X,Y location and interpolate. My interpolation was simply linear and used only the 4 nearest neighbors, but a more robust algorithm would either use more neighboring pixels or do some filtering on the input image before rotation. When you get to the end of the first output line, you go back to the original pixel location plus on orthogonal delta to get the first pixel of the second output line and so on. My algorithm used reads of 4 adjacent pixels in each of three adjacent rows to fill internal memory in 3 x 4 blocks. The starting point of these 3 x 4 blocks depended on the direction of rotation, but it allowed me to do +/- 14 degrees max. I did not need to do sine / cosine in my design because there was a processor that looked at the incoming raw image to find the check corners and directly programmed the starting pixel location and X,Y deltas. -- GaborArticle: 157624
I'll shortly be starting a design in a Zynq FPGA using Vivado. I'm confident that I will be able to use VHDL to create the design, partly because it isn't outside my comfort zone, and partly because there are many app notes/tutorials /books etc on that subject. Timing constraints are a different issue, mainly because they seem to be relatively neglected and un-glamorous. What I would really like are some examples of "good practice", i.e. small self-contained examples using timing constraints, documenting what is necessary and what is sufficient, and why. I emphasise "small and self-contained" since at this stage I'm not interested in all the arcane possibilities of the constraint languages, merely the common cases with their boundaries. In the software community such things have been given the somewhat fancy name of "design patterns", but it is a valuable concept. So, I'd be grateful for pointers to references that you found useful when you were learning how to use constraints effectively. Background is that the logic design will be conventional, with patterns based around * blocks containing FSMs in the form of one VHDL process for the combinatorial logic plus another VHDL process for the registers * two (or more!) clock domains, some with period X, some period 8X (i.e. a nice simple integer relationship) * re-synchronisation across those domain boundaries, using predefined standard Xilinx FPGA primitives * external i/o timing is, perhaps surprisingly, "don't care"; I'll resynchronise inputs myself, and will accept whatever the outputs provide * and I'm presuming that the PL<->PS interface will be "dealt with" by Vivado without my intervention So, I'd be grateful for pointers to references that you found useful when you were learning how to use constraints effectively.
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z