Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On 10/5/2015 1:12 PM, Kevin Neilson wrote: > >> >> I won't say I understand it, but I have seen somethings about this. But >> "true random number generator"??? My understanding is this is virtually >> impossible. I haven't read about this. Is it based on noise from a >> diode or something? I recall a researcher trying that and it was good, >> but he could never find the source of a long term DC bias. >> > > I don't know how these guys do it, but you can make a decent true random number generator with ring oscillators. I read one paper that described using several of these along with non-linear feedback shift registers to get a good random number. Again, I don't know for sure, but I think ring oscillators make very *poor* random number generators because they are easily linked to noise sources such as clocks on the chip. -- RickArticle: 158301
On 10/5/2015 1:37 PM, Meshenger wrote: > > In terms of Ethernet on the KickStart eval/development board, one can > add an Arduino shield for that. There is Bluetooth on-board but not > Ethernet. The only Arduino shields I have seen are general purpose interfaces which duplicate the MAC internal to the SF2. This means it would not be code compatible with the SF2 Ethernet and so not much value using the Kickstart board for software development of a SF2 project. I think the SF2 only needs a Phy and the transformer. I would consider designing an add-on board to use the internal Ethernet MAC and make the I/O capability compatible with the Microsemi eval board. I might add a few bells and whistles too. -- RickArticle: 158302
On 07/10/15 05:02, rickman wrote: > On 10/5/2015 1:29 PM, Kevin Neilson wrote: >>> >>> We are talking modulo 2 multiplies at every bit, otherwise known as >>> AND gates with no carry? I'm a bit fuzzy on this. >>> >>> Now I'm confused by Kevin's description. If the vector is >>> multiplied by a scalar, what parts are common and what parts are >>> unique? What parts of this are fixed vs. variable? The only parts >>> a tool can optimize are the fixed operands. Or I am totally >>> missing the concept. >>> >> Say you're multiplying a by a vector [b c d]. Let's say we're using >> the field GF(8) so a is 3 bits. Now a can be thought of as ( >> a0*alpha^0 + a1*alpha^1 + a2*alpha^2 ), where a0 is bit 0 of a, and >> alpha is the primitive element of the field. Then a*b or a*c or a*d >> is just a sum of some combination of those 3 values in the >> parentheses, depending upon the locations of the 1s in b, c, or d. >> So you can premultiply the three values in the parentheses (the >> common part) and then take sums of subsets of those three (the >> individual parts). It's all a bunch of XORs at the end. This is >> just a complicated way of saying that by writing the HDL at a more >> explicit level, the synthesizer is better able to find common factors >> and use a lot fewer gates.. > > Ok, I'm not at all familiar with GFs. I see now a bit of what you are > saying. But to be honest, I don't know the tools would have any trouble > with the example you give. The tools are pretty durn good at > optimizing... *but*... there are two things to optimize for, size and > performance. They are sometimes mutually exclusive, sometimes not. If > you ask the tool to give you the optimum size, I don't think you will do > better if you code it differently, while describing *exactly* the same > behavior. > > If you ask the tool to optimize for speed, the tool will feel free to > duplicate logic if it allows higher performance, for example, by > combining terms in different ways. Or less logic may require a longer > chain of LUTs which will be slower. LUT sizes in FPGAs don't always > match the logical breakdown so that speed or size can vary a lot > depending on the partitioning. > GF(8) is a Galios Field with 8 elements. That means that there are 8 different elements. These can be written in different ways. Sometimes people like to be completely abstract and write 0, 1, a, b, c, d, e, f. <http://www.wolframalpha.com/input/?i=GF%288%29> Others will want to use the polynomial forms (which are used in the definition of GF(8)) and write: 0, 1, x, x+1, x², x²+1, x²+x, x²+x+1. <http://www.math.uic.edu/~leon/mcs425-s08/handouts/field.pdf> That form is easily written in binary: 000, 001, 010, 011, 100, etc. One key point about a GF (or any field) is that you can combine elements through addition, subtraction, multiplication, and division (except by 0) and get another element of the field. To make this work, however, addition and multiplication (and therefore their inverses) are very different from how you normally define them. A GF is always of size p^n - in this case, p is 2 and n is 3. Your elements are a set of n elements from Z/p (the integers modulo p). Addition is just pairwise addition of elements, modulo p. For the case p = 2 (which is commonly used for practical applications, because it suits computing so neatly), this means you can hold your elements as a simple binary string of n digits, and addition is just xor. Multiplication is a little more complex. Treat your elements as polynomials (as shown earlier), and multiply them. Any factors of x^i can be reduced modulo p. Then the whole thing is reduced modulo the field's defining irreducible degree n polynomial (in this case, x³ + x + 1). This always gets you back to another element in the field. The real magic here is that the multiplication table (excluding 0) forms a Latin square - every element appears exactly once in each row and column. This means that division "works". It's easy to get a finite field like this for size p, where p is prime - it's just the integers modulo p, and you can use normal addition and multiplication modulo p. But the beauty of GF's is that they give you fields of different sizes - in particular, size 2^n. Back to the task in hand - implementing GF(8) on an FPGA. Addition is just xor - the tools should not have trouble optimising that! Multiplication in GF is usually done with lookup tables. For a field as small as GF(8), you would have a table with all the elements. This could be handled in a 6x3 bit "ROM", or the tools could generate logic and then reduce it to simple gates (for example, the LSB bit of the product of non-zero elements is the xnor of the LSB bits of the operands). For larger fields, such as the very useful GF(2^8), you will probably use lookup tables for the logs and antilogs, and use them for multiplication and division. A key area of practical application for GF's is in error detection and correction. A good example is for RAID6 (two redundant disks in a RAID array) - there is an excellent paper on the subject by the guy who implemented RAID 6 in Linux, using GF(2^8). It gives a good introduction to Galois fields. <https://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>Article: 158303
1)Have an array of logic cells or programmable logic evenly spaced out. Need to have the space reserved for routing channel. 2)But each logic cell need some SRAM configuration bits from configuration register The question I have is if I don't want configuration bits and wires jamming up the routing channel for the logic array, can I have them on different metal layers? I want to have simplest. 2 layers of metal. Metal 1 for logic arrays and routing channels and Metal 2 for SRAM configuration bits and wirings. --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158304
On Tue, 06 Oct 2015 23:02:27 -0400, rickman wrote: [snip] > Ok, I'm not at all familiar with GFs. I see now a bit of what you are > saying. But to be honest, I don't know the tools would have any trouble > with the example you give. The tools are pretty durn good at > optimizing... *but*... there are two things to optimize for, size and > performance. They are sometimes mutually exclusive, sometimes not. If > you ask the tool to give you the optimum size, I don't think you will do > better if you code it differently, while describing *exactly* the same > behavior. You seemed to have misinterpreted my earlier post in this thread in which I had two implementations of *exactly* the same function that were giving very different speed and area results after synthesis. I think what you say (about the coding not mattering and the tools doing a good job) is true for "small" functions. Once you get past a certain level of complexity, the tools (at least the Xilinx ones) seem to do a poor job if the function has been coded the canonical way. Recoding as a tree of XORs can give better results. This is a fault in the synthesiser and is independent of the speed vs area optimisation setting. Regards, AllanArticle: 158305
On 10/7/2015 7:18 AM, lilzz wrote: > 1)Have an array of logic cells or programmable logic evenly spaced out. > Need to have the space reserved for routing channel. > > 2)But each logic cell need some SRAM configuration bits from > configuration register > > The question I have is if I don't want configuration bits and wires > jamming up the routing channel for the logic array, can I have them on > different metal layers? Are you asking about designing your own FPGA? If so, you can do anything you want with that. > I want to have simplest. 2 layers of metal. Metal 1 for logic arrays and > routing channels and Metal 2 for SRAM configuration bits and wirings. I don't think you can divide up the signals in that way. Any set of connections will require routing in both the X and the Y direction. It is often best to use two layers, one for X and one for Y. Otherwise you find a set of signals routing in one direction block routing in the other direction. So if you want to partition your routing layers into two classes, each class will need two layers for a total of four layers of metal. -- RickArticle: 158306
rickman <gnuarm@gmail.com> wrote: > Again, I don't know for sure, but I think ring oscillators make very > *poor* random number generators because they are easily linked to noise > sources such as clocks on the chip. Yes: http://www.cl.cam.ac.uk/~atm26/papers/markettos-ches2009-inject-trng.pdf That has been confirmed to me by various industrial security folks. TheoArticle: 158307
rickman <gnuarm@gmail.com> wrote: (snip on configuration registers in FPGAs) > Are you asking about designing your own FPGA? If so, you can do anything > you want with that. >> I want to have simplest. 2 layers of metal. Metal 1 for logic arrays and >> routing channels and Metal 2 for SRAM configuration bits and wirings. > I don't think you can divide up the signals in that way. Any set of > connections will require routing in both the X and the Y direction. It > is often best to use two layers, one for X and one for Y. Otherwise you > find a set of signals routing in one direction block routing in the > other direction. So if you want to partition your routing layers into > two classes, each class will need two layers for a total of four layers > of metal. I haven't thought about it this much since the XC4000 days, but as far as I know there are configuration bit shift registers along rows (or columns) and a column (or row) on one side that steers bits as appropriate. Note also that configuration runs much slower than the actual logic, so the transistors are different, and routing can be different. For logic, one tries to keep signals in metal, instead of silicon, as its lower resistivity means it runs faster. That isn't so important for configuration. It might be that you can share with wiring that has a different use after configuration. If so, there is probably a patent, though it might have expired. Note also that the LUTs in some FPGAs can also be used as shift registers, and also need to be part of the configuration data. Maybe they use the same shift logic in both cases. I am sure by now there is much art on the optimal designs for FPGAs, which should be well documented somewhere. Patents are a convenient place to look. -- glenArticle: 158308
Hi. I'm looking at universities that offer online/distance (part or full time) master degree based in FPGAs, HDLs/HLS and/or digital design. I wonder if someone knows of such a degree, or can point in the right direction. ThanksArticle: 158309
Hi All, In Altera devices (at least) it is recommended that reset be applied to the async port of flips. It is also recommended that such reset should be pre-synchronised before wiring it to these async ports. This saves resource and helps recovery/removal timing. What exactly is recovery/removal. I know it is defined in terms of reset release and that reset should not be de-asserted close to clock edge. Fair enough but is this independent of D input? I mean if D input is stable (or passes setup/hold) does it matter still that reset release near clock edge will be problem on its own. From timieQuest it looks certainly that it does matter but why? How is reset actually applied inside the flip? Any help appreciated. Zak --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158310
In article <sdudnT94KJgxv4XLnZ2dnUU7-KOdnZ2d@giganews.com>, zak <93737@FPGARelated> wrote: >Hi All, > >In Altera devices (at least) it is recommended that reset be applied to >the async port of flips. It is also recommended that such reset should be >pre-synchronised before wiring it to these async ports. This saves >resource and helps recovery/removal timing. > >What exactly is recovery/removal. I know it is defined in terms of reset >release and that reset should not be de-asserted close to clock edge. Fair >enough but is this independent of D input? I mean if D input is stable (or >passes setup/hold) does it matter still that reset release near clock edge >will be problem on its own. From timieQuest it looks certainly that it >does matter but why? How is reset actually applied inside the flip? Without any specific knowledge of the actual circuitry of a FF - ask yourself this - at the inactive edge of reset, let's say the D pin is a one - the reset is causing a 0 at the FF output. If the clock recovery timing is not met (basically a setup/hold check with respect to the async reset pin) - what value would you expect at Q? An unknown value is the only reasonable assumption. Now ask yourself if D is a zero (matching the reset state). Is it now safe to assume the the recovery check doesn't matter? I wouldn't bet on it. I can see hand wavy arguments both ways. Now someone with more intimate details of the FF internals may argue one way or another - but me as a logic designer? No way I'm depending on that. Do as Altera advises and properly synchronize that inactive edge of reset. Make sure your timing tools are checking that path. Reset/Initialization problems can be quite the devil to find and debug. Regards, MarkArticle: 158311
Thanks Mark. That is a good answer Zak --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158312
>Hi. I'm looking at universities that offer online/distance (part or full >time) master degree based in FPGAs, HDLs/HLS and/or digital design. I wonder >if someone knows of such a degree, or can point in the right direction. > >Thanks I dont think i've heard of any Masters program with a 100% focus on HDLs/FPGAs. What you are likely to find if FPGA is of interest to you is pursuing a Master in either EE with DSP concentration or Computer Science/Engineering. Studying either of these courses should give you much exposure to FPGA's. From my experience you sort of have to build on your experience once you have been exposed to it. For clearer information you can go talk to some technical institutions in regards to this. This is just my opinion. Hope this helps. --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158313
Thanks. Do you have any program in particular that you would recommend ?Article: 158314
How do I most efficiently add 8 numbers in FPGA? What is the best way to save LUTs? How is data width affecting LUT consumption? Thanks in advance. --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158315
On Tuesday, October 20, 2015 at 7:40:35 AM UTC-4, b2508 wrote: > How do I most efficiently add 8 numbers in FPGA? With an adder. You haven't stated any requirements so any answer here would be OK. Consider: - You didn't specify your latency or processing speed requirements - You didn't specify your efficiency metric (i.e. power? LUTs? Something else?) > What is the best way to save LUTs? Use an accumulator and stream the numbers in sequentially might use fewer LUTs > How is data width affecting LUT consumption? More LUTs will be used when you increase the data width KevinArticle: 158316
b2508 <108118@fpgarelated> wrote: > How do I most efficiently add 8 numbers in FPGA? > What is the best way to save LUTs? > How is data width affecting LUT consumption? The most efficient adder is the carry save adder. But the actual implementation depends on many other details, such as the timing of the availability of the numbers, and also the bit width. -- glenArticle: 158317
On 10/20/2015 7:40 AM, b2508 wrote: > How do I most efficiently add 8 numbers in FPGA? > What is the best way to save LUTs? > How is data width affecting LUT consumption? > Thanks in advance. This sounds like a homework problem. In an FPGA there aren't many ways to save LUTs for adders. Unless you can process your data serially, the only thing I can think of is to do the additions in a tree structure which saves you a very few LUTs from the bit growth of the result compared to processing the additions serially, it's also faster. (((a+b)+(c+d))+((e+f)+(g+h))) vs. ((((((a+b)+c)+d)+e)+f)+g)+h -- RickArticle: 158318
>In article <sdudnT94KJgxv4XLnZ2dnUU7-KOdnZ2d@giganews.com>, >zak <93737@FPGARelated> wrote: >>Hi All, >> >Now ask yourself if D is a zero (matching the reset state). Is it now >safe to assume the the recovery check doesn't matter? I wouldn't >bet on it. I can see hand wavy arguments both ways. > >Now someone with more intimate details of the FF internals >may argue one way or another - but me as a logic designer? No way I'm >depending on that. > >Do as Altera advises and properly synchronize that inactive edge >of reset. Make sure your timing tools are checking that path. >Reset/Initialization problems can be quite the devil to find and >debug. > >Regards, > >Mark You shouldn't have to hand wave or guess, you should be able to look at timing requirements from the silicon vendors data sheet and it will tell what the reset_deassert to clock recovery time is. It should also note if that constraint is only valid when D=1 or not. It's hard to imagine any flop going to 1 in that situation but you can get some really weird behavior from muxed based logic. I like to do both an asynchronous and synchronous reset on every flop and connect the asynchronous reset directly to the input pad so that both edges are asynchronous. You delay the synchronous reset enough clocks so that you never see a 1 on the D input until after the recovery time has passed. This is great for large chips where the transport delay across the die can be multiple clock cycles. If you syncronize the asynchronous reset then you also create a mess that the dft engineer has to clean up. You will also mess up the timing for any signal that passes between soft reset domains. John Eaton z3qmtr45@gmail.com --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158319
On Tuesday, October 20, 2015 at 6:40:35 AM UTC-5, b2508 wrote: > How do I most efficiently add 8 numbers in FPGA? > What is the best way to save LUTs? > How is data width affecting LUT consumption? > Thanks in advance. > --------------------------------------- > Posted through http://www.FPGARelated.com At the risk of doing someone else's homework: > How do I most efficiently add 8 numbers in FPGA? The latest parts from Xilinx and Altera will add three numbers at a time using a single carry chain. > What is the best way to save LUTs? A: Doing serial arithmetic using block RAM to hold inputs & outputs. B: Using DSP adders in place of LUT carry chains. JimArticle: 158320
rickman <gnuarm@gmail.com> wrote: > On 10/20/2015 7:40 AM, b2508 wrote: >> How do I most efficiently add 8 numbers in FPGA? >> What is the best way to save LUTs? >> How is data width affecting LUT consumption? >> Thanks in advance. > This sounds like a homework problem. Yes, but even so, leaving lots of unknowns. > In an FPGA there aren't many ways to save LUTs for adders. If you have 8 n-bit inputs and need the sum as fast as possible, there aren't a huge number of choices. Though it does depends a litlte on n. > Unless you can process your data serially, the In this case, there are two choices. You can process the data bit serial, or word serial. (Or, I suppose somewhere in between.) Choosing one of those would depend on how the data was supplied, and again, how fast you need the result. In addition, only one set of eight, or many? > only thing I can think of is to do the additions in a tree structure > which saves you a very few LUTs from the bit growth of the result > compared to processing the additions serially, it's also faster. > (((a+b)+(c+d))+((e+f)+(g+h))) vs. ((((((a+b)+c)+d)+e)+f)+g)+h If you just chain adders, the usual tools will optimize them. But you might also want some registers in there, too. Also, this could be a lab homework problem, where the student is supposed to try things out and see what happens. -- glenArticle: 158321
On 20/10/2015 22:52, jim.brakefield@ieee.org wrote: > On Tuesday, October 20, 2015 at 6:40:35 AM UTC-5, b2508 wrote: >> How do I most efficiently add 8 numbers in FPGA? >> What is the best way to save LUTs? >> How is data width affecting LUT consumption? Why not try it out. Run one of the tool chains and see what happens when you build adder in different ways and then if its not what you expect come and ask on here. The tool chains will show you what the LUT usage is. I was a tad suprised to find that when I coded:- byteout <= byte1+byte2+byte3+byte4+byte5+byte6+byte7+byte8 ; and compared it with temp1 <= byte1 + byte2 + byte3 + byte4 ; temp2 <= byte5 + byte6 + byte7 + byte8 ; byteout <= temp1 + temp2 ; I got the same number of LUTs and Slices used.... >> Thanks in advance. >> --------------------------------------- >> Posted through http://www.FPGARelated.com > > At the risk of doing someone else's homework: > >> How do I most efficiently add 8 numbers in FPGA? Define efficiency. Almost all efficiency is a trade off between space and performance. > The latest parts from Xilinx and Altera will add three numbers at a time using a single carry chain. > In this modern world of optimising tool chains why not just put them all in one expression and let the tool chain work out what is best for the chip. >> What is the best way to save LUTs? > A: Doing serial arithmetic using block RAM to hold inputs & outputs. A classical trade off of speed, as its now serial, for gates used. If you do it serially then you may need to do 7 separate serial additions.. .. which will need more LUTs for the carry latches.... > B: Using DSP adders in place of LUT carry chains. Assuming your chip has one? > > Jim > Just my two cents/pence/yuan... And Jim, Nothing personal, your comments seemed a suitable place to hang my hat.... DaveArticle: 158322
>>In article <sdudnT94KJgxv4XLnZ2dnUU7-KOdnZ2d@giganews.com>, >>zak <93737@FPGARelated> wrote: >>>Hi All, >>> > >>Now ask yourself if D is a zero (matching the reset state). Is it now >>safe to assume the the recovery check doesn't matter? I wouldn't >>bet on it. I can see hand wavy arguments both ways. >> >>Now someone with more intimate details of the FF internals >>may argue one way or another - but me as a logic designer? No way I'm >>depending on that. >> >>Do as Altera advises and properly synchronize that inactive edge >>of reset. Make sure your timing tools are checking that path. >>Reset/Initialization problems can be quite the devil to find and >>debug. >> >>Regards, >> >>Mark > >You shouldn't have to hand wave or guess, you should be able to look at >timing requirements from the silicon vendors data sheet and it will tell >what the reset_deassert to clock recovery time is. It should also note if >that constraint is only valid when D=1 or not. > >It's hard to imagine any flop going to 1 in that situation but you can get >some really weird behavior from muxed based logic. > > > >I like to do both an asynchronous and synchronous reset on every flop and >connect the asynchronous reset directly to the input pad so that both >edges are asynchronous. You delay the synchronous reset enough clocks so >that you never see a 1 on the D input until after the recovery time has >passed. This is great for large chips where the transport delay across the >die can be multiple clock cycles. > >If you syncronize the asynchronous reset then you also create a mess that >the dft engineer has to clean up. > > >You will also mess up the timing for any signal that passes between soft >reset domains. > > >John Eaton > >z3qmtr45@gmail.com > > > > >--------------------------------------- >Posted through http://www.FPGARelated.com Apologies but can I ask if this reply is from an ASIC mindset or is it spam. Almost every statement doesn't make any sense whatsoever for FPGAs. Zak --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158323
jt_eaton wrote: > You shouldn't have to hand wave or guess, you should be able to look at > timing requirements from the silicon vendors data sheet and it will tell > what the reset_deassert to clock recovery time is. It should also note if > that constraint is only valid when D=1 or not. > > It's hard to imagine any flop going to 1 in that situation but you can get > some really weird behavior from muxed based logic. The real problem is if the reset is distributed on general switch fabric while the clock is on a low-skew clock net, some FFs could be released from reset on one clock, others could be released on the next clock, if the clock is fast. Think of some state machine register that is supposed to start at zero, and start going through states after reset ends. It could end up in an odd state if some FFs are still being reset while others have got out of reset. JonArticle: 158324
>The real problem is if the reset is distributed on general switch fabric >while the clock is on a low-skew clock net, some FFs could be released from > >reset on one clock, others could be released on the next clock, if the clock > >is fast. Think of some state machine register that is supposed to start at > >zero, and start going through states after reset ends. It could end up in >an odd state if some FFs are still being reset while others have got out of > >reset. > >Jon True but the tool takes care of that and tells you if it passed recovery/removal at every register based on clock period check. In case it doesn't you can then assist by reducing reset fanout e.g. cascading stages. Zak --------------------------------------- Posted through http://www.FPGARelated.com
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z