Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
In article <9ae86fdc-dc6a-4d3f-b201-594fe2f6a3cd@googlegroups.com>, Kevin Neilson <kevin.neilson@xilinx.com> wrote: >> I'm not enough of an FPGA guy to make really deep comments, but this >> looks like the state of C compilers about 20 or so years ago. When I >> started coding in C one had to write the code with an eye to the assembly >> that the thing was spitting out. Now, if you've got a good optimizer >> (and the gnu C optimizer is better than I am on all but a very few of the >> processors I've worked with recently), you just express your intent and >> the compiler makes it happen most efficiently. >> >I know! I often feel like I'm a software guy, but stuck in the 80s, poring over every line generated by the assembler to make sure it's optimized. But, but "HLS", and "IP Integrator"... ;) I actually came back a bit let down from a recent Xilinx user's meeting at just how much focus Xilinx is putting on their 'high level' tools. I'm of the opinion that Xilinx is sinking a ton of resources into something that a small minority will ever use. (And will probably not last long either). To Xilinx, RTL design is dead... --MarkArticle: 159476
On Mon, 21 Nov 2016 21:19:50 +0000, Mark Curry wrote: > In article <9ae86fdc-dc6a-4d3f-b201-594fe2f6a3cd@googlegroups.com>, > Kevin Neilson <kevin.neilson@xilinx.com> wrote: >>> I'm not enough of an FPGA guy to make really deep comments, but this >>> looks like the state of C compilers about 20 or so years ago. When I >>> started coding in C one had to write the code with an eye to the >>> assembly that the thing was spitting out. Now, if you've got a good >>> optimizer (and the gnu C optimizer is better than I am on all but a >>> very few of the processors I've worked with recently), you just >>> express your intent and the compiler makes it happen most efficiently. >>> >>I know! I often feel like I'm a software guy, but stuck in the 80s, >>poring over every line generated by the assembler to make sure it's >>optimized. > > But, but "HLS", and "IP Integrator"... ;) > > I actually came back a bit let down from a recent Xilinx user's meeting > at just how much focus Xilinx is putting on their 'high level' tools. > I'm of the opinion that Xilinx is sinking a ton of resources into > something that a small minority will ever use. (And will probably not > last long either). To Xilinx, RTL design is dead... > > --Mark If that small minority is the one with the most dollars behind it, then they win. Dunno if that's the case or not, but it seems like there's a lot of design of high-volume, cost-sensitive stuff that's done mostly by applications engineers these days. Or, Xilinx is wrong, and they'll spend a lot of money on uselessness. That's never happened before in the history of semiconductors, now has it? ;) -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!Article: 159477
> I actually came back a bit let down from a recent Xilinx user's meeting a= t just how > much focus Xilinx is putting on their 'high level' tools. I'm of the opi= nion that > Xilinx is sinking a ton of resources into something that a small minority= will=20 > ever use. (And will probably not last long either). To Xilinx, RTL desi= gn is=20 > dead... >=20 > --Mark I wish they would just focus all their effort on the synthesizer and placer= . The chips get better and better, but the software seems stuck. I think = the high-level tools are not for serious users. You can only use them if y= ou don't care about clock speed, and if you don't care about clock speed, y= ou should be using a processor or something.Article: 159478
On Mon, 21 Nov 2016 14:51:13 -0800, Kevin Neilson wrote: >> I actually came back a bit let down from a recent Xilinx user's meeting >> at just how much focus Xilinx is putting on their 'high level' tools. >> I'm of the opinion that Xilinx is sinking a ton of resources into >> something that a small minority will ever use. (And will probably not >> last long either). To Xilinx, RTL design is dead... >> >> --Mark > > I wish they would just focus all their effort on the synthesizer and > placer. The chips get better and better, but the software seems stuck. > I think the high-level tools are not for serious users. You can only > use them if you don't care about clock speed, and if you don't care > about clock speed, you should be using a processor or something. Maybe if the synthesizer got better the demand for hugely fast chips would go down, and thus they'd shoot themselves in the foot -- at least from their perspective. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!Article: 159479
In article <c5206719-b91e-43e5-94ef-dfc84a49d62a@googlegroups.com>, Kevin Neilson <kevin.neilson@xilinx.com> wrote: >> I actually came back a bit let down from a recent Xilinx user's meeting at just how >> much focus Xilinx is putting on their 'high level' tools. I'm of the opinion that >> Xilinx is sinking a ton of resources into something that a small minority will >> ever use. (And will probably not last long either). To Xilinx, RTL design is >> dead... >> >> --Mark > >I wish they would just focus all their effort on the synthesizer and placer. The chips >get better and better, but the software seems stuck. I think the high-level tools are >not for serious users. You can only use them if you don't care about clock speed, and >if you don't care about clock speed, you should be using a processor or something. Agreement. Add value where you add value - in your core competencies. Xilinx adds value here - they design some kick ass technologies, in some very tough geometries. They add value here. They have some excellant experts in a wide breadth of technologies, than can help you design and debug some of the most advanced designs. They add value in their software back end tools which must map to this technology. They have great reference designs, and documentation. They don't add value in the front end. They're trying to solve a difficult problem that's been around for 20 years, that's vexxed an entire EDA software industry. Learn from the ASIC guys here. ASIC companies punted on their "special sauce" in-house SW 20 years ago, before they got wise and let the EDA industry do its job. FPGA needs to do the same now. I'm actually of the opinion that they should punt on synthesis too. Focus on the back end. I doubt it'll happen - folks are too used to the idea of "free" EDA tools from the FPGA vendors. Regards, MarkArticle: 159480
On 21/11/16 20:19, Tim Wescott wrote: > On Mon, 21 Nov 2016 10:07:41 +0000, Tom Gardner wrote: > >> On 20/11/16 22:43, Tim Wescott wrote: >>> On Sat, 19 Nov 2016 14:15:18 -0800, Kevin Neilson wrote: >>> >>>> Here's an interesting synthesis result. I synthesized this with >>>> Vivado for Virtex-7: >>>> >>>> reg [68:0] x; >>>> reg x_neq_0; >>>> always@(posedge clk) x_neq_0 <= x!=0; // version 1 >>>> >>>> Then I rephrased the logic: >>>> >>>> reg [68:0] x; >>>> reg x_neq_0; >>>> always@(posedge clk) x_neq_0 <= |x; // version 2 >>>> >>>> These should be the same, right? >>>> >>>> Version 1 uses 23 3-input LUTs on the first level followed by a >>>> 23-long carry chain (6 CARRY4 blocks). This is twice as big as it >>>> should be. >>>> >>>> Version 2 is 3 levels of LUTs, 12 6-input LUTs on the first level, 15 >>>> total. >>>> >>>> Neither is optimal. What I really want is a combination, 12 6-input >>>> LUTs followed by 3 CARRY4s. >>>> >>>> This is supposed to be the era of high-level synthesis... >>> >>> I'm not enough of an FPGA guy to make really deep comments, but this >>> looks like the state of C compilers about 20 or so years ago. When I >>> started coding in C one had to write the code with an eye to the >>> assembly that the thing was spitting out. Now, if you've got a good >>> optimizer (and the gnu C optimizer is better than I am on all but a >>> very few of the processors I've worked with recently), you just express >>> your intent and the compiler makes it happen most efficiently. >>> >>> Clearly, that's not yet the case, at least for that particular >>> synthesis tool. It's a pity. >> >> Of course sometimes you don't want optimisation. Consider, for example, >> bridging terms in an asynchronous circuit. > > OK. I give up -- what do you mean by "bridging terms"? https://en.wikipedia.org/wiki/Karnaugh_map#Race_hazards It is called a bridging term since it is a logically redundant term that straddles two required minterms. Its purpose is to remove static hazards (glitches) that can occur when inputs change, typically when there are unequal propagation delays inside the implementation. > In general, I would say that if this is an issue, then (as with the > 'volatile' and 'mutable' keywords in C++), there should be a way in the > language to express your intent to the synthesizer -- either a way to say > "don't optimize this section", or a way to say "keep this signal no > matter what", or a syntax that lets you lay down literal hardware, etc. It only occurs in asynchronous circuits; the <ahem> workaround is to only have synchronous designs and implementations.Article: 159481
On 21/11/16 20:47, GaborSzakacs wrote: > Tim Wescott wrote: >> On Mon, 21 Nov 2016 10:07:41 +0000, Tom Gardner wrote: >> >>> On 20/11/16 22:43, Tim Wescott wrote: >>>> On Sat, 19 Nov 2016 14:15:18 -0800, Kevin Neilson wrote: >>>> >>>>> Here's an interesting synthesis result. I synthesized this with >>>>> Vivado for Virtex-7: >>>>> >>>>> reg [68:0] x; >>>>> reg x_neq_0; >>>>> always@(posedge clk) x_neq_0 <= x!=0; // version 1 >>>>> >>>>> Then I rephrased the logic: >>>>> >>>>> reg [68:0] x; >>>>> reg x_neq_0; >>>>> always@(posedge clk) x_neq_0 <= |x; // version 2 >>>>> >>>>> These should be the same, right? >>>>> >>>>> Version 1 uses 23 3-input LUTs on the first level followed by a >>>>> 23-long carry chain (6 CARRY4 blocks). This is twice as big as it >>>>> should be. >>>>> >>>>> Version 2 is 3 levels of LUTs, 12 6-input LUTs on the first level, 15 >>>>> total. >>>>> >>>>> Neither is optimal. What I really want is a combination, 12 6-input >>>>> LUTs followed by 3 CARRY4s. >>>>> >>>>> This is supposed to be the era of high-level synthesis... >>>> I'm not enough of an FPGA guy to make really deep comments, but this >>>> looks like the state of C compilers about 20 or so years ago. When I >>>> started coding in C one had to write the code with an eye to the >>>> assembly that the thing was spitting out. Now, if you've got a good >>>> optimizer (and the gnu C optimizer is better than I am on all but a >>>> very few of the processors I've worked with recently), you just express >>>> your intent and the compiler makes it happen most efficiently. >>>> >>>> Clearly, that's not yet the case, at least for that particular >>>> synthesis tool. It's a pity. >>> Of course sometimes you don't want optimisation. Consider, for example, >>> bridging terms in an asynchronous circuit. >> >> OK. I give up -- what do you mean by "bridging terms"? >> >> In general, I would say that if this is an issue, then (as with the 'volatile' >> and 'mutable' keywords in C++), there should be a way in the language to >> express your intent to the synthesizer -- either a way to say "don't optimize >> this section", or a way to say "keep this signal no matter what", or a syntax >> that lets you lay down literal hardware, etc. >> > > Bridging terms refers to terms that cover transitions in an asynchronous > sequential circuit. Xilinx tools specifically do not honor this sort of > logic and it really has no business in their FPGA's. However, if you > insist on generating asynchronous sequential logic in a Xilinx FPGA, you > will need to instantiate LUTs to get the coverage you're looking for. Agreed. You will probably also have to nail down the LUTs and the signal routing. I suspect that, since Xilinx has a very good range of I/O primitives, there really isn't any benefit to full async design in their FPGAs.Article: 159482
On 22/11/16 00:33, Tim Wescott wrote: > On Mon, 21 Nov 2016 14:51:13 -0800, Kevin Neilson wrote: > >>> I actually came back a bit let down from a recent Xilinx user's meeting >>> at just how much focus Xilinx is putting on their 'high level' tools. >>> I'm of the opinion that Xilinx is sinking a ton of resources into >>> something that a small minority will ever use. (And will probably not >>> last long either). To Xilinx, RTL design is dead... >>> >>> --Mark >> >> I wish they would just focus all their effort on the synthesizer and >> placer. The chips get better and better, but the software seems stuck. >> I think the high-level tools are not for serious users. You can only >> use them if you don't care about clock speed, and if you don't care >> about clock speed, you should be using a processor or something. > > Maybe if the synthesizer got better the demand for hugely fast chips > would go down, and thus they'd shoot themselves in the foot -- at least > from their perspective. Synthesis is easy. Place and route is hard. A big question is how to either decouple or integrate the them. Particularly when you see the size of the big Xilinx chips and consider the relative time taken to get across the chip and through a single LUT (and then through the integrated ARM cores :) ) But I suspect I'm close to teaching you how to suck eggs :)Article: 159483
On Tue, 22 Nov 2016 01:33:12 +0000, Tom Gardner wrote: > On 22/11/16 00:33, Tim Wescott wrote: >> On Mon, 21 Nov 2016 14:51:13 -0800, Kevin Neilson wrote: >> >>>> I actually came back a bit let down from a recent Xilinx user's >>>> meeting at just how much focus Xilinx is putting on their 'high >>>> level' tools. I'm of the opinion that Xilinx is sinking a ton of >>>> resources into something that a small minority will ever use. (And >>>> will probably not last long either). To Xilinx, RTL design is >>>> dead... >>>> >>>> --Mark >>> >>> I wish they would just focus all their effort on the synthesizer and >>> placer. The chips get better and better, but the software seems >>> stuck. >>> I think the high-level tools are not for serious users. You can only >>> use them if you don't care about clock speed, and if you don't care >>> about clock speed, you should be using a processor or something. >> >> Maybe if the synthesizer got better the demand for hugely fast chips >> would go down, and thus they'd shoot themselves in the foot -- at least >> from their perspective. > > Synthesis is easy. Place and route is hard. A big question is how to > either decouple or integrate the them. > > Particularly when you see the size of the big Xilinx chips and consider > the relative time taken to get across the chip and through a single LUT > (and then through the integrated ARM cores :) ) > > But I suspect I'm close to teaching you how to suck eggs :) Nah -- about the teaching me to suck eggs part, at least. I understand the principles involved, but it's not something I've ever done. Assuming that people know what the hell they're doing it can't be an easy problem, because it hasn't been fully solved. At least -- to my knowledge the process is still an iterative one that's at least partially based on some sort of a pseudo-random process (presumably simulated annealing). -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!Article: 159484
On 11/21/2016 3:47 PM, GaborSzakacs wrote: > Tim Wescott wrote: >> On Mon, 21 Nov 2016 10:07:41 +0000, Tom Gardner wrote: >> >>> On 20/11/16 22:43, Tim Wescott wrote: >>>> On Sat, 19 Nov 2016 14:15:18 -0800, Kevin Neilson wrote: >>>> >>>>> Here's an interesting synthesis result. I synthesized this with >>>>> Vivado for Virtex-7: >>>>> >>>>> reg [68:0] x; >>>>> reg x_neq_0; >>>>> always@(posedge clk) x_neq_0 <= x!=0; // version 1 >>>>> >>>>> Then I rephrased the logic: >>>>> >>>>> reg [68:0] x; >>>>> reg x_neq_0; >>>>> always@(posedge clk) x_neq_0 <= |x; // version 2 >>>>> >>>>> These should be the same, right? >>>>> >>>>> Version 1 uses 23 3-input LUTs on the first level followed by a >>>>> 23-long carry chain (6 CARRY4 blocks). This is twice as big as it >>>>> should be. >>>>> >>>>> Version 2 is 3 levels of LUTs, 12 6-input LUTs on the first level, 15 >>>>> total. >>>>> >>>>> Neither is optimal. What I really want is a combination, 12 6-input >>>>> LUTs followed by 3 CARRY4s. >>>>> >>>>> This is supposed to be the era of high-level synthesis... >>>> I'm not enough of an FPGA guy to make really deep comments, but this >>>> looks like the state of C compilers about 20 or so years ago. When I >>>> started coding in C one had to write the code with an eye to the >>>> assembly that the thing was spitting out. Now, if you've got a good >>>> optimizer (and the gnu C optimizer is better than I am on all but a >>>> very few of the processors I've worked with recently), you just express >>>> your intent and the compiler makes it happen most efficiently. >>>> >>>> Clearly, that's not yet the case, at least for that particular >>>> synthesis tool. It's a pity. >>> Of course sometimes you don't want optimisation. Consider, for example, >>> bridging terms in an asynchronous circuit. >> >> OK. I give up -- what do you mean by "bridging terms"? >> >> In general, I would say that if this is an issue, then (as with the >> 'volatile' and 'mutable' keywords in C++), there should be a way in >> the language to express your intent to the synthesizer -- either a way >> to say "don't optimize this section", or a way to say "keep this >> signal no matter what", or a syntax that lets you lay down literal >> hardware, etc. >> > > Bridging terms refers to terms that cover transitions in an asynchronous > sequential circuit. Xilinx tools specifically do not honor this sort of > logic and it really has no business in their FPGA's. However, if you > insist on generating asynchronous sequential logic in a Xilinx FPGA, you > will need to instantiate LUTs to get the coverage you're looking for. Xilinx parts do not require bridging terms. If two canonical terms, adjacent in the Karnaugh map, are set to the same value in the LUT there is no glitch if a single input transitions from one term to another. This is because they use transmission gates for the multiplexer and there is enough capacitance to hold a signal on the output if neither signals are driving the output as the switches transition. If you think about it just a bit, you will realize most FPGA LUTs only have canonical product terms and so can't have "cover terms" or "bridging terms". -- Rick CArticle: 159485
On 22/11/16 01:50, Tim Wescott wrote: > On Tue, 22 Nov 2016 01:33:12 +0000, Tom Gardner wrote: > >> On 22/11/16 00:33, Tim Wescott wrote: >>> On Mon, 21 Nov 2016 14:51:13 -0800, Kevin Neilson wrote: >>> >>>>> I actually came back a bit let down from a recent Xilinx user's >>>>> meeting at just how much focus Xilinx is putting on their 'high >>>>> level' tools. I'm of the opinion that Xilinx is sinking a ton of >>>>> resources into something that a small minority will ever use. (And >>>>> will probably not last long either). To Xilinx, RTL design is >>>>> dead... >>>>> >>>>> --Mark >>>> >>>> I wish they would just focus all their effort on the synthesizer and >>>> placer. The chips get better and better, but the software seems >>>> stuck. >>>> I think the high-level tools are not for serious users. You can only >>>> use them if you don't care about clock speed, and if you don't care >>>> about clock speed, you should be using a processor or something. >>> >>> Maybe if the synthesizer got better the demand for hugely fast chips >>> would go down, and thus they'd shoot themselves in the foot -- at least >>> from their perspective. >> >> Synthesis is easy. Place and route is hard. A big question is how to >> either decouple or integrate the them. >> >> Particularly when you see the size of the big Xilinx chips and consider >> the relative time taken to get across the chip and through a single LUT >> (and then through the integrated ARM cores :) ) >> >> But I suspect I'm close to teaching you how to suck eggs :) > > Nah -- about the teaching me to suck eggs part, at least. I understand > the principles involved, but it's not something I've ever done. > > Assuming that people know what the hell they're doing it can't be an easy > problem, because it hasn't been fully solved. At least -- to my > knowledge the process is still an iterative one that's at least partially > based on some sort of a pseudo-random process (presumably simulated > annealing). I'm sure heuristics are involved, of course, but even they will only get you so far. From memory, a CLB "gate" delay is of the order of 100ps and it can take ~1ns for a logic signal to cross the chip (clocks can be a bit faster due to dedicated drivers and tracks). Even a "global reset" becomes a heretical concept. Now, what delay should you guess a particular gate+track will have, and where should you place it? Ditto the 100,000 others - to maximise the clock rate of the ensemble. As you might guess, the workflow is 1 design 2 synthesise (from RTL/behavioural/system design) 3 simulate, to get an idea of speed 4 place and route 5 simulate, with "actual" delays 6 utter expletive deleteds 7 goto 1 Yes, there are many means to constrain the designs and help the place and route, from specifying which timings matter to nailing down functions in individual LUT/CLBs. But they only go so far.Article: 159486
> I'm sure heuristics are involved, of course, but even > they will only get you so far. >=20 > From memory, a CLB "gate" delay is of the order of 100ps > and it can take ~1ns for a logic signal to cross the chip > (clocks can be a bit faster due to dedicated drivers > and tracks). Even a "global reset" becomes a heretical > concept. >=20 In the part I'm using, LUT delays are 43 ps and net delays between them can= easily be 1 ns. I'm looking at a net segment now that is 950ps and it loo= ks like it only goes about 3% the width of the die. It's short. (It does = go across an IOB column, which is probably part of the problem.) The heuri= stics in the synthesizer seem to dislike using MUXF7s and MUXCYs, even thou= gh they have dedicated routing, because the LUT delay is only 43ps and that= makes it look good. But when the route to it is >500ps, the advantage is = lost. These are nice chips, but the synthesizer is still weak. And it seems odd = that a slight rephrasing resulting in an equivalent Boolean expression woul= d yield an entirely different synthesis result.Article: 159487
Rob Gaddi <rgaddi@highlandtechnology.invalid> wrote: > Xilinx ISE, Xilinx Vivado, and Altera Quartus all work under Linux. > They can all be touchy about the exact Linux they're running under, > which is why I've got them all walled off in VMs. Vivado I run under > Ubuntu, Quartus on CentOS. Usually they'll work, but it needs some shared library package installing and it takes a little bit of fiddling to work out what that is for your distro. Google is your friend here, or else ldd. I have managed to run most EDA tools (Xilinx, Altera, Cadence, Mentor, Synopsys) under Ubuntu with a little fiddling, even though vendors typically only support RHEL. A VM is the simplest way, though not always quickest (both in runtime and installation). Or run everything under CentOS, since vendors like RHEL. The accretion of mine and others' notes from our local installations are here: http://www.wiki.cl.cam.ac.uk/rowiki/CompArch/EDA however they are long overdue for a tidy (and removal of decade-old stuff) TheoArticle: 159488
On 11/21/16 5:07 AM, Tom Gardner wrote: > On 20/11/16 22:43, Tim Wescott wrote: >> On Sat, 19 Nov 2016 14:15:18 -0800, Kevin Neilson wrote: >> >>> Here's an interesting synthesis result. I synthesized this with Vivado >>> for Virtex-7: >>> >>> reg [68:0] x; >>> reg x_neq_0; >>> always@(posedge clk) x_neq_0 <= x!=0; // version 1 >>> >>> Then I rephrased the logic: >>> >>> reg [68:0] x; >>> reg x_neq_0; >>> always@(posedge clk) x_neq_0 <= |x; // version 2 >>> >>> These should be the same, right? >>> >>> Version 1 uses 23 3-input LUTs on the first level followed by a 23-long >>> carry chain (6 CARRY4 blocks). This is twice as big as it should be. >>> >>> Version 2 is 3 levels of LUTs, 12 6-input LUTs on the first level, 15 >>> total. >>> >>> Neither is optimal. What I really want is a combination, 12 6-input >>> LUTs followed by 3 CARRY4s. >>> >>> This is supposed to be the era of high-level synthesis... >> >> I'm not enough of an FPGA guy to make really deep comments, but this >> looks like the state of C compilers about 20 or so years ago. When I >> started coding in C one had to write the code with an eye to the assembly >> that the thing was spitting out. Now, if you've got a good optimizer >> (and the gnu C optimizer is better than I am on all but a very few of the >> processors I've worked with recently), you just express your intent and >> the compiler makes it happen most efficiently. >> >> Clearly, that's not yet the case, at least for that particular synthesis >> tool. It's a pity. > > Of course sometimes you don't want optimisation. > Consider, for example, bridging terms in an asynchronous > circuit. > If you are thinking in terms of an AND-OR tree for the typical LUT based FPGA, you aren't going to get it right. Most FPGA's now use the LUT, which, at least for a single LUT, are normally guaranteed to be glitch free for single line transitions (so no need for the bridging terms). If you need more inputs than a single LUT provides, and you need need the glitch free performance, than trying to force a massive AND-OR tree is normally going to be very inefficient, and I find it worth building the exact structure I need with the Low Level, vendor provided fundamental LUT/Carry primatives.Article: 159489
On 23/11/16 16:33, Richard Damon wrote: > On 11/21/16 5:07 AM, Tom Gardner wrote: >> On 20/11/16 22:43, Tim Wescott wrote: >>> On Sat, 19 Nov 2016 14:15:18 -0800, Kevin Neilson wrote: >>> >>>> Here's an interesting synthesis result. I synthesized this with Vivado >>>> for Virtex-7: >>>> >>>> reg [68:0] x; >>>> reg x_neq_0; >>>> always@(posedge clk) x_neq_0 <= x!=0; // version 1 >>>> >>>> Then I rephrased the logic: >>>> >>>> reg [68:0] x; >>>> reg x_neq_0; >>>> always@(posedge clk) x_neq_0 <= |x; // version 2 >>>> >>>> These should be the same, right? >>>> >>>> Version 1 uses 23 3-input LUTs on the first level followed by a 23-long >>>> carry chain (6 CARRY4 blocks). This is twice as big as it should be. >>>> >>>> Version 2 is 3 levels of LUTs, 12 6-input LUTs on the first level, 15 >>>> total. >>>> >>>> Neither is optimal. What I really want is a combination, 12 6-input >>>> LUTs followed by 3 CARRY4s. >>>> >>>> This is supposed to be the era of high-level synthesis... >>> >>> I'm not enough of an FPGA guy to make really deep comments, but this >>> looks like the state of C compilers about 20 or so years ago. When I >>> started coding in C one had to write the code with an eye to the assembly >>> that the thing was spitting out. Now, if you've got a good optimizer >>> (and the gnu C optimizer is better than I am on all but a very few of the >>> processors I've worked with recently), you just express your intent and >>> the compiler makes it happen most efficiently. >>> >>> Clearly, that's not yet the case, at least for that particular synthesis >>> tool. It's a pity. >> >> Of course sometimes you don't want optimisation. >> Consider, for example, bridging terms in an asynchronous >> circuit. >> > > If you are thinking in terms of an AND-OR tree for the typical LUT based FPGA, > you aren't going to get it right. Most FPGA's now use the LUT, which, at least > for a single LUT, are normally guaranteed to be glitch free for single line > transitions (so no need for the bridging terms). If you need more inputs than a > single LUT provides, and you need need the glitch free performance, than trying > to force a massive AND-OR tree is normally going to be very inefficient, and I > find it worth building the exact structure I need with the Low Level, vendor > provided fundamental LUT/Carry primatives. Agreed.Article: 159490
I use a Lattice LFXP3C-3TN100C on a production board that has been made for several years with quantities in the thousands. The HW-USBN-2A JTAG programmer typically works without complaint, both of them. One is a Lattice unit and the other is a Chinese knockoff. We are trying to finish a run of 600+ units and most have completed testing, but we have around 60 that can not be programmed. We get an error that the device ID can't be verified. The value returned is 0xFFFFFFFE while the expected value is 0x01255043. I would say there is a problem with a bad trace, but then I would expect *all* F's rather than just one bit being a zero. I've traced the signals and they are getting where they belong. Signals TDI and TMS are pulled up to 3.3 volts while TCK is not and only reaches 2.0 volts. I can see transitions on TDI with pulses in the microsecond range. This is the same with working boards. These chips only need Vcc, PROG_N high and the four JTAG signals, TMS, TCK, TDO and TDI to be connected in order to program them. We seem to have all that. Any ideas on what to check? The test fixture will program a good board just fine. But these 60 units can't even pass chip ID verification. I think I'm ready to replace the FPGA on one of them. -- Rick CArticle: 159491
Den onsdag den 23. november 2016 kl. 19.51.31 UTC+1 skrev rickman: > I use a Lattice LFXP3C-3TN100C on a production board that has been made > for several years with quantities in the thousands. The HW-USBN-2A JTAG > programmer typically works without complaint, both of them. One is a > Lattice unit and the other is a Chinese knockoff. > > We are trying to finish a run of 600+ units and most have completed > testing, but we have around 60 that can not be programmed. We get an > error that the device ID can't be verified. The value returned is > 0xFFFFFFFE while the expected value is 0x01255043. I would say there is > a problem with a bad trace, but then I would expect *all* F's rather > than just one bit being a zero. > > I've traced the signals and they are getting where they belong. Signals > TDI and TMS are pulled up to 3.3 volts while TCK is not and only > reaches 2.0 volts. I can see transitions on TDI with pulses in the > microsecond range. This is the same with working boards. > > These chips only need Vcc, PROG_N high and the four JTAG signals, TMS, > TCK, TDO and TDI to be connected in order to program them. We seem to > have all that. > > Any ideas on what to check? The test fixture will program a good board > just fine. But these 60 units can't even pass chip ID verification. I > think I'm ready to replace the FPGA on one of them. > pull-down or series resistors of the wrong value?Article: 159492
On Wednesday, November 23, 2016 at 1:51:31 PM UTC-5, rickman wrote: > I use a Lattice LFXP3C-3TN100C on a production board that has been made > for several years with quantities in the thousands. The HW-USBN-2A JTAG > programmer typically works without complaint, both of them. One is a > Lattice unit and the other is a Chinese knockoff. > > We are trying to finish a run of 600+ units and most have completed > testing, but we have around 60 that can not be programmed. We get an > error that the device ID can't be verified. The value returned is > 0xFFFFFFFE while the expected value is 0x01255043. The value 0xfffffffe is -2 in two's complement. It might be indicating an error code you can look up. > I would say there is > a problem with a bad trace, but then I would expect *all* F's rather > than just one bit being a zero. > > I've traced the signals and they are getting where they belong. Signals > TDI and TMS are pulled up to 3.3 volts while TCK is not and only > reaches 2.0 volts. I can see transitions on TDI with pulses in the > microsecond range. This is the same with working boards. > > These chips only need Vcc, PROG_N high and the four JTAG signals, TMS, > TCK, TDO and TDI to be connected in order to program them. We seem to > have all that. > > Any ideas on what to check? The test fixture will program a good board > just fine. But these 60 units can't even pass chip ID verification. I > think I'm ready to replace the FPGA on one of them. Best regards, Rick C. HodginArticle: 159493
On 11/23/2016 1:58 PM, lasselangwadtchristensen@gmail.com wrote: > Den onsdag den 23. november 2016 kl. 19.51.31 UTC+1 skrev rickman: >> I use a Lattice LFXP3C-3TN100C on a production board that has been made >> for several years with quantities in the thousands. The HW-USBN-2A JTAG >> programmer typically works without complaint, both of them. One is a >> Lattice unit and the other is a Chinese knockoff. >> >> We are trying to finish a run of 600+ units and most have completed >> testing, but we have around 60 that can not be programmed. We get an >> error that the device ID can't be verified. The value returned is >> 0xFFFFFFFE while the expected value is 0x01255043. I would say there is >> a problem with a bad trace, but then I would expect *all* F's rather >> than just one bit being a zero. >> >> I've traced the signals and they are getting where they belong. Signals >> TDI and TMS are pulled up to 3.3 volts while TCK is not and only >> reaches 2.0 volts. I can see transitions on TDI with pulses in the >> microsecond range. This is the same with working boards. >> >> These chips only need Vcc, PROG_N high and the four JTAG signals, TMS, >> TCK, TDO and TDI to be connected in order to program them. We seem to >> have all that. >> >> Any ideas on what to check? The test fixture will program a good board >> just fine. But these 60 units can't even pass chip ID verification. I >> think I'm ready to replace the FPGA on one of them. >> > > pull-down or series resistors of the wrong value? We have checked everything we can think of. Even the one signal at 2.0 volts works with many other boards... Maybe I'll have them add a pullup to that one signal. Thanks for the idea. -- Rick CArticle: 159494
On Wed, 23 Nov 2016 13:51:30 -0500, rickman wrote: > I use a Lattice LFXP3C-3TN100C on a production board that has been made > for several years with quantities in the thousands. The HW-USBN-2A JTAG > programmer typically works without complaint, both of them. One is a > Lattice unit and the other is a Chinese knockoff. > > We are trying to finish a run of 600+ units and most have completed > testing, but we have around 60 that can not be programmed. We get an > error that the device ID can't be verified. The value returned is > 0xFFFFFFFE while the expected value is 0x01255043. I would say there is > a problem with a bad trace, but then I would expect *all* F's rather > than just one bit being a zero. > > I've traced the signals and they are getting where they belong. Signals > TDI and TMS are pulled up to 3.3 volts while TCK is not and only reaches > 2.0 volts. I can see transitions on TDI with pulses in the microsecond > range. This is the same with working boards. > > These chips only need Vcc, PROG_N high and the four JTAG signals, TMS, > TCK, TDO and TDI to be connected in order to program them. We seem to > have all that. > > Any ideas on what to check? The test fixture will program a good board > just fine. But these 60 units can't even pass chip ID verification. I > think I'm ready to replace the FPGA on one of them. Have you looked at the TDO line while the thing is ID-ing the chip? Are you getting something that looks like 0x01255043 in there, or all ones? If you can get your hands on one good and some bad boards you should learn something. An exponentially decaying high voltage might come across as 0xfffffffe, but I would expect that it would then sometimes come across as all 'f', or sometimes as 0xfffffffc, 0xfffffff8, etc. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!Article: 159495
Den onsdag den 23. november 2016 kl. 20.56.00 UTC+1 skrev Tim Wescott: > On Wed, 23 Nov 2016 13:51:30 -0500, rickman wrote: > > > I use a Lattice LFXP3C-3TN100C on a production board that has been made > > for several years with quantities in the thousands. The HW-USBN-2A JTAG > > programmer typically works without complaint, both of them. One is a > > Lattice unit and the other is a Chinese knockoff. > > > > We are trying to finish a run of 600+ units and most have completed > > testing, but we have around 60 that can not be programmed. We get an > > error that the device ID can't be verified. The value returned is > > 0xFFFFFFFE while the expected value is 0x01255043. I would say there is > > a problem with a bad trace, but then I would expect *all* F's rather > > than just one bit being a zero. > > > > I've traced the signals and they are getting where they belong. Signals > > TDI and TMS are pulled up to 3.3 volts while TCK is not and only reaches > > 2.0 volts. I can see transitions on TDI with pulses in the microsecond > > range. This is the same with working boards. > > > > These chips only need Vcc, PROG_N high and the four JTAG signals, TMS, > > TCK, TDO and TDI to be connected in order to program them. We seem to > > have all that. > > > > Any ideas on what to check? The test fixture will program a good board > > just fine. But these 60 units can't even pass chip ID verification. I > > think I'm ready to replace the FPGA on one of them. > > Have you looked at the TDO line while the thing is ID-ing the chip? Are > you getting something that looks like 0x01255043 in there, or all ones? > If you can get your hands on one good and some bad boards you should > learn something. > > An exponentially decaying high voltage might come across as 0xfffffffe, > but I would expect that it would then sometimes come across as all 'f', > or sometimes as 0xfffffffc, 0xfffffff8, etc. or rising voltage if the data is sent LSB firstArticle: 159496
> We have checked everything we can think of. Even the one signal at 2.0= =20 > volts works with many other boards... Maybe I'll have them add a pullup= =20 > to that one signal. Thanks for the idea. >=20 Have you got the parts from a reliable source? I had a similar issue ("chec= ked everything we can think of", failure rate about 20%) once. After weeks = of searching in turned out that we have bought counterfeit parts... (It wer= en't FPGAs, however, but application processors...) Thomas www.entner-electronics.com - Home of EEBlaster and JPEG-CodecArticle: 159497
On Wed, 23 Nov 2016 13:55:32 -0800, lasselangwadtchristensen wrote: > Den onsdag den 23. november 2016 kl. 20.56.00 UTC+1 skrev Tim Wescott: >> On Wed, 23 Nov 2016 13:51:30 -0500, rickman wrote: >> >> > I use a Lattice LFXP3C-3TN100C on a production board that has been >> > made for several years with quantities in the thousands. The >> > HW-USBN-2A JTAG programmer typically works without complaint, both of >> > them. One is a Lattice unit and the other is a Chinese knockoff. >> > >> > We are trying to finish a run of 600+ units and most have completed >> > testing, but we have around 60 that can not be programmed. We get an >> > error that the device ID can't be verified. The value returned is >> > 0xFFFFFFFE while the expected value is 0x01255043. I would say there >> > is a problem with a bad trace, but then I would expect *all* F's >> > rather than just one bit being a zero. >> > >> > I've traced the signals and they are getting where they belong. >> > Signals TDI and TMS are pulled up to 3.3 volts while TCK is not and >> > only reaches 2.0 volts. I can see transitions on TDI with pulses in >> > the microsecond range. This is the same with working boards. >> > >> > These chips only need Vcc, PROG_N high and the four JTAG signals, >> > TMS, TCK, TDO and TDI to be connected in order to program them. We >> > seem to have all that. >> > >> > Any ideas on what to check? The test fixture will program a good >> > board just fine. But these 60 units can't even pass chip ID >> > verification. I think I'm ready to replace the FPGA on one of them. >> >> Have you looked at the TDO line while the thing is ID-ing the chip? >> Are you getting something that looks like 0x01255043 in there, or all >> ones? If you can get your hands on one good and some bad boards you >> should learn something. >> >> An exponentially decaying high voltage might come across as 0xfffffffe, >> but I would expect that it would then sometimes come across as all 'f', >> or sometimes as 0xfffffffc, 0xfffffff8, etc. > > or rising voltage if the data is sent LSB first Then it would be much easier to believe a reliable 0 LSB with all ones following. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!Article: 159498
On 11/23/2016 2:55 PM, Tim Wescott wrote: > On Wed, 23 Nov 2016 13:51:30 -0500, rickman wrote: > >> I use a Lattice LFXP3C-3TN100C on a production board that has been made >> for several years with quantities in the thousands. The HW-USBN-2A JTAG >> programmer typically works without complaint, both of them. One is a >> Lattice unit and the other is a Chinese knockoff. >> >> We are trying to finish a run of 600+ units and most have completed >> testing, but we have around 60 that can not be programmed. We get an >> error that the device ID can't be verified. The value returned is >> 0xFFFFFFFE while the expected value is 0x01255043. I would say there is >> a problem with a bad trace, but then I would expect *all* F's rather >> than just one bit being a zero. >> >> I've traced the signals and they are getting where they belong. Signals >> TDI and TMS are pulled up to 3.3 volts while TCK is not and only reaches >> 2.0 volts. I can see transitions on TDI with pulses in the microsecond >> range. This is the same with working boards. >> >> These chips only need Vcc, PROG_N high and the four JTAG signals, TMS, >> TCK, TDO and TDI to be connected in order to program them. We seem to >> have all that. >> >> Any ideas on what to check? The test fixture will program a good board >> just fine. But these 60 units can't even pass chip ID verification. I >> think I'm ready to replace the FPGA on one of them. > > Have you looked at the TDO line while the thing is ID-ing the chip? Are > you getting something that looks like 0x01255043 in there, or all ones? > If you can get your hands on one good and some bad boards you should > learn something. > > An exponentially decaying high voltage might come across as 0xfffffffe, > but I would expect that it would then sometimes come across as all 'f', > or sometimes as 0xfffffffc, 0xfffffff8, etc. I may have gotten to the bottom of this... or at least below the knees. The results of probing with the scope were a bit inconsistent which may be from not controlling all the variables. But the programmer has a debug mode where you can toggle or set any of the FPGA input lines and get a report of the one output line from the FPGA. After likely chasing my tail for a while I realized the TDI line was not toggling when I expected it to. When I pursued this I found that board has some sort of a short to Vcc on TDI of around 500 ohms. The driver couldn't pull it down at all. After all was said and done this seemed to be a problem on only a single board. I was able to program and test successfully three more boards. Everyone was going home (factory folk work early hours) so I called it a day and asked them to retest all the failed boards. I had been told some failed in programming and some failed the audio testing, so I expect we have some various issues which may or may not all be real. This all is exacerbated by the fact that I spend a large hunk of my time some four hours from the contract manufacturer. When in Maryland, I am only an hour away which isn't so bad. But I'm typically only there on weekends now. With any luck the testing will be completed without significant issues. To Thomas, yes, I thought of the counterfeit issue and asked if they were buying from a reliable source, but I already knew the answer. The FPGAs are EOLed and only available from Arrow now. I have used some 3000 parts since last year and watching the web site inventory numbers, it would appear I am the *only* remaining user of these parts, lol! So I have no doubt they bought them from Arrow who had put in a large order when the EOL announcement came out and are now stuck with 78,000 parts!!! Funny that they seem to be raising the price rather than lowering it. -- Rick CArticle: 159499
On 11/23/2016 6:18 PM, Tim Wescott wrote: > On Wed, 23 Nov 2016 13:55:32 -0800, lasselangwadtchristensen wrote: > >> Den onsdag den 23. november 2016 kl. 20.56.00 UTC+1 skrev Tim Wescott: >>> On Wed, 23 Nov 2016 13:51:30 -0500, rickman wrote: >>> >>>> I use a Lattice LFXP3C-3TN100C on a production board that has been >>>> made for several years with quantities in the thousands. The >>>> HW-USBN-2A JTAG programmer typically works without complaint, both of >>>> them. One is a Lattice unit and the other is a Chinese knockoff. >>>> >>>> We are trying to finish a run of 600+ units and most have completed >>>> testing, but we have around 60 that can not be programmed. We get an >>>> error that the device ID can't be verified. The value returned is >>>> 0xFFFFFFFE while the expected value is 0x01255043. I would say there >>>> is a problem with a bad trace, but then I would expect *all* F's >>>> rather than just one bit being a zero. >>>> >>>> I've traced the signals and they are getting where they belong. >>>> Signals TDI and TMS are pulled up to 3.3 volts while TCK is not and >>>> only reaches 2.0 volts. I can see transitions on TDI with pulses in >>>> the microsecond range. This is the same with working boards. >>>> >>>> These chips only need Vcc, PROG_N high and the four JTAG signals, >>>> TMS, TCK, TDO and TDI to be connected in order to program them. We >>>> seem to have all that. >>>> >>>> Any ideas on what to check? The test fixture will program a good >>>> board just fine. But these 60 units can't even pass chip ID >>>> verification. I think I'm ready to replace the FPGA on one of them. >>> >>> Have you looked at the TDO line while the thing is ID-ing the chip? >>> Are you getting something that looks like 0x01255043 in there, or all >>> ones? If you can get your hands on one good and some bad boards you >>> should learn something. >>> >>> An exponentially decaying high voltage might come across as 0xfffffffe, >>> but I would expect that it would then sometimes come across as all 'f', >>> or sometimes as 0xfffffffc, 0xfffffff8, etc. >> >> or rising voltage if the data is sent LSB first > > Then it would be much easier to believe a reliable 0 LSB with all ones > following. The one board that actually failed programming seems to have a low resistance between the TDI line and Vcc. I assume the TDI line being held high results in this pattern. I'd have to dig into the JTAG spec to see just what is happening, but as it is limited to this one board (so far) I think I can ignore it. It's a shame this product is now EOL itself. There are still plenty of FPGAs available (even though they are EOL'd) and the test process works pretty well most of the time. But then that is life in the engineering game. On to the next product! -- Rick C
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z