Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
I discovered one cause (but not all) of the coredumps I experience. If I h= ad mismatched port widths in a VHDL instantiation, I'll often have coredump= s. There is no indication of what is wrong, but now I know what to look fo= r in some cases. I also suffer all kinds of problems when I try to use unc= onstrained outputs based on unconstrained inputs, to the point where I just= have to avoid that feature of VHDL. I think it'd be great to have a cloud service you could use if you didn't n= eed to use it that often, but I don't know if that would be profitable for = Mentor.Article: 155251
One mistake that is not too hard to make is forgetting to put a synchronize= r flop on the input of an edge detector, like you might have on a UART inpu= t (so that the edge detector has two flops, total). Depending on the routi= ng delays, this can cause you to miss a sizable percentage of edges. (Not = just delayed, but missed completely.) Using only a single flop is sometime= s known as using the "greedy path". (Actually, to mitigate metastability as well, an edge detector ought to hav= e three flops and an AND gate. Using two is sometimes known as using the "= sneaky path".)Article: 155252
On 6/17/2013 8:14 PM, Rob Gaddi wrote: > On Mon, 17 Jun 2013 20:00:01 -0400 > rickman<gnuarm@gmail.com> wrote: > >> So I finally got around to adding some debug signals which I would >> monitor on an analyzer and guess what, the bug is gone! I *hate* when >> that happens. I can change the code so the debug signals only appear >> when a control register is set to enable them, but still, I don't like >> this. I want to know what is causing this DURN THING! >> >> Anyone see this happen to them before? >> >> Oh yeah, someone in another thread (that I can't find, likely because I >> don't recall the group I posted it in) suggested I add synchronizing FFs >> to the serial data in. Sure enough I had forgotten to do that. Maybe >> that was the fix... of course! It wasn't metastability, I bet it was >> feeding multiple bits of the state machine! Durn, I never make that >> sort of error. Thanks to whoever it was that suggested the obvious that >> I had forgotten. >> >> -- >> >> Rick > > Not metastability, a race condition. Asynchronous external input > headed to multiple clocked elements, each of which it reaches via a > different path with a different delay. > > When you added debugging signals you changed the netlist, which changed > the place and route, making unpredictable changes to those delays. No, when changing the debug output I added the synchronization FFs which fixed the problem. My point was that when the other poster suggested that I need to sync to the clock I mistook that for metastability forgetting that the input went to multiple sections of logic. So actually I made the same mistake twice... lol > In > this case, it happened to push it into a place where _as far as you > tested_, it seems happy. But it's still unsafe, because as you change > other parts of the design, the P&R of that section will still change > anyhow, and you start getting my favorite situation, the problem that > comes and goes based on entirely unrelated factors. > > The fix you fixed fixes it. When you resynchronized it on the same > clock as you're running around the rest of the logic, you forced that > path to become timing constrained. As such, the P&R takes it upon > itself to make sure that the timing of that route is irrelevant with > respect to the clock period, and your problem goes away for good. Just to make sure of what was what (it has been two years since I last worked with this design) I pulled the FFs out and added back just one. Sure enough the bug reappears with no FFs, but goes away with just one. The added debug info available allowed me to see exactly the error and sure enough, when a start bit comes in there is a chance that the two counters are not properly set and the error shows up in the center of the bit where the current contents of the shift register are moved into the holding register as a new char. I guess what most likely happened is that when I wrote the UART code I assumed the sync FFs would be external and when I wrote the wrapper code I assumed the FFs were inside the UART. In other words, I didn't have a proper spec and never gave this problem proper consideration. I will revisit this design and look at the other inputs. No reason to assume I didn't make the same mistake elsewhere. -- RickArticle: 155253
On 6/18/2013 3:16 PM, Kevin Neilson wrote: > One mistake that is not too hard to make is forgetting to put a synchronizer flop on the input of an edge detector, like you might have on a UART input (so that the edge detector has two flops, total). Depending on the routing delays, this can cause you to miss a sizable percentage of edges. (Not just delayed, but missed completely.) Using only a single flop is sometimes known as using the "greedy path". > > (Actually, to mitigate metastability as well, an edge detector ought to have three flops and an AND gate. Using two is sometimes known as using the "sneaky path".) Everyone is saying the same thing, so I guess I didn't explain clearly. Someone had already pointed out to me that I needed a synchronizer on the received data signal in another thread that I can't find now. I took them at their word, but was thinking they meant it was about metastability which I figured was not a problem at these speeds (yes, the speeds do make a difference for metastability since you never chase it away, you just minimize it). I wasn't thinking about the serial in signal feeding the state machine, just the shift register. So when I made the changes, which included the synchronizer, it worked. Because I didn't expect the synchronizer to do anything, I had forgotten about it until I was typing the post here. I remembered at the end of the message and realized that was what fixed the problem... Sorry for the confusion. Still, thanks to all who replied and especially the mystery person who suggested it in the other thread wherever that was. -- RickArticle: 155254
Le 18/06/2013 23:45, rickman a écrit : > I guess what most likely happened is that when I wrote the UART code I > assumed the sync FFs would be external and when I wrote the wrapper code > I assumed the FFs were inside the UART. In other words, I didn't have a > proper spec and never gave this problem proper consideration. Several years ago a young engineer reused my long proven UART code and modified it, carelessly removing the synchronizing FF. He came to see me and complained that my UART didn't work, it hung after some unpredictable time. I thought for a few minutes, guessed he probably had removed the FF and fixed his problem right away. NicolasArticle: 155255
That's the same thing that happened to me when I had the problem last. I h= ad an edge detector connected to a big synchronizer module that was in turn= connected to all the input pins. When I had problems I looked inside the = synchronizer module and found that it didn't have a flop on that line; it w= as just wired straight through.Article: 155256
rickman <gnuarm@gmail.com> wrote: > Everyone is saying the same thing, so I guess I didn't explain clearly. > Someone had already pointed out to me that I needed a synchronizer on > the received data signal in another thread that I can't find now. I > took them at their word, but was thinking they meant it was about > metastability which I figured was not a problem at these speeds (yes, > the speeds do make a difference for metastability since you never chase > it away, you just minimize it). I wasn't thinking about the serial in > signal feeding the state machine, just the shift register. There's 3 things that could have gone wrong (and might still be doing wrong): You failed to synchronise between the clock domain of the input serial link and the clock of your system (sounds like you fixed this one) You failed to constrain the clocks and other inputs so the synthesis tool knows what timing budget it has to meet You failed timing analysis and didn't notice - in other words the synthesis tool says the design it produced doesn't meet your supplied timing constraints, despite its best efforts. If the failure is small it may still work in some voltage/temperature/silicon situations, but it isn't guaranteed in all cases. Normally the last one will raise big red flags in the tool, assuming the timing analyser does get run as part of the build. However the first two are easy to overlook and you get no warning from the tools. TheoArticle: 155257
On 6/18/2013 7:28 AM, Thomas Stanka wrote: > On 18 Jun., 03:19, phanhuyich <khanhnguyent...@gmail.com> wrote: >> I am starting to study VHDL. Now, I have to do an exercise with the following content: >> >> I have to define an array of 10 elements ( 8 bit range) ([3,4,2,8,9,0,1,5,7,6] for example). And 10 elements were imported to within 10 clock cycles. The question is find the maximum number and second maximum number in this array after 10 clock cycle. >> Anyone help to show me the method to solve it using VHDL ? > > No problem. Just write down your sollution to that problem in "not > VHDL". Then ask what part of the algorithm is hard for you transfer in > VHDL and why so we can help. > > HInt it helps to think about the RTL and draw a picture about how the > data flow might be than it is easy to write it down in VHDL. > > regards Thomas > I often try to think of exercises like this in a completely different setting. For example, suppose you have ten ordinary playing cards from a standard deck of 52. You agree that there is a standard order of these cards where the lowest for this exercise is Ace of Clubs, then Ace of Diamonds, Ace of Hearts, Ace of Spades, Two of Clubs, Two of Diamonds, . . . with King of Spades being highest. Now you're going to flip one card at a time (this is the same way you get your input, one item per clock cycle). With each flip you get to make one decision. For example if you only cared about the highest card of the ten, you could have a single stack where you place the new card if it is higher than the card already showing (or if the stack has no cards), and discard any card that is not higher. Now my first thought was that you could use this same approach to find the two highest cards, but there's one case where it doesn't work - if the highest card comes out first. Then your stack will only have one card in it so you can't just dig one card down to find the second highest. So you need to think how you'd arrange cards to be certain to know the top two at the end of the exercise. Then it's a simple matter of translating this procedure to a VHDL process. Have fun! -- GaborArticle: 155258
Gabor wrote: > On 6/18/2013 7:28 AM, Thomas Stanka wrote: >> On 18 Jun., 03:19, phanhuyich <khanhnguyent...@gmail.com> wrote: >>> I am starting to study VHDL. Now, I have to do an exercise with the >>> following content: >>> >>> I have to define an array of 10 elements ( 8 bit range) >>> ([3,4,2,8,9,0,1,5,7,6] for example). And 10 elements were imported to >>> within 10 clock cycles. The question is find the maximum number and >>> second maximum number in this array after 10 clock cycle. >>> Anyone help to show me the method to solve it using VHDL ? >> >> No problem. Just write down your sollution to that problem in "not >> VHDL". Then ask what part of the algorithm is hard for you transfer in >> VHDL and why so we can help. >> >> HInt it helps to think about the RTL and draw a picture about how the >> data flow might be than it is easy to write it down in VHDL. >> >> regards Thomas >> > I often try to think of exercises like this in a completely different > setting. For example, suppose you have ten ordinary playing cards > from a standard deck of 52. You agree that there is a standard > order of these cards where the lowest for this exercise is Ace > of Clubs, then Ace of Diamonds, Ace of Hearts, Ace of Spades, Two > of Clubs, Two of Diamonds, . . . with King of Spades being highest. > > Now you're going to flip one card at a time (this is the same > way you get your input, one item per clock cycle). With each > flip you get to make one decision. For example if you only cared > about the highest card of the ten, you could have a single stack > where you place the new card if it is higher than the card already > showing (or if the stack has no cards), and discard any card that > is not higher. > > Now my first thought was that you could use this same approach to > find the two highest cards, but there's one case where it doesn't > work - if the highest card comes out first. Then your stack will > only have one card in it so you can't just dig one card down to > find the second highest. > > So you need to think how you'd arrange cards to be certain to > know the top two at the end of the exercise. Then it's a simple > matter of translating this procedure to a VHDL process. > > Have fun! > I posted that last night when the brain was foggy. I should have said that the simple algorithm won't work for finding the second highest card if the highest card comes before the second highest. In any case you need to think of a good algorithm for finding both. -- GaborArticle: 155259
To borrow Gabor's card game analogy... You have two stacks, (highest and 2nd highest) If the drawn card is same or higher than the highest stack, then move the top card from the highest stack to the 2nd highest stack, move the drawn card to the highest stack. else if the drawn card is same or higher than the 2nd highest stack, then move the drawn card to the 2nd highest stack. draw another card and repeat. AndyArticle: 155260
I have a project that need about 300KLE, I want to choose a device between = the Xilinx V7 and Altera Stratix5, please give some suggestions, in the low= -end devices,what is the key diffrence between spartan6 and Cyclone5,which = have the better route pass percentage?And which have the better resource us= eage?And which have the better power consumption?Article: 155261
On Tuesday, June 18, 2013 10:06:32 AM UTC-7, Kevin Neilson wrote: > I discovered one cause (but not all) of the coredumps I experience. If I= had mismatched port widths in a VHDL instantiation, I'll often have coredu= mps. There is no indication of what is wrong, but now I know what to look = for in some cases. I also suffer all kinds of problems when I try to use u= nconstrained outputs based on unconstrained inputs, to the point where I ju= st have to avoid that feature of VHDL. >=20 >=20 >=20 > I think it'd be great to have a cloud service you could use if you didn't= need to use it that often, but I don't know if that would be profitable fo= r Mentor. In my case, the tool choked miserably whenever I misinterpreted the systemv= erilog spec and hooked up interfaces incorrectly. =20 In my opinion Mentor can use the cloud platform quite creatively and make a= business out of the unmet need which is allowing engineers to build myriad= pieces of ip that serve niche areas without going through a vetting proces= s to justify a big budget and therefore a big market. And think of the community schools that generally offer programs in c progr= amming, etc. Why not programs in verification, linting, scripting, simple = designs, etc.? More side opportunities for consultants/senior engineers as = trainers, more opportunities for the students to learn online. E.g. If cou= rsera/udemy can offer software courses, why not hardware courses as well? = And think of kickstarter/indiegogo which can fund those hardware projects. Enough said. I don't mean to say that cost of the tools is the only thing = that is preventing massive innovation in the hardware development. But I f= eel it is an important part as it limits the creative ability of the people= who can make a difference.Article: 155262
bjzhangwn@gmail.com wrote: > I have a project that need about 300KLE, I want to choose a device between the Xilinx V7 and Altera Stratix5, please give some suggestions, in the low-end devices,what is the key diffrence between spartan6 and Cyclone5,which have the better route pass percentage?And which have the better resource useage?And which have the better power consumption? It's been a while since the last X vs. A wars, here. But it seems you are asking two differenct questions. First for 300KLE and either Virtex7 or Stratix5. I assume that the Spartan6 vs Cyclone5 question is separate because I don't think they go up to 300KLE (I know for a fact that Spartan6 only goes to 150KLE). And now there are newer "low-priced" Artix parts from Xilinx if you wanted to look at 7-series for comparison with the latest Altera low-cost parts. Artix goes up to 200KLE (right now I think the only two sizes available are 100K and 200K). Artix can also do x4 PCIe if you need that. Spartan 6 LXT only has 1x endpoint blocks. Not sure what you mean by "route pass percentage." If you're talking about the amount of logic you can stuff into a part before it becomes unroutable, then Xilinx parts are pretty good. You usually get to a point where it's too hard to meet timing (due to slice packing and other placement constraints) before you get an unroutable design. I generally consider about 70% LUT usage to be "full" from this perspective. No experience on altera parts. -- GaborArticle: 155263
On 6/15/2013 10:17 PM, Eric Wallin wrote: > Thanks for your response rickman! > > On Saturday, June 15, 2013 8:40:27 PM UTC-4, rickman wrote: >> That the ground I have been plowing off and on for the last 10 years. > > Ooo, same here, and my condolences. I caught a break a couple of months ago and have been beavering away on it ever since, and I finally have something that doesn't cause me to vomit when I code for it. Multiple indexed simple stacks with explicit pointer control makes everything a lot easier than a bog standard stack machine. I think the auto-consumption of literally everything, particularly the data, indexes, and pointers you dearly want to use again is at the bottom of all the crazy people just accept with stack machines. This mechanism works great for manual data entry on HP calculators, but not so much for stack machines IMHO. Auto consumption also pretty much rules out conditional execution of single instructions. I was looking at how to improve a stack design a few months ago and came to a similar conclusion. My first attempt at getting around the stack ops was to use registers. I was able to write code that was both smaller and faster since in my design all instructions are one clock cycle so executed instruction count equals number of machine cycles. Well, sort of. My original dual stack design was literally one clock per instruction. In order to work with clocked block ram the register machine would use either both phases of two clocks per machine cycle or four clock cycles. While pushing ideas around on paper, the J1 design gave me an idea of adjusting the stack point as well as using an offset in each instruction. That gave a design that is even faster with fewer instructions. I'm not sure if it is practical in a small opcode. I have been working with 8 and 9 bit opcodes, the latest approach with stack pointer control can fit in 9 bits, but would be happier with a couple more bits. >> I assume that you do understand that the point of MISC is that the >> implementation can be minimized so that the instructions run faster. In >> theory this makes up for the extra instructions needed to manipulate the >> stack on occasion. But I understand your interest in minimizing the >> inconvenience of stack ops. I spent a little time looking at >> alternatives and am currently looking at a stack CPU design that allows >> offsets into the stack to get around the extra stack ops. I'm not sure >> how this compares to your ideas. It is still a dual stack design as I >> have an interest in keeping the size of the implementation at a minimum. > > MISC is interesting, but you have to consider that all ops, including simple stack manipulations, will generally consume as much real time as a multiply, which suddenly makes all of those confusing stack gymnastics you have to perform to dig out your loop index or whatever from underneath your read/write pointer from underneath your data and such overly burdensome. Programming to facilitate stack optimization is king on a stack machine. I'm not sure how the multiply speed is relevant, but the real question is just how fast does an algorithm run which has to include all the instructions needed as well as the clock speed. Then it is also important to consider resources used. I think you said your design uses 1800 LEs which is a *lot* more than a simple two stack design. They aren't always available. > Indexes into a moving stack - that way lies insanity. Ever hit the roll down button on an HP calculator and get instantly flummoxed? Maybe a compiler can keep track of that kind of stuff, but my weak brain isn't up to the task. Then I don't know why you are designing CPUs, lol! I like RPN calculators and have trouble using anything else. I also program in Forth so this all works for me. > Altera BRAM doesn't go as wide as Xilinx with true dual port. When I was working in Xilinx I was able to use a single BRAM for both the data and return stacks (16 bit data). I expect Xilinx has some patent that Altera can't get around for a couple more years. Lattice seems to be pretty good though. I just would prefer to have an async read since that works in a one clock machine cycle better. >> 1800 LEs won't even fit on the FPGAs I am targeting. > > I'm not sure anything less than the smallest Cyclone 2 is really worth developing in. A lot of the stuff below that is often more expensive due to the built-in configuration memory and such. There are quite inexpensive Cyclone dev boards on eBay from China. I don't know about dev board cost, but I can get a 1280 LUT Lattice part for under $4 in reasonable quantity. That is the area I typically work in. My big problem is packages. I don't want to have to use extra fine pitch on PCBs to avoid the higher costs. BGAs require very fine via holes and fine pitch PCB traces and run the board costs up a bit. None of the FPGA makers support the parts I like very well. VQ100 is my favorite, small but enough pins for most projects. >> I would like to hear about your innovations. As you seem to understand, >> it is hard to be truly innovative finding new ideas that others have not >> uncovered. But I think you are certainly in an area that is not >> thoroughly explored. > > I haven't seen anything exactly like it, certainly not the way the stacks are implemented. And I deal with extended arithmetic results in an unusual way. In terms of scheduling and pipelining, the Parallax Propeller is probably the closest in architecture (you can infer from the specs and operational model what they don't explicitly tell you in the datasheet). > >> I won't argue with that. Even when I was an IEEE member, I never found >> a document I didn't have to pay for. > > I was a member too right out of grad school. But, like Janet Jackson sang: "What have they done for me lately?" My mistake was getting involved in the local chapters. Seems IEEE is just a good ol' boys network and is all about status and going along to get along. They don't believe in the written rules, more so the unwritten ones. >> When can we expect to see your paper? > > It's all but done, just picking around the edges at this point. As soon as the code is verified to my satisfaction I'll release both and post here. Ok, looking forward to it. -- RickArticle: 155264
On 6/19/2013 4:29 PM, GaborSzakacs wrote: > bjzhangwn@gmail.com wrote: >> I have a project that need about 300KLE, I want to choose a device >> between the Xilinx V7 and Altera Stratix5, please give some >> suggestions, in the low-end devices,what is the key diffrence between >> spartan6 and Cyclone5,which have the better route pass percentage?And >> which have the better resource useage?And which have the better power >> consumption? > > It's been a while since the last X vs. A wars, here. But it seems you > are asking two differenct questions. First for 300KLE and either > Virtex7 or Stratix5. I assume that the Spartan6 vs Cyclone5 question > is separate because I don't think they go up to 300KLE (I know for a > fact that Spartan6 only goes to 150KLE). And now there are newer > "low-priced" Artix parts from Xilinx if you wanted to look at 7-series > for comparison with the latest Altera low-cost parts. Artix goes up > to 200KLE (right now I think the only two sizes available are 100K > and 200K). Artix can also do x4 PCIe if you need that. Spartan 6 LXT > only has 1x endpoint blocks. > > Not sure what you mean by "route pass percentage." If you're talking > about the amount of logic you can stuff into a part before it becomes > unroutable, then Xilinx parts are pretty good. You usually get to > a point where it's too hard to meet timing (due to slice packing and > other placement constraints) before you get an unroutable design. I > generally consider about 70% LUT usage to be "full" from this > perspective. No experience on altera parts. I think this is one of those questions like, "how long is a piece of string"? The utilization percentage depends entirely on your design. I did a design on a Lattice part a while back and when the customer wanted an upgrade I warned that it would likely push the utilization up to 80% or more which might make it hard to route and meet timing. Sure enough, the project got to about 80%, but we had no problem at all with routing or timing. Certainly the tools are better than they were back when I almost did harakari working on a design update at over 90% utilization. So I don't think there is a good answer to the question. But there is a not-so-bad solution. Unless you plan to instantiate vendor specific components, you should be able to write your code without making the decision about the vendor. When the design is done or nearly so, generate bit files on both tools and see which fits best. The most likely answer is, "it doesn't matter". Power consumption can be measured very easily on most development boards. They are usually less than a weeks pay and sometimes less than a day's pay. Or you can ask the vendor... -- RickArticle: 155265
On 6/19/2013 11:40 AM, jonesandy@comcast.net wrote: > To borrow Gabor's card game analogy... > > You have two stacks, (highest and 2nd highest) > > If the drawn card is same or higher than the highest stack, then > > move the top card from the highest stack to the 2nd highest stack, > move the drawn card to the highest stack. > > else if the drawn card is same or higher than the 2nd highest stack, then > > move the drawn card to the 2nd highest stack. > > draw another card and repeat. They don't need to be stacks. You just need to have two holding spots (registers) and initialize them to something less than anything you will have on the input. Then on each draw of a card (or sample on the input) you compare to both spots, if the input is higher than the "highest" spot you save it there and put the old highest on the "second highest" spot. If not, but it is higher than the "second highest" you put it there. Gabor was using a stack because he thought it would get him both the highest and the second highest with one compare operation, but it didn't work. Two compares are needed for each input. In your approach your compare is "higher or same", why do you need to do anything if they are the same? Not that it is a big deal, but in some situations this could require extra work. -- RickArticle: 155266
On Wednesday, June 19, 2013 11:13:52 AM UTC-7, Sanjay Parekh wrote: > On Tuesday, June 18, 2013 10:06:32 AM UTC-7, Kevin Neilson wrote: >=20 > > I discovered one cause (but not all) of the coredumps I experience. If= I had mismatched port widths in a VHDL instantiation, I'll often have core= dumps. There is no indication of what is wrong, but now I know what to loo= k for in some cases. I also suffer all kinds of problems when I try to use= unconstrained outputs based on unconstrained inputs, to the point where I = just have to avoid that feature of VHDL. >=20 > >=20 >=20 > >=20 >=20 > >=20 >=20 > > I think it'd be great to have a cloud service you could use if you didn= 't need to use it that often, but I don't know if that would be profitable = for Mentor. >=20 >=20 >=20 > In my case, the tool choked miserably whenever I misinterpreted the syste= mverilog spec and hooked up interfaces incorrectly. =20 >=20 >=20 >=20 > In my opinion Mentor can use the cloud platform quite creatively and make= a business out of the unmet need which is allowing engineers to build myri= ad pieces of ip that serve niche areas without going through a vetting proc= ess to justify a big budget and therefore a big market. >=20 >=20 >=20 > And think of the community schools that generally offer programs in c pro= gramming, etc. Why not programs in verification, linting, scripting, simpl= e designs, etc.? More side opportunities for consultants/senior engineers a= s trainers, more opportunities for the students to learn online. E.g. If c= oursera/udemy can offer software courses, why not hardware courses as well?= And think of kickstarter/indiegogo which can fund those hardware projects= . >=20 >=20 >=20 > Enough said. I don't mean to say that cost of the tools is the only thin= g that is preventing massive innovation in the hardware development. But I= feel it is an important part as it limits the creative ability of the peop= le who can make a difference. Interesting read today if you can see as I do the opportunities for cloud b= ased tools.. http://gigaom.com/2013/06/19/open-compute-is-bringing-the-make= r-movement-to-the-enterprise/?utm_source=3DGeneral+Users&utm_campaign=3D347= 2bd888e-c%3Atec%2Capl+d%3A06-20&utm_medium=3Demail&utm_term=3D0_1dd83065c6-= 3472bd888e-98983131Article: 155267
On Wednesday, June 19, 2013 5:13:04 PM UTC-4, rickman wrote: > While pushing ideas around on paper, the J1 design gave me an idea of=20 > adjusting the stack point as well as using an offset in each=20 > instruction. That gave a design that is even faster with fewer=20 > instructions. I'm not sure if it is practical in a small opcode. =20 Interesting. The J1 strongly influenced me as well. < I have been working with 8 and 9 bit opcodes, the latest approach with=20 > stack pointer control can fit in 9 bits, but would be happier with a=20 > couple more bits. I decided to stay away from non-powers of 2 widths for instructions and dat= a. Not efficient in standard storage. Having multiple instructions per wo= rd I see now as more of a bug than a feature because you have to index into= it to return from a subroutine and how / where do you store the index? > Programming to facilitate stack optimization is king on a stack machine.= =20 I feel that this is a fiddly activity that wastes the programmer's time and= creates code that is exceedingly difficult to figure out later. > I'm not sure how the multiply speed is relevant, but the real question= =20 > is just how fast does an algorithm run which has to include all the=20 > instructions needed as well as the clock speed. Multiply is relevant because in a 32 bit machine it will likely be THE spee= d bottleneck, pulling overall timing down. They include non-fabric registe= ring at the I/O of the FPGA multiply hardware to help pipeline it. Same wi= th BRAM - reads really speed up if you use the "free" output registering (i= n addition to the synchronous register you are generally forced to use). > > Indexes into a moving stack - that way lies insanity. Ever hit the rol= l down button on an HP calculator and get instantly flummoxed? Maybe a com= piler can keep track of that kind of stuff, but my weak brain isn't up to t= he task. >=20 > Then I don't know why you are designing CPUs, lol! I like RPN=20 > calculators and have trouble using anything else. I also program in=20 > Forth so this all works for me. Quite the contrary, I've used HP calculators religiously since I won one in= a HS engineering contest almost 30 years ago. Too bad they don't make the= "real" ones anymore (35S is the best they can do it seems, maybe they lost= the plans along with those of the Saturn V). But when I hit the roll down= button to find a value on the stack, I have to give up on the other stack = items due to confusion. I really want to like Forth, but after reading the= books and being repeatedly repelled by the syntax and programming model I = gave up. My goal with CPU design was to make one simple enough to program without sp= ecial tools, but complex enough to do real work and I think I've finally ac= hieved that. > I expect Xilinx has some patent that Altera can't get around for a=20 > couple more years. Lattice seems to be pretty good though. I just=20 > would prefer to have an async read since that works in a one clock=20 > machine cycle better. I like Lattice parts too, and used the original MachXO on many boards in li= eu of a CPLD. =20 But I gave up on single cycle along with two stacks and autoconsumption. L= ike you say async read BRAM is hard to come by. Single cycle is also slow = and strands a bazillion FFs in the fabric. I wonder if you've read this article: http://spectrum.ieee.org/semiconductors/processors/25-microchips-that-shook= -the-world Moore made a lot of money off of what seem like frivolous lawsuits, which b= rings him down several notches in my eyes.Article: 155268
Eric Wallin wrote: > Quite the contrary, I've used HP calculators religiously since > I won one in a HS engineering contest almost 30 years ago. > Too bad they don't make the "real" ones anymore (35S is the > best they can do it seems, maybe they lost the plans along > with those of the Saturn V). I'm sure HP still has the plans for the Saturn, viz http://www.hpmuseum.org/saturn.htm Sorry, couldn't resist. > I really want to like Forth, but after reading the books > and being repeatedly repelled by the syntax and programming > model I gave up. Nobody /writes/ Forth. They write programs that emit Forth. The most mainstream example of that is printer drivers emitting PostScript.Article: 155269
On 20/06/2013 15:10, Sanjay Parekh wrote: > On Wednesday, June 19, 2013 11:13:52 AM UTC-7, Sanjay Parekh wrote: >> On Tuesday, June 18, 2013 10:06:32 AM UTC-7, Kevin Neilson wrote: .. >> >>> I think it'd be great to have a cloud service you could use if you didn't need to use it that often, but I don't know if that would be profitable for Mentor. .. > Interesting read today if you can see as I do the opportunities for cloud based tools.. http://gigaom.com/2013/06/19/open-compute-is-bringing-the-maker-movement-to-the-enterprise/?utm_source=General+Users&utm_campaign=3472bd888e-c%3Atec%2Capl+d%3A06-20&utm_medium=email&utm_term=0_1dd83065c6-3472bd888e-98983131 > I don't think cloud EDA services will happen soon for the simple reason that companies are generally not happy to splatter their highly valuable IP over the internet. You have an additional problem that the servers are normally not located in your country which means you have to fight a foreign court system if something goes wrong (server hacked, IP theft, etc). Hans www.ht-lab.comArticle: 155270
On Tuesday, April 23, 2013 4:13:42 PM UTC-4, Kevin Neilson wrote: > Why is Modelsim so expensive? It is a mature product and yet it segfault= s on me all the time. Constantly. Often, when it ought to give me warning= s or errors (such as when there is a port width mismatch) it just core dump= s instead, leaving me to comment out lines one at a time until I figure out= why it's crashing. That's my rant. It's still pretty decent, but ought t= o be cheaper if it's going to coredump like freeware. The simulator in Quartus is nice and has a "functional simulation" mode tha= t makes the compile fairly trivial and quick. Altera unfortunately unbundl= ed it from the main GUI after 9.2SP2 and turned into an ugly rickety Tcl ba= sed unintegrated monstrosity. At the time the rep told me "no one uses it"= . Don't mind me, I'm a just a nobody.Article: 155271
On 6/20/2013 10:51 AM, Tom Gardner wrote: > Eric Wallin wrote: > > I really want to like Forth, but after reading the books > > and being repeatedly repelled by the syntax and programming > > model I gave up. Quitter! If the syntax (or near total lack thereof) bothers you, then you must have a very thin skin. > Nobody /writes/ Forth. They write programs that emit Forth. > The most mainstream example of that is printer drivers > emitting PostScript. LOL! I guess I was documenting something this morning rather than writing code... -- RickArticle: 155272
I'd really like to make some cores in my spare time, but the revenues would= be pretty small, and there is no way it would be worthwhile to buy Synplif= y and Modelsim licenses for such a small endeavor. I don't know exactly wh= at that would cost, but I'm sure it's tens of thousands. It'd be great if = I use the tools online for a few hours here and there and just pay for that= . Even if I couldn't use the GUI--if I could just get an EDIF and .srr fil= e back--that would be useful. I guess I could use Icarus or something, but I'm sure it's not going to par= se the nice SysVerilog / VHDL 2008 code I write, and who wants to buy a cor= e that comes with an Icarus project file?Article: 155273
On 6/20/13 1:50 PM, rickman wrote: > On 6/20/2013 10:51 AM, Tom Gardner wrote: >> Eric Wallin wrote: >> > I really want to like Forth, but after reading the books >> > and being repeatedly repelled by the syntax and programming >> > model I gave up. > > Quitter! If the syntax (or near total lack thereof) bothers you, then > you must have a very thin skin. > >> Nobody /writes/ Forth. They write programs that emit Forth. >> The most mainstream example of that is printer drivers >> emitting PostScript. > > LOL! I guess I was documenting something this morning rather than > writing code... > Eric, I'm curious what books these were that you found so offensive? I'm also baffled about your comment about "programs that emit Forth." Although PostScript has many features in common with Forth, it is quite different, both in terms of command set and programming model. Modern Forths (e.g. since the release of ANS Forth 94) feature a variety of implementation strategies, ranging from fairly conventional compilers that generate optimized machine code to more traditional threaded code models. Cheers, Elizabeth -- ================================================== Elizabeth D. Rather (US & Canada) 800-55-FORTH FORTH Inc. +1 310.999.6784 5959 West Century Blvd. Suite 700 Los Angeles, CA 90045 http://www.forth.com "Forth-based products and Services for real-time applications since 1973." ==================================================Article: 155274
On Thursday, June 20, 2013 7:50:09 PM UTC-4, rickman wrote: > Quitter! If the syntax (or near total lack thereof) bothers you, then=20 > you must have a very thin skin. Ha ha! And I see what you did there. On Thursday, June 20, 2013 9:31:53 PM UTC-4, Elizabeth D. Rather wrote: > Eric, I'm curious what books these were that you found so offensive? The books ("Starting Forth", "Thinkind Forth", "Forth Programmer's Handbook= ") weren't themselves offensive, but they revealed Forth to be much lamer t= han I expected for all the stick-it-to-the-man ethos surrounding it. I was= totally stoked for a stack-based language that would solve all my problems= , but all I got was some books gathering dust. > I'm also baffled about your comment about "programs that emit Forth."=20 > Although PostScript has many features in common with Forth, it is quite= =20 > different, both in terms of command set and programming model. You're looking for Tom Gardner, he's down the hall near the elevators using= a little stamp at the bottom of his cane to make little chicken footprints= on the floor..
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z