Messages from 155250

Article: 155250
Subject: Re: Modelsim ought to be cheaper
From: Kevin Neilson <kevin.neilson@xilinx.com>
Date: Tue, 18 Jun 2013 10:06:32 -0700 (PDT)
Links: << >> << T >> << A >>

I discovered one cause (but not all) of the coredumps I experience.  If I h=
ad mismatched port widths in a VHDL instantiation, I'll often have coredump=
s.  There is no indication of what is wrong, but now I know what to look fo=
r in some cases.  I also suffer all kinds of problems when I try to use unc=
onstrained outputs based on unconstrained inputs, to the point where I just=
 have to avoid that feature of VHDL.

I think it'd be great to have a cloud service you could use if you didn't n=
eed to use it that often, but I don't know if that would be profitable for =
Mentor.

Article: 155251
Subject: Re: Chasing Bugs in the Fog
From: Kevin Neilson <kevin.neilson@xilinx.com>
Date: Tue, 18 Jun 2013 12:16:59 -0700 (PDT)
Links: << >> << T >> << A >>

One mistake that is not too hard to make is forgetting to put a synchronize=
r flop on the input of an edge detector, like you might have on a UART inpu=
t (so that the edge detector has two flops, total).  Depending on the routi=
ng delays, this can cause you to miss a sizable percentage of edges.  (Not =
just delayed, but missed completely.)  Using only a single flop is sometime=
s known as using the "greedy path".

(Actually, to mitigate metastability as well, an edge detector ought to hav=
e three flops and an AND gate.  Using two is sometimes known as using the "=
sneaky path".)

Article: 155252
Subject: Re: Chasing Bugs in the Fog
From: rickman <gnuarm@gmail.com>
Date: Tue, 18 Jun 2013 17:45:18 -0400
Links: << >> << T >> << A >>

On 6/17/2013 8:14 PM, Rob Gaddi wrote:
> On Mon, 17 Jun 2013 20:00:01 -0400
> rickman<gnuarm@gmail.com>  wrote:
>
>> So I finally got around to adding some debug signals which I would
>> monitor on an analyzer and guess what, the bug is gone!  I *hate* when
>> that happens.  I can change the code so the debug signals only appear
>> when a control register is set to enable them, but still, I don't like
>> this.  I want to know what is causing this DURN THING!
>>
>> Anyone see this happen to them before?
>>
>> Oh yeah, someone in another thread (that I can't find, likely because I
>> don't recall the group I posted it in) suggested I add synchronizing FFs
>> to the serial data in.  Sure enough I had forgotten to do that.  Maybe
>> that was the fix...  of course!  It wasn't metastability, I bet it was
>> feeding multiple bits of the state machine!  Durn, I never make that
>> sort of error.  Thanks to whoever it was that suggested the obvious that
>> I had forgotten.
>>
>> --
>>
>> Rick
>
> Not metastability, a race condition.  Asynchronous external input
> headed to multiple clocked elements, each of which it reaches via a
> different path with a different delay.
>
> When you added debugging signals you changed the netlist, which changed
> the place and route, making unpredictable changes to those delays.

No, when changing the debug output I added the synchronization FFs which 
fixed the problem.

My point was that when the other poster suggested that I need to sync to 
the clock I mistook that for metastability forgetting that the input 
went to multiple sections of logic.  So actually I made the same mistake 
twice... lol

> In
> this case, it happened to push it into a place where _as far as you
> tested_, it seems happy.  But it's still unsafe, because as you change
> other parts of the design, the P&R of that section will still change
> anyhow, and you start getting my favorite situation, the problem that
> comes and goes based on entirely unrelated factors.
>
> The fix you fixed fixes it.  When you resynchronized it on the same
> clock as you're running around the rest of the logic, you forced that
> path to become timing constrained.  As such, the P&R takes it upon
> itself to make sure that the timing of that route is irrelevant with
> respect to the clock period, and your problem goes away for good.

Just to make sure of what was what (it has been two years since I last 
worked with this design) I pulled the FFs out and added back just one. 
Sure enough the bug reappears with no FFs, but goes away with just one. 
  The added debug info available allowed me to see exactly the error and 
sure enough, when a start bit comes in there is a chance that the two 
counters are not properly set and the error shows up in the center of 
the bit where the current contents of the shift register are moved into 
the holding register as a new char.

I guess what most likely happened is that when I wrote the UART code I 
assumed the sync FFs would be external and when I wrote the wrapper code 
I assumed the FFs were inside the UART.  In other words, I didn't have a 
proper spec and never gave this problem proper consideration.

I will revisit this design and look at the other inputs.  No reason to 
assume I didn't make the same mistake elsewhere.

-- 

Rick

Article: 155253
Subject: Re: Chasing Bugs in the Fog
From: rickman <gnuarm@gmail.com>
Date: Tue, 18 Jun 2013 18:03:36 -0400
Links: << >> << T >> << A >>

On 6/18/2013 3:16 PM, Kevin Neilson wrote:
> One mistake that is not too hard to make is forgetting to put a synchronizer flop on the input of an edge detector, like you might have on a UART input (so that the edge detector has two flops, total).  Depending on the routing delays, this can cause you to miss a sizable percentage of edges.  (Not just delayed, but missed completely.)  Using only a single flop is sometimes known as using the "greedy path".
>
> (Actually, to mitigate metastability as well, an edge detector ought to have three flops and an AND gate.  Using two is sometimes known as using the "sneaky path".)

Everyone is saying the same thing, so I guess I didn't explain clearly. 
  Someone had already pointed out to me that I needed a synchronizer on 
the received data signal in another thread that I can't find now.  I 
took them at their word, but was thinking they meant it was about 
metastability which I figured was not a problem at these speeds (yes, 
the speeds do make a difference for metastability since you never chase 
it away, you just minimize it).  I wasn't thinking about the serial in 
signal feeding the state machine, just the shift register.

So when I made the changes, which included the synchronizer, it worked. 
  Because I didn't expect the synchronizer to do anything, I had 
forgotten about it until I was typing the post here.  I remembered at 
the end of the message and realized that was what fixed the problem...

Sorry for the confusion.  Still, thanks to all who replied and 
especially the mystery person who suggested it in the other thread 
wherever that was.

-- 

Rick

Article: 155254
Subject: Re: Chasing Bugs in the Fog
From: Nicolas Matringe <nicolas.matringe@fre.fre>
Date: Wed, 19 Jun 2013 00:13:41 +0200
Links: << >> << T >> << A >>

Le 18/06/2013 23:45, rickman a écrit :

> I guess what most likely happened is that when I wrote the UART code I
> assumed the sync FFs would be external and when I wrote the wrapper code
> I assumed the FFs were inside the UART.  In other words, I didn't have a
> proper spec and never gave this problem proper consideration.

Several years ago a young engineer reused my long proven UART code and 
modified it, carelessly removing the synchronizing FF. He came to see me 
and complained that my UART didn't work, it hung after some 
unpredictable time.
I thought for a few minutes, guessed he probably had removed the FF and 
fixed his problem right away.

Nicolas

Article: 155255
Subject: Re: Chasing Bugs in the Fog
From: Kevin Neilson <kevin.neilson@xilinx.com>
Date: Tue, 18 Jun 2013 15:26:00 -0700 (PDT)
Links: << >> << T >> << A >>

That's the same thing that happened to me when I had the problem last.  I h=
ad an edge detector connected to a big synchronizer module that was in turn=
 connected to all the input pins.  When I had problems I looked inside the =
synchronizer module and found that it didn't have a flop on that line; it w=
as just wired straight through.

Article: 155256
Subject: Re: Chasing Bugs in the Fog
From: Theo Markettos <theom+news@chiark.greenend.org.uk>
Date: 19 Jun 2013 01:24:27 +0100 (BST)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:
> Everyone is saying the same thing, so I guess I didn't explain clearly. 
>   Someone had already pointed out to me that I needed a synchronizer on 
> the received data signal in another thread that I can't find now.  I 
> took them at their word, but was thinking they meant it was about 
> metastability which I figured was not a problem at these speeds (yes, 
> the speeds do make a difference for metastability since you never chase 
> it away, you just minimize it).  I wasn't thinking about the serial in 
> signal feeding the state machine, just the shift register.

There's 3 things that could have gone wrong (and might still be doing
wrong):

You failed to synchronise between the clock domain of the input serial link
and the clock of your system (sounds like you fixed this one)

You failed to constrain the clocks and other inputs so the synthesis tool
knows what timing budget it has to meet

You failed timing analysis and didn't notice - in other words the synthesis
tool says the design it produced doesn't meet your supplied timing
constraints, despite its best efforts.  If the failure is small it may still
work in some voltage/temperature/silicon situations, but it isn't guaranteed
in all cases.

Normally the last one will raise big red flags in the tool, assuming the
timing analyser does get run as part of the build.  However the first two
are easy to overlook and you get no warning from the tools.

Theo

Article: 155257
Subject: Re: Ask about finding maximum and second's maximum number in array
From: Gabor <gabor@szakacs.org>
Date: Tue, 18 Jun 2013 22:17:06 -0400
Links: << >> << T >> << A >>

On 6/18/2013 7:28 AM, Thomas Stanka wrote:
> On 18 Jun., 03:19, phanhuyich <khanhnguyent...@gmail.com> wrote:
>> I am starting to study VHDL. Now, I have to do an exercise with the following content:
>>
>>   I have to define an array of 10 elements ( 8 bit range) ([3,4,2,8,9,0,1,5,7,6] for example). And 10 elements were imported to within 10 clock cycles. The question is find the maximum number and second maximum number in this array after 10 clock cycle.
>>   Anyone help to show me the method to solve it using VHDL ?
>
> No problem. Just write down your sollution to that problem in "not
> VHDL". Then ask what part of the algorithm is hard for you transfer in
> VHDL and why so we can help.
>
> HInt it helps to think about the RTL and draw a picture about how the
> data flow might be than it is easy to write it down in VHDL.
>
> regards Thomas
>
I often try to think of exercises like this in a completely different
setting.  For example, suppose you have ten ordinary playing cards
from a standard deck of 52.  You agree that there is a standard
order of these cards where the lowest for this exercise is Ace
of Clubs, then Ace of Diamonds, Ace of Hearts, Ace of Spades, Two
of Clubs, Two of Diamonds, . . . with King of Spades being highest.

Now you're going to flip one card at a time (this is the same
way you get your input, one item per clock cycle).  With each
flip you get to make one decision.  For example if you only cared
about the highest card of the ten, you could have a single stack
where you place the new card if it is higher than the card already
showing (or if the stack has no cards), and discard any card that
is not higher.

Now my first thought was that you could use this same approach to
find the two highest cards, but there's one case where it doesn't
work - if the highest card comes out first.  Then your stack will
only have one card in it so you can't just dig one card down to
find the second highest.

So you need to think how you'd arrange cards to be certain to
know the top two at the end of the exercise.  Then it's a simple
matter of translating this procedure to a VHDL process.

Have fun!

-- 
Gabor

Article: 155258
Subject: Re: Ask about finding maximum and second's maximum number in array
From: GaborSzakacs <gabor@alacron.com>
Date: Wed, 19 Jun 2013 10:55:50 -0400
Links: << >> << T >> << A >>

Gabor wrote:
> On 6/18/2013 7:28 AM, Thomas Stanka wrote:
>> On 18 Jun., 03:19, phanhuyich <khanhnguyent...@gmail.com> wrote:
>>> I am starting to study VHDL. Now, I have to do an exercise with the 
>>> following content:
>>>
>>>   I have to define an array of 10 elements ( 8 bit range) 
>>> ([3,4,2,8,9,0,1,5,7,6] for example). And 10 elements were imported to 
>>> within 10 clock cycles. The question is find the maximum number and 
>>> second maximum number in this array after 10 clock cycle.
>>>   Anyone help to show me the method to solve it using VHDL ?
>>
>> No problem. Just write down your sollution to that problem in "not
>> VHDL". Then ask what part of the algorithm is hard for you transfer in
>> VHDL and why so we can help.
>>
>> HInt it helps to think about the RTL and draw a picture about how the
>> data flow might be than it is easy to write it down in VHDL.
>>
>> regards Thomas
>>
> I often try to think of exercises like this in a completely different
> setting.  For example, suppose you have ten ordinary playing cards
> from a standard deck of 52.  You agree that there is a standard
> order of these cards where the lowest for this exercise is Ace
> of Clubs, then Ace of Diamonds, Ace of Hearts, Ace of Spades, Two
> of Clubs, Two of Diamonds, . . . with King of Spades being highest.
> 
> Now you're going to flip one card at a time (this is the same
> way you get your input, one item per clock cycle).  With each
> flip you get to make one decision.  For example if you only cared
> about the highest card of the ten, you could have a single stack
> where you place the new card if it is higher than the card already
> showing (or if the stack has no cards), and discard any card that
> is not higher.
> 
> Now my first thought was that you could use this same approach to
> find the two highest cards, but there's one case where it doesn't
> work - if the highest card comes out first.  Then your stack will
> only have one card in it so you can't just dig one card down to
> find the second highest.
> 
> So you need to think how you'd arrange cards to be certain to
> know the top two at the end of the exercise.  Then it's a simple
> matter of translating this procedure to a VHDL process.
> 
> Have fun!
> 
I posted that last night when the brain was foggy.  I should have
said that the simple algorithm won't work for finding the second
highest card if the highest card comes before the second highest.

In any case you need to think of a good algorithm for finding both.

-- 
Gabor

Article: 155259
Subject: Re: Ask about finding maximum and second's maximum number in array is given.
From: jonesandy@comcast.net
Date: Wed, 19 Jun 2013 08:40:22 -0700 (PDT)
Links: << >> << T >> << A >>

To borrow Gabor's card game analogy...

You have two stacks, (highest and 2nd highest)

If the drawn card is same or higher than the highest stack, then

  move the top card from the highest stack to the 2nd highest stack,
  move the drawn card to the highest stack.

else if the drawn card is same or higher than the 2nd highest stack, then

  move the drawn card to the 2nd highest stack.

draw another card and repeat.

Andy

Article: 155260
Subject: comparing between Xilinx and altera
From: "bjzhangwn@gmail.com" <bjzhangwn@gmail.com>
Date: Wed, 19 Jun 2013 08:47:24 -0700 (PDT)
Links: << >> << T >> << A >>

I have a project that need about 300KLE, I want to choose a device between =
the Xilinx V7 and Altera Stratix5, please give some suggestions, in the low=
-end devices,what is the key diffrence between spartan6 and Cyclone5,which =
have the better route pass percentage?And which have the better resource us=
eage?And which have the better power consumption?

Article: 155261
Subject: Re: Modelsim ought to be cheaper
From: Sanjay Parekh <parekhsanjayh@gmail.com>
Date: Wed, 19 Jun 2013 11:13:52 -0700 (PDT)
Links: << >> << T >> << A >>

On Tuesday, June 18, 2013 10:06:32 AM UTC-7, Kevin Neilson wrote:
> I discovered one cause (but not all) of the coredumps I experience.  If I=
 had mismatched port widths in a VHDL instantiation, I'll often have coredu=
mps.  There is no indication of what is wrong, but now I know what to look =
for in some cases.  I also suffer all kinds of problems when I try to use u=
nconstrained outputs based on unconstrained inputs, to the point where I ju=
st have to avoid that feature of VHDL.
>=20
>=20
>=20
> I think it'd be great to have a cloud service you could use if you didn't=
 need to use it that often, but I don't know if that would be profitable fo=
r Mentor.

In my case, the tool choked miserably whenever I misinterpreted the systemv=
erilog spec and hooked up interfaces incorrectly. =20

In my opinion Mentor can use the cloud platform quite creatively and make a=
 business out of the unmet need which is allowing engineers to build myriad=
 pieces of ip that serve niche areas without going through a vetting proces=
s to justify a big budget and therefore a big market.

And think of the community schools that generally offer programs in c progr=
amming, etc.  Why not programs in verification, linting, scripting, simple =
designs, etc.? More side opportunities for consultants/senior engineers as =
trainers, more opportunities for the students to learn online.  E.g. If cou=
rsera/udemy can offer software courses, why not hardware courses as well?  =
And think of kickstarter/indiegogo which can fund those hardware projects.

Enough said.  I don't mean to say that cost of the tools is the only thing =
that is preventing massive innovation in the hardware development.  But I f=
eel it is an important part as it limits the creative ability of the people=
 who can make a difference.

Article: 155262
Subject: Re: comparing between Xilinx and altera
From: GaborSzakacs <gabor@alacron.com>
Date: Wed, 19 Jun 2013 16:29:30 -0400
Links: << >> << T >> << A >>

bjzhangwn@gmail.com wrote:
> I have a project that need about 300KLE, I want to choose a device between the Xilinx V7 and Altera Stratix5, please give some suggestions, in the low-end devices,what is the key diffrence between spartan6 and Cyclone5,which have the better route pass percentage?And which have the better resource useage?And which have the better power consumption?

It's been a while since the last X vs. A wars, here.  But it seems you
are asking two differenct questions.  First for 300KLE and either
Virtex7 or Stratix5.  I assume that the Spartan6 vs Cyclone5 question
is separate because I don't think they go up to 300KLE (I know for a
fact that Spartan6 only goes to 150KLE).  And now there are newer
"low-priced" Artix parts from Xilinx if you wanted to look at 7-series
for comparison with the latest Altera low-cost parts.  Artix goes up
to 200KLE (right now I think the only two sizes available are 100K
and 200K).  Artix can also do x4 PCIe if you need that.  Spartan 6 LXT
only has 1x endpoint blocks.

Not sure what you mean by "route pass percentage."  If you're talking
about the amount of logic you can stuff into a part before it becomes
unroutable, then Xilinx parts are pretty good.  You usually get to
a point where it's too hard to meet timing (due to slice packing and
other placement constraints) before you get an unroutable design.  I
generally consider about 70% LUT usage to be "full" from this
perspective.  No experience on altera parts.

-- 
Gabor

Article: 155263
Subject: Re: New soft processor core paper publisher?
From: rickman <gnuarm@gmail.com>
Date: Wed, 19 Jun 2013 17:13:04 -0400
Links: << >> << T >> << A >>

On 6/15/2013 10:17 PM, Eric Wallin wrote:
> Thanks for your response rickman!
>
> On Saturday, June 15, 2013 8:40:27 PM UTC-4, rickman wrote:
>> That the ground I have been plowing off and on for the last 10 years.
>
> Ooo, same here, and my condolences.  I caught a break a couple of months ago and have been beavering away on it ever since, and I finally have something that doesn't cause me to vomit when I code for it.  Multiple indexed simple stacks with explicit pointer control makes everything a lot easier than a bog standard stack machine.  I think the auto-consumption of literally everything, particularly the data, indexes, and pointers you dearly want to use again is at the bottom of all the crazy people just accept with stack machines.  This mechanism works great for manual data entry on HP calculators, but not so much for stack machines IMHO.  Auto consumption also pretty much rules out conditional execution of single instructions.

I was looking at how to improve a stack design a few months ago and came 
to a similar conclusion.  My first attempt at getting around the stack 
ops was to use registers.  I was able to write code that was both 
smaller and faster since in my design all instructions are one clock 
cycle so executed instruction count equals number of machine cycles. 
Well, sort of.  My original dual stack design was literally one clock 
per instruction.  In order to work with clocked block ram the register 
machine would use either both phases of two clocks per machine cycle or 
four clock cycles.

While pushing ideas around on paper, the J1 design gave me an idea of 
adjusting the stack point as well as using an offset in each 
instruction.  That gave a design that is even faster with fewer 
instructions.  I'm not sure if it is practical in a small opcode.  I 
have been working with 8 and 9 bit opcodes, the latest approach with 
stack pointer control can fit in 9 bits, but would be happier with a 
couple more bits.

>> I assume that you do understand that the point of MISC is that the
>> implementation can be minimized so that the instructions run faster.  In
>> theory this makes up for the extra instructions needed to manipulate the
>> stack on occasion.  But I understand your interest in minimizing the
>> inconvenience of stack ops.  I spent a little time looking at
>> alternatives and am currently looking at a stack CPU design that allows
>> offsets into the stack to get around the extra stack ops.  I'm not sure
>> how this compares to your ideas.  It is still a dual stack design as I
>> have an interest in keeping the size of the implementation at a minimum.
>
> MISC is interesting, but you have to consider that all ops, including simple stack manipulations, will generally consume as much real time as a multiply, which suddenly makes all of those confusing stack gymnastics you have to perform to dig out your loop index or whatever from underneath your read/write pointer from underneath your data and such overly burdensome.

Programming to facilitate stack optimization is king on a stack machine. 
  I'm not sure how the multiply speed is relevant, but the real question 
is just how fast does an algorithm run which has to include all the 
instructions needed as well as the clock speed.  Then it is also 
important to consider resources used.  I think you said your design uses 
1800 LEs which is a *lot* more than a simple two stack design.  They 
aren't always available.

> Indexes into a moving stack - that way lies insanity.  Ever hit the roll down button on an HP calculator and get instantly flummoxed?  Maybe a compiler can keep track of that kind of stuff, but my weak brain isn't up to the task.

Then I don't know why you are designing CPUs, lol!  I like RPN 
calculators and have trouble using anything else.  I also program in 
Forth so this all works for me.

> Altera BRAM doesn't go as wide as Xilinx with true dual port.  When I was working in Xilinx I was able to use a single BRAM for both the data and return stacks (16 bit data).

I expect Xilinx has some patent that Altera can't get around for a 
couple more years.  Lattice seems to be pretty good though.  I just 
would prefer to have an async read since that works in a one clock 
machine cycle better.

>>    1800 LEs won't even fit on the FPGAs I am targeting.
>
> I'm not sure anything less than the smallest Cyclone 2 is really worth developing in.  A lot of the stuff below that is often more expensive due to the built-in configuration memory and such.  There are quite inexpensive Cyclone dev boards on eBay from China.

I don't know about dev board cost, but I can get a 1280 LUT Lattice part 
for under $4 in reasonable quantity.  That is the area I typically work 
in.  My big problem is packages.  I don't want to have to use extra fine 
pitch on PCBs to avoid the higher costs.  BGAs require very fine via 
holes and fine pitch PCB traces and run the board costs up a bit.  None 
of the FPGA makers support the parts I like very well.  VQ100 is my 
favorite, small but enough pins for most projects.

>> I would like to hear about your innovations.  As you seem to understand,
>> it is hard to be truly innovative finding new ideas that others have not
>> uncovered.  But I think you are certainly in an area that is not
>> thoroughly explored.
>
> I haven't seen anything exactly like it, certainly not the way the stacks are implemented.  And I deal with extended arithmetic results in an unusual way.  In terms of scheduling and pipelining, the Parallax Propeller is probably the closest in architecture (you can infer from the specs and operational model what they don't explicitly tell you in the datasheet).
>
>> I won't argue with that.  Even when I was an IEEE member, I never found
>> a document I didn't have to pay for.
>
> I was a member too right out of grad school.  But, like Janet Jackson sang: "What have they done for me lately?"

My mistake was getting involved in the local chapters.  Seems IEEE is 
just a good ol' boys network and is all about status and going along to 
get along.  They don't believe in the written rules, more so the 
unwritten ones.

>> When can we expect to see your paper?
>
> It's all but done, just picking around the edges at this point.  As soon as the code is verified to my satisfaction I'll release both and post here.

Ok, looking forward to it.

-- 

Rick

Article: 155264
Subject: Re: comparing between Xilinx and altera
From: rickman <gnuarm@gmail.com>
Date: Wed, 19 Jun 2013 21:28:33 -0400
Links: << >> << T >> << A >>

On 6/19/2013 4:29 PM, GaborSzakacs wrote:
> bjzhangwn@gmail.com wrote:
>> I have a project that need about 300KLE, I want to choose a device
>> between the Xilinx V7 and Altera Stratix5, please give some
>> suggestions, in the low-end devices,what is the key diffrence between
>> spartan6 and Cyclone5,which have the better route pass percentage?And
>> which have the better resource useage?And which have the better power
>> consumption?
>
> It's been a while since the last X vs. A wars, here. But it seems you
> are asking two differenct questions. First for 300KLE and either
> Virtex7 or Stratix5. I assume that the Spartan6 vs Cyclone5 question
> is separate because I don't think they go up to 300KLE (I know for a
> fact that Spartan6 only goes to 150KLE). And now there are newer
> "low-priced" Artix parts from Xilinx if you wanted to look at 7-series
> for comparison with the latest Altera low-cost parts. Artix goes up
> to 200KLE (right now I think the only two sizes available are 100K
> and 200K). Artix can also do x4 PCIe if you need that. Spartan 6 LXT
> only has 1x endpoint blocks.
>
> Not sure what you mean by "route pass percentage." If you're talking
> about the amount of logic you can stuff into a part before it becomes
> unroutable, then Xilinx parts are pretty good. You usually get to
> a point where it's too hard to meet timing (due to slice packing and
> other placement constraints) before you get an unroutable design. I
> generally consider about 70% LUT usage to be "full" from this
> perspective. No experience on altera parts.

I think this is one of those questions like, "how long is a piece of 
string"?  The utilization percentage depends entirely on your design.  I 
did a design on a Lattice part a while back and when the customer wanted 
an upgrade I warned that it would likely push the utilization up to 80% 
or more which might make it hard to route and meet timing.  Sure enough, 
the project got to about 80%, but we had no problem at all with routing 
or timing.  Certainly the tools are better than they were back when I 
almost did harakari working on a design update at over 90% utilization.

So I don't think there is a good answer to the question.  But there is a 
not-so-bad solution.  Unless you plan to instantiate vendor specific 
components, you should be able to write your code without making the 
decision about the vendor.  When the design is done or nearly so, 
generate bit files on both tools and see which fits best.  The most 
likely answer is, "it doesn't matter".

Power consumption can be measured very easily on most development 
boards.  They are usually less than a weeks pay and sometimes less than 
a day's pay.  Or you can ask the vendor...

-- 

Rick

Article: 155265
Subject: Re: Ask about finding maximum and second's maximum number in array
From: rickman <gnuarm@gmail.com>
Date: Wed, 19 Jun 2013 21:39:05 -0400
Links: << >> << T >> << A >>

On 6/19/2013 11:40 AM, jonesandy@comcast.net wrote:
> To borrow Gabor's card game analogy...
>
> You have two stacks, (highest and 2nd highest)
>
> If the drawn card is same or higher than the highest stack, then
>
>    move the top card from the highest stack to the 2nd highest stack,
>    move the drawn card to the highest stack.
>
> else if the drawn card is same or higher than the 2nd highest stack, then
>
>    move the drawn card to the 2nd highest stack.
>
> draw another card and repeat.

They don't need to be stacks.  You just need to have two holding spots 
(registers) and initialize them to something less than anything you will 
have on the input.  Then on each draw of a card (or sample on the input) 
you compare to both spots, if the input is higher than the "highest" 
spot you save it there and put the old highest on the "second highest" 
spot.  If not, but it is higher than the "second highest" you put it 
there.

Gabor was using a stack because he thought it would get him both the 
highest and the second highest with one compare operation, but it didn't 
work.  Two compares are needed for each input.

In your approach your compare is "higher or same", why do you need to do 
anything if they are the same?  Not that it is a big deal, but in some 
situations this could require extra work.

-- 

Rick

Article: 155266
Subject: Re: Modelsim ought to be cheaper
From: Sanjay Parekh <parekhsanjayh@gmail.com>
Date: Thu, 20 Jun 2013 07:10:59 -0700 (PDT)
Links: << >> << T >> << A >>

On Wednesday, June 19, 2013 11:13:52 AM UTC-7, Sanjay Parekh wrote:
> On Tuesday, June 18, 2013 10:06:32 AM UTC-7, Kevin Neilson wrote:
>=20
> > I discovered one cause (but not all) of the coredumps I experience.  If=
 I had mismatched port widths in a VHDL instantiation, I'll often have core=
dumps.  There is no indication of what is wrong, but now I know what to loo=
k for in some cases.  I also suffer all kinds of problems when I try to use=
 unconstrained outputs based on unconstrained inputs, to the point where I =
just have to avoid that feature of VHDL.
>=20
> >=20
>=20
> >=20
>=20
> >=20
>=20
> > I think it'd be great to have a cloud service you could use if you didn=
't need to use it that often, but I don't know if that would be profitable =
for Mentor.
>=20
>=20
>=20
> In my case, the tool choked miserably whenever I misinterpreted the syste=
mverilog spec and hooked up interfaces incorrectly. =20
>=20
>=20
>=20
> In my opinion Mentor can use the cloud platform quite creatively and make=
 a business out of the unmet need which is allowing engineers to build myri=
ad pieces of ip that serve niche areas without going through a vetting proc=
ess to justify a big budget and therefore a big market.
>=20
>=20
>=20
> And think of the community schools that generally offer programs in c pro=
gramming, etc.  Why not programs in verification, linting, scripting, simpl=
e designs, etc.? More side opportunities for consultants/senior engineers a=
s trainers, more opportunities for the students to learn online.  E.g. If c=
oursera/udemy can offer software courses, why not hardware courses as well?=
  And think of kickstarter/indiegogo which can fund those hardware projects=
.
>=20
>=20
>=20
> Enough said.  I don't mean to say that cost of the tools is the only thin=
g that is preventing massive innovation in the hardware development.  But I=
 feel it is an important part as it limits the creative ability of the peop=
le who can make a difference.

Interesting read today if you can see as I do the opportunities for cloud b=
ased tools.. http://gigaom.com/2013/06/19/open-compute-is-bringing-the-make=
r-movement-to-the-enterprise/?utm_source=3DGeneral+Users&utm_campaign=3D347=
2bd888e-c%3Atec%2Capl+d%3A06-20&utm_medium=3Demail&utm_term=3D0_1dd83065c6-=
3472bd888e-98983131

Article: 155267
Subject: Re: New soft processor core paper publisher?
From: Eric Wallin <tammie.eric@gmail.com>
Date: Thu, 20 Jun 2013 07:35:30 -0700 (PDT)
Links: << >> << T >> << A >>

On Wednesday, June 19, 2013 5:13:04 PM UTC-4, rickman wrote:

> While pushing ideas around on paper, the J1 design gave me an idea of=20
> adjusting the stack point as well as using an offset in each=20
> instruction.  That gave a design that is even faster with fewer=20
> instructions.  I'm not sure if it is practical in a small opcode. =20

Interesting.  The J1 strongly influenced me as well.

< I have been working with 8 and 9 bit opcodes, the latest approach with=20
> stack pointer control can fit in 9 bits, but would be happier with a=20
> couple more bits.

I decided to stay away from non-powers of 2 widths for instructions and dat=
a.  Not efficient in standard storage.  Having multiple instructions per wo=
rd I see now as more of a bug than a feature because you have to index into=
 it to return from a subroutine and how / where do you store the index?

> Programming to facilitate stack optimization is king on a stack machine.=
=20

I feel that this is a fiddly activity that wastes the programmer's time and=
 creates code that is exceedingly difficult to figure out later.

>   I'm not sure how the multiply speed is relevant, but the real question=
=20
> is just how fast does an algorithm run which has to include all the=20
> instructions needed as well as the clock speed.

Multiply is relevant because in a 32 bit machine it will likely be THE spee=
d bottleneck, pulling overall timing down.  They include non-fabric registe=
ring at the I/O of the FPGA multiply hardware to help pipeline it.  Same wi=
th BRAM - reads really speed up if you use the "free" output registering (i=
n addition to the synchronous register you are generally forced to use).

> > Indexes into a moving stack - that way lies insanity.  Ever hit the rol=
l down button on an HP calculator and get instantly flummoxed?  Maybe a com=
piler can keep track of that kind of stuff, but my weak brain isn't up to t=
he task.
>=20
> Then I don't know why you are designing CPUs, lol!  I like RPN=20
> calculators and have trouble using anything else.  I also program in=20
> Forth so this all works for me.

Quite the contrary, I've used HP calculators religiously since I won one in=
 a HS engineering contest almost 30 years ago.  Too bad they don't make the=
 "real" ones anymore (35S is the best they can do it seems, maybe they lost=
 the plans along with those of the Saturn V).  But when I hit the roll down=
 button to find a value on the stack, I have to give up on the other stack =
items due to confusion.  I really want to like Forth, but after reading the=
 books and being repeatedly repelled by the syntax and programming model I =
gave up.

My goal with CPU design was to make one simple enough to program without sp=
ecial tools, but complex enough to do real work and I think I've finally ac=
hieved that.

> I expect Xilinx has some patent that Altera can't get around for a=20
> couple more years.  Lattice seems to be pretty good though.  I just=20
> would prefer to have an async read since that works in a one clock=20
> machine cycle better.

I like Lattice parts too, and used the original MachXO on many boards in li=
eu of a CPLD. =20

But I gave up on single cycle along with two stacks and autoconsumption.  L=
ike you say async read BRAM is hard to come by.  Single cycle is also slow =
and strands a bazillion FFs in the fabric.

I wonder if you've read this article:

http://spectrum.ieee.org/semiconductors/processors/25-microchips-that-shook=
-the-world

Moore made a lot of money off of what seem like frivolous lawsuits, which b=
rings him down several notches in my eyes.

Article: 155268
Subject: Re: New soft processor core paper publisher?
From: Tom Gardner <spamjunk@blueyonder.co.uk>
Date: Thu, 20 Jun 2013 15:51:15 +0100
Links: << >> << T >> << A >>

Eric Wallin wrote:
> Quite the contrary, I've used HP calculators religiously since
 > I won one in a HS engineering contest almost 30 years ago.
 > Too bad they don't make the "real" ones anymore (35S is the
 > best they can do it seems, maybe they lost the plans along
 > with those of the Saturn V).

I'm sure HP still has the plans for the Saturn, viz
http://www.hpmuseum.org/saturn.htm

Sorry, couldn't resist.


 > I really want to like Forth, but after reading the books
 > and being repeatedly repelled by the syntax and programming
 > model I gave up.

Nobody /writes/ Forth. They write programs that emit Forth.
The most mainstream example of that is printer drivers
emitting PostScript.

Article: 155269
Subject: Re: Modelsim ought to be cheaper
From: HT-Lab <hans64@htminuslab.com>
Date: Thu, 20 Jun 2013 16:33:52 +0100
Links: << >> << T >> << A >>

On 20/06/2013 15:10, Sanjay Parekh wrote:
> On Wednesday, June 19, 2013 11:13:52 AM UTC-7, Sanjay Parekh wrote:
>> On Tuesday, June 18, 2013 10:06:32 AM UTC-7, Kevin Neilson wrote:
..
>>
>>> I think it'd be great to have a cloud service you could use if you didn't need to use it that often, but I don't know if that would be profitable for Mentor.
..
> Interesting read today if you can see as I do the opportunities for cloud based tools.. http://gigaom.com/2013/06/19/open-compute-is-bringing-the-maker-movement-to-the-enterprise/?utm_source=General+Users&utm_campaign=3472bd888e-c%3Atec%2Capl+d%3A06-20&utm_medium=email&utm_term=0_1dd83065c6-3472bd888e-98983131
>

I don't think cloud EDA services will happen soon for the simple reason 
that companies are generally not happy to splatter their highly valuable 
IP over the internet.

You have an additional problem that the servers are normally not located 
in your country which means you have to fight a foreign court system if 
something goes wrong (server hacked, IP theft, etc).

Hans
www.ht-lab.com

Article: 155270
Subject: Re: Modelsim ought to be cheaper
From: Eric Wallin <tammie.eric@gmail.com>
Date: Thu, 20 Jun 2013 10:13:59 -0700 (PDT)
Links: << >> << T >> << A >>

On Tuesday, April 23, 2013 4:13:42 PM UTC-4, Kevin Neilson wrote:
> Why is Modelsim so expensive?  It is a mature product and yet it segfault=
s on me all the time.  Constantly.  Often, when it ought to give me warning=
s or errors (such as when there is a port width mismatch) it just core dump=
s instead, leaving me to comment out lines one at a time until I figure out=
 why it's crashing.  That's my rant.  It's still pretty decent, but ought t=
o be cheaper if it's going to coredump like freeware.

The simulator in Quartus is nice and has a "functional simulation" mode tha=
t makes the compile fairly trivial and quick.  Altera unfortunately unbundl=
ed it from the main GUI after 9.2SP2 and turned into an ugly rickety Tcl ba=
sed unintegrated monstrosity.  At the time the rep told me "no one uses it"=
.

Don't mind me, I'm a just a nobody.

Article: 155271
Subject: Re: New soft processor core paper publisher?
From: rickman <gnuarm@gmail.com>
Date: Thu, 20 Jun 2013 19:50:09 -0400
Links: << >> << T >> << A >>

On 6/20/2013 10:51 AM, Tom Gardner wrote:
> Eric Wallin wrote:
>  > I really want to like Forth, but after reading the books
>  > and being repeatedly repelled by the syntax and programming
>  > model I gave up.

Quitter!  If the syntax (or near total lack thereof) bothers you, then 
you must have a very thin skin.

> Nobody /writes/ Forth. They write programs that emit Forth.
> The most mainstream example of that is printer drivers
> emitting PostScript.

LOL!  I guess I was documenting something this morning rather than 
writing code...

-- 

Rick

Article: 155272
Subject: Re: Modelsim ought to be cheaper
From: Kevin Neilson <kevin.neilson@xilinx.com>
Date: Thu, 20 Jun 2013 17:34:54 -0700 (PDT)
Links: << >> << T >> << A >>

I'd really like to make some cores in my spare time, but the revenues would=
 be pretty small, and there is no way it would be worthwhile to buy Synplif=
y and Modelsim licenses for such a small endeavor.  I don't know exactly wh=
at that would cost, but I'm sure it's tens of thousands.  It'd be great if =
I use the tools online for a few hours here and there and just pay for that=
.  Even if I couldn't use the GUI--if I could just get an EDIF and .srr fil=
e back--that would be useful.

I guess I could use Icarus or something, but I'm sure it's not going to par=
se the nice SysVerilog / VHDL 2008 code I write, and who wants to buy a cor=
e that comes with an Icarus project file?

Article: 155273
Subject: Re: New soft processor core paper publisher?
From: "Elizabeth D. Rather" <erather@forth.com>
Date: Thu, 20 Jun 2013 15:31:53 -1000
Links: << >> << T >> << A >>

On 6/20/13 1:50 PM, rickman wrote:
> On 6/20/2013 10:51 AM, Tom Gardner wrote:
>> Eric Wallin wrote:
>>  > I really want to like Forth, but after reading the books
>>  > and being repeatedly repelled by the syntax and programming
>>  > model I gave up.
>
> Quitter!  If the syntax (or near total lack thereof) bothers you, then
> you must have a very thin skin.
>
>> Nobody /writes/ Forth. They write programs that emit Forth.
>> The most mainstream example of that is printer drivers
>> emitting PostScript.
>
> LOL!  I guess I was documenting something this morning rather than
> writing code...
>

Eric, I'm curious what books these were that you found so offensive?

I'm also baffled about your comment about "programs that emit Forth." 
Although PostScript has many features in common with Forth, it is quite 
different, both in terms of command set and programming model.

Modern Forths (e.g. since the release of ANS Forth 94) feature a variety 
of implementation strategies, ranging from fairly conventional compilers 
that generate optimized machine code to more traditional threaded code 
models.

Cheers,
Elizabeth

-- 
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Article: 155274
Subject: Re: New soft processor core paper publisher?
From: Eric Wallin <tammie.eric@gmail.com>
Date: Thu, 20 Jun 2013 20:38:58 -0700 (PDT)
Links: << >> << T >> << A >>

On Thursday, June 20, 2013 7:50:09 PM UTC-4, rickman wrote:
> Quitter!  If the syntax (or near total lack thereof) bothers you, then=20
> you must have a very thin skin.

Ha ha!  And I see what you did there.

On Thursday, June 20, 2013 9:31:53 PM UTC-4, Elizabeth D. Rather wrote:
> Eric, I'm curious what books these were that you found so offensive?

The books ("Starting Forth", "Thinkind Forth", "Forth Programmer's Handbook=
") weren't themselves offensive, but they revealed Forth to be much lamer t=
han I expected for all the stick-it-to-the-man ethos surrounding it.  I was=
 totally stoked for a stack-based language that would solve all my problems=
, but all I got was some books gathering dust.

> I'm also baffled about your comment about "programs that emit Forth."=20
> Although PostScript has many features in common with Forth, it is quite=
=20
> different, both in terms of command set and programming model.

You're looking for Tom Gardner, he's down the hall near the elevators using=
 a little stamp at the bottom of his cane to make little chicken footprints=
 on the floor..

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search