Messages from 37850

Article: 37850
Subject: Re: Kindergarten Stuff
From: "Bryan" <bryan@srccomp.com>
Date: Fri, 21 Dec 2001 08:56:28 -0700
Links: << >> << T >> << A >>

True, crying is easy.  The design is running at 200Mhz DDR.  As crappy as
the DCMs are in the ES parts, I don't think I want to try running a faster
clock in the part to sample with.  And I am already using 96% of the
XC2V3000 part so I don't have much logic left to play tricks with.  I am
going to get down in the part and go for performance on this one.  The hand
routing solution from Carl works, so I'm off to the races.

Bryan



"Falk Brunner" <Falk.Brunner@gmx.de> wrote in message
news:9vtr7v$hs8un$3@ID-84877.news.dfncis.de...
> "Bryan" <bryan@srccomp.com> schrieb im Newsbeitrag
> news:3c20c0ef$0$25796$4c41069e@reader1.ash.ops.us.uu.net...
>
> > ncd to hand route my couple nets.  What I am building into a macro is a
16
> > bit FIFO, I have 16 of these FIFOs in the design and each one contains
IOB
>
> The question is, how fast are the clocks for this FIFOs. If it is not too
> fast, you can use some oversampling methods to run everything on ONE
clock.
> Also, if the frequency of these FIFOs is considerably high, you should
think
> about a very small, fast synchronizer circuit to synchonize the incomming
> datas to one clock domain and, again, work everything on one clock.
>
> Its quite easy to run out of clock nets, but its engineering to solve the
> problem without crying that much ;-)
>
> --
> MfG
> Falk
>
>
>
>
>

Article: 37851
Subject: Re: You take the low road and I'll ......
From: "Austin Franklin" <austin@dar87kroom.com>
Date: Fri, 21 Dec 2001 11:07:03 -0500
Links: << >> << T >> << A >>


"Ray Andraka" <ray@andraka.com> wrote in message
news:3C22757C.8EA86695@andraka.com...
> I think using both in the same design, at least for a consultant doing
designs for
> a client, is a sin among sins.  It requires the client to obtain and
maintain two
> tool flows if he wishes to do anything with the design without having to
come back
> to you.  Pick one and stick with it for the whole design.

I entirely disagree.  FPGAs have to go on a board.  The board is drawn in
schematics...I don't know of any HDL board tools...so typically, clients
already have a schematic package.  Technically, you are already mixing
tools...you have the front end synthesis, and the back end Xilinx
tools...and there are many of them.  Being able to do mixed input designs
can vastly increase productivity, as well as allow for the inclusion of
external designs done in the "other" language.

No one has ever had a complaint with my doing this.  In fact, using Abel as
a source for schematic modules was very popular some 8 or so years
ago...before Verilog/VHDL synthesis tools became "usable" for FPGAs.  Also,
when synthesis tools are rev'd, there inevitably are changes in the way the
tool creates its output...and as such, won't give you consistent results, so
you technically should be archiving the tools with the design...BUT...the
vendors (typically) won't give you a forever license for THAT revision of
the synthesis tools so you can do that...as well as trying to get multiple
revisions of tools running on the same computer...you're better off
archiving the whole damn computer!

To each his own, but for me, I found mixed designs work great, are easier to
read/understand and give the client as well as the designers, more
flexibility and a better end product.

There is ONE tool that is missing from schematics, and that's something like
"grep"...it would be great to be able to compare two schematics and find out
the differences visually...  I do like that about text files...

Austin

Article: 37852
Subject: Re: Defauolt Should Be "Inputs and Outputs" For IOBs - please respond???
From: "Austin Franklin" <austin@dar87kroom.com>
Date: Fri, 21 Dec 2001 11:09:30 -0500
Links: << >> << T >> << A >>

Bret, I'm still waiting for an answer to this, would you  (or anyone who
knows the answer) please respond...

This was what I asked:

Bret,

Where are you assigning these attributes?  You said in the "front end
tools", yet Synplcity has an "syn_useioff" that doesn't appear to
matter...you still need the "-pr b" in the mapper.  According to the
Synplicity docs, there is no "iob" attribute...  Are you talking about in a
constraint file?  That's really got nothing to do with the synthesis front
end tools...

Austin

Article: 37853
Subject: Re: How to make an implementable big counter?
From: "Ian Smith" <ian_smith@hotmail.com>
Date: Fri, 21 Dec 2001 17:45:09 -0000
Links: << >> << T >> << A >>

Do you need your large count with a resolution of 1? Quite often a large
count doesn't need to be adjustable to the nearest count, but instead say a
multiple of 32. If so, you can use one counter as a prescaler, doing a /32,
then the other to count the prescaler count.

Article: 37854
Subject: Re: Dual-port ram templates
From: Mike Treseler <tres@tc.fluke.com>
Date: Fri, 21 Dec 2001 10:01:57 -0800
Links: << >> << T >> << A >>

> Copy the component declaration wizard code into your architecture
> code, and instantiate it.
> 
> Leonardo generates an .edf file with the component as a black box.
> 
> Maxplus2 reads a wizard-generated vhdl file to fill in the
> black-box component. Make sure the wizard files are in the
> same directory as the .edf file.

That is a workable band-aid, but did Altera explain
the original synthesis problem?

             -- Mike Treseler

Article: 37855
Subject: Re: A ram wish
From: Peter Alfke <peter.alfke@xilinx.com>
Date: Fri, 21 Dec 2001 12:42:21 -0800
Links: << >> << T >> << A >>

I see your problem, but we will not make the read operation combinatorial. There
are user-advantages to being clocked, but there are also cases, like yours,
where it is a drawback. But the circuit implications on our side are such that
we will stick with clocked read for the foreeable future.
There are "slightly dirty" tricks, like using the falling edge as the read
clock. Most likely involves some clock XOR-gating, which requires finesse...

Sorry, Santa cannot help you.

Peter Alfke

Rob Finch wrote:

> I wish block ram had an async read option like distributed ram. How hard
> would this be to do ? The problem I have is using the same port to perform
> both read and write operations for a cpu. The cpu always generates an
> address that is a registered output registered on the clock edge. So just
> after the clock edge, the address is available. This works great for writes
> because the next clock edge can be used to write the data to the block ram.
> However it doesn't work for reads, because we want the next clock edge to
> latch the data into a cpu register. Instead, the read data isn't available
> until after the next clock edge. So 1) a wait state could be inserted for
> read operations (cuts performance in half). 2) we can use the address from
> the cpu as it is just before it's registered and use a second port of the
> block ram - means we have two address busses and the block ram can't be
> shared with another device, or twice as many blocks rams are required.
>
> Rob

Article: 37856
Subject: Re: A ram wish
From: "stefaan vanheesbeke" <stefaan.vanheesbeke@pandora.be>
Date: Fri, 21 Dec 2001 20:44:35 GMT
Links: << >> << T >> << A >>

I wish the same thing as you do, and for the same reason...

I think altera has a solution for this, but the use of their blockrams is
not so clear documented as from Xilinx.



"Rob Finch" <robfinch@sympatico.ca> schreef in bericht
news:vpBU7.36868$x25.3709356@news20.bellglobal.com...
> I wish block ram had an async read option like distributed ram. How hard
> would this be to do ? The problem I have is using the same port to perform
> both read and write operations for a cpu. The cpu always generates an
> address that is a registered output registered on the clock edge. So just
> after the clock edge, the address is available. This works great for
writes
> because the next clock edge can be used to write the data to the block
ram.
> However it doesn't work for reads, because we want the next clock edge to
> latch the data into a cpu register. Instead, the read data isn't available
> until after the next clock edge. So 1) a wait state could be inserted for
> read operations (cuts performance in half). 2) we can use the address from
> the cpu as it is just before it's registered and use a second port of the
> block ram - means we have two address busses and the block ram can't be
> shared with another device, or twice as many blocks rams are required.
>
> Rob
>
>
>

Article: 37857
Subject: Re: CE on XILINX FFs and Metastability
From: Peter Alfke <peter.alfke@xilinx.com>
Date: Fri, 21 Dec 2001 12:51:23 -0800
Links: << >> << T >> << A >>

Hi, Frank.
What you describe is the classical double-synchronizer ( sped up by using
alternate clock edges)
At any reasonable clock rate, this will stop the propagation of metastable
signals.
So, go ahead.

Frohes Fest !

Peter Alfke, Xilinx Applications
===============================
Frank Papenfuss wrote:

> Dear FPGA comunity,
>
> I have a design that must cope with asynchronous input
> signals. Basically I have a WE pulse that gates a data
> vector into the chip. The WE signal is sampled by two
> FFs to enshure proper pulse detection. One FF is clocked
> by the positive edge of the system clock and
> one by the negative edge (I do not want to go
> into too much details about why I must do this). The FFs
> that sample the pulse connect to the CE (clock enable)
> of the following FF to prevent the metastable state from
> probagating actually into the design. Since I have only
> simulated this so far I cannot say if it will really work
> inside the chip (which will be a XILINX FPGA).
>
> My question is: Has anyone experience with using CE as
> a mean to prevent a metastable state from probagating
> further.
>
> Tool Setup:
> -----------
> Simulation & Synthesis: SYNOPSIS Ver 1999.10
> Target Technology Mapping: XILINX Design Manager V3.3.08i
> Target Part: XILINX VirtexE XCV300E-8-PQ240
>
> I would also be greatful if you could point me to some
> electronically available article, technote or appnote
> about this topic, if available.
>
> Thanks in advance,
> FRANK

Article: 37858
(removed)

Article: 37859
Subject: Re: CE on XILINX FFs and Metastability
From: Peter Alfke <peter.alfke@xilinx.com>
Date: Fri, 21 Dec 2001 12:59:30 -0800
Links: << >> << T >> << A >>



Rick Filipkiewicz wrote:

> Frank Papenfuss wrote:
>
> > Dear FPGA comunity,
> >
> > I have a design that must cope with asynchronous input
> > signals. Basically I have a WE pulse that gates a data
> > vector into the chip. The WE signal is sampled by two
> > FFs to enshure proper pulse detection. One FF is clocked
> > by the positive edge of the system clock and
> > one by the negative edge (I do not want to go
> > into too much details about why I must do this). The FFs
> > that sample the pulse connect to the CE (clock enable)
> > of the following FF to prevent the metastable state from
> > probagating actually into the design. Since I have only
> > simulated this so far I cannot say if it will really work
> > inside the chip (which will be a XILINX FPGA).
> >
> > My question is: Has anyone experience with using CE as
> > a mean to prevent a metastable state from probagating
> > further.
> >
>
> Frank,
>
> It is an unfortunate fact that if an signal from a source async to a
> clock is sampled on that clock then there is always a chance that a
> metastable state could propagate arbitrarily far into your system.
>
> Metastability is a statistical thing and so all you can do is reduce the
> probability of its affecting your system to some very small number (or
> the MTBF >> time between you changing jobs).
>

The first flip-flop will undoubtably go metastable occasionally.
For the second flip-flop to go metastable, the first Q must transition just at
the sensitive moment of the second flip-flop. That is very unlikely ( but the
probability is not zero)
If the settling time margin from the Q of the first flip-flop to the D of the
second flip-flop is reasonably long  ( 5 ns or more) then the probability of the
second Q behaving strangely will border on zero.
If human life depends on the proper operation of this circuit, add another
stage.

Peter Alfke

Article: 37860
Subject: Re: How to make an implementable big counter?
From: Peter Alfke <peter.alfke@xilinx.com>
Date: Fri, 21 Dec 2001 13:07:35 -0800
Links: << >> << T >> << A >>

Neat idea, but is it really worth it?
A synchronous binary counter with a capacity of counting to one million takes just
20 flip-flops, using the carry structure in modern FPGAs. That's 5 CLBs in Virtex
or Spartan-II. And this design runs well above 100 MHz and requires zero
creativity or even thinking.

Peter Alfke
===========================
Carl Brannen wrote:

> Re very long counter design...
>
> > In my design I need to make a synchronous counter that counts, let's
> > say, till 1000000. (Actual aim for counter is to built in a delay).  I
> > do this by the use of integer type signals and with each clock'event I
> > add 1 till I reach the wanted 1000000.  When I try to implement this
> > in an FPGA it consumes a very high amount of CLBs and it seems very
> > disastrous for the maximum reachable clock freq.
>
> Assuming that you don't care about the intervening counts, you can use SRL16s
> and SRL16Es to create relatively efficient large counters.  And you don't have
> to deal with decoding LFSR values either.
>
> An SRL16 with its Q output brought back to its D input can be initialized (with
> an INIT attribute) to have only a single bit high.  The other (of up to 17
> bits), are initialized to zero.  As it clocks around, it produces a pulse
> every 17th clock.
>
> This puts a counter with a length of up to a little over 4 bits (i.e. log2(17))
> into a single LUT.  That's 4x as efficient as regular counters, and you get a
> free registered "done" bit.  You can gang these up, either by using the
> enables, or by ANDing the outputs of counters whos periods have no common
> divisor.
>
> Example with 5 SRL16/SRL16Es, gets within 5% of 10^6 clocks, uses only 7 LUTs:
>
> First  SRL16 goes high every 17th clock.  It's output connects to the enable
> input of an SRL16E that also is set for 17 clocks.  The result: Two bits, that
> when ANDed, produce a pulse every 17^2 = 289 clocks.
>
> Third  SRL16 goes high every 15th clock.  It's output connects to the enable
> input of an SRL16E that also is set for 15 clocks.  The result: Two bits, that
> when ANDed, produce a pulse every 15^2 = 225 clocks.
>
> Fifth SRL16 goes high every 16th clock.
>
> Since 17^2, 15^2, and 16 have no common divisors, the outputs of the five SRL16
> / SRL16Es can be ANDed together to produce a counter that pulses once every
> 17^2 * 15^2 * 16 = 1040400 clocks.  This is in excess of the 1000000 (as was
> asked for), and it only took 7 LUTs (<2 CLBs).  In addition, there are no lines
> that have a loading of more than 3.  The 5-input AND can be implemented with
> a registered 4-input AND (of the first four SRL16s), and a registered 2-input
> AND.  That means that there are no paths that go through un registered logic,
> and the design will clock at a very high rate.
>
> One downside is that the SRLs require so much GND and VCC routing, but
> you can create all that yourself and prevent the placer from going hog wild
> with it.
>
> Another downside is what happens to the SRL16s if you have glitches on your
> clock.  Unlike most counters, this circuit will not "fix" itself.  But lets try
> to not think too much about that.
>
> You can also play sneaky games with the first layer SRL16s.  When that first
> registered 4-input AND gate goes high, all the SRL16s will have just been in
> their high state.  That means that if you replace those two SRL16s with two
> SRL16Es you can hook the registered AND gate output back up to the (inverted
> logic) enables of those first two SRL16Es.  The effect of that modification
> will be to change that registered AND gate from counting to (16^2 * 17^2)
> to one that counts to (16^2*17^2 + 1).  Since this is relatively prime
> to the previous 16^2*17^2, that means that you can build two such circuits
> and AND their outputs together to get a period of 73984 * 73985 with just
> 11 LUTs.  This is getting a 32.35 bit binary count, with DONE pulse, and very
> high speed performance for only 11 LUTs or  2.94 bits per LUT.
>
> I should mention that I've never implemented that last sneaky game, so if it
> doesn't work I'd not be completely surprised.  Sure seems like it would
> though, and my instincts for this sort of stuff are usually pretty good.
>
> Carl
>
> --
> Posted from firewall.terabeam.com [216.137.15.2]
> via Mailgate.ORG Server - http://www.Mailgate.ORG

Article: 37861
Subject: Re: How to make an implementable big counter?
From: Peter Alfke <peter.alfke@xilinx.com>
Date: Fri, 21 Dec 2001 13:13:35 -0800
Links: << >> << T >> << A >>

Ian Smith wrote:

> Do you need your large count with a resolution of 1? Quite often a large
> count doesn't need to be adjustable to the nearest count, but instead say a
> multiple of 32. If so, you can use one counter as a prescaler, doing a /32,
> then the other to count the prescaler count.

Yes, but what is the advantage?
Prescalers and pulse-swallowers and also LFSR counters are great for speed, but
they do nothing for area efficiency.
To divide by a million, you need 20 flip-flops ( yes, 16 might be hidden in an
SRL16 look-up table ).

Peter Alfke

Article: 37862
Subject: Re: A ram wish
From: Peter Alfke <peter.alfke@xilinx.com>
Date: Fri, 21 Dec 2001 13:38:37 -0800
Links: << >> << T >> << A >>

stefaan vanheesbeke wrote:

> I wish the same thing as you do, and for the same reason...
>
> I think altera has a solution for this, but the use of their blockrams is
> not so clear documented as from Xilinx.
>

Don't trigger my allergic reaction mentioning Altera's obfuscating
documentation.

A data sheet should describe the functionality honestly. I do not want to have
to be on constant guard against half-truths and three-quarter lies, as if I were
visiting a used-car lot.
When it's called "dual-port", it should be dual-port ( or the limitation that
one port is write the other read should be stated). When "low-power" is claimed,
it should not be the irrelevant power dissipated in the terminating resistor...
That's marketing at its worst, and it does not belong in a data sheet or app
note.

As long as I have been involved in Xilinx documentation (13 years) , I have
always tried to describe the features and limitations in a forthright way. A
data sheet is first and foremost written for the design engineer. And most pages
describe the device limitation ( delays, set-up time requirements, max
frequency, leakage currents etc)
Marketing can create their own glossy brochures where everything is "great".

Peter Alfke

Article: 37863
Subject: Re: A ram wish
From: Rick Filipkiewicz <rick@algor.co.uk>
Date: Fri, 21 Dec 2001 22:12:18 +0000
Links: << >> << T >> << A >>



Peter Alfke wrote:

> As long as I have been involved in Xilinx documentation (13 years) , I have
> always tried to describe the features and limitations in a forthright way. A
> data sheet is first and foremost written for the design engineer. And most pages
> describe the device limitation ( delays, set-up time requirements, max
> frequency, leakage currents etc)
> Marketing can create their own glossy brochures where everything is "great".
>
> Peter Alfke

There are really only 2 sorts of silicon vendors - Those whose data sheets you can
trust and whose manuals are written to be read by human beings, and the others.
Xilinx is definitely category 1.

However I've found that its still usually advisable to read data sheets starting
from the back to avoid the slight marketing leakage that sometimes contaminates the
first page's bullet point list :-)

Article: 37864
Subject: Re: A ram wish
From: Peter Alfke <peter.alfke@xilinx.com>
Date: Fri, 21 Dec 2001 15:02:27 -0800
Links: << >> << T >> << A >>

Rick Filipkiewicz wrote:

> There are really only 2 sorts of silicon vendors - Those whose data sheets you can
> trust and whose manuals are written to be read by human beings, and the others.
> Xilinx is definitely category 1.

Thanks, nice to hear this. It requires constant vigilance.

>
> However I've found that its still usually advisable to read data sheets starting
> from the back to avoid the slight marketing leakage that sometimes contaminates the
> first page's bullet point list :-)

Yes and no. The front page summarizes all the good things about the part, and that
gives the reader ( who is assumed to not know anything about this part ) a feel for
whether it is meaningful to bother reading the rest. By necessity, this front page is
cryptic and up-beat. We think this is a neat part, otherwise we would not offer it for
sale. But I am constantly toning down the fancy adjectives that marketing tries to
sneak in...
The front page is the 1-minute first encounter. "Why would you like this ".

Peter Alfke

Article: 37865
(removed)

Article: 37866
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
From: "Carl Brannen" <carl.brannen@terabeam.com>
Date: Fri, 21 Dec 2001 23:38:44 +0000 (UTC)
Links: << >> << T >> << A >>

Hi Stephen,

> Perhaps I'm missing something, but why can't you send the output of the
> F-LUT to output X, and the output of the F5-MUX to output F5?

As far as I can see, you can do that, and then use the same algorithm I
gave in order to get the G-LUT output out of the slice with the F6-MUX.

Unfortunately, my brain is just a few neurons short of quickly figuring out
how efficient it would be.  My guess is that it is an improvement...

Carl


-- 
Posted from firewall.terabeam.com [216.137.15.2] 
via Mailgate.ORG Server - http://www.Mailgate.ORG

Article: 37867
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
From: Ray Andraka <ray@andraka.com>
Date: Sat, 22 Dec 2001 04:51:22 GMT
Links: << >> << T >> << A >>

The F5 output only goes to the F6 mux in the neighboring slice, nowhere else.
At the F6 you run into a similar problem, because it has to get out somewhere,
and it's other input has to be sourced by the F5 in that slice, which in turn
is sourced by the LUTs.

Stephen Melnikoff wrote:

> > The problem with doing it is that it's hard to get the output of the "F"
> LUT
> > out of the slice.  But it can be done by brining it out the CARRY-OUT.
>
> Perhaps I'm missing something, but why can't you send the output of the
> F-LUT to output X, and the output of the F5-MUX to output F5?
>
> Stephen Melnikoff.
>
> --
> Stephen Melnikoff - s.j.melnikoff@iee.org
> Electronic, Electrical and Computer Engineering
> University of Birmingham, Birmingham, UK

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 37868
Subject: Re: How can I reduce Spartan-II routing delays to meet 33MHz PCI's Tsu < 7 ns requirement?
From: "Carl Brannen" <carl.brannen@terabeam.com>
Date: Sat, 22 Dec 2001 05:05:53 +0000 (UTC)
Links: << >> << T >> << A >>

Kevin, I was under the impression that Xilinx put the IRDY and TRDY hardware
in there because without it they couldn't guarantee PCI compatibility.

> Regarding the "built-in PCI logic," I will assume what you mean
> is Xilinx's special IRDY and TRDY logic.
> Because the PCI IP core has to be portable across different platforms, I
> am not interested in using that special IRDY and TRDY logic, and I don't
> really know how it works.

I had a design once where the customer selected the pins himself, and I had to
make them cut and jumper the prototypes in order to get the PCI IP installed
right.

The Xilinx PCI logic takes an IRDY and a TRDY input, along with I1, I2, and I3,
and produces an output called "PCI_CE".  It's intended use is as a clock
enable for when the xilinx drives the CBE[3:0] and AD[31:0] outputs.  That
should give a clue about what the logic in it is.  This should give another
clue:

http://support.xilinx.com/xlnx/xil_ans_display.jsp?iLanguageID=1&iCountryID=1&getPagePath=10397

Since IRDY and TRDY are being brought in as inputs, I suppose this logic
applies to the case when the Xilinx is a bus master, and it's used to extend
cycles when the slave isn't ready.  The idea would be to keep CBE constant
(and AD too, if it was a master write cycle), if the slave responded with
a not ready response.  But it's been a while since I looked at a PCI spec.

I'm pretty sure that if it were possible to make a Xilinx PCI IP core without
the special logic, Xilinx would have done it.  On the other hand, maybe their
new parts are enough faster than before that the special logic isn't needed.

One thing I like about Xilinx is that their silicon has always been pretty much
rock solid for me.  I've never had a real complaint about their silicon, but
I complain all the time about their software.  If something acts silly it's
always because I've got signal integrity issues (or whatever), but they're not
Xilinx' fault.

Carl

I always try to register all my inputs and outputs in the IOBs because that
makes it a lot easier to analyze timing for the system.  It breaks the system
timing calculations into two parts.  (1) Getting on and off of the Xilinx, but
all those signals have pretty much the same timing, and (2) moving data around
inside the Xilinx, but the tools handle that for me.  I guess you can't
simplify to that kind of system with a PCI interface.


-- 
Posted from firewall.terabeam.com [216.137.15.2] 
via Mailgate.ORG Server - http://www.Mailgate.ORG

Article: 37869
Subject: Re: Kindergarten Stuff
From: "Carl Brannen" <carl.brannen@terabeam.com>
Date: Sat, 22 Dec 2001 05:39:34 +0000 (UTC)
Links: << >> << T >> << A >>

Bryan, if you ask engineers how to solve a problem and they can't help you,
it seems like they will tell you that you're going about it the wrong way. (g)

SRC Computers sounds like a cool place to work.  I love pushing silicon to its
limits.  That's not always what the customer wants, though.

Carl


-- 
Posted from firewall.terabeam.com [216.137.15.2] 
via Mailgate.ORG Server - http://www.Mailgate.ORG

Article: 37870
Subject: Re: Initialization of RAM
From: "Carl Brannen" <carl.brannen@terabeam.com>
Date: Sat, 22 Dec 2001 07:07:53 +0000 (UTC)
Links: << >> << T >> << A >>

Hi Andraka

> In my experience, which numbers into the hundreds of FPGA designs, I haven't seen a
> single case where the design actually went to ASIC, although a decent percentage of
> the customers naively believe that theirs will and want the FPGA design done
> generically enough to go directly to ASIC.  Of course they also want performance,
> or they probably wouldn't have called me in the first place.

I've only had one design go to ASIC, and it was in production with the XC2064
for years before they did it.  It was coded in straight from XACT and it was
very tightly packed, which is probably why it took so long to put it into an
ASIC.  Volumes were huge and by the time it was out of production, Xilinx was
selling the XC2064s in huge volume at an amazingly low price.  Xilinx must have
some of the steepest price / volume curves in the industry.  When you get to
the high 10 thousands those guys will cut you a deal.

The horrible thing is that it was only supposed to be a temporary remedy until
we got an ASIC, so management got me to cut some very tight corners in the
design.  For years I was afraid that Xilinx was going to update their process
and blow me away on my minimum path delays.  They wanted it done too quick to
go to an ASIC, and they didn't like the "risk" of ASICs.  Then they kept
putting
off the ASIC conversion.

It ran off of a (max) 72MHz clock by immediately dividing it to 36MHz.  Then
the outputs were "DDRed" back to 72MHz.  In order to get that to work, I had
to make the output bus "source synchronous", so I rebuilt a 72MHz clock from
something like an XOR with delay of the 36MHz clock with itself.  When I say
the thing scared me I'm not kidding.  For a while parts were screened for
compatibility by observing their behavior in a test circuit.  The test circuit
fed the chip a clock that was reduced in amplitude to 60% of normal amplitude
had slow rise and fall times, and varied its average voltage another 25% at
audio frequencies.  If the part could come up with a decent output clock under
that kind of input clock, it got soldered into the spot.  The 10% or so that
failed would get soldered into another position.  The time required to screen
parts was around 10 seconds each.

I was sure that the EE police were going to arrest me for the stunt, but I
got away clean.  I suspect that if I hadn't built the screening device we'd
have had problems in manufacturing, but that never happened.  Eventually,
Xilinx' process was sped up to the point where they always had 100% passing,
and manufacturing quit screening for suitability.  (Which just made me more
scared.)

Carl


-- 
Posted from firewall.terabeam.com [216.137.15.2] 
via Mailgate.ORG Server - http://www.Mailgate.ORG

Article: 37871
Subject: Re: Dual-port ram templates
From: Russell Shaw <rjshaw@iprimus.com.au>
Date: Sat, 22 Dec 2001 19:35:08 +1100
Links: << >> << T >> << A >>

Mike Treseler wrote:
> 
> > Copy the component declaration wizard code into your architecture
> > code, and instantiate it.
> >
> > Leonardo generates an .edf file with the component as a black box.
> >
> > Maxplus2 reads a wizard-generated vhdl file to fill in the
> > black-box component. Make sure the wizard files are in the
> > same directory as the .edf file.
> 
> That is a workable band-aid, but did Altera explain
> the original synthesis problem?

Not yet. Seems that leonardo just doesn't do dual-port rams.

I found that to avoid a maxplus2-vhdl-licence problem,
once the vhdl wizard code is put into your own code, then regenerate
the wizard files in ahdl.

Using that method, i've been using lpm_ram_dp. However, i found
that i can run it at 2.5MHz, but not 5MHz (the dac output goes
crappy, or other parts of unrelated code stop working). I've
got flip-flop pipelining everywhere, and if i bypass the ram,
i can get good waveforms.

How fast can EAB dual-port rams be run in an acex 1k30 speed -3
device?

Article: 37872
Subject: Re: You take the low road and I'll ......
From: "kryten_droid" <kryten_droid@ntlworld.com>
Date: Sat, 22 Dec 2001 16:16:56 -0000
Links: << >> << T >> << A >>

I vote for the schematic camp.

I can look at a schematic and get the big picture quickly.

Easy to spot bad connections too.

I have waded through other people's code and still been completely unable to
figure out how they intended it to work.

Often I have tried sketching block diagrams of such code and found the code
had no real coherent structure.

Personally I find trying to understand non-trivial VHDL hardware designs
from the source code is as awful as figuring out a circuit design from the
netlist. If just writing netlists was adequate, people would not have
invented schematic/PCB CAD packages. People buy pictorial maps of cities,
not lists of street names and their junctions.

Arguing for text-based design on the grounds that text editors are commonly
available seems to me like arguing we should use fingernails because not
everyone has a screwdriver.

I will be compromising, by using a package like Orcad to design the
top-level view and generate a VHDL skeleton (by selecting VHDL as the
netlist output format), then fleshing out the "soft components".

K

Article: 37873
Subject: Re: Virtex-2 maximum clock speed
From: Otomo <invalid@nospam.com>
Date: Sat, 22 Dec 2001 08:24:18 -0800
Links: << >> << T >> << A >>

In article <3BC471C6.D745D1CE@xilinx.com>,
 Austin Lesea <austin.lesea@xilinx.com> wrote:

> Getting the data in and out is the next problem after the internal
> processing, and the LVDS IO allows for data buses of ~16 bits running at
> ~700 Mb/s DDR rates, and the 840 Mb/s rates with some careful placement.

FWIW, my 8:64 demux operates perfectly fine at 950 Gbps on the input 
bits (475 MHz clock) as a result of hand placing all the high speed 
stuff. It's a -5 part if I recall, but I've been on vacation for 3 days 
so it's already hazy. ;-)

My only problem, previously posted, is that TRCE reports a max clock of 
170 MHz or so because multiclock designs seem to befuddle the poor thing.

Article: 37874
(removed)

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search