Messages from 114900

Article: 114900
Subject: Re: book recommendation for self study in digital logic design
From: Matthew Hicks <mdhicks2@uiuc.edu>
Date: Fri, 26 Jan 2007 02:01:31 +0000 (UTC)
Links: << >> << T >> << A >>

After you learn the basics, i.e. read the books on logic design and on an 
HDL, it's time to have some fun.  Get an FPGA development board and find 
a problem to solve or a cool project to work on.  Start simple and build 
up to complex things.  It may take a while until your projects become "interesting" 
but once you get going, "the world is your oyster."


---Matthew Hicks


> On Jan 24, 5:13 pm, Matthew Hicks <mdhic...@uiuc.edu> wrote:
> 
>> Those aren't digital logic books, those are computer architecture
>> books. For my first course in logic design I used "Fundamentals of
>> Logic Design" by Roth.  For a mix of logic design and HDL, "Advanced
>> Digital Design with the Verilog HDL" by Ciletti is quite good, but
>> more advanced.  I personally wanted to read "Digital Design:
>> Principles and Practices" by Wakerly, which got good reviews but may
>> be on the advanced side of things.
>> 
> Mhh, the Ciletti book looks good. What I am struggling with is, once
> done with simple building blocks, how to put them together in a whole
> design.
> 
> I found another book along that line:
> 
> "Advanced Digital Logic Design Using Verilog, State Machines, and
> Synthesis for FPGA's"
> by Sunggu Lee
> http://www.amazon.com/Advanced-Digital-Verilog-Machines-Synthesis/dp/0
> 534551610/sr=8-6/qid=1169760050/ref=sr_1_6/103-6992646-3335035?ie=UTF8
> &s=books
> 
> http://www.engineering.thomsonlearning.com/products/productPage.aspx?i
> sbn=0534551610
> 
> Will see whether I can find some more information about it.
> 
> Guenter
>

Article: 114901
Subject: Re: OrCAD symbol for the Xilinx V5LX50 FF676 device
From: "Jim Wu" <jimwu88NOOOSPAM@yahoo.com>
Date: 25 Jan 2007 18:03:09 -0800
Links: << >> << T >> << A >>

Orcad 10.5 and later has a new function "New Part from Spreadsheet"
that makes this much easier. If you use ADEPT, you can export a csv
file that can be directly copied and pasted to the spreadsheet in Orcad
to generate a multi-part symbol. Check the link below for details:
    http://home.comcast.net/~jimwu88/tools/adept/gen_orcad_symbol.html

ADEPT can be freely downloaded from
    http://home.comcast.net/~jimwu88/tools/adept/

Cheers,
Jim




On Jan 25, 5:03 pm, "Symon" <symon_bre...@hotmail.com> wrote:
> <george_gran...@hotmail.com> wrote in messagenews:1169758971.659992.128940@j27g2000cwj.googlegroups.com...> Does anyone know where I can get the OrCAD symbol for the Xilinx V5LX50
> > FF676 device?http://www.fpga-faq.com/FAQ_Pages/0027_Creating_PCB_symbols_for_FPGAs...
> HTH, Syms

Article: 114902
Subject: Timing Diagram Tool
From: "any2letters" <any2willdo@letter.com>
Date: Thu, 25 Jan 2007 18:18:27 -0800
Links: << >> << T >> << A >>

Does anyone know a decent - FREE - timing diagram cad tool?

Thanks

Article: 114903
Subject: Re: OrCAD symbol for the Xilinx V5LX50 FF676 device
From: "schsym" <support@schematicsymbol.com>
Date: 25 Jan 2007 18:21:57 -0800
Links: << >> << T >> << A >>

Schematicsymbol.com sells the Xilinx Virtex5 LX schematic symbol for
OrCAD schematic capture tools. The offering of components is constantly
expanding and should have all Virtex 4 and 5 parts, as well as the new
Spartan3 symbols available soon. Large schematic symbols have multiple
components with pins grouped by function and bank. Most symbols are pdf
viewable before purchase by using Adobe reader and clicking on the
'VIEW' link. Secure credit card payment/ instant automated notification
and download. 
Thanks!!

www.schematicsymbol.com support

Article: 114904
Subject: Re: "Divide" a video line in two stripe
From: "Rob" <robnstef@frontiernet.net>
Date: Fri, 26 Jan 2007 04:09:10 GMT
Links: << >> << T >> << A >>

I guess I misundertood your original ask.

I was thinking that the data was coming in like

Assuming a 12 pixel line for discussion puposes.

clk1 pixel 1, 2
clk2 pixel 3, 4
clk3 pixel 5, 6
--------------------split
clk4 pixel 7, 8
clk 5 pixel 9, 10
clk 6 pixel 11, 12

And what you wanted to do was to split the line in half and ship out two 
data streams.

First line will be used to initially fill the memory buffers

So, to fill the memory you would:
clk1: write into address 0 of mem buf_a pixels 1/2
clk2: write into address 1 of mem buf_a pixels 3/4
clk3: write into address 2 of mem buf_a pixels 5/6
Then reset write pointer for mem buf_a back to address 0 while waiting for 
mem buf_b to fill

clk4: write into address 0 of mem buf_b pixels 7/8
clk5: write into address 1 of mem buf_b pixels 9/10
clk6: write into address 2 of mem buf_b pixels 11/12
Then reset write pointer for mem buf_b back to address 0

L2
clk1: read from address 0 of mem buf_a and mem buf_b -- pixles 1/2 and 7/8 
go out
        write into address 0 of mem buf_a L2 pixel 1/2
clk2: read from address 1 of mem buf_a and mem buf_b -- pixles 3/4 and 9/10 
go out
        write into address 1 of mem buf_a L2 pixel 3/4
clk3: read from address 2 of mem buf_a and mem buf_b -- pixles 5/6 and 11/12 
go out
        write into address 2 of mem buf_a L2 pixel 5/6
        reset write pointer for mem buf_a back to address 0 while waiting 
for the rest of L2 to be written to mem buf_b
        reset write pointer for mem buf_b back to address 0
clk4: write into address 0 of mem buf_b L2 pixel 7/8
clk5: write into address 1 of mem buf_b L2 pixel 9/10
clk6: write into address 2 of mem buf_b L2 pixel 11/12
        reset write pointer for mem buf_b back to address 0

Then repeat for each addition line.

Rob


"Sylvain Munaut <SomeOne@SomeDomain.com>" <246tnt@gmail.com> wrote in 
message news:1169633530.314103.126550@l53g2000cwa.googlegroups.com...
> Hi Rob,
>
> Yes the bram are dual port but that's not the issue here.
>
> In your scheme, I'll be overwriting some data of the previous line
> before I reread them ... Look at the content of block a. I'll need 1
> full line time to reread it but I'll fill it with new data in only half
> line time ...
>
> Sylvain
>
> On Jan 23, 4:20 am, "Rob" <robns...@frontiernet.net> wrote:
>> Sylvain
>>
>> I'm not familiar with Xilinx's memory architecture; but if their memory 
>> blocks have the option of being run in dual-port mode it could make this 
>> problem much easier to deal with.
>>
>> In the past I've taken advantage of other mfg's mixed-port 
>> read-during-write mode.  This mode is used when a RAM has one port 
>> reading and the other port writing to the same address location with the 
>> same clock. The memory block outputs the old data at the specified 
>> address when there is a simultaneous read during write to the same port. 
>> You then could set up two blocks (one for each half of the image) a line 
>> deep.
>>
>> First fill the memory blocks with line 1
>>
>> fill block a
>>
>> *reset wraddr_a back to addr 0 and wait for block b to fill
>>
>> fill block b
>>
>> reset wraddr_b back to addr 0
>>
>> now you read through the two blocks simultaneously while writing to the 
>> same address for block_a
>>
>> reset wraddr_b back to addr 0
>>
>> repeat *
>>
>> You'll have two pointers for each memory block, one read and one write 
>> pointer.
>>
>> I haven't done any work with DVI so I may be missing something specific 
>> to that interface.  If so, my apologies.
>>
>> Take care,
>>
>> Rob
>>
>> "Sylvain Munaut <Some...@SomeDomain.com>" <246...@gmail.com> wrote in 
>> messagenews:1169454808.249472.167130@a75g2000cwd.googlegroups.com...
>>
>> > Here's my problem :
>>
>> > A have a video module (that I can't really change), that outputs a
>> > 3840x2400 image, by outputing two consecutive pixels at once (like
>> > dual-link DVI). The problem is that the screen to display that doesn't
>> > want dual-link DVI, it wants two independant DVI stream, one for the
>> > left part of the screen and another for the right part of the screen.
>> > (two "stripes" of 1920x2400).
>>
>> > I'm trying to come up with a solution to "transform" one into another,
>> > without using a frame buffer nor storing more than 1 line of video.
>> > (At 3840, in color, that already is 6 Xilinx BRAMs and I'm a little
>> > short of those ...).
>>
>> > According to my calculations, It should even be possible to only store
>> > half a line, but I prefer to have a 1 line delay than half a line
>> > delay.
>> > My problem is that I can't find how to do it ... Storing in BRAM has
>> > proven to be an addressing nightmare to store and reread simultaneously
>> > without overwriting data I haven't re-read yet ... (since I don't read
>> > in the same order that I write).
>>
>> > Does anyone has done something similar or has a genius idea ? Because
>> > I'm missing something here, that should be simple and I just don't see
>> > it ...
>>
>> > Sylvain
>

Article: 114905
Subject: Re: Timing Diagram Tool
From: Bob Perlman <bobsrefusebin@hotmail.com>
Date: Thu, 25 Jan 2007 20:15:38 -0800
Links: << >> << T >> << A >>

On Thu, 25 Jan 2007 18:18:27 -0800, "any2letters"
<any2willdo@letter.com> wrote:

>Does anyone know a decent - FREE - timing diagram cad tool?
>
>Thanks 
>

http://www.timingtool.com

They have a free, web-based version of their tool.  I have no idea if
it's any good.  But you capitalized "free," not "decent," so I figure
it's worth a mention.

Bob Perlman
Cambrian Design Works
http://www.cambriandesign.com

Article: 114906
Subject: Datapath design problem?
From: "Shenli" <zhushenli@gmail.com>
Date: 25 Jan 2007 22:37:19 -0800
Links: << >> << T >> << A >>

Hi all,

I am reading "Coding Guidelines for Datapath Synthesis" from Synopsys.

It says "The most important technique to improve the performance of a
datapath is to avoid expensive carry-propagations and to make use of
redundant representations instead (like carry-save or partial-product)
wherever possible."

1. Is there any article talk about what's "carry-propagation" and how
to avoid use it?
2. What's "redundant representations" mean?

Please recommend some readings about it, thanks in advance!

Best regards,
Davy

Article: 114907
Subject: Re: video buffering scheme, nonsequential access (no spatial locality)
From: "JustJohn" <john.l.smith@l-3com.com>
Date: 25 Jan 2007 23:02:50 -0800
Links: << >> << T >> << A >>

On Jan 24, 11:36 am, "wallge" <wal...@gmail.com> wrote:
> I am doing some embedded video processing, where I store an incoming
> frame of video, then based on some calculations in another part of the
> system, I warp that buffered frame of video. Now when the frame goes
> into the buffer
> (an off-FPGA SDRAM chip), it is simply written in one pixel at a time
> in row major ordering.
>
> The problem with this is that I will not be accessing it in this way. I
> may want to do some arbitrary image rotation. This means
>  the first pixel I want to access is not the first one I put in the
> buffer, It might actually be the last one in the buffer. If I am doing
> full page reads, or even burst reads, I will get a bunch of pixels that
> I will not need to determine the output pixel value. If i just do
> single reads, this waists a bunch of clock cycles setting up the SDRAM,
> telling it which row to activate and which column to read from. After
> the read is done, you then have to issue the precharge command to close
> the row. There is a high degree of inefficiency to this. It takes 5,
> maybe 10 clock cycles just to retrieve one
> pixel value.
>
> Does anyone know a good way to organize a frame buffer to be more
> friendly (and more optimal) to nonsequential access (like the kind we
> might need if we wanted to warp the input image via some
> linear/nonlinear transformation)?

A fairly simple technique, reasonably well-known among video system
designers, is to use what is sometimes called tiling the image into the
((DDR)S)DRAM columns.
E.g., assume a 1Kx1K image with vertical and horizontal address bits
(V9..V0) and (H9..H0), and also DRAM with row and column address bits
(R9..R0) and (C9..C0). Do _not_ use the straight mapping of:
(V9..V0) <=> (R9..R0) and (H9..H0) <=> (C9..C0)
Instead, map the H/V LSBs into the DRAM column address, and the H/V
MSBs to the DRAM row address:
(V4..V0,H4..H0) <=> (C9..C0) and (V9..V5,H9..H5) <=> (R9..R0)
When warping, the image sample addresses are pipelined out to DRAM,
with time designed into the pipeline to examine the addresses for DRAM
row boundary crossings, and only stall when it's necessary to re-RAS.
Stalls only occur when the sampling area overlaps the edge of a tile,
instead of with every 2x2 or 3x3 fetch.
(You posed your question so well, I'll bet this already occurred to
you)

Caching can also be used to bypass the external RAM access pipeline
when the required pixels are already in the FPGA. There are lots of
different caching techniques, I haven't looked at that in a while.

Block processing is a kind of variant of caching, reading a tile from
external DRAM into BRAM, warping from that BRAM into another BRAM, then
sending the results back out, but border calculations get messy for
completely arbitrary warps.

HTH
Just John

Article: 114908
Subject: unsigned and signed data in Verilog?
From: "Shenli" <zhushenli@gmail.com>
Date: 25 Jan 2007 23:08:40 -0800
Links: << >> << T >> << A >>

Hi all,

I am reading "Coding Guidelines for Datapath Synthesis" from Synopsys.
And confused with the example below, why split unsigned and signed +
and *?

//=====Unintended behavior======
input  signed  [3:0] a;
input  signed  [7:0] b;
output        [11:0] z;


// product width is 8 bits (not 12!)
assign z = $unsigned(a * b);
// -> 4x8=8 bit multiply
//============================

//======Intended behavior======
input  signed  [3:0] a;
input  signed  [7:0] b;
output        [11:0] z;
wire   signed [11:0] z_sgn;

// product width is 12 bits
assign z_sgn = a * b;
assign z     = $unsigned(z_sgn);
// -> 4x8=12 bit multiply
//============================

Best regards,
Davy

Article: 114909
Subject: how do you code this?
From: "aravind" <aramosfet@gmail.com>
Date: 25 Jan 2007 23:14:35 -0800
Links: << >> << T >> << A >>


There is a 20 bit counter,with two inputs ,on the rising edge of one
input the counter must increment and on the rising edge of the other
input the counter must decrement.
this is for a 1MB FIFO buffer using single port external SRAM,I'm using
Xilinx ISE tool.according to the xilinx tool you cannot have two
(rising_edge()) statements in a single process.

How do you code it?

Article: 114910
Subject: Re: ModelSim Leaf Instances
From: backhus <nix@nirgends.xyz>
Date: Fri, 26 Jan 2007 08:40:27 +0100
Links: << >> << T >> << A >>

Brad Smallridge schrieb:
> Hello group,
> 
> I ran into an ugly ModelSim6 warning saying that
> I had too many "leaf instances" or lines of code,
> and that my performance was going to be severly
> affected. Sure was. A simulation that normally
> runs in about a minute now takes an hour.
> 
> I traced the problem down to instantiated primitive
> FIFOs that I replaced with an inferred memory.
> Why does this work? Fewer instances to be sure, but
> not less hardware. This appears to be an arbitrary
> ceiling imposed by the marketing people at Mentor.
> 
> Although I have a temporary fix, questions arise:
> 1) What is a "Leaf" instance?
> 2) How do I know when I'm running out of room?
> 3) What are the best practices to get the most out
>    of the ModelSim XE starter?
> 
> Thanks,
> 
> Brad Smallridge
> AiVision
> 
> 
Hi Brad,
leaf instances are the way modelsim counts the ammount of code you want 
to simulate. I'm not sure how it is done precisely, but the effect you 
have seen can be explained.

The instantiated primitive fifo consisted of eiter a netlist with many 
gate instances to simulate or a quite large (coregen?) simulation model. 
Your infered memory probably just had quite few lines of code.

Modelsim doesn't care how much hardware a synthesis tool will create. 
It's a simulator and therefore can just take the ammount of source code 
to simulate into account.

Older versions of Modelsim XE starter printed the maximum and/or the 
actual number of leaf instances in the transcript window. I can't find 
that anymore in the 6.2c version.

If your simulation time went up from minutes to about an hour the 
difference in leaf instances must be massive. My own experiences (even 
with large designs) never had such a drastic loss of performance.

Do you have a large number of FIFOs in your design?
e.g. Basic design has 1000 leaf instances and uses 10 FIFOS. If each 
FIFO consists of 500 leaf instances itself you end up with 6000 leaf 
instances. If the infered memorys use 20 leaf instances you come up with 
1200 leaf instances in the end.

6000 compared to 1200 needs a lot more calculations anyway, and if these 
are also slowed down by some factor...well just let's say it will take long.

What kind of computer du you have?
Actual "over 2GHz" and "more than 512MB RAM" machines should perform 
quite well, even above the XE Startetr leaf instance limit.

have a nice simulation
   Eilert

Article: 114911
Subject: Re: Xilinx ISE 8.2
From: "Ben Jones" <ben.jones@xilinx.com>
Date: Fri, 26 Jan 2007 10:00:00 -0000
Links: << >> << T >> << A >>

<ammonton@cc.full.stop.helsinki.fi> wrote in message 
news:epb4ak$evj$1@oravannahka.helsinki.fi...
> doug <doug@doug> wrote:
>
>> Memory leaks are due to sloppy or incompetent programmers.  Not fixing
>> them is due to poor management.
>
> Obviously I can't speak for Xilinx, but IME some leaks are due to
> design errors that can require whole subsystems to be rewritten to fix
> properly.

Entering the conversation rather late, but as an (internal) user of the ISE 
tools and someone else who has seen their fair share of these memory issues, 
I thought I ought to chime in. I'm not going to leap to the defence of the 
ISE programmers and pretend that there are no problems, but I wanted to draw 
attention to what I see as an important distinction between two similar but 
separate issues:

 (1) Memory leaks, and
 (2) Excessive memory consumption.

No-one will argue that problem (1) is acceptable in commercial production 
software. However, a lot of the time what looks like a memory leak is really 
just problem (2): an inefficient algorithm that consumes a massive amount of 
memory when it runs, but is in fact perfectly well-behaved in terms of 
returning memory to the system when it has finished with it. (Usually this 
is a result of someone writing the simplest possible algorithm to get the 
job done, not expecting it to become the critical section of code. I think 
it's fair to say most of us have been guilty of that at one time or another! 
:-))

Many times, I and other developers in my group have encountered excessive 
runtimes and memory consumption for synthesizing quite simple pieces of HDL 
code. Upon investigation, this has often turned out to be due to complexity 
of coding style - for example, nested loops in VHDL function calls are a 
common culprit. Usually there turns out to be a way to re-phrase the code in 
question so that XST behaves better.

Often what seems to happen is that problems in class (2) are deemed 
"annoyances" rather than defects, and thus get a lower priority than "real" 
bugs that cause crashes or data corruption. Nevertheless, the bottom line is 
that the more people who report that they're having problems, the more 
likely something is to get fixed.

So please, carry on complaining... :-\

      -Ben-

Article: 114912
Subject: Re: how do you code this?
From: David R Brooks <davebXXX@iinet.net.au>
Date: Fri, 26 Jan 2007 03:21:52 -0800
Links: << >> << T >> << A >>

aravind wrote:
> There is a 20 bit counter,with two inputs ,on the rising edge of one
> input the counter must increment and on the rising edge of the other
> input the counter must decrement.
> this is for a 1MB FIFO buffer using single port external SRAM,I'm using
> Xilinx ISE tool.according to the xilinx tool you cannot have two
> (rising_edge()) statements in a single process.
> 
> How do you code it?
> 
Are the two clocks synchronised? (I assume not, since you're building a 
FIFO).
If not, such a circuit cannot be built. Consider what might happen when 
the two clock edges occur almost simultaneously: the circuit will 
certainly fail for some (small) delay between the clocks.
The usual way to build a FIFO is to use a separate counter for each 
port. You can compare them to determine fullness. (Hint: use Gray-code 
counters, not straight binary).

If the clocks are synchronised, it may be possible to re-work the 
circuit, so the counter sees a single clock, and an UP/HOLD/DOWN control 
input. That is feasible.

Article: 114913
Subject: Re: Timing Diagram Tool
From: Koen Van Renterghem <Ih8teSpam@intec.ugent.be>
Date: Fri, 26 Jan 2007 13:59:18 +0100
Links: << >> << T >> << A >>

any2letters wrote:
> Does anyone know a decent - FREE - timing diagram cad tool?
> 
> Thanks 
> 
> 

* Not free :
    - http://www.syncad.com/
    - If you are looking for something that helps you write 
documentation, you can check a shareware tool called TimeGen 
(http://www.xfusionsoftware.com/). Easy to use & not too expensive.

* Free tools :
    - http://drawtiming.sourceforge.net/
    - A latex package to insert timing diagrams, called 'timing', also 
exists. (http://www.ctan.org/tex-archive/macros/latex/contrib/timing/)
    - Check the 'other timing diagram' section on 
http://tdv.sourceforge.net/ for a couple of interesting links.

Article: 114914
Subject: Inferring Xilinx RAM's with Byte enable options
From: "anil" <anil.janumpally@gmail.com>
Date: 26 Jan 2007 04:59:31 -0800
Links: << >> << T >> << A >>

Hello all,

       does any one have an idea about the related question? what is
the type of code written to infer the option mentioned.

Regards,
Anil

Article: 114915
Subject: Re: xilinx 8.2 xps debug problems
From: Brian Drummond <brian_drummond@btconnect.com>
Date: Fri, 26 Jan 2007 13:06:44 +0000
Links: << >> << T >> << A >>

On Thu, 25 Jan 2007 18:39:32 +0100, Ludwig Lenz
<llenz@vlsi.informatik.tu-darmstadt.de> wrote:

>Hi
>
>I am working with an xilinx xup board. 8.2 sp1 ise + 8.2 edk.
>I am starting creating a design with ppc running from ddr ram, sysace and
>uart for output. Starting the debugger with:
>xmd -xmp system.xmp -opt etc/xmd_ppc405_0.opt
>with example application works.
>
>Closing the project adding a new plb peripherial template with 4x32
>registers no interrupts. Reopening the project. Download bitstream.

When downloading bitstream , are you using Impact and the JTAG chain?
If so, you need the "Pulse PROG" option set to start the PPC properly
(using other tools there may be an equivalent setting). 
Symptoms if you don't are just what you describe.

Never did get a satisfactory explanation as to what this did or why, but
it fixed the problem here.

- Brian

Article: 114916
Subject: Re: how do you code this?
From: Brian Drummond <brian_drummond@btconnect.com>
Date: Fri, 26 Jan 2007 13:14:41 +0000
Links: << >> << T >> << A >>

On 25 Jan 2007 23:14:35 -0800, "aravind" <aramosfet@gmail.com> wrote:

>
>There is a 20 bit counter,with two inputs ,on the rising edge of one
>input the counter must increment and on the rising edge of the other
>input the counter must decrement.
>this is for a 1MB FIFO buffer using single port external SRAM,I'm using
>Xilinx ISE tool.according to the xilinx tool you cannot have two
>(rising_edge()) statements in a single process.
>
>How do you code it?

You don't. But consider coding two separate counters and computing the
difference between them.

Since you have two separate clock domains, it can get complicated.

If you only need the output (difference) in one of the clock domains,
consider re-latching the second domain's counter output on the first
clock, to keep the output computation synchronous.

- Brian

Article: 114917
Subject: Re: how do you code this?
From: "aravind" <aramosfet@gmail.com>
Date: 26 Jan 2007 05:15:59 -0800
Links: << >> << T >> << A >>

On Jan 26, 4:21 pm, David R Brooks <daveb...@iinet.net.au> wrote:
> aravind wrote:
> > There is a 20 bit counter,with two inputs ,on the rising edge of one
> > input the counter must increment and on the rising edge of the other
> > input the counter must decrement.
> > this is for a 1MB FIFO buffer using single port external SRAM,I'm using
> > Xilinx ISE tool.according to the xilinx tool you cannot have two
> > (rising_edge()) statements in a single process.
>
> > How do you code it?Are the two clocks synchronised? (I assume not, since you're building a
> FIFO).
> If not, such a circuit cannot be built. Consider what might happen when
> the two clock edges occur almost simultaneously: the circuit will
> certainly fail for some (small) delay between the clocks.
> The usual way to build a FIFO is to use a separate counter for each
> port. You can compare them to determine fullness. (Hint: use Gray-code
> counters, not straight binary).
>
> If the clocks are synchronised, it may be possible to re-work the
> circuit, so the counter sees a single clock, and an UP/HOLD/DOWN control
> input. That is feasible.

Well the FIFO is not exactly asynchronous. Since i'm using a single
port SRAM that sits outside the FPGA, it cannot handle a simultaneous
read and write.I'm designing the state machine such that only one of
the signals is asserted to the 20 bit counter at once.If i redesign the
circuit to use a single clock with UP/HOLD/DOWN control ,i need to
waste another clock cycle.i'm already using 2 clocks for read and 2
clocks for a write operation to satisfy the timing requirements of the
SRAM chip.The reason why chose to implement a counter instead of
comparing the RD,WR addresses  is that, i can generate full, empty
signals and also know to size of the buffer at any instant.

Article: 114918
Subject: Re: Xilinx ISE 8.2
From: Martin Thompson <martin.j.thompson@trw.com>
Date: Fri, 26 Jan 2007 13:37:10 +0000
Links: << >> << T >> << A >>

Daniel O'Connor <darius@dons.net.au> writes:

> Martin Thompson wrote:
>> Simulation I use vhdl-mode for, it scans my entire set of files and
>> builds me a complete makefile.  Anyone who hasn't spent 2 or 3 weeks
>
> Sure that's VHDL mode? I can't see anything in the docs about simulation..
>

Yep, definitely...  It's in the compilation section as none of it is
simulation specific, I just target the Modelsim compiler.

>> attempting to get up the learning curve of emacs/vhdl-mode has missed
>> out IMHO.  Two of my colleagues (who are distinctly mouse and windows
>> types) are converted to Emacs for vhdl writing (purely down to the
>> power of vhdl-mode).  I used to use Codewright, which I thought was
>> pretty hot, but it's not a patch on emacs, especially for VHDL.
>
> Hmm, it looks very interesting but I use Verilog :)
>

Ahh, in that case, sorry, you're out of luck I think.  Unless verilog-mode is
 as far on as vhdl-mode...

>> The rest of my scripts are simple bat files in windows, I just make a
>> "working directory", copy in the EDF, UCF and any other NGD files I
>> might need, then run NGCbuild, map, par etc.  I have a few other bits
>> ont he end for our internal use which generate a C-file with the
>> bitstream as an array, and embed the time of compile into the UserID
>> register.
>
> I need to know the commands.. I am software guy and all the "compile" steps
> are confusing :)
>

See my makefile I posted elsewhere on this thread...

Cheers,
Martin

-- 
martin.j.thompson@trw.com 
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
http://www.conekt.net/electronics.html

Article: 114919
Subject: Re: ModelSim Leaf Instances
From: "Duth" <premduth@gmail.com>
Date: 26 Jan 2007 06:00:32 -0800
Links: << >> << T >> << A >>

Hi Brad,

Please take a look at the MXE FAQ that Xilinx has on what the limits of
the starter version is:

http://www.xilinx.com/xlnx/xil_ans_display.jsp?BV_UseBVCookie=yes&getPagePath=24506

If you are running into the leaf limit with Instantiated Xilinx FIFO
and not when inferring, ensure that you are using the correct
libraries. Xilinx marks the libraries such that each component is seen
as only one line. Realize that the starter version is free and thus it
does come with limits. One of the main limits is the number of lines
and number of non-xilinx leaf instances. It should not tell you that
you are reaching the max number of non-xilinx leaf limits if you are
using the Xilinx components. Unless you are not using the pre-compiled
libraries that are released by Xilinx.

Thanks
Duth

On Jan 26, 5:44 am, "Brad Smallridge" <bradsmallri...@dslextreme.com>
wrote:
> Hello group,
>
> I ran into an ugly ModelSim6 warning saying that
> I had too many "leaf instances" or lines of code,
> and that my performance was going to be severly
> affected. Sure was. Asimulationthat normally
> runs in about a minute now takes an hour.
>
> I traced the problem down to instantiated primitive
> FIFOs that I replaced with an inferred memory.
> Why does this work? Fewer instances to be sure, but
> not less hardware. This appears to be an arbitrary
> ceiling imposed by the marketing people at Mentor.
>
> Although I have a temporary fix, questions arise:
> 1) What is a "Leaf" instance?
> 2) How do I know when I'm running out of room?
> 3) What are the best practices to get the most out
>    of the ModelSim XE starter?
> 
> Thanks,
> 
> Brad Smallridge
> AiVision

Article: 114920
Subject: Timing analyzer with Virtex 4
From: "skyworld" <chenyong20000@gmail.com>
Date: 26 Jan 2007 06:41:35 -0800
Links: << >> << T >> << A >>

Hi,

I am designing with Virtex 4, which needs DCM output clk0/90/180/270 at
more than 300MHz. I use a ibufg to connects the fpga input clock and
connects ibufg output to DCM input directly. Then I set this constraint
with period 3.2ns. I output clk0/clk90/clk180/clk270 with bufg.
Unfortunately I failed timing. I checked with Timing Analyzer and got
such information:

================================================================================
Timing constraint: TS_clk_90_in = PERIOD TIMEGRP "clk_90_in"
TS_clk_ibufg_out PHASE 0.8 ns HIGH
        50%;

 2 items analyzed, 1 timing error detected. (1 setup error, 0 hold
errors)
 Minimum period is   4.972ns.
--------------------------------------------------------------------------------
Slack:                  -0.443ns (requirement - (data path - clock path
skew + uncertainty))
  Source:               u_datarecovery_a6_i/FF0 (FF)
  Destination:          u_datarecovery_c5_i (FF)
  Requirement:          0.800ns
  Data Path Delay:      1.040ns (Levels of Logic = 0)
  Clock Path Skew:      -0.003ns
  Source Clock:         clk_180 rising at 0.000ns
  Destination Clock:    clk_90 rising at 0.800ns
  Clock Uncertainty:    0.200ns


I am confused by the slack: (requirement - (data path - clock path skew
+ uncertainty)), Can any body help me to explain this? And how to
reduce this data path dealy? Seems my design fails with this.

for anothe path, I got:
--------------------------------------------------------------------------------
Slack:                  1.399ns (requirement - (data path - clock path
skew + uncertainty))
  Source:               u_datarecovery_d5_i (FF)
  Destination:          u_datarecovery_d4_i (FF)
  Requirement:          2.400ns
  Data Path Delay:      0.801ns (Levels of Logic = 0)
  Clock Path Skew:      0.000ns
  Source Clock:         clk_180 rising at 1.600ns
  Destination Clock:    clk_90 rising at 4.000ns
  Clock Uncertainty:    0.200ns

why the requirement is different between these two data path? I need
them both works at the same frequency.


Thanks.

Article: 114921
Subject: Re: Timing analyzer with Virtex 4
From: "Symon" <symon_brewer@hotmail.com>
Date: Fri, 26 Jan 2007 15:32:58 -0000
Links: << >> << T >> << A >>

"skyworld" <chenyong20000@gmail.com> wrote in message 
news:1169822495.611226.116970@k78g2000cwa.googlegroups.com...
> --------------------------------------------------------------------------------
> Slack:                  -0.443ns (requirement - (data path - clock path
> skew + uncertainty))
>  Source:               u_datarecovery_a6_i/FF0 (FF)
>  Destination:          u_datarecovery_c5_i (FF)
>  Requirement:          0.800ns
>  Data Path Delay:      1.040ns (Levels of Logic = 0)
>  Clock Path Skew:      -0.003ns
>  Source Clock:         clk_180 rising at 0.000ns
>  Destination Clock:    clk_90 rising at 0.800ns
>  Clock Uncertainty:    0.200ns
>
>
> I am confused by the slack: (requirement - (data path - clock path skew
> + uncertainty)), Can any body help me to explain this? And how to
> reduce this data path dealy? Seems my design fails with this.
>
You've got a FF clocked by clk_180 feeding a FF clocked by clk_90. There's 
only 800ps between these two edges, which is your clock period, 3.2ns * 
(180 - 90)/360 degrees. That leaves you a slack of 0.800 - 1.040 - 0.003 - 
0.200ns. Just like it says.
HTH, Syms.

Article: 114922
Subject: Forcing a LUT to not be optimized
From: "David" <dpmontminy@gmail.com>
Date: 26 Jan 2007 07:34:59 -0800
Links: << >> << T >> << A >>

Hi,

I'm trying to create a design that uses a LUT to control routing on a
Virtex-II Pro.  It's pretty easy to create the LUT in VHDL and feed
it into a MUX to select the appropriate output based on the values in
the LUT.  I'm trying to use this in a partial reconfiguration design
so that I can change the values in the LUT with a partial bitstream to
change the routing.  My problem is that the design is optimized and
broken up in to multiple LUTs making it hard to determine what needs to
be changed.

Is there any way to force the LUT to be left as a primitive and
implement the equations (or initial value) that I set?  I would also
like to be able to force the LUT to be in known location so that I can
find it easily in the NCD file.  I've seen plenty of documentation
staying this can be done, I can't find any exampled.  I believe I can
use an RLOC but I'm not sure where the RLOC constraint should be
placed.

Thanks for your help,

David

Here's what I know so far:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;

---- Uncomment the following library declaration if instantiating
---- any Xilinx primitives in this code.
library UNISIM;
use UNISIM.VComponents.all;

entity lut_mod is
    Port ( SW_0 : in  STD_LOGIC;  --just some simple inputs and
SW_1 : in  STD_LOGIC;               -- outputs for testing
SW_3 : in  STD_LOGIC;
LED_0 : out  STD_LOGIC;
LED_2 : out  STD_LOGIC;
LED_3 : out  STD_LOGIC);
end lut_mod;

architecture Behavioral of lut_mod is

signal LUT_to_MUX : STD_LOGIC;

begin

  LED_3 <= SW_1;
  LED_2 <= SW_0;

LUT4_L_inst : LUT4_L
   generic map (
      INIT => X"1010")
   port map (
      LO => LUT_to_MUX, 	-- LUT local output
      I0 => SW_3, 		-- LUT input
      I1 => SW_3,		-- LUT input
      I2 => SW_3, 		-- LUT input
      I3 => SW_3 		-- LUT input
   );

MUXF5_inst : MUXF5
   port map (
      O => LED_0,      		-- Output of MUX to general routing
      I0 => SW_0,   		-- Input
      I1 => SW_1,  	 	-- Input
      S => LUT_to_MUX 	-- Input select to MUX
   );

end Behavioral;

Article: 114923
Subject: Re: Porting MontaVista Linux on ML403
From: "sh3.m4y4" <sh3.m4y4@gmail.com>
Date: 26 Jan 2007 08:33:03 -0800
Links: << >> << T >> << A >>

Hi Ben,

I checked the OCP setup on the board and everything seemed to match the
configuration to generate the bitstream. Another message that I got
after it got stuck on "Freeing unused kernel memory" is "nfs: server
192.168.1.137 not responding, still trying". I checked the nfs status
and it is running properly. dhcpd is also running. The next step should
be: "serial console detected. Disabling virtual terminals. init
started: "...etc. Any suggestions?

Maya

On Jan 25, 7:33 pm, Ben Jackson <b...@ben.com> wrote:
> On 2007-01-25, sh3.m4y4 <sh3.m...@gmail.com> wrote:
>
> > Freeing unused kernel memory: 60k init. Does anybody know how to solve
> > this problem? Thanks.Make sure your on-chip peripheral (OCP) setup exactly matches what
> you have in your bitstream.  There's no probing (and no way to probe)
> for most of these devices, so your kernel must exactly match your .bit.
> 
> --
> Ben Jackson AD7GD
> <b...@ben.com>http://www.ben.com/

Article: 114924
Subject: Re: Forcing a LUT to not be optimized
From: "JustJohn" <john.l.smith@l-3com.com>
Date: 26 Jan 2007 08:42:36 -0800
Links: << >> << T >> << A >>



On Jan 26, 7:34 am, "David" <dpmontm...@gmail.com> wrote:
> Hi,
>
> I'm trying to create a design that uses a LUT to control routing on a
> Virtex-II Pro.  It's pretty easy to create the LUT in VHDL and feed
> it into a MUX to select the appropriate output based on the values in
> the LUT.  I'm trying to use this in a partial reconfiguration design
> so that I can change the values in the LUT with a partial bitstream to
> change the routing.  My problem is that the design is optimized and
> broken up in to multiple LUTs making it hard to determine what needs to
> be changed.
>
> Is there any way to force the LUT to be left as a primitive and
> implement the equations (or initial value) that I set?  I would also
> like to be able to force the LUT to be in known location so that I can
> find it easily in the NCD file.  I've seen plenty of documentation
> staying this can be done, I can't find any exampled.  I believe I can
> use an RLOC but I'm not sure where the RLOC constraint should be
> placed.
>
> Thanks for your help,
>
> David
>
> Here's what I know so far:
>
> library IEEE;
> use IEEE.STD_LOGIC_1164.ALL;
> use IEEE.STD_LOGIC_ARITH.ALL;
> use IEEE.STD_LOGIC_UNSIGNED.ALL;
>
> ---- Uncomment the following library declaration if instantiating
> ---- any Xilinx primitives in this code.
> library UNISIM;
> use UNISIM.VComponents.all;
>
> entity lut_mod is
>     Port ( SW_0 : in  STD_LOGIC;  --just some simple inputs and
> SW_1 : in  STD_LOGIC;               -- outputs for testing
> SW_3 : in  STD_LOGIC;
> LED_0 : out  STD_LOGIC;
> LED_2 : out  STD_LOGIC;
> LED_3 : out  STD_LOGIC);
> end lut_mod;
>
> architecture Behavioral of lut_mod is
>
> signal LUT_to_MUX : STD_LOGIC;
>
> begin
>
>   LED_3 <= SW_1;
>   LED_2 <= SW_0;
>
> LUT4_L_inst : LUT4_L
>    generic map (
>       INIT => X"1010")
>    port map (
>       LO => LUT_to_MUX,      -- LUT local output
>       I0 => SW_3,            -- LUT input
>       I1 => SW_3,            -- LUT input
>       I2 => SW_3,            -- LUT input
>       I3 => SW_3             -- LUT input
>    );
>
> MUXF5_inst : MUXF5
>    port map (
>       O => LED_0,                    -- Output of MUX to general routing
>       I0 => SW_0,            -- Input
>       I1 => SW_1,            -- Input
>       S => LUT_to_MUX        -- Input select to MUX
>    );
>
> end Behavioral;

Read The Fine Manual.
The Constraints Guide shows examples of RLOCs in both the architecture
declaration area for single components and in the declaration area of
for generate blocks for primitives created that way.
Just John

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search