Site Home   Archive Home   FAQ Home   How to search the Archive   How to Navigate the Archive   
Compare FPGA features and resources   

Threads starting:
1994JulAugSepOctNovDec1994
1995JanFebMarAprMayJunJulAugSepOctNovDec1995
1996JanFebMarAprMayJunJulAugSepOctNovDec1996
1997JanFebMarAprMayJunJulAugSepOctNovDec1997
1998JanFebMarAprMayJunJulAugSepOctNovDec1998
1999JanFebMarAprMayJunJulAugSepOctNovDec1999
2000JanFebMarAprMayJunJulAugSepOctNovDec2000
2001JanFebMarAprMayJunJulAugSepOctNovDec2001
2002JanFebMarAprMayJunJulAugSepOctNovDec2002
2003JanFebMarAprMayJunJulAugSepOctNovDec2003
2004JanFebMarAprMayJunJulAugSepOctNovDec2004
2005JanFebMarAprMayJunJulAugSepOctNovDec2005
2006JanFebMarAprMayJunJulAugSepOctNovDec2006
2007JanFebMarAprMayJunJulAugSepOctNovDec2007
2008JanFebMarAprMayJunJulAugSepOctNovDec2008
2009JanFebMarAprMayJunJulAugSepOctNovDec2009
2010JanFebMarAprMayJunJulAugSepOctNovDec2010
2011JanFebMarAprMayJunJulAugSepOctNovDec2011
2012JanFebMarAprMayJunJulAugSepOctNovDec2012
2013JanFebMarAprMayJunJulAugSepOctNovDec2013
2014JanFebMarAprMayJunJulAugSepOctNovDec2014
2015JanFebMarAprMayJunJulAugSepOctNovDec2015
2016JanFebMarAprMayJunJulAugSepOctNovDec2016
2017JanFebMarAprMayJunJulAugSepOctNovDec2017
2018JanFebMarAprMayJunJulAugSepOctNovDec2018
2019JanFebMarAprMayJunJulAugSepOctNovDec2019
2020JanFebMarAprMay2020

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search

Messages from 108400

Article: 108400
Subject: Re: Trying to get plb_temac working
From: Siva Velusamy <siva.velusamy@xilinx.com>
Date: Sun, 10 Sep 2006 15:34:06 -0700
Links: << >>  << T >>  << A >>
> 
>>There will be an external PHY on your board.  It's likely that you
>>could hook your switch up to this without anything in the FPGA and
>>it would negotiate a link speed and blink the RX lights.  If that's
>>not happening, solve that problem before you worry about the FPGA.
> 
> This doesn't seem to be the case: When I use an example project using
> opb_ethernet which directly uses the PHY the switch recognizes that
> there is a connection. (As I only have a evaluation IP core of
> opb_ethernet I want to avoid using this one)
> 

Did you check if the example project has source code to program the PHY 
& the Ethernet core? I don't think it will work before doing that.

/Siva

Article: 108401
Subject: Re: Xilinx ISE ver 8.2.02i is optimizing away and removing "redundant" logic - help!
From: james7uw@yahoo.ca
Date: 10 Sep 2006 16:25:10 -0700
Links: << >>  << T >>  << A >>
Thanks for your replies, KJ, Weng. It is clear to
me that disabling optimization is not the real way
to fix my problem, but that using the coding style
that Xilinx likes is the way. Once that is
followed, no more problems will occur. I am just
not looking forward to the painful process of
trial and error that I have read will be required
by first-time Xilinx users to get the right coding
style. I have looked over the xst.pdf manual for
coding style. Removing all the latches was very
helpful when I did that prior to posting here,
back at my translate stage.

I am using only "std_logic". I will check my
synthesis report for warnings. I have no timing
violations listed as of this stage: post-map. I am
not at post-PAR yet. Why should I take the time to
place and route when post-map simulation doesn't
even work? I think doing that is for experienced
users who don't have trouble with the earlier
stages.

I am doing a lot of simultaneous "xor"s of
different bit ranges of 128-bit "words" and using
a function that uses a function (i.e.,
combinatorial logic) and I'm doing that
simultaneously as input to signals that are then
"xor"-ed. These are done after each clock cycle,
when initial signals are updated. That is, when
these initial signals are updated in a process at
the rising edge of my clock, then I have
additional signals that should just be updated
because the data has changed. I'm not using any
sensitivity list or any clock cycle for them.
These assignments should cause signals to change,
which cause the next set of signals to change, in
about three steps, with ranges of bits being
processed in parallel (and mixed, which is why I
have to get into bit ranges). Finally, signals
named "_next" are updated, then the next clock
cycle is awaited at which time the original
signals are updated from the "_next" signals.
Based on my experience so far in which I got into
trouble at the synthesize and translate stage due
to not having a clock on my ROM, do you think
putting clocks on everything would be the thing to
try? This is the trial and error that I will have
to go through.

Do you have any samples of this kind of VHDL code
that Xilinx likes, that you could show me?

Best regards,
-James


Article: 108402
Subject: Re: Xilinx ISE ver 8.2.02i is optimizing away and removing "redundant" logic - help!
From: "ankyag" <ankur101@gmail.com>
Date: 10 Sep 2006 17:51:26 -0700
Links: << >>  << T >>  << A >>

> Do you have any samples of this kind of VHDL code
> that Xilinx likes, that you could show me?
>

In my experience with Xilinx and/or other FPGAs, the only kind of HDL
that these tools "don't like" are non-synthesizable constructs. For e.g
"a<= a+1", where "a" is not a latched signal, or the "initial" blocks
in verilog. So the best thing to do would be to read up on synthesis in
any standard HDL reference book pretty quickly.  My suspicion is that
you haven't paid attention to this while writing the code.

The other possibility might be that you are setting your clock
frequency too high which causes some setup/hold time violations and
gives you all those "reds".

Best,
ankyag


Article: 108403
Subject: Re: Xilinx ISE ver 8.2.02i is optimizing away and removing "redundant" logic - help!
From: "Weng Tianxiang" <wtxwtx@gmail.com>
Date: 10 Sep 2006 19:25:04 -0700
Links: << >>  << T >>  << A >>

ankyag wrote:
> > Do you have any samples of this kind of VHDL code
> > that Xilinx likes, that you could show me?
> >
>
> In my experience with Xilinx and/or other FPGAs, the only kind of HDL
> that these tools "don't like" are non-synthesizable constructs. For e.g
> "a<= a+1", where "a" is not a latched signal, or the "initial" blocks
> in verilog. So the best thing to do would be to read up on synthesis in
> any standard HDL reference book pretty quickly.  My suspicion is that
> you haven't paid attention to this while writing the code.
>
> The other possibility might be that you are setting your clock
> frequency too high which causes some setup/hold time violations and
> gives you all those "reds".
>
> Best,
> ankyag

Hi ankyag,
I widely use the equation like:
a <= a +1;

Usually a is an unsigned (or std_logic_vector) for a counter, it
doesn't matter whether the equation is in a process or in a concurrent
area.

No any problem.

I don't see why VHDL dislikes it or it cannot be synthesized.

Weng


Article: 108404
Subject: Re: ddr with multiple users
From: "Daniel S." <digitalmastrmind_no_spam@hotmail.com>
Date: Sun, 10 Sep 2006 22:30:10 -0400
Links: << >>  << T >>  << A >>
David Ashley wrote:
>>> Complications:
>>> 1) To support bursting, it needs to have some sort of fifo. An easy way
>>> would be the core stores up the whole burst, then transacts it to the
>>> DDR when all is known.
>> I'd suggest keeping along that train of thought as you go forward but
>> keep refining it.
> 
> I'm starting to like this approach. Each master could then just queue
> up an access, say
> WRITE = ADDRESS + Some number of 32 bit words of data to put there
> READ = ADDRESS + the number of words you want from there.
> 
> In either case data gets shoved into a fifo owned by the master. Once
> the transaction is queued up, the master just needs to wait until it's
> done.
> 
> Let's see what the masters are:
> 1) CPU doing cache line fills + flushes, no single beat reads/writes
> 2) Batches of audio data for read
> 3) Video data for read
> 4) Perhaps DMA channels initiated by the CPU, transfer from BRAM
> to memory, say for ethernet packets.
> 
> 2,3,4 latency isn't an issue. #1 latency can be minimized if the
> cpu uses BRAM as a cache, which is the intent.
> 
> Thanks for taking the time to write all that!
> Dave

Since routing multiple 32+bits buses consumes a fair amount of routing 
and control logic which needs tweaking whenever the design changes, I 
have been considering ring buses for future designs. As long as latency 
is not a primary issue, the ring-bus can also be used for data 
streaming, with the memory controller simply being one more possible 
target/initiator node.

Using dual ring buses (clockwise + counter-clockwise) to link critical 
nodes can take care of most latency concerns by improving proximity. For 
large and extremely intensive applications like GPUs, the memory 
controller can have multiple ring bus taps to further increase bandwidth 
and reduce latency - look at ATI's X1600 GPUs.

Ring buses are great in ASICs since they have no a-priori routing 
constraints, I wonder how well this would apply to FPGAs since these are 
optimized for linear left-to-right data paths, give or take a few 
rows/columns. (I did some preliminary work on this and the partial 
prototype reached 240MHz on V4LX25-10, limited mostly by routing and 4:1 
muxes IIRC.)

-- 
Daniel Sauvageau
moc.xortam@egavuasd
Matrox Graphics Inc.
1155 St-Regis, Dorval, Qc, Canada
514-822-6000

Article: 108405
Subject: Re: HOLD violations in Xilinx fpga
From: "Weng Tianxiang" <wtxwtx@gmail.com>
Date: 10 Sep 2006 19:41:27 -0700
Links: << >>  << T >>  << A >>

Peter Alfke wrote:
> zohair wrote:
> > Are there any approaches or constraints available to fix hold-time violations in the Xilinx ISE tool aimed at Virtex-2 devices? The P&R tool is supposed to automatically fix these, but if the timing results from P&R show hold violations, what is the recommended approach to eliminating them? In the ASIC world one would insert buffers in the failing path and do an incremental P&R compile - does this apply for Xilinx FPGAs as well?
>
> Answer from Peter Alfke:
> In a synchronous design, you clock all flip-flops with a common clock.
> The Q output of one flip-flop drives (through interconnect and perhaps
> also other logic) the D-input of the "downstream" flip-flop.
> Although all flip-flops are clocked together, the clock-to-Q plus other
> delays assure that the "old' data is held at the downstream flip-flop
> input until well after the clock edge. Perfect operation!
>
> If, however, the clock arrives at the downstream flip-flop very late,
> the old data may already have disappeared, and the "new" data will be
> clocked in, which is one form of a race condition.
> You never have this problem when you us the global clock distribution,
> for its skew is less than the propagation delay from one flip-flop to
> the other.
> But if you use local clock routing, or -heaven forbid- use clock gating
> or other unsavory methods, then you can (or will) create hold time
> problems.
> The solution is to "don't do that". Use an un-gated global clock,
> together with selective Clock Enable.
> Inserting extra delays in the data path is a dangerous Band-Aid method,
> only to be used in emergencies.
> Try to understand your problem first, before fixing it.
> Peter Alfke, Xilinx

Hi Peter,
My situation is different from what you described.

My design was for PCI core. In PCI core, there are two global PCI
signals:
nIRDY, and nTRDY that drive a lot of loads and the signals cannot be
latched before they are used.
In Alera chip situation, nIRDY/nTRDY are not treaded as global signals
and they use normal routes for all connections in a design. That causes

nTRDY signal disappear earlier than global clock signal in some
flip-flops
in my design, causing hold time violations. After understanding its
reasons,
I added a constant logic One to the related equations and delayed nTRDY
signal
disappearance. After adding constant One logic, the hold violations
disappear.

For Xilinx chip, nIRDY/nTRDY have global net so that the hold
violations don't happen.

Weng

So for individual registers, it is possible that nTRDY dissappear before


Article: 108406
Subject: Re: ddr with multiple users
From: "Weng Tianxiang" <wtxwtx@gmail.com>
Date: 10 Sep 2006 20:08:29 -0700
Links: << >>  << T >>  << A >>

Daniel S. wrote:
> David Ashley wrote:
> >>> Complications:
> >>> 1) To support bursting, it needs to have some sort of fifo. An easy way
> >>> would be the core stores up the whole burst, then transacts it to the
> >>> DDR when all is known.
> >> I'd suggest keeping along that train of thought as you go forward but
> >> keep refining it.
> >
> > I'm starting to like this approach. Each master could then just queue
> > up an access, say
> > WRITE = ADDRESS + Some number of 32 bit words of data to put there
> > READ = ADDRESS + the number of words you want from there.
> >
> > In either case data gets shoved into a fifo owned by the master. Once
> > the transaction is queued up, the master just needs to wait until it's
> > done.
> >
> > Let's see what the masters are:
> > 1) CPU doing cache line fills + flushes, no single beat reads/writes
> > 2) Batches of audio data for read
> > 3) Video data for read
> > 4) Perhaps DMA channels initiated by the CPU, transfer from BRAM
> > to memory, say for ethernet packets.
> >
> > 2,3,4 latency isn't an issue. #1 latency can be minimized if the
> > cpu uses BRAM as a cache, which is the intent.
> >
> > Thanks for taking the time to write all that!
> > Dave
>
> Since routing multiple 32+bits buses consumes a fair amount of routing
> and control logic which needs tweaking whenever the design changes, I
> have been considering ring buses for future designs. As long as latency
> is not a primary issue, the ring-bus can also be used for data
> streaming, with the memory controller simply being one more possible
> target/initiator node.
>
> Using dual ring buses (clockwise + counter-clockwise) to link critical
> nodes can take care of most latency concerns by improving proximity. For
> large and extremely intensive applications like GPUs, the memory
> controller can have multiple ring bus taps to further increase bandwidth
> and reduce latency - look at ATI's X1600 GPUs.
>
> Ring buses are great in ASICs since they have no a-priori routing
> constraints, I wonder how well this would apply to FPGAs since these are
> optimized for linear left-to-right data paths, give or take a few
> rows/columns. (I did some preliminary work on this and the partial
> prototype reached 240MHz on V4LX25-10, limited mostly by routing and 4:1
> muxes IIRC.)
>
> --
> Daniel Sauvageau
> moc.xortam@egavuasd
> Matrox Graphics Inc.
> 1155 St-Regis, Dorval, Qc, Canada
> 514-822-6000

Hi Daniel,
Here is my suggestion.
For example, there are 5 components which have access to DDR controller
module.
What I would like to do is:
1. Each of 5 components has an output buffer shared by DDR controller
module;
2. DDR controller module has an output bus shared by all 5 components
as their input bus.

Each data has an additional bit to indicate if it is a data or a
command.
If it is a command, it indicates which the output bus is targeting at.
If it is a data, the data belongs to the targeted component.

Output data streams look like this:
Command;
data;
...
data;
Command;
data;
...
data;

In the command data, you may add any information you like.
The best benefit of this scheme is it has no delays and no penalty in
performance, and it has minimum number of buses.

I don't see ring bus has any benefits over my scheme.

In ring situation, you must have (N+1)*2 buses for N >= 2. In my
scheme, it must have N+1 buses, where N is the number of components,
excluding DDR controller module.

Weng


Article: 108407
Subject: RESET Signals
From: "Roger" <hpsham@gmail.com>
Date: 10 Sep 2006 20:34:07 -0700
Links: << >>  << T >>  << A >>
Why RESET signals are always active low? I understand that active low
resets are immune to noise, but could someone explain in detail?

Also 

How does Power on Rest Work?

Thanks


Article: 108408
Subject: Re: Trying to get plb_temac working
From: Ben Jackson <ben@ben.com>
Date: Sun, 10 Sep 2006 22:39:38 -0500
Links: << >>  << T >>  << A >>
On 2006-09-10, Benedikt Wildenhain <benedikt@benedikt-wildenhain.de> wrote:
> This doesn't seem to be the case: When I use an example project using
> opb_ethernet which directly uses the PHY the switch recognizes that
> there is a connection. (As I only have a evaluation IP core of
> opb_ethernet I want to avoid using this one)

Well, that board says it has a National DS83847.  Looks like you need
to register to get schematics.  You should grab the datasheet and see
what's required to make it work.  Again, if the PHY is set up for
autonegotiation it probably doesn't care what's on the MII interface
when it comes to negotiating link.  It's possible that the configuration
and reset lines on the PHY are tied to the FPGA, and it's not the MAC
that's the issue, but some glue in the example project which is driving
those config lines.  It's also possible that the example project
configures registers with MDIO before the PHY will work, but I bet it
has sufficient hardware configuration as well.

-- 
Ben Jackson AD7GD
<ben@ben.com>
http://www.ben.com/

Article: 108409
Subject: Re: Xilinx ISE ver 8.2.02i is optimizing away and removing "redundant" logic - help!
From: "ankyag" <ankur101@gmail.com>
Date: 10 Sep 2006 21:04:05 -0700
Links: << >>  << T >>  << A >>


> I widely use the equation like:
> a <= a +1;
>
> Usually a is an unsigned (or std_logic_vector) for a counter, it
> doesn't matter whether the equation is in a process or in a concurrent
> area.
>
> No any problem.
>
> I don't see why VHDL dislikes it or it cannot be synthesized.

Sorry if my previous comment confused you. All I meant was it cannot be
used as a concurrent assignment (in vhdl) or in the "assign" statement
in verilog. It is okay to use it within a "process" or "always" block.
In the latter case,  a latch/flip-flop is inferred.

Hope this clears the confusion,
Ankur


Article: 108410
Subject: Re: Forth-CPU design
From: "werty" <werty@swissinfo.org>
Date: 10 Sep 2006 23:08:51 -0700
Links: << >>  << T >>  << A >>

 You miss the point .

  You never want to RETURN to a "controlling"/"Calling"
  Word  in NewForth , you can not justify it EVER .
 You NEVER need to RETURN EVER , you can always run
faster and yet the upper level word is still in control
 without ever RETURNING !!!!

  Got it ?   NEVER RETURN ..


 Example :  A hi level word starts the show and has a list
 of 13 Mid-Levels   it will "run' , at the end of the first  , instead
of returning to the main word , I.P. looks at the list of 13
in main word and JUMPS indirect thru that list , to the next
 subroutine .... But w/o cost !! The mechanism is fast ,
it does NOT return to main word .

 That is NOT returning , it is transparently jumping to
next subroutine W/O wasting time Returning ,

  It gets worse as you study the
ARM 4 cores ...The BRA is actually slower for the pipeline
is slow sending the address to the P.C.  !  it takes 3 clocks !

  The problem is people get stuck on a Return stack ,
they just can't see any other way of doin it clean ....
  sorry about that !

 Imagine a CPU that has an external list of addresses
 that is its "program"  .   alias     I.D.C./ IndirectThreadedCode
 Everything in outer RAM is an address .
  No executable code in outer RAM .
  But inside CPU is More RAM that holds  Primatives .
   Those Primatives look like an extension of the  instruction set .
 Boot CPU and it looks for first address from FFFF  .
  After a few Primatives are run from that sequence from outer RAM
 there's a conditional and the Primatives , change the list from
which they are executing .
   This sequence can of course create new lists in outer RAM
as it runs  , and create data .

 Do you see any STACKs ?   any RETURNs ?
 All CPUs today can do this and its faster than their
 own CALL/RETURN .

 Return  STACKS  are slow , hard to address ,  uneeded ....

 NewForth for ARM 920T will be Free OpSys ....


Article: 108411
Subject: Re: Xilinx ISE ver 8.2.02i is optimizing away and removing "redundant"
From: David Ashley <dash@nowhere.net.dont.email.me>
Date: 11 Sep 2006 08:25:22 +0200
Links: << >>  << T >>  << A >>
Weng Tianxiang wrote:
> Never spend time doing post-map simulation;
> Never spend time using DOS command lines;
> Never spend time turning off Xilinx's optimization;

Weng,

Can you clarify the 2nd one about "DOS command lines"?
I'm using xilinx webpack tools under linux, operating
from the command line. Actually I've built up a Makefile
that invokes the commands. Is there some gotcha I need
to know about? I prefer command line tools operated by
"make" as opposed to IDE's.

Below's the important pieces of the Makefile. The commands
I got from the pacman source build script, converted to unix
make syntax. Works fine.

-Dave


XILINX=/Xilinx
NAME=main
SETUP=LD_LIBRARY_PATH=$(XILINX)/bin/lin XILINX=$(XILINX) \
		PATH=$(PATH):$(XILINX)/bin/lin


bitfile:  step0 step1 step2 step3 step4 step5

step0:
	$(SETUP) xst -ifn $(NAME).scr -ofn $(NAME).srp
step1:
	$(SETUP) ngdbuild -nt on -uc $(NAME).ucf $(NAME).ngc $(NAME).ngd
step2:
	$(SETUP) map -pr b $(NAME).ngd -o $(NAME).ncd $(NAME).pcf
step3:
	$(SETUP) par -w -ol high $(NAME).ncd $(NAME).ncd $(NAME).pcf
step4:
	$(SETUP) trce -v 10 -o $(NAME).twr $(NAME).ncd $(NAME).pcf
step5:
	$(SETUP) bitgen $(NAME).ncd $(NAME).bit -w #-f $(NAME).ut
hwtest:
	sudo xc3sprog $(NAME).bit

-----
main.scr contains this:

run
-ifn main.prj
-ifmt VHDL
-ofn main.ngc
-ofmt NGC -p XC3S500E-FG320-4
-opt_mode Area
-opt_level 2

------
main.prj just lists the vhd source files.





-- 
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture

Article: 108412
Subject: Re: ddr with multiple users
From: David Ashley <dash@nowhere.net.dont.email.me>
Date: 11 Sep 2006 08:38:51 +0200
Links: << >>  << T >>  << A >>
Weng Tianxiang wrote:
> Daniel S. wrote:
> 
>>David Ashley wrote:
>>
>>>>>Complications:
>>>>>1) To support bursting, it needs to have some sort of fifo. An easy way
>>>>>would be the core stores up the whole burst, then transacts it to the
>>>>>DDR when all is known.
>>>>
>>>>I'd suggest keeping along that train of thought as you go forward but
>>>>keep refining it.
>>>
>>>I'm starting to like this approach. Each master could then just queue
>>>up an access, say
>>>WRITE = ADDRESS + Some number of 32 bit words of data to put there
>>>READ = ADDRESS + the number of words you want from there.
>>>
>>>In either case data gets shoved into a fifo owned by the master. Once
>>>the transaction is queued up, the master just needs to wait until it's
>>>done.
>>>
>>>Let's see what the masters are:
>>>1) CPU doing cache line fills + flushes, no single beat reads/writes
>>>2) Batches of audio data for read
>>>3) Video data for read
>>>4) Perhaps DMA channels initiated by the CPU, transfer from BRAM
>>>to memory, say for ethernet packets.
>>>
>>>2,3,4 latency isn't an issue. #1 latency can be minimized if the
>>>cpu uses BRAM as a cache, which is the intent.
>>>
>>>Thanks for taking the time to write all that!
>>>Dave
>>
>>Since routing multiple 32+bits buses consumes a fair amount of routing
>>and control logic which needs tweaking whenever the design changes, I
>>have been considering ring buses for future designs. As long as latency
>>is not a primary issue, the ring-bus can also be used for data
>>streaming, with the memory controller simply being one more possible
>>target/initiator node.
>>
>>Using dual ring buses (clockwise + counter-clockwise) to link critical
>>nodes can take care of most latency concerns by improving proximity. For
>>large and extremely intensive applications like GPUs, the memory
>>controller can have multiple ring bus taps to further increase bandwidth
>>and reduce latency - look at ATI's X1600 GPUs.
>>
>>Ring buses are great in ASICs since they have no a-priori routing
>>constraints, I wonder how well this would apply to FPGAs since these are
>>optimized for linear left-to-right data paths, give or take a few
>>rows/columns. (I did some preliminary work on this and the partial
>>prototype reached 240MHz on V4LX25-10, limited mostly by routing and 4:1
>>muxes IIRC.)
>>
>>--
>>Daniel Sauvageau
>>moc.xortam@egavuasd
>>Matrox Graphics Inc.
>>1155 St-Regis, Dorval, Qc, Canada
>>514-822-6000
> 
> 
> Hi Daniel,
> Here is my suggestion.
> For example, there are 5 components which have access to DDR controller
> module.
> What I would like to do is:
> 1. Each of 5 components has an output buffer shared by DDR controller
> module;
> 2. DDR controller module has an output bus shared by all 5 components
> as their input bus.
> 
> Each data has an additional bit to indicate if it is a data or a
> command.
> If it is a command, it indicates which the output bus is targeting at.
> If it is a data, the data belongs to the targeted component.
> 
> Output data streams look like this:
> Command;
> data;
> ...
> data;
> Command;
> data;
> ...
> data;
> 
> In the command data, you may add any information you like.
> The best benefit of this scheme is it has no delays and no penalty in
> performance, and it has minimum number of buses.
> 
> I don't see ring bus has any benefits over my scheme.
> 
> In ring situation, you must have (N+1)*2 buses for N >= 2. In my
> scheme, it must have N+1 buses, where N is the number of components,
> excluding DDR controller module.
> 
> Weng
> 

Weng,

Your strategy seems to make sense to me. I don't actually know what a
ring buffer is. Your design seems appropriate for the imbalance built
into the system -- that is, any of the 5 components can initiate a
command at any time, however the DDR controller can only respond
to one command at a time. So you don't need a unique link to each
component for data coming from the DDR.

However thinking a little more on it, each of the 5 components must
have logic to ignore the data that isn't targeted at themselves. Also
in order to be able to deal with data returned from the DDR at a
later time, perhaps a component might store it in a fifo anyway.

The approach I had sort of been envisioning involved for each
component you have 2 fifos, one goes for commands and data
from the component to the ddr, and the other is for data coming
back from the ddr. The ddr controller just needs to decide which
component to pull commands from --  round robin would be fine
for my application. If it's a read command, it need only stuff the
returned data in the right fifo.

I don't know, I think I like your approach. One can always add
a 2nd fifo for read data if desired, and I think the logic to ignore
others' data is probably trivial...

-Dave

-- 
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture

Article: 108413
Subject: Re: microblaze startup problem
From: =?ISO-8859-1?Q?G=F6ran_Bilski?= <goran.bilski@xilinx.com>
Date: Mon, 11 Sep 2006 08:45:45 +0200
Links: << >>  << T >>  << A >>
sjulhes wrote:
>>- How do you detect that is doesn't start? :
> 
> Test program is a hello world print onto the UART, sometimes there is 
> nothing on the hyperterminal
> 
> 
>>- Are you using external memory?
> 
> NO
> 
> 
>>- Can the program fit totally in the internal BRAM?
> 
> YES, hello world is quite light when it's alone !!!
> 

Not if you use printf, that takes around 40kb of code.
Just check the size of the program with the command "mb-size" and be 
sure that it fits within the define LMB area.

> 
>>- What download cable are you using?
> 
> JTAG DLC7
> 
> 
> "Göran Bilski" <goran.bilski@xilinx.com> a écrit dans le message de news: 
> edrjlu$9501@cliff.xsj.xilinx.com...
> 
>>sjulhes wrote:
>>
>>>Hello all,
>>>
>>>We are trying to design a small microblaze design in a spartan 3 and the 
>>>problem we have is that the microblaze does not always start when the 
>>>bitsream is downloaded with JTAG.
>>>
>>>But when we implement the debug module it always works.
>>>
>>>Does anyone has a clue ?
>>>Is it timing problems ? Is there specific timing constraints to add for 
>>>microblaze ?
>>>Software problem ( linker script is automatically generated by edk)?
>>>
>>>Any idea is welcome.
>>>
>>>Thank you.
>>>
>>>Stéphane.
>>
>>Hi,
>>
>>It's too little information to point out the what is wrong.
>>
>>- How do you detect that is doesn't start?
>>- Are you using external memory?
>>- Can the program fit totally in the internal BRAM?
>>- What download cable are you using?
>>
>>Göran Bilski 
> 
> 
> 

Article: 108414
Subject: Re: RESET Signals
From: "Thomas Stanka" <usenet_10@stanka-web.de>
Date: 11 Sep 2006 00:08:27 -0700
Links: << >>  << T >>  << A >>

Roger schrieb:

> Why RESET signals are always active low? I understand that active low
> resets are immune to noise, but could someone explain in detail?

Reset signals are not _always_ low.
But it allows an easy power on reset, as you need only a RC network to
delay VCC for a few us-ms. As long as VCC is low, you have no problem,
as no chip works without VCC. If VCC rises, you only need to delay the
rising on reset until your chip is full powered and clock starts
working.  

bye Thomas


Article: 108415
Subject: Re: Forth-CPU design
From: David Ashley <dash@nowhere.net.dont.email.me>
Date: 11 Sep 2006 09:09:53 +0200
Links: << >>  << T >>  << A >>
werty wrote:
>  NewForth for ARM 920T will be Free OpSys ....

You've said you're making a forth on an arm core, or
something like that. How does that work? Wouldn't a
forth cpu execute forth instructions natively? Where would
the ARM fit in?

-Dave


-- 
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture

Article: 108416
Subject: Re: Forth-CPU design
From: David Brown <david@westcontrol.removethisbit.com>
Date: Mon, 11 Sep 2006 09:40:24 +0200
Links: << >>  << T >>  << A >>
werty wrote:
>  You miss the point .
> 
>   You never want to RETURN to a "controlling"/"Calling"
>   Word  in NewForth , you can not justify it EVER .
>  You NEVER need to RETURN EVER , you can always run
> faster and yet the upper level word is still in control
>  without ever RETURNING !!!!
> 
>   Got it ?   NEVER RETURN ..
> 
> 
>  Example :  A hi level word starts the show and has a list
>  of 13 Mid-Levels   it will "run' , at the end of the first  , instead
> of returning to the main word , I.P. looks at the list of 13
> in main word and JUMPS indirect thru that list , to the next
>  subroutine .... But w/o cost !! The mechanism is fast ,
> it does NOT return to main word .
> 
>  That is NOT returning , it is transparently jumping to
> next subroutine W/O wasting time Returning ,
> 
>   It gets worse as you study the
> ARM 4 cores ...The BRA is actually slower for the pipeline
> is slow sending the address to the P.C.  !  it takes 3 clocks !
> 
>   The problem is people get stuck on a Return stack ,
> they just can't see any other way of doin it clean ....
>   sorry about that !
> 
>  Imagine a CPU that has an external list of addresses
>  that is its "program"  .   alias     I.D.C./ IndirectThreadedCode
>  Everything in outer RAM is an address .
>   No executable code in outer RAM .
>   But inside CPU is More RAM that holds  Primatives .
>    Those Primatives look like an extension of the  instruction set .
>  Boot CPU and it looks for first address from FFFF  .
>   After a few Primatives are run from that sequence from outer RAM
>  there's a conditional and the Primatives , change the list from
> which they are executing .
>    This sequence can of course create new lists in outer RAM
> as it runs  , and create data .
> 
>  Do you see any STACKs ?   any RETURNs ?
>  All CPUs today can do this and its faster than their
>  own CALL/RETURN .
> 
>  Return  STACKS  are slow , hard to address ,  uneeded ....
> 
>  NewForth for ARM 920T will be Free OpSys ....
> 

As far as I can see, that all makes sense for a program structured like 
this:

(main routine) :
	do thing1
	do thing2
	do thing3

Avoiding the call and return makes sense - the end of "thing1" can just 
be a jump to the start of "thing2", or even better, "thing2" can simply 
follow on after "thing1".

But if the program is:
(main routine) :
	do thing1
	do thing2
	do thing1
	do thing3

Or if both "thing1" and "thing2" want to call a common routine "thing4", 
you are stuck.  Either you duplicate code (which might be worth doing if 
the code is short, but not if it is long), or you have some sort of 
stack - there is no other way (any other solutions are a stack in disguise).


An ideal compiler arrangement would figure this sort of thing out 
automatically and cut out all the calls and returns when they are not 
necessary, but include them when they are.  I don't know much about 
forth implementations (it's over fifteen years since I looked at forth), 
but I do know of C compilers that do this.

Article: 108417
Subject: Re: Performance Appraisals
From: "PARTICLEREDDY (STRAYDOG)" <particlereddy@gmail.com>
Date: 11 Sep 2006 01:02:31 -0700
Links: << >>  << T >>  << A >>
DONT YOU THINK THIS IS WASTE OF TIME SPENDING TIME ON SOME NON CORE NON
TECHNOLOGY RELATED MATTER. END UP THIS DISCUSSION SORRY TO SAY THIS..
BUT I SEE THAT OUR GENIUS PEOPLE IN THIS GROUP ARE SPENDING MUCH AMOUNT
OF THEIR BRAINS IN THIS KIND OF DISCUSSION..PLEASE DO INVEST YOUR
EFFORTS IN MORE TECHNOLOGY RELATED THINGS..

I AM NOT PREACHING ANY..ITS THEIR INTEREST..BUT SEE MANY A QUESTIONS OF
TRUE TECHNOLOGIES ARE NOT ATTENDED.

REGARDS
PARTICLEREDDY.


Article: 108418
Subject: Re: RESET Signals
From: "PeteS" <PeterSmith1954@googlemail.com>
Date: 11 Sep 2006 01:10:22 -0700
Links: << >>  << T >>  << A >>
Roger wrote:
> Why RESET signals are always active low? I understand that active low
> resets are immune to noise, but could someone explain in detail?
>
> Also
>
> How does Power on Rest Work?
>
> Thanks

Resets are _often_ low for the reason already stated - it makes forcing
a reset at power up very simple. That said, I have in a current design
2 devices that require active high resets.

Power-on reset (as found in many microcontrollers) can be achieved
quite simply with a very basic (two transistor) flip-flop where the
power up state is guaranteed by the simple artifice of delaying a
signal or a small amount of capacitance on one side of the device. This
usually does not require an external reset signal but the technique is
the same; hold one particular line [low | high] during power-up, where
power-up is defined as rising through some threshold. Brownout-detect
uses the same principle.

For FPGAs, the power-on state is guaranteed by the init state of the
loaded bitstream. Here we simply force an initialisation if the system
voltage[s] drop below some threshold, but the effect is the same.

Cheers

PeteS


Cheers

PeteS


Article: 108419
Subject: Re: Performance Appraisals
From: "Michael A. Terrell" <mike.terrell@earthlink.net>
Date: Mon, 11 Sep 2006 08:58:46 GMT
Links: << >>  << T >>  << A >>
"PARTICLEREDDY (STRAYDOG)" wrote:
> 
> DONT YOU THINK THIS IS WASTE OF TIME SPENDING TIME ON SOME NON CORE NON
> TECHNOLOGY RELATED MATTER. END UP THIS DISCUSSION SORRY TO SAY THIS..
> BUT I SEE THAT OUR GENIUS PEOPLE IN THIS GROUP ARE SPENDING MUCH AMOUNT
> OF THEIR BRAINS IN THIS KIND OF DISCUSSION..PLEASE DO INVEST YOUR
> EFFORTS IN MORE TECHNOLOGY RELATED THINGS..
> 
> I AM NOT PREACHING ANY..ITS THEIR INTEREST..BUT SEE MANY A QUESTIONS OF
> TRUE TECHNOLOGIES ARE NOT ATTENDED.
> 
> REGARDS
> PARTICLEREDDY.


   No one is going to pay attention to any idiot who posts in all caps.


-- 
Service to my country? Been there, Done that, and I've got my DD214 to
prove it.
Member of DAV #85.

Michael A. Terrell
Central Florida

Article: 108420
Subject: Problem with adding DCM to Spartan-3
From: "Daveb" <dave.bryan@gmail.com>
Date: 11 Sep 2006 02:12:10 -0700
Links: << >>  << T >>  << A >>
Hi,

I have a fairly simple Spartan-3 design which essentially implements an
interface to an external MCU and it works fine. I need to interface to
a 2nd data bus which is asynchronous to the main (MCU) clock. I want to
multiply the main clock by 4 to implement a 2-stage FF synchroniser &
data latching arrangement. I've incorporated the frequency synthesiser
part of a DCM using core generator and I now get the following error
(using ISE 8.2) :

ERROR:Place:207 - Due to SelectIO banking constraints, the IOBs in your
design cannot be automatically placed.

If I remove all pad constraints I no longer get this error. I've
tracked the problem down to adding 1 or more signals to pins in bank 6.
The summary report for the bank assignments is as follows:

Bank 0 has 10 pads, 9 (90%) are utilized.
Bank 1 has 9 pads, 9 (100%) are utilized.
Bank 2 has 14 pads, 14 (100%) are utilized.
Bank 3 has 15 pads, 12 (80%) are utilized.
Bank 4 has 11 pads, 2 (18%) are utilized.
Bank 5 has 9 pads, 7 (77%) are utilized.
Bank 6 has 14 pads, 1 (7%) are utilized.  <- causes the error

As the PCB is in manufacture changing pin assignments will be a bit
painful. Has anyone come across this or have any suggestions ?

Thanks
Dave


Article: 108421
Subject: Re: Forth-CPU design
From: Kolja Sulimma <news@sulimma.de>
Date: Mon, 11 Sep 2006 11:15:09 +0200
Links: << >>  << T >>  << A >>
werty schrieb:

>  Return  STACKS  are slow , hard to address ,  uneeded ....
>  NewForth for ARM 920T will be Free OpSys ....

You are off topic.
This is a thread about a hardware implementation of a FORTH-CPU.
Hardware stacks are as fast as register files and not addressed at all.

With modern JITs you might get better performce for stack based machine
models like JAVA or FORTH with static scheduling on a register file done
by a JIT compilert than you could get on a stack CPU.
But a stack based hardware implementation can get OK performance with
both software and hardware that is an order of magnitude simpler than
the JIT on RISC approach. And due to the lack of a JIT it will have
better real time properties.

Kolja Sulimma

Article: 108422
Subject: Re: Problem with adding DCM to Spartan-3
From: "Antti" <Antti.Lukats@xilant.com>
Date: 11 Sep 2006 02:24:19 -0700
Links: << >>  << T >>  << A >>
Daveb schrieb:

> Hi,
>
> I have a fairly simple Spartan-3 design which essentially implements an
> interface to an external MCU and it works fine. I need to interface to
> a 2nd data bus which is asynchronous to the main (MCU) clock. I want to
> multiply the main clock by 4 to implement a 2-stage FF synchroniser &
> data latching arrangement. I've incorporated the frequency synthesiser
> part of a DCM using core generator and I now get the following error
> (using ISE 8.2) :
>
> ERROR:Place:207 - Due to SelectIO banking constraints, the IOBs in your
> design cannot be automatically placed.
>
> If I remove all pad constraints I no longer get this error. I've
> tracked the problem down to adding 1 or more signals to pins in bank 6.
> The summary report for the bank assignments is as follows:
>
> Bank 0 has 10 pads, 9 (90%) are utilized.
> Bank 1 has 9 pads, 9 (100%) are utilized.
> Bank 2 has 14 pads, 14 (100%) are utilized.
> Bank 3 has 15 pads, 12 (80%) are utilized.
> Bank 4 has 11 pads, 2 (18%) are utilized.
> Bank 5 has 9 pads, 7 (77%) are utilized.
> Bank 6 has 14 pads, 1 (7%) are utilized.  <- causes the error
>
> As the PCB is in manufacture changing pin assignments will be a bit
> painful. Has anyone come across this or have any suggestions ?
>
> Thanks
> Dave

set IOSTANDARD for all pins used in the design in your UCF and try
again

if you have some IO pin with non default and not compliant iostandard
mapped to any bank where ISE automaps some pins with default iostandard
you get similar error.

Antti


Article: 108423
Subject: Re: simplyrisc-s1 free core
From: "Antti" <Antti.Lukats@xilant.com>
Date: 11 Sep 2006 02:30:55 -0700
Links: << >>  << T >>  << A >>
Jan Panteltje schrieb:

> Free S1 core with Linux tools (gcc) emu runs on iverilog:
>  http://www.srisc.com/?s1
> This is a RISC with 64 bit address and data bus, and Wishbone intereface.
>
> Have not tried it, may be of interst to some here.
>
> <quote>
> The OpenSPARC T1 microprocessor (codename Niagara) features 8 SPARC CPU
> Cores and several peripherals; the S1 Core takes only one 64-bit SPARC Core
> from that design and adds a Wishbone bridge, a reset controller and a basic
> interrupt controller, to make it easy for a system engineer to integrate the
> design.
> <end quote>

I am trying it right now - seems like lot of fun, when trying it with
Xilinx ISE it has already managed to make 3 different kinds of fatal
crashes !!

so it would be good test case for Xilinx to test their software
against.

Antti


Article: 108424
Subject: Re: Xilinx ISE ver 8.2.02i is optimizing away and removing "redundant" logic - help!
From: "KJ" <kkjennings@sbcglobal.net>
Date: Mon, 11 Sep 2006 09:58:57 GMT
Links: << >>  << T >>  << A >>

"Weng Tianxiang" <wtxwtx@gmail.com> wrote in message 
news:1157941504.597171.318920@i42g2000cwa.googlegroups.com...
> I widely use the equation like:
> a <= a +1;
>
> Usually a is an unsigned (or std_logic_vector) for a counter, it
> doesn't matter whether the equation is in a process or in a concurrent
> area.

It matters very much whether it is in a clocked process or not.  If 'a<=a+1' 
is in an unclocked process or concurrent statement you've just created a 
latch.  As a general guideline if you ever have a signal on both sides of 
the '<=' in an area outside of a clocked process you've got a latch.


> No any problem.

I doubt that.  a<= a+1 outside of a clocked process will (at best) produce a 
counter that increments by one at whatever uncontrolled propogation delay of 
the device you have

>
> I don't see why VHDL dislikes it or it cannot be synthesized.

It can be synthesized....it just is highly unlikely to do what you want it 
to do.

KJ 





Site Home   Archive Home   FAQ Home   How to search the Archive   How to Navigate the Archive   
Compare FPGA features and resources   

Threads starting:
1994JulAugSepOctNovDec1994
1995JanFebMarAprMayJunJulAugSepOctNovDec1995
1996JanFebMarAprMayJunJulAugSepOctNovDec1996
1997JanFebMarAprMayJunJulAugSepOctNovDec1997
1998JanFebMarAprMayJunJulAugSepOctNovDec1998
1999JanFebMarAprMayJunJulAugSepOctNovDec1999
2000JanFebMarAprMayJunJulAugSepOctNovDec2000
2001JanFebMarAprMayJunJulAugSepOctNovDec2001
2002JanFebMarAprMayJunJulAugSepOctNovDec2002
2003JanFebMarAprMayJunJulAugSepOctNovDec2003
2004JanFebMarAprMayJunJulAugSepOctNovDec2004
2005JanFebMarAprMayJunJulAugSepOctNovDec2005
2006JanFebMarAprMayJunJulAugSepOctNovDec2006
2007JanFebMarAprMayJunJulAugSepOctNovDec2007
2008JanFebMarAprMayJunJulAugSepOctNovDec2008
2009JanFebMarAprMayJunJulAugSepOctNovDec2009
2010JanFebMarAprMayJunJulAugSepOctNovDec2010
2011JanFebMarAprMayJunJulAugSepOctNovDec2011
2012JanFebMarAprMayJunJulAugSepOctNovDec2012
2013JanFebMarAprMayJunJulAugSepOctNovDec2013
2014JanFebMarAprMayJunJulAugSepOctNovDec2014
2015JanFebMarAprMayJunJulAugSepOctNovDec2015
2016JanFebMarAprMayJunJulAugSepOctNovDec2016
2017JanFebMarAprMayJunJulAugSepOctNovDec2017
2018JanFebMarAprMayJunJulAugSepOctNovDec2018
2019JanFebMarAprMayJunJulAugSepOctNovDec2019
2020JanFebMarAprMay2020

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search