Messages from 148225

Article: 148225
Subject: Re: Automatic BUFG insertion on a non clock signal in ISE 12.1
From: Amish Rughoonundon <amishrughoonundon@gmail.com>
Date: Wed, 30 Jun 2010 11:28:33 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 30, 12:06=A0pm, Rob Gaddi <rga...@technologyhighland.com> wrote:
> On 6/30/2010 7:37 AM, Amish Rughoonundon wrote:
>
>
>
>
>
> > Hi,
> > =A0 I have a very simple design using a latch. I know latches should no=
t
> > be used but it is a necessary evil in this case for speed reason. If
> > someone can find as fast of a way to do this with synchronous logic, I
> > would love to hear it.
>
> > My question is that ISE 12.1 is inserting a BUFG in between the input
> > ms3_n and the latch input during synthesis. I do not understand why.
> > I do not have any constraint on this code at all.
>
> > If anybody has a good explanation, I would really appreciate it
> > because I am at a loss. Thanks,
> > Amish
>
> > [CODE]
> > library IEEE;
> > use IEEE.STD_LOGIC_1164.all;
>
> > entity dspInterfaceTest is
> > =A0 =A0 =A0 port(
> > =A0 =A0 =A0 =A0 =A0 ms3_n : in STD_LOGIC;
> > =A0 =A0 =A0 =A0 =A0 ack : out STD_LOGIC;
> > =A0 =A0 =A0 =A0 =A0 s_done : in std_logic
> > =A0 =A0 =A0 =A0 =A0 );
> > end dspInterfaceTest;
>
> > architecture dspInterfaceTest of dspInterfaceTest is
>
> > =A0 =A0 =A0signal s_doneLatch : std_logic;
> > begin
>
> > =A0 =A0 =A0p_doneLatch : process(s_done, ms3_n)
> > =A0 =A0 =A0begin
> > =A0 =A0 =A0 =A0 =A0if(s_done =3D '1') then
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0s_doneLatch<=3D '1';
>
> > =A0 =A0 =A0 =A0 =A0elsif(ms3_n =3D '1') then
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0s_doneLatch<=3D '0';
> > =A0 =A0 =A0 =A0 =A0end if;
>
> > =A0 =A0 =A0end process p_doneLatch;
>
> > =A0 =A0 =A0ack<=3D 'Z' when ms3_n =3D '1' else s_doneLatch;
>
> > end dspInterfaceTest;
> > [/CODE]
>
> No idea offhand, but your ack signal is going directly to a top-level
> pin, not internal logic, right?
>
> --
> Rob Gaddi, Highland Technology
> Email address is currently out of order

yes all ports are going to external FPGA pins
Amish

Article: 148226
Subject: Re: Automatic BUFG insertion on a non clock signal in ISE 12.1
From: Rob Gaddi <rgaddi@technologyhighland.com>
Date: Wed, 30 Jun 2010 11:39:21 -0700
Links: << >> << T >> << A >>

On 6/30/2010 11:28 AM, Amish Rughoonundon wrote:
> On Jun 30, 12:06 pm, Rob Gaddi<rga...@technologyhighland.com>  wrote:
>> On 6/30/2010 7:37 AM, Amish Rughoonundon wrote:
>>
>>
>>
>>
>>
>>> Hi,
>>>    I have a very simple design using a latch. I know latches should not
>>> be used but it is a necessary evil in this case for speed reason. If
>>> someone can find as fast of a way to do this with synchronous logic, I
>>> would love to hear it.
>>
>>> My question is that ISE 12.1 is inserting a BUFG in between the input
>>> ms3_n and the latch input during synthesis. I do not understand why.
>>> I do not have any constraint on this code at all.
>>
>>> If anybody has a good explanation, I would really appreciate it
>>> because I am at a loss. Thanks,
>>> Amish
>>
>>> [CODE]
>>> library IEEE;
>>> use IEEE.STD_LOGIC_1164.all;
>>
>>> entity dspInterfaceTest is
>>>        port(
>>>            ms3_n : in STD_LOGIC;
>>>            ack : out STD_LOGIC;
>>>            s_done : in std_logic
>>>            );
>>> end dspInterfaceTest;
>>
>>> architecture dspInterfaceTest of dspInterfaceTest is
>>
>>>       signal s_doneLatch : std_logic;
>>> begin
>>
>>>       p_doneLatch : process(s_done, ms3_n)
>>>       begin
>>>           if(s_done = '1') then
>>>               s_doneLatch<= '1';
>>
>>>           elsif(ms3_n = '1') then
>>>               s_doneLatch<= '0';
>>>           end if;
>>
>>>       end process p_doneLatch;
>>
>>>       ack<= 'Z' when ms3_n = '1' else s_doneLatch;
>>
>>> end dspInterfaceTest;
>>> [/CODE]
>>
>> No idea offhand, but your ack signal is going directly to a top-level
>> pin, not internal logic, right?
>>
>> --
>> Rob Gaddi, Highland Technology
>> Email address is currently out of order
>
> yes all ports are going to external FPGA pins
> Amish

It looks like you're trying to infer a D-latch with a preset, tying 
D=>'0', G=>ms3_N, P=>s_done, Q=>s_doneLatch.  Then that feeds a tristate 
buffer where ms3_n is the tristate enable.

I haven't the foggiest idea why you're doing that, and if you take a 
step back from the problem there's probably a better, synchronous way to 
do it.  But that doesn't change the fact that there's no reason that XST 
shouldn't be letting you do it the way you're trying to.

Only thing I can think of is to try to replacing your behavioral code 
with direct instantiation.  I know that for Spartan-6 they've crippled 
the libraries so that you can't actually do direct instantiation anymore 
from HDL, but maybe if you broke down and did a schematic design for 
that block?

-- 
Rob Gaddi, Highland Technology
Email address is currently out of order

Article: 148227
Subject: Re: Automatic BUFG insertion on a non clock signal in ISE 12.1
From: Gabor <gabor@alacron.com>
Date: Wed, 30 Jun 2010 11:52:15 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 30, 10:37=A0am, Amish Rughoonundon <amishrughoonun...@gmail.com>
wrote:
> Hi,
> =A0I have a very simple design using a latch. I know latches should not
> be used but it is a necessary evil in this case for speed reason. If
> someone can find as fast of a way to do this with synchronous logic, I
> would love to hear it.
>
> My question is that ISE 12.1 is inserting a BUFG in between the input
> ms3_n and the latch input during synthesis. I do not understand why.
> I do not have any constraint on this code at all.
>
> If anybody has a good explanation, I would really appreciate it
> because I am at a loss. Thanks,
> Amish
>
> [CODE]
> library IEEE;
> use IEEE.STD_LOGIC_1164.all;
>
> entity dspInterfaceTest is
> =A0 =A0 =A0port(
> =A0 =A0 =A0 =A0 =A0ms3_n : in STD_LOGIC;
> =A0 =A0 =A0 =A0 =A0ack : out STD_LOGIC;
> =A0 =A0 =A0 =A0 =A0s_done : in std_logic
> =A0 =A0 =A0 =A0 =A0);
> end dspInterfaceTest;
>
> architecture dspInterfaceTest of dspInterfaceTest is
>
> =A0 =A0 signal s_doneLatch : std_logic;
> begin
>
> =A0 =A0 p_doneLatch : process(s_done, ms3_n)
> =A0 =A0 begin
> =A0 =A0 =A0 =A0 if(s_done =3D '1') then
> =A0 =A0 =A0 =A0 =A0 =A0 s_doneLatch <=3D '1';
>
> =A0 =A0 =A0 =A0 elsif(ms3_n =3D '1') then
> =A0 =A0 =A0 =A0 =A0 =A0 s_doneLatch <=3D '0';
> =A0 =A0 =A0 =A0 end if;
>
> =A0 =A0 end process p_doneLatch;
>
> =A0 =A0 ack <=3D 'Z' when ms3_n =3D '1' else s_doneLatch;
>
> end dspInterfaceTest;
> [/CODE]

What FPGA family is this going into?  For the Spartan 3 series you
still
have latches in the fabric, but they use a clock routing connection
for
the gate input.  That could explain the automatic BUFG insertion.  You
can shut this off by giving your ms3_n signal a BUFFER_TYPE of
"none".  See the XST manual for more on how to do this.

Regards,
Gabor

Article: 148228
Subject: Re: Binary integer to ASCII string in HDL?
From: "John Speth" <johnspeth@yahoo.com>
Date: Wed, 30 Jun 2010 13:45:09 -0700
Links: << >> << T >> << A >>

>I need to do some fast (<5 usec) conversions of binary integer to ASCII 
>string in HDL (NIOS, FPGA, etc) - basically a fast substitute for 
>sprintf(s,"%d",n);

Thanks for suggesting binary_to_bcd.v from OpenCores.  It's exactly what I 
needed and it worked right out of the box.  Using it, I met my speed target.

JJS



--- news://freenews.netfront.net/ - complaints: news@netfront.net ---

Article: 148229
Subject: Re: Automatic BUFG insertion on a non clock signal in ISE 12.1
From: Amish Rughoonundon <amishrughoonundon@gmail.com>
Date: Wed, 30 Jun 2010 13:58:07 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 30, 2:52=A0pm, Gabor <ga...@alacron.com> wrote:
> On Jun 30, 10:37=A0am, Amish Rughoonundon <amishrughoonun...@gmail.com>
> wrote:
>
>
>
>
>
> > Hi,
> > =A0I have a very simple design using a latch. I know latches should not
> > be used but it is a necessary evil in this case for speed reason. If
> > someone can find as fast of a way to do this with synchronous logic, I
> > would love to hear it.
>
> > My question is that ISE 12.1 is inserting a BUFG in between the input
> > ms3_n and the latch input during synthesis. I do not understand why.
> > I do not have any constraint on this code at all.
>
> > If anybody has a good explanation, I would really appreciate it
> > because I am at a loss. Thanks,
> > Amish
>
> > [CODE]
> > library IEEE;
> > use IEEE.STD_LOGIC_1164.all;
>
> > entity dspInterfaceTest is
> > =A0 =A0 =A0port(
> > =A0 =A0 =A0 =A0 =A0ms3_n : in STD_LOGIC;
> > =A0 =A0 =A0 =A0 =A0ack : out STD_LOGIC;
> > =A0 =A0 =A0 =A0 =A0s_done : in std_logic
> > =A0 =A0 =A0 =A0 =A0);
> > end dspInterfaceTest;
>
> > architecture dspInterfaceTest of dspInterfaceTest is
>
> > =A0 =A0 signal s_doneLatch : std_logic;
> > begin
>
> > =A0 =A0 p_doneLatch : process(s_done, ms3_n)
> > =A0 =A0 begin
> > =A0 =A0 =A0 =A0 if(s_done =3D '1') then
> > =A0 =A0 =A0 =A0 =A0 =A0 s_doneLatch <=3D '1';
>
> > =A0 =A0 =A0 =A0 elsif(ms3_n =3D '1') then
> > =A0 =A0 =A0 =A0 =A0 =A0 s_doneLatch <=3D '0';
> > =A0 =A0 =A0 =A0 end if;
>
> > =A0 =A0 end process p_doneLatch;
>
> > =A0 =A0 ack <=3D 'Z' when ms3_n =3D '1' else s_doneLatch;
>
> > end dspInterfaceTest;
> > [/CODE]
>
> What FPGA family is this going into? =A0For the Spartan 3 series you
> still
> have latches in the fabric, but they use a clock routing connection
> for
> the gate input. =A0That could explain the automatic BUFG insertion. =A0Yo=
u
> can shut this off by giving your ms3_n signal a BUFFER_TYPE of
> "none". =A0See the XST manual for more on how to do this.
>
> Regards,
> Gabor

Rob,
 actually there is a typo. It should be
elsif(dsp_ms_three_n =3D '0') then
            s_doneLatch <=3D '0';

 this is used to connect to the acknowledge line of a dsp. When the
dsp toggle the ms3_n line low, I needed the FPGA to respond
immediately bringing the ack line low. When the FPGA is done
processing the request, it would set ack high and when the dsp is done
grabbing the data and set ms3_n high, the ack output will immediately
be tri-stated.

I wanted it to be really fast, less than 5 ns if possible but from my
testing it does not seem it will be possible due to routing delay.

Gabor,
I think you are right looking at the FPGA editor, it seems the LATCH
uses a clock input connected to ms3_n. I guess in Spartan 3E there is
no other way to do a latch so the synthesizer adds a BUFG to the line
automatically. Setting buffer_type seems to fix the problem. Thanks
for the input, that was a real help
Amish

Article: 148230
Subject: Re: Automatic BUFG insertion on a non clock signal in ISE 12.1
From: Gabor <gabor@alacron.com>
Date: Wed, 30 Jun 2010 15:27:05 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 30, 4:58=A0pm, Amish Rughoonundon <amishrughoonun...@gmail.com>
wrote:
> On Jun 30, 2:52=A0pm, Gabor <ga...@alacron.com> wrote:
>
>
>
> > On Jun 30, 10:37=A0am, Amish Rughoonundon <amishrughoonun...@gmail.com>
> > wrote:
>
> > > Hi,
> > > =A0I have a very simple design using a latch. I know latches should n=
ot
> > > be used but it is a necessary evil in this case for speed reason. If
> > > someone can find as fast of a way to do this with synchronous logic, =
I
> > > would love to hear it.
>
> > > My question is that ISE 12.1 is inserting a BUFG in between the input
> > > ms3_n and the latch input during synthesis. I do not understand why.
> > > I do not have any constraint on this code at all.
>
> > > If anybody has a good explanation, I would really appreciate it
> > > because I am at a loss. Thanks,
> > > Amish
>
> > > [CODE]
> > > library IEEE;
> > > use IEEE.STD_LOGIC_1164.all;
>
> > > entity dspInterfaceTest is
> > > =A0 =A0 =A0port(
> > > =A0 =A0 =A0 =A0 =A0ms3_n : in STD_LOGIC;
> > > =A0 =A0 =A0 =A0 =A0ack : out STD_LOGIC;
> > > =A0 =A0 =A0 =A0 =A0s_done : in std_logic
> > > =A0 =A0 =A0 =A0 =A0);
> > > end dspInterfaceTest;
>
> > > architecture dspInterfaceTest of dspInterfaceTest is
>
> > > =A0 =A0 signal s_doneLatch : std_logic;
> > > begin
>
> > > =A0 =A0 p_doneLatch : process(s_done, ms3_n)
> > > =A0 =A0 begin
> > > =A0 =A0 =A0 =A0 if(s_done =3D '1') then
> > > =A0 =A0 =A0 =A0 =A0 =A0 s_doneLatch <=3D '1';
>
> > > =A0 =A0 =A0 =A0 elsif(ms3_n =3D '1') then
> > > =A0 =A0 =A0 =A0 =A0 =A0 s_doneLatch <=3D '0';
> > > =A0 =A0 =A0 =A0 end if;
>
> > > =A0 =A0 end process p_doneLatch;
>
> > > =A0 =A0 ack <=3D 'Z' when ms3_n =3D '1' else s_doneLatch;
>
> > > end dspInterfaceTest;
> > > [/CODE]
>
> > What FPGA family is this going into? =A0For the Spartan 3 series you
> > still
> > have latches in the fabric, but they use a clock routing connection
> > for
> > the gate input. =A0That could explain the automatic BUFG insertion. =A0=
You
> > can shut this off by giving your ms3_n signal a BUFFER_TYPE of
> > "none". =A0See the XST manual for more on how to do this.
>
> > Regards,
> > Gabor
>
> Rob,
> =A0actually there is a typo. It should be
> elsif(dsp_ms_three_n =3D '0') then
> =A0 =A0 =A0 =A0 =A0 =A0 s_doneLatch <=3D '0';
>
> =A0this is used to connect to the acknowledge line of a dsp. When the
> dsp toggle the ms3_n line low, I needed the FPGA to respond
> immediately bringing the ack line low. When the FPGA is done
> processing the request, it would set ack high and when the dsp is done
> grabbing the data and set ms3_n high, the ack output will immediately
> be tri-stated.
>
> I wanted it to be really fast, less than 5 ns if possible but from my
> testing it does not seem it will be possible due to routing delay.
>
> Gabor,
> I think you are right looking at the FPGA editor, it seems the LATCH
> uses a clock input connected to ms3_n. I guess in Spartan 3E there is
> no other way to do a latch so the synthesizer adds a BUFG to the line
> automatically. Setting buffer_type seems to fix the problem. Thanks
> for the input, that was a real help
> Amish

That's good.  Be aware that the more recent FPGA architectures
have gotten rid of the D-latch mode of the fabric flip-flops.  This
seems
to be a trend based on most of the market changing from ASIC
prototyping to production delivery vehicles.  You can still make
asynchronous sequential logic using LUT's, but the performance
will not be as good.

Regards,
Gabor

Article: 148231
Subject: Re: Xilinx BULLSHITIX-8, when?
From: Sebastien Bourdeauducq <sebastien.bourdeauducq@gmail.com>
Date: Wed, 30 Jun 2010 15:35:30 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 22, 4:16=A0am, Randy Yates <ya...@ieee.org> wrote:
> The linux version of the 12.1 ISE is a joke: custom build scripts,
> separate library directories, custom bashrc entries required, etc. (as
> opposed to an RPM and use of the system libraries).

As far as I can tell, all other versions of ISE also had this problem.
I think the nastiest new thing about 12.1 is that WebTalk "feature"
you cannot normally disable when using WebPack. I think you should not
_have_ to tell Xilinx how you use their devices.

S=E9bastien
PS. Disabling Webtalk on 12.x is as simple as deleting the cURL
library from the ISE directories.

Article: 148232
Subject: Re: Automatic BUFG insertion on a non clock signal in ISE 12.1
From: Brian Davis <brimdavis@aol.com>
Date: Wed, 30 Jun 2010 17:33:02 -0700 (PDT)
Links: << >> << T >> << A >>

Rob Gaddi wrote:
>
> I know that for Spartan-6 they've crippled the libraries so that you
> can't actually do direct instantiation anymore from HDL, but maybe
> if you broke down and did a schematic design for that block?
>
 You and John Larkin have both posted that you can no longer
instantiate simple FD-something primitives with Spartan6.

 I don't have 12.x here at home to test, but LOC'd FDx's still
work fine in 10.x on V5 with a nearly identical IOB structure;
barring any new version tool bugs, they should also work for S6.

 See my post from last week's thread:
http://groups.google.com/group/comp.arch.fpga/msg/b12a845e6b89e340

 Note that the simplified buffer-only IOB pad structure in the newer
families means you need to LOC the FD primitive to the adjacent
OLOGIC site rather than the IOB pad location if you are using LOCs;
placing the IOB attribute on the primitive might still do this
automatically without needing an explicit LOC on the primitive.

 When Xilinx went from one libraries guide to family specific
guides several versions ago, they dropped the descriptions of the
non-native simple primitives from the family guides, but they are
all still available in the unisim library.

 These primitives in the unisim package are all black-boxed and
should be properly mapped to the appropriate device-specific
element for any FPGA target family.

 At least that is how it's intended to work, see the primitives
section of the S6/V6 XST guide:

http://www.xilinx.com/support/documentation/sw_manuals/xilinx12_1/xst_v6s6.=
pdf
XST User Guide for Virtex-6 and Spartan-6 Devices
UG687 (v 12.1)

Page 271 :
"
" XST provides dedicated VHDL and Verilog libraries to simplify
" instantiation of Xilinx=AE device primitives in the Hardware
" Description Language (HDL) source code.
"
<snip>
"
" VHDL Xilinx=AE Device Primitives Device Libraries
"
" In VHDL, declare library UNISIM with its package vcomponents
" in the HDL source code.
"
"  library unisim;
"  use unisim.vcomponents.all;
"
" The HDL source code for this package is located in the following
" file of the XST installation:
"    vhdl\src\ unisims\unisims_vcomp.vhd
"

Brian

Article: 148233
Subject: Re: Xilinx BULLSHITIX-8, when?
From: Aaron Holtzman <aholtzma@gmail.com>
Date: Wed, 30 Jun 2010 17:53:47 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 21, 2:45=A0pm, Rob Gaddi <rga...@technologyhighland.com> wrote:
> Right now I'm working on two S6 projects, both of which are absolute
> disasters due to problems with the toolchain. =A0My DRAM problem from a
> month ago, Xilinx ultimately told me was my problem and they washed
> their hands of it. =A0

Why does Xilinx think it is your problem? I am about to start layout
on an S6 board and hope to avoid this problem (which seems to be
common!). Thanks.

cheers,
aaron

Article: 148234
Subject: Re: Xilinx BULLSHITIX-8, when?
From: Rob Gaddi <rgaddi@technologyhighland.com>
Date: Wed, 30 Jun 2010 18:14:32 -0700
Links: << >> << T >> << A >>

On 6/30/2010 5:53 PM, Aaron Holtzman wrote:
> On Jun 21, 2:45 pm, Rob Gaddi<rga...@technologyhighland.com>  wrote:
>> Right now I'm working on two S6 projects, both of which are absolute
>> disasters due to problems with the toolchain.  My DRAM problem from a
>> month ago, Xilinx ultimately told me was my problem and they washed
>> their hands of it.
>
> Why does Xilinx think it is your problem? I am about to start layout
> on an S6 board and hope to avoid this problem (which seems to be
> common!). Thanks.
>
> cheers,
> aaron

My specific problem was that I wanted to use the PLLs in the chip to 
generate a memory bus clock frequency that wasn't equal to the 
oscillator frequency.  This turns out to be an absolute nightmare, and 
is what Xilinx left me hanging on.

If you're going to do an S6 design using the MIG, I would _strongly_ 
recommend that you bring in a clock at the bus frequency that you're 
looking for from an external pin (the only thing that the MIG supports) 
  The MIG will make available to you a CLK0 output that is a PLL 
buffered version of that external clock, which you can then use in the 
rest of your design.  Or, if you want it at a different frequency, just 
give the MIG it's own damn oscillator.

Or see whether brands A or L might be any better.  Do keep in mind that, 
as near as can be told from the 7 series info, Xilinx is killing the 
hard MCB concept.  Looks like it didn't quite click.

-- 
Rob Gaddi, Highland Technology
Email address is currently out of order

Article: 148235
Subject: Re: Xilinx BULLSHITIX-8, when?
From: Bryan <bryan.fletcher@avnet.com>
Date: Wed, 30 Jun 2010 18:50:42 -0700 (PDT)
Links: << >> << T >> << A >>

I work for Avnet, which seems not to be too popular with this crowd,
but I will share my experience anyway.  I have a project with
XC6SLX16-2CSG324 and LPDDR that seems to work well with MIG 3.3 in ISE
11.4.  Granted, we are only running at 200 MHz.  We do not provide a
200 MHz input to the chip.  We have a 66 MHz oscillator input to the
FPGA.  It is true that by default, MIG generates a design that assumes
the native system clock is the same as the memory clock.  The only
clocking customization that MIG allows is the choice between single-
ended or differential clock.  However, since MIG provides all the HDL
sources for the clock infrastructure, it is possible to modify the
clocking structure to generate the correct memory clock given any
system clock that meets the specifications of the PLL.

I have some instructions that explains step-by-step how to do this for
the Avnet board (www.em.avnet.com/spartan6lx-evl).  If you are
interested, please contact Avnet Technical Support (www.em.avnet.com/
techsupport).  In addition to this LPDDR example, Xilinx provides
working hardware examples for DDR2 on the SP601 and DDR3 on the
SP605.  Avnet has another board with DDR3 that has been proven out at
800 Mbps in hardware (www.em.avnet.com/spartan6lx150t-dev).

The other critical thing to do with these DDR designs is proper PCB
layout and termination, without which the design will fail.  Xilinx
provides some very specific layout guidelines in UG388 that need to be
followed if you want the full memory interface performance.

Xilinx recently published revised specifications for the MCB.  See
http://www.xilinx.com/support/answers/35818.htm
The Spartan-6 Memory Controller Block (MCB) has new data rate
specifications and performance modes for DDR2 and DDR3 interfaces as
specified in version 1.5 of the Spartan-6 FPGA Data Sheet (DS162):
http://www.xilinx.com/support/documentation/data_sheets/ds162.pdf

You should also be aware of the MIG Design Advisory Answer Record.
http://www.xilinx.com/support/answers/33566.htm

Bryan
Avnet

Article: 148236
Subject: Xilinx xapp175, empty + full flag really synchronous?
From: firefox3107 <firefox3107@gmail.com>
Date: Wed, 30 Jun 2010 21:15:06 -0700 (PDT)
Links: << >> << T >> << A >>

Hey,

I found an amazing async fifo concept on the xilinx homepage.
It looks that the latency to share data between clock domains is
reduced to one cycle.
But I'm asking if the setup + hold times are always met. I admit that
setting the empty flag is correct but what about releasing, which is
caused by write clock, thus asynchrous?

Article: 148237
Subject: Re: Xilinx xapp175, empty + full flag really synchronous?
From: firefox3107 <firefox3107@gmail.com>
Date: Wed, 30 Jun 2010 21:47:55 -0700 (PDT)
Links: << >> << T >> << A >>

Sorry, I have forgotten to add the link:
http://www.xilinx.com/support/documentation/application_notes/xapp175.pdf

Article: 148238
Subject: Re: Xilinx xapp175, empty + full flag really synchronous?
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Thu, 1 Jul 2010 05:10:36 +0000 (UTC)
Links: << >> << T >> << A >>

firefox3107 <firefox3107@gmail.com> wrote:

> I found an amazing async fifo concept on the xilinx homepage.
> It looks that the latency to share data between clock domains is
> reduced to one cycle.
> But I'm asking if the setup + hold times are always met. I admit that
> setting the empty flag is correct but what about releasing, which is
> caused by write clock, thus asynchrous?

Not being an expert in FIFO design, as I understand it in any
asynchronous system you have to assume that setup+hold won't always
be met.  In crossing clock domains, either the signal will come
before the clock (and the operation will be performed) or it will
come after (and not be performed).  If it comes after the clock,
then the operation is performed on the next clock cycle.

For FIFOs that is sometimes done by passing a Gray counter output
across the clock domain, in which case either the old or new value
is received on the other side.  Those are the only choices.

-- glen

Article: 148239
Subject: Re: Xilinx xapp175, empty + full flag really synchronous?
From: firefox3107 <firefox3107@gmail.com>
Date: Wed, 30 Jun 2010 22:19:55 -0700 (PDT)
Links: << >> << T >> << A >>

I'm concerned about metastability and oscillation and in this
application note the flags are not synchronized with 2 flipflops in
the new clock domain.
So I'm asking if this FIFO design is reliable?

Article: 148240
Subject: Re: Xilinx BULLSHITIX-8, when?
From: "RCIngham" <robert.ingham@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com>
Date: Thu, 01 Jul 2010 05:29:33 -0500
Links: << >> << T >> << A >>

[snip]
>
>If you're going to do an S6 design using the MIG, I would _strongly_ 
>recommend that you bring in a clock at the bus frequency that you're 
>looking for from an external pin (the only thing that the MIG supports) 
>  The MIG will make available to you a CLK0 output that is a PLL 
>buffered version of that external clock, which you can then use in the 
>rest of your design.  Or, if you want it at a different frequency, just 
>give the MIG it's own damn oscillator.
>
>Or see whether brands A or L might be any better.  Do keep in mind that, 
>as near as can be told from the 7 series info, Xilinx is killing the 
>hard MCB concept.  Looks like it didn't quite click.
>
>-- 
>Rob Gaddi, Highland Technology
>Email address is currently out of order

May not be necessary for V4, if the memory speed isn't very high.

For a PCB-tester FPGA design, I derived the 200MHz DDR2 memory clock from
32MHz (using a tandem pair of DCMs), and all was well.

As mentioned, PCB layout and especially routing are critical for these
components. The company had to re-spin the boards to get the DDR2s to
work.
	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 148241
Subject: Re: altshift_taps for Xilinx?
From: Martin Thompson <martin.j.thompson@trw.com>
Date: Thu, 01 Jul 2010 12:05:19 +0100
Links: << >> << T >> << A >>

Gladys <yuhui.b@gmail.com> writes:

> Thank you so much for help, I've successfully implemented the line
> buffer of about 3000 delay of data 12bits for my image processing.
> Now I have another question: the pixels are in bayer pattern, such as:
>
>           R1
>       B  G  B
> R2  G  R  G  R3
>       B  G  B
>           R4
>
> I want to correct the defective pixel 5x5 pixel surrounding, then
> replace it by the average value of the nearest same color neighbor
> pixels.
> For example, the R in the middle is a dead pixel, then R=(R1+R2+R3+R4)/
> 4
>
> In vertical, I've built 4 line buffer, but in horizontal, I still need
> to have 5 pixels available at the same time, do I need to use shift
> registers again to
> delay the data for 4, 3, 2, 1clock cycle?

Single tick delays are just flip-flops, so you'll need 4 sets of 12
bits => 48 flipflops.

>
> I find it's memory consuming and I'm not sure if my solution is
> correct.  Could you please help me? Thanks again.

Which bit is memory consuming?  Not the horizontal buffers I hope!

The vertical line buffers will need as many elements as you have
pixels to store (4 complete lines).  There's not a lot you can do
about this - if you need to look back by 4 lines, you need to store 4
lines to do it.

Cheers,
Martin

-- 
martin.j.thompson@trw.com 
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
http://www.conekt.net/electronics.html

Article: 148242
Subject: DMA operation to 64-bits PC platform
From: Frank van Eijkelenburg <fei.technolution@gmail.com>
Date: Thu, 1 Jul 2010 08:03:39 -0700 (PDT)
Links: << >> << T >> << A >>

Hi,

I have a custom made PCIe board with a Virtex 5 FPGA on which I
implemented a DMA unit which uses the PCIe endpoint block plus v1.14.
I also implemented simple read/write operations from the PC to the
board (the board responds with completion TLPs). The read/write
operations are working, DMA is not working

The board is inserted in a pc with Windows 7 64 bits platform. An
application allocates virtual memory and passes the memory block to
the driver. The driver locks the memory and converts the virtual
addresses into physical addresses. These physical addresses are
written to the FPGA.

When I start an DMA operation, I can see in chipscope the correct
physical addresses in the TLP header. However, I do not see the
correct values in the allocated memory. What can I do to check where
it is going wrong?

Another question is about the memory request TLPs. What should I use,
32 or 64 bit write requests? Or do I have to check runtime if the
physical memory address is below or above the 4 GB (and use
respectively 32 and 64 bit requests)?


Thanks in advance,

Frank

Article: 148243
Subject: carrier tracking over zero frequency point
From: "kadhiem_ayob" <kadhiem_ayob@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.co.uk>
Date: Thu, 01 Jul 2010 10:12:36 -0500
Links: << >> << T >> << A >>

Hi All,

I am developing a carrier tracking module for 16QAM receiver based on
Costas loop on an fpga platform. Tracking works well on either side of zero
frequency and over zero if crossing rate is several seconds or minutes
apart. 
However it loses lock and relocks if zero crossing rate is few seconds
apart.
Is this acceptable to an average RF expectations or do you think I can
improve on it.

Regards

kadhiem	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 148244
Subject: Re: DMA operation to 64-bits PC platform
From: "maxascent" <maxascent@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.co.uk>
Date: Thu, 01 Jul 2010 10:45:18 -0500
Links: << >> << T >> << A >>

I have done a similar design myself but using Windows 7, 32-bit. I use
32-bit TLPs and have had no problems with the design. BTW I have used
Windriver to generate the device driver. 

Jon	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 148245
Subject: Re: Xilinx xapp175, empty + full flag really synchronous?
From: d_s_klein <d_s_klein@yahoo.com>
Date: Thu, 1 Jul 2010 13:57:23 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 30, 10:19=A0pm, firefox3107 <firefox3...@gmail.com> wrote:
> I'm concerned about metastability and oscillation and in this
> application note the flags are not synchronized with 2 flipflops in
> the new clock domain.
> So I'm asking if this FIFO design is reliable?

I have been told by several people (Xilinx FAEs and Xilinx Users) that
the only reliable 2-clock FIFOs are the one created by coregen.

My understanding is that there is some magic (RPMs?) in the macro that
"just work".

My guess is that if you really choose to use this XAPP, you get to
verify it yourself (and re-verify it every time the placement
changes).

RK

Article: 148246
Subject: Re: DMA operation to 64-bits PC platform
From: Michael S <already5chosen@yahoo.com>
Date: Thu, 1 Jul 2010 17:15:40 -0700 (PDT)
Links: << >> << T >> << A >>

On Jul 1, 5:03=A0pm, Frank van Eijkelenburg <fei.technolut...@gmail.com>
wrote:
>
> Another question is about the memory request TLPs. What should I use,
> 32 or 64 bit write requests? Or do I have to check runtime if the
> physical memory address is below or above the 4 GB (and use
> respectively 32 and 64 bit requests)?
>
> Thanks in advance,
>
> Frank

Memory accesses below 4GB have to use 3DW (=3D32-bit) TLP headers.
4DW TLP headers addressing memory below 4GB are prohibited by PCIe
standard although they would occasionally work on some chipsets, e.g.
on Intel 5000P/5000X series.

Article: 148247
Subject: Re: DMA operation to 64-bits PC platform
From: Charles Gardiner <charles.gardiner@invalid.invalid>
Date: Fri, 02 Jul 2010 02:19:17 +0200
Links: << >> << T >> << A >>

Frank van Eijkelenburg schrieb:
> Hi,
> 
> I have a custom made PCIe board with a Virtex 5 FPGA on which I
> implemented a DMA unit which uses the PCIe endpoint block plus v1.14.
> I also implemented simple read/write operations from the PC to the
> board (the board responds with completion TLPs). The read/write
> operations are working, DMA is not working
> 
> The board is inserted in a pc with Windows 7 64 bits platform. An
> application allocates virtual memory and passes the memory block to
> the driver. The driver locks the memory and converts the virtual
> addresses into physical addresses. These physical addresses are
> written to the FPGA.

How are you doing this? Normally, an application requests a buffer using malloc()
or new() and gets a handle to the driver using CreateFile(). You then use
WriteFile(hDevice, Buffer,...), ReadFile(hDevice, Buffer,....) or
DeviceIoControl() to initiate a transfer to/from  the device. Thats the
application side.

On the driver(kernel) side, I would strongly recommend that you write a KMDF based
driver. Download the windows WDK, all it costs is your email. (You have to log in
over Microsoft Connect, last time I looked). There are lots of examples there,
including for PCI(e) based DMA. To (very quickly) summarise, your driver requests
the scatter/gather list describing the buffers (see
WdfDmaTransactionInitializeUsingRequest() in the WDK API docs as a starting point)
above and passes these to your hardware one-by-one which then does DMA in or out.
With a call to WdfRequestComplete the buffers are released by the kernel and your
application can reuse them or free them up as required. (This is of course all
considerably more than a days work, by the way.)

You do not have to explicitly lock down the buffer yourself. Windows does this for
you while the I/O request is active. (Read/WriteFile from your app up to
WdfRequestComplete from the driver)

> 
> When I start an DMA operation, I can see in chipscope the correct
> physical addresses in the TLP header. However, I do not see the
> correct values in the allocated memory. What can I do to check where
> it is going wrong?
> 

In this case, I would first doubt whether the addresses are correct.

> Another question is about the memory request TLPs. What should I use,
> 32 or 64 bit write requests? Or do I have to check runtime if the
> physical memory address is below or above the 4 GB (and use
> respectively 32 and 64 bit requests)?
> 

The PCIe spec says: a transfer below 4 GB must use a 3 DWord header, a transfer
above 4 GB must use a 4 DWord header. i.e. a four dword header wth address[63:32]
set to zero is invalid.

> 
> Thanks in advance,
> 
> Frank

Article: 148248
Subject: Re: Xilinx xapp175, empty + full flag really synchronous?
From: gtwrek@sonic.net (Mark Curry)
Date: 02 Jul 2010 00:37:11 GMT
Links: << >> << T >> << A >>

In article <aef37803-9ae0-41e9-84ae-c50aaa4f40f9@7g2000prh.googlegroups.com>,
d_s_klein  <d_s_klein@yahoo.com> wrote:
>On Jun 30, 10:19 pm, firefox3107 <firefox3...@gmail.com> wrote:
>> I'm concerned about metastability and oscillation and in this
>> application note the flags are not synchronized with 2 flipflops in
>> the new clock domain.
>> So I'm asking if this FIFO design is reliable?
>
>I have been told by several people (Xilinx FAEs and Xilinx Users) that
>the only reliable 2-clock FIFOs are the one created by coregen.
>
>My understanding is that there is some magic (RPMs?) in the macro that
>"just work".
>
>My guess is that if you really choose to use this XAPP, you get to
>verify it yourself (and re-verify it every time the placement
>changes).

Not true at all.  If designer's couldn't design cross clock domain
logic on an FPGA there'd be a lot less successful FPGAs :). 
The principal elements of fifo design are well known - there's just tricky
cases to watch out for.

We've designed our own fifo's for generations of FPGAs (from 
multiple vendors).  You don't need "magic" RPMs nor reverification 
of every placement (other than normal STA checks).

Now to the OP - metastability is hardly an issue at these
technologies.  Rather just wise cross clock domain design
techniques.  Does Xilinx use good techniques in their FIFOs?
I can't answer that - I don't use their FIFOs, nor know the
design.  But they're said to be used in millions of parts....

Article: 148249
Subject: Re: DMA operation to 64-bits PC platform
From: Frank van Eijkelenburg <fei.technolution@gmail.com>
Date: Fri, 2 Jul 2010 00:08:35 -0700 (PDT)
Links: << >> << T >> << A >>

On Jul 2, 2:19=A0am, Charles Gardiner <charles.gardi...@invalid.invalid>
wrote:
> Frank van Eijkelenburg schrieb:
>
> > Hi,
>
> > I have a custom made PCIe board with a Virtex 5 FPGA on which I
> > implemented a DMA unit which uses the PCIe endpoint block plus v1.14.
> > I also implemented simple read/write operations from the PC to the
> > board (the board responds with completion TLPs). The read/write
> > operations are working, DMA is not working
>
> > The board is inserted in a pc with Windows 7 64 bits platform. An
> > application allocates virtual memory and passes the memory block to
> > the driver. The driver locks the memory and converts the virtual
> > addresses into physical addresses. These physical addresses are
> > written to the FPGA.
>
> How are you doing this? Normally, an application requests a buffer using =
malloc()
> or new() and gets a handle to the driver using CreateFile(). You then use
> WriteFile(hDevice, Buffer,...), ReadFile(hDevice, Buffer,....) or
> DeviceIoControl() to initiate a transfer to/from =A0the device. Thats the
> application side.
>
> On the driver(kernel) side, I would strongly recommend that you write a K=
MDF based
> driver. Download the windows WDK, all it costs is your email. (You have t=
o log in
> over Microsoft Connect, last time I looked). There are lots of examples t=
here,
> including for PCI(e) based DMA. To (very quickly) summarise, your driver =
requests
> the scatter/gather list describing the buffers (see
> WdfDmaTransactionInitializeUsingRequest() in the WDK API docs as a starti=
ng point)
> above and passes these to your hardware one-by-one which then does DMA in=
 or out.
> With a call to WdfRequestComplete the buffers are released by the kernel =
and your
> application can reuse them or free them up as required. (This is of cours=
e all
> considerably more than a days work, by the way.)
>
> You do not have to explicitly lock down the buffer yourself. Windows does=
 this for
> you while the I/O request is active. (Read/WriteFile from your app up to
> WdfRequestComplete from the driver)
>
>
>
> > When I start an DMA operation, I can see in chipscope the correct
> > physical addresses in the TLP header. However, I do not see the
> > correct values in the allocated memory. What can I do to check where
> > it is going wrong?
>
> In this case, I would first doubt whether the addresses are correct.
>
> > Another question is about the memory request TLPs. What should I use,
> > 32 or 64 bit write requests? Or do I have to check runtime if the
> > physical memory address is below or above the 4 GB (and use
> > respectively 32 and 64 bit requests)?
>
> The PCIe spec says: a transfer below 4 GB must use a 3 DWord header, a tr=
ansfer
> above 4 GB must use a 4 DWord header. i.e. a four dword header wth addres=
s[63:32]
> set to zero is invalid.
>
>
>
> > Thanks in advance,
>
> > Frank

The way it works is as follows:
- the application allocates the memory (malloc).
- a pointer to this memory is passed to the driver (custom made
driver).
- the driver creates a scatter-gather list by using the
GetScatterGatherList method from the DMA_ADAPTER object.
- the driver writes each entry of the scatter-gather list (which
contains a physical address and length) to the FPGA.
- the FPGA receives data (though another interface) and writes this
data to the memory of the pc by use of DMA (just generates write
requests).
- after writing the data the FPGA generates an interrupt of PCIe (not
working yet, but we know when the FPGA finished a transaction).

I now understand I have to verify runtime if the physical address is
below or above 4 GB and use a 3 DW respectively 4 DW TLP header. I
will change that in the FPGA and give it a try.

About the addresses, these are correct. We did the following test:
write the virtual memory from the application and read the memory by
using the physical addresses in the driver. In the driver we read what
the application has written.

Any other suggestions?

Frank

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search