Messages from 144800

Article: 144800
Subject: Re: Cable autodetection failed
From: John Adair <g1@enterpoint.co.uk>
Date: Tue, 5 Jan 2010 00:00:49 -0800 (PST)
Links: << >> << T >> << A >>

That's a very old board and about then Digilent did a few like that
with a direct connection to a parallel port. It does complicate a
solution in that direct connection to a Laptop either is going to need
an add-on parallel port by plg in card or docking station methods.

However I think there is a chance of using a physical adaptor to make
this interface work with a modern cable. I would suggest googling
parallel cable III schematics to understand what goes in a parallel
port cable and how it relates to a parallel port pinout. I think if
you can wire to the relevant pins of the interface, enabling any
relevant buffer, you may be able to get the 4 JTAG signals you need.

Of course just buying a new board, with a modern interface, might be
simplier.

John Adair
Enterpoint Ltd.

On 4 Jan, 22:26, Angus <angusdun...@googlemail.com> wrote:
> i found at last the datasheet@http://www.digilentinc.com/Data/Products/D2E/D2E-rm.PDF(please p2
> onwards re. the configuration cable). can my problem still be
> resolved?

Article: 144801
Subject: Re: EPCS vs SPI Flash
From: "Nial Stewart" <nial*REMOVE_THIS*@nialstewartdevelopments.co.uk>
Date: Tue, 5 Jan 2010 10:53:00 -0000
Links: << >> << T >> << A >>

"Omer Osman" <research@ottomaneng.com> wrote in message 
news:02ced69f-4e91-4902-8e05-7b0671f2174d@v25g2000yqk.googlegroups.com...
> Hello world,
> I'm looking at making my first FPGA board with a Cyclone III EP3C25
> and am researching my boot memory solution. Reading Altera's docs they
> reference their EPCS series serial programming chips but they run for
> an outlandish $16 - $32 on Newark.
> Googling got me to this page: 
> http://fpgaforum.blogspot.com/2006/03/any-replacement-for-altera-epcs_19.html
> that claims that their so called EPCS programming chips are nothing
> more than SPI flash. Can anyone confirm this? I am looking at getting
> the SST25VF032B 32Mb (4M x 8) SPI Flash memory.


I don't think it can be just any serial flash, Quartus looks for the correct
device ID (or similar) when programming. Check the data sheets for the
details.

I've sucessfully used the ST parts for a few years, I was told off the record
by an FAE that the Altera parts are actually these re-branded.

It must cost a lot to write 'Altera' on them!


Nial

Article: 144802
Subject: Re: Video Processing
From: "Ghostboy" <Ghostboy@dommel.be>
Date: Tue, 05 Jan 2010 05:03:30 -0600
Links: << >> << T >> << A >>

Hi,

The video file is not coming from another PCI card but from the pc. The
code to send and receive data is on top of the PCI driver in Linux. But I
don't know how to let the algorithm (PCI core) on the FPGA know that it can
start processing data and how it can give a signal to the pc that the
processing of a frame is finished. I also want to use the I/O space of the
PCI instead of the memory space but I don't know if that will be fast
enough to process more than 50 frames per second with a resolution of
320x240. 

	   
					
---------------------------------------		
This message was sent using the comp.arch.fpga web interface on
http://www.FPGARelated.com

Article: 144803
Subject: Re: Video Processing
From: John_H <newsgroup@johnhandwork.com>
Date: Tue, 5 Jan 2010 04:02:32 -0800 (PST)
Links: << >> << T >> << A >>

On Jan 5, 6:03=A0am, "Ghostboy" <Ghost...@dommel.be> wrote:
> Hi,
>
> The video file is not coming from another PCI card but from the pc. The
> code to send and receive data is on top of the PCI driver in Linux. But I
> don't know how to let the algorithm (PCI core) on the FPGA know that it c=
an
> start processing data and how it can give a signal to the pc that the
> processing of a frame is finished. I also want to use the I/O space of th=
e
> PCI instead of the memory space but I don't know if that will be fast
> enough to process more than 50 frames per second with a resolution of
> 320x240.
>
> --------------------------------------- =A0 =A0 =A0 =A0
> This message was sent using the comp.arch.fpga web interface onhttp://www=
.FPGARelated.com

Why would you want to use I/O space?  The most effective way to
transfer data over simple PCI interfaces (rather than PCI-X or PCIe)
is the Memory Read Multiple transaction to transfer a cache line at a
time rather than a 32-bit word for each transaction.  [I might be
assuming too much up front.  Are you using a master/target PCI core to
perform Direct Memory Accesses (DMAs) to transfer the FPGA-processed
data back to the PC?]

Are you streaming the data or transferring an entire file?

When you design the FPGA logic that hooks up to the PCI core, you have
access to the addresses being written and can maintain a byte count.
When a certain address is written (e.g. a frame of data is complete)
you can begin your processing.  Either your driver or the design of
the system should define the frame size and/or number of frames to
process as well as FPGA memory (or I/O) space that's appropriate for
the writes.  To transfer data back to the PC, the driver should have a
defined DMA space for the master mode of a master/target PCI core and
let the FPGA transfer the data with the memory read multiple
transactions to fill the space and interrupt the processor when
complete.  The interrupt handler on the PC side would then do what it
needs to with the frame and set up the next DMA memory band.

I didn't look at your file in the post above because I don't have a
RAR decompressor on my home machine.

Article: 144804
Subject: Re: EPCS vs SPI Flash
From: Antti <antti.lukats@googlemail.com>
Date: Tue, 5 Jan 2010 04:10:08 -0800 (PST)
Links: << >> << T >> << A >>

On Jan 5, 12:53=A0pm, "Nial Stewart"
<nial*REMOVE_TH...@nialstewartdevelopments.co.uk> wrote:
> "Omer Osman" <resea...@ottomaneng.com> wrote in message
>
> news:02ced69f-4e91-4902-8e05-7b0671f2174d@v25g2000yqk.googlegroups.com...
>
> > Hello world,
> > I'm looking at making my first FPGA board with a Cyclone III EP3C25
> > and am researching my boot memory solution. Reading Altera's docs they
> > reference their EPCS series serial programming chips but they run for
> > an outlandish $16 - $32 on Newark.
> > Googling got me to this page:
> >http://fpgaforum.blogspot.com/2006/03/any-replacement-for-altera-epcs...
> > that claims that their so called EPCS programming chips are nothing
> > more than SPI flash. Can anyone confirm this? I am looking at getting
> > the SST25VF032B 32Mb (4M x 8) SPI Flash memory.
>
> I don't think it can be just any serial flash, Quartus looks for the corr=
ect
> device ID (or similar) when programming. Check the data sheets for the
> details.
>
> I've sucessfully used the ST parts for a few years, I was told off the re=
cord
> by an FAE that the Altera parts are actually these re-branded.
>
> It must cost a lot to write 'Altera' on them!
>
> Nial

right

1) ST
2) the LOGO costs $$$ (actuall it DOESNT its just PURE profit to
Altera!)
3) different vendors maybe "compatible" for basic read and maybe
write, but that doesnt them make to work with native Quartus tools

Antti

Article: 144805
Subject: Re: Video Processing
From: "Ghostboy" <Ghostboy@dommel.be>
Date: Tue, 05 Jan 2010 07:06:25 -0600
Links: << >> << T >> << A >>

Hi,

I first want to use the I/O space because I think the most simple
implementation.
The possibility to use DMA is not in this core.
If I make a zip file can you open it then? It might be easier to see the
problem.

I'm streaming an entire file by the way.




>Why would you want to use I/O space?  The most effective way to
>transfer data over simple PCI interfaces (rather than PCI-X or PCIe)
>is the Memory Read Multiple transaction to transfer a cache line at a
>time rather than a 32-bit word for each transaction.  [I might be
>assuming too much up front.  Are you using a master/target PCI core to
>perform Direct Memory Accesses (DMAs) to transfer the FPGA-processed
>data back to the PC?]
>
>Are you streaming the data or transferring an entire file?
>
>When you design the FPGA logic that hooks up to the PCI core, you have
>access to the addresses being written and can maintain a byte count.
>When a certain address is written (e.g. a frame of data is complete)
>you can begin your processing.  Either your driver or the design of
>the system should define the frame size and/or number of frames to
>process as well as FPGA memory (or I/O) space that's appropriate for
>the writes.  To transfer data back to the PC, the driver should have a
>defined DMA space for the master mode of a master/target PCI core and
>let the FPGA transfer the data with the memory read multiple
>transactions to fill the space and interrupt the processor when
>complete.  The interrupt handler on the PC side would then do what it
>needs to with the frame and set up the next DMA memory band.
>
>I didn't look at your file in the post above because I don't have a
>RAR decompressor on my home machine.
>	   
					
---------------------------------------		
This message was sent using the comp.arch.fpga web interface on
http://www.FPGARelated.com

Article: 144806
Subject: Re: Video Processing
From: John_H <newsgroup@johnhandwork.com>
Date: Tue, 5 Jan 2010 05:53:32 -0800 (PST)
Links: << >> << T >> << A >>

On Jan 5, 8:06=A0am, "Ghostboy" <Ghost...@dommel.be> wrote:
> Hi,
>
> I first want to use the I/O space because I think the most simple
> implementation.
> The possibility to use DMA is not in this core.
> If I make a zip file can you open it then? It might be easier to see the
> problem.
>
> I'm streaming an entire file by the way.
>
What work have you done or *can* you do?  You've used system generator
and other high level tools.  Are you familiar with low level FPGA
design?  Are you looking to get someone to do this low level design
work for you?

Is your need on the PC driver for the XUPV2P such that you need
software help?

Having a zip file I can actually look at won't help me to help you if
I don't know what you need.

I've suggested on the FPGA side that you use the writes to the FPGA
(quantity and/or addresses) to know when a fram of data is available
and that you use a PCI interrupt to tell the PC when the FPGA is
done.  You specifically asked for those.

What more are you asking for?

Article: 144807
Subject: Re: EPCS vs SPI Flash
From: Mike Harrison <mike@whitewing.co.uk>
Date: Tue, 05 Jan 2010 16:14:56 +0000
Links: << >> << T >> << A >>

On Tue, 5 Jan 2010 04:10:08 -0800 (PST), Antti <antti.lukats@googlemail.com> wrote:

>On Jan 5, 12:53 pm, "Nial Stewart"
><nial*REMOVE_TH...@nialstewartdevelopments.co.uk> wrote:
>> "Omer Osman" <resea...@ottomaneng.com> wrote in message
>>
>> news:02ced69f-4e91-4902-8e05-7b0671f2174d@v25g2000yqk.googlegroups.com...
>>
>> > Hello world,
>> > I'm looking at making my first FPGA board with a Cyclone III EP3C25
>> > and am researching my boot memory solution. Reading Altera's docs they
>> > reference their EPCS series serial programming chips but they run for
>> > an outlandish $16 - $32 on Newark.
>> > Googling got me to this page:
>> >http://fpgaforum.blogspot.com/2006/03/any-replacement-for-altera-epcs...
>> > that claims that their so called EPCS programming chips are nothing
>> > more than SPI flash. Can anyone confirm this? I am looking at getting
>> > the SST25VF032B 32Mb (4M x 8) SPI Flash memory.
>>
>> I don't think it can be just any serial flash, Quartus looks for the correct
>> device ID (or similar) when programming. Check the data sheets for the
>> details.
>>
>> I've sucessfully used the ST parts for a few years, I was told off the record
>> by an FAE that the Altera parts are actually these re-branded.

This should be pretty obvious from the datasheets.

>> It must cost a lot to write 'Altera' on them!
>>
>> Nial
>
>right
>
>1) ST
>2) the LOGO costs $$$ (actuall it DOESNT its just PURE profit to
>Altera!)
>3) different vendors maybe "compatible" for basic read and maybe
>write, but that doesnt them make to work with native Quartus tools
>
>Antti
>

There are a lot of differences between manufacturers withe SPI flash in the area of write/erase
granularity, but they are generally compatible for read. 
If Quartus is fussy about programming, use Altera chips for prototyping and someone else's for
production. 

I've used SST, Atmel and ST(Now Numonyx).
The SST ones I used suck as they don't have page write so very slow to program.
I think Atmel currently have the biggest available (i.e. ex-stock DK/Mouser) capacity in 8-SO at
32mbit (note the 26 series has errata in chip erase), and decent write/erase times.

Article: 144808
Subject: Re: How to protect my Virtex5 design without battery?
From: untergangsprophet <filter001@desinformation.de>
Date: Tue, 5 Jan 2010 09:03:32 -0800 (PST)
Links: << >> << T >> << A >>

On 29 Dez. 2009, 08:49, vcar <hi...@163.com> wrote:
> For certainreasons, I could not use battery on my board, so the
> Virtex5 bitstream encryptioncould not be used. In this situation, what
> could I do to protect my design on areasonable level?

What about an Atmel crypto-memory?

It has your Key, which is very difficult to clone.
The FPGA logic can then have random values encrypted by the crypto-
memory and verify them with its internal key.
You should spent some effort to make sure your "random values" are
sufficiently random.

You could go the same Way using for example a LPC (NXP) flash-
microcontroller which can be read-protected.

Article: 144809
Subject: Why are my pins being removed? LIT:243 and MapLib:701 warnings
From: Griffin <captain.griffin@gmail.com>
Date: Tue, 5 Jan 2010 13:36:12 -0800 (PST)
Links: << >> << T >> << A >>

I'm working on a ML402 (Virtex-4) EDK project which has one custom IP
("event_getter") that was created using the EDK Create Peripheral
Wizard and then I added my code (included below) to the generated
user_logic.vhdl file. I am able to generate a netlist without issue,
but when I compile the bitstream, I get the following warnings:

WARNING:LIT:243 - Logical network my_peripheral_pin_0_IBUF has no
   load.

and

WARNING:MapLib:701 - Signal my_peripheral_pin<0> connected to top
level port my_peripheral_pin<0> has been removed.

and it is repeated for pins my_peripheral_pin<1 - 6> (ie, all my
external pins). Synthesis continues, however, and a bitstream is
generated.

My peripheral code is:

  pin_in: std_logic_vector(0 to 6);

  [...]

  signal pixel_0 : std_logic_vector(0 to 15):= (others => '0');
  signal pixel_1 : std_logic_vector(0 to 15):= (others => '0');
  signal pixel_2 : std_logic_vector(0 to 15):= (others => '0');
  signal pixel_3 : std_logic_vector(0 to 15):= (others => '0');
  signal pixel_4 : std_logic_vector(0 to 15):= (others => '0');
  signal pixel_5 : std_logic_vector(0 to 15):= (others => '0');
  signal pixel_6 : std_logic_vector(0 to 15):= (others => '0');

  signal s0: std_logic_vector(0 to C_SLV_DWIDTH-1):= (others => '0');
  signal s1: std_logic_vector(0 to C_SLV_DWIDTH-1):= (others => '0');
  signal s2: std_logic_vector(0 to C_SLV_DWIDTH-1):= (others => '0');
  signal s3: std_logic_vector(0 to C_SLV_DWIDTH-1):= (others => '0');

[...]

  	 -- purpose: peek at pixels and increment corresponding counters
  -- type   : combinational
  -- inputs : pin_in(i)
  -- outputs: pixel_(i)

  check0: process (pin_in(0))  is
  begin
       if rising_edge(pin_in(0)) then
        pixel_0 <= pixel_0 + '1';
        s0(0 to 15) <= pixel_0;
       end if;
  end process;

  check1: process (pin_in(1))   is
  begin
       if rising_edge(pin_in(1)) then
        pixel_1 <= pixel_1 + '1';
        s0(16 to 31) <= pixel_1;
       end if;
  end process;

(the above two blocks are repeated for pixels 2-6)

   -- purpose: update registers to pixel count values
  -- inputs : Bus2IP_Clk
  -- outputs: slv_reg(i)
UPDATE_REGISTERS : process (Bus2IP_Clk) is
 begin  -- process
    slv_reg0 <= s0;
    slv_reg1 <= s1;
    slv_reg2 <= s2;
    slv_reg3 <= s3;
 end process;



slv_reg0..3 are read out by EDK's automatically generated ReadReg
functions so I can access them from the C-level part of my project.


Firstly, I'm not sure what the first message means, no where in the
code I wrote have I defined anything including '_IBUF' (albeit I
suspect it to be something EDK generates automatically, in such a
case, can someone tell me what it is?

Secondly, if I understand correctly, the pins of my custom peripheral
are being removed from the project. I've looked around the internet a
bit and suspect this might be done due to auto-optimization, but
considering that these pins are, indeed, being used in the user_logic
file, and the registers that they store their values in are being read
out by my applicateion C code, I'm not sure why this would happen.
Does anyone have an idea why EDK could be removing my pins?

I assigned them to be external ports in EDK via System Assembly View -
> Ports -> My_peripheral -> (net column menu) Make External.

Thanks in advance,

Sean.

Article: 144810
Subject: Re: ASM hardware language definition file for Altera/Xilinx
From: grigio <crowned32@gmail.com>
Date: Wed, 6 Jan 2010 00:19:42 -0800 (PST)
Links: << >> << T >> << A >>

On 2 Gen, 19:21, Weng Tianxiang <wtx...@gmail.com> wrote:
> Hi,
> I need to write ASM hardware language for circuits. I wrote a lot
> about 10 years ago for Altera chip. Now I couldn't find the ASM
> hardware language definition file from Altera/Xilinx.

Perhaps PALASM?

Article: 144811
Subject: university platform cable
From: David Fejes <fejesd@gmail.com>
Date: Wed, 6 Jan 2010 04:10:20 -0800 (PST)
Links: << >> << T >> << A >>

Hello everybody,

can anyone tell me what is the difference between the University
Platform Cable (UW-USB-II-G) and the Platform Cable USB II (HW-USB-II-
G)?
There is almost no search result on the university platform cable,
however it is present at more webshops about half of price as platform
cable USB II. Has anyone tried it? Will it work with the iMPACT and
will it support the xilinx CPLDs?

thank you in advance

Article: 144812
Subject: Re: ASM hardware language definition file for Altera/Xilinx
From: Weng Tianxiang <wtxwtx@gmail.com>
Date: Wed, 6 Jan 2010 08:38:24 -0800 (PST)
Links: << >> << T >> << A >>

On Jan 6, 12:19=A0am, grigio <crowne...@gmail.com> wrote:
> On 2 Gen, 19:21, Weng Tianxiang <wtx...@gmail.com> wrote:
>
> > Hi,
> > I need to write ASM hardware language for circuits. I wrote a lot
> > about 10 years ago for Altera chip. Now I couldn't find the ASM
> > hardware language definition file from Altera/Xilinx.
>
> Perhaps PALASM?

Yes, it is similar language, but Altera has its own language name, and
not called PALASM.

Weng

Article: 144813
Subject: A VHDL compiler error report in Xilinx ISE 10.1 and service pack 3
From: Weng Tianxiang <wtxwtx@gmail.com>
Date: Wed, 6 Jan 2010 08:51:01 -0800 (PST)
Links: << >> << T >> << A >>

Hi Xilinx,
Here is the error reporting with its code and compilation result with
ISE 10.1 and service pack 3.
The code is specially simplified to highlight the VHDL compiler error
characteristics.
There are global data array definitions.
when one of many data in the global data array is accessed in one
logic level, all other non-accessed data in the global data array
cannot be accessed and declared a data access error.
I list only one error case when one of many data in the global data
array is accessed at one level.
Another similar error case is when one of many data in the global data
array is accessed at two different levels in a module, the same
compilation error occurs again.

Error reason: Each of many data in a global data array must have its
independent own data source ID and a global data array cannot share
only one data source ID to determine whether the access violation
occurs or not.

LIBRARY ieee;
USE ieee.std_logic_1164.all;
use ieee.numeric_std.all;

Package XILINX_ERROR_REPORT is
constant TOP : integer := 1;
constant BOTTOM	: integer := 0;
signal Error : unsigned(TOP downto BOTTOM);
end package;

-------------------------------------------------------------------------------

LIBRARY ieee;
USE ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.XILINX_ERROR_REPORT.all;

entity Xilinx_Error is
generic(X : integer := 1);
port(
CLK : in std_logic;
SINI : in std_logic;

Error_I	: in std_logic;
Error_O	: out std_logic
);
end Xilinx_Error;

architecture A of Xilinx_Error is
begin
Error_O <= Error(X);

A1 : process(CLK)
begin
if CLK'event and CLK = '1' then
if SINI = '1' then
Error(X) <= '0';
else
if Error_I = '1' then
Error(X) <= '1';
else
Error(X) <= '0';
end if;
end if;
end if;
end process;
end A;

-------------------------------------------------------------------------------

LIBRARY ieee;
USE ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.XILINX_ERROR_REPORT.all;

entity Xilinx_Error_Detect is
generic(X : integer := 1);
port(
CLK : in std_logic;
SINI : in std_logic;
Error_I	: in std_logic_vector(TOP downto BOTTOM);
Error_O	: out std_logic_vector(TOP downto BOTTOM)
);
end Xilinx_Error_Detect;

architecture A of Xilinx_Error_Detect is
component Xilinx_Error is
generic(X : integer := 1);
port(
CLK : in std_logic;
SINI : in std_logic;
Error_I	: in std_logic;
Error_O	: out std_logic
);
end component;

begin
Generate_A : for J in BOTTOM to TOP generate
Xilinx_Error_A : Xilinx_Error
generic map(X => J)
port map (
CLK	=> CLK,
SINI	=> SINI,
Error_I	=> Error_I(J),
Error_O	=> Error_O(J)
);
end generate;
end A;

-------------------------------------------------------------------------------
==================================================
=======================
* HDL
Compilation *
==================================================
=======================
Compiling vhdl file "C:/Xilinx-Error-Report/Xilinx_Error_Report.vhd"
in Library work.
Package <xilinx_error_report> compiled.
Entity <xilinx_error> compiled.
Entity <xilinx_error> (Architecture <a>) compiled.
Entity <xilinx_error_detect> compiled.
Entity <Xilinx_Error_Detect> (Architecture <A>) compiled.

==================================================
=======================
* Design Hierarchy
Analysis *
==================================================
=======================
Analyzing hierarchy for entity <Xilinx_Error_Detect> in library <work>
(architecture <A>) with generics.
X = 1

Analyzing hierarchy for entity <Xilinx_Error> in library <work>
(architecture <a>) with generics.
X = 0

Analyzing hierarchy for entity <Xilinx_Error> in library <work>
(architecture <a>) with generics.
X = 1


==================================================
=======================
* HDL
Analysis *
==================================================
=======================
Analyzing generic Entity <Xilinx_Error_Detect> in library <work>
(Architecture <A>).
X = 1
Entity <Xilinx_Error_Detect> analyzed. Unit <Xilinx_Error_Detect>
generated.

Analyzing generic Entity <Xilinx_Error.1> in library <work>
(Architecture <a>).
X = 0
Entity <Xilinx_Error.1> analyzed. Unit <Xilinx_Error.1> generated.

Analyzing generic Entity <Xilinx_Error.2> in library <work>
(Architecture <a>).
X = 1
ERROR:Xst:2548 - "C:/Xilinx-Error-Report/Xilinx_Error_Report.vhd" line
36:
Signal 'Error' defined in a package is already used in entity
<Xilinx_Error.1>.
-->

Total memory usage is 129212 kilobytes

Number of errors : 1 ( 0 filtered)
Number of warnings : 1 ( 0 filtered)
Number of infos : 0 ( 0 filtered)


Process "Synthesis" failed

In the above example, Error(1 downto 0) is declared in package
XILINX_ERROR_REPORT, and its components are used in different modules:
Error(0) is used in module Xilinx_Error.1 and Error(1) is used in
module Xilinx_Error.2, but the compiler fails to correctly register
the domain for each of Error(1 downto 0) and declares a error that is
not an error.

Weng

Article: 144814
Subject: Re: ADC problem on spartan3E
From: "mlajevar" <mahsa_lajevardi@yahoo.com>
Date: Wed, 06 Jan 2010 13:18:49 -0600
Links: << >> << T >> << A >>

Hello

I got that the values on LEDs is the half of the value I expect from the
formula,so I think the problem would be with the 14th bit in my vhdl
code,because from 13th bit to LSB ,all bits are almost same,but the big
differencce comes from being half of the original value caused by 14th
bit.

I am trying to work with chipscope,I haven't worked with it ,and I don't
know which type of chipscop pro  have to work
with?(ATC2,ICON,ILA,VIO)according to some materials I read,I think I should
work with ILA s that right? I mean to check all my fourtheen bits value for
ADC.

thanks in advance for all your ideas and suggestion	   
					
---------------------------------------		
This message was sent using the comp.arch.fpga web interface on
http://www.FPGARelated.com

Article: 144815
Subject: Databus crossing clock domains with data freeze
From: Nicholas Kinar <n.kinar@usask.ca>
Date: Wed, 06 Jan 2010 14:11:50 -0600
Links: << >> << T >> << A >>

Hello--

What is the best standard practice to have a data bus cross a clock 
domain by implementing a data freeze?

There is an extremely brief description of the data freeze given here:
http://www.fpga4fun.com/CrossClockDomain4.html

What is the best way to "freeze" the data in the source clock domain?

I have a 108-bit bus which needs to cross between a high-speed clock 
domain (280MHz) and a clock domain operated at a lower speed (70MHz).

I am using Verilog as the HDL and my FPGA is a Cyclone II.

Nicholas

Article: 144816
Subject: Re: Databus crossing clock domains with data freeze
From: John_H <newsgroup@johnhandwork.com>
Date: Wed, 6 Jan 2010 12:34:36 -0800 (PST)
Links: << >> << T >> << A >>

On Jan 6, 3:11=A0pm, Nicholas Kinar <n.ki...@usask.ca> wrote:
> Hello--
>
> What is the best standard practice to have a data bus cross a clock
> domain by implementing a data freeze?
>
> There is an extremely brief description of the data freeze given here:htt=
p://www.fpga4fun.com/CrossClockDomain4.html
>
> What is the best way to "freeze" the data in the source clock domain?
>
> I have a 108-bit bus which needs to cross between a high-speed clock
> domain (280MHz) and a clock domain operated at a lower speed (70MHz).
>
> I am using Verilog as the HDL and my FPGA is a Cyclone II.
>
> Nicholas

The "flag" is the important item in that description.  If you never
have more than one data value to transfer within any 8 (os so) high-
speed clock cycles you can get by with transferring one value at a
time.  If you have bursts of data, you need a FIFO but the average
speed cannot be greater than one in four high-speed clocks.  The FIFO
would need to be sized such that the longest burst could always be
drained.

When you have new data in the fast domain, toggle a single bit.  Read
(register) that single bit in the slow domain.  If the bit has
changed, load the data on the next cycle.  You keep track of whether
the bit has changed with a simple clock delay of that bit in the slow
domain.

Why not just load the data on the same clock the bit changes, using an
XOR of the fast and slow flag bits for an enable?  If the clocks
aren't synchronous with guaranteed setup and hold, the enable may get
to some bits on one side of the clock transition, other bits on the
opposite side resulting in "half" transferred data.

I mentioned earlier to "toggle" a single bit in the fast domain.  This
eliminates the need to have a reset handshake back from the slow
domain; it's only when the bit changes that a write occurs.  This
points out that you can't have the bit toggle twice within one slow-
domain clock cycle or the change won't be seen and data lost.  Also,
since there's a full slow clock cycle between registering the bit and
performing the data load, the data has to remain static for that
duration.

If you need more help than descriptions, write again.  I love to see
people think through the issue and understand why they write the code
they do.

- John

Article: 144817
Subject: Re: ADC problem on spartan3E
From: John_H <newsgroup@johnhandwork.com>
Date: Wed, 6 Jan 2010 12:40:21 -0800 (PST)
Links: << >> << T >> << A >>

On Jan 6, 2:18=A0pm, "mlajevar" <mahsa_lajeva...@yahoo.com> wrote:
>
> I got that the values on LEDs is the half of the value I expect from the
> formula,so I think the problem would be with the 14th bit in my vhdl
> code,because from 13th bit to LSB ,all bits are almost same,but the big
> difference comes from being half of the original value caused by 14th
> bit.

Keep in mind that on a serial interface being off by a factor of two
in your read value is the same as having the value off by one clock in
the serial domain.  Perhaps you're actually missing the least
significant bit!

Article: 144818
Subject: Re: Databus crossing clock domains with data freeze
From: Jonathan Bromley <jonathan.bromley@MYCOMPANY.com>
Date: Wed, 06 Jan 2010 22:07:49 +0100
Links: << >> << T >> << A >>

On Wed, 06 Jan 2010 14:11:50 -0600, Nicholas Kinar wrote:

>What is the best way to "freeze" the data in the source clock domain?
>
>I have a 108-bit bus which needs to cross between a high-speed clock 
>domain (280MHz) and a clock domain operated at a lower speed (70MHz).

If you are certain that the source clock is more than 2x faster 
than your target clock, I think it's rather straightforward.

Create a divide-by-2 signal in the target domain.  No
reset is required, because the phase of the divide-by-2
is irrelevant; only its changes are of interest.  So
we use a Verilog model that doesn't need a reset in 
simulation either:

  always @(posedge slow_clock)
    if (slow_flag == 1'b1)
      slow_flag <= 0;
    else
      slow_flag <= 1;

In the source domain, put the new data in your freeze
register as soon as you detect a change on slow_flag,
taking care to resynchronize slow_flag to avoid the risk
of input hazards.  Again no reset is required; it'll
sort itself out within three clock cycles.

  always @(posedge fast_clock) begin
    resync_slow_flag <= slow_flag;
    old_slow_flag <= resync_slow_flag;
    if (old_slow_flag != resync_slow_flag) begin
      freeze_register <= source_data;
      // Do whatever it takes to indicate that
      // source_data has been consumed, and make
      // the next source_data available no more
      // than 2 fast clocks later
    end
  end

And finally, capture freeze_register on every slow_clock:

  always @(posedge slow_clock)
    useful_data <= freeze_register;

In this way you can get a new data value on every slow_clock.
You can carry "data valid" information along with the data
itself, if you don't have a new data item soon enough for
every slow clock.

Draw lots of timing diagrams, and do lots of worst-case
analysis, to convince yourself whether this very simple
approach is robust in your situation.  I believe that 
it works reliably provided the clock periods obey the
following relationship:

  slow_period >= (2*fast_period) + Tss + Tpf + Tsf + Tps

where Tss is the setup time of the useful_data register,
Tsf is the setup time of the resync_slow_flag register,
Tpf is the propagation delay (including routing) from
fast clock to the freeze_register data becoming available
at the useful_data register's input, and Tps is the 
propagation delay from slow clock to slow_flag becoming
available at the input to resync_slow_flag.  

Note that Tss+Tpf and Tsf+Tps are both pretty much equal
to the shortest clock period that the FPGA can usefully
use, since they are both simple register-to-register
paths with no intervening logic.  So, as a first 
estimate, you could say

  slow_period >= (2*fast_period) + (2/Fmax)

where Fmax is the FPGA's fastest useful clock speed.
But you'll need to apply timing constraints and check
the static timing results to be sure that you are safe.

Whatever you do, please double-check my assumptions 
for yourself before doing anything upon which your 
life, fortune or good name depends.  Clock domain
crossings have been the undoing of many.

See also the "Flancter", and standard asynchronous FIFOs
in the FPGA macrocell library (although they will be much
more resource-hungry than the simple freeze, because they
must work for all combinations of source and target 
clock frequency).
-- 
Jonathan Bromley

Article: 144819
Subject: Re: Databus crossing clock domains with data freeze
From: Nicholas Kinar <n.kinar@usask.ca>
Date: Wed, 06 Jan 2010 15:18:49 -0600
Links: << >> << T >> << A >>

Hello John--

Thank you for your response!

> The "flag" is the important item in that description.  If you never
> have more than one data value to transfer within any 8 (os so) high-
> speed clock cycles you can get by with transferring one value at a
> time.  If you have bursts of data, you need a FIFO but the average
> speed cannot be greater than one in four high-speed clocks.  The FIFO
> would need to be sized such that the longest burst could always be
> drained.

Essentially what I have is a 108-bit register which holds samples from 
six 18-bit ADCs. Once the register is full of data, I bring high an 
"offload_flag" signal which is read in the lower-speed clock domain. 
Once the "offload_flag" signal goes high, the 108-bit register is copied 
into another register in the slow clock domain.

Then logic in the lower-speed clock domain brings high an 
"rs_offload_flag" signal, which is read in the high speed clock domain. 
When the "rs_offload_flag" signal goes high, logic in the high speed 
clock domain then brings low the "offload_flag" signal.

This code fails timing analysis. There is no more than one data value to 
transfer within 8 high speed clock cycles.

Perhaps my problem is that I need to use a synchronizer to bring the 
"offload_flag" signal and the "rs_offload_flag" signal between clock 
domains?

> 
> When you have new data in the fast domain, toggle a single bit.  Read
> (register) that single bit in the slow domain.  If the bit has
> changed, load the data on the next cycle.  You keep track of whether
> the bit has changed with a simple clock delay of that bit in the slow
> domain.

So I would have to keep track of the state of the single bit in the slow 
domain?  Would this involve having a register that holds the previous 
value of the bit?   Every clock cycle, the register would be monitored 
for a change.  If there is a transition in the bit, then the register is 
read.

Then my register would have 109 bits = 108 bits data + 1 bit for transfer.


> 
> Why not just load the data on the same clock the bit changes, using an
> XOR of the fast and slow flag bits for an enable?  If the clocks
> aren't synchronous with guaranteed setup and hold, the enable may get
> to some bits on one side of the clock transition, other bits on the
> opposite side resulting in "half" transferred data.

What is the difference between the "fast" and "slow" flag bits?  Do you 
mean that there are two flag bits?

> 
> I mentioned earlier to "toggle" a single bit in the fast domain.  This
> eliminates the need to have a reset handshake back from the slow
> domain; it's only when the bit changes that a write occurs.  This
> points out that you can't have the bit toggle twice within one slow-
> domain clock cycle or the change won't be seen and data lost.  Also,
> since there's a full slow clock cycle between registering the bit and
> performing the data load, the data has to remain static for that
> duration.

I think that I understand how to do this.  The state of the single bit 
(say data[0] in a 108-bit data word) is examined for a change in the 
slow clock domain.  If the bit changes, then it is time to read the 
data.  The 108-bit register is simply copied into another register that 
is in the slow clock domain.

> 
> If you need more help than descriptions, write again.  I love to see
> people think through the issue and understand why they write the code
> they do.
> 

Yes - it's far easier to write the code yourself than struggle through 
understanding lines and lines of code that has been written by someone else.

Nicholas

Article: 144820
Subject: Re: Databus crossing clock domains with data freeze
From: Nicholas Kinar <n.kinar@usask.ca>
Date: Wed, 06 Jan 2010 15:30:34 -0600
Links: << >> << T >> << A >>

Hello Jonathon--


Thank you for your response!

> 
> If you are certain that the source clock is more than 2x faster 
> than your target clock, I think it's rather straightforward.
> 

Yes, the source clock and the target clock are produced by the same PLL. 
   Since 280MHz is 4 times faster than 70MHz, all that is being used 
here is a clock multiplier.


> Create a divide-by-2 signal in the target domain.  No
> reset is required, because the phase of the divide-by-2
> is irrelevant; only its changes are of interest.  So
> we use a Verilog model that doesn't need a reset in 
> simulation either:
> 
>   always @(posedge slow_clock)
>     if (slow_flag == 1'b1)
>       slow_flag <= 0;
>     else
>       slow_flag <= 1;
> 

I see - all that is being done here is a simple clock divisor.  Perhaps 
this could also be done with a shift register?

> In the source domain, put the new data in your freeze
> register as soon as you detect a change on slow_flag,
> taking care to resynchronize slow_flag to avoid the risk
> of input hazards.  Again no reset is required; it'll
> sort itself out within three clock cycles.
> 
>   always @(posedge fast_clock) begin
>     resync_slow_flag <= slow_flag;
>     old_slow_flag <= resync_slow_flag;
>     if (old_slow_flag != resync_slow_flag) begin
>       freeze_register <= source_data;
>       // Do whatever it takes to indicate that
>       // source_data has been consumed, and make
>       // the next source_data available no more
>       // than 2 fast clocks later
>     end
>   end
> 
> And finally, capture freeze_register on every slow_clock:
> 
>   always @(posedge slow_clock)
>     useful_data <= freeze_register;
> 

Very neat.  It seems like a clean way to do this.


> In this way you can get a new data value on every slow_clock.
> You can carry "data valid" information along with the data
> itself, if you don't have a new data item soon enough for
> every slow clock.
> 
> Draw lots of timing diagrams, and do lots of worst-case
> analysis, to convince yourself whether this very simple
> approach is robust in your situation.  I believe that 
> it works reliably provided the clock periods obey the
> following relationship:
> 
>   slow_period >= (2*fast_period) + Tss + Tpf + Tsf + Tps
> 
> where Tss is the setup time of the useful_data register,
> Tsf is the setup time of the resync_slow_flag register,
> Tpf is the propagation delay (including routing) from
> fast clock to the freeze_register data becoming available
> at the useful_data register's input, and Tps is the 
> propagation delay from slow clock to slow_flag becoming
> available at the input to resync_slow_flag.  
> 
> Note that Tss+Tpf and Tsf+Tps are both pretty much equal
> to the shortest clock period that the FPGA can usefully
> use, since they are both simple register-to-register
> paths with no intervening logic.  So, as a first 
> estimate, you could say
> 
>   slow_period >= (2*fast_period) + (2/Fmax)
> 
> where Fmax is the FPGA's fastest useful clock speed.
> But you'll need to apply timing constraints and check
> the static timing results to be sure that you are safe.
> 
> Whatever you do, please double-check my assumptions 
> for yourself before doing anything upon which your 
> life, fortune or good name depends.  Clock domain
> crossings have been the undoing of many.
> 
> See also the "Flancter", and standard asynchronous FIFOs
> in the FPGA macrocell library (although they will be much
> more resource-hungry than the simple freeze, because they
> must work for all combinations of source and target 
> clock frequency).

I will take a look.  Thanks for your help, Jonathon!


Nicholas

Article: 144821
Subject: Re: Databus crossing clock domains with data freeze
From: Nicholas Kinar <n.kinar@usask.ca>
Date: Wed, 06 Jan 2010 17:37:09 -0600
Links: << >> << T >> << A >>

Hello Jonathan--

>> In the source domain, put the new data in your freeze
>> register as soon as you detect a change on slow_flag,
>> taking care to resynchronize slow_flag to avoid the risk
>> of input hazards.  Again no reset is required; it'll
>> sort itself out within three clock cycles.
>>
>>   always @(posedge fast_clock) begin
>>     resync_slow_flag <= slow_flag;
>>     old_slow_flag <= resync_slow_flag;
>>     if (old_slow_flag != resync_slow_flag) begin
>>       freeze_register <= source_data;
>>       // Do whatever it takes to indicate that
>>       // source_data has been consumed, and make
>>       // the next source_data available no more
>>       // than 2 fast clocks later
>>     end
>>   end

What I don't understand is the assignment logic at the top of this 
always block.  Isn't "old_slow_flag" equal to "resync_slow_flag"?

What is the best way to detect a change on "slow_flag"?

Once again, thank you so much for your help.

Nicholas

(and I am sorry for misspelling your name in previous posts...)

Article: 144822
Subject: Re: Databus crossing clock domains with data freeze
From: Nicholas Kinar <n.kinar@usask.ca>
Date: Wed, 06 Jan 2010 21:04:28 -0600
Links: << >> << T >> << A >>

> 
> What I don't understand is the assignment logic at the top of this 
> always block.  Isn't "old_slow_flag" equal to "resync_slow_flag"?
> 

To help me better understand this, I've re-written the code:

reg old_slow_flag;
reg slow_flag;

always @(posedge fast_clock) begin

     if (old_slow_flag != slow_flag) begin

       old_slow_flag <= slow_flag;
       freeze_register <= source_data;

       // Do whatever it takes to indicate that
       // source_data has been consumed, and make
       // the next source_data available no more
       // than 2 fast clocks later

     end
   end

Article: 144823
Subject: Re: Databus crossing clock domains with data freeze
From: Nicholas Kinar <n.kinar@usask.ca>
Date: Wed, 06 Jan 2010 21:14:20 -0600
Links: << >> << T >> << A >>


> 
> If you are certain that the source clock is more than 2x faster 
> than your target clock, I think it's rather straightforward.
> 

The examples are very helpful.  Thank you for posting terse snippets of 
code (rather than a large entire example program with lines and lines of 
code).

Nicholas

Article: 144824
Subject: Re: Databus crossing clock domains with data freeze
From: Nicholas Kinar <n.kinar@usask.ca>
Date: Wed, 06 Jan 2010 21:23:26 -0600
Links: << >> << T >> << A >>


> 
> Draw lots of timing diagrams, and do lots of worst-case
> analysis, to convince yourself whether this very simple
> approach is robust in your situation.  I believe that 
> it works reliably provided the clock periods obey the
> following relationship:
> 
>   slow_period >= (2*fast_period) + Tss + Tpf + Tsf + Tps
> 

Yes - I just tried this using the Quartus II synthesis tools.  Many 
thanks for posting this procedure!  I've verified that the solution you 
propose passes timing analysis in Quartus II for my particular design.
> 
>   slow_period >= (2*fast_period) + (2/Fmax)
> 


> where Fmax is the FPGA's fastest useful clock speed.
> But you'll need to apply timing constraints and check
> the static timing results to be sure that you are safe.
> 
> Whatever you do, please double-check my assumptions 
> for yourself before doing anything upon which your 
> life, fortune or good name depends.  Clock domain
> crossings have been the undoing of many.
> 

To me, this equation seems to be very reasonable.  The procedure also 
works well.


> See also the "Flancter", and standard asynchronous FIFOs
> in the FPGA macrocell library (although they will be much
> more resource-hungry than the simple freeze, because they
> must work for all combinations of source and target 
> clock frequency).

Agreed.  Thanks again for your help, Jonathan!

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search