Messages from 141425

Article: 141425
Subject: Re: True dual-port RAM in VHDL: XST question
From: "Fredxx" <fredxx@spam.com>
Date: Wed, 24 Jun 2009 11:37:24 +0100
Links: << >> << T >> << A >>


"Jonathan Bromley" <jonathan.bromley@MYCOMPANY.com> wrote in message 
news:prs345lln7bmpp71hrk9p33ehfkq8231gj@4ax.com...
> hi all,
>
> As promised many weeks ago, I'm building what I
> hope will be a comprehensive summary of how to do
> RAM inference from VHDL and Verilog code for all
> the common synthesis tools and FPGAs.  It will
> go on our website some time this summer (sorry,
> it's not a high-priority project).
>
> I've encountered what seems to me to be a bug
> in XST (all versions from 8 to 11 inclusive)
> and I would value your opinion before I start
> to give Xilinx a hard time about it.  By the
> way, exactly the same bug appears to be present
> in Quartus but I haven't yet done enough detailed
> investigation to comment on that properly.
>
> To create true (dual-clock) dual-port RAM,
> I need to create two clocked processes.  This
> requires me to use a shared variable for
> the memory itself (ugly but possible, works
> correctly in XST):
>
>  type t_mem is array (0 to 2**ABITS-1) of
>              std_logic_vector(DBITS-1 downto 0);
>  shared variable mem: t_mem;  -- the memory storage
> begin -- the architecture
>  process (clock0) -- manages port A
>  begin
>    if rising_edge (clock0) then
>      if we0 = '1' then  -- write to port A
>        mem(to_integer(unsigned(a0))) := wd0;
>        rd0 <= wd0;
>      else
>        rd0 <= mem(to_integer(unsigned(a0)));
>      end if;
>    end if;
>  end process;
>  --
>  process (clock1) -- manages port B
>  begin
>    if rising_edge (clock1) then
>      if we1 = '1' then
>        mem(to_integer(unsigned(a1))) := wd1;
>        rd1 <= wd1;
>      else
>        rd1 <= mem(to_integer(unsigned(a1)));
>      end if;
>    end if;
>  end process;
>
> That, I believe, is the right way to do it.
>
> However, both XST and Quartus give THE SAME SYNTHESIS
> RESULTS if I change "shared variable" to "signal", and
> make signal assignments instead of variable assignments
> to the mem() array.  This is just plain WRONG!  Writing
> to a signal from two processes represents two resolved
> drivers on the signal, and does not correctly model a
> dual-port memory in simulation.
>
> Given that the whole point of memory inference from
> HDL code is that you get a convenient, readable,
> accurate simulation model as part of your design
> code, this behaviour by the synthesis tools is
> incomprehensible to me.  Can anyone clarify?  Has
> anyone fallen foul of this problem?  Best of all,
> could Brian Philofsky, who has written so clearly
> and helpfully about XST in the past, please speak
> up and tell us what the blazes is going on here?
>

Your knowledge of VHDL is greater than mine, but I assumed that

>      if we1 = '1' then
>        mem(to_integer(unsigned(a1))) := wd1;
>    end if;

was equivalent to;
      if we1 = '1' then
        mem(to_integer(unsigned(a1))) := wd1;
      else
        mem(to_integer(unsigned(a1))) := mem(to_integer(unsigned(a1)));
      end if;

if you used something like;
      if we1 = '1' then
        mem(to_integer(unsigned(a1))) := wd1;
      else
        mem(to_integer(unsigned(a1))) := (others => 'Z');
      end if;

Would this then give more consistent results where both processes wouldn't 
be fighting against each other?

Happy to be told I'm wrong.

Article: 141426
Subject: Re: True dual-port RAM in VHDL: XST question
From: Jonathan Bromley <jonathan.bromley@MYCOMPANY.com>
Date: Wed, 24 Jun 2009 12:15:39 +0100
Links: << >> << T >> << A >>

On Wed, 24 Jun 2009 11:37:24 +0100, "Fredxx" wrote:

>if you used something like;
>      if we1 = '1' then
>        mem(to_integer(unsigned(a1))) := wd1;
>      else
>        mem(to_integer(unsigned(a1))) := (others => 'Z');
>      end if;
>
>Would this then give more consistent results where both processes wouldn't 
>be fighting against each other?

Sadly, no.  I see what you're getting at, but I don't think you could
ever get the memory to have the correct contents if both ports are
doing that all the time.  Each process may overwrite locations it's
already correctly written, using Zs, for no good reason.

Suppose you could get it right somehow, and arrange that each process
is driving Z to all locations it's never written, but appropriate
values to locations it has written.  What then happens if the second
process writes to a location that previously was written by the other?
How can it tell the first process now to put Z on that location?

In truth the "correct" solution would be to write the whole thing
as a single process with two clocks:

  process (clock0, clock1)
    variable mem: t_mem;
  begin
    if rising_edge(clock0) then
      if we0 = '1' then
        mem(a0) := wd0;
      end if;
    end if;
    if rising_edge(clock1) then
      if we1 = '1' then
        mem(a1) := wd1;
      end if;
    end if;
    ...

But I suspect synthesis tools would chuck that overboard
without a second thought.
-- 
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which 
are not the views of Doulos Ltd., unless specifically stated.

Article: 141427
Subject: Re: True dual-port RAM in VHDL: XST question
From: "Fredxx" <fredxx@spam.com>
Date: Wed, 24 Jun 2009 12:44:27 +0100
Links: << >> << T >> << A >>


"Jonathan Bromley" <jonathan.bromley@MYCOMPANY.com> wrote in message 
news:bn0445lbemmf3qnl87te1hps2l3nc9novv@4ax.com...
> On Wed, 24 Jun 2009 11:37:24 +0100, "Fredxx" wrote:
>
>>if you used something like;
>>      if we1 = '1' then
>>        mem(to_integer(unsigned(a1))) := wd1;
>>      else
>>        mem(to_integer(unsigned(a1))) := (others => 'Z');
>>      end if;
>>
>>Would this then give more consistent results where both processes wouldn't
>>be fighting against each other?
>
> Sadly, no.  I see what you're getting at, but I don't think you could
> ever get the memory to have the correct contents if both ports are
> doing that all the time.  Each process may overwrite locations it's
> already correctly written, using Zs, for no good reason.
>
> Suppose you could get it right somehow, and arrange that each process
> is driving Z to all locations it's never written, but appropriate
> values to locations it has written.  What then happens if the second
> process writes to a location that previously was written by the other?
> How can it tell the first process now to put Z on that location?
>
> In truth the "correct" solution would be to write the whole thing
> as a single process with two clocks:
>
>  process (clock0, clock1)
>    variable mem: t_mem;
>  begin
>    if rising_edge(clock0) then
>      if we0 = '1' then
>        mem(a0) := wd0;
>      end if;
>    end if;
>    if rising_edge(clock1) then
>      if we1 = '1' then
>        mem(a1) := wd1;
>      end if;
>    end if;
>    ...
>
> But I suspect synthesis tools would chuck that overboard
> without a second thought.
> -- 
> Jonathan Bromley, Consultant
>
> DOULOS - Developing Design Know-how
> VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services
>
> Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
> jonathan.bromley@MYCOMPANY.com
> http://www.MYCOMPANY.com
>
> The contents of this message may contain personal views which
> are not the views of Doulos Ltd., unless specifically stated.
>

I perhaps am making the (erroneous) assumption that two statements will be 
or'd together and the Z's will be overdriven by the signals.  But as you 
say, I would be replacing the RAM locations with Z's or something that the 
synthesiser concocts.

To be honest, I think it isn't good practice to have signals driven by 2 
clocks, and I'd probably use clock switching primitives instead so the 
memory would be written in one process with just one clock.

Article: 141428
Subject: Re: True dual-port RAM in VHDL: XST question
From: Jonathan Bromley <jonathan.bromley@MYCOMPANY.com>
Date: Wed, 24 Jun 2009 13:23:56 +0100
Links: << >> << T >> << A >>

On Wed, 24 Jun 2009 12:44:27 +0100, "Fredxx" wrote:

>I perhaps am making the (erroneous) assumption that two statements will be 
>or'd together and the Z's will be overdriven by the signals. 

That's more-or-less correct.  Each process represents a driver
on any signal it writes.  If multiple processes write to a signal,
then the actual signal value is determined by resolving the
various driven values.  Of course, anything else overdrives Z.

The hard-to-solve problem:  suppose process A writes a value
to a memory location at some time; clearly, you want that
value to remain in the location and not to be overwritten
to Z on the next clock, so you can't allow process A to change
its mind about that value.  Some time later, suppose process B
writes to the same location.  Now you have two non-Z drivers
on the same set of bits.  How can process B tell process A 
that it's time for its driver to lapse back to Z?  Shared
variables, for all their ugliness, solve this problem 
neatly (which is why my problem simply doesn't exist in
Verilog, where all variables are shared).

>To be honest, I think it isn't good practice to have signals driven by 2 
>clocks, and I'd probably use clock switching primitives instead so the 
>memory would be written in one process with just one clock.

In normal logic I would 100% agree, but here I'm talking about
modeling and synthesizing the FPGAs' built-in RAM blocks, which
have the option of independent clocks on the two ports.  So
it is important to write VHDL corresponding to that behavior.
You could mux the clocks onto a single port, but that would
be a totally different design.
-- 
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which 
are not the views of Doulos Ltd., unless specifically stated.

Article: 141429
Subject: Re: True dual-port RAM in VHDL: XST question
From: "Fredxx" <fredxx@spam.com>
Date: Wed, 24 Jun 2009 13:54:30 +0100
Links: << >> << T >> << A >>


"Jonathan Bromley" <jonathan.bromley@MYCOMPANY.com> wrote in message 
news:e76445hlo55p6dki0oi764adcblv5oloul@4ax.com...
> On Wed, 24 Jun 2009 12:44:27 +0100, "Fredxx" wrote:
>
>
> In normal logic I would 100% agree, but here I'm talking about
> modeling and synthesizing the FPGAs' built-in RAM blocks, which
> have the option of independent clocks on the two ports.  So
> it is important to write VHDL corresponding to that behavior.
> You could mux the clocks onto a single port, but that would
> be a totally different design.

Ah - I see - that does sound rather tricky and can see where you're coming 
from.

Article: 141430
Subject: Re: Subtleties of Booth's Algorithm Implementation
From: rickman <gnuarm@gmail.com>
Date: Wed, 24 Jun 2009 07:09:51 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 23, 2:56 pm, Weng Tianxiang <wtx...@gmail.com> wrote:
> On Jun 23, 7:23 am, rickman <gnu...@gmail.com> wrote:
>
>
>
> > On Jun 23, 12:51 am, rickman <gnu...@gmail.com> wrote:
>
> > > On Jun 22, 6:33 pm, Mike Treseler <mtrese...@gmail.com> wrote:
>
> > > > rickman wrote:
> > > > > I used that in an
> > > > > earlier design and it uses about 300 LUTs for the multiplier.  I have
> > > > > lots of clock cycles, so I can generate the partial products
> > > > > sequentially using much less logic.
>
> > > > Less than 200 LUTs?
>
> > > >       -- Mike Treseler
>
> > > Are you joking?  I expect it to be on the order of 40 LUTs for a 16 x
> > > 16 multiplier.  Maybe I am not making myself clear.  I am calculating
> > > one partial product at a time and adding them to the product
> > > sequentially using the same hardware, but multiple clock cycles.  Sort
> > > of like a bit serial adder, but this is a partial product serial
> > > multiplier or maybe you could call it a multiplier bit serial
> > > multiplier.  A truely bit serial multiplier could be done in a dozen
> > > to twenty LUTs I expect, but a 16x16 multiply would take 256 clock
> > > cycles.
>
> > > I actually considered this on an earlier design, but it turned out I
> > > didn't need to push on the LUTs used and the X*Y multiplier took some
> > > 300 LUTs.  So going down to 40 or 50 LUTs is a big improvement.  I
> > > should have this done later this week and I'll post the results.
>
> > > Rick
>
> > My estimate was off by a little as I forgot to account for the extra
> > rank of muxes required to load the initial values.  So the total for a
> > 16 x 16 add and shift sequential multiplier is 60 LUTs as reported by
> > the Lattice tools.  This drops by about 10 LUTs if the counter is
> > removed.  In my design the timing may be controlled by logic
> > elsewhere.  Yes, somehow Synplify is using 10 LUTs to build a four bit
> > counter!  It seems to want to duplicate three of the four FFs.
>
> > Rick- Hide quoted text -
>
> > - Show quoted text -
>
> Hi Rick,
> Can you show me where I can find the schematics of Lattice doing 16x16
> for 60 LUT?
>
> Do 60 LUTs include 32-bit flip-flops to store output data?
>
> Weng

I was able to get a schematic from Synplify Pro, but it did not want
to include the text except on part of the drawing.  If anyone can tell
me how to get the text to display, I will reprint it.  The PDF file
can be found at http://arius.com/stuff/FPGA/Multiply_16x16.pdf

This incarnation uses 62 LUTs and 42 FFs if I am reading the info
correctly.  This includes 32 FFs to hold the output product.  The
Multiplicand is not registered, it is assumed that it is held constant
on the input.  There are 7 FFs in the 4 bit counter (Synplify insists
on duplicating 3 bits as "fast" counter bits) one FF for the carry bit
on the Multiplier negate circuit and the lsb of the product register
is duplicated twice (it is used to control the adder).  If the bit
counter is not duplicated, it would also reduce the LUT count by 4
LUTs.

Just to make this clear, this multiplier uses as many clock cycles to
produce a product as you have bits in the multiplier.  This is not a
"fast" multiplier which can produce a result on each clock cycle and
can even be pipelined to run with as fast a clock as this circuit.
Also, I have not simulated it to be sure it is coded correctly, but I
am pretty confident it is working the way I intend or at least any
mistakes won't change the LUT count much.

Rick

Article: 141431
Subject: Re: True dual-port RAM in VHDL: XST question
From: rickman <gnuarm@gmail.com>
Date: Wed, 24 Jun 2009 07:18:10 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 24, 7:44=A0am, "Fredxx" <fre...@spam.com> wrote:
> "Jonathan Bromley" <jonathan.brom...@MYCOMPANY.com> wrote in message
>
> news:bn0445lbemmf3qnl87te1hps2l3nc9novv@4ax.com...
>
>
>
> > On Wed, 24 Jun 2009 11:37:24 +0100, "Fredxx" wrote:
>
> >>if you used something like;
> >> =A0 =A0 =A0if we1 =3D '1' then
> >> =A0 =A0 =A0 =A0mem(to_integer(unsigned(a1))) :=3D wd1;
> >> =A0 =A0 =A0else
> >> =A0 =A0 =A0 =A0mem(to_integer(unsigned(a1))) :=3D (others =3D> 'Z');
> >> =A0 =A0 =A0end if;
>
> >>Would this then give more consistent results where both processes would=
n't
> >>be fighting against each other?
>
> > Sadly, no. =A0I see what you're getting at, but I don't think you could
> > ever get the memory to have the correct contents if both ports are
> > doing that all the time. =A0Each process may overwrite locations it's
> > already correctly written, using Zs, for no good reason.
>
> > Suppose you could get it right somehow, and arrange that each process
> > is driving Z to all locations it's never written, but appropriate
> > values to locations it has written. =A0What then happens if the second
> > process writes to a location that previously was written by the other?
> > How can it tell the first process now to put Z on that location?
>
> > In truth the "correct" solution would be to write the whole thing
> > as a single process with two clocks:
>
> > =A0process (clock0, clock1)
> > =A0 =A0variable mem: t_mem;
> > =A0begin
> > =A0 =A0if rising_edge(clock0) then
> > =A0 =A0 =A0if we0 =3D '1' then
> > =A0 =A0 =A0 =A0mem(a0) :=3D wd0;
> > =A0 =A0 =A0end if;
> > =A0 =A0end if;
> > =A0 =A0if rising_edge(clock1) then
> > =A0 =A0 =A0if we1 =3D '1' then
> > =A0 =A0 =A0 =A0mem(a1) :=3D wd1;
> > =A0 =A0 =A0end if;
> > =A0 =A0end if;
> > =A0 =A0...
>
> > But I suspect synthesis tools would chuck that overboard
> > without a second thought.
> > --
> > Jonathan Bromley, Consultant
>
> > DOULOS - Developing Design Know-how
> > VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services
>
> > Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
> > jonathan.brom...@MYCOMPANY.com
> >http://www.MYCOMPANY.com
>
> > The contents of this message may contain personal views which
> > are not the views of Doulos Ltd., unless specifically stated.
>
> I perhaps am making the (erroneous) assumption that two statements will b=
e
> or'd together and the Z's will be overdriven by the signals. =A0But as yo=
u
> say, I would be replacing the RAM locations with Z's or something that th=
e
> synthesiser concocts.
>
> To be honest, I think it isn't good practice to have signals driven by 2
> clocks, and I'd probably use clock switching primitives instead so the
> memory would be written in one process with just one clock.


That would be a truly bizarre circuit design.  I don't know how they
actually construct memory to use separate clocks, but I expect it uses
an async memory with two independent synchronous interfaces.  FPGA
reps have posted here that there is a lot of "magic" in the logic
between the sync interfaces and the async memory inside the block
ram.  All of this would be very hard to describe using an HDL.  But
driving a signal with 'z' or switching clocks is not the way to go at
all...

Rick

Article: 141432
Subject: Re: True dual-port RAM in VHDL: XST question
From: "Fredxx" <fredxx@spam.com>
Date: Wed, 24 Jun 2009 15:45:37 +0100
Links: << >> << T >> << A >>


"Jonathan Bromley" <jonathan.bromley@MYCOMPANY.com> wrote in message 
news:e76445hlo55p6dki0oi764adcblv5oloul@4ax.com...
> On Wed, 24 Jun 2009 12:44:27 +0100, "Fredxx" wrote:
>
>>I perhaps am making the (erroneous) assumption that two statements will be
>>or'd together and the Z's will be overdriven by the signals.
>
> That's more-or-less correct.  Each process represents a driver
> on any signal it writes.  If multiple processes write to a signal,
> then the actual signal value is determined by resolving the
> various driven values.  Of course, anything else overdrives Z.
>
> The hard-to-solve problem:  suppose process A writes a value
> to a memory location at some time; clearly, you want that
> value to remain in the location and not to be overwritten
> to Z on the next clock, so you can't allow process A to change
> its mind about that value.  Some time later, suppose process B
> writes to the same location.  Now you have two non-Z drivers
> on the same set of bits.  How can process B tell process A
> that it's time for its driver to lapse back to Z?  Shared
> variables, for all their ugliness, solve this problem
> neatly (which is why my problem simply doesn't exist in
> Verilog, where all variables are shared).
>
>>To be honest, I think it isn't good practice to have signals driven by 2
>>clocks, and I'd probably use clock switching primitives instead so the
>>memory would be written in one process with just one clock.
>
> In normal logic I would 100% agree, but here I'm talking about
> modeling and synthesizing the FPGAs' built-in RAM blocks, which
> have the option of independent clocks on the two ports.  So
> it is important to write VHDL corresponding to that behavior.
> You could mux the clocks onto a single port, but that would
> be a totally different design.

What's wrong with an asynchronous memory, where the appropriate clocks latch 
the control signals to create synchronous RAM.

Then we can do something like:

process (a0, we0, wd0, a1, we1, wd1)
begin
  if we0 = '1' then  -- write to port A
    mem(conv_integer(a0)) <= wd0;
  end if;
  if we1 = '1' then  -- write to port A
    mem(conv_integer(a1)) <= wd1;
  end if;
  rd0 <= mem(conv_integer(a0));
  rd1 <= mem(conv_integer(a1));
end process;

It works in simulation!!

Article: 141433
Subject: Re: True dual-port RAM in VHDL: XST question
From: Jonathan Bromley <jonathan.bromley@MYCOMPANY.com>
Date: Wed, 24 Jun 2009 15:57:41 +0100
Links: << >> << T >> << A >>

On Wed, 24 Jun 2009 15:45:37 +0100, "Fredxx" wrote:

>What's wrong with an asynchronous memory[...]
>It works in simulation!! 

Nothing wrong with them, except that they don't exist
in real FPGAs.  By contrast, dual-ported dual-clock
synchronous RAMs most certainly do :-)
-- 
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which 
are not the views of Doulos Ltd., unless specifically stated.

Article: 141434
Subject: Re: True dual-port RAM in VHDL: XST question
From: rickman <gnuarm@gmail.com>
Date: Wed, 24 Jun 2009 08:00:54 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 24, 10:45=A0am, "Fredxx" <fre...@spam.com> wrote:
> "Jonathan Bromley" <jonathan.brom...@MYCOMPANY.com> wrote in message
>
> news:e76445hlo55p6dki0oi764adcblv5oloul@4ax.com...
>
>
>
> > On Wed, 24 Jun 2009 12:44:27 +0100, "Fredxx" wrote:
>
> >>I perhaps am making the (erroneous) assumption that two statements will=
 be
> >>or'd together and the Z's will be overdriven by the signals.
>
> > That's more-or-less correct. =A0Each process represents a driver
> > on any signal it writes. =A0If multiple processes write to a signal,
> > then the actual signal value is determined by resolving the
> > various driven values. =A0Of course, anything else overdrives Z.
>
> > The hard-to-solve problem: =A0suppose process A writes a value
> > to a memory location at some time; clearly, you want that
> > value to remain in the location and not to be overwritten
> > to Z on the next clock, so you can't allow process A to change
> > its mind about that value. =A0Some time later, suppose process B
> > writes to the same location. =A0Now you have two non-Z drivers
> > on the same set of bits. =A0How can process B tell process A
> > that it's time for its driver to lapse back to Z? =A0Shared
> > variables, for all their ugliness, solve this problem
> > neatly (which is why my problem simply doesn't exist in
> > Verilog, where all variables are shared).
>
> >>To be honest, I think it isn't good practice to have signals driven by =
2
> >>clocks, and I'd probably use clock switching primitives instead so the
> >>memory would be written in one process with just one clock.
>
> > In normal logic I would 100% agree, but here I'm talking about
> > modeling and synthesizing the FPGAs' built-in RAM blocks, which
> > have the option of independent clocks on the two ports. =A0So
> > it is important to write VHDL corresponding to that behavior.
> > You could mux the clocks onto a single port, but that would
> > be a totally different design.
>
> What's wrong with an asynchronous memory, where the appropriate clocks la=
tch
> the control signals to create synchronous RAM.
>
> Then we can do something like:
>
> process (a0, we0, wd0, a1, we1, wd1)
> begin
> =A0 if we0 =3D '1' then =A0-- write to port A
> =A0 =A0 mem(conv_integer(a0)) <=3D wd0;
> =A0 end if;
> =A0 if we1 =3D '1' then =A0-- write to port A
> =A0 =A0 mem(conv_integer(a1)) <=3D wd1;
> =A0 end if;
> =A0 rd0 <=3D mem(conv_integer(a0));
> =A0 rd1 <=3D mem(conv_integer(a1));
> end process;
>
> It works in simulation!!

I doubt that it will synthesize.  Synthesis is largely a matter of
template matching.  You can describe a behavior any way you want in
simulation.  But if the synthesis tool does not recognize that form,
it won't synthesize to anything useful.  Often memory that is not
recognized as a block ram is synthesized as distributed memory using
much of the FFs on a chip.  Not only that, but it takes forever to
complete just to find out you don't have a workable design.

Rick

Article: 141435
Subject: Re: True dual-port RAM in VHDL: XST question
From: Ed McGettigan <ed.mcgettigan@xilinx.com>
Date: Wed, 24 Jun 2009 08:24:02 -0700
Links: << >> << T >> << A >>

Fredxx wrote:
> "Jonathan Bromley" <jonathan.bromley@MYCOMPANY.com> wrote in message 
> news:e76445hlo55p6dki0oi764adcblv5oloul@4ax.com...
>> On Wed, 24 Jun 2009 12:44:27 +0100, "Fredxx" wrote:
>>
>>> I perhaps am making the (erroneous) assumption that two statements will be
>>> or'd together and the Z's will be overdriven by the signals.
>> That's more-or-less correct.  Each process represents a driver
>> on any signal it writes.  If multiple processes write to a signal,
>> then the actual signal value is determined by resolving the
>> various driven values.  Of course, anything else overdrives Z.
>>
>> The hard-to-solve problem:  suppose process A writes a value
>> to a memory location at some time; clearly, you want that
>> value to remain in the location and not to be overwritten
>> to Z on the next clock, so you can't allow process A to change
>> its mind about that value.  Some time later, suppose process B
>> writes to the same location.  Now you have two non-Z drivers
>> on the same set of bits.  How can process B tell process A
>> that it's time for its driver to lapse back to Z?  Shared
>> variables, for all their ugliness, solve this problem
>> neatly (which is why my problem simply doesn't exist in
>> Verilog, where all variables are shared).
>>
>>> To be honest, I think it isn't good practice to have signals driven by 2
>>> clocks, and I'd probably use clock switching primitives instead so the
>>> memory would be written in one process with just one clock.
>> In normal logic I would 100% agree, but here I'm talking about
>> modeling and synthesizing the FPGAs' built-in RAM blocks, which
>> have the option of independent clocks on the two ports.  So
>> it is important to write VHDL corresponding to that behavior.
>> You could mux the clocks onto a single port, but that would
>> be a totally different design.
> 
> What's wrong with an asynchronous memory, where the appropriate clocks latch 
> the control signals to create synchronous RAM.
> 
> Then we can do something like:
> 
> process (a0, we0, wd0, a1, we1, wd1)
> begin
>   if we0 = '1' then  -- write to port A
>     mem(conv_integer(a0)) <= wd0;
>   end if;
>   if we1 = '1' then  -- write to port A
>     mem(conv_integer(a1)) <= wd1;
>   end if;
>   rd0 <= mem(conv_integer(a0));
>   rd1 <= mem(conv_integer(a1));
> end process;
> 
> It works in simulation!! 
> 
> 

Asynchronous memories only work correctly if the input delays are well 
controlled and internal self timed circuits function correctly.  In 
order to achieve reliability memory designers have to build in a lot of 
margin so asynchronous memories will operate much slower than a 
synchronous memory and even then there are tight specs on the address 
and data busses.

For example take the code example that you have above and instead of 
having a test bench that transitions the a0, a1, wd0 and wd1 at the same 
time add some delay to various bits in the address and data busses and 
observe the results.

Ed McGettigan
--
Xilinx Inc.

Article: 141436
Subject: Re: True dual-port RAM in VHDL: XST question
From: "Fredxx" <fredxx@spam.com>
Date: Wed, 24 Jun 2009 16:27:30 +0100
Links: << >> << T >> << A >>

"Jonathan Bromley" <jonathan.bromley@MYCOMPANY.com> wrote in message 
news:akf4459nrqg50jo7q29e4h3sbf5uf4gcp4@4ax.com...
> On Wed, 24 Jun 2009 15:45:37 +0100, "Fredxx" wrote:
>
>>What's wrong with an asynchronous memory[...]
>>It works in simulation!!
>
> Nothing wrong with them, except that they don't exist
> in real FPGAs.  By contrast, dual-ported dual-clock
> synchronous RAMs most certainly do :-)

I thought you were trying to simulate and synthesise dual port block RAM, 
without using the normal block RAM primitives.  In your first post you said 
"of how to do RAM inference from VHDL and Verilog code for all the common 
synthesis tools and FPGAs".

The asynchronous memory is an array of flip-flops rather than a memory, but 
that's a mute point.  It does both synthesise and simulate in Xilinx ISE 
tools.

Article: 141437
Subject: Re: True dual-port RAM in VHDL: XST question
From: "Fredxx" <fredxx@spam.com>
Date: Wed, 24 Jun 2009 16:31:10 +0100
Links: << >> << T >> << A >>

rickman wrote:
> On Jun 24, 10:45 am, "Fredxx" <fre...@spam.com> wrote:
>> "Jonathan Bromley" <jonathan.brom...@MYCOMPANY.com> wrote in message
>>
>> news:e76445hlo55p6dki0oi764adcblv5oloul@4ax.com...
>>
>>
>>
>>> On Wed, 24 Jun 2009 12:44:27 +0100, "Fredxx" wrote:
>>
>>>> I perhaps am making the (erroneous) assumption that two statements
>>>> will be or'd together and the Z's will be overdriven by the
>>>> signals.
>>
>>> That's more-or-less correct. Each process represents a driver
>>> on any signal it writes. If multiple processes write to a signal,
>>> then the actual signal value is determined by resolving the
>>> various driven values. Of course, anything else overdrives Z.
>>
>>> The hard-to-solve problem: suppose process A writes a value
>>> to a memory location at some time; clearly, you want that
>>> value to remain in the location and not to be overwritten
>>> to Z on the next clock, so you can't allow process A to change
>>> its mind about that value. Some time later, suppose process B
>>> writes to the same location. Now you have two non-Z drivers
>>> on the same set of bits. How can process B tell process A
>>> that it's time for its driver to lapse back to Z? Shared
>>> variables, for all their ugliness, solve this problem
>>> neatly (which is why my problem simply doesn't exist in
>>> Verilog, where all variables are shared).
>>
>>>> To be honest, I think it isn't good practice to have signals
>>>> driven by 2 clocks, and I'd probably use clock switching
>>>> primitives instead so the memory would be written in one process
>>>> with just one clock.
>>
>>> In normal logic I would 100% agree, but here I'm talking about
>>> modeling and synthesizing the FPGAs' built-in RAM blocks, which
>>> have the option of independent clocks on the two ports. So
>>> it is important to write VHDL corresponding to that behavior.
>>> You could mux the clocks onto a single port, but that would
>>> be a totally different design.
>>
>> What's wrong with an asynchronous memory, where the appropriate
>> clocks latch the control signals to create synchronous RAM.
>>
>> Then we can do something like:
>>
>> process (a0, we0, wd0, a1, we1, wd1)
>> begin
>> if we0 = '1' then -- write to port A
>> mem(conv_integer(a0)) <= wd0;
>> end if;
>> if we1 = '1' then -- write to port A
>> mem(conv_integer(a1)) <= wd1;
>> end if;
>> rd0 <= mem(conv_integer(a0));
>> rd1 <= mem(conv_integer(a1));
>> end process;
>>
>> It works in simulation!!
>
> I doubt that it will synthesize.  Synthesis is largely a matter of
> template matching.  You can describe a behavior any way you want in
> simulation.  But if the synthesis tool does not recognize that form,
> it won't synthesize to anything useful.  Often memory that is not
> recognized as a block ram is synthesized as distributed memory using
> much of the FFs on a chip.  Not only that, but it takes forever to
> complete just to find out you don't have a workable design.
>

Perhaps I've been lucky with ISE tools, but I have found constructs like 
this to work.

I was of the opinion that Johnathan wanted to create a VHDL block which 
could replace the normal dual port block RAMs normlly found in FPGAs. 
Therefore I don't see the problem in using std_logic_vector flipflops to 
create the memory.

Article: 141438
Subject: Re: True dual-port RAM in VHDL: XST question
From: Sandro <sdroamt@netscape.net>
Date: Wed, 24 Jun 2009 08:31:26 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 24, 2:23=A0pm, Jonathan Bromley <jonathan.brom...@MYCOMPANY.com>
> ...
> In normal logic I would 100% agree, but here I'm talking about
> modeling and synthesizing the FPGAs' built-in RAM blocks, which
> have the option of independent clocks on the two ports. =A0So
> it is important to write VHDL corresponding to that behavior.
> You could mux the clocks onto a single port, but that would
> be a totally different design.
> ...

If you are curious please take a look to the vhdl VITAL
simulations sources...
you can find in

  <ISE INST PATH>/Xilinx/10.1/ISE/vhdl/src/unisims/unisim_VITAL.vhd

but be careful ;-) ... they are NOT 20 lines of code.

Regards
Sandro

Article: 141439
Subject: Re: True dual-port RAM in VHDL: XST question
From: Muzaffer Kal <kal@dspia.com>
Date: Wed, 24 Jun 2009 09:50:53 -0700
Links: << >> << T >> << A >>

On Wed, 24 Jun 2009 16:27:30 +0100, "Fredxx" <fredxx@spam.com> wrote:
>The asynchronous memory is an array of flip-flops rather than a memory, but 
>that's a mute point.  It does both synthesise and simulate in Xilinx ISE 
>tools.
>
Flip-flops need a clock to function. How do you write to them without
a clock to implement asynchronous memory (which by definition doesn't
have it?). You can use an array of latches as opposed to flip-flops
but timing latches is quite difficult especially in an fpga context
where tools are really not geared towards it. You maybe able to
synthesize it in ISE and the original code simulates for sure but have
you tried a back-annotated gate level simulation? It would be an
interesting challenge to get it to work fully unless your read/write
pulse widths and separations are extremely conservative.
One last to remember is that there are a lot fewer slice registers
(from which latches are made) than memory bits in an FPGA so you're
quite limited in how much async memory of this type you can make.
---
Muzaffer Kal

DSPIA INC.
ASIC/FPGA Design Services
http://www.dspia.com

Article: 141440
Subject: Re: Subtleties of Booth's Algorithm Implementation
From: Mike Treseler <mtreseler@gmail.com>
Date: Wed, 24 Jun 2009 09:54:00 -0700
Links: << >> << T >> << A >>

rickman wrote:

> I was able to get a schematic from Synplify Pro, but it did not want
> to include the text except on part of the drawing.  If anyone can tell
> me how to get the text to display, I will reprint it.  The PDF file
> can be found at http://arius.com/stuff/FPGA/Multiply_16x16.pdf

Don't know about synplify.
Does it have an RTL viewer also?

> This incarnation uses 62 LUTs and 42 FFs if I am reading the info
> correctly.  This includes 32 FFs to hold the output product.  The
> Multiplicand is not registered, it is assumed that it is held constant
> on the input.

That should be ok if the enable is synchronized.
Interesting. Thanks for the posting.

> Also, I have not simulated it to be sure it is coded correctly, but I
> am pretty confident it is working the way I intend or at least any
> mistakes won't change the LUT count much.

I prefer to start with an RTL sim,
but I know that many designers prefer
working on the bench. Good luck.

        -- Mike Treseler

Article: 141441
Subject: Re: True dual-port RAM in VHDL: XST question
From: Andy <jonesandy@comcast.net>
Date: Wed, 24 Jun 2009 10:53:49 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 24, 6:15=A0am, Jonathan Bromley <jonathan.brom...@MYCOMPANY.com>
wrote:
> In truth the "correct" solution would be to write the whole thing
> as a single process with two clocks:
>
> =A0 process (clock0, clock1)
> =A0 =A0 variable mem: t_mem;
> =A0 begin
> =A0 =A0 if rising_edge(clock0) then
> =A0 =A0 =A0 if we0 =3D '1' then
> =A0 =A0 =A0 =A0 mem(a0) :=3D wd0;
> =A0 =A0 =A0 end if;
> =A0 =A0 end if;
> =A0 =A0 if rising_edge(clock1) then
> =A0 =A0 =A0 if we1 =3D '1' then
> =A0 =A0 =A0 =A0 mem(a1) :=3D wd1;
> =A0 =A0 =A0 end if;
> =A0 =A0 end if;
> =A0 =A0 ...
>
> But I suspect synthesis tools would chuck that overboard
> without a second thought.

Current synthesis tools would probably have an issue with this, but
there's no good reason for it. DDR synthesis (though not the same as
independent clock, dual port memories) needs it anyway. Some synthesis
tools support dual clock processes, just not writes to the same var/
sig on both clocks. The only time this example does not behave like a
true dual clock/port ram is when two writes are attempted to the same
address at exactly the same time, which is not even defined for the
real HW. Good system design makes that case meaningless anyway.

Andy

Article: 141442
Subject: Re: True dual-port RAM in VHDL: XST question
From: Jonathan Bromley <jonathan.bromley@MYCOMPANY.com>
Date: Wed, 24 Jun 2009 18:57:29 +0100
Links: << >> << T >> << A >>

On Wed, 24 Jun 2009 08:31:26 -0700 (PDT), Sandro wrote:

>If you are curious please take a look to the vhdl VITAL
>simulations sources...

I know about the vendor-provided simulation models,
which are fine pieces of work that do their job well.
But they are completely irrelevant both to my original
problem and to the issue I asked about.  

I'm trying to assemble a complete and accurate list 
of the _synthesizable_ templates for all common types 
of FPGA memory, and I have discovered a template
that synthesizes to dual-clock RAM in two FPGA
vendors' tools but is a complete nonsense for
simulation.  I want to know why this has happened,
what we can do about it, and why the vendors haven't 
already been beaten to pulp over it by users.
-- 
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which 
are not the views of Doulos Ltd., unless specifically stated.

Article: 141443
Subject: Re: True dual-port RAM in VHDL: XST question
From: Jonathan Bromley <jonathan.bromley@MYCOMPANY.com>
Date: Wed, 24 Jun 2009 19:01:09 +0100
Links: << >> << T >> << A >>

On Wed, 24 Jun 2009 10:53:49 -0700 (PDT), Andy wrote:

[of multi-clocked processes in VHDL]

>Current synthesis tools would probably have an issue with this, but
>there's no good reason for it. DDR synthesis (though not the same as
>independent clock, dual port memories) needs it anyway.

I completely agree.  One of the side-effects of the 
survey I'm doing will probably be that I'll log requests
for exactly this feature with all the synthesis vendors.
I don't hold out much hope, though.  Support welcomed ;-)
-- 
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which 
are not the views of Doulos Ltd., unless specifically stated.

Article: 141444
Subject: Re: True dual-port RAM in VHDL: XST question
From: Muzaffer Kal <kal@dspia.com>
Date: Wed, 24 Jun 2009 11:11:47 -0700
Links: << >> << T >> << A >>

On Wed, 24 Jun 2009 18:57:29 +0100, Jonathan Bromley
<jonathan.bromley@MYCOMPANY.com> wrote:

>I'm trying to assemble a complete and accurate list 
>of the _synthesizable_ templates for all common types 
>of FPGA memory, and I have discovered a template
>that synthesizes to dual-clock RAM in two FPGA
>vendors' tools but is a complete nonsense for
>simulation.  I want to know why this has happened,
>what we can do about it, and why the vendors haven't 
>already been beaten to pulp over it by users.

Originally coming from ASIC side I find this incredible but it seems
that majority of people doing FPGA design don't simulate. I was at an
FPGA infomercial the other day about two new device families coming
out from a vendor to stay nameless and only %20 or so people raised
their hands when asked this question. This might explain how these
templates survived as is for such a long time.
---
Muzaffer Kal

DSPIA INC.
ASIC/FPGA Design Services
http://www.dspia.com

Article: 141445
Subject: Re: Virtex-6 shipping?
From: Ed McGettigan <ed.mcgettigan@xilinx.com>
Date: Wed, 24 Jun 2009 11:32:30 -0700
Links: << >> << T >> << A >>

Antti wrote:
> http://news.prnewswire.com/ViewContent.aspx?ACCT=109&STORY=/www/story/03-31-2009/0004998410&EDATE=
> 
> already 2 weeks?
> 
> but how come there can exist 1000 designs, when first shipments made
> 31 march?
> also the number of EA customers 700 seems unlikely, if so then
> logistic has made
> miracles shipping to 700 customers in NO time and those already used
> the parts
> in 1000 designs?
> 
> oh, well it only the news the xilinx way, it is still be seen whe ISE
> support for
> S6-V6 comes
> 
> Antti

Software support, data sheets and user guides for Spartan-6 and Virtex-6 
are now available.

ISE - http://www.xilinx.com/tools/designtools.htm
V-6 - http://www.xilinx.com/products/virtex6/
S-6 - http://www.xilinx.com/products/spartan6/

Ed McGettigan
--
Xilinx Inc.

Article: 141446
Subject: Re: EPM7064 Altera PLD oe1\oe2\gclr1
From: "Andrew Holme" <ah@nospam.co.uk>
Date: Wed, 24 Jun 2009 19:50:28 +0100
Links: << >> << T >> << A >>


"Aldorus" <him@hereonearth.com> wrote in message 
news:ZZc0m.272932$Tp1.216546@en-nntp-01.dc1.easynews.com...
> Hi all
>
>
>       Just curious about the EPM7064 Altera CPLD. I designed a small
> system around it and I am curious how necessary it is to tie down
> Oe1 and Oe2 to ground and Gclr to Vcc.
>
> Currently I only have Oe1 grounded the others left floating and Gclk1
> connected to my oscillator.
>
> My code is simple and clean but the outputs (after downloading to the
> device) appear to be floating ... which leads me to believe I may need
> to ground Oe2 and tie Gclr to a pullup
>
> Anyone here with knowledge/experience working with Altera CPLD's?
> Thanks in advance

I connected pins 1, 2 and 44 to ground as the pin report suggested and it 
worked for me.

Article: 141447
Subject: Re: Virtex-6 shipping?
From: "Antti.Lukats@googlemail.com" <Antti.Lukats@googlemail.com>
Date: Wed, 24 Jun 2009 11:50:31 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 24, 9:32=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote:
> Antti wrote:
> >http://news.prnewswire.com/ViewContent.aspx?ACCT=3D109&STORY=3D/www/stor=
y...
>
> > already 2 weeks?
>
> > but how come there can exist 1000 designs, when first shipments made
> > 31 march?
> > also the number of EA customers 700 seems unlikely, if so then
> > logistic has made
> > miracles shipping to 700 customers in NO time and those already used
> > the parts
> > in 1000 designs?
>
> > oh, well it only the news the xilinx way, it is still be seen whe ISE
> > support for
> > S6-V6 comes
>
> > Antti
>
> Software support, data sheets and user guides for Spartan-6 and Virtex-6
> are now available.
>
> ISE -http://www.xilinx.com/tools/designtools.htm
> V-6 -http://www.xilinx.com/products/virtex6/
> S-6 -http://www.xilinx.com/products/spartan6/
>
> Ed McGettigan
> --
> Xilinx Inc.- Hide quoted text -
>
> - Show quoted text -

na finally
not need to worry about NDA any more when talking about the s6/v6 :)

Antti

Article: 141448
Subject: Re: Subtleties of Booth's Algorithm Implementation
From: Weng Tianxiang <wtxwtx@gmail.com>
Date: Wed, 24 Jun 2009 11:57:11 -0700 (PDT)
Links: << >> << T >> << A >>

On Jun 24, 9:54=A0am, Mike Treseler <mtrese...@gmail.com> wrote:
> rickman wrote:
> > I was able to get a schematic from Synplify Pro, but it did not want
> > to include the text except on part of the drawing. =A0If anyone can tel=
l
> > me how to get the text to display, I will reprint it. =A0The PDF file
> > can be found athttp://arius.com/stuff/FPGA/Multiply_16x16.pdf
>
> Don't know about synplify.
> Does it have an RTL viewer also?
>
> > This incarnation uses 62 LUTs and 42 FFs if I am reading the info
> > correctly. =A0This includes 32 FFs to hold the output product. =A0The
> > Multiplicand is not registered, it is assumed that it is held constant
> > on the input.
>
> That should be ok if the enable is synchronized.
> Interesting. Thanks for the posting.
>
> > Also, I have not simulated it to be sure it is coded correctly, but I
> > am pretty confident it is working the way I intend or at least any
> > mistakes won't change the LUT count much.
>
> I prefer to start with an RTL sim,
> but I know that many designers prefer
> working on the bench. Good luck.
>
> =A0 =A0 =A0 =A0 -- Mike Treseler

Hi Rick,
The file published at http://arius.com/stuff/FPGA/Multiply_16x16.pdf
has some flaws:
I cannot see the texts after 4 clicks in width direction with 400%
magnification. It means the first 3 clicks in width show normal
drawings, but after that only schematics are shown normally, but no
texts are shown.

Is it normal?

Weng

Article: 141449
Subject: Re: True dual-port RAM in VHDL: XST question
From: Mike Treseler <mtreseler@gmail.com>
Date: Wed, 24 Jun 2009 12:05:39 -0700
Links: << >> << T >> << A >>

Jonathan Bromley wrote:

> I'm trying to assemble a complete and accurate list 
> of the _synthesizable_ templates for all common types 
> of FPGA memory, and I have discovered a template
> that synthesizes to dual-clock RAM in two FPGA
> vendors' tools but is a complete nonsense for
> simulation.  I want to know why this has happened,
> what we can do about it, and why the vendors haven't 
> already been beaten to pulp over it by users.

This has happen because the majority of FPGA
designers prefer to wire together blocks
by others, and verify on the bench using
trial and error synthesis.

The silly dual clock RAM model is ignored
not because it is silly, but because
a vendor netlist is preferred to RTL
to get at all the asynchronous black magic.

What to do? I stick with single clock RAMs
and arbitrate synchronously.

       -- Mike Treseler

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search