Messages from 154225

Article: 154225
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Tim Wescott <tim@seemywebsite.com>
Date: Tue, 11 Sep 2012 15:20:23 -0500
Links: << >> << T >> << A >>

On Tue, 11 Sep 2012 11:47:01 -0700, robotron wrote:

> Dear colleagues,
> 
> I am sending you a proposal of binary counter, designed to minimize
> logic path length in between flip-flops to one gate (MUX) only, at the
> expense of not so straightforward binary counting. The reason for this
> design has emerged while using Actel (MicroSemi) ProASIC/IGLOO
> architecture, lacking any hardwired support for fast carry.
> 
> I have placed VHDL code, schematics, testbench and sample C code to
> OpenCores:
> 
> 	http://opencores.org/project,pcounter
> 
> for further review. If you have GHDL, you can run the test easily by
> issuing "make testrun" or "make testvcd" to examine traces.
> 
> Background:
> During our work on Actel FPGAs (basically, 3-LUT & DFF only), we were
> aware of following types of faster counters: - LFSR counter
> - Johnson counter
> - "RLA counter" (as tailored using Actel's SmartGen core generator)
> 
> Johnson due to its O(2^n) (n as number of bits) can not be used for
> longer counts; LFSR's are hard to invert (table lookup seems to be only
> known method), therefore also impractical for wider counters. RLA
> counter is still too slow and complex for wider counters and moderate
> speeds (e.g.
>> 24bits @ >100MHz).
> 
> As a consequence, the proposed counter uses synchronous divide-by-two
> blocks, each using 1-bit pipeline and carry by single-clock pulse.
> Design is simple and fast, preliminary results from Synplify and Actel
> Designer shows 32bits @200MHz feasible.
> 
> However, output bit lines are non-proportionaly delayed by discrete
> number of clock periods. Therefore, to obtain linear bit word, an
> inversion formula needs to be applied. Fortunately, the inversion is
> simple (unlike LFSR's), in C (pcount.c):
> 
>   for (k = 1; k < n; k++)
>     if ((y & ((1<<k)-1)) < k)
>       y = y ^ (1<<k);
> 
> -- it may be implemented in VHDL core, or within CPU as shown, depending
> on application requirements.
> 
> I am attaching design files & C language decoder/encoder of counter bit
> words. If you have GHDL, you can run the test easily by issuing "make
> testrun" or "make testvcd" to examine traces.
> 
> ** My questions are: **
> - does this design exists, is it being used, and if so, what is its
> name? - if not, do you find the design useful?

So, you've reinvented the ripple counter, only with deterministic carry?

I'm not sure what you'd call it, but a quick Google on "ripple counter", 
possibly with "synchronous" tossed in there, may find you some prior art.

-- 
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com

Article: 154226
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: robotron <hefaistos@gmail.com>
Date: Tue, 11 Sep 2012 13:40:40 -0700 (PDT)
Links: << >> << T >> << A >>

Hello,

On Tuesday, September 11, 2012 10:20:24 PM UTC+2, Tim Wescott wrote:
>
> So, you've reinvented the ripple counter, only with deterministic carry?

yes, nothing more. (Only still unsure about that REinvention by means of easily accessible, documented prior art -- this is why I am bothering the newsgroup.)

> I'm not sure what you'd call it, but a quick Google on "ripple counter", 
> possibly with "synchronous" tossed in there, may find you some prior art.

Can you be more specific? I have found only asynchronous ripple counters.
For synchronous counter searches, I constantly get, what also vendor tools offer: counters without pipeline, i.e. with combinatorial network, spanning over all of the outputs. THIS is what I meant to avoid within the design.

Example: http://en.wikipedia.org/wiki/File:4-bit-jk-flip-flop_V1.1.svg

Here you can see AND gate spanning three output bits. For wider counter, that means n-1 input wide AND. Of course, may be solved by faster logic. However, do you know about any working solution on *slow* architectures, at least as good as my proposal?

I know only about Johnson/one-hot counter (exponential floorplan consumption) and LFSR (exponential lookup-table consumption). Am I missing something?

Thank you for your response,
Marek

Article: 154227
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g. Actel) -- Request For Comment
From: Jon Elson <jmelson@wustl.edu>
Date: Tue, 11 Sep 2012 16:30:42 -0500
Links: << >> << T >> << A >>

robotron wrote:


> I know only about Johnson/one-hot counter (exponential floorplan
> consumption) and LFSR (exponential lookup-table consumption). Am I missing
> something?
How about gray code counters?  Maybe their only advantage is when sampling
them, not the carry propagation, though.

Jon

Article: 154228
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Tim Wescott <tim@seemywebsite.com>
Date: Tue, 11 Sep 2012 16:37:49 -0500
Links: << >> << T >> << A >>

On Tue, 11 Sep 2012 16:30:42 -0500, Jon Elson wrote:

> robotron wrote:
> 
> 
>> I know only about Johnson/one-hot counter (exponential floorplan
>> consumption) and LFSR (exponential lookup-table consumption). Am I
>> missing something?
> How about gray code counters?  Maybe their only advantage is when
> sampling them, not the carry propagation, though.
> 
> Jon

Gray code counters have the same carry issues as regular binary counters.

I think the idea of a synchronous ripple counter is a clever one.  I'd be 
surprised if it hasn't been thought of before, but it's still clever.

-- 
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com

Article: 154229
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g. Actel) -- Request For Comment
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Tue, 11 Sep 2012 21:41:20 +0000 (UTC)
Links: << >> << T >> << A >>

Tim Wescott <tim@seemywebsite.com> wrote:
> On Tue, 11 Sep 2012 11:47:01 -0700, robotron wrote:

(snip)
>> I am sending you a proposal of binary counter, designed to minimize
>> logic path length in between flip-flops to one gate (MUX) only, at the
>> expense of not so straightforward binary counting. The reason for this
>> design has emerged while using Actel (MicroSemi) ProASIC/IGLOO
>> architecture, lacking any hardwired support for fast carry.

(snip) 
>>       http://opencores.org/project,pcounter

I think it is pretty neat. I wouldn't be surprised if it had
been invented in the early days of computing, and then forgotten.

My first thought would have been a Gray code counter, and 
probably also my second thought. 

For now, I will call it carry anticipation.

In generating modulon N counters where N isn't a power of two, 
it often takes more than one cycle to do the comparison against
N-1 to generate a reset. One possible solution is to compare 
against N-2, then delay the reset. 

Well, more specifically, pipeline the comparator such that the reset
comes out at the right time. That requires a comparison against
a smaller number such that the result is right.

Also reminds me of Peter Alfke's divide by 2.5 circuit used to
build really fast dividers. 

(snip)

> So, you've reinvented the ripple counter, only with deterministic carry?

The usual ripple counter is asynchronous, but I almost see what
you mean.

> I'm not sure what you'd call it, but a quick Google on "ripple counter", 
> possibly with "synchronous" tossed in there, may find you some prior art.

Hmm. OK, my description above isn't quite right. If it did do
carry anticipation, then it would be an ordinary binary counter.
If instead you delay the carry at each stage by one cycle, 
does that describe it? 

I suppose I still think that it is possible to build a Gray code
counter with similar delays in it.

-- glen

Article: 154230
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Tim Wescott <tim@seemywebsite.com>
Date: Tue, 11 Sep 2012 16:41:48 -0500
Links: << >> << T >> << A >>

On Tue, 11 Sep 2012 13:40:40 -0700, robotron wrote:

> Hello,
> 
> On Tuesday, September 11, 2012 10:20:24 PM UTC+2, Tim Wescott wrote:
>>
>> So, you've reinvented the ripple counter, only with deterministic
>> carry?
> 
> yes, nothing more. (Only still unsure about that REinvention by means of
> easily accessible, documented prior art -- this is why I am bothering
> the newsgroup.)
> 
>> I'm not sure what you'd call it, but a quick Google on "ripple
>> counter", possibly with "synchronous" tossed in there, may find you
>> some prior art.
> 
> Can you be more specific? I have found only asynchronous ripple
> counters. For synchronous counter searches, I constantly get, what also
> vendor tools offer: counters without pipeline, i.e. with combinatorial
> network, spanning over all of the outputs. THIS is what I meant to avoid
> within the design.
> 
> Example: http://en.wikipedia.org/wiki/File:4-bit-jk-flip-flop_V1.1.svg
> 
> Here you can see AND gate spanning three output bits. For wider counter,
> that means n-1 input wide AND. Of course, may be solved by faster logic.
> However, do you know about any working solution on *slow* architectures,
> at least as good as my proposal?
> 
> I know only about Johnson/one-hot counter (exponential floorplan
> consumption) and LFSR (exponential lookup-table consumption). Am I
> missing something?

Well, your scheme does have the drawback that while the counter is fast, 
and the sampling thereof is fast, the algorithm to get from the sampled 
value to a binary one takes some clock cycles.

Still, it'd work well as a capture for a microprocessor, or as a compare 
(for something like a PWM).  It'd be better yet if you could make it an 
up-down counter, but I bet that's harder.

If you have access to a university library, do a literature search -- I'd 
be astonished if this hasn't shown up before in something like the IEEE 
Circuit Society journal, or in a patent someplace.

-- 
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com

Article: 154231
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g. Actel) -- Request For Comment
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Tue, 11 Sep 2012 21:49:42 +0000 (UTC)
Links: << >> << T >> << A >>

Tim Wescott <tim@seemywebsite.com> wrote:
> On Tue, 11 Sep 2012 16:30:42 -0500, Jon Elson wrote:

(snip on fast counters)

>> How about gray code counters?  Maybe their only advantage is when
>> sampling them, not the carry propagation, though.

(snip)
> Gray code counters have the same carry issues as regular binary counters.

Are you sure? Having not designed one recently, I wouldn't be sure,
but I thoght that they didn't. You do have to consider when each FF
will change, though, and so synchronous logic tools might not be
able to do the timing. 

> I think the idea of a synchronous ripple counter is a clever one.  I'd be 
> surprised if it hasn't been thought of before, but it's still clever.

Yes, many strange things were invented in the early days.

One that seems to be forgotten is the Earle latch, which as best
I can describe it, combines one level of logic with latch circuitry.
But that was before edge triggered logic.

-- glen

Article: 154232
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Tim Wescott <tim@seemywebsite.com>
Date: Tue, 11 Sep 2012 17:14:11 -0500
Links: << >> << T >> << A >>

On Tue, 11 Sep 2012 21:49:42 +0000, glen herrmannsfeldt wrote:

> Tim Wescott <tim@seemywebsite.com> wrote:
>> On Tue, 11 Sep 2012 16:30:42 -0500, Jon Elson wrote:
> 
> (snip on fast counters)
> 
>>> How about gray code counters?  Maybe their only advantage is when
>>> sampling them, not the carry propagation, though.
> 
> (snip)
>> Gray code counters have the same carry issues as regular binary
>> counters.
> 
> Are you sure? Having not designed one recently, I wouldn't be sure, but
> I thoght that they didn't. You do have to consider when each FF will
> change, though, and so synchronous logic tools might not be able to do
> the timing.

Well, yes an no.  Now that you mention it, it strikes me that the "only 
one thing changes at a time" may make it _easier_ to anticipate carries 
-- but _when_ any bit in a gray code counter changes is still dependent 
on _all_ the other bits; if that has to propagate through a bunch of 
logic, then you're back to slows-ville.

-- 
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com

Article: 154233
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g. Actel) -- Request For Comment
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Tue, 11 Sep 2012 22:18:49 +0000 (UTC)
Links: << >> << T >> << A >>

Tim Wescott <tim@seemywebsite.com> wrote:

(snip)
> If you have access to a university library, do a literature search -- I'd 
> be astonished if this hasn't shown up before in something like the IEEE 
> Circuit Society journal, or in a patent someplace.

If you don't have such access, and are interested in the such, I
recommend Earl Swartzlander's "Computer Arithmetic, vol. 1"
(and then vol. 2).

They are IEEE published books with reprints of various papers that
were important in the history of computer arithmetic. Volume 2
has more recent papers, many related to fast floating point.

I don't see a counter like this, so far, though.

-- glen

Article: 154234
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g. Actel) -- Request For Comment
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Tue, 11 Sep 2012 22:27:47 +0000 (UTC)
Links: << >> << T >> << A >>

Tim Wescott <tim@seemywebsite.com> wrote:

(after Tim wrote)

>>> Gray code counters have the same carry issues as regular binary
>>> counters.

(then I wrote)
>> Are you sure? Having not designed one recently, I wouldn't be sure, but
>> I thoght that they didn't. You do have to consider when each FF will
>> change, though, and so synchronous logic tools might not be able to do
>> the timing.

> Well, yes an no.  Now that you mention it, it strikes me that the "only 
> one thing changes at a time" may make it _easier_ to anticipate carries 
> -- but _when_ any bit in a gray code counter changes is still dependent 
> on _all_ the other bits; if that has to propagate through a bunch of 
> logic, then you're back to slows-ville.

But if one input of an AND is zero, it doesn't matter if the
other one changes. When that input goes from zero to one, what
matters is that the other input isn't still changing. You might
be able to get only one gate delay on any signal that actually
changes, even though the gate chain is longer.

All that said without actually looking at any circuits, so
it might not apply here.

-- glen

Article: 154235
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Gabor <gabor@szakacs.org>
Date: Tue, 11 Sep 2012 22:01:18 -0400
Links: << >> << T >> << A >>

On 9/11/2012 2:47 PM, robotron wrote:
> Dear colleagues,
>
> I am sending you a proposal of binary counter, designed to minimize logic
> path length in between flip-flops to one gate (MUX) only, at the expense of
> not so straightforward binary counting. The reason for this design has
> emerged while using Actel (MicroSemi) ProASIC/IGLOO architecture, lacking
> any hardwired support for fast carry.
>
> I have placed VHDL code, schematics, testbench and sample C code to OpenCores:
>
> 	http://opencores.org/project,pcounter
>
> for further review. If you have GHDL, you can run the test easily by issuing
> "make testrun" or "make testvcd" to examine traces.
>
> Background:
> During our work on Actel FPGAs (basically, 3-LUT & DFF only), we were aware
> of following types of faster counters:
> - LFSR counter
> - Johnson counter
> - "RLA counter" (as tailored using Actel's SmartGen core generator)
>
> Johnson due to its O(2^n) (n as number of bits) can not be used for longer
> counts; LFSR's are hard to invert (table lookup seems to be only known
> method), therefore also impractical for wider counters. RLA counter is still
> too slow and complex for wider counters and moderate speeds (e.g.
>> 24bits @ >100MHz).
>
> As a consequence, the proposed counter uses synchronous divide-by-two
> blocks, each using 1-bit pipeline and carry by single-clock pulse. Design is
> simple and fast, preliminary results from Synplify and Actel Designer shows
> 32bits @200MHz feasible.
>
> However, output bit lines are non-proportionaly delayed by discrete number
> of clock periods. Therefore, to obtain linear bit word, an inversion formula
> needs to be applied. Fortunately, the inversion is simple (unlike LFSR's),
> in C (pcount.c):
>
>    for (k = 1; k < n; k++)
>      if ((y & ((1<<k)-1)) < k)
>        y = y ^ (1<<k);
>
> -- it may be implemented in VHDL core, or within CPU as shown, depending on
> application requirements.
>
> I am attaching design files & C language decoder/encoder of counter bit
> words. If you have GHDL, you can run the test easily by issuing
> "make testrun" or "make testvcd" to examine traces.
>
> ** My questions are: **
> - does this design exists, is it being used, and if so, what is its name?
> - if not, do you find the design useful?
>
>
> Best regards,
> Marek Peca
>
It's simple and elegant.  If you only wanted a divide by 2^n
count with a square wave output, you don't really need the inversion
formula.  It would also work with the initial 'en' input as a count
enable, but that would mess up your inversion formula, as the carry
propagates even when the input enable is false.  In fact, for an
n-bit version of this counter, it the input enable were off for
n cycles (or more) the output would eventually settle on a normal
binary pattern without the inversion.

-- Gabor

Article: 154236
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: robotron <hefaistos@gmail.com>
Date: Wed, 12 Sep 2012 02:46:20 -0700 (PDT)
Links: << >> << T >> << A >>

On Sep 11, 11:41=A0pm, Tim Wescott <t...@seemywebsite.com> wrote:
> Well, your scheme does have the drawback that while the counter is fast,
> and the sampling thereof is fast, the algorithm to get from the sampled
> value to a binary one takes some clock cycles.
>
> Still, it'd work well as a capture for a microprocessor, or as a compare
> (for something like a PWM). =A0It'd be better yet if you could make it an
> up-down counter, but I bet that's harder.

Yes, the bitword inversion calculation is the price paid. I understand
fully,
that for demanding, high-throughput applications this is an obstacle.

However, as you said, for less intensive applications like PWM or
timestamping
it may be useful. For now, we are going to use it within time interval
meter.

I have uploaded the design to OpenCores, so anybody is welcome to
send their improvements.

> If you have access to a university library, do a literature search -- I'd
> be astonished if this hasn't shown up before in something like the IEEE
> Circuit Society journal, or in a patent someplace.

Thank you for recommending the journal, I will give a look.

Best regards,
Marek

Article: 154237
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: robotron <hefaistos@gmail.com>
Date: Wed, 12 Sep 2012 02:56:20 -0700 (PDT)
Links: << >> << T >> << A >>

On Sep 12, 4:01=A0am, Gabor <ga...@szakacs.org> wrote:
>
> It's simple and elegant. =A0If you only wanted a divide by 2^n
> count with a square wave output, you don't really need the inversion
> formula.

Right.

>=A0It would also work with the initial 'en' input as a count
> enable, but that would mess up your inversion formula, as the carry
> propagates even when the input enable is false. =A0In fact, for an
> n-bit version of this counter, it the input enable were off for
> n cycles (or more) the output would eventually settle on a normal
> binary pattern without the inversion.

Really? I do not see that at the moment.
(Maybe I should try it in a simulation, rather than force the brain to
boot.)


Marek

Article: 154238
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Gabor <gabor@szakacs.invalid>
Date: Wed, 12 Sep 2012 09:14:58 -0400
Links: << >> << T >> << A >>

robotron wrote:
> On Sep 12, 4:01 am, Gabor <ga...@szakacs.org> wrote:
>> It's simple and elegant.  If you only wanted a divide by 2^n
>> count with a square wave output, you don't really need the inversion
>> formula.
> 
> Right.
> 
>>  It would also work with the initial 'en' input as a count
>> enable, but that would mess up your inversion formula, as the carry
>> propagates even when the input enable is false.  In fact, for an
>> n-bit version of this counter, it the input enable were off for
>> n cycles (or more) the output would eventually settle on a normal
>> binary pattern without the inversion.
> 
> Really? I do not see that at the moment.
> (Maybe I should try it in a simulation, rather than force the brain to
> boot.)
> 
> 
> Marek

It made sense to me because this is essentially a ripple carry counter
with pipeline stages in the carry chain, so unless the pipeline is
also dependent on the enable input, it should eventually settle
on a binary output.

I tried it myself, and it works as I expected.  Here's my version
(VHDL is not my first language so I converted to Verilog).  It holds
en true for 100 cycles at a time, and the output (eventually) settles
on a multiple of 100 (modulo 256 in this case as my pcounter is 8 bits).

`timescale 1 ns / 1 ps

// pdivtwo by Marek Peca
// http://opencores.org/project,pcounter

// Verilog adapted from the schematic diagram

`default_nettype none

module pdivtwo
(
   input wire    en,
   input wire    clk,
   input wire    rst,
   output reg    p,
   output reg    q
);

always @ (posedge clk or posedge rst)
if (rst)
   begin
     p <= 1'b0;
     q <= 1'b0;
   end
else
   begin
     if (en) q <= !q;
     p <= en & q;
   end

endmodule

// pcounter by Marek Peca
// http://opencores.org/project,pcounter

// Verilog adapted from the schematic diagram

module pcounter
#(
   parameter WIDTH = 8
)
(
   input wire              en,
   input wire              clk,
   input wire              rst,
   output wire             p,
   output wire [WIDTH-1:0] q
);

// Internal carry chain elements
wire [WIDTH-1:1] p_int;

pdivtwo pctr [WIDTH-1:0]
(
   .en     ({p_int,en}),
   .clk    (clk),
   .rst    (rst),
   .p      ({p,p_int}),
   .q      (q)
);

endmodule

module pcounter_tb;

// Simple test bench enables the counter for 100 clocks at a time.

// Inputs
reg en;
reg clk;
reg rst;

// Outputs
wire p;
wire [7:0] q;

// Unit Under Test (UUT)
pcounter uut
(
   .en   (en),
   .clk  (clk),
   .rst  (rst),
   .p    (p),
   .q    (q)
);

initial begin
   // Initialize Inputs
   en = 0;
   clk = 0;
   rst = 1;
   #101;
   rst = 0;
end

always clk = #5 !clk;

integer timer = 0;
always @ (posedge clk)
if (!rst)
   begin
     timer <= timer + 1;
     case (timer)
       10: en <= #1 1;
       110: en <= #1 0;
       210: en <= #1 1;
       310: en <= #1 0;
       500: timer <= 0;
     endcase
   end

endmodule

`default_nettype wire

Regards,
Gabor

Article: 154239
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Gabor <gabor@szakacs.invalid>
Date: Wed, 12 Sep 2012 10:17:20 -0400
Links: << >> << T >> << A >>

robotron wrote:
> Dear colleagues,
> 
> I am sending you a proposal of binary counter, designed to minimize logic
> path length in between flip-flops to one gate (MUX) only, at the expense of
> not so straightforward binary counting. The reason for this design has
> emerged while using Actel (MicroSemi) ProASIC/IGLOO architecture, lacking
> any hardwired support for fast carry.
> 
> I have placed VHDL code, schematics, testbench and sample C code to OpenCores:
> 
> 	http://opencores.org/project,pcounter
> 
> for further review. If you have GHDL, you can run the test easily by issuing
> "make testrun" or "make testvcd" to examine traces.
> 
> Background:
> During our work on Actel FPGAs (basically, 3-LUT & DFF only), we were aware
> of following types of faster counters:
> - LFSR counter
> - Johnson counter
> - "RLA counter" (as tailored using Actel's SmartGen core generator)
> 
> Johnson due to its O(2^n) (n as number of bits) can not be used for longer
> counts; LFSR's are hard to invert (table lookup seems to be only known
> method), therefore also impractical for wider counters. RLA counter is still
> too slow and complex for wider counters and moderate speeds (e.g. 
>> 24bits @ >100MHz).
> 
> As a consequence, the proposed counter uses synchronous divide-by-two
> blocks, each using 1-bit pipeline and carry by single-clock pulse. Design is
> simple and fast, preliminary results from Synplify and Actel Designer shows
> 32bits @200MHz feasible.
> 
> However, output bit lines are non-proportionaly delayed by discrete number
> of clock periods. Therefore, to obtain linear bit word, an inversion formula
> needs to be applied. Fortunately, the inversion is simple (unlike LFSR's),
> in C (pcount.c):
> 
>   for (k = 1; k < n; k++)
>     if ((y & ((1<<k)-1)) < k)
>       y = y ^ (1<<k);
> 
> -- it may be implemented in VHDL core, or within CPU as shown, depending on
> application requirements.
> 
> I am attaching design files & C language decoder/encoder of counter bit
> words. If you have GHDL, you can run the test easily by issuing
> "make testrun" or "make testvcd" to examine traces.
> 
> ** My questions are: **
> - does this design exists, is it being used, and if so, what is its name?
> - if not, do you find the design useful?
> 
> 
> Best regards,
> Marek Peca

Just to see if this has some application in Xilinx FPGA's I gave it
a whirl in a Spartan 6.  For a 32-bit counter with registered inputs
and only the final p and q going offchip (again with additional
registers) the best I could do in the -3 speed grade was 425 MHz.
The same size counter and architecture (including a carry out)
using the built-in carry chain logic for a normal binary counter
resulted in more than 470 MHz.  Looking through the timing numbers
it appears that routing delays for this counter negate any help
you might get by losing the carry chain (in Spartan 6).  I imagine
it would be a win in a CPLD (if you have the extra macrocells
for the 2x register count).  In the past I have used LFSR's for
long counters in CPLD's - partly for speed, but mostly because
of the reduced connectivity requirements.

Regards,
Gabor

Article: 154240
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: rickman <gnuarm@gmail.com>
Date: Wed, 12 Sep 2012 11:24:31 -0400
Links: << >> << T >> << A >>

On 9/11/2012 10:01 PM, Gabor wrote:
> On 9/11/2012 2:47 PM, robotron wrote:
>> Dear colleagues,
>>
>> I am sending you a proposal of binary counter, designed to minimize logic
>> path length in between flip-flops to one gate (MUX) only, at the
>> expense of
>> not so straightforward binary counting. The reason for this design has
>> emerged while using Actel (MicroSemi) ProASIC/IGLOO architecture, lacking
>> any hardwired support for fast carry.
>>
>> I have placed VHDL code, schematics, testbench and sample C code to
>> OpenCores:
>>
>> http://opencores.org/project,pcounter
>>
>> for further review. If you have GHDL, you can run the test easily by
>> issuing
>> "make testrun" or "make testvcd" to examine traces.
>>
>> Background:
>> During our work on Actel FPGAs (basically, 3-LUT & DFF only), we were
>> aware
>> of following types of faster counters:
>> - LFSR counter
>> - Johnson counter
>> - "RLA counter" (as tailored using Actel's SmartGen core generator)
>>
>> Johnson due to its O(2^n) (n as number of bits) can not be used for
>> longer
>> counts; LFSR's are hard to invert (table lookup seems to be only known
>> method), therefore also impractical for wider counters. RLA counter is
>> still
>> too slow and complex for wider counters and moderate speeds (e.g.
>>> 24bits @ >100MHz).
>>
>> As a consequence, the proposed counter uses synchronous divide-by-two
>> blocks, each using 1-bit pipeline and carry by single-clock pulse.
>> Design is
>> simple and fast, preliminary results from Synplify and Actel Designer
>> shows
>> 32bits @200MHz feasible.
>>
>> However, output bit lines are non-proportionaly delayed by discrete
>> number
>> of clock periods. Therefore, to obtain linear bit word, an inversion
>> formula
>> needs to be applied. Fortunately, the inversion is simple (unlike
>> LFSR's),
>> in C (pcount.c):
>>
>> for (k = 1; k < n; k++)
>> if ((y & ((1<<k)-1)) < k)
>> y = y ^ (1<<k);
>>
>> -- it may be implemented in VHDL core, or within CPU as shown,
>> depending on
>> application requirements.
>>
>> I am attaching design files & C language decoder/encoder of counter bit
>> words. If you have GHDL, you can run the test easily by issuing
>> "make testrun" or "make testvcd" to examine traces.
>>
>> ** My questions are: **
>> - does this design exists, is it being used, and if so, what is its name?
>> - if not, do you find the design useful?
>>
>>
>> Best regards,
>> Marek Peca
>>
> It's simple and elegant. If you only wanted a divide by 2^n
> count with a square wave output, you don't really need the inversion
> formula. It would also work with the initial 'en' input as a count
> enable, but that would mess up your inversion formula, as the carry
> propagates even when the input enable is false. In fact, for an
> n-bit version of this counter, it the input enable were off for
> n cycles (or more) the output would eventually settle on a normal
> binary pattern without the inversion.
>
> -- Gabor

I've seen this used before.  They added delay lines after the counter 
bits to produce a count output that is simple binary.  This was in a 
high speed network interface and the front end ran very fast relative to 
the now antiquated FPGA technology.  The actual circuit may not have 
been a counter, it may have been an adder, but it did have a carry chain.

In essence, this circuit is a pipelined, bit serial counter.  You still 
need to wait for all the bits to be counted or use the conversion formula.

Rick

Article: 154241
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: robotron <hefaistos@gmail.com>
Date: Wed, 12 Sep 2012 09:00:55 -0700 (PDT)
Links: << >> << T >> << A >>

Dear Gabor,

On Wednesday, September 12, 2012 4:17:56 PM UTC+2, Gabor wrote:
> 
> Just to see if this has some application in Xilinx FPGA's I gave it
> a whirl in a Spartan 6.  For a 32-bit counter with registered inputs
> and only the final p and q going offchip (again with additional
> registers) the best I could do in the -3 speed grade was 425 MHz.
> The same size counter and architecture (including a carry out)
> using the built-in carry chain logic for a normal binary counter
> resulted in more than 470 MHz.  Looking through the timing numbers
> it appears that routing delays for this counter negate any help
> you might get by losing the carry chain (in Spartan 6).  I imagine
> it would be a win in a CPLD (if you have the extra macrocells
> for the 2x register count).  In the past I have used LFSR's for
> long counters in CPLD's - partly for speed, but mostly because
> of the reduced connectivity requirements.

It seems the dedicated carry logic is of real help there.
OK, it makes no sense for these FPGAs.

It is certainly *much* better at Actel/MicroSemi, we have already put it into our recent design.

Thank you very much for the try.
Marek

Article: 154242
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: robotron <hefaistos@gmail.com>
Date: Wed, 12 Sep 2012 09:06:45 -0700 (PDT)
Links: << >> << T >> << A >>

Hello,

On Wednesday, September 12, 2012 5:24:43 PM UTC+2, rickman wrote:
>
> I've seen this used before.  They added delay lines after the counter=20
> bits to produce a count output that is simple binary.  This was in a=20
> high speed network interface and the front end ran very fast relative to=
=20
> the now antiquated FPGA technology.  The actual circuit may not have=20
> been a counter, it may have been an adder, but it did have a carry chain.
>=20
> In essence, this circuit is a pipelined, bit serial counter.  You still=
=20
> need to wait for all the bits to be counted or use the conversion formula=
.

interesting!

1. *please*, could you find the original work?

2.
Actually, my initial design containted a bank of delay lines, recovering th=
e binary counting (maybe unnecessary -- I must check Gabor's post about des=
ynchronizing bits to plain binary counting). The drawback is, there is O(n^=
2) DFFs if implemented using shift registers. The k-th bit has to be delaye=
d by (n-k) mod 2^k bits, what is still O(n^2). In practice, it gives huge D=
FF counts for 32..64bit counters.

The delayers may be also implemented using embedded counters -- since there=
 is no need to delay arbitrary signal, only a pulse "p" (and then divide :2=
). Such a counter may be of ordinary architecture, because it has only log(=
n) bits. However, all this seem to me to be too complicated and resource us=
age may asymptotically drop to O(n log(n)), but in practice, it can hardly =
be less than using shift registers.

I must try Gabor's Verilog to see, what happens.

Thank you very much,
Marek

Article: 154243
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: rickman <gnuarm@gmail.com>
Date: Wed, 12 Sep 2012 12:40:18 -0400
Links: << >> << T >> << A >>

On 9/12/2012 12:06 PM, robotron wrote:
> Hello,
>
> On Wednesday, September 12, 2012 5:24:43 PM UTC+2, rickman wrote:
>>
>> I've seen this used before.  They added delay lines after the counter
>> bits to produce a count output that is simple binary.  This was in a
>> high speed network interface and the front end ran very fast relative to
>> the now antiquated FPGA technology.  The actual circuit may not have
>> been a counter, it may have been an adder, but it did have a carry chain.
>>
>> In essence, this circuit is a pipelined, bit serial counter.  You still
>> need to wait for all the bits to be counted or use the conversion formula.
>
> interesting!
>
> 1. *please*, could you find the original work?
>
> 2.
> Actually, my initial design containted a bank of delay lines, recovering the binary counting (maybe unnecessary -- I must check Gabor's post about desynchronizing bits to plain binary counting). The drawback is, there is O(n^2) DFFs if implemented using shift registers. The k-th bit has to be delayed by (n-k) mod 2^k bits, what is still O(n^2). In practice, it gives huge DFF counts for 32..64bit counters.
>
> The delayers may be also implemented using embedded counters -- since there is no need to delay arbitrary signal, only a pulse "p" (and then divide :2). Such a counter may be of ordinary architecture, because it has only log(n) bits. However, all this seem to me to be too complicated and resource usage may asymptotically drop to O(n log(n)), but in practice, it can hardly be less than using shift registers.
>
> I must try Gabor's Verilog to see, what happens.
>
> Thank you very much,
> Marek

This would not have been published.  It was part of a design a colleague 
worked on and this just seemed like the obvious way to do it, but then 
once you have a solution it always seems obvious, no?

The delay lines can be large, but in this design the data was quickly 
culled and the data rates dropped significantly.

Have you thought about clumping bits by twos, threes or fours?  I'm not 
familiar with the logic family you are using, but if it has four input 
LUTs a grouping of three bits will give a carry out in one LUT with the 
same delay as one bit of your approach.  Then you can use a lot fewer 
delay line FFs even if it doesn't change the O(n^2) relationship.

I mean, it really comes down to the number of FFs needed.  If the 
constant is small enough and your n is not too large, all is good!

I'm not sure I understand how delay counters would work.  Each edge has 
to be delayed, but the edges will happen in less than the delay time. 
You would need to somehow pipeline the edges...

If the counter is free running, you really only need to phase each bit 
correctly.  The first bit is 0, 1, 0, 1... so there are only two phases, 
either one or no FFs to delay it rather than n-1.  The second bit has a 
pattern of four states so the delay is modulo 4 and can be 0, 1, 2 or 3 
rather than n-2.  Does that help?  It should take some of the sting out 
of a long counter.

I think even with an input enable if you route it to all FFs in parallel 
a modulo delay will still work ok.

Have you thought of switching to a device with a built in carry chain?

Rick

Article: 154244
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: robotron <hefaistos@gmail.com>
Date: Wed, 12 Sep 2012 09:46:15 -0700 (PDT)
Links: << >> << T >> << A >>

On Wednesday, September 12, 2012 6:40:30 PM UTC+2, rickman wrote:
> (..)
> If the counter is free running, you really only need to phase each bit 
> correctly.  The first bit is 0, 1, 0, 1... so there are only two phases, 
> either one or no FFs to delay it rather than n-1.  The second bit has a 
> pattern of four states so the delay is modulo 4 and can be 0, 1, 2 or 3 
> rather than n-2.  Does that help?  It should take some of the sting out 
> of a long counter.

Yes, but not much. As I wrote, it's (n-k) mod 2^k.

> (..)
> Have you thought of switching to a device with a built in carry chain?

Yes and no. Currently, the Actel architecture suits our needs for
aerospace, radiation tolerant design quite well. Of course we will
port our design to other families, e.g. (much slower) Atmel and (more
recent) Spartan-6.

I only tried to mention the idea -- because it may be useful somewhere
again, in general.

Marek

Article: 154245
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Gabor <gabor@szakacs.invalid>
Date: Wed, 12 Sep 2012 14:04:26 -0400
Links: << >> << T >> << A >>

robotron wrote:
> Dear Gabor,
> 
> On Wednesday, September 12, 2012 4:17:56 PM UTC+2, Gabor wrote:
>> Just to see if this has some application in Xilinx FPGA's I gave it
>> a whirl in a Spartan 6.  For a 32-bit counter with registered inputs
>> and only the final p and q going offchip (again with additional
>> registers) the best I could do in the -3 speed grade was 425 MHz.
>> The same size counter and architecture (including a carry out)
>> using the built-in carry chain logic for a normal binary counter
>> resulted in more than 470 MHz.  Looking through the timing numbers
>> it appears that routing delays for this counter negate any help
>> you might get by losing the carry chain (in Spartan 6).  I imagine
>> it would be a win in a CPLD (if you have the extra macrocells
>> for the 2x register count).  In the past I have used LFSR's for
>> long counters in CPLD's - partly for speed, but mostly because
>> of the reduced connectivity requirements.
> 
> It seems the dedicated carry logic is of real help there.
> OK, it makes no sense for these FPGAs.
> 
> It is certainly *much* better at Actel/MicroSemi, we have already put it into our recent design.
> 
> Thank you very much for the try.
> Marek

Yes, the Spartan 6 look-ahead carry logic is quite fast, and in addition
to the fast gates it has its own fast dedicated routing.  Still 425 MHz
is likely to be much faster than a typically achievable system speed
for all but the most carefully tuned designs.  And it turns out that
having two flip-flops plus a LUT matches the slice architecture of
the newer Xilinx parts, so the resource usage at the slice level is
not bad for the pcounter.  In fact if I get rid of the async reset
in my code, the pcounter and the standard carry-chain counter use
the same resources (8 slices for 32 bits).

Getting back to the inversion issue, it seems to me that if you
want to work back to a binary number and also use the enable input,
you would need to base the inversion on the p bits as well as
q bits, or else base it on the history of the en input as well
as the q bits.  Essentially, knowing the current p bit values
allows the software to finish the carry propagation.

-- Gabor

Article: 154246
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Kolja Sulimma <ksulimma@googlemail.com>
Date: Wed, 12 Sep 2012 15:58:00 -0700 (PDT)
Links: << >> << T >> << A >>

Am Dienstag, 11. September 2012 20:47:01 UTC+2 schrieb robotron:

>=20
> - does this design exists, is it being used, and if so, what is its name?

If I understand your description correctly it is a carry save accumulator:
http://en.wikipedia.org/wiki/Carry-save_adder

Quote: "To put it another way, we are taking a carry digit from the positio=
n on our right, and passing a carry digit to the left, just as in conventio=
nal addition; but the carry digit we pass to the left is the result of the =
previous calculation and not the current one. In each clock cycle, carries =
only have to move one step along, and not n steps as in conventional additi=
on."

Kolja Sulimma
www.cronologic.de

Article: 154247
Subject: Re: New(?) fast binary counter for FPGAs without carry logic (e.g.
From: Brian Davis <brimdavis@aol.com>
Date: Wed, 12 Sep 2012 17:54:03 -0700 (PDT)
Links: << >> << T >> << A >>

robotron wrote:
>
> > I've seen this used before. =A0They added delay lines after the
> > counter bits to produce a count output that is simple binary.
> <snip>
> 1. *please*, could you find the original work?
>
Similar sorts of carry pipelining were common in the
early Xilinx XC2000/3000 parts; I recall there being
some fast counter techniques in application notes and
Xcell journals of that era.

Pipelined carry chains at one or two bits per carry were
also commonly used for accumulators and counters in the
GHz clock rate GaAs standard cell GaAs designs that I
worked on in the early 90's.

The pictures from the following TriQuint patent show a
few variants of the input/output deskew trees that can
be implemented for delay equalization of a loadable
accumulator having carry pipelining:
 http://www.google.com/patents/US5140540

( disclaimer : I worked with some of the authors back
when I was doing a foundry design through TriQuint )

- Brian

Article: 154248
Subject: Re: Looking for an extremely cheap FPGA board (in quantity, academic use)
From: Jon Elson <jmelson@wustl.edu>
Date: Thu, 13 Sep 2012 14:11:26 -0500
Links: << >> << T >> << A >>

rickman wrote:


> 
> If you build the Parllel Cable III into your product, what will you
> connect it to?  I don't think they have built a PC with a parallel port
> in a number of years and my understanding is that the drivers for USB
> parallel ports don't work properly with bit banging software like this.
>   Am I mistaken?  Is this a workable solution?
I have no idea about the USB-parport, but you CAN get computers with real
parallel ports, just not on laptops.  I use the parallel port for lots
of interconnect projects, and the Intel D525 chipset still has parport
support, if the motherboard maker chooses to bring it out.  For this
reason, we use a lot of Intel D525 (Atom) motherboards in projects.

Jon

Article: 154249
Subject: Re: Looking for an extremely cheap FPGA board (in quantity, academic use)
From: Mark Brehob <brehob@gmail.com>
Date: Sun, 16 Sep 2012 08:52:00 -0700 (PDT)
Links: << >> << T >> << A >>

On Sep 10, 12:48=A0pm, rickman <gnu...@gmail.com> wrote:
> On 9/10/2012 9:56 AM, AMD...@gmail.com wrote:
>
>
>
> > Having been struggling with the MachXO2 for the past few months, I'd su=
ggest doing a thorough eval before jumping in with both feet.
>
> > While it has a lot of nifty advertised features, not all of them work w=
ell, and some (i2c slave) are quite badly broken. =A0Documentation is mostl=
y okay, but very spread out and not indexed well, with some rough spots.
>
> > While this is a fine introduction to the life of a working engineer, it=
 may get in the way of the concepts you're trying to teach :-). =A0The basi=
c IO, logic and ram functions seem to work as expected, these seem to be mo=
stly copies of previous generations.
>
> > Their no cost licenses are valid for one year, so they may also change =
their policies on 'free' licensing or the included tool set at some future =
point and leave you with no good way out.
>
> I've been down this road with Lattice before. =A0The license "expires" in
> a year but they can renew it indefinitely because there is no additional
> cost to them. =A0They can also license it to as many computers as you wis=
h
> so don't worry about what happens when you replace that old laptop. =A0Yo=
u
> won't get updates after the first year which can be a blessing actually.
> =A0 If it ain't broke...
>
> I haven't tried the new Diamond software yet.
>
> Rick

Humm, it's sounding like Lattice is going to be the way to go unless I
can get support from Altera or Xilinx (something I'll pursue if this
gets out of the "idea" stage).

Has anyone worked with Lattice Diamond?  How does it compare to the
software from Xilinx and Altera?  I've been burned by some "bad" FPGA
software in the past (not horrible, just too steep of a learning curve
for the classroom) and would _really_ like the ability to do schematic
capture in addition to Verilog...

Thanks again to everyone for their responses, this has been really
useful!
Mark

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search