Messages from 134325

Article: 134325
Subject: Re: Downsizing Verilog synthesization.
From: John McCaskill <jhmccaskill@gmail.com>
Date: Wed, 6 Aug 2008 07:21:59 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 8:40=A0am, eromlignod <eromlig...@aol.com> wrote:
> Hi guys:
>
> I'm prototyping an application using a Xilinx Spartan-3 development
> board. =A0I'm using this particular development kit because it is suited
> to the large amount of I/O I need.
>
> I'm new to FPGA, so I have written the code in Verilog using almost
> exclusively a high-level, behavioural style. =A0The program works, but
> synthesizes using 99% of the available slices. =A0So if I try to change
> or improve the code, it often synthesizes to over 100% and kicks out
> an error.
>
> I need to condense what I've got to give me some space to work with.
>
> The application is basically a large number of high-speed pulse
> inputs. =A0I count them all independently and average several readings
> over time for each to produce a 21-bit number. =A0Each of these 21-bit
> vectors (there are almost 100) is sent to a central processing module
> that evaluates and compares them using simple arithmetic. =A0Based on
> these comparisons, another set of vectors is sent on to a couple of
> modules that arrange them into a special synchronous serial output.
> That's all it does.
>
> Are there any standard tips or general guidelines that you might offer
> to condense my synthesis? =A0I have found, for example, that making the
> vectors smaller doesn't really change the overall slice count, yet
> commenting out a single line of the processing code can change it
> drastically.
>
> Any ideas or comments would be greatly appreciated.
>
> Don

Since you state that you run out of slices, I know that your design is
larger than the FPGA can hold, but I would still point out that the
slice utilization is a pessimistic view of how much of the FPGA you
are using, the mapping stage spreads the logic out by default instead
of packing it as tightly as possible.  The Register and LUT
utilization is an optimistic measure of how much of the FPGA you have
left.  You need to watch all of them to get a good idea of how full
your design really is.

You mention both a high speed pulse counting section that counts and
averages over time, and then a processing section that sounds like it
is slower. How much slower is it?  If you can share resources over
time in this section you could save resources.

You can look in the reports to see how many adders, etc the tools
inferred from your code.  Your goal is to reduce that number to the
minimum required to perform the comparisons.  You have a range of
options that depend on your constraints.  At one end of the spectrum,
just find any redundant calculations and rearrange your code to share
those calculations. At the other end, you could use a soft processor
such as a PicoBlaze to do the calculations in software.

Regards,

John McCaskill
www.FasterTechnology.com

Article: 134326
Subject: Re: Downsizing Verilog synthesization.
From: eromlignod <eromlignod@aol.com>
Date: Wed, 6 Aug 2008 07:28:06 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 8:56=A0am, Mike Treseler <mtrese...@gmail.com> wrote:
> eromlignod wrote:
> > The application is basically a large number of high-speed pulse
> > inputs. =A0I count them all independently and average several readings
> > over time for each to produce a 21-bit number. =A0Each of these 21-bit
> > vectors (there are almost 100) is sent to a central processing module
> > that evaluates and compares them using simple arithmetic. =A0Based on
> > these comparisons, another set of vectors is sent on to a couple of
> > modules that arrange them into a special synchronous serial output.
>
> Since the answer is shifted out in serial,
> maybe it could be constructed a bit at a time
> to save resources.
>
> > Are there any standard tips or general guidelines that you might offer
> > to condense my synthesis?
>
> A basic trade is time for gates.
> A serial crc is slower, but requires less resources
> than the parallel version, for example.
>
> =A0 =A0 -- Mike Treseler

Mike:

I'm intrigued by your answer, but don't fully understand what you
propose.  You say that I should construct my serial signal a bit at a
time, but how else can I?

My last serial generating module has a big 256 vector input that it is
translating to a serial output that repeats the 256 bits over and
over.  The code is basically something like this:

input [255:0] invector;
output serout;
reg [7:0] x;

always @(negedge shiftclock)
   begin
      x =3D x + 1;
      serout =3D invector[x];
   end

I'll bet there's a better way.

Don

Article: 134327
Subject: Re: Downsizing Verilog synthesization.
From: eromlignod <eromlignod@aol.com>
Date: Wed, 6 Aug 2008 07:43:02 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 9:21=A0am, John McCaskill <jhmccask...@gmail.com> wrote:
> On Aug 6, 8:40=A0am, eromlignod <eromlig...@aol.com> wrote:
>
>
>
>
>
> > Hi guys:
>
> > I'm prototyping an application using a Xilinx Spartan-3 development
> > board. =A0I'm using this particular development kit because it is suite=
d
> > to the large amount of I/O I need.
>
> > I'm new to FPGA, so I have written the code in Verilog using almost
> > exclusively a high-level, behavioural style. =A0The program works, but
> > synthesizes using 99% of the available slices. =A0So if I try to change
> > or improve the code, it often synthesizes to over 100% and kicks out
> > an error.
>
> > I need to condense what I've got to give me some space to work with.
>
> > The application is basically a large number of high-speed pulse
> > inputs. =A0I count them all independently and average several readings
> > over time for each to produce a 21-bit number. =A0Each of these 21-bit
> > vectors (there are almost 100) is sent to a central processing module
> > that evaluates and compares them using simple arithmetic. =A0Based on
> > these comparisons, another set of vectors is sent on to a couple of
> > modules that arrange them into a special synchronous serial output.
> > That's all it does.
>
> > Are there any standard tips or general guidelines that you might offer
> > to condense my synthesis? =A0I have found, for example, that making the
> > vectors smaller doesn't really change the overall slice count, yet
> > commenting out a single line of the processing code can change it
> > drastically.
>
> > Any ideas or comments would be greatly appreciated.
>
> > Don
>
> Since you state that you run out of slices, I know that your design is
> larger than the FPGA can hold, but I would still point out that the
> slice utilization is a pessimistic view of how much of the FPGA you
> are using, the mapping stage spreads the logic out by default instead
> of packing it as tightly as possible. =A0The Register and LUT
> utilization is an optimistic measure of how much of the FPGA you have
> left. =A0You need to watch all of them to get a good idea of how full
> your design really is.
>
> You mention both a high speed pulse counting section that counts and
> averages over time, and then a processing section that sounds like it
> is slower. How much slower is it? =A0If you can share resources over
> time in this section you could save resources.
>
> You can look in the reports to see how many adders, etc the tools
> inferred from your code. =A0Your goal is to reduce that number to the
> minimum required to perform the comparisons. =A0You have a range of
> options that depend on your constraints. =A0At one end of the spectrum,
> just find any redundant calculations and rearrange your code to share
> those calculations. At the other end, you could use a soft processor
> such as a PicoBlaze to do the calculations in software.
>
> Regards,
>
> John McCaskillwww.FasterTechnology.com- Hide quoted text -
>
> - Show quoted text -

What sorts of operations are the biggest gate-hogs?

I have a lot of comparison "if" operations, counters, and non-blocking
assignments to convert lots of inputs into usable arrays.  The
averagers each divide by 32 and I have another single divider toward
the end that divides by 256.  Other than that, I'm not doing anything
very fancy.  I have no multipliers (though I might like to add one),
no "for" loops, etc.

I do have a series of hard-coded standard values that I use for
comparison.  They are in the form of parameters that are fed to each
of the input counter modules when they are instantiated in the top
module.  I suppose these could be EPROM memories, but I haven't
figured out yet how to use the memory provided on the development
board.

Don

Article: 134328
Subject: processor clk and bus clk in edk
From: fmostafa <fatma.abouelella@ugent.be>
Date: Wed, 6 Aug 2008 08:13:23 -0700 (PDT)
Links: << >> << T >> << A >>

hi everybody,

what is the difference between the processor clk and the bus clock in
edk platform, and what is the relationship between these clocks and
the opb clock , also does the HWICAP clk is the OPB clk.

thanks
fatma

Article: 134329
Subject: Re: Altera sues Zilog - signs of desperation from Programmable Vendor
From: KJ <kkjennings@sbcglobal.net>
Date: Wed, 6 Aug 2008 08:34:34 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 9:58=A0am, Jon Beniston <j...@beniston.com> wrote:
> > > 5V tolerant I/O with a 3.3V supply. Hardly the first company to do
> > > that. Maybe the actual circuit to implement it was novel.
> > A novel circuit (if that's the case) even to implement something that
> > is not functionally new is patentable...you don't agree?
>
> No. Being novel is not the only requirement.
>

True, there are other requirements.  Rather than saying 'is
patentable' I should have said 'could be patentable'.

> > > > I have not reviewed any of these patents so I'm not sure what the a=
ctual
> > > > patent claims are.
>
> > Usually people don't boast about their lack of knowledge due to their
> > own lack of effort to review the publicly available material...in a
> > public forum no less.
>
> You might want to check who you have quoted, especially given your
> next comment.
>

Sorry about that, I apologize.

Kevin Jennings

Article: 134330
Subject: Re: Downsizing Verilog synthesization.
From: Gabor <gabor@alacron.com>
Date: Wed, 6 Aug 2008 08:49:59 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 10:43 am, eromlignod <eromlig...@aol.com> wrote:
> On Aug 6, 9:21 am, John McCaskill <jhmccask...@gmail.com> wrote:
>
>
>
> > On Aug 6, 8:40 am, eromlignod <eromlig...@aol.com> wrote:
>
> > > Hi guys:
>
> > > I'm prototyping an application using a Xilinx Spartan-3 development
> > > board.  I'm using this particular development kit because it is suited
> > > to the large amount of I/O I need.
>
> > > I'm new to FPGA, so I have written the code in Verilog using almost
> > > exclusively a high-level, behavioural style.  The program works, but
> > > synthesizes using 99% of the available slices.  So if I try to change
> > > or improve the code, it often synthesizes to over 100% and kicks out
> > > an error.
>
> > > I need to condense what I've got to give me some space to work with.
>
> > > The application is basically a large number of high-speed pulse
> > > inputs.  I count them all independently and average several readings
> > > over time for each to produce a 21-bit number.  Each of these 21-bit
> > > vectors (there are almost 100) is sent to a central processing module
> > > that evaluates and compares them using simple arithmetic.  Based on
> > > these comparisons, another set of vectors is sent on to a couple of
> > > modules that arrange them into a special synchronous serial output.
> > > That's all it does.
>
> > > Are there any standard tips or general guidelines that you might offer
> > > to condense my synthesis?  I have found, for example, that making the
> > > vectors smaller doesn't really change the overall slice count, yet
> > > commenting out a single line of the processing code can change it
> > > drastically.
>
> > > Any ideas or comments would be greatly appreciated.
>
> > > Don
>
> > Since you state that you run out of slices, I know that your design is
> > larger than the FPGA can hold, but I would still point out that the
> > slice utilization is a pessimistic view of how much of the FPGA you
> > are using, the mapping stage spreads the logic out by default instead
> > of packing it as tightly as possible.  The Register and LUT
> > utilization is an optimistic measure of how much of the FPGA you have
> > left.  You need to watch all of them to get a good idea of how full
> > your design really is.
>
> > You mention both a high speed pulse counting section that counts and
> > averages over time, and then a processing section that sounds like it
> > is slower. How much slower is it?  If you can share resources over
> > time in this section you could save resources.
>
> > You can look in the reports to see how many adders, etc the tools
> > inferred from your code.  Your goal is to reduce that number to the
> > minimum required to perform the comparisons.  You have a range of
> > options that depend on your constraints.  At one end of the spectrum,
> > just find any redundant calculations and rearrange your code to share
> > those calculations. At the other end, you could use a soft processor
> > such as a PicoBlaze to do the calculations in software.
>
> > Regards,
>
> > John McCaskillwww.FasterTechnology.com-Hide quoted text -
>
> > - Show quoted text -
>
> What sorts of operations are the biggest gate-hogs?
>
> I have a lot of comparison "if" operations, counters, and non-blocking
> assignments to convert lots of inputs into usable arrays.  The
> averagers each divide by 32 and I have another single divider toward
> the end that divides by 256.  Other than that, I'm not doing anything
> very fancy.  I have no multipliers (though I might like to add one),
> no "for" loops, etc.
>
> I do have a series of hard-coded standard values that I use for
> comparison.  They are in the form of parameters that are fed to each
> of the input counter modules when they are instantiated in the top
> module.  I suppose these could be EPROM memories, but I haven't
> figured out yet how to use the memory provided on the development
> board.
>
> Don

What tools are you using for synthesis?  If ISE / XST (webpack or
foundation from Xilinx) which version?

Things like divide by power of two should take no resources whatever
(i.e. shift operators are basically wires).  However a synthesis tool
may look at the division operator and think you need a divider, which
will take a lot of logic.

Also since you seem to be register-heavy, see where you can
use serial shift registers or memory instead of loose flip-flops.
In Spartan 3 you get 16 stages of serial shift  register or 16
bits of distributed RAM from a single LUT site.  Coding shift
registers without a reset term allows the synthesizer to place
them in these structures instead of flip-flops (which come
one to a LUT site).

Did you look at your map report or "design summary"?  In
the latest version of ISE the design summary can show you
where your largest resource allocations come from.

Regards,
Gabor

Article: 134331
Subject: Re: Downsizing Verilog synthesization.
From: eromlignod <eromlignod@aol.com>
Date: Wed, 6 Aug 2008 09:24:48 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 10:49=A0am, Gabor <ga...@alacron.com> wrote:
> On Aug 6, 10:43 am, eromlignod <eromlig...@aol.com> wrote:
>
>
>
>
>
> > On Aug 6, 9:21 am, John McCaskill <jhmccask...@gmail.com> wrote:
>
> > > On Aug 6, 8:40 am, eromlignod <eromlig...@aol.com> wrote:
>
> > > > Hi guys:
>
> > > > I'm prototyping an application using a Xilinx Spartan-3 development
> > > > board. =A0I'm using this particular development kit because it is s=
uited
> > > > to the large amount of I/O I need.
>
> > > > I'm new to FPGA, so I have written the code in Verilog using almost
> > > > exclusively a high-level, behavioural style. =A0The program works, =
but
> > > > synthesizes using 99% of the available slices. =A0So if I try to ch=
ange
> > > > or improve the code, it often synthesizes to over 100% and kicks ou=
t
> > > > an error.
>
> > > > I need to condense what I've got to give me some space to work with=
.
>
> > > > The application is basically a large number of high-speed pulse
> > > > inputs. =A0I count them all independently and average several readi=
ngs
> > > > over time for each to produce a 21-bit number. =A0Each of these 21-=
bit
> > > > vectors (there are almost 100) is sent to a central processing modu=
le
> > > > that evaluates and compares them using simple arithmetic. =A0Based =
on
> > > > these comparisons, another set of vectors is sent on to a couple of
> > > > modules that arrange them into a special synchronous serial output.
> > > > That's all it does.
>
> > > > Are there any standard tips or general guidelines that you might of=
fer
> > > > to condense my synthesis? =A0I have found, for example, that making=
 the
> > > > vectors smaller doesn't really change the overall slice count, yet
> > > > commenting out a single line of the processing code can change it
> > > > drastically.
>
> > > > Any ideas or comments would be greatly appreciated.
>
> > > > Don
>
> > > Since you state that you run out of slices, I know that your design i=
s
> > > larger than the FPGA can hold, but I would still point out that the
> > > slice utilization is a pessimistic view of how much of the FPGA you
> > > are using, the mapping stage spreads the logic out by default instead
> > > of packing it as tightly as possible. =A0The Register and LUT
> > > utilization is an optimistic measure of how much of the FPGA you have
> > > left. =A0You need to watch all of them to get a good idea of how full
> > > your design really is.
>
> > > You mention both a high speed pulse counting section that counts and
> > > averages over time, and then a processing section that sounds like it
> > > is slower. How much slower is it? =A0If you can share resources over
> > > time in this section you could save resources.
>
> > > You can look in the reports to see how many adders, etc the tools
> > > inferred from your code. =A0Your goal is to reduce that number to the
> > > minimum required to perform the comparisons. =A0You have a range of
> > > options that depend on your constraints. =A0At one end of the spectru=
m,
> > > just find any redundant calculations and rearrange your code to share
> > > those calculations. At the other end, you could use a soft processor
> > > such as a PicoBlaze to do the calculations in software.
>
> > > Regards,
>
> > > John McCaskillwww.FasterTechnology.com-Hidequoted text -
>
> > > - Show quoted text -
>
> > What sorts of operations are the biggest gate-hogs?
>
> > I have a lot of comparison "if" operations, counters, and non-blocking
> > assignments to convert lots of inputs into usable arrays. =A0The
> > averagers each divide by 32 and I have another single divider toward
> > the end that divides by 256. =A0Other than that, I'm not doing anything
> > very fancy. =A0I have no multipliers (though I might like to add one),
> > no "for" loops, etc.
>
> > I do have a series of hard-coded standard values that I use for
> > comparison. =A0They are in the form of parameters that are fed to each
> > of the input counter modules when they are instantiated in the top
> > module. =A0I suppose these could be EPROM memories, but I haven't
> > figured out yet how to use the memory provided on the development
> > board.
>
> > Don
>
> What tools are you using for synthesis? =A0If ISE / XST (webpack or
> foundation from Xilinx) which version?
>
> Things like divide by power of two should take no resources whatever
> (i.e. shift operators are basically wires). =A0However a synthesis tool
> may look at the division operator and think you need a divider, which
> will take a lot of logic.
>
> Also since you seem to be register-heavy, see where you can
> use serial shift registers or memory instead of loose flip-flops.
> In Spartan 3 you get 16 stages of serial shift =A0register or 16
> bits of distributed RAM from a single LUT site. =A0Coding shift
> registers without a reset term allows the synthesizer to place
> them in these structures instead of flip-flops (which come
> one to a LUT site).
>
> Did you look at your map report or "design summary"? =A0In
> the latest version of ISE the design summary can show you
> where your largest resource allocations come from.
>
> Regards,
> Gabor- Hide quoted text -
>
> - Show quoted text -


Interesting.

Thanks Gabor!  This may be very useful.  I have a large number of 8-
bit vectors in my design.  I have about 220 of them passing from one
module to another.  They each begin as an "output reg [7:0]" in one
module and are all assigned to an array in the other module like this.

reg [7:0] array [219:0];
=2E..
y[0] <=3D array[0];
y[1] <=3D array[1];
y[2] <=3D array[3];
=2E..etc.

Is this bad form?

Don

Article: 134332
Subject: Re: Downsizing Verilog synthesization.
From: eromlignod <eromlignod@aol.com>
Date: Wed, 6 Aug 2008 09:31:19 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 11:24=A0am, eromlignod <eromlig...@aol.com> wrote:
> On Aug 6, 10:49=A0am, Gabor <ga...@alacron.com> wrote:
>
>
>
>
>
> > On Aug 6, 10:43 am, eromlignod <eromlig...@aol.com> wrote:
>
> > > On Aug 6, 9:21 am, John McCaskill <jhmccask...@gmail.com> wrote:
>
> > > > On Aug 6, 8:40 am, eromlignod <eromlig...@aol.com> wrote:
>
> > > > > Hi guys:
>
> > > > > I'm prototyping an application using a Xilinx Spartan-3 developme=
nt
> > > > > board. =A0I'm using this particular development kit because it is=
 suited
> > > > > to the large amount of I/O I need.
>
> > > > > I'm new to FPGA, so I have written the code in Verilog using almo=
st
> > > > > exclusively a high-level, behavioural style. =A0The program works=
, but
> > > > > synthesizes using 99% of the available slices. =A0So if I try to =
change
> > > > > or improve the code, it often synthesizes to over 100% and kicks =
out
> > > > > an error.
>
> > > > > I need to condense what I've got to give me some space to work wi=
th.
>
> > > > > The application is basically a large number of high-speed pulse
> > > > > inputs. =A0I count them all independently and average several rea=
dings
> > > > > over time for each to produce a 21-bit number. =A0Each of these 2=
1-bit
> > > > > vectors (there are almost 100) is sent to a central processing mo=
dule
> > > > > that evaluates and compares them using simple arithmetic. =A0Base=
d on
> > > > > these comparisons, another set of vectors is sent on to a couple =
of
> > > > > modules that arrange them into a special synchronous serial outpu=
t.
> > > > > That's all it does.
>
> > > > > Are there any standard tips or general guidelines that you might =
offer
> > > > > to condense my synthesis? =A0I have found, for example, that maki=
ng the
> > > > > vectors smaller doesn't really change the overall slice count, ye=
t
> > > > > commenting out a single line of the processing code can change it
> > > > > drastically.
>
> > > > > Any ideas or comments would be greatly appreciated.
>
> > > > > Don
>
> > > > Since you state that you run out of slices, I know that your design=
 is
> > > > larger than the FPGA can hold, but I would still point out that the
> > > > slice utilization is a pessimistic view of how much of the FPGA you
> > > > are using, the mapping stage spreads the logic out by default inste=
ad
> > > > of packing it as tightly as possible. =A0The Register and LUT
> > > > utilization is an optimistic measure of how much of the FPGA you ha=
ve
> > > > left. =A0You need to watch all of them to get a good idea of how fu=
ll
> > > > your design really is.
>
> > > > You mention both a high speed pulse counting section that counts an=
d
> > > > averages over time, and then a processing section that sounds like =
it
> > > > is slower. How much slower is it? =A0If you can share resources ove=
r
> > > > time in this section you could save resources.
>
> > > > You can look in the reports to see how many adders, etc the tools
> > > > inferred from your code. =A0Your goal is to reduce that number to t=
he
> > > > minimum required to perform the comparisons. =A0You have a range of
> > > > options that depend on your constraints. =A0At one end of the spect=
rum,
> > > > just find any redundant calculations and rearrange your code to sha=
re
> > > > those calculations. At the other end, you could use a soft processo=
r
> > > > such as a PicoBlaze to do the calculations in software.
>
> > > > Regards,
>
> > > > John McCaskillwww.FasterTechnology.com-Hidequotedtext -
>
> > > > - Show quoted text -
>
> > > What sorts of operations are the biggest gate-hogs?
>
> > > I have a lot of comparison "if" operations, counters, and non-blockin=
g
> > > assignments to convert lots of inputs into usable arrays. =A0The
> > > averagers each divide by 32 and I have another single divider toward
> > > the end that divides by 256. =A0Other than that, I'm not doing anythi=
ng
> > > very fancy. =A0I have no multipliers (though I might like to add one)=
,
> > > no "for" loops, etc.
>
> > > I do have a series of hard-coded standard values that I use for
> > > comparison. =A0They are in the form of parameters that are fed to eac=
h
> > > of the input counter modules when they are instantiated in the top
> > > module. =A0I suppose these could be EPROM memories, but I haven't
> > > figured out yet how to use the memory provided on the development
> > > board.
>
> > > Don
>
> > What tools are you using for synthesis? =A0If ISE / XST (webpack or
> > foundation from Xilinx) which version?
>
> > Things like divide by power of two should take no resources whatever
> > (i.e. shift operators are basically wires). =A0However a synthesis tool
> > may look at the division operator and think you need a divider, which
> > will take a lot of logic.
>
> > Also since you seem to be register-heavy, see where you can
> > use serial shift registers or memory instead of loose flip-flops.
> > In Spartan 3 you get 16 stages of serial shift =A0register or 16
> > bits of distributed RAM from a single LUT site. =A0Coding shift
> > registers without a reset term allows the synthesizer to place
> > them in these structures instead of flip-flops (which come
> > one to a LUT site).
>
> > Did you look at your map report or "design summary"? =A0In
> > the latest version of ISE the design summary can show you
> > where your largest resource allocations come from.
>
> > Regards,
> > Gabor- Hide quoted text -
>
> > - Show quoted text -
>
> Interesting.
>
> Thanks Gabor! =A0This may be very useful. =A0I have a large number of 8-
> bit vectors in my design. =A0I have about 220 of them passing from one
> module to another. =A0They each begin as an "output reg [7:0]" in one
> module and are all assigned to an array in the other module like this.
>
> reg [7:0] array [219:0];
> ...
> y[0] <=3D array[0];
> y[1] <=3D array[1];
> y[2] <=3D array[3];
> ...etc.
>
> Is this bad form?
>
> Don- Hide quoted text -
>
> - Show quoted text -

Oops.  I meant for that code to be:

input [7:0] y0;
input [7:0] y1;
=2E..
reg [7:0] array [219:0];
=2E..
array[0] <=3D y0;
array[1] <=3D y1;
=2E..etc.


Don

Article: 134333
Subject: Re: Downsizing Verilog synthesization.
From: Mike Treseler <mtreseler@gmail.com>
Date: Wed, 06 Aug 2008 09:41:07 -0700
Links: << >> << T >> << A >>

eromlignod wrote:

> Mike:
> 
> I'm intrigued by your answer, but don't fully understand what you
> propose.  You say that I should construct my serial signal a bit at a
> time, but how else can I?

I meant to suggest arranging some sort of pipeline
to work on the math while you are shifting the answer out.

        -- Mike Treseler

Article: 134334
Subject: Re: Downsizing Verilog synthesization.
From: Joseph Samson <user@not.my.company>
Date: Wed, 06 Aug 2008 12:43:17 -0400
Links: << >> << T >> << A >>

eromlignod wrote:
> Hi guys:
> 
> I'm prototyping an application using a Xilinx Spartan-3 development
> board.  I'm using this particular development kit because it is suited
> to the large amount of I/O I need.
> 
> I'm new to FPGA, so I have written the code in Verilog using almost
> exclusively a high-level, behavioural style.  The program works, but
> synthesizes using 99% of the available slices.  So if I try to change
> or improve the code, it often synthesizes to over 100% and kicks out
> an error.

Are you talking about synthesis or place and route? It's not an error 
(at least through 9.1i) to synthesize to more logic than is available in 
a part. My current design synthesizes to 105% of device resources. 
Mapping optimizes the design and gets rid of redundant or unused logic.

It's also not unusual to have 99% of the slices occupied. The tools 
prefer to spread the design over as many slices as possible.

The map report will show how many resources are really used (LUTs, FF, 
BRAM, Clocks, MULT.....).

---
Joe Samson

Article: 134335
Subject: Re: Downsizing Verilog synthesization.
From: John McCaskill <jhmccaskill@gmail.com>
Date: Wed, 6 Aug 2008 09:50:22 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 11:31=A0am, eromlignod <eromlig...@aol.com> wrote:
> On Aug 6, 11:24=A0am, eromlignod <eromlig...@aol.com> wrote:
>
>
>
> > On Aug 6, 10:49=A0am, Gabor <ga...@alacron.com> wrote:
>
> > > On Aug 6, 10:43 am, eromlignod <eromlig...@aol.com> wrote:
>
> > > > On Aug 6, 9:21 am, John McCaskill <jhmccask...@gmail.com> wrote:
>
> > > > > On Aug 6, 8:40 am, eromlignod <eromlig...@aol.com> wrote:
>
> > > > > > Hi guys:
>
> > > > > > I'm prototyping an application using a Xilinx Spartan-3 develop=
ment
> > > > > > board. =A0I'm using this particular development kit because it =
is suited
> > > > > > to the large amount of I/O I need.
>
> > > > > > I'm new to FPGA, so I have written the code in Verilog using al=
most
> > > > > > exclusively a high-level, behavioural style. =A0The program wor=
ks, but
> > > > > > synthesizes using 99% of the available slices. =A0So if I try t=
o change
> > > > > > or improve the code, it often synthesizes to over 100% and kick=
s out
> > > > > > an error.
>
> > > > > > I need to condense what I've got to give me some space to work =
with.
>
> > > > > > The application is basically a large number of high-speed pulse
> > > > > > inputs. =A0I count them all independently and average several r=
eadings
> > > > > > over time for each to produce a 21-bit number. =A0Each of these=
 21-bit
> > > > > > vectors (there are almost 100) is sent to a central processing =
module
> > > > > > that evaluates and compares them using simple arithmetic. =A0Ba=
sed on
> > > > > > these comparisons, another set of vectors is sent on to a coupl=
e of
> > > > > > modules that arrange them into a special synchronous serial out=
put.
> > > > > > That's all it does.
>
> > > > > > Are there any standard tips or general guidelines that you migh=
t offer
> > > > > > to condense my synthesis? =A0I have found, for example, that ma=
king the
> > > > > > vectors smaller doesn't really change the overall slice count, =
yet
> > > > > > commenting out a single line of the processing code can change =
it
> > > > > > drastically.
>
> > > > > > Any ideas or comments would be greatly appreciated.
>
> > > > > > Don
>
> > > > > Since you state that you run out of slices, I know that your desi=
gn is
> > > > > larger than the FPGA can hold, but I would still point out that t=
he
> > > > > slice utilization is a pessimistic view of how much of the FPGA y=
ou
> > > > > are using, the mapping stage spreads the logic out by default ins=
tead
> > > > > of packing it as tightly as possible. =A0The Register and LUT
> > > > > utilization is an optimistic measure of how much of the FPGA you =
have
> > > > > left. =A0You need to watch all of them to get a good idea of how =
full
> > > > > your design really is.
>
> > > > > You mention both a high speed pulse counting section that counts =
and
> > > > > averages over time, and then a processing section that sounds lik=
e it
> > > > > is slower. How much slower is it? =A0If you can share resources o=
ver
> > > > > time in this section you could save resources.
>
> > > > > You can look in the reports to see how many adders, etc the tools
> > > > > inferred from your code. =A0Your goal is to reduce that number to=
 the
> > > > > minimum required to perform the comparisons. =A0You have a range =
of
> > > > > options that depend on your constraints. =A0At one end of the spe=
ctrum,
> > > > > just find any redundant calculations and rearrange your code to s=
hare
> > > > > those calculations. At the other end, you could use a soft proces=
sor
> > > > > such as a PicoBlaze to do the calculations in software.
>
> > > > > Regards,
>
> > > > > John McCaskillwww.FasterTechnology.com-Hidequotedtext-
>
> > > > > - Show quoted text -
>
> > > > What sorts of operations are the biggest gate-hogs?
>
> > > > I have a lot of comparison "if" operations, counters, and non-block=
ing
> > > > assignments to convert lots of inputs into usable arrays. =A0The
> > > > averagers each divide by 32 and I have another single divider towar=
d
> > > > the end that divides by 256. =A0Other than that, I'm not doing anyt=
hing
> > > > very fancy. =A0I have no multipliers (though I might like to add on=
e),
> > > > no "for" loops, etc.
>
> > > > I do have a series of hard-coded standard values that I use for
> > > > comparison. =A0They are in the form of parameters that are fed to e=
ach
> > > > of the input counter modules when they are instantiated in the top
> > > > module. =A0I suppose these could be EPROM memories, but I haven't
> > > > figured out yet how to use the memory provided on the development
> > > > board.
>
> > > > Don
>
> > > What tools are you using for synthesis? =A0If ISE / XST (webpack or
> > > foundation from Xilinx) which version?
>
> > > Things like divide by power of two should take no resources whatever
> > > (i.e. shift operators are basically wires). =A0However a synthesis to=
ol
> > > may look at the division operator and think you need a divider, which
> > > will take a lot of logic.
>
> > > Also since you seem to be register-heavy, see where you can
> > > use serial shift registers or memory instead of loose flip-flops.
> > > In Spartan 3 you get 16 stages of serial shift =A0register or 16
> > > bits of distributed RAM from a single LUT site. =A0Coding shift
> > > registers without a reset term allows the synthesizer to place
> > > them in these structures instead of flip-flops (which come
> > > one to a LUT site).
>
> > > Did you look at your map report or "design summary"? =A0In
> > > the latest version of ISE the design summary can show you
> > > where your largest resource allocations come from.
>
> > > Regards,
> > > Gabor- Hide quoted text -
>
> > > - Show quoted text -
>
> > Interesting.
>
> > Thanks Gabor! =A0This may be very useful. =A0I have a large number of 8=
-
> > bit vectors in my design. =A0I have about 220 of them passing from one
> > module to another. =A0They each begin as an "output reg [7:0]" in one
> > module and are all assigned to an array in the other module like this.
>
> > reg [7:0] array [219:0];
> > ...
> > y[0] <=3D array[0];
> > y[1] <=3D array[1];
> > y[2] <=3D array[3];
> > ...etc.
>
> > Is this bad form?
>
> > Don- Hide quoted text -
>
> > - Show quoted text -
>
> Oops. =A0I meant for that code to be:
>
> input [7:0] y0;
> input [7:0] y1;
> ...
> reg [7:0] array [219:0];
> ...
> array[0] <=3D y0;
> array[1] <=3D y1;
> ...etc.
>
> Don

If you can map this onto a block ram, you will save quite a bit of
registers. Whether or not you can do this depends on if you can write
the vectors in one (or a few) at a time, and process them sequentially
in the time you have available.  How much time do you have to process
the vectors? Ns, us, ms ?

Regards,

John McCaskill
www.FasterTechnology.com

Article: 134336
Subject: Re: Downsizing Verilog synthesization.
From: eromlignod <eromlignod@aol.com>
Date: Wed, 6 Aug 2008 10:31:57 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 11:50=A0am, John McCaskill <jhmccask...@gmail.com> wrote:
> If you can map this onto a block ram, you will save quite a bit of
> registers. Whether or not you can do this depends on if you can write
> the vectors in one (or a few) at a time, and process them sequentially
> in the time you have available. =A0How much time do you have to process
> the vectors? Ns, us, ms ?

Ah, I think I'm following along now.  Are you talking about sending
the numbers over a single 8-bit vector wire one-at-a-time?  Hmmm.

The vectors are actually independent from each other and refresh at
various random rates, so a few usec here or there shouldn't make a
difference.  I'll give it a try!

Don

Article: 134337
Subject: Re: processor clk and bus clk in edk
From: Jon Beniston <jon@beniston.com>
Date: Wed, 6 Aug 2008 10:34:06 -0700 (PDT)
Links: << >> << T >> << A >>

> what is the difference between the processor clk and the bus clock in
> edk platform,

It is up to you - they can be the same or different. The processor
clock determines the speed instructions execute at, the bus clock
determines the speed of bus transactions. The processor clock can
often be faster than the bus clock (E.g. CPU at 300MHz, bus at
100MHz), particularly if you are using a PowerPC.

Jon

Article: 134338
Subject: Re: Downsizing Verilog synthesization.
From: eromlignod <eromlignod@aol.com>
Date: Wed, 6 Aug 2008 10:36:03 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 11:43=A0am, Joseph Samson <u...@not.my.company> wrote:
> Are you talking about synthesis or place and route? It's not an error
> (at least through 9.1i) to synthesize to more logic than is available in
> a part. My current design synthesizes to 105% of device resources.
> Mapping optimizes the design and gets rid of redundant or unused logic.
>
> It's also not unusual to have 99% of the slices occupied. The tools
> prefer to spread the design over as many slices as possible.
>
> The map report will show how many resources are really used (LUTs, FF,
> BRAM, Clocks, MULT.....).

Right, I meant when I process all the way to place & route, not just
synthesis.  It is funny that it is at 99%.  In fact, it shows that all
but two of the slices are used (!!).

It does list usage as 76% related logic and 24% unrelated logic.  I'm
not sure how to remedy that.

Don

Article: 134339
Subject: Re: Downsizing Verilog synthesization.
From: John_H <newsgroup@johnhandwork.com>
Date: Wed, 6 Aug 2008 12:06:56 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 10:36=A0am, eromlignod <eromlig...@aol.com> wrote:
<snip>
> Right, I meant when I process all the way to place & route, not just
> synthesis. =A0It is funny that it is at 99%. =A0In fact, it shows that al=
l
> but two of the slices are used (!!).
>
> It does list usage as 76% related logic and 24% unrelated logic. =A0I'm
> not sure how to remedy that.
>
> Don

The xilinx mapper will spread logic out with one LUT per slice until
it fills to nearly 100% of the slices then it will backfill the 2nd
LUT in each slice where conditions permit.  There's usually a good
stretch between 99% and 101%.

If you explained better what you're trying to do (signal quantity,
what the counts represent, frequencies of the signals and the system
clock) you might get better suggestions on how to code things.  Right
now most of us are taking stabs in the dark.

Article: 134340
Subject: Re: cpu,fpga, clock, dac, initialize sequence
From: "MM" <mbmsv@yahoo.com>
Date: Wed, 6 Aug 2008 16:43:09 -0400
Links: << >> << T >> << A >>

> what's the point here?

The point is that the embedded programmer has to read the hardware manual 
first and/or talk to the hardware designer. Asking generic questions here 
won't help!


/Mikhail

Article: 134341
Subject: Re: Altera sues Zilog - signs of desperation from Programmable Vendor
From: Jim Granville <no.spam@designtools.maps.co.nz>
Date: Thu, 07 Aug 2008 11:06:01 +1200
Links: << >> << T >> << A >>

Jon Beniston wrote:
>>Ther is a fairly well argued rant against patents (or at least the USA's
>>interpretation of patents) athttp://www.embeddedtechjournal.com/articles_2008/20080729_patent.htm
>>
>>I'm not sure that the idea of patents is completely broken, but it
>>certainly needs major reform so that innovation is reasonably protected and
>>rewarded. A copiers' free-for-all would also be bad, IMHO.- Hide quoted text -
> 
> 
> It seems far too many companies seem to think that anything new they
> do is patentable. In the UK there is a requirement that the invention
> must also not be obvious to anyone with experience in the field. 

I thought that was fairly global ?

Of course, patent attorneys have no motivation (or skill)
to apply this test, (nor indeed of researching prior art
any further than google) they are far more interested in
chargable work, like researching other claims, and writing
up the patent itself.
A patent is merely a license to litigate, and if
it IS contested, guess who gets income ?

> If the same problem that is solved by many inventions in recent patents
> were to be given to 10 other engineers, I'm fairly sure some of them
> would come up with similar or better solutions. If that is the case,
> then I don't think an invention should be patentable. If you solve
> something that others have struggled with, then maybe you have a case.

It is very rare to see a patent that does get above the 'obvious'
and 'prior art' thresholds.

-jg

Article: 134342
Subject: Re: Downsizing Verilog synthesization.
From: Jim Granville <no.spam@designtools.maps.co.nz>
Date: Thu, 07 Aug 2008 11:27:51 +1200
Links: << >> << T >> << A >>

eromlignod wrote:

> Hi guys:
> 
> I'm prototyping an application using a Xilinx Spartan-3 development
> board.  I'm using this particular development kit because it is suited
> to the large amount of I/O I need.
> 
> I'm new to FPGA, so I have written the code in Verilog using almost
> exclusively a high-level, behavioural style.  The program works, but
> synthesizes using 99% of the available slices.  So if I try to change
> or improve the code, it often synthesizes to over 100% and kicks out
> an error.
> 
> I need to condense what I've got to give me some space to work with.
> 
> The application is basically a large number of high-speed pulse
> inputs.

Define 'high speed', and what timebase reading rate ?
What are the pulses coming from ?

To some, microseconds is high speed, to others, femtoseconds is high 
speed.....

> I count them all independently and average several readings
> over time for each to produce a 21-bit number.  

If you mean several readings from the same channel, a longer
count time will do that for free.

Are you reading frequency? (fixed time readout of the counters)

> Each of these 21-bit
> vectors (there are almost 100) is sent to a central processing module
> that evaluates and compares them using simple arithmetic.  Based on
> these comparisons, another set of vectors is sent on to a couple of
> modules that arrange them into a special synchronous serial output.
> That's all it does.

What sort of comparison, and what decision rates are you talking ?
Is that processing software, or hardware ?

Do you need 21 bits of precision, or just 21 bits of dynamic range ?

A quasi-log counter bus would drop the fan-out.
(so a 13 bit MSB and a 3 bit exponent, would mux on 16 bit
data paths - 76% of the mux logic right there.

-jg

Article: 134343
Subject: Re: double precision floating point alignment issues with xilkernel
From: "Vasanth Asokan" <vasanth@xilinx.com>
Date: Wed, 6 Aug 2008 16:55:44 -0700
Links: << >> << T >> << A >>


"Matthias Alles" <REMOVEallesCAPITALS@NOeit.SPAMuni-kl.de> wrote in message 
news:g6rmt5$ef7$1@news.uni-kl.de...
> Thanks for the information. That's good to know! I didn't think about
> that at all..
>
> Are there any plans to include FPU support into xilkernel for a future
> release?

Yes we do plan to add the support. I don't have a hard date for it though :(

Article: 134344
Subject: Re: Downsizing Verilog synthesization.
From: John McCaskill <jhmccaskill@gmail.com>
Date: Wed, 6 Aug 2008 18:05:57 -0700 (PDT)
Links: << >> << T >> << A >>

On Aug 6, 12:31=A0pm, eromlignod <eromlig...@aol.com> wrote:
> On Aug 6, 11:50=A0am, John McCaskill <jhmccask...@gmail.com> wrote:
>
> > If you can map this onto a block ram, you will save quite a bit of
> > registers. Whether or not you can do this depends on if you can write
> > the vectors in one (or a few) at a time, and process them sequentially
> > in the time you have available. =A0How much time do you have to process
> > the vectors? Ns, us, ms ?
>
> Ah, I think I'm following along now. =A0Are you talking about sending
> the numbers over a single 8-bit vector wire one-at-a-time? =A0Hmmm.
>
> The vectors are actually independent from each other and refresh at
> various random rates, so a few usec here or there shouldn't make a
> difference. =A0I'll give it a try!
>
> Don

You are asking good questions, so there are multiple people here that
will be happy to help you out. However, you are asking for some low
level suggestions without giving enough high level detail.  The best
optimizations are the ones that you apply at the high level where you
have the most leverage.

If you can tell us more about what you are trying to do you will get
better responses. You said that you have almost 100 high speed
channels.

How many channels are there?
How fast are the pulses arriving on average?
Over what time is the average?
What is the air speed of an unladen swallow?
What is the minimum spacing between pulses?
How fast does your central processing module need to compare the
channels?

As Jim Granville pointed our, the various time bases of your problem
have a major impact on the potential solutions.

Regards,

John McCaskill
www.FasterTechnology.com

Article: 134345
Subject: Re: Schematic Capture tutorials/books?
From: laserbeak43 <laserbeak43@gmail.com>
Date: Wed, 6 Aug 2008 22:41:15 -0700 (PDT)
Links: << >> << T >> << A >>

Very cool looking tutorial! Thanks!
Will i need the files included or can i look at the images in the pdf
and complete it?

thanks!!

On Aug 6, 4:25=A0am, Herbert Kleebauer <k...@unibwm.de> wrote:
> laserbeak43 wrote:
>
> > thanks for the offer, but the links seem to be dead, and the only
> > version
> > of ISE that truly works for me, is 7.1i(i haven't tried anything
> > earlier than that, though)
> > although, i'm sure i could easily change things around to get it to
> > work in 7.
>
> Sorry:
>
> ftp://137.193.64.130/pub/mproz/mproz3_e.pdfftp://137.193.64.130/pub/mproz=
/mproz3.zip
>
> Or here a http mirror:
>
> http://www.ikomi.de/pub/
>
> > On Aug 6, 4:07 am, Herbert Kleebauer <k...@unibwm.de> wrote:
> > > laserbeak43 wrote:
> > > > Hmm, so i've heard. Everyone says Xilinx stuff is bad for beginners
> > > > and i must admit,
> > > > I've been doing nothing but troubleshooting ever since i got this
> > > > board, what a headache.
>
> > > Take a look at:
>
> > >ftp://137.193.164.130/pub/mproz/mproz3_e.pdfftp://137.193.164.130/pub.=
..
>
> > > It's a step by step introduction for a simple design
> > > (ISE 9.2, Spartan3E) using schematic entry.

Article: 134346
Subject: ISE 8.1i sp3: map is not recognized as an internal or external
From: laserbeak43 <laserbeak43@gmail.com>
Date: Wed, 6 Aug 2008 23:46:32 -0700 (PDT)
Links: << >> << T >> << A >>

hi,
I keep getting this error in the implement and design phase with ISE:
'C:\MinGW\include\C__~1\342EBD~1.5\map' is not recognized as an
internal or external command,
operable program or batch file.
I have MinGW installed but that dir doesn't exist. although i do know
what dir it's looking for.
Is anyone familiar with this error? can someone please help with this?

Thanks
Malik

Article: 134347
Subject: Re: Schematic Capture tutorials/books?
From: Herbert Kleebauer <klee@unibwm.de>
Date: Thu, 07 Aug 2008 09:52:19 +0200
Links: << >> << T >> << A >>

laserbeak43 wrote:
> 
> Very cool looking tutorial! Thanks!
> Will i need the files included or can i look at the images in the pdf
> and complete it?

If you want to do it yourself, then don't even look at the schematics
in the pdf but only at the specification of the processor architecture
(page 1-3 of the pdf, not page 4 because this already shows the solution).

But I think you first want to get used to the Xilinx software (schematic
entry, simulation and implementation). Therefore I would suggest to
use the provided schematics and follow the step-by-step tutorial and 
simulate the provided simple assembly program. If all works, delete
all logic (gates and flip-flops) from the lower level schematics (that's 
the starting point for our students) and redesign the CPU yourself.

But if you want to do all yourself from scratch, you will need at
least the assembler from the "addon" directory. If you worry about
the executable, just compile the provided C sources yourself:

gcc -O3 -o xdela xdela.c

Article: 134348
Subject: how to change the system clk in EDK project
From: fmostafa <fatma.abouelella@ugent.be>
Date: Thu, 7 Aug 2008 01:40:06 -0700 (PDT)
Links: << >> << T >> << A >>

hi all;

I have a question about the system clk in EDK project, my processor
clk is 100 Mhz and my bus clk is 25Mhz , i tried to increase  the bus
clk to  50 Mhz , i did this by changing in the DCM , as i changed
the  CLKDV divisor to 2 instead of 4 , which means as i thought
(100/2) instead of (100/4), and cleaned the hardware and i stared to
generate the netlist and the bitstream but i don't know the uart is
not working in a right way , and of course i couldn't examine the rest
of the system .
i don't know if what i did is right or there is something missed or
there is another way to change the frequency.


thanks
fatma

From glaser@ict.tuwien.ac.at Thu Aug 07 02:29:43 2008
Path: nlpi059.nbdc.sbc.com!nlpi062.nbdc.sbc.com!prodigy.com!nlpi057.nbdc.sbc.com!prodigy.net!feeder.erje.net!newsfeed.utanet.at!newsfeed.wu-wien.ac.at!aconews-feed.univie.ac.at!aconews.univie.ac.at!not-for-mail
Newsgroups: comp.arch.fpga
Subject: RTL Schematic as EDIF
From: Johann Glaser <glaser@ict.tuwien.ac.at>
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
Message-Id: <1218101383.15335.13.camel@glaser>
Mime-Version: 1.0
X-Mailer: Evolution 2.22.3.1 
Date: Thu, 07 Aug 2008 11:29:43 +0200
Lines: 29
NNTP-Posting-Host: pc53.ict.tuwien.ac.at
X-Trace: 1218101250 tunews.univie.ac.at 11868 128.131.80.53
X-Complaints-To: abuse@tuwien.ac.at
Xref: prodigy.net comp.arch.fpga:147114
X-Received-Date: Thu, 07 Aug 2008 05:31:02 EDT (nlpi059.nbdc.sbc.com)

Hi!

My PhD thesis deals with coarse-grained reconfigurable logic. Therefore
the RTL schematic synthesis result is one major input for my work.

I tried Xilinx ISE 10.1 as well as Synplicity Synplify Pro 9.2. Both
tools provide this RTL netlist (before implementing it to the technology
netlist), but both in encrypted file formats.

Xilinx ISE 10.1 saves the file as NGR file. Unfortunately there is no
ngr2edif tool provided (while an ngc2edif is available). 

Synplicity Synplify Pro 9.2 saves a SRS file and provides an edf2srs
tool, but no reverse.

Could you please point me to tools which can convert these files formats
to open formats (especially EDIF) or to synthesis tools (not necessarily
for FPGA, a tool from an ASIC flow is ok too), which save the RTL
schematic as open file formats.

Thanks
  Hansi

-- 
Johann Glaser                          <glaser@ict.tuwien.ac.at>
             Institute of Computer Technology, E384
Vienna University of Technology, Gusshausstr. 27-29, A-1040 Wien
Phone: ++43/1/58801-38444                Fax: ++43/1/58801-38499

Article: 134349
Subject: Re: Downsizing Verilog synthesization.
From: "Nial Stewart" <nial*REMOVE_THIS*@nialstewartdevelopments.co.uk>
Date: Thu, 7 Aug 2008 12:13:30 +0100
Links: << >> << T >> << A >>

> I'm intrigued by your answer, but don't fully understand what you
> propose.

You need to have a better understanding of what's generated by your code.
Remember you're describing hardware.

> My last serial generating module has a big 256 vector input that it is
> translating to a serial output that repeats the 256 bits over and
> over.  The code is basically something like this:
> input [255:0] invector;
> output serout;
> reg [7:0] x;
> always @(negedge shiftclock)
>    begin
>       x = x + 1;
>       serout = invector[x];
>    end

That (probably) creates a 256 bit vector and a massive mux to select
one of the bits.

In VHDL the following generates a big shift register which the tools
will find dead easy to place and route as each logical path is just
from one register to the next.....

if(rising_edge(clk)) then

    invector(254 downto 0) <= invector(255 downto 1);
    serout <= invector(0);

end if;

This should be easily translated to verilog.




Nial.

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search