Site Home   Archive Home   FAQ Home   How to search the Archive   How to Navigate the Archive   
Compare FPGA features and resources   

Threads starting:
1994JulAugSepOctNovDec1994
1995JanFebMarAprMayJunJulAugSepOctNovDec1995
1996JanFebMarAprMayJunJulAugSepOctNovDec1996
1997JanFebMarAprMayJunJulAugSepOctNovDec1997
1998JanFebMarAprMayJunJulAugSepOctNovDec1998
1999JanFebMarAprMayJunJulAugSepOctNovDec1999
2000JanFebMarAprMayJunJulAugSepOctNovDec2000
2001JanFebMarAprMayJunJulAugSepOctNovDec2001
2002JanFebMarAprMayJunJulAugSepOctNovDec2002
2003JanFebMarAprMayJunJulAugSepOctNovDec2003
2004JanFebMarAprMayJunJulAugSepOctNovDec2004
2005JanFebMarAprMayJunJulAugSepOctNovDec2005
2006JanFebMarAprMayJunJulAugSepOctNovDec2006
2007JanFebMarAprMayJunJulAugSepOctNovDec2007
2008JanFebMarAprMayJunJulAugSepOctNovDec2008
2009JanFebMarAprMayJunJulAugSepOctNovDec2009
2010JanFebMarAprMayJunJulAugSepOctNovDec2010
2011JanFebMarAprMayJunJulAugSepOctNovDec2011
2012JanFebMarAprMayJunJulAugSepOctNovDec2012
2013JanFebMarAprMayJunJulAugSepOctNovDec2013
2014JanFebMarAprMayJunJulAugSepOctNovDec2014
2015JanFebMarAprMayJunJulAugSepOctNovDec2015
2016JanFebMarAprMayJunJulAugSepOctNovDec2016
2017JanFebMarAprMayJunJulAugSepOctNovDec2017
2018JanFebMarAprMayJunJulAugSepOctNovDec2018
2019JanFebMarAprMayJunJulAugSepOctNovDec2019
2020JanFebMarAprMay2020

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search

Messages from 131825

Article: 131825
Subject: Re: Style for Highly-Pipelined State Machines
From: Aiken <aikenpang@gmail.com>
Date: Fri, 2 May 2008 13:56:49 -0700 (PDT)
Links: << >>  << T >>  << A >>
I think you should do is put your "piplied" .
the number of state you need to add =3D=3D the number of cycle you need to
use for  (a*b+c)*d. that's you exactly add the operation inside the
states. but not out side the FSM


On May 2, 2:20=A0pm, Kevin Neilson <kevin_neil...@removethiscomcast.net>
wrote:
> KJ wrote:
> > "Kevin Neilson" <kevin_neil...@removethiscomcast.net> wrote in message
> >news:fv7i38$69n6@cnn.xsj.xilinx.com...
> >> My question: =A0what is the cleanest way to describe an FSM requiring
> >> pipelining?
> ...
> > The other thing to consider is whether the latency being introduced by t=
his
> > outsourced logic needs to be 'compensated for' in some fashion or is it =
OK
> > to simply wait for the acknowledge. =A0In some instances, it is fine for=
 the
> > FSM to simply wait in a particular state until the acknowledge comes bac=
k.
> > In others you need to be feeding new data into the hunk-o-logic on every=

> > clock cycle even though you haven't got the results from the first back.=
 =A0In
> > that situation you still have the req/ack pair but now the ack is simply=

> > saying that the request has been accepted for processing, the actual res=
ults
> > will be coming out later. =A0Now the hunk-o-logic needs an additional ou=
tput
> > to flag when the output is actually valid. =A0This output data valid sig=
nal
> > would typically tend to feed into a separate FSM or some other logic (i.=
e.
> > 'usually' not the first FSM). =A0The first FSM controls feeding stuff in=
, the
> > second FSM or other processing logic is in charge of taking up the outpu=
ts
> > and doing something with it.
>
> ...
>
> > Kevin Jennings
>
> In this case I do indeed have to continue to keep the pipe full, so
> inserting wait states is not an option. =A0And the latency of the "hunk of=

> logic", aka concurrent process, is actually significant because I have
> to get the result and feed it back into the FSM. =A0This example shows why=
:
>
> STATE2: begin
> =A0 =A0if (condition)
> =A0 =A0 =A0begin
> =A0 =A0 =A0 =A0state <=3D STATE3;
> =A0 =A0 =A0 =A0y =A0 =A0 <=3D (a*b+c)*d;
> =A0 =A0 =A0end
> end
>
> I have to get the result (a*b+c) and then feed it back into the FSM so I
> can multiply by d. =A0Why not just let the concurrent process handle that?=

> =A0 Because I want to limit my resource usage to a single DSP48, so I have=

> to schedule the multiplications inside the FSM. =A0But I'll have to check
> out the Wishbone thing you're talking about.
> -Kevin


Article: 131826
Subject: Re: quick question
From: austin <austin@xilinx.com>
Date: Fri, 02 May 2008 14:01:05 -0700
Links: << >>  << T >>  << A >>
Mike,

I was trying to see if he even knew the difference between simulation,
and synthesis....

!

Austin

Mike Treseler wrote:
> austin wrote:
> 
>> OK, I thought I knew something and now I am confused by your answer.
>>
>> A simulator will "execute" the VHDL and provide you with a verification
>> (simulation) of whatever it is you are trying to do.
> 
> Sorry. I thought that mk was talking bout synthesis.
> 
>> Targeting that VHDL to a synthesis tool will use a library of hardware
>> elements (in an FPGA: LUTs, CLBs, etc; or in an ASIC: gates, registers,
>> flip flops, etc) and result in a hardware design that performs the
>> function you desired.
> 
> That's the object.
> The source may be structural or sequential.
> 
>        -- Mike Treseler

Article: 131827
Subject: Re: Forking in One-Hot FSMs
From: Kevin Neilson <kevin_neilson@removethiscomcast.net>
Date: Fri, 02 May 2008 15:20:41 -0600
Links: << >>  << T >>  << A >>
Aiken wrote:
> But why not combine these two states into one states?and let that
> states to do the pipline stuff?
> Your coding may let your design slower and may not be implemented as
> state machine in the final design.
> 
The example might not show this well, but you may want to fork from 
several different states and the length of the fork, before it dies, 
could be several states.  So if I just have the single state, I would 
have to manually branch out through the state tree and figure out all 
states I could possibly be in while the "fork" would be operating and 
then add the fork logic to all those states.  If that makes sense. 
Anyway, it's completely unmaintainable, because when you add a new state 
to the machine you would have to figure out if the pipeline is supposed 
to be full at that time and remember to add in the logic for that.
-Kevin

Article: 131828
Subject: Re: Forking in One-Hot FSMs
From: "Brad Smallridge" <bradsmallridge@dslextreme.com>
Date: Fri, 2 May 2008 14:43:50 -0700
Links: << >>  << T >>  << A >>
Kevin,

> Having two bits hot in a one-hot FSM would normally be a bad thing.  But I 
> was wondering if anybody does this purposely, in order to fork, which 
> might be a syntactically nicer way to have a concurrent FSM.

I wonder about this too. I am currently doing a pipeline and
some code is shown below. So I wrote out the states without
an array so when the ModelSim comes up I don't have to expand
the states to see them. I also group signals that I want to
see associated with the states in the declarative region so
I don't have to futz too much with the ModelSim.

Some states stay on longer with the state_count condition.
I read one header record on state31 and follow it with three
of another type of data record with state32.

Now the "branching" happens because these states address
a NoBL SRAM and there is a two cycle lag between the
address and the data. Not show below, I also have clock
delays on these states, state32_1, state32_2, and so on
so when the address goes out on state32, I then have data
to process on state32_2.

In my Zen thinking about this,
I always have a When state associated with every What.

It actually gets deeper than that because there are FIFOs
involved as well.  You'll need FIFOs in your design if
you are going to tackle a Sobel function. Here the trick
is to start thinking about your processses starting from
the READ data and figure out how many delays you need to
deliver an answer, then figure out where the WRITE data
should marry into the flow.  I have now states out to _7.

Perhaps someone could suggest a better term than state
machine "forking"? And if there is some guidelines on how
to code ane debug pipelined architecture. I'm with Kevin,
it get's real messy, real soon.

Brad Smallridge
AiVision

 inner_cell_state_machine:process(clk)
 begin
 if(clk'event and clk='1')then
   inner_cell_restart <='0';
   if(reset='1')then
     state30<='0'; state31<='0'; state32<='0'; state33<='0';
     state34<='0'; state35<='0'; state36<='0'; state37<='0';
     state38<='0';state39<='0';
     inner_cell_rd_ad <= std_logic_vector(to_unsigned(inner_cell_start,18));
     inner_cell_wr_ad <= std_logic_vector(to_unsigned(inner_cell_start,18));
     state_count <= (others=>'0');
   elsif(state29='1')then -- state29 automatically turns off from 
init_state_machine
     state30<='1';
   --State30 Initial inner cell state
   elsif(state30='1')then
     state30<='0';
     state31<='1';
     state_count <= (others=>'0');
   -- State31 Read the inner cell
   elsif(state31='1')then
       state31 <='0';
       state32<='1';
       inner_cell_rd_ad <= inner_cell_rd_ad+1;
   -- State32 Read the inner cell connections
   elsif(state32='1')then
     if(state_count=2)then
       state32<='0';
       state33<='1';
       state_count <= (others=>'0');
     else
       state_count <= state_count+1;
     end if;
       inner_cell_rd_ad <= inner_cell_rd_ad+1;
   -- State33 Wait for SRAM to deliver first connection
   elsif(state33='1')then
       state33<='0';
       state34<='1';
       state_count <= (others=>'0');
   -- State34 Read  connection
   elsif(state34='1')then
     if(state_count=2)then
       state34<='0';
       state35<='1';
       state_count <= (others=>'0');
     else
       state_count <= state_count+1;
     end if;
  . . .



Article: 131829
Subject: Re: Style for Highly-Pipelined State Machines
From: Mike Treseler <mike_treseler@comcast.net>
Date: Fri, 02 May 2008 15:12:39 -0700
Links: << >>  << T >>  << A >>
Kevin Neilson wrote:

> I have to get the result (a*b+c) and then feed it back into the FSM so I
> can multiply by d.  Why not just let the concurrent process handle that?
>  Because I want to limit my resource usage to a single DSP48, so I have
> to schedule the multiplications inside the FSM.

I still like the idea of a step counter.

On tick one, I do x <= a * b *c;
On tick two, I do y <= x * d;
and so on ...

        -- Mike Treseler

Article: 131830
Subject: Re: Argh! Need help debugging Xilinx .xsvf Player (XAPP058)
From: Bob <rsg.uClinux@gmail.com>
Date: Fri, 2 May 2008 16:27:48 -0700 (PDT)
Links: << >>  << T >>  << A >>
On May 2, 1:32 pm, "MM" <mb...@yahoo.com> wrote:
> Bob,
>
> The startup clock problem didn't occur to me because I am actually
> programming a Platform Flash rather than FPGA, thus CCLK is the correct
> clock in my case....
>
> > Looking at the Properties dialog for the "Generate Programming File"
> > process, I see "FPGA Start-Up Clock" under "Startup Options" , which
> > is indeed set to CCLK - seems this is the default setting, is that
> > true?  So you are suggesting I change this to "JTAG Clock", right?
>
> That's what you have to do.

Double argh!  This still doesn't work.  I set the JTAG clock and it
still doesn't work!

Any _other_ ideas?

Thanks!
-Bob

Article: 131831
Subject: Re: Forking in One-Hot FSMs
From: Eric Smith <eric@brouhaha.com>
Date: Fri, 02 May 2008 16:36:06 -0700
Links: << >>  << T >>  << A >>
Kevin Neilson wrote:
> Having two bits hot in a one-hot FSM would normally be a bad thing.
> But I was wondering if anybody does this purposely, in order to fork,
> which might be a syntactically nicer way to have a concurrent FSM.

DEC used that style of design in the PDP-16 Register Transfer Modules.
Possibly also in the control units of some of their asynchronous
processors such as the PDP-6 and KA10.

Article: 131832
Subject: Re: Forking in One-Hot FSMs
From: "KJ" <kkjennings@sbcglobal.net>
Date: Fri, 2 May 2008 22:54:15 -0400
Links: << >>  << T >>  << A >>

"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message 
news:Y9MSj.9245$sd4.3805@fe109.usenetserver.com...
> Kevin,
>
>> Having two bits hot in a one-hot FSM would normally be a bad thing.

'One hot' is a particular implementation of a FSM, but from a logic 
perspective (i.e. how you go about designing your state machine, the states 
needed, the branching, etc.) means absolutely nothing.

>> But I was wondering if anybody does this purposely, in order to fork, 
>> which might be a syntactically nicer way to have a concurrent FSM.

'Concurrent' state machines though is simply another way of saying state 
machines that either are totally independent of one another, or only loosely 
connected (i.e. there is some signalling going on between the two, but you 
can *usually* futz with one without breaking the other).

As I mentioned in more detail in my response on 'Style for Highly-Pipelined 
State Machines', I only really see two basic approaches:

- The first is some form of counting or adding states where you have a known 
fixed number of states between 'doing this' and 'doing that'.  This method 
works but it really quickly leads to rather complicated code that is 
difficult to understand and (because of the complexity) probably has some 
logic holes as well that may take some time to surface.  In certain designs 
though this method is just fine and the results are fine and easy to 
maintain.  The problem though is when the realization sets in that the code 
is getting out of control and how to manage it (which was the point of the 
other thread).

- The second method uses request/acknowledge handshaking between the 
'concurrent' state machines.  This method scales very nicely from a design 
perspective and is just as efficient from an implementation perspective as 
well.

Bottom line here is to realize that a 'fork in an FSM' is really a call to 
think of it as two separate state machines that have a 
communication/signalling requirement and don't try to force your mental 
model as being 'one' state machine.  After all, the entire design can be 
considered to be a single state machine...but it is generally of no use to 
think of it that way from a design perspective.

<snip>
> Now the "branching" happens because these states address
> a NoBL SRAM and there is a two cycle lag between the
> address and the data. Not show below, I also have clock
> delays on these states, state32_1, state32_2, and so on
> so when the address goes out on state32, I then have data
> to process on state32_2.
>
> In my Zen thinking about this,
> I always have a When state associated with every What.
>
> It actually gets deeper than that because there are FIFOs
> involved as well.  You'll need FIFOs in your design if
> you are going to tackle a Sobel function. Here the trick
> is to start thinking about your processses starting from
> the READ data and figure out how many delays you need to
> deliver an answer, then figure out where the WRITE data
> should marry into the flow.  I have now states out to _7.
>

Try looking at it now from a somewhat different perspective.  Let's say you 
have one state machine who's sole purpose is to generate read and write 
commands and addresses to the memory but not to process the data at all.  In 
addition there is a second state machine who's sole purpose is to process 
the data that gets read back from memory and produce some sort of result, 
that maybe goes to memory, maybe goes somewhere else, it doesn't matter.

If the 'address generator' state machine needs the results from some 
computation in order to procede on, then it sends a read request to the 
'data processor' state machine, and waits until it gets the acknowledge.  It 
may sound queer, but what that does is allows you to design with any sort of 
latency and does not require any apriori knowledge of how many clock ticks 
it will take to get that result back, the address generator state machine is 
waiting for the acknowledge back from the data processor state machine 
(which in turn is waiting for the data to come back from the memory).

Now you could argue that the data processor state machine can't just 
*process data*, it most likely needs to know what it is supposed to be doing 
with it, and that knowledge likely lives in the address generator state 
machine.  Fair enough, but all that means is that the address generator 
needs to be able to send commands over to the data processor.  This could be 
done in an ad hoc manner by setting some signal (or more likely multiple 
signals) that are inputs to the data processor.  This method works fine. 
You can also view this interface somewhat more abstractly as the data 
processor having a command port that is written to by the address 
generator....and again, a simple request/acknowledge handshake is all you 
need here as well.  In many cases though the simple signal(s) is sufficient, 
but in other cases, the two state machines interact a bit more closely and a 
more well defined communication channel between the two will clear up a lot 
from the perspective of getting to a functional design that is easy to 
maintain.

> Perhaps someone could suggest a better term than state
> machine "forking"?

Separate state machines that signal each other in some fashion.

> And if there is some guidelines on how
> to code ane debug pipelined architecture. I'm with Kevin,
> it get's real messy, real soon.
>

Ponder a bit on breaking up things as I've suggested starting with something 
small and see how it comes out.  It may take a bit to get used to, but the 
end result is smaller easier to understand and debug state machines (albeit 
more of them) that communicate over well defined internal interfaces. 
You'll also find that changes (like switching the Nobl SRAM to DRAM as an 
example) can be accomodated without having to change *everything*.

Kevin Jennings



Article: 131833
Subject: Re: Style for Highly-Pipelined State Machines
From: "KJ" <kkjennings@sbcglobal.net>
Date: Fri, 2 May 2008 23:26:29 -0400
Links: << >>  << T >>  << A >>

"Kevin Neilson" <kevin_neilson@removethiscomcast.net> wrote in message 
news:fvfm29$on81@cnn.xsj.xilinx.com...
> KJ wrote:
>> "Kevin Neilson" <kevin_neilson@removethiscomcast.net> wrote in message 
>> news:fv7i38$69n6@cnn.xsj.xilinx.com...
>>> My question:  what is the cleanest way to describe an FSM requiring 
>>> pipelining?
> ...
>> The other thing to consider is whether the latency being introduced by 
>> this outsourced logic needs to be 'compensated for' in some fashion or is 
>> it OK to simply wait for the acknowledge.  In some instances, it is fine 
>> for the FSM to simply wait in a particular state until the acknowledge 
>> comes back. In others you need to be feeding new data into the 
>> hunk-o-logic on every clock cycle even though you haven't got the results 
>> from the first back.  In that situation you still have the req/ack pair 
>> but now the ack is simply saying that the request has been accepted for 
>> processing, the actual results will be coming out later.  Now the 
>> hunk-o-logic needs an additional output to flag when the output is 
>> actually valid.  This output data valid signal would typically tend to 
>> feed into a separate FSM or some other logic (i.e. 'usually' not the 
>> first FSM).  The first FSM controls feeding stuff in, the second FSM or 
>> other processing logic is in charge of taking up the outputs and doing 
>> something with it.
>>
> ...
>>
>> Kevin Jennings
> In this case I do indeed have to continue to keep the pipe full, so 
> inserting wait states is not an option.  And the latency of the "hunk of 
> logic", aka concurrent process, is actually significant because I have to 
> get the result and feed it back into the FSM.  This example shows why:
>
> STATE2: begin
>   if (condition)
>     begin
>       state <= STATE3;
>       y     <= (a*b+c)*d;
>     end
> end
>
> I have to get the result (a*b+c) and then feed it back into the FSM so I 
> can multiply by d.  Why not just let the concurrent process handle that? 
> Because I want to limit my resource usage to a single DSP48, so I have to 
> schedule the multiplications inside the FSM.  But I'll have to check out 
> the Wishbone thing you're talking about.

Well, just the fact that you're time sharing the DSP48 means that you're not 
processing something new on every clock cycle which just screams out to me 
that you'd want to implement this with a request/acknowledge type of 
framework.  Consider having a black box that has two logical interfaces 
called 'inp' and 'out'.  The 'inp' interface will be written to by some 
external thingy and provide 'a', 'b', 'c' and 'd' inputs.  The black box 
will compute "y     <= (a*b+c)*d;" and make it available on the 'out' 
interface via a read command.  Using Avalon naming nomenclature then this 
black box will have the following set of signals:

inp_write: input
inp_writedata: input vector
inp_waitrequest: output

out_read: input
out_readdata: output vector
out_waitrequest: output

What this black box would do is provide the output of the calculation "y <= 
..." on the signal 'out_readdata' in response to a read request on 
'out_read'.  If the calculation has not been completed (maybe 'a', 'b' and 
'c' aren't even available yet), the output signal 'out_waitrequest' would be 
set.  So from the perspective of someone trying to use this black box, they 
would simply set 'out_read' active and wait until 'out_waitrequest' is not 
active.  On that one particular clock cycle, 'out_readdata' has the data.

Now in order to perform the calculation the black box needs 'a', 'b', 'c' 
and 'd'.  For simplicity, I'll assume that they all become available at the 
same time.  At that time the external thingy talking to the black box will 
set 'inp_write' active and 'inp_writedata' to contain 'a', 'b', 'c' and 'd' 
(don't constrain yourself into thinking of this as being 'bytes' or 'words', 
etc.).  If the black box in turn sets the 'inp_waitrequest' output to a 1 
then it means that the external thingy needs to hold 'inp_write' and 
'inp_writedata' without changing them because the black box, for whatever 
reason, is not quite ready (maybe because the results of the previous 
calculation have not yet been read as an example).

The 'external thingy' that controls the black box, is possibly your state 
machine that is basically signalling the black box when it has new data to 
process.  The interface between these two then consists of the above set of 
well defined signals.  The state machine doesn't need to know or care 
explicitly about what the actual latency is in getting through the black 
box.  In order for your system to operate properly, you may need to have 
some latency requirement, but the point here is that the controlling state 
machine can be oblivious to what that latency actually is.

Now turn to the black box.  Since you want to share use of the DSP48 so that 
it gets reused this means that there will be a state machine inside it in 
some fashion.  After receiving new data on the 'inp' interface, the black 
box would set the inp_waitrequest active on the next clock cycle to prevent 
subsequent writes from occurring until the black box is ready to accept more 
data.  Then you go through your calculation and a couple clock cycles later 
you've finally computed the requested output....now what?  You've got the 
output needed for 'out_readdata' to send out with the result.  Up until that 
point, 'out_waitrequest' would be active to indicate that the output is not 
available.  But once it is, the black box would drop 'out_waitrequest'.  If 
'out_read' is active then the result has been passed along, if not then you 
need to hold on to the result until 'out_read' does get set.

It all might sound complicated and that it will take up considerable logic 
resources to implement but in fact it doesn't.  When all the dust has 
settled the logic resources used up in the part will be pretty much 
identical to any other scheme you can come up with (such as counters or 
extra states in a state machine).  In exchange you'll get a much easier to 
understand and maintain design with less chances for having some logic hole 
that occurs under only very infrequent conditions which is very easy to get 
when you have a convoluted difficult to follow single state machine.

Lastly, I've used the Avalon naming convention but Wishbone is logically 
identical, instead of a 'waitrequest' they have 'acknowledge' which are 
logically just the not() of one other.  Avalon is better when it comes to 
dealing with read cycle latency, it defines a specific signal for this 
whereas Wishbone doesn't directly handle this at all but has some additional 
signals that can be used for any purpose, one of which can be to handle read 
cycle latency.

Kevin Jennings 



Article: 131834
Subject: Re: Argh! Need help debugging Xilinx .xsvf Player (XAPP058)
From: "MM" <mbmsv@yahoo.com>
Date: Fri, 2 May 2008 23:34:57 -0400
Links: << >>  << T >>  << A >>
"Bob" <rsg.uClinux@gmail.com> wrote in message 
news:87f1d4cf-1a31-4218-8c60-7da009716ee1@j22g2000hsf.googlegroups.com...
>
> Double argh!  This still doesn't work.  I set the JTAG clock and it
> still doesn't work!
>
> Any _other_ ideas?
>

I guess Antti is right... Anyway, which version of the tools are you using? 
What is you microcontroller? Do you have any external pullups? Do you 
disconnect the cable when running your player? There is a number of Answer 
Records on Xilinx site, which might be relevant to your problem. e.g. 
http://www.xilinx.com/support/answers/22255.htm

/Mikhail 



Article: 131835
Subject: Aldec Active-HDL 7.3 sp1 [stimulators]
From: 0xdeadbeef <Przemyslaw.Duda@gmail.com>
Date: Sat, 3 May 2008 01:47:33 -0700 (PDT)
Links: << >>  << T >>  << A >>
Hi all. I'm brand new in fpga subject so please be patient :P
My problem is about to use stimulatorin waveform.
 Well, exactly- there is no such thing as stimulator as it was in ahdl
7.1.
How to add it then ?
 Please help me because w/o it I won't be able to do my project.
Thanks

Article: 131836
Subject: Re: Aldec Active-HDL 7.3 sp1 [stimulators]
From: Mike Treseler <mike_treseler@comcast.net>
Date: Sat, 03 May 2008 03:32:38 -0700
Links: << >>  << T >>  << A >>
0xdeadbeef wrote:
> Hi all. I'm brand new in fpga subject so please be patient :P
> My problem is about to use stimulatorin waveform.
>  Well, exactly- there is no such thing as stimulator as it was in ahdl
> 7.1.
> How to add it then ?

Choose a language, vhdl or verilog.
Aldec is a simulator that can use either.
Write some synthesis code.
Write a testbench.

         -- Mike Treseler


Article: 131837
Subject: Re: Forking in One-Hot FSMs
From: Mike Treseler <mtreseler@gmail.com>
Date: Sat, 03 May 2008 04:08:41 -0700
Links: << >>  << T >>  << A >>
Brad Smallridge wrote:

 > Perhaps someone could suggest a better term than state
 > machine "forking"? And if there is some guidelines on how
 > to code and debug pipelined architecture. I'm with Kevin,
 > it gets real messy, real soon.

There is no requirement that a process/block must update only
a single register named 'State'.

When I look at large, textbook style state machine
examples, like the ones in this thread, I imagine
a much simpler process that updates several smaller registers.
Maybe an input reg, output reg, a couple of counters
and a few well-named booleans.

    -- Mike Treseler

Article: 131838
Subject: Using SRL16
From: Partha <partha.maji@gmail.com>
Date: Sat, 3 May 2008 06:14:49 -0700 (PDT)
Links: << >>  << T >>  << A >>
Hello,

I am trying to synthesize a simple 8-bit delay line (delay of 10) on
Virtex2 xc2v40-4cs144 FPGA. My objective is to use LUT instead of
available Flops inside each slice. After I synthesize I observe that
it uses 17-IOBs (16 for delay, 1 for Clock) and 8 slices (8-delay per
slice, i.e. total 64).
I am wondering why it does not use only 2 slices to implement 64-
delays. As each SRL16 can provide 16 delays and each slice has 2 SRL16
element, it can make 32-delay out of one slice.
I am guessing there should be some option to provide some constraints.
I do not want to use Manual Floorplan Design option. Can any one
suggest me if it is possible?

- Partha

Article: 131839
Subject: Re: Using SRL16
From: Mike Treseler <mtreseler@gmail.com>
Date: Sat, 03 May 2008 07:20:33 -0700
Links: << >>  << T >>  << A >>
Partha wrote:

> I am wondering why it does not use only 2 slices to implement 64-
> delays. As each SRL16 can provide 16 delays and each slice has 2 SRL16
> element, it can make 32-delay out of one slice.

Probably because you have 500 LUTs and flops to spare,
so synthesis optimizes speed/routing rather than resources.
A LUT shifter is slow enough as it ;)

> I am guessing there should be some option to provide some constraints.
> I do not want to use Manual Floorplan Design option. Can any one
> suggest me if it is possible?

You could fill up the part or set optimization for area.

I wouldn't bother unless I was down to my last LUT.
In fact, I would make a real shifter unless I was
out of flops.

         -- Mike Treseler

Article: 131840
Subject: Re: Using SRL16
From: austin <austin@xilinx.com>
Date: Sat, 03 May 2008 07:58:27 -0700
Links: << >>  << T >>  << A >>
Partha,

If you have a reset, or set, then SRL can not be used, as SRL has no set 
or reset.  They can be set to an initial condition by the configuration, 
however.

http://www.xilinx.com/support/documentation/white_papers/wp231.pdf

Page 2.

Austin

Article: 131841
Subject: Re: Using SRL16
From: jprovidenza@yahoo.com
Date: Sat, 3 May 2008 07:59:33 -0700 (PDT)
Links: << >>  << T >>  << A >>
On May 3, 7:20 am, Mike Treseler <mtrese...@gmail.com> wrote:
> Partha wrote:
> > I am wondering why it does not use only 2 slices to implement 64-
> > delays. As each SRL16 can provide 16 delays and each slice has 2 SRL16
> > element, it can make 32-delay out of one slice.
>
> Probably because you have 500 LUTs and flops to spare,
> so synthesis optimizes speed/routing rather than resources.
> A LUT shifter is slow enough as it ;)
>
> > I am guessing there should be some option to provide some constraints.
> > I do not want to use Manual Floorplan Design option. Can any one
> > suggest me if it is possible?
>
> You could fill up the part or set optimization for area.
>
> I wouldn't bother unless I was down to my last LUT.
> In fact, I would make a real shifter unless I was
> out of flops.
>
>          -- Mike Treseler

It might be helpful if you post your code.  Using the SRL16 requires
that you don't use certain
features - for example, you can't apply a reset to the logic that is
targeted to the SRL16.

I believe Xilinx has some papers on how to do this.  I've created
logic with SRL16 many times
with Verilog, so you should be able to figure this out.

Good luck!

John P

Article: 131842
Subject: Re: Using SRL16
From: Sean Durkin <news_may08@tuxroot.de>
Date: Sat, 03 May 2008 17:00:11 +0200
Links: << >>  << T >>  << A >>
Partha wrote:
> Hello,
> 
> I am trying to synthesize a simple 8-bit delay line (delay of 10) on
> Virtex2 xc2v40-4cs144 FPGA. My objective is to use LUT instead of
> available Flops inside each slice. After I synthesize I observe that
> it uses 17-IOBs (16 for delay, 1 for Clock) and 8 slices (8-delay per
> slice, i.e. total 64).
> I am wondering why it does not use only 2 slices to implement 64-
> delays. As each SRL16 can provide 16 delays and each slice has 2 SRL16
> element, it can make 32-delay out of one slice.
> I am guessing there should be some option to provide some constraints.
> I do not want to use Manual Floorplan Design option. Can any one
> suggest me if it is possible?

In this case, it probably depends on your HDL-code. A SRL16 cannot be
inferred by the synthesis tool if you describe a shift register with
reset, since you cannot set/reset the contents of a LUT (just like you
can't reset the contents of a BRAM, you can only reset its output
registers).

If you describe the shift register WITHOUT a reset, the synthesis tool
can infer a SRL16. Usually you don't need the reset anyway, since the
SRL16 is guaranteed to start up with all zeroes after loading the FPGA
(at least that's what Xilinx says), so you might as well just forget
about it.

If you need the SRL16 to start up with a value other than all zeroes,
you can do that without describing a reset as well: When using XST, it's
enough to assign an initial value in the declaration of the signal that
is laters used for the shift register. AFAIK, XST does value that
initial value. Other synthesis tools I've tried ignore it (e.g.
Precision), so in that case you probably have to instantiate a SRL16
manually and attach an INIT-attribute to it... YMMV.

cu,
Sean

Article: 131843
Subject: FPGA Processor for Signal Processing ?
From: HansWernerMarschke@web.de
Date: Sat, 3 May 2008 08:18:33 -0700 (PDT)
Links: << >>  << T >>  << A >>
You find at the web and in books implementations of processors for FPGA
=B4s and also processors like Picoblaze and Microblaze from firms like
Xilinx. Are there also implementations of processors special designed
for signal processing that realize things like FFT for example ?

Thanks for help

Article: 131844
Subject: Re: Using SRL16
From: John_H <newsgroup@johnhandwork.com>
Date: Sat, 03 May 2008 08:48:41 -0700
Links: << >>  << T >>  << A >>
Mike Treseler wrote:
> Partha wrote:
> 
>> I am wondering why it does not use only 2 slices to implement 64-
>> delays. As each SRL16 can provide 16 delays and each slice has 2 SRL16
>> element, it can make 32-delay out of one slice.
> 
> Probably because you have 500 LUTs and flops to spare,
> so synthesis optimizes speed/routing rather than resources.
> A LUT shifter is slow enough as it ;)
<snip>

Actually, the LUT shifters can work toward the full fabric speed of the 
device in SOME families.

I ended up filing a webcase a couple years ago over the Spartan3 or 3E 
timing results because the expected slower numbers weren't showing up. 
Despite my insistence that the speed would be much slower given the 
history with the SRLs, I was told that the high speeds I was getting 
were correct after the AE had some conversations with people closer to 
the silicon.

Check your datasheet or timing results for information that's not 4 
years old.

- John_H

Article: 131845
Subject: Re: Using SRL16
From: Alain <no_spa2005@yahoo.fr>
Date: Sat, 3 May 2008 10:57:13 -0700 (PDT)
Links: << >>  << T >>  << A >>
On 3 mai, 16:58, austin <aus...@xilinx.com> wrote:
> Partha,
>
> If you have a reset, or set, then SRL can not be used, as SRL has no set
> or reset.  They can be set to an initial condition by the configuration,
> however.
>
> http://www.xilinx.com/support/documentation/white_papers/wp231.pdf
>
> Page 2.
>
> Austin

Austin,
With ISE 10.1, XST add a new feature : "SRL inference for shift
register with single set or reset signal."
For me, it's only interesting with a vector (otherwise, this is a
waste of registers for implementing this reset).

Alain.

Article: 131846
Subject: Re: PLB Master Example
From: raghunandan85@gmail.com
Date: Sat, 3 May 2008 11:44:02 -0700 (PDT)
Links: << >>  << T >>  << A >>
I got the PLB IPIF working with the provided example. There are a
couple of signals that I didnt follow.

1) What is the need for IP2IP Address?
2) Does DMA do 1 data transaction per PLB clock? From what I
understand the new address and data are placed on the bus every
alternate clock cycle since 1 clock is wasted in going to
CHK_BURST_DONE state, where a check is made on whether the necessary
number of bytes have been written.

Raghu.

Article: 131847
Subject: Re: asic gate count
From: "vijayant.rutgers@gmail.com" <vijayant.rutgers@gmail.com>
Date: Sat, 3 May 2008 12:01:23 -0700 (PDT)
Links: << >>  << T >>  << A >>
On May 2, 1:24=A0am, Thomas Stanka <usenet_nospam_va...@stanka-web.de>
wrote:
> On 1 Mai, 21:54, "vijayant.rutg...@gmail.com"
>
> <vijayant.rutg...@gmail.com> wrote:
> > Ok. I have my design finalized. The fir length would be 64 operating
> > on 32 bit wide word. Now could you please hint me on estimating gate
> > count ?
>
> Is it serial or parallel? Using RAM or FF? Which ASIC technology?
> With tight timing constraints or relaxed timing?
>
> My guess would be 65x32 for storage of input and result and two adders
> size 32 bit.
>
> ASIC gate count is a value gained by guess of numbers multiplied with
> e^n with n being a marketing factor (technical oriented people assume
> n=3Drandom(unconstrained) as you can't understand calculation of n if
> you'r not member of a marketing department)
>
> bye Thomas

it will be parallel implementation using RAM and relaxed timing. Any
help is appreciated.

Thanks,
Vijayant

Article: 131848
Subject: Re: Old FPGA question
From: whygee <whygee@yg.yg>
Date: Sat, 03 May 2008 22:09:29 +0200
Links: << >>  << T >>  << A >>
John Adair wrote:
> The XCV400 is on the supported list of devices for ISE Webpack
> http://www.xilinx.com/ise/products/webpack_config.htm so you should
> not have any costs for tools.

OK, thanks :-)

I have also downloaded some docs from xilinx.com
and it seems that it fits my needs. Now I have
to make a PCB...

> I would however wonder why this device
> was in the trash bin as undoubtedly some longer life products still
> use them.
Don't ask me that, I'm just looking in the trash bin :-)

Today I found there 50 or 60 reels of 1208 resistors.
That's a treasure for my future designs :-)))
Probably these resistors come from the same company as the FPGA,
that probably closed. Often, the chairmen simply pay a recycling company
to "dump the company's things" by fear of the environmental laws,
and the recyclers are not electronicians or specialists.
All they can spot are easy "valuable" things, because
they don't look like chairs, desks, lamps...

One of the guys who works in a recycling company (where i
scan the trash) used to repair CRTs (self-taught)
but is clueless about many things.
And his coworkers who "move" things are .. well, just "movers".
They're paid to put things in a huge van, no matter if something
is broken or valuable, because the CRT specialist will handle that
and sell the "cool stuffs" on eBay. The rest is roughly sorted
and goes to several recycling specialists (CRT, chemicals, PCB so
gold is extracted...)

So in the end, they have a large supply of LCD, and even more CRTs,
but most of them are in such a state that it's impossible
to know if it works. No suitable cables, or power supplies,
or accessories... But i won't complain : that where
I got my Alphas, some SUNs, countless PCs and accessories...


> John Adair
> Enterpoint Ltd.

YG, who has to sort all those resistor reels...

Article: 131849
Subject: Re: Old FPGA question
From: Nicolas Matringe <nicolas.matringe@fre.fre>
Date: Sun, 04 May 2008 11:37:33 +0200
Links: << >>  << T >>  << A >>
whygee a écrit :
> John Adair wrote:
>> I would however wonder why this device
>> was in the trash bin as undoubtedly some longer life products still
>> use them.
> Don't ask me that, I'm just looking in the trash bin :-)

Yann you'll have to tell me where this is ;-)

Nicolas



Site Home   Archive Home   FAQ Home   How to search the Archive   How to Navigate the Archive   
Compare FPGA features and resources   

Threads starting:
1994JulAugSepOctNovDec1994
1995JanFebMarAprMayJunJulAugSepOctNovDec1995
1996JanFebMarAprMayJunJulAugSepOctNovDec1996
1997JanFebMarAprMayJunJulAugSepOctNovDec1997
1998JanFebMarAprMayJunJulAugSepOctNovDec1998
1999JanFebMarAprMayJunJulAugSepOctNovDec1999
2000JanFebMarAprMayJunJulAugSepOctNovDec2000
2001JanFebMarAprMayJunJulAugSepOctNovDec2001
2002JanFebMarAprMayJunJulAugSepOctNovDec2002
2003JanFebMarAprMayJunJulAugSepOctNovDec2003
2004JanFebMarAprMayJunJulAugSepOctNovDec2004
2005JanFebMarAprMayJunJulAugSepOctNovDec2005
2006JanFebMarAprMayJunJulAugSepOctNovDec2006
2007JanFebMarAprMayJunJulAugSepOctNovDec2007
2008JanFebMarAprMayJunJulAugSepOctNovDec2008
2009JanFebMarAprMayJunJulAugSepOctNovDec2009
2010JanFebMarAprMayJunJulAugSepOctNovDec2010
2011JanFebMarAprMayJunJulAugSepOctNovDec2011
2012JanFebMarAprMayJunJulAugSepOctNovDec2012
2013JanFebMarAprMayJunJulAugSepOctNovDec2013
2014JanFebMarAprMayJunJulAugSepOctNovDec2014
2015JanFebMarAprMayJunJulAugSepOctNovDec2015
2016JanFebMarAprMayJunJulAugSepOctNovDec2016
2017JanFebMarAprMayJunJulAugSepOctNovDec2017
2018JanFebMarAprMayJunJulAugSepOctNovDec2018
2019JanFebMarAprMayJunJulAugSepOctNovDec2019
2020JanFebMarAprMay2020

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search