Messages from 154850

Article: 154850
Subject: Re: Button clock
From: "tupadre" <4060@embeddedrelated>
Date: Fri, 18 Jan 2013 15:40:06 -0600
Links: << >> << T >> << A >>

>Le 18/01/2013 21:32, tupadre a ï¿½crit :
>> Hi to everyone!
>[...]
>> And here is a fragment of my processor:
>>
>> ----PC process
>> process(inicio,simulated_clk)
>>
>> variable pc_out: std_logic_vector (15 downto 0);
>> variable pc_high : std_logic_vector(15 downto 0);
>> begin
>> 	--Si es la primera vez que se carga
>> 	if inicio='0' then
>> 		pc<="0000000000000000";
>> 		reg_ret<="0000000000000000";
>> 		pc_out:="0000000000000000";
>> 	--Se ha cargado mÃ¡s veces
>> 	elsif simulated='1' and simulated_clk'event then
>> 		pc_out:=pc;
>>
>> ...
>> end process;
>>
>>
>> I don't know why it doesn't work. As I said I just want to simulate a
clock
>> so to show my proffesor how the processor works step by step
(instruction
>> by instruction).
>>
>> Hope you can help. Thanks!
>
>Hello
>Your clock edge condition looks wrong, it should be :
>  	elsif simulated_clk = '1' and simulated_clk'event then
>or you can use the rising_edge function instead :
>          elsif rising_edge(simulated_clk) then
>
>Nicolas
>

First of all thanks for answering.

Yeah, you are right the code is wrong but is due to the translation from
spanish to english (my real code is fine). However, I will try with
rising_edge()...	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154851
Subject: Re: Button clock
From: "tupadre" <4060@embeddedrelated>
Date: Fri, 18 Jan 2013 15:52:40 -0600
Links: << >> << T >> << A >>

rising_edge() doesn't work...

I modified the .ucf file in order to test if the "pulsador" switch is
working by adding this line "NET "pulsador"  LOC = "P11" ;" so when I
switch on pulsador it should be turned on the P11 led but it doesn't light
up...

My new .ucf:

#PACE: Start of Constraints generated by PACE

NET "pulsador" CLOCK_DEDICATED_ROUTE = FALSE;
NET "act_reg3" CLOCK_DEDICATED_ROUTE = FALSE;

#PACE: Start of PACE I/O Pin Assignments
NET "an0"  LOC = "E13"  ;
NET "an3"  LOC = "F14"  ;
NET "an2"  LOC = "G14"  ;
NET "an1"  LOC = "d14"  ;


#########################  8 Leds
###########################################
NET "mult_lec"  LOC = "K12"  ;
NET "mult_esc"  LOC = "P14"  ;
NET "mult_ret"  LOC = "L12"  ;
NET "mult_pc"  LOC = "N14"  ;
NET "reg_w"  LOC = "P13"  ;
NET "call_w"  LOC = "N12"  ;
NET "sel_alu"  LOC = "P12"  ;
NET "pulsador"  LOC = "P11"  ;

NET "clk"  LOC = "T9"  ;
#Frecuencia de 50 MHz

###################################### AquÃ va el marcador  
##########################################
NET "segmentos<6>"  LOC = "E14"  ;
NET "segmentos<5>"  LOC = "G13"  ;
NET "segmentos<4>"  LOC = "N15"  ;
NET "segmentos<3>"  LOC = "P15"  ;
NET "segmentos<2>"  LOC = "R16"  ;
NET "segmentos<1>"  LOC = "F13"  ;
NET "segmentos<0>"  LOC = "N16"  ;
NET "segmentos<7>"  LOC = "P16"  ;


#######################################  Switches
###########################################
###### Indicar registro 1 #####
NET "act_reg1"  LOC = "F12"  ;
###### Indicar registro 2 #####
NET "act_reg2"  LOC = "G12"  ;
###### Indicar registro 2 #####
NET "act_reg3"  LOC = "H14"  ;
#########  My button clock   ############
NET "pulsador"  LOC = "H13"  ;


#PACE: Start of PACE Area Constraints

#PACE: Start of PACE Prohibit Constraints

#PACE: End of Constraints generated by PACE

Hope you can help!	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154852
Subject: Re: Button clock
From: GaborSzakacs <gabor@szakacs.invalid>
Date: Fri, 18 Jan 2013 17:34:34 -0500
Links: << >> << T >> << A >>

tupadre wrote:
> Hi to everyone!
> 
> This is my first message here so first of all thanks to everyone to let me
> post here!
> 
> I have a homework, it consists on writing a processor and make it work in a
> Spartan 3. I did the first but the second one is getting difficult. I just
> want to show my profesor that processor is working well so I want to
> generate a clock event with a button or switch just like here:
> http://fpgawiki.dustingram.com/index.php?title=Button_Clock
> 
> My UCF is this one:
> 
> #PACE: Start of Constraints generated by PACE
> 
> NET "act_reg3" CLOCK_DEDICATED_ROUTE = FALSE;
> 
> #PACE: Start of PACE I/O Pin Assignments
> NET "an0"  LOC = "E13"  ;
> NET "an3"  LOC = "F14"  ;
> NET "an2"  LOC = "G14"  ;
> NET "an1"  LOC = "d14"  ;
> 
> #This is a button to simulate a clock
> NET "simulated_clk"  LOC = "M13"  ;
> 
> 
> #########################  8 Leds
> ###########################################
> NET "mult_lec"  LOC = "K12"  ;
> NET "mult_esc"  LOC = "P14"  ;
> NET "mult_ret"  LOC = "L12"  ;
> NET "mult_pc"  LOC = "N14"  ;
> NET "reg_w"  LOC = "P13"  ;
> NET "call_w"  LOC = "N12"  ;
> NET "sel_alu"  LOC = "P12"  ;
> 
> #I use this for another thing
> NET "clk"  LOC = "T9"  ;
> 
> ###################################### Segments  
> ##########################################
> NET "segmentos<6>"  LOC = "E14"  ;
> NET "segmentos<5>"  LOC = "G13"  ;
> NET "segmentos<4>"  LOC = "N15"  ;
> NET "segmentos<3>"  LOC = "P15"  ;
> NET "segmentos<2>"  LOC = "R16"  ;
> NET "segmentos<1>"  LOC = "F13"  ;
> NET "segmentos<0>"  LOC = "N16"  ;
> NET "segmentos<7>"  LOC = "P16"  ;
> 
> 
> #######################################  Switches
> ###########################################
> NET "act_reg1"  LOC = "F12"  ;
> NET "act_reg2"  LOC = "G12"  ;
> NET "act_reg3"  LOC = "H14"  ;
> 
> 
> #PACE: Start of PACE Area Constraints
> 
> #PACE: Start of PACE Prohibit Constraints
> 
> #PACE: End of Constraints generated by PACE
> 
> 
> 
> 
> -------------------------------------------------------------------
> 
> And here is a fragment of my processor:
> 
> ----PC process
> process(inicio,simulated_clk)
>  
> variable pc_out: std_logic_vector (15 downto 0); 
> variable pc_high : std_logic_vector(15 downto 0);
> begin
> 	--Si es la primera vez que se carga
> 	if inicio='0' then 
> 		pc<="0000000000000000"; 
> 		reg_ret<="0000000000000000";
> 		pc_out:="0000000000000000";
> 	--Se ha cargado mÃ¡s veces
> 	elsif simulated='1' and simulated_clk'event then  
> 		pc_out:=pc;
> 
> ...
> end process;
> 
> 
> I don't know why it doesn't work. As I said I just want to simulate a clock
> so to show my proffesor how the processor works step by step (instruction
> by instruction).
> 
> Hope you can help. Thanks!
> 
> PD: Sorry for my poor English.
> 
> 	   
> 					
> ---------------------------------------		
> Posted through http://www.FPGARelated.com

  You don't say exactly what doesn't work, but the obvious
problem with using a button for a clock is that if it is
a typical mechanical switch it will bounce, and then each
button press will generate many clocks to your circuit.

You need to debounce the switch in order to use it as
a clock.

-- Gabor

Article: 154853
Subject: Re: Combination loops and false paths
From: rickman <gnuarm@gmail.com>
Date: Fri, 18 Jan 2013 20:03:49 -0500
Links: << >> << T >> << A >>

On 1/17/2013 8:57 PM, glen herrmannsfeldt wrote:
> Rob Doyle<radioengr@gmail.com>  wrote:
>
> (snip)
>
>> I guess I'm using term ALU and am2901 interchangeably.
>> I'll be more specific.
>
>> There is nothing wrong with the am2901 proper.  It is what it is.
>
> I suppose, but it was designed way before the tools we use now.
>
> (snip)
>
>> The problems is that am2901 output goes to a bus that eventually routes
>> back to the am2901 input for some unused (as best I can tell)
>> configuration of the microcode.  This all happens with no registers in
>> the loop.
>
> (snip)
>
>> I guess that it is just a design from another day - a whole lot less
>> synchronous than anything I've done in an FPGA before.
>
>> I have enjoyed going back through that all.   I even found my "Mick and
>> Brick" book.  I'll probably do a VAX 11/780 next which also used
>> bit-sliced parts.
>
> Years ago, maybe just about when it was new, I bought "Mick and Brick."
>
> Then, about 20 years ago, it got lost in a move. A few weeks ago I
> bought a used one from half.com for a low price. (In case I decide
> to do some 2901 designs in FPGAs.)
>
> The discussion on combinatorial loops reminds me of the wrap around
> carry on ones complement adders. If done the obvious way, it is a
> combinatorial loop, but hopefully one that, in actual use, resolves
> itself.

Mick and Brick was not just about the 2901, it covered the basic 
concepts of designing a processor.  One of the things that stuck with me 
was the critical path they described, which I believe was in a 
conditional branch calculating the next address (I guess it finally got 
away from me again).  I found that to be true on every processor design 
I looked at, including the MISC designs I did in FPGAs on my own.  These 
guys had some pretty good insight into processor design.

I had my own book too and will have to dig around for it.  But I am 
pretty sure it is gone as I haven't seen it in other searches I've done 
for other books the last ten years or so.  I think I got it free from 
AMD at one point.  Now they are over $100 for one in good condition. 
I'm not sure what "adequate" means for a book condition.  They say it is 
all legible, but I've seen some pretty rough books in "good" condition.

Rick

Article: 154854
Subject: Re: Combination loops and false paths
From: rickman <gnuarm@gmail.com>
Date: Fri, 18 Jan 2013 22:00:27 -0500
Links: << >> << T >> << A >>

On 1/17/2013 9:09 PM, glen herrmannsfeldt wrote:
> rickman<gnuarm@gmail.com>  wrote:
>
> (snip, I wrote)
>
>>> The BRAM on most FPGAs are synchronous (clocked). That might not
>>> match what you need for some older designs. If it isn't too big,
>>> and you really need asynchronous RAM, you have to make it out
>>> of CLB logic.
>
>> Yes, not only are the block RAMs synchronous, the LUT RAMs (distributed)
>> are also synchronous.  That is why I say you have to make async RAM out
>> of latches.
>
> The are now? They didn't used to be. I am somewhat behind in the
> generations of FPGAs.

Actually, I think Xilinx made their XC4000 series with clocks for 
writing the distributed RAM.  They had too much trouble with poor 
designs trying to generate a write pulse with good timing and decided 
they were better off giving the user a clock.  I used the ACEX from 
Altera in 2000 or so which had an async read block RAM.  It made a 
processor easier to design saving a clock cycle on reads.  Block RAMs 
have always been synchronous on the writes and now they are synchronous 
on reads as well... so many generations of FPGAs...

> (snip, I also wrote)
>
>>> I believe that the KA-10 was done in asynchronous (non-clocked) logic.
>>> That might make an interesting FPGA project.
>
>> Doing async logic in an FPGA is not so easy.  You need timing info that
>> is hard to get.
>
> The whole idea behind asynchronous logic is that you don't need to know
> any timing information. Otherwise known as self-timed logic, there is
> enough hand-shaking such that every signal changes when it is ready, no
> sooner and no later. If you use dual-rail logic:
>
> http://en.wikipedia.org/wiki/Asynchronous_system#Asynchronous_datapaths
>
> then all timing just works.

You need to read that section again...  Nowhere does it say the timing 
"just works".  It describes two ways to communicate a signal, one is to 
send one of two pulses for a 1 or a 0 and the other is to use a 
handshake signal which has a delay longer than the data it is clocking. 
  In both cases you have to use timing to generate the control signal 
(or combined data and control in the first case).  The advantage is that 
the timing issues are "localized" to the unit rather than being global.

The problem with doing this in an FPGA is that the tools are all 
designed for fully synchronous systems.  This sort of local timing with 
an emphasis on relative delays rather than simple maximum delays is 
difficult to do using the standard tools.

> If you mix synchronous and asynchronous logic, then things get more
> interesting.

All real time systems are at some point synchronous.  They have 
deadlines to meet and often there are recurring events that have to be 
synced to a clock such as an interface or an ADC.  In the end an async 
processor buys you very little other than saving power in the clock 
tree.  Even this is just a strawman as the real question is the power it 
takes to get the job done, not how much power is used to distribute the 
clock.

The GA144 is an array of 144 fully async processors.  I have looked at 
using the GA144 for real world designs twice.  In each case the I/O had 
to be clocked which is awkwardly supported in the chip.  In the one case 
the limitations made it very difficult to even analyze timing of a 
clocked interface, much less meet timing.  In the other case low power 
was paramount and the GA144 could not match the power requirements while 
I am pretty sure I can do the job with a low power FPGA.  Funny 
actually, the GA144 has an idle current of just 55 nA per processor or 
just 7 uA for the chip.  The FPGA I am working with has an idle current 
of some 40 uA but including the processing the total should be under 100 
uA.  In the GA144 I calculated over 100 uA just driving the ADC not 
counting any real processing.  Actually, most of the power is used in 
timing the ADC conversion.  Without a high resolution clock the only way 
to time the ADC conversion is to put the processor in an idle loop...

There is many a slip 'twixt cup and lip.

Rick

Article: 154855
Subject: full tcp offload solution with tcp session setup/teardown support
From: oguzyilmazlist@gmail.com
Date: Sat, 19 Jan 2013 03:54:49 -0800 (PST)
Links: << >> << T >> << A >>

Hello,

I am searching for a fpga accelerated ethernet card solution for facing tcp=
 sessions before OS. The solution should complete 3 way handshake before op=
erating system/driver stage. This implies it should create SYN-ACK packets =
and wait for 3rd step ACK. This implies it should keep a connection/session=
 table. Generally, I am waiting high connection rate (1M conn per second fo=
r 1 Gbps connection) and high number of live sessions.

I would be grateful for any redirection. Sorry for bothering if this is the=
 wrong community for the subject.

Regards,

Oguz

Article: 154856
Subject: Re: full tcp offload solution with tcp session setup/teardown support
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Sat, 19 Jan 2013 12:09:38 +0000 (UTC)
Links: << >> << T >> << A >>

oguzyilmazlist@gmail.com wrote:
> Hello,
> 
> I am searching for a fpga accelerated ethernet card solution 
> for facing tcp sessions before OS. The solution should complete 
> 3 way handshake before operating system/driver stage. 
> This implies it should create SYN-ACK packets and wait for 3rd 
> step ACK. This implies it should keep a connection/session table. 
> Generally, I am waiting high connection rate (1M conn per second 
> for 1 Gbps connection) and high number of live sessions.

It might be that some NIC do that. I know there are some with
special features to offload some of the processing from the
server, such as the checksum calculation.

> I would be grateful for any redirection. Sorry for bothering if 
> this is the wrong community for the subject.

You might try comp.dcom.lans.ethernet, even though it isn't
really an ethernet question. There is also a tcpip group.

-- glen

Article: 154857
Subject: Re: full tcp offload solution with tcp session setup/teardown support
From: oguzyilmaz@gmail.com
Date: Sat, 19 Jan 2013 04:39:40 -0800 (PST)
Links: << >> << T >> << A >>

On Saturday, January 19, 2013 2:09:38 PM UTC+2, glen herrmannsfeldt wrote:
> oguzyilmazlist@gmail.com wrote:
> 
> > Hello,
> 
> > 
> 
> > I am searching for a fpga accelerated ethernet card solution 
> 
> > for facing tcp sessions before OS. The solution should complete 
> 
> > 3 way handshake before operating system/driver stage. 
> 
> > This implies it should create SYN-ACK packets and wait for 3rd 
> 
> > step ACK. This implies it should keep a connection/session table. 
> 
> > Generally, I am waiting high connection rate (1M conn per second 
> 
> > for 1 Gbps connection) and high number of live sessions.
> 
> 
> 
> It might be that some NIC do that. I know there are some with
> 
> special features to offload some of the processing from the
> 
> server, such as the checksum calculation.
> 
> 

I am searching for a different solution then ordinary TOE NIC solutions. The difference is high rate of tcp session setup/teardown.

> 
> > I would be grateful for any redirection. Sorry for bothering if 
> 
> > this is the wrong community for the subject.
> 
> 
> 
> You might try comp.dcom.lans.ethernet, even though it isn't
> 
> really an ethernet question. There is also a tcpip group.
> 
> 
> 
> -- glen

Article: 154858
Subject: Re: full tcp offload solution with tcp session setup/teardown
From: Allan Herriman <allanherriman@hotmail.com>
Date: 19 Jan 2013 15:39:36 GMT
Links: << >> << T >> << A >>

On Sat, 19 Jan 2013 03:54:49 -0800, oguzyilmazlist wrote:

> Hello,
> 
> I am searching for a fpga accelerated ethernet card solution for facing
> tcp sessions before OS. The solution should complete 3 way handshake
> before operating system/driver stage. This implies it should create
> SYN-ACK packets and wait for 3rd step ACK. This implies it should keep a
> connection/session table. Generally, I am waiting high connection rate
> (1M conn per second for 1 Gbps connection) and high number of live
> sessions.

I hate to be a naysayer, but I believe 1M connections per second is not 
possible on a 1Gb/s link, regardless of how fast the processing is.

Minimum frame size = 64 bytes + 8 bytes preamble and SFD + 12 bytes IFG.  
You can send up to about 1.488M packets per second in each direction.  
Can't do a 3 way handshake with 1.488 packets, unless you do some trick 
like putting multiple handshakes in the one packet.

I've never implemented a TCP/IP stack, so I might be missing something.

Regards,
Allan

Article: 154859
Subject: Re: full tcp offload solution with tcp session setup/teardown support
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Sat, 19 Jan 2013 18:52:55 +0000 (UTC)
Links: << >> << T >> << A >>

oguzyilmaz@gmail.com wrote:

(snip)
>> > I am searching for a fpga accelerated ethernet card solution 
>> > for facing tcp sessions before OS. The solution should complete 
>> > 3 way handshake before operating system/driver stage. 

>> > This implies it should create SYN-ACK packets and wait for 3rd 
>> > step ACK. This implies it should keep a connection/session table. 

>> > Generally, I am waiting high connection rate (1M conn per second 

>> > for 1 Gbps connection) and high number of live sessions.

>> It might be that some NIC do that. I know there are some with
>> special features to offload some of the processing from the
>> server, such as the checksum calculation.

> I am searching for a different solution then ordinary TOE NIC 
> solutions. The difference is high rate of tcp session setup/teardown.

Yes, but someone else might have had this problem before.

Though a high rate of setup/teardown implies only a small amount
of data to each, and most use UDP in that case.

Can you explain the actual problem that you are trying to solve?
(Which specific protocol, or what kind of data?)

I haven't thought about this for a while, but I believe, while
it is usually not done, it is possible to include TCP data in
some of the TCP handshaking packets. You might also be able
to add FIN earlier than usual. 

Post to the tcp-ip newsgroup and ask about the minimum TCP
session.  You might be able to do:

1)  SYN+data
2)  SYN+ACK+data+FIN
3)  ACK+data+FIN

I know that there are NICs designed to offload some of the work,
but I don't know much more than that.

-- glen

Article: 154860
Subject: Re: Combination loops and false paths
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Sat, 19 Jan 2013 22:21:59 +0000 (UTC)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:

(snip)
> Mick and Brick was not just about the 2901, it covered the basic 
> concepts of designing a processor.  One of the things that stuck with me 
> was the critical path they described, which I believe was in a 
> conditional branch calculating the next address (I guess it finally got 
> away from me again).  I found that to be true on every processor design 
> I looked at, including the MISC designs I did in FPGAs on my own.  These 
> guys had some pretty good insight into processor design.

Yes, but with 29xx for all the examples. I also have some books
on microprogramming, independent of the processor. Well, maybe
not completely independent.
 
> I had my own book too and will have to dig around for it.  But I am 
> pretty sure it is gone as I haven't seen it in other searches I've done 
> for other books the last ten years or so.  I think I got it free from 
> AMD at one point.  Now they are over $100 for one in good condition. 
> I'm not sure what "adequate" means for a book condition.  They say it is 
> all legible, but I've seen some pretty rough books in "good" condition.

half.com has $1.49 (plus shipping) for acceptable condition, $5.52 for
good condition, and $8.88 for very good condition.

The one I got has the dust jacket a little worn and torn, and the
spine might be a little weak, but plenty usable. 

-- glen

Article: 154861
Subject: Re: Combination loops and false paths
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Sat, 19 Jan 2013 22:35:55 +0000 (UTC)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:

(snip)
>>> Yes, not only are the block RAMs synchronous, the LUT RAMs (distributed)
>>> are also synchronous.  That is why I say you have to make async RAM out
>>> of latches.

>> The are now? They didn't used to be. I am somewhat behind in the
>> generations of FPGAs.
 
> Actually, I think Xilinx made their XC4000 series with clocks for 
> writing the distributed RAM.  They had too much trouble with poor 
> designs trying to generate a write pulse with good timing and decided 
> they were better off giving the user a clock.  I used the ACEX from 
> Altera in 2000 or so which had an async read block RAM.  It made a 
> processor easier to design saving a clock cycle on reads.  Block RAMs 
> have always been synchronous on the writes and now they are synchronous 
> on reads as well... so many generations of FPGAs...

I am not sure about writes now. BRAMs are synchronous read, but LUT
RAM better not be, as the LUTs are the same as used for logic.

Some designs just won't work with a synchronous read RAM (which
is sometimes a ROM). 
 
(snip on asynchronous logic)

> You need to read that section again...  Nowhere does it say the timing 
> "just works".  It describes two ways to communicate a signal, one is to 
> send one of two pulses for a 1 or a 0 and the other is to use a 
> handshake signal which has a delay longer than the data it is clocking. 
>  In both cases you have to use timing to generate the control signal 
> (or combined data and control in the first case).  The advantage is that 
> the timing issues are "localized" to the unit rather than being global.

I meant the one they call dual rail logic. There are two wires sending
the signal, in one of three states, 0, 1, or none, and one coming
back acknowledging the signal. The generate a signal, either the 0
or 1 goes active, until the ack comes back, at which time the output
signal is removed until the ack goes away. Full handshake both ways.

> The problem with doing this in an FPGA is that the tools are all 
> designed for fully synchronous systems.  This sort of local timing with 
> an emphasis on relative delays rather than simple maximum delays is 
> difficult to do using the standard tools.

Yes. Besides all those useless FF's in each cell. 
 
>> If you mix synchronous and asynchronous logic, then things get more
>> interesting.
 
> All real time systems are at some point synchronous.  They have 
> deadlines to meet and often there are recurring events that have to be 
> synced to a clock such as an interface or an ADC.  In the end an async 
> processor buys you very little other than saving power in the clock 
> tree.  Even this is just a strawman as the real question is the power it 
> takes to get the job done, not how much power is used to distribute the 
> clock.

As I understand it, there are some current processors with asynchronous
logic blocks, such as a multiplier. The operands are clocked in and,
an unknown (data dependent) number of cycles later the result comes
out, is latched, and sent on. So, 0*0 might be very fast, were
full width operands might be much slower. 
 
> The GA144 is an array of 144 fully async processors.  I have looked at 
> using the GA144 for real world designs twice.  In each case the I/O had 
> to be clocked which is awkwardly supported in the chip.  In the one case 
> the limitations made it very difficult to even analyze timing of a 
> clocked interface, much less meet timing.  In the other case low power 
> was paramount and the GA144 could not match the power requirements while 
> I am pretty sure I can do the job with a low power FPGA.  Funny 
> actually, the GA144 has an idle current of just 55 nA per processor or 
> just 7 uA for the chip.  The FPGA I am working with has an idle current 
> of some 40 uA but including the processing the total should be under 100 
> uA.  In the GA144 I calculated over 100 uA just driving the ADC not 
> counting any real processing.  Actually, most of the power is used in 
> timing the ADC conversion.  Without a high resolution clock the only way 
> to time the ADC conversion is to put the processor in an idle loop...

Sounds like an interesting design.

-- glen

Article: 154862
Subject: Re: full tcp offload solution with tcp session setup/teardown support
From: oguzyilmazlist@gmail.com
Date: Sun, 20 Jan 2013 13:39:54 -0800 (PST)
Links: << >> << T >> << A >>

On Saturday, January 19, 2013 8:52:55 PM UTC+2, glen herrmannsfeldt wrote:
>=20
>=20
>=20
> (snip)
>=20
> >> > I am searching for a fpga accelerated ethernet card solution=20
>=20
> >> > for facing tcp sessions before OS. The solution should complete=20
>=20
> >> > 3 way handshake before operating system/driver stage.=20
>=20
> =20
>=20
> >> > This implies it should create SYN-ACK packets and wait for 3rd=20
>=20
> >> > step ACK. This implies it should keep a connection/session table.=20
>=20
> =20
>=20
> >> > Generally, I am waiting high connection rate (1M conn per second=20
>=20
> =20
>=20
> >> > for 1 Gbps connection) and high number of live sessions.
>=20
> =20
>=20
> >> It might be that some NIC do that. I know there are some with
>=20
> >> special features to offload some of the processing from the
>=20
> >> server, such as the checksum calculation.
>=20
> =20
>=20
> > I am searching for a different solution then ordinary TOE NIC=20
>=20
> > solutions. The difference is high rate of tcp session setup/teardown.
>=20
>=20
>=20
> Yes, but someone else might have had this problem before.
>=20
>=20
>=20
> Though a high rate of setup/teardown implies only a small amount
>=20
> of data to each, and most use UDP in that case.
>=20
>=20
>=20
> Can you explain the actual problem that you are trying to solve?
>=20
> (Which specific protocol, or what kind of data?)
>=20
>=20

Actual problems are,

- For IP Spood TCP connection trials, Full Toe NIC should receive SYN, send=
 SYN-ACK, wait for ACK. This is 3way handshake. If this completes, we are s=
ure IP is not spoofed. Now NIC can forward connection to the driver and ope=
rating system.=20

- Operating systems are using hash tree or radix tree tables for keeping st=
ate entries. For high session setup/teardown rates, this can be slow to Add=
, Delete Modify this table. Each state entry maybe about 500 bytes.I am cur=
ious about outcomes of doing state table operations on a TOE NIC.


>=20
> I haven't thought about this for a while, but I believe, while
>=20
> it is usually not done, it is possible to include TCP data in
>=20
> some of the TCP handshaking packets. You might also be able
>=20
> to add FIN earlier than usual.=20
>=20
>=20
>=20
> Post to the tcp-ip newsgroup and ask about the minimum TCP
>=20
> session.  You might be able to do:
>=20
>=20
>=20
> 1)  SYN+data
>=20
> 2)  SYN+ACK+data+FIN
>=20
> 3)  ACK+data+FIN
>=20
>=20
>=20
> I know that there are NICs designed to offload some of the work,
>=20
> but I don't know much more than that.
>=20
>=20
>=20
> -- glen

Article: 154863
Subject: Re: Combination loops and false paths
From: rickman <gnuarm@gmail.com>
Date: Mon, 21 Jan 2013 00:32:03 -0500
Links: << >> << T >> << A >>

On 1/19/2013 5:35 PM, glen herrmannsfeldt wrote:
> rickman<gnuarm@gmail.com>  wrote:
>
> (snip)
>>>> Yes, not only are the block RAMs synchronous, the LUT RAMs (distributed)
>>>> are also synchronous.  That is why I say you have to make async RAM out
>>>> of latches.
>
>>> The are now? They didn't used to be. I am somewhat behind in the
>>> generations of FPGAs.
>
>> Actually, I think Xilinx made their XC4000 series with clocks for
>> writing the distributed RAM.  They had too much trouble with poor
>> designs trying to generate a write pulse with good timing and decided
>> they were better off giving the user a clock.  I used the ACEX from
>> Altera in 2000 or so which had an async read block RAM.  It made a
>> processor easier to design saving a clock cycle on reads.  Block RAMs
>> have always been synchronous on the writes and now they are synchronous
>> on reads as well... so many generations of FPGAs...
>
> I am not sure about writes now. BRAMs are synchronous read, but LUT
> RAM better not be, as the LUTs are the same as used for logic.
>
> Some designs just won't work with a synchronous read RAM (which
> is sometimes a ROM).

That's right.  My processor design on the ACEX with async reads had to 
be modified to work in nearly any other part with fully sync block RAM. 
  I could possibly clock the RAM on the negative edge with the rest of 
the design clocked on the positive.  Or I could do a read on every clock 
using the address precursor which is available on the prior clock cycle 
at the input to the address register.  Both methods reduce timing 
margins along with other tradeoffs.

>> You need to read that section again...  Nowhere does it say the timing
>> "just works".  It describes two ways to communicate a signal, one is to
>> send one of two pulses for a 1 or a 0 and the other is to use a
>> handshake signal which has a delay longer than the data it is clocking.
>>   In both cases you have to use timing to generate the control signal
>> (or combined data and control in the first case).  The advantage is that
>> the timing issues are "localized" to the unit rather than being global.
>
> I meant the one they call dual rail logic. There are two wires sending
> the signal, in one of three states, 0, 1, or none, and one coming
> back acknowledging the signal. The generate a signal, either the 0
> or 1 goes active, until the ack comes back, at which time the output
> signal is removed until the ack goes away. Full handshake both ways.

I haven't seen the logic, but how do they generate the timing for the 
handshakes?  I don't think it "just works".  My understanding is that 
the handshakes are generated by a delay line that is designed to have a 
longer delay than the logic.  This is hard to do with the timing tools 
designed for synchronous systems.

>> The problem with doing this in an FPGA is that the tools are all
>> designed for fully synchronous systems.  This sort of local timing with
>> an emphasis on relative delays rather than simple maximum delays is
>> difficult to do using the standard tools.
>
> Yes. Besides all those useless FF's in each cell.

I don't follow.  I think typical async logic still has FFs, they just 
don't use a global clock.  I suppose if you have handshakes back and 
forth you are making latches in the combinatorial logic if nothing else.

>>> If you mix synchronous and asynchronous logic, then things get more
>>> interesting.
>
>> All real time systems are at some point synchronous.  They have
>> deadlines to meet and often there are recurring events that have to be
>> synced to a clock such as an interface or an ADC.  In the end an async
>> processor buys you very little other than saving power in the clock
>> tree.  Even this is just a strawman as the real question is the power it
>> takes to get the job done, not how much power is used to distribute the
>> clock.
>
> As I understand it, there are some current processors with asynchronous
> logic blocks, such as a multiplier. The operands are clocked in and,
> an unknown (data dependent) number of cycles later the result comes
> out, is latched, and sent on. So, 0*0 might be very fast, were
> full width operands might be much slower.

I haven't heard of that.  How would that benefit a sync processor?  I 
can only think that would be useful if the design were compared to one 
with a very slow multiplier which required the processor to wait for 
many clock cycles.  A multiplier with a "ready" flag could shorten the 
wait.  But that can also be done for a fully sync multiplier.  In fact, 
in the async GA144 there is no multiply instruction.  Instead there is a 
multiply step instruction which can be used to do multiplies in a loop. 
  The loop can be terminated when the multiply has detected the rest of 
the bits are all zero (or all ones maybe?).  I haven't seen the code 
that does this, but this is reported in some of their white papers.

>> The GA144 is an array of 144 fully async processors.  I have looked at
>> using the GA144 for real world designs twice.  In each case the I/O had
>> to be clocked which is awkwardly supported in the chip.  In the one case
>> the limitations made it very difficult to even analyze timing of a
>> clocked interface, much less meet timing.  In the other case low power
>> was paramount and the GA144 could not match the power requirements while
>> I am pretty sure I can do the job with a low power FPGA.  Funny
>> actually, the GA144 has an idle current of just 55 nA per processor or
>> just 7 uA for the chip.  The FPGA I am working with has an idle current
>> of some 40 uA but including the processing the total should be under 100
>> uA.  In the GA144 I calculated over 100 uA just driving the ADC not
>> counting any real processing.  Actually, most of the power is used in
>> timing the ADC conversion.  Without a high resolution clock the only way
>> to time the ADC conversion is to put the processor in an idle loop...
>
> Sounds like an interesting design.

I still need to verify that the LVDS input will detect the still very 
low level signal from the antenna.  Once I show that to work, I've got 
the rest covered.  If it doesn't work, I'll either need to use a 
separate comparator or if that won't work I might be able to provide 
some feedback to keep the detector on it's sensitive edge.

All the other parts have been analyzed well enough that I am very 
confident I'll meet my goal.

BTW, I am thinking of using a cheap analog battery driven clock as an 
output device.  So I bought one for $4 and took it apart.  It has the 
tiny circuit board for the clock chip and crystal and a very simple coil 
driving a gear that turns 180° each tick.  The rest of the clock is the 
same as any analog clock except it is *all* plastic.  Plastic gears, 
plastic pivot, plastic box.  I guess once you do the timing with 
electronics there is no longer a need for the fancy stuff in the 
mechanism.  Checking on Aliexpress I found the mechanisms for only $2! 
Sometimes technology is amazing in just how cheaply it can be produced.

Rick

Article: 154864
Subject: TTA-based Co-design Environment (TCE) v1.7 released
From: Pekka Jaaskelainen <pekka.jaaskelainen@nospam.pls.invalid>
Date: Mon, 21 Jan 2013 16:23:47 +0000 (UTC)
Links: << >> << T >> << A >>

TTA-based Co-design Environment (TCE) is a toolset for designing 
application-specific processors based on the Transport Triggered 
Architecture (TTA). The toolset provides a complete retargetable co-design
flow from high-level language programs down to synthesizable processor
RTL (VHDL and Verilog backends supported) and parallel program binaries.
Processor customization points include the register files, function units, 
supported operations, and the interconnection network.

This release adds support for LLVM 3.2, initial support for
half-precision floats, improved vector support and OpenCL
host-tta-device mode simulation with pocl's ttasim driver. See the
CHANGES file for a more thorough change listing.

Acknowledgements
----------------
Many thanks to Dr. Erno Salminen for his improvements on the TCE User
Manual, and to Antti Hayrrinen for his first code contributions to this
release. Keep them coming!

We'd like to thank the Radio Implementation Research Team of Nokia
Research Center and Academy of Finland (funding decision 253087)
for financially supporting most of the work for developing this release. 
Much appreciated!

Links
-----

TCE home page:     http://tce.cs.tut.fi
This announcement: http://tce.cs.tut.fi/downloads/ANNOUNCEMENT
Download:          http://tce.cs.tut.fi/downloads
Change log:        http://tce.cs.tut.fi/downloads/CHANGES

Article: 154865
Subject: i've used a verilog ip core of 8051...plz someone tell me what should
From: deepak <dpk.losser@gmail.com>
Date: Mon, 21 Jan 2013 11:14:48 -0800 (PST)
Links: << >> << T >> << A >>

Article: 154866
Subject: Re: Button clock
From: "tupadre" <4060@embeddedrelated>
Date: Mon, 21 Jan 2013 14:57:47 -0600
Links: << >> << T >> << A >>

Ok, it didn't work because the 'start' port was always to 0 so the
processor couldn't start working :)	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 154867
Subject: Re: i've used a verilog ip core of 8051...plz someone tell me what
From: Tim Wescott <tim@seemywebsite.com>
Date: Mon, 21 Jan 2013 16:06:05 -0600
Links: << >> << T >> << A >>

On Mon, 21 Jan 2013 11:14:48 -0800, deepak wrote:

Absolutely nothing, because he mistook the title for the text.

I'm not going to even attempt to read the post that you shoved into the 
title.  Try writing a _short_, _descriptive_ title, like "8051 core 
problems", then try the new, different, and innovative technique of 
PUTTING YOUR QUESTION IN THE TEXT OF YOUR POST.

I guarantee you, more people will be willing to answer.

-- 
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com

Article: 154868
Subject: Re: image storing into BRAM
From: vidyasagar.kantamneni@gmail.com
Date: Tue, 22 Jan 2013 03:50:18 -0800 (PST)
Links: << >> << T >> << A >>

Hi Balaji, 

were you able to get over your project, to be able to communicate with Nexys2 board via USB?

It will be great to hear from you. At the moment I am struggling with the same problem, where to start, what do do and so on.

I want to transfer a file from PC to Nexys 2 Bram via USB.

Hope you or anybody can help me

Thank you in advance. 
Best regards, 
Vidya.

Article: 154869
Subject: Re: i've used a verilog ip core of 8051...plz someone tell me what
From: deepak <dpk.losser@gmail.com>
Date: Tue, 22 Jan 2013 05:39:38 -0800 (PST)
Links: << >> << T >> << A >>

On Monday, January 21, 2013 2:06:05 PM UTC-8, Tim Wescott wrote:
> On Mon, 21 Jan 2013 11:14:48 -0800, deepak wrote:
> 
> 
> 
> Absolutely nothing, because he mistook the title for the text.
> 
> 
> 
> I'm not going to even attempt to read the post that you shoved into the 
> 
> title.  Try writing a _short_, _descriptive_ title, like "8051 core 
> 
> problems", then try the new, different, and innovative technique of 
> 
> PUTTING YOUR QUESTION IN THE TEXT OF YOUR POST.
> 
> 
> 
> I guarantee you, more people will be willing to answer.
> 
> 
> 
> -- 
> 
> My liberal friends think I'm a conservative kook.
> 
> My conservative friends think I'm a liberal kook.
> 
> Why am I not happy that they have found common ground?
> 
> 
> 
> Tim Wescott, Communications, Control, Circuits & Software
> 
> http://www.wescottdesign.com

thanks for ur guidance!!

Article: 154870
Subject: implementation of 8051 ip core on fpga
From: deepak <dpk.losser@gmail.com>
Date: Tue, 22 Jan 2013 06:49:29 -0800 (PST)
Links: << >> << T >> << A >>

i am getting these errors can anybody help!!

ERROR:HDLCompilers:27 - "../../Documents and Settings/deepak/Desktop/8051/trunk/rtl/verilog/oc8051_alu_test.v" line 62 Illegal redeclaration of 'oc8051_alu'


ERROR:HDLCompilers:26 - "../../Documents and Settings/deepak/Desktop/8051/trunk/rtl/verilog/oc8051_alu_test.v" line 322 Macro reference `OC8051_ALU_PCS is not defined

Article: 154871
Subject: Re: implementation of 8051 ip core on fpga
From: Tim Wescott <tim@seemywebsite.please>
Date: Tue, 22 Jan 2013 09:33:59 -0600
Links: << >> << T >> << A >>

On Tue, 22 Jan 2013 06:49:29 -0800, deepak wrote:

> i am getting these errors can anybody help!!
> 
> ERROR:HDLCompilers:27 - "../../Documents and
> Settings/deepak/Desktop/8051/trunk/rtl/verilog/oc8051_alu_test.v" line
> 62 Illegal redeclaration of 'oc8051_alu'

You don't say what tool you're using, whether you're trying to do 
simulation or synthesis, and whether you've written oc8051_alu_test.v 
yourself, or if you got it someplace, where and whether you've modified 
it.

That error message indicates that something called "oc8051_alu" is being 
declared twice.  Have you looked through that file to see if it's 
happening within the file?

> ERROR:HDLCompilers:26 - "../../Documents and
> Settings/deepak/Desktop/8051/trunk/rtl/verilog/oc8051_alu_test.v" line
> 322 Macro reference `OC8051_ALU_PCS is not defined

You're attempting to use a macro that doesn't exist.  If you wrote the 
code, then you probably misspelled the macro either where you define it 
or where you use it.  If you didn't write the code, tell us where you got 
it.

If you're trying to make an 8051 core work and this level of problem 
stymies you, you have a long row to hoe.

-- 
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com

Article: 154872
Subject: Re: implementation of 8051 ip core on fpga
From: GaborSzakacs <gabor@alacron.com>
Date: Tue, 22 Jan 2013 15:53:36 -0500
Links: << >> << T >> << A >>

deepak wrote:
> i am getting these errors can anybody help!!
> 
> ERROR:HDLCompilers:27 - "../../Documents and Settings/deepak/Desktop/8051/trunk/rtl/verilog/oc8051_alu_test.v" line 62 Illegal redeclaration of 'oc8051_alu'
> 
> 
> ERROR:HDLCompilers:26 - "../../Documents and Settings/deepak/Desktop/8051/trunk/rtl/verilog/oc8051_alu_test.v" line 322 Macro reference `OC8051_ALU_PCS is not defined

First, if you're using Xilinx tools I would suggest moving your project
to a directory that has no spaces in the path like c:\projects

Second, I suspect that although you're only seeing 2 errors, there must
be piles or warnings that would probably shed light on the real issue.
For example your second error saying that a Verilog macro is not defined
often means that some `include file was not found and therefore you
missed out on all of the `defines in that included file.  This often
happens in the Xilinx tools if you don't tell it where to look for
included files or put a more complete path in the `include statement.

-- Gabor

Article: 154873
Subject: Re: implementation of 8051 ip core on fpga
From: muzaffer.kal@gmail.com
Date: Thu, 24 Jan 2013 00:17:08 -0800 (PST)
Links: << >> << T >> << A >>

On Tuesday, January 22, 2013 6:49:29 AM UTC-8, deepak wrote:
> i am getting these errors can anybody help!!
> 
> 
> 
> ERROR:HDLCompilers:27 - "../../Documents and Settings/deepak/Desktop/8051/trunk/rtl/verilog/oc8051_alu_test.v" line 62 Illegal redeclaration of 'oc8051_alu'
> 
> 
> 
> 
> 
> ERROR:HDLCompilers:26 - "../../Documents and Settings/deepak/Desktop/8051/trunk/rtl/verilog/oc8051_alu_test.v" line 322 Macro reference `OC8051_ALU_PCS is not defined

It seems oc8051_alu_test.v and oc8051_alu.v define the same module inside so you probably want only one of them. It seems like alu_test is a new module in development. Try running the scripts in the sim directory instead of trying to make your own tests.

Article: 154874
Subject: Re: Ray Andraka's Book?
From: dldatwyler@gmail.com
Date: Thu, 24 Jan 2013 20:56:19 -0800 (PST)
Links: << >> << T >> << A >>

I noticed this very old post about the "book". Is it to be published soon?


On Wednesday, February 22, 2006 9:28:42 AM UTC-7, Ray Andraka wrote:
> Stephen Craven wrote:
> > Last summer mention was made of a DSP / FPGA book by Ray Andraka
> > hitting the (online) shelves this fall:
> > http://tinyurl.com/s4v5s
> > 
> > Has this come to pass?
> > 
> > Aside from Meyer-Baese's "Digital Signal Processing with Field
> > Programmable Gate Arrays" book, which I found somewhat difficult to
> > read, are there any other FPGA-specific DSP books out there?
> > 
> > Thanks!
> > Stephen
> > 
> 
> No, unfortunately, I am still working on it, and my publisher (elsevier) 
> is sitting on my rather firmly to get it done.  Only so many hours per 
> day.  They have set a new deadline for me to have it to them by August 
> so that they can get it out in the fall.

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search