Messages from 107200

Article: 107200
Subject: Re: Why No Process Shrink On Prior FPGA Devices ?
From: "Antti" <Antti.Lukats@xilant.com>
Date: 25 Aug 2006 07:35:29 -0700
Links: << >> << T >> << A >>

Peter Alfke schrieb:

> jacko wrote:
> >>
> > pin compatability is just customer support, how about a 1 pin high
> > implies a self program from a small hardwired rom, which gets enough of
> > the chip off the ground, to work as a programmer for itself and others.
> >
> >
> We have had that since the beginning, 20 years ago.
> It is called "Master Mode Configuration"
>
> Peter Alfke, Xilinx

no - I think this is more like one of my past "idea for xilinx"

FPGA has built in hardware loader for __small__ rom. this rom contains
the logic to implement the actual loader, be it compact flash or nand
or
whatever. easily doable. just make a small part of the FPGA to become
alive first. allowing the rest of the FPGA to be configured from the
'bootstrap ipcore'.

nobody is doing it - but without that, the RAM nased FPGA configuration
solutions are still kinda PITA.

sure as Xilinx is now bringing back the parallel flash solutions from
XC2K
into S3E and Virtex-5 it becomes better, but the bootstrap idea would
still be the kicker!

Antti

Article: 107201
Subject: Re: fastest FPGA
From: Austin Lesea <austin@xilinx.com>
Date: Fri, 25 Aug 2006 07:45:00 -0700
Links: << >> << T >> << A >>

OK,

The Virtex 4 family is the first family to ever be able to shift 1,0
through all resources (without tripping power on reset from the
transient of all switching on the same edge).

Cooling it is another matter, as many have stated.

My comment on over-clocking was intended to say that we are completely
unlike a micro-processor, and the traditional tricks that you read about
to get a microprocessor to work faster are not likely to work, as we
have far more complex timing paths in a customer design.

You appeared to live up to your name, that was all I was observing.

Sounds like you do know something of what you speak.  Sorry if I thought
you were (totally) ignorant.  Given the name, and the posting, it was
hard to tell.

Getting back to the 2000E, I remember that we had quite a bit of
difficulty with the speeds files for that part.  Something about them
being unable to model some of the paths accurately.

Generating the speeds files is sometimes difficult, and the software
trying to model the hardware can be just plain wrong.

With Virtex 4, and now Virtex 5, we no longer allow the software to
"invent" ways to model the hardware:  the process instead forces the
modeling to match the hardware.  Tricky business.

So, even this perverse design (in V4, or V5) is now able to run, in the
largest part.

I still do not recommend it, as the THUMP from all that switching leads
to not being able to control jitter, even with the best possible power
distribution system.  I believe that V4 and V5 will require some
internal 'SSO' usage restrictions, as they are not like earlier devices
which would configure, DONE would go high, and then immediately reset
and go back to configuring if you tried to instantiate and run a full
device shift register.

Austin

Article: 107202
Subject: UltraController II + SystemAce
From: "Patrick Dubois" <prdubois@gmail.com>
Date: 25 Aug 2006 07:46:13 -0700
Links: << >> << T >> << A >>

Hello everyone,

First of all, big thanks to everyone who contribute to this newsgroup.
Most of the time, I can find a solution to my problem just googling
this newsgroup. Not this time however...

I know that this is a long post but if you can help with the
UltraController II + SystemAce or UC2 + xmd, please read on.

I'm currently trying to get a system with the UltraController II up and
running. To make a long story short (hopefully), I started with the
tools at v7.1, the UC2 reference design, and an eval board from Avnet.
I was able to successfully build a small system using the UC2 and a
Wishbone bus. I always ran the UC2 code using the debugger however (XMD
+ GDB).

Confident that the UC2 was a good solution for me (as I don't have
external ram), I spent a few months incorporating the UC2 in my real
design (3 FPGAs, two VP40 and one VP7). I used SystemC to simulate the
UC2 and now I reached the point of trying it out in hardware. That's
when the problems start...

First of all, xmd can't even detect the PowerPCs in the JTAG chain in
the big design. So I took a step back and went back to the Avnet eval
board to play around more and get to understand the flow better. Right
now I'd like to accomplish two things:
1- I'd like to be able to create a SystemAce file that loads both the
bit config file and the elf program file at boot-up (for the Avnet
board with a single VP7).
2- I'd like to understand how xmd can load and run my software without
the GDB debugger.

For the SystemAce, I used the modified genace.tcl script that comes
with the reference design but it doesn't work. To be more precise, I
can generate the ace file, the SystemAce loads everything in the FPGA
fine (done led goes up), but my code doesn't boot. The hardware part
seems okay because one of my debug led turns on (attached directly to
VCC in the fpga). Since the code doesn't boot, I tried to use xmd to
connect to the PowerPC. However, xmd doesn't detect the PowerPC (ERROR:
Unable to connect to PowerPC target. Invalid Processor Version No
0x00000000"). That's strange because if I then proceed to load the bit
file using iMPACT, then xmd can detect the PowerPC fine.

Now for xmd, I'd like know how to load and run my code with it. I
followed the instructions in XAPP571 (to properly use the DEBUGHALT
controller), but it doesn't work. I tried a few things but I'm really
in the dark as I don't have deep understanding of the whole EDK flow
(which is why I choose the UltraController in the first place).

I appologize for the long post but I wanted to give as much details as
possible to whoever could help me... By the way, I opened several
webcases with Xilinx to help me along the way to where I am now, but I
thought I'd give this newsgroup a shot for any UC2 experts out there.

Thanks.

Patrick Dubois

Article: 107203
Subject: Re: fastest FPGA
From: Ray Andraka <ray@andraka.com>
Date: Fri, 25 Aug 2006 10:51:53 -0400
Links: << >> << T >> << A >>

Totally_Lost wrote:

> 
> A while back I had a nice long chat on the phone with Ray about this
> problem ... I don't think "ANYONE" ... even Ray, can make it work.
> 
> Unless there is some God called "Anyone" that you are hiding in the
> wings.
> 

Ahh, This must be John Bass. I thought I recognized this particular 
rant. I guess he changed his screen name from fpga_toys or whatever it 
was.  Yes, you can make an extreme design that will dissipate around 
100W, which would be a real challenge to power and keep cool. That is 
really a pathologic case though, real world high density high clock rate 
designs tend to have average toggle rates of 20% or less.  Bit serial 
designs have toggle rates that are a bit higher, but still usually well 
under 50%.  I don't see dissipations more than about 20-25W, which can 
be handled with proper thermal design on any of the large FPGAs.  In 
most cases, I'd say the average dissipation I've been seeing on large 
aggressive designs (2V6000, V4SX55, 2P70) is between 10 and 13 Watts.

Article: 107204
Subject: Installing Quartus 6 "web edition full"
From: "edaudio2000@yahoo.co.uk" <edaudio2000@yahoo.co.uk>
Date: 25 Aug 2006 08:27:43 -0700
Links: << >> << T >> << A >>

Is there a difference between Quartus "Web Edition" and "Web Edition
Full"?

I bought the NIOS development kit which came with Quartus 5.1 Web
Edition Full

I now want to replace it with the standard Quartus 6 WEb Edition, but
the program won't allow me to enter a new (downoladed) license file,
nor it will perform the automatic web license update Message: "Can't
find a newer license file for this system at the Altera website,
contact Altera Customer service"

Does anybody know what may be going on?

I have raised a  service request on the altera website, but I suspect
they may take a very long time to respond.

Any help appreciated 

TIA

Article: 107205
Subject: Re: Style of coding complex logic (particularly state machines)
From: Martin Gagnon <martin@yanos.No.SpAm.org>
Date: Fri, 25 Aug 2006 15:46:18 GMT
Links: << >> << T >> << A >>

["Followup-To:" header set to comp.lang.vhdl.]
On 2006-08-25, mikegurche@yahoo.com <mikegurche@yahoo.com> wrote:

[snip]

>
> Hi, Eilert,
>
> I generally use this style but with a different output segment.  I have
> three output logic templates:
>
> Template 1: vanilla, unbuffered output
>    -- FSM with unbuffered output
>    -- Can be used for Mealy/Moore output
>    -- (include input in sensitivity list for Mealy)
>    FSM_unbuf_out : PROCESS(CurrentState)
>          Y <= '0';  -- Default Value assignments
>          Z <= '0';
>        CASE CurrentState IS
>          WHEN Start => NULL;
>          WHEN Middle => Y <= '1';
>                         Z <= '1';
>          WHEN Stop => Z <= '1';
>          WHEN OTHERS => NULL;
>        END CASE;
>      END IF;
>    END PROCESS FSM_regout;
>
> Template 2: add buffer for output (There are 4 processes now ;-)
>    -- FSM with buffered output
>    -- there is a 1-clock delay
>    -- can be used for Mealy/Moore output
>    FSM_unbuf_out : PROCESS(CurrentState)
>          Y_tmp <= '0';  -- Default Value assignments
>          Z_tmp <= '0';
>        CASE CurrentState IS
>          WHEN Start => NULL;
>          WHEN Middle => Y_tmp <= '1';
>                         Z_tmp <= '1';
>          WHEN Stop => Z_tmp <= '1';
>          WHEN OTHERS => NULL;
>        END CASE;
>      END IF;
>    END PROCESS FSM_unbuf_out;
>
>    -- buffer for output signal
>    FSM_out_buf : PROCESS(Clock, Reset)
>      BEGIN -- Output Logic
>        IF Reset = '1' THEN
>          Y <='0';  -- Default Value assignments
>          Z <='0';
>        ELSIF Clock'EVENT AND Clock = '1' THEN
>          Y <= Y_tmp ;  -- Default Value assignments
>          Z <= Z_tmp;
>      END IF;
>    END PROCESS FSM_out_buf;
>
>
> Template 3: buffer with "look-ahead" output logic
>    -- FSM with look-ahead buffered output
>    -- no 1-clock delay
>    -- can be used for Moore output only
>    FSM_unbuf_out : PROCESS(NextState)
>          Y_tmp <= '0';  -- Default Value assignments
>          Z_tmp <= '0';
>        CASE NextState IS
>          WHEN Start => NULL;
>          WHEN Middle => Y_tmp <= '1';
>                         Z_tmp <= '1';
>          WHEN Stop => Z_tmp <= '1';
>          WHEN OTHERS => NULL;
>        END CASE;
>      END IF;
>    END PROCESS FSM_unbuf_out;
>
>    -- buffer for output signal
>    -- same as template 2
>    FSM_out_buf : PROCESS(Clock, Reset)
>    . . .
>
> The code is really lengthy.  However, as you indicated earlier, its
> structure is regular, and can be served as a template or even
> autogenerated.  I develop the template based on
> "http://academic.csuohio.edu/chu_p/rtl/chu_rtL_book/rtl_chap10_fsm.pdf"
> It is a very good article on FSM (or very bad, if this is not your
> coding style).
>

Hi.. I've read this pdf and it's look very interesting.. it's how many
different type of state machine implementations etc.. But the way I code
my state machine is different of all of them and I don't know if it's
good and I'm not sure to which one mine is equivalent. My state machine
is on one single process.. but is different than the way is shown in the
rtl_chap10_fsm.pdf file.. (the one that's is done in a single process
and is supposed to be bad) Here's one of my state machines example.
 
====================================================================

type txgen_states_t is (
    st_idle,
    st_gotsync,
    st_tx_delay,
    st_tx_startcharge,
    st_tx_stopcharge,
    st_tx_fire,
    st_wait_min_period
);

...

constant zero32: std_logic_vector(31 downto 0) := (others=>'0');
signal prev_state_buf, cur_state_buf : txgen_states_t ;

... 

txgen_state_machine_proc:
process(clk, reset_n)
begin
    if reset_n = '0' then
        prev_state_buf <= st_idle ;
        cur_state_buf <= st_idle ;

    elsif rising_edge(clk) then
        prev_state_buf <= cur_state_buf ;

        case cur_state_buf is
        when st_idle =>
            if sync = '1' then
                cur_state_buf <= st_gotsync ;

            else
                cur_state_buf <= cur_state_buf;

            end if;

        when st_gotsync =>
            cur_state_buf <= st_tx_delay ;
    
        when st_tx_delay =>
            if tx_delay_done = '1' then
                cur_state_buf <= st_tx_startcharge;
            else
                cur_state_buf <= cur_state_buf ;
            end if;
        
        when st_tx_startcharge =>
            if tx_charge_done = '1' then
                cur_state_buf <= st_tx_stopcharge;
            else
                cur_state_buf <= cur_state_buf ;
            end if;
        
        when st_tx_stopcharge =>
            if tx_transfer_done = '1' then
                cur_state_buf <= st_tx_fire;
            else
                cur_state_buf <= cur_state_buf ;
            end if;
        
        when st_tx_fire =>
            cur_state_buf <= st_wait_min_period;
        
        when st_wait_min_period =>
            if tx_min_period_done = '1' then
                cur_state_buf <= st_idle;
            else
                cur_state_buf <= cur_state_buf ;
            end if;
            
        when OTHERS =>
            cur_state_buf <= st_idle ;

        end case;
    end if;
end process;

--- 
--- One of the input_signal process
--- 
tx_charge_done_proc:
process(clkin, reset_n)
begin
    if reset_n = '0' then
        tx_charge_cnt <= (others=>'0') ;
        tx_charge_done <= '0';
        
    elsif rising_edge(clkin) then
        case cur_txgen_st_buf is
        when st_tx_startcharge =>
            if tx_charge_cnt > zero32(tx_charge_cnt'range) then
                tx_charge_cnt <= tx_charge_cnt - '1';
                tx_charge_done <= '0';
            else
                tx_charge_cnt <= tx_charge_cnt ;
                tx_charge_done <= '1';
            end if;

        when OTHERS =>
            tx_charge_cnt <= tx_charge_time ;
            tx_charge_done <= '0';
        end case;
    end if;
end process;

=================================================================

A lot of the code is missing.. I show the state machine process one of the
input used by the state machine...

I've never had problem with this way of implementing my state machine.. I
do a lot of RTL desing on a Xilinx Virtex-II with 200Mhz Clock.. and
everything work as I expect..

what do you think about the way I do my state machine ?

Thanks..  

-- 
Martin Gagnon

Article: 107206
Subject: Virtex 4 TEMAC and MII questions
From: "sjulhes" <t@aol.fr>
Date: Fri, 25 Aug 2006 17:51:08 +0200
Links: << >> << T >> << A >>

Hello,

I have to design a ethernet link in a V4 to be interfaced to an external PHY 
with a MII interface.

The ethernet link has to work in two ways :

- Normal transmit / receive through the TEMAC using the MII interface
- Manual driving of the MII interface from the FPGA logic to access the PHY 
with the the TEMAC disabled
- Going back to normal transmit / receive through the EMAC using the MII 
interface
- and so on

Here are my questions :

- Is the TEMAC hardware usable alone without any Ethernet IP ? How can this 
be done ?
- Is the MII interface part of the TEMAC or can thes MII interface signals 
be accessible from the rest of the FPGA, in parallel with the TEMAC ?
- Does the TEMAC logic can be connected / disconnected to the MII interface 
quickly and without having to reinitialize it ?
- Can a MII interface be accessible and driven with logic from inside the 
FPGA ?

Thank you for your answers.

Stéphane.

Article: 107207
Subject: Re: Running DDR below the min frequency
From: rick@mips.com
Date: 25 Aug 2006 09:02:15 -0700
Links: << >> << T >> << A >>


PeteS wrote:
> r...@mips.com wrote:
> > This post might not be directly on topic here but I'm hoping that
> > someone else out in FPGA-land might have come across a similar problem.
> >
> > The problem: I'm using an FPGA to do emulation of RTL that would
> > normally be build into an ASIC.As such, therefore, the RTL is not very
> > `FPGA friendly' and I can only get a speed of 33-40MHz, maybe 50 at a
> > push. The trouble is that the system incorporates a DDR DRAM controller
> > and DDR DRAMs have a min. frequency spec - 66MHz in the case of the
> > ones I'm using - related to the internal DLL. I *think* that this is
> > why I'm getting no response from the RAMs during read cycles & the data
> > bus seems to be floating.
> >
> > I've tried running with both DLL enabled and disabled to no avail.
> > [Maybe some manufacturers work in this mode and others don't].
> >
> > Any ideas ?
>
> The spec minimum for DDR is 83.33MHz. I have managed to make some run
> at 66MHz, but I don't think you'll get them to run lower.
>
> You are correct that this is related to the internal DLLs in the DDR.
>
> If you can't bump the speed, I can't see that you'll be able to make it
> work
>

Pete,

With a little bit of advice & messing around with s/w I've managed to
get DDR working in DLL-disabled mode @33MHz at least as far as getting
the PROM monitor up  running.

The basic advice was that with the DLL off the DDR DRAMs (may) ignore
the CAS latency value programmed into the mode register and just kind
of choose their own value. The detailed advice that in this case CL
always = 2 was not right since for the Samsung based DIMMs I'm using we
seem to have CL = 1.

Rick

Article: 107208
Subject: Re: UltraController II + SystemAce
From: "MM" <mbmsv@yahoo.com>
Date: Fri, 25 Aug 2006 12:04:43 -0400
Links: << >> << T >> << A >>

Patrick,

I don't have answers to all your questions, but here is some info:

> For the SystemAce, I used the modified genace.tcl script that comes
> with the reference design but it doesn't work. To be more precise, I
> can generate the ace file, the SystemAce loads everything in the FPGA
> fine (done led goes up), but my code doesn't boot. The hardware part
> seems okay because one of my debug led turns on (attached directly to
> VCC in the fpga). Since the code doesn't boot, I tried to use xmd to
> connect to the PowerPC. However, xmd doesn't detect the PowerPC (ERROR:
> Unable to connect to PowerPC target. Invalid Processor Version No
> 0x00000000"). That's strange because if I then proceed to load the bit
> file using iMPACT, then xmd can detect the PowerPC fine.

I haven't worked with SystemACE, but as was discussed here multiple times in
the past, the DONE pin going high doesn't really mean the configuration has
been finished, it rather says that it is about to be finished. In other
words you might be missing a few more clock cycles to finilize the
configuration.

> I tried a few things but I'm really
> in the dark as I don't have deep understanding of the whole EDK flow
> (which is why I choose the UltraController in the first place).

Ultracontroller is a neat idea, but it might be much easier to do a design
that would use BRAMs instead of trying to fit everything in cache. Since PPC
is a 32-bit processor, the code tends to grow pretty quickly making it
difficult to fit anything useful in cache alone. Also, if I understand
correctly, the cache cannot be initialized from the bit file, which makes
the initial boot cumbersome.  I have designed a board where I originally
planned to use UC2, but then switched to a regular PPC design with PLB_BRAM.

/Mikhail

Article: 107209
Subject: Re: Virtex 4 TEMAC and MII questions
From: "MM" <mbmsv@yahoo.com>
Date: Fri, 25 Aug 2006 12:11:24 -0400
Links: << >> << T >> << A >>

> Here are my questions :
>
> - Is the TEMAC hardware usable alone without any Ethernet IP ? How can
this
> be done ?

What do you mean under Ethernet IP. TEMAC is an Ethernet MAC.

> - Is the MII interface part of the TEMAC or can thes MII interface signals
> be accessible from the rest of the FPGA, in parallel with the TEMAC ?
> - Can a MII interface be accessible and driven with logic from inside the
> FPGA ?

The MII IOs of the TEMAC connect to regular FPGA pins through regular
routing, thus you are free to do with them whatever you want as soon as you
can meet timing constraints.

> - Does the TEMAC logic can be connected / disconnected to the MII
interface
> quickly and without having to reinitialize it ?

You need to study the MII protocol to figure this one out.


/Mikhail

Article: 107210
Subject: Re: Digilent USB support from Xilinx Impact (Programmer cable SDK for Impact)
From: "zcsizmadia@gmail.com" <zcsizmadia@gmail.com>
Date: 25 Aug 2006 09:21:42 -0700
Links: << >> << T >> << A >>

I still use 6.3,and it doesn't exist there :(
Afetr installing 8.2 I could make CableServer work (with a lot of
impact and CableServer crashes :)

Here is what I think. I can easily hook the WSOCK32 functions in
CableServer and see what parameters they have in accept, connect, recv,
send, ioctlsocket, etc. In that case it would be much easier to
reverse engineer the communication and reproduce the whole
communication in a new CableServer.

Your idea will work on both Linux and Win32 for sure, so next week I
start working on a new CableServer.

First I think I will implement the Paralell III support, that will be a
good start to test and debug, and later implement other LPT based
cables, and then implement Digilent USB, which I think should be easy.

I hate to pick names for projects. Any ideas? :)

Zoltan

Article: 107211
Subject: I2C on Xilinx Virtex-4/ML403
From: "Suzie" <eckardts@saic.com>
Date: 25 Aug 2006 09:27:38 -0700
Links: << >> << T >> << A >>

I'm developing on an ML403 evaluation board with a Virtex-4 device.
I'm calling Xilinx's Level 0 I2C driver routines (XIic_Send, _Recv)
from a PPC405 program running under the QNX OS.  I'm connecting to an
external I2C device, a temp sensor/ADC, via the J3 header on the ML403.

When scoping the I2C SDA and SCL lines, I often notice a missing bit
within the 8-bit address word.  Obviously, when this happens, the
addressed device does not ACK the transfer.

I believe that my physical I2C connection is correct because I can
successfully and consistently use the GPIO-I2C bit-banging approach (as
implemented in Xilinx's iic_eeprom test program) to communicate with my
external device.

I'm not sure how my operating environment or the driver could cause
this problem.  The address is supplied by a single byte-write to the
OPB_IIC core's Tx FIFO register; that seems atomic to me.  My gut
feeling is that there is a problem with the core.

Anyone seen this problem, or know what I might be doing wrong??

Article: 107212
Subject: Re: fastest FPGA
From: fpga_toys@yahoo.com
Date: 25 Aug 2006 09:29:22 -0700
Links: << >> << T >> << A >>

Ray Andraka wrote:
> Ahh, This must be John Bass. I thought I recognized this particular
> rant. I guess he changed his screen name from fpga_toys or whatever it
> was.  Yes, you can make an extreme design that will dissipate around
> 100W, which would be a real challenge to power and keep cool.

If you look back Ray, you will find that I have used Totally_Lost as a
handle far longer than discussions in just this forum ... as it's
excellent flame bait to draw out the bigots that only think they know
what they are talking about. In fact, I've used this handle in other
forums going back over 20 years. It was probably Totally_Lost that
first picked the specification, power limits, and cooling issue
discussion with Austin and Peter with this same objection based on the
2600E failures using heavy LUT shift registers.

Now, when there is an extreme design that can pull 100W, that means
there are data paterns that the design will have very high one, or
more, clock currents that have to be handled in the design. Designs
based on a bet that you can handle the average caseok, and just hope
worst case doesn't occur, just defer problems to random in field
failures.

Article: 107213
Subject: Re: Digilent USB support from Xilinx Impact (Programmer cable SDK for Impact)
From: "Antti" <Antti.Lukats@xilant.com>
Date: 25 Aug 2006 09:29:22 -0700
Links: << >> << T >> << A >>

zcsizmadia@gmail.com schrieb:

> I still use 6.3,and it doesn't exist there :(
> Afetr installing 8.2 I could make CableServer work (with a lot of
> impact and CableServer crashes :)
>
> Here is what I think. I can easily hook the WSOCK32 functions in
> CableServer and see what parameters they have in accept, connect, recv,
> send, ioctlsocket, etc. In that case it would be much easier to
> reverse engineer the communication and reproduce the whole
> communication in a new CableServer.
>
> Your idea will work on both Linux and Win32 for sure, so next week I
> start working on a new CableServer.
>
> First I think I will implement the Paralell III support, that will be a
> good start to test and debug, and later implement other LPT based
> cables, and then implement Digilent USB, which I think should be easy.
>
> I hate to pick names for projects. Any ideas? :)
>
> Zoltan

eeesy! names are easy. dont worry yet.

it makes sense to make a "DummY"
cableserver first, then add cable drivers as loadable plugins to it.
or use the cable server as another brigge, like forwarding impact
requests to altera jtagserver :)

or the cableserver could recognize both altera and impact protocols

Antti
attached is VERY RAW something that was at least responding
to some impact tcp packets. maybe its for older version and
its all changed, anyway all the protocol is clearly visible and
lots of info comes from cableserver commandline log, this
is good source to look at , there are as example commands
to set and clear single bits in parallel port, ..



----------
procedure TForm1.serverAccept(Sender: TObject; ClientSocket:
TCustomIpClient);
var
  a : array[0..100] of byte;
  slen,i,j,l : integer;
  pNumber, Speed: integer;
  pName, cName,
  s : string;

begin
  l := ClientSocket.ReceiveBuf(a, 100);
//  memo1.Lines.Add('ACCEPT '+inttostr(l));

  case a[0] of
    $00: begin
      memo1.Lines.Add('X 00');
      ClientSocket.Sendln(#$01, '');
    end;

    $01: begin
      memo1.Lines.Add('LOCK');

      ClientSocket.Sendln(#$01, '');
      repeat
        l := ClientSocket.ReceiveBuf(a, 100);
        if l>0 then
        begin
          ClientSocket.Sendln(#$01, '');
          memo1.Lines.Add('cmd'+inttostr(l));
        end;
      until (a[0] = 1) or (l<=0);
    end;

    $03: begin
///      memo1.Lines.Add('SET CABLE'+inttostr(l));

      if l = 2 then
      begin
//        ClientSocket.Sendln(#$01, '');
//        server.WaitForData;

        l := ClientSocket.ReceiveBuf(a, 100);
        l := ClientSocket.Sendln(#$01, '');
      end else
      begin
        ClientSocket.Sendln(#$01, '');
      end;
    end;
    $04: begin
      memo1.Lines.Add('Close Cable');
      ClientSocket.Sendln(#$01, '');
    end;

    $05: begin
///      memo1.Lines.Add('GET INFO');
      ClientSocket.Sendln(#$01, '');
      //

///      memo1.Lines.Add('GET INFO2');


      ClientSocket.Sendln(
      #$03#$00#$00#$00+
      'LPT'+
      #$00#$00#$00#$00+
      #$40#$0D#$03#$00+
      #$0C#$00#$00#$00+
      'Parallel III'+
      #$00#$00#$00#$00
      , '');
///      memo1.Lines.Add('GET INFO3');

    (*
      ClientSocket.Sendln(
      #$04#$00#$00#$00+
      'NONE'+
      #$FF#$FF#$FF#$FF+
      #$00#$00#$00#$00+
      #$00#$00#$00#$00+
      #$00#$00#$00#$00
      , '');
*)
    end;
    $11: begin
      memo1.Lines.Add('X 11');
      ClientSocket.Sendln(#$01, '');
    end;

    $70: begin
      //
      memo1.Lines.Add('Check Server');
      ClientSocket.Sendln(#$01, '');
    end else
    begin
      for i:=0 to l-1 do memo1.lines.add('REC:' +
inttohex(ord(a[i]),2));

      ClientSocket.Sendln(#$01, '');
    end;
  end;
end;

Article: 107214
Subject: Re: Digilent USB support from Xilinx Impact (Programmer cable SDK
From: David Ashley <dash@nowhere.net.dont.email.me>
Date: 25 Aug 2006 18:38:19 +0200
Links: << >> << T >> << A >>

Martin Thompson wrote:
> Of course, ideally, ditching Impact would be good ;-)

Well impact is now out of the picture with the spartan-3e
starter board, using the xup
http://inisyn.org/src/xup/

So I've got all I need.
vhdl -> ghdl -> gtkwave for simulation
vhdl -> xilinx webpack tools -> bitfile for synthesis
xup -> spartan3e for test

Net cost? $149 for spartan-3e starter board, plus
some determination not to do this all under windows.

-Dave

-- 
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture

Article: 107215
Subject: Re: Digilent USB support from Xilinx Impact (Programmer cable SDK for Impact)
From: "Antti" <Antti.Lukats@xilant.com>
Date: 25 Aug 2006 09:44:11 -0700
Links: << >> << T >> << A >>

David Ashley schrieb:

> Martin Thompson wrote:
> > Of course, ideally, ditching Impact would be good ;-)
>
> Well impact is now out of the picture with the spartan-3e
> starter board, using the xup
> http://inisyn.org/src/xup/
>
> So I've got all I need.
> vhdl -> ghdl -> gtkwave for simulation
> vhdl -> xilinx webpack tools -> bitfile for synthesis
> xup -> spartan3e for test
>
> Net cost? $149 for spartan-3e starter board, plus
> some determination not to do this all under windows.
>
> -Dave
>
> --
> David Ashley                http://www.xdr.com/dash
> Embedded linux, device drivers, system architecture

without determination it all works brillantly without fuzz in winXP ;)

Antti

Article: 107216
Subject: Re: I2C on Xilinx Virtex-4/ML403
From: "Antti" <Antti.Lukats@xilant.com>
Date: 25 Aug 2006 09:48:32 -0700
Links: << >> << T >> << A >>

Suzie schrieb:

> I'm developing on an ML403 evaluation board with a Virtex-4 device.
> I'm calling Xilinx's Level 0 I2C driver routines (XIic_Send, _Recv)
> from a PPC405 program running under the QNX OS.  I'm connecting to an
> external I2C device, a temp sensor/ADC, via the J3 header on the ML403.
>
> When scoping the I2C SDA and SCL lines, I often notice a missing bit
> within the 8-bit address word.  Obviously, when this happens, the
> addressed device does not ACK the transfer.
>
> I believe that my physical I2C connection is correct because I can
> successfully and consistently use the GPIO-I2C bit-banging approach (as
> implemented in Xilinx's iic_eeprom test program) to communicate with my
> external device.
>
> I'm not sure how my operating environment or the driver could cause
> this problem.  The address is supplied by a single byte-write to the
> OPB_IIC core's Tx FIFO register; that seems atomic to me.  My gut
> feeling is that there is a problem with the core.
>
> Anyone seen this problem, or know what I might be doing wrong??

no, but I see another problem with the OPB_IIC core.
no matter I set the clock scaler, etc the OPB_IIC core just
gives me 650KHz clock out. on SDA line.

I have managed I think to get the OPB_IIC working too one
a long time ago, but bitbang is way EASIER and always works.

Antti

Article: 107217
Subject: Re: virtex II inner organisation
From: flo <tnerolf@freesurf.fr>
Date: Fri, 25 Aug 2006 18:51:58 +0200
Links: << >> << T >> << A >>

Antti a écrit :
> flo schrieb:
> 
>> Hi everyone,
>> I'm trying to deal with readback and scrubbing into a XC2V1500 FPGA.
>>
>> I've got a problem identifying the Major Adress and the Minor Adress
>> when I'm doing a readback.
>> I read documents (XAPP138 and XAPP151) but nothing works with virtexII.
>> I know the frame length and the number of frame because it is in the
>> bitstream but nothing about the number of frame in each minor adress
>> depending on the major address and the blockk type...
>>
>> Does anyone know how to determine it?
>>
>> Thanks a lot.
>>
>> florent
> 
> this information is available in some files in the \xilinx\ dirs but
> you cant
> access them.
> 
> the easiest is to run bitgen with debug option and then look at the
> file,
> it will write out each frame separatly so you can gather the
> information you need
> 
> antti
> 
thanks antti !
I tried your solution and I can get beginning and end of each MJA/MNA 
into the bitstream.

It is a very interesting option for debug.

florent

Article: 107218
Subject: Re: virtex II inner organisation
From: flo <tnerolf@freesurf.fr>
Date: Fri, 25 Aug 2006 18:56:47 +0200
Links: << >> << T >> << A >>

jbnote@gmail.com a écrit :
> Hello Flo,
> 
>> Hi everyone,
>> I'm trying to deal with readback and scrubbing into a XC2V1500 FPGA.
>>
>> I've got a problem identifying the Major Adress and the Minor Adress
>> when I'm doing a readback.
>>
>> Does anyone know how to determine it?
> 
> This information is available in the virtex-II Useg Guide (ug002.pdf)
> available on Xilinx's website. Page 314 and following for the current
> revision (configuration details). Table 4-17 coupled with FAR decoding
> is what you're looking for.

Excellent ! brilliant! that's it, at least, the info i needed.
> 
> Based on this info I've already written working code for reading and
> dumping xc2v2000 *bitstreams* (crc computation, luts, brams...).
> 
> I can release it under the GPL
It would be interesting for my personnal knowledge, but the vhdl code I 
work on is for an electronic company. I'm not sure GPL will be much use 
for them.
  if you can help me to get it to work
> with readbacks (how far is that from pure bitstreams ?)
Sure I can have a look. In fact readback and error detection into the 
read data are the only things I'm sure is working.

  and the 1500
> (small amount of work here).
Sure I can complete your design with 1500 info

let me know.

Cheers.
Florent



> 
> JB
>

Article: 107219
Subject: Re: fastest FPGA
From: fpga_toys@yahoo.com
Date: 25 Aug 2006 09:56:51 -0700
Links: << >> << T >> << A >>

Ray Andraka wrote:
> Totally_Lost wrote:
>  >> So ... you are claiming any valid design will run in any Xilinx FPGA
>  >> at max clock rate?
>
> I'm afraid you don't know what you are talking about as far as FPGAs go.

Sorry Ray ... but your explaination about NOT overclocking an FPGA's
ignored the reason for overclocking any VLSI part .... to avoid the
design margin for worst case thermal/voltage/process in applications
where it's not needed, and thermal/voltage/process can be controlled
tighter with cooling, voltage selection, and hand selection of parts.

Sure ... a production design needs to adhere to the margins, a hand
tuned lab design doesn't, when it's specifically set up to optimize the
operating envelope based on controling these factors to gain
performance.

By the Way, your bit serial fpga math page was part of the resources I
used to setup the 2000E and 2600E design failures after hand packing
the RC5 and heat flow simulation designs, both of which were better
than 50% LUT SRL schematic macros. In the speed/area optimization for
highly replicated compute cores, bit serial, or digit serial frequently
beats out fully parallel designs. And unfortunately, is the most
dangerous operation area for Virtex parts using LUT shift register
memories heavily. Your page is a great resource for newbies, but
probably should include some warning notes about power and LUT SRL
designs.

After sorting that problem out, PAINFULLY, I did later rework both
designs based on LUT RAMs that was still bit/digit serial and avoided
the high heat/current of the LUT SRL's. Using Gray Code syncronized
counters. locally replicated to avoid fanout and routing skew, I was
finally able to get both designs functional, but lost some
performance/density in the process.

Sorry you got sucked in by the Totally_Lost handle, as you are a
wonderful resource, as is your web page .... but slaming posters based
on name, origin, school/work place, and just plain ignorance is pretty
poor form reserved for shit head bigots.

Article: 107220
Subject: Re: UltraController II + SystemAce
From: "Patrick Dubois" <prdubois@gmail.com>
Date: 25 Aug 2006 09:57:27 -0700
Links: << >> << T >> << A >>

MM wrote:

> I haven't worked with SystemACE, but as was discussed here multiple times in
> the past, the DONE pin going high doesn't really mean the configuration has
> been finished, it rather says that it is about to be finished. In other
> words you might be missing a few more clock cycles to finilize the
> configuration.

Thanks for the info, I wasn't aware of that. I also have a led that is
driven to '1' in the fpga fabric and this led turns on. It's still no
garantee that the whole chip is configued I guess...

> Ultracontroller is a neat idea, but it might be much easier to do a design
> that would use BRAMs instead of trying to fit everything in cache. Since PPC
> is a 32-bit processor, the code tends to grow pretty quickly making it
> difficult to fit anything useful in cache alone. Also, if I understand
> correctly, the cache cannot be initialized from the bit file, which makes
> the initial boot cumbersome.  I have designed a board where I originally
> planned to use UC2, but then switched to a regular PPC design with PLB_BRAM.

Yes, I'm starting to realize that the UC2 maybe wasn't such a great
idea after all. You are correct, the cache cannot be initialized from
the bit file, which is the source of most of my problems (for example,
you can't use impact to generate the SystemAce file, you need to use a
tcl script).

I still wish that I could use the UC2, as I don't really want to deal
with the OPB/PLB buses right now. Creating a full-blown PowerPC design
in EDK doesn't seem like an easy task. 32 inputs and 32 outputs is just
what I need (and my understanding is that there is no faster way to
toggle such pins than with the UC2). As far as code size is concerned,
I already have a pretty complete design and I'm using about 50% of the
code cache (plenty of data cache left), with Size Optimization. I also
would like to keep my BRAMs free, as I'm doing large FFTs which make
heavy use of BRAMs.

Another point in favor of the UC2 is that I have a really simple
SystemC model of the UC2 right now. I can do relatively fast
co-simulations of my design. I heard that using the Swift models to
simulate the PowerPC is really slow...

Thanks for your input MM.

Patrick

Article: 107221
Subject: Re: UltraController II + SystemAce
From: "Antti" <Antti.Lukats@xilant.com>
Date: 25 Aug 2006 10:07:17 -0700
Links: << >> << T >> << A >>

Patrick Dubois schrieb:

> MM wrote:
>
> > I haven't worked with SystemACE, but as was discussed here multiple times in
> > the past, the DONE pin going high doesn't really mean the configuration has
> > been finished, it rather says that it is about to be finished. In other
> > words you might be missing a few more clock cycles to finilize the
> > configuration.
>
> Thanks for the info, I wasn't aware of that. I also have a led that is
> driven to '1' in the fpga fabric and this led turns on. It's still no
> garantee that the whole chip is configued I guess...
>
> > Ultracontroller is a neat idea, but it might be much easier to do a design
> > that would use BRAMs instead of trying to fit everything in cache. Since PPC
> > is a 32-bit processor, the code tends to grow pretty quickly making it
> > difficult to fit anything useful in cache alone. Also, if I understand
> > correctly, the cache cannot be initialized from the bit file, which makes
> > the initial boot cumbersome.  I have designed a board where I originally
> > planned to use UC2, but then switched to a regular PPC design with PLB_BRAM.
>
> Yes, I'm starting to realize that the UC2 maybe wasn't such a great
> idea after all. You are correct, the cache cannot be initialized from
> the bit file, which is the source of most of my problems (for example,
> you can't use impact to generate the SystemAce file, you need to use a
> tcl script).
>
> I still wish that I could use the UC2, as I don't really want to deal
> with the OPB/PLB buses right now. Creating a full-blown PowerPC design
> in EDK doesn't seem like an easy task. 32 inputs and 32 outputs is just
> what I need (and my understanding is that there is no faster way to
> toggle such pins than with the UC2). As far as code size is concerned,
> I already have a pretty complete design and I'm using about 50% of the
> code cache (plenty of data cache left), with Size Optimization. I also
> would like to keep my BRAMs free, as I'm doing large FFTs which make
> heavy use of BRAMs.
>
> Another point in favor of the UC2 is that I have a really simple
> SystemC model of the UC2 right now. I can do relatively fast
> co-simulations of my design. I heard that using the Swift models to
> simulate the PowerPC is really slow...
>
> Thanks for your input MM.
>
> Patrick

Patrick

PPC caches __can__ be initialized
1) from BIT file using the USR_ACCESS (and bridge IP to JTAG master)
2) from ACE using PPC ICE registers directly

but sure, the documentation and tools todo this
are not the very best == means possible months
of time wasted to get it all working properly.

but it is possible. I looked at the USR_ACCESS
to JTAG gateway ip core, and I also know enough
about the undocumented PPC ICE registers that
I am confident its all doable.



Antti

Article: 107222
Subject: Re: fastest FPGA
From: fpga_toys@yahoo.com
Date: 25 Aug 2006 10:11:45 -0700
Links: << >> << T >> << A >>

Austin Lesea wrote:
> OK,
>
> The Virtex 4 family is the first family to ever be able to shift 1,0
> through all resources (without tripping power on reset from the
> transient of all switching on the same edge).

On the largest parts, I think this is probably true, because the clock
tree skew is wider than clock period. I'm still not entirely convinced
that slightly smaller parts, in the smallest packages are actually
worst case stable due to higher than average clock currents in other
cases.

> My comment on over-clocking was intended to say that we are completely
> unlike a micro-processor, and the traditional tricks that you read about
> to get a microprocessor to work faster are not likely to work, as we
> have far more complex timing paths in a customer design.

Not true ... it's the same micro optimization of temp/voltage/process
with maybe a slightly different sweet spot ... simply optimizing the
environmentals and hand screening parts that overclockers do for CPU's
and memory. For all the same reasons, to avoid the worst case margin
based on worst case process, temp and voltage.

> You appeared to live up to your name, that was all I was observing.
>
> Sounds like you do know something of what you speak.  Sorry if I thought
> you were (totally) ignorant.  Given the name, and the posting, it was
> hard to tell.

After using Totally_Lost for more than 20 years to draw out technical
bigots, I'm suprised the bait still works. You would think that smart
people would realize few technical people making strong statements are
neither clueless, or offering to be rightfully skewed as idiots.

Flaming posters based on name, origin, place of work/school, and other
plain ignorance factors is pretty poor form ... and at minimum violates
expected standards of civility and code of conduct for this, and most
forums.

> I still do not recommend it, as the THUMP from all that switching leads
> to not being able to control jitter, even with the best possible power
> distribution system.  I believe that V4 and V5 will require some
> internal 'SSO' usage restrictions, as they are not like earlier devices
> which would configure, DONE would go high, and then immediately reset
> and go back to configuring if you tried to instantiate and run a full
> device shift register.

As I noted, the largest parts have a high clock tree skew, slightly
smaller parts do not, and I suspect the "THUMP" you are talking about
is the same peak pin current problem I have repeatedly asked about that
is not specified by Xilinx, nor are tools available to calcuate the
resulting voltage transients at the die.

Article: 107223
Subject: Re: high level languages for synthesis
From: David Ashley <dash@nowhere.net.dont.email.me>
Date: 25 Aug 2006 19:15:47 +0200
Links: << >> << T >> << A >>

Jan Panteltje wrote:
> On a sunny day (24 Aug 2006 23:20:57 -0700) it happened fpga_toys@yahoo.com
> wrote in <1156486857.335345.201150@i3g2000cwc.googlegroups.com
> 
> 
>>:There is a reason the software world doesn't allow
>>software engineers to write production programs in native machine
>>language ones and zeros, or even use high level assembly languange in
>>most cases .... and increasingly not even low level C code.
> 
> 
> There are many many cases where ASM on a micro controller is to be preferred.
> Not only for code-size, but also for speed, and _because of_ simplicity.

I just want to contribute my $.02 to this. For years, perhaps 15, since I
heard about FPGA's I've wanted to get involved in them. It's been
absolutely clear to me, with 30 years of computer experience, that
reconfigurable computing has unlimited potential for increasing
computational
power beyond the traditional single thread general purpose cpu model.

A recent article in linux journal talks about doing encryption, AES128 or
something similiar, was 500X faster when done in an fpga VS having the
cpu do it. 500X! That's 9 generations of Moore's law. Yes for the very
specific task of doing this one thing, it speeds up that much, but for
general stuff?

That's not the point. With an ASIC you can do only one thing. But with
an FPGA you can have it do whatever is needed at that moment.

There's a startup I just heard about, forgot the name, but since AMD
just released complete specs on their hypertransport protocol used
in Opterons, this startup is producing FPGA devices that fit on a small
PCB that itself is pin compatible with an Opteron. So you can take a
standard multi cpu server, install one of these instead of an opteron,
and because it "breathes" so well (tens of gigabyte/second IO) you
can really make use of the FPGA. PCI approaches are hampered by
the bandwidth of the PCI bus, BTW. That is, PCI card with FPGA on it
doesn't gain you a lot because the pci bus itself is a choke point.

Finally in my own experience I've seen in the one project I did some
DSP coding, encoding live video to mpeg-2 I frames, I was able to
achieve 60x performance improvement beyond compiled 'c' code
by writing hand coded ASM + making some compromises that no
compiler could ever make, but which a human can easily. It's
similiar to lossy compression vs perfect compression -- the compiler
can only do perfect coding, however if you can do a little math that
is fast and "good enough" accuracy, you can reap huge rewards in
performance. To do *this* kind of performance improvement required
a human. However it's theoretically possible that a compiler could
target an algorithm to an FPGA and achieve a perfect implementation
of the algorithm, yet at 500X the performance, without having to
have human intuition.

So the whole thesis is this:
1) With silicon speed we're running up against limits of how fast
we can operate cpu's at.
2) Special purpose cpus like DSP's can achieve huge performance
improvements, but getting them require tradeoffs a compiler can't
make -- they require human decision making processes.
3) With reconfigurable computing (FPGA's) in theory you can get
massive performance improvements by just implementing the
desired algorithm into the fabric of an FPGA.

So what has to happen is to make it painless for engineers to
be able to use FPGAs as a resource, on the fly, within applications.
If that means using high level 'c' like languages and smart
compilers, and if it's in theory possible, then go for it. Right now
using FPGAs for stuff is very pain*ful* compared to doing the
same thing in software. That's got to change.

NOTE: The cpu manufacturers seem to be heading down the
path of multiple cpu cores, each with its own cache, on the
same chip. Since they can't make 'em faster, start packing
more of them on the die. Software wise they're backwards
compatible, but this just leaves it up to the programming
community to try to optimize their code better for multiple
cores. You don't see a lot of that yet. Multi thread programming
is itself very complicated and a huge source of bugs/lockups/
deadlocks/etc. So to take advantage of that effectively will
probably require delegating thread distribution to the
compiler -- which is similiar to what has to happen with
doing fpga coding with a high level language. What's
really necessary in both cases is for a programmer to just
think in terms of single thread execution, write his program,
and the compiler does the rest...

Someone earlier noted the similarity between fpga coding
now and ASM programming in the past. I agree with this
completely. FPGA use has to move out of the voodoo, priestly
arena it's in get into the hands of the common man.

-Dave

-- 
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture

Article: 107224
Subject: Re: fastest FPGA
From: David Ashley <dash@nowhere.net.dont.email.me>
Date: 25 Aug 2006 19:23:33 +0200
Links: << >> << T >> << A >>

Ray Andraka wrote:
> Ahh, This must be John Bass. I thought I recognized this particular
> rant. I guess he changed his screen name from fpga_toys or whatever it

Totally_lost is fpga_toys? I just saw recent posts by both.
Do people use sock puppets in this newsgroup?

-Dave

-- 
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search