Messages from 19400

Article: 19400
Subject: Re: Dumb question springing from a discussion about chess on a chip...
From: diep@xs4all.nl (Vincent Diepeveen)
Date: Sun, 19 Dec 1999 23:15:51 GMT
Links: << >> << T >> << A >>

On Sat, 18 Dec 1999 12:50:33 -0500, Ray Andraka <randraka@ids.net>
wrote:

>
>
>Dann Corbit wrote:
>
>> "Ray Andraka" <randraka@ids.net> wrote in message
>> news:385B1DEE.7517AAC7@ids.net...
>> > The chess processor as you describe would be sensible in an FPGA.  Current
>> > offerings have extraordinary logic densities, and some of the newer FPGAs
>> have
>> > over 500K of on-chip RAM which can be arranged as a very wide memory.
>> Some of
>> > the newest parts have several million 'marketing' gates available too.
>> FPGAs
>> > have long been used as prototyping platforms for custom silicon.
>>
>> I am curious about the memory.  Chess programs need to access at least tens
>> of megabytes of memory.  This is used for the hash tables, since the same
>> areas are repeatedly searched.  Without a hash table, the calculations must
>> be performed over and over.  Some programs can even access gigabytes of ram
>> when implemented on a mainframe architecture.  Is very fast external ram
>> access possible from FPGA's?
>
>This is conventional CPU thinking.  With the high degree of parallelism in the

No this is algorithmic speedup design.

Branching factor (time multiplyer to see another move ahead)
gets better with it by a large margin.

So BF in the next formula gets better

  # operations in FGPA   =  C *  (BF^n)
      where n is a positive integer.

>FPGA and the large amount of resources in some of the more recent devices, it
>may very well be that it is more advantageous to recompute the values rather
>than fetching them.  There may even be a better approach to the algorithm that
>just isn't practical on a conventional CPU.  Early computer chess did not use
>the huge memories.  I suspect the large memory is more used to speed up the
>processing rather than a necessity to solving the problem.

Though  #operations used by deep blue was incredible compared to
any program of today at world championship 1999 many programs searched
positionally deeper (deep blue 5 to 6 moves ahead some programs
looking there 6-7 moves ahead).

This all because of these algorithmic improvements.

It's like comparing bubblesort against merge sort.
You need more memory for merge sort as this is not in situ but
it's O (n log n). Take into account that in computergames the
option to use an in situ algorithm is not available.

>> > If I were doing such I design in an FPGA however, I would look deeper to
>> see
>> > what algorithmic changes could be done to take advantage of the
>> parallelism
>> > offered by the FPGA architecture.  Usually that means moving away from a
>> > traditional GP CPU architecture which is limited by the inherently serial
>> > instruction stream.  If you are trying to mimic the behavior of a CPU, you
>> would
>> > possibly do better with a fast CPU, as you will get be able to run those
>> at a
>> > higher clock rate.  The FPGA gains an advantage over CPUs when you can
>> take
>> > advantage of parallelism to get much more done in a clock cycle than you
>> can
>> > with a CPU.
>>
>> The ability to do many things at once may be a huge advantage.  I don't
>> really know anything about FPGA's, but I do know that in chess, there are a
>> large number of similar calcutions that take place at the same time.  The
>> more things that can be done in parallel, the better.
>
>Think of it as a medium for creating a custom logic circuit.  A conventional CPU
>is specific hardware optimized to perform a wide variety of tasks, none
>especially well.  Instead we can build a circuit the specifically addresses the
>chess algorithms at hand.  Now, I don't really know much about the algorithms
>used for chess.  I suspect one would look ahead at all the possibilities for at
>least a few moves ahead and assign some metric to each to determine the one with
>the best likely cost/benefit ratio.  The FPGA might be used to search all the
>possible paths in parallel.

My program allows parallellism. i need bigtime locking for this, in
order to balance the parallel paths.

How are the possibilities in FPGA to press several of the same program
at one cpu, so that inside the FPGA there is a sense of parallellism?

How about making something that enables to lock within the FPGA?

It's not possible my parallellism without locking, as that's the same
bubblesort versus merge sort story, as 4 processors my program gets
4.0 speedup, but without the locking 4 processors would be a 
lot slower than a single sequential processor.



>> > That said, I wouldn't recommend that someone without a sound footing in
>> > synchronous digital logic design take on such a project.  Ideally the
>> designer
>> > for something like this is very familiar with the FPGA architecture and
>> tools
>> > (knows what does and doesn't map efficiently in the FPGA architecture),
>> and is
>> > conversant in computer architecture and design and possibly has some
>> pipelined
>> > signal processing background (for exposure to hardware efficient
>> algorithms,
>> > which are usually different than ones optimized for software).
>> I am just curious about feasibility, since someone raised the question.  I
>> would not try such a thing by myself.
>>
>> Supposing that someone decided to do the project (however) what would a
>> rough ball-park guestimate be for design costs, the costs of creating the
>> actual masks, and production be for a part like that?
>
>The nice thing about FPGAs is that there is essentially no NRE or fabrication
>costs.  The parts are pretty much commodity items, purchased as generic
>components.  The user develops a program consisting of a compiled digital logic
>design, which is then used to field customize the part.  Some FPGAs are
>programmed once during the product manufacturer (one time programmables include
>Actel and Quicklogic).  Others, including the Xilinx line, have thousands of
>registers that are loaded up by a bitstream each time the device is powered up.
>The bitstream is typically stored in an external EPROM memory, or in some cases
>supplied by an attached CPU.  Part costs range from under $5 for small arrays to
>well over $1000 for the newest largest fastest parts.

How about a program that's having thousands of chessrules and
incredible amount of loops within them and a huge search,

So the engine & eval only equalling 1.5mb of C source code.

How expensive would that be, am i understaning here that
i need for every few rules to spent another $1000 ?

>The design effort for the logic circuit you are looking at is not trivial.  For
>the project you describe, the bottom end would probably be anywhere from 12
>weeks to well over a year of effort depending on the actual complexity of the
>design, the experience of the designer with the algorithms, FPGA devices and
>tools.

I needed years to write it in C already...

Vincent Diepeveen
diep@xs4all.nl

>> --
>> C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
>>  "The C-FAQ Book" ISBN 0-201-84519-9
>> C.A.P. Newsgroup   http://www.dejanews.com/~c_a_p
>> C.A.P. FAQ: ftp://38.168.214.175/pub/Chess%20Analysis%20Project%20FAQ.htm

>--
>-Ray Andraka, P.E.
>President, the Andraka Consulting Group, Inc.
>401/884-7930     Fax 401/884-7950
>email randraka@ids.net
>http://users.ids.net/~randraka

Article: 19401
Subject: Re: JamPlayer and 10K10
From: Michael Stanton <mikes@magtech.com.au>
Date: Mon, 20 Dec 1999 10:29:42 +1100
Links: << >> << T >> << A >>

Hello Thomas

If you haven't already seen it, Application Note AN088 on the Altera web site is
helpful when programming FLEX10K with the Jam player.

 http://www.altera.com/document/an/an088.pdf

It is important to get the DOS command line for jam right, in particular
specifying the actions you require. See the -d command line options on Pages 10
and 11 of AN088  (especially the DO_CONFIGURE command).

You may also like to see the Atlas Technical Solutions on the Altera Web Site,
especially the following :

http://www.altera.com/html/atlas/soln/rd09171998_2963.html

"Problem :  When I run the JamTM Player, why does it say programming or
configuration was successful but nothing happens? "

Regards, Michael

Thomas Bornhaupt wrote:

> Hi,
>
> i use MAX+plus II 9.3 7/23/1999 (1999.07). I can program the EPF10K10LC84-4
> direcly out of Max+plus. And all works correct!
>
> Then I try to program the EPF10K10LC84-4 with JAM.EXE (16Bit Dos) on the
> same Hardware (Computer and Byteblaster). But it does not work!
>
> The following lines are from the dosbox
> -----------
> Jam (Stapl) Player Version 2.12
> Copywrite (C) 1997-1999 Altera Corporation
> Device #1 IDCODE is 010100DD
> DONE
> Exit code = 0 ... Success
> -----------
>
> If i take off the cable from the ByteBlaster then i get the following lines:
> -----------
> Jam (Stapl) Player Version 2.12
> Copywrite (C) 1997-1999 Altera Corporation
> Device #1 IDCODE is FFFFFFFF
> DONE
> Exit code = 0 ... Success
> ----------
>
> The ExitCode ist always 0. But the EPF10K10LC84-4 ist not working.
>
> This Error is so clear, that i think that must be an User Error.
>
> Which option is not correkt in MAX+plus or JAM.EXE?

Article: 19402
Subject: Re: Speed grade
From: Ray Andraka <randraka@ids.net>
Date: Sun, 19 Dec 1999 21:02:47 -0500
Links: << >> << T >> << A >>

Luigi Funes wrote:

> Peter,
> can you explain what are the "dirty asynchronous tricks"  to avoid, please?
> The manufacturers specify only the max. delays.  I have always to assume
> that  the min. delay, theorically, could be zero?
> If several signals follow similar paths, like in a bus, how have I to assume
> the timing relationships between these signals at the end of path? Each
> signal could have a different delay on the same device?
> And generally, how much real is the timing analysis and simulation?
> I belive these are important misunderstood questions. Thank you.
>
> Luigi

 Sounds like you use sound design practices.  The dirty tricks Peter refers to
generally depend on a propagation delay being above some minimum amount for the
circuit to work.  I've seen too many of these to still believe that people
wouldn't design that way.  The asynchronous sets/resets on FPGAs have a way of
letting some of these 'dirty tricks' sneak in on an otherwise conscientous
designer.  The timing analyzer in the xilinx tools is pretty good.  You do have
to make sure you set up your constraints properly.

--
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email randraka@ids.net
http://users.ids.net/~randraka

Article: 19403
Subject: Re: JamPlayer and 10K10
From: "Thomas Bornhaupt" <Thomas.Bornhaupt@t-online.de>
Date: Mon, 20 Dec 1999 11:34:16 +0100
Links: << >> << T >> << A >>

Hi Michael,

thank you for your tipps. But it doesnot work.

It seemt to me, that the MAX+plus (9.3) genarates wrong JAM or JBC files.

I testet Jam.EXE 1.2 with the -dDO_CONFIGURE. But the Chip is not
programmed.

Inside of the JAM-File (Language 1.1) i found this line

BOOLEAN DO_CONFIGURE = 0;

So i set it to

BOOLEAN DO_CONFIGURE = 1;

Starting JAM.EXE i got a syntax-error in line 440!

Also i tested JAM.EXE 2.2. Here you have the option -aCONFIGURE. This is the
Action out of the JAM-file (STAPL Format):

ACTION CONFIGURE = PR_INIT_CONFIGURE, PR_EXECUTE;

And now I got an exception. The Dosbox went direcly away and a pure
dosmachine hang up with an EMM386 error.


regards
Thomas Bornhaupt


Michael Stanton <mikes@magtech.com.au> schrieb in im Newsbeitrag:
385D6A66.27FE91D4@magtech.com.au...
> Hello Thomas
>
> If you haven't already seen it, Application Note AN088 on the Altera web
site is
> helpful when programming FLEX10K with the Jam player.
>
>  http://www.altera.com/document/an/an088.pdf
>
> It is important to get the DOS command line for jam right, in particular
> specifying the actions you require. See the -d command line options on
Pages 10
> and 11 of AN088  (especially the DO_CONFIGURE command).
>
> You may also like to see the Atlas Technical Solutions on the Altera Web
Site,
> especially the following :
>
> http://www.altera.com/html/atlas/soln/rd09171998_2963.html
>
> "Problem :  When I run the JamTM Player, why does it say programming or
> configuration was successful but nothing happens? "
>
> Regards, Michael
>
>
> Thomas Bornhaupt wrote:
>
> > Hi,
> >
> > i use MAX+plus II 9.3 7/23/1999 (1999.07). I can program the
EPF10K10LC84-4
> > direcly out of Max+plus. And all works correct!
> >
> > Then I try to program the EPF10K10LC84-4 with JAM.EXE (16Bit Dos) on the
> > same Hardware (Computer and Byteblaster). But it does not work!
> >
> > The following lines are from the dosbox
> > -----------
> > Jam (Stapl) Player Version 2.12
> > Copywrite (C) 1997-1999 Altera Corporation
> > Device #1 IDCODE is 010100DD
> > DONE
> > Exit code = 0 ... Success
> > -----------
> >
> > If i take off the cable from the ByteBlaster then i get the following
lines:
> > -----------
> > Jam (Stapl) Player Version 2.12
> > Copywrite (C) 1997-1999 Altera Corporation
> > Device #1 IDCODE is FFFFFFFF
> > DONE
> > Exit code = 0 ... Success
> > ----------
> >
> > The ExitCode ist always 0. But the EPF10K10LC84-4 ist not working.
> >
> > This Error is so clear, that i think that must be an User Error.
> >
> > Which option is not correkt in MAX+plus or JAM.EXE?
>
>
>
>

Article: 19404
Subject: Necessary to 'synchronise' an asynchronous FSM reset?
From: micheal_thompson@my-deja.com
Date: Mon, 20 Dec 1999 12:41:19 GMT
Links: << >> << T >> << A >>



I pose this question on the assumption that an asynchronous clear
signal (for example a power on reset) might not reliably initialise all
registers of an FSM if it were to release very close to a clock edge.
Even if this is a correct assumption, maybe its felt that a reset is
such a rare-event that the chances of this happening are 'slim and
none' anyway?
Any thoughts would be appreciated.

regds
Mike


Sent via Deja.com http://www.deja.com/
Before you buy.

Article: 19405
Subject: Re: How to include SpartanXL code in C souce code?
From: "Mark van de Belt" <mvandebelt@roaxNOSPAM.nl>
Date: Mon, 20 Dec 1999 14:13:57 +0100
Links: << >> << T >> << A >>


Søren Lambæk wrote in message ...
>Hi
>
>I have designed a Xilinx SpartanXL project in OrCAD EXPRESS using the
Xilinx
>Aliance 2.1i fitter.
>I have a microcontroller wich will handle the loading of the FPGA.
>
>My question is can someone point me in the right direction on how to
include
>the FPGA code in the C source kode of
>the microcontroller SW.
>
>Regards
>Søren Lambæk
>KK-Electronic a/s
>Denmark
>E-mail: sl@kk-electronic.dk
>
Hello,

You can look up XAPP122 on the XILINX web page. You will have to download 2
files: PCONFIG and MAKESRC. To get makesrc to work you need to make a
configuration file to instruct it to create an array, i.e. FPGA_DATA[] to
copy from ROM to FPGA. The only thing that you still have to do is to strip
the first 7 lines in the .MCS file before PCONFIG.

We created a (simple and dirty) batch file to do all the work. If you are
interested I can post or mail it.

I hope this helps.
Mark van de Belt
ROAX BV
(remove the NOSPAM from the E-mail address to mail me)

Article: 19406
Subject: Making a chessprogram in FPGA?
From: vdiepeve@cs.uu.nl (Vincent Diepeveen)
Date: 20 Dec 1999 14:41:06 GMT
Links: << >> << T >> << A >>

>On Sat, 18 Dec 1999 01:24:39 -0700, "Simon Bacon"
...

...
>
>Why not ask around on some the chess news groups like: 
>rec.games.chess.computer
>to see what part of computer programs use the most time, and then see
>if the FPGA's power could help speed up one of those bottlenecks.

For my program:
  evaluation : 90% systemtime
  search     : 10% systemtime

     that 10% of systemtime are basically:
          - locking (root, hashtable, searchblocks)
          - caches to prevent doing more evaluations and
            smart code to select moves in order to look ahead
            as deep as possible with the minimal number of 'nodes',
            where a node can be seen as a position that's processed 
            somehow.

The number of evaluations a second in my program is 2000-4000 at a PII450,
60% i get out of a direct cache at the 'evaluation cache' (so that
2000-4000 full evaluations are 40% out of that number). Then another
couple of thousands a seconds are 'transpositions' (out of the huge
transpositionhashtable, which stores whether a certain path was previously
already seen). this totally leads to 12000 to 14000 nodes a second.

12000 to 14000 nodes a second is dead slow compared to many other programs
that get way over a 100000. the reason for this is obviously the slow
evaluation function.

I would be interested in a PCI card
having at it hardware (fpga?) which is doing this evaluation
and can receive say 160 bytes of position information (can be compressed
to a smaller number of bytes) and return to the software an integer
which is the evaluation for that given position.

In order to be interesting this evaluation must be given to the
software at least a 100,000 times a second to be interesting.

The big 4 interesting things then are
  a) what do i need making this?
  b) the price of making prototypes?
  c) price a piece 
     making a couple of hundreds of those PCI cards (up to a couple of
     thousand)?
  d) price of making a reprogrammable(?) prototype (many bugs in eval
     get weekly fixed and eval gets weekly expanded a bit)?

Thanks in advance,
Vincent Diepeveen

diep@xs4all.nl

www.diepchess.com

--
          +----------------------------------------------------+
          |  Vincent Diepeveen      email:  vdiepeve@cs.ruu.nl |
          |  http://www.students.cs.ruu.nl/~vdiepeve/          |
          +----------------------------------------------------+

Article: 19407
Subject: Re: Necessary to 'synchronise' an asynchronous FSM reset?
From: bob@nospam.thanks (Bob Perlman)
Date: Mon, 20 Dec 1999 15:44:24 GMT
Links: << >> << T >> << A >>

Hi - 

On Mon, 20 Dec 1999 12:41:19 GMT, micheal_thompson@my-deja.com wrote:

>
>
>I pose this question on the assumption that an asynchronous clear
>signal (for example a power on reset) might not reliably initialise all
>registers of an FSM if it were to release very close to a clock edge.
>Even if this is a correct assumption, maybe its felt that a reset is
>such a rare-event that the chances of this happening are 'slim and
>none' anyway?
>Any thoughts would be appreciated.
>
>regds
>Mike

My policy is to give every FSM an asynchronous reset and a synchronous
reset.  The asynchronous reset puts the FSM in the right state even in
the absence of a clock, which is important if the FSM is controlling,
say, internal or external TriStates that might otherwise contend.  The
synchronous reset works around the problem you mentioned (by the way,
'slim and none' is just another phrase for, 'sooner or later, for
sure').  I do one-hot FSMs exclusively, and I apply the synchronous
reset only to the initial state FF of the FSM; I use it to (a) hold
that FF set and (b) gate off that FF's output to any other state FF.

I create the synchronous reset with a pipeline of 3 or 4 FFs, all of
which get a global reset.  A HIGH is fed to the D of the first FF, and
gets propagated to the end of the chain after reset is released.  The
output of the last FF is inverted to produce the active HIGH
synchronous reset.  For devices that support global sets, you can just
set all the FFs, feed a LOW into the first FF, and dispense with the
inverter at the end.  It's important to clock this FF chain with the
same clock used for the FSM, of course.

There are other ways to work around this problem, such as adding extra
do-nothing states after the initial states in a one-hot, or making
sure that the FSM won't transition out of the initial state until a
few cycles after the asynch reset has been released.  These work, too.
The method I've described is easy to do in either schematics or HDL
and, if desired, allows you to easily synchronize the startup of
multiple FSMs.

Take care,
Bob Perlman

-----------------------------------------------------
Bob Perlman
Cambrian Design Works
Digital Design, Signal Integrity
http://www.best.com/~bobperl/cdw.htm
Send e-mail replies to best<dot>com, username bobperl
-----------------------------------------------------

Article: 19408
Subject: ANN: The Industry's Largest Independent Information Source of FPGAs and CPLDs (www.optimagic.com)
From: "Steven K. Knapp" <sknapp@optimagic.com>
Date: Mon, 20 Dec 1999 08:50:15 -0800
Links: << >> << T >> << A >>

Visit the industry's largest independent on-line information source for
programmable logic, The Programmable Logic Jump Station.

   * FREE downloadable FPGA and CPLD design software
   * Information on devices, boards, books, consultants, etc.
   * FAQ plus tutorials on VHDL and Verilog


              http://www.optimagic.com/index.shtml



Featuring:
---------


            --- FREE Development Software ---


Free and Low-Cost Software - http://www.optimagic.com/lowcost.shtml
Free, downloadable demos and evaluation versions from all the major
suppliers.


          --- Frequently-Asked Questions (FAQ) ---


Programmable Logic FAQ - http://www.optimagic.com/faq.html
A great resource for designers new to programmable logic.



          --- FPGAs, CPLDs, FPICs, etc. ---


Recent Developments - http://www.optimagic.com/index.shtml
Find out the latest news about programmable logic.


Device Vendors - http://www.optimagic.com/companies.html
FPGA, CPLD, SPLD, and FPIC manufacturers.


Device Summary - http://www.optimagic.com/summary.html
Who makes what and where to find out more.


Market Statistics - http://www.optimagic.com/market.html
Total high-density programmable logic sales and market share.



            --- Development Software ---


Design Software - http://www.optimagic.com/software.html
Find the right tool for building your programmable logic design.


Synthesis Tutorials - http://www.optimagic.com/tutorials.html
How to use VHDL or Verilog.



              --- Related Topics ---


FPGA Boards - http://www.optimagic.com/boards.html
See the latest FPGA boards and reconfigurable computers.


Design Consultants - http://www.optimagic.com/consultants.html
Find a programmable logic expert in your area of the world.


Research Groups - http://www.optimagic.com/research.html
The latest developments from universities, industry, and
government R&D facilities covering FPGA and CPLD devices,
applications, and reconfigurable computing.


News Groups - http://www.optimagic.com/newsgroups.html
Information on useful newsgroups.


Related Conferences - http://www.optimagic.com/conferences.html
Conferences and seminars on programmable logic.


Information Search - http://www.optimagic.com/search.html
Pre-built queries for popular search engines plus other
information resources.


The Programmable Logic Bookstore - http://www.optimagic.com/books.html
Books on programmable logic, VHDL, and Verilog.  Most can be
ordered on-line, in association with Amazon.com



            . . . and much, much more.


Bookmark it today!

Article: 19409
Subject: Re: Necessary to 'synchronise' an asynchronous FSM reset?
From: Ray Andraka <randraka@ids.net>
Date: Mon, 20 Dec 1999 12:46:44 -0500
Links: << >> << T >> << A >>

You can use the global async reset even with fast clocks as long as you
include a mechanism to keep critical stuff from starting off until a clock
or two after the async global reset is released.  There is no issue if the
D inputs of the asynchronously reset flip-flops are not at a '1' level when
the reset is released.

micheal_thompson@my-deja.com wrote:

> I pose this question on the assumption that an asynchronous clear
> signal (for example a power on reset) might not reliably initialise all
> registers of an FSM if it were to release very close to a clock edge.
> Even if this is a correct assumption, maybe its felt that a reset is
> such a rare-event that the chances of this happening are 'slim and
> none' anyway?
> Any thoughts would be appreciated.
>
> regds
> Mike
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.

--
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email randraka@ids.net
http://users.ids.net/~randraka

Article: 19410
Subject: Re: Dumb question springing from a discussion about chess on a chip...
From: Ray Andraka <randraka@ids.net>
Date: Mon, 20 Dec 1999 13:10:48 -0500
Links: << >> << T >> << A >>



Vincent Diepeveen wrote:

> On Sat, 18 Dec 1999 12:50:33 -0500, Ray Andraka <randraka@ids.net>
> wrote:
>
> >
> >
> >Dann Corbit wrote:
> >
> >> "Ray Andraka" <randraka@ids.net> wrote in message
> >> news:385B1DEE.7517AAC7@ids.net...
> >> > The chess processor as you describe would be sensible in an FPGA.  Current
> >> > offerings have extraordinary logic densities, and some of the newer FPGAs
> >> have
> >> > over 500K of on-chip RAM which can be arranged as a very wide memory.
> >> Some of
> >> > the newest parts have several million 'marketing' gates available too.
> >> FPGAs
> >> > have long been used as prototyping platforms for custom silicon.
> >>
> >> I am curious about the memory.  Chess programs need to access at least tens
> >> of megabytes of memory.  This is used for the hash tables, since the same
> >> areas are repeatedly searched.  Without a hash table, the calculations must
> >> be performed over and over.  Some programs can even access gigabytes of ram
> >> when implemented on a mainframe architecture.  Is very fast external ram
> >> access possible from FPGA's?
> >
> >This is conventional CPU thinking.  With the high degree of parallelism in the
>
> No this is algorithmic speedup design.
>

What I meant by this is that just using the FPGA to accelerate the CPU algorithm
isn't necessarily going to give you all the FPGA is capable of doing.  You need to
rethink some of the algorithm to optimize it to the resources you have available in
the FPGA.  The algorithm as it stands now is at least somewhat tailored to a cpu
implementation.  It appears your thinking is jsut using the FPGA to speed up the
inner loop, where what I am proposing is to rearrange the algorithm so that the FPGA
might for example look at the whole board state on the current then next move.  In a
CPU based algorithm, the storage is cheap and the computation is expensive.  In an
FPGA, you have an opportunity for very wide parallel processes (you can even send a
lock signal laterally across process threads).  Here the processing is generally
cheaper than the storage of intermediate results.  The limiting factor is often the
I/O bandwidth, so you want to rearrange your algorithm to tailor it to the quite
different limitations of the FPGA.

> Branching factor (time multiplyer to see another move ahead)
> gets better with it by a large margin.
>
> So BF in the next formula gets better
>
>   # operations in FGPA   =  C *  (BF^n)
>       where n is a positive integer.
>
> >FPGA and the large amount of resources in some of the more recent devices, it
> >may very well be that it is more advantageous to recompute the values rather
> >than fetching them.  There may even be a better approach to the algorithm that
> >just isn't practical on a conventional CPU.  Early computer chess did not use
> >the huge memories.  I suspect the large memory is more used to speed up the
> >processing rather than a necessity to solving the problem.
>
> Though  #operations used by deep blue was incredible compared to
> any program of today at world championship 1999 many programs searched
> positionally deeper (deep blue 5 to 6 moves ahead some programs
> looking there 6-7 moves ahead).
>
> This all because of these algorithmic improvements.
>
> It's like comparing bubblesort against merge sort.
> You need more memory for merge sort as this is not in situ but
> it's O (n log n). Take into account that in computergames the
> option to use an in situ algorithm is not available.
>
> >> > If I were doing such I design in an FPGA however, I would look deeper to
> >> see
> >> > what algorithmic changes could be done to take advantage of the
> >> parallelism
> >> > offered by the FPGA architecture.  Usually that means moving away from a
> >> > traditional GP CPU architecture which is limited by the inherently serial
> >> > instruction stream.  If you are trying to mimic the behavior of a CPU, you
> >> would
> >> > possibly do better with a fast CPU, as you will get be able to run those
> >> at a
> >> > higher clock rate.  The FPGA gains an advantage over CPUs when you can
> >> take
> >> > advantage of parallelism to get much more done in a clock cycle than you
> >> can
> >> > with a CPU.
> >>
> >> The ability to do many things at once may be a huge advantage.  I don't
> >> really know anything about FPGA's, but I do know that in chess, there are a
> >> large number of similar calcutions that take place at the same time.  The
> >> more things that can be done in parallel, the better.
> >
> >Think of it as a medium for creating a custom logic circuit.  A conventional CPU
> >is specific hardware optimized to perform a wide variety of tasks, none
> >especially well.  Instead we can build a circuit the specifically addresses the
> >chess algorithms at hand.  Now, I don't really know much about the algorithms
> >used for chess.  I suspect one would look ahead at all the possibilities for at
> >least a few moves ahead and assign some metric to each to determine the one with
> >the best likely cost/benefit ratio.  The FPGA might be used to search all the
> >possible paths in parallel.
>
> My program allows parallellism. i need bigtime locking for this, in
> order to balance the parallel paths.
>
> How are the possibilities in FPGA to press several of the same program
> at one cpu, so that inside the FPGA there is a sense of parallellism?
>
> How about making something that enables to lock within the FPGA?
>
> It's not possible my parallellism without locking, as that's the same
> bubblesort versus merge sort story, as 4 processors my program gets
> 4.0 speedup, but without the locking 4 processors would be a
> lot slower than a single sequential processor.
>
> >> > That said, I wouldn't recommend that someone without a sound footing in
> >> > synchronous digital logic design take on such a project.  Ideally the
> >> designer
> >> > for something like this is very familiar with the FPGA architecture and
> >> tools
> >> > (knows what does and doesn't map efficiently in the FPGA architecture),
> >> and is
> >> > conversant in computer architecture and design and possibly has some
> >> pipelined
> >> > signal processing background (for exposure to hardware efficient
> >> algorithms,
> >> > which are usually different than ones optimized for software).
> >> I am just curious about feasibility, since someone raised the question.  I
> >> would not try such a thing by myself.
> >>
> >> Supposing that someone decided to do the project (however) what would a
> >> rough ball-park guestimate be for design costs, the costs of creating the
> >> actual masks, and production be for a part like that?
> >
> >The nice thing about FPGAs is that there is essentially no NRE or fabrication
> >costs.  The parts are pretty much commodity items, purchased as generic
> >components.  The user develops a program consisting of a compiled digital logic
> >design, which is then used to field customize the part.  Some FPGAs are
> >programmed once during the product manufacturer (one time programmables include
> >Actel and Quicklogic).  Others, including the Xilinx line, have thousands of
> >registers that are loaded up by a bitstream each time the device is powered up.
> >The bitstream is typically stored in an external EPROM memory, or in some cases
> >supplied by an attached CPU.  Part costs range from under $5 for small arrays to
> >well over $1000 for the newest largest fastest parts.
>
> How about a program that's having thousands of chessrules and
> incredible amount of loops within them and a huge search,
>
> So the engine & eval only equalling 1.5mb of C source code.
>
> How expensive would that be, am i understaning here that
> i need for every few rules to spent another $1000 ?

It really depends on the implementation.   The first step in finding a good FPGA
implementation is repartitioning the algorithm.  This ground work is often the
longest part of the FPGA design cycle, and it is a part that is not even really
acknowledged in the literature or by the part vendors.  Do the system work up front
to optimize the architecture for the resoucrces you have available, and in the end
you will wind up with something much better, faster, and smaller than anything
arrived at by simple translation.

At one extreme, one could just us the FPGA to instantiate custom CPUs with a
specialized instruction set for the chess program.  That approach would likely net
you less performance than an emulator for the custom CPU running on a modern
machine.  The reason for that is the modern CPUs are clocked at considerably higher
clock rates than a typical FPGA design is capable of, so even if the emulation takes
an average of 4 or 5 cycles for each custom instruction, it will still keep up with
or outperform the FPGA.  Where the FPGA gets its power is the ability to do lots of
stuff at the same time.   To take advantage of that, you usually need to get away
from an instruction based processor.



>
>
> >The design effort for the logic circuit you are looking at is not trivial.  For
> >the project you describe, the bottom end would probably be anywhere from 12
> >weeks to well over a year of effort depending on the actual complexity of the
> >design, the experience of the designer with the algorithms, FPGA devices and
> >tools.
>
> I needed years to write it in C already...
>
> Vincent Diepeveen
> diep@xs4all.nl
>
> >> --
> >> C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
> >>  "The C-FAQ Book" ISBN 0-201-84519-9
> >> C.A.P. Newsgroup   http://www.dejanews.com/~c_a_p
> >> C.A.P. FAQ: ftp://38.168.214.175/pub/Chess%20Analysis%20Project%20FAQ.htm
>
> >--
> >-Ray Andraka, P.E.
> >President, the Andraka Consulting Group, Inc.
> >401/884-7930     Fax 401/884-7950
> >email randraka@ids.net
> >http://users.ids.net/~randraka



--
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email randraka@ids.net
http://users.ids.net/~randraka

Article: 19411
Subject: Re: hobbyist friendly pld?
From: jcurren@my-deja.com
Date: Mon, 20 Dec 1999 18:27:35 GMT
Links: << >> << T >> << A >>

In article <38542614.91EB57DA@auckland.ac.nz>,
  Grant Sargent <g.sargent@auckland.ac.nz> wrote:
[snip]
> I'd recommend it. I did see that a previous poster mentioned it was a
> 6-month time-limited version, but the version I've got is unlimited
> (well, I'm fairly sure it's unlimited... I didn't see anything that
> mentioned a time-limit.)
>
> Cheers,
> Grant
>

... the license code does expire after 6 months, but you can then just
re- register through the Altera website, and you get a license file
good for another 6 months.  It's sort of a marketing thing, I suppose;
keeps eyes on their site.
I have to cast my vote for Altera also- as a cold- start novice, I got
up the curve pretty quickly based on their help files and examples.
Good luck-
Joe Curren

Sent via Deja.com http://www.deja.com/
Before you buy.

Article: 19412
Subject: Re: Speed grade
From: Jonas Thor <thor@sm.luth.se>
Date: Mon, 20 Dec 1999 20:54:03 +0100
Links: << >> << T >> << A >>


>can you explain what are the "dirty asynchronous tricks"  to avoid, please?

Yep, I also would like to know what you shouldn't do. I have been
taught to keep everything synchronous. Some stuff I've learned is:

- Never source an asynch reset or clock from comb. logic
- No comb. feedback loops 
- Synch your asynch inputs inputs carefully (if needed)
- Use one global clock (Be aware of clock skew and short paths)
- Use one global reset, but take careful care of your FSM reset
- Keep everything synchronous!

Will this avoid "dirty asynchronous tricks" or what are those tricks?

Merry X-Mas from an unexperienced engineer!

Article: 19413
Subject: Re: Dumb question springing from a discussion about chess on a chip...
From: "Simon Bacon" <simon@tile.demon.co.uk.notreally>
Date: Mon, 20 Dec 1999 15:03:48 -0700
Links: << >> << T >> << A >>

John L. Smith <jsmith@visicom.com> wrote...

> Seems to me that if chess is implementable in FPGA, place & route
> ought to be accelerable too. And that place/route should have a
> higher priority. I'd rather play chess against people.
>
> When do we get FPGA accelerated place/route, to reduce our P/R times
> from hours to minutes? Its only a factor of 60 acceleration we're
> looking for.
>
> Take the discussion below, and replace the words "moves" or "board
> positions" with the word "placements" or "routings" where appropriate.

Some guys at Xerox PARC wre looking at that a few years ago.
In principle something like Lee's algorithm is a natural.
The problem is that users are rarely looking for a simple
route solution; the 'rules' set for a modern design can
be huge - things like layer assignment, crosstalk, trace widths,
and on and on.

Sorry, I no longer have any references to the Xerox work.

Article: 19414
Subject: Re: Necessary to 'synchronise' an asynchronous FSM reset?
From: "Simon Bacon" <simon@tile.demon.co.uk.notreally>
Date: Mon, 20 Dec 1999 15:07:28 -0700
Links: << >> << T >> << A >>

<micheal_thompson@my-deja.com> wrote...
>
>
> I pose this question on the assumption that an asynchronous clear
> signal (for example a power on reset) might not reliably initialise
all
> registers of an FSM if it were to release very close to a clock edge.
> Even if this is a correct assumption, maybe its felt that a reset is
> such a rare-event that the chances of this happening are 'slim and
> none' anyway?
> Any thoughts would be appreciated.

The chance of it happening on your bench is 'slim to none'.
The chance of it happening in the customer's equipment is 100%.

I have an existence proof of this :)

Article: 19415
Subject: Re: Dumb question springing from a discussion about chess on a chip...
From: "John L. Smith" <jsmith@visicom.com>
Date: Mon, 20 Dec 1999 16:13:07 -0800
Links: << >> << T >> << A >>

This is a multi-part message in MIME format.
--------------57B64DD0374D5F0A238C4420
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Seems to me that if chess is implementable in FPGA, place & route
ought to be accelerable too. And that place/route should have a
higher priority. I'd rather play chess against people.

When do we get FPGA accelerated place/route, to reduce our P/R times
from hours to minutes? Its only a factor of 60 acceleration we're
looking for.

Take the discussion below, and replace the words "moves" or "board
positions" with the word "placements" or "routings" where appropriate.

Dave Decker wrote:
> "Simon Bacon" > <simon@tile.demon.co.uk.notreally> wrote:
> >Could you post a few examples of the sort of primitives you
> >would like to see a Chess Machine execute.
> The partition between the work done by the micro or DSP and the work
> done by the FPGA, is usually best made giving the micro the more
> complex algorithmic jobs and giving the FPGA the compute intensive,
> but algorithmicly simple, repetitive, flow through tasks.
> 
> Chess programs have to:
> Generate a tree of all possible moves from the current position for a
> depth of a few generations, the more the better.
> 
> Prune that tree so that stupid moves are not investigated, giving time
> for more interesting moves to be probed to more generations.
> 
> As each new possible future board position is postulated it must be
> evaluated.
> 
> It seems that one first task the FPGA could do is to evaluate a board
> position and return its merit.
> 
> If that's not enough work, perhaps the FPGA could also generate a list
> of every possible next half move and return that list.
> 
> The micro would be used for the more complex task of pruning the tree
> and sending the next position to be evaluated to the FPGA(s). The
> micro would need access to big memory. The FPGA would just run a
> subroutine, without need to reference the history or progress of the
> overall algorithm.
>
--------------57B64DD0374D5F0A238C4420
Content-Type: text/x-vcard; charset=us-ascii; name="vcard.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for John L. Smith
Content-Disposition: attachment; filename="vcard.vcf"

begin:          vcard
fn:             John L. Smith
n:              Smith;John L.
org:            Visicom Imaging Products
adr:            10052 Mesa Ridge Court;;;San Diego;CA;92121;USA
email;internet: jsmith@visicom.com
title:          Principal Engineer
tel;work:       858-320-4102
tel;fax:        858-??????
note:           http://www.visicom.com/products/Vigra/index.html
x-mozilla-cpt:  ;0
x-mozilla-html: TRUE
version:        2.1
end:            vcard

--------------57B64DD0374D5F0A238C4420--

Article: 19416
Subject: Re: Dumb question springing from a discussion about chess on a chip...
From: "Joel Kolstad" <JKolstad@Electroglas.Com>
Date: Mon, 20 Dec 1999 16:30:22 -0800
Links: << >> << T >> << A >>

John L. Smith <jsmith@visicom.com> wrote in message
news:385EC613.57BAFBF0@visicom.com...
> Seems to me that if chess is implementable in FPGA, place & route
> ought to be accelerable too.

It is.  Some models of those big ASIC emulation boxes made by Ikos and
QuickTurn use Xilinx FPGAs, and they're running the same old PAR that you
and I are... just on racks of PCs simultaneously to speed up the process.  I
don't know just how many are run in parallel, however.

> When do we get FPGA accelerated place/route, to reduce our P/R times
> from hours to minutes? Its only a factor of 60 acceleration we're
> looking for.

64 PCs (which wouldn't be 60x, but...) is certainly under $64K... how much
money do you have? :-)

Article: 19417
Subject: Re: How to include SpartanXL code in C souce code?
From: "Joel Kolstad" <JKolstad@Electroglas.Com>
Date: Mon, 20 Dec 1999 16:38:28 -0800
Links: << >> << T >> << A >>

I agree with Mark here that you should go after MakeSrc.  It's a simple,
clean little program that'll generate C source for you.  On our last
project, we used it until we ran out of memory and had to stuff the bitsream
into Flash.  In any case, it's configurable enough so that you can have it
add lines before and after the bulk of the FPGA bitstream -- this is useful
for adding a line that defines a variable telling your programming
subroutine the size of the bitstream.  (I.e., you don't want to have a
header file that says #define BitSreamLength=xxxx -- make the compiler do
the work for you, something like const int
SizeOfFPGABitsream=sizeof(TheBigHonkingFPGABitStreamArray))

Another thing to check when you do this -- make sure the definition of the
datastream itself has a 'const' qualifier in front of it!  (E.g., const
unsigned myBitsream[]={ ...} ).  If you don't do this, and if you linker
thinks it's targeting a ROM, it'll happily reserve the same amount of space
out of your RAM and the C environment startup code will copy it from 'ROM'
to RAM.  Uggh.

> We created a (simple and dirty) batch file to do all the work.

...or coerce your makefile into doing this.

---Joel Kolstad

Article: 19418
Subject: Re: JamPlayer and 10K10
From: Michael Stanton <mikes@magtech.com.au>
Date: Tue, 21 Dec 1999 12:10:38 +1100
Links: << >> << T >> << A >>

Hi Thomas

We have never had to alter any lines inside the Jam source file and have always
been able to use the .jam file produced by Max+Plus II.

The following is the DOS command line we use to program a FLEX 10K30A as part of
a three device JTAG chain :

jam -v -dDO_CONFIGURE=1 -p378 cpld_top.jam

We are using Jam.exe ver 1.2 and Max+Plus II 9.3 and have a ByteBlasterMV
connected to a standard PC printer port (LPT1 at 378h) via a 2m long D25M-D25F
extension cable.

There are two versions of the jam.exe ; 16-bit-DOS and Win95-WinNT. Have you
tried each version ?

Can't think of anything else to try, - hope it works out for you !

Regards, Michael

Thomas Bornhaupt wrote:

> Hi Michael,
>
> thank you for your tipps. But it doesnot work.
>
> It seemt to me, that the MAX+plus (9.3) genarates wrong JAM or JBC files.
>
> I testet Jam.EXE 1.2 with the -dDO_CONFIGURE. But the Chip is not
> programmed.
>
> Inside of the JAM-File (Language 1.1) i found this line
>
> BOOLEAN DO_CONFIGURE = 0;
>
> So i set it to
>
> BOOLEAN DO_CONFIGURE = 1;
>
> Starting JAM.EXE i got a syntax-error in line 440!
>
> Also i tested JAM.EXE 2.2. Here you have the option -aCONFIGURE. This is the
> Action out of the JAM-file (STAPL Format):
>
> ACTION CONFIGURE = PR_INIT_CONFIGURE, PR_EXECUTE;
>
> And now I got an exception. The Dosbox went direcly away and a pure
> dosmachine hang up with an EMM386 error.
>
> regards
> Thomas Bornhaupt

Article: 19419
Subject: Re: Speed grade
From: Ray Andraka <randraka@ids.net>
Date: Mon, 20 Dec 1999 20:21:33 -0500
Links: << >> << T >> << A >>

That pretty much covers it.  Add don't use comb. logic for intentional delays.

Jonas Thor wrote:

> >can you explain what are the "dirty asynchronous tricks"  to avoid, please?
>
> Yep, I also would like to know what you shouldn't do. I have been
> taught to keep everything synchronous. Some stuff I've learned is:
>
> - Never source an asynch reset or clock from comb. logic
> - No comb. feedback loops
> - Synch your asynch inputs inputs carefully (if needed)
> - Use one global clock (Be aware of clock skew and short paths)
> - Use one global reset, but take careful care of your FSM reset
> - Keep everything synchronous!
>
> Will this avoid "dirty asynchronous tricks" or what are those tricks?
>
> Merry X-Mas from an unexperienced engineer!



--
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email randraka@ids.net
http://users.ids.net/~randraka

Article: 19420
Subject: Re: Dumb question springing from a discussion about chess on a chip...
From: Ray Andraka <randraka@ids.net>
Date: Mon, 20 Dec 1999 20:25:29 -0500
Links: << >> << T >> << A >>

I am have heard of a limited amount of work using FPGAs to speed up PAR for
FPGAs.  For some reason, there doesn't seem to be much mainstream interest.
Perhaps it is because they (who ever that is) assume the average FPGA user who
only does a design or so a year isn't going to spring several grand for an
accelerator board to save what amounts to a very small amount of time.

John L. Smith wrote:

> Seems to me that if chess is implementable in FPGA, place & route
> ought to be accelerable too. And that place/route should have a
> higher priority. I'd rather play chess against people.
>
> When do we get FPGA accelerated place/route, to reduce our P/R times
> from hours to minutes? Its only a factor of 60 acceleration we're
> looking for.
>
> Take the discussion below, and replace the words "moves" or "board
> positions" with the word "placements" or "routings" where appropriate.
>
> Dave Decker wrote:
> > "Simon Bacon" > <simon@tile.demon.co.uk.notreally> wrote:
> > >Could you post a few examples of the sort of primitives you
> > >would like to see a Chess Machine execute.
> > The partition between the work done by the micro or DSP and the work
> > done by the FPGA, is usually best made giving the micro the more
> > complex algorithmic jobs and giving the FPGA the compute intensive,
> > but algorithmicly simple, repetitive, flow through tasks.
> >
> > Chess programs have to:
> > Generate a tree of all possible moves from the current position for a
> > depth of a few generations, the more the better.
> >
> > Prune that tree so that stupid moves are not investigated, giving time
> > for more interesting moves to be probed to more generations.
> >
> > As each new possible future board position is postulated it must be
> > evaluated.
> >
> > It seems that one first task the FPGA could do is to evaluate a board
> > position and return its merit.
> >
> > If that's not enough work, perhaps the FPGA could also generate a list
> > of every possible next half move and return that list.
> >
> > The micro would be used for the more complex task of pruning the tree
> > and sending the next position to be evaluated to the FPGA(s). The
> > micro would need access to big memory. The FPGA would just run a
> > subroutine, without need to reference the history or progress of the
> > overall algorithm.
> >
>
>   ------------------------------------------------------------------------
>
>   John L. Smith <jsmith@visicom.com>
>   Principal Engineer
>   Visicom Imaging Products
>
>   John L. Smith
>   Principal Engineer        <jsmith@visicom.com>
>   Visicom Imaging Products  HTML Mail
>   10052 Mesa Ridge Court    Work: 858-320-4102
>   San Diego                 Fax: 858-??????
>   CA                        Netscape Conference Address
>   92121                     Netscape Conference DLS Server
>   USA
>   http://www.visicom.com/products/Vigra/index.html
>   Additional Information:
>   Last Name     Smith
>   First Name    John L.
>   Version       2.1



--
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email randraka@ids.net
http://users.ids.net/~randraka

Article: 19421
Subject: Re: Speed grade
From: rk <stellare@nospam.erols.com>
Date: Mon, 20 Dec 1999 21:08:46 -0500
Links: << >> << T >> << A >>

perhaps another one ...

no self-clearing structures (f-f output to it's clear).  find another way to make
that pulse! i've seen this too many times!

----------------------------------------------------------------------
rk                               The world of space holds vast promise
stellar engineering, ltd.        for the service of man, and it is a
stellare@erols.com.NOSPAM        world we have only begun to explore.
Hi-Rel Digital Systems Design    -- James E. Webb, 1968




Ray Andraka wrote:

> That pretty much covers it.  Add don't use comb. logic for intentional delays.
>
> Jonas Thor wrote:
>
> > >can you explain what are the "dirty asynchronous tricks"  to avoid, please?
> >
> > Yep, I also would like to know what you shouldn't do. I have been
> > taught to keep everything synchronous. Some stuff I've learned is:
> >
> > - Never source an asynch reset or clock from comb. logic
> > - No comb. feedback loops
> > - Synch your asynch inputs inputs carefully (if needed)
> > - Use one global clock (Be aware of clock skew and short paths)
> > - Use one global reset, but take careful care of your FSM reset
> > - Keep everything synchronous!
> >
> > Will this avoid "dirty asynchronous tricks" or what are those tricks?
> >
> > Merry X-Mas from an unexperienced engineer!
>
> --
> -Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email randraka@ids.net
> http://users.ids.net/~randraka

Article: 19422
Subject: automated testbench
From: #YEO WEE KWONG# <P7102672H@ntu.edu.sg>
Date: Tue, 21 Dec 1999 10:12:54 +0800
Links: << >> << T >> << A >>

Hi,

Anyone can share as to the useful links or knowhow as to how to build an
automated testbench.

Generally, I am interested as to what are the issues involved and how to
get around it?

I am also interested to know how to obtain a good text handling package
to obtain the stimulus / vector from a file and then cycle n no of cycle
to obtain a result to be outputed to a file for verification beside the
limited package of std.textio.all and ieee.std_logic_textio.all.

Thanks

yeo

Article: 19423
Subject: Re: Speed grade
From: murray@pa.dec.com (Hal Murray)
Date: 21 Dec 1999 02:19:37 GMT
Links: << >> << T >> << A >>


> no self-clearing structures (f-f output to it's clear).  find another way to make
> that pulse! i've seen this too many times!

I'd go much farther than that.  Any pulse that isn't a multiple
of the clock frequency (made by clocking a FF in a FSM) is asking
for troubles.

How about a list of reasonable ways to make a shorter pulse?
(and/or things to keep in mind when you do)

I think one of the Xilinx ap-notes mentions at least one.  It
may be a very old one.

-- 
These are my opinions, not necessarily my employers.

Article: 19424
Subject: Re: Speed grade
From: rk <stellare@nospam.erols.com>
Date: Mon, 20 Dec 1999 22:57:48 -0500
Links: << >> << T >> << A >>

Hal Murray wrote:

> > no self-clearing structures (f-f output to it's clear).  find another way to make
> > that pulse! i've seen this too many times!
>
> I'd go much farther than that.  Any pulse that isn't a multiple
> of the clock frequency (made by clocking a FF in a FSM) is asking
> for troubles.

once circuit i ran into that comes to mind, one of the most memorable ones, was when
there were two flip-flops, their outputs were NANDED, and the output of the NAND was
hooked up to the clears of both flip-flops and the NAND was an output of that
sub-circuit.

it's nice to always be able to make a pulse an integral number of clock ticks ...
however, system requirements and limitations do not always make that practical.  the
nice thing about making everything go off one edge of the clock, a frequent
recommendation, is that it makes the static timing analysis trivial.

same with the other rules ... nice to follow but can't all the time.  for example,
sometimes i just run out of low-skew clocks ... you can design reliabily with high-skew
clocks, there are a number of ways to do it, but they aren't pretty and take some
care.  another example is designing for very low power.

----------------------------------------------------------------------
rk                               The world of space holds vast promise
stellar engineering, ltd.        for the service of man, and it is a
stellare@erols.com.NOSPAM        world we have only begun to explore.
Hi-Rel Digital Systems Design    -- James E. Webb, 1968

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search