Messages from 157125

Article: 157125
Subject: Re: Need ideas for FYP
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Wed, 15 Oct 2014 01:44:20 +0000 (UTC)
Links: << >> << T >> << A >>

Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:

(previously snipped project suggestions)

> The trouble with all these projects is they're something a GPU could do with
> much less programming effort (at least to make it work non-optimally).  So
> I'm not sure the advantage of using an FPGA.  In an FPGA it's a lot harder
> to change the architecture if the problem changes (at least if you're
> writing in Verilog/VHDL it is).

If you can do them in fixed point, you can make really big arrays
to process really big data sets, though not so cheap.

There are people who need that, but not so many of them.

Learning how to do it isn't bad, though.

(snip)

> One thing FPGAs are good at I/O.  So a nice example is video processing -
> you take in video from a camera, do something clever to it, and output to a
> display.  There's a lot of data so you have to process it fast, and it's a
> nice visual demo.  It's also easy to debug - you can see what's going wrong
> on the screen.

Well, many filtering algorithms can be implemented as systolic arrays,
which allow for minimal I/O for the processing done. 

Implementing an FIR filter in fixed point in an FPGA would be a
reasonable sized project.  Again, learn about systolic arrays.

> Likewise other kinds of non-optical data (eg scan data from a 2D sensor of
> some kind).  You can also use audio or other sensors, as long as you have a
> useful output.

-- glen

Article: 157126
Subject: Re: Need ideas for FYP
From: rickman <gnuarm@gmail.com>
Date: Wed, 15 Oct 2014 01:36:10 -0400
Links: << >> << T >> << A >>

On 10/14/2014 8:17 PM, Theo Markettos wrote:
> rickman <gnuarm@gmail.com> wrote:
>> Oceanic modeling is a huge area.  You might want to narrow the focus on
>> that one a *lot* more before you try to narrow your list... or just
>> remove it.
>
> The trouble with all these projects is they're something a GPU could do with
> much less programming effort (at least to make it work non-optimally).  So
> I'm not sure the advantage of using an FPGA.  In an FPGA it's a lot harder
> to change the architecture if the problem changes (at least if you're
> writing in Verilog/VHDL it is).

Why is an HDL harder to change than any other code?  I use the same 
editor for both...


>> An area I find interesting is low power processing.  You might consider
>> what it takes to do something with a minimum of power consumption using
>> off the shelf devices.  There are a lot of potential applications there.
>
> One thing FPGAs are good at I/O.  So a nice example is video processing -
> you take in video from a camera, do something clever to it, and output to a
> display.  There's a lot of data so you have to process it fast, and it's a
> nice visual demo.  It's also easy to debug - you can see what's going wrong
> on the screen.

That is very true.


> Likewise other kinds of non-optical data (eg scan data from a 2D sensor of
> some kind).  You can also use audio or other sensors, as long as you have a
> useful output.

I/O is a big plus for an FPGA.  But I think the OP wants something that 
deals with some current major problem.

I wonder what medical app might be suitable for an FPGA.  Something that 
uses an array of sensors to measure body contour or pressure maybe, like 
a footstep?

-- 

Rick

Article: 157127
Subject: Re: Need ideas for FYP
From: Theo Markettos <theom+news@chiark.greenend.org.uk>
Date: 16 Oct 2014 00:57:51 +0100 (BST)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:
> On 10/14/2014 8:17 PM, Theo Markettos wrote:
> > rickman <gnuarm@gmail.com> wrote:
> >> Oceanic modeling is a huge area.  You might want to narrow the focus on
> >> that one a *lot* more before you try to narrow your list... or just
> >> remove it.
> >
> > The trouble with all these projects is they're something a GPU could do
> > with much less programming effort (at least to make it work
> > non-optimally).  So I'm not sure the advantage of using an FPGA.  In an
> > FPGA it's a lot harder to change the architecture if the problem changes
> > (at least if you're writing in Verilog/VHDL it is).
> 
> Why is an HDL harder to change than any other code?  I use the same 
> editor for both...

Changing small-scale stuff is easy, in any language.  Re-architecting the
problem is harder.  In Verilog/VHDL it's hard because you have to rewrite
all the control logic as well as reorganise the datapath.

Let's say you built a single-issue CPU and you want to convert it to
superscalar.  Not only do you need to rewrite the datapath (not trivial) you
have to manage all the enable signals on the pipeline stages and the state
machine about when each stage fires.  If you get one of those interlocks
wrong you get subtle bugs.  If you change something, you may get a different
set of subtle bugs.  Rinse and repeat.

If you're writing code on a GPU you're writing in a much higher level
language: the API doesn't even know how many cores you have (it'll depend
what model of GPU your machine has) - you just give it the work to do and
it'll partition it up amongst the cores.

While there are many subtleties about writing efficient GPU code (you need
to know a lot about the underlying architecture to achieve good
performance), it's relatively simple to write bad GPU code that works, and
then you can refine it later.  Bad HDL tends not to work.  Not working means
staring at simulator traces, which is not a pleasant experience.  Or it
works in the simulator but not on the board, because the language (I'm
looking at Verilog particularly) isn't sufficiently strict about what the
expected behaviour should be (and then you get to stare at
ChipScope/SignalTap traces, an even less pleasant experience).

> I/O is a big plus for an FPGA.  But I think the OP wants something that 
> deals with some current major problem.

The issue for compute problems is always going to be that the FPGA at, say,
200MHz and one stick of DDR3 RAM is up against the multi-GHz GPU with
thousands of threads, GDDR5 memory, and so on.  There are applications that
don't suit GPUs, but unless you have a good architectural reason why it
won't work I'd say in most cases you're better off starting with a GPU.

However, this is putting the cart before the horse.  If you stare at your
algorithm for long enough, with the FPGA or GPU architecture in mind, you
can probably make significant performance increase by refactoring the task
before writing a line of code.

I realise this is a student project so 'doing something with an FPGA' might
be more of a goal than 'making X go faster', but we tend to see a lot of
papers which go like this:

1. Built a Matlab/Java/Python simulator that ran at speed X
2. Built an FPGA system that runs at speed 100X
3. Profit!^H^H^H^H Publish!

When the intermediate steps might be
1b. Refactored algorithm (with caches, memory bandwidth, etc in mind)
1c. Built a multithreaded C/C++ simulator that runs at speed 70X on the same
hardware as Matlab result
1d. Run that on a proper server, not their 5 year old laptop

at which point why bother with this FPGA stuff?

Theo

Article: 157128
Subject: Re: Need ideas for FYP
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Thu, 16 Oct 2014 00:44:47 +0000 (UTC)
Links: << >> << T >> << A >>

Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:
> rickman <gnuarm@gmail.com> wrote:
>> On 10/14/2014 8:17 PM, Theo Markettos wrote:

(snip)
>> > The trouble with all these projects is they're something a GPU could do
>> > with much less programming effort (at least to make it work
>> > non-optimally).  So I'm not sure the advantage of using an FPGA.  In an
>> > FPGA it's a lot harder to change the architecture if the problem changes
>> > (at least if you're writing in Verilog/VHDL it is).

For those problems, you should use a GPU.

There are some problems where an FPGA is a good solution, though.

First, they pretty much have to be able to be done in fixed point.
Next, they have to be done on a really huge scale.

If all the arithmetic operations are fixed point add, subtract,
and compare, you can do a really huge number of them in an array
af FPGAs.

>> Why is an HDL harder to change than any other code?  I use the same 
>> editor for both...

> Changing small-scale stuff is easy, in any language.  Re-architecting the
> problem is harder.  In Verilog/VHDL it's hard because you have to rewrite
> all the control logic as well as reorganise the datapath.

Linear systolic arrays are pretty easy to change. It is a linear array
of relatively simple (but in any case, a module in the appropriate
HDL) cells. You can put more or less in a single FPGA, and make
a linear array of such FPGAs when needed. 

> Let's say you built a single-issue CPU and you want to convert it to
> superscalar.  Not only do you need to rewrite the datapath (not 
> trivial) you have to manage all the enable signals on the pipeline 
> stages and the state machine about when each stage fires.  
> If you get one of those interlocks wrong you get subtle bugs.  
> If you change something, you may get a different set of subtle bugs.  
> Rinse and repeat.

In that case, no, don't use an FPGA. 

> If you're writing code on a GPU you're writing in a much higher level
> language: the API doesn't even know how many cores you have 
> (it'll depend what model of GPU your machine has) - you just give 
> it the work to do and it'll partition it up amongst the cores.

I am not so sure what is now being done with large arrays
of FPGAs (not clusters of PCs with a few GPUs in them).
If it needs floating point, and not all problems that are commonly
done in floating point should be, then GPU might be a better choice.

(sni)

>> I/O is a big plus for an FPGA.  But I think the OP wants something that 
>> deals with some current major problem.

> The issue for compute problems is always going to be that the FPGA at, say,
> 200MHz and one stick of DDR3 RAM is up against the multi-GHz GPU with
> thousands of threads, GDDR5 memory, and so on.  There are applications that
> don't suit GPUs, but unless you have a good architectural reason why it
> won't work I'd say in most cases you're better off starting with a GPU.

I have written verilog that could do 1e19 operations, which are 5 bit
add/subtract/compare per day. There is an actual problem that can
use that much computation.

How many GPUs does it take to do 1e19 arithmetic operations per day?

> However, this is putting the cart before the horse.  If you stare at your
> algorithm for long enough, with the FPGA or GPU architecture in mind, you
> can probably make significant performance increase by refactoring the task
> before writing a line of code.

> I realise this is a student project so 'doing something with an FPGA' might
> be more of a goal than 'making X go faster', but we tend to see a lot of
> papers which go like this:

> 1. Built a Matlab/Java/Python simulator that ran at speed X
> 2. Built an FPGA system that runs at speed 100X
> 3. Profit!^H^H^H^H Publish!

Well, at this point he only needs to show that it could be done.
That is, proof of concept. Only when someone puts of the money
does he have to show that it can scale.

> When the intermediate steps might be
> 1b. Refactored algorithm (with caches, memory bandwidth, etc in mind)
> 1c. Built a multithreaded C/C++ simulator that runs at speed 70X 
>     on the same hardware as Matlab result
> 1d. Run that on a proper server, not their 5 year old laptop

> at which point why bother with this FPGA stuff?

For some actual examples of FPGA based hardware processors see:

http://www.timelogic.com/catalog/775

-- glen

Article: 157129
Subject: Re: Need ideas for FYP
From: Mike Field <mikefield1969@gmail.com>
Date: Wed, 15 Oct 2014 19:36:14 -0700 (PDT)
Links: << >> << T >> << A >>

On Tuesday, 14 October 2014 05:09:49 UTC+13, awais...@namal.edu.pk  wrote:
> I am student of Bachelors and going to start my FYP in some days. I am go=
ing into the field of high computation in verilog.=20
>
> Any other projects you might suggest that may be beneficial for me.
>=20
> Thanks!

It isn't so much computation, and I know nothing about the bioinformatics f=
ield, but some sort of DNA pattern matching algo always struck me as being =
an interesting area to explore. The data objects are small, the data set si=
zes are large, and the parallel nature of FPGAs and internal memory bandwid=
th can be exploited. A processor can compare a couple symbols per cycle, a =
GPU might do a few 100 or a thousand symbols per cycle. An low end FPGA cou=
ld do a few thousands per cycle.

Is it best to have 'n' tiny little state machines, each detecting one of 'n=
' patterns, or do you timeslice 'x' state machines, each looking for x/n pa=
tterns? Is it best to look at data in big gulps, or one symbol at a time?

How is the best way to look for patterns? A giant grep-like FSM, or multipl=
e smaller FSMs? Do you spread the FSMs into a pipeline (each stage feeding =
onto the next) or do you use local feedback? Can FSMs be partitioned to max=
imise efficiency? Can you leverage the underlying FPGA architecture to your=
 advantage (e.g. cascades in DSP blocks, coupling between BRAM blocks).

I like this idea because the FPGA side FSMs are relatively simple, and most=
 of the technology is in how you generate the tables that allow you to sear=
ch quickly and efficiently.

It could also easily implement pattern matches that are tricky to do in S/W=
.=20

Gosh Darn - looks like somebody has been there before (not that I've actual=
ly read the papers... )=20

http://www.ipcsit.com/vol2/38-A313.pdf
http://ieee-hpec.org/2012/index_htm_files/Fernandez.pdf

I also thought that some sort of particle simulation (e.g. Photon Mapping) =
would be interesting to explore, but never had the time.

Mike

Article: 157130
Subject: Re: Need ideas for FYP
From: rickman <gnuarm@gmail.com>
Date: Wed, 15 Oct 2014 23:38:10 -0400
Links: << >> << T >> << A >>

On 10/15/2014 7:57 PM, Theo Markettos wrote:
> rickman <gnuarm@gmail.com> wrote:
>> On 10/14/2014 8:17 PM, Theo Markettos wrote:
>>> rickman <gnuarm@gmail.com> wrote:
>>>> Oceanic modeling is a huge area.  You might want to narrow the focus on
>>>> that one a *lot* more before you try to narrow your list... or just
>>>> remove it.
>>>
>>> The trouble with all these projects is they're something a GPU could do
>>> with much less programming effort (at least to make it work
>>> non-optimally).  So I'm not sure the advantage of using an FPGA.  In an
>>> FPGA it's a lot harder to change the architecture if the problem changes
>>> (at least if you're writing in Verilog/VHDL it is).
>>
>> Why is an HDL harder to change than any other code?  I use the same
>> editor for both...
>
> Changing small-scale stuff is easy, in any language.  Re-architecting the
> problem is harder.  In Verilog/VHDL it's hard because you have to rewrite
> all the control logic as well as reorganise the datapath.
>
> Let's say you built a single-issue CPU and you want to convert it to
> superscalar.  Not only do you need to rewrite the datapath (not trivial) you
> have to manage all the enable signals on the pipeline stages and the state
> machine about when each stage fires.  If you get one of those interlocks
> wrong you get subtle bugs.  If you change something, you may get a different
> set of subtle bugs.  Rinse and repeat.
>
> If you're writing code on a GPU you're writing in a much higher level
> language: the API doesn't even know how many cores you have (it'll depend
> what model of GPU your machine has) - you just give it the work to do and
> it'll partition it up amongst the cores.

I don't follow.  How is rearchitecting a CPU in an FPGA anything like 
changing GPU code?  Your explanation makes no sense.  I think you are 
working from a huge lack of knowledge of HDLs.


> While there are many subtleties about writing efficient GPU code (you need
> to know a lot about the underlying architecture to achieve good
> performance), it's relatively simple to write bad GPU code that works, and
> then you can refine it later.  Bad HDL tends not to work.  Not working means
> staring at simulator traces, which is not a pleasant experience.  Or it
> works in the simulator but not on the board, because the language (I'm
> looking at Verilog particularly) isn't sufficiently strict about what the
> expected behaviour should be (and then you get to stare at
> ChipScope/SignalTap traces, an even less pleasant experience).
>
>> I/O is a big plus for an FPGA.  But I think the OP wants something that
>> deals with some current major problem.
>
> The issue for compute problems is always going to be that the FPGA at, say,
> 200MHz and one stick of DDR3 RAM is up against the multi-GHz GPU with
> thousands of threads, GDDR5 memory, and so on.  There are applications that
> don't suit GPUs, but unless you have a good architectural reason why it
> won't work I'd say in most cases you're better off starting with a GPU.

The CPU has only a handful of ALUs to perform useful calculations on, 
the FPGA is limited only by its size.  The clock speed is swamped by the 
sheer number of computations that can happen in parallel.

The GPU has lots of ALUs, but is limited in how they are used.  It is 
*nothing* like having 1000 separate processors.  So it can only be 
useful on certain types of problems.

The FPGA gets around all of these issues and can be configured on the fly.


> However, this is putting the cart before the horse.  If you stare at your
> algorithm for long enough, with the FPGA or GPU architecture in mind, you
> can probably make significant performance increase by refactoring the task
> before writing a line of code.
>
> I realise this is a student project so 'doing something with an FPGA' might
> be more of a goal than 'making X go faster', but we tend to see a lot of
> papers which go like this:
>
> 1. Built a Matlab/Java/Python simulator that ran at speed X
> 2. Built an FPGA system that runs at speed 100X
> 3. Profit!^H^H^H^H Publish!
>
> When the intermediate steps might be
> 1b. Refactored algorithm (with caches, memory bandwidth, etc in mind)
> 1c. Built a multithreaded C/C++ simulator that runs at speed 70X on the same
> hardware as Matlab result
> 1d. Run that on a proper server, not their 5 year old laptop
>
> at which point why bother with this FPGA stuff?

I gave one reason which you didn't respond to.

-- 

Rick

Article: 157131
Subject: Re: Need ideas for FYP
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Thu, 16 Oct 2014 11:55:22 +0000 (UTC)
Links: << >> << T >> << A >>

Mike Field <mikefield1969@gmail.com> wrote:

(snip)

> It isn't so much computation, and I know nothing about the 
> bioinformatics field, but some sort of DNA pattern matching algo 
> always struck me as being an interesting area to explore. 
> The data objects are small, the data set sizes are large, 
> and the parallel nature of FPGAs and internal memory bandwidth 
> can be exploited. A processor can compare a couple symbols per 
> cycle, a GPU might do a few 100 or a thousand symbols per cycle. 
> An low end FPGA could do a few thousands per cycle.

I think that is about right. And run at about 200MHz, maybe 300MHz.
FPGAs have the registers built in, so you just have to be sure
to use enough of them.

> Is it best to have 'n' tiny little state machines, each detecting 
> one of 'n' patterns, or do you timeslice 'x' state machines, 
> each looking for x/n patterns? Is it best to look at data in 
> big gulps, or one symbol at a time?

https://en.wikipedia.org/wiki/Dynamic_programming#Sequence_alignment

> How is the best way to look for patterns? A giant grep-like FSM, 
> or multiple smaller FSMs? Do you spread the FSMs into a pipeline 
> (each stage feeding onto the next) or do you use local feedback? 

The latter is probably the right description. 

The idea of dynamic programming is that if you make the optimal
decision at each point, you find the globally optimal solution.

Conveniently, systolic arrays are convenient for evaluating
dynamic programming algorithms, and also nice and efficient
to implement in FPGAs. (Or ASICs, sometimes.)

> Can FSMs be partitioned to maximise efficiency? 
> Can you leverage the underlying FPGA architecture to your 
> advantage (e.g. cascades in DSP blocks, coupling 
> between BRAM blocks).

-- glen

Article: 157132
Subject: Re: Need ideas for FYP
From: rickman <gnuarm@gmail.com>
Date: Thu, 16 Oct 2014 12:17:04 -0400
Links: << >> << T >> << A >>

On 10/16/2014 7:55 AM, glen herrmannsfeldt wrote:
> Mike Field <mikefield1969@gmail.com> wrote:
>
> (snip)
>
>> It isn't so much computation, and I know nothing about the
>> bioinformatics field, but some sort of DNA pattern matching algo
>> always struck me as being an interesting area to explore.
>> The data objects are small, the data set sizes are large,
>> and the parallel nature of FPGAs and internal memory bandwidth
>> can be exploited. A processor can compare a couple symbols per
>> cycle, a GPU might do a few 100 or a thousand symbols per cycle.
>> An low end FPGA could do a few thousands per cycle.
>
> I think that is about right. And run at about 200MHz, maybe 300MHz.
> FPGAs have the registers built in, so you just have to be sure
> to use enough of them.

I think they have dealt with that one pretty well.  After all, they have 
sequenced the human genome.

>> Is it best to have 'n' tiny little state machines, each detecting
>> one of 'n' patterns, or do you timeslice 'x' state machines,
>> each looking for x/n patterns? Is it best to look at data in
>> big gulps, or one symbol at a time?
>
> https://en.wikipedia.org/wiki/Dynamic_programming#Sequence_alignment
>
>> How is the best way to look for patterns? A giant grep-like FSM,
>> or multiple smaller FSMs? Do you spread the FSMs into a pipeline
>> (each stage feeding onto the next) or do you use local feedback?
>
> The latter is probably the right description.
>
> The idea of dynamic programming is that if you make the optimal
> decision at each point, you find the globally optimal solution.

This is not a global truth.  It assumes the path to the optimal solution 
is monotonic which it may not be.

-- 

Rick

Article: 157133
Subject: Re: Need ideas for FYP
From: Mike Field <mikefield1969@gmail.com>
Date: Thu, 16 Oct 2014 13:57:58 -0700 (PDT)
Links: << >> << T >> << A >>

On Friday, 17 October 2014 05:17:04 UTC+13, rickman  wrote:
>=20
> I think they have dealt with that one pretty well.  After all, they have=
=20
> sequenced the human genome.
>=20
Sure, but in this age of "big data" how quickly can you query 1000s of genm=
oes, each approximately 4 billion symbols in size, looking for somewhat fuz=
zy matches?

Oldish (2006) papers talk about a 2GHz Xeon processing 32M characters per s=
econd, and 16 CPU Power system processing 1.2G symbols per second. This is =
obvoiusly bound by CPU cycles and not memory or I/O bandwidth.

FPGA hardware, has come a long way in that time, as have memory capacity, m=
emory bandwidths and I/O subsystems. Standard CPUs haven't progressed at su=
ch a dramatic pace, just adding more cores. I am pretty sure that revisitin=
g it with an FPGA board that could hold the entire genome in memory, full m=
emory bandwidth speed (a 16x 333 MHz DDR memory can deliver 1.2G symbols pe=
r second) should be able to get close to this on a tiny power budget.=20

If the process isn't limited by I/O bandwidth, then it isn't running fast e=
nough :-)

Maybe it could be implemented on one of those ARM/FPGA hybrid chips, with t=
he fabric having a larger private memory to hold the genome data, and the A=
RM just performing command and control... it would avoid a lot of the compl=
exity of high speed I/O.

Article: 157134
Subject: Re: Need ideas for FYP
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Thu, 16 Oct 2014 21:27:02 +0000 (UTC)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:

(snip)

>> I think that is about right. And run at about 200MHz, maybe 300MHz.
>> FPGAs have the registers built in, so you just have to be sure
>> to use enough of them.

> I think they have dealt with that one pretty well.  After all, they have 
> sequenced the human genome.

For some actual numbers of what you can do today:

http://res.illumina.com/documents/products/datasheets/datasheet_hiseq2500.pdf

this machine can generate 4 billion reads (sequences) of 125 base pairs,
for a total of 500 Gbp in six days. You then want to compare that
the the reference (human) genome (if it is human data) or 3Gbp.

The dynamic programming algorithm gives you the score for each 125bp
fragment against the reference, including appropriate penalty
(usually 1 each) for insertions, deletions, or substitutions.

(The algorithm is the same one, or similar to, the one used by diff.
The original diff got the algorithm from one that was used for
protein sequences in the 1970s.)

Since the reads are up to 125bp long, if you score +1 for a match,
the score can't go over 127, and so 7 bits is enough. It takes five
to seven add/subtract/compare operations, 7 bit fixed point, to 
compare each new base against each base of the reference.
So, 5e11*3e9/6 days or 2.5e20 dynamic programming cells per day.
Times 5, so 1.25e21 7 bit add/subtract/compare per day.

How fast is your GPU? 

(If you want to sequence a new genome, it is done at about 10x 
coverage. You randomly select 30Gbp of 125bp fragments, and hope
that they cover most of the genome to a depth of at least two.
So, the above machine can sequence about 12 humans in 6 days.)

The sequencers have gotten somewhat faster since the last time I
did this calculations. Note that for many years now, it isn't the
chemistry that limits it, but the data processing. 

>>> Is it best to have 'n' tiny little state machines, each detecting
>>> one of 'n' patterns, or do you timeslice 'x' state machines,
>>> each looking for x/n patterns? Is it best to look at data in
>>> big gulps, or one symbol at a time?

>> https://en.wikipedia.org/wiki/Dynamic_programming#Sequence_alignment

(snip)

>> The idea of dynamic programming is that if you make the optimal
>> decision at each point, you find the globally optimal solution.

> This is not a global truth.  It assumes the path to the optimal 
> solution is monotonic which it may not be.

People only think up algorithms that satisfy the restrictions
for dynamic programming. 

The one commonly used does local alignment, so finds the highest
scoring match between each input sequence and the reference,
including all combinations of insertion, deletion, or substitution.
(Think about spell checkers finding the close words to you 
misspelled word.)  The five operation algorithm scores pretty
much the way you would for words. With a little more work, you
an do affine gap scoring, there the penalty for a gap has an open
penalty and extend penalty, such that longer gaps are not 
proportionally penalized.  You can make even more complicated
gap penalty functions.

The times are already long enough. No point in going to one that
is exponential in the length of the fragments.

-- glen

Article: 157135
Subject: Re: Need ideas for FYP
From: rickman <gnuarm@gmail.com>
Date: Thu, 16 Oct 2014 18:08:48 -0400
Links: << >> << T >> << A >>

On 10/16/2014 4:57 PM, Mike Field wrote:
> On Friday, 17 October 2014 05:17:04 UTC+13, rickman  wrote:
>>
>> I think they have dealt with that one pretty well.  After all, they have
>> sequenced the human genome.
>>
> Sure, but in this age of "big data" how quickly can you query 1000s of genmoes, each approximately 4 billion symbols in size, looking for somewhat fuzzy matches?
>
> Oldish (2006) papers talk about a 2GHz Xeon processing 32M characters per second, and 16 CPU Power system processing 1.2G symbols per second. This is obvoiusly bound by CPU cycles and not memory or I/O bandwidth.
>
> FPGA hardware, has come a long way in that time, as have memory capacity, memory bandwidths and I/O subsystems. Standard CPUs haven't progressed at such a dramatic pace, just adding more cores. I am pretty sure that revisiting it with an FPGA board that could hold the entire genome in memory, full memory bandwidth speed (a 16x 333 MHz DDR memory can deliver 1.2G symbols per second) should be able to get close to this on a tiny power budget.
>
> If the process isn't limited by I/O bandwidth, then it isn't running fast enough :-)
>
> Maybe it could be implemented on one of those ARM/FPGA hybrid chips, with the fabric having a larger private memory to hold the genome data, and the ARM just performing command and control... it would avoid a lot of the complexity of high speed I/O.

Not sure you can hold 3 Gsymbols on chip in an FPGA.  They may have 
memory, but not GBs.  So the ARM doing control isn't really all that 
helpful.  It can't even be remotely in the data path so it doesn't need 
to be on chip at all.  Why waste space that can be used for more memory 
and logic?

The real advantage of the FPGA approach is that it can connect to 
multiple memory chips and run them at max throughput.  Multiple FPGAs 
can be used on one board potentially outpacing the density of PC CPUs 
and almost certainly reducing the power budget.  What was ALU bound in a 
PC will be memory bound in an FPGA, so more memory means more processing.

-- 

Rick

Article: 157136
Subject: Re: Need ideas for FYP
From: Mike Field <mikefield1969@gmail.com>
Date: Thu, 16 Oct 2014 15:40:29 -0700 (PDT)
Links: << >> << T >> << A >>

On Friday, 17 October 2014 11:08:48 UTC+13, rickman  wrote:
> On 10/16/2014 4:57 PM, Mike Field wrote:
> The real advantage of the FPGA approach is that it can connect to 
> multiple memory chips and run them at max throughput.  Multiple FPGAs 
> can be used on one board potentially outpacing the density of PC CPUs 
> and almost certainly reducing the power budget.  What was ALU bound in a 
> PC will be memory bound in an FPGA, so more memory means more processing.

Fully agree, and assuming his board is something like an Digilent Atlys it may already have perhaps 128MB of DDR on it, allowing the design to be tested with 512 million DNA (2-bit) symbols - enough to hold a worm's genome.

However, I guess I've dragged this discussion far away from the the original poster's question of what to do for his final year project...

Mike

Article: 157137
Subject: Re: Need ideas for FYP
From: Theo Markettos <theom+news@chiark.greenend.org.uk>
Date: 17 Oct 2014 00:58:03 +0100 (BST)
Links: << >> << T >> << A >>

rickman <gnuarm@gmail.com> wrote:
> On 10/15/2014 7:57 PM, Theo Markettos wrote:
> 
> I don't follow.  How is rearchitecting a CPU in an FPGA anything like 
> changing GPU code?  Your explanation makes no sense.  I think you are 
> working from a huge lack of knowledge of HDLs.

It isn't, that's the point.  /If/ you can write in a high-ish level language
like CUDA or OpenCL and achieve your goal with some commodity hardware you
can buy in every city, why would you want to use an FPGA?  Why would you
want to worry about synthesis and meeting timing and state machines and
debugging with a logic analyser?  Maybe your problem doesn't fit the GPU
model and is better suited to an FPGA, but you really ought to stop and
think first.

It's not the HDL as a language as such (though Verilog's lax syntax makes it
easy to introduce bugs), it's that the abstraction is not sufficiently high
enough to make any useful progress.  If somebody wants to experiment with
architecture they should be able to do that without having to manage all the
underlying complexity, and then be able to go back later and refine the code
for performance.  But this is awkward in (eg) verilog unless you have a very
good test suite - it's too easy to introduce control flow bugs.

I've read the code of a web browser written in assembler.  That's an example
where the abstraction was not sufficiently high - hand management of
registers and bit twiddling of memory made it simply impossible to keep
control of the complexity.  Development eventually ground to a halt because
it was simply impossible to develop.  Likewise verilog gives you ultimate
control, and that's not always what you want when you're just evaluating
ideas.

All I'm saying is that verilog/VHDL are insufficiently high levels of
abstraction for architectural exploration.  I'm not saying all HDLs/HLS are
bad, just that you need to pick the right language.

> The CPU has only a handful of ALUs to perform useful calculations on, 
> the FPGA is limited only by its size.  The clock speed is swamped by the 
> sheer number of computations that can happen in parallel.
> 
> The GPU has lots of ALUs, but is limited in how they are used.  It is 
> *nothing* like having 1000 separate processors.  So it can only be 
> useful on certain types of problems.
>
> The FPGA gets around all of these issues and can be configured on the fly.

Agreed.  Some problems are about relatively simple heavily-parallel compute,
and if especially if they can be easily pipelined then they fit FPGA nicely. 
Likewise if they have Gbps of external I/O FPGA will leave a GPU standing
(or if the I/O is not in a PC-friendly format).

However if they need heavy floating point, like a lot of scientific compute,
this starts eating up area rapidly.  If they're memory-bound, then you're up
against the limits of DDR3, which is a lot less bandwidth than GDDR5.  Or if
you want to do iterative development: many-hours FPGA synthesis times are
not conducive.

Horses for courses and all that.  My point is that you should do the work on
your algorithm to see how it best fits the technologies available to you
(CPU, GPU, FPGA), and then refactor it to suit.  You may get substantially
more performance by refactoring the algorithm for a given technology, rather
than simply jumping in and implementing a naive algorithm.  Once you've done
this, only then implement it.  But then be prepared to (repeatedly) refactor
your architecture again in the light of that experience.

> I gave one reason which you didn't respond to.

I'm not saying 'FPGA bad, GPU good', I'm saying implementing an FPGA design
for scientific compute is a lot of work.  So you need to have a clear
reasoning why you're doing it.  Just doing it 'to make my Matlab go faster'
is not a good enough reason, because there's a lot less painful ways to
achieve that.

Theo

Article: 157138
Subject: Re: Non-project mode Vivado simulation?
From: kkoorndyk <kris.koorndyk@gmail.com>
Date: Fri, 17 Oct 2014 11:10:29 -0700 (PDT)
Links: << >> << T >> << A >>

On Tuesday, October 14, 2014 9:05:07 AM UTC-4, Petter Gustad wrote:
> Is it possible to run a Vivado simulation in non-project mode? 
> 
> 
> 
> I can't seem to find any documentation on how to do it. ug835 describes
> 
> which Tcl commands are used for simulation, but not which to use for
> 
> non-project mode.
> 
> 
> 
> //Petter
> 
> -- 
> 
> .sig removed by request.

Yes, it's fairly easy using xvhdl, xvlog, xelab and xsim as described in UG900.

For an example, I have a SPI Master module I've set this up for.  I created a spi_master.prj file with the following contents:

vhdl work ../src/spi_master_ae.vhd
vhdl work ../tb/stdtb_pb.vhd
vhdl work ../tb/models/spi_bfm.vhd
vhdl work ../tb/spi_testbench_pb.vhd
vhdl work ../tests/spi_testcase_e.vhd
vhdl work ../tests/spi_tc_a.vhd
vhdl work ../tb/harnesses/spi_harness_pb.vhd
vhdl work ../tb/harnesses/spi_harness_tb_e.vhd
vhdl work ../tb/harnesses/spi_harness_tb_a.vhd
vhdl work ../tb/spi_tb.vhd

I then run xvhdl using the following to compile everything:

# xvhdl -prj spi_master.prj

Next, I elaborated the design using:

# xelab work.spi_tb -prj spi_master.prj -debug all

Finally, I kick off the simulation using the GUI with:

# xsim -g work.spi_tb

UG900 provides much greater detail on the other options for each of those steps too.

Article: 157139
Subject: Handel-C to VHDL
From: Ahmed Ablak <ahmedablak0@gmail.com>
Date: Fri, 17 Oct 2014 15:48:05 -0700 (PDT)
Links: << >> << T >> << A >>

When I generate VHDL from Handel-C. I always end up with an empty VHDL file, did any one face this problem? 
and how to solve it? 
Thanks

Article: 157140
Subject: Re: looking for systemC/TLM 2.0 courses
From: Alan Fitch <apf@invalid.invalid>
Date: Sat, 18 Oct 2014 00:37:34 +0100
Links: << >> << T >> << A >>

On 13/10/14 18:35, Jim Lewis wrote:
> Hi Alan,
>>> I used to, has been a fun ride. Too bad I never got the chance to follow 
>>
>>> a course of yours. It seems now they are more and more pushing for 
>>
>>> SystemVerilog, while I lately was trying to fight my way with OSVVM 
>>
>>> which I find more appropriate for that type of community.
>>
>>>
>>
>>
>>
>> I'm afraid it's "the dead hand of the market"...
> 
> Perhaps a shrinking market for Doulos, however, it has been a growing market for us (including in UK).  
> 
> What I do see in SystemVerilog's favor is the vendors are pushing the user community heavily to it since they can make more money with SystemVerilog licenses. 
> 
> OTOH, recently we have had people switching from SystemVerilog to VHDL/OSVVM because their projects could not afford the pricing of a SystemVerilog simulator when OSVVM can do the same thing.
> 
> Cheers,
> Jim
> 
> 
> 

I just want to clarify that I left Doulos a year ago, and I have no
knowledge of their current market. Also I didn't say that the VHDL
market is shrinking. What I was trying to say is that there's more
demand for SystemVerilog training than VHDL - but that doesn't mean that
VHDL is necessarily shrinking.

regards

Alan

-- 
Alan Fitch

Article: 157141
Subject: Fast and slow clocks
From: Bruce Varley
Date: Sat, 18 Oct 2014 15:53:31 +0800
Links: << >> << T >> << A >>

I'm wondering what the correct way to handle the following situation
is. Sorry this is a bit long winded. BTW, it's not homework, all that
was 40+ years ago.


I have two clocks, clk which is the FPGA clock rate, and sclk which I
create using a simple divide by n counter. Typically, sclk is 1024
times slower than clk. 

An event occurs that sets a reg, DR, for one clk cycle. 

There is a register, calcreg [7:0] which is to be incremented slowly,
but reset to zero on DR.

There are two sections, one triggered by clk and one by sclk, ie:

//	Fast section
always @ (posedge clk)
	begin
	....
	end

// Slow section
always @ (posedge sclk)
	begin
	if (DR)  calcreg <= 8'h0  ;         // Reset calcreg on DR 
	else calcreg <= calcreg + 1   ;  // Else increment
	end

The problem of course is that the on state of DR will almost always be
missed, it will only appear if it happens to coincide with a sclk edge
(1 / 1024). So the above doesn't work.

So I tried modifying the clk section as follows:
 
always @ (posedge clk)
	begin
	....
	if (DR) calcreg <= 8'h0  ;  
	end

This threw up build errors, 

Error (10028): Can't resolve multiple constant drivers for net
"FlashCtr[3]" at tick.v(43)

I think I see the reason, it's like trying to wire two gate outputs to
the same point, something that's obviously verboten with active drive
hardware. 


If someone could help with the following specific questions it would
help a lot.....

Is using the two clocks simply bad practice, ie. should everything be
done in a single always block at clk rate?

Is there a standard way to latch the DR signal when it occurs on the
fast clock, so that it will be there on the next transition of sclk,
which must then clear the DR latch? I've tried this, and come up with
the same build error with the latch.

Article: 157142
Subject: Re: Handel-C to VHDL
From: HT-Lab <hans64@htminuslab.com>
Date: Sat, 18 Oct 2014 09:22:24 +0100
Links: << >> << T >> << A >>

On 17/10/2014 23:48, Ahmed Ablak wrote:
> When I generate VHDL from Handel-C. I always end up with an empty VHDL file, did any one face this problem?
> and how to solve it?
> Thanks
>
Hi Ahmed,

There is not much info to go on. I assume you are using Handel-c because 
of some existing (Celoxica) hardware? If you are just after C synthesis 
then I would suggest you look at Hercules, Xilinx HLS, synflow etc or 
swap to a more traditional RTL languages.

Do any of the demo files work? There is a simple led toggle example in 
the ./examples/pal/led directory.

Good luck,
Hans
www.ht-lab.com

Article: 157143
Subject: Re: looking for systemC/TLM 2.0 courses
From: HT-Lab <hans64@htminuslab.com>
Date: Sat, 18 Oct 2014 09:32:10 +0100
Links: << >> << T >> << A >>

On 18/10/2014 00:37, Alan Fitch wrote:

Hi Alan,
..
>
> I just want to clarify that I left Doulos a year ago, and I have no
> knowledge of their current market. Also I didn't say that the VHDL
> market is shrinking.

I guess using "the dead hand of the market" is not the most appropriate 
phrase for the leading FPGA design language.

> What I was trying to say is that there's more
> demand for SystemVerilog training than VHDL -

I am not sure that is correct from what I understand VHDL is still the 
most popular Doulos language course, also if you look at the current 
schedule there are more VHDL than SystemVerilog courses.

> but that doesn't mean that
> VHDL is necessarily shrinking.

I agree with Jim that the EDA industry seems to be doing its best to 
make this happen ;-)

Hope you are enjoying your new job and are allowed to use VHDL and SystemC,

Regards,
Hans.
www.ht-lab.com

>
> regards
>
> Alan
>

Article: 157144
Subject: Re: Fast and slow clocks
From: "mnentwig" <24789@embeddedrelated>
Date: Sat, 18 Oct 2014 04:33:26 -0500
Links: << >> << T >> << A >>

Hi,

I would put it all into a single clk block. I.e.

reg [9:0] count = 0;
always @(posedge clk) begin
  count <= count + 1;
  // fast "clock" here
  if (count == 0) begin

  end
end

Depending on what you want to achieve, you could also re-synchronize the 	 
 
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 157145
Subject: Re: Fast and slow clocks
From: "mnentwig" <24789@embeddedrelated>
Date: Sat, 18 Oct 2014 04:48:55 -0500
Links: << >> << T >> << A >>

.. you could also resynchronize the slow "clock" on the event. 
Maybe the difficulty is to define how exactly the circuit is going to
behave, not so much coding it in RTL.

Multiple clocks might be used on an ASIC where it allows use of smaller
cells for the slow part. 
For a simple FPGA project, my main goal would be to keep the code as
readable as possible.
	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 157146
Subject: Re: Need ideas for FYP
From: awaish2011@namal.edu.pk
Date: Sat, 18 Oct 2014 07:01:50 -0700 (PDT)
Links: << >> << T >> << A >>

On Monday, October 13, 2014 11:48:37 PM UTC+5, mnentwig wrote:
> Hi,
> 
> 
> 
> >> verilog. 
> 
> 
> 
> >> These are some projects which I might be doing:
> 
> >> spartan 6 xc6slx45 kit 
> 
> 
> 
> >>1.the n-body gravitational problem
> 
> >>2.Oceanic modeling
> 
> >>3.Cancer biology modeling
> 
> 
> 
> this is not to discourage you. But please be warned that heavy-duty FPGA
> 
> implementations as youre planning are
> 
> 
> 
> *much*
> 
> 
> 
> and I mean "much" harder than it looks from all those shiny webpages that
> 
> make it look like Lego bricks because they want to sell you stuff.
> 
> 
> 
> Here's my proposal: Why don't you implement "Hello world!" in Morse code.
> 
> Which is ".... . .-.. .-.. ---   .-- --- .-. .-.. -.. " 
> 
> Just a blinking LED. 
> 
> Expect that i'll take between a day and two weeks. This includes things
> 
> that "should" be easy but are not, such as installing ISE 14.7 when you've
> 
> never done it before, making the JTAG interface work etc.
> 
> 
> 
> In my personal opinion, the xc6slx45 is an excellent choice to get started.
> 
> Because
> 
> a) it does not require a Xilinx license to program it
> 
> b) I can get one cheaply if you ever need one, i.e. "Numato Saturn" or
> 
> "Pipistrello" boards, for ~$130..160. 
> 
> c) If it breaks, it's no big deal, compared to a $3000+ board.
> 
> To learn Verilog, the smallest and cheapest FPGA will do, if you decide to
> 
> buy one for yourself. The typical feedback from the board is "this doesn't
> 
> work - go simulate some more".
> 
> 
> 
> Note, you said "Verilog", not using some intermediate wizardry that
> 
> generates the code. For the latter, a sxl45 is probably too small
> 
> (guessing, haven't done it myself).	   
> 
> 					
> 
> ---------------------------------------		
> 
> Posted through http://www.FPGARelated.com

I have Xilinx 14.5 full licensed and spartan 6 xc6slx45 kit available in my college lab. And I also want to get a bit challenging thing for project.

Article: 157147
Subject: Re: Need ideas for FYP
From: awaish2011@namal.edu.pk
Date: Sat, 18 Oct 2014 07:08:40 -0700 (PDT)
Links: << >> << T >> << A >>

On Monday, October 13, 2014 9:09:49 PM UTC+5, awais...@namal.edu.pk wrote:
> I am student of Bachelors and going to start my FYP in some days. I am going into the field of high computation in verilog. These are some projects which I might be doing:
> 
> 1.the n-body gravitational problem
> 
> 2.Oceanic modeling
> 
> 3.Cancer biology modeling
> 
> 
> 
> Any other projects you might suggest that may be beneficial for me.
> 
> And also my main aim after Bachelors is to get admission in some US university. 
> 
> Thanks!

And I have also available this card in my lab
http://www.nallatech.com/PCI-Express-FPGA-Cards/pcie-385n-altera-stratix-v-fpga-computing-card.html

And I may also move to open CL. ANy good projects in open CL???

Article: 157148
Subject: Re: Fast and slow clocks
From: Gabor <gabor@szakacs.org>
Date: Sat, 18 Oct 2014 10:53:42 -0400
Links: << >> << T >> << A >>

On 10/18/2014 3:53 AM, "Bruce Varley" wrote:
> I'm wondering what the correct way to handle the following situation
> is. Sorry this is a bit long winded. BTW, it's not homework, all that
> was 40+ years ago.
>
>
> I have two clocks, clk which is the FPGA clock rate, and sclk which I
> create using a simple divide by n counter. Typically, sclk is 1024
> times slower than clk.
>
> An event occurs that sets a reg, DR, for one clk cycle.
>
> There is a register, calcreg [7:0] which is to be incremented slowly,
> but reset to zero on DR.
>
> There are two sections, one triggered by clk and one by sclk, ie:
>
> //	Fast section
> always @ (posedge clk)
> 	begin
> 	....
> 	end
>
> // Slow section
> always @ (posedge sclk)
> 	begin
> 	if (DR)  calcreg <= 8'h0  ;         // Reset calcreg on DR
> 	else calcreg <= calcreg + 1   ;  // Else increment
> 	end
>
> The problem of course is that the on state of DR will almost always be
> missed, it will only appear if it happens to coincide with a sclk edge
> (1 / 1024). So the above doesn't work.
>
> So I tried modifying the clk section as follows:
>
> always @ (posedge clk)
> 	begin
> 	....
> 	if (DR) calcreg <= 8'h0  ;
> 	end
>
> This threw up build errors,
>
> Error (10028): Can't resolve multiple constant drivers for net
> "FlashCtr[3]" at tick.v(43)
>
> I think I see the reason, it's like trying to wire two gate outputs to
> the same point, something that's obviously verboten with active drive
> hardware.
>
>
> If someone could help with the following specific questions it would
> help a lot.....
>
> Is using the two clocks simply bad practice, ie. should everything be
> done in a single always block at clk rate?
>
> Is there a standard way to latch the DR signal when it occurs on the
> fast clock, so that it will be there on the next transition of sclk,
> which must then clear the DR latch? I've tried this, and come up with
> the same build error with the latch.
>

If you're really just dividing one clock to make another, then you're
probably better off using a single clock and generating a count enable
for your slow process.  On the other hand your problem of using a fast
signal to reset a slow process is also applicable to situations where
the two clocks are not related and are both necessary for the design.
In that case I would normally have an intermediate variable in the fast
clock domain that gets set by DR and cleared by a signal returned from
the slow process.  Something like:

reg  DR_hold = 0;
reg  DR_seen = 0;
always @ (posedge clk)
   begin
     if (DR) DR_hold <= 1;
     else if (DR_resync) DR_hold <= 0;
   end

always @ (posedge sclk)
   begin
     DR_resync <= DR_hold;
   end

Note that if you use DR_resync as the reset term, it will
cause additional latency from DR to the reset of the counter.
You could use DR_hold instead, but then the problem is if
the two clocks are really unrelated you could miss a reset
if DR_hold asserts very near the rising edge of sclk and
DR_resync catches the event but the counter (or some of its
bits) does not.

-- 
Gabor

Article: 157149
Subject: Re: Fast and slow clocks
From: Gabor <gabor@szakacs.org>
Date: Sat, 18 Oct 2014 10:55:22 -0400
Links: << >> << T >> << A >>

On 10/18/2014 10:53 AM, Gabor wrote:
> On 10/18/2014 3:53 AM, "Bruce Varley" wrote:
>> I'm wondering what the correct way to handle the following situation
>> is. Sorry this is a bit long winded. BTW, it's not homework, all that
>> was 40+ years ago.
>>
>>
>> I have two clocks, clk which is the FPGA clock rate, and sclk which I
>> create using a simple divide by n counter. Typically, sclk is 1024
>> times slower than clk.
>>
>> An event occurs that sets a reg, DR, for one clk cycle.
>>
>> There is a register, calcreg [7:0] which is to be incremented slowly,
>> but reset to zero on DR.
>>
>> There are two sections, one triggered by clk and one by sclk, ie:
>>
>> //    Fast section
>> always @ (posedge clk)
>>     begin
>>     ....
>>     end
>>
>> // Slow section
>> always @ (posedge sclk)
>>     begin
>>     if (DR)  calcreg <= 8'h0  ;         // Reset calcreg on DR
>>     else calcreg <= calcreg + 1   ;  // Else increment
>>     end
>>
>> The problem of course is that the on state of DR will almost always be
>> missed, it will only appear if it happens to coincide with a sclk edge
>> (1 / 1024). So the above doesn't work.
>>
>> So I tried modifying the clk section as follows:
>>
>> always @ (posedge clk)
>>     begin
>>     ....
>>     if (DR) calcreg <= 8'h0  ;
>>     end
>>
>> This threw up build errors,
>>
>> Error (10028): Can't resolve multiple constant drivers for net
>> "FlashCtr[3]" at tick.v(43)
>>
>> I think I see the reason, it's like trying to wire two gate outputs to
>> the same point, something that's obviously verboten with active drive
>> hardware.
>>
>>
>> If someone could help with the following specific questions it would
>> help a lot.....
>>
>> Is using the two clocks simply bad practice, ie. should everything be
>> done in a single always block at clk rate?
>>
>> Is there a standard way to latch the DR signal when it occurs on the
>> fast clock, so that it will be there on the next transition of sclk,
>> which must then clear the DR latch? I've tried this, and come up with
>> the same build error with the latch.
>>
>
> If you're really just dividing one clock to make another, then you're
> probably better off using a single clock and generating a count enable
> for your slow process.  On the other hand your problem of using a fast
> signal to reset a slow process is also applicable to situations where
> the two clocks are not related and are both necessary for the design.
> In that case I would normally have an intermediate variable in the fast
> clock domain that gets set by DR and cleared by a signal returned from
> the slow process.  Something like:
>
> reg  DR_hold = 0;
> reg  DR_seen = 0;
> always @ (posedge clk)
>    begin
>      if (DR) DR_hold <= 1;
>      else if (DR_resync) DR_hold <= 0;
>    end
>
> always @ (posedge sclk)
>    begin
>      DR_resync <= DR_hold;
>    end
>
> Note that if you use DR_resync as the reset term, it will
> cause additional latency from DR to the reset of the counter.
> You could use DR_hold instead, but then the problem is if
> the two clocks are really unrelated you could miss a reset
> if DR_hold asserts very near the rising edge of sclk and
> DR_resync catches the event but the counter (or some of its
> bits) does not.
>

Oops, in the previous post I started with "DR_seen" but then
went to "DR_resync" for the same signal.  But you get the idea...

-- 
Gabor

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search