Messages from 157625

Article: 157625
Subject: Re: Image rotation
From: Mike Field <mikefield1969@gmail.com>
Date: Wed, 7 Jan 2015 20:23:43 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, 8 January 2015 08:51:55 UTC+13, Tomas D.  wrote:
> The FPGA will be Altera Cyclone V with one hard memory controller (5CEFA2=
=20
> device). I am trying to check if it will be sufficient to use one DDR3=20
> memory chip or it's better to use two devices with 32bit memory bus, thus=
=20
> increasing the bandwidth.
>=20

The most efficient method which uses frame buffer that is external to the F=
PGA will result in one write and one read per pixel, if you use a DDR modul=
e you will need a bit more than 2x the bandwidth of the video stream, so yo=
u will need a bit over 900MB/s for 24-bit 1080p @ 60 Hz. You will need to c=
arefully plan how memory will be accessed to maximise available memory band=
width.

For 1080p video, if you can hold 180 rows of pixel data inside your FPGA yo=
u don't actually need external memory to buffer the frames at all, and you =
can achieve lower latency too (approx the time for 181 lines). The idea bei=
ng to use a rolling buffer of 180 rows that you sample/extract your output =
pixels from. The cost of the larger FPGA might be offset by the savings in =
not requiring the external memory, smaller PCB and so on.

There is a sweet spot for 720p video, where you can get away with holding j=
ust 128 rows for +/- 5 degrees of rotation, requiring only half a MB of blo=
ck RAM. This assumes that you are not interpolating between pixels.

If you are performing interpolation, then you might need to be really cunni=
ng and use the extra cycles found in the blanking interval to give addition=
al cycles required for the extra memory accesses needed when you walk throu=
gh the pixels. Your access pattern might be something like=20
  =20
1234......
...5678...
......9ABC
..........

In this case it takes 12 cycles to access the data needed for interpolating=
 10 pixels (because of the additional cycle required for access 5 & 9 when =
it jumps lines). You will then need something like a FIFO to remove the gap=
s in the output pixel stream. For 1080p, you have about 280 cycles in the h=
orizontal blanking interval, a little more than what you will need for a +/=
- 5 degree rotation, where you will have at most 167 changes between lines.

Mike

Article: 157626
Subject: Re: Image rotation
From: Richard Damon <Richard@Damon-Family.org>
Date: Thu, 08 Jan 2015 08:39:08 -0500
Links: << >> << T >> << A >>

On 1/7/15 11:23 PM, Mike Field wrote:
> On Thursday, 8 January 2015 08:51:55 UTC+13, Tomas D.  wrote:
>> The FPGA will be Altera Cyclone V with one hard memory controller
>> (5CEFA2 device). I am trying to check if it will be sufficient to
>> use one DDR3 memory chip or it's better to use two devices with
>> 32bit memory bus, thus increasing the bandwidth.
>>
>
> The most efficient method which uses frame buffer that is external to
> the FPGA will result in one write and one read per pixel, if you use
> a DDR module you will need a bit more than 2x the bandwidth of the
> video stream, so you will need a bit over 900MB/s for 24-bit 1080p @
> 60 Hz. You will need to carefully plan how memory will be accessed to
> maximise available memory bandwidth.
>
> For 1080p video, if you can hold 180 rows of pixel data inside your
> FPGA you don't actually need external memory to buffer the frames at
> all, and you can achieve lower latency too (approx the time for 181
> lines). The idea being to use a rolling buffer of 180 rows that you
> sample/extract your output pixels from. The cost of the larger FPGA
> might be offset by the savings in not requiring the external memory,
> smaller PCB and so on.
>
> There is a sweet spot for 720p video, where you can get away with
> holding just 128 rows for +/- 5 degrees of rotation, requiring only
> half a MB of block RAM. This assumes that you are not interpolating
> between pixels.
>
> If you are performing interpolation, then you might need to be really
> cunning and use the extra cycles found in the blanking interval to
> give additional cycles required for the extra memory accesses needed
> when you walk through the pixels. Your access pattern might be
> something like
>
> 1234......
 > ...5678...
 > ......9ABC
 > ..........
>
> In this case it takes 12 cycles to access the data needed for
> interpolating 10 pixels (because of the additional cycle required for
> access 5 & 9 when it jumps lines). You will then need something like
> a FIFO to remove the gaps in the output pixel stream. For 1080p, you
> have about 280 cycles in the horizontal blanking interval, a little
> more than what you will need for a +/- 5 degree rotation, where you
> will have at most 167 changes between lines.
>
>
> Mike
>

On the need for 12 cycles here. My experience is that FPGA's tend NOT to 
have giant blocks of memory, but a lot of "smaller" blocks (perhaps of 
differing size.  This 1/2 MB memory is likely made of smaller blocks and 
could be defined as 2 separate memories, one for even lines, and one for 
odd, which says that you 4,5 and 8,9 could be accessed simultaneously. 
(Actually, if you are interpolating the pixels, you are going to almost 
always want two lines of data, the line above and below your fractional 
position, and the point before and after, and so may want to 4 way 
interleave your memory).

Article: 157627
Subject: Re: Timing Constraints: are there any
From: "kaz" <37480@embeddedrelated>
Date: Thu, 08 Jan 2015 10:43:19 -0600
Links: << >> << T >> << A >>

see altera timequest design centre	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 157628
Subject: Re: Image rotation
From: "Tomas D." <mailsoc@gmial.com>
Date: Thu, 8 Jan 2015 20:34:13 -0000
Links: << >> << T >> << A >>

This is a multi-part message in MIME format.

------=_NextPart_000_0036_01D02B82.77215380
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable


  "Mike Field" <mikefield1969@gmail.com> wrote in message =
news:8c1434a4-c435-463c-9783-0d35911a33b3@googlegroups.com...
  On Thursday, 8 January 2015 08:51:55 UTC+13, Tomas D.  wrote:
  > The FPGA will be Altera Cyclone V with one hard memory controller =
(5CEFA2=20
  > device). I am trying to check if it will be sufficient to use one =
DDR3=20
  > memory chip or it's better to use two devices with 32bit memory bus, =
thus=20
  > increasing the bandwidth.
  >=20

  The most efficient method which uses frame buffer that is external to =
the FPGA will result in one write and one read per pixel, if you use a =
DDR module you will need a bit more than 2x the bandwidth of the video =
stream, so you will need a bit over 900MB/s for 24-bit 1080p @ 60 Hz. =
You will need to carefully plan how memory will be accessed to maximise =
available memory bandwidth.

  For 1080p video, if you can hold 180 rows of pixel data inside your =
FPGA you don't actually need external memory to buffer the frames at =
all, and you can achieve lower latency too (approx the time for 181 =
lines). The idea being to use a rolling buffer of 180 rows that you =
sample/extract your output pixels from. The cost of the larger FPGA =
might be offset by the savings in not requiring the external memory, =
smaller PCB and so on.

  There is a sweet spot for 720p video, where you can get away with =
holding just 128 rows for +/- 5 degrees of rotation, requiring only half =
a MB of block RAM. This assumes that you are not interpolating between =
pixels.

  If you are performing interpolation, then you might need to be really =
cunning and use the extra cycles found in the blanking interval to give =
additional cycles required for the extra memory accesses needed when you =
walk through the pixels. Your access pattern might be something like=20
    =20
  1234......
  ...5678...
  ......9ABC
  ..........

  In this case it takes 12 cycles to access the data needed for =
interpolating 10 pixels (because of the additional cycle required for =
access 5 & 9 when it jumps lines). You will then need something like a =
FIFO to remove the gaps in the output pixel stream. For 1080p, you have =
about 280 cycles in the horizontal blanking interval, a little more than =
what you will need for a +/- 5 degree rotation, where you will have at =
most 167 changes between lines.

---------------------------------------

Hi,
thank you for all the ideas.
The resolution I will be playing with is 1200x1080 - a little but more =
than FullHD. Since I gonna use Altera FPGA, there's no point for me not =
to use Altera VIP suite. This means, that I will get bursts of real =
image data and there will be no blanking periods - just spaces between =
bursts.
The memory will still be used for a frame buffer, because the input =
video stream will have different clock source, so can end up dropping =
frames.
However, I am interested in image rotation techniques and the literature =
at the moment. You're describing the methods you did, which, for me, =
never played with image processing, is kinda difficult to get. Maybe you =
have resources to read about these rotation methods first of all?

Thank you.

BR
Tomas D.
------=_NextPart_000_0036_01D02B82.77215380
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META name=3DGENERATOR content=3D"MSHTML 11.00.9600.17496">
<STYLE></STYLE>
</HEAD>
<BODY>
<BLOCKQUOTE>
  <DIV><FONT size=3D2 face=3DArial></FONT>&nbsp;</DIV>
  <DIV><FONT size=3D2 face=3DArial>"Mike Field" &lt;</FONT><A=20
  href=3D"mailto:mikefield1969@gmail.com"><FONT size=3D2=20
  face=3DArial>mikefield1969@gmail.com</FONT></A><FONT size=3D2 =
face=3DArial>&gt;=20
  wrote in message </FONT><A=20
  =
href=3D"news:8c1434a4-c435-463c-9783-0d35911a33b3@googlegroups.com"><FONT=
 size=3D2=20
  =
face=3DArial>news:8c1434a4-c435-463c-9783-0d35911a33b3@googlegroups.com</=
FONT></A><FONT=20
  size=3D2 face=3DArial>...</FONT></DIV>
  <DIV><FONT size=3D2 face=3DArial>On Thursday, 8 January 2015 08:51:55 =
UTC+13,=20
  Tomas D.&nbsp; wrote:<BR>&gt; The FPGA will be Altera Cyclone V with =
one hard=20
  memory controller (5CEFA2 <BR>&gt; device). I am trying to check if it =
will be=20
  sufficient to use one DDR3 <BR>&gt; memory chip or it's better to use =
two=20
  devices with 32bit memory bus, thus <BR>&gt; increasing the =
bandwidth.<BR>&gt;=20
  <BR><BR>The most efficient method which uses frame buffer that is =
external to=20
  the FPGA will result in one write and one read per pixel, if you use a =
DDR=20
  module you will need a bit more than 2x the bandwidth of the video =
stream, so=20
  you will need a bit over 900MB/s for 24-bit 1080p @ 60 Hz. You will =
need to=20
  carefully plan how memory will be accessed to maximise available =
memory=20
  bandwidth.<BR><BR>For 1080p video, if you can hold 180 rows of pixel =
data=20
  inside your FPGA you don't actually need external memory to buffer the =
frames=20
  at all, and you can achieve lower latency too (approx the time for 181 =
lines).=20
  The idea being to use a rolling buffer of 180 rows that you =
sample/extract=20
  your output pixels from. The cost of the larger FPGA might be offset =
by the=20
  savings in not requiring the external memory, smaller PCB and so=20
  on.<BR><BR>There is a sweet spot for 720p video, where you can get =
away with=20
  holding just 128 rows for +/- 5 degrees of rotation, requiring only =
half a MB=20
  of block RAM. This assumes that you are not interpolating between=20
  pixels.<BR><BR>If you are performing interpolation, then you might =
need to be=20
  really cunning and use the extra cycles found in the blanking interval =
to give=20
  additional cycles required for the extra memory accesses needed when =
you walk=20
  through the pixels. Your access pattern might be something like=20
  <BR>&nbsp;&nbsp;=20
  <BR>1234......<BR>...5678...<BR>......9ABC<BR>..........<BR><BR>In =
this case=20
  it takes 12 cycles to access the data needed for interpolating 10 =
pixels=20
  (because of the additional cycle required for access 5 &amp; 9 when it =
jumps=20
  lines). You will then need something like a FIFO to remove the gaps in =
the=20
  output pixel stream. For 1080p, you have about 280 cycles in the =
horizontal=20
  blanking interval, a little more than what you will need for a +/- 5 =
degree=20
  rotation, where you will have at most 167 changes between=20
lines.</FONT></DIV></BLOCKQUOTE><FONT size=3D2 face=3DArial>
<DIV><BR></FONT><FONT size=3D2=20
face=3DArial>---------------------------------------</FONT></DIV>
<DIV><FONT size=3D2 face=3DArial></FONT>&nbsp;</DIV>
<DIV><FONT size=3D2 face=3DArial>Hi,</FONT></DIV>
<DIV><FONT size=3D2 face=3DArial>thank you for all the =
ideas.</FONT></DIV>
<DIV><FONT size=3D2 face=3DArial>The resolution I will be playing with =
is 1200x1080=20
- a little but more than FullHD. Since I gonna use Altera FPGA, there's =
no point=20
for me not to use Altera VIP suite. This means, that I will get bursts =
of real=20
image data and there will be no blanking periods - just spaces between=20
bursts.</FONT></DIV>
<DIV><FONT size=3D2 face=3DArial>The memory will still be used for a =
frame buffer,=20
because the input video stream will have different clock source, so can =
end up=20
dropping frames.</FONT></DIV>
<DIV><FONT size=3D2 face=3DArial>However, I am interested in image =
rotation=20
techniques and the literature at the moment. You're describing the =
methods you=20
did, which, for me, never played with image processing, is kinda =
difficult to=20
get. Maybe you have resources to read about these rotation methods first =
of=20
all?</FONT></DIV>
<DIV><FONT size=3D2 face=3DArial></FONT>&nbsp;</DIV>
<DIV><FONT size=3D2 face=3DArial>Thank you.</FONT></DIV>
<DIV><FONT size=3D2 face=3DArial></FONT>&nbsp;</DIV>
<DIV><FONT size=3D2 face=3DArial>BR</FONT></DIV>
<DIV><FONT size=3D2 face=3DArial>Tomas D.</DIV></FONT></BODY></HTML>

------=_NextPart_000_0036_01D02B82.77215380--

Article: 157629
Subject: Re: Timing Constraints: are there any "design patterns" indicating good practice?
From: Theo Markettos <theom+news@chiark.greenend.org.uk>
Date: 08 Jan 2015 23:16:05 +0000 (GMT)
Links: << >> << T >> << A >>

Tom Gardner <spamjunk@blueyonder.co.uk> wrote:
> So, I'd be grateful for pointers to references that you found useful when you
> were learning how to use constraints effectively.

I don't speak Xilinx, but in Altera land it should be enough to specify the
input clocks and let the tools figure out the rest: they know about PLLs and
clock crossing components.

Though you may have to constrain your external I/O if you get problems -
while the tools know what to do with existing interfaces (like memory PHYs),
you may have to put in rules to say 'don't care' to wires you expose from
your own VHDL, otherwise they may try harder to constrain things you don't
want - or underconstain them if there is an innate timing relationship (eg
something like SPI where the ordering of clock and data edges matters
somewhat)

Theo

Article: 157630
Subject: Re: Parallel execution of Systemc code
From: Alan Fitch <apf@invalid.invalid>
Date: Fri, 09 Jan 2015 00:06:07 +0000
Links: << >> << T >> << A >>

On 07/01/15 10:06, muyihwah@gmail.com wrote:
<snip>

> 
> 
> Thanks a lot for the response. Could you please point me to some online resources or books that might provide detailed explanations.
> 

http://www.doulos.com/knowhow/systemc

Alan

-- 
Alan Fitch

Article: 157631
Subject: Re: Parallel execution of Systemc code
From: Xesium <amirhossein.gholamipour@gmail.com>
Date: Fri, 9 Jan 2015 17:08:43 -0800 (PST)
Links: << >> << T >> << A >>

On Tuesday, January 6, 2015 at 1:15:16 AM UTC-8, muyi...@gmail.com wrote:
> // Testbench
> rst=true; wait(10, SC_MS);
> rst=false; in1=8; in2=2; in3=3; in4=6; sel=2;
> wait(50, SC_MS);
> cout << "selected data:" << out << " " << endl; rst=true; sel = 0;
> wait(50, SC_MS);
> cout << "selected data:" << out << " " << endl;
> 
> 
> I am used to VHDL which runs in parallel, but I finding it difficult to understand/if Systemc code run in parallel like the example above.

What do you mean running in parallel?
Do you mean that separate threads run concurrently? The answer to that is yes. SystemC has notion of threads which run concurrently. In fact just like Verilog/VHDL you can write code that is as low as gate level and can be synthesized.
If you mean that SystemC kernel runs in a multi-processor system leveraging parallelism, then to the best of my knowledge that is not true.

The code that you have here is sequential meaning that "rst" first gets value "true" then after 10ms becomes "false" then after 50ms becomes "true".
Checkout sc_method vs. sc_thread in SystemC.

Article: 157632
Subject: Name this pipelining technique
From: Allan Herriman <allanherriman@hotmail.com>
Date: 10 Jan 2015 03:45:11 GMT
Links: << >> << T >> << A >>

Hi,

I vaguely recall reading papers that described an automated pipelining 
technique that could take an existing synchronous design and turn it into 
a N-way hyperthreaded design by replacing all the flip flops with N 
chained flip flops.

I'd like to look at those papers again, but I can't remember what it was 
called, and my google-fu is weak.

Any ideas?

Thanks,
Allan

Article: 157633
Subject: Re: Name this pipelining technique
From: brimdavis@gmail.com
Date: Sat, 10 Jan 2015 07:57:59 -0800 (PST)
Links: << >> << T >> << A >>

On January 9,Allan Herriman wrote:

> ...take an existing synchronous design and turn it 
> into a N-way hyperthreaded design by replacing all
> the flip flops with N chained flip flops. 
>
> I'd like to look at those papers again, but I can't
> remember what it was called, and my google-fu is weak.

I think "C-Slow retiming" is the academic term for the transformation.

The earliest moniker I've heard as applied to a processor is "barrel processor" [CDC], with something like "N-way sequential multithreaded" being the modern terminology.

IIRC, Tobias at edaptix had written some articles about his automated implementation of the technique on the late Programmable Planet website, see:
http://www.edaptix.com/coremultiplier.htm

-Brian

Article: 157634
Subject: Re: Name this pipelining technique
From: HT-Lab <hans64@htminuslab.com>
Date: Sat, 10 Jan 2015 16:43:47 +0000
Links: << >> << T >> << A >>

On 10/01/2015 03:45, Allan Herriman wrote:
> Hi,
>
> I vaguely recall reading papers that described an automated pipelining
> technique that could take an existing synchronous design and turn it into
> a N-way hyperthreaded design by replacing all the flip flops with N
> chained flip flops.
>
> I'd like to look at those papers again, but I can't remember what it was
> called, and my google-fu is weak.
>
> Any ideas?
>
> Thanks,
> Allan
>
Perhaps PicoPIPE as used(patented) by Archronix?

Hans
www.ht-lab.com

Article: 157635
Subject: Re: Name this pipelining technique
From: Allan Herriman <allanherriman@hotmail.com>
Date: 11 Jan 2015 00:04:18 GMT
Links: << >> << T >> << A >>

On Sat, 10 Jan 2015 07:57:59 -0800, brimdavis wrote:

> On January 9,Allan Herriman wrote:
> 
>> ...take an existing synchronous design and turn it into a N-way
>> hyperthreaded design by replacing all the flip flops with N chained
>> flip flops.
>>
>> I'd like to look at those papers again, but I can't remember what it
>> was called, and my google-fu is weak.
> 
> I think "C-Slow retiming" is the academic term for the transformation.
> 
> The earliest moniker I've heard as applied to a processor is "barrel
> processor" [CDC], with something like "N-way sequential multithreaded"
> being the modern terminology.
> 
> IIRC, Tobias at edaptix had written some articles about his automated
> implementation of the technique on the late Programmable Planet website,
> see: http://www.edaptix.com/coremultiplier.htm
> 
> -Brian

C-Slow it was.  Thanks very much.

Regards,
Allan

Article: 157636
Subject: Re: Name this pipelining technique
From: Mike Field <mikefield1969@gmail.com>
Date: Sun, 11 Jan 2015 13:04:30 -0800 (PST)
Links: << >> << T >> << A >>

On Saturday, 10 January 2015 16:45:13 UTC+13, Allan Herriman  wrote:
> I'd like to look at those papers again, but I can't remember what it was 
> called, and my google-fu is weak.
> 
> Any ideas?

I've seen it termed 'System Hyper Pipelining'.

Article: 157637
Subject: [cross-post] nand flash bad blocks management
From: al.basili@gmail.com (alb)
Date: 12 Jan 2015 09:23:24 GMT
Links: << >> << T >> << A >>

Hi everyone,

We have ~128Mbit of configuration to be stored in a Flash device and for 
reasons related to qualification (HiRel application) we are more 
inclined to the use of NAND technology instead of NOR. Unfortunately 
NAND flash suffers from bad blocks, which may also develop during the 
lifetime of the component and have to be handled.

I've read something about bad block management and it looks like there 
are two essential strategies to cope with the issue of bad blocks:

1. skip block
2. reserved block

The first one will skip a block whenever is bad and write on the first 
free one, updating also the logical block addressing (LBA). While the second 
strategy reserves a dedicated area to remap the bad blocks. In this 
second case the LBA shall be kept updated as well.

I do not see much of a difference between the two strategies except the 
fact that in case 1. I need to 'search' for the first available free 
block, while in second case I reserved a special area for it. Am I 
missing any other major difference?

The second question I have is about 'management'. I do not have a 
software stack to perform the management of these bad blocks and I'm 
obliged to do it with my FPGA. Does anyone here see any potential risk 
in doing so? Would I be better off dedicating a small footprint 
controller in the FPGA to handle the Flash Translation Layer with wear 
leveling and bad block management? Can anyone here point me to some 
IPcores readily available for doing this?

There's a high chance I will need to implement some sort of 'scrubbing' 
to avoid accumulation of errors. All these 'functions' to handle the 
Flash seem to me very suited for software but not for hardware. Does 
anyone here have a different opinion?

Any comment/suggestion/pointer/rant is appreciated.

Cheers,

Al

-- 
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Article: 157638
Subject: Re: [cross-post] nand flash bad blocks management
From: "Boudewijn Dijkstra" <sp4mtr4p.boudewijn@indes.com>
Date: Mon, 12 Jan 2015 11:38:30 +0100
Links: << >> << T >> << A >>

Op Mon, 12 Jan 2015 10:23:24 +0100 schreef alb <al.basili@gmail.com>:
> Hi everyone,
>
> We have ~128Mbit of configuration to be stored in a Flash device and for
> reasons related to qualification (HiRel application) we are more
> inclined to the use of NAND technology instead of NOR. Unfortunately
> NAND flash suffers from bad blocks, which may also develop during the
> lifetime of the component and have to be handled.
>
> I've read something about bad block management and it looks like there
> are two essential strategies to cope with the issue of bad blocks:
>
> 1. skip block
> 2. reserved block
>
> The first one will skip a block whenever is bad and write on the first
> free one, updating also the logical block addressing (LBA). While the  
> second
> strategy reserves a dedicated area to remap the bad blocks. In this
> second case the LBA shall be kept updated as well.
>
> I do not see much of a difference between the two strategies except the
> fact that in case 1. I need to 'search' for the first available free
> block, while in second case I reserved a special area for it. Am I
> missing any other major difference?

The second strategy is required when the total logical storage capacity  
must be constant. I can imagine the existence of 'bad sectors' degrading  
performance on some filesystems.

> The second question I have is about 'management'. I do not have a
> software stack to perform the management of these bad blocks and I'm
> obliged to do it with my FPGA. Does anyone here see any potential risk
> in doing so? Would I be better off dedicating a small footprint
> controller in the FPGA to handle the Flash Translation Layer with wear
> leveling and bad block management? Can anyone here point me to some
> IPcores readily available for doing this?

Sounds like you're re-inventing eMMC.

> There's a high chance I will need to implement some sort of 'scrubbing'
> to avoid accumulation of errors.

Indeed regular reading (and IIRC also writing) can increase the longevity  
of the device. But it is up to you whether that is needed at all.

> All these 'functions' to handle the
> Flash seem to me very suited for software but not for hardware. Does
> anyone here have a different opinion?

AFAIK, (e)MMC devices all have a small microcontroller inside.>


-- 
(Remove the obvious prefix to reply privately.)
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/

Article: 157639
Subject: Re: [cross-post] nand flash bad blocks management
From: al.basili@gmail.com (alb)
Date: 12 Jan 2015 15:52:57 GMT
Links: << >> << T >> << A >>

Hi Boudewijn,

In comp.arch.embedded Boudewijn Dijkstra <sp4mtr4p.boudewijn@indes.com> wrote:
[]
>> I've read something about bad block management and it looks like there
>> are two essential strategies to cope with the issue of bad blocks:
>>
>> 1. skip block
>> 2. reserved block
>>
>> The first one will skip a block whenever is bad and write on the first
>> free one, updating also the logical block addressing (LBA). While the  
>> second
>> strategy reserves a dedicated area to remap the bad blocks. In this
>> second case the LBA shall be kept updated as well.
>>
>> I do not see much of a difference between the two strategies except the
>> fact that in case 1. I need to 'search' for the first available free
>> block, while in second case I reserved a special area for it. Am I
>> missing any other major difference?

> The second strategy is required when the total logical storage capacity  
> must be constant. I can imagine the existence of 'bad sectors' degrading  
> performance on some filesystems.

Ok, that's a valid point, meaning that since I declare the user space 
only the total minus the reserved, the user may rely on that 
information.

But in terms of total amount of bad blocks for the quoted endurance will 
be exactly with the same number. None of the strategies mentioned wear 
less the device.

>> The second question I have is about 'management'. I do not have a
>> software stack to perform the management of these bad blocks and I'm
>> obliged to do it with my FPGA. Does anyone here see any potential risk
>> in doing so? Would I be better off dedicating a small footprint
>> controller in the FPGA to handle the Flash Translation Layer with wear
>> leveling and bad block management? Can anyone here point me to some
>> IPcores readily available for doing this?
> 
> Sounds like you're re-inventing eMMC.

I didn't know there was a name for that. Well if that's so yes, but it's 
not for storing your birthday's picture, rather for space application.

Even if there are several 'experiments' running in low orbit with nand 
flash components, I do not know any operational satellite (like for 
meteo or similar) to have anything like this.

>> There's a high chance I will need to implement some sort of 'scrubbing'
>> to avoid accumulation of errors.
> 
> Indeed regular reading (and IIRC also writing) can increase the longevity  
> of the device. But it is up to you whether that is needed at all.

I'm not aiming to increase longevity. I'm aiming to guarantee that the 
system will cope with the expected bit flip and still guarantee mission 
objectives throughout the intended lifecycle (7.5 years on orbit).

Scrubbing is not so complicated, you read, correct and write back. But 
doing so when you hit a bad block during the rewrite and you have tons 
of other things to do in the meanwhile may have some side effects...to 
be evaluated and handled.

>> All these 'functions' to handle the
>> Flash seem to me very suited for software but not for hardware. Does
>> anyone here have a different opinion?
> 
> AFAIK, (e)MMC devices all have a small microcontroller inside.>

It does not surprise me, I have the requirement not to include *any* 
software onboard! I may let an embedded microcontroller with a hardcoded 
list of instruction slip through, but I'm not so sure.

Al

Article: 157640
Subject: Re: [cross-post] nand flash bad blocks management
From: Don Y <this@is.not.me.com>
Date: Mon, 12 Jan 2015 17:03:45 -0700
Links: << >> << T >> << A >>

Hi Boudewijn,

On 1/12/2015 3:38 AM, Boudewijn Dijkstra wrote:
> Op Mon, 12 Jan 2015 10:23:24 +0100 schreef alb <al.basili@gmail.com>:

>> The second question I have is about 'management'. I do not have a
>> software stack to perform the management of these bad blocks and I'm
>> obliged to do it with my FPGA. Does anyone here see any potential risk
>> in doing so? Would I be better off dedicating a small footprint
>> controller in the FPGA to handle the Flash Translation Layer with wear
>> leveling and bad block management? Can anyone here point me to some
>> IPcores readily available for doing this?
>
> Sounds like you're re-inventing eMMC.
>
>> There's a high chance I will need to implement some sort of 'scrubbing'
>> to avoid accumulation of errors.
>
> Indeed regular reading (and IIRC also writing) can increase the longevity of
> the device. But it is up to you whether that is needed at all.

Um, *reading* also causes fatigue in the array -- just not as quickly as
*writing*/erase.  In most implementations, this isn't a problem because
you're reading the block *into* RAM and then accessing it from RAM.
But, if you just keep reading blocks repeatedly, you'll discover your
ECC becoming increasingly more active/aggressive in "fixing" the degrading
NAD cells.

So, either KNOW that your access patterns (read and write) *won't*
disturb the array.  *Or*, actively manage it by "refreshing" content
after "lots of" accesses (e.g., 100K-ish) PER PAGE/BANK.

>> All these 'functions' to handle the
>> Flash seem to me very suited for software but not for hardware. Does
>> anyone here have a different opinion?
>
> AFAIK, (e)MMC devices all have a small microcontroller inside.>

I can't see an *economical* way of doing this (in anything less than
huge volumes) with dedicated hardware (e.g., FPGA).

Article: 157641
Subject: Re: [cross-post] nand flash bad blocks management
From: "Boudewijn Dijkstra" <sp4mtr4p.boudewijn@indes.com>
Date: Tue, 13 Jan 2015 10:17:23 +0100
Links: << >> << T >> << A >>

Op Tue, 13 Jan 2015 01:03:45 +0100 schreef Don Y <this@is.not.me.com>:
> On 1/12/2015 3:38 AM, Boudewijn Dijkstra wrote:
>> Op Mon, 12 Jan 2015 10:23:24 +0100 schreef alb <al.basili@gmail.com>:
>
>>> There's a high chance I will need to implement some sort of 'scrubbing'
>>> to avoid accumulation of errors.
>>
>> Indeed regular reading (and IIRC also writing) can increase the  
>> longevity of
>> the device. But it is up to you whether that is needed at all.
>
> Um, *reading* also causes fatigue in the array -- just not as quickly as
> *writing*/erase.

Indeed; my apologies.  Performing many reads before an erase, will indeed  
cause bit errors that can be repaired by reprogramming.  What I wanted to  
say, but misremembered, is that *not* reading over extended periods may  
also cause bit errors, due to charge leak.  This can also be repaired by  
reprogramming.
(ref: Micron TN2917)

>>> All these 'functions' to handle the
>>> Flash seem to me very suited for software but not for hardware. Does
>>> anyone here have a different opinion?
>>
>> AFAIK, (e)MMC devices all have a small microcontroller inside.
>
> I can't see an *economical* way of doing this (in anything less than
> huge volumes) with dedicated hardware (e.g., FPGA).

Space exploration is not economical (yet).  ;)



-- 
(Remove the obvious prefix to reply privately.)
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/

Article: 157642
Subject: Re: [cross-post] nand flash bad blocks management
From: al.basili@gmail.com (alb)
Date: 13 Jan 2015 18:51:03 GMT
Links: << >> << T >> << A >>

Hi Don,

In comp.arch.embedded Don Y <this@is.not.me.com> wrote:
[]
>> Indeed regular reading (and IIRC also writing) can increase the 
>> longevity of the device. But it is up to you whether that is needed 
>> at all.
> 
> Um, *reading* also causes fatigue in the array -- just not as quickly 
> as *writing*/erase.  In most implementations, this isn't a problem 
> because you're reading the block *into* RAM and then accessing it from 
> RAM. But, if you just keep reading blocks repeatedly, you'll discover 
> your ECC becoming increasingly more active/aggressive in "fixing" the 
> degrading NAD cells.

reading does not cause *fatigue* in the sense that does not wear the 
device. The effect has been referred to 'read disturb' which may cause 
errors in pages other than the one read. With multiple readings of the 
same page you may end up inducing so many errors that your ECC would not 
be able to cope with when you try to access the *other* pages.

These sorts of problems though are showing up when we talk about a 
number of reading cycles in the hundreds of thousands if not million 
(google: The Inconvenient Truths of NAND Flash Memory).

> So, either KNOW that your access patterns (read and write) *won't* 
> disturb the array.  *Or*, actively manage it by "refreshing" content 
> after "lots of" accesses (e.g., 100K-ish) PER PAGE/BANK.

We have to cope with bit flips anyway (low earth orbit), so we are 
obliged to scrub the memory, in order to avoid errors' accumulation we 
move the entire block, update the LBA and erase the one affected, so it 
becomes again available.

>>> All these 'functions' to handle the Flash seem to me very suited for 
>>> software but not for hardware. Does anyone here have a different 
>>> opinion?
>>
>> AFAIK, (e)MMC devices all have a small microcontroller inside.>
> 
> I can't see an *economical* way of doing this (in anything less than 
> huge volumes) with dedicated hardware (e.g., FPGA).

Well according to our latest estimates we are about at 30% of cell usage 
on an AX2000 (2MGates), without including any scrubbing (yet), but 
including the bad block management.

Al

Article: 157643
Subject: Re: [cross-post] nand flash bad blocks management
From: Don Y <this@is.not.me.com>
Date: Tue, 13 Jan 2015 12:47:08 -0700
Links: << >> << T >> << A >>

Hi Boudewijn,

On 1/13/2015 2:17 AM, Boudewijn Dijkstra wrote:

>>>> There's a high chance I will need to implement some sort of 'scrubbing'
>>>> to avoid accumulation of errors.
>>>
>>> Indeed regular reading (and IIRC also writing) can increase the longevity of
>>> the device. But it is up to you whether that is needed at all.
>>
>> Um, *reading* also causes fatigue in the array -- just not as quickly as
>> *writing*/erase.
>
> Indeed; my apologies.  Performing many reads before an erase, will indeed cause
> bit errors that can be repaired by reprogramming.  What I wanted to say, but
> misremembered, is that *not* reading over extended periods may also cause bit
> errors, due to charge leak.  This can also be repaired by reprogramming.
> (ref: Micron TN2917)

Yes, its amazing how many of the issues that were troublesome in OLD
technologies have modern day equivalents!  E.g., "print through" for
tape; write-restore-after-read for core; etc.

>>>> All these 'functions' to handle the
>>>> Flash seem to me very suited for software but not for hardware. Does
>>>> anyone here have a different opinion?
>>>
>>> AFAIK, (e)MMC devices all have a small microcontroller inside.
>>
>> I can't see an *economical* way of doing this (in anything less than
>> huge volumes) with dedicated hardware (e.g., FPGA).
>
> Space exploration is not economical (yet).  ;)

<frown>  Wise ass!  :>

Yes, I meant "economical" in terms of device complexity.  The more complex
the device required for a given functionality, the less reliable (in an
environment where you don't get second-chances)

Article: 157644
Subject: Re: please help and advice : Error: Pack:1107 - Unable to combine the
From: yigitcomez1993@gmail.com
Date: Wed, 14 Jan 2015 05:55:48 -0800 (PST)
Links: << >> << T >> << A >>





pin atiyorsun ya BASYS yada diger FPGA boardlar=FDn=FDn baz=FDlar=FDnde swi=
tch ler yada button lar=FDn tamam=FD I/O pini olarak gecmiyor
att=FDg=FDn pinleri degistir baska yerlere ata o zaman cal=FDsmas=FD laz=FD=
m

Article: 157645
Subject: Altera Cyclone II
From: Michael <michael_laajanen@yahoo.com>
Date: Thu, 15 Jan 2015 08:19:15 +0100
Links: << >> << T >> << A >>

Hi,

I have a Altera Cyclone II design where I am looking for a good way to 
make a complete reset via HDL.

In Xilinx there is a STARTUP macro that can be used for reset, does the 
Altera also have a similar macro or other way to via HDL make a complete 
reset?

cheers

Michael

Article: 157646
Subject: Re: Altera Cyclone II
From: "kaz" <37480@embeddedrelated>
Date: Thu, 15 Jan 2015 12:26:20 -0600
Links: << >> << T >> << A >>

>Hi,
>
>I have a Altera Cyclone II design where I am looking for a good way to 
>make a complete reset via HDL.
>
>In Xilinx there is a STARTUP macro that can be used for reset, does the 
>Altera also have a similar macro or other way to via HDL make a complete 
>reset?
>
>cheers
>
>Michael
>

Altera does not recommend internal reset if that is what you mean. Though
you can 
design one simply by running a counter from zero to some value and stop
then 
apply/release reset according to counter value. This requires zero startup
which is can 
be guaranteed after fpga chipwide reset release but may fail due to
possibility of timing 
violation as external clock could arrive any time.

Kaz	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 157647
Subject: Re: Altera Cyclone II
From: Michael <michael_laajanen@yahoo.com>
Date: Thu, 15 Jan 2015 21:03:14 +0100
Links: << >> << T >> << A >>

Hi,
On 01/15/15 07:26 PM, kaz wrote:
>> Hi,
>>
>> I have a Altera Cyclone II design where I am looking for a good way to
>> make a complete reset via HDL.
>>
>> In Xilinx there is a STARTUP macro that can be used for reset, does the
>> Altera also have a similar macro or other way to via HDL make a complete
>> reset?
>>
>> cheers
>>
>> Michael
>>
>
> Altera does not recommend internal reset if that is what you mean. Though
> you can
> design one simply by running a counter from zero to some value and stop
> then
> apply/release reset according to counter value. This requires zero startup
> which is can
> be guaranteed after fpga chipwide reset release but may fail due to
> possibility of timing
> violation as external clock could arrive any time.
>
> Kaz	
> 					
> ---------------------------------------		
> Posted through http://www.FPGARelated.com
>
The Xilinx startup macro connects to the FPGAs internal reset net which 
means triggering the reset via the macro generates a powerup reset just 
like after configuration. This means that even FF that do not have a HDL 
reset net will get back to its initial state.

This is very handy. I was hoping that I could generate a reset in the 
Cyclone II in the same way without adding alot of code to all process.

anyhow, thanks for the reply

Michael

Article: 157648
Subject: Re: Altera Cyclone II
From: =?UTF-8?B?QWRhbSBHw7Nyc2tp?= <gorskiamalpawpkropkapeel_@xx>
Date: Mon, 19 Jan 2015 16:01:06 +0100
Links: << >> << T >> << A >>

On 2015-01-15 08:19, Michael wrote:
> Hi,
>
> I have a Altera Cyclone II design where I am looking for a good way to
> make a complete reset via HDL.
>
> In Xilinx there is a STARTUP macro that can be used for reset, does the
> Altera also have a similar macro or other way to via HDL make a complete
> reset?
>
> cheers
>
> Michael

http://www.alteraforum.com/forum/showthread.php?t=24658

Adam Górski

Article: 157649
Subject: Re: Altera Cyclone II
From: "Tomas D." <mailsoc@gmial.com>
Date: Mon, 19 Jan 2015 20:54:28 -0000
Links: << >> << T >> << A >>


"Michael" <michael_laajanen@yahoo.com> wrote in message 
news:chp7vjFof2bU1@mid.individual.net...
> Hi,
>
> I have a Altera Cyclone II design where I am looking for a good way to 
> make a complete reset via HDL.
>
> In Xilinx there is a STARTUP macro that can be used for reset, does the 
> Altera also have a similar macro or other way to via HDL make a complete 
> reset?
>
> cheers
>
> Michael

Hello Michael,
you have two options:
1) If you use PLL, then use LOCKED output as a reset to your logic. Create a 
counter, that will set PLL areset input to high when you need a reset.
2) A hardware reset by routing IO pin to nCONFIG input, which would have a 
pull-up resistor. When the FPGA is reseted, the pull-up would ensure the 
nCONFIG is high and the FPGA can enter user mode. When user-mode is reached, 
the output is kept high until the reset is needed. This will reset the FPGA 
completely.

Regards
Tomas D.

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search