Messages from 157700

Article: 157700
Subject: Re: data memory mapping microblaze
From: Brian Drummond <brian@shapes.demon.co.uk>
Date: Sat, 7 Feb 2015 10:54:19 +0000 (UTC)
Links: << >> << T >> << A >>

On Sat, 07 Feb 2015 07:17:29 +0100, alb wrote:

> INFO: this very same post was posted to comp.arch.embedded with no
>  follow up over few days, that's the reason why I decided to post it
>  here where I hope to get more feedback.

> I'm using a mb-gcc and I've looked to the ld refernce, but how can you
> specify that a set of registers need to go to the data memory to a
> specific address? Or is it implicitely assumed that .data segments
>  would
> go to a 'data memory'?
> 
> Anyone with any pointer?
> 
> Al

This is not MB-specific but applicable to gcc.
You can attach "section" pragmas (in C, attributes) to variables in the 
source code as part of their declaration. This directs them to be placed 
in the named section.

https://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Variable-Attributes.html

"Some file formats do not support arbitrary sections so the section 
attribute is not available on all platforms."

I seem to remember having to do this AND effectively repeat the same 
information in the .ld linker script, (via the SECTIONS command)

http://www.scoberlin.de/content/media/http/informatik/gcc_docs/
ld_3.html#SEC17

However I am now hazy on the details, and that may have been an artefact 
of the rather old version of gcc I had to use on that project.

-- Brian

Article: 157701
Subject: Re: Topics for Projects on FPGA+Computer Archtecture
From: "mnentwig" <24789@embeddedrelated>
Date: Sat, 07 Feb 2015 07:47:39 -0600
Links: << >> << T >> << A >>

TTA is one possibility, a random topic that I happen to know about.

A lot of work has been done already and is available. You can download a
full implementation with LLVM compiler here:
http://tce.cs.tut.fi/tta.html

	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 157702
Subject: Re: Topics for Projects on FPGA+Computer Archtecture
From: jim.brakefield@ieee.org
Date: Sat, 7 Feb 2015 16:05:42 -0800 (PST)
Links: << >> << T >> << A >>

On Saturday, February 7, 2015 at 2:58:58 AM UTC-6, vrb...@gmail.com wrote:
> Hi,
> 
>  I have to work on a project related to FPGA (Altera DEI or Altera DEII) and computer architecture. Can anyone suggest good topics that I can work on individually (say for 3-4 months).
> Thank you in advance.

I've done a quick place & route on all the soft processor cores I can find:
http://opencores.org/project,up_core_list,summary

Would like to have better metrics re comparative performance.
In particular a simple, e.g. easily implemented in assembler, CoreMark type metric.  It should be straight forward to select some subset of cores and then perform the measurements and analysis.

Article: 157703
Subject: Re: Topics for Projects on FPGA+Computer Archtecture
From: vrbvasu@gmail.com
Date: Sat, 7 Feb 2015 16:50:56 -0800 (PST)
Links: << >> << T >> << A >>

Thank you both of you.
I just wanted to know if these are graduate level projects that you suggested? And if you can suggest some more similar topics. In the meanwhile I am going through the links that you shared.
Thank you in advance .

Article: 157704
Subject: Re: data memory mapping microblaze
From: Tim Wescott <seemywebsite@myfooter.really>
Date: Sat, 07 Feb 2015 19:16:18 -0600
Links: << >> << T >> << A >>

On Sat, 07 Feb 2015 07:17:29 +0100, alb wrote:

> INFO: this very same post was posted to comp.arch.embedded with no
>  follow up over few days, that's the reason why I decided to post it
>  here where I hope to get more feedback.
> 
> Hi everyone,
> 
> I'm dealing with an mb-lite which is clone of the microblaze
> architecture and I'm trying to understand how the memory mapping works.
> 
> We have memory mapped registers which are needed to exchange data
> between the uP and the FPGA and it should be pretty straight
>  forward to
> map this memory into a segment in 'data memory', but unfortunately it
> seems the object-dump does not seem to show anything but a list of
> segments with no distinction between 'data memory' and 'instruction
> memory'.
> 
> IIRC on similar Harvard Architectures (like the ADSP21xx) you could
> write the linker script to store data and instructions.
> 
> I'm using a mb-gcc and I've looked to the ld refernce, but how can you
> specify that a set of registers need to go to the data memory to a
> specific address? Or is it implicitely assumed that .data segments
>  would
> go to a 'data memory'?
> 
> Anyone with any pointer?
> 
> Al

Hey Al:

Sorry I missed this on c.a.e.  In fact, I'm cross-posting 

Am I correct that all you need to do is to read/write from a register or 
set thereof, which is mapped to a specific location in the processor's 
memory space?

In that case, there are a number of things you can do.  Doing a Google 
search of "memory mapped registers in C" is a good start.  If you find 
something by Michael Barr or Jack Gansle -- believe it, even if I 
contradict it below.

I'm going to assume that these memory-mapped registers are not part of the 
"real" data memory -- e.g., if the real data memory runs from 0x00000000 
to 0x00007ffff, then you won't find the registers there.

The thoroughly C way to do this, without really demanding much "gnu-ness", 
is to declare your registers something like:

#define REGISTER_0 (*((volatile int32_t *)(0x00010000)))
#define REGISTER_1 (*((volatile int32_t *)(0x00010004)))

Etc.  Then you use REGISTER_0, REGISTER_1 as if they were integers.

The way I do this, which is probably wrong on many levels, is to define 
the register addresses directly in a linker script:

TIM2      = 0x40000000;
TIM3      = 0x40000400;

(This is for an ST STM32F303)

Then in the code I have these long structure definitions that match the 
stackup of the registers in the pertinent register sets, and the header 
files end with something like:

extern volatile SGenTimer1Regs  TIM2;
extern volatile SGenTimer1Regs  TIM3;

This automatically makes everything line up, and then in the code I can 
reference a timer with code like this (different timer than above, but you 
get the idea):

TIM4.CCER.bits.CC3E   = 0;  // disable CC3
TIM4.DIER.all         = 0;
TIM4.DIER.bits.CC2IE  = 1;  // wait for falling edge
TIM4.SR.all           = 0;

My way works happily for me, but isn't terribly standard.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Article: 157705
Subject: Re: data memory mapping microblaze
From: Tim Wescott <seemywebsite@myfooter.really>
Date: Sat, 07 Feb 2015 19:21:05 -0600
Links: << >> << T >> << A >>

On Sat, 07 Feb 2015 19:16:18 -0600, Tim Wescott wrote:

> On Sat, 07 Feb 2015 07:17:29 +0100, alb wrote:
> 
>> INFO: this very same post was posted to comp.arch.embedded with no
>>  follow up over few days, that's the reason why I decided to post it
>>  here where I hope to get more feedback.
>> 
>> Hi everyone,
>> 
>> I'm dealing with an mb-lite which is clone of the microblaze
>> architecture and I'm trying to understand how the memory mapping works.
>> 
>> We have memory mapped registers which are needed to exchange data
>> between the uP and the FPGA and it should be pretty straight
>>  forward to
>> map this memory into a segment in 'data memory', but unfortunately it
>> seems the object-dump does not seem to show anything but a list of
>> segments with no distinction between 'data memory' and 'instruction
>> memory'.
>> 
>> IIRC on similar Harvard Architectures (like the ADSP21xx) you could
>> write the linker script to store data and instructions.
>> 
>> I'm using a mb-gcc and I've looked to the ld refernce, but how can you
>> specify that a set of registers need to go to the data memory to a
>> specific address? Or is it implicitely assumed that .data segments
>>  would
>> go to a 'data memory'?
>> 
>> Anyone with any pointer?
>> 
>> Al
> 
> Hey Al:
> 
> Sorry I missed this on c.a.e.  In fact, I'm cross-posting

Whoops -- no, I'm not cross-posting.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Article: 157706
Subject: Re: data memory mapping microblaze
From: al.basili@gmail.com (alb)
Date: 8 Feb 2015 20:42:12 GMT
Links: << >> << T >> << A >>

Hi Brian,

Brian Drummond <brian@shapes.demon.co.uk> wrote:
[]
>> I'm using a mb-gcc and I've looked to the ld refernce, but how can you
>> specify that a set of registers need to go to the data memory to a
>> specific address? Or is it implicitely assumed that .data segments
>>  would
>> go to a 'data memory'?
[]
> This is not MB-specific but applicable to gcc.
> You can attach "section" pragmas (in C, attributes) to variables in the 
> source code as part of their declaration. This directs them to be placed 
> in the named section.
>
> https://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Variable-Attributes.html

Thanks for the pointer, I was aware about this possibility and I used to 
do that on an old DSP (ADSP2187L) espectially for memory mapped 
registers, but I clearly remember the possibility to specify PM or DM in 
a 'memory map' file that unfortunately I'm not able to recover anymore!.

> "Some file formats do not support arbitrary sections so the section 
> attribute is not available on all platforms."
> 
> I seem to remember having to do this AND effectively repeat the same 
> information in the .ld linker script, (via the SECTIONS command)
> 
> http://www.scoberlin.de/content/media/http/informatik/gcc_docs/
> ld_3.html#SEC17

I think I'm more concerned about the MEMORY command where I should be 
writing something along these lines:

MEMORY
  {
    ROM (r): ORIGIN = 0x00000000, LENGTH = whatever
    RAM (w): ORIGIN = 0x80000000, LENGTH = whatever
  }

Now, on an Harvard Architecture the two might really be starting from 
0x0, but I don't seem to understand how you can specify that to the 
linker without incurring into a 'memory overlap' error message.

> However I am now hazy on the details, and that may have been an artefact 
> of the rather old version of gcc I had to use on that project.

Not sure it is just an artefact, I clearly remember the need to specify 
memory mapped registers in the linker script (to have those symbols 
located to the right place) and in the source code.

Al

Article: 157707
Subject: Re: data memory mapping microblaze
From: al.basili@gmail.com (alb)
Date: 8 Feb 2015 21:38:37 GMT
Links: << >> << T >> << A >>

Hi Tim,

Tim Wescott <seemywebsite@myfooter.really> wrote:
[]
>> I'm using a mb-gcc and I've looked to the ld refernce, but how can you
>> specify that a set of registers need to go to the data memory to a
>> specific address? Or is it implicitely assumed that .data segments
>>  would go to a 'data memory'?
>> 
> Sorry I missed this on c.a.e.  In fact, I'm cross-posting 

I'm doing it for you, since I believe c.a.e. is more appropriate. It's 
kind of funny the thread picked up more momentum here than there.

> Am I correct that all you need to do is to read/write from a register or 
> set thereof, which is mapped to a specific location in the processor's 
> memory space?

Not really. Being able to map registers into a specific location is an 
important thing, but here the subject is a bit different.

On an Harvard Architecture data memory and program memory are simply two 
different worlds. They can both start at location 0x00000000, but I 
didn't find a way to say that to the linker without getting a 
'memory/section overlap'.

> In that case, there are a number of things you can do.  Doing a Google 
> search of "memory mapped registers in C" is a good start.  If you find 
> something by Michael Barr or Jack Gansle -- believe it, even if I 
> contradict it below.

I knew Ganssle from 'The Art of Embedded Systems Designs', a must, but I 
wasn't aware about Barr, I'll definitely have a look at his advices.

> I'm going to assume that these memory-mapped registers are not part of the 
> "real" data memory -- e.g., if the real data memory runs from 0x00000000 
> to 0x00007ffff, then you won't find the registers there.

Actually I need to map the 'real memory', memory mapped registers are 
just the second in line, but they should follow the same 'reasoning'.

I have a data memory space and I want to place it at address 0, but it 
ain't working (section overlap). One thing I thought about was tricking 
the linker into thinking that pm and dm are in different locations but 
since I do not need 32bit address for my embedded system I can say 
0x00000000 for pm and 0x80000000 for my dm. Say only 20bit for the dm 
address are used, I can safely say that my dm really starts from 
physical address 0. But is that really ortodox? It sounds a nasty 
workaround to me.

> The thoroughly C way to do this, without really demanding much "gnu-ness", 
> is to declare your registers something like:
> 
> #define REGISTER_0 (*((volatile int32_t *)(0x00010000)))
> #define REGISTER_1 (*((volatile int32_t *)(0x00010004)))
> 
> Etc.  Then you use REGISTER_0, REGISTER_1 as if they were integers.
> 
> The way I do this, which is probably wrong on many levels, is to define 
> the register addresses directly in a linker script:
> 
> TIM2      = 0x40000000;
> TIM3      = 0x40000400;
> 
> (This is for an ST STM32F303)
> 
> Then in the code I have these long structure definitions that match the 
> stackup of the registers in the pertinent register sets, and the header 
> files end with something like:
> 
> extern volatile SGenTimer1Regs  TIM2;
> extern volatile SGenTimer1Regs  TIM3;

Interesting, but how do you make sure you don't incur into memory 
overlaps? I used to have some sort of BASEADDRESS and refer all 
addressing w.r.t. it:

#define REGISTER_0 (*((volatile int32_t *)(BASEADDRESS + 0x00000000)))
#define REGISTER_1 (*((volatile int32_t *)(BASEADDRESS + 0x00000004)))

still there's high risk that you may fall into temptetion to change the 
type without readjusting the address location.
An alternative might be doing something like:

#define REGISTER_0 (*((volatile int32_t *)(BASEADDRESS + 0*sizeof(int32_t))))
#define REGISTER_1 (*((volatile int32_t *)(BASEADDRESS + 1*sizeof(int32_t))))

but again you need to be rigourous, should a change in type occourr, to 
change both the type and the address.

Anyway, this is off topic since I'm not able to specify the dm address 
for .data section and others.

Al

Article: 157708
Subject: processor core validation
From: al.basili@gmail.com (alb)
Date: 8 Feb 2015 22:02:31 GMT
Links: << >> << T >> << A >>

Hi everyone,

I was wondering if anyone can point me to some formal method to validate 
a soft processor core. 

We have the source code (vhdl) and a simulation environment to load 
programs and execute them, but I'm not sure in this case code coeverage 
will be sufficient. What about cases like interrupt handling?

I can run Dhrystone or CoreMark, but will it be sufficient?

Any idea/pointer/comment is appreciated,

Al

-- 
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Article: 157709
Subject: Re: processor core validation
From: HT-Lab <hans64@htminuslab.com>
Date: Mon, 09 Feb 2015 09:08:42 +0000
Links: << >> << T >> << A >>

On 08/02/2015 22:02, alb wrote:
> Hi everyone,
>
Hi Al,

> I was wondering if anyone can point me to some formal method to validate
> a soft processor core.
>
> We have the source code (vhdl) and a simulation environment to load
> programs and execute them, but I'm not sure in this case code coeverage
> will be sufficient. What about cases like interrupt handling?

I assume you know that Code Coverage has nothing to do with formal 
Verification and/or tells you if your processor is bug free, it just 
measure of the quality of your testbench.

If you want to formally test your processor then AFAIK your only option 
is to start writing lots of assertion and spend a years salary on a 
formal tool (solidify/Questa formal/etc). You can also run assertions in 
your simulator (like Modelsim DE/Riviera Pro) but this will not be 
exhaustive.

If your processor is small you might get away of writing top level 
assertions but most likely (due to the cone of logic) you will have to 
formally prove sub-blocks and use good old testvectors for the 
interconnects.

>
> I can run Dhrystone or CoreMark, but will it be sufficient?

What do you think ;-)

If you have a golden reference model then I would suggest a regression 
suite with lots of constrained random generated instructions, interrupt 
test programs, OS load, etc and comparing the output to your reference 
model.

I would also suggest you look up the (old) Viper processor which was a 
hot topic when I was working in the space industry.

Good luck,

Regards,
Hans.
www.ht-lab.com

> Any idea/pointer/comment is appreciated,
>
> Al
>

Article: 157710
Subject: Re: FPGA : Open core FFT
From: hess.gremio@gmail.com
Date: Tue, 10 Feb 2015 03:48:50 -0800 (PST)
Links: << >> << T >> << A >>

Hi,

I'm trying to use cfft too and have some questions.
Maybe you can help me? If you remember something...
hahahaha

Thanks :D

Article: 157711
Subject: Re: Dynamic partial reconfiguration on Spartan 3 chips
From: nick291 <niksharma291@gmail.com>
Date: Tue, 10 Feb 2015 15:23:44 -0600
Links: << >> << T >> << A >>

yes i am in need of DPR on spartan 3,Can you help?

Article: 157712
Subject: Open Source GPGPU core
From: jeffbush001@gmail.com
Date: Wed, 11 Feb 2015 09:09:18 -0800 (PST)
Links: << >> << T >> << A >>

I've been designing an open source Larrabee-esque GPGPU processor in System=
Verilog and I thought people might find it interesting. Full source code, d=
ocumentation, tests, tools, etc. are available on github:

https://github.com/jbush001/NyuziProcessor

The processor supports a wide, predicated vector floating point pipeline wi=
th 16 lanes and multiple hardware threads to hide memory and computation la=
tency. It also supports multiple cache coherent cores. I've created an LLVM=
 backend for this, so C/C++ code can be compiled for it.  It includes suppo=
rt for first class vector types using the GCC vector extensions, as well as=
 a intrinsics to expose specialized instructions.

I've written a 3D engine (software/librender) that is optimized to take adv=
antage both of the vector unit and multiple cores/hardware threads.  Here's=
 a video of the standard teapot (with ~2300 triangles) rendering on a singl=
e core on FPGA running at 50 Mhz:

http://youtu.be/DsvZorBu4Uk

This image is the emulator rendering Dabrovik's Sponza atrium, ~66k triangl=
es. This took around 200 million instructions to render between 8 virtual c=
ores and 32 hardware threads (at 1024x768):

http://i.imgur.com/sHAsAU5.png

My main purpose of designing this was to be able to experiment with process=
or architecture with real, empirical data. The neat thing about having all =
the source to a cycle accurate hardware design is that it is infinitely ins=
trumentable. I've kept notes about some of my findings here:

http://latchup.blogspot.com/

Anyway, comments and suggestions are appreciated, and I'm happy to take con=
tributions if people are interested in hacking on it.

Article: 157713
Subject: Re: processor core validation
From: al.basili@gmail.com (alb)
Date: 11 Feb 2015 22:12:27 GMT
Links: << >> << T >> << A >>

Hi Tim,

Tim Wescott <seemywebsite@myfooter.really> wrote:
[]
> You could have a combination hardware/software test harness, where you 
> have a full test app (or perhaps an app that loops through the processor 
> states that are of most interest) while at the same time the processor is 
> getting interrupted by an asynchronous source that either has a different 
> period from the processor's, or which is intentionally random.

This test harness could effectively be running on a simulator, where I 
could use a golden model and crosscheck that throwing random interrupts 
does not cause any deviation from the golden model behavior.

The main difficulty I see would be the functional coverage model I need 
to put in place to tell me I'm done.

There might be corner cases where the only solution would be crafting a 
special purpose test program to generate stimuli (or sequences of 
stimuli) which might be very difficult to achieve otherwise (some sort 
of direct testing and random testing).

> Just going through the thing step-by-step and, positing problem modes, and 
> examining the code to ask "how am I going to test for this?" will force 
> you to think in depth about everything -- one of the frequent ways that 
> TDD helps me to find bugs is by forcing me to discover them in the test 
> design, before I even get to the point of throwing software at the test.

I can simply get the spec of the processor and start to classify the 
requirements and then go one by one trying to build my functional 
coverage model. It is possible that going through the code would not be 
needed, even though it might help us discovering corner cases.

Assuming the golden model is provided (I've found one on OVP, yet not 
sure whether it is the right one!) I'll just verify compliance to the 
golden model and maybe present some usecases where some high level 
software is able to run and work without crashing.

The black box approach has some advantages:

1. no need to know the details (unless we need a fix) 
2. bfm for processor's interface are /available/ (write_mem, read_mem) 
3. heuristic approach (let's see if it breaks!)

OTOH we might face the need to open the box and go deeper with our 
verification, where interactions between internal elements have to be 
analyzed.

With the hypothesis that the processor 'state' does not depend on 
previous instructions (there's no cache), the order of the instructions 
does not necessarily matter. Should this be the case I'll probably can 
simply live with throwing random instructions at it.

Al

Article: 157714
Subject: Re: Open Source GPGPU core
From: "jt_eaton" <84408@embeddedrelated>
Date: Thu, 12 Feb 2015 21:04:34 -0600
Links: << >> << T >> << A >>

I tried building your toolchain on both  a 32 and 64 bit amd Ubuntu 14.10
system and get:

Linking CXX shared library ../../../lib/liblldb.so
Python script sym-linking LLDB Python API
Program error: Invalid parameters entered, -h for help. 
You entered:
['--buildConfig=',
'--srcRoot=/home/johne/Desktop/Nyuzi/NyuziToolchain/tools/lldb',
'--targetDir=/home/johne/Desktop/Nyuzi/NyuziToolchain/build/tools/lldb/source/../scripts',
'--cfgBldDir=/home/johne/Desktop/Nyuzi/NyuziToolchain/build/tools/lldb/source/../scripts',
'--prefix=/home/johne/Desktop/Nyuzi/NyuziToolchain/build',
'--cmakeBuildConfiguration=.', '-m'] (-1)
tools/lldb/source/CMakeFiles/liblldb.dir/build.make:282: recipe for target
'lib/liblldb.so.3.7.0' failed
make[2]: *** [lib/liblldb.so.3.7.0] Error 255
CMakeFiles/Makefile2:12189: recipe for target
'tools/lldb/source/CMakeFiles/liblldb.dir/all' failed
make[1]: *** [tools/lldb/source/CMakeFiles/liblldb.dir/all] Error 2
Makefile:133: recipe for target 'all' failed
make: *** [all] Error 2
johne@ouabache:~/Desktop/Nyuzi/NyuziToolchain/build$ 


John Eaton
	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 157715
Subject: Re: Open Source GPGPU core
From: jeffbush001@gmail.com
Date: Fri, 13 Feb 2015 07:56:37 -0800 (PST)
Links: << >> << T >> << A >>

On Thursday, February 12, 2015 at 7:04:38 PM UTC-8, jt_eaton wrote:
> I tried building your toolchain on both  a 32 and 64 bit amd Ubuntu 14.10
> system and get:
> 
> Linking CXX shared library ../../../lib/liblldb.so
> Python script sym-linking LLDB Python API
> Program error: Invalid parameters entered, -h for help. 
> You entered:

It looks like LLDB was not building correctly when the build type wasn't set (I normally build with Debug).  I pushed a change to the cmake files that should address this.  Let me know if that fixes it.

Thanks

--Jeff

Article: 157716
Subject: Re: Dynamic partial reconfiguration on Spartan 3 chips
From: rickman <gnuarm@gmail.com>
Date: Fri, 13 Feb 2015 13:21:13 -0500
Links: << >> << T >> << A >>

On 2/10/2015 4:23 PM, nick291 wrote:
> yes i am in need of DPR on spartan 3,Can you help?

I have been wanting DPR for a long time.  I don't know if Xilinx 
supports DPR at all any more, but I remember quite some years back they 
dropped support for this on the Spartan devices.

 From what I could tell the task is actually not an easy one for the 
software developers and the true demand for it was limited.  I believe 
the only real support Xilinx ever gave to this was for very large 
customers with whom they would provide a Xilinx expert to help you with 
your work.

Personally I think DPR has some real potential as it would open up 
markets that otherwise are not feasible.  But I have to assume Xilinx 
has a handle on their real market, and even perhaps to a lesser degree, 
their potential markets.

So instead of being able to craft modules which would fit into an FPGA 
in the same way that USB devices are plugged into a PC, every FPGA 
design has to be hand crafted with all the pieces molded at the same time.

-- 

Rick

Article: 157717
Subject: Re: Open Source GPGPU core
From: "jt_eaton" <84408@embeddedrelated>
Date: Fri, 13 Feb 2015 20:37:47 -0600
Links: << >> << T >> << A >>

>
>It looks like LLDB was not building correctly when the build type wasn't
set (I normally build with Debug).  I pushed a change to the cmake files
that should address this.  Let me know if that fixes it.
>
>Thanks
>
>--Jeff
>

That fixed it. Ran all the tests and got the picture in the frame buffer. 

Do any of the tests run verilator to create a vcd dump file?


John Eaton
	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 157718
Subject: Re: Open Source GPGPU core
From: "jt_eaton" <84408@embeddedrelated>
Date: Fri, 13 Feb 2015 21:00:36 -0600
Links: << >> << T >> << A >>


>>
>
>That fixed it. Ran all the tests and got the picture in the frame buffer.

>
>Do any of the tests run verilator to create a vcd dump file?
>
>
>John Eaton
>	   
>					
>---------------------------------------		
>Posted through http://www.FPGARelated.com
>

Ok I found it. Are all of your tests all using the same vcd dump file?

John Eaton
	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 157719
Subject: Re: Open Source GPGPU core
From: jeffbush001@gmail.com
Date: Fri, 13 Feb 2015 19:00:59 -0800 (PST)
Links: << >> << T >> << A >>

On Friday, February 13, 2015 at 6:37:53 PM UTC-8, jt_eaton wrote:
> That fixed it. Ran all the tests and got the picture in the frame buffer.=
=20

Great!

> Do any of the tests run verilator to create a vcd dump file?

Yep.  All of the cosimulation tests run in Verilator.  The compiler tests c=
an be made to run in verilator by defining USE_VERILATOR=3D1 in the shell e=
nvironment.  The render tests have a target 'verirun' that will run them in=
 Verilator (there are READMEs in those directories with more details)

VCD dumps aren't produced by default, but can be enabled by modifying the m=
akefile in the rtl/ directory, uncommenting the line:

VERILATOR_OPTIONS=3D--trace --trace-structs

And rebuilding. A file 'trace.vcd' will be written in the same directory. T=
he output files get big fast for non-trivial tests. :)

Article: 157720
Subject: Re: Open Source GPGPU core
From: Aleksandar Kuktin <akuktin@gmail.com>
Date: Sun, 15 Feb 2015 14:17:09 +0000 (UTC)
Links: << >> << T >> << A >>

On Wed, 11 Feb 2015 09:09:18 -0800, jeffbush001 wrote:

> I've been designing an open source Larrabee-esque GPGPU processor in
> SystemVerilog and I thought people might find it interesting. Full
> source code, documentation, tests, tools, etc. are available on github:
> 
> https://github.com/jbush001/NyuziProcessor
> 
> The processor supports a wide, predicated vector floating point pipeline
> with 16 lanes and multiple hardware threads to hide memory and
> computation latency. It also supports multiple cache coherent cores.
> I've created an LLVM backend for this, so C/C++ code can be compiled for
> it.  It includes support for first class vector types using the GCC
> vector extensions, as well as a intrinsics to expose specialized
> instructions.

How many gates does it take once synthesized? Are there any Altera-
specific constructs in code or is it portable?

Article: 157721
Subject: Re: Open Source GPGPU core
From: jeffbush001@gmail.com
Date: Sun, 15 Feb 2015 07:37:13 -0800 (PST)
Links: << >> << T >> << A >>

On Sunday, February 15, 2015 at 6:17:13 AM UTC-8, Aleksandar Kuktin wrote:

> How many gates does it take once synthesized? Are there any Altera-
> specific constructs in code or is it portable?

The default configuration with 1 core takes around 70k LEs on Altera. Almos=
t all of the design is generic behavioral RTL without custom megafunctions.=
  The exception are SRAM and FIFO modules, which generally need to be tweak=
ed for the specific target to infer properly.

Article: 157722
Subject: Re: Open Source GPGPU core
From: Aleksandar Kuktin <akuktin@gmail.com>
Date: Sun, 15 Feb 2015 15:40:15 +0000 (UTC)
Links: << >> << T >> << A >>

On Sun, 15 Feb 2015 07:37:13 -0800, jeffbush001 wrote:

> On Sunday, February 15, 2015 at 6:17:13 AM UTC-8, Aleksandar Kuktin
> wrote:
> 
>> How many gates does it take once synthesized? Are there any Altera-
>> specific constructs in code or is it portable?
> 
> The default configuration with 1 core takes around 70k LEs on Altera.
> Almost all of the design is generic behavioral RTL without custom
> megafunctions.  The exception are SRAM and FIFO modules, which generally
> need to be tweaked for the specific target to infer properly.

Okay, so this sounds fun. Gonna clone it and see what's inside. :)

Article: 157723
Subject: Kintex UltraScale board with two DDR4 interfaces?
From: "Owenh" <103481@embeddedrelated>
Date: Mon, 16 Feb 2015 10:48:38 -0600
Links: << >> << T >> << A >>

Hello,

I am looking for a Xilinx Kintex UltraScale FPGA board, where it would be
possible to run two separated DDR4 controllers (i.e., a board where there
is more than one address bus for the DDR4 chips).

In case someone knows about such a board, I would be happy to know.

Thank you :-)


	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com

Article: 157724
Subject: Inferring F7 / F8 Mux in Xilinx
From: Kevin Neilson <kevin.neilson@xilinx.com>
Date: Wed, 18 Feb 2015 00:50:57 -0800 (PST)
Links: << >> << T >> << A >>

I'm posting this here for my own future reference.

If you infer a mux with fewer than 2**n inputs, Vivado won't infer the F7 or F8 muxes.  Here is the trick to make sure you get the best synthesis.

Example, 5-input mux:

wire [7:0] mux_inputs[0:4]; // only 5 inputs
wire [7:0] mux_out;
wire [2:0] mux_sel;

always@(posedge clk)
  if (mux_sel<5)
    mux_out <= mux_inputs[mux_sel];
  else 
    mux_out <= 'bx;  // assign rest of 2**3-5 inputs to don't-care


Keywords:  f7, f8, multiplexer, Vivado

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search