Messages from 106900

Article: 106900
Subject: ALTERA Automotive Graphics Controller Reference Design--drivers
From: "Keith Williams" <e_s_p_i_a_n_@insightbb.com>
Date: Tue, 22 Aug 2006 08:53:10 -0400
Links: << >> << T >> << A >>

OK.  I know this is borderline for this group, but seems the best place I
can think of for asking right now.

I am very interested in utilizing the Automotive Graphics Controller ref
design that Altera has posted on their website with a NIOS II core running
uCLinux.

The drivers that were posted seemed to be 'raw' NIOS code.  Are there any
Linux Framebuffer drivers available for this device?

Are there any plans or on-going projects that are porting drivers over for
this device?

Thanks for any insight.

Keith

Article: 106901
Subject: Re: CPU design
From: "radarman" <jshamlet@gmail.com>
Date: 22 Aug 2006 05:57:26 -0700
Links: << >> << T >> << A >>


G=F6ran Bilski wrote:
> Frank Buss wrote:
> > G=F6ran Bilski wrote:
> >
> >
> >>If the interesting part is to create this solution without any time
> >>limits than you should create most from scratch.
> >
> >
> > Yes, this is what I'm planning.
> >
> > I have another idea for a CPU, very RISC like. The bits of an instructi=
ons
> > are something like micro-instructions:
> >
> > There are two internal 16 bit registers, r1 and r2, on which the core c=
an
> > perform operations and 6 "normal" 16 bit registers. The first 2 bits of=
 an
> > instructions defines the meaning of the rest:
> >
> > 2 bits: operation:
> >   00 load internal register 1
> >   01 load internal register 2
> >   10 execute operation
> >   11 store internal register 1
> >
> > I think it is a good idea to use 8 bits for one instruction instead of
> > using non-byte-aligned instructions, so we have 6 bits for the operatio=
n=2E
> > Some useful operations:
> >
> > 6 bits: execute operation:
> >   r1 =3D r1 and r2
> >   r1 =3D r1 or r2
> >   r1 =3D r1 xor r2
> >   cmp(r1, r2)
> >   r1 =3D r1 + r2
> >   r1 =3D r1 - r2
> >   pc =3D r1
> >   pc =3D r1, if c=3D0
> >   pc =3D r1, if c=3D1
> >   pc =3D r1, if z=3D0
> >   pc =3D r1, if z=3D1
> >
> > For the load and store micro instructions, we have 6 bits for encoding =
the
> > place on which the load and store acts:
> >
> > 6 bits place:
> >   1 bit: transfer width (0=3D8, 1=3D16 bits)
> >   2 bits source/destination:
> >     00: register:
> >       3 bits: register index
> >     01: immediate:
> >       1 bit: width of immediate value (0=3D8, 1=3D16 bits)
> >         next 1 or 2 bytes: immediate number (8/16 bits)
> >     10: memory address in register
> >       3 bits: register index
> >     11: address
> >       1 bit: width of address (0=3D8, 1=3D16 bits)
> >         next 1 or 2 bytes: address (8/16 bits)
> >
> > The transfer width and the value need not to be the same. E.g. 1010xx
> > means, that the next byte is loaded into the internal register and the
> > upper 8 bits are set to 0.
> >
> > But for this reduced instruction set a compiler would be a good idea. Or
> > different layers of assembler. I'll try to translate my first CPU desig=
n,
> > which needed 40 bytes:
> >
> > ; swap 6 byte source and destination MACs
> > 	.base =3D 0x1000
> > p1:	.dw 0
> > p2:	.dw 0
> > tmp:	.db 0
> > 	move #5, p1
> > 	move #11, p2
> > loop:	move.b (p1), tmp
> > 	move.b (p2), (p1)
> > 	move.b tmp, (p2)
> > 	sub.b p2, #1
> > 	sub.b p1, #1
> > 	bcc.b loop
> >
> > With my new instruction set it could be written like this (the normal
> > registers 0 and 1 are constant 0 and 1) :
> >
> > 	load r1 immediate with 5
> > 	store r1 to register 2
> > 	load r1 immediate with 11
> > 	store r1 to register 3
> > loop:	load r1 from memory address in register 2
> > 	load r2 from memory address in register 3
> > 	store r1 to memory address in register 3
> > 	store r2 to memory address in register 2
> > 	load r1 from register 3
> > 	load r2 from register 1
> > 	operation r1 =3D r1 - r2
> > 	store r1 in register 3
> > 	load r1 in register 2
> > 	operation r1 =3D r1 - r2
> > 	store r1 in register 2
> > 	operation pc =3D loop if c=3D0
> >
> > This is 20 bytes long. As you can see, there are micro optimizations
> > possible, like for the last two register decrements, where the subtrahe=
nd
> > needs to be loaded only once.
> >
> > I think this instruction set could be implemented with very few gates,
> > compared to other instruction sets, and the memory usage is low, too.
> > Another advantage: 64 different instructions are possible and orthogonal
> > higher levels are easy to implement with it, because the load and store
> > operations work on all possible places. Speed would be not the fastest,=
 but
> > this is no problem for my application.
> >
> > The only problem is that you need a C compiler or something like this,
> > because writing assembler with this reduced instruction set looks like =
it
> > will be no fun.
> >
> > Instead of 16 bits, 32 bits and more is easy to implement with generic
> > parameters for this core.
> >
>
> Things to keep in mind is to handle larger arithmetic than 16 bits.
> That will usually introduce some kind of carry bits (stored where?).
> You seems to have a c,z bits somewhere but you will need two versions of
> each instruction, one which uses the carry and one which doesn't
>
> Running more than just simple programs in real-time applications
> requires interrupt support which messes things up considerable in the
> control part of the processor.
>
> Do you consider using only absolute branching or also doing relative
> branching?
>
> If you really are wanting to have a processor which is code efficient,
> you might want to look at a stack machine.
> If I was to create a tiny tiny processor with little area and code
> efficient I would do a stack machine.
> But they are much nastier to program but they can be implemented very
> efficiently.
>
>
> G=F6ran

Since you have control of the microcode, you can implement 16-bit math
in an 8-bitter by chaining other states. The v8 uRISC/Arclite has a
16-bit increment, which is implemented as {Rn+1,Rn}++. It takes two
clock cycles to execute because it issues two commands to the ALU. Yes,
you do have to keep a carry flag, but you would keep one anyway.

BTW - I just finished the interrupt controller for my processor core,
and it wasn't that difficult. (once I got past the priority part). In
my case, I wait for the next instruction decode, and then enter the
interrupt states. Once it starts an interrupt, it's a simple matter of
storing off the flag register and current PC + 1, and then doing a JSR
to the location indicated in the service vector. I use a req/ack scheme
to let the microcode FSM indicate that it has entered an ISR.

Of course, my CPU doesn't have any cache, and a simple two-stage
pipeline - so that might have something to do with the simplicity of it.

Article: 106902
Subject: Re: CPU design
From: "radarman" <jshamlet@gmail.com>
Date: 22 Aug 2006 06:01:09 -0700
Links: << >> << T >> << A >>


G=F6ran Bilski wrote:
> Frank Buss wrote:
> > G=F6ran Bilski wrote:
> >
> >
> >>You seems to have a c,z bits somewhere but you will need two versions of
> >>each instruction, one which uses the carry and one which doesn't
> >
> >
> > Yes, I have carry and zero flag. To make the implementation of the core
> > easier, I think I'll use one bit of the instruction set to determine if=
 the
> > flags are updated or not.
> >
> >
> >>Running more than just simple programs in real-time applications
> >>requires interrupt support which messes things up considerable in the
> >>control part of the processor.
> >
> >
> > Why? I think I can implement a "call" instruction like in 68000:
> >
> > r2=3Dpc
> > pc=3Dr1
> >
> > In the sub routine I can save r2, if I need more call stack.
> >
> > Interrupts could be implemented by saving the PC register in a special
> > register and restoring it by calling a special return instruction.
> >
> >
> So will you have instructions that saves the C,Z values?
> Imagine doing a cmp instruction and after that you take an interrupt,
> the interrupt handler will also use these flags so when you return the
> interrupted program will use the wrong values.
>
>
> >>Do you consider using only absolute branching or also doing relative
> >>branching?
> >
> >
> > 64 instructions are possible, so relative branching is a good idea and =
I'll
> > use the same concept with one bit for deciding, if it is absolute or
> > relative.
> >
> >
> >>If you really are wanting to have a processor which is code efficient,
> >>you might want to look at a stack machine.
> >>If I was to create a tiny tiny processor with little area and code
> >>efficient I would do a stack machine.
> >>But they are much nastier to program but they can be implemented very
> >>efficiently.
> >
> >
> > I've implemented a simple Forth implementation for Java and it's just
> > different, not more difficult to program in Forth:
> >
> > http://www.frank-buss.de/forth/
> >
> > The MARC4 from Atmel uses qForth:
> >
> > http://www.atmel.com/journal/documents/issue5/pg46_48_Atmel_5_CodePatch=
_A.pdf
> >
> > Maybe you are right and the core and programs are smaller with Forth, I=
'll
> > think about it. Really useful is that it is simple to write an interact=
ive
> > read-eval-print loop in Forth (like in Lisp), so that you can program a=
nd
> > debug a system over RS232.
> >

Simpler solution - have the microcode FSM push the flags to the stack.
It's a simple alteration, and saves a lot of heartache. I have
contemplated even pushing the entire context to the stack, since I can
burst write from the FSM a lot faster than I can with individual
PSH/POP instructions, but I figure that would be overkill.

Article: 106903
Subject: Davies-meyer in VHDL
From: "bs" <bs.addr@googlemail.com>
Date: 22 Aug 2006 06:06:52 -0700
Links: << >> << T >> << A >>

Hi everybody;

I am new in VDHL and crypto also. I would like to implement the
Davies-meyer HASH function ( Hi = Emi(Hi-1)+Hi-1 ) in VHDL. The problem
I am having is that: The block cipher I am having (Kasumi) have 64 bits
input and output and the HASH function(SHA1) is having 160 bits output.
I don't know how can I manage an agrement between them in order to
implement the Davies-meyer.
 Can anyone help me in getting an arrangement of those functions or
indicate where I can find literatures or implementations about this.
 Thanks all and nice day.
Adam.

Article: 106904
Subject: Re: Modelsim SE Simulation
From: "krishna.janumanchi@gmail.com" <krishna.janumanchi@gmail.com>
Date: 22 Aug 2006 06:08:12 -0700
Links: << >> << T >> << A >>

Thank you all for your prompt replies.

@ Hans - Thank you - license command works - but MAKE always fail..
gives weird errors.

@ Joseph -
You are right - I am using Modelsim GUI for both compilation &
simulation.
Is it faster - compilation from command prompt??
Modelsim GUI uses a script file which consists of compilation order.
My project is having both Verilog & VHDL files - that is Mixed..
I am uisng Modelsim SE PLUS 6.0C - Linux OS.

Regards,
Krishna Janumanchi

>
> Hi there,
>
> I guess you are trying to compile your design within
> the Modelsim GUI (vsim), and then run the simulation.
> Maybe it is better to separate it into two stages?
>
> When you compile the verilog design, run vlog with -incr
> option ("incremental").
>
> When running vsim, if using VHDL, you can use -lic_vhdl,
> and for verilog you can use -lic_vlog.
>
> It might be useful to tell us
> - which OS you are using?
> - VHDL / verilog/ mix language
> - how do you compiling the design? Within Modelsim GUI
>    or inside C-shell / Windows CMD?
>
> In addition, is a lot of the design files you are compiling
> are the xilinx / Altera libraries? If yes, you maybe able
> to use library feature instead of compiling everything in
> work.  And the library will only need to be compiled once.
> 
> Joseph

Article: 106905
Subject: Re: CPU design
From: "jacko" <jackokring@gmail.com>
Date: 22 Aug 2006 06:31:05 -0700
Links: << >> << T >> << A >>


G=F6ran Bilski wrote:

> Frank Buss wrote:
> > G=F6ran Bilski wrote:
> >
> >
> >>If the interesting part is to create this solution without any time
> >>limits than you should create most from scratch.
> >
> >
> > Yes, this is what I'm planning.
> >
> > I have another idea for a CPU, very RISC like. The bits of an instructi=
ons
> > are something like micro-instructions:
> >
> > There are two internal 16 bit registers, r1 and r2, on which the core c=
an
> > perform operations and 6 "normal" 16 bit registers. The first 2 bits of=
 an
> > instructions defines the meaning of the rest:
> >
> > 2 bits: operation:
> >   00 load internal register 1
> >   01 load internal register 2
> >   10 execute operation
> >   11 store internal register 1
> >
> > I think it is a good idea to use 8 bits for one instruction instead of
> > using non-byte-aligned instructions, so we have 6 bits for the operatio=
n=2E
> > Some useful operations:
> >
> > 6 bits: execute operation:
> >   r1 =3D r1 and r2
> >   r1 =3D r1 or r2
> >   r1 =3D r1 xor r2
> >   cmp(r1, r2)
> >   r1 =3D r1 + r2
> >   r1 =3D r1 - r2
> >   pc =3D r1
> >   pc =3D r1, if c=3D0
> >   pc =3D r1, if c=3D1
> >   pc =3D r1, if z=3D0
> >   pc =3D r1, if z=3D1
> >
> > For the load and store micro instructions, we have 6 bits for encoding =
the
> > place on which the load and store acts:
> >
> > 6 bits place:
> >   1 bit: transfer width (0=3D8, 1=3D16 bits)
> >   2 bits source/destination:
> >     00: register:
> >       3 bits: register index
> >     01: immediate:
> >       1 bit: width of immediate value (0=3D8, 1=3D16 bits)
> >         next 1 or 2 bytes: immediate number (8/16 bits)
> >     10: memory address in register
> >       3 bits: register index
> >     11: address
> >       1 bit: width of address (0=3D8, 1=3D16 bits)
> >         next 1 or 2 bytes: address (8/16 bits)
> >
> > The transfer width and the value need not to be the same. E.g. 1010xx
> > means, that the next byte is loaded into the internal register and the
> > upper 8 bits are set to 0.
> >
> > But for this reduced instruction set a compiler would be a good idea. Or
> > different layers of assembler. I'll try to translate my first CPU desig=
n,
> > which needed 40 bytes:
> >
> > ; swap 6 byte source and destination MACs
> > 	.base =3D 0x1000
> > p1:	.dw 0
> > p2:	.dw 0
> > tmp:	.db 0
> > 	move #5, p1
> > 	move #11, p2
> > loop:	move.b (p1), tmp
> > 	move.b (p2), (p1)
> > 	move.b tmp, (p2)
> > 	sub.b p2, #1
> > 	sub.b p1, #1
> > 	bcc.b loop
> >
> > With my new instruction set it could be written like this (the normal
> > registers 0 and 1 are constant 0 and 1) :
> >
> > 	load r1 immediate with 5
> > 	store r1 to register 2
> > 	load r1 immediate with 11
> > 	store r1 to register 3
> > loop:	load r1 from memory address in register 2
> > 	load r2 from memory address in register 3
> > 	store r1 to memory address in register 3
> > 	store r2 to memory address in register 2
> > 	load r1 from register 3
> > 	load r2 from register 1
> > 	operation r1 =3D r1 - r2
> > 	store r1 in register 3
> > 	load r1 in register 2
> > 	operation r1 =3D r1 - r2
> > 	store r1 in register 2
> > 	operation pc =3D loop if c=3D0
> >
> > This is 20 bytes long. As you can see, there are micro optimizations
> > possible, like for the last two register decrements, where the subtrahe=
nd
> > needs to be loaded only once.
> >
> > I think this instruction set could be implemented with very few gates,
> > compared to other instruction sets, and the memory usage is low, too.
> > Another advantage: 64 different instructions are possible and orthogonal
> > higher levels are easy to implement with it, because the load and store
> > operations work on all possible places. Speed would be not the fastest,=
 but
> > this is no problem for my application.
> >
> > The only problem is that you need a C compiler or something like this,
> > because writing assembler with this reduced instruction set looks like =
it
> > will be no fun.
> >
> > Instead of 16 bits, 32 bits and more is easy to implement with generic
> > parameters for this core.
> >
>
> Things to keep in mind is to handle larger arithmetic than 16 bits.
> That will usually introduce some kind of carry bits (stored where?).
> You seems to have a c,z bits somewhere but you will need two versions of
> each instruction, one which uses the carry and one which doesn't

or you will have to clear the carry when you want to add without carry.

> Running more than just simple programs in real-time applications
> requires interrupt support which messes things up considerable in the
> control part of the processor.

a register swap for interrupt processing is the easiest.

> Do you consider using only absolute branching or also doing relative
> branching?

either would work, but relative has code size advantage, and absolute
has execution advantage.

> If you really are wanting to have a processor which is code efficient,
> you might want to look at a stack machine.
> If I was to create a tiny tiny processor with little area and code
> efficient I would do a stack machine.
> But they are much nastier to program but they can be implemented very
> efficiently.
>

search for MSL16 as a compact example of stack machine, i would use
slightly different ops, and things if i did it.

2/ ??? i'd have full bit reversal
get rid of the subtract.

umm??

cheers
jacko

Article: 106906
Subject: ModelSim SE PLUS 6.1B. Problem to simulate RocketIO in GT_CUSTOM mode
From: axalay@gmail.com
Date: 22 Aug 2006 06:35:43 -0700
Links: << >> << T >> << A >>

Modelsim report is:

# Reading C:/Modeltech_6.1b/tcl/vsim/pref.tcl
# //  ModelSim SE 6.1b Sep  8 2005
# //
# //  Copyright Mentor Graphics Corporation 2005
# //              All Rights Reserved.
# //
# //  THIS WORK CONTAINS TRADE SECRET AND
# //  PROPRIETARY INFORMATION WHICH IS THE PROPERTY
# //  OF MENTOR GRAPHICS CORPORATION OR ITS LICENSORS
# //  AND IS SUBJECT TO LICENSE TERMS.
# //
# do {test_clock.fdo}
# ** Warning: (vlib-34) Library already exists at "work".
# Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
# -- Compiling module stm4ser
#
# Top level modules:
# 	stm4ser
# Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
# -- Compiling module dcm1
#
# Top level modules:
# 	dcm1
# Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
# -- Compiling module clock
#
# Top level modules:
# 	clock
# Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
# -- Compiling module test_clock
#
# Top level modules:
# 	test_clock
# Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
# -- Compiling module glbl
#
# Top level modules:
# 	glbl
# vsim -L xilinxcorelib_ver -L unisims_ver -lib work -t 1ps test_clock
glbl
# Loading work.test_clock
# Loading work.clock
# Loading work.dcm1
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.BUFG
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.IBUFG
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.DCM
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.dcm_clock_divide_by_2
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.dcm_maximum_period_check
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.dcm_clock_lost
# Loading work.stm4ser
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.GT_CUSTOM
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.GT
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.GT_SWIFT
# Loading C:\Xilinx\verilog\mti_se\unisims_ver.GT_SWIFT_BIT
# Loading work.glbl
# ** Warning: (vsim-PLI-3003)
C:/Xilinx/verilog/mti_se/unisims_ver/unisims_ver_SmartWrapper_source.v(18339):
[TOFD] - System task or function '$lm_model' is not defined.
#         Region:
/test_clock/UUT/module1/GT_CUSTOM_INST/gt_1/gt_swift_1/I1
# .main_pane.mdi.interior.cs.vm.paneset.cli_0.wf.clip.cs.pw.wf
# .main_pane.workspace
# .main_pane.signals.interior.cs
# No errors or warnings.
# Break at test_clock.tfw line 82
# Simulation Breakpoint: Break at test_clock.tfw line 82
# MACRO ./test_clock.fdo PAUSED at line 17

In this report I'm not andestend warning. All off signals from RocketIO
module is x-state. But all of oter modules simulate succes.

:) And sorry my very bad english

Article: 106907
Subject: Re: Modelsim SE Simulation
From: Joseph <joseph.yiu@somewhere-in-arm.com>
Date: Tue, 22 Aug 2006 14:47:34 +0100
Links: << >> << T >> << A >>

krishna.janumanchi@gmail.com wrote:
> Thank you all for your prompt replies.
> 
> @ Hans - Thank you - license command works - but MAKE always fail..
> gives weird errors.
> 
> @ Joseph -
> You are right - I am using Modelsim GUI for both compilation &
> simulation.
> Is it faster - compilation from command prompt??
> Modelsim GUI uses a script file which consists of compilation order.
> My project is having both Verilog & VHDL files - that is Mixed..
> I am uisng Modelsim SE PLUS 6.0C - Linux OS.
> 
> Regards,
> Krishna Janumanchi
> 
> 

Hi Krishna,

It won't be much faster, but by doing that you can avoid
your license problem.

Assumed your top level is "testbench". When you run (in Linux shell)

$> vsim -gui testbench

By default Modelsim will wait if the required license is not available,
and queue for a license.  When the license is becoming available, it
will then start to load the design.

Another advantage of separating the compile stage is that you can
easily redirect stdout messages to a log file. And examine it if
things goes wrong. While inside the GUI, the console display inside
the GUI wondows will have limited number of text lines.  Older error
or warning messages could be lost.

Joseph

Article: 106908
Subject: OFFSET with DCM NET or derived NET?
From: "Brandon Jasionowski" <killerhertz@gmail.com>
Date: 22 Aug 2006 07:17:15 -0700
Links: << >> << T >> << A >>

I have been trying to use an internal clock net in my offset
constraint, sourced from a DCM instance. This appears to be legal,
according to the CGD: "OFFSET is used only for padrelated
signals, and cannot be used to extend the arrival time specification
method to the
internal signals in a design. A clock that comes from an internal
signal is one generated
from a synch element, like a FF. A clock that comes from a PAD and goes
through a DLL,
DCM, clock buffer, or combinatorial logic is supported."

But, this doesn't work for me. If I use the original clock pad in the
constraint it works. However, these OFFSETS are driven by a DCM FX
output, so the timing is different from the original. Will ISE infer
this timing difference? It appears to be unclear to me...

Thanks,
-Brandon

Article: 106909
Subject: Re: Why is Spartan-3 more expensive than Cyclone?
From: "homoalteraiensis" <fpgaengineerfrankfurt@arcor.de>
Date: 22 Aug 2006 07:22:55 -0700
Links: << >> << T >> << A >>



bart wrote:
> density. the Xilinx XC3S1000 has more LUTs (4-input Look Up Tables)
> than the Altera EP1C6.

... while the Cyclone II also has 4 inputs per LUT AFAIR

Article: 106910
Subject: Re: Alternative for Mentor''s HDL Designer
From: "homoalteraiensis" <fpgaengineerfrankfurt@arcor.de>
Date: 22 Aug 2006 07:36:00 -0700
Links: << >> << T >> << A >>

Documentation ist the point! For certain applications, a detailed docu
has to be deliverered too, no matter how fast text based coding had
been or might have been.

I once had a project with several hundred states, hard to keep an
overview when just dealing with text based signal handling, and almost
impossible to work with 2-3 persons at it the same time.

Mike Treseler wrote:
> The quartus hdl/state machine viewer
> works the other way around.

Well, I know about this function but did not find it convenient. States
are placed in linear order and obviously it is not possible to rearange
this or use this diagram for further input.

Article: 106911
Subject: Re: Video - DSP Eval board with Altera
From: "alterauser" <fpgaengineerfrankfurt@arcor.de>
Date: 22 Aug 2006 07:44:26 -0700
Links: << >> << T >> << A >>

(this was my post, I just changed user name)

No Idea? It does not necessarily need to be a video app - just high
data rete with RAM and DSP port.

Article: 106912
Subject: Re: ModelSim SE PLUS 6.1B. Problem to simulate RocketIO in GT_CUSTOM mode
From: "Jim Wu" <jimwu88NOOOSPAM@yahoo.com>
Date: 22 Aug 2006 07:59:41 -0700
Links: << >> << T >> << A >>

Have you checked AR  22214?

http://www.xilinx.com/xlnx/xil_ans_display.jsp?BV_UseBVCookie=yes&getPagePath=22214


HTH,
Jim
http://home.comcast.net/~jimwu88/tools/

axalay@gmail.com wrote:
> Modelsim report is:
>
> # Reading C:/Modeltech_6.1b/tcl/vsim/pref.tcl
> # //  ModelSim SE 6.1b Sep  8 2005
> # //
> # //  Copyright Mentor Graphics Corporation 2005
> # //              All Rights Reserved.
> # //
> # //  THIS WORK CONTAINS TRADE SECRET AND
> # //  PROPRIETARY INFORMATION WHICH IS THE PROPERTY
> # //  OF MENTOR GRAPHICS CORPORATION OR ITS LICENSORS
> # //  AND IS SUBJECT TO LICENSE TERMS.
> # //
> # do {test_clock.fdo}
> # ** Warning: (vlib-34) Library already exists at "work".
> # Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
> # -- Compiling module stm4ser
> #
> # Top level modules:
> # 	stm4ser
> # Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
> # -- Compiling module dcm1
> #
> # Top level modules:
> # 	dcm1
> # Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
> # -- Compiling module clock
> #
> # Top level modules:
> # 	clock
> # Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
> # -- Compiling module test_clock
> #
> # Top level modules:
> # 	test_clock
> # Model Technology ModelSim SE vlog 6.1b Compiler 2005.09 Sep  8 2005
> # -- Compiling module glbl
> #
> # Top level modules:
> # 	glbl
> # vsim -L xilinxcorelib_ver -L unisims_ver -lib work -t 1ps test_clock
> glbl
> # Loading work.test_clock
> # Loading work.clock
> # Loading work.dcm1
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.BUFG
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.IBUFG
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.DCM
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.dcm_clock_divide_by_2
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.dcm_maximum_period_check
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.dcm_clock_lost
> # Loading work.stm4ser
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.GT_CUSTOM
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.GT
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.GT_SWIFT
> # Loading C:\Xilinx\verilog\mti_se\unisims_ver.GT_SWIFT_BIT
> # Loading work.glbl
> # ** Warning: (vsim-PLI-3003)
> C:/Xilinx/verilog/mti_se/unisims_ver/unisims_ver_SmartWrapper_source.v(18339):
> [TOFD] - System task or function '$lm_model' is not defined.
> #         Region:
> /test_clock/UUT/module1/GT_CUSTOM_INST/gt_1/gt_swift_1/I1
> # .main_pane.mdi.interior.cs.vm.paneset.cli_0.wf.clip.cs.pw.wf
> # .main_pane.workspace
> # .main_pane.signals.interior.cs
> # No errors or warnings.
> # Break at test_clock.tfw line 82
> # Simulation Breakpoint: Break at test_clock.tfw line 82
> # MACRO ./test_clock.fdo PAUSED at line 17
>
> In this report I'm not andestend warning. All off signals from RocketIO
> module is x-state. But all of oter modules simulate succes.
> 
> :) And sorry my very bad english

Article: 106913
Subject: Detect failure in Berlekamp algorithm
From: patrick.melet@dmradiocom.fr
Date: 22 Aug 2006 08:00:02 -0700
Links: << >> << T >> << A >>

Hi all,

I have implanted a complete Reed Solomon decoder.

I used the Berlekamp-Massey algorithm but I don't know how to detect a
failure e.g. e>t
e : error and t is the capacity of correction

I have t+1 values of the error locator polynomial (elp) so his degree
is t

But if I have t+1 errors, the degree of the elp is t+1 so t+2 values
but I have only t+1 blocks to compute the elp !!!

How to do to detect failure ?

Thanks

Article: 106914
Subject: Re: CPU design
From: Ray Andraka <ray@andraka.com>
Date: Tue, 22 Aug 2006 11:02:03 -0400
Links: << >> << T >> << A >>

Jim Granville wrote:
> Frank Buss wrote:
> <snip>
> 
>> The only problem is that you need a C compiler or something like this,
>> because writing assembler with this reduced instruction set looks like it
>> will be no fun.
> 
> 
>  Since this is a very specifica application, do you have a handle on the 
> code size yet ?
> 
>  Another angle to this, would be to choose the smallest CPU for which a 
> C compiler exists.
> 
> Here, Freescale's new RS08 could be a reasonable candidate ?
> 
>  Or chose another more complex core and then scan the compiled output,
> to check the Opcode usages, and subset that.
> 
> -jg
> 
> 

Quite a while back I designed a small microcontroller for a Xilinx 
XC4000E series part that used approximately 80 LUTs and ran at IIRC, 105 
MHz, I think it was in a 4020XL.  It was a simple risc machine that was 
sort of a cross between a PIC microcontroller and an RCA1802.  It had a 
register file with 16 registers like the 1802, and had a small 
instruction set similar to a PIC. If I recall correctly, it was a 
harvard architecture. The ISA was specifically designed for the FPGA 
architecture.

Anyway the difficult part about it was that it had no programming tools 
to support it.  We did write a crude assembler for it, but that was 
about as far as we took it.  The point is, the hardware and ISA design 
is only part of the job.  The tools development is as big a piece as the 
processor design itself.

Article: 106915
Subject: Using multi-cycle contraint and simulate it correctly
From: "alterauser" <fpgaengineerfrankfurt@arcor.de>
Date: 22 Aug 2006 08:15:21 -0700
Links: << >> << T >> << A >>

I am speeding up a design for data processing, where many simple steps
are done causing much overhead. Therefore, I try to increase the system
speed, by eg. inserting some FFs  for critical paths but found fitting
problems with the multipliers.

My solution, was to parallelize some large (40x40) multiplications and
used multi cycle contraints (two clocks) to make it run. Quartus says,
it is fine. After using the constraints, I obtain speeds above 150MHz.

Problem: I cannot check this in Modelsim, because the result of the
multiplications show up immediately after on clock, which is not the
case in reality.

Article: 106916
Subject: Microblaze - Writing to instruction store
From: simpson.eric@gmail.com
Date: 22 Aug 2006 08:18:50 -0700
Links: << >> << T >> << A >>

I would like to have a control application running out of a LMB BRAM,
that is mapped to the low-order addresses, say 0x0000 - 0xFFFF.  I'm
wondering if there is any way to have the microblaze write to the
higher-order addresses (0x10000-0x1FFFF) in the instruction store and
then begin executing these instructions?

I realize that on the LMB instruction interface the Byte_Enable,
Data_Write, and Write_Strobe signals are not used, I'm basically
wondering if there is a good way to utilize/attach logic to these
signals.

Article: 106917
Subject: Re: hex format 16 bit?
From: Ralf Hildebrandt <Ralf-Hildebrandt@gmx.de>
Date: Tue, 22 Aug 2006 17:18:51 +0200
Links: << >> << T >> << A >>

jacko wrote:
> in quartus II loading a .hex file into 16 bit rom, would it be little
> endian, or is only low byte filled??

I can only speak for MaxPlus+ but there reading Intel hex is non-standard.

Intel hex has always addresses pointing to 8 bit locations, while the
destination width in MaxPlus changes. (16 bit destinations for 16 bit
wide RAM)

Intel hex is little endian, while Maxplus reads data in big endian order.

Look at the simulation! You will see easily, what happens.

I have written a converter, but I am not sure if I can release it as
open source. I have to ask my boss..

Ralf

Article: 106918
Subject: Re: Using multi-cycle contraint and simulate it correctly
From: "Gabor" <gabor@alacron.com>
Date: 22 Aug 2006 08:23:20 -0700
Links: << >> << T >> << A >>

alterauser wrote:
> I am speeding up a design for data processing, where many simple steps
> are done causing much overhead. Therefore, I try to increase the system
> speed, by eg. inserting some FFs  for critical paths but found fitting
> problems with the multipliers.
>
> My solution, was to parallelize some large (40x40) multiplications and
> used multi cycle contraints (two clocks) to make it run. Quartus says,
> it is fine. After using the constraints, I obtain speeds above 150MHz.
>
> Problem: I cannot check this in Modelsim, because the result of the
> multiplications show up immediately after on clock, which is not the
> case in reality.

If you don't want to do a back-annotated, post place&route simulation,
you can add delays into the source code for Modelsim.  If you add a
fixed delay on your multiplier that is longer than one clock cycle but
short enough to meet the setup time on the second clock you can
test that the behavior of your multicycle design is correct.  One
caveat, Quartus probably doesn't give you a minimum clock to
output on the multiplier, but it's probably safe to assume that it
is LESS than one clock cycle.  So you really need to test that
the behavior is the same whether the clock to output is less than
one cycle (as you see in Modelsim with no delay) or more than
one cycle but less than two cycles.

HTH,
Gabor

Article: 106919
Subject: Re: OFFSET with DCM NET or derived NET?
From: "Gabor" <gabor@alacron.com>
Date: 22 Aug 2006 08:44:07 -0700
Links: << >> << T >> << A >>

Brandon Jasionowski wrote:
> I have been trying to use an internal clock net in my offset
> constraint, sourced from a DCM instance. This appears to be legal,
> according to the CGD: "OFFSET is used only for padrelated
> signals, and cannot be used to extend the arrival time specification
> method to the
> internal signals in a design. A clock that comes from an internal
> signal is one generated
> from a synch element, like a FF. A clock that comes from a PAD and goes
> through a DLL,
> DCM, clock buffer, or combinatorial logic is supported."
>
> But, this doesn't work for me. If I use the original clock pad in the
> constraint it works. However, these OFFSETS are driven by a DCM FX
> output, so the timing is different from the original. Will ISE infer
> this timing difference? It appears to be unclear to me...
>
> Thanks,
> -Brandon

Normally ISE will infer the timing difference correctly if there is a
simple relationship (like 1:1) from the external pad frequency to
the internal FX clock frequency.  Since the OFFSET is used for
PAD signals, this would make sense, since there would have to
be an assumed phase relationship between the external pad signal
requiring the OFFSET constraint and the internal clock.  If the
signals requiring the OFFSET constraint are not externally
generated, it would normally make sense to use a PERIOD or
FROM : TO constraint instead.

HTH,
Gabor

Article: 106920
Subject: Re: Microblaze - Writing to instruction store
From: "Antti" <Antti.Lukats@xilant.com>
Date: 22 Aug 2006 08:50:09 -0700
Links: << >> << T >> << A >>

simpson.eric@gmail.com schrieb:

> I would like to have a control application running out of a LMB BRAM,
> that is mapped to the low-order addresses, say 0x0000 - 0xFFFF.  I'm
> wondering if there is any way to have the microblaze write to the
> higher-order addresses (0x10000-0x1FFFF) in the instruction store and
> then begin executing these instructions?
>
> I realize that on the LMB instruction interface the Byte_Enable,
> Data_Write, and Write_Strobe signals are not used, I'm basically
> wondering if there is a good way to utilize/attach logic to these
> signals.

in most cases the ILMB and DLMB are connected to A and B ports
of the same BRAM blocks, those the instruction memory is also
accessible for normal writes at the same addresses

antti

Article: 106921
Subject: Re: Xilinx .002ns timing error
From: Ray Andraka <ray@andraka.com>
Date: Tue, 22 Aug 2006 12:11:33 -0400
Links: << >> << T >> << A >>

Brad Smallridge wrote:
>>Answer #20986 says it's fixed but maybe you've uncovered
>>a place where it's not ...
> 
> 
> Humph. You're right.
> It appears it can't handle 7/2.
> Is there a work around?
> 
> Brad Smallridge
> brad at
> aivision
> dot com
> 
> 
> 

Brad, the work-around is to specify the period rather than the clock 
frequency.  If there is a DCM involved make sure the period specified is 
divisible by the DCM input to output ratio too.

Article: 106922
Subject: Re: Using multi-cycle contraint and simulate it correctly
From: kayrock66@yahoo.com
Date: 22 Aug 2006 09:17:00 -0700
Links: << >> << T >> << A >>

A couple of questions that will help answer your query:
Are you using hard or soft multipliers?  What are you doing that you
need 40 bit factors?

Thanks,
Jay

alterauser wrote:
> I am speeding up a design for data processing, where many simple steps
> are done causing much overhead. Therefore, I try to increase the system
> speed, by eg. inserting some FFs  for critical paths but found fitting
> problems with the multipliers.
>
> My solution, was to parallelize some large (40x40) multiplications and
> used multi cycle contraints (two clocks) to make it run. Quartus says,
> it is fine. After using the constraints, I obtain speeds above 150MHz.
>
> Problem: I cannot check this in Modelsim, because the result of the
> multiplications show up immediately after on clock, which is not the
> case in reality.

Article: 106923
Subject: Re: CPU design
From: "JJ" <johnjakson@gmail.com>
Date: 22 Aug 2006 09:24:46 -0700
Links: << >> << T >> << A >>

Frank Buss wrote:
> For implementing the higher level protocols for my Spartan 3E starter kit
> TCP/IP stack implementation, I plan to use a CPU, because I think this
> needs less gates than in pure VHDL. The instruction set could be limited,
> because more instructions and less gates is good, and it doesn't need to be
> fast, so I can design a very orthogonal CPU, which maybe needs even less
> gates. The first draft:
>
> http://www.frank-buss.de/vhdl/cpu.html
>
> It is some kind of a 68000 clone, but much easier. What do you think of it?
> Any ideas to reduce the instruction set even more, without the drawback to
> need more instructions for a given task?
>
> --
> Frank Buss, fb@frank-buss.de
> http://www.frank-buss.de, http://www.it4-systems.de

I did a google for <tiny tcp stack> and saw lots of things

I was looking specifically for Adam Dunkels , he gets alot of press on
OSNews and other sites for his various embedded OS projects.

His uIP stack claims to be the worlds smallest stack, uses 4-5KB of
code space and only a few 100 bytes of ram. uIP has been ported to a
wide range of systems and many commercial projects. He mentions ABB,
Altera, BMW, Cisco Systems, Ericsson, GE, HP, Volvo Technology, Xilinx.
The IwIP is a bigger faster version of uIP.

http://www.sics.se/~adam/

Besides uIP he also has a tiny OS Contiki, a ProtoThreads package.

John Jakson
transputer_guy

Article: 106924
Subject: Re: ISE 8.1: Process "Map" failed
From: "Jeroen Tenbult" <j.tenbult95@chello.nl>
Date: Tue, 22 Aug 2006 18:46:12 +0200
Links: << >> << T >> << A >>

Hello Johan,

I had the same map error. Only thing to do is to create a complete new 
project and start all over again.

Regards,

Jeroen


"Johan Bernspång" <xjohbex@xfoix.se> schreef in bericht 
news:ece9jc$mok$1@mercur.foi.se...
> Hi all,
>
> I came back from vacation yesterday, and full of ideas I started to work 
> where I left before the holidays. The first thing that happens is that I 
> can't map my design anymore. The vhdl is exactly the same as earlier, and 
> the only difference in the map report is the lines regarding related and 
> unrelated logic below:
>
> Design Summary
> --------------
> Number of errors:      0
> Number of warnings:  130
> Logic Utilization:
>   Number of Slice Flip Flops:       7,351 out of  21,504   34%
>   Number of 4 input LUTs:           5,267 out of  21,504   24%
> Logic Distribution:
>   Number of occupied Slices:        5,505 out of  10,752   51%
>   Number of Slices containing only related logic:   5,505 out of 5,505 
> 100%
>   Number of Slices containing unrelated logic:          0 out of 5,505 
> 0%
>         *See NOTES below for an explanation of the effects of unrelated 
> logic
> Total Number 4 input LUTs:          6,045 out of  21,504   28%
>   Number used as logic:             5,267
>   Number used as a route-thru:        310
>   Number used as Shift registers:     468
>
>   Number of bonded IOBs:              132 out of     456   28%
>     IOB Flip Flops:                     5
>     IOB Master Pads:                   55
>     IOB Slave Pads:                    55
>     IOB Dual-Data Rate Flops:          26
>   Number of Block RAMs:                42 out of      56   75%
>   Number of MULT18X18s:                40 out of      56   71%
>   Number of GCLKs:                      5 out of      16   31%
>   Number of DCMs:                       1 out of       8   12%
>   Number of BSCANs:                     1 out of       1  100%
>
>    Number of RPM macros:           26
> Total equivalent gate count for design:  3,057,923
> Additional JTAG gate count for IOBs:  6,336
> Peak Memory Usage:  266 MB
>
> Following the design summary is a note regarding related logic.
>
> In my old map report none of this related logic stuff is present. But I 
> can't really understand why the mapper fails due to this since none of the 
> logic is unrelated.
>
> I am pretty sure that my code hasn't changed during my vacation. Does ISE 
> know that there is a new version out and it wants me to upgrade? ;-)
>
> Has anyone experienced this before?
>
> Regards
> Johan
>
> -- 
> -----------------------------------------------
> Johan Bernspång, xjohbex@xfoix.se
> Research engineer
>
> Swedish Defence Research Agency - FOI
> Division of Command & Control Systems
> Department of Electronic Warfare Systems
>
> www.foi.se
>
> Please remove the x's in the email address if
> replying to me personally.
> -----------------------------------------------

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search