Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Göran Bilski wrote: > If the interesting part is to create this solution without any time > limits than you should create most from scratch. Yes, this is what I'm planning. I have another idea for a CPU, very RISC like. The bits of an instructions are something like micro-instructions: There are two internal 16 bit registers, r1 and r2, on which the core can perform operations and 6 "normal" 16 bit registers. The first 2 bits of an instructions defines the meaning of the rest: 2 bits: operation: 00 load internal register 1 01 load internal register 2 10 execute operation 11 store internal register 1 I think it is a good idea to use 8 bits for one instruction instead of using non-byte-aligned instructions, so we have 6 bits for the operation. Some useful operations: 6 bits: execute operation: r1 = r1 and r2 r1 = r1 or r2 r1 = r1 xor r2 cmp(r1, r2) r1 = r1 + r2 r1 = r1 - r2 pc = r1 pc = r1, if c=0 pc = r1, if c=1 pc = r1, if z=0 pc = r1, if z=1 For the load and store micro instructions, we have 6 bits for encoding the place on which the load and store acts: 6 bits place: 1 bit: transfer width (0=8, 1=16 bits) 2 bits source/destination: 00: register: 3 bits: register index 01: immediate: 1 bit: width of immediate value (0=8, 1=16 bits) next 1 or 2 bytes: immediate number (8/16 bits) 10: memory address in register 3 bits: register index 11: address 1 bit: width of address (0=8, 1=16 bits) next 1 or 2 bytes: address (8/16 bits) The transfer width and the value need not to be the same. E.g. 1010xx means, that the next byte is loaded into the internal register and the upper 8 bits are set to 0. But for this reduced instruction set a compiler would be a good idea. Or different layers of assembler. I'll try to translate my first CPU design, which needed 40 bytes: ; swap 6 byte source and destination MACs .base = 0x1000 p1: .dw 0 p2: .dw 0 tmp: .db 0 move #5, p1 move #11, p2 loop: move.b (p1), tmp move.b (p2), (p1) move.b tmp, (p2) sub.b p2, #1 sub.b p1, #1 bcc.b loop With my new instruction set it could be written like this (the normal registers 0 and 1 are constant 0 and 1) : load r1 immediate with 5 store r1 to register 2 load r1 immediate with 11 store r1 to register 3 loop: load r1 from memory address in register 2 load r2 from memory address in register 3 store r1 to memory address in register 3 store r2 to memory address in register 2 load r1 from register 3 load r2 from register 1 operation r1 = r1 - r2 store r1 in register 3 load r1 in register 2 operation r1 = r1 - r2 store r1 in register 2 operation pc = loop if c=0 This is 20 bytes long. As you can see, there are micro optimizations possible, like for the last two register decrements, where the subtrahend needs to be loaded only once. I think this instruction set could be implemented with very few gates, compared to other instruction sets, and the memory usage is low, too. Another advantage: 64 different instructions are possible and orthogonal higher levels are easy to implement with it, because the load and store operations work on all possible places. Speed would be not the fastest, but this is no problem for my application. The only problem is that you need a C compiler or something like this, because writing assembler with this reduced instruction set looks like it will be no fun. Instead of 16 bits, 32 bits and more is easy to implement with generic parameters for this core. -- Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.deArticle: 106876
Jim Granville wrote: > Sylvain Munaut wrote: > > Antti wrote: > > > >>Dave Pollum schrieb: > >> > >> > >>>Antti wrote: > >>> > >>>>Peter Alfke schrieb: > >>>> > >>>> > >>>>>Why not use PicoBlaze, which is freely available ? > >>>>>Or MicroBlaze if you need more speed? > >>>>>Peter Alfke, from home. > >>>>> > >> > >>[] > >> > >>>Then perhaps there should be a "NanoBlaze"? ;) > >>>(micro, nano, pico) > >> > >>>-Dave Pollum > >> > >>Dave, > >> > >>NanoBlaze is already (R) registered trademark of Xilinx Inc. > >> > >>Antti > >> > > > > > > Damn, I'm working on a 16 bits RISC cpu optimized for S3/V4 > > and I was thinking of nanoblaze ... gotta find some other name > > now ... > > If it is going to be open source, how about NanoFire ? > > -jg sounds good, i do wonder if micron is trademark too, any other good names?? Minon is a possibility, but it may end up being called indi. maybe a thread should just list un occupied namespace :-) cheersArticle: 106877
Frank Buss wrote: <snip> > The only problem is that you need a C compiler or something like this, > because writing assembler with this reduced instruction set looks like it > will be no fun. Since this is a very specifica application, do you have a handle on the code size yet ? Another angle to this, would be to choose the smallest CPU for which a C compiler exists. Here, Freescale's new RS08 could be a reasonable candidate ? Or chose another more complex core and then scan the compiled output, to check the Opcode usages, and subset that. -jgArticle: 106878
rickman wrote: > pmaupin@gmail.com wrote: > > rickman wrote: > > > Antti wrote: > > > > rickman schrieb: > > > > > > > > > Antti wrote: > > > > > > bm schrieb: > > > > > > > > > > > > > Interesting ...Any pointer ? > > > > > > > > > > > > you really learn how to goofle ! :) > > > > > > > > > > > > just enter "usb fpga ukp" as search term and there you, first hit! > > > > > > > > > > Was this a typo? I get a bunch of links to sites giving pricing in > > > > > Brittish pounds. > > > > > > > > > > Maybe you were referring to this... > > > > > > > > > > http://www.opencores.org/projects.cgi/web/usb_phy/overview > > > > NO. > > > > and NO typo. > > > > > > > > google search web search from my PC with keywords "usb fpga ukp" > > > > returns as first hit the following URL (I just rechecked!) > > > > > > > > http://www.geocities.jp/kwhr0/hard/pc8001.html > > > > > > Well I guess you are just special then. I get > > > http://lists.distributed.net/pipermail/hardware/1998-October/000325.html > > > and I would have no idea why you would use "ukp" as part of the search. > > > Care to explain or do you prefer to remail mysterious about it? > > > > > > BTW, the address you posted gives me a web page in an Asian language, > > > possibly Japanese. I am not able to read any of it. > > > > Maybe you have your google search preferences set to return "English > > pages only" > > Isn't that rather irrelevant? Even if the search had turned up a > Japanese page, how would I have any clue as to what it was about? > > Antti seems to think that this was somehow an obviously useful page and > should have been found by the OP. Bah! Now I'm laughing nearly as hard as Antti. I was merely addressing how you and Antti achieved different results on seemingly identical searches, which seemed to be puzzling you. On my original post, I _almost_ asked if you are as provincial an American as I am (to set your google preference to English only), and it seems you may have answered my unasked question :) Unfortunately, I don't have any relevant pages for the OP either. Regards, PatArticle: 106879
Andy Ray wrote: > burn.sir@gmail.com wrote: >> wow, that was one quick answer :) >> >> >> Nop, sorry. LUTs wont do. Too much logic for linear-filtering and the >> result is still crappy, IMO. There is no D/A, every thing is pure >> digital. The LUT solution would make it "too" pure. >> >> I was thinking of something like a unstable feedback loop that >> hopefully doesnt require a multiplier/divider to work. >> > > > That sounds a bit like a Goertzel filter (IIR) which does require a > multiplier to work: > > (http://www.dattalo.com/technical/theory/sinewave.html). > > A CORDIC circuit could probably do the job, though requires > approximately as many cycles per output as bits of precision you need. > No multiplier though. > > Cheers, > > Andy. But the Cordic algorithms must either calculate many coefficients (requires both multiply and divide) or use a LUT. See original implementation of the Intel 8087, it used Cordic algorithms. Cordic: _Co_ordinate _R_otations for _D_igital _I_ntegrating _C_omputer -- JosephKK Gegen dummheit kampfen die Gotter Selbst, vergebens.  --SchillerArticle: 106880
Jim Granville wrote: > Frank Buss wrote: > <snip> > > The only problem is that you need a C compiler or something like this, > > because writing assembler with this reduced instruction set looks like it > > will be no fun. > just got quartus II after 1/2 hr seems ok, after setting top level!! i wonder if the avalon sopc includes usb? not sure if c compilier for it. well at least i have a vhdl compilier now which looks good. must start on the micron design soon.Article: 106881
in quartus II loading a .hex file into 16 bit rom, would it be little endian, or is only low byte filled??Article: 106882
Hello group, I am getting a strange timing error report with numerous requirements of .002ns with the same source and destination clock, and which inevitably fail. This clock is from a dcm clkfx, if that matters. What is going on here? And what is this program doing when it says rising at 29774.000ns? Is it running some sort of mini testbench? Brad Smallridge brad at aivision dot com Timing constraint: TS_top_layer_inst_cam2_dcm_inst_cam_dcm_inst_CLK0_BUF = PERIOD TIMEGRP "top_layer_inst_cam2_dcm_inst_cam_dcm_inst_CLK0_BUF" TSCAM2 HIGH 50%; 1896 items analyzed, 37 timing errors detected. (37 setup errors, 0 hold errors) Minimum period is 44875.000ns. -------------------------------------------------------------------------------- Slack: -3.588ns (requirement - (data path - clock path skew + uncertainty)) Source: top_layer_inst/cam2_iserdes_inst/cam_lval (FF) Destination: top_layer_inst/mid_layer_inst/cam2_col_5 (FF) Requirement: 0.002ns Data Path Delay: 3.530ns (Levels of Logic = 1) Clock Path Skew: 0.000ns Source Clock: top_layer_inst/cam2_clkdiv rising at 29774.998ns Destination Clock: top_layer_inst/cam2_clkdiv rising at 29775.000ns Clock Uncertainty: 0.060ns Data Path: top_layer_inst/cam2_iserdes_inst/cam_lval to top_layer_inst/mid_layer_inst/cam2_col_5 Location Delay type Delay(ns) Physical Resource Logical Resource(s) ------------------------------------------------- ------------------- SLICE_X48Y59.YQ Tcko 0.360 top_layer_inst/cam2_iserdes_inst/cam_lval top_layer_inst/cam2_iserdes_inst/cam_lval SLICE_X59Y62.F4 net (fanout=5) 0.914 top_layer_inst/cam2_iserdes_inst/cam_lval SLICE_X59Y62.X Tilo 0.194 top_layer_inst/mid_layer_inst/_n0042 top_layer_inst/mid_layer_inst/_n00421 SLICE_X66Y60.SR net (fanout=6) 0.905 top_layer_inst/mid_layer_inst/_n0042 SLICE_X66Y60.CLK Tsrck 1.157 top_layer_inst/mid_layer_inst/cam2_col<4> top_layer_inst/mid_layer_inst/cam2_col_5 ------------------------------------------------- --------------------------- Total 3.530ns (1.711ns logic, 1.819ns route) (48.5% logic, 51.5% route)Article: 106883
Brad Smallridge wrote: > Hello group, > > I am getting a strange timing error report > with numerous requirements of .002ns with > the same source and destination clock, and > which inevitably fail. > > This clock is from a dcm clkfx, if that matters. > It does matter ;) The problem you're seeing maybe related to precision error when ISE computes the period. Answer #20986 says it's fixed but maybe you've uncovered a place where it's not ... SylvainArticle: 106884
Hi all, I came back from vacation yesterday, and full of ideas I started to work where I left before the holidays. The first thing that happens is that I can't map my design anymore. The vhdl is exactly the same as earlier, and the only difference in the map report is the lines regarding related and unrelated logic below: Design Summary -------------- Number of errors: 0 Number of warnings: 130 Logic Utilization: Number of Slice Flip Flops: 7,351 out of 21,504 34% Number of 4 input LUTs: 5,267 out of 21,504 24% Logic Distribution: Number of occupied Slices: 5,505 out of 10,752 51% Number of Slices containing only related logic: 5,505 out of 5,505 100% Number of Slices containing unrelated logic: 0 out of 5,505 0% *See NOTES below for an explanation of the effects of unrelated logic Total Number 4 input LUTs: 6,045 out of 21,504 28% Number used as logic: 5,267 Number used as a route-thru: 310 Number used as Shift registers: 468 Number of bonded IOBs: 132 out of 456 28% IOB Flip Flops: 5 IOB Master Pads: 55 IOB Slave Pads: 55 IOB Dual-Data Rate Flops: 26 Number of Block RAMs: 42 out of 56 75% Number of MULT18X18s: 40 out of 56 71% Number of GCLKs: 5 out of 16 31% Number of DCMs: 1 out of 8 12% Number of BSCANs: 1 out of 1 100% Number of RPM macros: 26 Total equivalent gate count for design: 3,057,923 Additional JTAG gate count for IOBs: 6,336 Peak Memory Usage: 266 MB Following the design summary is a note regarding related logic. In my old map report none of this related logic stuff is present. But I can't really understand why the mapper fails due to this since none of the logic is unrelated. I am pretty sure that my code hasn't changed during my vacation. Does ISE know that there is a new version out and it wants me to upgrade? ;-) Has anyone experienced this before? Regards Johan -- ----------------------------------------------- Johan Bernspång, xjohbex@xfoix.se Research engineer Swedish Defence Research Agency - FOI Division of Command & Control Systems Department of Electronic Warfare Systems www.foi.se Please remove the x's in the email address if replying to me personally. -----------------------------------------------Article: 106885
Frank Buss wrote: > Göran Bilski wrote: > > >>If the interesting part is to create this solution without any time >>limits than you should create most from scratch. > > > Yes, this is what I'm planning. > > I have another idea for a CPU, very RISC like. The bits of an instructions > are something like micro-instructions: > > There are two internal 16 bit registers, r1 and r2, on which the core can > perform operations and 6 "normal" 16 bit registers. The first 2 bits of an > instructions defines the meaning of the rest: > > 2 bits: operation: > 00 load internal register 1 > 01 load internal register 2 > 10 execute operation > 11 store internal register 1 > > I think it is a good idea to use 8 bits for one instruction instead of > using non-byte-aligned instructions, so we have 6 bits for the operation. > Some useful operations: > > 6 bits: execute operation: > r1 = r1 and r2 > r1 = r1 or r2 > r1 = r1 xor r2 > cmp(r1, r2) > r1 = r1 + r2 > r1 = r1 - r2 > pc = r1 > pc = r1, if c=0 > pc = r1, if c=1 > pc = r1, if z=0 > pc = r1, if z=1 > > For the load and store micro instructions, we have 6 bits for encoding the > place on which the load and store acts: > > 6 bits place: > 1 bit: transfer width (0=8, 1=16 bits) > 2 bits source/destination: > 00: register: > 3 bits: register index > 01: immediate: > 1 bit: width of immediate value (0=8, 1=16 bits) > next 1 or 2 bytes: immediate number (8/16 bits) > 10: memory address in register > 3 bits: register index > 11: address > 1 bit: width of address (0=8, 1=16 bits) > next 1 or 2 bytes: address (8/16 bits) > > The transfer width and the value need not to be the same. E.g. 1010xx > means, that the next byte is loaded into the internal register and the > upper 8 bits are set to 0. > > But for this reduced instruction set a compiler would be a good idea. Or > different layers of assembler. I'll try to translate my first CPU design, > which needed 40 bytes: > > ; swap 6 byte source and destination MACs > .base = 0x1000 > p1: .dw 0 > p2: .dw 0 > tmp: .db 0 > move #5, p1 > move #11, p2 > loop: move.b (p1), tmp > move.b (p2), (p1) > move.b tmp, (p2) > sub.b p2, #1 > sub.b p1, #1 > bcc.b loop > > With my new instruction set it could be written like this (the normal > registers 0 and 1 are constant 0 and 1) : > > load r1 immediate with 5 > store r1 to register 2 > load r1 immediate with 11 > store r1 to register 3 > loop: load r1 from memory address in register 2 > load r2 from memory address in register 3 > store r1 to memory address in register 3 > store r2 to memory address in register 2 > load r1 from register 3 > load r2 from register 1 > operation r1 = r1 - r2 > store r1 in register 3 > load r1 in register 2 > operation r1 = r1 - r2 > store r1 in register 2 > operation pc = loop if c=0 > > This is 20 bytes long. As you can see, there are micro optimizations > possible, like for the last two register decrements, where the subtrahend > needs to be loaded only once. > > I think this instruction set could be implemented with very few gates, > compared to other instruction sets, and the memory usage is low, too. > Another advantage: 64 different instructions are possible and orthogonal > higher levels are easy to implement with it, because the load and store > operations work on all possible places. Speed would be not the fastest, but > this is no problem for my application. > > The only problem is that you need a C compiler or something like this, > because writing assembler with this reduced instruction set looks like it > will be no fun. > > Instead of 16 bits, 32 bits and more is easy to implement with generic > parameters for this core. > Things to keep in mind is to handle larger arithmetic than 16 bits. That will usually introduce some kind of carry bits (stored where?). You seems to have a c,z bits somewhere but you will need two versions of each instruction, one which uses the carry and one which doesn't Running more than just simple programs in real-time applications requires interrupt support which messes things up considerable in the control part of the processor. Do you consider using only absolute branching or also doing relative branching? If you really are wanting to have a processor which is code efficient, you might want to look at a stack machine. If I was to create a tiny tiny processor with little area and code efficient I would do a stack machine. But they are much nastier to program but they can be implemented very efficiently. GöranArticle: 106886
> Answer #20986 says it's fixed but maybe you've uncovered > a place where it's not ... Humph. You're right. It appears it can't handle 7/2. Is there a work around? Brad Smallridge brad at aivision dot comArticle: 106887
Brad Smallridge wrote: > > Answer #20986 says it's fixed but maybe you've uncovered > > a place where it's not ... > > Humph. You're right. > It appears it can't handle 7/2. > Is there a work around? > > Brad Smallridge > brad at > aivision > dot com How did you specify you period ? I usually choose to specify it in ps, trying to get a number that leads to an integer number of ps for the FX period. For example, if you had 133 MHz as a constraint, use 7497 ps as a period ... 7497 / 7 * 2 = 2140 ps ...Article: 106888
krishna.janumanchi@gmail.com wrote: > Hi, > > I use Modelsim SE 6.0 simulator for my projects. > My Project is very big and it takes nearly 15 min for compilation. > As the license is network one, after compilation, it says simulation > license error - if license is not available. Is there a command > available in modelsim to check license on network? > Next, In my project only 3 to 4 files are changed frequently. Rest > other files are not disturbed at all. > But still I am recompiling all files. Are there any commands available > so that I can skip compiling files which are not changed at all? > > Please help.. > > Regards, > JK > Hi there, I guess you are trying to compile your design within the Modelsim GUI (vsim), and then run the simulation. Maybe it is better to separate it into two stages? When you compile the verilog design, run vlog with -incr option ("incremental"). When running vsim, if using VHDL, you can use -lic_vhdl, and for verilog you can use -lic_vlog. It might be useful to tell us - which OS you are using? - VHDL / verilog/ mix language - how do you compiling the design? Within Modelsim GUI or inside C-shell / Windows CMD? In addition, is a lot of the design files you are compiling are the xilinx / Altera libraries? If yes, you maybe able to use library feature instead of compiling everything in work. And the library will only need to be compiled once. JosephArticle: 106889
Does it really fail? or have you misinterpreted what it said, because it looks ok... The related logic is not a concern in this case as far as I can see. ISE always adds an asterisk and a note about related logic IIRC. Ben "Johan Bernspång" <xjohbex@xfoix.se> wrote in message news:ece9jc$mok$1@mercur.foi.se... > Hi all, > > I came back from vacation yesterday, and full of ideas I started to work > where I left before the holidays. The first thing that happens is that I > can't map my design anymore. The vhdl is exactly the same as earlier, and > the only difference in the map report is the lines regarding related and > unrelated logic below: > > Design Summary > -------------- > Number of errors: 0 > Number of warnings: 130 > Logic Utilization: > Number of Slice Flip Flops: 7,351 out of 21,504 34% > Number of 4 input LUTs: 5,267 out of 21,504 24% > Logic Distribution: > Number of occupied Slices: 5,505 out of 10,752 51% > Number of Slices containing only related logic: 5,505 out of 5,505 > 100% > Number of Slices containing unrelated logic: 0 out of 5,505 > 0% > *See NOTES below for an explanation of the effects of unrelated > logic > Total Number 4 input LUTs: 6,045 out of 21,504 28% > Number used as logic: 5,267 > Number used as a route-thru: 310 > Number used as Shift registers: 468 > > Number of bonded IOBs: 132 out of 456 28% > IOB Flip Flops: 5 > IOB Master Pads: 55 > IOB Slave Pads: 55 > IOB Dual-Data Rate Flops: 26 > Number of Block RAMs: 42 out of 56 75% > Number of MULT18X18s: 40 out of 56 71% > Number of GCLKs: 5 out of 16 31% > Number of DCMs: 1 out of 8 12% > Number of BSCANs: 1 out of 1 100% > > Number of RPM macros: 26 > Total equivalent gate count for design: 3,057,923 > Additional JTAG gate count for IOBs: 6,336 > Peak Memory Usage: 266 MB > > Following the design summary is a note regarding related logic. > > In my old map report none of this related logic stuff is present. But I > can't really understand why the mapper fails due to this since none of the > logic is unrelated. > > I am pretty sure that my code hasn't changed during my vacation. Does ISE > know that there is a new version out and it wants me to upgrade? ;-) > > Has anyone experienced this before? > > Regards > Johan > > -- > ----------------------------------------------- > Johan Bernspång, xjohbex@xfoix.se > Research engineer > > Swedish Defence Research Agency - FOI > Division of Command & Control Systems > Department of Electronic Warfare Systems > > www.foi.se > > Please remove the x's in the email address if > replying to me personally. > -----------------------------------------------Article: 106890
Göran Bilski wrote: > You seems to have a c,z bits somewhere but you will need two versions of > each instruction, one which uses the carry and one which doesn't Yes, I have carry and zero flag. To make the implementation of the core easier, I think I'll use one bit of the instruction set to determine if the flags are updated or not. > Running more than just simple programs in real-time applications > requires interrupt support which messes things up considerable in the > control part of the processor. Why? I think I can implement a "call" instruction like in 68000: r2=pc pc=r1 In the sub routine I can save r2, if I need more call stack. Interrupts could be implemented by saving the PC register in a special register and restoring it by calling a special return instruction. > Do you consider using only absolute branching or also doing relative > branching? 64 instructions are possible, so relative branching is a good idea and I'll use the same concept with one bit for deciding, if it is absolute or relative. > If you really are wanting to have a processor which is code efficient, > you might want to look at a stack machine. > If I was to create a tiny tiny processor with little area and code > efficient I would do a stack machine. > But they are much nastier to program but they can be implemented very > efficiently. I've implemented a simple Forth implementation for Java and it's just different, not more difficult to program in Forth: http://www.frank-buss.de/forth/ The MARC4 from Atmel uses qForth: http://www.atmel.com/journal/documents/issue5/pg46_48_Atmel_5_CodePatch_A.pdf Maybe you are right and the core and programs are smaller with Forth, I'll think about it. Really useful is that it is simple to write an interactive read-eval-print loop in Forth (like in Lisp), so that you can program and debug a system over RS232. -- Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.deArticle: 106891
From: "Tommy Thorn" <tommy.thorn@gmail.com> Newsgroups: > Martin Schoeberl wrote: >> JOP at 100MHz on the Altera DE2 using the 16-bit SRAM: >> >> Avalon: 11,322 >> SimpCon: 14,760 >> >> So for the SRAM interface SimpCon is a clear winner ;-) >> The 16-bit SRAM SimpCon solution is even faster than >> the 32-bit SRAM Avalon solution. > > I'm not sure what your point is. It's hardly surprising that a JOP > works better with the interface it was codesigned with, rather than > some other one crafted on top. It says nothing is the relative merits > of Avalon and SimpCon. I could code up a counter example quite easily. You're right from your point of view. I have only JOP to compare SimpCon and Avalon. JOP takes advantage of the early acknowledge of SimpCon. However, it's still simpler with SimpCon to implement a SRAM interface with the input and output registers at the IO cells in the FPGA without adding one cycle latency. A small defense of the JOP/SimpCon version: SimpCon was added very late to JOP. Up to this time JOP used it's own proprietary memory interface that was not shared with the IO subsystem. The IO devices also used a proprietary interface. Than I changed JOP to use Wishbone for memory and IO, but had to add a non Wishbone compliant early ack signal to get the performance I wanted. This resulted in the definition of SimpCon and another change in JOPs memory/IO system. It would be interesting to take another CPU (not NIOS or JOP) and implement an Avalon and a SimpCon SRAM interface and compare the performance. However, who has time to do this... > > Altera has an App note on "Using Nios II Tightly Coupled Memory > Tutorial" > (http://altera.com/literature/tt/tt_nios2_tightly_coupled_memory_tutorial.pdf), > but as far as I understand you, this is already how you use the memory. Very interesting, thanks for the link. No, this is not the way I used the on-chip memory with JOP - this looks NIOS specific. And it is stated there: 'The term tightly coupled memory interface refers to an *Avalon-like* interface...' That's interesting as it is an indication that there are issues for low latency connections with Avalon ;-) > I noticed you didn't reply to how SimpCon doesn't scale. Does your > silence mean that you see it now? :-) It means I have not thought enough about it ;-) MartinArticle: 106892
> > Forth looks interesting, too: http://www.ultratechnology.com/f21cpu.html > and Java also: http://www.jopdesign.com/ could not resist ;-) MartinArticle: 106893
Martin Schoeberl wrote: > and Java also: http://www.jopdesign.com/ You have tested both: a "normal" instruction set and a stack machine. For the stack machine you wrote that it is two times faster. What about code size and the size of the core? I've downloaded your code and looks like it is implemented very close to the hardware instead of using arbitrary VHDL and let the synthesizer decide how to implement it. A good idea for my implementation :-) -- Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.deArticle: 106894
>> and Java also: http://www.jopdesign.com/ > > You have tested both: a "normal" instruction set and a stack machine. For > the stack machine you wrote that it is two times faster. What about code > size and the size of the core? Mmh, this statement is from a very early version of JOP (about 2001). It was a comparison on the implementation of the Java virtual machine (JVM) in two different types of microcode. About code size: It's the code (bytecode) that the Java compiler generates plus some class information. Bytecode is efficient, but class information adds to the memory footprint. The size depends on the support of Java libraries. Core size is configurable, starting from about 1000 LCs. A well balanced version of JOP is about 2000 LCs. > > I've downloaded your code and looks like it is implemented very close to > the hardware instead of using arbitrary VHDL and let the synthesizer decide > how to implement it. A good idea for my implementation :-) What do you mean with 'very close to the hardware'? I try to avoid vendor specific library elements as much as possible and stay with plain VHDL. If you mean that the VHDL coding style is more hardware oriented, than I agree. I started directly in an FPGA implementation and did almost no simulation. MartinArticle: 106895
Frank Buss wrote: > Göran Bilski wrote: > > >>You seems to have a c,z bits somewhere but you will need two versions of >>each instruction, one which uses the carry and one which doesn't > > > Yes, I have carry and zero flag. To make the implementation of the core > easier, I think I'll use one bit of the instruction set to determine if the > flags are updated or not. > > >>Running more than just simple programs in real-time applications >>requires interrupt support which messes things up considerable in the >>control part of the processor. > > > Why? I think I can implement a "call" instruction like in 68000: > > r2=pc > pc=r1 > > In the sub routine I can save r2, if I need more call stack. > > Interrupts could be implemented by saving the PC register in a special > register and restoring it by calling a special return instruction. > > So will you have instructions that saves the C,Z values? Imagine doing a cmp instruction and after that you take an interrupt, the interrupt handler will also use these flags so when you return the interrupted program will use the wrong values. >>Do you consider using only absolute branching or also doing relative >>branching? > > > 64 instructions are possible, so relative branching is a good idea and I'll > use the same concept with one bit for deciding, if it is absolute or > relative. > > >>If you really are wanting to have a processor which is code efficient, >>you might want to look at a stack machine. >>If I was to create a tiny tiny processor with little area and code >>efficient I would do a stack machine. >>But they are much nastier to program but they can be implemented very >>efficiently. > > > I've implemented a simple Forth implementation for Java and it's just > different, not more difficult to program in Forth: > > http://www.frank-buss.de/forth/ > > The MARC4 from Atmel uses qForth: > > http://www.atmel.com/journal/documents/issue5/pg46_48_Atmel_5_CodePatch_A.pdf > > Maybe you are right and the core and programs are smaller with Forth, I'll > think about it. Really useful is that it is simple to write an interactive > read-eval-print loop in Forth (like in Lisp), so that you can program and > debug a system over RS232. >Article: 106896
Yes, it really fails without any explanation at all. No errors nothing... Anyway, I created a new project and copied the sources to that. Guess what, it works again... For some reason the problem was project oriented, and I can't explain that. /Johan Benjamin Todd wrote: > Does it really fail? or have you misinterpreted what it said, because it > looks ok... > The related logic is not a concern in this case as far as I can see. ISE > always adds an asterisk and a note about related logic IIRC. > Ben > > "Johan Bernspång" <xjohbex@xfoix.se> wrote in message > news:ece9jc$mok$1@mercur.foi.se... >> Hi all, >> >> I came back from vacation yesterday, and full of ideas I started to work >> where I left before the holidays. The first thing that happens is that I >> can't map my design anymore. The vhdl is exactly the same as earlier, and >> the only difference in the map report is the lines regarding related and >> unrelated logic below: >> >> Design Summary >> -------------- >> Number of errors: 0 >> Number of warnings: 130 >> Logic Utilization: >> Number of Slice Flip Flops: 7,351 out of 21,504 34% >> Number of 4 input LUTs: 5,267 out of 21,504 24% >> Logic Distribution: >> Number of occupied Slices: 5,505 out of 10,752 51% >> Number of Slices containing only related logic: 5,505 out of 5,505 >> 100% >> Number of Slices containing unrelated logic: 0 out of 5,505 >> 0% >> *See NOTES below for an explanation of the effects of unrelated >> logic >> Total Number 4 input LUTs: 6,045 out of 21,504 28% >> Number used as logic: 5,267 >> Number used as a route-thru: 310 >> Number used as Shift registers: 468 >> >> Number of bonded IOBs: 132 out of 456 28% >> IOB Flip Flops: 5 >> IOB Master Pads: 55 >> IOB Slave Pads: 55 >> IOB Dual-Data Rate Flops: 26 >> Number of Block RAMs: 42 out of 56 75% >> Number of MULT18X18s: 40 out of 56 71% >> Number of GCLKs: 5 out of 16 31% >> Number of DCMs: 1 out of 8 12% >> Number of BSCANs: 1 out of 1 100% >> >> Number of RPM macros: 26 >> Total equivalent gate count for design: 3,057,923 >> Additional JTAG gate count for IOBs: 6,336 >> Peak Memory Usage: 266 MB >> >> Following the design summary is a note regarding related logic. >> >> In my old map report none of this related logic stuff is present. But I >> can't really understand why the mapper fails due to this since none of the >> logic is unrelated. >> >> I am pretty sure that my code hasn't changed during my vacation. Does ISE >> know that there is a new version out and it wants me to upgrade? ;-) >> >> Has anyone experienced this before? >> >> Regards >> Johan >> >> -- >> ----------------------------------------------- >> Johan Bernspång, xjohbex@xfoix.se >> Research engineer >> >> Swedish Defence Research Agency - FOI >> Division of Command & Control Systems >> Department of Electronic Warfare Systems >> >> www.foi.se >> >> Please remove the x's in the email address if >> replying to me personally. >> ----------------------------------------------- > > -- ----------------------------------------------- Johan Bernspång, xjohbex@xfoix.se Research engineer Swedish Defence Research Agency - FOI Division of Command & Control Systems Department of Electronic Warfare Systems www.foi.se Please remove the x's in the email address if replying to me personally. -----------------------------------------------Article: 106897
Hmm, ISE is fantastic for bizzare things like that.. If it happens again click on Project -> Clean Up Project Files... This usually removes all the junk from your project folder. Ben "Johan Bernspång" <xjohbex@xfoix.se> wrote in message news:ecepf4$mok$2@mercur.foi.se... > Yes, it really fails without any explanation at all. No errors nothing... > Anyway, I created a new project and copied the sources to that. Guess > what, it works again... For some reason the problem was project oriented, > and I can't explain that. > > /Johan > > > Benjamin Todd wrote: >> Does it really fail? or have you misinterpreted what it said, because it >> looks ok... >> The related logic is not a concern in this case as far as I can see. ISE >> always adds an asterisk and a note about related logic IIRC. >> Ben >> >> "Johan Bernspång" <xjohbex@xfoix.se> wrote in message >> news:ece9jc$mok$1@mercur.foi.se... >>> Hi all, >>> >>> I came back from vacation yesterday, and full of ideas I started to work >>> where I left before the holidays. The first thing that happens is that I >>> can't map my design anymore. The vhdl is exactly the same as earlier, >>> and the only difference in the map report is the lines regarding related >>> and unrelated logic below: >>> >>> Design Summary >>> -------------- >>> Number of errors: 0 >>> Number of warnings: 130 >>> Logic Utilization: >>> Number of Slice Flip Flops: 7,351 out of 21,504 34% >>> Number of 4 input LUTs: 5,267 out of 21,504 24% >>> Logic Distribution: >>> Number of occupied Slices: 5,505 out of 10,752 51% >>> Number of Slices containing only related logic: 5,505 out of 5,505 >>> 100% >>> Number of Slices containing unrelated logic: 0 out of 5,505 >>> 0% >>> *See NOTES below for an explanation of the effects of unrelated >>> logic >>> Total Number 4 input LUTs: 6,045 out of 21,504 28% >>> Number used as logic: 5,267 >>> Number used as a route-thru: 310 >>> Number used as Shift registers: 468 >>> >>> Number of bonded IOBs: 132 out of 456 28% >>> IOB Flip Flops: 5 >>> IOB Master Pads: 55 >>> IOB Slave Pads: 55 >>> IOB Dual-Data Rate Flops: 26 >>> Number of Block RAMs: 42 out of 56 75% >>> Number of MULT18X18s: 40 out of 56 71% >>> Number of GCLKs: 5 out of 16 31% >>> Number of DCMs: 1 out of 8 12% >>> Number of BSCANs: 1 out of 1 100% >>> >>> Number of RPM macros: 26 >>> Total equivalent gate count for design: 3,057,923 >>> Additional JTAG gate count for IOBs: 6,336 >>> Peak Memory Usage: 266 MB >>> >>> Following the design summary is a note regarding related logic. >>> >>> In my old map report none of this related logic stuff is present. But I >>> can't really understand why the mapper fails due to this since none of >>> the logic is unrelated. >>> >>> I am pretty sure that my code hasn't changed during my vacation. Does >>> ISE know that there is a new version out and it wants me to upgrade? ;-) >>> >>> Has anyone experienced this before? >>> >>> Regards >>> Johan >>> >>> -- >>> ----------------------------------------------- >>> Johan Bernspång, xjohbex@xfoix.se >>> Research engineer >>> >>> Swedish Defence Research Agency - FOI >>> Division of Command & Control Systems >>> Department of Electronic Warfare Systems >>> >>> www.foi.se >>> >>> Please remove the x's in the email address if >>> replying to me personally. >>> ----------------------------------------------- >> >> > > > -- > ----------------------------------------------- > Johan Bernspång, xjohbex@xfoix.se > Research engineer > > Swedish Defence Research Agency - FOI > Division of Command & Control Systems > Department of Electronic Warfare Systems > > www.foi.se > > Please remove the x's in the email address if > replying to me personally. > -----------------------------------------------Article: 106898
Göran Bilski wrote: > So will you have instructions that saves the C,Z values? > Imagine doing a cmp instruction and after that you take an interrupt, > the interrupt handler will also use these flags so when you return the > interrupted program will use the wrong values. Yes, I think a r1 to flags register and flags register to r1 instruction will be sufficient, a little bit like 6502 txs and tsx. -- Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.deArticle: 106899
Well, I tried that too, but it didn't help in this case. It seems like the problem was connected to the project file itself... Beats me... /Johan Benjamin Todd wrote: > Hmm, ISE is fantastic for bizzare things like that.. > If it happens again click on Project -> Clean Up Project Files... > This usually removes all the junk from your project folder. > Ben > > "Johan Bernspång" <xjohbex@xfoix.se> wrote in message > news:ecepf4$mok$2@mercur.foi.se... >> Yes, it really fails without any explanation at all. No errors nothing... >> Anyway, I created a new project and copied the sources to that. Guess >> what, it works again... For some reason the problem was project oriented, >> and I can't explain that. >> >> /Johan >> >> >> Benjamin Todd wrote: >>> Does it really fail? or have you misinterpreted what it said, because it >>> looks ok... >>> The related logic is not a concern in this case as far as I can see. ISE >>> always adds an asterisk and a note about related logic IIRC. >>> Ben >>> >>> "Johan Bernspång" <xjohbex@xfoix.se> wrote in message >>> news:ece9jc$mok$1@mercur.foi.se... >>>> Hi all, >>>> >>>> I came back from vacation yesterday, and full of ideas I started to work >>>> where I left before the holidays. The first thing that happens is that I >>>> can't map my design anymore. The vhdl is exactly the same as earlier, >>>> and the only difference in the map report is the lines regarding related >>>> and unrelated logic below: >>>> >>>> Design Summary >>>> -------------- >>>> Number of errors: 0 >>>> Number of warnings: 130 >>>> Logic Utilization: >>>> Number of Slice Flip Flops: 7,351 out of 21,504 34% >>>> Number of 4 input LUTs: 5,267 out of 21,504 24% >>>> Logic Distribution: >>>> Number of occupied Slices: 5,505 out of 10,752 51% >>>> Number of Slices containing only related logic: 5,505 out of 5,505 >>>> 100% >>>> Number of Slices containing unrelated logic: 0 out of 5,505 >>>> 0% >>>> *See NOTES below for an explanation of the effects of unrelated >>>> logic >>>> Total Number 4 input LUTs: 6,045 out of 21,504 28% >>>> Number used as logic: 5,267 >>>> Number used as a route-thru: 310 >>>> Number used as Shift registers: 468 >>>> >>>> Number of bonded IOBs: 132 out of 456 28% >>>> IOB Flip Flops: 5 >>>> IOB Master Pads: 55 >>>> IOB Slave Pads: 55 >>>> IOB Dual-Data Rate Flops: 26 >>>> Number of Block RAMs: 42 out of 56 75% >>>> Number of MULT18X18s: 40 out of 56 71% >>>> Number of GCLKs: 5 out of 16 31% >>>> Number of DCMs: 1 out of 8 12% >>>> Number of BSCANs: 1 out of 1 100% >>>> >>>> Number of RPM macros: 26 >>>> Total equivalent gate count for design: 3,057,923 >>>> Additional JTAG gate count for IOBs: 6,336 >>>> Peak Memory Usage: 266 MB >>>> >>>> Following the design summary is a note regarding related logic. >>>> >>>> In my old map report none of this related logic stuff is present. But I >>>> can't really understand why the mapper fails due to this since none of >>>> the logic is unrelated. >>>> >>>> I am pretty sure that my code hasn't changed during my vacation. Does >>>> ISE know that there is a new version out and it wants me to upgrade? ;-) >>>> >>>> Has anyone experienced this before? >>>> >>>> Regards >>>> Johan >>>> >>>> -- >>>> ----------------------------------------------- >>>> Johan Bernspång, xjohbex@xfoix.se >>>> Research engineer >>>> >>>> Swedish Defence Research Agency - FOI >>>> Division of Command & Control Systems >>>> Department of Electronic Warfare Systems >>>> >>>> www.foi.se >>>> >>>> Please remove the x's in the email address if >>>> replying to me personally. >>>> ----------------------------------------------- >>> >> >> -- >> ----------------------------------------------- >> Johan Bernspång, xjohbex@xfoix.se >> Research engineer >> >> Swedish Defence Research Agency - FOI >> Division of Command & Control Systems >> Department of Electronic Warfare Systems >> >> www.foi.se >> >> Please remove the x's in the email address if >> replying to me personally. >> ----------------------------------------------- > > -- ----------------------------------------------- Johan Bernspång, xjohbex@xfoix.se Research engineer Swedish Defence Research Agency - FOI Division of Command & Control Systems Department of Electronic Warfare Systems www.foi.se Please remove the x's in the email address if replying to me personally. -----------------------------------------------
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z