Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On Fri, 20 Mar 2009 12:15:23 -0700 (PDT), jleslie48 wrote: >but I cant seem to compare these two chips, the VIRTEX II vs the >XCR3512XL-12-PQ208 >its apples to oranges, how does it work? It really IS apples and oranges. The structures are so different that you can't expect design size considerations to map meaningfully from one to the other. CPLDs excel at wide decoding functions. Each logic cell (typically, but not necessarily, including a flip-flop) computes the logical OR of a bunch of "product terms". Each product term is the logical AND of any selection of a very large number of signals - resource cost is driven by number of PTs, NOT by the number of inputs to a given PT. PT inputs are basically free. FPGAs have a much higher ratio of flip-flops to logic. The basic logic function is a 4-input lookup table, i.e. any logical function you care - but only with 4 inputs. Wide AND functions and decodes are quite expensive of area. Inputs to a logic function are expensive as soon as you have more than 4 of them. (OK, it might be 5 or 6 in some newer devices.) Of course, synthesis tools will cram your specified VHDL functionality into either, and will get reasonably good optimization in either. But to say "adding this function made my FPGA only 20% bigger. so why does it make my CPLD 40% bigger?" is a question with no sensible answer. The devil is in the detail. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK jonathan.bromley@MYCOMPANY.com http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.Article: 139101
On 20 Mar, 03:08, Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote: > In article <1478a4ab-62d3-4e9d-b1a3-769f84726...@j8g2000yql.googlegroups.= com>, > > Jacko =A0<jackokr...@gmail.com> wrote: > >hi > > >Chuck and the poe are in the design lab, > > >Chuck sys to pope "Have you got a rubber, my designs are getting big?" > >Pope says "That's a bit RISCy!" > > >So if you had possiblly 4 instructions to do stack init pointers and > >save both aswell, what would you use? > > Do you ever read over your posts before submission? > Do you have a spelling checker? > > > > >cheers jacko > > -- > -- > Albert van der Horst, UTRECHT,THE NETHERLANDS > Economic growth -- like all pyramid schemes -- ultimately falters. > albert@spe&ar&c.xs4all.nl &=3Dnhttp://home.hccnet.nl/a.w.m.van.der.horst Darwin, Mutation and the Death of a Lnguage via Stagnation =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D Oh yo evil looking mutated word, you is dead right, no sexy none. Me bee's full oxford smili life, gets me the beer token and enduf this wanton easo-speak. cheers jackoArticle: 139102
jleslie48 wrote: >> any idea on how to make it fit? If it has to be that device, I would need two of them. > how can I find out who is the piggy, and what can I due to trim things > down? Synthesis has already done the trimming. The device is too small. > but I cant seem to compare these two chips, the VIRTEX II vs the > XCR3512XL-12-PQ208 > its apples to oranges, how does it work? Rerun synthesis and check the % utilization -- Mike TreselerArticle: 139103
Jonathan Bromley <jonathan.bromley@mycompany.com> wrote: (big snip) > FPGAs have a much higher ratio of flip-flops to logic. The > basic logic function is a 4-input lookup table, i.e. any > logical function you care - but only with 4 inputs. Wide > AND functions and decodes are quite expensive of area. > Inputs to a logic function are expensive as soon as you > have more than 4 of them. (OK, it might be 5 or 6 in > some newer devices.) I think some can use the carry chain logic to do wide AND (or NOR if inverted) logic. Maybe still not quite the same as CPLD (PAL) logic, but fast and not too expensive. -- glenArticle: 139104
In article <1478a4ab-62d3-4e9d-b1a3-769f847260a2@j8g2000yql.googlegroups.com>, Jacko <jackokring@gmail.com> wrote: > So if you had possiblly 4 instructions to do stack init pointers and > save both aswell, what would you use? > ------------ Previously Jacko wrote: > You will find the subroutine RI FI RO SO BA does the same as > the forth word lit. > > == subroutine to do move #$222 to address $665 == > $lit , > $666 , <-- ? typo $665 <> $666 or is this for the auto-decr ? > $lit , > $222 , > SI FA FO BA ========= Does $lit , $666 , $lit , $222 mean: "push $666, push $222" ? Then SI FA FO BA == (S)->A ; (S)->Q ; A->(Q) ; (R)->P == ? If it's a stack-machine, then which register is the TOS-pointer ? Let's try to work backwards:- (R)->P should move #$222 to address $665?6 ==> P == address $665 [or $666] and R pointed to mem-containing #$222 So, how did $lit , $222 [ push $222] get $222 into mem pointed to by R ? ------------- Perhaps this is all obvious for someone with a VHDL design background, but this forth-group has just degenerated into clowing, with this thread. Can anybody contribute any knowledge/interpretation to how this nibz works ? ------------- someone wrote: > > How many bits are in an opcode, 4 or 5? > > I would say it has to be five > > or there is no way for the machine to distinguish between > > an opcode and an address. In other words, there > > *has* to be a CALL instruction, > > even if it is just a one bit opcode with the rest being the address. Jacko wrote: > if(instructionRegister<16) doOpcode(instructionRegister) else > doSubroutine(instructionRegister); IMO this corresponds to the wiki: > For this example we will take the most complex to understand instruction BO. > Details > > BO ((S) AND A) mul 2 -> A > > This translates as BOth instruction: Load the indirect contents of > memory indexed by S and 'and' it with A. > Then shift this left one bit through the CarryRollBit and save this > result into A. > Also increase S by 1 to do the post increment indexing. So, '16 basic instructions' [need 4 bits] and the BO instruction, would get the BasicInstr/Subroutine 1-bit-flag. This would imply a 5bit wide word, which is obviously not the case ? ------------- who can uderstand/explain this:-- >There are no conditional instructions, SU of the carry to (R) >is the major branching method. ... > * SU (S)+A->A - SUm (CarryRollBit added too) --- Since of the 16 instruction, the only ALU types: +, xor, and; use 'A' as one would expect an accumulator to be used [as a source & destination for the binary op.], 'A' *is* the accumulator !! Similarly 'S' is seen to be the TOS-pointer. --- wiki says: > All opcodes above 15 are subroutine call addresses. OK, so for an 8 bit wide word, you get 256-15 possible subroutines ? And the subroutines also use the basic 16 instructions, including possibly nested-subroutine/s ? No, because the first-level of subroutines has already been allocated the 256-16 subroutine-pointers ? Ok, you can only have 256-16 *different* subroutines, but they can be nested - limited only by RAM to hold a stacked-returns ? == Chris Glur.Article: 139105
On 20 Mar, 15:03, VAX9...@gmail.com wrote: > Dear FPGA users, > > This news group is valuable to all of us. I found that Google Group > now allows users to report spam. I suggest we all take a few seconds > to report spams on this comp.misc.fpga group. The result is yet to > see, but we can start now. > > Just click into the spam message, then click "options" in the subject > line, then "more options" in author line. I simply describe the reason > as "This message is spam". I hope this helps. Thank you! I have been reporting spam on other groups and, after a long time, the senders Google accounts were removed. I think it took about four to eight weeks. I send this to reassure that they do seem to take notice eventually but patience is required. JamesArticle: 139106
On 20 Mar, 21:54, AliB...@gmail.com wrote: > In article <1478a4ab-62d3-4e9d-b1a3-769f84726...@j8g2000yql.googlegroups.= com>, > > Jacko <jackokr...@gmail.com> wrote: > > So if you had possiblly 4 instructions to do stack init pointers and > > save both aswell, what would you use? > > ------------Previously Jacko wrote: > > You will find the subroutine RI FI RO SO BA does the same as > > =A0 =A0the forth word lit. > > > =A0 =A0=3D=3D subroutine to do move #$222 to address $665 =3D=3D > > =A0 =A0$lit , > > =A0 =A0$666 , =A0<-- ? typo =A0$665 <> $666 or is this for the auto-dec= r ? Yes, It sure is. > > =A0 =A0$lit , > > =A0 =A0$222 , > > =A0 =A0SI FA FO BA > > =3D=3D=3D=3D=3D=3D=3D=3D=3D > Does $lit , $666 , $lit , =A0$222 mean: > "push =A0 =A0$666, push =A0 =A0$222" ? Yes when lit is defined as the soubroutine equivelent to lit. > Then =A0SI FA FO BA =3D=3D > (S)->A ; (S)->Q ; A->(Q) ; (R)->P =3D=3D ? I would think so!! TOS->A, TOS(2)->Q, STORE TOS -> TOS(2) ADDRESS, GET REURN TO PROGRAM COUNTER > If it's a stack-machine, then which register > is the TOS-pointer ? =A0 S, but a top of stack optimization using A may be possible, for speed, but lower space efficiency. > Let's try to work backwards:- > =A0(R)->P should move #$222 to address $665?6 > =A0=3D=3D> P =A0=3D=3D address $665 [or $666] > =A0 =A0 and > R pointed to mem-containing #$222 > > So, how did =A0 =A0$lit , =A0 =A0$222 [ push =A0 =A0$222] get > $222 into mem pointed to by R ? > ------------- RI FI SO RO BA where SO RO commutes to RO SO as a duplicate expression for same function. (R)->Q,(Q)->A,A->(S),Q->(R),(R)->P get return address and get indirect next address following return address, and save this on stack, and put incremented return address back on return stack and get return address (modified by +1) into program counter to execute a return to the address following the literal value. > Perhaps this is all obvious for someone with a > VHDL design background, but this forth-group has > just degenerated into clowing, with this thread. ;-) > Can anybody contribute any knowledge/interpretation > to how this nibz works ? > ------------- > > someone wrote: > > > =A0How =A0many =A0bits are in an opcode, 4 or 5? =A0 > > > =A0I would say it has to be five > > > or there is no way for the machine to distinguish between > > > =A0an opcode and =A0an =A0address. =A0 In =A0other =A0words, =A0there= =A0 > > > =A0*has* =A0to be =A0a CALL =A0instruction, > > > even if it is just a one bit opcode with the rest being the address. > Jacko wrote: > > =A0 =A0if(instructionRegister<16) doOpcode(instructionRegister) else > > =A0 =A0 =A0 doSubroutine(instructionRegister); > > IMO this corresponds to the wiki: > > > For this example we will take the most complex to understand instructio= n BO. > > =A0 =A0Details > > > =A0 =A0BO ((S) AND A) mul 2 -> A > > > =A0 =A0This translates as BOth instruction: Load the indirect contents = of > > memory indexed by S and 'and' it with A. > > Then shift this left one bit through the =A0CarryRollBit and save this > > result into A. > > Also increase S by 1 to do the post increment indexing. > > So, '16 basic instructions' [need 4 bits] and the BO instruction, > would get the BasicInstr/Subroutine 1-bit-flag. No, a basic instruction is a number under 16, any number over 15 is an address. This does have a disadvantage of not being able to call a subroutine below address 16 but this is not a major fault, as boot code would be here, and it is possible to place a subroutine call instruction within these addresses. > This would imply a 5bit wide word, which is obviously not > the case ? The implication of the extra bit 'needed' is not a true account of functioning. > ------------- who can uderstand/explain this:-->There are no conditional = instructions, SU of the carry to (R) > >is the major branching method. > ... > > =A0 =A0 =A0 =A0* SU (S)+A->A - SUm (CarryRollBit added too) (R)->Q,Q->(S),(S)->A,A+(S)->A,A->(S),(S)->Q,Q->(R),(R)->P > --- > Since of the 16 instruction, the only ALU types: +, xor, and; > use 'A' as one would expect an accumulator to be used > [as a source & destination for the binary op.], 'A' *is* the > accumulator !! Similarly 'S' is seen to be the TOS-pointer. > --- wiki says:> All =A0opcodes above 15 are subroutine call addresses. > > OK, so for an 8 bit wide word, you get 256-15 possible subroutines ? > And the subroutines also use the basic 16 instructions, including > =A0possibly nested-subroutine/s ? > No, because the first-level of subroutines has already been allocated > =A0the 256-16 subroutine-pointers ? > Ok, you can only have 256-16 *different* subroutines, but they can be > =A0nested - limited only by RAM to hold a stacked-returns ? > > =3D=3D Chris Glur. Hope this helps. Cheers jackoArticle: 139107
Well this is a bummer. Here I think I'm being careful, working things out with test bench, re-building all the while and checking for growth, only to be blind sided when I hook up the pin to the generated signal. Meantime I've got some more info and questions. 1) >> any idea on how to make it fit? If it has to be that device, I would need two of them. that chip we are getting for around $100, I don't even know where to buy them, and where do I get a Virtex II-PRO chip? digikey says they are $1000?? Wouldn't I be better off getting the Virtex II-PRO? Meantime the old chip is mounted on a custom board layout, I guess my hardware guys are going to have to re-lay out the board with two of these chips? that won't work anyway as I added 600 macrocells to a chip that only had 512 to begin with... I think I was able to chip that down to under 512 though, but not by much. 2) what exactly is a macrocell anyway? 3) "Rerun synthesis and check the % utilization " that's what I've been doing. basically I added the equivalent of soft uart and the data generator state machine that Jonathan so kindly set me up with. So I started backing out that code bit by bit to see where I pop the %s. When I took out the data generator I only freed up a few macro cells, I tried reducing the fixed buffers, and that again only freed up some of the small weeds. When I deleted the UART however, it cleared up the whole mess. But now I don't know if thats because I've effectively dead ended other parts, ... I'm still driving the output now, but I can't believe my little uart is that big a deal. I'm wondering if there is an expense in the separate modules, and instantiations, or maybe the 'reverse' function. My next step is to start stubbing out sections and see what causes the growth. It would be nice if in all those reports that get generated they assigned macrocells/pterms etc back to the source code that generated them. Directory of C:\jon\oats 03/20/2009 10:36 AM 19,460 data_gen_40.vhd 03/19/2009 03:48 PM 8,795 OATS_TOP.ucf 03/20/2009 04:56 PM 263,536 OATS_Top.vhd 03/09/2009 11:21 AM 941 mod_m_counter.vhd 03/04/2009 05:23 PM 3,486 fifo.vhd 03/19/2009 03:23 PM 5,635 oats_top_tb.vhd 03/20/2009 01:10 PM 3,851 uart_core40.vhd 03/12/2009 09:22 AM 2,756 uart_rx40.vhd 03/12/2009 12:31 PM 3,734 uart_tx40.vhd Now clearly source size doesn't make much difference, the OATS_TOP.VHD program is ridiculously big, but as shown above used reasonable amounts of resources. I added some clocking and counters to OATS_TOP: ------------------------------ FUNCTION to_slv (c: character) RETURN STD_LOGIC_VECTOR IS BEGIN RETURN std_logic_vector(to_unsigned(character'pos(c), 8)); END; FUNCTION reverse (a : IN STD_LOGIC_VECTOR) RETURN STD_LOGIC_VECTOR IS VARIABLE result : STD_LOGIC_VECTOR(a'RANGE); ALIAS aa : STD_LOGIC_VECTOR(a'REVERSE_RANGE) IS a; BEGIN FOR i IN aa'RANGE LOOP result(i) := aa(i); END LOOP; RETURN result; END; ... clock_4hz: PROCESS ( system_clock_used ) BEGIN IF ( rising_edge(system_clock_used)) then IF ( clk_4hz_countdown = 0) THEN clk_4hz_countdown <= human_clock_count; clk_4hz <= NOT clk_4hz; else clk_4hz_countdown <= clk_4hz_countdown -1; End if; -- countdown ife END IF; END PROCESS clock_4hz; clock_2mhz: PROCESS ( system_clock_used ) BEGIN IF ( rising_edge(system_clock_used)) then IF ( clk_2mhz_countdown = 0) THEN clk_2mhz_countdown <= clk_2mhz_clock_count; -- base clock choice clk_2mhz <= NOT clk_2mhz; else clk_2mhz_countdown <= clk_2mhz_countdown -1; End if; -- countdown ife END IF; END PROCESS clock_2mhz; clock_2mhz_ctr: PROCESS ( clk_2mhz ) BEGIN IF ( rising_edge(clk_2mhz)) then time_cntr_500ns <= time_cntr_500ns +1; END IF; END PROCESS clock_2mhz_ctr; -- this clock is not uart related clock_7812hz: PROCESS ( system_clock_used ) BEGIN IF ( rising_edge(system_clock_used)) then IF ( clk_7812hz_countdown = 0) THEN clk_7812hz_countdown <= clk_7812hz_clock_count; -- base clock choice clk_7812hz <= NOT clk_7812hz; if (clk_7812hz = '0') Then clk_7812hz_tick <= '1'; end if; else clk_7812hz_tick <= '0'; clk_7812hz_countdown <= clk_7812hz_countdown -1; if ( (initialize_done = '0' ) AND (clk_7812hz_countdown = 1276) ) Then initialize_done <= '1'; initialize_data_gen <= '1'; else initialize_data_gen <= '0'; end if; -- initialize_done End if; -- countdown ife END IF; END PROCESS clock_7812hz; clock_7812_ctr: PROCESS ( clk_7812hz ) BEGIN IF ( rising_edge(clk_7812hz)) then time_cntr_128us <= time_cntr_128us +1; uptime_at_128us <= time_cntr_500ns; a2mhz_parity_plus(7) <= ( (time_cntr_500ns(39) xor time_cntr_500ns(38)) xor (time_cntr_500ns(37) xor time_cntr_500ns(36)) ) xor ( (time_cntr_500ns(35) xor time_cntr_500ns(34)) xor (time_cntr_500ns(33) xor time_cntr_500ns(32)) ); a2mhz_parity_plus(6) <= ( (time_cntr_500ns(31) xor time_cntr_500ns(30)) xor (time_cntr_500ns(29) xor time_cntr_500ns(28)) ) xor ( (time_cntr_500ns(27) xor time_cntr_500ns(26)) xor (time_cntr_500ns(25) xor time_cntr_500ns(24)) ); a2mhz_parity_plus(5) <= ( (time_cntr_500ns(23) xor time_cntr_500ns(22)) xor (time_cntr_500ns(21) xor time_cntr_500ns(20)) ) xor ( (time_cntr_500ns(19) xor time_cntr_500ns(18)) xor (time_cntr_500ns(17) xor time_cntr_500ns(16)) ); a2mhz_parity_plus(4) <= ( (time_cntr_500ns(15) xor time_cntr_500ns(14)) xor (time_cntr_500ns(13) xor time_cntr_500ns(12)) ) xor ( (time_cntr_500ns(11) xor time_cntr_500ns(10)) xor (time_cntr_500ns(09) xor time_cntr_500ns(08)) ); a2mhz_parity_plus(3) <= ( (time_cntr_500ns(07) xor time_cntr_500ns(06)) xor (time_cntr_500ns(05) xor time_cntr_500ns(04)) ) xor ( (time_cntr_500ns(03) xor time_cntr_500ns(02)) xor (time_cntr_500ns(01) xor time_cntr_500ns(00)) ); END IF; END PROCESS clock_7812_ctr; data_message_handler: PROCESS ( system_clock_used) BEGIN IF ( rising_edge(system_clock_used) ) THEN if ( (w40_wanted = '1' ) and (clk_7812hz_tick = '1') ) then w40_data_from_main <= reverse( a2mhz_optional_message & std_logic_vector(uptime_at_128us) & a2mhz_parity_plus & a2mhz_optional_message ) ; -- big endian for BAE. w40_ready <= '1'; else w40_ready <= '0'; end if; -- w40_wanted ite else --w40_ready <= '0'; END IF; --clock edge END PROCESS data_message_handler; --------------------- --2mhz communications uart begin -- instantiate uart a2mhz_uart_unit: entity work.uart40(str_arch) generic map ( dbit => a2mhz_data_bit_count, sb_tick => a2mhz_clock_tick_per_sampling_rate, dvsr => a2mhz_baud_rate_divisor, dvsr_bit => 2, -- number of bits necessary to hold dvsr FIFO_W => 2 -- 2**(value) is the number of chars that can be queued. ) -- generic map port map( clk => system_clock_used, reset => initialize_data_gen, rd_uart => a2mhz_RX_READ_BUFFER_STB, wr_uart => a2mhz_TX_WRITE_BUFFER_STB, rx => a2mhz_HUART_RX_LINE, w_data => a2mhz_TX_1CHAR_BUF, tx_full => a2mhz_TX_BUFFER_FULL, rx_empty => open, rx_not_empty => a2mhz_RX_BUFFER_DATA_PRESENT, r_data => a2mhz_RX_1CHAR_BUF, tx => a2mhz_HUART_TX_LINE, baud_rate_tick => a2mhz_UART_EN_16_x_BAUD ); a2mhz_DATA_GENERATOR: entity work.data_gen_40 generic map ( PC_bits => 5 , dbit => a2mhz_data_bit_count , the_program => -- Long startup delay op40_DELAY & 200 & --2 bytes long op40_LABEL & 04 & op40_WAIT_FOR_W40 & op40_GOTOL & 04 & -- spin on printing W40's from here on in. op40_HALT ) port map ( clock => system_clock_used , reset => initialize_data_gen , timer => a2mhz_UART_EN_16_x_BAUD , tx_data => a2mhz_TX_1CHAR_BUF , tx_valid => a2mhz_tx_valid , tx_ready => a2mhz_tx_ready , rx_data => a2mhz_RX_1CHAR_BUF , lbl_data => a2mhz_lbl_data_from_main , rx_valid => a2mhz_RX_BUFFER_DATA_PRESENT , rx_needed => a2mhz_rx_wanted , reset_out => open --UART_RESET_BUFFER , lbl_needed => a2mhz_lbl_wanted , halted => a2mhz_halted , error_cond => a2mhz_error_cond_main , w40_data => w40_data_from_main , w40_ready => w40_ready , w40_needed => w40_wanted ); -- JSEB: Conditioning of interface signals between UART and data generator a2mhz_tx_ready <= not a2mhz_TX_BUFFER_FULL; a2mhz_TX_WRITE_BUFFER_STB <= a2mhz_tx_valid and a2mhz_tx_ready; -- Write only when it's safe a2mhz_RX_READ_BUFFER_STB <= a2mhz_rx_wanted and a2mhz_RX_BUFFER_DATA_PRESENT; a2mhz_HUART_TX_CK_LINE <= clk_2mhz; --2mhz communications uart end --------------------- now by removing the a2mhz_uart_unit entity, I return to acceptable levels the old levels. That leads me to believe I just put all the logic in the unused/don't bother pile and once I put the uart back in, Is the REVERSE function the problem? The uart is pretty simple: ----------Uart_core40.vhd------------- -- Listing 7.4 -- -- jl 090226 First working. WATCH OUT!!! DVSR_BIT!!! for 19200 baud, -- the DVSR was 325, and guess what? that means that you need -- 9 bits instead of 8 for the DVSR_BIT, this still synthed ok -- but generated 175 warnings. changing it to 9 bits (or 115,200 baud means -- the synth gets through with 5 warnings, and works. -- 090312 copy and redo of uart_core.vhd. this one is to customize to the to the 40 bit -- (probably expand to 64) comm port for BAE. -- library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity uart40 is generic( -- Default setting: -- xxx baud, 8 data bis, 1 stop its, 2^2 FIFO DBIT : integer:=8; -- # data bits SB_TICK : integer:=16; -- # ticks for stop bits, 16/24/32 -- for 1/1.5/2 stop bits DVSR : integer:= 325; -- baud rate divisor -- DVSR = 50M/(16*baud rate(19200)) == 162.76 -- 100m/(16*19200) ==325.52 -- 100m/(16*115200) ==54.25 DVSR_BIT : integer:=9; -- # bits of DVSR 325 needs 9 bits!!!!! FIFO_W : integer:=2 -- # addr bits of FIFO -- # words in FIFO=2^FIFO_W ); port( clk, reset: in std_logic; rd_uart, wr_uart: in std_logic; rx: in std_logic; w_data: in std_logic_vector((dbit-1) downto 0); tx_full, rx_empty: out std_logic; rx_not_empty: out std_logic; r_data: out std_logic_vector((dbit-1) downto 0); tx: out std_logic; baud_rate_tick: out std_logic ); end uart40; architecture str_arch of uart40 is signal tick: std_logic; signal rx_done_tick: std_logic; signal tx_fifo_out: std_logic_vector((dbit-1) downto 0); signal rx_data_out: std_logic_vector((dbit-1) downto 0); signal tx_empty, tx_fifo_not_empty: std_logic; signal tx_done_tick: std_logic; begin baud_gen_unit: entity work.mod_m_counter(arch) generic map(M=>DVSR, N=>DVSR_BIT) port map(clk =>clk, reset =>reset, q =>open, max_tick =>tick ); uart40_rx_unit: entity work.uart40_rx(arch) generic map(DBIT=>DBIT, SB_TICK=>SB_TICK) port map(clk=>clk, reset=>reset, rx=>rx, s_tick=>tick, rx_done_tick=>rx_done_tick, dout=>rx_data_out); fifo_rx_unit: entity work.fifo(arch) generic map(B=>DBIT, W=>FIFO_W) port map(clk=>clk, reset=>reset, rd=>rd_uart, wr=>rx_done_tick, w_data=>rx_data_out, empty=>rx_empty, notempty=>rx_not_empty, full=>open, r_data=>r_data); fifo_tx_unit: entity work.fifo(arch) generic map(B=>DBIT, W=>FIFO_W) port map(clk=>clk, reset=>reset, rd=>tx_done_tick, wr=>wr_uart, w_data=>w_data, empty=>tx_empty, notempty=>open, full=>tx_full, r_data=>tx_fifo_out); uart40_tx_unit: entity work.uart40_tx(arch) generic map(DBIT=>DBIT, SB_TICK=>SB_TICK) port map(clk=>clk, reset=>reset, tx_start=>tx_fifo_not_empty, s_tick=>tick, din=>tx_fifo_out, tx_done_tick=> tx_done_tick, tx=>tx); tx_fifo_not_empty <= not tx_empty; baud_rate_tick <= tick; end str_arch; -----fifo.vhd----- -- Listing 4.20 library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity fifo is generic( B: natural:=8; -- number of bits W: natural:=4 -- number of address bits ); port( clk, reset: in std_logic; rd, wr: in std_logic; w_data: in std_logic_vector (B-1 downto 0); empty, notempty, full: out std_logic; r_data: out std_logic_vector (B-1 downto 0) ); end fifo; architecture arch of fifo is type reg_file_type is array (2**W-1 downto 0) of std_logic_vector(B-1 downto 0); signal array_reg: reg_file_type; signal w_ptr_reg, w_ptr_next, w_ptr_succ: std_logic_vector(W-1 downto 0); signal r_ptr_reg, r_ptr_next, r_ptr_succ: std_logic_vector(W-1 downto 0); signal full_reg, empty_reg, full_next, empty_next: std_logic; signal wr_op: std_logic_vector(1 downto 0); signal wr_en: std_logic; begin --================================================= -- register file --================================================= process(clk,reset) begin if (reset='1') then array_reg <= (others=>(others=>'0')); elsif (clk'event and clk='1') then if wr_en='1' then array_reg(to_integer(unsigned(w_ptr_reg))) <= w_data; end if; end if; end process; -- read port r_data <= array_reg(to_integer(unsigned(r_ptr_reg))); -- write enabled only when FIFO is not full wr_en <= wr and (not full_reg); --================================================= -- fifo control logic --================================================= -- register for read and write pointers process(clk,reset) begin if (reset='1') then w_ptr_reg <= (others=>'0'); r_ptr_reg <= (others=>'0'); full_reg <= '0'; empty_reg <= '1'; elsif (clk'event and clk='1') then w_ptr_reg <= w_ptr_next; r_ptr_reg <= r_ptr_next; full_reg <= full_next; empty_reg <= empty_next; end if; end process; -- successive pointer values w_ptr_succ <= std_logic_vector(unsigned(w_ptr_reg)+1); r_ptr_succ <= std_logic_vector(unsigned(r_ptr_reg)+1); -- next-state logic for read and write pointers wr_op <= wr & rd; process(w_ptr_reg,w_ptr_succ,r_ptr_reg,r_ptr_succ,wr_op, empty_reg,full_reg) begin w_ptr_next <= w_ptr_reg; r_ptr_next <= r_ptr_reg; full_next <= full_reg; empty_next <= empty_reg; case wr_op is when "00" => -- no op when "01" => -- read if (empty_reg /= '1') then -- not empty r_ptr_next <= r_ptr_succ; full_next <= '0'; if (r_ptr_succ=w_ptr_reg) then empty_next <='1'; end if; end if; when "10" => -- write if (full_reg /= '1') then -- not full w_ptr_next <= w_ptr_succ; empty_next <= '0'; if (w_ptr_succ=r_ptr_reg) then full_next <='1'; end if; end if; when others => -- write/read; w_ptr_next <= w_ptr_succ; r_ptr_next <= r_ptr_succ; end case; end process; -- output full <= full_reg; empty <= empty_reg; notempty <= not empty_reg; end arch; ------------------mod_m_counter.vhd -- Listing 4.11 library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity mod_m_counter is generic( N: integer := 4; -- number of bits M: integer := 10 -- mod-M ); port( clk, reset : in std_logic; max_tick : out std_logic; q : out std_logic_vector(N-1 downto 0) ); end mod_m_counter; architecture arch of mod_m_counter is signal r_reg: unsigned(N-1 downto 0); signal r_next: unsigned(N-1 downto 0); begin -- register process(clk,reset) begin if (reset='1') then r_reg <= (others=>'0'); elsif (clk'event and clk='1') then r_reg <= r_next; end if; end process; -- next-state logic r_next <= (others=>'0') when r_reg=(M-1) else r_reg + 1; -- output logic q <= std_logic_vector(r_reg); max_tick <= '1' when r_reg=(M-1) else '0'; end arch; ---------uart_tx40.vhd -- Listing 7.3 -- JL 090309 changing hard coded '15' to (sb_tick-1) for length of -- each bit. hard coded '7' for databits now (dbit-1) as well. -- JL 090312 custom version of uart_tx for the BAE comm link. library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity uart40_tx is generic( DBIT: integer:=8; -- # data bits SB_TICK: integer:=16 -- # ticks for stop bits ); port( clk, reset: in std_logic; tx_start: in std_logic; s_tick: in std_logic; din: in std_logic_vector((dbit-1) downto 0); tx_done_tick: out std_logic; tx: out std_logic ); end uart40_tx ; architecture arch of uart40_tx is type state_type is (idle, start, data, stop); constant go_high : std_logic := '1'; constant go_low : std_logic := '0'; signal state_reg, state_next: state_type; signal s_reg, s_next: unsigned(7 downto 0); signal n_reg, n_next: unsigned(7 downto 0); signal b_reg, b_next: std_logic_vector((dbit-1) downto 0); signal tx_reg, tx_next: std_logic; signal bit_length: std_logic := '0'; -- testbench watching only. use with din watch. begin -- FSMD state & data registers process(clk,reset) begin if reset='1' then state_reg <= idle; s_reg <= (others=>'0'); n_reg <= (others=>'0'); b_reg <= (others=>'0'); tx_reg <= go_high; elsif (clk'event and clk='1') then state_reg <= state_next; s_reg <= s_next; n_reg <= n_next; b_reg <= b_next; tx_reg <= tx_next; end if; end process; -- next-state logic & data path functional units/routing process(state_reg,s_reg,n_reg,b_reg,s_tick, tx_reg,tx_start,din) begin state_next <= state_reg; s_next <= s_reg; n_next <= n_reg; b_next <= b_reg; tx_next <= tx_reg ; tx_done_tick <= '0'; case state_reg is when idle => tx_next <= go_low; if tx_start='1' then state_next <= start; s_next <= (others=>'0'); b_next <= din; end if; when start => tx_next <= go_high; if (s_tick = '1') then if s_reg=(sb_tick-1) then state_next <= data; s_next <= (others=>'0'); n_next <= (others=>'0'); else s_next <= s_reg + 1; end if; end if; when data => tx_next <= b_reg(0); if (s_tick = '1') then if s_reg=(sb_tick-1) then bit_length <= not bit_length; -- measure a bit. s_next <= (others=>'0'); b_next <= '0' & b_reg((dbit-1) downto 1) ; if n_reg=(DBIT-1) then state_next <= idle; -- stop ; --lets skip the stop bit. tx_done_tick <= '1'; -- moved in from stop else n_next <= n_reg + 1; end if; else s_next <= s_reg + 1; end if; end if; when stop => tx_next <= go_high; if (s_tick = '1') then if s_reg=(SB_TICK*4-1) then -- lets make it stick out for now. state_next <= idle; tx_done_tick <= '1'; else s_next <= s_reg + 1; end if; end if; end case; end process; tx <= tx_reg; end arch; ---------------uart_rx40.vhd ---- -- Listing 7.1 library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity uart40_rx is generic( DBIT: integer:=8; -- # data bits SB_TICK: integer:=16 -- # ticks for stop bits ); port( clk, reset: in std_logic; rx: in std_logic; s_tick: in std_logic; rx_done_tick: out std_logic; dout: out std_logic_vector((dbit-1) downto 0) ); end uart40_rx ; architecture arch of uart40_rx is type state_type is (idle, start, data, stop); signal state_reg, state_next: state_type; signal s_reg, s_next: unsigned(3 downto 0); signal n_reg, n_next: unsigned(2 downto 0); signal b_reg, b_next: std_logic_vector((dbit-1) downto 0); begin -- FSMD state & data registers process(clk,reset) begin if reset='1' then state_reg <= idle; s_reg <= (others=>'0'); n_reg <= (others=>'0'); b_reg <= (others=>'0'); elsif (clk'event and clk='1') then state_reg <= state_next; s_reg <= s_next; n_reg <= n_next; b_reg <= b_next; end if; end process; -- next-state logic & data path functional units/routing process(state_reg,s_reg,n_reg,b_reg,s_tick,rx) begin state_next <= state_reg; s_next <= s_reg; n_next <= n_reg; b_next <= b_reg; rx_done_tick <='0'; case state_reg is when idle => if rx='0' then state_next <= start; s_next <= (others=>'0'); end if; when start => if (s_tick = '1') then if s_reg=(sb_tick/2 -1) then state_next <= data; s_next <= (others=>'0'); n_next <= (others=>'0'); else s_next <= s_reg + 1; end if; end if; when data => if (s_tick = '1') then if s_reg=(sb_tick-1) then s_next <= (others=>'0'); b_next <= rx & b_reg((dbit-1) downto 1) ; if n_reg=(DBIT-1) then state_next <= stop ; else n_next <= n_reg + 1; end if; else s_next <= s_reg + 1; end if; end if; when stop => if (s_tick = '1') then if s_reg=(SB_TICK-1) then state_next <= idle; rx_done_tick <='1'; else s_next <= s_reg + 1; end if; end if; end case; end process; dout <= b_reg; end arch;Article: 139108
On Mar 20, 3:15 pm, jleslie48 <j...@jonathanleslie.com> wrote: > On Mar 20, 2:58 pm, jleslie48 <j...@jonathanleslie.com> wrote: > > > > > On Mar 20, 1:41 pm, Mike Treseler <mtrese...@gmail.com> wrote: > > > > jleslie48 wrote: > > > > and when I added some digital outputs, the %'s went up, but then I > > > > added a whole bunch of logic, and nothing changed, > > > > If the number of cells or Pterms used didn't change at all, > > > I would expect that a "whole bunch" of logic does not make it > > > out to a pin. I would run a sim to check. > > > > -- Mike Treseler > > > ahhh, well that is a bummer. I just tied the output to a pin and now > > I"m getting: > > > Fitting... > > . > > ERROR:Cpld:1063 - Design requires at least 947 macrocells, exceeds > > device limit > > 512. > > ERROR:Cpld:1062 - Design contains 2004 unique product terms, exceeds > > device > > limit 1536. > > ERROR:Cpld:1064 - Design rules checking error. Fitting process > > stopped. > > ...o > > ERROR:Cpld:868 - Cannot fit the design into any of the specified > > devices with > > the selected implementation options. > > > any idea on how to make it fit? > > ok, before I added my functionality, I had: > Macrocells Used Pterms Used Registers Used Pins Used Function > Block Inputs Used > 379/512 (75%) 831/1536 (55%) 354/512 (70%) 118/176 (68%) > 779/1280 (61%) > > so from my errors, I can see I added some 600 macrocells, and 1200 > pterms, > > how can I find out who is the piggy, and what can I due to trim things > down? I don't think you really got an answer to this question. To some extent you can look at the code and estimate the number of macrocells or other logic elements used. But to measure it, you need to break the code into modules and let the tool tell you about each module separately. In an FPGA the logic has a finer grain, so there are not as much optimizations to affect these counts when you use the block all together. But a CPLD can put a lot of logic into each macrocell and will be much more limited by the FF count. Your design counts above indicate that your design uses 1300 FFs and your CPLD only has 512 FFs. Not a good fit! You won't find much in the way of optimizations that will make this fit. The best thing to trim your logic is to change your algorithm. If there are parts of your design that can run slowly compared to the clock rate, you can let them run sequentially rather than in parallel. But if your design has to run at the full rate of the clock with everything in parallel, you just need a larger part. So take a good, hard at your design and see if there is anything you can do to reduce it. > also, what is a macrocell and pterm? I think these got answered, but a little more detail... A macrocell is the unit block of a CPLD. It typically include one or two FFs, an output, often to a pin along with some amount of logic. The logic in a macrocell is made of p-terms and OR gates. P-terms are very wide AND gates with inputs from all of the inputs to that block, all of the FFs in that block as well as, in some devices, some inputs from other macrocell p-terms. The p-terms of a given macrocell are OR'd together to produce the input to the FF or it can be routed directly to the output. There is also a p-term or two devoted to controlling the tri- state driver on the output. The OR gate and FF outputs are connected back to the logic matrix for use in other or the same macrocells. Some devices have "buried" FFs which allow some of the logic in the macrocell to be split off and used with this second FF, but the output can only be routed back to the routing matrix, not an output pin. That is a lot to absorb from a description. I am sure the data sheet has a picture that is very clear and can portray the detail better. The main thing to understand is that the p-term (and) are unlimited (or more accurately only limited by the inputs to the block routing) and an FPGA typically has much smaller LUTs, usually 1 LUT per FF or sometimes 4 LUTs to 3 FFs. So a CPLD is often FF count limited while an FPGA is mostly LUT count limited. Certainly there are things you can change in your design to use more logic and fewer FF to target CPLDs. But I think it will be a major job to cut the design size by more than half! > I originally ran this program on a virtexII, and everthing looked > liked it > was pretty small and effecient: > > Device Utilization Summary > [-] > Logic Utilization > Used > Available > Utilization > Note(s) > Number of Slice Flip Flops > 1,282 > 27,392 > 4% > > Number of 4 input LUTs > 1,545 > 27,392 > 5% > > Logic Distribution > > Number of occupied Slices > 1,302 > 13,696 > 9% > > Number of Slices containing only related logic > 1,302 > 1,302 > 100% > > Number of Slices containing unrelated logic > 0 > 1,302 > 0% > > Total Number of 4 input LUTs > 1,589 > 27,392 > 5% > > Number used as logic > 1,545 > > Number used as a route-thru > 44 > > Number of bonded IOBs > Number of bonded > 15 > 556 > 2% > > IOB Flip Flops > 1 > > Number of RAMB16s > 2 > 136 > 1% > > Number of BUFGMUXs > 3 > 16 > 18% > > but I cant seem to compare these two chips, the VIRTEX II vs the > XCR3512XL-12-PQ208 > its apples to oranges, how does it work? The 3512 has 512 FFs in the macrocells. (I think they also have input FFs) The FPGA is using some 1300 out of 27,000! The FPGA is using 1500 LUTs for logic. It does not look to me like that couldn't fit in the logic of 512 macrocells. But the number of FFs has to be reduced. Are they all necessary? RickArticle: 139109
On Mar 20, 7:39=A0pm, Jacko <jackokr...@gmail.com> wrote: > > > This would imply a 5bit wide word, which is obviously not > > the case ? > > The implication of the extra bit 'needed' is not a true account of > functioning. Ah, but it is. In your specific implementation, you have not only a fifth bit, but also a sixth, seventh all the way up to 16th, no? You have a 16 bit instruction word and only 17 opcodes; 0 through 15 are the ones you list, and 16 through 65535 is the LIT or CALL instruction (I'm not sure which).Article: 139110
On Mar 21, 12:12 am, rickman <gnu...@gmail.com> wrote: > On Mar 20, 3:15 pm, jleslie48 <j...@jonathanleslie.com> wrote: > > > > > On Mar 20, 2:58 pm, jleslie48 <j...@jonathanleslie.com> wrote: > > > > On Mar 20, 1:41 pm, Mike Treseler <mtrese...@gmail.com> wrote: > > > > > jleslie48 wrote: > > > > > and when I added some digital outputs, the %'s went up, but then I > > > > > added a whole bunch of logic, and nothing changed, > > > > > If the number of cells or Pterms used didn't change at all, > > > > I would expect that a "whole bunch" of logic does not make it > > > > out to a pin. I would run a sim to check. > > > > > -- Mike Treseler > > > > ahhh, well that is a bummer. I just tied the output to a pin and now > > > I"m getting: > > > > Fitting... > > > . > > > ERROR:Cpld:1063 - Design requires at least 947 macrocells, exceeds > > > device limit > > > 512. > > > ERROR:Cpld:1062 - Design contains 2004 unique product terms, exceeds > > > device > > > limit 1536. > > > ERROR:Cpld:1064 - Design rules checking error. Fitting process > > > stopped. > > > ...o > > > ERROR:Cpld:868 - Cannot fit the design into any of the specified > > > devices with > > > the selected implementation options. > > > > any idea on how to make it fit? > > > ok, before I added my functionality, I had: > > Macrocells Used Pterms Used Registers Used Pins Used Function > > Block Inputs Used > > 379/512 (75%) 831/1536 (55%) 354/512 (70%) 118/176 (68%) > > 779/1280 (61%) > > > so from my errors, I can see I added some 600 macrocells, and 1200 > > pterms, > > > how can I find out who is the piggy, and what can I due to trim things > > down? > > I don't think you really got an answer to this question. To some > extent you can look at the code and estimate the number of macrocells > or other logic elements used. But to measure it, you need to break > the code into modules and let the tool tell you about each module > separately. In an FPGA the logic has a finer grain, so there are not > as much optimizations to affect these counts when you use the block > all together. But a CPLD can put a lot of logic into each macrocell > and will be much more limited by the FF count. Your design counts > above indicate that your design uses 1300 FFs and your CPLD only has > 512 FFs. Not a good fit! > > You won't find much in the way of optimizations that will make this > fit. The best thing to trim your logic is to change your algorithm. > If there are parts of your design that can run slowly compared to the > clock rate, you can let them run sequentially rather than in > parallel. But if your design has to run at the full rate of the clock > with everything in parallel, you just need a larger part. So take a > good, hard at your design and see if there is anything you can do to > reduce it. > > > also, what is a macrocell and pterm? > > I think these got answered, but a little more detail... A macrocell > is the unit block of a CPLD. It typically include one or two FFs, an > output, often to a pin along with some amount of logic. The logic in > a macrocell is made of p-terms and OR gates. P-terms are very wide > AND gates with inputs from all of the inputs to that block, all of the > FFs in that block as well as, in some devices, some inputs from other > macrocell p-terms. The p-terms of a given macrocell are OR'd together > to produce the input to the FF or it can be routed directly to the > output. There is also a p-term or two devoted to controlling the tri- > state driver on the output. The OR gate and FF outputs are connected > back to the logic matrix for use in other or the same macrocells. > Some devices have "buried" FFs which allow some of the logic in the > macrocell to be split off and used with this second FF, but the output > can only be routed back to the routing matrix, not an output pin. > > That is a lot to absorb from a description. I am sure the data sheet > has a picture that is very clear and can portray the detail better. > The main thing to understand is that the p-term (and) are unlimited > (or more accurately only limited by the inputs to the block routing) > and an FPGA typically has much smaller LUTs, usually 1 LUT per FF or > sometimes 4 LUTs to 3 FFs. So a CPLD is often FF count limited while > an FPGA is mostly LUT count limited. Certainly there are things you > can change in your design to use more logic and fewer FF to target > CPLDs. But I think it will be a major job to cut the design size by > more than half! > > > > > I originally ran this program on a virtexII, and everthing looked > > liked it > > was pretty small and effecient: > > > Device Utilization Summary > > [-] > > Logic Utilization > > Used > > Available > > Utilization > > Note(s) > > Number of Slice Flip Flops > > 1,282 > > 27,392 > > 4% > > > Number of 4 input LUTs > > 1,545 > > 27,392 > > 5% > > > Logic Distribution > > > Number of occupied Slices > > 1,302 > > 13,696 > > 9% > > > Number of Slices containing only related logic > > 1,302 > > 1,302 > > 100% > > > Number of Slices containing unrelated logic > > 0 > > 1,302 > > 0% > > > Total Number of 4 input LUTs > > 1,589 > > 27,392 > > 5% > > > Number used as logic > > 1,545 > > > Number used as a route-thru > > 44 > > > Number of bonded IOBs > > Number of bonded > > 15 > > 556 > > 2% > > > IOB Flip Flops > > 1 > > > Number of RAMB16s > > 2 > > 136 > > 1% > > > Number of BUFGMUXs > > 3 > > 16 > > 18% > > > but I cant seem to compare these two chips, the VIRTEX II vs the > > XCR3512XL-12-PQ208 > > its apples to oranges, how does it work? > > The 3512 has 512 FFs in the macrocells. (I think they also have input > FFs) The FPGA is using some 1300 out of 27,000! The FPGA is using > 1500 LUTs for logic. It does not look to me like that couldn't fit in > the logic of 512 macrocells. But the number of FFs has to be > reduced. Are they all necessary? > > Rick "But I think it will be a major job to cut the design size by more than half!" Well this is what has me scratching my head, I only added one uart to the 3512, the listing from the Virtex II has two separate UARTS, to make up the 1300 slice flip flops. I've only moved one of the uarts to the 3512 so far and it blew its top. I can't see how one uart can take up the entire chip, Or that the difference between the $90 3512 and the $1200 Virtex II Pro? I'm not sure of what you are getting out with me reducing the number of FF's, I'm just getting the hand of VHDL but I'm not aware of what code makes up the FF's, I inlcuded the code I put in up above, It seems very straight-forward, state machine, "The FPGA is using some 1300 out of 27,000! " I'm assuming you mean the 1282/27,392 number. What I'm guessing is that in order for this design I have to get these 1282 to fit into the the 512 macrocells of the 3512 but I can only put 1 in each macrocell, aka I've got to get down to under 512 slice FF. Thats not counting the problem I'm having with the pterms,Article: 139111
On 21 Mar, 04:30, rickman <gnu...@gmail.com> wrote: > On Mar 20, 7:39=A0pm, Jacko <jackokr...@gmail.com> wrote: > > > > > > This would imply a 5bit wide word, which is obviously not > > > the case ? > > > The implication of the extra bit 'needed' is not a true account of > > functioning. That would be like saying you need the extra )s on the front of numbers when you do arithmetic on paper. > Ah, but it is. =A0In your specific implementation, you have not only a > fifth bit, but also a sixth, seventh all the way up to 16th, no? =A0You > have a 16 bit instruction word and only 17 opcodes; 0 through 15 are > the ones you list, and 16 through 65535 is the LIT or CALL instruction > (I'm not sure which). (Depends on the subroutine start address) all subroutines are calls, so they are all calls, just one is LIT. Yes, you will find primitives use codes 0-15 and colon definitions use 0-65535. If you are crazy enough to have a massive primitive set, or to implement such a set in full width memory, then you would be right. On the 12 bit version you could use 16 bit memory, and have the high 4 as the primitive part of the address space. As stated on the website (somewhare) this processor is not designed for running monolith inlined code, and pay in space and cache slowdown such things will, say yoda. So in the example I gave for the store, it's likely the last line of simple instructions would be a subroutine named store or +1! You will find a large amount of primitive code can be optimized into a small logic area, especially if the address space over which these subroutines is spread is sparse to allow combinational alignment of product terms and boolean logic reduction. To just generalize this code as something to slot into the threading is missing the point that this is an ocassional feature, not a best practice. cheers jacko "speak unto my mobile I will, sometime it may be a programming tool."Article: 139112
On Fri, 20 Mar 2009 17:56:39 -0700 (PDT), jleslie48 wrote: >re-building all the while and checking for growth, only to >be blind sided when I hook up the pin to the generated signal. So you need to identify each functional block in your design, and synthesise it - on its own - in an FPGA with every input and output of the block hooked to a pin (the synth tool will automatically put pads on the ports of your top-level VHDL entity, so that's no effort). That way you can quickly get a feel for the size of each block. If you synthesise with some pins not connected, the tool will surely strip away loads of unused logic and you will get an over-optimistic size estimate. Interconnect between blocks costs propagation delay, but only rather a little logic, so that's OK. Your reverse() function is pretty much free - it's just interconnect. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK jonathan.bromley@MYCOMPANY.com http://www.MYCOMPANY.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.Article: 139113
Hello, I am trying to connect a 16 bit Intel Strataflash to the xps_mch_emc_v2_00_a. My problem is, that the flashwriter.tcl application stops after some percents (13% or later) and never comes to 100%. In the microblaze-uclinux archive, I found two old posts: http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/archive/2005/07/msg00046.html and http://osdir.com/ml/linux.uclinux.microblaze/2004-02/msg00046.html Can anybody confirm, that this also works with the new xps_mch_emc? -- KristianArticle: 139114
I forgot to say, that I want the DATA_WIDTH_MATCHING option set to 1 because I want the emc to be fully transparent to the operating system (32 bit writes). Kristian "Kristian Klaus" <kristian.klaus@gmx.de> schrieb im Newsbeitrag news:gq2f5g$eph$1@hahn.informatik.hu-berlin.de... > Hello, > > I am trying to connect a 16 bit Intel Strataflash to the > xps_mch_emc_v2_00_a. My problem is, that the flashwriter.tcl application > stops after some percents (13% or later) and never comes to 100%. > > In the microblaze-uclinux archive, I found two old posts: > > http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/archive/2005/07/msg00046.html > and > http://osdir.com/ml/linux.uclinux.microblaze/2004-02/msg00046.html > > Can anybody confirm, that this also works with the new xps_mch_emc? > > -- > Kristian > > >Article: 139115
On Fri, 20 Mar 2009 17:56:39 -0700 (PDT), jleslie48 <jon@jonathanleslie.com> wrote: >Well this is a bummer. Here I think I'm being careful, working things >out with >test bench, re-building all the while and checking for growth, only to >be blind sided >when I hook up the pin to the generated signal. > >Meantime I've got some more info and questions. > >1) >> any idea on how to make it fit? > >If it has to be that device, I would need two of them. > >that chip we are getting for around $100, I don't even know where to >buy them, and where do I get a Virtex II-PRO chip? digikey says they >are $1000?? Wouldn't I be better off getting the Virtex II-PRO? Spartan-3 gives a sizeable resource for well under $100. (V2Pro with the same capacity would be $500 up - ballpark numbers) em.avnet.com lists the XC3S1500 for $74 (1 off) $55 (100 off) and you can upgrade to larger versioss if necessary. http://www.enterpoint.co.uk/moelbryn/raggedstone1.html Here's one example of a complete board with XC3S1500 for about $250. >Meantime the old chip is >mounted on a custom board layout, I guess my hardware guys are going >to have to re-lay out the board with >two of these chips? Consider a Spartan-3 layout for room to grow. >3) "Rerun synthesis and check the % utilization " >that's what I've been doing. basically I added the equivalent of soft >uart and the data generator state machine that Jonathan so kindly set >me up with. So I started backing out that code bit by bit to see >where I pop the %s. Not the best way - as you discovered. Find the "do not allocate I/O pin" synthesis option and synth each major subsystem as a separate project. Crosscheck that a simple sum of the results is approx (say within 10%) of the overall size. Note any major surprises... This option is used to build separate re-usable modules (black boxes) so it is not allowed to optimize away anything not connected to a pin; therefore it is a better way to determine resource usage. (Just another way to achieve what Jonathan advised, but without the labour of adding the pins) - BrianArticle: 139116
In xapp460 the DVI/HDMI transmitter & receiver is implemented but the max throughput limit to somewhat 750 Mb/s, which can handle up to 1080i or 720p resolution. The 1080p, however needs twice of that The question is how can we crank up the throughput to about 1.5 Gb/s ?Article: 139117
On Mar 21, 2:38=A0am, Jacko <jackokr...@gmail.com> wrote: > On 21 Mar, 04:30, rickman <gnu...@gmail.com> wrote: > > > On Mar 20, 7:39=A0pm, Jacko <jackokr...@gmail.com> wrote: > > > > > This would imply a 5bit wide word, which is obviously not > > > > the case ? > > > > The implication of the extra bit 'needed' is not a true account of > > > functioning. > > That would be like saying you need the extra )s on the front of > numbers when you do arithmetic on paper. Extra what??? Actually, you are quoting your own comment. > > Ah, but it is. =A0In your specific implementation, you have not only a > > fifth bit, but also a sixth, seventh all the way up to 16th, no? =A0You > > have a 16 bit instruction word and only 17 opcodes; 0 through 15 are > > the ones you list, and 16 through 65535 is the LIT or CALL instruction > > (I'm not sure which). > > (Depends on the subroutine start address) all subroutines are calls, > so they are all calls, just one is LIT. I have no idea what you are talking about. How does this instruction set specify literals? Your obfuscation is getting to be annoying. You never explain what you mean, you speak in crypto language and you seem intent on never really explaining the principles of your design. Even your assembly language is some new symbolism that just serves to isolate what you are doing and thinking rather than to be at all useful for communication. None of the rest of this is at all useful. You are presuming that I am making some sort of statement or that I am looking at your processor from a very different point of view. I am doing neither. I am trying to understand your processor from the point of view of a small, embeddable CPU for use in an FPGA and in particular, to be programmable in Forth. That is the target of my CPU. I am hoping to learn something about these processors that I don't know or that I haven't thought to try. What I am learning about this design is that it seems to have been designed without regard to a lot of knowledge available, not that I will ever know for sure because it will never really be explained. Have you read Koopman's book on stack CPUs? He covers a lot of ground with that. > Yes, you will find primitives use codes 0-15 and colon definitions use > 0-65535. If you are crazy enough to have a massive primitive set, or > to implement such a set in full width memory, then you would be right. > On the 12 bit version you could use 16 bit memory, and have the high 4 > as the primitive part of the address space. > > As stated on the website (somewhare) this processor is not designed > for running monolith inlined code, and pay in space and cache slowdown > such things will, say yoda. > > So in the example I gave for the store, it's likely the last line of > simple instructions would be a subroutine named store or +1! > > You will find a large amount of primitive code can be optimized into a > small logic area, especially if the address space over which these > subroutines is spread is sparse to allow combinational alignment of > product terms and boolean logic reduction. > > To just generalize this code as something to slot into the threading > is missing the point that this is an ocassional feature, not a best > practice. RickArticle: 139118
CPLDs are generally very small devices compared to a FPGAs. They are generally slightly easier to use for the novice but I won't let that put you off going for FPGA. Virtex-IIPro is a very old and expensive familiy now. Xilinx offers 2 sets of families. The Virtex range is big, very fast and expensive. Virtex-5 is readily available with Virtex-6 just announced. The Spartan families go from small to medium size in comparision. Coolrunner etc. I would describe as tiny to give a reference. If you have the ability to choose a part now then the Spartan-3A or Spartan-3AN are probably a good choice. The S3-A needs an external Flash memory that is used to configure the device at power up. The S3- AN has an internal Flash that is used for that purpose. The smallest S3-AN is the XC3S50AN and it has about 1400 flip-flops as a comparision to the Coolrunner with 512 macrocells which have 512 flip-flops available. It is very difficult to make a simple comparision between CPLD and FPGA technologies but I would suggest just trail building the design in a XC3S50AN to get a better comparision. ISE Webpack I presume you already have and it will only take a few minutes to change the part type and re-build. If you do want a development board we supply lots of choice with some more shortly in this market sector soon. You may find some of the links on our Techitips page useful - http://www.enterpoint.co.uk/techitips/= techitips.html. John Adair Enterpoint Ltd. On 21 Mar, 05:14, jleslie48 <j...@jonathanleslie.com> wrote: > On Mar 21, 12:12 am, rickman <gnu...@gmail.com> wrote: > > > > > > > On Mar 20, 3:15 pm, jleslie48 <j...@jonathanleslie.com> wrote: > > > > On Mar 20, 2:58 pm, jleslie48 <j...@jonathanleslie.com> wrote: > > > > > On Mar 20, 1:41 pm, Mike Treseler <mtrese...@gmail.com> wrote: > > > > > > jleslie48 wrote: > > > > > > and when I added some digital outputs, the %'s went up, but the= n I > > > > > > added a whole bunch of logic, and nothing changed, > > > > > > If the number of cells or Pterms used didn't change at all, > > > > > I would expect that a "whole bunch" of logic does not make it > > > > > out to a pin. I would run a sim to check. > > > > > > =A0 =A0 =A0 -- Mike Treseler > > > > > ahhh, =A0well that is a bummer. =A0I just tied the output to a pin = and now > > > > I"m getting: > > > > > Fitting... > > > > . > > > > ERROR:Cpld:1063 - Design requires at least 947 macrocells, exceeds > > > > device limit > > > > =A0 =A0512. > > > > ERROR:Cpld:1062 - Design contains 2004 unique product terms, exceed= s > > > > device > > > > =A0 =A0limit 1536. > > > > ERROR:Cpld:1064 - Design rules checking error. Fitting process > > > > stopped. > > > > ...o > > > > ERROR:Cpld:868 - Cannot fit the design into any of the specified > > > > devices with > > > > =A0 =A0the selected implementation options. > > > > > any idea on how to make it fit? > > > > ok, before I added my functionality, I had: > > > Macrocells Used =A0 =A0 =A0 =A0 Pterms Used =A0 =A0 Registers Used = =A0Pins Used =A0 =A0 =A0 Function > > > Block Inputs Used > > > 379/512 =A0(75%) =A0831/1536 =A0(55%) =A0 =A0 =A0 =A0 354/512 =A0(70%= ) =A0118/176 =A0(68%) > > > 779/1280 =A0(61%) > > > > so from my errors, I can see I added some 600 macrocells, and 1200 > > > pterms, > > > > how can I find out who is the piggy, and what can I due to trim thing= s > > > down? > > > I don't think you really got an answer to this question. =A0To some > > extent you can look at the code and estimate the number of macrocells > > or other logic elements used. =A0But to measure it, you need to break > > the code into modules and let the tool tell you about each module > > separately. =A0In an FPGA the logic has a finer grain, so there are not > > as much optimizations to affect these counts when you use the block > > all together. =A0But a CPLD can put a lot of logic into each macrocell > > and will be much more limited by the FF count. =A0Your design counts > > above indicate that your design uses 1300 FFs and your CPLD only has > > 512 FFs. =A0Not a good fit! > > > You won't find much in the way of optimizations that will make this > > fit. =A0The best thing to trim your logic is to change your algorithm. > > If there are parts of your design that can run slowly compared to the > > clock rate, you can let them run sequentially rather than in > > parallel. =A0But if your design has to run at the full rate of the cloc= k > > with everything in parallel, you just need a larger part. =A0So take a > > good, hard at your design and see if there is anything you can do to > > reduce it. > > > > also, what is a macrocell and pterm? > > > I think these got answered, but a little more detail... =A0A macrocell > > is the unit block of a CPLD. =A0It typically include one or two FFs, an > > output, often to a pin along with some amount of logic. =A0The logic in > > a macrocell is made of p-terms and OR gates. =A0P-terms are very wide > > AND gates with inputs from all of the inputs to that block, all of the > > FFs in that block as well as, in some devices, some inputs from other > > macrocell p-terms. =A0The p-terms of a given macrocell are OR'd togethe= r > > to produce the input to the FF or it can be routed directly to the > > output. =A0There is also a p-term or two devoted to controlling the tri= - > > state driver on the output. =A0The OR gate and FF outputs are connected > > back to the logic matrix for use in other or the same macrocells. > > Some devices have "buried" FFs which allow some of the logic in the > > macrocell to be split off and used with this second FF, but the output > > can only be routed back to the routing matrix, not an output pin. > > > That is a lot to absorb from a description. =A0I am sure the data sheet > > has a picture that is very clear and can portray the detail better. > > The main thing to understand is that the p-term (and) are unlimited > > (or more accurately only limited by the inputs to the block routing) > > and an FPGA typically has much smaller LUTs, usually 1 LUT per FF or > > sometimes 4 LUTs to 3 FFs. =A0So a CPLD is often FF count limited while > > an FPGA is mostly LUT count limited. =A0Certainly there are things you > > can change in your design to use more logic and fewer FF to target > > CPLDs. =A0But I think it will be a major job to cut the design size by > > more than half! > > > > I originally ran this program on a virtexII, and everthing looked > > > liked it > > > was pretty small and effecient: > > > > Device Utilization Summary > > > [-] > > > Logic Utilization > > > Used > > > Available > > > Utilization > > > Note(s) > > > Number of Slice Flip Flops > > > 1,282 > > > 27,392 > > > 4% > > > > Number of 4 input LUTs > > > 1,545 > > > 27,392 > > > 5% > > > > Logic Distribution > > > > Number of occupied Slices > > > 1,302 > > > 13,696 > > > 9% > > > > =A0 =A0 Number of Slices containing only related logic > > > 1,302 > > > 1,302 > > > 100% > > > > =A0 =A0 Number of Slices containing unrelated logic > > > 0 > > > 1,302 > > > 0% > > > > Total Number of 4 input LUTs > > > 1,589 > > > 27,392 > > > 5% > > > > =A0 =A0 Number used as logic > > > 1,545 > > > > =A0 =A0 Number used as a route-thru > > > 44 > > > > Number of bonded IOBs > > > Number of bonded > > > 15 > > > 556 > > > 2% > > > > =A0 =A0 IOB Flip Flops > > > 1 > > > > Number of RAMB16s > > > 2 > > > 136 > > > 1% > > > > Number of BUFGMUXs > > > 3 > > > 16 > > > 18% > > > > but I cant seem to compare these two chips, the VIRTEX II vs the > > > XCR3512XL-12-PQ208 > > > its apples to oranges, how does it work? > > > The 3512 has 512 FFs in the macrocells. =A0(I think they also have inpu= t > > FFs) =A0The FPGA is using some 1300 out of 27,000! =A0The FPGA is using > > 1500 LUTs for logic. =A0It does not look to me like that couldn't fit i= n > > the logic of 512 macrocells. =A0But the number of FFs has to be > > reduced. =A0Are they all necessary? > > > Rick > > "But I think it will be a major job to cut the design size by more > than half!" > > Well this is what has me scratching my head, =A0I only added one uart to > the 3512, =A0the listing from the Virtex II has two > separate UARTS, =A0to make up the 1300 slice flip flops. =A0 =A0I've only > moved one of the uarts to the 3512 so far and it blew its top. =A0I > can't see how one uart can take up the entire chip, Or that the > difference between the $90 3512 and the > $1200 Virtex II Pro? > > I'm not sure of what you are getting out with me reducing the number > of FF's, =A0I'm just getting the hand of VHDL but I'm not aware of what > code makes up the FF's, =A0 I inlcuded the code I put in up above, It > seems very straight-forward, state machine, > > "The FPGA is using some 1300 out of 27,000! =A0" =A0I'm assuming you mean > the 1282/27,392 number. =A0 What I'm guessing is that in order for this > design I have to get these 1282 to fit into the the 512 macrocells of > the 3512 but I can only put 1 in each macrocell, aka I've got to get > down to under 512 slice FF. =A0Thats not counting the problem I'm having > with the pterms,- Hide quoted text - > > - Show quoted text -Article: 139119
On Mar 21, 1:14 am, jleslie48 <j...@jonathanleslie.com> wrote: > On Mar 21, 12:12 am, rickman <gnu...@gmail.com> wrote: > > > > > On Mar 20, 3:15 pm, jleslie48 <j...@jonathanleslie.com> wrote: > > > > On Mar 20, 2:58 pm, jleslie48 <j...@jonathanleslie.com> wrote: > > > > > On Mar 20, 1:41 pm, Mike Treseler <mtrese...@gmail.com> wrote: > > > > > > jleslie48 wrote: > > > > > > and when I added some digital outputs, the %'s went up, but then I > > > > > > added a whole bunch of logic, and nothing changed, > > > > > > If the number of cells or Pterms used didn't change at all, > > > > > I would expect that a "whole bunch" of logic does not make it > > > > > out to a pin. I would run a sim to check. > > > > > > -- Mike Treseler > > > > > ahhh, well that is a bummer. I just tied the output to a pin and now > > > > I"m getting: > > > > > Fitting... > > > > . > > > > ERROR:Cpld:1063 - Design requires at least 947 macrocells, exceeds > > > > device limit > > > > 512. > > > > ERROR:Cpld:1062 - Design contains 2004 unique product terms, exceeds > > > > device > > > > limit 1536. > > > > ERROR:Cpld:1064 - Design rules checking error. Fitting process > > > > stopped. > > > > ...o > > > > ERROR:Cpld:868 - Cannot fit the design into any of the specified > > > > devices with > > > > the selected implementation options. > > > > > any idea on how to make it fit? > > > > ok, before I added my functionality, I had: > > > Macrocells Used Pterms Used Registers Used Pins Used Function > > > Block Inputs Used > > > 379/512 (75%) 831/1536 (55%) 354/512 (70%) 118/176 (68%) > > > 779/1280 (61%) > > > > so from my errors, I can see I added some 600 macrocells, and 1200 > > > pterms, > > > > how can I find out who is the piggy, and what can I due to trim things > > > down? > > > I don't think you really got an answer to this question. To some > > extent you can look at the code and estimate the number of macrocells > > or other logic elements used. But to measure it, you need to break > > the code into modules and let the tool tell you about each module > > separately. In an FPGA the logic has a finer grain, so there are not > > as much optimizations to affect these counts when you use the block > > all together. But a CPLD can put a lot of logic into each macrocell > > and will be much more limited by the FF count. Your design counts > > above indicate that your design uses 1300 FFs and your CPLD only has > > 512 FFs. Not a good fit! > > > You won't find much in the way of optimizations that will make this > > fit. The best thing to trim your logic is to change your algorithm. > > If there are parts of your design that can run slowly compared to the > > clock rate, you can let them run sequentially rather than in > > parallel. But if your design has to run at the full rate of the clock > > with everything in parallel, you just need a larger part. So take a > > good, hard at your design and see if there is anything you can do to > > reduce it. > > > > also, what is a macrocell and pterm? > > > I think these got answered, but a little more detail... A macrocell > > is the unit block of a CPLD. It typically include one or two FFs, an > > output, often to a pin along with some amount of logic. The logic in > > a macrocell is made of p-terms and OR gates. P-terms are very wide > > AND gates with inputs from all of the inputs to that block, all of the > > FFs in that block as well as, in some devices, some inputs from other > > macrocell p-terms. The p-terms of a given macrocell are OR'd together > > to produce the input to the FF or it can be routed directly to the > > output. There is also a p-term or two devoted to controlling the tri- > > state driver on the output. The OR gate and FF outputs are connected > > back to the logic matrix for use in other or the same macrocells. > > Some devices have "buried" FFs which allow some of the logic in the > > macrocell to be split off and used with this second FF, but the output > > can only be routed back to the routing matrix, not an output pin. > > > That is a lot to absorb from a description. I am sure the data sheet > > has a picture that is very clear and can portray the detail better. > > The main thing to understand is that the p-term (and) are unlimited > > (or more accurately only limited by the inputs to the block routing) > > and an FPGA typically has much smaller LUTs, usually 1 LUT per FF or > > sometimes 4 LUTs to 3 FFs. So a CPLD is often FF count limited while > > an FPGA is mostly LUT count limited. Certainly there are things you > > can change in your design to use more logic and fewer FF to target > > CPLDs. But I think it will be a major job to cut the design size by > > more than half! > > > > I originally ran this program on a virtexII, and everthing looked > > > liked it > > > was pretty small and effecient: > > > > Device Utilization Summary > > > [-] > > > Logic Utilization > > > Used > > > Available > > > Utilization > > > Note(s) > > > Number of Slice Flip Flops > > > 1,282 > > > 27,392 > > > 4% > > > > Number of 4 input LUTs > > > 1,545 > > > 27,392 > > > 5% > > > > Logic Distribution > > > > Number of occupied Slices > > > 1,302 > > > 13,696 > > > 9% > > > > Number of Slices containing only related logic > > > 1,302 > > > 1,302 > > > 100% > > > > Number of Slices containing unrelated logic > > > 0 > > > 1,302 > > > 0% > > > > Total Number of 4 input LUTs > > > 1,589 > > > 27,392 > > > 5% > > > > Number used as logic > > > 1,545 > > > > Number used as a route-thru > > > 44 > > > > Number of bonded IOBs > > > Number of bonded > > > 15 > > > 556 > > > 2% > > > > IOB Flip Flops > > > 1 > > > > Number of RAMB16s > > > 2 > > > 136 > > > 1% > > > > Number of BUFGMUXs > > > 3 > > > 16 > > > 18% > > > > but I cant seem to compare these two chips, the VIRTEX II vs the > > > XCR3512XL-12-PQ208 > > > its apples to oranges, how does it work? > > > The 3512 has 512 FFs in the macrocells. (I think they also have input > > FFs) The FPGA is using some 1300 out of 27,000! The FPGA is using > > 1500 LUTs for logic. It does not look to me like that couldn't fit in > > the logic of 512 macrocells. But the number of FFs has to be > > reduced. Are they all necessary? > > > Rick > > "But I think it will be a major job to cut the design size by more > than half!" > > Well this is what has me scratching my head, I only added one uart to > the 3512, the listing from the Virtex II has two > separate UARTS, to make up the 1300 slice flip flops. I've only > moved one of the uarts to the 3512 so far and it blew its top. I > can't see how one uart can take up the entire chip, Or that the > difference between the $90 3512 and the > $1200 Virtex II Pro? Somewhere we did not communicate. The VIIP part has some 27,000 FFs. Yes, that was 27 *thousand* FFs. The 3512 has 512 FFs for logic. So there is no way that you can expect the CPLD to hold anywhere near the same number of UARTs as the FPGA you are using. The two UARTs in the VIIP are using 1300 FFs. Divide that by two (assuming they don't share any logic like the baud rate generator) and you get 650 FFs per UART. Will that fit into 512 FFs in the CPLD? It is very likely that the UART you are using is very much more complex than you really need. I expect you could fit some 10 or more UARTs into this CPLD if they are streamlined a bit. A UART is nothing but a pair of shift registers with some control logic and should fit into a couple of dozen FFs if coded minimally. To do that requires that you understand how to design hardware so that you know what you want from the HDL code and then to code the HDL to produce that hardware. > I'm not sure of what you are getting out with me reducing the number > of FF's, I'm just getting the hand of VHDL but I'm not aware of what > code makes up the FF's, I inlcuded the code I put in up above, It > seems very straight-forward, state machine, > > "The FPGA is using some 1300 out of 27,000! " I'm assuming you mean > the 1282/27,392 number. What I'm guessing is that in order for this > design I have to get these 1282 to fit into the the 512 macrocells of > the 3512 but I can only put 1 in each macrocell, aka I've got to get > down to under 512 slice FF. Thats not counting the problem I'm having > with the pterms, Yes, that is what you need to do. I'm not sure what the p-term count is a problem. > > > > ERROR:Cpld:1062 - Design contains 2004 unique product terms, exceeds > > > > device > > > > limit 1536. If you get your FF count down, the p-term count will also likely decrease as well. But I'm not clear on why there are so few p-terms in this device. A typical CPLD will have macrocells with a range of p- terms per macrocell of 4 to 12 or more. I would have to look at the data sheet of the 3512 to see how they are organized. Ahhh... I found your culprit. architecture arch of fifo is type reg_file_type is array (2**W-1 downto 0) of std_logic_vector(B-1 downto 0); signal array_reg: reg_file_type; This FIFO is implemented using memory resources in the FPGA. In the CPLD there are no memory resouces... at least in Xilinx CPLDs. Other brands have memory. With 8 bits and 16 words, each FIFO uses 128 FFs. A UART has two FIFOs using 256 FFs. That's half the CPLD right there! If you want a UART in the CPLD you need to take out the FIFOs. If you are still running the same code for the Hello World program, you don't need the FIFOs anyway. Instead of letting the data generator push chars into the FIFO, let the data generator be throttled by the UART handshake directly. The UART clearly has other complexities that is eating up FFs. You need to find or code a simpler UART to suit your requirements. Think of the CPLD as an MCU with only 2 kB of program space. You wouldn't pull the UART driver out of Linux and try to use it in that device would you? In essence, that is what you are doing. RickArticle: 139120
If you search as a UK customer some parts don't appear. This might explain why you didn't find it. John Adair Enterpoint Ltd. On 20 Mar, 13:39, Mike Harrison <m...@whitewing.co.uk> wrote: > On Fri, 20 Mar 2009 14:25:10 +0100, "StoneThrower" <digi_64-public[remove= this]@yahoo.com> wrote: > >> Digikey only stock the straight version =A0(-DSA) > >?! > >Digikey p/n H10644-ND, FX2-100S-1.27DS, right-angled (~as nicely shown o= n > >pics): > >http://parts.digikey.com/1/parts/287861-conn-recept-r-a-100pos-1-27mm... > > Thanks - =A0 I did several searches & failed to find it - seems like they= forgot to put the pin count > in the item data so when I filtered to 100 pin, it didn't find it! > Their parametric data is usually very good so I don't tend to look furthe= r if the search doesn't > find something - have emailed them.Article: 139121
On Mar 21, 8:42 am, Brian Drummond <brian_drumm...@btconnect.com> wrote: > On Fri, 20 Mar 2009 17:56:39 -0700 (PDT), jleslie48 <j...@jonathanleslie.= com> > wrote: > > >Well this is a bummer. Here I think I'm being careful, working things > >out with > >test bench, re-building all the while and checking for growth, only to > >be blind sided > >when I hook up the pin to the generated signal. > > >Meantime I've got some more info and questions. > > >1) >> any idea on how to make it fit? > > >If it has to be that device, I would need two of them. > > >that chip we are getting for around $100, I don't even know where to > >buy them, and where do I get a Virtex II-PRO chip? digikey says they > >are $1000?? Wouldn't I be better off getting the Virtex II-PRO? > > Spartan-3 gives a sizeable resource for well under $100. (V2Pro with the = same > capacity would be $500 up - ballpark numbers) > em.avnet.com lists the XC3S1500 for $74 (1 off) $55 (100 off) and you can > upgrade to larger versioss if necessary. > > http://www.enterpoint.co.uk/moelbryn/raggedstone1.html > Here's one example of a complete board with XC3S1500 for about $250. > > >Meantime the old chip is > >mounted on a custom board layout, I guess my hardware guys are going > >to have to re-lay out the board with > >two of these chips? > > Consider a Spartan-3 layout for room to grow. > > >3) "Rerun synthesis and check the % utilization " > >that's what I've been doing. basically I added the equivalent of soft > >uart and the data generator state machine that Jonathan so kindly set > >me up with. So I started backing out that code bit by bit to see > >where I pop the %s. > > Not the best way - as you discovered. > > Find the "do not allocate I/O pin" synthesis option and synth each major > subsystem as a separate project. Crosscheck that a simple sum of the resu= lts is > approx (say within 10%) of the overall size. Note any major surprises... > > This option is used to build separate re-usable modules (black boxes) so = it is > not allowed to optimize away anything not connected to a pin; therefore i= t is a > better way to determine resource usage. > > (Just another way to achieve what Jonathan advised, but without the labou= r of > adding the pins) > > - Brian Hey everybody, thanks for all the good suggestions. 1) So it is reasonable to conclude that the cold runner 3512 is way too small to even run a uart yes? 2) is the board that Brian suggested, the raggedstone1 with the spartan XC3S1500 is big enough? 2A) I really want to put 3 or 4 UARTS onto the chip, Is it big enough for that? 2B) The specs for the XC3S1500 are: Xilinx XC3S1500-4FG676C FPGA Spartan=AE-3 Family 1.5M Gates 29952 Cells 630MHz Commercial 90nm Technology 1.2V 676-Pin FCBGA Cross to Alternate Parts by selecting most important features and values below and then search again Search within this category only Search within this manufacturer only Feature Description Feature Value Package 676FCBGA Family Name Spartan=AE-3 Device Logic Cells 29952 Device Logic Units 3328 Device System Gates 1500000 Number of Registers N/A Maximum Internal Frequency 630 MHz Typical Operating Supply Voltage 1.2 V Maximum Number of User I/Os 487 RAM Bits 589824 Re-programmability Support Yes Whats the deal withe the "PACKAGE" ( 676FCBGA) I see from AVNET that the XC3S1500 comes in lots of flavors: http://avnetexpress.avnet.com/store/em/EMController?langId=3D-1&storeId=3D5= 00201&catalogId=3D500201&term=3DXC3S1500&x=3D0&y=3D0&N=3D0&action=3Dproduct= s XC3S1500-4FG320C XC3S1500-4FG456C XC3S1500-4FG676C XC3S1500-4FGG456C XC3S1500-5FG456C XC3S1500-5FGG456C XC3S1500-4FGG320C XC3S1500-4FGG320I XC3S1500-5FGG676C XC3S1500-4FGG676C XC3S1500-4FG320I XC3S1500-4FG456I XC3S1500-4FG676I XC3S1500-4FGG456I XC3S1500-4FGG676I XC3S1500-5FG320C XC3S1500-5FG676C XC3S1500-5FGG320C XC3S1500-5FGG320 how interchangeable are these parts? the ds0099.pdf data sheet is not geared towards just eh xc3s1500, I'm getting confused within its 219 pages... 3) The raggedstone1 has an added feature that I was told to consider, mounting into a PC. Initially I want to put it a stand alone box, so I will need to order the board, the PCI I/O header, and the Ocsillator and then I'm good to go yes?Article: 139122
On Mar 21, 3:47=A0pm, Mawafugo <cco...@netscape.net> wrote: > In xapp460 the DVI/HDMI transmitter & receiver is implemented but the > max throughput limit to somewhat 750 Mb/s, which can handle up to > 1080i or 720p resolution. =A0The 1080p, however needs twice of that > > The question is how can we crank up the throughput to about 1.5 Gb/s ? answer is: it is not doable with S3A AnttiArticle: 139123
On Mar 21, 11:10 am, jleslie48 <j...@jonathanleslie.com> wrote: > > 1) So it is reasonable to conclude that the cold runner 3512 is way > too small to even run a uart yes? I would not say that. A UART can be done in a small number of FFs, or in your case, a small number of macrocells. I expect it to be easy to get ten UARTs into the 3512. But the code you have for a UART is very large and overly complex if you just want to do serial transmission and reception of data. If you just want to *send* data the size can be reduced further. > 2) is the board that Brian suggested, the raggedstone1 with the > spartan XC3S1500 is big enough? Certainly an XC3S1500 is plenty large enough for four UARTs. In that size part you could have not only the UARTs, but also the CPU! > 2A) I really want to put 3 or 4 UARTS onto the chip, Is it big enough > for that? Yes. If four UARTs is all you want, the XC3S1500 is very much overkill. > 2B) The specs for the XC3S1500 are: > > Xilinx XC3S1500-4FG676C > FPGA Spartan=AE-3 Family 1.5M Gates 29952 Cells 630MHz Commercial 90nm > Technology 1.2V 676-Pin FCBGA > Cross to Alternate Parts by selecting most important features and > values below and then search again > Search within this category only > Search within this manufacturer only > Feature Description Feature Value > Package 676FCBGA > Family Name Spartan=AE-3 > Device Logic Cells 29952 > Device Logic Units 3328 > Device System Gates 1500000 > Number of Registers N/A > Maximum Internal Frequency 630 MHz > Typical Operating Supply Voltage 1.2 V > Maximum Number of User I/Os 487 > RAM Bits 589824 > Re-programmability Support Yes > > Whats the deal withe the "PACKAGE" ( 676FCBGA) > > I see from AVNET that the XC3S1500 comes in lots of flavors: > > http://avnetexpress.avnet.com/store/em/EMController?langId=3D-1&storeId..= . > > XC3S1500-4FG320C > XC3S1500-4FG456C > XC3S1500-4FG676C > XC3S1500-4FGG456C > XC3S1500-5FG456C > XC3S1500-5FGG456C > XC3S1500-4FGG320C > XC3S1500-4FGG320I > XC3S1500-5FGG676C > XC3S1500-4FGG676C > XC3S1500-4FG320I > XC3S1500-4FG456I > XC3S1500-4FG676I > XC3S1500-4FGG456I > XC3S1500-4FGG676I > XC3S1500-5FG320C > XC3S1500-5FG676C > XC3S1500-5FGG320C > XC3S1500-5FGG320 > > how interchangeable are these parts? the ds0099.pdf data sheet is not > geared towards just eh xc3s1500, I'm getting confused within its 219 > pages... They are all the same die and will likely all run the same bitstream. You really only need to worry about the package you are using unless you want to target different boards. The first digit after the dash -4 or -5 is the speed of the part. The parts are the same, but they are tested to different speeds. The letter at the end is the temperature rating, C for commerical (normally 0 to 70 C ambient, but I think Xilinx specs a higher number and says this has to be the die temperature) and I for industrial (-20 to +85 C with the same issue as commercial). The rest of the suffix is the package. Mostly all the different sized parts have similar timing. But some timing numbers are different. Anything that is widely distributed across the chip has further to go in the larger chips, so it runs slower. If you want to design for a range of packages you mainly need to limit your design to the I/O pins that are used on the smallest package. So make sure every package you want to use supports all of those pins. Otherwise you should have no problems. > 3) The raggedstone1 has an added feature that I was told to consider, > mounting into a PC. Initially I want to put it a stand alone box, so I > will need to order the board, the PCI I/O header, and the Ocsillator > and then I'm good to go yes? The board will need power. Other than that, you need to consult the data sheet for the board. They should provide specs on how to use the board stand alone. RickArticle: 139124
On Mar 21, 12:51=A0pm, "Kristian Klaus" <kristian.kl...@gmx.de> wrote: > I forgot to say, that I want the DATA_WIDTH_MATCHING option set to 1 beca= use > I want the emc to be fully transparent to the operating system (32 bit > writes). > > Kristian > > "Kristian Klaus" <kristian.kl...@gmx.de> schrieb im Newsbeitragnews:gq2f5= g$eph$1@hahn.informatik.hu-berlin.de... > > > Hello, > > > I am trying to connect a 16 bit Intel Strataflash to the > > xps_mch_emc_v2_00_a. My problem is, that the flashwriter.tcl applicatio= n > > stops after some percents (13% or later) and never comes to 100%. > > > In the microblaze-uclinux archive, I found two old posts: > > >http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/archive/2005/0... > > and > >http://osdir.com/ml/linux.uclinux.microblaze/2004-02/msg00046.html > > > Can anybody confirm, that this also works with the new xps_mch_emc? > > > -- > > Kristian upon the times when EDK was so young i designed a special IP core FIX that reparaired the EMC and did allow flash writing when widtht matching is on. it was for MANY years. I assumed the problem is now fixed by Xilinx? but i havent worked with EMC for long time at the old times, the EMC fix was basically an AND gate that removed some extra pulse that made the CFI interface go nuts. Antti
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z