Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On 12/1/2015 8:55 PM, BobH wrote: > On 11/30/2015 5:34 PM, rickman wrote: >> On 11/30/2015 6:44 PM, BobH wrote: >>> A mistake that I have made, is to mis-spell the wire connection and then >>> there is no user for the outputs. The easiest way to check that is to >>> inspect the simulation at the inputs to the next stage that uses the >>> data and make sure that they are wiggling as you expect and not showing >>> undefined as they would for an undriven wire. The second easiest way to >>> check that is to eyeball the naming for this problem. >> >> If you make a spelling error, won't that be flagged because that signal >> hasn't been declared? >> > Often the auto-wire "feature" will generate a replacement. If you go > through the logs, it is noted, and usually the auto-wire will be a > single wide signal instead of a bus, so it shows up that way too. That is why VHDL has strong typing, errors like this are made *very* clear. -- RickArticle: 158476
So this evening I implemented the PLA instruction, which reads from the sta= ck (at the current location of the stack pointer) and stores the value ther= e into A. Synthesis took about 3x as long, and at the end of it there's a w= hole bunch of Info messages about how it wasn't storing the stack in a bloc= k ram for this reason or that. Looking at the registers, I jumped from ~260 to ~520, so it looks as though= the variably-indexed (via SP) set of stack registers were incorporated int= o the design again :) Phew! I guess I'll just get on with it and implement more instructions - I was ju= st afraid that as the design got larger, it would be harder to debug. Looks= like it might have been easier :) Thanks again for all the help everyone, especially the verilog examples Bob= :) SimonArticle: 158477
On 12/1/2015 11:49 PM, Simon wrote: > So this evening I implemented the PLA instruction, which reads from the stack (at the current location of the stack pointer) and stores the value there into A. Synthesis took about 3x as long, and at the end of it there's a whole bunch of Info messages about how it wasn't storing the stack in a block ram for this reason or that. > > Looking at the registers, I jumped from ~260 to ~520, so it looks as though the variably-indexed (via SP) set of stack registers were incorporated into the design again :) Phew! > > I guess I'll just get on with it and implement more instructions - I was just afraid that as the design got larger, it would be harder to debug. Looks like it might have been easier :) > > Thanks again for all the help everyone, especially the verilog examples Bob :) You do have an issue if a block RAM is not being used. The code I've seen looks like you are writing from a functional perspective rather than structural. I would suggest you write a module for a block RAM using example code provided by your chip manufacturer. Then incorporate that RAM module into your code as appropriate. Block RAM must have a register delay in the RAM itself. There are other restrictions as well, the details depending on the vendor. If you code the module by the provider's example you should get a block RAM. This should also help you see the limitations of how you can use that RAM. I have had similar problems coding adders when I was trying to use the carry out. One small issue with how I was using the adder resulting in a second adder being used to generate the carry out. -- RickArticle: 158478
On Tuesday, December 1, 2015 at 9:02:26 PM UTC-8, rickman wrote: >=20 > You do have an issue if a block RAM is not being used. The code I've=20 > seen looks like you are writing from a functional perspective rather=20 > than structural. I would suggest you write a module for a block RAM=20 > using example code provided by your chip manufacturer. Then incorporate= =20 > that RAM module into your code as appropriate. But I don't want a block-ram. I don't want to pay the penalty of a clock-cy= cle for access to the values. I want a block of 256 registers, which I can = access with as-close-to-zero time cost as possible. Block-ram's are great, = but in this case I really want just a whole bunch of registers. I'm conscious that something is screwy. I don't understand why an array of = registers declared as... ///////////////////////////////////////////////////////////////////////= /////=20 // Set up zero-page as register-based for speed reasons=20 ///////////////////////////////////////////////////////////////////////= /////=20 reg [`NW:0] zp[0:255]; // Zero-= page=20 ... should exhibit a whole bunch of warnings along the lines of INFO: [Synth 8-5545] ROM "zp_reg[255]" won't be mapped to RAM because a= ddress size (32) is larger than maximum supported(25)"=20 Um, que ? Address size =3D=3D 32 ? Even if you treat it as a 1-bit array, t= hat's only 11 bits of address (8 * 256 =3D 2048) to access any given bit. H= mm, now there's a thought. I wonder if declaring: reg [2047:0] zp; .. and doing the bit-selections might be a way to do it. No array, just a f= reaking huge register. I wonder how efficient it is at ganging up LUTs to m= ake a combined single register... I actually might try implementing a module along the lines of BobH's code a= bove - rather than just declaring the register array, and see how that work= s out. At the moment I'm busy writing unit tests :) =20 Cheers SimonArticle: 158479
On 12/2/2015 11:08 AM, Simon wrote: > On Tuesday, December 1, 2015 at 9:02:26 PM UTC-8, rickman wrote: >> >> You do have an issue if a block RAM is not being used. The code >> I've seen looks like you are writing from a functional perspective >> rather than structural. I would suggest you write a module for a >> block RAM using example code provided by your chip manufacturer. >> Then incorporate that RAM module into your code as appropriate. > > But I don't want a block-ram. I don't want to pay the penalty of a > clock-cycle for access to the values. I want a block of 256 > registers, which I can access with as-close-to-zero time cost as > possible. Block-ram's are great, but in this case I really want just > a whole bunch of registers. Ok, I understand better now. > I'm conscious that something is screwy. I don't understand why an > array of registers declared as... > > //////////////////////////////////////////////////////////////////////////// > // Set up zero-page as register-based for speed reasons > //////////////////////////////////////////////////////////////////////////// > reg [`NW:0] zp[0:255]; // Zero-page > > ... should exhibit a whole bunch of warnings along the lines of > > INFO: [Synth 8-5545] ROM "zp_reg[255]" won't be mapped to RAM because > address size (32) is larger than maximum supported(25)" > > Um, que ? Address size == 32 ? Even if you treat it as a 1-bit array, > that's only 11 bits of address (8 * 256 = 2048) to access any given > bit. Hmm, now there's a thought. I wonder if declaring: > > reg [2047:0] zp; > > .. and doing the bit-selections might be a way to do it. No array, > just a freaking huge register. I wonder how efficient it is at > ganging up LUTs to make a combined single register... > > I actually might try implementing a module along the lines of BobH's > code above - rather than just declaring the register array, and see > how that works out. At the moment I'm busy writing unit tests :) Now I am lost again. Why are you trying to change the code that is giving you 256 registers? The only RAM in FPGAs these days is synchronous RAM. If you don't want the address register delay then your only choice is to use fabric FFs. -- RickArticle: 158480
rickman wrote: > On 12/1/2015 8:55 PM, BobH wrote: >> On 11/30/2015 5:34 PM, rickman wrote: >>> On 11/30/2015 6:44 PM, BobH wrote: >>>> A mistake that I have made, is to mis-spell the wire connection and >>>> then >>>> there is no user for the outputs. The easiest way to check that is to >>>> inspect the simulation at the inputs to the next stage that uses the >>>> data and make sure that they are wiggling as you expect and not showing >>>> undefined as they would for an undriven wire. The second easiest way to >>>> check that is to eyeball the naming for this problem. >>> >>> If you make a spelling error, won't that be flagged because that signal >>> hasn't been declared? >>> >> Often the auto-wire "feature" will generate a replacement. If you go >> through the logs, it is noted, and usually the auto-wire will be a >> single wide signal instead of a bus, so it shows up that way too. > > That is why VHDL has strong typing, errors like this are made *very* clear. > You don't need VHDL, just Verilog 2001 and use `default_nettype none to prevent auto-wire generation. -- GaborArticle: 158481
In article <n3lkei021un@news3.nntpjunkie.com>, BobH <wanderingmetalhead.nospam.please@yahoo.com> wrote: > >The brute force might look like: > >module reg_ram >( > input wire [1:0] address, > input wire [7:0] write_data, > input wire write_en, > input wire clk, > input wire rstn, > output reg [7:0] read_data >); > >reg [7:0] cell0, cell1, cell2, cell3; > >always @(posedge clk or negedge rstn) >if (~rstn) > cell0 <= 8'h0; >else > if (write_en & (address == 2'h0)) > cell0 <= write_data; > >always @(posedge clk or negedge rstn) >if (~rstn) > cell1 <= 8'h0; >else > if (write_en & (address == 2'h1)) > cell1 <= write_data; > <snip> > case (address) > 2'h0: read_data = cell0; > 2'h0: read_data = cell1; > 2'h0: read_data = cell2; > 2'h0: read_data = cell3; > endcase >endmodule > >As rude as this looks, most of the other structures that I can think of >result in something that looks like a huge barrel shifter and are larger >to implement. <snip> Huh. I missed what led up to this, but explicity coding up each case like this is entirely unneccesary in verilog. reg [ 7 : 0] cell [ 3 : 0]; always @( posedge clk ) // NO ASYNC RESET - messes up optimization - no reset at all actually is prefered if( write_en ) cell[ address ] <= write_data; always @* read_data = cell[ address ]; Done. If reset's are needed then it won't map to block RAM. Xilinx has examples in their docs for how to successfully infer block RAM. Regards, MarkArticle: 158482
On Wednesday, December 2, 2015 at 8:28:10 AM UTC-8, rickman wrote: >=20 > > I actually might try implementing a module along the lines of BobH's > > code above - rather than just declaring the register array, and see > > how that works out. At the moment I'm busy writing unit tests :) >=20 > Now I am lost again. Why are you trying to change the code that is=20 > giving you 256 registers? The only RAM in FPGAs these days is=20 > synchronous RAM. If you don't want the address register delay then your= =20 > only choice is to use fabric FFs. Maybe I'm reading/understanding it incorrectly - it looks to me that there'= s an always @ (posedge(clk)) dependency for writes - but I'm relatively fin= e with that - I won't need the data until the next clock anyway if I'm writ= ing, because that's how the 6502 worked.=20 For reads, it looked to me as though it used always @ (*), and I (perhaps i= ncorrectly) thought that would get me the results on the module's data bus = as soon as the 'address' lines changed. As for why to change it, I don't like it when I don't understand the error/= info messages the tool is giving me. Given my (relatively limited) understa= nding of what the synthesis tool is actually *doing* under the hood, it pro= bably means I'm not getting what I actually want, or if I am, it's in some = highly-inefficient manner. Your comment about inferring extra adders unnece= ssarily is pretty relevant I feel :) It does tie me to a single write/read per clock, whereas I could set N regi= sters per clock (and thus "push" 3 elements onto the stack for the BRK inst= ruction in a single clock for example), but I'm actually ok with that too, = I think. The 6502 only had 1 databus, so *it* took multiple clocks to do mu= ltiple writes as well.=20 Its entirely possible my understanding of the module is flawed. I'm happy t= o be corrected :) Cheers SimonArticle: 158483
On 12/2/2015 10:55 AM, Mark Curry wrote: > In article <n3lkei021un@news3.nntpjunkie.com>, > <snip> > > Huh. I missed what led up to this, but explicity coding up each case > like this is entirely unneccesary in verilog. > > reg [ 7 : 0] cell [ 3 : 0]; > always @( posedge clk ) // NO ASYNC RESET - messes up optimization - no reset at all actually is prefered > if( write_en ) > cell[ address ] <= write_data; > > always @* > read_data = cell[ address ]; > > Done. If reset's are needed then it won't map to block RAM. > Xilinx has examples in their docs for how to successfully infer block RAM. > Thanks! I have always explicitly built a model RAM when I wanted RAM in an FPGA rather than inferring one. I just automatically include the reset when I do D flops because it makes the simulation cleaner. I think that the original poster wanted an array of flops thinking that they would be faster than block ram. This looks worth messing with when I get some breathing space. I am a little curious about the synthesizabilty of it. Regards, BobHArticle: 158484
On 12/2/2015 9:08 AM, Simon wrote: > On Tuesday, December 1, 2015 at 9:02:26 PM UTC-8, rickman wrote: >> >> You do have an issue if a block RAM is not being used. The code I've >> seen looks like you are writing from a functional perspective rather >> than structural. I would suggest you write a module for a block RAM >> using example code provided by your chip manufacturer. Then incorporate >> that RAM module into your code as appropriate. > > But I don't want a block-ram. I don't want to pay the penalty of a clock-cycle > for access to the values. I want a block of 256 registers, which I can access > with as-close-to-zero time cost as possible. Block-ram's are great, but in > this case I really want just a whole bunch of registers. > > I'm conscious that something is screwy. I don't understand why an array of > registers declared as... > > //////////////////////////////////////////////////////////////////////////// > // Set up zero-page as register-based for speed reasons > //////////////////////////////////////////////////////////////////////////// > reg [`NW:0] zp[0:255]; // Zero-page > > ... should exhibit a whole bunch of warnings along the lines of > > INFO: [Synth 8-5545] ROM "zp_reg[255]" won't be mapped to RAM because > address size (32) is larger than maximum supported(25)" Block ram tends to be smallish and often wierd sizes. > > Um, que ? Address size == 32 ? Even if you treat it as a 1-bit array, that's > only 11 bits of address (8 * 256 = 2048) to access any given bit. Hmm, now there's > a thought. I wonder if declaring: > > reg [2047:0] zp; > > .. and doing the bit-selections might be a way to do it. No array, just a freaking > huge register. I wonder how efficient it is at ganging up LUTs to make a combined single > register... This will result in a huge barrel shifter which will likely get slow. I don't know what your clock speeds are relative the the FPGA capability, but I don't like the big barrel shifter implementations. If your clock speeds are a few MHz and you are using a modern FPGA, you probably can afford to implement it that way. > I actually might try implementing a module along the lines of BobH's code above - rather > than just declaring the register array, and see how that works out. At the moment > I'm busy writing unit tests :) Try Mark Curry's suggested syntax. If it is synthesizable, it will be MUCH easier to implement! From Mark's comment, if you include the reset, it should prevent the replacement of FF's with block RAM. Regards, BobHArticle: 158485
In article <n3nup505fn@news3.nntpjunkie.com>, BobH <wanderingmetalhead.nospam.please@yahoo.com> wrote: >On 12/2/2015 10:55 AM, Mark Curry wrote: >> In article <n3lkei021un@news3.nntpjunkie.com>, >> <snip> >> >> Huh. I missed what led up to this, but explicity coding up each case >> like this is entirely unneccesary in verilog. >> >> reg [ 7 : 0] cell [ 3 : 0]; >> always @( posedge clk ) // NO ASYNC RESET - messes up optimization - no reset at all actually is prefered >> if( write_en ) >> cell[ address ] <= write_data; >> >> always @* >> read_data = cell[ address ]; >> >> Done. If reset's are needed then it won't map to block RAM. >> Xilinx has examples in their docs for how to successfully infer block RAM. >> > >Thanks! I have always explicitly built a model RAM when I wanted RAM in >an FPGA rather than inferring one. I just automatically include the >reset when I do D flops because it makes the simulation cleaner. > I think that the original poster wanted an array of flops thinking >that they would be faster than block ram. > >This looks worth messing with when I get some breathing space. I am a >little curious about the synthesizabilty of it. Bob - it's all synthesizable for FPGA's just fine. The only trick is when you definetly want to infer Block RAMs. In that case, it's best to check the Xilinx Docs, and use their templates, with little modification. You can modify the Xilinx template, for instance, to make the RAM width, and depth a parameter. But stray to far, and it may trip up. And when I say trip up - I mean it'll synthesize to something that matches your description - however it may mess up and build it up out of FFs instead of Block RAMS. (You may also optionally attach a pragma to FORCE it to map to FFs - in the case you mentioned above where you may want the faster access. Just don't make it a very big array!) Play with it when you have time. It's an excellent tool in your toolbox. Regards, MarkArticle: 158486
On 12/2/2015 6:27 PM, BobH wrote: > On 12/2/2015 10:55 AM, Mark Curry wrote: >> In article <n3lkei021un@news3.nntpjunkie.com>, >> <snip> >> >> Huh. I missed what led up to this, but explicity coding up each case >> like this is entirely unneccesary in verilog. >> >> reg [ 7 : 0] cell [ 3 : 0]; >> always @( posedge clk ) // NO ASYNC RESET - messes up optimization - >> no reset at all actually is prefered >> if( write_en ) >> cell[ address ] <= write_data; >> >> always @* >> read_data = cell[ address ]; >> >> Done. If reset's are needed then it won't map to block RAM. >> Xilinx has examples in their docs for how to successfully infer block >> RAM. >> > > Thanks! I have always explicitly built a model RAM when I wanted RAM in > an FPGA rather than inferring one. I just automatically include the > reset when I do D flops because it makes the simulation cleaner. > I think that the original poster wanted an array of flops thinking > that they would be faster than block ram. > > This looks worth messing with when I get some breathing space. I am a > little curious about the synthesizabilty of it. I'm not sure the above is a correct model for block RAMs in many devices. The ones I have used have a register delay even in the read path. There can be separate interfaces (address, controls and data) for reading and writing, but in all cases the read data is registered. What devices will this model work for? Or maybe I'm not so familiar with Verilog. The read path in the above description is async, no? -- RickArticle: 158487
On 12/2/2015 1:01 PM, Simon wrote: > On Wednesday, December 2, 2015 at 8:28:10 AM UTC-8, rickman wrote: >> >>> I actually might try implementing a module along the lines of >>> BobH's code above - rather than just declaring the register >>> array, and see how that works out. At the moment I'm busy writing >>> unit tests :) >> >> Now I am lost again. Why are you trying to change the code that >> is giving you 256 registers? The only RAM in FPGAs these days is >> synchronous RAM. If you don't want the address register delay then >> your only choice is to use fabric FFs. > > > Maybe I'm reading/understanding it incorrectly - it looks to me that > there's an always @ (posedge(clk)) dependency for writes - but I'm > relatively fine with that - I won't need the data until the next > clock anyway if I'm writing, because that's how the 6502 worked. My understanding is that all block RAM have a register in the read path, I've always considered there is a register in the input side of address, data in and control rather than worrying about any internal details. It all works the same. Looks like I had forgotten about the distributed RAM. It has async read and sync write. So your model will work just fine. > For reads, it looked to me as though it used always @ (*), and I > (perhaps incorrectly) thought that would get me the results on the > module's data bus as soon as the 'address' lines changed. > > As for why to change it, I don't like it when I don't understand the > error/info messages the tool is giving me. Given my (relatively > limited) understanding of what the synthesis tool is actually *doing* > under the hood, it probably means I'm not getting what I actually > want, or if I am, it's in some highly-inefficient manner. Your > comment about inferring extra adders unnecessarily is pretty relevant > I feel :) Now that my misunderstanding is straightened out I see what you are saying. I don't understand the error message either, but then I can't see the code. Try isolating the error to a smaller section of code. Obviously there is something else going on that it thinks an 8 bit address RAM is being indexed by a 32 bit value. I expect it has something to do with the way you are using the array rather than the way you are declaring it. > It does tie me to a single write/read per clock, whereas I could set > N registers per clock (and thus "push" 3 elements onto the stack for > the BRK instruction in a single clock for example), but I'm actually > ok with that too, I think. The 6502 only had 1 databus, so *it* took > multiple clocks to do multiple writes as well. > > Its entirely possible my understanding of the module is flawed. I'm > happy to be corrected :) -- RickArticle: 158488
On 12/3/2015 12:01 AM, rickman wrote: > On 12/2/2015 6:27 PM, BobH wrote: >> On 12/2/2015 10:55 AM, Mark Curry wrote: >>> In article <n3lkei021un@news3.nntpjunkie.com>, >>> <snip> >>> >>> Huh. I missed what led up to this, but explicity coding up each case >>> like this is entirely unneccesary in verilog. >>> >>> reg [ 7 : 0] cell [ 3 : 0]; >>> always @( posedge clk ) // NO ASYNC RESET - messes up optimization - >>> no reset at all actually is prefered >>> if( write_en ) >>> cell[ address ] <= write_data; >>> >>> always @* >>> read_data = cell[ address ]; >>> >>> Done. If reset's are needed then it won't map to block RAM. >>> Xilinx has examples in their docs for how to successfully infer block >>> RAM. >>> >> >> Thanks! I have always explicitly built a model RAM when I wanted RAM in >> an FPGA rather than inferring one. I just automatically include the >> reset when I do D flops because it makes the simulation cleaner. >> I think that the original poster wanted an array of flops thinking >> that they would be faster than block ram. >> >> This looks worth messing with when I get some breathing space. I am a >> little curious about the synthesizabilty of it. > > I'm not sure the above is a correct model for block RAMs in many > devices. The ones I have used have a register delay even in the read > path. There can be separate interfaces (address, controls and data) > for reading and writing, but in all cases the read data is registered. > > What devices will this model work for? Or maybe I'm not so familiar > with Verilog. The read path in the above description is async, no? I Googled and found the distributed RAM in the Xilinx parts support async reads. So I am clear on this now. I must have forgotten this. -- RickArticle: 158489
rickman <gnuarm@gmail.com> wrote: > On 12/2/2015 6:27 PM, BobH wrote: >> On 12/2/2015 10:55 AM, Mark Curry wrote: (snip) >>> reg [ 7 : 0] cell [ 3 : 0]; >>> always @( posedge clk ) // NO ASYNC RESET - messes up optimization - >>> no reset at all actually is prefered >>> if( write_en ) >>> cell[ address ] <= write_data; >>> always @* >>> read_data = cell[ address ]; (snip) >> Thanks! I have always explicitly built a model RAM when I wanted RAM in >> an FPGA rather than inferring one. I just automatically include the >> reset when I do D flops because it makes the simulation cleaner. >> I think that the original poster wanted an array of flops thinking >> that they would be faster than block ram. (snip) > I'm not sure the above is a correct model for block RAMs in many > devices. The ones I have used have a register delay even in the read > path. There can be separate interfaces (address, controls and data) > for reading and writing, but in all cases the read data is registered. > What devices will this model work for? Or maybe I'm not so familiar > with Verilog. The read path in the above description is async, no? Yes that has async. read and sync. write, and that doesn't work with the usual block RAM. I am not sure if it wants the register before, or after, or if it doesn't matter. -- glenArticle: 158490
rickman <gnuarm@gmail.com> wrote: (snip) > I Googled and found the distributed RAM in the Xilinx parts support > async reads. So I am clear on this now. I must have forgotten this. The distributed RAM is just the usual LUTs, so support asynchronous read the same way they do when they are gates. I think they also support asynchronous write, but that is less obvious. -- glenArticle: 158491
On 12/3/2015 1:40 AM, glen herrmannsfeldt wrote: > rickman <gnuarm@gmail.com> wrote: > > (snip) > >> I Googled and found the distributed RAM in the Xilinx parts support >> async reads. So I am clear on this now. I must have forgotten this. > > The distributed RAM is just the usual LUTs, so support asynchronous > read the same way they do when they are gates. I think they also > support asynchronous write, but that is less obvious. No, they do not support async writes. I recall it was in the XC4000 series they got rid of async writes because they had so much trouble supporting it. Basically there were too many users who didn't know how to properly use async memory. There may have been some technical advantages to using a sync write for the FPGA designers, but I am pretty sure it was really an issue of complaints that it didn't work right which really meant they were not meeting the specs on the pulse width of the write strobe. Async RAM has a lot of timing details to meet compared to the sync version. With sync it is basically just setup and hold of the inputs. -- RickArticle: 158492
On 12/3/2015 1:34 AM, glen herrmannsfeldt wrote: > rickman <gnuarm@gmail.com> wrote: >> On 12/2/2015 6:27 PM, BobH wrote: >>> On 12/2/2015 10:55 AM, Mark Curry wrote: > > (snip) >>>> reg [ 7 : 0] cell [ 3 : 0]; >>>> always @( posedge clk ) // NO ASYNC RESET - messes up optimization - >>>> no reset at all actually is prefered >>>> if( write_en ) >>>> cell[ address ] <= write_data; > >>>> always @* >>>> read_data = cell[ address ]; > > (snip) >>> Thanks! I have always explicitly built a model RAM when I wanted RAM in >>> an FPGA rather than inferring one. I just automatically include the >>> reset when I do D flops because it makes the simulation cleaner. >>> I think that the original poster wanted an array of flops thinking >>> that they would be faster than block ram. > > (snip) >> I'm not sure the above is a correct model for block RAMs in many >> devices. The ones I have used have a register delay even in the read >> path. There can be separate interfaces (address, controls and data) >> for reading and writing, but in all cases the read data is registered. > >> What devices will this model work for? Or maybe I'm not so familiar >> with Verilog. The read path in the above description is async, no? > > Yes that has async. read and sync. write, and that doesn't > work with the usual block RAM. > > I am not sure if it wants the register before, or after, or > if it doesn't matter. I'm not sure what you mean. Before or after what exactly? -- RickArticle: 158493
>> >> Yes that has async. read and sync. write, and that doesn't >> work with the usual block RAM. >> >> I am not sure if it wants the register before, or after, or >> if it doesn't matter. > >I'm not sure what you mean. Before or after what exactly? > >-- > >Rick Do you put a register before or after the ram array. You can register the addresses and then do an asynchronous read or you can do an asynchronous read and then register the data. The difference is known as writethru. If you have a dual port sram and do both a read and write operation to the same address in the same cycle then do you read the old data or the new? In the first case you will get the new data while the second case will give you the old data. In the first case the write data is written though the sram to become the read data. Selection depends on the circuit needs. If you are using sram in a fifo then writting to a completely full fifo on exactly the same cycle that data is popped off will not work with writethru. You want to pop off the oldest and replace it with the newest. If sram is a cpu register bank and you store in register X followed by an instruction the uses register X then pipelining will read the new data on the same cycle that it writes it to ram. In that case you must have writethru. John Eaton --------------------------------------- Posted through http://www.FPGARelated.comArticle: 158494
On 12/3/2015 11:27 AM, jt_eaton wrote: >>> >>> Yes that has async. read and sync. write, and that doesn't >>> work with the usual block RAM. >>> >>> I am not sure if it wants the register before, or after, or >>> if it doesn't matter. >> >> I'm not sure what you mean. Before or after what exactly? >> >> -- >> >> Rick > > Do you put a register before or after the ram array. > > You can register the addresses and then do an asynchronous read or you can > do an asynchronous read and then register the data. > > > The difference is known as writethru. If you have a dual port sram and do > both a read and write operation to the same address in the same cycle then > do you read the old data or the new? Yes. > In the first case you will get the new data while the second case will > give you the old data. > > In the first case the write data is written though the sram to become the > read data. > > Selection depends on the circuit needs. If you are using sram in a fifo > then writting to a completely full fifo on exactly the same cycle that > data is popped off will not work with writethru. You want > to pop off the oldest and replace it with the newest. > > If sram is a cpu register bank and you store in register X followed by an > instruction the uses register X then pipelining will read the new data on > the same cycle that it writes it to ram. In that case you must have > writethru. Different vendors give the modes different names, but essentially on block RAM writes the read data can be the old data, the new data or the read data port is held at the last value with no change. None of this is affected by where you put the registers in your HDL. This is typically controlled by attributes. -- RickArticle: 158495
> INFO: [Synth 8-5545] ROM "zp_reg[255]" won't be mapped to RAM because > address size (32) is larger than maximum supported(25)" The problem might be How are you indexing that small block of registers - is the address being used to index zp_reg also 8 bits? Also, why 255 elements and not 256? MikeArticle: 158496
rickman <gnuarm@gmail.com> wrote: (snip) >>>>> always @( posedge clk ) // NO ASYNC RESET - messes up optimization - >>>>> no reset at all actually is prefered >>>>> if( write_en ) >>>>> cell[ address ] <= write_data; >>>>> always @* >>>>> read_data = cell[ address ]; (snip) >> Yes that has async. read and sync. write, and that doesn't >> work with the usual block RAM. >> I am not sure if it wants the register before, or after, or >> if it doesn't matter. > I'm not sure what you mean. Before or after what exactly? For the case of reading, so consider a ROM, do you put the register on the address inputs, or the data outputs? Or, since the difference is only delay, can the synthesis tools move it from one to the other? -- glenArticle: 158497
jt_eaton <84408@fpgarelated> wrote: (snip, I wrote) >>> I am not sure if it wants the register before, or after, or >>> if it doesn't matter. >>I'm not sure what you mean. Before or after what exactly? (snip) > Do you put a register before or after the ram array. Yes that is what I meant. > You can register the addresses and then do an asynchronous read or > you can do an asynchronous read and then register the data. > The difference is known as writethru. If you have a dual port sram and do > both a read and write operation to the same address in the same cycle then > do you read the old data or the new? > In the first case you will get the new data while the second case will > give you the old data. > In the first case the write data is written though the sram to > become the read data. > Selection depends on the circuit needs. If you are using sram in a fifo > then writting to a completely full fifo on exactly the same cycle that > data is popped off will not work with writethru. You want > to pop off the oldest and replace it with the newest. If the FIFO has the same clock for both, then I suppose you can do that. With asynchronous read and write, you can't really do that, as you can't prevent the read from coming just slightly after the write. Most FIFOs have an "almost full" that helps avoid that, and also allows for other delays in stopping data come in. > If sram is a cpu register bank and you store in register X followed by an > instruction the uses register X then pipelining will read the new data on > the same cycle that it writes it to ram. In that case you must have > writethru. Or you add extra logic to bypass the RAM in that case. -- glenArticle: 158498
In article <n3oi6n$23b$1@dont-email.me>, rickman <gnuarm@gmail.com> wrote: >On 12/2/2015 6:27 PM, BobH wrote: >> On 12/2/2015 10:55 AM, Mark Curry wrote: >>> In article <n3lkei021un@news3.nntpjunkie.com>, >>> <snip> >>> >>> Huh. I missed what led up to this, but explicity coding up each case >>> like this is entirely unneccesary in verilog. >>> >>> reg [ 7 : 0] cell [ 3 : 0]; >>> always @( posedge clk ) // NO ASYNC RESET - messes up optimization - >>> no reset at all actually is prefered >>> if( write_en ) >>> cell[ address ] <= write_data; >>> >>> always @* >>> read_data = cell[ address ]; >>> >>> Done. If reset's are needed then it won't map to block RAM. >>> Xilinx has examples in their docs for how to successfully infer block >>> RAM. >>> >> >> Thanks! I have always explicitly built a model RAM when I wanted RAM in >> an FPGA rather than inferring one. I just automatically include the >> reset when I do D flops because it makes the simulation cleaner. >> I think that the original poster wanted an array of flops thinking >> that they would be faster than block ram. >> >> This looks worth messing with when I get some breathing space. I am a >> little curious about the synthesizabilty of it. > >I'm not sure the above is a correct model for block RAMs in many >devices. The ones I have used have a register delay even in the read >path. There can be separate interfaces (address, controls and data) >for reading and writing, but in all cases the read data is registered. > >What devices will this model work for? Or maybe I'm not so familiar >with Verilog. The read path in the above description is async, no? Rick, My only real point in the above code was showing it was possible to index into a multi-dimensional array in Verilog in synthesizable code. One doesn't need to explicity code out each index. Synthesis WILL build SOMETHING for all of these variations. It's all synthesizable. Now, if you're intending to map specifically to BLOCK, or Distributed memories, then I strongly suggestions checking the vendor documentation, and using their templates. It's easy to trip up the tools, and have them not build what you intended. Your example is a simple one. If you want to generate a BRAM, then you must register your read data (as well as your write). Missing this, you'll get Distributed (or FFs!). Regards, MarkArticle: 158499
On 12/3/2015 12:59 PM, glen herrmannsfeldt wrote: > rickman <gnuarm@gmail.com> wrote: > (snip) >>>>>> always @( posedge clk ) // NO ASYNC RESET - messes up optimization - >>>>>> no reset at all actually is prefered >>>>>> if( write_en ) >>>>>> cell[ address ] <= write_data; > >>>>>> always @* >>>>>> read_data = cell[ address ]; > > (snip) >>> Yes that has async. read and sync. write, and that doesn't >>> work with the usual block RAM. > >>> I am not sure if it wants the register before, or after, or >>> if it doesn't matter. > >> I'm not sure what you mean. Before or after what exactly? > > For the case of reading, so consider a ROM, do you put the > register on the address inputs, or the data outputs? > > Or, since the difference is only delay, can the synthesis tools > move it from one to the other? Rather than try to guess what is happening, just read the vendor's documentation and copy their examples for inferring RAM. I know Xilinx gives this info. I looked at an 8 year old document from Lattice and they say there are enough subtle differences between vendors that there is little point to inferring block RAM, so just instantiate it, (a newer document may have different recommendations). I don't like that and have never had any trouble with inference. I always put the registers at the inputs to the RAM as in some families there is an optional additional register on the data output. Otherwise I expect there is no difference based on where you put it... -- Rick
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z