Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Or, Use an inverted version of the clock to a BUFGMUX, and when it isn't right, switch the BUFGMUX to the other clock input. AustinArticle: 117426
On Mar 28, 11:04 am, "Alan Nishioka" <a...@nishioka.com> wrote: > On Mar 27, 11:14 am, "radarman" <jsham...@gmail.com> wrote: > > > The watchdog timer otherwise works fine. I can start/stop/reset the > > timer with no problem. The trouble is that when the watchdog timer > > DOES cause a reset, it won't release the reset line, which was > > effectively locking up the system. It did NOT respond like the > > datasheet. > > The source in /cygdrive/c/EDK81/hw/XilinxProcessorIPLib/pcores/ > opb_timebase_wdt_v1_00_a/hdl/vhdl/timebase_wdt_core.vhd says that the > state machine will only leave the ExpiredTwice state on OPB_Rst. > > So you need an OPB_Rst to release the WDT_Reset line. > > Alan Nishioka After I inserted the edge detector, the OPB reset was being released. The trouble is that the watchdog timer refused to release ITS reset output, even after OPB Reset cycled.. The reset controller is operating from the input clock, not the DCM clock. It has to, in order to bring the dcm out of reset. The clock on this board is a bit difficult, because it's not free-running at configuration time. Ultimately, the problem was solved another way. We have an async reset line going to another FPGA that can handle the problem. If the communications link goes down, the controller will just hit the "reset button"Article: 117427
Dolphin wrote: > Rob, > > If I take a look in the timing analysis then I see that not all my > LVDS pins have the same setup time. I am worried that this causes the > problem. > 0.120 ns TA1 ... > 5.128 ns TE2 > 5.121 ns TE2 This definitely calls for a Tsu constraint. Cyclone II devices have programmable input delay chains (in order to make Tsu/Th tradeoffs), and it looks like these chains have been set to 0 on TA, and to 7 on the other ones. You can find the actual Input Delay Chain values if, in the compilation report, you for to the fitter report, Resource section, and then expand the "Input pins" section. It's in one of the columns there. I find that Quartus by default (i.e. unconstrained) sets only one bit in a bus with delay tap 0 and the rest at 7. It's not really a Quartus 'bug', according to someone at Altera - the mismatch is there in order to show up and remind people to constrain their input buses... I suggest you set Tsu for these inputs to 0.4ns and recompile. You may get some errors, but at least the delay chain will be set to 0. Then raise the Tsu to a little more than the largest Tsu you find after the initial compilation and you should be all set. Hope this helps a bit. Best regards, BenArticle: 117428
Ken Soon wrote: > > Well, the reference design data file did say that the purpose of the wrapper > is to minimize the number of external interconnect required by the scaler so > that it can fit into a realistic target device. My wrapper has instantiates > 2 sequential lookup tables and 6 horizontal coefficient tables and 3 > vertical coefficient tables. So your scaler uses an FIR filter for scaling with 3x6 grid sampling and the coefficient sets for each available scaling factor are stored in BRAMs... yup, this can cost a handful of BRAMs - and you will have a hard time dumping these or your video line buffers in DRAMs. You really need to look into exactly what is consuming how many BRAMs how and why as I suggested in my previous message. Coefficient tables and video line buffers will be difficult to shift into DRAM: you will need some BRAMs to buffer data to/from the DRAMs and are not going to be any better off if you end up with as many FIFO BRAMs as you originally needed plain BRAMs for the initial design... actually, you will be worse off given the extra glue-logic. If coefficient tables are eating up those ~40 BRAMs, you may be in serious trouble since DRAMs cannot be programmed by a bitfile - you will need some method of initializing the DRAM. Analyze the design carefully to see how the BRAMs are used, there may be a few reduction tricks that can be applied to spare a few. > Hmm, I guess I will be looking at other designs which have interface with > the DDR SDRAM controller and from there, try to understand the interface and > I hope to be able to do the same for my video scaler design. Even if you find a suitable DRAM controller to paste into your design overnight, reworking your scaler to work with it will require some significant effort. > Thanks alot for checking for me that the code is a BRAM inference wrapper > for a dual port RAM. with independently clocked read and write ports. The coding style is obvious to anyone who has spent any significant amount of time getting 'fancy' BRAM inferences to work. The independent read and write port BRAM, independently clocked or not, is the easiest to get right to the point of being nearly impossible to mess up. > Yeh and sadly the evaluation board doesn't come with a dram interface > instruction guides or somewhat. IT only comes with the pin numbers and a few > brief description, alas. If you want to interface DRAMs, I suggest you start by looking up the datasheets for your board's DRAMs. At the very least, it will give you an idea of how kludgey a DRAM interface can be. The road to a fast and stable DRAM interface is full of bumpy quirks.Article: 117429
On Mar 30, 5:05 pm, Austin Lesea <aus...@xilinx.com> wrote: > Or, > > Use an inverted version of the clock to a BUFGMUX, and when it isn't > right, switch the BUFGMUX to the other clock input. > > Austin I think this takes the BUFGMUX out of the feedback loop and you would then lose your best phase relationship. Normally a BUFGMUX (used as a BUFG) is connected from the CLK0 output of the DCM to the feedback pin to remove BUFG delay from the clock. You could use a second BUFGMUX on CLK0 and CLK180 outputs with a switch to select the one you want, but the routing delays to the two BUFGMUXes might have a small difference.Article: 117430
John McCaskill wrote: > The SLR16s have two outputs that can be used for serial to parallel > shifters, the Q15 for cascading, and the selectable output. I finally > found the article that describes how to use an SRL plus a FF to build > a 6 bit per slice serial to parallel shifter. It is by Kris Chaplin, > not Ken Chapman like I had been thinking. > This works nicely as long as you don't need speed. The direct outputs (ie not through the slice register) from the SRL16 are quite slow. Even just routing the SRL output directly to a flip-flop in another slice yields a huge minimum clock hit.Article: 117431
On Mar 30, 4:11 pm, "Andy Peters" <goo...@latke.net> wrote: > On Mar 30, 12:00 pm, "Peter Klemperer" <ftpe...@gmail.com> wrote: > > > Hi All, > > > In one of my VHDL designs I have a section of code that I want > > different versions for synthesis than for simulation. Currently I > > just comment out one section and uncomment the other, but I had a > > rather embarassing incident yesterday where I forgot to change the > > comments before beginning synthesis. Ooops. > > > I searched around a bit, but haven't found a definitive solution other > > than using a preprocessor. I don't think that this is a satisfactory > > solution. Eliminating the simulation only code is simple, the -- > > synthesis translate_off/on pragmas works great. Is there an > > equivalent for modelsim? > > Generate statements? > > -a That's a solution to reducing the changes needed, but it doesn't fully automate the changes unless there's a way to detect which compiler is being used - Modelsim or synthesis. If you had a way to put your synthesis-specific code at the top of the file, you could use the - line option for vcom in Modelsim to start compilation after the synthesis only section, but I don't see this as very practical. If you were using Verilog I could think of a few ways to deal with this... In Verilog for example you can define a macro on the command line and then use `ifdef `else to exclude code that is for synthesis only.Article: 117432
"Slow" (and "fast") are very nebulous statements. Losing a few nanoseconds in an interconnect is o.k. when the clock runs at 100 MHz, or even a bit faster... Peter Alfke On Mar 30, 2:46 pm, Ray Andraka <r...@andraka.com> wrote: > John McCaskill wrote: > > The SLR16s have two outputs that can be used for serial to parallel > > shifters, the Q15 for cascading, and the selectable output. I finally > > found the article that describes how to use an SRL plus a FF to build > > a 6 bit per slice serial to parallel shifter. It is by Kris Chaplin, > > not Ken Chapman like I had been thinking. > > This works nicely as long as you don't need speed. The direct outputs > (ie not through the slice register) from the SRL16 are quite slow. Even > just routing the SRL output directly to a flip-flop in another slice > yields a huge minimum clock hit.Article: 117433
Patrick wrote: > I am talking about the backend, the problem I have is quite simple. > I have 2 execution units for arithmetic and logic, 1 memory access > unit (Load/Store) and a mutliply unit. > Obviously all of these 4 Units would like to write data at some point > to the registers. > So when i only have two write ports to the register file, then I have > a problem if > all 4 Units wanna write data into my regfile, right? > > So I wonder how this is actually relised in a RISC architecture when > there are more units to write to the regfile than I have writeports available On an ASIC, there is no FPGA-esque restriction on how many write ports a register file / SRAM can have. If a given CPU's execution unit may produce up to six register results in any one cycle, the register file will have as many as six write ports. For an FPGA-based implementation, there is a simple (but slow) way to simulate an ASIC-style register file / multi-ported SRAM: implement the register file as actual registers instead of dual-ported memories, this way you can read and write all registers at once if you want to but you will need read and write muxes to route your M write ports to your K registers and out your N read ports... huge and rather slow. If you want to keep your logic simple, you can simply freeze the execution pipeline on cycles where there are pending register writes. My third suggestion is to implement a register write queue (as little as two entries per port could be enough) and freeze the pipeline only when the queue is full - if an instruction depends on queued registers, fetch the relevant values from the queues instead of the registers.Article: 117434
Daniel S. & Weng > Since most of the new syntaxes from this thread come from VHDL 200X > language extension drafts, chances are that none of these new extensions > will be included/supported in/by standard Xilinx synthesis > libraries/tools until the next ISE after (and probably only if) they > become official. Accellera VHDL 3.0, is an Accellera Standardized that other vendors are currently implementing to. It was approved in July 2006. Hence, it is official so kick your vendor to get it implemented. The sooner you kick Xilinx, the sooner they will update their tools. Best Regards, JimArticle: 117435
Gabor wrote: > On Mar 30, 8:47 am, "Brad Parker" <b...@heeltoe.com> wrote: >> I have some questions about xilinx ISE/EDK and modelsim simulation. >> I'm looking for advice/pointers. >> >> I'm an reformed software guy, and my perspective is from using a lot >> of open source tools >> like linux and gcc. I've done a lot of linux kernel work so I'm >> familiar with big project trees, makefiles, etc... >> >> I'm sort of suprised by the state of EDA tools. I've recently spent a >> little time working with ISE & EDK >> and trying to get model to simulate a virtex 4. >> >> [To me, it seems like all the tools have their fingers in all the >> files and things are spread all out and >> it's *really* difficult sometimes to debug when a flow does not work >> or modelsim gives you a cryptic >> messages like "need to recompile blah because foo has been updated".] >> >> I've spent a lot of time with iverilog (icarus) and cver, using >> gtkwave and makefiles. This is a simple >> flow which works very reliably. But when I plug the files into ISE I >> end up with a large tree of who knows >> what (well, I do know, but you get the point). And when you add >> modelsim, it just gets worse. >> >> Is there any place (a boot, website,etc...) which describes what >> happens when I "compile libraries" >> in ISE or EDK? what get produced? are the files in a standard form? >> why do they need to be compiled? >> (I'm guessing that somehow magic private behavior models are secretly >> passed in a machine readable form >> to modelsim) >> >> (and why on earth the "workspace" library called "work" on the disk, >> why not "workspace"? is it just me?) >> >> -brad > > > Brad, > > Welcome to our world. The short answer is that the software guys that > write C compilers and operating systems actually use their software > and > to some extent understand how others use it, at least as that relates > to developing software. The guys who write EDA software have no clue > what hardware design is all about for the most part. > > As for what compilation does, this is all Modelsim, not Xilinx. I > doubt > you'll get very far trying to decipher the compiled code unless you > want > a job at Mentor :) > > In my opinion the Modelsim simulator runs remarkably fast considering > what has to happen under the hood in your serial execution processor. > There is no way an interpreted (as opposed to compiled) simulation > engine could keep up with this. > One important difference between VHDL & C (for example) is that "entities", unlike C headers, are referenced in their compiled form, not just copied in as source. So they must have been compiled (strictly, analysed) before any other unit which uses them. There are (with ModelSim) two ways to go about this: 1. Use vmake - ModelSim's tool which writes a makefile for you, respecting the order of things. 2. Compile in 2 steps, first compile everything using the -skip ab option (ie skip architectures & bodies), then repeat using the -just ab option (ie just those). That way, all entities (& other stuff) are done before they are needed. Also, remember that every entity can have more than one architecture: this is as though you could have several implementations of a single C module, all valid together. Of course, the system needs a rule to choose which it shall use. This rule is called "binding". "work" is the default library name, as defined by VHDL. But it's just that: the default. Use vlib to set up new libraries to your liking, & override the defaults in the vcom command line.Article: 117436
Gabor wrote: > On Mar 30, 4:11 pm, "Andy Peters" <goo...@latke.net> wrote: >> On Mar 30, 12:00 pm, "Peter Klemperer" <ftpe...@gmail.com> wrote: >> >>> Hi All, >>> In one of my VHDL designs I have a section of code that I want >>> different versions for synthesis than for simulation. Currently I >>> just comment out one section and uncomment the other, but I had a >>> rather embarassing incident yesterday where I forgot to change the >>> comments before beginning synthesis. Ooops. >>> I searched around a bit, but haven't found a definitive solution other >>> than using a preprocessor. I don't think that this is a satisfactory >>> solution. Eliminating the simulation only code is simple, the -- >>> synthesis translate_off/on pragmas works great. Is there an >>> equivalent for modelsim? >> Generate statements? >> >> -a > > > That's a solution to reducing the changes needed, but it doesn't fully > automate the changes unless there's a way to detect which compiler is > being used - Modelsim or synthesis. If you had a way to put your > synthesis-specific code at the top of the file, you could use the - > line > option for vcom in Modelsim to start compilation after the synthesis > only section, but I don't see this as very practical. If you were > using Verilog I could think of a few ways to deal with this... > In Verilog for example you can define a macro on the command line > and then use `ifdef `else to exclude code that is for synthesis only. > Depending on the extent of your optional code, can you use two different architectures, & a configuration file to choose between them?Article: 117437
Thank you all for your responses. Unfortunately, I am sharing this code among people using different design flow and I don't want to change anybodies flow. For now, it seems the most reliable solution would be using generate statements, but I think we're going to just have to remember to hand comment out when it's time to synthesize. Thanks again, --Peter > Depending on the extent of your optional code, can you use two different > architectures, & a configuration file to choose between them?Article: 117438
Hello, I am trying to create a new compilation target for SysGen and I found that the documentation is not complete and the sample compilation target that is provided with SysGen is not working. Did any body tried to build a compilation target for SysGen? Any success? RegardsArticle: 117439
Peter Klemperer wrote: > Hi All, > > In one of my VHDL designs I have a section of code that I want > different versions for synthesis than for simulation. Currently I > just comment out one section and uncomment the other, but I had a > rather embarassing incident yesterday where I forgot to change the > comments before beginning synthesis. Ooops. > > I searched around a bit, but haven't found a definitive solution other > than using a preprocessor. I don't think that this is a satisfactory > solution. Eliminating the simulation only code is simple, the -- > synthesis translate_off/on pragmas works great. Is there an > equivalent for modelsim? You can use this old trick: constant MODELSIM : boolean := false -- synthesis translate_off or true -- synthesis translate_on ; then if MODELSIM then foo; else bar; end if; or similar. Jonathan B will surely post a better solution in due course.Article: 117440
Hi all, I am a Computer Science student in my final year. My graduation project is to build a face recognition system (based on Principle Component Analysis and possibly Artificial Neural Networks) on FPGA. Since software is the focus of our college study, hardware is not quite my domain of expertise. I have been doing some reading on the subject, but of course it's nothing in comparison to years of experience. So, I was wondering if you can provide me with some guidance on the following points: - Which would be a better approach: implementing such system in HDL, or using a soft microprocessor, which if I understand correctly will make it possible to implement the system in assembly or even C. What about mixing them, which I think is referred to as a hardware/software co-design approach; would that be too hard to accomplish? What would its advantage be over either of the two approaches? - If the use of a microprocessor is suggested, what are the recommendations for the type of microprocessor or the specific implementation? - I already went and bought an XESS XSA-200 prototyping board, which operates a Spartan-II XC2S200-5FG256 (200k gates) FPGA. After spending the last couple of months with it, I realize it might be a bit low- end. The question is, would it be possible to fit the probably-complex image processing system on it, or is it possible that I would reach a point where I can't fit my design on it no matter what? I realize this is a relatively long post, so I'd be grateful for any answers to any part of it. Thank you for reading this far, and thanks in advance for any replies. Best Regards, Islam Ossama 4th year, Computer Science Dept. Faculty of Computer and Information, Helwan University, Cairo, Egypt.Article: 117441
On Mar 30, 7:07 pm, Tim <t...@nooospam.roockyloogic.com> wrote: > Peter Klemperer wrote: > > Hi All, > > > In one of my VHDL designs I have a section of code that I want > > different versions for synthesis than for simulation. Currently I > > just comment out one section and uncomment the other, but I had a > > rather embarassing incident yesterday where I forgot to change the > > comments before beginning synthesis. Ooops. > > > I searched around a bit, but haven't found a definitive solution other > > than using a preprocessor. I don't think that this is a satisfactory > > solution. Eliminating the simulation only code is simple, the -- > > synthesis translate_off/on pragmas works great. Is there an > > equivalent for modelsim? > > You can use this old trick: > > constant MODELSIM : boolean := false > -- synthesis translate_off > or true > -- synthesis translate_on > ; > > then > if MODELSIM then foo; else bar; end if; > > or similar. > > Jonathan B will surely post a better solution in due course. Thank you Tim, You just made my day. I use Verilog, but work with EDK which has a strong VHDL bias. I have been having to work around problems with passing std_logic_vector generics to integer parameters and keeping multiple versions of EDK and ModelSim happy at the same time. This is a much more elegant solution than any of the others that I have come up with or had recomended to me. Regards, John McCaskill www.fstertechnology.comArticle: 117442
On Mar 30, 1:02 pm, "Daniel S." <digitalmastrmind_no_s...@hotmail.com> wrote: > Weng Tianxiang wrote: > >>> Or I think I'd prefer: > >>> y_xor(0) <= xor (x and X"B4D1B4D14B2E4B2E"); > >>> Because it's much less typing. > >> I like it. Just if it was more readable. > > >> Jim- Hide quoted text - > > >> - Show quoted text - > > > Hi Lewis, > > Can you tell me which Xilinx ISE version starts to support the new > > definition you described in the form of XOR((...))? > > Since most of the new syntaxes from this thread come from VHDL 200X > language extension drafts, chances are that none of these new extensions > will be included/supported in/by standard Xilinx synthesis libraries/tools > until the next ISE after (and probably only if) they become official. > > Until then, you will most likely have to wait for the next(next(next(...))) > ISE revision and hope for "early" adoption. In the meantime, you can write > your own package and define your own reduction functions, it takes only a > minute or two to code your own xor_reduce function... assuming you do not > already have one in your vendor's libraries that does the same job. No big > deal. Hi Daniel, What I mentioned were coded without any trouble. The trouble is each time one needs it, he must write the code for himself, A broad range same function can be written one time as a VHDL standard and after that everybody doesn't have to write it again and again. Now in any standard VHDL library, there are too many functions that are useless, for example, XOR(a, b). More general functions with variable size of std_logic_vector or unsigned without troube to introduce another signal definition for the temparorily set is easier to use and will be widely used. WengArticle: 117443
On Mar 27, 8:44 pm, "Paul Leventis" <paul.leven...@gmail.com> wrote: > Hi, > > Due to recent changes, the 2-to-4 year delay has been reduced > somewhat. If you go tohttp://www.uspto.gov/patft/, you can now view > published applications. I'm not sure how soon after filing that an > application is made available to the public, but based on random > searches of the patents of my colleagues it seems like it is in the > neighborhood of 1 year. > > I agree with Peter -- rarely will you see what exactly was implemented > in the chip in a patent; it is usually some combination of a bunch of > patents mixed with good engineering and a dash of trade secrets. And > it can be hard to tell what a patent was really about by the time it > gets turned into legalize... > > - Paul Hi Paul, Can you help identify where I can find the patent application: U.S. Appl No. 11/151/796, filed Jun. 14, 2005. It is said that a patent application will be published after 180 days since its file date. But from U.S. Patent Office, I couldn't find it. Why? Or any other technology to find it? Or it may be delayed longer than 180 days? WengArticle: 117444
Weng, I am glad you liked the paper. Here it is: http://www.sunburst-design.com/papers/CummingsSNUG2002SJ_FIFO2.pdf Both Cliff Cummins and I are deeply involved in the peculiarities of asynchr. FIFOs. When we agreed to co-author this paper, Cliff was very suspicious that my solution would not work properly, so I had to work very hard to (almost) convince him. That definitely improved the paper, which was then voted "best paper of the conference"(Synopsys User Group, 2002) Nice memories... Thanks Peter Alfke On Mar 27, 10:03 pm, "Weng Tianxiang" <wtx...@gmail.com> wrote: > On Mar 27, 7:15 pm, "Peter Alfke" <a...@sbcglobal.net> wrote: > > > > > Weng, you seem to believe that there is a one-to-one corresponcence > > between the content of a patent and the Xilinx implementation. > > That is not necessarily so. > > If you want to learn what a certain company is interested in, then > > looking at patents is meaningful, (but you still suffer from the 2- > > to-4year delay in patent issuing.) > > If you want to design an ASIC, intimate knowledge of the FPGA may be > > more hindrance than help. The architecture and circuit trade-offs are > > completely different. > > Keep studying... > > Peter Alfke > > ========================== > > > On Mar 27, 6:10 pm, "Weng Tianxiang" <wtx...@gmail.com> wrote: > > > > On Mar 27, 3:25 pm, "John_H" <newsgr...@johnhandwork.com> wrote: > > > > > Is page 158 of the Virtex-5 User Guide > > > > > http://direct.xilinx.com/bvdocs/userguides/ug190.pdf > > > > > just too darned simple for you? Are you trying to understand the operation > > > > of the part from the detailed silicon level tricks that may or may not be > > > > applicable for this part of the device? I tried looking at a DDR IOB cell > > > > patent once and found it to be interestingly disconnected from my RTL and > > > > chip level design experience. If you are into physical level design of CMOS > > > > chips on advanced processes you have a chance of understanding how things > > > > come together. If all you want to know is how that chip will work for you, > > > > use the User's Guide! > > > > > I don't have to know about the metal casting used for the alternator in my > > > > car to understand how the alternator works. You don't need patents to > > > > understand the SLICE_L. > > > > > - John_H > > > > > "Weng Tianxiang" <wtx...@gmail.com> wrote in message > > > > >news:1175036266.831589.180920@b75g2000hsg.googlegroups.com... > > > > > > Hi, > > > > > When I am turning to Xilinx Virtex-5 new chips from Virtex-II, I would > > > > > like to know which patents filed by Xilinx to disclose the contents of > > > > > Slice L. > > > > > > Slice M is too complex for me to fully understand at the moment and > > > > > just knowledge of Slice L is good enough for me to start with Virtex-5 > > > > > as basic knowledge for it. > > > > > > Thank you. > > > > > > Weng- Hide quoted text - > > > > > - Show quoted text - > > > > Hi John, > > > Yes, I am interested in ASIC design of Slice L and want someone's help > > > to locate the patent filed by Xilinx that contains the contents of > > > Slice L. I am not interested in slice M that is too complex to me now. > > > > I have already printed the user manual you indicated and carefully > > > read it. But it doesn't meet my curiority. > > > > Weng- Hide quoted text - > > > - Show quoted text - > > Hi Peter, > Thank you for your advice. > > I like reading and learning. Your paper about asynchronous FIFO > cooperated with another engineer is the best article I have read in my > life. > > WengArticle: 117445
Islam, If I were you, I would: first explore the best algorithm then see whether it can be implemented on a (any!) microprocessor, achieving reasonable performance. If the microprocessor implementation is too slow, I would look at a way to speed it up with an FPGA, and I would use an existing board of reasonable performance. The Xilinx University Program has a better board, based on Virtex- IIPro, that is surprisingly inexpensive for universities. I would contact the Xilinx University Program about details. Challenging project! Peter Alfke On Mar 30, 6:17 pm, "Islam Ossama" <islam.oss...@gmail.com> wrote: > Hi all, > > I am a Computer Science student in my final year. My graduation > project is to build a face recognition system (based on Principle > Component Analysis and possibly Artificial Neural Networks) on FPGA. > Since software is the focus of our college study, hardware is not > quite my domain of expertise. I have been doing some reading on the > subject, but of course it's nothing in comparison to years of > experience. So, I was wondering if you can provide me with some > guidance on the following points: > > - Which would be a better approach: implementing such system in HDL, > or using a soft microprocessor, which if I understand correctly will > make it possible to implement the system in assembly or even C. What > about mixing them, which I think is referred to as a hardware/software > co-design approach; would that be too hard to accomplish? What would > its advantage be over either of the two approaches? > > - If the use of a microprocessor is suggested, what are the > recommendations for the type of microprocessor or the specific > implementation? > > - I already went and bought an XESS XSA-200 prototyping board, which > operates a Spartan-II XC2S200-5FG256 (200k gates) FPGA. After spending > the last couple of months with it, I realize it might be a bit low- > end. The question is, would it be possible to fit the probably-complex > image processing system on it, or is it possible that I would reach a > point where I can't fit my design on it no matter what? > > I realize this is a relatively long post, so I'd be grateful for any > answers to any part of it. > > Thank you for reading this far, and thanks in advance for any replies. > > Best Regards, > Islam Ossama > 4th year, Computer Science Dept. > Faculty of Computer and Information, > Helwan University, Cairo, Egypt.Article: 117446
On Mar 26, 2:37 pm, djo...@btinternet.com wrote: > Dear users > > Thank you for your reply. > > I found that my problem lies in the state machine it self, so i > basically removed the RAM from my project and ran the synatizer using > quartus. > > The warning message has slighted change but may be its caused by the > same mistake > > Warning: Found 80 node(s) in clock paths which may be acting as ripple > and/or gated clocks -- node(s) analyzed as buffer(s) resulting in > clock skew > > Regarding the RAMS timing what my state machine is doing is creating a > vector throughout differnt stages of the process and when the vector > is ready to be saved . The statemachine goes to a state that asserts > the Clock input to the RAM to '1' and this is asserted 1 for one clock > cycle of the clock in the state machine before returning to '0'; > > The other message i get is > Warning: Timing Analysis is analyzing one or more combinational loops > as latches > Warning: Node "ramin[0]$latch" is a latch > .. > ... > .... > > Am i strucuring my VHDL code wrongly or somthing. I am using mentor > graphic FPGA adavantge and have graphical created a state machine. > > > > >From what i have understood is that- Hide quoted text - > > - Show quoted text - It's hard to be sure of what's going on without seeing your design, but those warning messages from Quartus are a strong indication you're not coding your VHDL in a clean way. Instead of gating the clock to your RAM, use a clock enable (safer and easier to timing analyze). Vaughn AlteraArticle: 117447
Peter, First of all, thanks for replying :-). We (team of 5) spent the last semester exploring and evaluating the different algorithms. We came to the conclusion that PCA would be the best compromise between complexity and accuracy, with the recommendation that Neural Networks be added into it if possible, to increase the accuracy even further. We did evaluate different implementations of the algorithm, some on MATLAB and some running natively, and the performance was as we expected: not suitable for real-time, especially for large face databases. This only confirmed what was suggested at the beginning of our research, which was the main reason we wanted to explore an FPGA implementation to make up for the performance shortcomings while maintaining (or better, increasing) the level of accuracy. Right now we have in fact divided up the work between us, and 3 of my team are working on the algorithm (it being the main focus of the project), while one is working on the neural network, and myself working on the FPGA. What remains to be determined is the kind of design that the algorithm will be implemented with. Whether it would be solely VHDL, a soft microprocessor and asm/C/C++ code, or some combination of the two. And, of course, whether the hardware currently at hand would support that design. Also, after sending my earlier post, I stumbled upon a project on opencores.org called Java Optimized Processor (http:// www.opencores.com/projects.cgi/web/jop/overview), which is essentially a soft processor for Java bytecode. I was wondering how good/robust/ flexible it is, and whether someone here has actually used it in any way on actual hardware. I might try to implement it and download it to my FPGA, if I can get it to compile on Webpack 9.1i without trouble. I did check out the Xilinx University Program. The "Virtex-II Pro Development System" seems like an extreme overkill in our case, since the use of FPGA isn't standard curriculum in our faculty; we are mainly doing this as a unique, single-case approach. It seems like an excellent choice for an engineering faculty, though. Unfortunately, I don't think our faculty (Computer Science) would be willing to make such a purchase based on a single case requirement, especially taking into account the relatively high currency exchange rate (1 USD ~= 5.7 EGP). Thanks again for your response, and I hope I haven't bored you with my long reply... Best Regards, Islam Ossama On Mar 31, 5:17 am, "Peter Alfke" <a...@sbcglobal.net> wrote: > Islam, > If I were you, I would: > first explore the best algorithm > then see whether it can be implemented on a (any!) microprocessor, > achieving reasonable performance. > If the microprocessor implementation is too slow, I would look at a > way to speed it up with an FPGA, and I would use an existing board of > reasonable performance. > The Xilinx University Program has a better board, based on Virtex- > IIPro, that is surprisingly inexpensive for universities. > I would contact the Xilinx University Program about details. > > Challenging project! > Peter AlfkeArticle: 117448
Islam Ossama wrote: > Peter, > > First of all, thanks for replying :-). > > We (team of 5) spent the last semester exploring and evaluating the > different algorithms. We came to the conclusion that PCA would be the > best compromise between complexity and accuracy, with the > recommendation that Neural Networks be added into it if possible, to > increase the accuracy even further. > > We did evaluate different implementations of the algorithm, some on > MATLAB and some running natively, and the performance was as we > expected: not suitable for real-time, especially for large face > databases. That would seem to be the key issue. FPGA should be able to scan one face quite quickly, but trawling for a match becomes a data-scan problem. Just how large are these databases ? -jgArticle: 117449
We (Illiac 6 research group at the University of Illinois) are currenlty working on porting one of the more popular face recognition programs for an application that is going to run on our "Communications Supercomputer", which involves a Virtex II-Pro FPGA. These is no benefit from implementing a processor in the FPGA and then running the existing C code on it, becuase there is no way it will come close to the performance of a processor in ASIC. Your only option of utilizing the FPGA for speedup is going to the roots of the algorithm(s), finding parallelism, and writing HDL code to exploit that parallelism. That is the phase we are currently in. Remember that an algorithm that may not be the best in a sequential environment may shine in a highly parallel environment, so it's best to look at all algorithms for possible parallel structure. ---Matthew Hicks > Peter, > > First of all, thanks for replying :-). > > We (team of 5) spent the last semester exploring and evaluating the > different algorithms. We came to the conclusion that PCA would be the > best compromise between complexity and accuracy, with the > recommendation that Neural Networks be added into it if possible, to > increase the accuracy even further. > > We did evaluate different implementations of the algorithm, some on > MATLAB and some running natively, and the performance was as we > expected: not suitable for real-time, especially for large face > databases. > > This only confirmed what was suggested at the beginning of our > research, which was the main reason we wanted to explore an FPGA > implementation to make up for the performance shortcomings while > maintaining (or better, increasing) the level of accuracy. > > Right now we have in fact divided up the work between us, and 3 of my > team are working on the algorithm (it being the main focus of the > project), while one is working on the neural network, and myself > working on the FPGA. > > What remains to be determined is the kind of design that the algorithm > will be implemented with. Whether it would be solely VHDL, a soft > microprocessor and asm/C/C++ code, or some combination of the two. > And, of course, whether the hardware currently at hand would support > that design. > > Also, after sending my earlier post, I stumbled upon a project on > opencores.org called Java Optimized Processor (http:// > www.opencores.com/projects.cgi/web/jop/overview), which is essentially > a soft processor for Java bytecode. I was wondering how good/robust/ > flexible it is, and whether someone here has actually used it in any > way on actual hardware. I might try to implement it and download it to > my FPGA, if I can get it to compile on Webpack 9.1i without trouble. > > I did check out the Xilinx University Program. The "Virtex-II Pro > Development System" seems like an extreme overkill in our case, since > the use of FPGA isn't standard curriculum in our faculty; we are > mainly doing this as a unique, single-case approach. It seems like an > excellent choice for an engineering faculty, though. Unfortunately, I > don't think our faculty (Computer Science) would be willing to make > such a purchase based on a single case requirement, especially taking > into account the relatively high currency exchange rate (1 USD ~= 5.7 > EGP). > > Thanks again for your response, and I hope I haven't bored you with my > long reply... > > Best Regards, > Islam Ossama > On Mar 31, 5:17 am, "Peter Alfke" <a...@sbcglobal.net> wrote: > >> Islam, >> If I were you, I would: >> first explore the best algorithm >> then see whether it can be implemented on a (any!) microprocessor, >> achieving reasonable performance. >> If the microprocessor implementation is too slow, I would look at a >> way to speed it up with an FPGA, and I would use an existing board of >> reasonable performance. >> The Xilinx University Program has a better board, based on Virtex- >> IIPro, that is surprisingly inexpensive for universities. >> I would contact the Xilinx University Program about details. >> Challenging project! >> Peter Alfke
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z