Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On 9 Apr., 22:51, rickman <gnu...@gmail.com> wrote: > On Apr 8, 5:30 pm, Jim Granville <no.s...@designtools.maps.co.nz> > wrote: > > > Re Quad Serial memory devices, and execute in place : > > I see SST have just released a Quad device as well. > > $1.16/10K for 16MBit > > > Good news, and bad news : > > > Good news: 80MHz nibble rate. They also added 8 & 16 bit address modes, > > to the default 24 bit. > > Bad News: These short-jump modes _are_ signed relative, > > BUT they do NOT cross page boundaries. (ie Just like the 8048...) > > > So, the idea falls short of being usable in relocatable code. > > > Pity, as a memory-opcode jump is a good way to save some bandwidth > > > Why make something signed relative, but then have it page-wrap ? > > > - was this a bug, maybe they intended it to work properly, but found > > an oops, and changed the data to match the silicon ? > > I knew there was something fundamentally wrong with the idea of a very > slow processor in an FPGA executing from an external memory chip. I > figured out what it is. Using even *less* real estate inside the FPGA > and by adding only two, very inexpensive chips, I can add an > *external* processor and (optionally) a serial memory chip to > interface with the FPGA through a serial port such as SPI, I2C or > custom. Rather than try to implement a full processor in minimal > gates, isn't it easier and more effective to use a serial control to > allow an external processor to control the FPGA, while matching the > size and cost to the job? > > Even 32 bit flash based ARM chips are available for less than the cost > of the serial memory. ;) I am doing myself exactly this A3P060 3.5 AT45DB161D 1.0 ATmega88 0.8 ========== ;) but there is still room and need for low FPGA estate serial engine at least i think so AnttiArticle: 131076
rickman wrote: > On Apr 8, 5:30 pm, Jim Granville <no.s...@designtools.maps.co.nz> > wrote: > >>Re Quad Serial memory devices, and execute in place : >>I see SST have just released a Quad device as well. >>$1.16/10K for 16MBit >> >>Good news, and bad news : >> >>Good news: 80MHz nibble rate. They also added 8 & 16 bit address modes, >>to the default 24 bit. >>Bad News: These short-jump modes _are_ signed relative, >>BUT they do NOT cross page boundaries. (ie Just like the 8048...) >> >>So, the idea falls short of being usable in relocatable code. >> >>Pity, as a memory-opcode jump is a good way to save some bandwidth >> >>Why make something signed relative, but then have it page-wrap ? >> >>- was this a bug, maybe they intended it to work properly, but found >>an oops, and changed the data to match the silicon ? > > > I knew there was something fundamentally wrong with the idea of a very > slow processor in an FPGA executing from an external memory chip. I > figured out what it is. Using even *less* real estate inside the FPGA > and by adding only two, very inexpensive chips, I can add an > *external* processor and (optionally) a serial memory chip to > interface with the FPGA through a serial port such as SPI, I2C or > custom. Rather than try to implement a full processor in minimal > gates, isn't it easier and more effective to use a serial control to > allow an external processor to control the FPGA, while matching the > size and cost to the job? Yes, Of course; That's the litmus test ALL soft-CPU choices must face. It is nearly always better to use a dedicated Microcontroller IF one is available that fits. (and mop the 'other stuff' up in the (now hopefully smaller) FPGA) The range of dedicated controllers is also expanding all the time. Ethernet and USB are becoming small incremental adders. I see the serial CPU as something more of a paper exercise, that 'defines the numbers' : How small can it be, and how slow ? What chips are out there, that support such a design ? Quad-SPI devices are relatively new, and give another pincount/bandwidth design point. FPGA vendors need to include them on their configuration memory support list, for faster config times. SST claim they target Execute in place with their device. I've asked SST for some examples of that, and for examples of a design that actually uses their strange wraping-relative-jumps :) -jgArticle: 131077
Hi All, I need to specify the strict setup time for the group of signals. It can be relatively high, but I need very low skew between the signals. In Quartus for Altera FPGAs I can define it with the sdc contraints as follows: set_max_delay -from [get_ports MY_BUS[*]] -to [get_registers *] 6.000 set_min_delay -from [get_ports MY_BUS[*]] -to [get_registers *] 5.000 With the constraints above the code compiles and opperates correctly. However in Xilinx ISE: TIMEGRP "MY_BUS_GRP" OFFSET = IN 6 ns BEFORE "SYS_CLK" does not allow me to specify both the minimal and maximal setup time... As the result, I get the implementation, where for one signal the setup time is equal to 5.99 ns, but for the other - e.g. only 3ns . Because the signals are then oversampled at high frequency it results in the unacceptable skew... How can I solve this problem? Using OFFSET IN AFTER is problematic, as I'd need to adjust all constraints if the period of the clock changes :-(. -- TIA & Regards, WojtekArticle: 131078
On Apr 9, 3:36=A0pm, Wojciech Zabolotny <w...@ise.pw.edu.pl> wrote: > Hi All, > > I need to specify the strict setup time for the group of signals. > It can be relatively high, but I need very low skew between the signals. > In Quartus for Altera FPGAs I can define it with the sdc contraints as > follows: > > set_max_delay -from [get_ports MY_BUS[*]] -to [get_registers *] 6.000 > set_min_delay -from [get_ports MY_BUS[*]] -to [get_registers *] 5.000 > > With the constraints above the code compiles and opperates correctly. > > However in Xilinx ISE: > TIMEGRP "MY_BUS_GRP" OFFSET =3D IN 6 ns BEFORE "SYS_CLK" > does not allow me to specify both the minimal and maximal setup time... > > As the result, I get the implementation, where for one signal the setup > time is equal to 5.99 ns, but for the other - e.g. only 3ns . > Because the signals are then oversampled at high frequency it results in > the unacceptable skew... > > How can I solve this problem? > Using OFFSET IN AFTER is problematic, as I'd need to adjust all > constraints if the period of the clock changes :-(. > -- > TIA & Regards, > Wojtek There is no way any FPGA (or other IC) can guarantee a combinatorial plus routing delay to be 5 ns min, 6 ns max. Just normal ambient temperature and Vcc variations exceed these tolerances, You may perhaps be able to make the delays track (which is what you seem to be after) but absolute delay precision is out of the question, Now, if you register these values in a register driven by a global clock, the Qs may have very little delta delay, BTW, in my book, "set-up time" is a timing requirement on the D-input of a register or flip-flop, never a physical delay. Maybe that's the difference between the two manufacturers... Peter Alfke, XilinxArticle: 131079
On Apr 10, 5:12 am, Peter Alfke <pe...@xilinx.com> wrote: > On Apr 9, 10:05 am, shakith.ferna...@gmail.com wrote: > > > > > On Mar 31, 11:57 am, "TC" <no...@nowhere.com> wrote: > > > > "Dave" <dhsch...@gmail.com> wrote in message > > > >news:940082fb-7675-4407-8c91-fd1300a0ed41@e60g2000hsh.googlegroups.com... > > > > > On Mar 26, 10:25 am, "BobW" <nimby_NEEDS...@roadrunner.com> wrote: > > > >> <shakith.ferna...@gmail.com> wrote in message > > > > >>news:a01b5815-57f0-4f38-bade-fae9d7937826@d4g2000prg.googlegroups.com... > > > > >> > Hi, > > > > >> > We have developed a High Speed on FPGA using the MGT/RocketIO to > > > >> > generate high speed signals. Also we receive the high speed signals > > > >> > using the MGT. Due to the nature of the application, we require a pure > > > >> > signal without encoding (We have switched off the 8B/10B encoding in > > > >> > the MGT). The problem is that it effects the clock recovery accuracy > > > >> > in the received signal as there might not be good DC balance in the > > > >> > signal. Which in turn effects the data received. And MGT is a black > > > >> > box to us. > > > > >> > Two options- > > > >> > 1. Another encoding mechanism such that we have pure signal and good > > > >> > clock recovery. > > > >> > 2. Or is there a parameter in MGT to improve the clock recovery. > > > > >> > We are looking for some solution around this problem. > > > > >> > Our setup: > > > >> > FPGA - Xilinx Virtex II Pro > > > >> > Board - Xilinx XUP Virtex(tm)-II Pro Development System > > > >> > Software - ISE 9.1i > > > > >> > Best Regards > > > >> >Shakith > > > > >> The clock recovery circuitry requires that there be a minimum number of > > > >> received edges per unit time. Otherwise, the recovered clock edges will > > > >> drift with respect to the data stream. > > > > >> If you don't use some type of encoding (e.g. 8B/10B), or you don't insure > > > >> that your raw data ALWAYS provides this minimum edge density requirement, > > > >> then the clock/data recovery circuitry will not work. > > > > >> Bob > > > > > Perhaps a scrambler/descrambler using LFSR's would help alleviate any > > > > edge density or DC bias problems by randomizing the data a bit? I'm > > > > not sure if it would guarantee correct reception, but it may help. > > > > You don't say why you need a "pure" signal (no coding). I could assume that > > > you are trying to eliminate the overhead of coding (with 8b10b you only get > > > an effective data bandwidth of 80%). > > > > You didn't say whether or not you MUST have a DC-balanced signal. Are your > > > transmitters AC-coupled to the receivers? If yes, the you must have a > > > DC-balanced signal and coding IS required. > > > > You also don't mention your clocking requirements. Do you have a common > > > (distributed) reference clock, or are the reference clocks independant? > > > > Typical clock recovery circuits assume plesio-synchronous operation. This > > > means the the reference clocks can be independent but must be within +/- XYZ > > > PPM (parts-per-million) of each other. For a receiver to track the > > > transmitter the receiver must recover the clock (and data) from the > > > transmitted bit stream. At high speeds this is done using PLLs. For the PLLs > > > to perform well they require frequent edges in the transmitted bit stream. > > > > Without a coded bit-stream you can't have DC-balance and you can't gaurantee > > > frequent edges for the PLL. So if you really can't have a coded bit-stream > > > then you must have a distributed (common) reference clock and you cannot use > > > AC-coupling. > > > > If you are trying to minimize the overhead of coding look at other coding > > > methods like 64b/66b codes. These are more efficient but still have the > > > benefits of DC-balanced tranmission and frequent clock edges. > > > > Note that another benefit of 8b10b is the ability to align your symbols. How > > > are you going to know the boundaries of a byte, a packet, etc., without a > > > coded stream and use of non-data symbols (control symbols)? If you have > > > independent reference clocks how are you going to avoid underrun and overrun > > > of the reciever withouth control symbols (i.e. insertion and removal of idle > > > characters)? Do you need any form of closed-loop flow control, error > > > signalling, etc.? If yes, how are you going to do this without control > > > symbols? > > > > TC > > > Thanks...we also noticed a new problem > > When we run our implementation on FPGA, there is loss of data on the > > non-active part of the signal. Its set to be 20180bits, but the number > > of bits received is less than that. The loss of data is not consistent > > either. But this doesn't appear in simulation in Modelsim. One reason > > could be the fact that the DC Balance is not there. But does it affect > > the data recovery from serial to parallel? > > Another option is use an external modulator chip on the receiver > > signal to achieve DC balance and the use the new modulated signal on > > MGT and decode it inside the FPGA. > > > Our setup: > > MGT Speed - 2 Gpbs > > FPGA - Xilinx Virtex II Pro > > Board - Xilinx XUP Virtex(tm)-II Pro Development System > > Software - ISE 9.1i > > You are running into a fundamental limitation. What you receive is NRZ > data, High for 1, Low for 0 (or something similar). You know that a > High level of a certain duration means three ones in a row, but the > receiver needs a clock to separate the bits. It gets this clock from > its own PLL oscillator that is being kept alive and accurate by re- > synchronizing itself from any transitions in the data stream. That's > why you must have a transition quite often. Maybe you can run for 70 > bits without a transition, but after tht, the PLL oscillator will > drift and will not partition the incoming data stream properly. > Sorry for this basic explanation, maybe everybody understands all this > already. > Peter Alfke ahhhhh Thanks.. Now I get it.. So in long streams, its unable to partition the data to recover the exact bits.. hmm, I guess one solution is use an external scrambler/encoder on the signal to achieve data balance. then do descrambling/decoding inside the fpga to get the correct data.. Cheers ShakithArticle: 131080
Wojciech Zabolotny wrote: > Hi All, > > I need to specify the strict setup time for the group of signals. > It can be relatively high, but I need very low skew between the signals. > In Quartus for Altera FPGAs I can define it with the sdc contraints as > follows: > > set_max_delay -from [get_ports MY_BUS[*]] -to [get_registers *] 6.000 > set_min_delay -from [get_ports MY_BUS[*]] -to [get_registers *] 5.000 > > With the constraints above the code compiles and opperates correctly. > > However in Xilinx ISE: > TIMEGRP "MY_BUS_GRP" OFFSET = IN 6 ns BEFORE "SYS_CLK" > does not allow me to specify both the minimal and maximal setup time... > > As the result, I get the implementation, where for one signal the setup > time is equal to 5.99 ns, but for the other - e.g. only 3ns . > Because the signals are then oversampled at high frequency it results in > the unacceptable skew... > > How can I solve this problem? > Using OFFSET IN AFTER is problematic, as I'd need to adjust all > constraints if the period of the clock changes :-(. When signals are absolutely critical to be low skew as in oversampling or otherwise very high speed designs, I and many others resort to using manual placement and often constrained routes. If the routing is regular, often one constrained route can cover a multitude of signals. Constrained routing is an advanced feature that requires some work inside FPGA Editor to generate the constraints. Solid knowledge of the physical workings around the IOBs and signal routing boxes helps significantly. Much of that knowledge comes from "poking around" inside the chip trying different routes both automated and manual. There are some application notes on wide, very high speed buses that use the physical constraints to tie down how things behave coming into the chip. How precise you get depends on how extreme your design goals are. - John_HArticle: 131081
"Pete" <petersen.curt@gmail.com> wrote in message news:c85cbe7c-8693-498b-96bb-093b8c244ae3@8g2000hsu.googlegroups.com... > I've noticed that the Xilinx FFT bit-accurate c simulation calls are > very very slow. Anyone else notice this? <snip> > The program is single-threaded right now, and I intend to write a > multi-threaded version soon. But first, I decided to use a profiler > on my single-threaded code to see what kind of speedup I should > expect. I wasn't very surprised to find that 99% of my execution time > was taking place inside the xilinx_ip_xfft_v5_0_bitacc_simulate() > function. I WAS surprised to find that 55.6% of that function's > execution time is being spent in malloc() and free() calls. I have > some nice call graphs and an excel spreadsheet I'd be willing to share > if someone from Xilinx would like to take a look. > Petersen Curt I don't have direct experience with the Xilinx simulation, but a couple observations: -- malloc() and free() don't belong in an FFT core; most optimized DSP library calls make the user pre-build the "twiddle factors" table (via an API call) and then the FFT routine can reuse that without needing to recreate it every time. Saves a lot of CPU-expensive sine() and cosine() calls as well. Xilinx could easily alter their API to provide that functionality. -- Bit-true emulation is expensive since you no longer can call the IEEE-754 functions built into your FPU in the microprocessor core but have to do everything the "hard way". I've done one simulation using the Matlab Fixed Point Toolbox. It worked, and was pretty close to the Blackfin DSP I was emulating, but it ran 10 times slower than the floating point implementation. One problem with the Fixed Point Toolbox is that they only provide an "example" code that implements a radix-two transform. If you don't do it exactly that way, you'll have to code the FFT routine yourself. I was using a radix-4 algorithm on the Blackfin, and some of the source code was in assembler. I truncated and rounded my results after the FFT and did the other critical calculations in the Toolbox. Maybe Xilinx could generate a reference FFT function for the Matlab toolbox, but that might give away some IP on their implementation.Article: 131082
On Apr 9, 6:28=A0pm, shakith.ferna...@gmail.com wrote: > On Apr 10, 5:12 am, Peter Alfke <pe...@xilinx.com> wrote: > > > > > On Apr 9, 10:05 am, shakith.ferna...@gmail.com wrote: > > > > On Mar 31, 11:57 am, "TC" <no...@nowhere.com> wrote: > > > > > "Dave" <dhsch...@gmail.com> wrote in message > > > > >news:940082fb-7675-4407-8c91-fd1300a0ed41@e60g2000hsh.googlegroups.co= m... > > > > > > On Mar 26, 10:25 am, "BobW" <nimby_NEEDS...@roadrunner.com> wrote:= > > > > >> <shakith.ferna...@gmail.com> wrote in message > > > > > >>news:a01b5815-57f0-4f38-bade-fae9d7937826@d4g2000prg.googlegroups.= com... > > > > > >> > Hi, > > > > > >> > We have developed a High Speed on FPGA using the MGT/RocketIO t= o > > > > >> > generate high speed signals. Also we receive the high speed sig= nals > > > > >> > using the MGT. Due to the nature of the application, we require= a pure > > > > >> > signal without encoding (We have switched off the 8B/10B encodi= ng in > > > > >> > the MGT). The problem is that it effects the clock recovery acc= uracy > > > > >> > in the received signal as there might not be =A0good DC balance= in the > > > > >> > signal. Which in turn effects the data received. And MGT is a b= lack > > > > >> > box to us. > > > > > >> > Two options- > > > > >> > 1. Another encoding mechanism such that we have pure signal and= good > > > > >> > clock recovery. > > > > >> > 2. Or is there a parameter in MGT to improve the clock recovery= . > > > > > >> > We are looking for some solution around this problem. > > > > > >> > Our setup: > > > > >> > FPGA - Xilinx Virtex II Pro > > > > >> > Board - Xilinx XUP Virtex(tm)-II Pro Development System > > > > >> > Software - ISE 9.1i > > > > > >> > Best Regards > > > > >> >Shakith > > > > > >> The clock recovery circuitry requires that there be a minimum num= ber of > > > > >> received edges per unit time. Otherwise, the recovered clock edge= s will > > > > >> drift with respect to the data stream. > > > > > >> If you don't use some type of encoding (e.g. 8B/10B), or you don'= t insure > > > > >> that your raw data ALWAYS provides this minimum edge density requ= irement, > > > > >> then the clock/data recovery circuitry will not work. > > > > > >> Bob > > > > > > Perhaps a scrambler/descrambler using LFSR's would help alleviate = any > > > > > edge density or DC bias problems by randomizing the data a bit? I'= m > > > > > not sure if it would guarantee correct reception, but it may help.= > > > > > You don't say why you need a "pure" signal (no coding). I could assu= me that > > > > you are trying to eliminate the overhead of coding (with 8b10b you o= nly get > > > > an effective data bandwidth of 80%). > > > > > You didn't say whether or not you MUST have a DC-balanced signal. Ar= e your > > > > transmitters AC-coupled to the receivers? If yes, the you must have = a > > > > DC-balanced signal and coding IS required. > > > > > You also don't mention your clocking requirements. Do you have a com= mon > > > > (distributed) reference clock, or are the reference clocks independa= nt? > > > > > Typical clock recovery circuits assume plesio-synchronous operation.= This > > > > means the the reference clocks can be independent but must be within= +/- XYZ > > > > PPM (parts-per-million) of each other. For a receiver to track the > > > > transmitter the receiver must recover the clock (and data) from the > > > > transmitted bit stream. At high speeds this is done using PLLs. For = the PLLs > > > > to perform well they require frequent edges in the transmitted bit s= tream. > > > > > Without a coded bit-stream you can't have DC-balance and you can't g= aurantee > > > > frequent edges for the PLL. So if you really can't have a coded bit-= stream > > > > then you must have a distributed (common) reference clock and you ca= nnot use > > > > AC-coupling. > > > > > If you are trying to minimize the overhead of coding look at other c= oding > > > > methods like 64b/66b codes. These are more efficient but still have = the > > > > benefits of DC-balanced tranmission and frequent clock edges. > > > > > Note that another benefit of 8b10b is the ability to align your symb= ols. How > > > > are you going to know the boundaries of a byte, a packet, etc., with= out a > > > > coded stream and use of non-data symbols (control symbols)? If you h= ave > > > > independent reference clocks how are you going to avoid underrun and= overrun > > > > of the reciever withouth control symbols (i.e. insertion and removal= of idle > > > > characters)? Do you need any form of closed-loop flow control, error= > > > > signalling, etc.? If yes, how are you going to do this without contr= ol > > > > symbols? > > > > > TC > > > > Thanks...we also noticed a new problem > > > When we run our implementation on FPGA, there is loss of data on the > > > non-active part of the signal. Its set to be 20180bits, but the number= > > > of bits received is less than that. The loss of data is not consistent= > > > either. But this doesn't appear in simulation in Modelsim. One reason > > > could be the fact that the DC Balance is not there. But does it affect= > > > the data recovery from serial to parallel? > > > Another option is use an external modulator chip on the receiver > > > signal to achieve DC balance and the use the new modulated signal on > > > MGT and decode it =A0inside the FPGA. > > > > Our setup: > > > MGT Speed - 2 Gpbs > > > FPGA - Xilinx Virtex II Pro > > > Board - Xilinx XUP Virtex(tm)-II Pro Development System > > > Software - ISE 9.1i > > > You are running into a fundamental limitation. What you receive is NRZ > > data, High for 1, Low for 0 (or something similar). You know that a > > High level of a certain duration means three ones in a row, but the > > receiver needs a clock to separate the bits. It gets this clock from > > its own PLL oscillator that is being kept alive and accurate by re- > > synchronizing itself from any transitions in the data stream. That's > > why you must have a transition quite often. Maybe you can run for 70 > > bits without a transition, but after tht, the PLL oscillator will > > drift and will not partition the incoming data stream properly. > > Sorry for this basic explanation, maybe everybody understands all this > > already. > > Peter Alfke > > ahhhhh > Thanks.. > Now I get it.. > So in long streams, its unable to partition the data to recover the > exact bits.. > hmm, I guess one solution is use an external scrambler/encoder on the > signal to achieve data balance. then do descrambling/decoding inside > the fpga to get the correct data.. > > Cheers > ShakithArticle: 131083
On Apr 9, 6:28=A0pm, shakith.ferna...@gmail.com wrote: > On Apr 10, 5:12 am, Peter Alfke <pe...@xilinx.com> wrote: > > > > > On Apr 9, 10:05 am, shakith.ferna...@gmail.com wrote: > > > > On Mar 31, 11:57 am, "TC" <no...@nowhere.com> wrote: > > > > > "Dave" <dhsch...@gmail.com> wrote in message > > > > >news:940082fb-7675-4407-8c91-fd1300a0ed41@e60g2000hsh.googlegroups.co= m... > > > > > > On Mar 26, 10:25 am, "BobW" <nimby_NEEDS...@roadrunner.com> wrote:= > > > > >> <shakith.ferna...@gmail.com> wrote in message > > > > > >>news:a01b5815-57f0-4f38-bade-fae9d7937826@d4g2000prg.googlegroups.= com... > > > > > >> > Hi, > > > > > >> > We have developed a High Speed on FPGA using the MGT/RocketIO t= o > > > > >> > generate high speed signals. Also we receive the high speed sig= nals > > > > >> > using the MGT. Due to the nature of the application, we require= a pure > > > > >> > signal without encoding (We have switched off the 8B/10B encodi= ng in > > > > >> > the MGT). The problem is that it effects the clock recovery acc= uracy > > > > >> > in the received signal as there might not be =A0good DC balance= in the > > > > >> > signal. Which in turn effects the data received. And MGT is a b= lack > > > > >> > box to us. > > > > > >> > Two options- > > > > >> > 1. Another encoding mechanism such that we have pure signal and= good > > > > >> > clock recovery. > > > > >> > 2. Or is there a parameter in MGT to improve the clock recovery= . > > > > > >> > We are looking for some solution around this problem. > > > > > >> > Our setup: > > > > >> > FPGA - Xilinx Virtex II Pro > > > > >> > Board - Xilinx XUP Virtex(tm)-II Pro Development System > > > > >> > Software - ISE 9.1i > > > > > >> > Best Regards > > > > >> >Shakith > > > > > >> The clock recovery circuitry requires that there be a minimum num= ber of > > > > >> received edges per unit time. Otherwise, the recovered clock edge= s will > > > > >> drift with respect to the data stream. > > > > > >> If you don't use some type of encoding (e.g. 8B/10B), or you don'= t insure > > > > >> that your raw data ALWAYS provides this minimum edge density requ= irement, > > > > >> then the clock/data recovery circuitry will not work. > > > > > >> Bob > > > > > > Perhaps a scrambler/descrambler using LFSR's would help alleviate = any > > > > > edge density or DC bias problems by randomizing the data a bit? I'= m > > > > > not sure if it would guarantee correct reception, but it may help.= > > > > > You don't say why you need a "pure" signal (no coding). I could assu= me that > > > > you are trying to eliminate the overhead of coding (with 8b10b you o= nly get > > > > an effective data bandwidth of 80%). > > > > > You didn't say whether or not you MUST have a DC-balanced signal. Ar= e your > > > > transmitters AC-coupled to the receivers? If yes, the you must have = a > > > > DC-balanced signal and coding IS required. > > > > > You also don't mention your clocking requirements. Do you have a com= mon > > > > (distributed) reference clock, or are the reference clocks independa= nt? > > > > > Typical clock recovery circuits assume plesio-synchronous operation.= This > > > > means the the reference clocks can be independent but must be within= +/- XYZ > > > > PPM (parts-per-million) of each other. For a receiver to track the > > > > transmitter the receiver must recover the clock (and data) from the > > > > transmitted bit stream. At high speeds this is done using PLLs. For = the PLLs > > > > to perform well they require frequent edges in the transmitted bit s= tream. > > > > > Without a coded bit-stream you can't have DC-balance and you can't g= aurantee > > > > frequent edges for the PLL. So if you really can't have a coded bit-= stream > > > > then you must have a distributed (common) reference clock and you ca= nnot use > > > > AC-coupling. > > > > > If you are trying to minimize the overhead of coding look at other c= oding > > > > methods like 64b/66b codes. These are more efficient but still have = the > > > > benefits of DC-balanced tranmission and frequent clock edges. > > > > > Note that another benefit of 8b10b is the ability to align your symb= ols. How > > > > are you going to know the boundaries of a byte, a packet, etc., with= out a > > > > coded stream and use of non-data symbols (control symbols)? If you h= ave > > > > independent reference clocks how are you going to avoid underrun and= overrun > > > > of the reciever withouth control symbols (i.e. insertion and removal= of idle > > > > characters)? Do you need any form of closed-loop flow control, error= > > > > signalling, etc.? If yes, how are you going to do this without contr= ol > > > > symbols? > > > > > TC > > > > Thanks...we also noticed a new problem > > > When we run our implementation on FPGA, there is loss of data on the > > > non-active part of the signal. Its set to be 20180bits, but the number= > > > of bits received is less than that. The loss of data is not consistent= > > > either. But this doesn't appear in simulation in Modelsim. One reason > > > could be the fact that the DC Balance is not there. But does it affect= > > > the data recovery from serial to parallel? > > > Another option is use an external modulator chip on the receiver > > > signal to achieve DC balance and the use the new modulated signal on > > > MGT and decode it =A0inside the FPGA. > > > > Our setup: > > > MGT Speed - 2 Gpbs > > > FPGA - Xilinx Virtex II Pro > > > Board - Xilinx XUP Virtex(tm)-II Pro Development System > > > Software - ISE 9.1i > > > You are running into a fundamental limitation. What you receive is NRZ > > data, High for 1, Low for 0 (or something similar). You know that a > > High level of a certain duration means three ones in a row, but the > > receiver needs a clock to separate the bits. It gets this clock from > > its own PLL oscillator that is being kept alive and accurate by re- > > synchronizing itself from any transitions in the data stream. That's > > why you must have a transition quite often. Maybe you can run for 70 > > bits without a transition, but after tht, the PLL oscillator will > > drift and will not partition the incoming data stream properly. > > Sorry for this basic explanation, maybe everybody understands all this > > already. > > Peter Alfke > > ahhhhh > Thanks.. > Now I get it.. > So in long streams, its unable to partition the data to recover the > exact bits.. > hmm, I guess one solution is use an external scrambler/encoder on the > signal to achieve data balance. then do descrambling/decoding inside > the fpga to get the correct data.. > > Cheers > Shakith Thats the way when you cannot use the simplest method, called 8B10B, where 10 bits are used to transmit the content of 8 data bits. Transceivers often include the necessary encoder/decoder, and DC balancing is automatically included. The penalty is a 25% loss in throughput ( e.g. only 2.5 Gbps from a 3,125 Gbps channel. Scrambling is populat with the telephone people. Peter Alfke, XilinxArticle: 131084
VIPS wrote: > Hi All > > This application I am looking at requires 17 tera bytes of > multiplication per second. Which in an FPGA means 40K FPGAs. What I > want to know is how many 32x32 Mults can you fit into an ASIC today > Standard Cell or Custom ASIC. Also what kind of speeds can I get. You could also do the calculations for : http://www.mathstar.com/ and once you have the Multiplier count, the data bandwidth to keep all these fed will be important as well. -jgArticle: 131085
any help on that plz? regards xenixArticle: 131086
Le Wed, 09 Apr 2008 20:32:53 +0000, Uwe Bonnes a écrit: > Habib Bouaziz-Viallet <habib@rigel.systems> wrote: >> Le Wed, 09 Apr 2008 12:49:43 -0400, DJ Delorie a écrit: > >> > Habib Bouaziz-Viallet <habib@rigel.systems> writes: >> >> I'm wondering if a basic tool already exist to program Xilinx CPLD >> >> (XC95144 and so) under Linux (preferably with the old Parralle cable III) >> > >> Hi ! >> It's always a pleasure to speak with you M. Delorie >> > Doesn't ISE itself have such a tool? > >> I'm working with WebPack ISE and i guess that the programming tools >> (ImpACT) is a native MS-Windows tool and did not work under GNU/Linux. >> I'm building a parallel cable for my own so i did not already test it. > > Use impact with > http://rmdir.de/~michael/xilinx/ > > Don't bother with the windriver kernel module... Hi ! Please correct me if i'm wrong. Building this library allows us (Linux user's) to program Xilinx devices with impact (in command line) and usb or parrallel cable III. I will try this next week (time to build my own parrallel cable III) PS : Although the DJ Delorie solution (xapp058) requires more work, I would go more deeply later. Many thanks, Habib. -- HBVArticle: 131087
On 9 Apr., 00:52, Franck Y <franck...@gmail.com> wrote: > I have a project where i have to implement a ring oscillator (3 not > gates) [...] > which seems that > quartus has optmised 'a little too much' since i want to see the > delay. You can't even really call that optimization. You seen, in an FPGA you do not have individual not and nor gates. You have 4 input lookup tables. Even in an ASIC technology there is a technology mapping step involved, that tries to find a representation in the target technology for your technology independant specification. While it would be possible to map your desing 1 to 1 to the FPGA technology in general that is not possible, so the tool does not even try. Instead it is looking for netlist portion (not individual gates) that can be represented by the FPGAs ressources. In your case it finds an obvious match: The whole circuit can be represented by a single LUT. Of course it chooses that representation. This is not an overly aggressive optimization, it is a necessary step to find a realizable netlist at all. The solution to your problem is a technology dependant representation: Instantiate LUT primitives. You can even manually place them to increase the delay. Kolja SulimmaArticle: 131088
Habib Bouaziz-Viallet <habib@rigel.systems> wrote: ... > PS : > Although the DJ Delorie solution (xapp058) requires more work, I would go > more deeply later. If you like to go the SVF way, urjtag (look on sourceforge) can play SVF on many adapters. Bye -- Uwe Bonnes bon@elektron.ikp.physik.tu-darmstadt.de Institut fuer Kernphysik Schlossgartenstrasse 9 64289 Darmstadt --------- Tel. 06151 162516 -------- Fax. 06151 164321 ----------Article: 131089
Jon Elson posted: |-------------------------------------------------------------------| |"[..] | | | |[..] Proving you can | |correct corruption from a hit anywhere on a chip, while running ANY| |program, at any time, seems like fantasy." | |-------------------------------------------------------------------| Correct. Xilinx did (and probably still does) have an admission on its website that a level of risk must always be accepted, no matter what is done to combat single-event effects. Regards, Colin Paul Gloster, unemployed and hungryArticle: 131090
Austin posted: |------------------------------------------------------------------------| |"[..] | | | |[..] they can be tested | |by reconfiguring to flip bits while operating. One heck of a lot cheaper| |than using a proton beam, or neutron beam .... and more complete (we | |have folks who flip each bit, one by one, and prove their system meets | |its requirements)." | |------------------------------------------------------------------------| Logical testing will not match checking whether real radiation respects your model of the system. One transient can defeat the outcome of clocked triply modularly redundant voters. Sincerely, Colin Paul Gloster, unemployed and coldArticle: 131091
Karl posted: |---------------------------------------------------------------------| |"why all this fuss about the need for new system level languages and| |higher abstraction...systems were also heterogeneous in the past but | |only few experts did implement them...hardware and software designers| |were working apart. [..] | | | |[..]" | |---------------------------------------------------------------------| Is anything new? I refer you to J. Robert Heath, B. D. Carroll, Terry T. Cwik, "CDL - A tool for concurrent hardware and software development", "Proceedings of the 14th conference on Design Automation", 1977. Regards, Colin Paul Gloster, unemployed and hungryArticle: 131092
On Apr 9, 10:24 pm, John_H <newsgr...@johnhandwork.com> wrote: > Wojciech Zabolotny wrote: > > Hi All, > > > I need to specify the strict setup time for the group of signals. > > It can be relatively high, but I need very low skew between the signals. > > In Quartus for Altera FPGAs I can define it with the sdc contraints as > > follows: > > > set_max_delay -from [get_ports MY_BUS[*]] -to [get_registers *] 6.000 > > set_min_delay -from [get_ports MY_BUS[*]] -to [get_registers *] 5.000 > > > With the constraints above the code compiles and opperates correctly. > > > However in Xilinx ISE: > > TIMEGRP "MY_BUS_GRP" OFFSET = IN 6 ns BEFORE "SYS_CLK" > > does not allow me to specify both the minimal and maximal setup time... > > > As the result, I get the implementation, where for one signal the setup > > time is equal to 5.99 ns, but for the other - e.g. only 3ns . > > Because the signals are then oversampled at high frequency it results in > > the unacceptable skew... > > > How can I solve this problem? > > Using OFFSET IN AFTER is problematic, as I'd need to adjust all > > constraints if the period of the clock changes :-(. > > When signals are absolutely critical to be low skew as in oversampling > or otherwise very high speed designs, I and many others resort to using > manual placement and often constrained routes. If the routing is > regular, often one constrained route can cover a multitude of signals. > > Constrained routing is an advanced feature that requires some work > inside FPGA Editor to generate the constraints. Solid knowledge of the > physical workings around the IOBs and signal routing boxes helps > significantly. Much of that knowledge comes from "poking around" inside > the chip trying different routes both automated and manual. > > There are some application notes on wide, very high speed buses that use > the physical constraints to tie down how things behave coming into the > chip. How precise you get depends on how extreme your design goals are. > > - John_H If you can generate the required phase relationship with your clock signal (to the sampling flip-flops) instead of playing with the input delay to the flop D, it is fairly simple to have a low skew sampling when you use the IOB flip-flops. Then the only routing that comes into play is the clock, and this can be adjusted using DCM resources. This assumes you don't insert the delay element in each IOB between the pin and the D. Newer Virtex parts also have variable input delay elements that can help to remove residual skew, but for a 1 ns window this should not be necessary.Article: 131093
Le Thu, 10 Apr 2008 08:36:35 +0000, Uwe Bonnes a écrit: > If you like to go the SVF way, urjtag (look on sourceforge) can play SVF on > many adapters. Ok thanks Uwe ! > > Bye > -- HBVArticle: 131094
On 9 Apr., 00:58, Peter Alfke <pe...@xilinx.com> wrote: > Slightly off-topic: Here is a civilian research application at CERN, > where 120 Virtex 4FX devices "digest" and pre-process a thousand data > streams of 2.5 Gbps each. > I was peripherally involved, and I helped write the press release...http://biz.yahoo.com/prnews/080404/aqf063.html?.v=35 > I even got to walk around in the tunnel. > Peter Alfke, Xilinx So that's why Volker Lindenstruth is hosting the FPL this year. I worked on something like the predecessor of that beast in 1995. Processing 100.000 data streams at 10MHz. Kolja SulimmaArticle: 131095
"John_H" <newsgroup@johnhandwork.com> wrote in message news:ef-dnWJb2Lr752DanZ2dnUVZ_remnZ2d@comcast.com... > > When signals are absolutely critical to be low skew as in oversampling or > otherwise very high speed designs, I and many others resort to using > manual placement and often constrained routes. If the routing is regular, > often one constrained route can cover a multitude of signals. > > Constrained routing is an advanced feature that requires some work inside > FPGA Editor to generate the constraints. Solid knowledge of the physical > workings around the IOBs and signal routing boxes helps significantly. > Much of that knowledge comes from "poking around" inside the chip trying > different routes both automated and manual. > Good advice. May I also recommend some reading for the OP? directed routing site:xilinx.com HTH., Syms. From webmaster@nillakaes.de Thu Apr 10 09:02:41 2008 Path: flpi142.ffdc.sbc.com!flpi104.ffdc.sbc.com!newsdst01.news.prodigy.net!prodigy.com!newscon04.news.prodigy.net!prodigy.net!goblin2!goblin.stu.neva.ru!feed.cnntp.org!news.cnntp.org!not-for-mail Message-Id: <47fe3a1e$0$583$6e1ede2f@read.cnntp.org> From: Thorsten Kiefer <webmaster@nillakaes.de> Subject: clock instanciation Newsgroups: comp.arch.fpga Date: Thu, 10 Apr 2008 18:02:41 +0200 User-Agent: KNode/0.10.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7Bit Lines: 28 Organization: CNNTP NNTP-Posting-Host: 2974a2e3.read.cnntp.org X-Trace: DXC=b2OSkjN=b3WA[F0hKihbFPWoT\PAgXa?QaW[_@WQ]BhS1P5d:S1^>6SdlT:^LKmYjZm?>\ITh57oXVJ@RQDmEb[W X-Complaints-To: abuse@cnntp.org Xref: prodigy.net comp.arch.fpga:143516 X-Received-Date: Mon, 21 Apr 2008 19:21:05 EDT (flpi142.ffdc.sbc.com) Hi, in my book it's written that the clock has to instactiated like this : architecture Behavioral of fftest is constant T : time := 20ns; ... begin ... --clock process begin clk <= '0'; wait for T/2; clk <= '1'; wait for T/2; end process ... end Behavioral; But I get the error messages : 2x "Wait for statement unsupported." Spartan3, Xilinx ISE 9.2.... Best Regards ThorstenArticle: 131096
"Thorsten Kiefer" <webmaster@nillakaes.de> wrote in message news:47fe3a1e$0$583$6e1ede2f@read.cnntp.org... > Hi, > in my book it's written that the clock > has to instactiated like this : > Hi Thorsten, Your book is talking about simulation. You can't synthesise that code, because the FPGA can't implement delays like that. You need an external clock for synthesis that connects to your design through an input port. HTH., Syms. From webmaster@nillakaes.de Thu Apr 10 09:21:39 2008 Path: flpi142.ffdc.sbc.com!flpi104.ffdc.sbc.com!newsdst01.news.prodigy.net!prodigy.com!newscon04.news.prodigy.net!prodigy.net!goblin1!goblin.stu.neva.ru!newsfeed.datemas.de!feed.cnntp.org!news.cnntp.org!not-for-mail Message-Id: <47fe3e96$0$583$6e1ede2f@read.cnntp.org> From: Thorsten Kiefer <webmaster@nillakaes.de> Subject: Re: clock instanciation Newsgroups: comp.arch.fpga Date: Thu, 10 Apr 2008 18:21:39 +0200 References: <47fe3a1e$0$583$6e1ede2f@read.cnntp.org> <ftldt6$2ra$1@aioe.org> User-Agent: KNode/0.10.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7Bit Lines: 31 Organization: CNNTP NNTP-Posting-Host: 2974a2e3.read.cnntp.org X-Trace: DXC=65NH5f3M3dQm5517_1kbUXWoT\PAgXa?QaW[_@WQ]BhS1P5d:S1^>6SdlT:^LKmYjZDY4c_O@`cXWSR5>dB75[i_ X-Complaints-To: abuse@cnntp.org Xref: prodigy.net comp.arch.fpga:143518 X-Received-Date: Mon, 21 Apr 2008 19:21:38 EDT (flpi142.ffdc.sbc.com) Symon wrote: > > "Thorsten Kiefer" <webmaster@nillakaes.de> wrote in message > news:47fe3a1e$0$583$6e1ede2f@read.cnntp.org... >> Hi, >> in my book it's written that the clock >> has to instactiated like this : >> > Hi Thorsten, > Your book is talking about simulation. You can't synthesise that code, > because the FPGA can't implement delays like that. You need an external > clock for synthesis that connects to your design through an input port. > HTH., Syms. OK, I found the pin, now another question. This is also for simulation only: reset <= '1', '0' after T/2; I tried to convert that like that : process begin reset <= '1'; wait until falling_edge(clk); reset <= '0'; end process; But that would run forever, right ? Is it possible to abort a process after 1 execution ? --TKArticle: 131097
Thorsten Kiefer wrote: > OK, > I found the pin, now another question. This is also for simulation only: > reset <= '1', '0' after T/2; > > I tried to convert that like that : No synthesis code is required for reset or clock other than the input port declarations. Some testbench process can wiggle it however you like. -- Mike TreselerArticle: 131098
"Thorsten Kiefer" <webmaster@nillakaes.de> wrote in message news:47fe3e96$0$583$6e1ede2f@read.cnntp.org... > > OK, > I found the pin, now another question. This is also for simulation only: > reset <= '1', '0' after T/2; > > I tried to convert that like that : > process > begin > reset <= '1'; > wait until falling_edge(clk); > reset <= '0'; > end process; > But that would run forever, right ? > Is it possible to abort a process after 1 execution ? > > --TK > generate_reset : process begin reset <= '1'; wait for 10 ns; reset <= '0'; wait ; -- ***YOU NEED THIS*** end process generate_reset;Article: 131099
Colin, It is a question of completeness. Logically going through every bit, is 100% functionally complete. Sitting in a proton beam is "waiting for Godot" -- how long must you wait to check enough bits to achieve the required coverage? It becomes a matter of "too many dollars to keep the lights on." (Beam testing is horribly power hungry, and very expensive, eg TSL is $250K for a session, not including the airplane tickets, hotel rooms, people, rental cars...). Additional system testing in a beam is highly desired, but the goals are not for functional completeness, but to cover whatever might have been missed bu flipping 100%, one by one, every configuration bit. XTMR Tool(tm) software can not be broken by a single radiative event, nor by a single bit flip (as verified by NASA, JPL, CERN, etc....). Our flow triplicates the voters, so that every feedback path gets a full TMR. A failure in a voter is "voted" out by the other two voters. That is why we have so many designers using this flow: it just works. Austin Colin Paul Gloster wrote: > Austin posted: > |------------------------------------------------------------------------| > |"[..] | > | | > |[..] they can be tested | > |by reconfiguring to flip bits while operating. One heck of a lot cheaper| > |than using a proton beam, or neutron beam .... and more complete (we | > |have folks who flip each bit, one by one, and prove their system meets | > |its requirements)." | > |------------------------------------------------------------------------| > > Logical testing will not match checking whether real radiation respects > your model of the system. One transient can defeat the outcome of clocked > triply modularly redundant voters. > > Sincerely, > Colin Paul Gloster, > unemployed and cold
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z