Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On Saturday, August 27, 2016 at 11:09:26 PM UTC-7, rickman wrote: > > I am thinking about getting a board with Xilinx FPGA, probably one of > > the older Virtex ones. Found this on ebay: > > http://www.ebay.com/itm/XILINX-VIRTEX-4-XC4VFX100-FPGA-kit-Development-= board-XKF4/181791876772 > > , any comments? >=20 > Why an older FPGA? Is it price? >=20 Yes. The newer Virtex boards cost a lot more (the ones using Virtex 6 or 7 = run into thousands of dollars). The only reason I am thinking of Xilinx is because it is probably the most = used FPGA (along with maybe Altera), so I thought having it on my resume mi= ght provide some advantage. > I have *no* idea what that could be. As I mentioned, my understanding=20 > is the TCP/IP protocol in particular is *very* complex and people don't= =20 > want to implement their own software on a CPU, much less in hardware.=20 > I'm not sure what an "offload engine" would be other than a dedicated=20 > CPU, optimized for TCP/IP. But if you have a CM4 in the device, why not= =20 > use that? > Because the main purpose of this exercise is to design some network protoco= l in real hardware (FPGA) so I can make myself marketable for jobs in that = area. If the network protocol part is too complicated as a side project, th= en I might just have to write some other (simpler) design and at least beco= me familiar with the FPGA aspects of design.Article: 159176
On Sunday, August 28, 2016 at 12:11:47 AM UTC-7, Tom Gardner wrote: > On 27/08/16 18:20, PM X wrote: > > I meant some part of the TCP/IP stack implemented in actual hardware an= d running in the FPGA. For example, a TCP Offload Engine, which implements = the whole TCP/IP stack in hardware. Now, I understand that is something eno= rmous and do not want to work on something that big for just a side/hobby p= roject. But something in that area with less complexity is what I am lookin= g for. I just don't have a good idea of what that could be, so having some = clear definition will help. >=20 > Protocol stack offload engines have two fundamental "issues". >=20 > Firstly the time and latency required to get the packets > to/from the main CPU. That significantly affects performance. >=20 > Secondly the point that TCP is and end-to-end protocol > and will correct protocol errors between or in the > endpoints. An offload engine becomes the endpoint, so > errors between the offload engine and the main CPU will > not be corrected. >=20 > The processing required to reliably transfer packets > to/from the main CPU bears a lot of similarity to TCP! >=20 > TCP in FPGA makes sense where > - the lowest latency is required, and > - where simplifying assumptions can be made, and > - where the end terminal logic is also done in the FPGA > e.g. high frequency trading, where they put the business > trading rules in hardware to minimise latency As a matter of fact, I am doing this to prepare myself for interviews in FP= GA design role in high frequency trading (in maybe 3-6 months time). Is the= re any simpler project that you could suggest that would be in the general = area of network protocols (maybe like a stripped down version) that could b= e implemented in FPGA? Thanks!Article: 159177
On 28/08/16 19:26, PM X wrote: > On Sunday, August 28, 2016 at 12:11:47 AM UTC-7, Tom Gardner wrote: >> On 27/08/16 18:20, PM X wrote: >>> I meant some part of the TCP/IP stack implemented in actual hardware and running in the FPGA. For example, a TCP Offload Engine, which implements the whole TCP/IP stack in hardware. Now, I understand that is something enormous and do not want to work on something that big for just a side/hobby project. But something in that area with less complexity is what I am looking for. I just don't have a good idea of what that could be, so having some clear definition will help. >> >> Protocol stack offload engines have two fundamental "issues". >> >> Firstly the time and latency required to get the packets >> to/from the main CPU. That significantly affects performance. >> >> Secondly the point that TCP is and end-to-end protocol >> and will correct protocol errors between or in the >> endpoints. An offload engine becomes the endpoint, so >> errors between the offload engine and the main CPU will >> not be corrected. >> >> The processing required to reliably transfer packets >> to/from the main CPU bears a lot of similarity to TCP! >> >> TCP in FPGA makes sense where >> - the lowest latency is required, and >> - where simplifying assumptions can be made, and >> - where the end terminal logic is also done in the FPGA >> e.g. high frequency trading, where they put the business >> trading rules in hardware to minimise latency > > As a matter of fact, I am doing this to prepare myself for interviews in FPGA design role in high frequency trading (in maybe 3-6 months time). Is there any simpler project that you could suggest that would be in the general area of network protocols (maybe like a stripped down version) that could be implemented in FPGA? Thanks! Based on precisely *zero* evidence and only 30s thought, I would guess the networking stack is known technology, whereas business/trading rules are newer. ISTR seeing horrendously expensive boards (worth about 1ms of HFT!) with large numbers of big FPGAs and networking. If that's correct then partitioning the trading rules across FPGAs might be interesting. Or not.Article: 159178
On 8/28/2016 2:26 PM, PM X wrote: > On Sunday, August 28, 2016 at 12:11:47 AM UTC-7, Tom Gardner wrote: >> On 27/08/16 18:20, PM X wrote: >>> I meant some part of the TCP/IP stack implemented in actual hardware and running in the FPGA. For example, a TCP Offload Engine, which implements the whole TCP/IP stack in hardware. Now, I understand that is something enormous and do not want to work on something that big for just a side/hobby project. But something in that area with less complexity is what I am looking for. I just don't have a good idea of what that could be, so having some clear definition will help. >> >> Protocol stack offload engines have two fundamental "issues". >> >> Firstly the time and latency required to get the packets >> to/from the main CPU. That significantly affects performance. >> >> Secondly the point that TCP is and end-to-end protocol >> and will correct protocol errors between or in the >> endpoints. An offload engine becomes the endpoint, so >> errors between the offload engine and the main CPU will >> not be corrected. >> >> The processing required to reliably transfer packets >> to/from the main CPU bears a lot of similarity to TCP! >> >> TCP in FPGA makes sense where >> - the lowest latency is required, and >> - where simplifying assumptions can be made, and >> - where the end terminal logic is also done in the FPGA >> e.g. high frequency trading, where they put the business >> trading rules in hardware to minimise latency > > As a matter of fact, I am doing this to prepare myself for interviews in FPGA design role in high frequency trading (in maybe 3-6 months time). Is there any simpler project that you could suggest that would be in the general area of network protocols (maybe like a stripped down version) that could be implemented in FPGA? Thanks! Ok, that explains a lot. I would suggest that you start with learning IP stack software. Before you can implement it in the FPGA you have to understand it. So first learn IP stack software. Once you understand how that works you can decide how best to implement it in logic. -- Rick CArticle: 159179
On 8/28/2016 1:51 PM, rickman wrote: > On 8/28/2016 2:26 PM, PM X wrote: >> On Sunday, August 28, 2016 at 12:11:47 AM UTC-7, Tom Gardner wrote: >>> On 27/08/16 18:20, PM X wrote: >>>> I meant some part of the TCP/IP stack implemented in actual hardware >>>> and running in the FPGA. For example, a TCP Offload Engine, which >>>> implements the whole TCP/IP stack in hardware. Now, I understand >>>> that is something enormous and do not want to work on something that >>>> big for just a side/hobby project. But something in that area with >>>> less complexity is what I am looking for. I just don't have a good >>>> idea of what that could be, so having some clear definition will help. >>> >>> Protocol stack offload engines have two fundamental "issues". >>> >>> Firstly the time and latency required to get the packets >>> to/from the main CPU. That significantly affects performance. >>> >>> Secondly the point that TCP is and end-to-end protocol >>> and will correct protocol errors between or in the >>> endpoints. An offload engine becomes the endpoint, so >>> errors between the offload engine and the main CPU will >>> not be corrected. >>> >>> The processing required to reliably transfer packets >>> to/from the main CPU bears a lot of similarity to TCP! >>> >>> TCP in FPGA makes sense where >>> - the lowest latency is required, and >>> - where simplifying assumptions can be made, and >>> - where the end terminal logic is also done in the FPGA >>> e.g. high frequency trading, where they put the business >>> trading rules in hardware to minimise latency >> >> As a matter of fact, I am doing this to prepare myself for interviews >> in FPGA design role in high frequency trading (in maybe 3-6 months >> time). Is there any simpler project that you could suggest that would >> be in the general area of network protocols (maybe like a stripped >> down version) that could be implemented in FPGA? Thanks! > > Ok, that explains a lot. I would suggest that you start with learning > IP stack software. Before you can implement it in the FPGA you have to > understand it. So first learn IP stack software. Once you understand > how that works you can decide how best to implement it in logic. UDP/IP is much simpler the TCP/IP. It is commonly done in FPGAs. For example: http://www.fpga4fun.com/10BASE-T.html OK. It is only 10Base-T. But it's not that different than the 10GbE that we do. You can get a crappy NIC and Basys 3 Artix 7 board for less than $200.00 http://store.digilentinc.com/pmodnic100-network-interface-controller/ It won't be low latency (the NIC has an SPI serial interface) but it will teach concepts. Rob.Article: 159180
> > UDP/IP is much simpler the TCP/IP. It is commonly done in FPGAs. > > For example: > > http://www.fpga4fun.com/10BASE-T.html > > OK. It is only 10Base-T. But it's not that different than the 10GbE that > we do. > > You can get a crappy NIC and Basys 3 Artix 7 board for less than $200.00 > > http://store.digilentinc.com/pmodnic100-network-interface-controller/ > > It won't be low latency (the NIC has an SPI serial interface) but it > will teach concepts. > > Rob. Thanks. Is this the board you are referring to? http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-board-recommended-for-introductory-users/ If so, this board doesn't seem to have SPI (at least no listed in the description). Also, do you think this board has enough capacity (in terms of logic elements, etc.) to support a fairly complicated design like UDP/IP?Article: 159181
PM X <pinaki2@gmail.com> wrote: > As a matter of fact, I am doing this to prepare myself for interviews in > FPGA design role in high frequency trading (in maybe 3-6 months time). Is > there any simpler project that you could suggest that would be in the > general area of network protocols (maybe like a stripped down version) > that could be implemented in FPGA? Thanks! I'd start bottom-up. First, get the PHY working. You probably want a single speed without any rate switching (eg 1G, 10G), because anything else gets messy with clock reconfiguration. The choice will likely depend on your board. (You can't really test it at this point). Hopefully your board vendor has example code for this. Then drop in a vendor MAC component. I've used the Altera ones and they're not too bad: there are pipes on the side for streams in and out which are easy to deal with. There is a memory mapped interface for configuration - you may need to implement that to configure it, or maybe the defaults are OK. (To configure, use a soft core - NIOS or Microblaze or whatever). On the subject of board choice, I'd suggest avoiding anything with an external MAC chip (external PHY is OK) because they expect to be driven from a processor, while on FPGA it's all about packet pipes. Likewise a MAC as part of a CPU subsystem (eg a Zynq PS, Altera SoC-FPGA HPS) should be avoided because they usually have the same problem, as well as being awkward poorly-documented third-party IP where you essentially have to use the Linux driver. Once you have the MAC working, you have layer 2 packets in and out (you can see MAC addresses etc). Then you can start building up the layers (eg do IP and ARP). The MAC will probably give you some help for layer 2 (eg compute checksums for you) but you're on your own after that. Building up the layers is something you can simulate, while doing the plumbing of the MAC you can only test in hardware. I'd suggest writing a testbench that simulates pushing layer 2 packets between two simulated endpoints, and you can replace that by MAC+PHY on hardware when necessary. If you want experience of networking on FPGA, you don't need to do a full TCP stack to do that: much of the principles of pushing packets about and dealing with vendor IP cores is the same irrespective of the packet format. If it makes you think about latency and meeting timing, that's a useful thing. If you want to learn about the vagaries of TCP, IPv6, etc I'd suggest starting from software first. Maybe find some NIC with a non-scary interface and program it directly (with no OS support). I'm thinking something old like a PCI NE2000 (eg RTL8139) that isn't as complex as a modern card - the Intel gigabit E1000(e) series (8254x, 8257x, I21x) are well documented but a bit more complex. While you can do this from FPGA, it's much harder to debug. TheoArticle: 159182
On 8/28/2016 11:49 AM, Tim Regeant wrote: > Anyone know where I can find this vintage software? > > I am looking for the verion 6.10 free with dongle not required. > > I think Synario was the one to release the free version. > > Used to be at the ftp site ftp://ftp.synario.com but can't reach it now. > > Thanks for any help you can offer. Software has been found, thanks.Article: 159183
On 8/26/2016 5:47 PM, PM X wrote: > Hi all, > I have over a decade of experience in hardware design, but almost all of it is in ASIC. I had done some FPGA projects in school, but nothing after that. So after all these years, I want to work on some personal FPGA projects (mainly to prepare myself for future job interviews). I have the following two questions. > > 1. What is an FPGA board that I can buy for this purpose? I am probably looking to do something not too basic (since I already have a lot of experience in design), at the same time I do not want to make it a super complicated full time project either. I prefer Xilinx (but I am open) and something less than $250 will be good. Note that question #2 might also affect the choice of board. > > 2. Once I have the FPGA board, I would like to implement some design involving network protocol (like TCP/IP, UDP, etc.). However, I have not worked on these network layers before and don't have an extensive knowledge on them either (other than what I had read in school long ago), so I do not have a very clear picture of what to do. Is there any open source design available on this? Or any projects with specific definitions that I can understand and then start implementing? > > Thanks! > Right off the bat I'm not much of an expert so maybe some that are more experienced can comment. Have you looked at the Digilent Arty Artix 7 FPGA board? It's inexpensive at $99, and has a built in Ethernet Physical, and it mentions that it comes with A MAC IP so you are off to a start. It has 35K cells so I'm not familiar if that would be enough to implement the rest of the needed logic. It has both Arduino, and Pmod connectors so if desired you can add an external Ethernet module that are available in several chip versions. < http://store.digilentinc.com/arty-board-artix-7-fpga-development-board-for-makers-and-hobbyists/ > Here is one external Ethernet adapter, there are many other available from other places. < http://store.digilentinc.com/pmodnic100-network-interface-controller/ > -- Cecil - k5nwaArticle: 159184
On 8/29/2016 4:58 AM, PM X wrote: >> >> UDP/IP is much simpler the TCP/IP. It is commonly done in FPGAs. >> >> For example: >> >> http://www.fpga4fun.com/10BASE-T.html >> >> OK. It is only 10Base-T. But it's not that different than the 10GbE that >> we do. >> >> You can get a crappy NIC and Basys 3 Artix 7 board for less than $200.00 >> >> http://store.digilentinc.com/pmodnic100-network-interface-controller/ >> >> It won't be low latency (the NIC has an SPI serial interface) but it >> will teach concepts. >> >> Rob. > > Thanks. Is this the board you are referring to? > http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-board-recommended-for-introductory-users/ > > If so, this board doesn't seem to have SPI (at least no listed in the description). Also, do you think this board has enough capacity (in terms of logic elements, etc.) to support a fairly complicated design like UDP/IP? What do you mean it doesn't have SPI? SPI is a simple shift register interface which can *easily* be implemented in an FPGA (or MCU) using the GPIOs. Do you mean ISP, in system programming? If it doesn't have ISP how do you load your design? -- Rick CArticle: 159185
On 8/29/2016 1:32 PM, rickman wrote: > On 8/29/2016 4:58 AM, PM X wrote: >>> >>> UDP/IP is much simpler the TCP/IP. It is commonly done in FPGAs. >>> >>> For example: >>> >>> http://www.fpga4fun.com/10BASE-T.html >>> >>> OK. It is only 10Base-T. But it's not that different than the 10GbE that >>> we do. >>> >>> You can get a crappy NIC and Basys 3 Artix 7 board for less than $200.00 >>> >>> http://store.digilentinc.com/pmodnic100-network-interface-controller/ >>> >>> It won't be low latency (the NIC has an SPI serial interface) but it >>> will teach concepts. >>> >>> Rob. >> >> Thanks. Is this the board you are referring to? >> http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-board-recommended-for-introductory-users/ >> >> >> If so, this board doesn't seem to have SPI (at least no listed in the >> description). Also, do you think this board has enough capacity (in >> terms of logic elements, etc.) to support a fairly complicated design >> like UDP/IP? > > What do you mean it doesn't have SPI? SPI is a simple shift register > interface which can *easily* be implemented in an FPGA (or MCU) using > the GPIOs. > > Do you mean ISP, in system programming? If it doesn't have ISP how do > you load your design? > It's a FPGA, you can add SPI easily, there are IPs for free to allow that to happen. In my post I was also going to mention the BASYS-3 board, I left it out because the Arty Board has a ton of memory available that this one doesn't but this on has a lot of switches and LED which can be handy. -- Cecil - k5nwaArticle: 159186
On Monday, August 29, 2016 at 11:32:23 AM UTC-7, rickman wrote: > On 8/29/2016 4:58 AM, PM X wrote: > >> > >> UDP/IP is much simpler the TCP/IP. It is commonly done in FPGAs. > >> > >> For example: > >> > >> http://www.fpga4fun.com/10BASE-T.html > >> > >> OK. It is only 10Base-T. But it's not that different than the 10GbE that > >> we do. > >> > >> You can get a crappy NIC and Basys 3 Artix 7 board for less than $200.00 > >> > >> http://store.digilentinc.com/pmodnic100-network-interface-controller/ > >> > >> It won't be low latency (the NIC has an SPI serial interface) but it > >> will teach concepts. > >> > >> Rob. > > > > Thanks. Is this the board you are referring to? > > http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-board-recommended-for-introductory-users/ > > > > If so, this board doesn't seem to have SPI (at least no listed in the description). Also, do you think this board has enough capacity (in terms of logic elements, etc.) to support a fairly complicated design like UDP/IP? > > What do you mean it doesn't have SPI? SPI is a simple shift register > interface which can *easily* be implemented in an FPGA (or MCU) using > the GPIOs. > > Do you mean ISP, in system programming? If it doesn't have ISP how do > you load your design? > > -- > > Rick C I meant I didn't see a (dedicated) SPI interface. But you're right. SPI protocol just needs 4 general purpose pins, so using GPIOs should do it.Article: 159187
On Monday, August 29, 2016 at 11:38:27 AM UTC-7, Cecil Bayona wrote: > On 8/29/2016 1:32 PM, rickman wrote: > > On 8/29/2016 4:58 AM, PM X wrote: > >>> > >>> UDP/IP is much simpler the TCP/IP. It is commonly done in FPGAs. > >>> > >>> For example: > >>> > >>> http://www.fpga4fun.com/10BASE-T.html > >>> > >>> OK. It is only 10Base-T. But it's not that different than the 10GbE t= hat > >>> we do. > >>> > >>> You can get a crappy NIC and Basys 3 Artix 7 board for less than $200= .00 > >>> > >>> http://store.digilentinc.com/pmodnic100-network-interface-controller/ > >>> > >>> It won't be low latency (the NIC has an SPI serial interface) but it > >>> will teach concepts. > >>> > >>> Rob. > >> > >> Thanks. Is this the board you are referring to? > >> http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-board-recomm= ended-for-introductory-users/ > >> > >> > >> If so, this board doesn't seem to have SPI (at least no listed in the > >> description). Also, do you think this board has enough capacity (in > >> terms of logic elements, etc.) to support a fairly complicated design > >> like UDP/IP? > > > > What do you mean it doesn't have SPI? SPI is a simple shift register > > interface which can *easily* be implemented in an FPGA (or MCU) using > > the GPIOs. > > > > Do you mean ISP, in system programming? If it doesn't have ISP how do > > you load your design? > > >=20 > It's a FPGA, you can add SPI easily, there are IPs for free to allow=20 > that to happen. >=20 > In my post I was also going to mention the BASYS-3 board, I left it out= =20 > because the Arty Board has a ton of memory available that this one=20 > doesn't but this on has a lot of switches and LED which can be handy. > --=20 > Cecil - k5nwa OK, thanks. I will check out both of them. What is the largest design you (= or someone you know) have implemented on these boards? The Artix line seems= to be lower end than Virtex line, so trying to get an idea if they can sup= port somewhat complicated designs.Article: 159188
On 8/29/2016 2:40 PM, PM X wrote: > On Monday, August 29, 2016 at 11:38:27 AM UTC-7, Cecil Bayona wrote: >> On 8/29/2016 1:32 PM, rickman wrote: >>> On 8/29/2016 4:58 AM, PM X wrote: >>>>> >>>>> UDP/IP is much simpler the TCP/IP. It is commonly done in FPGAs. >>>>> >>>>> For example: >>>>> >>>>> http://www.fpga4fun.com/10BASE-T.html >>>>> >>>>> OK. It is only 10Base-T. But it's not that different than the 10GbE that >>>>> we do. >>>>> >>>>> You can get a crappy NIC and Basys 3 Artix 7 board for less than $200.00 >>>>> >>>>> http://store.digilentinc.com/pmodnic100-network-interface-controller/ >>>>> >>>>> It won't be low latency (the NIC has an SPI serial interface) but it >>>>> will teach concepts. >>>>> >>>>> Rob. >>>> >>>> Thanks. Is this the board you are referring to? >>>> http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-board-recommended-for-introductory-users/ >>>> >>>> >>>> If so, this board doesn't seem to have SPI (at least no listed in the >>>> description). Also, do you think this board has enough capacity (in >>>> terms of logic elements, etc.) to support a fairly complicated design >>>> like UDP/IP? >>> >>> What do you mean it doesn't have SPI? SPI is a simple shift register >>> interface which can *easily* be implemented in an FPGA (or MCU) using >>> the GPIOs. >>> >>> Do you mean ISP, in system programming? If it doesn't have ISP how do >>> you load your design? >>> >> >> It's a FPGA, you can add SPI easily, there are IPs for free to allow >> that to happen. >> >> In my post I was also going to mention the BASYS-3 board, I left it out >> because the Arty Board has a ton of memory available that this one >> doesn't but this on has a lot of switches and LED which can be handy. >> -- >> Cecil - k5nwa > > OK, thanks. I will check out both of them. What is the largest design you (or someone you know) have implemented on these boards? The Artix line seems to be lower end than Virtex line, so trying to get an idea if they can support somewhat complicated designs. > Nothing Fancy, that is why in my earlier post I mentioned that I don't have a lot of experience. I been working a 32 bit stack based CPU, but it's a work in progress, I'm still sorting it out, it taken less than 20% of the chip, but a stack CPU are rather simple compared to other CPU's, when finished it should be pretty nice, most instructions take one clock to execute, and it used packed instructions, 5 instructions to a word fetch. Originally it was on a Lattice Brevia2, I am now converting it to a Artix-7 board, but there is software involved too so it's going slow and I'm learning as I go. -- Cecil - k5nwaArticle: 159189
On 8/29/2016 3:40 PM, PM X wrote: > On Monday, August 29, 2016 at 11:38:27 AM UTC-7, Cecil Bayona wrote: >> On 8/29/2016 1:32 PM, rickman wrote: >>> On 8/29/2016 4:58 AM, PM X wrote: >>>>> >>>>> UDP/IP is much simpler the TCP/IP. It is commonly done in FPGAs. >>>>> >>>>> For example: >>>>> >>>>> http://www.fpga4fun.com/10BASE-T.html >>>>> >>>>> OK. It is only 10Base-T. But it's not that different than the 10GbE that >>>>> we do. >>>>> >>>>> You can get a crappy NIC and Basys 3 Artix 7 board for less than $200.00 >>>>> >>>>> http://store.digilentinc.com/pmodnic100-network-interface-controller/ >>>>> >>>>> It won't be low latency (the NIC has an SPI serial interface) but it >>>>> will teach concepts. >>>>> >>>>> Rob. >>>> >>>> Thanks. Is this the board you are referring to? >>>> http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-board-recommended-for-introductory-users/ >>>> >>>> >>>> If so, this board doesn't seem to have SPI (at least no listed in the >>>> description). Also, do you think this board has enough capacity (in >>>> terms of logic elements, etc.) to support a fairly complicated design >>>> like UDP/IP? >>> >>> What do you mean it doesn't have SPI? SPI is a simple shift register >>> interface which can *easily* be implemented in an FPGA (or MCU) using >>> the GPIOs. >>> >>> Do you mean ISP, in system programming? If it doesn't have ISP how do >>> you load your design? >>> >> >> It's a FPGA, you can add SPI easily, there are IPs for free to allow >> that to happen. >> >> In my post I was also going to mention the BASYS-3 board, I left it out >> because the Arty Board has a ton of memory available that this one >> doesn't but this on has a lot of switches and LED which can be handy. >> -- >> Cecil - k5nwa > > OK, thanks. I will check out both of them. What is the largest design you (or someone you know) have implemented on these boards? The Artix line seems to be lower end than Virtex line, so trying to get an idea if they can support somewhat complicated designs. No sure what "lower end" means in technical terms. I expect the Artix line of FPGAs will easily support somewhat complicated designs for reasonable values of "somewhat complicated". I suggest you not give much credence to marketing information and consider the chip specifications. For the most part the important issue is the LUT count. Otherwise the extra features are only useful if you need them. -- Rick CArticle: 159190
On 8/29/2016 4:30 PM, Cecil Bayona wrote: > > > On 8/29/2016 2:40 PM, PM X wrote: >> On Monday, August 29, 2016 at 11:38:27 AM UTC-7, Cecil Bayona wrote: >>> On 8/29/2016 1:32 PM, rickman wrote: >>>> On 8/29/2016 4:58 AM, PM X wrote: >>>>>> >>>>>> UDP/IP is much simpler the TCP/IP. It is commonly done in FPGAs. >>>>>> >>>>>> For example: >>>>>> >>>>>> http://www.fpga4fun.com/10BASE-T.html >>>>>> >>>>>> OK. It is only 10Base-T. But it's not that different than the >>>>>> 10GbE that >>>>>> we do. >>>>>> >>>>>> You can get a crappy NIC and Basys 3 Artix 7 board for less than >>>>>> $200.00 >>>>>> >>>>>> http://store.digilentinc.com/pmodnic100-network-interface-controller/ >>>>>> >>>>>> It won't be low latency (the NIC has an SPI serial interface) but it >>>>>> will teach concepts. >>>>>> >>>>>> Rob. >>>>> >>>>> Thanks. Is this the board you are referring to? >>>>> http://store.digilentinc.com/basys-3-artix-7-fpga-trainer-board-recommended-for-introductory-users/ >>>>> >>>>> >>>>> >>>>> If so, this board doesn't seem to have SPI (at least no listed in the >>>>> description). Also, do you think this board has enough capacity (in >>>>> terms of logic elements, etc.) to support a fairly complicated design >>>>> like UDP/IP? >>>> >>>> What do you mean it doesn't have SPI? SPI is a simple shift register >>>> interface which can *easily* be implemented in an FPGA (or MCU) using >>>> the GPIOs. >>>> >>>> Do you mean ISP, in system programming? If it doesn't have ISP how do >>>> you load your design? >>>> >>> >>> It's a FPGA, you can add SPI easily, there are IPs for free to allow >>> that to happen. >>> >>> In my post I was also going to mention the BASYS-3 board, I left it out >>> because the Arty Board has a ton of memory available that this one >>> doesn't but this on has a lot of switches and LED which can be handy. >>> -- >>> Cecil - k5nwa >> >> OK, thanks. I will check out both of them. What is the largest design >> you (or someone you know) have implemented on these boards? The Artix >> line seems to be lower end than Virtex line, so trying to get an idea >> if they can support somewhat complicated designs. >> > Nothing Fancy, that is why in my earlier post I mentioned that I don't > have a lot of experience. I been working a 32 bit stack based CPU, but > it's a work in progress, I'm still sorting it out, it taken less than > 20% of the chip, but a stack CPU are rather simple compared to other > CPU's, when finished it should be pretty nice, most instructions take > one clock to execute, and it used packed instructions, 5 instructions to > a word fetch. Originally it was on a Lattice Brevia2, I am now > converting it to a Artix-7 board, but there is software involved too so > it's going slow and I'm learning as I go. Just a comment on your stack processor. I've done some design work with stack processors and read about a lot of designs. In my humble opinion, if you have multiple cycle instructions, you are doing it wrong. I don't want to steal the thread. If you care to discuss this we can start another thread. -- Rick CArticle: 159191
On 8/29/2016 7:55 PM, rickman wrote: > On 8/29/2016 4:30 PM, Cecil Bayona wrote: >> Nothing Fancy, that is why in my earlier post I mentioned that I don't >> have a lot of experience. I been working a 32 bit stack based CPU, but >> it's a work in progress, I'm still sorting it out, it taken less than >> 20% of the chip, but a stack CPU are rather simple compared to other >> CPU's, when finished it should be pretty nice, most instructions take >> one clock to execute, and it used packed instructions, 5 instructions to >> a word fetch. Originally it was on a Lattice Brevia2, I am now >> converting it to a Artix-7 board, but there is software involved too so >> it's going slow and I'm learning as I go. > > Just a comment on your stack processor. I've done some design work with > stack processors and read about a lot of designs. In my humble opinion, > if you have multiple cycle instructions, you are doing it wrong. I > don't want to steal the thread. If you care to discuss this we can > start another thread. > I'm not sure why you think that it uses multiple cock instruction, I mentioned that most occur in one clock, the exception is load immediate it takes the instruction fetch, then a second fetch for the 32 bit value to push on the stack, all others take one clock. Even that one can take place in one clock with extra hardware to fetch RAM with a buffer to hold two 32 bit words, an alternative is to use two clocks one to execute the other to fetch program instructions. What is does have is multiple instructions on one program word, it's five instructions or one depending on what it is, jump, and call take the whole 32 bit word and is not packed, everything else is 5 instructions to a 32 bit program word so you have fewer memory fetches. -- Cecil - k5nwaArticle: 159192
On 8/30/2016 12:03 AM, Cecil Bayona wrote: > On 8/29/2016 7:55 PM, rickman wrote: >> On 8/29/2016 4:30 PM, Cecil Bayona wrote: >>> Nothing Fancy, that is why in my earlier post I mentioned that I don't >>> have a lot of experience. I been working a 32 bit stack based CPU, but >>> it's a work in progress, I'm still sorting it out, it taken less than >>> 20% of the chip, but a stack CPU are rather simple compared to other >>> CPU's, when finished it should be pretty nice, most instructions take >>> one clock to execute, and it used packed instructions, 5 instructions to >>> a word fetch. Originally it was on a Lattice Brevia2, I am now >>> converting it to a Artix-7 board, but there is software involved too so >>> it's going slow and I'm learning as I go. >> >> Just a comment on your stack processor. I've done some design work with >> stack processors and read about a lot of designs. In my humble opinion, >> if you have multiple cycle instructions, you are doing it wrong. I >> don't want to steal the thread. If you care to discuss this we can >> start another thread. >> > I'm not sure why you think that it uses multiple cock instruction, I > mentioned that most occur in one clock, the exception is load immediate > it takes the instruction fetch, then a second fetch for the 32 bit value > to push on the stack, all others take one clock. Even that one can take > place in one clock with extra hardware to fetch RAM with a buffer to > hold two 32 bit words, an alternative is to use two clocks one to > execute the other to fetch program instructions. > > What is does have is multiple instructions on one program word, it's > five instructions or one depending on what it is, jump, and call take > the whole 32 bit word and is not packed, everything else is 5 > instructions to a 32 bit program word so you have fewer memory fetches. Are you using external memory? The CPUs I've designed were 100% internal to the FPGA so there was no advantage to fetching multiple instructions. In fact, the multiplexer needed was just more delay. I thought you design was not one clock per instruction because of what you said. Your design is using a single memory interface. It is common for stack machines to separate data and instruction space. But if you are working out of external memory that is not so easy. I assume you have read the various similar work that has been done? The J1 is very interesting in that it was so durn simple. A true MISC design and effective. If you meander over to comp.lang.forth there are folks there who have designed some MISC CPUs which have proven their worth. Bernd Paysan designed the B16 which seems like an effective design. I've never tried to work with it, but it sounds similar to yours in they way it combines multiple instructions into a single instruction word. -- Rick CArticle: 159193
On 8/30/2016 1:11 AM, rickman wrote: > On 8/30/2016 12:03 AM, Cecil Bayona wrote: >> On 8/29/2016 7:55 PM, rickman wrote: >>> On 8/29/2016 4:30 PM, Cecil Bayona wrote: >>>> Nothing Fancy, that is why in my earlier post I mentioned that I don't >>>> have a lot of experience. I been working a 32 bit stack based CPU, but >>>> it's a work in progress, I'm still sorting it out, it taken less than >>>> 20% of the chip, but a stack CPU are rather simple compared to other >>>> CPU's, when finished it should be pretty nice, most instructions take >>>> one clock to execute, and it used packed instructions, 5 >>>> instructions to >>>> a word fetch. Originally it was on a Lattice Brevia2, I am now >>>> converting it to a Artix-7 board, but there is software involved too so >>>> it's going slow and I'm learning as I go. >>> >>> Just a comment on your stack processor. I've done some design work with >>> stack processors and read about a lot of designs. In my humble opinion, >>> if you have multiple cycle instructions, you are doing it wrong. I >>> don't want to steal the thread. If you care to discuss this we can >>> start another thread. >>> >> I'm not sure why you think that it uses multiple cock instruction, I >> mentioned that most occur in one clock, the exception is load immediate >> it takes the instruction fetch, then a second fetch for the 32 bit value >> to push on the stack, all others take one clock. Even that one can take >> place in one clock with extra hardware to fetch RAM with a buffer to >> hold two 32 bit words, an alternative is to use two clocks one to >> execute the other to fetch program instructions. >> >> What is does have is multiple instructions on one program word, it's >> five instructions or one depending on what it is, jump, and call take >> the whole 32 bit word and is not packed, everything else is 5 >> instructions to a 32 bit program word so you have fewer memory fetches. > > Are you using external memory? The CPUs I've designed were 100% > internal to the FPGA so there was no advantage to fetching multiple > instructions. In fact, the multiplexer needed was just more delay. > > I thought you design was not one clock per instruction because of what > you said. Your design is using a single memory interface. It is common > for stack machines to separate data and instruction space. But if you > are working out of external memory that is not so easy. > > I assume you have read the various similar work that has been done? The > J1 is very interesting in that it was so durn simple. A true MISC > design and effective. If you meander over to comp.lang.forth there are > folks there who have designed some MISC CPUs which have proven their > worth. Bernd Paysan designed the B16 which seems like an effective > design. I've never tried to work with it, but it sounds similar to > yours in they way it combines multiple instructions into a single > instruction word. > I think the multiple instructions is to save on program space, 32 bits to an instruction will eat up the memory space ways too quickly. It is a Von Newman machine with a single address space, code and data is all in the same space. It uses three kinds of instruction formats, short and long, and double word. A short instruction is 6 bits so you can cram 5 instruction to a memory word, so there are 64 possible instructions but the bare machine uses 27 opcodes so there is room for additional instructions. The are things that require no addresses such as dup, swap, return, add, etc. Long instructions take 30 bits, they are things like jump, call, including conditional versions, they include a 24 bit address as part of the instruction which can address 16MB of program space, the upper two bits are unused at present There is a single two word instruction, which pushes the second 32 bit word into the stack, it itself is a 6 bit instruction and there can be more that one "LIT" instruction, each one has an additional word to use as a literal. Each 32 bit program word could have 6 LIT instructions with each one having an additional 32 bit word following the program code word so if you have six literal words packed into single 32 bit program word, it is followed by six 32 words containing the literal values, the program continues 7 words after the current 6 instruction word, of course one can have fewer literals. The whole thing is like a packed instruction pipeline it minimizes the number of fetches but any control flow instruction cancels and dumps the current instruction queue and does a fetch at a different place since the IP has changed. The job of packing the instructions into one word is handled by the Forth Compiler so the user does not see it, it just makes the program code smaller automatically. I did some test and 5 is too long a queue, when doing a 16 bit version it packs 3 instructions to a word and that version ends up with a shorter program area so it's more efficient, in part because Forth changes the Program Counter often so having a longer instruction queue waste instruction slots. on a 16 bit version adding some extra instructions would be nice so you end up with add with carry, subtract with borrow etc so it can handle 32 operations efficiently. Overall its an efficient design but it could be improved as anything else could. One could add combined instructions where a return is combined with some regular instructions so it execute both in one clock. Eventually I want to work with the J1, its a simpler 16 bit machine but since the instruction word contains multiple fields it can have instructions that do multiple operations at the same time naturally, with a more complex compiler packing multiple instructions it might save on program space. It does have very limited addressing space due to it's instructions limited to 16 bits. -- Cecil - k5nwaArticle: 159194
On 8/30/2016 4:49 AM, Cecil Bayona wrote: > > > On 8/30/2016 1:11 AM, rickman wrote: >> On 8/30/2016 12:03 AM, Cecil Bayona wrote: >>> On 8/29/2016 7:55 PM, rickman wrote: >>>> On 8/29/2016 4:30 PM, Cecil Bayona wrote: >>>>> Nothing Fancy, that is why in my earlier post I mentioned that I don't >>>>> have a lot of experience. I been working a 32 bit stack based CPU, >>>>> but >>>>> it's a work in progress, I'm still sorting it out, it taken less than >>>>> 20% of the chip, but a stack CPU are rather simple compared to other >>>>> CPU's, when finished it should be pretty nice, most instructions take >>>>> one clock to execute, and it used packed instructions, 5 >>>>> instructions to >>>>> a word fetch. Originally it was on a Lattice Brevia2, I am now >>>>> converting it to a Artix-7 board, but there is software involved >>>>> too so >>>>> it's going slow and I'm learning as I go. >>>> >>>> Just a comment on your stack processor. I've done some design work >>>> with >>>> stack processors and read about a lot of designs. In my humble >>>> opinion, >>>> if you have multiple cycle instructions, you are doing it wrong. I >>>> don't want to steal the thread. If you care to discuss this we can >>>> start another thread. >>>> >>> I'm not sure why you think that it uses multiple cock instruction, I >>> mentioned that most occur in one clock, the exception is load immediate >>> it takes the instruction fetch, then a second fetch for the 32 bit value >>> to push on the stack, all others take one clock. Even that one can take >>> place in one clock with extra hardware to fetch RAM with a buffer to >>> hold two 32 bit words, an alternative is to use two clocks one to >>> execute the other to fetch program instructions. >>> >>> What is does have is multiple instructions on one program word, it's >>> five instructions or one depending on what it is, jump, and call take >>> the whole 32 bit word and is not packed, everything else is 5 >>> instructions to a 32 bit program word so you have fewer memory fetches. >> >> Are you using external memory? The CPUs I've designed were 100% >> internal to the FPGA so there was no advantage to fetching multiple >> instructions. In fact, the multiplexer needed was just more delay. >> >> I thought you design was not one clock per instruction because of what >> you said. Your design is using a single memory interface. It is common >> for stack machines to separate data and instruction space. But if you >> are working out of external memory that is not so easy. >> >> I assume you have read the various similar work that has been done? The >> J1 is very interesting in that it was so durn simple. A true MISC >> design and effective. If you meander over to comp.lang.forth there are >> folks there who have designed some MISC CPUs which have proven their >> worth. Bernd Paysan designed the B16 which seems like an effective >> design. I've never tried to work with it, but it sounds similar to >> yours in they way it combines multiple instructions into a single >> instruction word. >> > I think the multiple instructions is to save on program space, 32 bits > to an instruction will eat up the memory space ways too quickly. It is a > Von Newman machine with a single address space, code and data is all in > the same space. Again, that is a reflection of having a common address space for instructions and data. My CPUs use 8 or 9 bits for instructions while being data size independent. The instruction format does not imply any particular data bus size. > It uses three kinds of instruction formats, short and long, and double > word. > > A short instruction is 6 bits so you can cram 5 instruction to a memory > word, so there are 64 possible instructions but the bare machine uses 27 > opcodes so there is room for additional instructions. The are things > that require no addresses such as dup, swap, return, add, etc. > > Long instructions take 30 bits, they are things like jump, call, > including conditional versions, they include a 24 bit address as part of > the instruction which can address 16MB of program space, the upper two > bits are unused at present > > There is a single two word instruction, which pushes the second 32 bit > word into the stack, it itself is a 6 bit instruction and there can be > more that one "LIT" instruction, each one has an additional word to use > as a literal. Each 32 bit program word could have 6 LIT instructions > with each one having an additional 32 bit word following the program > code word so if you have six literal words packed into single 32 bit > program word, it is followed by six 32 words containing the literal > values, the program continues 7 words after the current 6 instruction > word, of course one can have fewer literals. I've shied away from multicycle instructions because it means more bits (1 bit in this case) to indicate the cycle count which is more input(s) to the decoder. I wanted to try to keep the decoder as simple as possible. > The whole thing is like a packed instruction pipeline it minimizes the > number of fetches but any control flow instruction cancels and dumps the > current instruction queue and does a fetch at a different place since > the IP has changed. The F18A does that. Learning how to pack instructions into the word is a bit tricky. Even harder is learning how to time execution, but that's because it is async and does not use a fixed frequency clock. > The job of packing the instructions into one word is handled by the > Forth Compiler so the user does not see it, it just makes the program > code smaller automatically. I did some test and 5 is too long a queue, > when doing a 16 bit version it packs 3 instructions to a word and that > version ends up with a shorter program area so it's more efficient, in > part because Forth changes the Program Counter often so having a longer > instruction queue waste instruction slots. on a 16 bit version adding > some extra instructions would be nice so you end up with add with carry, > subtract with borrow etc so it can handle 32 operations efficiently. > > Overall its an efficient design but it could be improved as anything > else could. One could add combined instructions where a return is > combined with some regular instructions so it execute both in one clock. One of the things I have looked at briefly is breaking out the three "engines" so each one is separately controlled by fields in the instruction. This will require a larger instruction, but 16 bits should be enough. Then many types of instructions could be combined. I didn't pursue it because I didn't want to work on the assembler that would handle it. > Eventually I want to work with the J1, its a simpler 16 bit machine but > since the instruction word contains multiple fields it can have > instructions that do multiple operations at the same time naturally, > with a more complex compiler packing multiple instructions it might save > on program space. It does have very limited addressing space due to it's > instructions limited to 16 bits. I seem to recall discussing this with someone else not too long ago. I don't think there actually is much parallelism possible with the J1 design. You can combine a return instruction with arithmetic instructions, and there are fields to adjust the stack independently. So you might be able to combine say, 2DUP + in one instruction by using + with a DSTACK +1 instead of a -1. But the useful combos will be limited. The utility is also limited by the instruction frequency. Only 35% of the instructions in the app the J1 was designed for are ALU instructions which are the only ones that can be paralleled. In my CPU design I had separate "engines" for the data stack (ALU), the return stack (destination for literals, addresses for all operations and looping control) and the instruction fetch. If each of these had separate fields in the instruction word there might be more opportunity for parallelism. But as I said, I didn't pursue this because the software support would be messy. I'd like to get back to that, but it won't happen any time soon. -- Rick CArticle: 159195
Does anyone have the drivers for Windows 10 64 bit use of the STM32F1xx bas= ed USB Blaster clones? They are readily available, inexpensive and the STM3= 2 Cortex CPU should have plenty of performance for these to work well. Howe= ver, the firmware isn't fully Altera compatible so when connected via USB t= hey crash the machine with a blue screen reboot. I just received mine and for the price, ordering a different one would be w= orth while but I would like to get this one working instead of waiting 1 to= 2 weeks for another. And most vendors don't show which CPU theirs uses so = I could end up with another that I can't use. Any and all pointers will be most appreciated! For reference, I'm running Quartus 16.0.2 in Windows 10 Enterprise, 64 bit.= The clone's PC board has a row of 11 vias for a header which I suspect is = used for programming it, if anyone has ideas about that as well. Best to you all -Article: 159196
On Tue, 30 Aug 2016 11:01:44 -0700, Jim Horn wrote: > Does anyone have the drivers for Windows 10 64 bit use of the STM32F1xx > based USB Blaster clones? They are readily available, inexpensive and > the STM32 Cortex CPU should have plenty of performance for these to work > well. However, the firmware isn't fully Altera compatible so when > connected via USB they crash the machine with a blue screen reboot. > > I just received mine and for the price, ordering a different one would > be worth while but I would like to get this one working instead of > waiting 1 to 2 weeks for another. And most vendors don't show which CPU > theirs uses so I could end up with another that I can't use. > > Any and all pointers will be most appreciated! > > For reference, I'm running Quartus 16.0.2 in Windows 10 Enterprise, 64 > bit. The clone's PC board has a row of 11 vias for a header which I > suspect is used for programming it, if anyone has ideas about that as > well. > > Best to you all - Not what you want to hear, but you could maybe get things working with a virtual machine (i.e. VirtualBox) running a 32-bit Windows version. You'll be able to do everything from the existing machine, just more clunkily. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!Article: 159197
I wonder if they ever gave it to him ? Would they do it now that these chips are no longer produced?Article: 159198
lolinka04@gmail.com wrote: > I wonder if they ever gave it to him ? > Would they do it now that these chips are no longer produced? > > If you're going to reply to an 18-year-old post, it would be nice to quote the thread for those who don't keep that many headers downloaded. In any case, AMD is long out of the SPLD business, but the 22V10 still lives on. Atmel is making them as the ATF22V10: http://www.atmel.com/Images/doc0735.pdf The data sheet mentions "flash" technology, so I assume the programming algorithm has changed since the EEPROM versions made by Lattice, TI, and AMD. Are you planning to build your own programmer? -- GaborArticle: 159199
On Thu, 1 Sep 2016 09:51:36 -0700 (PDT) lolinka04@gmail.com wrote: > I wonder if they ever gave it to him ? > Would they do it now that these chips are no longer produced? I did a project to program early Flash logic parts. This included a PC plugin board, borrowed equation compiler, and configuration stream generator. Why might this ancient history be of interest? Jan Coombs
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z