31 Jan 2017
Problems bringing the Kestrel-3 up on the Nexys-2 board forces me to try bringing it up on a new FPGA board instead, the icoBoard Gamma, based around the iCE40HX8K FPGA. However, the limitations of this FPGA seriously constrains the design of the computer, as the CPU just barely fits as it is. I’ve decided to brutally murder my darlings, shed all unnecessary I/O features that basically defined the Kestrel-3 as a home computer, and focus instead on pure compute and aggregate I/O capability. Off-loading non-essential I/O to intelligent peripherals, via I/O channels, brings the design of the Kestrel-3 closer to that of an older IBM mainframe, a la System/360 or System/370.
Problems bringing the Kestrel-3 up on the Nexys-2 board forces me to try bringing it up on a new FPGA board instead, the icoBoard Gamma, based around the iCE40HX8K FPGA. The Lattice FPGA contains a little over 7100 look-up tables (LUTs). The current Kestrel-3 design, targeting a Xilinx Spartan-3E device, consumes a little over 5500 LUTs. This suggests to me that, all other things being equal, the current Kestrel-3 design should similarly synthesize to approximately the same number of LUTs in the Lattice part. This leaves somewhere around 1500 LUTs left over for other tasks.
However, as I cogitate over the Kestrel’s icoBoard implementation, the realities of the 8K’s LUT limitations truly start to sink in. First, everything isn’t equal. Some of those LUTs will be used as wires, since the iCE40 architecture doesn’t have the same routing abilities as the Spartan-3. It has fewer block RAMs, which all but mandates running from external RAM. Running internally on “debug RAM” is simply not an option with the icoBoard implementation. Upgrades to the CPU microarchitecture will consume LUTs as well (it’s impossible for me to predict how much though). Pipelining will require new control logic to coordinate the different stages, while privilege levels, TLBs, and MMU page table walkers will likely consume a fair chunk of that 1500 LUT slop. I have no idea if a cache controller will even fit, so it’s best to not count on it for now. Finally, overall performance of the computer will almost certainly not meet my home-computer targets. The iCE40 chip is fabricated in a different process than the Spartan-3 chip, so it’s likely the CPU will need to be clocked slower than 25MHz when fully synthesized, at least until pipelining is put into place.
Obviously, these things collectively do not sit well with me. I want reasonable performance (eventually), interactivity, and an ability to upgrade pieces without too much hard labor. This calls for a design which is extremely modular. As indicated in my previous blog update, I’ve decided to tackle a design not too dissimilar from early, 1960s-1970s-era mainframes. Thus, the icoBoard gamma will have sufficient resources synthesized to allow it to compute, but not much else.
This gives several benefits:
It also gives some disadvantages:
You might think, “You should use RS-232 for your interconnect,” or perhaps, “SPI would be perfect for this.” I used to think this was the case too. It’s not. I’ve decided to resurrect a dead technology instead: IEEE-1355. This is the protocol behind the currently niche SpaceWire protocol, used pretty much only in aerospace. Why would I use this seemingly out-dated technology?
Honestly, I’ve no idea why this technology died. It is far, far, far from out-dated. As one article puts it, it offers “ATM speed at RS-232 cost”. My particular reasons for adopting this standard are:
Despite its advantages, I will need to change some aspects of IEEE-1355 to better suit my needs. IEEE-1355 is specified with the assumption that both sides of a connection use a dedicated controller, which can maintain the link state in real-time to a 2 microsecond resolution. That is, it can detect link errors and reset within two microseconds. Similarly, if you pull a cable, it can detect the link having gone dead within that time as well. This is infeasible with (especially slower/cheaper) microcontrollers. For this reason, I need to relax this constraint. The link reset protocol will likely remain the same, but specific timings will be relaxed to allow, say, 10ms resolution times. This is still fast enough for humans to perceive instantaneous response times, but should be slow enough to support bit-banged implementations in slower microcontrollers.
To use a mainframe, you’ll need a terminal. I have two terminal ideas floating around in my head.
I first plan to rig an ESP8266-based microcontroller to serve as a bridge between IEEE-1355 and RS-232. This would allow the mainframe to be operated from a host PC, Raspberry Pi, or similar device. It would be a dumb peripheral, in the sense that it would perform no interpretation of the data sent by either the Kestrel-3 or the user. This represents the simplest possible I/O controller, and thus, easiest to get working. In essence, it’s a $3 replacement for a $15 USB/RS232 cable. Don’t worry; the economics of this baffles me too, but it seems to work.
I then happened upon the idea of using a Kestrel-2 as an 80x25 monochrome terminal with some bitmapped graphics abilities. This would require a more sophisticated protocol on the Kestrel-3/Terminal Controller side of the connection, since we are no longer dealing with just a dumb terminal. It now is closer in scope to a VNC client. As long as the protocol is compatible, anyone can build a microcontroller-based terminal built around a Gameduino as well. Software running on the Kestrel-3 should be nonethewiser, resolution and color depth-sensitive code notwithstanding.
I’ve also toyed with the idea of using a line-mode and block-mode terminal (a la 3270 terminals) as well. These user interfaces fascinate me, and I think they’re quite under-appreciated. Commodore used its line-oriented interface to great effect, particularly with its machine-language monitors responding like magic to screen-editing operations. Expect me to spend some time playing with these in the future.
A mainframe is worthless without storage. I continue to plan on using SD storage. However, instead of the FPGA driving the SD card directly, as I’ve done with the Kestrel-2 before, I now plan on off-loading SD card management to a microcontroller.
The reason for this is simple: abstract away the protocol differences between SD, SDSC, SDHC, SDXC, SD/UHS, etc. protocols. And that’s not even touching the MMC-derived protocols.
What I don’t yet know is whether or not I intend on off-loading filesystem operations to the microcontroller. It’s awfully tempting, especially since I have several existing models I could follow:
Embrace and extend Commodore DOS for use with the Kestrel. Start out with something on par with a Commodore 1541 or 8050 disk controller DOS, and later fork it to support subdirectories, partitions, filenames longer than 16 characters, etc. This has the benefit that it’s a relatively simple and well-defined protocol for storage devices which also supports direct access to the underlying storage media, allowing new filesystems to be implemented. Risks include a poor mapping from IEEE-488 semantics to IEEE-1355 semantics, and a potentially poor mapping of tracks and sectors to logical blocks. Maximum volume size is 16MB, due to limitations of 256 tracks, 256 sectors, and 256 bytes per sector.
Subset Commodore DOS. Focus only on the direct access aspects, leaving the filesystem aspects for the Kestrel-3 to handle. Same risks as above. In effect, GEOS for Commodore 64 and 128 uses this approach, particularly with VLIR files.
Support 9P. This is more modern than Commodore DOS, in that it natively supports subdirectories, supports multi-user environments, etc. However, it does not support direct access to the storage media, so you’re stuck using whatever filesystem the 9P driver on the controller implements. 9P was designed to operate over IP packets, and thus is a better fit than Commodore DOS for IEEE-1355 packet switching. It is also markedly more complicated too. Another risk with 9P is how to handle removable media. Consider, at any time, the user could pull an SD card out of its slot, or accidentally break the SPI connection with the SD slot. How does 9P handle this?
Support IBM mainframe-style raw read, raw write, and seek cylinder commands. This exposes the true nature of disk drives to the programmer. However, disk drives are a dying breed these days. We’d have to synthesize the concept of cylinders when using SD media, for example.
Right now, it seems like 9P is the best option to go with for general purpose storage. I just wish I could find an implementation I can actually understand. I might start out with something simple, like a subset of Commodore DOS, and switch to 9P later on once I gain more experience.
All the problems I’ve been having with the Kestrel-3 of late have gotten me down, but in retrospect, it might be the best thing to happen to the computer design. This level of modularity might actually make the computer more appealing to a wider free/open source hardware community. Here’s hoping things come along nicely.
Samuel A. Falvo II
Twitter: @SamuelAFalvoII
Google+: +Samuel A. Falvo II
Software engineer by day. Amateur computer engineer by night. Founded the Kestrel Computer Project as a proof-of-concept back in 2007, with the Kestrel-1 computer built around the 65816 CPU. Since then, he's evolved the design to use a simple stack-architecture CPU with the Kestrel-2, and is now in the process of refining the design once more with a 64-bit RISC-V compatible engine in the Kestrel-3.
Samuel is or was:
Samuel seeks inspirations in many things, but is particularly moved by those things which moved or enabled him as a child. These include all things Commodore, Amiga, Atari, and all those old Radio-Electronics magazines he used to read as a kid.
Today, he lives in the San Francisco Bay Area with his beautiful wife, Steph, and four cats; 13, 6.5, Tabitha, and Panther.