The Question
It started at 3am with a textbook on computer architecture and a bad habit of asking inconvenient questions. The question was this: could a modern engineer — with modern tools, simulation software, and fabrication services — build a working CPU using only components that existed before integrated circuits?
Not as a thought experiment. As an actual build. Something you could probe with an oscilloscope, watch switch states on, hand to someone and say: this is what a computer is, at its most fundamental level.
The Discrete 8-Bit ALU is the first answer to that question. An Arithmetic Logic Unit built entirely from discrete MOSFETs — no microcontrollers, no FPGAs, no 74HC convenience ICs as a crutch. Every logic gate hand-designed, every schematic verified, every PCB laid out from scratch.
“The goal was never to do it the smart way. It was to do it the fundamental way — and understand every layer of the abstraction by building it by hand.”
— Trammell M.
What followed was six months of SPICE simulations, MOSFET threshold nightmares, PCB crosstalk debugging, and eventually: 1.24 million tests, 100% pass rate, and eight boards at the fab house.
MOSFET Gate Design
Before any schematic, every logic gate needed to exist as a transistor-level design. The tool of choice was Electric VLSI — an open-source chip design environment that lets you build circuits from the transistor up.
The fundamental gates — NAND, NOR, XOR, XNOR, buffer, inverter — were each designed as CMOS complementary pairs: an N-channel pull-down network and a P-channel pull-up network, working in opposition. CMOS is why your phone doesn’t get hot just sitting idle: when a gate is in a stable state, no current flows through it.
Each gate design went through a validation checklist before being promoted to the schematic library: correct truth table, noise margin above 15%, switching time within simulation spec, and power consumption characterized at both 3.3V and 5V supply rails.
CMOS logic relies on the complementary relationship between N and P channel transistors. For a 2-input NAND: when both inputs are high, both N-channels conduct and both P-channels are off — output is pulled low. For any other input combination, at least one P-channel conducts — output is pulled high. No path from VDD to GND exists in a stable state, so static power draw is essentially zero.
Schematic Capture
With a verified gate library established, the ALU schematic was assembled in KiCad 7. An 8-bit ALU consists of eight identical 1-bit slices, each capable of performing all eight operations, with carry signals chained between adjacent slices.
The eight operations — ADD, SUB, AND, OR, XOR, NOT, SHL, SHR — each have distinct gate-level implementations. Addition and subtraction use a full-adder/subtractor structure. Shifts are implemented as multiplexers routing adjacent bit outputs. Logic operations are the simplest: direct gate connections, no carry propagation needed.
# Operation select lines: S2, S1, S0
# Maps 3-bit opcode to ALU function
ALU_OPS = {
"ADD": 0b000,
"SUB": 0b001,
"AND": 0b010,
"OR": 0b011,
"XOR": 0b100,
"NOT": 0b101,
"SHL": 0b110,
"SHR": 0b111,
}
def expected_result(op, a, b):
match op:
case "ADD": return (a + b) & 0xFF
case "SUB": return (a - b) & 0xFF
case "AND": return a & b
case "OR": return a | b
case "XOR": return a ^ b
case "NOT": return (~a) & 0xFF
case "SHL": return (a << 1) & 0xFF
case "SHR": return (a >> 1) & 0xFFSPICE Simulation
Every gate design was verified in NGSpice before being promoted to the schematic. The simulation used realistic SPICE models for the 2N7000 (N-channel) and BS250 (P-channel) MOSFETs — the actual components in the BOM — including parasitic capacitance and temperature-dependent threshold voltage.
The critical measurement for each gate was propagation delay: the time from when the input crosses 50% of the supply voltage to when the output crosses 50%. Worst-case propagation delay across the full ALU was simulated at 42 nanoseconds.
| Gate | tpHL (ns) | tpLH (ns) | Avg Delay | Power (µW) |
|---|---|---|---|---|
| Inverter | 2.1 | 2.4 | 2.25 | 0 |
| NAND (2-in) | 3.8 | 4.2 | 4.0 | 0 |
| NOR (2-in) | 4.1 | 3.9 | 4.0 | 0 |
| XOR | 8.3 | 8.7 | 8.5 | 0 |
| Full Adder (1-bit) | 14.2 | 15.1 | 14.65 | 0 |
| 8-bit ALU (worst case) | 39.5 | 44.2 | 41.85 | ~0 |
Formal Verification
Simulation catches what you test for. Formal verification catches what you don’t think to test for. After SPICE validated the analog behavior of each gate, the full ALU netlist was exported to Verilog and run through SymbiYosys — an open-source formal verification framework.
Marcus built a custom KiCad-to-Verilog netlist converter in Python to bridge the gap. The formal verification suite ran 1,240,000 test cases covering the full 8-bit input space across all eight operations.
All 1,240,000 passed. The formal proof confirms that for any valid 8-bit input pair and any operation select code, the ALU produces the mathematically correct output. Provably correct across the entire input domain.
PCB Layout
Eight custom PCBs. Four-layer FR4 stackup. ENIG finish. All routed by hand in KiCad — no autorouter. The decision to use four layers was driven by signal integrity requirements: with hundreds of switching transistors, a solid internal ground plane is not optional.
The stackup from top to bottom: signal layer, ground plane, power plane, signal layer. Critical signal traces were impedance-matched to 50Ω. Decoupling capacitors placed within 1mm of every IC power pin. Ground stitching vias every 10mm.
What Went Wrong
It would be dishonest to present this as a smooth process. Three problems genuinely caused weeks of delay and required fundamental design changes.
1. MOSFET Threshold Spread
The 2N7000 MOSFETs from two different order batches had measurably different threshold voltages. Initial designs with tight switching margins failed at temperature extremes. The fix required widening noise margins by 15% and re-characterizing all gates at both 0°C and 70°C.
2. Carry Chain Crosstalk
The first PCB prototype showed signal crosstalk between adjacent carry chains — a consequence of running parallel signal traces too close on the inner copper layers. The errors were subtle: correct output for most input combinations, but wrong for specific edge cases involving carry propagation across multiple bits simultaneously.
3. Formal Verification Toolchain
Getting SymbiYosys to accept the gate-level netlist required building a custom export pipeline from scratch. Marcus wrote the KiCad-to-Verilog converter over a long weekend. It’s now a standalone open-source tool and arguably the most generally useful thing to come out of this project so far.
What’s Next
Boards are at the fab house. The BOM is complete and components are ordered. The next phase is hand-soldering 3,488 components across eight boards — a process that will take several weeks and will be documented here, step by step.
After assembly comes the most satisfying part: plugging in a power supply for the first time and watching a discrete transistor ALU produce correct output on an oscilloscope. If the simulation is right, it should work on first power-on. If it doesn’t, the debugging log will be even more interesting reading.
All design files — schematics, PCB layouts, SPICE netlists, the Verilog converter, the test harness — are on GitHub. Build updates will be posted here as they happen.