AI Submissions for Fri Jan 02 2026
TinyTinyTPU: 2×2 systolic-array TPU-style matrix-multiply unit deployed on FPGA
Submission URL | 122 points | by Xenograph | 50 comments
TinyTinyTPU: a bite-size, working TPU you can simulate and run on a Basys3 FPGA. It implements a full TPU-style pipeline around a 2×2 systolic array, making TPU internals tangible for learning and experimentation.
Highlights
- End-to-end design: 2×2 systolic array (4 PEs) plus accumulator, activation (ReLU), normalization, and quantization stages
- Works today on a low-cost Basys3 (Artix-7) board via a simple UART host interface and Python driver
- Multi-layer MLP inference with double-buffered activations; includes demos (e.g., a mouse-gesture classifier)
- Thorough test suite with cocotb + Verilator and optional waveforms; module and top-level coverage
- Open-source flow supported (Yosys + nextpnr) in addition to Xilinx Vivado
Architecture notes
- Systolic dataflow: activations move horizontally; partial sums vertically
- Diagonal wavefront weight loading to align systolic timing
- Weight FIFO → MMU → Accumulator → Activation → Normalization → Quantization pipeline
- UART protocol for commands/results; 115200 8N1
Resource footprint on Basys3 (XC7A35T)
- ~1k LUTs (≈5%), ~1k FFs (≈3%), 8 DSP48E1 slices, 10–15 BRAMs, ~25k gates
- 100 MHz clock; reset on BTNC; RX/TX on B18/A18
Developer experience
- Sim: Verilator 5.x, cocotb, GTKWave/Surfer; make targets for unit and integration tests with waveforms
- FPGA: Vivado or Yosys/nextpnr build; Python host scripts for loading weights/activations and reading results
- Clear, modular repo with DEBUGGING_GUIDE and per-module tests (PE, MMU, accumulator, activation pipeline, UART, full system)
Why it’s interesting
- A minimal yet complete TPU you can read, simulate, and tinker with—ideal for understanding systolic arrays, post-MAC pipelines, and hardware-software co-design
- Demonstrates how a TPU scales: this 2×2 version is educational; the same concepts underpin larger arrays like TPU v1’s 256×256
Try it
- Run all sims from sim/ with make test (WAVES=1 for traces)
- Flash to Basys3 and use the provided Python driver to push weights/activations and execute inference
- Optional gesture demo trains a 2-layer MLP and performs real-time classification on the FPGA
While the submission focuses on an educational TPU implementation, the discussion broadens into a debate on the future of AI hardware, specifically comparing FPGAs, GPUs, and ASICs in the context of large-scale inference.
The Evolution of AI Hardware
- The Crypto Analogy: User mrntrwb likens the trajectory of AI inference to Bitcoin mining: moving from CPUs to GPUs, briefly to FPGAs, and finally to ASICs. They predict that GPU-based inference will soon become obsolete due to inefficiency compared to purpose-built chips (like Google's TPU or Groq).
- The Counter-Argument: Others, including fblstr and ssvrk, argue that modern Data Center GPUs are already effectively ASICs given the amount of die area dedicated to fixed-function matrix multiplication (Tensor Cores) rather than graphics. NitpickLawyer notes that high-end accelerators are much closer to ASICs than traditional video cards.
FPGAs vs. GPUs for Inference
- Performance Claims: A heated debate emerged regarding whether FPGAs can compete with top-tier GPUs (H200/B200). User dntcs claims to have worked on FPGA systems that outperform H200s on Llama3-class models, largely by bypassing memory bottlenecks.
- Skepticism: fblstr challenges this, noting that while memory bandwidth is the bottleneck, the sheer compute density (PetaOPS) of chips like the Blackwell B200 is difficult for general-purpose FPGA fabric to match.
- Bandwidth is King: Multiple users (tcnk, bee_rider) agree that the real constraint for inference is memory fabric and bandwidth. tcnk highlights modern platforms like the Alveo V80 with PCIe 5.0 and 200G NICs as the current state-of-the-art for programmable in-network compute.
Market Dynamics
- Hyperscaler Custom Silicon: The discussion notes that major tech companies (Google, Amazon, Meta, Microsoft) effectively already use custom silicon (TPUs, Inferentia, Maia) for their internal workloads, reducing reliance on Nvidia for inference.
- Edge Hardware: Narew and mffklst briefly discuss older "stick" format TPUs (Google Coral, Intel compute sticks), noting they are now dated and struggle to compete with low-power GPU/SOC options like Jetson.
Other Technical Notes
- 0-_-0 and hnkly drew parallels between neural networks and CPU branch predictors, discussing the potential for AI to handle heuristic tasks (like speculative execution) to skip expensive deterministic computations.
- zhm clarified that while TPUs are often associated with Transformers, architectures like the TPUv5 (Ironwood) were designed specifically for efficient LLM training, whereas other chips (like Etched's Sohu) are true Transformer-specific ASICs.
AB316: No AI Scapegoating Allowed
Submission URL | 36 points | by forthwall | 19 comments
California’s AB316, as described by the poster, adds Civil Code 1714.46 and bars “the AI did it” as a liability defense: if an AI system causes harm, developers or users can’t claim autonomy as a shield. The law broadly defines AI as systems that infer from inputs to generate outputs affecting physical or virtual environments.
The author (not a lawyer) thinks this is reasonable but vague, and raises thorny questions about who’s on the hook when things go wrong:
- Where does liability sit between model makers (e.g., OpenAI), app builders, and deployers?
- How does this play with open-source models used in critical contexts (e.g., an aircraft system)?
- Will claims hinge on marketing representations or integration choices?
Expected knock-on effects: more investment in guardrails and safety layers, tighter operational controls, stronger contracts and indemnities, and a budding market for AI liability insurance. The takeaway: unpredictability won’t excuse harm; if your system can cause damage—like a chatbot giving dangerous advice—you’re responsible for preventing it.
Discussion Summary:
Commenters grappled with the boundaries of liability, using analogies ranging from food safety to science fiction to explore whether unpredictability should absolve developers of blame.
- The "Eggshell Skull" Doctrine: The discussion opened with a grim hypothetical: if a chatbot encourages a user to commit suicide, is the developer liable? While some users felt a bot shouldn't be held to the same standard as a human, others cited the "eggshell skull" legal rule. This doctrine suggests a defendant is liable for the resulting harm even if the victim had a pre-existing vulnerability (like suicidal ideation), implying developers cannot use a user's mental state to shield themselves from the consequences of a bot's "persuasive" errors.
- The Zoo Analogy: One user reframed the AB316 logic using a zoo comparison. The law essentially states that "the AI is a wild animal" is not a valid defense. Just as a zoo is responsible for containment regardless of a tiger's natural instincts, an AI deployer is responsible for the system's output, regardless of its inherent unpredictability.
- Product Liability Parallels: Participants drew comparisons to the Jack in the Box E. coli outbreaks and faulty car parts. The consensus leaned toward treating AI as a commercial product: if a company sells "sausages made from unsanitary sources" (or a model trained on toxic data), they face strict liability for the outcomes.
- Redundancy vs. Clarity: A debate emerged over whether this law is redundant, given that product liability laws already exist. However, proponents argued the legislation is necessary to specifically close the "autonomy loophole," preventing defendants from claiming a system's "black box" nature puts its actions outside their legal control.
- The "Catbox" Sophistry: In a philosophical turn, a user cited the "Schrödinger's Catbox" from the novel Endymion—a device where a death is triggered by random radioactive decay, purportedly absolving the user of murder. The commenter argued that corporate reliance on AI stochasticity is a similar moral sophistry, attempting to use randomness to dilute ethical responsibility.
Everyone's Watching Stocks. The Real Bubble Is AI Debt
Submission URL | 48 points | by zerosizedweasle | 27 comments
Howard Marks flags rising leverage behind the AI boom as a late‑cycle warning sign
- The Oaktree co-founder says the AI trade has shifted from being funded by Big Tech’s cash piles to being increasingly financed with debt, a change he finds worrisome.
- He argues the AI rally looks further along than earlier in the year, with growing leverage a classic sign of maturing (and potentially bubbly) markets.
- Why it matters: Debt magnifies both gains and losses. If AI-driven revenues don’t arrive fast enough to cover swelling capex and financing costs, the pain could spread from equity to credit markets.
- What to watch: Hyperscalers’ capex and borrowing trends, off-balance-sheet commitments (long-term purchase and leasing deals), and credit spreads tied to the AI supply chain and data-center buildout.
- Context: Marks’ latest memo (“Is It a Bubble?”) doesn’t call a top outright but underscores that the risk profile of the AI trade has changed as leverage enters the picture.
Daily Digest: Hacker News Discussion
Investment Strategy Amidst "Doom" Signals The thread opened with users questioning where to allocate capital given the economic warnings (bubbles, debt, and inflation). Responses ranged from adhering to standard long-term strategies (such as Vanguard's 80/20 split) to fleeing to safety. While some advocated for holding cash to avoid potential market crashes of 15%+, others argued that cash is a poor hedge during inflationary periods driven by potential government "money printing." There was also a brief, contentious suggestion to pivot toward specific foreign indices (like Spain) or gold as safety plays.
Historical Parallels and Timing Highlighting the difficulty of acting on macro warnings, one commenter pointed to the Dot-com era: Alan Greenspan famously warned of "irrational exuberance" in 1996, yet the bubble did not burst for several more years. The consensus suggested that while valuations may be unsupported, timing the exact top remains notorious difficult.
Validating the Leverage Shift Validating the article's core thesis, a user shared their own analysis of Big Tech balance sheets (specifically Meta, Microsoft, and Amazon). They noted a distinct shift starting around the release of ChatGPT in late 2022: these previously cash-rich "fangs" have significantly increased their debt loads to finance the AI buildout, a fundamental change in risk profile that led the user to exit a 10-year position in the sector.