AI Submissions for Wed Feb 18 2026
What years of production-grade concurrency teaches us about building AI agents
Submission URL | 115 points | by ellieh | 36 comments
Title: Your Agent Framework Is Just a Bad Clone of Elixir: Concurrency Lessons from Telecom to AI Author: George Guimarães
Summary: Guimarães argues that today’s Python/JS agent frameworks are reinventing Erlang/Elixir’s 40-year-old actor model—the same principles the BEAM VM was built on to run telecom switches. AI agents aren’t “web requests”; they’re long-lived, stateful, concurrent sessions that demand lightweight isolation, message passing, supervision, and fault tolerance. Those properties are native to the BEAM and only partially approximated by Node.js and Python frameworks. If you’re building AI agents at scale, Elixir isn’t a hipster pick; it’s the architecture the problem calls for.
Key points:
- The 30-second request problem: Agent sessions routinely hold open connections for 5–30s with multiple LLM calls, tools, and streaming—multiplied by 10,000+ users. Thread-per-request stacks struggle here.
- Why BEAM fits:
- Millions of lightweight processes (~2 KB each), each with its own heap, GC, and fault isolation.
- Preemptive scheduling (every ~4,000 reductions) prevents any single agent from hogging CPU.
- Per-process garbage collection avoids global pauses at high concurrency.
- Native distribution: processes talk across nodes transparently.
- Phoenix Channels/LiveView already handle 100k+ WebSockets per server; an agent chat is just another long-lived connection.
- Node.js comparison:
- Single-threaded event loop makes CPU-heavy work block unrelated sessions unless offloaded.
- Stop-the-world GC and process-wide crashes hurt tail latency and reliability.
- Python/JS agent frameworks are converging on actors:
- Langroid explicitly borrows the actor model.
- LangGraph models agents as state machines with reducers and conditional edges.
- CrewAI coordinates agents via shared memory and task passing.
- AutoGen 0.4 pivots to an “event-driven actor framework” with async message passing and managed lifecycles.
- Conclusion: they’re rediscovering what the BEAM has provided since 1986.
- LLMs + Elixir: José Valim highlighted a Tencent study where Claude Opus 4 achieved the highest code-completion rate on Elixir problems (80.3%), but the deeper point is runtime fit, not just codegen ergonomics.
Why it matters:
- Agentic workloads look like telecom, not classic web requests. Architectures tuned for short, stateless requests buckle under thousands of long-lived, stateful streams. BEAM’s actor model maps directly onto agent process-per-session designs with supervision and graceful failure.
Practical takeaways for builders:
- Model each agent/session as an independent BEAM process; use supervision trees for fault recovery.
- Stream tokens over Phoenix Channels; scale horizontally with native distribution.
- Keep heavy CPU/ML off the schedulers (use ports, separate services, or NIFs with dirty schedulers).
- Use per-process state for isolation; leverage ETS when shared, fast in-memory tables are needed.
Caveats:
- Python still dominates ML tooling; you’ll often pair Elixir orchestration with Python/Rust for heavy inference.
- NIFs can stall schedulers if misused; prefer ports or dirty schedulers for safety.
- Team familiarity and hosting/tooling may influence stack choice despite the runtime fit.
Bottom line: AI agent frameworks in Python/JS are converging on the actor model because the problem demands it. If you want production-grade concurrency, fault isolation, and effortless real-time at scale, the BEAM/Elixir stack is the battle-tested blueprint rather than a pattern to reimplement piecemeal.
Here is a summary of the discussion:
Runtime Fit vs. Real-World Bottlenecks A major thread of debate centered on whether BEAM’s concurrency advantages matter when AI workloads are heavily IO-bound. Users like rndmtst and mccyb argued that since agents spend 95% of their time waiting on external APIs (OpenAI/Anthropic), the scheduler's efficiency is less critical than it was for telecom switches. While they admitted hot code swapping is a genuine advantage for updating logic without dropping active sessions, they questioned if runtime benefits outweigh the massive ecosystem and hiring advantages of Python.
"Let It Crash" vs. Context Preservation Commenters wrestled with applying Erlang’s "let it crash" philosophy to LLM context. qdrpl and vns pointed out that restarting a process effectively wipes the in-memory conversation history—a critical failure in AI sessions. asa400 clarified that supervisors are intended for unknown/transient execution errors, not semantic logic failures; however, mckrss noted that BEAM’s fault tolerance doesn't solve "durable execution" (sustaining state across deployments or node restarts), often requiring hybrid architectures with standard databases anyway.
Concurrency Constructs znnjdl shared an anecdote about switching from a complex Kubernetes setup to Elixir for long-running browser agents, noting that distributed problems that resulted in infrastructure "hell" elsewhere were solved by native language constructs. There was significant technical dispute between wqtwt, kbwn, and others regarding whether modern Linux OS threads are sufficient for these workloads versus the BEAM’s lightweight 2KB processes.
Frameworks vs. Primitives The discussion compared building on Elixir primitives (OTP) versus Python frameworks. vns described tools like LangChain as "bloated" attempts to provide structure that Elixir offers natively, though d4rkp4ttern defended newer Python frameworks like Langroid. Finally, jsvlm (José Valim, creator of Elixir) chimed in to correct the historical record, noting that the creators of Erlang implemented the actor model independently to solve practical problems, rather than adopting it from academic theory.
AI adoption and Solow's productivity paradox
Submission URL | 780 points | by virgildotcodes | 734 comments
Headline: CEOs say AI hasn’t moved the needle—economists dust off the Solow paradox
- A new NBER survey of ~6,000 executives across the U.S., U.K., Germany, and Australia finds nearly 90% report no AI impact on employment or productivity over the past three years. About two-thirds say they use AI—but only ~1.5 hours per week on average—and a quarter don’t use it at all.
- Despite the muted present, leaders still expect near-term gains: +1.4% productivity and +0.8% output over the next three years. Firms forecast a small employment drop (-0.7%), while workers expect a slight rise (+0.5%).
- The disconnect revives Solow’s productivity paradox: technology is everywhere except in the macro data. Apollo’s Torsten Slok says AI isn’t yet visible in employment, productivity, inflation, or most profit margins outside the “Magnificent Seven.”
- Evidence is mixed: the St. Louis Fed sees a 1.9% excess cumulative productivity bump since late 2022; an MIT study projects a more modest 0.5% over a decade. Separately, ManpowerGroup reports AI use up 13% in 2025 but confidence down 18%. IBM says it’s boosting junior hiring to avoid hollowing out its management pipeline.
- Optimists see a turn: Erik Brynjolfsson points to stronger GDP and estimates U.S. productivity rose 2.7% last year, suggesting benefits may finally follow 2024’s >$250B corporate AI spend.
Why it matters: Echoes of the 1980s IT cycle—big investment first, measurable gains later. Light-touch adoption and workflow inertia may be masking what only shows up after reorganization, tooling maturity, and broader diffusion.
Here is a summary of the discussion:
The Solow Paradox & Historical Parallels Commenters engaged deeply with the article's comparison to the 1970s/80s productivity paradox. While some agree that we are in the "DOS era" of AI—where expensive investment precedes the "Windows 95" era of utility—others argue the comparison is flawed. One user notes a key economic difference: modern AI (e.g., a $20 Claude subscription) has a significantly lower barrier to entry and onboarding cost than the mainframe computing and manual office training required in the 1970s.
The "Infinite Report" Loop A major thread of cynicism focuses on the nature of corporate work. Users argue that while AI might make producing reports "3x faster," it often degrades the signal-to-noise ratio.
- The Fluff Tax: Critics point out that faster writing shifts the burden to the reader; if a report takes 10% longer to understand because of AI "fluff," overall organizational value is lost.
- The AI Ouroboros: Several users joked (or lamented) that the inevitable solution is people using AI to summarize the very reports that colleagues used AI to generate, resulting in a hollow loop of information transfer.
Skill Acquisition vs. "Licking the Window" There is significant skepticism regarding using LLMs for learning and skill development.
- False Confidence: Users warn that AI gives a "false sense of security" regarding understanding material. One commenter vividly described it as "looking through the window" at knowledge rather than grasping it, advocating for the traditional "RTFM" (Read The F*ing Manual) approach for true expertise.
- Code vs. Prose: While confidence in AI for general communication is low, some developers defend the utility of current models for coding, noting that recent improvements in context handling allow models to effectively read codebases and implement solutions, unlike the "hallucinations" common in semantic text tasks.
Technical Bottlenecks The discussion touched on the limits of current distinct architectures. Some predict that purely scaling context windows (RAG) yields diminishing returns or slows down processing. One prediction suggests that the real productivity breakthrough won't come from larger LLMs, but from hybrid models that pair LLMs with logic-based systems to eliminate hallucinations and perform actual reasoning rather than probabilistic token generation.
Microsoft says bug causes Copilot to summarize confidential emails
Submission URL | 261 points | by tablets | 71 comments
Microsoft says a bug in Microsoft 365 Copilot Chat has been summarizing emails marked confidential, effectively bypassing data loss prevention policies. Tracked as CW1226324 and first detected January 21, the issue hit the Copilot “work tab” chat, which pulled content from users’ Sent Items and Drafts in Outlook desktop—even when sensitivity labels should have blocked automated access. Microsoft attributes it to a code/configuration error and began rolling out a fix in early February; a worldwide configuration update for enterprise customers is now deployed, and the company is monitoring and validating with affected users. Microsoft stresses no one gained access to information they weren’t already authorized to see, but admits the behavior violated Copilot’s design to exclude protected content. The company hasn’t disclosed the scope or a final remediation timeline; the incident is flagged as an advisory, suggesting limited impact. Why it matters: it’s a trust hit for AI guardrails in enterprise email—showing how label- and DLP-based protections can be undermined by new AI features even without a classic data breach.
Based on the discussion, commenters focused on the fundamental conflict between rapid AI integration and enterprise security requirements. Several users criticized Microsoft's approach as "sprinkling AI" onto existing tech stacks without rethinking the underlying security architecture, noting that standard protections (like prompt injection defenses) are insufficient against "unknown unknowns." A self-identified AI researcher argued that engineering is currently outpacing theoretical understanding, leading to "minimum viable product" safeguards that cannot guarantee data safety or effectively "unlearn" information once processed.
Key themes in the thread included:
- Failure of Guardrails: Participants noted that Data Loss Prevention (DLP) tools are pointless if the AI layer can bypass them, effectively rendering manual classification (like employee NDAs or "Confidential" labels) moot.
- The OS Debate: The incident sparked a recurring debate about leaving the Microsoft ecosystem for Linux or macOS due to "user-hostile" feature bloat, though counter-arguments pointed out that switching operating systems does not mitigate cloud-service vulnerabilities.
- Terminology: There was significant skepticism regarding Microsoft’s classification of the bug as an "advisory." Users argued this term softens the reality of what they view as a significant breach of trust and privacy, distinguishing it from the typical, lower-severity definition of the word in IT contexts.
Fastest Front End Tooling for Humans and AI
Submission URL | 109 points | by cpojer | 94 comments
Fastest Frontend Tooling for Humans and AI: a push for 10x faster JS/TS feedback loops
The author argues 2026 is the year JavaScript tooling finally gets fast by pairing strict defaults with native-speed tools—benefiting both humans and LLMs. The centerpiece is tsgo, a Go rewrite of TypeScript that reportedly delivers ~10x faster type checking, editor support, and even catches some errors the JS implementation missed. It’s been used across 20+ projects (1k–1M LOC) and is pitched as stable enough to adopt, especially if you first swap builds to tsdown (Rolldown-based) for libraries or Vite for apps. Migration is simple: install @typescript/native-preview, replace tsc with tsgo, clean legacy flags, and flip a VS Code setting.
On formatting, Oxfmt aims to replace Prettier without losing ecosystem coverage. It bakes in popular plugins (import/Tailwind class sorting) and falls back to Prettier for the long tail of non-JS languages, easing migration and editor integration.
For linting, Oxlint is positioned as the first credible ESLint replacement because it can run ESLint plugins via a shim and NAPI-RS, supports TS config files, and adds type-aware rules. With oxlint --type-aware --type-check, you can lint and type-check in one fast pass powered by tsgo.
To make strictness easy, @nkzw/oxlint-config bundles a comprehensive, fast, and opinionated rule set designed to guide both developers and LLMs:
- Error-only (no warnings)
- Enforce modern, consistent style
- Ban bug-prone patterns (e.g., instanceof), disallow debug-only code in prod
- Prefer fast, autofixable rules; avoid slow or overly subjective ones
The post includes “migration prompts” for swapping Prettier→Oxfmt and ESLint→Oxlint, and points to ready-made templates (web, mobile, library, server) used by OpenClaw. Smaller DevX picks name-check npm-run-all2, ts-node, pnpm, Vite, and React.
Why it matters: Faster, stricter tooling shortens feedback loops, reduces bugs, and—per the author’s experiments—helps LLMs produce more correct code under strong guardrails. Caveat: tsgo is still labeled experimental, so teams should trial it on a branch before a full switch.
Based on the discussion, the community reaction is divided between excitement for performance gains and concern over the long-term maintainability of a "fractured" ecosystem.
The "Schism" and Maintainability The most contentious point, led by user conartist6, is the fear that rewriting JavaScript tooling in low-level languages (Rust, Go) creates a "big schism." Critics argue this prevents the average JS developer from understanding, debugging, or contributing to the tools they rely on, potentially leaving critical infrastructure in the hands of VC-backed entities (like the creators of VoidZero) rather than the community. TheAlexLichter counters this, arguing that the average web developer rarely contributes to tooling internals anyway, and that AI tools make crossing language barriers (JS to Rust) easier for those who do wish to contribute.
Performance vs. Architecture There is a debate regarding why current tooling is slow.
- The Unified JS Argument: Some users argue that the slowness isn't due to JavaScript itself, but rather the inefficiency of running three separate programs (bundler, linter, formatter) that all parse the Abstract Syntax Tree (AST) separately. They suggest a unified toolchain written in JS would be sufficient if architected correctly.
- The Native Speed Argument: Others, including 9dev, argue that JS runtimes have hit a performance wall ("throughput cliff"), making native languages necessary for modern build speeds. They contend that "batch processing" speed is relevant and not just an architectural issue.
Adoption and Compatibility Users like dcr express high interest in switching for the "10x speed increase," noting that if the tools are compatible (e.g., Oxfmt supporting Prettier plugins, Oxlint running ESLint rules via compatibility layers), the migration is worth it. TheAlexLichter confirms that tools like Oxlint and independent projects like Rolldown are designed to be compatible replacements for existing standards.
Other points raised:
- Bun: User fsmdbrg questions why Bun wasn't mentioned, noting it already offers a fast, unified runtime, bundler, and test runner that solves many of these problems.
- AI Skepticism: One user initially dismissed the post as "AI spam" due to its tone, highlighting a growing distrust in the community toward AI-generated technical content, though they later walked back the comment.
- Security: There were minor concerns regarding tracking CVEs in the native dependencies of these new tools, though others felt the risk was manageable compared to general supply chain risks.
The Future of AI Software Development
Submission URL | 199 points | by nthypes | 140 comments
Martin Fowler recaps Thoughtworks’ Future of Software Development Retreat, pushing back on calls for an “AI-era manifesto.” Instead, a 17-page summary distills eight themes showing how practices built for human-only development are buckling under AI-assisted work. Replacements are emerging but immature.
What’s new
- Supervisory engineering “middle loop”: a layer between prompt/coding and production that focuses on oversight, verification, and integration.
- Risk tiering as a core discipline: engineering practices and controls scale with the risk of the change/system.
- TDD reframed as prompt engineering: tests as the most reliable way to specify and constrain LLM behavior.
- From DevEx to AgentEx: invest in tooling and workflows for humans plus agents, not just humans.
Reality check
- AI is an accelerator/amplifier, not a panacea. It speeds coding, but if delivery practices are weak, it just accelerates tech debt (echoing the 2025 DORA report).
- No one has it figured out at scale; the most valuable outcome may be a shared set of questions.
Open questions
- Skill mix: will LLMs erode FE/BE specialization in favor of “expert generalists,” or just code around silos?
- Economics: what happens when token subsidies end?
- Process: do richer specs push teams toward waterfall, or can LLMs speed evolutionary delivery without losing feedback loops?
Security and platforms
- Security lagged in attendance, but consensus: platform teams must provide “bullet trains” — fast, safe AI paths with security baked in. Vendors may be underweighting safety factors.
Meta
- Open Space format fostered deep, respectful dialogue and notable inclusivity — a reminder culture still compounds tooling.
Here is the daily digest summary for the top story:
Martin Fowler on AI’s impact: no new manifesto, but big shifts underway
Martin Fowler provides a recap of the Thoughtworks Future of Software Development Retreat, arguing against the creation of a new "AI manifesto" and instead presenting a summary of how AI is buckling practices originally designed solely for humans. The report identifies eight key themes, suggesting that while AI serves as an accelerator, it also threatens to speed up the accumulation of technical debt if delivery practices are weak.
Key emerging concepts include:
- Supervisory Engineering: A "middle loop" focused on oversight and verification rather than direct coding.
- Risk Tiering: Scaling engineering controls based on the risk level of the change.
- TDD as Prompt Engineering: Using tests as the primary method to specify and constrain LLM behavior.
- AgentEx: Moving beyond Developer Experience to build tooling for both humans and agents.
The report concludes that no one has solved this at scale yet, and the industry currently has more shared questions—about economics, skill specialization, and security—than answers.