AI Submissions for Sat Dec 20 2025
Claude in Chrome
Submission URL | 258 points | by ianrahman | 138 comments
Anthropic is rolling out “Claude in Chrome,” a beta extension that lets Claude navigate the web, click buttons, and fill forms directly in your browser. It’s available to all paid subscribers and integrates with Claude Code and the Claude Desktop app.
What it does
- Agentic browsing: Claude can read pages, navigate sites, submit forms, and operate across tabs.
- Background and scheduled runs: Kick off workflows that continue while you do other work, or schedule daily/weekly tasks.
- Example workflows: Pull metrics from analytics dashboards, organize Google Drive, prep from your calendar (read threads, book rooms), compare products across sites into a Google Sheet, log sales calls to Salesforce, and triage promotional emails.
How it integrates
- Claude Code: Build/test browser-based projects faster by iterating in Chrome.
- Claude Desktop: Start tasks in the desktop app and let Chrome handle the web steps via a connector.
Safety and limitations
- Strong warnings: This is a beta with unique risks (prompt injection, unintended actions, hallucinations).
- Permissions model: Pre-approve actions per site; Claude will still ask before irreversible steps (e.g., purchases). You can skip prompts for trusted workflows but should supervise closely.
- Not recommended: Financial transactions, password management, or other sensitive/high‑stakes tasks.
- Guidance provided: Docs on prompt‑injection risks, safe use, and permissions. Anthropic notes protections are not foolproof and asks users to report issues.
Why it matters
- Pushes “agentic” browser automation toward mainstream productivity and lightweight RPA.
- Honest risk framing acknowledges the open‑web attack surface for LLM agents—expect an arms race around prompt injection and permission design.
- Developers get a quicker loop for testing web apps; business users get scheduled, multi‑step workflows without leaving Chrome.
Availability
- Chrome extension, beta, for paid Claude subscribers. Claims compliance with Chrome Web Store User Data Policy (Limited Use). Includes demos and troubleshooting guides.
Based on the discussion, here is a summary of the community's reaction:
Security & "Sandboxing" Irony
The dominant sentiment is skepticism regarding security. The top comment notes the intense irony of engineers spending years hardening Chrome (V8 sandboxing, process splitting) only to now plug an LLM directly into the browser to operate it, likening it to "lighting a gasoline fire." Users also mocked the specific security implementation found in the code, describing it as a "comprehensive list of regexes" designed to prevent secret exfiltration (blocking words like password or api_key), which was widely ridiculed as insufficient.
Prior Art & Comparisons A significant portion of the thread debated whether Anthropic is "inventing" the terminal coding agent category or simply commercializing it. Users pointed out that open-source tools like Aider have offered similar functionality since 2023, correcting claims that this is a novel workflow. Some users felt this was an attempt by Anthropic to "flex" rather than genuinely innovate on the interface.
Real-World Testing & Hallucinations Early reports from users were mixed but revealing:
- Failures: One user tried to use it to analyze Zillow listings, but the agent failed to paginate or click links effectively, leading to the conclusion that the "promises are light years ahead of efficacy."
- "Scary" Success: Conversely, another user reported that when Claude Code couldn't find a public API for a task, it successfully navigated the Chrome UI, scraped authentication tokens and cookies, and constructed a
curlrequest to the service's private API. The user described this as "amazing" problem-solving that was simultaneously terrifying from a security perspective.
Antitrust & Ecosystem Commenters speculated that this is the "endgame" for the browser wars, predicting that Google will eventually bundle Gemini in a way that creates an antitrust conflict. Others worried that Google might use Manifest V4 or future updates to break functionality for third-party agents like Claude in favor of their own models.
Show HN: HN Wrapped 2025 - an LLM reviews your year on HN
Submission URL | 263 points | by hubraumhugo | 139 comments
Spotify Wrapped for Hacker News, but with snark. “HN Wrapped 2025” uses AI to roast your year on HN, chart your trends, and even predict what you’ll obsess over next. The makers (an “AI agents for web data” team that’s hiring) say they delete all data within 30 days and aren’t affiliated with YC or HN. Expect shareable, tongue-in-cheek summaries of your posts and comments—part toy, part recruiting pitch, and very on-brand for year-end internet rituals.
Here is a summary of the discussion:
Users had fun sharing the specific "roasts" generated by the tool, with many finding the summaries surprisingly accurate—or at least amusingly cutting. Common themes included the AI mocking users for being pedantic, obsessed with retro computing, or fixated on specific topics like GDP or cloud pricing. A standout feature for many was the generated "Future HN" headlines (predictions for 2026–2035), which some users admitted were realistic enough that they tried to click on them.
However, there was constructive technical feedback. Several commenters noticed a strong "recency bias," where the summary seemed to ignore the bulk of the year in favor of comments from the last month or two. The tool's creator, hbrmhg, was active in the thread, explaining that the system uses a two-step process (extracting patterns then generating content) and subsequently pushed an update to shuffle posts and reduce the recency bias based on this feedback.
On the critical side, some users felt the AI relied too heavily on keywords to generate generic stereotypes rather than understanding the nuance of their actual arguments. Others noted the irony (or mild horror) of how easily AI can profile individuals based on public data, calling it a "normalization of surveillance capitalism," though most admitted they still enjoyed the toy. A few bugs were also reported, such as issues with case-sensitivity in usernames and speech bubble attribution errors in the generated XKCD-style comics.
MIRA – An open-source persistent AI entity with memory
Submission URL | 118 points | by taylorsatula | 48 comments
MIRA OS: an open-source “brain-in-a-box” for building a persistent AI agent that never resets the chat, manages its own memories, and hot-loads tools on demand.
Highlights
- One conversation forever: No “new chat” button. Continuity is the design constraint, with REM‑sleep‑like async processing and self-directed context-window management.
- Memory that maintains itself: Discrete memories decay unless referenced or linked; relevant ones are loaded via semantic search and traversal. For long, non-decaying knowledge, “domaindocs” let you co-edit durable texts (e.g., a preseeded “knowledgeofself”), which Mira can expand/collapse to control token size.
- Drop-in tools, zero config: Put a tool file in tools/ and it self-registers on startup. Mira enables tools only when needed (via invokeother_tool) and lets them expire from context after 5 turns to reduce token bloat. Ships with Contacts, Maps, Email, Weather, Pager, Reminder, Web Search, History Search, Domaindoc, SpeculativeResearch, and InvokeOther.
- Event-driven “long-horizon” architecture: Loose coupling via events; after 120 minutes idle, SegmentCollapseEvent triggers memory extraction, cache invalidation, and summaries—each module reacts independently.
- Built for hacking: Simple tool spec plus HOW_TO_BUILD_A_TOOL.md lets AI coding assistants generate new tools quickly. Run it, cURL it, it talks back, learns, and uses tools.
- Tone and license: The author calls it their TempleOS—opinionated, minimal, and exploratory. AGPL-3.0. Snapshot: ~243 stars, 14 forks.
Why it’s interesting
- A serious stab at believable persistence without human curation.
- Clever token discipline: decaying memories + transient tool context + collapsible docs.
- Easy extensibility via event-driven modules and drop-in tools.
Potential trade-offs
- Single-threaded lifetime chat can blur topics and history.
- AGPL may limit some commercial uses.
Licensing Controversy The discussion began with immediate criticism regarding the submission's use of the term "Open Source." Users pointed out that the project originally carried a Business Source License (BSL-1.1), which is technically "Source Available" rather than Open Source under OSI definitions. The author (@tylrstl) acknowledged the error, explaining they initially copied Hashicorp’s license to prevent low-effort commercial clones, but ultimately agreed with the feedback and switched the license to AGPL-3.0 to align with the open-source spirit.
Memory and Context Poisoning A significant technical discussion revolved around the pitfalls of persistent memory.
- One user asked how MIRA prevents "context poisoning," where an AI remembers incorrect facts or gets stuck in a bad state that persists across sessions.
- The author explained their mitigation strategy: a two-step retrieval process. Instead of stuffing the context window, MIRA performs a semantic vector search, then uses a secondary API call to intelligently rank and filter those memories. Only the most relevant "top 10" make it to the main generation context, preventing the model from getting overwhelmed or confused by outdated data.
- Others noted the biological inspiration behind the memory system, comparing the decay mechanism to Hebbian plasticity.
Bugs and Architecture
- Real-time fixes: Users reported runtime errors with tool searches and mobile image uploads; the author identified a bug related to stripping tool calls in the native Claude support and pushed a fix during the thread.
- Tech Stack: Developers confirmed the project is Python-based, prompting relief from Python users and some disappointment from those hoping for a C# backend.
- Philosophy: Commenters appreciated the "TempleOS" comparison in the README, which the author clarified was a tribute to the obsessive, deep-dive learning style of David Hahn.
Reflections on AI at the End of 2025
Submission URL | 222 points | by danielfalbo | 333 comments
Salvatore “antirez” Sanfilippo (creator of Redis) surveys how the AI narrative shifted in 2025 and where he thinks it’s going next.
Key points
- The “stochastic parrot” era is over: Most researchers now accept that LLMs form useful internal representations of prompts and their own outputs.
- Chain-of-thought (CoT) works because it enables internal search (sampling within model representations) and, when paired with RL, teaches stepwise token sequences that converge to better answers. But CoT doesn’t change the architecture—it's still next-token prediction.
- Reinforcement learning with verifiable rewards could push capabilities beyond data-scaling limits. Tasks like code speed optimization offer long, clear reward signals; expect RL for LLMs to be the next big wave.
- Developer adoption: Resistance to AI-assisted coding has dropped as quality improved. The field is split between using LLMs as “colleagues” via chat vs. running autonomous coding agents.
- Beyond Transformers: Some prominent researchers are pursuing alternative architectures (symbolic/world models). Antirez argues LLMs may still reach AGI by approximating discrete reasoning, but multiple paths could succeed.
- ARC benchmarks: Once touted as anti-LLM, ARC now looks tractable—small specialized models do well on ARC-AGI-1, and large LLMs with extensive CoT score strongly on ARC-AGI-2.
- Long-term risk: He closes by saying the core challenge for the next 20 years is avoiding extinction.
Why it matters
- Signals a broad consensus shift on LLM reasoning, legitimizes CoT as a first-class technique, and frames RL with verifiable rewards as the likely engine of continued progress—especially for agentic, tool-using systems and program optimization tasks.
Medical Advice and Accessibility vs. Safety A central thread of the discussion debates the safety of the general public using LLMs for critical advice (medical, life decisions) versus using them for verifiable tasks like coding.
- The Scarcity Argument: User ltrm argues that critics overlook scarcity. While an LLM isn't perfect, real doctors are often inaccessible (months-long wait times, short appointments). They contend that an "80-90% accurate" LLM is superior to the current alternative: relying on Google Search filled with SEO spam and scams.
- The Safety Counterpoint: etra0 and ndrpd push back, noting that while software engineers can verify code outputs, laypeople cannot easily verify medical diagnosis hallucinations. They argue that justifying AI doctors based on broken healthcare systems is a "ridiculous misrepresentation" of safety, given that LLMs are still stochastic generators that sound authoritative even when wrong.
The Inevitability of "Enshittification" and Ads Participants strongly challenged the sentiment that LLMs are neutral actors that "don't try to scam."
- Subliminal Advertising: grgfrwny predicts that once financial pressure mounts, LLMs will move beyond overt ads to "subliminal contextual advertising." They offer a hypothetical where an AI responding to a user's feelings of loneliness subtly steers them toward a specific brand of antidepressants.
- Corporate Bias: lthcrps and JackSlateur argue that training data and system prompts will inevitably reflect the agendas of their owners. Commenters pointed to existing biases in models like Grok (reflecting Elon Musk's views) and Google (corporate safety/status quo) as evidence that models are already being "fingered on the scale."
Alternative Business Models There was speculation regarding how these models will be funded to avoid the advertising trap. jonas21 proposed an insurance-based model where insurers pay for unlimited medical AI access to reduce costs, theoretically incentivizing accuracy over engagement. However, critics noted that the medical industry's reliance on liability regulations and the legal system makes this unlikely to happen quickly.
Anthropic: You can't change your Claude account email address
Submission URL | 85 points | by behnamoh | 65 comments
Anthropic: You can’t change the email on a Claude account (workaround = cancel + delete + recreate)
Key points:
- No email change support: Claude doesn’t let you update the email tied to your account. Choose an address you’ll keep long-term.
- If you need a different email, the official path is:
- Cancel any paid plan: Settings > Billing > Cancel. Cancellation takes effect at the end of the current billing period. To avoid another charge, cancel at least 24 hours before the next billing date. If you can’t log in (lost access to the original email), contact Support from an email you can access and CC the original address, confirming you want to cancel.
- Unlink your phone number: Ask Support to unlink it so you can reuse it on the new account.
- Delete the old account: Settings > Account > Delete Account. This is permanent and removes saved chats—export your data first. If you see “Contact support” instead of a delete button, you’ll need Support to assist.
- Then create a new account with the desired email.
Why it matters:
- Email changes are common (job changes, domain migrations). The lack of in-place email updates means extra friction: cancel, coordinate with Support, risk data loss if you don’t export, and downtime between accounts.
Discussion Summary:
The discussion on Hacker News focused on the technical, legal, and user experience implications of this limitation, noting that it is not unique to Anthropic.
- Industry Standard (unfortunately): Multiple commenters pointed out that OpenAI (ChatGPT) generally lacks support for changing email addresses as well, leading to frustration that two of the leading AI labs struggle with this basic web feature.
- Database Speculation: A common theory was that Anthropic might be using the email address as the database Primary Key. Developers criticized this as an amateur architectural decision ("scientists writing Python" rather than web engineers). One user pointed out the irony that if you ask Claude itself, the AI strongly advises against using an email address as a primary key.
- Security vs. Usability: A debate emerged regarding whether this is a security feature or a flaw. While some argued that locking emails prevents Account Takeovers (ATO) and simplifies verification logic, others countered that it creates a "customer service nightmare" and risks total account loss if a user loses access to their original inbox.
- GDPR Concerns: Users questioned how this policy interacts with GDPR’s "Right to Rectification," which mandates that companies allow users to correct inaccurate personal data (such as a defunct email address).
- Fraud Detection: Several users shared anecdotes of getting "instabanned" when signing up with non-Gmail addresses (like Outlook), suggesting Anthropic’s anti-abuse systems are overly sensitive to email reputation, further complicating account management.
- The "Day 2" Feature: Experienced developers noted that building "change email" functionality is difficult to get right and is often indefinitely postponed by startups focused on shipping core features, though many argued it should be standard for paid services.
School security AI flagged clarinet as a gun. Exec says it wasn't an error
Submission URL | 41 points | by kyrofa | 30 comments
Headline: Florida school locks down after AI flags a clarinet as a rifle; vendor says system worked as designed
- A Florida middle school went into lockdown after ZeroEyes, an AI gun-detection system with human review, flagged a student’s clarinet as a rifle. Police arrived expecting an armed suspect; they found a student in a camo costume for a Christmas-themed dress-up day hiding in the band room.
- ZeroEyes defended the alert as “better safe than sorry,” saying customers want notifications even with “any fraction of a doubt.” The district largely backed the vendor, warning parents to tell students not to mimic weapons with everyday objects.
- There’s disagreement over intent: the student said he didn’t realize how he was holding the clarinet; ZeroEyes claimed he intentionally shouldered it like a rifle.
- Similar misfires have dogged AI school surveillance: ZeroEyes has reportedly flagged shadows and theater prop guns; rival Omnilert once mistook a Doritos bag for a gun, leading to a student’s arrest.
- Critics label the tech “security theater,” citing stress, resource drain, and a lack of transparency. ZeroEyes won’t share false-positive rates or total detections and recently scrubbed marketing that claimed it “can prevent” mass shootings.
- Despite concerns, the district is seeking to expand use: a state senator requested $500,000 to add roughly 850 ZeroEyes-enabled cameras, arguing more coverage means more protection.
- Police said students were never in danger. Experts question whether recurring false positives do more harm than good compared to funding evidence-backed mental health services.
Takeaway: The clarinet incident underscores the core trade-off of AI gun detection in schools—“better safe than sorry” can mean frequent high-stakes false alarms, opaque performance metrics, and mounting costs, even as districts double down on expansion.
Discussion Summary:
- Skepticism regarding "Intentionality": Users heavily ridiculed the school district’s claim that the student was "holding the clarinet like a weapon." Commenters jokingly speculated on what constitutes a "tactical stance" for band instruments and listed other items—like crutches, telescopes, or sextants—that might trigger posture-based analysis. One user compared the district's defense to victim-blaming.
- System Failure vs. Design: While some offered a slight defense of the AI (noting the student was wearing camo and cameras can be grainy), the consensus was that the human verification step failed completely. Users argued that if a human reviewer cannot distinguish a clarinet from a rifle, the service provides little value over raw algorithms.
- Incentives and Accountability: Several commenters suggested that vendors should face financial penalties for false positives to discourage "security theater." There was suspicion that school officials defending the software as "working as intended" are merely protecting expensive contracts.
- Broader Societal Context: The thread devolved into a debate on the root causes necessitating such tech. Some argued that metal detectors are a more proven (albeit labor-intensive) solution, while others lamented that the US focuses on "technical solutions" (surveillance) for "real-world problems" (gun violence/mental health) that other countries don't have.
- Humor and Satire: The discussion included references to the "Not Hotdog" app from Silicon Valley, suggestions that students should protest with comically fake cartoon bombs, and dark satire regarding the "price of freedom" involving school safety.
What If Readers Like A.I.-Generated Fiction?
Submission URL | 9 points | by tkgally | 9 comments
- A new experiment by computer scientist Tuhin Chakrabarty fine-tuned GPT-4o on the complete works of 30 authors (in one test, nearly all of Han Kang’s translated writings), then asked it to write new scenes in their style while holding out specific passages as ground truth checks.
- In blind evaluations by creative-writing grad students, the AI outputs beat human imitators in roughly two-thirds of cases. Judges often described the AI’s lines as more emotionally precise or rhythmic.
- An AI-detection tool (Pangram) failed to flag most of the fine-tuned outputs, suggesting style clones can evade current detectors.
- The work, co-authored with Paramveer Dhillon and copyright scholar Jane Ginsburg, appears as a preprint (not yet peer-reviewed). Ginsburg highlights the unsettling prospect that such indistinguishable, style-specific AI fiction could be commercially viable.
- Why it matters: This moves “AI can imitate vibe” to “AI can produce convincing, author-specific prose that readers may prefer,” raising acute questions about copyright (training on full oeuvres), consent, attribution, detectability, and the economics of publishing.
- Important caveats: Small sample and evaluator pool; translations were involved; results varied by author; outputs can still read trite; and legal/ethical legitimacy of the training data remains unresolved.
Here is a summary of the discussion:
Commenters engaged in a debate over the cultural value and quality of AI-generated prose, drawing sharp parallels to the "processed" nature of modern pop music and film.
- The "Mass Slop" Theory: Several users argued that if people cannot differentiate between AI and human writing, it is because mass media has already conditioned audiences to accept formulaic, "processed" content (akin to auto-tuned music).
- Garbage In, Garbage Out: Discussion touched on "enshittification," with users noting that if AI models are trained on mass-market "slop," they will simply produce more of it, failing to fix underlying quality issues in publishing.
- Market Saturation: There were predictions that readers will eventually "drown" in or grow tired of the flood of AI-generated content.
- ** Narratives & Bias:** While one user claimed this experiment smashes "AI-hater narratives," others maintained that readers still possess a bias toward "pure human" authorship when they are aware of the source.
- Article Accessibility: Users shared archive links to bypass the paywall, while some fiercely debated the quality of the article itself, advising others to "RTFA" (Read The F---ing Article) before judging.