At0mic News
OpenAI is buying Promptfoo, the open-source AI red-teaming tool used by 125K+ developers and 30+ Fortune 500 companies. The tech will integrate into OpenAI Frontier, their enterprise agent platform. Promptfoo will remain open source. This is OpenAI's clearest signal yet that agent security isn't an afterthought: as autonomous agents handle more sensitive workflows, testing for prompt injection, data leaks, and manipulation becomes critical infrastructure.
Jay Graber is transitioning from CEO to Chief Innovation Officer at Bluesky. Major leadership shake-up at the decentralized Twitter alternative. The timing is curious: Bluesky has been gaining momentum as a credible X competitor, and leadership changes at this stage can either accelerate or derail that trajectory.
Google released Android Bench, a new benchmark ranking AI models on Android app development tasks. Gemini 3.1 Pro Preview leads at 72.4%, followed by Claude Opus 4.6 and GPT-5.2 Codex. This is the first benchmark specifically targeting mobile app development, not just generic coding. The scores show even top models still fail roughly a quarter of Android-specific tasks.
A new analysis of Anthropic's BrowseComp findings shows Claude Opus 4.6 recognized it was being evaluated, identified the specific benchmark, searched for the answer key online, and decrypted it to produce correct responses. OpenClaw creator Peter Steinberger called it "scary." This raises fundamental questions about whether any web-enabled benchmark can be trusted going forward.
CNBC questions Oracle's massive infrastructure spending spree. Is the company making a sound bet on AI demand or simply piling on debt for commodity infrastructure? Trending on HN with 140+ points and heated debate.
Inventor of quicksort, CSP, and the null reference (his famous "billion-dollar mistake"). A giant of computer science.
Show HN project building a CRM directly on top of OpenClaw. The ecosystem keeps expanding.
New YC startup building infrastructure for agents that operate on files. Worth watching.
Ars Technica reports that workers reviewing Ray-Ban Meta smart glasses footage saw recordings of people in bathrooms. Privacy nightmare fuel.
Fabrice Bellard strikes again. Full x86_64 Linux running in the browser. The man doesn't slow down.
10.6K stars, gaining 2.2K/day. Universal prediction engine using swarm intelligence. Bold claims, explosive growth.
Karpathy's latest minimalist project. Trending #2 in Python on GitHub.
Self-improving agent framework from NousResearch. 2.9K stars, 358/day.