simonwillison.net

订阅源链接共 231 篇文章

GLM-5.1: Towards Long-Horizon Tasks

GLM-5.1: Towards Long-Horizon Tasks Chinese AI lab Z.ai's latest model is a giant 754B parameter 1.51TB (on Hugging Face ) MIT-licensed monster - the same size as their previous GLM-5 release, and sharing the same paper . It's available via OpenRouter so I asked it to draw me a pelican: llm install llm-openrouter llm -m openrouter/z-ai/glm-5.1 'Generate an SVG of a pelican on a bicycle' And something new happened... unprompted, the model decided to give me an HTML page that included both the SVG...

2026-04-07 21:25原文链接
未翻译

Anthropic's Project Glasswing - restricting Claude Mythos to security researchers - sounds necessary to me

Anthropic didn't release their latest model, Claude Mythos ( system card PDF ), today. They have instead made it available to a very restricted set of preview partners under their newly announced Project Glasswing . The model is a general purpose model, similar to Claude Opus 4.6, but Anthropic claim that its cyber-security research abilities are strong enough that they need to give the software industry as a whole time to prepare. Mythos Preview has already found thousands of high-severity vuln...

2026-04-07 20:52原文链接
未翻译

SQLite WAL Mode Across Docker Containers Sharing a Volume

Research: SQLite WAL Mode Across Docker Containers Sharing a Volume Inspired by this conversation on Hacker News about whether two SQLite processes in separate Docker containers that share the same volume might run into problems due to WAL shared memory. The answer is that everything works fine - Docker containers on the same host and filesystem share the same shared memory in a way that allows WAL to collaborate as it should. Tags: docker , sqlite

2026-04-07 15:41原文链接
未翻译

Google AI Edge Gallery

Google AI Edge Gallery Terrible name, really great app: this is Google's official app for running their Gemma 4 models (the E2B and E4B sizes, plus some members of the Gemma 3 family) directly on your iPhone. It works really well. The E2B model is a 2.54GB download and is both fast and genuinely useful. The app also provides "ask questions about images" and audio transcription (up to 30s) with the two small Gemma 4 models, and has an interesting "skills" demo which demonstrates tool calling agai...

2026-04-06 05:18原文链接
未翻译

datasette-ports 0.2

Release: datasette-ports 0.2 No longer requires Datasette - running uvx datasette-ports now works as well. Installing it as a Datasette plugin continues to provide the datasette ports command. Tags: datasette

2026-04-06 03:25原文链接
未翻译

scan-for-secrets 0.3

Release: scan-for-secrets 0.3 New -r/--redact option which shows the list of matches, asks for confirmation and then replaces every match with REDACTED , taking escaping rules into account. New Python function redact_file(file_path: str | Path, secrets: list[str], replacement: str = "REDACTED") -> int . Tags: projects

2026-04-06 02:59原文链接
未翻译

Cleanup Claude Code Paste

Tool: Cleanup Claude Code Paste Super-niche tool this. I sometimes copy prompts out of the Claude Code terminal app and they come out with a bunch of weird additional whitespace. This tool cleans that up. Tags: tools , claude-code

2026-04-06 02:55原文链接
未翻译

datasette-ports 0.1

Release: datasette-ports 0.1 Another example of README-driven development, this time solving a problem that might be unique to me. I often find myself running a bunch of different Datasette instances with different databases and different in-development plugins, spreads across dozens of different terminal windows - enough that I frequently lose them! Now I can run this: datasette install datasette-ports datasette ports And get a list of every running instance that looks something like this: http...

2026-04-06 00:23原文链接
未翻译

Eight years of wanting, three months of building with AI

Eight years of wanting, three months of building with AI Lalit Maganti provides one of my favorite pieces of long-form writing on agentic engineering I've seen in ages. They spent eight years thinking about and then three months building syntaqlite , which they describe as " high-fidelity devtools that SQLite deserves ". The goal was to provide fast, robust and comprehensive linting and verifying tools for SQLite, suitable for use in language servers and other development tools - a parser, forma...

2026-04-05 23:54原文链接
未翻译

Quoting Chengpeng Mou

From anonymized U.S. ChatGPT data, we are seeing: ~2M weekly messages on health insurance ~600K weekly messages [classified as healthcare] from people living in “hospital deserts” (30 min drive to nearest hospital) 7 out of 10 msgs happen outside clinic hours — Chengpeng Mou , Head of Business Finance, OpenAI Tags: ai-ethics , generative-ai , openai , chatgpt , ai , llms

2026-04-05 21:47原文链接
未翻译

Syntaqlite Playground

Tool: Syntaqlite Playground Lalit Maganti's syntaqlite is currently being discussed on Hacker News thanks to Eight years of wanting, three months of building with AI , a deep dive into exactly how it was built. This inspired me to revisit a research project I ran when Lalit first released it a couple of weeks ago, where I tried it out and then compiled it to a WebAssembly wheel so it could run in Pyodide in a browser (the library itself uses C and Rust). This new playground loads up the Python l...

2026-04-05 19:32原文链接
未翻译

scan-for-secrets 0.2

Release: scan-for-secrets 0.2 CLI tool now streams results as they are found rather than waiting until the end, which is better for large directories. -d/--directory option can now be used multiple times to scan multiple directories. New -f/--file option for specifying one or more individual files to scan. New scan_directory_iter() , scan_file() and scan_file_iter() Python API functions. New -v/--verbose option which shows each directory that is being scanned.

2026-04-05 04:07原文链接
未翻译

scan-for-secrets 0.1.1

Release: scan-for-secrets 0.1.1 Added documentation of the escaping schemes that are also scanned. Removed unnecessary repr escaping scheme, which was already covered by json .

2026-04-05 03:39原文链接
未翻译

scan-for-secrets 0.1

Release: scan-for-secrets 0.1 I like publishing transcripts of local Claude Code sessions using my claude-code-transcripts tool but I'm often paranoid that one of my API keys or similar secrets might inadvertently be revealed in the detailed log files. I built this new Python scanning tool to help reassure me. You can feed it secrets and have it scan for them in a specified directory: uvx scan-for-secrets $OPENAI_API_KEY -d logs-to-publish/ If you leave off the -d it defaults to the current dire...

2026-04-05 03:27原文链接
未翻译

research-llm-apis 2026-04-04

Release: research-llm-apis 2026-04-04 I'm working on a major change to my LLM Python library and CLI tool. LLM provides an abstraction layer over hundreds of different LLMs from dozens of different vendors thanks to its plugin system, and some of those vendors have grown new features over the past year which LLM's abstraction layer can't handle, such as server-side tool execution. To help design that new abstraction layer I had Claude Code read through the Python client libraries for Anthropic, ...

2026-04-05 00:32原文链接
未翻译

Quoting Kyle Daigle

[GitHub] platform activity is surging. There were 1 billion commits in 2025. Now, it's 275 million per week, on pace for 14 billion this year if growth remains linear (spoiler: it won't.) GitHub Actions has grown from 500M minutes/week in 2023 to 1B minutes/week in 2025, and now 2.1B minutes so far this week. — Kyle Daigle , COO, GitHub Tags: github , github-actions

2026-04-04 02:20原文链接
未翻译

Vulnerability Research Is Cooked

Vulnerability Research Is Cooked Thomas Ptacek's take on the sudden and enormous impact the latest frontier models are having on the field of vulnerability research. Within the next few months, coding agents will drastically alter both the practice and the economics of exploit development. Frontier model improvement won’t be a slow burn, but rather a step function. Substantial amounts of high-impact vulnerability research (maybe even most of it) will happen simply by pointing an agent at a sourc...

2026-04-03 23:59原文链接
未翻译

The cognitive impact of coding agents

A fun thing about recording a podcast with a professional like Lenny Rachitsky is that his team know how to slice the resulting video up into TikTok-sized short form vertical videos. Here's one he shared on Twitter today which ended up attracting over 1.1m views! That was 48 seconds. Our full conversation lasted 1 hour 40 minutes. Tags: ai-ethics , coding-agents , agentic-engineering , generative-ai , podcast-appearances , ai , llms , cognitive-debt

2026-04-03 23:57原文链接
未翻译

Quoting Willy Tarreau

On the kernel security list we've seen a huge bump of reports. We were between 2 and 3 per week maybe two years ago, then reached probably 10 a week over the last year with the only difference being only AI slop, and now since the beginning of the year we're around 5-10 per day depending on the days (fridays and tuesdays seem the worst). Now most of these reports are correct, to the point that we had to bring in more maintainers to help us. And we're now seeing on a daily basis something that ne...

2026-04-03 21:48原文链接
未翻译

Quoting Daniel Stenberg

The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good. I'm spending hours per day on this now. It's intense. — Daniel Stenberg , lead developer of cURL Tags: daniel-stenberg , security , curl , generative-ai , ai , llms , ai-security-research

2026-04-03 21:46原文链接
未翻译
第 1 页 / 共 12 页