RSS Bilingual Reader

minimaxir.com

An AI agent coding skeptic tries AI agent coding, in excessive detail

pre code.language-txt, pre code.language-md{ white-space: pre-wrap !important; word-break: normal !important; } You’ve likely seen many blog posts about AI agent coding/ vibecoding where the author talks about all the wonderful things agents can now do supported by vague anecdata, how agents will lead to the atrophy of programming skills, how agents impugn the sovereignty of the human soul, etc etc. This is NOT one of those posts. You’ve been warned. Last May, I wrote a blog post tit...

2026-02-27 18:00原文链接

未翻译

Nano Banana Pro is the best AI image generator, with caveats

pre code.language-txt { white-space: pre-wrap !important; word-break: normal !important; } A month ago, I posted a very thorough analysis on Nano Banana , Google’s then-latest AI image generation model, and how it can be prompt engineered to generate high quality and extremely nuanced images that most other image generations models can’t achieve, including ChatGPT at the time. For example, you can give Nano Banana a prompt with a comical amount of constraints: Create an image featuri...

2025-12-22 18:45原文链接

未翻译

Nano Banana can be prompt engineered for extremely nuanced AI image generation

pre code.language-txt { white-space: pre-wrap !important; word-break: normal !important; } You may not have heard about new AI image generation models as much lately, but that doesn’t mean that innovation in the field has stagnated: it’s quite the opposite. FLUX.1-dev immediately overshadowed the famous Stable Diffusion line of image generation models, while leading AI labs have released models such as Seedream , Ideogram , and Qwen-Image . Google also joined the action with Imagen 4...

2025-11-13 17:30原文链接

未翻译

Claude Haiku 4.5 does not appreciate my attempts to jailbreak it

pre code.language-txt { white-space: pre-wrap !important; word-break: normal !important; } Whenever a new large language model is released, one of my initial tests is to try and jailbreak it just to see how well the model handles adversarial attacks. Jailbreaking an LLM involves a form of adversarial prompt engineering to attempt to bypass its safeguards against prohibited user input such as prompts requesting sexual or illegal content. While most of the LLMs from top labs such as OpenAI’s...

2025-10-17 16:15原文链接

未翻译

Can modern LLMs actually count the number of b's in "blueberry"?

Last week, OpenAI announced and released GPT-5 , and the common consensus both inside the AI community and outside is that the new LLM did not live up to the hype. Bluesky — whose community is skeptical at-best of generative AI in all its forms — began putting the model through its paces: Michael Paulauski asked GPT-5 through the ChatGPT app interface “how many b’s are there in blueberry?”. A simple question that a human child could answer correctly, but ChatGPT states that the...

2025-08-12 16:00原文链接

未翻译

LLMs can now identify public figures in images

pre code.language-txt { white-space: pre-wrap !important; word-break: normal !important; } I’ve been working on a pipeline for representing an image as semantic structured data using multimodal LLMs for better image categorization, tagging, and searching. During my research, I started with something simple by taking an image and having a LLM describe who is in it: if they’re famous, there should be more than enough annotated images in the LLM’s training dataset to accurately id...

2025-07-28 20:15原文链接

未翻译