minimaxir.com
Nano Banana Pro is the best AI image generator, with caveats
pre code.language-txt { white-space: pre-wrap !important; word-break: normal !important; } A month ago, I posted a very thorough analysis on Nano Banana , Google’s then-latest AI image generation model, and how it can be prompt engineered to generate high quality and extremely nuanced images that most other image generations models can’t achieve, including ChatGPT at the time. For example, you can give Nano Banana a prompt with a comical amount of constraints: Create an image featuri...
Nano Banana can be prompt engineered for extremely nuanced AI image generation
pre code.language-txt { white-space: pre-wrap !important; word-break: normal !important; } You may not have heard about new AI image generation models as much lately, but that doesn’t mean that innovation in the field has stagnated: it’s quite the opposite. FLUX.1-dev immediately overshadowed the famous Stable Diffusion line of image generation models, while leading AI labs have released models such as Seedream , Ideogram , and Qwen-Image . Google also joined the action with Imagen 4...
Claude Haiku 4.5 does not appreciate my attempts to jailbreak it
pre code.language-txt { white-space: pre-wrap !important; word-break: normal !important; } Whenever a new large language model is released, one of my initial tests is to try and jailbreak it just to see how well the model handles adversarial attacks. Jailbreaking an LLM involves a form of adversarial prompt engineering to attempt to bypass its safeguards against prohibited user input such as prompts requesting sexual or illegal content. While most of the LLMs from top labs such as OpenAI’s...
Can modern LLMs actually count the number of b's in "blueberry"?
Last week, OpenAI announced and released GPT-5 , and the common consensus both inside the AI community and outside is that the new LLM did not live up to the hype. Bluesky — whose community is skeptical at-best of generative AI in all its forms — began putting the model through its paces: Michael Paulauski asked GPT-5 through the ChatGPT app interface “how many b’s are there in blueberry?”. A simple question that a human child could answer correctly, but ChatGPT states that the...
LLMs can now identify public figures in images
pre code.language-txt { white-space: pre-wrap !important; word-break: normal !important; } I’ve been working on a pipeline for representing an image as semantic structured data using multimodal LLMs for better image categorization, tagging, and searching. During my research, I started with something simple by taking an image and having a LLM describe who is in it: if they’re famous, there should be more than enough annotated images in the LLM’s training dataset to accurately id...