gilesthomas.com

订阅源链接共 4 篇文章

Why smart instruction-following makes prompt injection easier

Back when I first started looking into LLMs , I noticed that I could use what I've since called the transcript hack to get LLMs to work as chatbots without specific fine-tuning. It's occurred to me that this partly explains why protection against prompt injection is so hard in practice. The transcript hack involved presenting chat text as something that made sense in the context of next-token prediction. Instead of just throwing something like this at a base LLM: User: Provide a synonym for 'bri...

2025-11-12 19:00原文链接
未翻译

Writing an LLM from scratch, part 27 -- what's left, and what's next?

On 22 December 2024, I wrote : Over the Christmas break (and probably beyond) I'm planning to work through Sebastian Raschka 's book " Build a Large Language Model (from Scratch) ". I'm expecting to get through a chapter or less a day, in order to give things time to percolate properly. Each day, or perhaps each chapter, I'll post here about anything I find particularly interesting. More than ten months and 26 blog posts later, I've reached the end of the main body of the book -- there's just th...

2025-11-04 00:40原文链接
未翻译

Writing an LLM from scratch, part 26 -- evaluating the fine-tuned model

This post is on the second half of chapter 7 of Sebastian Raschka 's book " Build a Large Language Model (from Scratch) ". In the last post I covered the part of the chapter that covers instruction fine-tuning; this time round, we evaluate our model -- particularly interestingly, we try using another, smarter, model to judge how good its responses are. Once again, Raschka's explanation in this section is very clear, and there's not that much that was conceptually new to me, so I don't have that ...

2025-11-03 19:40原文链接
未翻译

Writing an LLM from scratch, part 25 -- instruction fine-tuning

This post is on the first part of chapter 7 of Sebastian Raschka 's book " Build a Large Language Model (from Scratch) ", which covers instruction fine-tuning. In my last post , I went through a technique which I'd found could sometimes make it possible to turn non-fine-tuned models into reasonable chatbots; perhaps unsurprisingly, the GPT-2 model isn't powerful enough to work that way. So, with that proven, it was time to do the work :-) This post covers the first half of the chapter, where we ...

2025-10-29 23:40原文链接
未翻译