Thread History in Higher Dimensions

multi-agent shouldn’t work - but it does. for all the things llm’s are known for, reliability is not one. yann lecunn now infamous slides show P(correct) = (1-e)^n…tldr each tokens is another chance for things to go wrong, so generatng MORE tokens only exacerbates the issue.

so…slapping AI on top of AI to improve reliabilty seems like it shouldn’t work. but it does. you don’t have to take my word for it. let’s look at three hottest coding clis today: Claude Code, OpenCode, and Amp.

for many software engineers, myself included, these programs have been the first agent that is actually useful. and one tool is common to all of them: Task: the ability to for the agent to delegate a narrowly defined task to another, typically less powerful LLM.

so first, why is such a tool included?

the cost argument its immediately obvious that shipping tokens to a smaller, cheaper model will be good for $$.

but that isn’t really why it matters. the market has shown a near unsatiable appetite for the smartest, best models to chew up tokens and spit out working solutions. after all, even at

context engineering the golden rule of context engineering is to belabor over every token.