BenchmarkLLM Benchmarks: Is It Worth ($$) Mixing 2 Models? (Planner + Executor)April 25, 2026LLM Coding Benchmark (April 2026): GPT 5.5, DeepSeek v4, Kimi v2.6, MiMo, and the State of the ArtApril 24, 2026LLM Benchmarks Part 2: Is It Worth Combining Multiple Models in the Same Project? Claude + GLM??April 18, 2026Testing Open Source and Commercial LLMs - Can Anyone Beat Claude Opus?April 5, 2026Rant - Will LLMs Evolve Forever? Demystifying LLMs in ProgrammingMay 1, 2025