How I built a benchmark to measure what truly matters for AI coding assistants - file editing precision, instruction following, and handling complex context.