I tried this a bit and kept ending up with large diffs where it was hard to trace which changes came from which prompt, instruction, or intermediate decision.Has anyone made this work in a normal company codebase? How do you keep the output reviewable and traceable?