Representation-Free Editing

alpaylan1 pts0 comments

Representation-Free Editing

Representation-Free Editing.

Posted on 2026-06-16

::

11 min read

:: Tags:

🏷software engineering,

🏷ai

Representation is an overloaded term in computer science. Colloquially, the term representation represents a concrete object that is the reflection of an abstract object. The representation carries over the properties of the abstract object into a new domain that is presumably more convenient for a purpose the representer has in mind. We use arrays to represent collections of items, we use adjacency lists to represent graphs, Unicode code pointers to represent written language, 64 bit integers to represent numbers. We use language to represent thoughts, programming languages to represent programs, so on, and on…

The thing about a representation is that, well, it’s distinct from the thing it’s representing. Many times, the distinction becomes practically important because the representation omits or transforms information in ways that aren’t possible to reconstruct. Take a compiler, for instance, decompilation is inherently lossy, because the machine code removes information such as types, transforms loops and branches into jumps, reorders code as needed, so we cannot have a perfect reconstruction of the source code from a given assembly.

This phenomenon is not specific to programming, you cannot reconstruct the word document from the PDF it is compiled to, you cannot reconstruct a set of polygons drawn on a canvas from the pixels it renders to. Well of course, you can have a reconstruction of the original document or canvas, but reconstruction is necessarily lossy because of the mismatches in representation, there are always cases that won’t perfectly map back to the original representation.

Is that so bad? Why do we really need a roundtrip property anyway?

Well, the good thing about a roundtrip is it gives us a very crucial capability, which is backpropagation. Here’s a classic one-way flow:

We have two representations, R1 and R2; and we start with the initial version of R1 at time t1. We can obtain the corresponding R2 via computing the arrow. The broken arrow that goes from R2 to R1 is the information. Let’s say we’re dealing with a LaTeX document, R1 is LaTeX, R2 is PDF. We compile LaTeX to PDF, view the PDF to see if there’s anything we don’t like, and note any changes we want to make to the document, marked by the broken arrow going back to R1. We then update R1, see if the change really corresponds to the one we wanted. This process is pretty inefficient, anyone who worked on a large latex document will tell you that. I’ll call this the single-representation editability problem. What you ideally want is to be able to modify the representation that gives you the information on what type of edit you want, or what kind of information do you want to present in the end result, like below:

Well, we can do this, right? Write your document, convert to PDF, open up whatever free or cracked PDF editor is available these days, modify the PDF document itself, you get what you wanted in the first place. The unfortunate fact of the matter is you had a reason for picking R1 (latex) to write documents in the first place, a PDF editor is a terrible place to write a paper, you really don’t want to do that. So eventually when you realize a non-minor change, you’ll wanna go back to writing latex. What’ll happen to all the edits you did on your PDF then, they’ll go up in smoke, as if they’ve never existed. I’ll call this the single-lineage editability problem, you can edit many representations, but once you decide on applying a one-way arrow to switch representations, you cannot go back, you have to continue in that representation.

What would be really, really nice is if we could have two way arrows, that would be really amazing. We could do whichever edit we want on whichever representation is convenient to us, no barriers or issues whatsoever.

Unfortunately, two-way arrows in such representation pairs are very rare. First, it is very much in the interest of a representation vendor to not have the perfect roundtrip, because it lowers the switching costs. Imagine a three-way version of the same interaction. At any point the user wants to get away from R1, maybe they don’t like the vendor, or maybe the R1 software compatibility is bad with their unstable Linux distribution; they can switch to a hypothetical R3 that is also compatible with R2 by simply going R1 -> R2 -> R3. If your representation (R1) is equivalent to the target one (R2) then anyone can build a compatible representation (R3) for competing with you. These vendors must also be willing to compromise on features, open source the representation in ways that allow you to build that compatibility layer. It is also a technically hard problem, because roundtrip requires you to carry all the necessary information for reconstruction, your format is necessarily redundant, you carry not just what the document is, but also...

representation document represent from information really

Related Articles