Beagle: Git, URIs and all the dirty words

gritzko1 pts0 comments

uris

Beagle: git, URIs and all the dirty words

Human authored

Git's basic model is a wonderfully simple system of blob trees<br>and commit chains that one can explain in 5 minutes to anyone.<br>Further up the stack, that wonderful simplicity devolves into<br>a mess of commands and flags developers with 20 years of git<br>experience have difficulty remembering.

That is doubly so when multi-tasking with LLMs. "I believe we<br>implemented it on Tuesday, but it is not here. Where is it?"<br>"Which branch corresponds to that remote?" And so on.

If only we had some universal language to address and access<br>local and remote resources, files and locations in files! Oh<br>wait, we have HTTP and URI, which are as standard as it gets.<br>Those were specifically designed for this task. Supported in<br>so many apps and libs. Can we apply that to git?

URIs

The URI layout we all remember by heart:

scheme: -- the access protocol / addressing scheme,

//authority -- most often the network host,

/path -- path in the remote filesystem,

?query -- other stuff (like arguments),

#fragment -- location within the document.

Can we retrofit that to a versioned store? Well, if all the<br>versioning info goes into the query, the rest is obvious.<br>http://somehost/dir/file?branch#L101 for example.<br>In fact, Beagle is a git-compatible SCM doing exactly that.

HTTP verbs

The case of HTTP is more interesting. Originally, HTTP has a<br>vocabulary of verbs: HEAD, GET, PUT, POST, PATCH, DELETE.<br>Although, people only use GET and POST nowadays. But, there<br>was some reason for the other verbs to exist, right?

GET "retrieves information"

HEAD is like GET, but no body

POST makes the server "accept the entity"

PUT requests the entity to be "stored"

DELETE does what it says

PATCH requests "changes" to be "applied"

While the vocabulary is a bit vague, fundamentally it grows<br>out of the need to access a remote filesystem. That fits<br>naturally the git model, which is, as described, a [content-<br>addressed filesystem]f. For that reason, Beagle uses the<br>HTTP verbs exclusively .

Wait, but it only has patch ? What about merge vs rebase?

Git's dirty words

There is always plenty of confusion around merge, rebase,<br>squash, cherry-pick and all the related techniques of<br>git-handling the twisted history of edits. Each command does<br>several often unrelated things and each thing can be done<br>by several commands, subtly differently.

Beagle decomposes those practices into a set of orthogonal<br>operations, building on that wonderfully simple underlying<br>model of git:

GET moves data from repo to worktree (including remotes)

HEAD is like GET's dry-run - fetch and report

POST moves data from worktree to repo (commits)

PUT only edits the reflog (sets branches/tags, stages)

DELETE is like PUT, but deletes

PATCH applies another version's changes to the worktree

As you might see, there is no way to supplement one operation<br>by another: they are strictly orthogonal. Let's see how that<br>applies to the pandemonium of merge/rebase/squash/cherrypick.

Let's see what all git merge variants do:

they apply changes from a diverging commit or branch,

they reuse (rebase) or add new (merge, squash) message,

they refer to the original (merge) or not (rebase, squash).

Consequently, we have 8 options: commit/branch, reuse/retitle,<br>and refer/forget. In fact, only some of these 8 have git terms<br>defined. For example, to squash we have to apply a diverging<br>branch in its entirety, add a new commit message, do not refer<br>to the original branch. To rebase, we apply separate commits,<br>reuse the messages, do not refer back. To merge, we apply all<br>of a branch, add a new message, refer back (the parent header).

The way to express it in Beagle CLI:<br># rebase one commit: apply, post<br>be patch ?feature<br>be post #!

# merge a branch: apply all, post with a new message<br>be patch ?feature!<br>be post '#merge the feature'

# squash a branch<br>be patch ?feature!<br>be post '#add a new feature!'

# rebase the entire branch<br>while be patch ?feature; do<br>make && make test && be post #!;<br>done

# cherry pick one commit<br>be patch #391a0d33<br>be post #!

Here we use the bang modifier to:

'?branch!' apply the entire branch (default: one commit),

'#message!' dont link the original commit (the parent ref).

Note: when we supply no message, the original one gets reused.<br>We may keep message/author but drop the original commit: #!.

Branch rebase here may only happen as a cycle, because we make<br>as many posts as many commits we have. This also ensures that<br>all the commited revisions build and pass the tests.

FAQ

So, how PUT is different from POST?

POST does commit and/or fast-forward. PUT resets a branch or<br>marks a file for commit/removal (reflog-only operations).

How does that compare to the URIs git uses?

git only uses URIs to access repos, e.g.<br>git://github.com/gritzko/beagle.git<br>That is very limiting, so we want to extend that addressing<br>scheme to access files, revisions, locations in files.

How does that compare to GitHub URIs?

GitHub...

branch post commit patch merge rebase

Related Articles