BQN: Primitive Overloading

tosh1 pts0 comments

BQN: Primitive overloading

(github) / BQN / commentary

Primitive overloading

The expression 3↕↕6 uses ↕ twice to do two completely different things (Windows and Range). This is a dark side of overloading, or the practice of stuffing one value with multiple meanings. Should a new array language use overloading? Should BQN have used it, for that matter?

The easy answer is to say it's better for BQN to use fewer special characters, so packing more meaning into them makes sense. If all the awkward pairs were split up, we'd need another modifier key to fit them on the keyboard! K, which sticks to ASCII and mostly uses one character per primitive, overloads primitives more often and in more ways, while Uiua, whose stack-based syntax forces fixed-arity primitives, avoids the keyboard constraint by translating glyphs from names. Besides this, the number of nice symbols in Unicode is much smaller than you'd expect, so that I didn't have any better ideas than using, say, ≠ for Length and ≢ for Shape. In a language that isn't tied to Unicode symbols, I'd definitely recommend splitting up primitives where the meanings don't fit together. Q's convention of keywords for monadic primitives is worth mentioning, and Klong is a nice effort with multi-symbol primitives.

BQN does get rid of a fair amount of APL overloading. Every modifier is best described as one thing with an optional left argument, with some looseness for ˜⊸⟜. The numerous meanings of . are reduced to two, the decimal in numbers and namespace field reference. The function-operator overloads /⌿\⍀ are separated out. Parentheses aren't used for lists like in K or recent Dyalog, which avoids ambiguity with 0- and 1-element lists.

But other things could be split up and aren't. The list below gives a few kinds of overloading: each presents the question of whether it should be considered one thing or two, which another language might answer differently.

⌽: reverse versus rotate?

-: negate versus subtract?

↕number versus ↕list?

↑: one axis versus many?

/boolean versus /integer?

Overloading is built deep into APL in a few ways. I think Iverson came to view packing more functionality into the same "space" as a fundamentally good thing; he developed this trend in papers published at I.P. Sharp, with J as the maximalist culmination. I think J should have cut back on new primitives and used the extra space given by its spelling system to draw connections, but it's packed with overloading instead, such as +: meaning double monadically and nor dyadically (the second meaning is obscure enough that I avoided it when writing J code). At the same time, when overloading is used well it can make primitives easier to remember and use. Let's start at the bottom of the list, with the most defensible form of overloading.

Equivalent overloading

APL isn't really a "one obvious way to do it" language in the sense that Python is, but it does follow a principle I'd describe as "one way is enough". That means that if APL already has a way to represent some data or a computation, it won't add another without a concrete benefit like shorter or faster code. This is why APL booleans are a kind of integer (I defend this decision here), and why it has one array datatype instead of various kinds of collection or a separate string type.

This means that something like the number 1 can mean many things, like an index or a count or a boolean, and the replicate function / might mean repeating or filtering. It's overloading, but it's a very consistent form because the mathematical description of what's going on in either case is the same. But it's not the only way—some statically-typed languages like Java and Haskell prefer to declare classes to split things up, so that the type system can check things for the user. An extreme example is a system that takes user input but needs to sanitize or escape it before passing it to certain functions. The APL way would be to represent both unsafe and safe input as strings, which is obviously dangerous.

However, the advantage of representing everything in a consistent format is that methods that work on one thing tend to work on many things. Want to reverse a string? It's just ⌽. Defining boolean negation ¬𝕩 more generally as 1-𝕩 makes some arithmetic more obvious, for example +´¬l is (≠l)-+´l. And connections can be made at a higher level too: if you learn a rule like a⊏b⊏c ←→ (a⊏b)⊏c, that applies for every meaning of ⊏. As long as a and b are flat arrays, that is, which highlights a conflict between this sort of compatible overloading and other sorts of extension.

Extensions

Heading further into the woods, we see that the APL family extends functions in ways that don't follow strictly from the definition. Usually this happens within a single primitive. For example, First (monadic ⊑) only applies naturally to an array, since otherwise there's no first element. But when given an atom it returns it unchanged, effectively treating it as a...

overloading primitives things like primitive versus

Related Articles