A Dialog on APL (2015)

A Dialog on APL | Dyalog Blog

-->

Home

Business

How We Can Help

Products

Services

Prices and Licences

Resellers

Application Development Partners

Reference Customers

Case Studies

Learning

Getting Started

Mastering Dyalog APL

Training

Dyalog in Education

TryAPL

2016 Year Game

Student Competition

Community

Sharing Ideas

Dyalog Forum

GitHub

User Meetings

User Groups

Blog

APL Wiki

Resources

Resource Map

Documentation

Tools and Interfaces

Download Dyalog

Support

Videos

Tutorials and Samples

Fonts and Keyboards

External Publications

Dyalog Papers

News

Event Calendar

Newsletters

About Us

Our Mission

Our Values

Meet Team Dyalog

The Dyalog Duck

Careers

Corporate

History

Products

Dyalog Services

Prices and Licences

Support (DSS)

Download Dyalog – Free

Dyalog '24

A discussion between Nicolas Delcros and Roger Hui

Nicolas, Prologue : From a language point of view, thanks to Ken Iverson, it is obvious that you want grade rather than sort as a primitive. Yet from a performance point of view, sort is currently faster than grade.

Can one be "more fundamental" than the other? If so, who’s wrong…APL or our CPUs? In any case, what does "fundamental" mean?

Roger : Sorting is faster than grading due to reasons presented in the J Wiki essay Sorting versus Grading, in the paragraph which begins "Is sorting necessarily faster than grading?" I can not prove it in the mathematical sense, but I believe that to be the case on any CPU when the items to be sorted/graded are machine units.

Nicolas, Formulation 1: Now, parsing ⍵[⍋⍵] (and not scanning the idiom) begs the question of how deeply an APL intepreter can "understand" what it’s doing to arrays.

How would an APL compiler resolve this conjunction in the parse tree? Do you simply have a bunch of state pointers such as "is the grade of" or "is sorted" or "is squozen" or "axis ordering" etc. walking along the tree? If so, do we have an idea of the number of such state pointers required to exhaustively describe what the APL language can do to arrays? If not, is there something more clever out there?

Roger: I don’t know of any general principles that can tell you what things can be faster. I do have two lists, one for J and another for Dyalog. A big part of the lists consists of compositions of functions, composition in the mathematical sense, that can be faster than doing the functions one after the other if you "recognize" what the composed function is doing and write a native implementation of it. Sort vs. grade is one example (sort is indexing composed with grade). Another one is (⍳∘1 >) or (1 ⍳⍨ >). The function is "find the first 1" composed with >. These compositions have native implementations and:

x←?1e6⍴1e6

cmpx '5e5(1 ⍳⍨ >)x' '(5e5>x)⍳1' 5e5(1 ⍳⍨ >)x → 0.00E0 | 0% (5e5>x)⍳1 → 1.06E¯2 | +272100% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

cmpx '¯1(1 ⍳⍨ >)x' '(¯1>x)⍳1' ¯1(1 ⍳⍨ >)x → 2.41E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ (¯1>x)⍳1 → 4.15E¯3 | +71% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

If you get a "hit" near the beginning, as would be the case with 5e5, you win big. Even if you have to go to the end (as with ¯1), you still save the cost of explicitly generating the Boolean vector and then scanning it to the end.

Another one, introduced in 14.1, is:

cmpx '(≢∪)x' '≢∪x' (≢∪)x → 4.43E¯3 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ ≢∪x → 1.14E¯2 | +157% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

This is the tally nub composition, used often in a customer application. If you "know" that you just want the tally of the nub (uniques), you don’t actually have to materialise the array for the nub.

I am not conversant with compiler technology so I don’t know what all an APL compiler can do. I do know that there’s a thing call "loop fusion" where, for example, in a+b×c÷2, it doesn’t have to go through a,b,c in separate loops, but can instead do

:for i :in ⍳≢a ⋄ z[i]←a[i]+b[i]×c[i]÷2 ⋄ :endif

saving on the array temp on every step. You win some with this, but I think the function composition approach wins bigger. On the other hand, I don’t know that there is a general technique for function composition. I mean, what general statements can you make about what things can be faster (AKA algebraically simpler)?

Nicholas: I sort of see…so jot is a "direct" conjunction. An indirect conjunction could be ⍵[⍺⍴⍋⍵] where the intermediate grade is reshaped. We "know" that grade and shape are "orthogonal" and can rewrite the formula to ⍴⍵[⍋⍵].

So if we can establish a list of flags, and establish how primitives touch these flags, and how these flags affect each other, then we can extend ∘ to any path of the parse tree, provided that the intermediate nodes don’t destroy the relationship between the two operations (here grade + indexing provides sort, independently of the reshape).

Of course we can spend our lives finding such tricks. Or we could try and systemise it.

Roger: What you are saying above is that reshape and indexing commute (i.e. reshape ∘...

A Dialog on APL (2015)

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy