Idempotent + Moving Window is simply a reduction | iabdb
I want to calculate the moving window max or min in a data parallel way.<br>Lets start with the two argument function max
max(x,y) returns the greater of x and y<br>max(x,x) returns x (idempotent)
max(5,3) return 5
max(max(5,3),max(5,3)) reduces to max(5,5) which just returns 5
If we want max of an entire list we can simply think of it as a reduction in the lisp/APL sense
max(head (x), (max (head (tail (x), max( head(tail(tail(x))), …. )
or in a more readable way, replace max with |, and insert it between every element in the list
x[0] | x[1] | x[2] | x[3] this is a standard reduction/over.
Here is a concrete example:
x:5 4 3 2 7 2 9 1
Max over(x) –> 9
|/x –> 9
We can look at the intermediate results
Max scan(x) –> 5 5 5 5 7 7 9 9
|\x –> 5 5 5 5 7 7 9 9
Now let’s say that we want to look at the 3 slice moving window.
Let’s take advantage of the fact that max(x,x) yields x, idempotent
We can calculate the max between each pair in our list. Read max each-prior (‘:)
(|’:)x –> 5 5 4 3 7 7 9 9
applying this function n-1 times gives us a moving max
so (|’:) (|’:) x is 3 slice moving window, which we can rewrite as<br>2 |’:/ x
5 5 5 4 7 7 9 9
We can see that |\x is equivalent to (count[x]-1) |’:/ x which is data parallel by construction. In other words, we are doing an adjacent transform.
To make this a bit clearer, we can show intermediate results by using a scan(\) instead of an over(/)
(count[x]-1)|’:\x
5 4 3 2 7 2 9 1<br>5 5 4 3 7 7 9 9<br>5 5 5 4 7 7 9 9<br>5 5 5 5 7 7 9 9<br>5 5 5 5 7 7 9 9<br>5 5 5 5 7 7 9 9<br>5 5 5 5 7 7 9 9<br>5 5 5 5 7 7 9 9
One thing we can notice is that we could terminate early, since we know that if an adjacent element did not change there is no way for the max value to propagate further.
Which means we can rewrite (count[x]-1)|’:\x to simply (|’:\)x
and indeed this gives us:
(|’:\)x
5 4 3 2 7 2 9 1<br>5 5 4 3 7 7 9 9<br>5 5 5 4 7 7 9 9<br>5 5 5 5 7 7 9 9
Technically this is still slower than maxs(x) but if we had gpu support for each-prior(‘:) we could calculate maxs in n calls to max prior which has a parallelization factor of n/2. Depending on the range of x, the n calls might be bound significantly such that for small ranges, for example you are dealing with a short int (8 bytes), you can effectively compute the maxs in O(c *n/2) time with parallelism. Where c is a function of the effective range of your inputs — max[x]-min[x].
Now let’s get back to the problem of max in a sliding window, like this leetcode problem. This problem is classified as hard and the reason is that leetcode doesn’t want you to use parallelism to solve sliding window, it wants you to do it in O(n) time directly.
We know we can solve the problem in O(n) time when the window is equal to 1,2, and n.<br>identity({x}), max each prior ({|’:x}), and maxs ({|\x}) respectively.
The key to solving it for all window sizes is to recognize that each entry only depends on itself and its neighbors and that the result is the same if you duplicate neighbors – again idempotentcy comes to the rescue. max(1 2 3) is equal to max(1 1 2 2 3 3). Another good explanation can be found here. This is technically a dynamic programming approach with two pointers, but I have never seen it explained in what I think is the most straightforward way.<br>Let’s borrow the example from leetcode and work through what should happen.
Suppose list of numbers n=1 4 3 0 5 2 6 7 and window size k=3. Then reusing the prior code to show intermediate steps, we get:
1 4 3 0 5 2 6 7<br>1 4 4 3 5 5 6 7<br>1 4 4 4 5 5 6 7
If we look at a particular slot, 0 for example, it gets overtaken by 4, but the 5 slot only needs to compete with 3. In other words, each element is looking at at most n-1 elements to the right and n-1 elements to the left. So we can rearrange n to do this in a natural way. Let’s reshape n, so that there are k columns. q supports ragged arrays, so our last row is not the same length which is nice.
q)(0N;k)#n<br>1 4 3<br>0 5 2<br>6 7
We can then compute the maxs of each row
q)maxs each (0N;k)#n<br>1 4 4<br>0 5 5<br>6 7
And flatten it back:
q)l:raze maxs each (0N;k)#n<br>q)l<br>1 4 4 0 5 5 6 7
This gives us our left window. Now we will repeat the steps but instead of going from the left we will go from the right.<br>The simplest way go from the right is to reverse the list apply the function as normal from the left and reverse the list again. In J this operation is called under, as in going under anesthesia, perform the operation and then wake you from the anesthesia. This is equivalent to looking at the cummulative max from the right or walking backward through the array.
q)under:{[f;g](f g f ::)}<br>q)(reverse maxs reverse)~under[reverse;maxs] /this is the same<br>1b
q)(reverse maxs reverse ::) each (0N;k)#n<br>4 4 3<br>5 5 2<br>7 7
Now we just need to flatten the list and call it r.
q)r:raze (reverse maxs reverse ::) each (0N;k)#n
Because r is going from the right, we need to shift it by k-1 elements to the right so that...