One Problem, Four Languages, Two Paradigms (2021)

One Problem, Four Languages, Two Paradigms - Asher's Blog

Keyboard shortcuts

Press ← or → to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Auto

Light

Rust

Coal

Navy

Ayu

Asher's Blog

One Problem, Four Languages, Two Paradigms

2021-10-19

Solving a leetcode problem in four programming languages using two acceleration paradigms!

NOTE: This post is a transcript of the youtube video linked here.

In addition to that, we’ll be using various combinations of two programming paradigms common in distributed computing: using a GPU to perform some calculations and MPI to distribute our calculation among multiple processes, potentially on multiple machines.

We’ll be looking at this leetcode problem, which is to determine if a 9 x 9 Sudoku board is valid, but not necessarily solvable. Each row, column, and subbox of the grid must have the digits 1-9.

Let’s jump right in to our BQN solution.

Content

One Problem, Four Languages, Two Paradigms

Content

BQN

Approach

Python

Python And MPI

C++

C++ And MPI

C++ And CUDA

C++ And CUDA And MPI

Fortran

Fortran And MPI

Conclusion

YouTube Description

BQN

This is what our sudoku boards will look like:

# two 8s in the first block bad ← ⟨8, 3, 0, 0, 7, 0, 0, 0, 0 6, 0, 0, 1, 9, 5, 0, 0, 0 0, 9, 8, 0, 0, 0, 0, 6, 0 8, 0, 0, 0, 6, 0, 0, 0, 3 4, 0, 0, 8, 0, 3, 0, 0, 1 7, 0, 0, 0, 2, 0, 0, 0, 6 0, 6, 0, 0, 0, 0, 2, 8, 0 0, 0, 0, 4, 1, 9, 0, 0, 5 0, 0, 0, 0, 8, 0, 0, 7, 9⟩

# valid sudoku good ← ⟨5, 3, 0, 0, 7, 0, 0, 0, 0 6, 0, 0, 1, 9, 5, 0, 0, 0 0, 9, 8, 0, 0, 0, 0, 6, 0 8, 0, 0, 0, 6, 0, 0, 0, 3 4, 0, 0, 8, 0, 3, 0, 0, 1 7, 0, 0, 0, 2, 0, 0, 0, 6 0, 6, 0, 0, 0, 0, 2, 8, 0 0, 0, 0, 4, 1, 9, 0, 0, 5 0, 0, 0, 0, 8, 0, 0, 7, 9⟩

And here is our full solution. This solution will be the basis for all of our later solutions.

F ← {𝕊𝕩: Fl ← 0⊸≠⊸/ # Filter 0s out Dup ← (∨´∾´)¬∘∊¨ # Are there any duplicates?

rs ← Dup Fl¨(9/↕9)⊔𝕩 # Check rows cs ← Dup Fl¨(81⥊↕9)⊔𝕩 # Check columns

bi ← 27⥊3/↕3 bs ← Dup Fl¨(bi∾(3+bi)∾(6+bi))⊔𝕩 # Check blocks

(bs ∨ rs ∨ cs)⊑"true"‿"false"

This first line is a function to filter out any 0s:

Fl ← 0⊸≠⊸/ Fl ⟨5, 3, 0, 0, 7, 0, 0, 0, 0⟩ ⟨ 5 3 7 ⟩

Here we have another utility function to return an integer indicating whether any duplicates were found in any sublists:

Dup ← (∨´∾´)¬∘∊¨ Dup ⟨⟨5, 3, 7⟩, ⟨1, 2, 3⟩⟩ Dup ⟨⟨5, 3, 7⟩, ⟨1, 2, 2⟩⟩

Next we check for duplicates in all the filtered rows and columns:

rs ← Dup Fl¨(9/↕9)⊔𝕩 cs ← Dup Fl¨(81⥊↕9)⊔𝕩

These ranges are used to create indices for grouping the values in X. I’ll show a trimmed down version of their output here to give you an idea:

3‿3⥊(3/↕3) # For the rows ┌─ ╵ 0 0 0 1 1 1 2 2 2 3‿3⥊(9⥊↕3) # For the columns ┌─ ╵ 0 1 2 0 1 2 0 1 2

Next I do something similar to get the indices for the boxes.

bi ← 27⥊3/↕3 3‿9⥊bi ┌─ ╵ 0 0 0 1 1 1 2 2 2 0 0 0 1 1 1 2 2 2 0 0 0 1 1 1 2 2 2

This creats indices for the first three boxes, and you can probably imagine how to extend this to get the indices for all the boxes. I just add three to the previous indices, and then add six, and then append them all together. Here’s the second layer:

3‿9⥊bi+3 ┌─ ╵ 3 3 3 4 4 4 5 5 5 3 3 3 4 4 4 5 5 5 3 3 3 4 4 4 5 5 5

And the final layer:

3‿9⥊bi+6 ┌─ ╵ 6 6 6 7 7 7 8 8 8 6 6 6 7 7 7 8 8 8 6 6 6 7 7 7 8 8 8

And all three layers of indices stacked on top of each other:

9‿9⥊(bi∾(3+bi)∾(6+bi)) ┌─ ╵ 0 0 0 1 1 1 2 2 2 0 0 0 1 1 1 2 2 2 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 3 3 3 4 4 4 5 5 5 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 6 6 6 7 7 7 8 8 8 6 6 6 7 7 7 8 8 8

Using these indices, I group all the elements of the input, and then check all of them for duplicates:

bs ← Dup Fl¨(bi∾(3+bi)∾(6+bi))⊔𝕩 # Check blocks

And in the end I check if there were duplicates in the blocks, in the rows, or in the columns, and use that to index into our strings that indicate whether our sudoku board is valid or not.

(bs ∨ rs ∨ cs)⊑"true"‿"false"

Approach

Before we move on to the Python solution, I’d like to talk about our approach to this solution in the rest of the languages, because they will all be pretty similar.

Just like in the BQN solution, we have three collections which represent the validity of the rows, another for the columns, and a third for the blocks.

Here I have a subset of a sudoku board on the bottom.

In our procedural languages, we’ll create an array thrice the size of the grid to hold these values.

Note that this is not as space (or time) efficient as many of the solutions that you can find on the discussion page for the leetcode problem, but it is much easier to parallelize and that’s really the point of this video.

Let’s now walk through a few steps of our algorithm starting at the second row and first column of our sudoku board, which relates to the second row of our “row matrix.”

Because we’re looking at our row matrix, we’ll take the row index in our sudoku board as the row for our row matrix, and we’ll take the value in...

One Problem, Four Languages, Two Paradigms (2021)

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast