GitHub - jndean/gpusnek: GPU-Parallelizing Arbitrary Python Code By Running 1 Million Python Interpreters on a GPU 🐍 · GitHub
/" data-turbo-transient="true" />
Skip to content
Search or jump to...
Search code, repositories, users, issues, pull requests...
-->
Search
Clear
Search syntax tips
Provide feedback
--><br>We read every piece of feedback, and take your input very seriously.
Include my email address so I can be contacted
Cancel
Submit feedback
Saved searches
Use saved searches to filter your results more quickly
-->
Name
Query
To see all available qualifiers, see our documentation.
Cancel
Create saved search
Sign in
/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up
Appearance settings
Resetting focus
You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.
Dismiss alert
{{ message }}
jndean
gpusnek
Public
Notifications<br>You must be signed in to change notification settings
Fork
Star
master
BranchesTags
Go to file
CodeOpen more actions menu
Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit
History<br>48 Commits<br>48 Commits
gpusnek
gpusnek
.gitignore
.gitignore
Makefile
Makefile
README.md
README.md
example_allreduce.cu
example_allreduce.cu
example_repl.cu
example_repl.cu
example_sum_for_profiling.cu
example_sum_for_profiling.cu
example_test.cu
example_test.cu
gpusnek_whitepaper.pdf
gpusnek_whitepaper.pdf
logo.png
logo.png
utils_for_examples.h
utils_for_examples.h
View all files
Repository files navigation
Read the "whitepaper" here.
gpusnek answers the question "What would it look like to be able to inline arbitrary Python code into your high-performance CUDA kernels, with no consideration for why that is a bad idea?".
This repository implements a full Python interpreter that can run on one GPU thread (or in parallel on many).<br>It even includes the Python lexer, parser and bytecode compiler.
We take the source code from MicroPython, ram it through nvcc (NVIDIA's CUDA compiler), and fix most of the things which break.
Examples include:
Running 1 Million Python interpreters on a consumer GPU and using them in an interactive REPL.
Communicating between CUDA threads by using Python to read/write to a shared virtual filesystem living in VRAM
Other such nonsense.
# Assuming you have CUDA development tools set up<br>make TARGET=cuda -j<br>./example_allreduce
You can also build for the TARGET=host, useful for checking you haven't broken anything :)
About
GPU-Parallelizing Arbitrary Python Code By Running 1 Million Python Interpreters on a GPU 🐍
Resources
Readme
Uh oh!
There was an error while loading. Please reload this page.
Activity
Stars
stars
Watchers
watching
Forks
forks
Report repository
Releases
No releases published
Packages
Uh oh!
There was an error while loading. Please reload this page.
Contributors
Uh oh!
There was an error while loading. Please reload this page.
Languages
84.6%
Python<br>11.4%
Makefile<br>1.4%
Linker Script<br>1.0%
CMake<br>0.5%
Shell<br>0.3%
Other<br>0.8%
You can’t perform that action at this time.