PyTorch Backend & Autograd - KlongPy
Skip to content
Initializing search
briangu/klongpy
Backend Comparison
Performance
Automatic Differentiation
GPU Acceleration
Mixing with Python
Best Practices
Function Compilation
Gradient Verification
Troubleshooting
Performance
Reference
Data & Systems
Backend Comparison
Performance
Automatic Differentiation
GPU Acceleration
Mixing with Python
Best Practices
Function Compilation
Gradient Verification
Troubleshooting
PyTorch Backend and Autograd¶
KlongPy supports multiple array backends. The PyTorch backend enables GPU acceleration and automatic differentiation (autograd) for gradient-based computations.
Enabling the PyTorch Backend¶
Command Line¶
# Use --backend flag<br>kgpy --backend torch
# With GPU device selection<br>kgpy --backend torch --device cuda
Programmatically¶
from klongpy import KlongInterpreter
# Create interpreter with torch backend<br>klong = KlongInterpreter(backend="torch")<br>print(klong._backend.name) # 'torch'
# With specific device<br>klong = KlongInterpreter(backend="torch", device="cuda")
Backend Comparison¶
Feature<br>NumPy Backend<br>PyTorch Backend
Default<br>Yes<br>No (use --backend torch)
Object dtype<br>Yes<br>No
String operations<br>Yes<br>Not supported
GPU acceleration<br>No<br>Yes (CUDA/MPS)
Autograd<br>Numeric only<br>Native autograd
Small array performance<br>Faster<br>Slightly slower
Large array performance<br>Good<br>Better (especially on GPU)
Performance¶
The torch backend excels with large arrays:
Benchmark NumPy Torch Winner<br>vector_add_100K 0.04ms 0.08ms NumPy (2x)<br>vector_add_1M 0.36ms 0.07ms Torch (5x)<br>compound_expr_1M 0.61ms 0.07ms Torch (8x)<br>grade_up_100K 0.59ms 0.19ms Torch (3x)
For small arrays (¶
KlongPy provides several gradient and differentiation operators:
Typing Special Characters¶
Symbol<br>Name<br>Mac<br>Windows
Nabla<br>Character Viewer (Ctrl+Cmd+Space)<br>Alt+8711
Partial<br>Option + d<br>Alt+8706
On Mac, ∂ can be typed directly with Option + d . For ∇, use the Character Viewer or copy-paste.
:> Autograd Operator (Recommended)¶
The :> operator uses PyTorch autograd for exact gradients:
f::{x^2} :" Define f(x) = x^2<br>f:>3 :" Compute f'(3) = 6.0
The syntax is function:>point where:<br>- function is a scalar-valued function (must return a single number)<br>- point is the input at which to compute the gradient
∇ Numeric Gradient Operator¶
The ∇ operator always uses numeric differentiation (finite differences), regardless of backend:
f::{x^2} :" Define f(x) = x^2<br>3∇f :" Compute f'(3) ≈ 6.0
The syntax is point∇function (note: reversed order from :>).
How They Work¶
Operator<br>Method<br>Precision<br>Speed
:> with torch<br>PyTorch autograd<br>Exact<br>Fast
:> without torch<br>Numeric<br>~1e-6 error<br>Slower
∇ (any backend)<br>Always numeric<br>~1e-6 error<br>Slower
With the torch backend (--backend torch or backend='torch'), prefer :> for:<br>- Exact gradients (no floating-point approximation error)<br>- Complex computational graphs<br>- Better performance on large arrays
Examples¶
Scalar function:<br>f::{x^3} :" f(x) = x^3<br>f:>2 :" f'(2) = 3*4 = 12.0
Polynomial:<br>p::{((3*x^4)-(2*x^2))+x} :" p(x) = 3x^4 - 2x^2 + x<br>p:>1 :" p'(1) = 12 - 4 + 1 = 9.0
Vector function (sum of squares):<br>g::{+/x^2} :" g(x) = sum(x_i^2)<br>g:>[1.0 2.0 3.0] :" [2 4 6] = 2*x
Gradient descent:<br>f::{x^2}<br>x::5.0<br>lr::0.1
:" Update rule: x = x - lr * grad<br>x::x-(lr*f:>x)
Multi-Parameter Gradients¶
Compute gradients for multiple parameters simultaneously using a list of symbols:
w::2.0<br>b::3.0<br>loss::{(w^2)+(b^2)}
:" Compute gradients for both w and b<br>grads::loss:>[w b] :" [4.0 6.0] = [2w, 2b]
This is especially useful for neural network training:
w::1.0<br>b::0.0<br>X::[1 2 3]<br>Y::[3 5 7]
:" MSE loss<br>loss::{(+/((w*X)+b-Y)^2)%3}
:" Compute both gradients in one call<br>grads::loss:>[w b]
Jacobian Computation¶
Compute the Jacobian matrix (matrix of partial derivatives) using the ∂ operator or .jacobian() function:
f::{x^2} :" Element-wise square
:" Using ∂ operator (point∂function)<br>[1 2]∂f :" [[2 0] [0 4]] diagonal matrix
:" Using .jacobian() function<br>.jacobian(f;[1 2]) :" Same result
For vector-valued functions f: R^n -> R^m, the Jacobian is an m x n matrix where J[i,j] = df_i/dx_j.
Multi-Parameter Jacobians¶
Just like gradients, you can compute Jacobians with respect to multiple parameters using a list of symbols:
w::[1.0 2.0]<br>b::[3.0 4.0]<br>f::{w^2} :" Returns [w0^2, w1^2]
:" Compute Jacobians for both w and b<br>jacobians::[w b]∂f :" Returns [J_w, J_b]
This returns a list of Jacobian matrices, one per parameter. Useful for analyzing how vector-valued functions depend on multiple parameter sets.
Custom Optimizers¶
KlongPy provides the gradient primitives (:>, ∂, .jacobian()). For optimizers, use the example classes in examples/autograd/optimizers.py which you can copy to your project and customize.
Manual gradient descent (no optimizer needed):<br>w::10.0<br>loss::{w^2}<br>lr::0.1
:" Update rule: w = w - lr * gradient<br>{w::w-(lr*loss:>w)}'!50<br>w...