I made a CPU only spiking neuron network lib that comes pretty close to PyTorch

etoxin/neuronguard-wikipedia-classifier · Hugging Face

Neuromorphic Wikipedia Domain Classifier

This repository hosts the pre-trained vocabulary and synaptic weights for NeuronGuard , a proof of concept that uses cache-aligned Spiking Neural Network (SNN) and neuromorphic event engine written in Rust and exposed to Python.

Note: This was a proof of concept to see if a event driven like neuron inspired neural model could work.

The model is trained on the 44.4 GB 'wikimedia/structured-wikipedia' dataset (10.4M articles total) to classify articles into 5 high-level domains:

0: Science & Technology

1: Geography & Places

2: Biography & People

3: History & Events

4: Arts & Culture

Using a streaming dataset pipeline (zero disk overhead) and community-curated Infobox template names for labeling, the model was trained on exactly 1,000,000 valid articles (samples) and achieved 93.14% accuracy on a 10,000-article test split, training in 15.85 seconds using local pre-filtered data on a standard Apple Silicon CPU.

Performance Metrics (Standard Apple Silicon CPU)

Training Set Size : 1,000,000 valid articles (samples)

Model Size : 320 KB (synaptic weights and vocabulary file)

Training Time : 15.85 seconds (using local pre-filtered data)

Inference Latency : Microseconds (~5–15 µs per article)

Memory Footprint : Accuracy : 93.14% (evaluated on a 10,000-article test split)

Comparison: NeuronGuard vs. PyTorch (1,000,000 Samples)

To validate the efficiency of NeuronGuard's neuromorphic architecture, we ran a head-to-head comparison against a standard PyTorch Multi-Layer Perceptron (MLP) on the full Wikipedia dataset (1,000,000 training samples, 10,000 test samples, and a 10,000-word vocabulary).

Head-to-Head Results (Apple Silicon M-Series CPU/GPU)

Metric NeuronGuard (SNN) PyTorch (MLP, 1 Epoch) PyTorch (MLP, 10 Epochs) Trade-Off / Insight

Hardware Used Single CPU Core Apple Silicon MPS GPU Apple Silicon MPS GPU NeuronGuard achieves line-rate speed without needing GPU acceleration.

Training Time 15.85 seconds 16.56 seconds 163.84 seconds NeuronGuard is 10.3x faster than PyTorch (10 epochs) on a single CPU core.

Data Loading Overhead 0.00 seconds (on-the-fly) 16.17 seconds 16.17 seconds NeuronGuard trains directly on the stream, bypassing memory loading overhead.

Total Pipeline Time 15.85 seconds 32.73 seconds 180.01 seconds NeuronGuard is 2x to 11.3x faster overall from cold start to fully trained.

Model Size ~320 KB 2.44 MB 2.44 MB NeuronGuard's model size is 7.6x smaller , making it ideal for edge devices.

Overall Accuracy 93.14% 97.98% 98.15% PyTorch's global backpropagation yields slightly higher peak accuracy, but NeuronGuard is within 5.0%.

Because NeuronGuard trains on the fly via Hebbian plasticity, it completely bypasses the massive dataset loading and memory overhead required by traditional deep learning frameworks.

Architectural Trade-Offs

NeuronGuard (Neuromorphic SNN) : Pros : Instant training (single-pass online learning), ultra-low memory footprint (Cons : Slightly lower peak accuracy due to the lack of iterative global optimization (backpropagation).

PyTorch (Traditional Deep Learning) : Pros : High peak accuracy (98%+) due to multi-pass gradient descent and non-linear optimization.

Cons : Requires dedicated GPU acceleration for fast training, larger model files, and significantly higher memory and dependency overhead.

Technology Overview

NeuronGuard operates on a hardware-conscious, matrix-free neuromorphic design. Rather than relying on traditional deep learning architectures (like Transformers or Feedforward networks), it implements the following core technologies:

Spiking Neural Network (SNN) Core : Models neural computation using discrete event spikes rather than continuous floating-point activations. Stimuli are processed as temporal events that propagate through synaptic pathways.

Hebbian-Style Plasticity : Synaptic weights are updated on the fly using simple transactional increments and decrements based on co-activation, completely bypassing backpropagation and gradient storage.

Cache-Aligned Memory Layout : All neural structures are spatially aligned to exactly 64-byte boundaries (matching standard CPU cache lines). This maximizes L1/L2 cache hit rates, prevents cache thrashing, and eliminates false sharing during parallel execution.

GIL-Free Parallelism : Drops the Python Global Interpreter Lock (GIL) during stream processing to execute concurrent, lock-free evaluations across background worker threads.

Guard/Lease Transactional Pattern : Implements transactional, lock-free leases on specific memory addresses using atomic compare-and-swap (CAS) operations for safe concurrent weight mutations.

Flat, Pointerless Serialization : Synaptic weights are stored as flat, contiguous binary arrays, enabling sub-millisecond serialization and deserialization directly to and from disk.

How to Load and Use in Python

To load and run inference using these...

I made a CPU only spiking neuron network lib that comes pretty close to PyTorch

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought

Naphtha Shortages Having a Growing Impact in Japan