JuSeongvin/pinn · Hugging Face
Log In<br>Sign Up
PINN-JEPA — Physics-Informed Encoder for 3D Human Motion (Step 1)
A physics-informed neural network (PINN) encoder for 3D skeletal human motion .<br>The encoder is pretrained in a self-supervised way to produce motion representations<br>whose kinematic structure (position → velocity → acceleration → jerk) and bone<br>geometry stay physically consistent. This repository releases the Step 1 PINN<br>pretraining stage (encoder body + pretraining head) and a non-latest checkpoint.
Status: research release. The published weights are not the latest internal<br>checkpoint, and are intended for reproducibility and experimentation, not production.
Model overview
Input<br>(B, T, J, 12) — per joint state [p(3), v(3), a(3), j(3)]
Skeleton<br>H36M 17-joint topology (J = 17)
Output<br>token features (B, T, J, D) + reconstructed state s_hat (B, T, J, 12)
Backbone<br>State embedding → (GraphMix spatial + TemporalBlock) × depth → LayerNorm
Default size<br>d_model = 256, depth = 6, d_state = 64
Framework<br>PyTorch (custom modules, no transformers dependency)
The encoder predicts a residual on position only; velocity, acceleration and jerk are<br>derived analytically via central differences , which is what keeps the representation<br>kinematically consistent rather than letting each channel drift independently.
Training objective (Step 1)
Self-supervised state reconstruction combined with physics-aware regularizers:
State reconstruction on p / v / a / j (weighted)
Bone-length consistency over skeleton edges
Kinematic consistency (finite-difference agreement between channels)
Jerk regularization for motion smoothness
See PINN_Lossfunction.py for exact terms and default weights.
Repository contents
PINN_EncoderBody.py # backbone (StateEmbedding, GraphMix, TemporalBlock, EncoderBody)<br>PINN_PretrainModel.py # Step 1 model: encoder + residual-p head -> s_hat<br>PINN_Lossfunction.py # physics-aware pretraining losses<br>PINN_Training.py # train step, checkpoint save/load<br>PINN_ModelEvaluation_downstream.py # representation-quality eval (clustering)<br>PINN_ModelEvaluation_itself4.py # model self-evaluation<br>PINN_visualization_for_model3.py # 3D skeleton render / input-vs-output compare<br>Utils.py # skeleton edges, central_diff, masked_mean, etc.<br>config.json # architecture hyperparameters (edit to match the checkpoint)<br>export_weights.py # slim a training checkpoint -> release weights<br>inference_example.py # minimal load + forward example
Note on imports. Modules use flat imports (from Utils import ...). Keep all<br>.py files at the repository root, or add the repo root to PYTHONPATH, before importing.
Usage
import json, torch<br>from PINN_EncoderBody import EncoderBody<br>from PINN_PretrainModel import PINNPretrainModel
cfg = json.load(open("config.json"))<br>encoder = EncoderBody(**cfg["encoder"])<br>model = PINNPretrainModel(encoder=encoder, fps=cfg["fps"])
state = torch.load("pytorch_model.bin", map_location="cpu")<br>model.load_state_dict(state)<br>model.eval()
# x: (B, T, J=17, 12) = [p, v, a, j] per joint<br>out = model(x)<br>features = out["token_feat"] # (B, T, J, D) representation<br>s_hat = out["s_hat"] # (B, T, J, 12) reconstructed state
config.json ships with the architecture defaults . If the released checkpoint was<br>trained with different settings, edit config.json so the shapes match before loading.
Intended use
Self-supervised motion representation learning research
Feature extraction for downstream pose/motion tasks
Studying physics-informed regularization for skeletal motion
Out of scope
Not a clinical, diagnostic, biometric, or safety-critical tool
Not trained or validated for person identification or surveillance
Tuned for the H36M 17-joint topology; other skeletons need adaptation/retraining
Limitations
Released weights are an older checkpoint and may underperform the internal latest version.
Assumes a fixed 17-joint topology and a consistent (p, v, a, j) input layout.
fps at inference should match the value used to build the (v, a, j) channels.
Evaluation utilities depend on scikit-learn; UMAP is optional.
License
This release is distributed under the Academic Free License v3.0 (AFL-3.0) .
Source code (*.py): AFL-3.0 — see LICENSE.
Model weights (released checkpoint): AFL-3.0, with the disclaimer below.
The weights are provided "as is", for research and reproducibility, without warranty of<br>any kind. They are not the latest internal checkpoint and carry no fitness guarantee for<br>any particular use. See NOTICE for the scope split between code and weights.
Citation
@misc{pinn_jepa_pose,<br>title = {PINN-JEPA: Physics-Informed Encoder for 3D Human Motion},<br>author = {},<br>year = {2026},<br>note = {Research code and weights, AFL-3.0},<br>howpublished = {\url{https://huggingface.co//}}
Downloads last month 30
Inference Providers NEW<br>This model isn't deployed by any Inference Provider. 🙋 Ask for provider support