Data Science Weekly – Issue 657

Data Science Weekly - Issue 657

Data Science Weekly Newsletter

SubscribeSign in

Data Science Weekly - Issue 657 Curated news, articles and jobs related to Data Science, AI, & Machine Learning Data Science Weekly Jun 25, 2026

Issue #657 June 25, 2026

Hello! Once a week, we write this email to share the links we thought were worth sharing in the Data Science, ML, AI, Data Visualization, and ML/Data Engineering worlds.

And now…let’s dive into some interesting links from this week.

Editor's Picks

United Kingdom prime ministers UK has had a spurt of prime ministerial turnover in the past decade or so, but it’s by no means unprecedented. I download data from Wikipedia and try several ways to visualise that turnover…

How The Heck Do Synthesizers Work? (An Interactive Exploration) Synthesizers have remained a staple of modern music. From iconic video game soundtracks to Hans Zimmer’s scores, from Radiohead to the Stranger Things theme. We hear more synthesized music than we even realize. So how do they actually work?…

Surprising lessons from my research scientist job search There are two recent blog posts from Alisa and Silvia, both CS PhD students, on how they prepared and got into frontier labs such as OpenAI and Google Deepmind. I highly recommend them, and after seeing the reactions on Twitter, I want to share a different angle: what surprised me during my own research scientist job search…

What’s on your mind

This Week’s Poll:

Last Week’s Poll:

Data Science Articles & Videos

The PCA Mistake I Made During My PhD (and How to Avoid It in R) A few years into my PhD, I ran a principal component analysis on a dataset I’d spent six months collecting, looked at the scree plot, picked “the number of components that looked right,” and moved on. A reviewer later asked me to justify that number. I couldn’t. Not with anything more rigorous than “the elbow looked like it was there.”…

It turns out Analytics was a great career to go into even in a world with AI [Reddit] Maybe two or three years ago I lamented the fact I had never gone into software development in spite of the fact I probably had the coding mindset for it, regretting the tedious and stressful aspects of Analytics as well as lower overall pay. Now with AI leading to massive layoffs and / or reduce hiring in software development and other Engineering fields, I’m thinking Analytics was a good field to specialize in since it has that sweet spot of being just close enough to the business and just close enough to the tech side that it is hard to automate away via AI. Furthermore, I think demand for analysts in general to understand data and accommodate reporting changes will also increase if AI is accelerating software changes and changes to data models and systems…

minFLUX - A hackable implementation of FLUX diffusion models A simplified educational PyTorch implementation of FLUX.1 and FLUX.2 diffusion transformers (DiT) by Black Forest Labs. Built for understanding rectified flow matching, joint attention, and the key design choices behind FLUX with verifiable line-by-line source mappings to the official codebases…

Data-driven discovery of dynamical models in biology In this review, we survey approaches for model discovery in biological dynamical systems, focusing on three methodological families: regression-based methods, network-based architectures, and decomposition techniques. We compare their ability to address three core goals: forecasting future states, identifying interactions, and characterizing system states. Representative methods are applied to a common benchmark, the Oregonator model, a minimal nonlinear oscillator that captures shared design principles of chemical and biological systems. By highlighting strengths, limitations, and interpretability, we aim to guide researchers in selecting tools for analyzing complex, nonlinear, and high-dimensional dynamics in the life sciences…

Scaling Laws, Carefully Scaling laws are one of the most critical empirical findings in deep learning. The observation is simple in form: the training loss L decreases predictably as we scale up model size N , dataset size D , and compute C , following a power-law curve, which appears as a straight line on a log-log plot. We can view scaling laws as a framework for describing the relationship between compute, loss, model size and data; at its core, it is about how to allocate precious compute optimally between N and D . This predictability makes scaling laws highly valuable in practice. A common workflow is to fit scaling laws on a handful of small runs and then extrapolate to estimate the token and compute requirements for larger models….

ML Foundations (prerequisites) for Post-Training | RLHF Book Course, Lecture 0 In this video I try to cover a bunch of math, LLM training fundamentals, and probability concepts that come up again and again in post-training content (and this book). We cover things like the role of mid-training,...

Data Science Weekly – Issue 657

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars