Sebastian Raschka's Blog

November 11, 2025

Recommendations for Getting the Most Out of a Technical Book

This short article compiles a few notes I previously shared when readers ask how to get the most out of my building large language model from scratch books. I follow a similar approach when I read technical books myself. It is not meant as a universal recipe, but it may be a helpful starting point. For this particular book, I strongly suggest reading it in order since each chapter depends on the previous one. And for each chapter, I recommend the following steps.
 •  0 comments  •  flag
Share on Twitter
Published on November 11, 2025 16:08

November 3, 2025

Beyond Standard LLMs

After I shared my Big LLM Architecture Comparison a few months ago, which focused on the main transformer-based LLMs, I received a lot of questions with respect to what I think about alternative approaches. (I also recently gave a short talk about that at the PyTorch Conference 2025, where I also promised attendees to follow up with a write-up of these alternative approaches). So here it is!
 •  0 comments  •  flag
Share on Twitter
Published on November 03, 2025 16:08

October 28, 2025

DGX Spark and Mac Mini for Local PyTorch Development

The DGX Spark for local LLM inferencing and fine-tuning was a pretty popular discussion topic recently. I got to play with one myself, primarily working with and on LLMs in PyTorch, and collected some benchmarks and takeaways.
 •  0 comments  •  flag
Share on Twitter
Published on October 28, 2025 17:06

October 4, 2025

Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)

Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples
 •  0 comments  •  flag
Share on Twitter
Published on October 04, 2025 17:06

September 6, 2025

Understanding and Implementing Qwen3 From Scratch

Previously, I compared the most notable open-weight architectures of 2025 in The Big LLM Architecture Comparison. Then, I zoomed in and discussed the various architecture components in From GPT-2 to gpt-oss: Analyzing the Architectural Advances on a conceptual level. Since all good things come in threes, before covering some of the noteworthy research highlights of this summer, I wanted to now dive into these architectures hands-on, in code. By following along, you will understand how it actually works under the hood and gain building blocks you can adapt for your own experiments or projects.
 •  0 comments  •  flag
Share on Twitter
Published on September 06, 2025 01:00

August 9, 2025

From GPT-2 to gpt-oss: Analyzing the Architectural Advances

OpenAI just released their new open-weight LLMs this week: gpt-oss-120b and gpt-oss-20b, their first open-weight models since GPT-2 in 2019. And yes, thanks to some clever optimizations, they can run locally. I spent the past few days reading through the code and technical reports to summarize the most interesting details.
 •  0 comments  •  flag
Share on Twitter
Published on August 09, 2025 04:00

July 18, 2025

The Big LLM Architecture Comparison

It has been seven years since the original GPT architecture was developed. At first glance, looking back at GPT-2 (2019) and forward to DeepSeek-V3 and Llama 4 (2024-2025), one might be surprised at how structurally similar these models still are. Comparing LLMs to determine the key ingredients that contribute to their good (or not-so-good) performance is notoriously challenging: datasets, training techniques, and hyperparameters vary widely and are often not well documented. However, I think that there is still a lot of value in examining the structural changes of the architectures themselves to see what LLM developers are up to in 2025.
 •  0 comments  •  flag
Share on Twitter
Published on July 18, 2025 23:00

June 30, 2025

LLM Research Papers: The 2025 List (January to June)

The latest in LLM research with a hand-curated, topic-organized list of over 200 research papers from 2025.
 •  0 comments  •  flag
Share on Twitter
Published on June 30, 2025 23:06

June 17, 2025

Understanding and Coding the KV Cache in LLMs from Scratch

KV caches are one of the most critical techniques for efficient inference in LLMs in production. KV caches are an important component for compute-efficient LLM inference in production. This article explains how they work conceptually and in code with a from-scratch, human-readable implementation.
 •  0 comments  •  flag
Share on Twitter
Published on June 17, 2025 01:00

May 9, 2025

Coding LLMs from the Ground Up: A Complete Course

Why build an LLM from scratch? It's probably the best and most efficient way to learn how LLMs really work. Plus, many readers have told me they had a lot of fun doing it.
 •  0 comments  •  flag
Share on Twitter
Published on May 09, 2025 17:00

Sebastian Raschka's Blog

Sebastian Raschka
Sebastian Raschka isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Sebastian Raschka's blog with rss.