Sebastian Raschka's Blog

April 18, 2026

My Workflow for Understanding LLM Architectures

A learning-oriented workflow for understanding new open-weight model releases
 •  0 comments  •  flag
Share on Twitter
Published on April 18, 2026 04:24

April 4, 2026

Components of A Coding Agent

How coding agents use tools, memory, and repo context to make LLMs work better in practice
 •  0 comments  •  flag
Share on Twitter
Published on April 04, 2026 04:45

March 22, 2026

A Visual Guide to Attention Variants in Modern LLMs

From MHA and GQA to MLA, sparse attention, and hybrid architectures
 •  0 comments  •  flag
Share on Twitter
Published on March 22, 2026 04:55

March 14, 2026

New LLM Architecture Gallery

I put together a new LLM Architecture Gallery that collects the architecture figures from my recent comparison articles in one place, together with compact fact sheets and links.
 •  0 comments  •  flag
Share on Twitter
Published on March 14, 2026 07:45

February 25, 2026

A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026

A Round Up And Comparison of 10 Open-Weight LLM Releases in Spring 2026
 •  0 comments  •  flag
Share on Twitter
Published on February 25, 2026 00:15

January 31, 2026

State of AI 2026 with Sebastian Raschka, Nathan Lambert, and Lex Fridman

I recently sat down with Lex Fridman and Nathan Lambert for a comprehensive 4.5 h interview to discuss the current state of progress of AI, and what the...
 •  0 comments  •  flag
Share on Twitter
Published on January 31, 2026 22:20

January 24, 2026

Categories of Inference-Time Scaling for Improved LLM Reasoning

Inference scaling has become one of the most effective ways to improve answer quality and accuracy in deployed LLMs. The idea is straightforward. If we are willing to spend a bit more compute, and more time at inference time (when we use the model to generate text), we can get the model to produce better answers.
 •  0 comments  •  flag
Share on Twitter
Published on January 24, 2026 00:15

December 30, 2025

The State Of LLMs 2025: Progress, Problems, and Predictions

A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.
 •  0 comments  •  flag
Share on Twitter
Published on December 30, 2025 00:15

LLM Research Papers: The 2025 List (July to December)

A curated list of LLM research papers from July���December 2025, organized by reasoning models, inference-time scaling, architectures, training efficiency, and diffusion.
 •  0 comments  •  flag
Share on Twitter
Published on December 30, 2025 00:00

December 7, 2025

From Random Forests to RLVR: A Short History of ML/AI Hello Worlds

Two years ago, I posted a list of Hello World examples for machine learning and AI on social. Here, the Hello World means beginner-friendly examples to showcase a method. I set a biennial calendar alert to revisit and append to it. I was thinking pretty hard about what a 2025 example could look like. So, here is a short post with the updated list and some explanations for more context.
 •  0 comments  •  flag
Share on Twitter
Published on December 07, 2025 16:20

Sebastian Raschka's Blog

Sebastian Raschka
Sebastian Raschka isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Sebastian Raschka's blog with rss.