Jump to ratings and reviews
Rate this book

Transformers for Natural Language Processing: Build, train, and fine-tune deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, and GPT-3, 2nd Edition

Rate this book
Transformers are a game-changer for natural language understanding (NLU) and have become one of the pillars of artificial intelligence.

Transformers for Natural Language Processing, 2nd Edition, investigates deep learning for machine translations, speech-to-text, text-to-speech, language modeling, question-answering, and many more NLP domains with transformers.

An Industry 4.0 AI specialist needs to be adaptable; knowing just one NLP platform is not enough anymore. Different platforms have different benefits depending on the application, whether it's cost, flexibility, ease of implementation, results, or performance. In this book, we analyze numerous use cases with Hugging Face, Google Trax, OpenAI, and AllenNLP.

This book takes transformers' capabilities further by combining multiple NLP techniques, such as sentiment analysis, named entity recognition, and semantic role labeling, to analyze complex use cases, such as dissecting fake news on Twitter. Also, see how transformers can create code using just a brief description.

By the end of this NLP book, you will understand transformers from a cognitive science perspective and be proficient in applying pretrained transformer models to various datasets.

602 pages, Paperback

Published April 11, 2022

52 people are currently reading
54 people want to read

About the author

Denis Rothman

15 books12 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
10 (40%)
4 stars
7 (28%)
3 stars
5 (20%)
2 stars
2 (8%)
1 star
1 (4%)
Displaying 1 - 5 of 5 reviews
Profile Image for Adam.
194 reviews11 followers
February 19, 2023
Good useful (though basic) info but padded with so much gratuitous philosophizing and boilerplate code that it's hard to find the morsels of actual content.
4 reviews6 followers
April 6, 2022
Worth every penny. If you are looking for an expert's take on NLP then look no further.

I've read anything I can get my hands on re: GPT-3/Huggingface. Prof Rothman has raised the ante for what should be considered acceptable discourse re: GPT-3/Transformers and the new NLP-driven world that we live in.

Buy the book; it has a feel of an experienced mentor giving you the tools but more importantly, the judgement to navigate these new waters.
Profile Image for Kyle.
28 reviews
September 24, 2024
I didn’t get much out of this.

TL;DR: Don’t follow the author’s example for how to write code. Read the seminal papers instead. A lot of the business-y angles of the book haven’t aged well. Watch 3blue1brown.

If the code examples are representative of the author’s work, this man writes cursed code. I am a software engineer who builds infrastructure, and I am aware the standard I hold my code to does not fit most science coding situations. But if the examples in this book are anything to go by, I would not trust an ML pipeline built by the author. And if he could demonstrate that it worked satisfactorily despite the state of the code, I would not want to be the poor soul who has to re-write all the author’s code later when something inevitably breaks.

His code examples often don’t go much beyond what a “ChatGPT API README” would give you. If you have any experience with Python, installing packages, and reading documentation, the code sections are often pointless. What he calls “classical coding” (IIRC) is few and far between in this book. The exception to this is the early BERT sections, if memory serves.

Again, a symptom of my line of work, but PLEASE use poetry and a pyproject.toml. This will greatly increase your reproducibility and dependency management, which is vital for ML tasks, especially when transformer models are evolving so fast. A jupyter notebook is great for poking around, but as an ML data scientist turned software engineer, my grad work would have gone a lot smoother if I focused on reproducibility through modular code design instead of “just throw it together”. I created a lot of extra work for myself, and I see those kinds of design patterns on display in several of the code examples in this textbook.

I also found the book poorly organized. It gave the impression that it was a first draft of a text. The author repeats himself a lot. On the one hand, I value: “tell the audience what you’ll tell them, tell them, then tell them what you just told them”. On the other hand, it often felt like the text didn’t flow from one chapter to the next. Nor did it feel like there was a cohesive vision.

I also chafe at any “business bro” lingo. It felt like “Industry 4.0” was the author’s hobby horse idea, and I really got bored of being reminded of it.

His opinions on where the field is going, the future role of the “AI specialist”, etc., also don’t feel like they have aged well. I recognize I don’t have much exposure to prompt engineering in an industry context. But in my experience, it has not been hard to get ChatGPT (version 4 and on) to do what I want. And if it is, it’s because I’m asking something of it that it cannot yet handle, not in a poorly-structured prompt.

ChatGPT is certainly not the only model available, but on more than one occasion the author demonstrates himself that the “cool new model” on the street, although innovative, can’t one-up the scale of data that has gone into training ChatGPT, and using ChatGPT ends up being more effective for many tasks. In my experience as an end-user, prompt engineering has only been absolutely essential when working with Stable Diffusion.

Overall, I think you’re better off reading the seminal papers that the author mentions (e.g. Attention is All You Need) and watching 3blue1brown’s transformer series on Youtube.
Profile Image for Ita Cirovic Donev.
12 reviews
September 27, 2024
I expected so much more from this book given its other reviews; however, I was deeply disappointed. At some points this book feels as if it was written by some, not that great, LLM model given the structure, choice, and sequence of sentences. Many sentences are repeated in only a slightly different way, as if there is no editor.

The topics are covered marginally, where plenty of space is taken for extra large images rather than words. Wording in some chapters is repetitive as if the chapter were written ages apart and the author forgot what it was said in previous chapters.

Very hand-wavy explanation of the transformer architecture. The author tries to have a structured flow however, the concepts are explained sort of in a haste and without proper attention to detail. The examples are overly simplistic contradictory to the level of usefulness. No notation is presented and how to implement this for a more complex case.
The author presents this as the novice is reading, but I can not imagine a novice would understand this flow of explanation and examples.

Generally it provides the basic examples and how to run the code, without going into interpretative details.
10 reviews
March 13, 2023
Has useful information but examples are difficult to get to run as versions are out of date. I guess such a fast moving field.
Displaying 1 - 5 of 5 reviews

Can't find what you're looking for?

Get help and learn more about the design.