Transformer haben sich seit ihrer Einführung nahezu über Nacht zur vorherrschenden Architektur im Natural Language Processing entwickelt. Sie liefern die besten Ergebnisse für eine Vielzahl von Aufgaben bei der maschinellen Sprachverarbeitung. Wenn Sie Data Scientist oder Programmierer sind, zeigt Ihnen dieses praktische Buch, wie Sie NLP-Modelle mit Hugging Face Transformers, einer Python-basierten Deep-Learning-Bibliothek, trainieren und skalieren können. Transformer kommen beispielsweise beim maschinellen Schreiben von Nachrichtenartikeln zum Einsatz, bei der Verbesserung von Google-Suchanfragen oder bei Chatbots. In diesem Handbuch zeigen Ihnen Lewis Tunstall, Leandro von Werra und Thomas Wolf, die auch die Transformers-Bibliothek von Hugging Face mitentwickelt haben, anhand eines praktischen Ansatzes, wie Transformer-basierte Modelle funktionieren und wie Sie sie in Ihre Anwendungen integrieren können. Sie werden schnell eine Vielzahl von Aufgaben wie Textklassifikation, Named Entity Recognition oder Question Answering kennenlernen, die Sie mit ihnen lösen können.
I was afraid this book would be redundant given how much information one can find online about transformers and the Hugging Face platform. Still, it turned out to be a very concise and pragmatic introduction to the topic and a valuable reference book with dozens of tips for training and tailoring your data tasks to the transformers paradigm.
This is a super cool book on NLP using the HuggingFace 🤗 ecosystem. It's well-written, and you can read it quite quickly (except for two very technical but important chapters). I would recommend it to anyone who has basic experience with deep learning and wants to dive into NLP.
Both a great primer on the subject, as well as a nice collection of more advanced 'gotchas'. Would have loved to see a bit more on the 'in production' side of things, but a great read nonetheless :-)
As a programmer I found this book initially interesting for laying the grounds to a more fundamental understading of transformers and why they are so hyped. I really hoped - being naive - I would read something groundbreaking in terms of computation and also hoped for an enablement to use transformers for future programming undertakings.
I was wrong. But that is really not the books fault. This technologies around transformers and llms strikes me as a black box where the underlying abstractions are not only difficult to understand but also difficult to put to use without tremendous effort and data. I truly do not understand where this is applicable outside of text-heavy domains and therefore I am a bit dissappointed in the technology.
At the same time the book has put a bit of calmness in me while reading stories of so-called AI products and software, where now I know at least the fundamentals and how hard they really are to put to use. All those abstractions resemble to me how the inner workings of a rdbms, and I know that for rdbms' that those are a career path in its own. Hence I am also starting to believe that at this stage, transformers has a long and steep learning curve for really putting them to use, and must not change its underlyings so much that it becomes difficult to follow over time.
I liked the book, it introduced me to a lot of things that is out of my daily work and I will definitly glimpse through the fundamentals many times again.
Accessibly written, with useful code examples and lots of directly actionable information on how to use HuggingFace tools. The chapters on making models efficient for production and on dealing with situations in which few labels are available are especially illuminating. It seems as if HuggingFace has developed a number of useful abstraction layers to make ML engineers more productive, especially around storing and accessing both models and data in a straightforward manner. Training your own neural network and deploying it has never been as easy as today and this book is a useful introduction to the ecosystem. The authors state that as of writing, 20,000 models had been shared on the model hub, but by the time I checked, the number was already 10 times as high. The explanations of model architectures and some technical details such as self-attention are also well-written, though of course given the current pace of technological change, are sure to be in need of another revision in only a couple years.
This book does a good job at introducing and explaining concepts of transformers. I especially like the Named Entity Recognition section, which also goes explains how to do model debugging.
I wish that this book focused a bit more on the huggingface ecosystem rather than only on the transformers part. When you would be tackling your custom problem, you will often have to deal with hugging face tokenisers and datasets. In my opinion they are vital to solving a NLP problem using hugging face and this book sadly does not give them enough attention.
Overall I did very much enjoy this book, however I am still sitting a bit with hunger w.r.t. solving real-world problems using hugging face.
Very good overview about capabilities of transformer architecture. Great examples. However more complicated math topics are glossed over in a very unelegant way. Math typograhpy is really bad. Some of the examples are really contrived.
Really nice overview of all things Transformers. Sometimes very vendor-centric but they tried their best to be as neural and inclusive as possible. One complaint I have is that it kinda starts midway without any context about Transformers but maybe that’s fine given the target audience of the book.
a lot of code and technical details. I read haft of the book and scanned through the other half. I wish there were more details and architecture design choices instead. Nevertheless, every well-written book.
A highly recommended book for anyone looking to understand how Transformer architecture works, combining theoretical concepts with practical code examples to ensure a thorough grasp of these models.
I’m impressed by how clearly the author explains the architectures and their applications. The clarity definitely makes complex concepts approachable to both technical and non-technical people.