Jump to ratings and reviews
Rate this book

Advances in Financial Machine Learning

Rate this book
Learn to understand and implement the latest machine learning innovations to improve your investment performance

Machine learning (ML) is changing virtually every aspect of our lives. Today, ML algorithms accomplish tasks that - until recently - only expert humans could perform. And finance is ripe for disruptive innovations that will transform how the following generations understand money and invest.

In the book, readers will learn how to:

Structure big data in a way that is amenable to ML algorithms Conduct research with ML algorithms on big data Use supercomputing methods and back test their discoveries while avoiding false positives Advances in Financial Machine Learning addresses real life problems faced by practitioners every day, and explains scientifically sound solutions using math, supported by code and examples. Readers become active users who can test the proposed solutions in their individual setting.

Written by a recognized expert and portfolio manager, this book will equip investment professionals with the groundbreaking tools needed to succeed in modern finance.

400 pages, Kindle Edition

First published January 1, 2018

540 people are currently reading
2284 people want to read

About the author

Marcos López de Prado

9 books34 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
189 (42%)
4 stars
164 (36%)
3 stars
58 (13%)
2 stars
27 (6%)
1 star
7 (1%)
Displaying 1 - 30 of 36 reviews
Profile Image for BCS.
218 reviews33 followers
August 14, 2018
Machine Learning is about gaining confidence in your algorithm. Looking at a financial trading model, you only get a limited amount of data from, for example, Bloomberg services on which to formulate confidence. Drilling down you may approximate third party transactions on which you can only obtain partial viability. In this book we look at the various factors that obscure a supply data model and which therefore reduce the information that may be derived. Given a large and diverse supply population, backtesting becomes a crucial retrospective that may give pointers to trading forecasts, but they are only pointers; looking backwards is at best simple guide forecasting. However, there are several ways of analysing supply data for subsequent information.

Having gained separate PhDs in Financial Economics and Mathematical Finance, and holding multiple patent applications on algorithmic trading, our Dr and one-time academic Marcos Lopez de Prado now manages several multibillion-dollar funds using ML algorithms.

Complex, often inter-related topics are covered with a simplicity that only comes from mastery of the subject areas, useful to every data analyst and business analyst supporting risk management. Being a proliferate author, Lopez often references back to his prior publications. A matrix of each topic - Financial Data, Software, Hardware, Math (well, he is American!), Meta-Strat and Overfitting are each chapter and part within the book. However, defining the Sharpe Ratio and its common derivatives such as Deflated Sharpe Ratio after its extensive earlier use is a presentational faux pas.

Standard industry financial risk models come with copious programming snippets in Python. Ultimately, terms such as molecules and atoms are used when trying to illustrate parallel (python) programming. Using KISS (Keep it Simple and Stupid) methodology, should Python Go?

Several elementary tools are introduced to visualise supply market data such as Time Bars, Tick Bars, Volume Bars and Dollar Bars and then one vertical and two horizontal barriers, information derivative bars and then Multi-Threaded Monte Carlo.

Cross-validation (CV) splits supply data into either Training or Testing pools to assist in model development and backtesting. K-fold CV, a popular model, being considered to be faulty, Lopez uses hyper-parameters in his own Purged k-fold CV to improve leakage using purging and embargoes.

Backtesting, (i.e. Stress Testing) is considered from several viewpoints. Three major Walk-Forward (WF) disadvantages such as a single scenario can easily be overfit, WF is not normally representative of future performance, and that its initial decisions are made on a limited portion of the total sample are each considered. Arguing against the benefits of WF, Lopez concludes the goal is to infer future performance from a number of out-of-sample scenarios. Extending his Purged k-fold principles, Lopez offers his Combinatorial Purged Cross-Validation method (CPCV), claiming it leads to fewer false discoveries, easily defeating WF overfitting.

Strategic risk and then portfolio risk are well covered, though this is a relatively stable topic with few recent advances.

Basics covered, we then focus on Asset Allocation, with increasing use of GCE A Level mathematics, looking at Markowitz’s Curse, Tree Clustering, Out-of-Sample Monte Carlo Simulations, Inverse Variance Allocation and others. Shannon’s Entropy and other financial applications of entropy then follow, as does a review of Microstructural Feature publications, including various Lamda’s such as Kyle’s and Hasbrouck’s.

Simple illustrations conclude with brute force used in Quantum Computing to find optimum solutions by examining all feasible solutions at the same time. Considering the maturity of Quantum Computing, the amount of advances especially to standard models seems somewhat lacking.

Finally, Dr Kesheng Wu and Dr Horst Simon look at Hierarchical Data Format 5 (HDF5), Supernova hunting, and High Performance Computing (HPC) against Cloud computing, reasoning HPC offers better cost effectiveness and higher performance. Several use cases are presented including Intraday Peak Electricity Usage, the latter providing recent interesting insights (2014) applied to a summer time study of American Advanced Metering Infrastructure (AMI), the Flash Crash of 2010 and High Frequency Events with Non-uniform Fast Fourier Transform used in the natural gas futures market.

Advances in Financial Machine Learning is a very interesting book. I would give it 8 out of 10 - the author knows his subject.

Review by Paul Ramsay
Originally published: https://www.bcs.org/content/conWebDoc...
Profile Image for Max Bolingbroke.
111 reviews24 followers
August 5, 2018
Read his free paper on hierarchical risk parity (SSRN 2708678) instead.
Profile Image for Denis Vasilev.
809 reviews107 followers
October 24, 2018
Практические советы по применению МЛ в торговле на фондовых рынках. Все по делу, очень интересно было глянуть на основные вопросы работы на одном из самых конкурентных рынков.
2 reviews3 followers
April 20, 2018
If you're coming from a computer science and/or machine learning background, you will learn a lot about how to frame your algorithmic thinking in the domain of finance and will leave you hungry for more hardcore graph theory, parallelization, machine learning (beyond simple random forest ensembles and clustering), advanced algorithms, and gutty details of implementation, which are left for you to explore and enjoy.

The purpose of this book is not to explain how to apply Deep Learning to make money, but rather to lay a solid foundation of how to invest in a scientifically rigorous fashion given the modern machine learning toolset and access to PBs of data. In many cases, rather than focussing on the specifics of any given model, Dr. Lopez de Prado focuses on generating and selecting useful features.

The book, which is a hybrid of a textbook and a manual, explains using both formal mathematics and empirical evidence why many of the assumptions about Machine Learning applied to the financial world are wrong and follows through with rigorous and practical solutions. For example, one of the most common false assumptions addressed in the book is that of IID samples in financial time series data.

Dr. Lopez de Prado manages to pull together ideas from a wide spectrum of academic disciplines including mathematics, econometrics, machine learning, computer science, information theory, and physics to build a strong scientific basis upon which to algorithmically invest. Despite the diversity of subject matter, the book progresses well, building on and reusing early themes and then exploring domain specific topics like market microstructure and quantum computing. Source code to implement many of the methods is provided as a practical toolkit to test out the claims presented. The thorough use of references is particularly helpful as it keeps the content fairly short and to the point.

Speed reading not recommended. Using a programming analogy, the mathematical notation is more reminiscent of the explicit verbosity of C++ than that of python (which is used in the book and is meant to be concise). It's not much of a problem but be aware the information content is dense.

Something that's mentioned but not explored is how to make use of “alternative datasets”. Given many of the advances in the wider realm of ML have been around data you don’t get from exchanges, it would be nice if some helpful pointers or references for dealing with alternative data were included. That said, it's not the end of the world given the wealth of resources online for analyzing text, image, and video data.

Buy this book if you're an experienced programmer getting into Finance or a Financial Professional looking to strengthen your algorithmic understanding. It is densely packed with a wealth of practical methods and breaks down and offers alternatives to faulty investing science.
Profile Image for Terran M.
78 reviews107 followers
May 20, 2018
This book is for people who already understand machine learning or predictive modeling, and who already understand investment, and would like some guidance on applying the one to the other. It is an excellent book if and only if you meet these conditions.

The author has a hint of Taleb-style arrogance, wanting to be recognized for being the smartest person in the room, but not enough to impede enjoyment of the book, and it answers the question of why he published it at all in a field which is otherwise characterized by "those who know do not say."
Profile Image for IOANNIS TSIOKOS.
13 reviews
July 25, 2018
Knowledge like this is hard to come by because it is much more profitable to implement it than to write about it. Marcos must have had an urge to share his knowledge that overwhelmed the common wisdom in this industry - to not share or sell anything that works.
Profile Image for Thiago Marzagão.
220 reviews25 followers
April 7, 2021
I don't have any background in finance, so a lot in this book was completely news to me. I had never thought that you could use volume or dollar bars (as opposed to time bars), for instance. I imagine finance folks have known that for a long time, but it completely blew my mind. (Though there is some good pushback here: "when acting based on volume traded we may be too late already.") Similarly, I had never thought about how differentiating might eliminate signal. I saw fractional differentiation in grad school, in passing, but I had never found real-world applications for it (probably because I haven't done a lot of work with time series). I finally understand how it can be useful. Also, I had never thought that you could do cross-validation in a combinatorial way, with multiple test sets at a time. That is something I will probably try in non-finance work too.

Now, I wish de Prado had separated data and strategy more clearly. Take chapter 13 - "Backtesting on Synthetic Data" -, for example. You'd expect it to be about using Monte Carlos to generate synthetic market data (which you could then use to backtest your strategy). But no. The synthetic data includes the trading rules themselves. Why conflate both things like this? What if I just want to simulate market data, so that later I can use it to backtest whatever strategy I want? That might keep things simpler, clearer, easier to handle. (Can we even call it backtesting if you're trying to optimize the parameters of your strategy? Aren't these different things? What am I missing here?)

It's the same with the triple barrier labeling. I get it - "Every investment strategy has stop-loss limits" (p. 44). But it conflates data and strategy, which not only makes the whole thing harder to reason about, but also introduces problems like leakage and low uniqueness, which de Prado then spends entire chapters explaining how to fix. Those problems wouldn't exist in the first place if our labels were not dependent on our strategy. The man is clearly a genius (and he knows that - the book has a Taleb-ish style at times), I'm sure he has good reasons for suggesting triple barrier labeling, but where is the evidence that all this additional complication pays off in terms of higher Sharpe or some other metric?

I also wonder whether some of these choices - dollar bars vs volume bars, triple barrier labeling vs conventional labeling, etc - could be learned from the data. I know, I know, overfitting. But I wish de Prado had discussed this possiblity, even if to dismiss it. (Maybe reinforcement learning is the final destination? Btw, it would be great if de Prado could add a chapter on reinforcement learning for finance in a future edition.)

Finally, this is a super dense book (though de Prado warns you of that right in the beginning). Some passages I had to read 2, 3, 4 times to understand, and a few passages I just didn't understand at all. I hadn't struggled so hard with a technical book since grad school. The language could be more clear and precise. The notation could be more consistent. And examples with real-world data would help. (There are lots of code snipets though, and they are super helpful.)

All that said, this is an amazing book that anyone who wants to do algorithmic trading should read.
Profile Image for Oleksandr Nikitin.
23 reviews12 followers
May 8, 2019
Given the overall sad state of the literature in this area, it's good. Also, it's entertaining. Just don't expect it to be a guide of any kind.
Profile Image for Ayush.
23 reviews
April 8, 2021
Gold, author has done a public service by sharing so much useful and mathematically grounded approaches. I see alot of self referencing in research but what can be done? Prado really is best positioned to write this book as he is both a practitioner as well as academic.

I'll read this book twice, this time with end of chapter exercises to finally nail the concepts down.

Cons:
The math is confusing at time and accompanying code makes python look like C++
13 reviews
January 10, 2024
The book was generally well written, and I feel lik I learned what I wanted from it.
I listened to this book as an audiobook (narrated by Steven Jay Cohen), which I'd not particularly recommend, as the formulas are poorly read aloud and unfortunately not all are available in the attached pdf.
Regarding the actual content of the book, I only had some small issues: The intro is a bit elitist, claiming you need to know a bunch of specific frameworks, APIs, programming languages, and even hardware vendors, while most of these are barely even mentioned throughout the rest of the book.
The last chapter about HPC could have been left out. It feels like a grant application for HPC funds was copy-pasted in.
5 reviews
September 17, 2024
Que dire ? Que dire ?

Magistral, des insights de zinzin! Ce livre m’a sans aucun doute donné toutes les armes pour révolutionner la finance quantitative dans la décennie à venir. Tout simplement une mine d’or, et je n’ai même pas tout parfaitement compris. Ce livre ne s’y prête pas, mais il faudrait l’adapter en film tellement c’est de la frappe, ne me faites pas jurer. Je lisais ce livre en partie pendant mes vacances à la montagne, et autant le mont blanc est majestueux, autant cet ouvrage est l’équivalent de l’Everest à côté!

Bravo à Marcos Lopez de Prado en tout cas, la taille du cerveau de ce mec quand même, aucun flop à son actif! Il est à son prime, c’est goatesque toute la valeur que ce gars délivre.
Profile Image for David.
11 reviews
July 13, 2020
The single most important point of the book is the characterization of the failure modes of systematic (quant) outfits, what almost never works and what he has seems at least sometimes work. This is extremely useful and is possibly applicable to organizations outside of the systematic domain. de Prado also has a paper covering much the same topics.

Overall the book is useful since few are writing books like this. I only wish more effort and time was put into it to increase the quality and output. The tone of the book is one which encourages rote learning and an ignorance of the general setting.

Worth at least skimming since many people will read it and it does set out a language and some methods used in practice. Like a lot of de Prado's work there are useful heuristics but deep understanding is not always on offer and seldom is an attempt made to create links between the problems and heuristic solutions. Some of the code examples are of poor quality and the text itself is poorly formatted. There are some groups like https://hudsonthames.org/ working to put together cleaned up and improved versions of the methods presented in the book (among other things) so do see them for reference.
50 reviews3 followers
June 17, 2021
One of the most important and influential books written on quantitative methods for the study of financial markets. Particularly interesting is the focus on statistical properties of financial time series, the selection of strategies using computationally-efficient techniques such as random trees and clustering methods, as well as several important market microstructure features such as VPIP and other measures of toxic order flow, etc.

I would recommend to everyone that is interested in the science of financial market microstructure, as well as the cutting edge of data science as applied to markets.
Profile Image for Evan.
3 reviews
November 29, 2022
Quantitative finance looks to machine learning for additional sources of alpha. Marcos Lopez de Prado introduces hierarchical risk parity as an additional way to cluster assets as an alternative to Markowitz's mean-variance optimization. Interesting ideas of up/down/right bounded-box trading strategies, with varying or static thresholds. Novel factors like fractional differentiation to achieve partial-stationarity and partial-memory in time series problems.

I liked it and have implemented a number of the ideas. I think every quant has read this book and the market has likely removed any leftover alpha as a result.
Profile Image for Igor Pejic.
Author 15 books16 followers
February 16, 2023
From the very broad AI books to a highly specific one: As the title suggests, Advances in Financial Machine Learning looks at a particular sub-discipline of AI and describes how to feed algorithms with huge troves of data in order to use them in financial services, in particular in investment banking.

Larsen digs down deep into the details, so this book might not be the perfect fit for C-level executives, but rather middle-managers who have to design and set up machine learning projects. Finance is among those industries that offers the best preconditions for machine learning, but there are still many pitfalls this book will help you to avoid.
Profile Image for Salim Tlemçani.
7 reviews
June 2, 2024
Great ideas, breaking the codes of quant research. However, I wish I was told that, although the title mentioned it, this book is not about financial machine learning, it rather about the framework around it and the processes to achieve a successful financial machine learning quant strategy. I love Marcos ideas and the way he destroys conventional finance to establish a new norm that is more rigorous (following US national lab processes for instance). Finally, the fact that for every chapter the reader is able to dig deeper with exercises and most importantly research papers is really satisfying. Thanks Marcos !!
Profile Image for Jason Orthman.
260 reviews4 followers
August 9, 2019
Very difficult book to rate and review as it’s effectively a text book for advanced participants in the field of coding (Python) and financial machine learning. The concepts and principles are still important. There is no easy win for fund managers who want to utilise financial machine learning to attain alpha. You will need a highly experienced team of skilled professionals across finance, coding, mathematics etc that will continue to keep evolving while avoiding common problems such as over-fitting, back-testing etc
Profile Image for Tony Murray.
7 reviews
May 7, 2020
Overall a decent textbook but one that I found too abstract to really dig into. I’m sure for specific people it is great but as someone who is technically inclined, it just felt a bit too much about him referencing his papers and prior text. I was honestly hoping to be able to translate some of the code snippets from python into R, but the code was very sparsely commented. I am working on a couple of simulations that the author coded and hope to get those translated. So overall it was a 4 star book.
1 review
May 16, 2022
A lot of ML books are written are by academics and teach you about the theory of different models.

This book is written by someone who "manages several multibillion-dollar funds for institutional investors using machine learning algorithms". And it shows.

I'd recommend this book to anyone who wants to learn to build robust ML models for the real world that will actually work in prod.

There is a lot of finance specific information, but much of it translates across industries (in particular for time series problems).
Profile Image for Jaume Sués Caula.
247 reviews2 followers
March 15, 2020
Not a recommended reading if you are starting up at quantitative trading. The technical depth is astonishing, with great real-life examples.

In my case, I wanted to immerse myself to get the argot and a sense of the complexity of this world (just after reading Jim Simmons biography).
Profile Image for Ferhat Culfaz.
271 reviews18 followers
August 1, 2020
A recycle of many of his papers in book. Has the cutting edge, but mix of very specific and at the same time very vague. Very advanced text and assumes you have vast prior knowledge. Very theoretical yet contains snippets of python code for implementation. Good bibliography after each chapter.
1 review
July 31, 2022
a couple of interesting ideas to glean from this, but nothing fancy. Quite a lot of pages spent talking about proper backtesting and data-engineering procedures, which is more quant trading 101 and less "advances".
Profile Image for Aidan L.
2 reviews
October 9, 2024
This is the core textbook for any modern quant. Good luck reading the entire thing from cover-to-cover lol
Take notes and don't throw the book away, I guarantee you will flip it back open when research calls...
Profile Image for Connor Holm.
15 reviews1 follower
January 8, 2025
A really interesting book that combines both theory and application within financial mathematics specializing in machine learning. The author is extremely knowledgeable and his portfolio optimization method of hierarchical risk parity was cool. Thank you to Mr. Ruetz for gifting me this book!
Profile Image for Tom.
1 review3 followers
June 9, 2025
Highly practical and reliable information

For those looking to learn machine learning techniques that deliver reliable results in finance, this book is ideal. In particular, the HRP algorithm delivers exceptional results.
Profile Image for Hüseyin Çötel.
305 reviews13 followers
August 30, 2025
I liked this book and even bought a hard copy to read it again when needed, it inspired few ideas, I have to admit I maybe understood half of what I read math language is hard for me to understand. And I find it hard to read code samples it needs better formatting and colors to read codes
3 reviews
January 21, 2019
Excellent book with practical example and issues in financial machine learning
Profile Image for Benji.
349 reviews75 followers
February 4, 2019
Application of ML algorithms to financial data is straightforward, at least in a technical sense.
Practically, God (or the devil) is in the details.
Profile Image for Randy Carlson.
34 reviews3 followers
May 27, 2019
Not bad. Very technical on both the finance end and the technical end.
Displaying 1 - 30 of 36 reviews

Can't find what you're looking for?

Get help and learn more about the design.