Multivariate data analysis (MDA) is a standard course taught at graduate level in various departments, notably mathematics and statistics, but also epidemiology, social sciences, etc. This textbook provides a balanced introduction to the theory of MDA and applications using R software. It covers many of the key topics found in a standard course, plus some modern topics such as tree-based methods and random forests. It includes lots of detailed examples and case studies to illustrate the methods, as well as exercises to enable use as a course text or for self-study. The book has been extensively class-tested through its use for a distance learning course at Sheffield.