Jump to ratings and reviews
Rate this book

C++ AVX Optimization: CPU SIMD Vectorization

Rate this book
AVX SIMD instructions are free CPU parallelization hidden in every CPU. Access these vectorized instructions for extra speed without learning assembly language by coding in C++ AVX intrinsics.

Key Introduction to AVX SIMD intrinsicsVectorization and horizontal reductionsLow latency tricks and branchless programmingInstruction-level parallelism and out-of-order executionLoop unrolling & double loop unrolling
Table of
Part AVX Optimizations

1. AVX Intrinsics
2. Simple AVX Example
3. CPU Platform Detection
4. Common Bugs & Slugs
5. One-Dimensional Vectorization
6. Horizontal Reductions
7. Vector Dot Product
8. Loop Optimizations
9. Softmax
10. Advanced AVX Techniques
Part Low-Level Code Optimizations
11. Compile-Time Optimizations
12. Zero Runtime Cost Operations
13. Bitwise Operations
14. Floating-Point Computations
15. Arithmetic Optimizations
16. Branch Prediction
17. Instruction-Level Parallelism
18. Core Pinning
19. Cache Locality
20. Cache Warming
21. Contiguous Memory Blocks
22. False Sharing
23. Memory Pools
Appendix Long List of Low Latency Techniques
Appendix License Details

335 pages, Kindle Edition

Published July 16, 2025

1 person is currently reading

About the author

David Spuler

20 books7 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
0 (0%)
4 stars
0 (0%)
3 stars
0 (0%)
2 stars
0 (0%)
1 star
0 (0%)
No one has reviewed this book yet.

Can't find what you're looking for?

Get help and learn more about the design.