Rate this book

Your Code as a Crime Scene: Use Forensic Techniques to Arrest Defects, Bottlenecks, and Bad Design in Your Programs

Name: Your Code as a Crime Scene: Use Forensic Techniques to Arrest Defects, Bottlenecks, and Bad Design in Your Programs (The Pragmatic Programmers)
Rating: 3.73 (50 reviews)
ISBN: 9781680500806

Adam Tornhill

Rate this book

Jack the Ripper and legacy codebases have more in common than you'd think. Inspired by forensic psychology methods, you'll learn strategies to predict the future of your codebase, assess refactoring direction, and understand how your team influences the design. With its unique blend of forensic psychology and code analysis, this book arms you with the strategies you need, no matter what programming language you use.

Software is a living entity that's constantly changing. To understand software systems, we need to know where they came from and how they evolved. By mining commit data and analyzing the history of your code, you can start fixes ahead of time to eliminate broken designs, maintenance issues, and team productivity bottlenecks.

In this book, you'll learn forensic psychology techniques to successfully maintain your software. You'll create a geographic profile from your commit data to find hotspots, and apply temporal coupling concepts to uncover hidden relationships between unrelated areas in your code. You'll also measure the effectiveness of your code improvements. You'll learn how to apply these techniques on projects both large and small. For small projects, you'll get new insights into your design and how well the code fits your ideas. For large projects, you'll identify the good and the fragile parts.

Large-scale development is also a social activity, and the team's dynamics influence code quality. That's why this book shows you how to uncover social biases when analyzing the evolution of your system. You'll use commit messages as eyewitness accounts to what is really happening in your code. Finally, you'll put it all together by tracking organizational problems in the code and finding out how to fix them. Come join the hunt for better code!

What You Need:

You need Java 6 and Python 2.7 to run the accompanying analysis tools. You also need Git to follow along with the examples.

GenresProgrammingTechnologySoftwareComputer ScienceTechnicalNonfictionEngineering

221 pages, Kindle Edition

First published May 1, 2014

78 people are currently reading

1083 people want to read

About the author

Adam Tornhill

5 books31 followers

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

55 (21%)

4 stars

110 (42%)

3 stars

68 (26%)

2 stars

20 (7%)

1 star

6 (2%)

Displaying 1 - 30 of 50 reviews

Rod Hilton

152 reviews3,116 followers

December 10, 2015

Full Disclosure: I was a technical reviewer for this book

Your Code as a Crime Scene has a lot of extremely interesting ideas, and for those alone it's worth reading. The essential idea is exactly what the title says - this seems weird or impossible at first but I assure you, the title is genuine. Effectively what this book is about is using forensic techniques to figure out which spots in a large code base are most in need of improvement/refactoring.

There are lots of different kinds of things to look for and visualize in your code to help figure out dangerous areas in it, and the book takes you through each one, how it works, and how to get at the data to find these areas. It's really, really interesting, in fact it's one of the most interesting books on software I've ever read. At a previous job, we did something similar to find good candidates for our weekly Refactotum meetings - it was a script that used Git and our ticketing system to find files that were frequently modified, very large, and the source of a disproportionate number of bugs - the function for a evaluating the sort order of the files was kind of complex and involved, and I was actually pretty proud of helping develop it because it was such a neat way of viewing a codebase. I had no idea that some day there'd be a book all about stuff like that, with even more techniques. Really cool.

My main complaint with the book is that the author frequently uses a tool he wrote, code maat, to perform the analyses. I'd have much preferred more stress on the actual methods - I often came away feeling like I'd be unable to employ these techniques without code maat because I didn't get enough detail on the thinking behind the tool. In fact, I think the book often comes off as a code maat tutorial that was renamed. The other thing that bothers me is that often the code base analyzed to illustrate one of the book's ideas is the code maat database itself. This just seemed so self-referential to me, weirdly meta or something. It made it hard for me to really understand things - the code itself was already one step removed from the ideas, and then the codebases being looked at wound up kind of having the same leap. It's hard to explain, but the book would be much better if it analyzed popular open source projects that people use every day, and personally I'd have preferred less involvement from code maat itself - maybe a little paragraph or section at the very end showing how code maat could be used to automate a lot of the more manual work that the previous pages employed.

Overall, I actually highly recommend checking this book out - especially if you work on a large codebase and are concerned about its quality. There are lots of cool ideas in the book to help you find the biggest "bang for your buck" areas to improve the codebase.

Nikita Salnikov-tarnovski

25 reviews14 followers

October 4, 2015

Good ideas, terrible book.

The author presents several metrics that you can mine from your version control system and several ways to visualize them. All this with the goal of gaining more insights into the structure of your code, the ways your team collaborate and to spot emerging troubles early on. Some of these metrics are really interesting and do give you valuable information in a cool way.

But the book itself is very bad. The connection to forensics and crime investigation is so weak it completely disappears after first few chapters. I would go so far as to claim that the title of the book is misleading if not outright lying. In many situations author's reasonings and conclusions require big leaps of faith to follow. Several chapters seem totally out of the context.

The separate blame goes to the source code accompanying the book. Although the author claims that the book is about the method, not the tool, examples of using those methods with the author's tool sum up to the large portion of the book. And for such central protagonist the code is just too immature. It is of alpha quality at best.

Sandro Mancuso

Author 2 books289 followers

April 23, 2020

The book explores useful quality metrics and behavioral patterns by analysing our commit history. This is a fascinating subject which we need to pay more attention to, mainly when it comes to software modernization projects. I gave it 4 stars as I thought the author could have gone a bit deeper in certain areas. Really enjoyed it.

Mark Seemann

Author 3 books487 followers

April 5, 2020

Learn techniques for analysing a code base. Not so much static code analysis, but analysis of source control repositories like Git.

The title's implied connection to real criminal forensics is tenuous at best, and becomes entirely conceptual as the book progresses, but the metaphor still seems apt. The techniques presented strike me as a set of suggestions that you can apply in an ad-hoc manner to investigate specific topics.

Part III covers how you can use these techniques to uncover communications patterns in and between teams. This was the best part, I thought.

software

Emre Sevinç

177 reviews434 followers

November 15, 2016

Some software projects start from scratch, are very limited scope, you work on them with a very small team for a few months or a year, and then move on to another project. This book is not for people working in such conditions. On the other hand, if you find yourself dealing with software-based products comprised of million line code bases that have been developed by tens of engineers spread throughout the world during many years, and you're supposed to fix bugs, and add new features, thinking about how you can even begin to understand the huge complexity, well, then, this book will mean something to you.

The book's beauty stems from the fact that software engineers can always do better, and that it is crucial to utilize the metadata surrounding the software development process, both statically, as well as temporally. Once you internalize this mindset, and start to look at the systems at a high level, you realize that there's still a lot of work to be done to enhance the practice of software development.

One of the good aspects of the book is that the analyses described by the author are mostly independent of programming languages and paradigms: the code repositories analyzed range from C# projects to the ones in Java, Scala, and Clojure. Therefore, you can make use of the book if you work with object oriented technologies, or are a functional programming aficionado. Another good point: the examined source code repositories are high profile open source projects such as Hibernate, and Scala. This means you can easily perform similar analyses yourself on the most recent version of those projects, and since they are popular ones, you either know about them, or used them heavily in your projects.

If I had to summarize the book, I'd say the gist of it is: find the hotspots in your project, check if they are problematic, and focus on temporal coupling for proactive handling of complexity that is to reveal itself soon. The author starts from these simple premises, and proceed to show how they can be applied in detail to many different code bases, using the tools he developed and applied to the revision control system data such as Git logs. The set of analyses that can be done, and the insights that straightforward visualizations can lead to are impressive!

You might have noticed that I didn't touch on the author's use of concepts from criminal psychology: that's because I think contrary to what the title might imply, this book is much more about the scientific analysis of software product and process artifacts in the service of higher quality engineering, rather than analyzing sofware from the perspective of criminal psychology. Of course, the author's background in psychology and cognitive science, in addition to his engineering experience, is the reason this book is probably in a category of its own, but still, most of the anecdotes, even though interesting in themselves, can be considered tangential to the main topics of the book.

The book finishes by noting this is not the end of the story, showing we need more tools to utilize the metadata such as building code recommenders by doing statistical analysis on your code base (and I'd add even other code bases!), integrating dynamic information about the code, utilizing even more detailed data such as in-commit changes that your development tools can record, and other advanced features to be implemented as concrete technologies. Some of them, such as code recommenders already started to appear in the last few years (see http://www.eclipse.org/recommenders/), but we need more of them, as well as cross-platform ones.

I can recommend this book experienced software engineers working on long-term, complex software projects and products, as well as engineering-oriented managers who seriously consider to enhance their processes.

Hildeberto

97 reviews

September 16, 2017

The book is good at investigating design issues, but not good at solving them. Problems are exposed, but not discussed. The tools presented aren't more than scripts that have to be executed manually and in the right sequence. There is no dashboard or some sort of continuous automated analysis to show the progress of the actions taken after the diagnosis.

The text emphasizes the approach of analyzing the history of changes in files based on version control logs, but I personally think that all diagnosis shown in the book can also be done by a static analysis tool like SonaQube.

Chapter 10, "Use Beauty as a Guiding Principle", was the most relevant chapter for me. It really opened my mind to some aspects of programming. It made the book worth reading.

technology

John

30 reviews5 followers

July 22, 2017

I work professionally with some fairly large and old codebases (1-200K lines of code). I follow a craftsman's approach to software development, so it brings me great joy to leave a system cleaner than I found it. But not all poor code is equally impeding to maintenance and feature development, and careers are meant for more than post-hoc code cleanup. You know, like sustainably building the right thing for your customer.

This book hits an intersection of respect for cognitive load, seeing things in the world as they are rather than as we would like them to be, and borrowing from disparate fields of study to solve tricky problems. I couldn't wait to try the techniques out on my main code base. Within an hour I had a working hotspot visualization of code from my own team. The hotspots matched remarkably well with my own experience. These are some hairy pieces of code, and with analyses like this it's a lot easier to sell the value of taking some time to refactor these areas.

If your read is like mine, you will devour the book in a week. Although it's less than 200 pages, the author does not skimp on nuance, studies supporting his highlighted heuristics, or approachable language. This is a case of a very good idea, executed well, and conservatively explicated. The ideas here are good, and the tools are shovel-ready.

This is the first software book I've seen that is about studying one's code base's evolution to identify its problems and suggest solutions. I can't wait to see how this changes my work in the next week, month, quarter, and year.

Matthew Boehm

22 reviews

April 6, 2017

Great book for learning how to understand large codebases and how to make data-driven decisions on improving code. The parallels to forensics helped make the book more entertaining and kept me engaged.

It's very easy for programming books to either be too common-sense or too subjective, but Tornhill manages to present novel information which I think has a much better shot at being widely accepted that debates on various testing methodologies, OO design, type systems, and many other highly contested topics in the software engineering world.

Lucy Batson

468 reviews9 followers

May 15, 2019

Some good ideas and approaches in this book, but most are tied to a tool that makes it not especially helpful. Would be much better if it wasn't so tightly coupled to code-maat.

Pattie Reaves

20 reviews

June 6, 2024

One of my favorite books we did for Engineering Book Club.

Nathan Albright

4,488 reviews154 followers

August 18, 2016

Admittedly, I am not that much of a computer programmer or coder of any kind, although computer programming is a subject of some personal interest [1] and so is forensics [2], and this inventive and creative book manages to combine insights from forensics into an intensely practical way to improve computer coding by using the techniques that people use to solve crimes. Although the idea may seem a bit far-fetched, considering that both crimes and complex computer programs are aspects of the mind, and that both of them often involve unsolved mysteries and a fair bit of messiness and chaos and pattern recognition, they have more similarities than are often considered to be the case, and in combining the two fields, this book is even of interest on the conceptual level to those whose skills with programming, or the lack thereof, prevent a full application of these insights as the author would wish.

This book, in under 200 pages, manages to provide information that is strikingly unusual, technical and practical, and also very humorous at the same time. The author begins, as a good author of a practical guide should begin, with an introductory section that explains how to best read the book and apply its insights. After this, the author discusses the matter of evolving software for several chapters, looking at code as a crime scene, crating profiles of possible offending areas of code to investigate under intense scrutiny, find hotspots in large-scale systems of code, judge the function of hotspots through their names, and calculate the complexity trends of code through the shape of code statistics over time. The author then spends a few chapters dissecting code architecture by treating code as a cooperative witness, detecting the signs of architectural decay, building a safety net for code architecture, and using beauty and elegance in code as a guiding principle in code design. The third part is perhaps even more striking in the way it encourages the reader to master the social aspects of code by looking at norms, groups, and false serial killers (bad code), discovering organizational metrics in one's codebase, building a knowledge map of a system, diving deeper by looking at code churn, and pointing the reader toward the future. Not only is this book practical and technically sound, but it is also thoughtful on the level of human communication and on the patterns of the human mind.

Among the many qualities that make this book such a surprising joy to read is the fact that the author spends a great deal of time poking fun at his own code, using the techniques he discusses on his own code project in order to remind himself of his own shortcomings and occasional shortsidedness in code design, including the combination of multiple features within the same files, which causes them to change more often and be more unstable, and in sloppy nomenclature which can cause ominous problems in trying to figure out what a particular piece of code happens to do. There are jokes about minecraft, insights about the avoidance of certain areas once a crime has been committed there, and comments on the need to bolster our intuition with solid data. There are even considerable insights as to the importance of strong data visualization techniques to be found here. If a reader is someone who codes, and takes coding seriously and wants to minimize the difficulties of maintaining and improving a code base, this is a strong and short book to read, a genuine pleasure that far exceeds the usual dry and boring fare one gets from technical reading.

[1] See, for example:

https://edgeinducedcohesion.wordpress...

https://edgeinducedcohesion.wordpress...

https://edgeinducedcohesion.wordpress...

https://edgeinducedcohesion.wordpress...

https://edgeinducedcohesion.wordpress...

[2] See, for example:

https://edgeinducedcohesion.wordpress...

https://edgeinducedcohesion.wordpress...

https://edgeinducedcohesion.wordpress...

challenge

Victor

41 reviews8 followers

January 24, 2017

Check out my blog for a more detailed review and summary of the book.

Most of us use version control systems only as repositories for code. But they can be much more than that since they contain a lot of useful information about a project’s history. Your Code as a Crime Scene describes how to extract this information from them. It presents some novel techniques useful for detecting hot spots and error prone code. The book is accompanied by a tool, Code Maat, that you can use to apply the techniques on your own projects. Although the tool is a black box, having the ability to put theory into practice is a big win. The book takes a look at the source code of Code Maat, but also at more mature and larger projects like Craft.Net, Hibernate and Scala.

Most of the techniques rely on simple metrics. I was surprised by how powerful they can be. This simplicity has its strengths and weaknesses. On one hand, they are language agnostic, so they can be applied on any code base, regardless of language. On the other hand, simplicity means they are less precise than other more complete metrics. But the author cites studies that prove these simple metrics are as good at detecting error prone code as other, more complex metrics.

The author also draws from his personal experience and describes how these techniques helped him (or would have helped him) in the past. I always find this useful and I think it helps emphasize the effectiveness of these techniques.

At under 200 pages, this book is packed with information. I recommend reading it and then employing the analysis techniques on your own projects. This data can help you get a better knowledge of a project and make better decisions.

read-tech

Erika RS

857 reviews262 followers

January 30, 2017

I often wish there were a better publication format for things that are worth more than some blog posts but don't have a whole book's worth of content. This book would have made a great booklet, but did feel padded.

That said, the techniques were useful. The book, at it's core, is about using data in the source code repository to get a broader understanding of a code base. The techniques are described in some detail, but the book leaned a bit heavily on a particular tool (Code Maat). I would have appreciated high level pseudocode versions of the various analysis techniques.

Throughout the book, we looked at several properties, all of which can be extracted from many source control repositories:
* change frequency
* lines of code -- the simplest proxy for complexity
* code churn (lines changed over time, rather than raw revisions)
* temporal coupling between code (files that are frequently changed together)
* authorship for expertise and churn (many authors)
* joint authorship to determine the social landscape of a project

The author also describes a number of interesting visualization methods and points to libraries that can generate them once you've extracted the raw data.

Note that Tornhill frequently warns that these are only heuristics. Although the framing around "forensic techniques" was rather thin, one important way that these tools are similar to tools of crime investigation is that they won't give you answers on their own. They will, however, point you at areas that warrant further investigation.

software

Johnny

600 reviews11 followers

March 6, 2016

First, this book is not about crimes, not even for the case your application was hacked. The reference to crimes is often very constructed and could be better described with psychology than crime fighting. There are some great ideas and very much promotion of the tool created by the author.

The idea to count the whitespaces in front of the line of code to determine the complexity is a genius idea. Counting tabs and whitespaces is easy and still gives you a very good idea on the complexity, then those characters are used to indent the code (more indentation equals more complexity). There are great moments in this book and this one alone is worth to read the book.

However, there are many more pages full of a tutorial for the tool code-maat. Parsing the output of git log is also a good idea, but when you can’t use this tool most of the analysis will be unreachable (they’re only available in code-maat and no other tool). And since there are no pre-build binaries that may easily be a problem for many readers. Except you like to build the tool written in Clojure from scratch.

Balhau

59 reviews5 followers

August 22, 2016

This is a very interesting book. Adam tackles some important points in the software developing. In a nutshell this book is all about getting valuable information that is hidden in your version control system. Be it git, svn, or any other repository there is lots of information regarding the healthy status of your project. Adam present us with a bunch of metrics and tools to extract the respective values. The approach is a mix of technical stuff with concepts from human sociology the will keep us sensible for some key ideas the rule the way human people work and the impact it has on software development. These ideas are used to extract and analyse information that will say if your code has lots of code debt or not, has process loss or not, and help us to deal with legacy code in a tractable way. This is a must read and I truly recommend the book as well as the tools used in the book. Cheers

Chris

168 reviews3 followers

March 28, 2016

This fairly short book provided a really interesting way to analyze software projects, with analogies drawn from criminal forensics. Nearly all the analysis is done via processing source-control metadata, but the range of insight that can be extracted is impressive. The book backs up the value of these analytic techniques with a good amount of information drawn from academic studies, not just personal or anecdotal experience. I also appreciated the variety of projects used in the examples. I haven't tried out the author's tools & techniques myself yet, but I look forward to doing so.

Read for March 2016 Austin Computer Book Club meeting.

austin-computer-book-club

Suvash Thapaliya

24 reviews4 followers

July 22, 2015

I've always wondered about (the lack of) tools that introspect the software as it's being written, from personal to architectural level. The findings in this book very well resonates with the kind of tools/techniques I'd like to work with (hopefully) in the future, hence I'm obviously biased towards liking the book.
With more introspection in how we develop software, whether it be at a social/organizational level or within the code itself, we can get a better overview of maintaining/writing better software. This is really what the book is mostly about, and the best part is that the author has written tools to actually back up his points with data. This was a treat to read.

read-2015 software-development tools

Gábor Hajba

139 reviews3 followers

August 1, 2016

Nice book, gives new ideas on how to approach software projects and their potential failures.

The only drawback in my eyes is that this book could have had more examples (even for SVN because this tool is widely used on big and legacy codebases and it is hard to work with them for the tools mentioned in the book).

computer-science

Manni

2 reviews

November 16, 2017

There is one interesting idea in this book: use your source control’s history feature to find which parts of your code change frequently and which do not. Then also correlate what parts change in unison. The rest is just marketing. The “crime scene” aspect is just very lame marketing. The tool used to analyze logs, today is a commercial service.
Don’t buy.

programming

João Maia

48 reviews5 followers

July 23, 2023

Livro de ferramenta do autor, não achei muito útil e limitada

dropped

Steven Jøris

8 reviews

January 28, 2024

This book was extremely interesting, and would highly recommend it to other developers, but I have some concerns preventing me from giving this 5 stars. Let's see why!

First, "Let's see why" is a pun on the writing in the book, with extensive "reading guides", concluding almost every section with a summary, announcement of the next section, and concluding "Let's find out". For such a short read, it detracts from the otherwise valuable content!

When interpreting data, I found most of the observations would/should be known to people on the team. It can be nice to link objective numbers to experienced pains, e.g. for presentations, but a deeper understanding of a codebase would evolve regardless in effective teams.

The techniques definitely seem relevant to get a high-level overview/visualization of known problems, but I have doubts in their ability to catch problems before they would become known to active developers. The book's example analysis feels like post-hoc rationalizations.

But my main concern is with the code complexity measure: indenting. True, low indenting correlates with good code, but it's easy to write bad code with little indenting. I'm afraid, using it as a complexity metric may encourage premature decoupling and increase temporal coupling. A concrete example: Uncle Bob's destructive advise to "extract till you drop", leading to function hell (https://whatheco.de/2010/12/07/functi...), would not only be classified as good code, but be encouraged when integrated with a CI tool.

Therefore, I'm on the fence as to how/when to apply it in an effective software development process.

Anders

50 reviews1 follower

April 27, 2022

This book outlines how you can gain insights to your software code, development workflow and architecture from version control in order to improve your way of working. Although I do not endorse all of the practices the book lists, software practitioners ideas of how to think about what they produce.

The book is not very long and very easy to read without any special knowledge except for some insight to what version control is. How ever in order to gain the full benefit of the book the reader should also have the ability to execute the sample scripts and commands. The short scripts and commands analysing the sample code repositories gives the reader the ability to see in practice what the output looks like.

work-related

Heather

996 reviews23 followers

August 6, 2024

Well this took me longer than needed to finish! I kept forgetting where I was, since I was reading on the O’Reilly site and couldn’t save my spot, so I probably read this book at least times over.

This would be a good book for an engineering manager or a team lead (who isn’t coding all the time) to use to check in on the health of the code and the team. I also considered using some of the techniques in my on-ramping act my new job, to see what parts of the code gets touched most. I didn’t because there was other on-ramping to do, but you know.

They published a newer version this year and that’s the one I finished reading today- it addresses the AI aspect of coding and keeping code clean and understandable.

Jake McCrary

424 reviews25 followers

June 27, 2018

The book has some interesting ideas in it. I should probably try them out on some codebases I've worked on to see what could be useful and what isn't.

The ideas didn't seem new to me. Maybe this is because I've heard them before or, once heard, they fit my mental model of how code changes.

I found the connection to forensic stories to be weak. I don't think that stories of catching serial killers helped the books arguments, though maybe it made it slightly more interesting.

non-fiction read_2018 tech

Luca Campobasso

59 reviews2 followers

April 25, 2020

Interesting and indeed quite practical, I can't wait to put this knowledge to use. It's very specific, so you will find it useful only if you work in some organization and you want to actually apply it immediately. If you are just interested in reading it for the hell of it, there will be still some useful insights that might help you seeing mistakes before making them, and write better and more organized code from the beginning.

programming

Steve Goodreads

40 reviews

August 20, 2024

Terrific exploration of ideas to aid in understanding a code base. The metrics presented can be a good first step in analyzing a problem (aka support decisions but not make them). Conceptually, the ideas offer good insights into where to focus efforts. I have not had a chance to apply any of these techniques to large codebases that I've worked on, would like to see how useful they end up being (I am optimistic though).

4-5-stars library

Maksim Kiryanov

24 reviews

July 10, 2021

Книга понравилась.

Очень емко и интересно описаны проблемы программного кода и как их можно проанализировать и предотвратить, если применить техники из судмедэкспертизы.
И все с практическими примерами, которые можно взять и попробовать.

Открыл для себя совершенно новый подход Hotspot Analysis.

Отдельно хочется выразить благодарность за то, что автор не пытается залить книгу "водой".

engineering

Christoph Kappel

470 reviews9 followers

November 8, 2023

Really interesting book and accompanying github repository. I played a bit with code maat so far, but I am definitely using it to discover new code bases in the furture.

The overall theme and relation to forensics is funny to read and it really dives into things like temporal coupling and such and explains what knowledge you can gain from it.

2023 code english

Dushan Hanuska

112 reviews2 followers

May 31, 2017

This book presents interesting ideas of extracting some hidden knowledge out of code meta-data, mostly commits to git repository. It is quite centric around a tool called "maat". All-in-all it is worth reading if you have time and are looking at how to improve your teams and their coding skills.

information-technology

Ruben

100 reviews10 followers

August 1, 2020

I subscribe the opinion of a lot of other reviewers. While the ideas are interesting (I already used some, and will write some tools for others for fun), the packaging (i.e. the story and the book itself) are not that great. I would still recommend it for the ideas on it, though.

Displaying 1 - 30 of 50 reviews

More reviews and ratings