This book was a quick read with not a lot of meat. I’ve captured the nuggets below that highlight my findings from the book. I would not waste my time reading the book, but it’s ok if you know nothing about AI and ML.
More and more of our lives are being governed by algorithms.
Sometimes AI is only a small part of a program while the rest of it is rules-based scripting. Other programs start out as AI-powered but switch control over to humans (CSC from chat bot to humans or self-driving cars) if things get tough (pseudo-AI).
“People often sell AI as more capable than it actually is.”
Flawed data will throw an AI for a loop or send it off in the wrong direction. Since in many cases our example data is the problem we’re giving AI to solve, it’s no wonder that bad data leads to a bad solution.
Machine Learning (ML) is a part of AI. It’s Deep Learning, Neural Networks, Markov Chains, Random Forests, etc.
The difference between ML algorithms and traditional rules-based programs is ML figures out the rules for itself via trial and error. As AI tries to reach the goals its programmers specify, it can discover new rules and correlations. All it needs is a goal and data set to learn from.
Algorithms are good at finding trends in huge data sets but not good with nuance. ML algorithms are just lines of computer code.
Researchers are working on designing AIs that can master a topic with fewer examples (I.e. one-shot learning) but for now a ton of training data is required.
While a human driver may only need to accumulate a few hundred hours of driving experience, Waymo’s cars have collected data from driving more than 6M road miles plus 5B more miles driven in simulation.
“Many AIs learn by copying humans. The question they’re answering is not ‘What is the best solution?’ But ‘What would the humans have done?’”
“It’s often not that easy to tell when AIs make mistakes. Since we don’t write their rules, they come up with their own...Instead, the AIs make complex interdependent adjustments to their own internal structures.”
“A monkey writing randomly on a typewriter for an infinite amount of time will eventually produce the entire works of Shakespeare.”
AI to generate new recipes - called for handfuls of broken glass.
AI to generate pickup lines - the title of the book.
AI to generate ice cream flavors - “Beet Bourbon” and “Praline Cheddar Swirl.”
AI shapes our online experience and determines the ads we see. AI helps with hyperpersonalization for products, music and movie recommendations.
Commercial algorithms write up hyperlocal articles about election results, sports scores, and recent home sales. The algorithm, Heliograf, developed by the Washington Post, turns sports stats into news articles. This journalism algorithm translates individual lines of a spreadsheet into sentences in a formulaic sports story; it works because it can write each sentence more or less independently.
Google translate is a language-translating neural network.
ANNs = Artificial Neural Networks, aka cybernetics or connectionism. They’re loosely modeled after the way the brain works. In the 1950s, the goal was to test theories about how the brain works. The power of the neural network lies in how these cells are connected. The human brain is a neural network made of 86B neural networks.
Markov Chains, like Recurrent Neural Networks (RNN), look at what happened in the past and predicts what’s most likely to happen next. Markov Chains are used for the autocomplete function in smartphones. Android’s autocorrect app, called GBoard, would suggest “funeral” when you typed “I’m going to my grandma’s,”
RANDOM FOREST ALGORITHM is a type of machine learning algorithm frequently used for prediction and classification. It’s made of individual decision trees, or flowcharts that leads to an outcome based on the information we have. It uses trial and error to configure itself. “If all the tiny trees in the forest pool their decisions and vote on the final outcome, they will be much more accurate than any individual tree.”
Companies use AI-powered resume scanners to decide which candidates to interview and who should be approved for a loan, recognizing voice commands, applying video filters, auto tagging faces in photos, and powering self-driving cars. VW, while testing AI in Australia, discovered it was confused by kangaroos as it had never before encountered anything that hopped.
AI is making decisions about who should get parole and powering surveillance.
AI’s consistency does not mean it’s unbiased. An algorithm can be consistently unfair, especially if it learned by copying humans, as many of them do.
Deepfakes allow people to swap one person’s head and/or body for another, even in video. They have the potential for creating fake but damaging videos - like realistic yet faked videos of a politician saying something inflammatory.
AI is pointing people to more polarizing content on YouTube.
Microsoft’s image recognition product tags sheep in pictures that do not contain sheep. It tended to see sheep in landscapes that had lush green fields - whether or not the sheep were actually there. The AI had been looking at the wrong thing.
At Stanford, the team trained AI to tell the difference between pictures of healthy skin and skin cancer. They discovered they had inadvertently trained a ruler detector instead. AI found it easier to look for the presence of a ruler in the picture.
AI is analyzing medical images, counting platelets or examining tissue samples for abnormal cells - each of these tasks are simple, consistent, and self-contained.
The Turing test (as Alan Turing proposed in the 1950s) has been a famous benchmark for the intelligence level of a computer program.
Chatbots will struggle if the topic is too broad. Facebook tried to create an AI-powered chatbot called M that was meant to make hotel reservations, book theater tickets, and recommend restaurants, August 2015. Years later, Facebook found that M still needed too much human help and shut down the service January 2018.
ANI = Artificial Narrow Intelligence
AGI = Artificial General Intelligence
GAN’s = Generative Adversarial Networks = a sub-variety of neural networks (introduced by Ian Goodfellow in 2014). They’re 2 algorithms in one - 2 adversaries that learn by testing each other (1 the generator and the other the discriminator). Through trial and error, both the generator and discriminator get better. Researches have designed a GAN to produce abstract art - managing to straddle the line between conformity and innovation. GANs work by combining 2 algorithms - one that generates imagines and one that classifies images - to reach a goal. Microsoft’s Seeing AI app is designed for people with vision impairments. Artist Gregory Chatonsky used 3 ML algorithms to generate paintings for a project called It’s Not Really You.
People have crowdsourced data sets - if you don’t have all the data you need on hand. Amazon Mechanical Turk - pays people to crowdsource data.
ML algorithms don’t have context for the problems we’re trying to solve, they don’t know what’s important and what to ignore. Google trained an algorithm called BigGAN which had no way of distinguishing an object’s surrounding from the object itself.
Security expert Melissa Elliott suggested the term giraffing for the phenomenon of AI overreporting relatively rare sights.
Bias in the dataset can skew the AI’s responses. Humans asking questions about an image tend to ask questions to which the answer is yes. An algorithm trained on a bias dataset found that answering yes to any question that begins with “Do you see a...” would result in 87% accuracy.
To maximize profit from betting on horse racing, a neural network determined the best strategy was to place zero bets.
Trying to evolve a robot to not run into walls, the AI algorithm evolved to not move, and thus didn’t hit walls.
It’s really tricky to come up with a goal that the AI isn’t going to accidentally misinterpret. The programmer still has to make sure that AI has actually solved the correct problem.
Why are AIs so prone to solving the wrong problems?
1) They develop their own ways of solving problems, and
2) They lack the contextual knowledge.
“It’s surprisingly common to develop a sophisticated ML algorithm that does absolutely nothing.”
Dolphin trainers learned that to get dolphins to help keep their tanks clean, they’d train them to fetch trash and bring to their keepers in exchange for fish. Some dolphins learned the exchange rate - tearing off small pieces to bring to their keepers for a fish apiece.
Navigation apps, during the 2017 CA wildfires, directed cars towards neighborhoods that were on fire since there was less traffic there.
Google Flu algorithm in the early 2010s made headlines for its ability to anticipate flu outbreaks by tracking how often people searched for information on flu symptoms. It vastly overestimated the number of flu cases (overfitting).
The algorithm COMPAS (sold by Northpointe) was widely used across the US to decide whether to recommend prisoners for parole, predicting whether released prisoners were likely to be arrested again. Unfortunately, the data the COMPAS algorithm learned from is the result of hundreds of years of systematic racial bias in the US justice system. In the US, black people are much more likely to be arrested for crimes than white people, even though they commit crimes at a similar rate.
Amazon discontinued use of the AI tool for screening job candidates upon revealing it was discriminating against women. If the algorithm is trained in the way that human hiring managers have selected or ranked resumes in the past, it’s very likely to pick up bias. Since humans tend to be biased, the algorithms that learn from them will also tend to be biased.
Predictive policing looks at police records and tries to predict where and when crimes will be recorded in the future. They send more police to those neighborhoods, and more crime will be detected there than a lightly policed but equally crime-ridden neighborhood, just because there are more police around. This can lead to over policing.
Treating a decision as impartial just because it came from an AI is know as Math-washing or Bias Laundering. The bias is still there, because the AI copied if from its training data, but now it’s wrapped in a layer of hard-to-interpret AI behavior. There are companies that have begun to offer bias screening as a service. One bias-checking program is Themis. One way of removing bias from an algorithm is to edit the training data until the training data no longer shows the bias, or selectively leave some applications out of the training data altogether (preprocessing).
Hackers may design adversarial attack’s that fool your AI if you don’t go to the time and expense of creating your own proprietary data set. People may poison publicly available data sets. People can contribute samples of malware to train anti-malware AI. For example, some advertisers have put fake specks of “dust” on their banner ads hoping people accidentally click on the ads while trying to brush them off their touch screen.
The infamous Microsoft Tay chatbot, a ML-based Twitter bot that was designed to learn from the users who tweeted at it, in no time learned to spew hate speech.
“In 2019, 40% of European start-ups classified in the AI category didn’t use any AI at all.”
A lot of human engineering goes into the data set. A human has to choose the subalgorithms and set them up so they can learn together.
“Practical ML ends up being a bit of a hybrid between rules-based programming, in which a human tells a computer step-by-step how to solve a problem, and open-ended ML, in which an algorithm has to figure everything out.” Sometimes the programmer researches the problem and discovers that they now understand it so well that they no longer need to use machine learning at all. We just sometimes don’t know what the best approach to a problem is. ML also needs humans for maintenance and oversight.