Tomasello lays out a very plausible and radical theory of the phylogenetic origins of language, in this highly readable and absorbing book. This question of language's origins has implications for the essential functions of our natural languages, the nature of semantic meaning, and the social character of our species. His theory can show that we are interconnected in a very deep manner that is not evident, especially in our contemporary individualistic and materialist-scientific culture. I will return to this last point, after summarizing the key points in the chapters of this book.
The basic thesis is that our natural languages evolutionarily originated in hand signing and gesture, and the possibilities of semantic meaning depend on our "common ground," or shared background knowledge and experienced (based in general cultural knowledge or individual shared experiences). For example, we are out on a walk together, and I point to the sky. The semantic meaning of this gesture depends on our common ground. If we had just been talking about how good the weather is today, this gesture might express the meaning 'indeed the weather is great, look how sunny is is'. If instead we had been having an existentialist conversation about how insignificant and fragile humanity is, this gesture might instead have the meaning 'yes, these heavens and the universe are so immense, and we humans are nothing.' This explanation of semantic meaning applies to verbal, linguistic utterances, too, though the full explanation is of course much more complicated than this.
Tomasello lays out this basic account in the first chapter.
In the second chapter, Tomasello reviews scientific findings regarding communication in the great ape species. Apes are able to perform simple 'mind-reading'; they know others have intentional states and can be sensitive to those. So they are capable of intentional communication. Speakers have a sense of the recipient's state and can address them accordingly, and recipients know that the speaker's communicative motions are directed towards them specifically. The majority of communication, and the most sophisticated forms of communication, all happen in their practices of gesturing and signing. Apes can gesture, wait for the recipient's reactions, and modulate their gestures in response. The vast majority of their gestures are used to get attention or to make demands. Vocalizations, in contrast, express only basic emotions and are purely causal consequences of being overtaken by certain emotions (i.e., fear).
Out of their gestural repertoire, certain cases of gestures to get others' attention are the most sophisticated and the likely phylogenetic antecedent for human communication. Apes can direct others' attention to objects in their environment. There is some action the speaker wants from the recipient, and the speaker draws the recipient's attention to the object to get this done; the speaker will adjust and change her gestures if the recipient doesn't respond according to her desires. This communicative move implies a capacity for symbolism; different gestures can be used to symbolize or represent some goal. In contrast, the majority of gestural communication lacks this symbolic dimension. It mostly consists in apes performing the first steps of an action sequence, and the recipient's recognition that this entire action is intended. In that case, there is no symbolization; these steps are a literal part of the action sequence.
From chapters 3-6, Tomasello goes into human communication. Tomasello's thesis is that the singular, major skill that humans gained over apes and that makes all the difference in our language is our human capacity for joint intentionality. When A and B form a joint goal, A knows that B is aware of what A knows, and of the fact that A knows what B is aware of. Joint intentionality is defined by this recursive structure, or 'back-and-forthing', of each other's intentional states. This capacity enables us to be especially sensitive to one another's background and to form greater common ground. This common ground the shared background knowledge and experience that can serve as the communicative context, in which simple gestures and signs can possibly gain complex meaning.
Tomasello tells an evolutionary story for how we became capable of joint intentionality. Apes are capable of collaborative activity, but it is not properly coordinative. They often hunt together as groups, but they never form action plans beforehand, and each ape's behavior is geared solely for the maximization of their individual benefit (not for the benefit of the group). This comes out in the fact that when a group succeeds in attaining its prey, individuals will try their best to secure as much food as they can for solely themselves; there is no motivation or cognitive capacity to distribute foods evenly. Given this inequity, apes are each motivated to be the one who kills the prey first, so they can have the biggest chance to secure the most food for themselves. This makes it impossible for groups to function in the most effective way to secure their goal; they decrease their chances as a group for attaining prey.
Tomasello speculates that our evolutionary ancestors became capable of agreeing on sharing the spoils equally among themselves. If this equal distribution is guaranteed, then individuals in a group would be freed to engage in the actions that would be most effective for the group as a whole. Mutually helpful behaviors become increasingly possible; our ancestors could develop greater motivations to help one another out, since this would benefit the group and thereby themselves as a whole.
Apes are capable only of requesting (or more like demanding) from others. Humans, in contrast, are capable of informing and sharing; we can offer advice and information to help one another, and we enjoy letting others in on the things we appreciate, so we can appreciate these things together. This makes Gricean communicative intent possible: I want you to notice that I purposefully intend to let you know about X. It is not just that I want you to know about X. This is the basis of joint intentionality. It is the starting point for us to form strong coordinative groups, social identities and cultures, and massive common ground; which is all requisite for the rich natural language distinctive of humans.
Parts of the book I have not summarized include Tomasello's review of the empirical literature on language formation in human infants, which amounts to the ontogenetic story of language. That is the focus of chapter 4. Tomasello also gives an account of how our full-blown natural languages might arise from the basic gestural communication that he accounts for in the phylogenetic story. That happens in chapter 6.
I'd highly recommend this book to anymore interested why humans are distinct from other organisms, and what makes our natural language categorically different from the communicative systems of other organisms. The book is very easy to read and presents deep ideas. Its various parts fit systematically together, and the writing is rarely redundant.
I'd like to write a bit on a consequence of all of this, which Tomasello briefly raises in the conclusion. Whenever we apprehend a case of language use (e.g., as I write out this sentence, and it is intelligible to me), the intelligibility and meaning of the language is based in this common ground we share with one another. This common ground consists in all the objects and events that we have jointly attended to; we have had shared experiences of these things, recognize that we share them, and intend for each other to recognize that we each intend for each other to share them. Thus the meaning of this sentence I'm writing is founded in intersubjective experiences; it is not strictly "I" who am speaking, but the possibilities of what I say are given by social groups or partners of which I have been participant. Poetically speaking, it is humans coalesced via joint intentionality who speak through my tongue; it is not any individual person, not I or you, who says my words. I don't know what this means on the literal register yet, and look forward to thinking about it more.