M’s Reviews > Elements of Information Theory 2nd Edition > Status Update
M
is on page 109 of 784
Skipped the problems because reading on a bus is maybe not the best environment for deep thought. So instead started the next chapter on data compression and got till Kraft inequality. I remember learning about it as a side remark in my introductory TCS class in undergrad. Didn't remember the proof being so clean.
— Sep 16, 2025 10:17PM
Like flag
M’s Previous Updates
M
is on page 88 of 784
Finished Chapter on the entropy rates of stochastic processes. Very fun and light intro, though I am kinda struggling what I am missing with the first exercise.
— Sep 10, 2025 07:20PM
M
is on page 78 of 784
Pretty good. The AEP chapter was nice and short. I like the book motivated the entropy rate of a stationary stochastic process too.
— Sep 04, 2025 10:04PM
M
is on page 43 of 784
Finished Chapter 1 till summary. Was good so far, though I don't have a good intuition for some equalities, such as the chain rule for mutual information. - 23:16, 18.08.2025
— Aug 18, 2025 08:16PM
M
is on page 38 of 784
Got stuck in the proof of Theorem 2.10.1 (Fano's inequality). I don't have a good feeling how the probability Pr(X \neq \tilde{X}) is random, and the inequalities in 2.135 don't make sense to me yet.
— Aug 11, 2025 08:08PM
M
is on page 30 of 784
Read up till Section 2.7.
Finally got a feeling for what mutual information and the KL divergence is. I especially liked the interpretation for encoding a random variable X. The KL divergence is the inefficiency, the additional needed when one uses an encoding assuming X follows q(x) but actually the true probability mass function is p(x).
The Venn-Diagram with H(X | Y), H(Y | X), I(X;Y) was helpful too.
— Aug 10, 2025 08:21PM
Finally got a feeling for what mutual information and the KL divergence is. I especially liked the interpretation for encoding a random variable X. The KL divergence is the inefficiency, the additional needed when one uses an encoding assuming X follows q(x) but actually the true probability mass function is p(x).
The Venn-Diagram with H(X | Y), H(Y | X), I(X;Y) was helpful too.

