A triangle inequality type result for iid random variables: echo of the positive semi-definiteness of f(x,y)=|x+y|-|x-y|, in probability theory

This blogpost discusses a beautiful way for solving the following interesting problem.

Example 1.
If X, Y are independent and identically distributed random variables taking values in the set \mathbf{R} of real numbers then show that
\displaystyle \mathbf{E} [ | X + Y | ] \ge \mathbf{E} [ | X - Y | ] ~~~~~~~ (1).

The above result follows from the positive semi-definiteness of (x,y) \mapsto |x+y| - |x-y|. Here we discuss an alternate solution, which is due to professor Fedor Petrov.

Solution with comments (thanks to professor Fedor Petrov). Notice that the problem doesn’t assume finiteness or even existence of the mean of X, Y. Echoing on the standard practice of checking the case of infinite means first, we first get rid of the case when |X| (and hence |Y|) has an infinite expected value. We leave this as an exercise for the readers.

Exercise 2.
Prove the claim in Example 1 in the case when \mathbf{E} [ |X|] (= \mathbf{E} [ |Y| ]) = + \infty.

So it suffices to consider the case when |X| has a finite expected value. Keeping aside this simplification for some time, let us look at the problem in a different way. The problem presents a claim about two iid random variables X and Y. This might suggest that it might be reasonable to rewrite the problem as a claim about the common law of X, Y. Letting \eta denote the common law of X, Y we can rewrite the claim in Example 1 as follows.

For any arbitrary probability distribution \eta on the real numbers one has
\mathbf{E} [ | X +Y | ] \ge \mathbf{E} [ |X- Y|] whenever X and Y are independent random variables each having law \eta.

Echoing on the general problem solving strategy of finding a sufficient criteria, one may like to find a claim which implies (1). A naive start would be to guess that the random variable |X+Y| - |X-Y| is non-negative with probability one. But this is not true.

Exercise 3.
Show that there exists iid random variables X, Y such that \mathbf{P} \{ | X + Y| - |X-Y| \le 0 \} > 0.

However we have
\displaystyle |X+X| + | X+Y| + |Y+X| + |Y+Y| \ge |X-Y| + |Y-X| \textsf{  with probability one} ~~~~~~~ (2).

Indeed, for any real numbers z_1 , z_2 one has \sum_{1 \le i , j \le 2} |z_i + z_j| \ge \sum_{1 \le i,j \le 2} |z_i - z_j|. The next natural step would be to take expectations on both sides of (2) and use the iid-ness of X, Y to get \mathbf{E} [ |X| ] +  \mathbf{E} [ |X+Y| ] \ge  \mathbf{E} [ |X-Y| ] which is not quite the same as what we wanted but only slightly different from it.

Let us amplify the above argument to get a proof. The key role will be played by the following somewhat popular inequality.

Claim 4.
For any natural number n \in \mathbf{N} and any arbitrary reals z_1 , .... , z_n one has
\displaystyle \sum_{1 \le i , j \le n} |z_i+z_j| \ge \sum_{1 \le i,j \le n} | z_i - z_j| ~~~~~~~ (3).

Applying the inequality (3) to the sample points X_1 (\omega) , .... , X_n (\omega) for jointly independent random variables X_1 , ..... , X_n having law \eta and take expectation on both sides we get that
\displaystyle \mathbf{E} [ |X + Y| ] + \frac{\mathbf{E} [ 2 |X| ]}{n-1} \ge \mathbf{E} [ |X-Y|] ~~~~~~~ (4);

thanks to Exercise 2 we see that sending n \to \infty gives us the desired claim.

So it suffices to prove Claim 4. A standard way is to induce on n.

Exercise 5.
Show (3) when n=2, and notice that there is nothing to prove if n = 1. Now, assuming the claim true for every $n \le m$ for some m \ge 2 we aim to prove the claim for n = m+1. To this end, replace each z_i by z_i + t for a fixed t and notice that the right hand side of (3) remains unchanged; optimize in t by minimizing the left hand side of (3) and use the induction hypothesis to get the claim.

This completes the proof. \square

Remark 6.
An interesting fact is that the inequality (1) holds even for iid vector-valued random variables X and Y, but the proof is slightly more difficult. It involves a trick of writing magnitudes for vectors in terms of inner products with random unit vectors. Here is a quick summary of this trick, by professor Terence Tao.


A retrospective analysis of the above solution, shows us that, in some sense, we removed probability from the picture (in the sense of finding a deterministic problem which implies the original claim made for random variables). This is an example of a more general strategy of de-probabilizing a problem. We shall discuss more about this matter in a later blogpost.

Also the fact that we are using Euclidean norm plays a crucial role in Example 1. For more details on it, see this post by professor Russ Lyons. This point will be discussed in more details in a later blogpost.

#de-probabilization, #easy-beautiful-proofs, #euclidean-norm, #fedor-petrov, #inequalities, #mathoverflow, #russ-lyons, #terence-tao

Convergence of mean from almost sure convergence in a super-linear bounded mean setting

The long title of the post might already lend some idea on the main focus of this blogpost. We are going to prove the following.

Proposition 1
Let (X_n)_{n =1}^{\infty} be a sequence of real (or complex) valued random variables such that there is some increasing, continuous function f : \mathbf{R} \to \mathbf{R} with \lim_{x \to \infty} \frac{f(x)}{x} = \infty (i.e., f grows strictly faster than linearly) and \mathbf{E} [ | f( |X_n | ) | ] < M for some absolute constant M \in \mathbf{R} and for all n \in \mathbf{N}.
Then if X_n converges almost surely to X_{\infty} as n \to \infty then \mathbf{E} [ X_n ] also converges to \mathbf{E} [ X_{\infty} as n \to \infty.

Before we prove Proposition 1, we would like to remark that, in general, it is not true that almost sure convergence implies convergence of mean.

Example 2
Let \Omega = [0,1] and let for any natural number n \in \mathbf{N} we define the random variable X_n by setting
X_n ( \omega ) = n \mathbf{1}_{ \omega \in (0, 1/n] }.
Then clearly, X_n ( \omega ) converges to 0 for all \omega \in [0,1] but still \mathbf{E} [ X_n ] = 1 for all n \in \mathbf{N} while \mathbf{E} [ 0 ] = 0.

In the example above, the sequence (X_n) is bounded in mean, and in fact for any bounded linear function f we have \sup _n \mathbf{E} [ |f(X_n)| ] < \infty but clearly, there is no function f with a strict superlinear growth for which the sequence ( f ( X_n) )_{n =1}^{\infty} has absolutely bounded means (see Exercise 3).

Exercise 3
If X_n are the random variables as defined in Example 2, then show that there is no function f : \mathbf{R} \to \mathbf{R} with \lim_{x \to \infty} f(x)/x = \infty for which one has \sup_n \mathbf{E} [ f(X_n) ] < \infty.

In this regard, Proposition 1 is quite elegant as it allows us to conclude convergence of means from a condition just slightly stronger than linear boundedness of means, and this actually follows because the boundedness of means in the superlinear setting actually implies the uniform integrability of ( |X_n|)_{n =1}^{\infty} which is, thanks to the bounded convergence theorem, stronger than what we need to conclude the claim. I had a much longer proof of this claim, which is essentially equivalent to the following more slick proof by professor Yuval Peres.

Proof of Proposition 1. Thanks to the bounded convergence theorem, it suffices to show that \sup_n \mathbf{E} [ |X_n| \mathbf{1}_{|X_n| > m } ] \to 0 as m \to \infty.

To begin, notice that there is some N \in \mathbf{N} such that f(y) > 0 for all y > N. For each m > N define \tau_m = \inf_{y > m} \frac{y}{f(y)}.
Then notice that one has for all natural numbers n and any m > 0.

\displaystyle |X_n| \textbf{1}_{|X_n| > m} \le f(|X_n)| \tau_m ;

taking expectation on both sides of the above line of display and sending m \to 0 gives us the uniform integrability of (|X_n|)_{n = 1}^{\infty} since \tau_m \to 0 as m \to \infty and this proves the claim. \square

In the special case, when f is the function f(x) = x^{1 + \delta} one obtains the much useful Theorem 25 from this blogpost by professor Terence Tao. In fact, Proposition 1 above was motivated by the ensuing remark after Theorem 25 on professor Tao’s blogpost. One classic application of Theorem 25 from professor Tao’s blogpost (together with Kolmogorov’s 0-1 law) is demonstrated in the proof of the opening proposition here. In a later blogpost I shall discuss consequences of Proposition 1 which are of a similar flavour.

#convergence-of-random-variables, #easy-beautiful-proofs, #terence-tao, #yuval-peres

Illustrations from the standard combinatorial toolbox

Recently I finished updating the document on standard techniques one often finds useful in combinatorics, which I wrote long back, and promised to update regularly in due course of time. It just happened that, unfortunately for a while I got distanced from combinatorics and paused working on the document. It is still not complete, at least I feel that I have more to discuss in this document so it will see another or a few more updates, hopefully very soon.

The most recent draft can be found below, in the attached file.

#easy-beautiful-proofs, #essays-in-combinatorics

What does “Combinatorics” Mean?

A Point of View.

Here is a problem from St. Petersburg Math olympiad. Like most Russian problems, the calculations are minimal and a trick is used. This is a problem I wrote on in AoPS forum (see [2]), and it was labeled as combinatorics. But is it really combinatorics? In my opinion, it could be classified as number theory or inequalities, or even algebra, because it uses elements of those subjects. I recalled a funny classification I came across regarding what combinatorics means. Here it is.

In Olympiad mathematics the topics are defined in the following way.

  • Routine functional equations and inequalities that use at least once Cauchy-Schwarz, kind of AM-GM, Jensen, Karamata, etc., etc. arecalled algebra.
  • Synthetic two-dimensional Euclidean geometry is calledgeometry.
  • Fermat’s little theorem, Cauchy totient function and modular arithmetic are number theory.
  • Everything else is called …combinatorics!!

Maybe, this isn’t a joke!

View original post 222 more words

Mathematics at Indiana University Bloomington

As some of you would know, I am going to begin my term as a PhD graduate student in the department of mathematics at Indiana University at Bloomington. I am all very excited about this, especially because my interim advisor there is professor Russell Lyons and the kind of problems he works on is exactly what has picqued my interests recently and also it feels great to know professor Russ because apparently our wavelengths match quite well.

As most of you know, I often post notes from courses I take on my blog. I would be doing the same, and in fact I plan to scribe all the mathematics and related lectures I attend at Indiana University and post them on my IU blog.

Being said that, it doesn’t mean that I would discontinue posting updates of my mathematical findings on this blog. As always, I am pretty confident that I would still engage in a significant amount of self studying (anyways it will be more required and relevant now), and so updates of all those would be posted here.

#iu-bloomington