This section provides a more advanced treatment of probability theory, building on the basics covered in the main Probability Theory page. It includes topics such as measure-theoretic foundations, advanced probability distributions, and limit theorems.

Infinite Probability Spaces

Imagine a probability space where the sample space is infinite, such as:

  • choosing a random real number between 0 and 1
  • tossing a fair coin infinitely many times

We can still define outcomes set as $\Omega$, events as subsets of $\Omega$.

But in each case, with infinitely many possible outcomes, each individual outcome has a probability of zero. If we still define a probability measure on individual outcomes, we would end up with any event having a probability of zero, and total probability would not sum to 1.

We need a definition of probability on sets of outcomes (events) that can handle infinite sample spaces. This is where $\sigma$-algebras and measure theory come in.

$\sigma$-Algebras/$\sigma$-Fields Definition

Let $\Omega$ be a non-empty set (the sample space). and let $\mathcal{F}$ be a collection of subsets of $\Omega$. We say that $\mathcal{F}$ is a $\sigma$-algebra (or $\sigma$-field) on $\Omega$ if it satisfies the following properties:

  1. the empty set is in $\mathcal{F}$: $\emptyset \in \mathcal{F}$
  2. if $A$ is in $\mathcal{F}$, then its complement is also in $\mathcal{F}$: if $A \in \mathcal{F}$, then $A^c = \Omega \setminus A \in \mathcal{F}$
  3. if $A_1, A_2, A_3, \ldots$ is a countable collection of sets in $\mathcal{F}$, then their union is also in $\mathcal{F}$: if $A_i \in \mathcal{F}$ for all $i \in \mathbb{N}$, then $\bigcup_{i=1}^{\infty} A_i \in \mathcal{F}$

In short, a $\sigma$-field is a countable collection of subsets of $\Omega$ that is closed under complementation and countable unions. (Could define probability measures on these sets.)

Probability Measure Definition

Let $(\Omega, \mathcal{F})$ be a measurable space, where $\Omega$ is the sample space and $\mathcal{F}$ is a $\sigma$-algebra on $\Omega$. A probability measure $\mathbb{P}$ is a function $\mathbb{P}: \mathcal{F} \to [0, 1]$ that satisfies the following properties:

  1. $\mathbb{P}(\Omega) = 1$ (the probability of the entire sample space is 1)
  2. countable additivity: if $A_1, A_2, A_3, \ldots$ is a countable collection of disjoint sets in $\mathcal{F}$ (i.e., $A_i \cap A_j = \emptyset$ for all $i \neq j$), then $\mathbb{P}\left(\bigcup_{i=1}^{\infty} A_i\right) = \sum_{i=1}^{\infty} \mathbb{P}(A_i)$

If $\mathcal{F}$ is finite, we could easily link this definition back to the basic definition of probability on finite sample spaces. But this definition also works for infinite sample spaces.

Almost Surely

Notice we mentioned that in infinite sample spaces, individual outcomes can have probability zero, how about the complement of those outcomes? It is possible that the complement of a set of outcomes with probability zero has probability one. This leads to the concept of “almost surely” (a.s.) or “with probability one”.

Example, for an infinite sequence of fair coin tosses:

  • Event $A$, all heads: $\mathbb{P}(A) = 0$
  • Event $A^c$, not all heads or at least one tail: $\mathbb{P}(A^c) = 1$

In this case, we say that $A^c$ occurs “almost surely”.

Definition:

Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space. An event $A \in \mathcal{F}$ is said to occur almost surely (a.s.) if $\mathbb{P}(A) = 1$. Equivalently, an event $A$ occurs almost surely if its complement $A^c$ has probability zero, i.e., $\mathbb{P}(A^c) = 0$.

Borel $\sigma$-Algebra

An example of sigma-algebra is the Borel $\sigma$-algebra on the real line, denoted by $\mathcal{B}(\mathbb{R})$. It is generated by the open intervals in $\mathbb{R}$ and includes all sets that can be formed from countable unions, intersections, and complements of these intervals.

Similar to why we need sigma-fields to define probabilities, a random variable is also a measurable function map from a probability space to $\mathbb{R}$, borel-algebra make sure the random variable is well-defined, can trace back to the original probability space.

Random Variables Definition

Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space. A random variable $X$ is a measurable function $X: \Omega \to \mathbb{R}$ such that for every Borel set $B \in \mathcal{B}(\mathbb{R})$, the preimage $X^{-1}(B) = {\omega \in \Omega : X(\omega) \in B}$ is in $\mathcal{F}$.

In other words, the inverse function $X$ on any borel set: $X^{-1}(B)$, must be an event (a set in the sigma-field $\mathcal{F}$) in the original probability space.

Distribution Function Definition

Let $X$ be a random variable defined on a probability space $(\Omega, \mathcal{F}, \mathbb{P})$. The distribution measure $\mu_X$ induced by $X$ is defined on the Borel $\sigma$-algebra $\mathcal{B}(\mathbb{R})$ as follows:

\[\mu_X(B) = \mathbb{P}(X^{-1}(B)) = \mathbb{P}(\{\omega \in \Omega : X(\omega) \in B\})\]

Expectations

One of the most important properties of random variables is their expectation (or expected value, mean). The expectation of a random variable $X$ is defined as:

Let $X$ be a random variable defined on a probability space $(\Omega, \mathcal{F}, \mathbb{P})$. The expectation of $X$, denoted by $\mathbb{E}[X]$, is defined as:

\[\mathbb{E}[X] = \int_{\Omega} X(\omega) \, d\mathbb{P}(\omega)\]

in finite or countable sample spaces, this reduces to the familiar sum:

\[\mathbb{E}[X] = \sum_{\omega \in \Omega} X(\omega) \mathbb{P}(\{\omega\})\]

We skipped the proof and details of Lebesgue integration here, but the key takeaway is that the expectation is defined as an integral with respect to the probability measure $\mathbb{P}$.

Properties of Expectation

(1) Intergrability

A random variable $X$ is said to be integrable if $\mathbb{E}[ X ] < \infty$. This ensures that the expectation $\mathbb{E}[X]$ is well-defined.

(2) Comparison

if $X \leq Y$ almost surely, then $\mathbb{E}[X] \leq \mathbb{E}[Y]$.

(3) Linearity

For any random variables $X$ and $Y$, and constants $a, b \in \mathbb{R}$:

\[\mathbb{E}[aX + bY] = a\mathbb{E}[X] + b\mathbb{E}[Y]\]

(4) Jensen’s Inequality

If $\phi: \mathbb{R} \to \mathbb{R}$ is a convex function and $X$ is an integrable random variable, then:

\[\phi(\mathbb{E}[X]) \leq \mathbb{E}[\phi(X)]\]

Comments