Probability and Discrete Random Variables

1. Applying Set Theory to Probability

  • Probability is a numerical value assigned to a set, indicating the likelihood of an event. A higher value corresponds to a greater probability. In this way, probability can be seen as a measurement similar to weight or temperature measurements.
  • Fortunately for engineers, the language of probability (including the term probability itself) naturally relates to our everyday experiences. The fundamental concept is a repeatable experiment that consists of a procedure and observations. There is inherent uncertainty in the outcome. Here are some examples:
    1. Flip a coin. Does it land heads up or tails up?
    2. Walk to a bus stop. How long do you wait for the bus to arrive?
    3. Roll a die. Which number appears on the top face?
    4. Measure the temperature at noon each day. What is the temperature?

  • As mentioned earlier, an experiment involves both a procedure and observations.
  • We will describe models of experiments using a set of possible experimental outcomes. In the context of probability, we assign a precise meaning to the term “outcome.”
    • Definition 0.1 (Outcome). An outcome of an experiment is any possible observation of that experiment.
    • Definition 0.2 (Sample Space). The sample space of an experiment is the finest-grain, mutually exclusive, collectively exhaustive set of all possible outcomes.
      • The finest-grain property ensures that all distinguishable outcomes are listed separately.
      • Mutual exclusivity means that if one outcome occurs, no other can occur at the same time.
      • Collective exhaustiveness requires that every possible outcome is included in the sample space.
  • Example 0.3. Flip a coin and let it land on a table (this is the procedure) and then observe which side (head or tail) faces you after the coin lands (this is the observation).
  • The sample space is $S= {h,t}$, where $h$ is the outcome “observe head,” and $t$ is the outcome “observe tail.”

  • In everyday language, an event is something that happens. In the context of an experiment, an event occurs when a specific outcome is observed.
  • Mathematically, an event is defined by the set of outcomes in the sample space where the phenomenon occurs. For each outcome, either the event happens or it does not.
    • Definition 0.4 (Event). An event is a set of outcomes of an experiment.
Set Algebra Probability
Set Event
Universal Set Sample Space
Element Outcomes

2. Probability Axioms

  • Our model of an experiment includes a procedure and observations, which can be represented using set theory. This involves a sample space S (the universal set), outcomes s (elements of S), and events A (sets of elements). To complete the model, we assign a probability $P [A]$ to each event A in the sample space. Probability, in this context, is the relative frequency of an event occurring in a large number of experimental trials. Mathematically, this is defined by the following axioms.

  • Definition 0.1 (Axioms of Probability). A probability measure $P [·]$ is a function that maps events in the sample space to real numbers such that:
    • Axiom 1: For any event $A$, $P[A]\geq0$.
    • Axiom 2: $P [S] = 1$.
    • Axiom 3: For any countable collection $A_1, A_2, . . .$ of mutually exclusive events,
    \[P[A_1\cup A_2 \cup ...]=P[A_1]+P[A_2]+...\]
  • Theorem 0.2. If $A=A_1\cup A_2 \cup … \cup A_m$ and $A_i \cap A_j=\varnothing$ for $i \not= j$, then
\[P[A]=\sum_{i=1}^{m}P[A_i]\]

3. Consequence of the Axioms

  • Theorem 0.1. The probability measure $P [·]$ satisfies:
    1. $P[A^c]=1-P[A]$
    2. For any A and B (not necessarily disjoint)
    \[P [A \cup B] = P [A] + P [B]-P[A \cap B]\]
    1. If $A \subset B$, then $P [A] \leq P [B].$
  • Theorem 0.2. For any event $A$ and event space ${B_1, B_2, …, B_m}$,
\[P[A]=\sum_{i=1}^{m}P[A \cap B_i]\]

4. Conditional Probability

  • Conditional probability $P [A\mid B]$ describes the likelihood of event A given that event B has occurred. This notation, read as “the probability of A given B,” provides insight into how probabilities can be used when only partial information is available.
  • Definition 0.2 (Conditional probability). The conditional probability of the event A given the occurrence of the event B is
\[P[A\mid B]=\frac{P[AB]}{P[B]}\]
  • Conditional probability is defined only when $P [B] > 0$.
  • If $P [B] = 0$, it means $B$ never occurs, making it illogical to discuss $P [A\mid B]$.
  • Note that $P [A\mid B]$ is a valid probability measure relative to the sample space consisting of outcomes in $B$, and it satisfies the properties of the three axioms of probability.

  • Example 0.4. Suppose you have a shuffled deck of cards, and you observe the top card. What is the conditional probability that the top card is the queen of hearts given that the top card is a red card?
  • Sol:
    • B ⇒ red card is on the top ⇒ $P[B] = \frac{26}{52} = \frac{1}{2}$
    • A ⇒ Queen of hearts is on the top ⇒ $P[A]=\frac{1}{52}$
    • $P[A\mid B]=\frac{P[AB]}{P[B]}=\frac{P[A]}{P[B]}=\frac{\frac{1}{52}}{\frac{1}{2}}=\frac{1}{26}$

5. Independence

  • Definition 0.1 (Two Independent Events). Events $A$ and $B$ are independent if and only if
\[P [AB] = P [A]· P [B].\]

Independence and Disjoint Events

Independence Interpretation

  • $P[A\mid B]=P[A]$
  • If $A$ and $B$ are independent ⇒ $A$ and $B^c$ are independent

Disjoint Interpretation

  • $P[A \cap B]=\emptyset$

Discrete Random Variables (DRV)

Random Variables and Their Relationships

  • A probability model begins with an experiment, and each random variable is directly related to the experiment. There are three types of relationships:
    1. The random variable is the observation.
    2. The random variable is a function of the observation.
    3. The random variable is a function of another random variable.
  • The value of a random variable is always derived from the outcome of the experiment, reflecting the relationship between the experiment and the random variable.
  • Definition 0.1 (Random Variable). A random variable consists of an experiment with a probability measure $P [·]$ defined on a sample space $S$ and a function that assigns a real number to each outcome in the sample space of the experiment.

Identifying Random Variables

  • A random variable $X$ can be represented by the function $X(s)$, which maps the sample outcome $s$ to the corresponding value of the random variable. The notation ${X= x}$ refers to the set of sample points $s\in S$ for which $X(s) = x$:
\[\{X=x\}=\{s \in S\mid X(s)=x\}\]
  • Some examples of random variables include:
    • A, the number of cars passing through a checkpoint in the next 10 minutes;
    • C, The number of correct answers given on a quiz with 12 questions;
    • M , the number of minutes until the next phone call is answered.

  • Definition 0.4 (Discrete Random Variable). $X$ is a discrete random variable if the range of $X$ is a countable set
\[S_X=\{x_1,x_2,...\}\]
  • Definition 0.5 (Finite Random Variable). $X$ is a finite random variable if the range is a finite set
\[S_X=\{x_1,x_2,...,x_n\}\]

Probability Mass Function

  • In a discrete probability model, each outcome in the sample space is assigned a probability between 0 and 1. For a discrete random variable $X$, this model is expressed using a probability mass function (PMF), denoted as $P_X (x)$. The PMF provides the probability that $X$ takes on a specific value $x$. Although the argument of the PMF can be any real number, the PMF itself only assigns probabilities to the values that $X$ can assume.
  • Definition 0.1 (Probability Mass Function (PMF)). The probability mass function (PMF) of the discrete random variable X is
\[P_X(x)=P[X=x]\]
  • Notation:
    • $X$ is the random variable.
    • $x$ is a possible value of $X$.
    • $P_X (x)$ is the PMF of $X$, assigning probabilities to values $x$.
  • Theorem 0.2. For a discrete random variable $X$ with probability mass function $P_X (x)$ and range $S_X$ ,the following properties hold:
    1. For any $x$, $P_X (x) \geq 0$
    2. $\sum_{x \in S_X}P_X(x)=1$
    3. For any event $B \subseteq S_X$ , the probability that $X$ is in the set $B$ is
    \[P[B]=\sum_{x\in B}P_X(x)\]
  • Example 0.3. The random variable M has PMF given by
\[P_M(m) = \begin{cases} \frac{2d - 1}{m + 1} & (m=1,2,3) \\ 0 & (\text{otherwise}) \end{cases}\]
  • Find:
    1. The value of the constant $d$.
    2. $P [M = 1]$.
    3. $P [M \geq 2]$.
    4. $P [M > 2]$.
  • Solution
    1. \[\begin{gather} \sum_{m=1}^{3} P_M(m) = 1 \implies \left( \frac{2d - 1}{1 + 1} + \frac{2d - 1}{2 + 1} + \frac{2d - 1}{3 + 1} = 1 \right) \cdot 12 \\ 12d - 6 + 8d - 4 + 6d - 3 = 12 \\ 26d - 13 = 12 \\ (26d = 25) \cdot \frac{1}{26} \\ \boxed{d = \frac{25}{26}} \end{gather}\]
    2. $\ P[M = 1] = \frac{2 \cdot \frac{25}{26} - 1}{1 + 1} = \frac{\frac{25}{13} - 1}{2} = \frac{\frac{25 - 13}{13}}{2} = \frac{\frac{12}{13}}{2} = \boxed{\frac{6}{13}}$
    3. $\ P[M \geq 2] = P[M = 2] + P[M = 3] = \frac{2 \cdot \frac{25}{26} - 1}{2 + 1} + \frac{2 \cdot \frac{25}{26} - 1}{3 + 1} = \frac{\frac{12}{13}}{3} + \frac{\frac{12}{13}}{4} = \frac{4}{13} + \frac{3}{13} = \boxed{\frac{7}{13}}$
    4. $\ P[M > 2] = P[M = 3] = \boxed{\frac{3}{13}}$

    image.png


Families of Discrete Random Variables

Bernoulli Discrete Random Variables

  • Definition 0.2 (Bernoulli (p) Random Variable). X is a Bernoulli ($p$) random variable if the PMF of X has the form
\[P_X(x) = \begin{cases} 1 - p, & \text{if } x = 0, \\ p, & \text{if } x = 1, \\ 0, & \text{otherwise}. \end{cases}\]

where the parameter $p$ is in the range $0 < p < 1.$


Geometric Discrete Random Variables “How many trials until the first success?”

  • Definition 0.4 (Geometric (p) Random Variable). A random variable $X$ is a geometric ($p$) random variable with parameter $p$ if the PMF of $X$ has the following form:
\[P_M(m) = \begin{cases} p(1-p)^{m-1} & \text{for } m = 1, 2, 3, \dots\\ 0 & \text{otherwise}. \end{cases}\]

where the parameter $p$ is in the range $0 < p < 1$.

  • Example 0.5. Suppose you keep flipping a coin until you get heads. The probability of getting heads on any given flip is $p = 0.2$. Let $X$ represent the number of flips needed to get the first head. What is the probability mass function (PMF) of $X$?
  • Solution:
\[P_X(x) = \begin{cases} 0.2(0.8)^{x-1}, & x = 0, 1, 2, 3, \dots \\ 0, & \text{otherwise} \end{cases}\]
  • Graph

image.png


Binomial Discrete Random Variable “How many success in n trials?”

  • Definition 0.6 (n choose k). For an integer $n \geq 0,$ we define:
\[\dbinom{n}{k} = \begin{cases} \dfrac{n!}{k!(n - k)!} & \text{if } k = 0, 1, \dots, n, \\ 0 & \text{otherwise} \end{cases}\]
  • Definition 0.7 (Binomial (n, p) Random Variable). Let X be a binomial (n, p) random variable. The probability mass function (PMF) of X is given by:
\[P_X(x) = \dbinom{n}{x} p^x (1 - p)^{n - x}\]

where $0 < p < 1$ and $n$ is a positive integer ($n \geq 1$).

  • Example 0.9. Suppose you flip a coin 12 times. The probability of getting heads on any given flip is p = 0.2. Let X represent the number of heads obtained in these 12 flips. What is the probability mass function (PMF) of X?
  • Solution:
\[P_K(k) = \dbinom{12}{k} (0.2)^k (0.8)^{12 - k}\]
  • Graph:

image.png


Pascal Discrete Random Variable “How many trials until the r-th success?”

  • Definition 0.11 (Pascal (k, p) Random Variable). X is a Pascal (k, p) random variable if the PMF of X has the form:
\[P_X(x) = \dbinom{x - 1}{k - 1} p^k (1 - p)^{x - k}\]

where $0 < p < 1$ and k is an integer such that $k \geq 1$.

  • Example 0.12. Assume the probability of a failed test is 0.1, and we are seeking to find three defective devices. The random variable L represents the number of tests necessary to find three defective devices. The PMF is:
  • Solution:
\[P_L(L) = \dbinom{L - 1}{2} (0.1)^3 (0.9)^{L - 3}\]
  • Graph:

image.png


Poisson Random Variable “How many events happen in this time?”

  • Definition 0.16 (Poisson ($\alpha$) Random Variable). A random variable X is a Poisson ($\alpha$) random variable if the PMF of X has the form:
\[P_X(x) = \begin{cases} \dfrac{\alpha^x e^{-\alpha}}{x!} & \text{for } x = 0, 1, 2, \dots, \\ 0 & \text{otherwise} \end{cases}\]

where the parameter $\alpha$ is a positive real number ($\alpha > 0$).

\[\begin{aligned} \lambda &\to \text{Average rate per second} \\ T &\to \text{time in seconds} \\ &\implies \alpha = \lambda T \end{aligned}\]
  • Example 0.17. The number of phone calls received at a call center in any time interval follows a Poisson distribution. A particular center receives on average ε = 3 calls per minute. Find the probability that there are no calls in a 0.5-minute interval. Also, find the probability that there are no more than four calls in a 2-minute interval.
  • Solution:
\[P_H(h) = \begin{cases} \dfrac{1.5^h e^{-1.5}}{h!} & \text{for } h = 0, 1, 2, \dots, \\ 0 & \text{otherwise} \end{cases}\] \[\text{Probability of no calls:} \\ P[H = 0] = P_H(0) = \dfrac{1.5^0 e^{-1.5}}{0!} = e^{-1.5} \approx 0.223\]
\[\begin{gathered} \text{Probability of no more than 4 calls in a 2-min interval:} \\ \\ \alpha = \lambda T = 3 \cdot 2 = 6 \\ \\ P_J(j) = \begin{cases} \dfrac{6^j e^{-6}}{j!} & \text{for } j = 0, 1, 2, \dots, \\ 0 & \text{otherwise} \end{cases} \\ \\ P[J \leq 4] = \sum_{j=0}^{4} P_J(j) = P_J(0) + P_J(1) + P_J(2) + P_J(3) + P_J(4) \\ \\ \begin{aligned} P_J(0) &= \dfrac{6^0 e^{-6}}{0!} = e^{-6}, \\ P_J(1) &= \dfrac{6^1 e^{-6}}{1!} = 6e^{-6}, \\ P_J(2) &= \dfrac{6^2 e^{-6}}{2!} = 18e^{-6}, \\ P_J(3) &= \dfrac{6^3 e^{-6}}{3!} = 36e^{-6}, \\ P_J(4) &= \dfrac{6^4 e^{-6}}{4!} = 54e^{-6}. \end{aligned} \\ \\ P[J \leq 4] = (1 + 6 + 18 + 36 + 54)e^{-6} = 115e^{-6} \approx 0.527 \end{gathered}\]