sdr.max_iid_rvs

sdr.max_iid_rvs(X: rv_continuous | rv_histogram, n_vars: int, p: float = 1e-16) → rv_histogram

Numerically calculates the distribution of the maximum of \(n\) i.i.d. random variables \(X_i\).

Parameters:¶

X: rv_continuous | rv_histogram¶: The distribution of the i.i.d. random variables \(X_i\).
n_vars: int¶: The number \(n\) of random variables.
p: float = 1e-16¶: The probability of exceeding the x axis, on either side, for each distribution. This is used to determine the bounds on the x axis for the numerical convolution. Smaller values of \(p\) will result in more accurate analysis, but will require more computation.

Returns:¶

The distribution of the sum \(Z = \max(X_1, X_2, \dots, X_n)\).

Notes¶

Given a random variable \(X\) with PDF \(f_X(x)\) and CDF \(F_X(x)\), we compute the PDF of \(Z = \max(X_1, X_2, \dots, X_n)\), where \(X_1, X_2, \dots, X_n\) are independent and identically distributed (i.i.d.), as follows.

The CDF of \(Z\), denoted \(F_Z(z)\), is \(F_Z(z) = P(Z \leq z)\). Since \(Z = \max(X_1, X_2, \dots, X_n)\), the event \(Z \leq z\) occurs if and only if all \(X_i \leq z\). Using independence,

\[F_Z(z) = P(Z \leq z) = \prod_{i=1}^n P(X_i \leq z) = [F_X(z)]^n .\]

The PDF of \(Z\), denoted \(f_Z(z)\), is the derivative of \(F_Z(z)\). Therefore, \(f_Z(z) = \frac{d}{dz} F_Z(z)\). Substituting \(F_Z(z) = [F_X(z)]^n\) yields \(f_Z(z) = n \cdot [F_X(z)]^{n-1} \cdot f_X(z)\).

Therefore, the PDF of \(Z = \max(X_1, X_2, \dots, X_n)\) is

\[f_Z(z) = n \cdot [F_X(z)]^{n-1} \cdot f_X(z)\]

where \(F_X(z)\) is the CDF of the original random variable \(X\), \(f_X(z)\) is the PDF of \(X\), and \(n\) is the number of samples.

Examples¶

Compute the distribution of the maximum of two normal random variables.

In [1]: X = scipy.stats.norm(loc=-1, scale=0.5)

In [2]: n_vars = 2

In [3]: x = np.linspace(-3, 1, 1_001)

In [4]: plt.figure(); \
   ...: plt.plot(x, X.pdf(x), label="$X$"); \
   ...: plt.plot(x, sdr.max_iid_rvs(X, n_vars).pdf(x), label=r"$\max(X_1, X_2)$"); \
   ...: plt.hist(X.rvs((100_000, n_vars)).max(axis=1), bins=101, density=True, histtype="step", label=r"$\max(X_1, X_2)$ empirical"); \
   ...: plt.legend(); \
   ...: plt.xlabel("Random variable"); \
   ...: plt.ylabel("Probability density"); \
   ...: plt.title("Maximum of two normal random variables");
   ...: 

Compute the distribution of the maximum of ten Rayleigh random variables.

In [5]: X = scipy.stats.rayleigh(scale=1)

In [6]: n_vars = 10

In [7]: x = np.linspace(0, 6, 1_001)

In [8]: plt.figure(); \
   ...: plt.plot(x, X.pdf(x), label="$X$"); \
   ...: plt.plot(x, sdr.max_iid_rvs(X, n_vars).pdf(x), label="$\\max(X_1, \dots, X_3)$"); \
   ...: plt.hist(X.rvs((100_000, n_vars)).max(axis=1), bins=101, density=True, histtype="step", label="$\\max(X_1, \dots, X_3)$ empirical"); \
   ...: plt.legend(); \
   ...: plt.xlabel("Random variable"); \
   ...: plt.ylabel("Probability density"); \
   ...: plt.title("Maximum of 10 Rayleigh random variables");
   ...: 

Compute the distribution of the maximum of 100 Rician random variables.

In [9]: X = scipy.stats.rice(2)

In [10]: n_vars = 100

In [11]: x = np.linspace(0, 8, 1_001)

In [12]: plt.figure(); \
   ....: plt.plot(x, X.pdf(x), label="$X$"); \
   ....: plt.plot(x, sdr.max_iid_rvs(X, n_vars).pdf(x), label=r"$\max(X_1, \dots, X_{100})$"); \
   ....: plt.hist(X.rvs((100_000, n_vars)).max(axis=1), bins=101, density=True, histtype="step", label=r"$\max(X_1, \dots, X_{100})$ empirical"); \
   ....: plt.legend(); \
   ....: plt.xlabel("Random variable"); \
   ....: plt.ylabel("Probability density"); \
   ....: plt.title("Maximum of 100 Rician random variables");
   ....: