-
sdr.max_iid_rvs(X: rv_continuous | rv_histogram, n_vars: int, p: float =
1e-16
) rv_histogram Numerically calculates the distribution of the maximum of \(n\) i.i.d. random variables \(X_i\).
- Parameters:¶
- X: rv_continuous | rv_histogram¶
The distribution of the i.i.d. random variables \(X_i\).
- n_vars: int¶
The number \(n\) of random variables.
- p: float =
1e-16
¶ The probability of exceeding the x axis, on either side, for each distribution. This is used to determine the bounds on the x axis for the numerical convolution. Smaller values of \(p\) will result in more accurate analysis, but will require more computation.
- Returns:¶
The distribution of the sum \(Z = \max(X_1, X_2, \dots, X_n)\).
Notes
Given a random variable \(X\) with PDF \(f_X(x)\) and CDF \(F_X(x)\), we compute the PDF of \(Z = \max(X_1, X_2, \dots, X_n)\), where \(X_1, X_2, \dots, X_n\) are independent and identically distributed (i.i.d.), as follows.
The CDF of \(Z\), denoted \(F_Z(z)\), is \(F_Z(z) = P(Z \leq z)\). Since \(Z = \max(X_1, X_2, \dots, X_n)\), the event \(Z \leq z\) occurs if and only if all \(X_i \leq z\). Using independence,
\[F_Z(z) = P(Z \leq z) = \prod_{i=1}^n P(X_i \leq z) = [F_X(z)]^n .\]The PDF of \(Z\), denoted \(f_Z(z)\), is the derivative of \(F_Z(z)\). Therefore, \(f_Z(z) = \frac{d}{dz} F_Z(z)\). Substituting \(F_Z(z) = [F_X(z)]^n\) yields \(f_Z(z) = n \cdot [F_X(z)]^{n-1} \cdot f_X(z)\).
Therefore, the PDF of \(Z = \max(X_1, X_2, \dots, X_n)\) is
\[f_Z(z) = n \cdot [F_X(z)]^{n-1} \cdot f_X(z)\]where \(F_X(z)\) is the CDF of the original random variable \(X\), \(f_X(z)\) is the PDF of \(X\), and \(n\) is the number of samples.
Examples
Compute the distribution of the maximum of two normal random variables.
In [1]: X = scipy.stats.norm(loc=-1, scale=0.5) In [2]: n_vars = 2 In [3]: x = np.linspace(-3, 1, 1_001) In [4]: plt.figure(); \ ...: plt.plot(x, X.pdf(x), label="X"); \ ...: plt.plot(x, sdr.max_iid_rvs(X, n_vars).pdf(x), label=r"$\max(X_1, X_2)$"); \ ...: plt.hist(X.rvs((100_000, n_vars)).max(axis=1), bins=101, density=True, histtype="step", label=r"$\max(X_1, X_2)$ empirical"); \ ...: plt.legend(); \ ...: plt.xlabel("Random variable"); \ ...: plt.ylabel("Probability density"); \ ...: plt.title("Maximum of two normal random variables"); ...:
Compute the distribution of the maximum of ten Rayleigh random variables.
In [5]: X = scipy.stats.rayleigh(scale=1) In [6]: n_vars = 10 In [7]: x = np.linspace(0, 6, 1_001) In [8]: plt.figure(); \ ...: plt.plot(x, X.pdf(x), label="X"); \ ...: plt.plot(x, sdr.max_iid_rvs(X, n_vars).pdf(x), label="$\\max(X_1, \dots, X_3)$"); \ ...: plt.hist(X.rvs((100_000, n_vars)).max(axis=1), bins=101, density=True, histtype="step", label="$\\max(X_1, \dots, X_3)$ empirical"); \ ...: plt.legend(); \ ...: plt.xlabel("Random variable"); \ ...: plt.ylabel("Probability density"); \ ...: plt.title("Maximum of 10 Rayleigh random variables"); ...:
Compute the distribution of the maximum of 100 Rician random variables.
In [9]: X = scipy.stats.rice(2) In [10]: n_vars = 100 In [11]: x = np.linspace(0, 8, 1_001) In [12]: plt.figure(); \ ....: plt.plot(x, X.pdf(x), label="X"); \ ....: plt.plot(x, sdr.max_iid_rvs(X, n_vars).pdf(x), label=r"$\max(X_1, \dots, X_{100})$"); \ ....: plt.hist(X.rvs((100_000, n_vars)).max(axis=1), bins=101, density=True, histtype="step", label=r"$\max(X_1, \dots, X_{100})$ empirical"); \ ....: plt.legend(); \ ....: plt.xlabel("Random variable"); \ ....: plt.ylabel("Probability density"); \ ....: plt.title("Maximum of 100 Rician random variables"); ....: