Uncertainty¶

Types of Uncertainty¶

Others’ knowledge Our knowledge	Known	Unknown
Known	Things we are certain of	We know there are things we can’t predict eg: Random Process
Unknown	Others know but you don’t know eg: Insufficient data	Completely unexpected/unforeseeable events eg: Unknown distribution

	Epistemic	Aleatoric
Uncertainty in	Model	Data
Cause	- Model misspecification - Missing training data	- Measurement errors - Process random noise
Reducible through more training data	✅	❌
Can be learnt by model???	❌	✅

Uncertainty Quantification Methods¶

	Concept	Assumption	Works for non-linear	Limitations
Asymptotic approach	Central limit theorem	- Assumes normal distribution of response residuals - Assumes homoscedascity of response residuals	❌	- Requires large sample size to satisfy asymptotic condition - Requires appropriate formula for calculating standard error (not possible for complex models)
Bootstrapping (preferred)	Random sampling with replacement		✅	Higher computation cost
Delta Approach			✅
Conformal Prediction

Uncertainty Intervals¶

\[ \begin{aligned} \{y_u, y_l\} &= \hat y \pm \Delta y \end{aligned} \]

	$\Delta y$
Normal Assumption	$t_{n_\text{cal}, \alpha/2} \times \text{SE}$
Conformal Prediction	$S^{-1} \left[ q_{\frac{\lceil (n+1)\alpha \rceil}{n}} \right]$

Normal Assumption¶

	Coefficient Confidence Interval	Response Confidence Interval	Response Prediction Interval
Notation	$\sigma_{\hat \beta}$	$\sigma \Big[ \hat \mu \vert x_{i, \text{new}} \Big]$	$\sigma \Big[ \hat y_{i, \text{new}} \vert x_{i, \text{new}} \Big]$
The upper and lower bound for estimated __ at a given level of significance	$\hat \beta$	$\hat \mu \vert x_{i, \text{new}}$	$\hat y \vert x_{i, \text{new}}$ $=\hat \mu \vert x_{i, \text{new}} + \hat u \vert x_{i, \text{new}}$
Univariate Linear Regression (Asymptotic Approach)	$\left\{ \text{RMSE} \sqrt{\dfrac{1}{n_\text{cal}} + \dfrac{\bar x^2}{n_\text{cal} \sigma^2_x}} , \dfrac{\text{RMSE}}{\sqrt{n_\text{cal} \sigma^2_x} }\right\}$	$\text{RMSE} \times \sqrt{\dfrac{1}{n_\text{cal}} + \dfrac{(x_{i, \text{new}}- \bar x )^2}{n_\text{cal} \sigma_x^2}}$	$\text{RMSE} \times \sqrt{\dfrac{1}{n_\text{cal}} + \dfrac{(x_{i, \text{new}} - \bar x )^2}{n_\text{cal} \sigma_x^2} \ \textcolor{hotpink}{+ 1}}$
Multivariate Linear Regression (Asymptotic Approach)	${\text{RMSE} \times \sqrt{\text{Cov}_{jj}}}$	$\text{RMSE} \times \sqrt{X_{i, \text{new}}^T \cdot \text{Cov} \cdot X_{i, \text{new}} }$	$\text{RMSE} \times \sqrt{X_{i, \text{new}}^T \cdot \text{Cov} \cdot X_{i, \text{new}} \ \textcolor{hotpink}{+ 1}}$
Multivariate Non-Linear Regression (Asymptotic + Delta Approach)	${\text{RMSE} \times \sqrt{\text{IF}_{jj}}}$	$\text{RMSE} \times \sqrt{ J_{i, \text{new}}^T \cdot \text{IF} \cdot J_{i, \text{new}} }$	$\text{RMSE} \times \sqrt{J_{i, \text{new}}^T \cdot \text{IF} \cdot J_{i, \text{new}} \ \textcolor{hotpink}{+ 1} }$

where - $\text{Cov}$: Covariance matrix - $\text{Cov} = (X' X)^{-1}$ - $J$: Jacobean matrix - $J_{i, \text{new}} = \dfrac{\partial \hat y_{i, \text{new}}}{\partial \beta}$ - $H$: Hessian matrix - $H \approx (J^T J)$ - $\text{IF}:$ Inverse Fischer - $\text{IF} = H^{-1}$

High values for non-diagonal elements of $\text{Cov}_\beta$ means that the errors of $\beta$ are correlated with each other.

Degree of freedom $= n - k - 1$, where

$n =$ sample size
$k=$ no of input variables

Confidence and prediction intervals are narrowest at $X = \bar X$, and get wider further from this point.

Under homoskedasticity, $$ \begin{aligned} \hat V(\hat \beta) &= (X' X)^{-1} \hat \sigma^2 \ &=\dfrac{\hat \sigma^2}{\hat u_j' \hat u_j} \end{aligned} $$

Note¶

RMSE = RMSE of validation data
If your validation error distribution is not normal, or you have a lot of data, you can use the quantiles of validation error distribution for the confidence intervals

Intervals using Models’ Prediction¶

For each data point, take __ of multiple models

average
5^th quantile
95^th quantile

Predictive Density¶

Describes the full probabilistic distribution $\forall x$

Trajectories/Scenarios¶

Equally-likely samples of multivariate predictive densities

Uncertainty Propagation¶

Function	Variance
$aA$	$= a^2\sigma_A^2$
$aA + bB$	$= a^2\sigma_A^2 + b^2\sigma_B^2 + 2ab\,\text{Cov(A, B)}$
$aA - bB$	$= a^2\sigma_A^2 + b^2\sigma_B^2 - 2ab\,\text{Cov(A, B)}$
$AB$	$\approx f^2 \left[\left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2 + 2\frac{\text{Cov(A, B)}}{AB} \right]$
$\frac{A}{B}$	$\approx f^2 \left[\left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2 - 2\frac{\text{Cov(A, B)}}{AB} \right]$
$\frac{A}{A+B}$	$\approx \frac{f^2}{\left(A+B\right)^2} \left(\frac{B^2}{A^2}\sigma_A^2 +\sigma_B^2 - 2\frac{B}{A} \text{Cov(A, B)} \right)$
$a A^{b}$	$\approx \left( {a}{b}{A}^{b-1}{\sigma_A} \right)^2 = \left( \frac{{f}{b}{\sigma_A}}{A} \right)^2$
$a \ln(bA)$	$\approx \left(a \frac{\sigma_A}{A} \right)^2$[^4]
$a \log_{10}(bA)$	$\approx \left(a \frac{\sigma_A}{A \ln(10)} \right)^2$[^5]
$a e^{bA}$	$\approx f^2 \left( b\sigma_A \right)^2$[^6]
$a^{bA}$	$\approx f^2 (b\ln(a)\sigma_A)^2$
$a \sin(bA)$	$\approx \left[ a b \cos(b A) \sigma_A \right]^2$
$a \cos \left( b A \right)$	$\approx \left[ a b \sin(b A) \sigma_A \right]^2$
$a \tan \left( b A \right)$	$\left[ a b \sec^2(b A) \sigma_A \right]^2$
$A^B$	$\approx f^2 \left[ \left( \frac{B}{A}\sigma_A \right)^2 +\left( \ln(A)\sigma_B \right)^2 + 2 \frac{B \ln(A)}{A} \text{Cov(A, B)} \right]$
$\sqrt{aA^2 \pm bB^2}$	$\approx \left(\frac{A}{f}\right)^2 a^2\sigma_A^2 + \left(\frac{B}{f}\right)^2 b^2\sigma_B^2 \pm 2ab\frac{AB}{f^2}\,\text{Cov(A, B)}$

For uncorrelated variables ($\rho_{AB}=0$, $\text{Cov(A, B)}=0$) expressions for more complicated functions can be derived by combining simpler functions. For example, repeated multiplication, assuming no correlation, gives $f = ABC; \qquad \left(\frac{\sigma_f}{f}\right)^2 \approx \left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2+ \left(\frac{\sigma_C}{C}\right)^2.$

For the case $f = AB$ we also have Goodman's expression[^7] for the exact variance: for the uncorrelated case it is $V(XY)= E(X)^2 V(Y) + E(Y)^2 V(X) + E((X-E(X))^2 (Y-E(Y))^2)$ and therefore we have: $\sigma_f^2 = A^2\sigma_B^2 + B^2\sigma_A^2 + \sigma_A^2\sigma_B^2$

Effect of correlation on differences¶

If A and B are uncorrelated, their difference A-B will have more variance than either of them. An increasing positive correlation ($\rho_{AB}\to 1$) will decrease the variance of the difference, converging to zero variance for perfectly correlated variables with the same variance. On the other hand, a negative correlation ($\rho_{AB}\to -1$) will further increase the variance of the difference, compared to the uncorrelated case.

For example, the self-subtraction f=A-A has zero variance $\sigma_f^2=0$ only if the variate is perfectly autocorrelated ($\rho_A=1$). If A is uncorrelated, $\rho_A=0$, then the output variance is twice the input variance, $\sigma_f^2=2\sigma^2_A$. And if A is perfectly anticorrelated, $\rho_A=-1$, then the input variance is quadrupled in the output, $\sigma_f^2=4\sigma^2_A$ (notice $1-\rho_A=2$ for f = aA − aA in the table above).

Value at Risk Models¶

Derive the risk profile of the firm
Protect firm against unacceptably large concentrations
Quantify potential losses

Collect data
Graph the data to inspect data quality
Transform prices data into returns form (percentage diff of prices)
Look at the frequency distribution
Obtain the standard deviation (volatility)
Multiply volatility with one-sided $Z_1$ to estimate 99% worst-case loss

Classification¶

\[ [\text{Bin}(n, p)_{1-\alpha/2}, \text{Bin}(n, p)_{\alpha/2}] \]

Last Updated: 2025-01-13 ; Contributors: AhmedThahir, web-flow

	\(\Delta y\)
Normal Assumption	\(t_{n_\text{cal}, \alpha/2} \times \text{SE}\)
Conformal Prediction	\(S^{-1} \left[ q_{\frac{\lceil (n+1)\alpha \rceil}{n}} \right]\)

	Coefficient Confidence Interval	Response Confidence Interval	Response Prediction Interval
Notation	\(\sigma_{\hat \beta}\)	\(\sigma \Big[ \hat \mu \vert x_{i, \text{new}} \Big]\)	\(\sigma \Big[ \hat y_{i, \text{new}} \vert x_{i, \text{new}} \Big]\)
The upper and lower bound for estimated __ at a given level of significance	\(\hat \beta\)	\(\hat \mu \vert x_{i, \text{new}}\)	\(\hat y \vert x_{i, \text{new}}\) \(=\hat \mu \vert x_{i, \text{new}} + \hat u \vert x_{i, \text{new}}\)
Univariate Linear Regression (Asymptotic Approach)	\(\left\{ \text{RMSE} \sqrt{\dfrac{1}{n_\text{cal}} + \dfrac{\bar x^2}{n_\text{cal} \sigma^2_x}} , \dfrac{\text{RMSE}}{\sqrt{n_\text{cal} \sigma^2_x} }\right\}\)	\(\text{RMSE} \times \sqrt{\dfrac{1}{n_\text{cal}} + \dfrac{(x_{i, \text{new}}- \bar x )^2}{n_\text{cal} \sigma_x^2}}\)	\(\text{RMSE} \times \sqrt{\dfrac{1}{n_\text{cal}} + \dfrac{(x_{i, \text{new}} - \bar x )^2}{n_\text{cal} \sigma_x^2} \ \textcolor{hotpink}{+ 1}}\)
Multivariate Linear Regression (Asymptotic Approach)	\({\text{RMSE} \times \sqrt{\text{Cov}_{jj}}}\)	\(\text{RMSE} \times \sqrt{X_{i, \text{new}}^T \cdot \text{Cov} \cdot X_{i, \text{new}} }\)	\(\text{RMSE} \times \sqrt{X_{i, \text{new}}^T \cdot \text{Cov} \cdot X_{i, \text{new}} \ \textcolor{hotpink}{+ 1}}\)
Multivariate Non-Linear Regression (Asymptotic + Delta Approach)	\({\text{RMSE} \times \sqrt{\text{IF}_{jj}}}\)	\(\text{RMSE} \times \sqrt{ J_{i, \text{new}}^T \cdot \text{IF} \cdot J_{i, \text{new}} }\)	\(\text{RMSE} \times \sqrt{J_{i, \text{new}}^T \cdot \text{IF} \cdot J_{i, \text{new}} \ \textcolor{hotpink}{+ 1} }\)

Function	Variance
\(aA\)	\(= a^2\sigma_A^2\)
\(aA + bB\)	\(= a^2\sigma_A^2 + b^2\sigma_B^2 + 2ab\,\text{Cov(A, B)}\)
\(aA - bB\)	\(= a^2\sigma_A^2 + b^2\sigma_B^2 - 2ab\,\text{Cov(A, B)}\)
\(AB\)	\(\approx f^2 \left[\left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2 + 2\frac{\text{Cov(A, B)}}{AB} \right]\)
\(\frac{A}{B}\)	\(\approx f^2 \left[\left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2 - 2\frac{\text{Cov(A, B)}}{AB} \right]\)
\(\frac{A}{A+B}\)	\(\approx \frac{f^2}{\left(A+B\right)^2} \left(\frac{B^2}{A^2}\sigma_A^2 +\sigma_B^2 - 2\frac{B}{A} \text{Cov(A, B)} \right)\)
\(a A^{b}\)	\(\approx \left( {a}{b}{A}^{b-1}{\sigma_A} \right)^2 = \left( \frac{{f}{b}{\sigma_A}}{A} \right)^2\)
\(a \ln(bA)\)	\(\approx \left(a \frac{\sigma_A}{A} \right)^2\)[^4]
\(a \log_{10}(bA)\)	\(\approx \left(a \frac{\sigma_A}{A \ln(10)} \right)^2\)[^5]
\(a e^{bA}\)	\(\approx f^2 \left( b\sigma_A \right)^2\)[^6]
\(a^{bA}\)	\(\approx f^2 (b\ln(a)\sigma_A)^2\)
\(a \sin(bA)\)	\(\approx \left[ a b \cos(b A) \sigma_A \right]^2\)
\(a \cos \left( b A \right)\)	\(\approx \left[ a b \sin(b A) \sigma_A \right]^2\)
\(a \tan \left( b A \right)\)	\(\left[ a b \sec^2(b A) \sigma_A \right]^2\)
\(A^B\)	\(\approx f^2 \left[ \left( \frac{B}{A}\sigma_A \right)^2 +\left( \ln(A)\sigma_B \right)^2 + 2 \frac{B \ln(A)}{A} \text{Cov(A, B)} \right]\)
\(\sqrt{aA^2 \pm bB^2}\)	\(\approx \left(\frac{A}{f}\right)^2 a^2\sigma_A^2 + \left(\frac{B}{f}\right)^2 b^2\sigma_B^2 \pm 2ab\frac{AB}{f^2}\,\text{Cov(A, B)}\)