cours/statistiques cheat sheet.md

up:: [[statistiques]], [[cheat sheet]]
#maths/statistiques

- loi binomiale $\mathcal{B}(n, p)$ : $E(X) = n\cdot p$, $\sigma(X) = \sqrt{ n\cdot p\cdot (1-p) }$
- loi de poisson $\mathcal{P}(\lambda)$ : $P(X=k) = \dfrac{\lambda^{k}e^{k}}{k!}$,  $E(X) = V(X) = \lambda$, $\sigma(X) = \sqrt{ \lambda }$
- Loi normale $\mathcal{N}(\mu, \sigma)$ : $\mathcal{N}(\mu_1, \sigma_1)+\mathcal{N}(\mu_2, \sigma_2) = \mathcal{N}\left(\mu_1+\mu_2, \sqrt{ \sigma_1^{2}+\sigma_2^{2} }\right)$ et $\Pi \leadsto \mathcal{N}(0, 1)$
- Variance : $V(X) = \frac{1}{n} \sum\limits_{i}\left(  \underbracket{n_{i}}_{\text{effectif}} \left( \underbracket{x_{i}}_{\text{valeur}} - \underbracket{\overline{X}}_{\text{moyenne}} \right) \right) = \overline{(X - \overline{X})^{2}}$
- covariance : $cov(X, Y) =  \overline{X\cdot Y} - \overline{X}\cdot \overline{Y} = \sum\limits_{i}\left( \frac{(X-\overline{X})\cdot(Y-\overline{Y})}{n} \right)$
- coefficient correllation linéaire : $\rho(X, Y) = \frac{cov(X, Y)}{\sigma(X)\cdot\sigma(Y)}$
**Régression par les moindres carrés** : $\hat{y} = a(x - \overline{X}) + \overline{Y}$ où $a = \frac{cov(X, Y)}{V(X)}$    valide si $0.7 \leq |\rho(X, Y)| \leq 1$
**Théorème centrale limite :**
- soient $X_{i}$ des variables aléatoires obéissant à **la même loi de probabilité**, la somme $\sum\limits_{k=0}^{n} X_{k}$ tend vers $\mathcal{N}(n\mu, \sigma\sqrt{ n })$
- **APPROXIMATIONS DE LA LOI BINOMIALE**
    - **par la loi normale**
        - Soit $X \leadsto \mathcal{B}(n, p)$, pour $n$ grand on à : $X$ suit environ $\mathcal{N}(E(X), \sigma(X)) = \mathcal{N}(n\cdot p, \sqrt{ n\cdot p\cdot(1-p) })$
        - satisfaisant si $n\cdot p \geq5$ et $n*(1-p) \geq 5$
    - **par la loi de poisson**
        - quand $n\cdot p < 5$, l'approximation par loi normale échoue. on peut alors utiliser une loi de poisson
        - soient $X \leadsto \mathcal{B}(n, p)$ et $\lambda = n\cdot p$  alors $P(X = k)$ tend vers $\frac{\lambda^{k}e^{-\lambda}}{k!}$ quand $n \to +\infty$, $k$ prenant toutes les valeurs possibles : $k = 1, 2, 3 \ldots$
        - quand $n \to +\infty$, on sait que $\mathcal{B}(n, p) \to \mathcal{P}(\lambda=n\cdot p)$
        - valable si $n \geq 30$, $p \leq 0.1$ et $\lambda=n\cdot p \leq 10$
        - ⚠️ les deux lois sont discrètes, dont pas de correction de continuité
- **approximation de la loi de poisson par la loi normale**
    - $X \leadsto \mathcal{P}(\lambda)$ (loi de poisson) avec $\lambda \geq 10$, on a $\mathcal{P}(\lambda ) \approx \mathcal{N}(\lambda)$ pour des variables réduites
    - avec *correction de continuité* : $P\left( a \leq \frac{X-\lambda }{\sqrt{ \lambda }} \leq b\right) = \Pi(b+0.5) - \Pi(a - 0.5)$
    - exemple : si $X \leadsto \mathcal{P}(\lambda=10)$, alors $P(X \leq 6) \approx \Pi \left( \frac{6.5 - 10}{\sqrt{ 10 }} \right)$
**Approximation par loi de poisson** :
- approximation de la loi $\mathcal{B}(n, p)$ (valable si $n \geq 30$, $p \leq 0.1$ et $\lambda=n\cdot p \leq 10$)
- quand $n \to +\infty$, on sait que $\mathcal{B}(n, p) \to \mathcal{P}(\lambda=n\cdot p)$
- ⚠️ les deux lois sont discrètes, dont pas de correction de continuité