-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
159 lines (119 loc) · 5.87 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
output: github_document
bibliography: misc/STAT_547C.bib
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
## Packages to use ----
pacman::p_load(
tidyverse, janitor, writexl,
readxl, scales, mytidyfunctions
)
## Set theme ------
theme_set(theme_jmr(text = element_text(family = "Times New Roman")))
options(
ggplot2.discrete.colour = c("#277DA1", "#f94144", "#F9C74F", "#43AA8B"),
ggplot2.discrete.fill = c("#277DA1", "#f94144", "#F9C74F", "#43AA8B")
)
```
# tidydelta <a href="https://javiermtzrdz.github.io/tidydelta/"><img src="man/figures/logo-tidydelta.png" align="right" height="138" width="138" /></a>
<!-- badges: start -->
[](https://CRAN.R-project.org/package=tidydelta)
[](https://github.com/JavierMtzRdz/tidydelta/actions/workflows/R-CMD-check.yaml)
[](https://cran.r-project.org/web/licenses/MIT)
[](https://lifecycle.r-lib.org/articles/stages.html#stable)
<!-- badges: end -->
Delta Method implementation to estimate standard errors in a {tidyverse} workflow.
## Installation
You can install the stable version from the [CRAN](https://cran.r-project.org/web/packages/tidydelta/index.html) with
``` r
install.packages("dplyr")
```
Or the development version from
[GitHub](https://github.com/JavierMtzRdz/tidydelta) with
``` r
remotes::install_github("JavierMtzRdz/tidydelta")
# Or
devtools::install_github("JavierMtzRdz/tidydelta")
```
## Theoretical Background
In general terms, the Delta Method provides a tool for approximating the behaviour of an estimator $\phi(T_n)$ using Taylor Expansion, where $T_n$ is an estimator of a parameter $\theta$, and $\phi$ is a function. To derive this result, we should begin with the observation that, according to the continuous mapping theorem, if $T_n \xrightarrow{\mathbb{P}} \theta$ implies $\phi(T_n) \xrightarrow{\mathbb{P}} \phi(\theta)$ assuming that $\phi$ is a continuous function [@Vaart:2000a, p.25].
This observation allows us to take a step forward to decompose the theorem of the DM. Assuming that $\phi$ is not only continuous but also differentiable and further assuming that $\sqrt{n} (T_n - \theta)$ converges in distribution to a variable $T$ as the sample size $n$ increases, [@Bouchard-Cote:2023, p.87] we can employ Taylor Expansion to show that
$$
\sqrt{n} (\phi(T_n) - \phi(\theta)) \approx \phi'(\theta)\sqrt{n} (T_n - \theta).
$$
Now, as the sample size $n$ becomes larger, the expression $\sqrt{n} (\phi(T_n) - \phi(\theta))$ converges in distribution to $\phi'(\theta)T$ [@Vaart:2000a, p. 25; @Bouchard-Cote:2023].
Given the previous result, we can rearrange the equations to show that
$$
\sqrt{n} (\phi(T_n) - \phi(\theta)) \sim N(0, \phi'(\theta)^2\sigma^2)
$$
as $n$ grows larger. For this reason, it is not a surprise that one of the primary uses of the Delta Method is to approximate the variance of transformations of estimators.
### The Multivariate Case
We often encounter scenarios where our $\theta$ can be expressed as a function of more than one parameter. For instance, we may be interested in the transformation $\phi$ of $\theta = (\theta_1, \ldots, \theta_k)$ [@Zepeda-TelloSchomakerMaringe:2022]. In this case, we are dealing with a multivariate parameter space and the estimator $T_{n,k} = (Y_{n,1}, \ldots, Y_{n,k})$ becomes a random vector. Consequently, the asymptotic behaviour of the estimator $T_{n,k}$ can be seen as follows [@Weisberg:2005, pp. 79-80]
$$
\sqrt{n} (T_{n,k} - \theta) \leadsto N(0, \Sigma).
$$
To approximate this scenario using the Delta Method, we need to compute the vector of all partial derivatives of $\phi(\theta)$ with respect to each parameter $\theta_1, \ldots, \theta_k$. This vector is denoted as $\nabla \phi$.
With this vector, we can extend the Delta Method to the multivariate case stating that asymptotically
$$
\sqrt{n} (\phi(T_{n,k}) - \phi(\theta)) \leadsto N(0, \nabla_{\theta}^\top\Sigma \nabla_{\theta}).
$$
In this equation, $\nabla_{\theta}^\top$ represents the transpose of the gradient vector $\nabla \phi$, and $\Sigma$ is the covariance matrix of the random vector $T_{nk}$ [@Weisberg:2005, pp. 79-80].From this, we obtain that $se(\theta) = \sqrt{\nabla_{\theta}^\top\Sigma \nabla_{\theta}}$, which constitutes the function that is being implemented in this project.
## Examples
Using `tidydelta()`, the following commands are equivalent:
```{r example}
# Load packages
library(tidydelta)
library(tidyverse)
# Simulate samples
set.seed(547)
x <- rnorm(10000, mean = 5, sd = 2)
y <- rnorm(10000, mean = 15, sd = 3)
bd <- tibble(x, y)
# Equivalent uses of tidydelta()
tidydelta(~ y / x,
conf_lev = .95
)
tidydelta(~ bd$y / bd$x,
conf_lev = .95
)
bd %>%
summarise(tidydelta(~ y / x,
conf_lev = .95
))
```
Now, the data frame is divided into samples to compare the transformation of the sample with the estimation of `tidydelta()`. In the real world, you would not need to compute the Delta Method if you have many samples, but it shows how it can be incorporated in a workflow with tidyverse.
```{r message=FALSE, warning=FALSE, fig.height=3.5, dpi=300}
(result <- bd %>%
summarise(tidydelta(~ x / y,
conf_lev = .95
)))
ggplot() +
geom_histogram(
data = bd %>%
mutate(t = x / y),
aes(x = t)
) +
geom_vline(aes(
xintercept = result$T_n,
color = "T_n"
)) +
geom_vline(
aes(
xintercept = c(
result$lower_ci,
result$upper_ci
),
color = "CI"
),
linetype = "dashed"
) +
labs(color = element_blank())
```
## References