class: center, middle, inverse, title-slide # Causal Diagrams in R ### 2020-07-29 (updated: 2020-07-28) --- class: middle, center, inverse # **Draw your causal assumptions with causal directed acyclic graphs (DAGs)** --- class: inverse # The basic idea 1. Specify your causal question 1. Use domain knowledge 1. Write variables as nodes 1. Write causal pathways as arrows (edges) --- class: middle, center, inverse # **ggdag** --- <img src="img/ggdagitty.png" width="100%" height="100%" style="display: block; margin: auto;" /> --- <img src="img/ggdagitty_alg.png" width="100%" height="100%" style="display: block; margin: auto;" /> --- <img src="img/ggdagitty_plots.png" width="100%" height="100%" style="display: block; margin: auto;" /> --- <img src="img/tidy_ggdagitty.png" width="100%" height="100%" style="display: block; margin: auto;" /> --- # Step 1: Specify your DAG -- ```r dagify( cancer ~ smoking, coffee ~ smoking ) ``` --- # Step 1: Specify your DAG ```r dagify( * cancer ~ smoking, coffee ~ smoking ) ``` --- # Step 1: Specify your DAG ```r dagify( cancer ~ smoking, * coffee ~ smoking ) ``` --- # Step 1: Specify your DAG ```r dagify( cancer ~ smoking, coffee ~ smoking ) %>% ggdag() ``` --- # Step 1: Specify your DAG <img src="02-dags_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- # Step 1: Specify your DAG ```r dagify( cancer ~ smoking + coffee, coffee ~ smoking ) %>% ggdag() ``` --- # Step 1: Specify your DAG <img src="02-dags_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> --- ## Your Turn 1 (**`02-dags-exercises.Rmd`**) ### Specify a DAG with `dagify()`. Write your assumption that `smoking` causes `cancer` as a formula. ### We're going to assume that coffee does not cause cancer, so there's no formula for that. But we still need to declare our causal question. Specify "coffee" as the exposure and "cancer" as the outcome (both in quotations marks). ### Plot the DAG using `ggdag()`
03
:
00
--- ## Your Turn 1 (`02-dags-exercises.Rmd`) ```r coffee_cancer_dag <- dagify( cancer ~ smoking, smoking ~ addictive, coffee ~ addictive, exposure = "coffee", outcome = "cancer", labels = c( "coffee" = "Coffee", "cancer" = "Lung Cancer", "smoking" = "Smoking", "addictive" = "Addictive \nBehavior" ) ) ``` --- ```r ggdag(coffee_cancer_dag) ``` <img src="02-dags_files/figure-html/unnamed-chunk-13-1.png" style="display: block; margin: auto;" /> --- # Causal effects and backdoor paths --- # Causal effects and backdoor paths ## **Ok, correlation != causation. But why not?** --- # Causal effects and backdoor paths ## ~~Ok, correlation != causation. But why not?~~ ## **We want to know if `x -> y`...** --- # Causal effects and backdoor paths ## ~~Ok, correlation != causation. But why not?~~ ## ~~We want to know if `x -> y`...~~ ## **But other paths also cause associations** --- # `ggdag_paths()` ## Identify "backdoor" paths -- ```r ggdag_paths(smk_wt_dag) ``` --- <img src="02-dags_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" /> --- ## Your Turn 2 ### Call `tidy_dagitty()` on `coffee_cancer_dag` to create a tidy DAG, then pass the results to `dag_paths()`. What's different about these data? ### Plot the open paths with `ggdag_paths()`. (Just give it `coffee_cancer_dag` rather than using `dag_paths()`; the quick plot function will do that for you.) Remember, since we assume there is *no* causal path from coffee to lung cancer, any open paths must be confounding pathways.
03
:
00
--- ## Your Turn 2 ```r coffee_cancer_dag %>% tidy_dagitty() %>% dag_paths() ``` ``` ## # A DAG with 4 nodes and 3 edges ## # ## # Exposure: coffee ## # Outcome: cancer ## # ## # A tibble: 5 x 11 ## set name x y direction to xend yend ## <chr> <chr> <dbl> <dbl> <fct> <chr> <dbl> <dbl> ## 1 1 addi… 25.7 28.1 -> coff… 24.5 27.9 ## 2 1 addi… 25.7 28.1 -> smok… 27.1 28.2 ## 3 1 smok… 27.1 28.2 -> canc… 28.3 28.3 ## 4 1 coff… 24.5 27.9 <NA> <NA> NA NA ## 5 1 canc… 28.3 28.3 <NA> <NA> NA NA ## # … with 3 more variables: circular <lgl>, label <chr>, ## # path <chr> ``` --- ```r coffee_cancer_dag %>% ggdag_paths() ``` <img src="02-dags_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" /> --- # Closing backdoor paths --- # Closing backdoor paths ## **We need to account for these open, non-causal paths** --- # Closing backdoor paths ## ~~We need to account for these open, non-causal paths~~ ## **Randomization** --- # Closing backdoor paths ## ~~We need to account for these open, non-causal paths~~ ## ~~Randomization~~ ## **Stratification, adjustment, weighting, matching, etc.** --- # Identifying adjustment sets ```r ggdag_adjustment_set(smk_wt_dag) ``` --- <img src="02-dags_files/figure-html/unnamed-chunk-19-1.png" style="display: block; margin: auto;" /> --- ## Your Turn 3 #### Now that we know the open, confounding pathways (sometimes called "backdoor paths"), we need to know how to close them! First, we'll ask {ggdag} for adjustment sets, then we would need to do something in our analysis to account for at least one adjustment set (e.g. multivariable regression, weighting, or matching for the adjustment sets). #### Use `ggdag_adjustment_set()` to visualize the adjustment sets. Add the arguments `use_labels = "label"` and `text = FALSE`. #### Write an R formula for each adjustment set, as you might if you were fitting a model in `lm()` or `glm()`
03
:
00
--- ## Your Turn 3 ```r ggdag_adjustment_set( coffee_cancer_dag, use_labels = "label", text = FALSE ) ``` --- <img src="02-dags_files/figure-html/unnamed-chunk-21-1.png" style="display: block; margin: auto;" /> --- ## Your Turn 3 ```r cancer ~ coffee + addictive cancer ~ coffee + smoking ``` --- class: inverse # Resources: ggdag vignettes ## [An Introduction to ggdag](https://ggdag.malco.io/articles/intro-to-ggdag.html) ## [An Introduction to Directed Acyclic Graphs](https://ggdag.malco.io/articles/intro-to-dags.html) ## [Common Structures of Bias](https://ggdag.malco.io/articles/bias-structures.html)