- Introduction
- Downside Setup
2.1. Causal Graph
2.2. Mannequin With and With out Z
2.3. Energy of Z as a Confounder - Sensitivity Evaluation
3.1. Purpose
3.2. Robustness Worth - PySensemakr
- Conclusion
- Acknowledgements
- References
The specter of unobserved confounding (aka omitted variable bias) is a infamous drawback in observational research. In most observational research, except we will fairly assume that remedy task is as-if random as in a pure experiment, we will by no means be actually sure that we managed for all doable confounders in our mannequin. Consequently, our mannequin estimates might be severely biased if we fail to regulate for an necessary confounder–and we wouldn’t even realize it for the reason that unobserved confounder is, properly, unobserved!
Given this drawback, you will need to assess how delicate our estimates are to doable sources of unobserved confounding. In different phrases, it’s a useful train to ask ourselves: how a lot unobserved confounding would there must be for our estimates to drastically change (e.g., remedy impact now not statistically important)? Sensitivity evaluation for unobserved confounding is an lively space of analysis, and there are a number of approaches to tackling this drawback. On this put up, I’ll cowl a easy linear methodology [1] primarily based on the idea of partial R² that’s broadly relevant to a big spectrum of circumstances.
2.1. Causal Graph
Allow us to assume that we have now 4 variables:
- Y: consequence
- D: remedy
- X: noticed confounder(s)
- Z: unobserved confounder(s)
This can be a frequent setting in lots of observational research the place the researcher is eager about realizing whether or not the remedy of curiosity has an impact on the result after controlling for doable treatment-outcome confounders.
In our hypothetical setting, the connection between these variables are such that X and Z each have an effect on D and Y, however D has no impact on Y. In different phrases, we’re describing a state of affairs the place the true remedy impact is null. As will develop into clear within the subsequent part, the aim of sensitivity evaluation is having the ability to purpose about this remedy impact when we have now no entry to Z, as we usually received’t because it’s unobserved. Determine 1 visualizes our setup.
Determine 1: Downside Setup
2.2. Mannequin With and With out Z
To display the issue that our unobserved Z may cause, I simulated some information in keeping with the issue setup described above. You may discuss with this pocket book for the main points of the simulation.
Since Z could be unobserved in actual life, the one mannequin we will usually match to information is Y~D+X. Allow us to see what outcomes we get if we run that regression.
Primarily based on these outcomes, it looks like D has a statistically important impact of 0.2686 (p<0.001) per one unit change on Y, which we all know isn’t true primarily based on how we generated the information (no D impact).
Now, let’s see what occurs to our D estimate once we management for Z as properly. (In actual life, we in fact received’t have the ability to run this extra regression since Z is unobserved however our simulation setting permits us to peek backstage into the true information era course of.)
As anticipated, controlling for Z appropriately removes the D impact by shrinking the estimate in the direction of zero and giving us a p-value that’s now not statistically important on the 𝛼=0.05 threshold (p=0.059).
2.3. Energy of Z as a Confounder
At this level, we have now established that Z is powerful sufficient of a confounder to eradicate the spurious D impact for the reason that statistically important D impact disappears once we management for Z. What we haven’t mentioned but is precisely how sturdy Z is as a confounder. For this, we’ll make the most of a helpful statistical idea referred to as partial R², which quantifies the proportion of variation {that a} given variable of curiosity can clarify that may’t already be defined by the present variables in a mannequin. In different phrases, partial R² tells us the added explanatory energy of that variable of curiosity, above and past the opposite variables which are already within the mannequin. Formally, it may be outlined as follows
the place RSS_reduced is the residual sum of squares from the mannequin that doesn’t embody the variable(s) of curiosity and RSS_full is the residual sum of squares from the mannequin that features the variable(s) of curiosity.
In our case, the variable of curiosity is Z, and we wish to know what quantity of the variation in Y and D that Z can clarify that may’t already be defined by the present variables. Extra exactly, we have an interest within the following two partial R² values
the place (1) quantifies the proportion of variance in Y that may be defined by Z that may’t already be defined by D and X (so the diminished mannequin is Y~D+X and the complete mannequin is Y~D+X+Z), and (2) quantifies the proportion of variance in D that may be defined by Z that may’t already be defined by X (so the diminished mannequin is D~X and the complete mannequin is D~X+Z).
Now, allow us to see how strongly related Z is with D and Y in our information by way of partial R².
It seems that Z explains 16% of the variation in Y that may’t already be defined by D and X (that is partial R² equation #1 above), and 20% of the variation in D that may’t already be defined by X (that is partial R² equation #2 above).
3.1. Purpose
As we mentioned within the earlier part, unobserved confounding poses an issue in actual analysis settings exactly as a result of, in contrast to in our simulation setting, Z can’t be noticed. In different phrases, we’re caught with the mannequin Y~D+X, having no strategy to know what our outcomes would have been if we may run the mannequin Y~D+X+Z as an alternative. So, what can we do?
Intuitively, an affordable sensitivity evaluation strategy ought to have the ability to inform us that if a Z such because the one we have now in our information had been to exist, it will nullify our outcomes. Do not forget that our Z explains 16% of the variation in Y and 20% of the variation in D that may’t be defined by noticed variables. Due to this fact, we anticipate sensitivity evaluation to inform us {that a} hypothetical Z-like confounder of comparable power could be sufficient to eradicate the statistically important D impact.
However how can we calculate that the unobserved confounder’s power must be on this 16–20% vary within the partial R² scale with out ever getting access to it? Enter robustness worth.
3.2. Robustness Worth
Robustness worth (RV) formalizes the thought we talked about above of figuring out the mandatory power of a hypothetical unobserved confounder that would nullify our outcomes. The usefulness of RV emanates from the truth that we solely want our observable mannequin Y~D+X and never the unobservable mannequin Y~D+X+Z to have the ability to calculate it.
Formally, we will write down as follows the RV that quantifies how sturdy unobserved confounding must be to vary our noticed statistical significance of the remedy impact (if the notation is an excessive amount of to comply with, simply bear in mind the important thing concept that the RV is a measure of the power of confounding wanted to vary our outcomes)
the place
- 𝛼 is our chosen significance stage (usually set to 0.05 or 5%),
- q determines the p.c discount q*100% in significance that we care about (usually set to 1, since we often care about confounding that would scale back statistical significance by 1*100%=100% therefore rendering it not statistically important),
- t_betahat_treat is the noticed t-value of our remedy from the mannequin Y~D+X (which is 8.389 on this case as might be seen from the regression outcomes above),
- df is our levels of freedom (which is 1000–3=997 on this case since we simulated 1000 samples and are estimating 3 parameters together with the intercept), and
- t*_alpha,df-1 is the t-value threshold related to a given 𝛼 and df-1 (1.96 if 𝛼 is about to 0.05).
We are actually able to calculate the RV in our personal information utilizing solely the noticed mannequin Y~D+X (res_ydx).
It’s by no struck of luck that our RV (18%) falls proper within the vary of the partial R² values we calculated for Y~Z|D,X (16%) and D~Z|X (20%) above. What the RV is telling us right here is that, even with none express data of Z, we will nonetheless purpose that any unobserved confounder wants, on common, a minimum of 18% power within the partial R² scale vis-à-vis each the remedy and the result to have the ability to nullify our statistically important outcome.
The rationale why the RV isn’t 16% or 20% however falls someplace in between (18%) is that it’s designed to be a single quantity that summarizes the mandatory power of the confounder with each the result and the remedy, so 18% makes good sense given what we all know concerning the information. You may give it some thought like this: for the reason that methodology doesn’t have entry to the precise numbers 16% and 20% when calculating the RV, it’s doing its finest to quantify the power of the confounder by assigning 18% to each partial R² values (Y~Z|D,X and D~Z|X), which isn’t too far off from the reality in any respect and really does a fantastic job summarizing the power of the confounder.
In fact, in actual life we received’t have the Z variable to double examine that our RV is right, however seeing how the 2 outcomes align right here ought to a minimum of offer you some confidence within the methodology. Lastly, as soon as we calculate the RV, we must always take into consideration whether or not an unobserved confounder of that power is believable. In our case, the reply is ‘sure’ as a result of we have now entry to the information era course of, however in your particular real-life software, the existence of such a robust confounder may be an unreasonable assumption. This could be excellent news for you since no life like unobserved confounder may drastically change your outcomes.
The sensitivity evaluation approach described above has already been carried out with all of its bells and whistles as a Python bundle beneath the identify PySensemakr (R, Stata, and Shiny App variations exist as properly). For instance, to get the very same outcome that we manually calculated within the earlier part, we will merely run the next code chunk.
Notice that “Robustness Worth, q = 1 alpha = 0.05” is 0.184, which is precisely what we calculated above. Along with the RV for statistical significance, the bundle additionally supplies the RV that’s wanted for the coefficient estimate itself to shrink to 0. Not surprisingly, unobserved confounding must be even bigger for this to occur (0.233 vs 0.184).
The bundle additionally supplies contour plots for the 2 partial R² values, which permits for an intuitive visible show of sensitivity to doable ranges of confounding with the remedy and the result (on this case, it shouldn’t be shocking to see that the x/y-axis worth pairs that meet the purple dotted line embody 0.18/0.18 in addition to 0.20/0.16).
One may even add benchmark values to the contour plot as proxies for doable quantities of confounding. In our case, since we solely have one noticed covariate X, we will set our benchmarks to be 0.25x, 0.5x and 1x as sturdy as that noticed covariate. The ensuing plot tells us {that a} confounder that’s half as sturdy as X must be sufficient to nullify our statistically important outcome (for the reason that “0.5x X” worth falls proper on the purple dotted line).
Lastly, I wish to observe that whereas the simulated information on this instance used a steady remedy variable, in follow the tactic works for any form of remedy variable together with binary remedies. Then again, the result variable technically must be a steady one since we’re working within the OLS framework. Nevertheless, the tactic can nonetheless be used even with a binary consequence if we mannequin it utilizing OLS (that is referred to as a LPM [2]).
The likelihood that our impact estimate could also be biased because of unobserved confounding is a standard hazard in observational research. Regardless of this potential hazard, observational research are a significant device in information science as a result of randomization merely isn’t possible in lots of circumstances. Due to this fact, you will need to know the way we will tackle the problem of unobserved confounding by working sensitivity analyses to see how strong our estimates are to potential such confounding.
The robustness worth methodology by Cinelli and Hazlett mentioned on this put up is an easy and intuitive strategy to sensitivity evaluation formulated in a well-recognized linear mannequin framework. If you’re eager about studying extra concerning the methodology, I extremely suggest looking on the unique paper and the bundle documentation the place you possibly can find out about many extra fascinating functions of the tactic reminiscent of ‘excessive state of affairs’ evaluation.
There are additionally many different approaches to sensitivity evaluation for unobserved confounding, and I would love briefly point out a few of them right here for readers who wish to proceed studying extra on this subject. One versatile approach is the E-value developed by VanderWeele and Ding that formulates the issue by way of danger ratios [3] (carried out in R right here). One other approach is the Austen plot developed by Veitch and Zaveri primarily based on the ideas of partial R² and propensity rating [4] (carried out in Python right here), and yet one more current strategy is by Chernozhukov et al [5] (carried out in Python right here).
I wish to thank Chad Hazlett for answering my query associated to utilizing the tactic with binary outcomes and Xinyi Zhang for offering plenty of invaluable suggestions on the put up. Until in any other case famous, all photographs are by the writer.
[1] C. Cinelli and C. Hazlett, Making Sense of Sensitivity: Extending Omitted Variable Bias (2019), Journal of the Royal Statistical Society
[2] J. Murray, Linear Chance Mannequin, Murray’s private web site
[3] T. VanderWeele and P. Ding, Sensitivity Evaluation in Observational Analysis: Introducing the E-Worth (2017), Annals of Inside Medication
[4] V. Veitch and A. Zaveri, Sense and Sensitivity Evaluation: Easy Submit-Hoc Evaluation of Bias Resulting from Unobserved Confounding (2020), NeurIPS
[5] V. Chernozhukov, C. Cinelli, W. Newey, A. Sharma, and V. Syrgkanis, Lengthy Story Brief: Omitted Variable Bias in Causal Machine Studying (2022), NBER