11.4 An example: stroke in women

Table 1 provides an example of the features of different investigation types. The overall topic is stroke in women. The table (taken from Hernan et al. 2019) provides an example research question, the features of data that would be required to answer it, and the types of analysis that could be used for investigations of three types: Description, Prediction and Causal inference.

Table 1: From Hernan, Hsu & Healy 2019. Examples of Tasks Conducted by Data Scientists Working with Electronic Health Records

Description

Prediction

Causal inference

Example of scientific question

How can women aged 60-80 years with stroke history be partitioned in classes defined by their characteristics?

What is the probability of having a stroke next year for women with certain characteristics?

Will starting a statin reduce, on average, the risk of stroke in women with certain characteristics?


Data


- Eligibility criteria
- Features (symptoms, clinical parameters … )


- Eligibility criteria
- Output (diagnosis of stroke over the next year)
- Inputs (age, blood pressure, history of stroke, diabetes at baseline)


- Eligibility criteria
- Outcome (diagnosis of stroke over the next year)
-Treatment (initiation of statins at baseline)
- Confounders
- Effect modifiers (optional)


Example of analytics


Cluster analysis


Regression
Decision trees
Random forests
Support vector machines
Neural networks


Regression
Matching
Inverse probability weighting
G-formula
G-estimation
Instrumental variable estimation