Statistics for Health Data Science
Welcome to Statistics for Health Data Science
Preamble
Acknowledgements
How to use this book
Overview
1 Introduction
Basic probability
Probability and statistics
2. Discrete Distributions
2.1 Application of Bayes’ Theorem
2.2 The binomial distribution
2.3 The Poisson distribution
2.4 Summary
3. Continuous distributions
3.1 Continuous random variables
3.2 Useful continuous distributions
3.3 Uses of the standard Normal distribution
3.4 Are the data normally distributed?
3.5 Joint distributions and correlations
Statistical Inference
Statistical Inference
4. Populations and Samples
4.1 Sampling from a population
4.2 Statistical models
4.3 Sampling distributions
4.4 Obtaining the sampling distribution
4.5 Summary
Appendix: Additional Reading
5. Likelihood
5.1 Maximum likelihood estimation
5.2 The likelihood
5.3 Log likelihood
5.4 Finding the MLE
5.5 Summary
6. Maximum Likelihood
6.1 Likelihood with independent observations
6.2 Properties of maximum likelihood estimators
6.3 Summary
Appendix: Additional Reading
7. Frequentist I: Confidence Intervals
7.1 Confidence intervals
7.2 Confidence intervals for the mean
7.3 Interpretation of confidence intervals
7.4 Approximate confidence intervals for parameters estimated using large samples
7.5 Confidence Intervals using resampling
7.6 Summary: Use of confidence intervals
Further resources
8. Frequentist II: Hypothesis tests
8.1 Evidence against hypotheses
8.2 The p-value
8.3 Connection between p-values and confidence intervals
8.4 Other (mis-)interpretations of p-values
8.5 Calculating p-values
Further resources
9. Bayesian Statistics I
9.1 Introduction to Bayesian Inference
9.2 Bayes Theorem (recap)
9.3 The Bayesian paradigm in Health data science problems.
9.4 Bayes thorem for discrete and continous data
9.5 Bayesian inference on proportions
9.6 Summarising Posteriors
9.7 Prior Predictions
9.8 Conjugacy
10. Bayesian Statistics II: Normal data
10.1 Example: CD4 cell counts
10.2 Calculating the posterior
10.3 Credible Intervals
10.4 Predictions
10.5 Multiparameter models
Further Resources
Statistical modelling
Investigations and the role of regression modelling
11. Types of Investigation
11.1 Specifying research questions
11.2 Different types of investigation
11.3 Properties of different types of investigation
11.4 An example: stroke in women
11.5 Role of explanatory variables in different types of investigation
11.6 Summary
References
12. Linear Regression I
12.1 Introduction
12.2 Data used in our examples
12.3 The simple linear regression model
12.4 Estimation of the population parameters
12.5 Example: continuous independent variable
12.6 Inference for the slope
12.7 Example: binary independent variable
12.8 Additional material
13. Linear Regression II
13.1 Categorical independent variables
13.2 Multivariable linear regression
13.3 Including multiple covariates
13.4 Centering
13.6 Including higher-order terms
13.7 Modelling interaction terms
14. Linear Regression III
14.1 Assumptions
14.2 Investigating assumptions using plots
14.2 Statistical tests of assumptions
14.3 Dealing with violations of assumptions
14.5 Collinearity
14.6 Optional Reading: Analysis of Variance
14.7 Proofs
15 Logistic Regression
15.1 Regression modelling for binary outcomes
15.2 Data used in our examples
15.3 The logistic regression model
15.4 Estimating the parameters
15.5 Examples
15.6 Inference
15.7 Multivariable logistic regression
15.8 Interactions and higher-order terms
15.9 Model diagnostics
15.12 Common pitfalls
15.13 Further resources
15.14 Additional reading
16. Generalised Linear Models (GLMs)
16.1 Introduction to Generalised Linear Models (GLMs)
16.2 Generalised Linear Model Components
16.3 GLM Assumptions
16.4 Link Functions
16.5 Programming GLM’s in R
16.6 Introduction to Poisson Generalised Linear Modelling (Poisson Regression)
16.7 Poisson Regression Example
16.8 Common Problems in Poisson Regression
17. The role of regression in different types of investigation
Statistics and Health Data Science
repository
open issue
Index