About this document

Content

This document offers an introduction to the statistical language R and the integrated development environment RStudio built on top of it. It is aimed at non-statisticians working in research, especially biomedical research and the life sciences, and does not require previous familiarity with R/RStudio.

This is not an introduction to statistics - I assume that the reader is familiar with basic statistical tools and concepts, including descriptive statistics numerical (means, medians etc.) and graphical (histograms, boxplots etc.), classical hypothesis tests (t-tests etc.) as well as linear regression. On the other hand, nothing beyond these basics is required for most parts: while some specialized sections deal with concepts like odds ratios and extensions of linear regression widely used in epidemiology, this material is fairly self-contained and can be skipped if not of interest.

R is one of the great success stories of scientific open source development. Its large and active community of users and developers has created a wide range of freely available introductory material, from one-page cheat-sheets to full books. So why have one more? Based on the requirements for my course, but also on personal preferences, this introduction offers a combination the following features:

  • no recapitulation of basic statistics,
  • introduction of the R command line from scratch,
  • emphasis on base R, as opposed to many of the tidyverse extensions and replacements,
  • more weight on R command line functionality compared to the RStudio GUI,
  • focus on scripting for data analysis, as opposed to package development,
  • discussion of organising data, code, and workflow in the context of a scientific study,
  • focus on generating output for scientific publications.

This is definitely not the shortest introduction to getting productive with RStudio as fast as possible. The goal is to provide an understanding for how R works, and can be used in (epidemiological) research, providing a context for broad applications and a foundation for extending the reader’s knowledge beyond the content presented here, according to their needs and interests.

Status

These notes are under development, and incomplete even for the simple purpose of accompanying the motivating course. Suggestions, comments and constructive criticism are welcome, and will be used to improve the product.

Version: 0.8.8