Introduction to statistics with R (beginners)

Biostatistics: Introduction to biostatistics (1st level)

This course is part of the Plateforme Biostatistique de Toulouse training session: “Initiation à LA statistique avec R”. The first session will held on September 21-23 2020 and the second is scheduled on March 29-31.

The content of this course is basic statistics, illustrated with the programming language R. The course covers the following topics:

  • exploratory statistics in one and two dimensions, including plots (Nathalie Vialaneix and Sandrine Laguerre)

  • statistical inference and statistical tests (Nathalie Vialaneix and Sandrine Laguerre)

  • PCA and clustering (Sébastien Déjean and Jérome Mariette)

This page gathers information about the course and material to download.

Please, contact Nathalie Vialaneix for any question or technical settings.

Install R

For this course, the installation of R, RStudio (and ability to compile RMarkdown files) and of a few packages on your personal computer is required prior the beginning of the course. The installation steps are described below.

Do not hesitate to contact me (emails preferred) in case of problem during the installation. Please describe precisely the error message (screenshot is a plus) when reporting a problem.

Install R (preferentially version 4.0 or higher)

R can be downloaded for free on the official repository website. Choose the version depending of your OS (Windows, Linux or Mac). Mac users should also probably install tcltk which is available in the section called tools. Some linux users might also found R in their distribution repositories (this is the case for Ubuntu and Debian users; further details are provided at this page, third bullet point).
   

Install RStudio

RStudio (Desktop version) can be downloaded for free at this link. Choose the version ("Installers" prefer) depending of your OS (Windows, Linux or Mac). Ubuntu users can install the .deb file with
sudo dpkg -i rstudio-XX.deb
sudo apt-get install -f
      
To be sure that you can compile rmarkdown files, open RStudio and click on New / RMarkdown file and then try to click on the button "knit". If all packages to knit your file are not installed, you should be prompted to install them.

Install required CRAN packages

The following packages (available on CRAN) will be required:
  • RColorBrewer
  • FactoMineR
They are installed (with dependencies) using:
install.packages(c("RColorBrewer", "FactoMineR"))
      
We also recommend that you check the installation with
library("...")
      
where ... is a package name.

Special warning for INRAE users: some of the installation settings in various units of INRAE Toulouse are such that your personal R library is located on a remote folder. When not on-site, this can result in errors or delays with installed package. If you intend to follow the course from your home, carrefully check that the package loads properly (with the library command as stated above) after a reboot of your computer.

Materiel for the class

Download the material and have it ready on your computer for the class.

  1. Course material
  2. Datasets The class will be illustrated with the following datasets:
    • for the first two days: an stages_TDF.csv is a dataset describing Tour de France stages. It originates from kaggle where its complete description is available. Be careful to download this file directly (with a right click on the mouse) and to not open it with Excel!
    • for the third day: the three CSV athle_records.csv, body_full.csv and body_light.csv (that must also not be opened with Excel).
  3. Material for the practical session