Appendix C β€” Technical resources for intermediate to advanced data scientists

C.1 R programming

C.1.1 Tidyverse

tidyverse Collection of R packages designed for data science
Link Recommendation
Introduction to the Tidyverse (JHU Data Science Lab’s lectures) ⭐ ⭐ ⭐ ⭐ ⭐
Importing data in the Tidyverse (JHU Data Science Lab’s lectures) ⭐ ⭐ ⭐ ⭐ ⭐
Wranging data in the Tidyverse
Visualizing data in the Tidyverse
Modeling Data in the Tidyverse
Link
Advanced R Programming
Building R packages
Building Data Visualization Tools
Advanced R and solutions

C.1.2 Spatial analyses

Link Recommendation
afrimapr - mapping data in Africa
https://afrimapr.github.io/afrimapr.website/
https://geocompr.robinlovelace.net/
https://www.paulamoraga.com/book-geospatial/
https://andysouth.shinyapps.io/afrilearnr-crash-course/
https://andysouth.shinyapps.io/intro-to-spatial-r/#section-summary

C.1.3 Literature reviews

metaverse Collection of R packages designed for data science
litsearchr Excellent R package for supporting evidence synthesis generation

There are very well documented) vignettes/tutorials

  • https://luketudge.github.io/litsearchr-tutorial/litsearchr_tutorial.html
  • https://elizagrames.github.io/litsearchr/litsearchr_vignette.html
  • https://elizagrames.github.io/litsearchr/introduction_vignette_v010.html

C.1.4 Multilanguage programming

Link Language Recommendation
Harvard Data Science workshops materials R, Python, Stata

C.1.5 Machine learning

MOOC Language Recommendation
Machine learning in Python with scikit-learn Python ⭐ ⭐ ⭐ ⭐ ⭐
Machine learning for healthcare

Supervised Machine Learning for Text Analysis in R

https://ocw.mit.edu/courses/6-0002-introduction-to-computational-thinking-and-data-science-fall-2016/

https://ocw.mit.edu/courses/6-438-algorithms-for-inference-fall-2014/

C.1.6 Comprehensive lectures

Link Tools Recommendation Pricing
Principles, Statistical and Computational Tools for Reproducible Data Science

R and Rstudio

Python, Git, and GitHub

  • Free (audit track)

  • 99 USD (verified track)

Collaborative Data Science for Healthcare Experience with R, Python and/or SQL is required
  • Free (audit track)

  • 49 USD (verified track)

https://www.coursera.org/learn/data-public-health

C.2 Data collection workflows

RuODK logo
ruODK
sitrep
REDCap+R
REDCapTidieR

C.2.1 Python programming

C.3 Books and websites

C.3.1 Git and GitHub

Link Recommendation
GitHub Guides
Happy Git and GitHub for the useR

C.3.2 Reproducible research

Books and websites
Link Language Recommendation
R for applied epidemiology and public health (2) R ⭐ ⭐ ⭐ ⭐ ⭐
R for Epidemiology R
R4epis R
R for Data Science R ⭐ ⭐ ⭐ ⭐ ⭐
Quarto for scientists R
Tidyverse Skills for Data Science (jhudatascience.org) R

C.4 Communities

Conferences

https://www.r-project.org/conferences/ in general in June-July Fees for Tanzanian participants around 6 USD / person

C.5 Statistics

https://rmisstastic.netlify.app/

grf

C.6 Quarto

Styling PDF documents with Quarto extensions - April 2024

Resource from Nicola Rennie