```{r}
library(openxlsx)
library(dplyr)
library(skimr)
library(gtsummary)
library(finalfit)
library(ggplot2)
library(ggthemes)
library(networkD3) # For alluvial/Sankey diagrams
library(tidyverse)
```
19 :books: Malaria case study
19.1 Introduction
19.1.1 Overview
These pages will demonstrate how to use Quarto to data from Tanzania.
19.1.2 Learning objectives
- Apply what you have learnt on Day 1 on real data
19.2 Getting started
19.2.1 Access the Quarto template
Download the Quarto template used for this case study (add link) using GitHub.
Please review previous sections on Quarto, data import and manipulation.
19.2.2 Install packages
```{r}
install.packages("ggplot2")
install.packages("ggthemes")
install.packages("networkD3")
install.packages("apyramid")
```
19.2.3 Dataset description
We will be using data and examples from a real consultation data which occurred in Tanzania between 2021-07-29 and 2021-12-17 within the Integrated Management of Childhood Illness (TIMCI) project.
Data are made available by the Ifakara Health Institute (IHI) for training purposes only. Please note, that some data has been adapted in order to best achieve training objectives. No personally indentifiable information have been kept in this dataset.
Information about the consultations of 10,308 children [1 day - 59 months] from 18 facilities (dispensaries and health centres) in Kaliua District, Sengerema District and Tanga District, Tanzania.
19.2.4 Data collection
Data were collected using ODK (ODK Collect, ODK Central) between 2021-07-29 and 2021-12-17. Research assistants recorded the following information from different sources.
Information | Prefix | Source |
---|---|---|
Context | CTX | Metadata |
Sociodemographics | SDC | Caregiver |
Clinical presentation | CLIN | Caregiver |
Laboratory investigations | TEST | Child booklet or facility MTUHA book |
Diagnoses | DX | Child booklet or facility MTUHA book |
Treatments | RX | |
|
Caregiver | |
|
Child booklet or facility MTUHA book | |
Referral advice | MGMT | |
Caregiver | ||
Child booklet or facility MTUHA book |
19.2.5 Data preparation
Data cleaning and data de-identification
Personally identifiable information (PII) were removed.
19.3 Population characteristics
19.3.1 Codebook
Variable | Coding |
---|---|
SDC_age_in_month | |
SDC_sex | 1: male 2: female 98: unknown |
CLIN_fever | 0: no 1: yes 98: not sure |
CLIN_fever_onset | |
CLIN_cough | 0: no 1: yes 98: not sure |
CLIN_diarrhoea | 0: no 1: yes 98: not sure |
RX_preconsult_antibiotics | |
RX_preconsult_antimalarials | |
CONSULT_district | Kaliua Sengerema Tanga |
CONSULT_area | urban rural |
CONSULT_facility_type | dispensary health centre |
How many variables are numerical? How many features are categorical?
A numerical variable is a quantity represented by a real or integer number.
A categorical variable has discrete values, typically represented by string labels (but not only) taken from a finite list of possible choices.
For instance, the variable native-country in our dataset is a categorical variable because it encodes the data using a finite list of possible countries (along with the ? symbol when this information is missing)