Public Health Insights: Clinical Reporting Using R

R for Public Health Workshop

🗓 Firday & Saturday, February 9 & 10, 2024 | 9:00pm - 11: 00pm

🏨 Virtual

💥 Register with Google Form

📝 To register for the workshop, follow instructions in the email “ Workshops” you received after registration.

Workshop Overview

Join us for an engaging and informative workshop on “Public Health Insights: Clinical Reporting Using {gtsummary}.” This workshop is designed to equip public health professionals, researchers, and analysts with the skills and knowledge to effectively utilize the gtsummary package for creating comprehensive clinical reports. gtsummary is a powerful tool that streamlines the process of summarizing data, making it an invaluable asset for deriving meaningful insights and driving evidence-based decisions in public health contexts.

Through hands-on demonstrations and interactive sessions, participants will learn how to leverage gtsummary to generate visually appealing and interpretable clinical reports. The workshop will cover essential concepts, best practices, and practical examples that demonstrate the application of gtsummary in real-world public health scenarios. Whether you’re a seasoned data analyst or new to the world of data reporting, this workshop will provide you with the tools and techniques needed to elevate your clinical reporting capabilities.

What is gtsummary?

The gtsummary package provides an elegant and flexible way to create publication-ready summary tables in R.

A critical part of the work of statisticians, data scientists, and analysts is summarizing data sets and regression models in R and publishing or sharing polished summary tables.

The gtsummary package was created to streamline these everyday analysis tasks by allowing users to easily create reproducible summaries of data sets, regression models, survey data, and survival data with a simple interface and very little code.

The package follows a tidy framework, making it easy to integrate with standard data workflows, and offers many table customization features through function arguments, helper functions, and custom themes.

Learning objectives

Understand the importance of standardized clinical reporting in public health research and practice.
Learn the basics of the gtsummary package and its capabilities for creating summary tables in R.
Explore various data formats and structures that can be used with gtsummary.
Learn how to prepare and clean data for clinical reporting using {gtsummary}.
Gain proficiency in creating common clinical summary tables, such as frequency tables, cross-tabulations, and stratified tables.
Understand how to incorporate descriptive statistics, including means, medians, and proportions, into {gtsummary} tables.
Learn how to customize table formatting, including titles, captions, footnotes, and table themes.
Explore advanced features of gtsummary for creating complex tables, including multi-variable summaries and interaction tables.
Understand the role of regression model summaries in clinical reporting and learn how to generate these summaries using gtsummary.

Is this course for me?

If your answer to any of the following questions is “yes”, then this is the right workshop for you.

Do you make summary tables in R (data, survey data, regression models, time-to-event data, adverse event reports)?
Do you want your workflow to be reproducible?
Are you often frustrated with the immense amount of code required to create great-looking tables in R?

The workshop is designed for those with some experience in R. It will be expected that you can perform basic data manipulation. Experience with the {tidyverse} and the %>% operator is a plus, but not required.

What you will learn?

Descriptive Tables

Descriptive statistics for continuous, categorical, and dichotomous variables in R, and presents the results in a beautiful, customizable summary table ready for publication (for example, Table 1 or demographic tables).

Table 1. General characteristics of the study participants (N = 680)
Characteristic	N = 680¹
Age	21.80 (2.99)
Gender
Female	301 (44%)
Male	379 (56%)
Marital Status
Married	52 (7.6%)
Unmarried	628 (92%)
Field of Study
Arts and Humanities	88 (13%)
Business	85 (13%)
Science	445 (65%)
Social science	62 (9.1%)
Year of Study
1st year	266 (39%)
2nd year	131 (19%)
3rd year	115 (17%)
4th year	114 (17%)
Masters	54 (7.9%)
Do you know about thalassemia?
No	70 (10%)
Yes	610 (90%)
¹ Mean (SD); n (%)

Comparative Tables

Comparative tables are a type of analytical table used to present and compare data across different variables, categories, or entities. These tables are commonly employed to highlight similarities, differences, and relationships between data points.They allow for a concise and structured way to showcase information side by side, making it easier for the audience to draw insights and conclusions.

Table 2. Level of knowledge of thalassemia who heard abouth thalassemia (N = 610
Characteristic	Overall, N = 610¹	Good, N = 156¹	Poor, N = 454¹
Age	21.79 (3.06)	21.85 (1.85)	21.77 (3.38)
Gender
Female	274 (45%)	64 (41%)	210 (46%)
Male	336 (55%)	92 (59%)	244 (54%)
Marital Status
Married	48 (7.9%)	14 (9.0%)	34 (7.5%)
Unmarried	562 (92%)	142 (91%)	420 (93%)
Field of Study
Arts and Humanities	69 (11%)	17 (11%)	52 (11%)
Business	72 (12%)	16 (10%)	56 (12%)
Science	414 (68%)	115 (74%)	299 (66%)
Social science	55 (9.0%)	8 (5.1%)	47 (10%)
Year of Study
1st year	240 (39%)	55 (35%)	185 (41%)
2nd year	114 (19%)	25 (16%)	89 (20%)
3rd year	101 (17%)	26 (17%)	75 (17%)
4th year	107 (18%)	35 (22%)	72 (16%)
Masters	48 (7.9%)	15 (9.6%)	33 (7.3%)
¹ Mean (SD); n (%)

Analytical Tables

Analytical tables typically refer to organized and structured data representations used for analysis, comparison, and interpretation of information. They are commonly used in various fields such as statistics, research, business, and academia. Analytical tables help present data in a clear and concise manner, making it easier to identify patterns, trends, and relationships among variables.

Table 3. Level of knowledge of thalassemia among university students (N = 610)
Characteristic	Overall, N = 610¹	Good, N = 156¹	Poor, N = 454¹	p-value²
Age	21.79 (3.06)	21.85 (1.85)	21.77 (3.38)	0.14
Gender				0.3
Female	274 (45%)	64 (41%)	210 (46%)
Male	336 (55%)	92 (59%)	244 (54%)
Marital Status				0.6
Married	48 (7.9%)	14 (9.0%)	34 (7.5%)
Unmarried	562 (92%)	142 (91%)	420 (93%)
Field of Study				0.2
Arts and Humanities	69 (11%)	17 (11%)	52 (11%)
Business	72 (12%)	16 (10%)	56 (12%)
Science	414 (68%)	115 (74%)	299 (66%)
Social science	55 (9.0%)	8 (5.1%)	47 (10%)
Year of Study				0.3
1st year	240 (39%)	55 (35%)	185 (41%)
2nd year	114 (19%)	25 (16%)	89 (20%)
3rd year	101 (17%)	26 (17%)	75 (17%)
4th year	107 (18%)	35 (22%)	72 (16%)
Masters	48 (7.9%)	15 (9.6%)	33 (7.3%)
¹ Mean (SD); n (%)
² Wilcoxon rank sum test; Pearson’s Chi-squared test

Univariate Regression Tables

Univariate regression is a type of linear regression where you have one independent variable (predictor) and one dependent variable (response). Regression tables are used to present the results of regression analysis, including coefficients, standard errors, significance levels, and other relevant statistics. Below is an example of what a simple univariate regression table might look like:

Characteristic	N	Event N	OR¹	95% CI¹	p-value	q-value²
Age	183	58	1.02	1.00, 1.04	0.091	0.18
Grade	193	61			0.93	0.93
I			—	—
II			0.95	0.45, 2.00
III			1.10	0.52, 2.29
¹ OR = Odds Ratio, CI = Confidence Interval
² False discovery rate correction for multiple testing

Multivariate Regression Tables

A multivariate regression table provides a summary of the results obtained from a multivariate regression analysis. This type of analysis involves multiple independent variables (predictors) and a single dependent variable.

Characteristic	OR¹	95% CI¹	p-value
Age	1.02	1.00, 1.04	0.087
T Stage			0.62
T1	—	—
T2	0.58	0.24, 1.37
T3	0.94	0.39, 2.28
T4	0.79	0.33, 1.90
¹ OR = Odds Ratio, CI = Confidence Interval

Prework

Before attending the workshop please have the following installed and configured on your machine.

Recent version of R
Recent version of RStudio

Most recent release of the gtsummary and other packages used in workshop.

instll_pkgs <- 
  c("gtsummary", "tidyverse", "labelled", "usethis", 
    "causaldata", "fs", "skimr", "car", "emmeans")
install.packages(instll_pkgs)

Ensure you can knit R markdown documents
- Open RStudio and create a new Rmarkdown document
- Save the document and check you are able to knit it.

Who We Are: CHIRAL Bangladesh

Center for Health Innovation, Research, Action and Learning - Bangladesh (CHIRAL Bangladesh) is a voluntary non-profit research organization, resolving to promote interdisciplinary research in the field of health data science, computational biology and genomics.

Instructor

Bio

Headshot of Jubayer

Hi, I am Jubayer, a highly motivated biomedical research enthusiasts with a Master of Science in Microbiology focus on public health and health data science. Research experience designing and implementing projects for biomedical data analysis (including next‑generation sequencing, RNA‑seq , and ssRNA‑seq ). I am interested in applying machine learning/deep learning tools and techniques in the context of disease diagnosis and large data analytics for public health while focusing on bridging the gap between computational and experimental laboratories through highly engaging and fruitful collaborations

Python is my primary language for data analysis and machine learning. I also have a basic understanding of R, Julia, SPSS, QGIS, and SQL.

This page highlights my teaching and research projects. Please reach out if you want to collaborate or have questions.

Skills

Programming Languages: Python, R, SQL, Julia, JavaScript; Data Science: scikit-learn, PyCaret, Dask, PySpark; GIS & Remote Sensing: ArcGIS, Geopandas, Xarray, Giovani; Analytics Softwares: SPSS, PowerBI, Microsoft Excel; Survey Tools: RedCap, KoboToolBox, EpiCollect, Google Forms; Academic Writing Tools: Microsoft Word, LaTeX, Mendeley; Bioinformatics: BioPython, Bioconductor, BioPandas, Galaxy, NGS, RNASeq, ssRNASeq; Miscellaneous Skills: UNIX, Version Control(Git), Web Scraping, APIs.

Selected Publications

Hossain, M.J., Islam, M.W., Munni, U.R. et al. Health-related quality of life among thalassemia patients in Bangladesh using the SF-36 questionnaire. Scientific Reports 13, 7734 (2023). https://doi.org/10.1038/s41598-023-34205-9
Towhid, S. T., Hossain, M. J., Sammo, M. A. S., & Akter, S. (2022). Perception of Students on Antibiotic Resistance and Prevention: An Online, Community-Based Case Study from Dhaka, Bangladesh. European Journal of Biology and Biotechnology, 3(3), 14–19. https://doi.org/10.24018/ejbio.2022.3.3.341
Hossain, M.J., Towhid ST, Sultana S, Mukta SA, Gulshan R, Miah MS (2022). Knowledge and Attitudes towards Thalassemia among Public University Students in Bangladesh. Microbial Bioactives, 5(2), https://doi.org/10.25163/microbbioacts.526325.