Public Health Insights: Clinical Reporting Using R

R for Public Health Workshop


đź—“ Firday & Saturday, February 9 & 10, 2024 | 9:00pm - 11: 00pm

🏨 Virtual

đź’Ą Register with Google Form

📝 To register for the workshop, follow instructions in the email “ Workshops” you received after registration.


Workshop Overview

Join us for an engaging and informative workshop on “Public Health Insights: Clinical Reporting Using {gtsummary}.” This workshop is designed to equip public health professionals, researchers, and analysts with the skills and knowledge to effectively utilize the gtsummary package for creating comprehensive clinical reports. gtsummary is a powerful tool that streamlines the process of summarizing data, making it an invaluable asset for deriving meaningful insights and driving evidence-based decisions in public health contexts.

Through hands-on demonstrations and interactive sessions, participants will learn how to leverage gtsummary to generate visually appealing and interpretable clinical reports. The workshop will cover essential concepts, best practices, and practical examples that demonstrate the application of gtsummary in real-world public health scenarios. Whether you’re a seasoned data analyst or new to the world of data reporting, this workshop will provide you with the tools and techniques needed to elevate your clinical reporting capabilities.

What is gtsummary?

The gtsummary package provides an elegant and flexible way to create publication-ready summary tables in R.

A critical part of the work of statisticians, data scientists, and analysts is summarizing data sets and regression models in R and publishing or sharing polished summary tables.

The gtsummary package was created to streamline these everyday analysis tasks by allowing users to easily create reproducible summaries of data sets, regression models, survey data, and survival data with a simple interface and very little code.

The package follows a tidy framework, making it easy to integrate with standard data workflows, and offers many table customization features through function arguments, helper functions, and custom themes.

Learning objectives

  • Understand the importance of standardized clinical reporting in public health research and practice.
  • Learn the basics of the gtsummary package and its capabilities for creating summary tables in R.
  • Explore various data formats and structures that can be used with gtsummary.
  • Learn how to prepare and clean data for clinical reporting using {gtsummary}.
  • Gain proficiency in creating common clinical summary tables, such as frequency tables, cross-tabulations, and stratified tables.
  • Understand how to incorporate descriptive statistics, including means, medians, and proportions, into {gtsummary} tables.
  • Learn how to customize table formatting, including titles, captions, footnotes, and table themes.
  • Explore advanced features of gtsummary for creating complex tables, including multi-variable summaries and interaction tables.
  • Understand the role of regression model summaries in clinical reporting and learn how to generate these summaries using gtsummary.

Is this course for me?

If your answer to any of the following questions is “yes”, then this is the right workshop for you.

  • Do you make summary tables in R (data, survey data, regression models, time-to-event data, adverse event reports)?

  • Do you want your workflow to be reproducible?

  • Are you often frustrated with the immense amount of code required to create great-looking tables in R?

The workshop is designed for those with some experience in R. It will be expected that you can perform basic data manipulation. Experience with the {tidyverse} and the %>% operator is a plus, but not required.

What you will learn?

Descriptive Tables

Descriptive statistics for continuous, categorical, and dichotomous variables in R, and presents the results in a beautiful, customizable summary table ready for publication (for example, Table 1 or demographic tables).

Table 1. General characteristics of the study participants (N = 680)
Characteristic N = 6801
Age 21.80 (2.99)
Gender
    Female 301 (44%)
    Male 379 (56%)
Marital Status
    Married 52 (7.6%)
    Unmarried 628 (92%)
Field of Study
    Arts and Humanities 88 (13%)
    Business 85 (13%)
    Science 445 (65%)
    Social science 62 (9.1%)
Year of Study
    1st year 266 (39%)
    2nd year 131 (19%)
    3rd year 115 (17%)
    4th year 114 (17%)
    Masters 54 (7.9%)
Do you know about thalassemia?
    No 70 (10%)
    Yes 610 (90%)
1 Mean (SD); n (%)

Comparative Tables

Comparative tables are a type of analytical table used to present and compare data across different variables, categories, or entities. These tables are commonly employed to highlight similarities, differences, and relationships between data points.They allow for a concise and structured way to showcase information side by side, making it easier for the audience to draw insights and conclusions.

Table 2. Level of knowledge of thalassemia who heard abouth thalassemia (N = 610
Characteristic Overall, N = 6101 Good, N = 1561 Poor, N = 4541
Age 21.79 (3.06) 21.85 (1.85) 21.77 (3.38)
Gender


    Female 274 (45%) 64 (41%) 210 (46%)
    Male 336 (55%) 92 (59%) 244 (54%)
Marital Status


    Married 48 (7.9%) 14 (9.0%) 34 (7.5%)
    Unmarried 562 (92%) 142 (91%) 420 (93%)
Field of Study


    Arts and Humanities 69 (11%) 17 (11%) 52 (11%)
    Business 72 (12%) 16 (10%) 56 (12%)
    Science 414 (68%) 115 (74%) 299 (66%)
    Social science 55 (9.0%) 8 (5.1%) 47 (10%)
Year of Study


    1st year 240 (39%) 55 (35%) 185 (41%)
    2nd year 114 (19%) 25 (16%) 89 (20%)
    3rd year 101 (17%) 26 (17%) 75 (17%)
    4th year 107 (18%) 35 (22%) 72 (16%)
    Masters 48 (7.9%) 15 (9.6%) 33 (7.3%)
1 Mean (SD); n (%)

Analytical Tables

Analytical tables typically refer to organized and structured data representations used for analysis, comparison, and interpretation of information. They are commonly used in various fields such as statistics, research, business, and academia. Analytical tables help present data in a clear and concise manner, making it easier to identify patterns, trends, and relationships among variables.

Table 3. Level of knowledge of thalassemia among university students (N = 610)
Characteristic Overall, N = 6101 Good, N = 1561 Poor, N = 4541 p-value2
Age 21.79 (3.06) 21.85 (1.85) 21.77 (3.38) 0.14
Gender


0.3
    Female 274 (45%) 64 (41%) 210 (46%)
    Male 336 (55%) 92 (59%) 244 (54%)
Marital Status


0.6
    Married 48 (7.9%) 14 (9.0%) 34 (7.5%)
    Unmarried 562 (92%) 142 (91%) 420 (93%)
Field of Study


0.2
    Arts and Humanities 69 (11%) 17 (11%) 52 (11%)
    Business 72 (12%) 16 (10%) 56 (12%)
    Science 414 (68%) 115 (74%) 299 (66%)
    Social science 55 (9.0%) 8 (5.1%) 47 (10%)
Year of Study


0.3
    1st year 240 (39%) 55 (35%) 185 (41%)
    2nd year 114 (19%) 25 (16%) 89 (20%)
    3rd year 101 (17%) 26 (17%) 75 (17%)
    4th year 107 (18%) 35 (22%) 72 (16%)
    Masters 48 (7.9%) 15 (9.6%) 33 (7.3%)
1 Mean (SD); n (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test

Univariate Regression Tables

Univariate regression is a type of linear regression where you have one independent variable (predictor) and one dependent variable (response). Regression tables are used to present the results of regression analysis, including coefficients, standard errors, significance levels, and other relevant statistics. Below is an example of what a simple univariate regression table might look like:

Characteristic N Event N OR1 95% CI1 p-value q-value2
Age 183 58 1.02 1.00, 1.04 0.091 0.18
Grade 193 61

0.93 0.93
    I

— —

    II

0.95 0.45, 2.00

    III

1.10 0.52, 2.29

1 OR = Odds Ratio, CI = Confidence Interval
2 False discovery rate correction for multiple testing

Multivariate Regression Tables

A multivariate regression table provides a summary of the results obtained from a multivariate regression analysis. This type of analysis involves multiple independent variables (predictors) and a single dependent variable.

Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.087
T Stage

0.62
    T1 — —
    T2 0.58 0.24, 1.37
    T3 0.94 0.39, 2.28
    T4 0.79 0.33, 1.90
1 OR = Odds Ratio, CI = Confidence Interval

Prework

Before attending the workshop please have the following installed and configured on your machine.

  • Recent version of R

  • Recent version of RStudio

  • Most recent release of the gtsummary and other packages used in workshop.

    instll_pkgs <- 
      c("gtsummary", "tidyverse", "labelled", "usethis", 
        "causaldata", "fs", "skimr", "car", "emmeans")
    install.packages(instll_pkgs)
  • Ensure you can knit R markdown documents

    • Open RStudio and create a new Rmarkdown document
    • Save the document and check you are able to knit it.
Who We Are: CHIRAL Bangladesh

Center for Health Innovation, Research, Action and Learning - Bangladesh (CHIRAL Bangladesh) is a voluntary non-profit research organization, resolving to promote interdisciplinary research in the field of health data science, computational biology and genomics.

Instructor

Bio

Headshot of Jubayer

Hi, I am Jubayer, a highly motivated biomedical research enthusiasts with a Master of Science in Microbiology focus on public health and health data science. Research experience designing and implementing projects for biomedical data analysis (including next‑generation sequencing, RNA‑seq , and ssRNA‑seq ). I am interested in applying machine learning/deep learning tools and techniques in the context of disease diagnosis and large data analytics for public health while focusing on bridging the gap between computational and experimental laboratories through highly engaging and fruitful collaborations

Python is my primary language for data analysis and machine learning. I also have a basic understanding of R, Julia, SPSS, QGIS, and SQL.

This page highlights my teaching and research projects. Please reach out if you want to collaborate or have questions.

Skills

Programming Languages: Python, R, SQL, Julia, JavaScript; Data Science: scikit-learn, PyCaret, Dask, PySpark; GIS & Remote Sensing: ArcGIS, Geopandas, Xarray, Giovani; Analytics Softwares: SPSS, PowerBI, Microsoft Excel; Survey Tools: RedCap, KoboToolBox, EpiCollect, Google Forms; Academic Writing Tools: Microsoft Word, LaTeX, Mendeley; Bioinformatics: BioPython, Bioconductor, BioPandas, Galaxy, NGS, RNASeq, ssRNASeq; Miscellaneous Skills: UNIX, Version Control(Git), Web Scraping, APIs.

Selected Publications

  1. Hossain, M.J., Islam, M.W., Munni, U.R. et al. Health-related quality of life among thalassemia patients in Bangladesh using the SF-36 questionnaire. Scientific Reports 13, 7734 (2023). https://doi.org/10.1038/s41598-023-34205-9
  2. Towhid, S. T., Hossain, M. J., Sammo, M. A. S., & Akter, S. (2022). Perception of Students on Antibiotic Resistance and Prevention: An Online, Community-Based Case Study from Dhaka, Bangladesh. European Journal of Biology and Biotechnology, 3(3), 14–19. https://doi.org/10.24018/ejbio.2022.3.3.341
  3. Hossain, M.J., Towhid ST, Sultana S, Mukta SA, Gulshan R, Miah MS (2022). Knowledge and Attitudes towards Thalassemia among Public University Students in Bangladesh. Microbial Bioactives, 5(2), https://doi.org/10.25163/microbbioacts.526325.