Public Health Insights: Clinical Reporting Using R
R for Public Health Workshop
đź—“ Firday & Saturday, February 9 & 10, 2024 | 9:00pm - 11: 00pm
🏨 Virtual
đź’Ą Register with Google Form
📝 To register for the workshop, follow instructions in the email “ Workshops” you received after registration.
Workshop Overview
Join us for an engaging and informative workshop on “Public Health Insights: Clinical Reporting Using {gtsummary}.” This workshop is designed to equip public health professionals, researchers, and analysts with the skills and knowledge to effectively utilize the gtsummary package for creating comprehensive clinical reports. gtsummary is a powerful tool that streamlines the process of summarizing data, making it an invaluable asset for deriving meaningful insights and driving evidence-based decisions in public health contexts.
Through hands-on demonstrations and interactive sessions, participants will learn how to leverage gtsummary to generate visually appealing and interpretable clinical reports. The workshop will cover essential concepts, best practices, and practical examples that demonstrate the application of gtsummary in real-world public health scenarios. Whether you’re a seasoned data analyst or new to the world of data reporting, this workshop will provide you with the tools and techniques needed to elevate your clinical reporting capabilities.
What is gtsummary?
The gtsummary package provides an elegant and flexible way to create publication-ready summary tables in R.
A critical part of the work of statisticians, data scientists, and analysts is summarizing data sets and regression models in R and publishing or sharing polished summary tables.
The gtsummary package was created to streamline these everyday analysis tasks by allowing users to easily create reproducible summaries of data sets, regression models, survey data, and survival data with a simple interface and very little code.
The package follows a tidy framework, making it easy to integrate with standard data workflows, and offers many table customization features through function arguments, helper functions, and custom themes.
Learning objectives
- Understand the importance of standardized clinical reporting in public health research and practice.
- Learn the basics of the gtsummary package and its capabilities for creating summary tables in R.
- Explore various data formats and structures that can be used with gtsummary.
- Learn how to prepare and clean data for clinical reporting using {gtsummary}.
- Gain proficiency in creating common clinical summary tables, such as frequency tables, cross-tabulations, and stratified tables.
- Understand how to incorporate descriptive statistics, including means, medians, and proportions, into {gtsummary} tables.
- Learn how to customize table formatting, including titles, captions, footnotes, and table themes.
- Explore advanced features of gtsummary for creating complex tables, including multi-variable summaries and interaction tables.
- Understand the role of regression model summaries in clinical reporting and learn how to generate these summaries using gtsummary.
Is this course for me?
If your answer to any of the following questions is “yes”, then this is the right workshop for you.
Do you make summary tables in R (data, survey data, regression models, time-to-event data, adverse event reports)?
Do you want your workflow to be reproducible?
Are you often frustrated with the immense amount of code required to create great-looking tables in R?
The workshop is designed for those with some experience in R. It will be expected that you can perform basic data manipulation. Experience with the {tidyverse} and the %>%
operator is a plus, but not required.
What you will learn?
Descriptive Tables
Descriptive statistics for continuous, categorical, and dichotomous variables in R, and presents the results in a beautiful, customizable summary table ready for publication (for example, Table 1 or demographic tables).
Characteristic | N = 6801 |
---|---|
Age | 21.80 (2.99) |
Gender | |
Female | 301 (44%) |
Male | 379 (56%) |
Marital Status | |
Married | 52 (7.6%) |
Unmarried | 628 (92%) |
Field of Study | |
Arts and Humanities | 88 (13%) |
Business | 85 (13%) |
Science | 445 (65%) |
Social science | 62 (9.1%) |
Year of Study | |
1st year | 266 (39%) |
2nd year | 131 (19%) |
3rd year | 115 (17%) |
4th year | 114 (17%) |
Masters | 54 (7.9%) |
Do you know about thalassemia? | |
No | 70 (10%) |
Yes | 610 (90%) |
1 Mean (SD); n (%) |
Comparative Tables
Comparative tables are a type of analytical table used to present and compare data across different variables, categories, or entities. These tables are commonly employed to highlight similarities, differences, and relationships between data points.They allow for a concise and structured way to showcase information side by side, making it easier for the audience to draw insights and conclusions.
Characteristic | Overall, N = 6101 | Good, N = 1561 | Poor, N = 4541 |
---|---|---|---|
Age | 21.79 (3.06) | 21.85 (1.85) | 21.77 (3.38) |
Gender | |||
Female | 274 (45%) | 64 (41%) | 210 (46%) |
Male | 336 (55%) | 92 (59%) | 244 (54%) |
Marital Status | |||
Married | 48 (7.9%) | 14 (9.0%) | 34 (7.5%) |
Unmarried | 562 (92%) | 142 (91%) | 420 (93%) |
Field of Study | |||
Arts and Humanities | 69 (11%) | 17 (11%) | 52 (11%) |
Business | 72 (12%) | 16 (10%) | 56 (12%) |
Science | 414 (68%) | 115 (74%) | 299 (66%) |
Social science | 55 (9.0%) | 8 (5.1%) | 47 (10%) |
Year of Study | |||
1st year | 240 (39%) | 55 (35%) | 185 (41%) |
2nd year | 114 (19%) | 25 (16%) | 89 (20%) |
3rd year | 101 (17%) | 26 (17%) | 75 (17%) |
4th year | 107 (18%) | 35 (22%) | 72 (16%) |
Masters | 48 (7.9%) | 15 (9.6%) | 33 (7.3%) |
1 Mean (SD); n (%) |
Analytical Tables
Analytical tables typically refer to organized and structured data representations used for analysis, comparison, and interpretation of information. They are commonly used in various fields such as statistics, research, business, and academia. Analytical tables help present data in a clear and concise manner, making it easier to identify patterns, trends, and relationships among variables.
Characteristic | Overall, N = 6101 | Good, N = 1561 | Poor, N = 4541 | p-value2 |
---|---|---|---|---|
Age | 21.79 (3.06) | 21.85 (1.85) | 21.77 (3.38) | 0.14 |
Gender | 0.3 | |||
Female | 274 (45%) | 64 (41%) | 210 (46%) | |
Male | 336 (55%) | 92 (59%) | 244 (54%) | |
Marital Status | 0.6 | |||
Married | 48 (7.9%) | 14 (9.0%) | 34 (7.5%) | |
Unmarried | 562 (92%) | 142 (91%) | 420 (93%) | |
Field of Study | 0.2 | |||
Arts and Humanities | 69 (11%) | 17 (11%) | 52 (11%) | |
Business | 72 (12%) | 16 (10%) | 56 (12%) | |
Science | 414 (68%) | 115 (74%) | 299 (66%) | |
Social science | 55 (9.0%) | 8 (5.1%) | 47 (10%) | |
Year of Study | 0.3 | |||
1st year | 240 (39%) | 55 (35%) | 185 (41%) | |
2nd year | 114 (19%) | 25 (16%) | 89 (20%) | |
3rd year | 101 (17%) | 26 (17%) | 75 (17%) | |
4th year | 107 (18%) | 35 (22%) | 72 (16%) | |
Masters | 48 (7.9%) | 15 (9.6%) | 33 (7.3%) | |
1 Mean (SD); n (%) | ||||
2 Wilcoxon rank sum test; Pearson’s Chi-squared test |
Univariate Regression Tables
Univariate regression is a type of linear regression where you have one independent variable (predictor) and one dependent variable (response). Regression tables are used to present the results of regression analysis, including coefficients, standard errors, significance levels, and other relevant statistics. Below is an example of what a simple univariate regression table might look like:
Characteristic | N | Event N | OR1 | 95% CI1 | p-value | q-value2 |
---|---|---|---|---|---|---|
Age | 183 | 58 | 1.02 | 1.00, 1.04 | 0.091 | 0.18 |
Grade | 193 | 61 | 0.93 | 0.93 | ||
I | — | — | ||||
II | 0.95 | 0.45, 2.00 | ||||
III | 1.10 | 0.52, 2.29 | ||||
1 OR = Odds Ratio, CI = Confidence Interval | ||||||
2 False discovery rate correction for multiple testing |
Multivariate Regression Tables
A multivariate regression table provides a summary of the results obtained from a multivariate regression analysis. This type of analysis involves multiple independent variables (predictors) and a single dependent variable.
Characteristic | OR1 | 95% CI1 | p-value |
---|---|---|---|
Age | 1.02 | 1.00, 1.04 | 0.087 |
T Stage | 0.62 | ||
T1 | — | — | |
T2 | 0.58 | 0.24, 1.37 | |
T3 | 0.94 | 0.39, 2.28 | |
T4 | 0.79 | 0.33, 1.90 | |
1 OR = Odds Ratio, CI = Confidence Interval |
Prework
Before attending the workshop please have the following installed and configured on your machine.
Recent version of R
Recent version of RStudio
-
Most recent release of the gtsummary and other packages used in workshop.
-
Ensure you can knit R markdown documents
- Open RStudio and create a new Rmarkdown document
- Save the document and check you are able to knit it.
Center for Health Innovation, Research, Action and Learning - Bangladesh (CHIRAL Bangladesh) is a voluntary non-profit research organization, resolving to promote interdisciplinary research in the field of health data science, computational biology and genomics.
Instructor
Bio
Hi, I am Jubayer, a highly motivated biomedical research enthusiasts with a Master of Science in Microbiology focus on public health and health data science. Research experience designing and implementing projects for biomedical data analysis (including next‑generation sequencing, RNA‑seq , and ssRNA‑seq ). I am interested in applying machine learning/deep learning tools and techniques in the context of disease diagnosis and large data analytics for public health while focusing on bridging the gap between computational and experimental laboratories through highly engaging and fruitful collaborations
Python is my primary language for data analysis and machine learning. I also have a basic understanding of R, Julia, SPSS, QGIS, and SQL.
This page highlights my teaching and research projects. Please reach out if you want to collaborate or have questions.
Skills
Programming Languages: Python, R, SQL, Julia, JavaScript; Data Science: scikit-learn, PyCaret, Dask, PySpark; GIS & Remote Sensing: ArcGIS, Geopandas, Xarray, Giovani; Analytics Softwares: SPSS, PowerBI, Microsoft Excel; Survey Tools: RedCap, KoboToolBox, EpiCollect, Google Forms; Academic Writing Tools: Microsoft Word, LaTeX, Mendeley; Bioinformatics: BioPython, Bioconductor, BioPandas, Galaxy, NGS, RNASeq, ssRNASeq; Miscellaneous Skills: UNIX, Version Control(Git), Web Scraping, APIs.
Selected Publications
- Hossain, M.J., Islam, M.W., Munni, U.R. et al. Health-related quality of life among thalassemia patients in Bangladesh using the SF-36 questionnaire. Scientific Reports 13, 7734 (2023). https://doi.org/10.1038/s41598-023-34205-9
- Towhid, S. T., Hossain, M. J., Sammo, M. A. S., & Akter, S. (2022). Perception of Students on Antibiotic Resistance and Prevention: An Online, Community-Based Case Study from Dhaka, Bangladesh. European Journal of Biology and Biotechnology, 3(3), 14–19. https://doi.org/10.24018/ejbio.2022.3.3.341
- Hossain, M.J., Towhid ST, Sultana S, Mukta SA, Gulshan R, Miah MS (2022). Knowledge and Attitudes towards Thalassemia among Public University Students in Bangladesh. Microbial Bioactives, 5(2), https://doi.org/10.25163/microbbioacts.526325.