if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
::install(version = "3.16") BiocManager
R for Bioinformatics
Learn to Analyze Genomics, Metagenomics, and Transcriptomics Data with R
đź—“ July 16 - July 31, 2023 | 9:00 pm - 11:00pm (Bangladesh Time)
🏨 Medium - Zoom/Google Meet
đź’Ą Register with Google Forms
đź’Ą Registration Fee: 1020 BDT (for students),1530BDT (for professionals)
đź“ť To join private Telegram group for the course, follow instructions in the email you received after registration.
Overview
Welcome to the exciting world of bioinformatics! In this course, we will explore the application of the R programming language in the field of bioinformatics. Whether you are a biologist, a computer scientist, or simply someone interested in the intersection of biology and programming, this book is designed to help you get started with using R for bioinformatics.
Bioinformatics combines biology, computer science, statistics, and math to understand complex biological processes. R is a powerful programming language widely used in bioinformatics for data analysis, visualization, and modeling. It has a rich set of packages and libraries for processing biological data.
This course assumes no prior knowledge of R programming, but assumes basic understanding of statistics and bioinformatics. We’ll cover data manipulation, visualization, statistics, and common bioinformatics techniques.
Remember, learning a new language and field takes time and practice. Stay persistent, ask questions, and engage actively with the material. Let’s embark on this exciting journey into bioinformatics with R. By the end, you’ll have the skills to explore and contribute to bioinformatics research.
Enjoy your learning experience, and good luck!
Learning bbjectives
Understand the fundamentals of bioinformatics and its application in biological research.
Gain a solid foundation in R programming, including data structures, control flow, and functions.
Learn data manipulation techniques specific to bioinformatics, such as working with DNA sequences and protein structures.
Gain hands-on experience by working with real-world bioinformatics datasets and applying R to extract valuable information.
Understand the importance of reproducibility and best practices in bioinformatics research using R.
Feel confident in applying R for basic bioinformatics tasks and be ready to explore more advanced topics in the field.
Why R?
- R is a programming and statistical language.
- R is used for data Analysis and Visualization.
- R is simple and easy to learn, read and write.
- R is an example of a FLOSS (Free Libre and Open Source Software) where one can freely distribute copies of this software, read its source code, modify it, etc.
Useful resources for learning R
DataCamp: interactive online lessons in R.
- Some of the courses are free (particularly community-written lessons like the one you’ll do today), but for paid courses, DataCamp costs about 300 SEK / mo.
RStudio Cheat Sheets: Very helpful 1-2 page overviews of common tasks and packages in R.
Quick-R: Website with short example-driven overviews of R functionality.
StackOverflow: Part of the Stack Exchange network, StackOverflow is a Q&A community website for people who work in programming. Tons of incredibly good R users and developers interact on StackExchange, so it’s a great place to search for answers to your questions.
R-Bloggers: Blog aggregagator for posts about R. Great place to learn really cool things you can do in R.
R for Data Science: Online version of the book by Hadley Wickham, who has written many of the best packages for R, including the Tidyverse, which we will cover.
Required Textbooks
The following books purchased and are avilable at the online book store. We have also a placed a copy of each on reserve at our online library.
- Fundamentals of Biostatistics by Bernard Rosner, Harvard University
- Applied Statistics by PennState Eberly College of Science
- Applied Medical Statistics for Beginners by Dr. Mohamed Elsherif
- Biostatistics by University of Florida
- Introduction to R Programming by Dr. Roger D. Peng
- R for Data Science by Roger D.Peng
- Introduction to Data Science by Rafael Irizarry
- Data Analysis for the Life Sciences by Rafael Irizarry
- Exploratory Data Analysis with R by Roger D.Peng
Blogs for R Programming, Statistics, and Data Analyis
- Programiz - https://www.datamentor.io/r-programming/
- PennState STAT 484 - https://online.stat.psu.edu/stat484/
- PennState Topics in R Statistical Language - https://online.stat.psu.edu/stat484/
- Simply Statistics - https://simplystatistics.org/
- TutorialPoint - https://www.tutorialspoint.com/r/index.htm
- R for Biologists - https://www.rforbiologists.org/
- Computational Genomics with R - https://compgenomr.github.io/book/
- Stat and R - https://statsandr.com/
- Rafa Lab - https://rafalab.github.io/pages/harvardx.html
- University of Florida - https://bolt.mph.ufl.edu/software/r-phc-6055/
Required software
- R: http://www.r-project.org/ (FREE)
- RStudio (additional libraries required): http://www.rstudio.com/ (FREE)
Recording of classes
Class lectures will be recorded automatically using cloud. The links will be posted to CHIRAL Classes when they are available.
Is this course for me?
If your answer to any of the following questions is “yes”, then this is the right workshop for you.
Do you make summary tables in R (data, survey data, regression models, time-to-event data, adverse event reports)?
Do you want your workflow to be reproducible?
Are you often frustrated with the immense amount of code required to create great-looking tables in R?
The workshop is designed for those with some experience in R. It will be expected that you can perform basic data manipulation. Experience with the {tidyverse} and the %>%
operator is a plus, but not required.
Prework
Before attending the workshop please have the following installed and configured on your machine.
Recent version of R
Recent version of RStudio
Most recent release of the Bioconductor and other packages used in courses
Install the latest release of R, then get the latest version of Bioconductor by starting R and entering the commands.
Ensure you can knit R markdown documents
- Open RStudio and create a new Rmarkdown document
- Save the document and check you are able to knit it.
Install Bioconductor Packages
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
::install() BiocManager
Install specific packages, e.g., “GenomicFeatures” and “AnnotationDbi”, with
::install(c("GenomicFeatures", "AnnotationDbi")) BiocManager
The install() function (in the BiocManager package) has arguments that change its default behavior; type ?install for further help.
Zoom + Working Virtually
Zoom link will be emailed to students
Class sessions will be recorded and later posted
We will have lectures as well as breakout room sessions to work on labs
Please be aware that there is the option to use closed captioning:
Instructor
Bio
Md. Jubayer Hossain is the Founder, Management Lead and Data Analyst at CHIRAL Bangladesh. CHIRAL Bangladesh is a non-profit organization dedicated to health research to improve lives in Bangladesh. He aspires to maximize the quality of life of people around him by working at the intersection of education, technology, and health research. His research interests include public health, health data science, remote sensing, GIS applications in health, and machine learning in healthcare. He has been teaching programming, data analysis, data science, and research methodology since 2020 in CHIRAL Bangladesh. Detailed research and teaching activities were found on his website.
Skills
Programming Languages: Python, R, SQL, Julia, JavaScript; Data Science: scikit-learn, PyCaret, Dask, PySpark; GIS & Remote Sensing: ArcGIS, Geopandas, Xarray, Giovani; Analytics Softwares: SPSS, PowerBI, Microsoft Excel; Survey Tools: RedCap, KoboToolBox, EpiCollect, Google Forms; Academic Writing Tools: Microsoft Word, LaTeX, Mendeley; Bioinformatics: BioPython, Bioconductor, BioPandas, Galaxy, NGS, RNASeq, ssRNASeq; Miscellaneous Skills: UNIX, Version Control(Git), Web Scraping, APIs.
Publications
- Hossain, M.J., Islam, M.W., Munni, U.R. et al. Health-related quality of life among thalassemia patients in Bangladesh using the SF-36 questionnaire. Scientific Reports 13, 7734 (2023). https://doi.org/10.1038/s41598-023-34205-9
- Towhid, S. T., Hossain, M. J., Sammo, M. A. S., & Akter, S. (2022). Perception of Students on Antibiotic Resistance and Prevention: An Online, Community-Based Case Study from Dhaka, Bangladesh. European Journal of Biology and Biotechnology, 3(3), 14–19. https://doi.org/10.24018/ejbio.2022.3.3.341
- Hossain, M.J., Towhid ST, Sultana S, Mukta SA, Gulshan R, Miah MS (2022). Knowledge and Attitudes towards Thalassemia among Public University Students in Bangladesh. Microbial Bioactives, 5(2), https://doi.org/10.25163/microbbioacts.526325.