Topic: Introduction to R in Epidemiology


The Global Health Network (TGHN) Asia has set up a Data Science club in icddr,b, Bangladesh. As part of the first two sessions, TGHN, in collaboration with HDR UK, icddr,b, Fiocruz, and Africa CDC organized a hybrid workshop on Introduction to R in Epidemiology on July 17 & 18, 2023 to deliver training in the R statistical package for aspiring epidemiologists. The goal of the two-day programme was to make R training more accessible and tailored to the broader communities of practice in the Global South.

Participants joining the two sessions learned about introductory R topics such as syntax, data cleaning, basic data analysis and visualization. A total of 37 participants attended the sessions, including 15 who joined virtually from 6 different countries, and 22 young researchers joined in person from institutions in Bangladesh.

Programme

Session 1:

Saimul Islam, Senior Research Investigator, Non communicable Diseases, NRD, icddr,b moderated the whole session. Mathew Redford from HDR Global gave the opening remarks, where he introduced the participants to the Global Health Data Science knowledge community and hub, the concept of Data Science Clubs and Clinics, networking with researchers, and availability of relevant tools. He also presented on the Data Science hub’s initiative to make health data more accessible and future initiatives to incorporate Artificial Intelligence and Machine Language in health research.

Aashna Uppal, an expert data scientist, and a candidate of the HDRUK-Turing-Wellcome PhD Programme in Health Data Science at the University of Oxford, facilitated the training. Participants were introduced to Applied Epidemiology and ‘R and RStudio’. Then, they were taken to the R studio environment, and the topics of different functions, packages, calculations, using working directory, creating R projects, and R syntax were touched upon. The participants then learned about class of objects and indexing, and took part in practice exercises.

Session 2:

After a brief session to address any queries from the previous session, the training was then moved to exposing the participants with real working challenges. Base R vs Tidyverse coding conventions, how to import data and packages, basic data manipulation, and data cleaning using RStudio were discussed. In the next segment of the session, creating tables, data visualization, R markdown and automated reporting features were introduced.

In the closing remarks, Dr. Aliya Naheed, lead of TGHN Asia and scientist, Noncommunicable Diseases, Nutrition Research Division, shared her experiences and challenges with data management. She encouraged the participants to go forward with their research career with effective communication with institutional and global communities. Moreover, she emphasized on practising the demonstrated topics for effective learning.

The participants shared how useful they found the two sessions and learned a lot from the training provided. A participant joining from Cambodia expressed her interest in incorporating R/Rstudio in their mixed methods study analysis. Specific feedback from the survey participants will be implemented in future workshops and Data Science club sessions. For example, future cohorts may be categorised on skill level and trained separately, and the length of the workshops may be increased to spend more time on materials covered. Participants may also be asked beforehand if they have specific requests for intermediate and advanced topics, and the team will try to run workshops in the regional languages where possible (although it was noted that running these workshops in English was not a barrier to learning). Another key recommendation was to ground all learning in practical examples; having practical examples could not only encourage discussion, but solidify learning. Lastly, it was recommended to follow up with participants at higher skill levels to gauge whether they would be interested in undertaking a “training of trainers” course, whereby they learn how to run Data Clubs and Clinics themselves. This would ensure sustainability in maintaining this learning platform over the long term. 

Click on the video links below to watch the recordings from the two sessions:

                                                                        



View in full screen (1hour 20m 21s)

Introduction to R for Epidemiology - Session 1



View in full screen (1hour 46m 52s)

Introduction to R for Epidemiology - Session 2

If you would like to set up a Data Science Club in your institute or network, we can support you with implementing and running it. Please reach out to us at asia@tghn2.org