Module overview
It is important that we provide bioinformatic cell analysis training to students in order to significantly improve research possibilities in their future careers in Biomedical Sciences. The quantitative cell biology (QCB) module will focus on the practical use of the methods employed, rather than focussing on just the mathematics and statistical approaches underpinning them. Some of the mathematics and statistics will be discussed, but no prior knowledge will be assumed. The analyses will predominantly be conducted using the R project for statistical computing software (https://www.r-project.org).
Students with or without experience of R programming and/or mathematics will be enrolled on this course.
Students with no background in this area will not be disadvantaged, as they will be provided with computing support, and training via attendance on a data carpentry course delivered by the Southampton Research Software Group (https://rsgsoton.net), to succeed.
There is no opportunity to repeat the year on this programme.
Aims and Objectives
Learning Outcomes
Subject Specific Practical Skills
Having successfully completed this module you will be able to:
- Apply investigative skills/methods of enquiry to researching problems and issues in one’s area of research.
Transferable and Generic Skills
Having successfully completed this module you will be able to:
- Use information technology e.g. web/internet, databases, spreadsheets, statistical packages and word processing effectively
- Manage a research project with due attention to time and resource management.
Subject Specific Intellectual and Research Skills
Having successfully completed this module you will be able to:
- Devise valid and reliable methods and instruments for data and information collection in relation to one’s own research
- Gather, quantify, analyse, synthesise, critically evaluate and interpret complex information
- Analyse problems objectively using key theoretical perspectives and empirical research
- Apply scientific and clinical concepts to the development of new ideas and the synthesis of hypotheses
Knowledge and Understanding
Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:
- The identification and justification of the value of different sources of data in drawing conclusions from published literature
- The practical issues involved in carrying out quantitative research
- The value, nature, uses and limitations of a range of research methods
Syllabus
Introduction to the course and software installation; introduction to RNA sequencing experimental design, data structure, metadata, data pre-processing and exploratory data analysis (QCB Session 0).
Differential gene expression analysis for RNA sequencing data (QCB Session 1).
Clustering based analyses for RNA sequencing data (QCB Session 2).
Dimension reduction analysis (i.e., Principal Component Analysis) for RNA sequencing data (QCB Session 3).
Data visualization and extracting biological meaning (i.e., gene set enrichment and pathway analysis) from RNA sequencing data (QCB Session 4).
Single-cell RNA sequencing data analysis from digital gene expression matrices (QCB session 5 & 6).
Set coursework assignments and a student-led revision workshop (QCB session 7).
Learning and Teaching
Teaching and learning methods
Teaching will consist of an introductory session (QCB Session 0) to meet course instructors, set up computers, establish an R environment, as well as an introduction into RNA sequencing analysis.
Then teaching will consist of six one-day master classes and a final one-day revision session. Each day will cover one of the syllabus sections detailed above. Each session will begin with a taught overview of the material to be covered, followed by a hands-on session on computer in which the students can explore the various data types and methods discussed. Example datasets for exploration will be provided at each session. Collaborative working between students will be encouraged during these sessions.
The hands-on sessions will be run with a member staff and 1-2 computational PhD/postdoc demonstrators, of which there are many suitable in Southampton.
The training and analysis will primarily be conducted using the R software environment. All methods will be demonstrated and full code for example problems will be provided; prior knowledge of programming would be beneficial but not required as all students will receive data carpentry training before QCB Session 1.
Total Study Time
The module will reflect the normal distribution of 200 hours of student effort attributable to each 20 credit module.
Contact hours: 50 (including 10 hours of dedicated data carpentry time)
Non-contact hours: 150 of independent learning
Lectures will be delivered face to face. Tutorials, support, and feedback for the practical computer workshop sessions will be given face-to-face. Data Carpentry will be delivered via live online workshops.
Type | Hours |
---|---|
Independent Study | 150 |
Teaching | 50 |
Total study time | 200 |
Resources & Reading list
General Resources
Access to a computer/laptop computer/workstation. The course is based on computational data analysis. Access to a computer or workstation with working R environment and internet access is essential
Textbooks
Bishop (2006). Pattern Recognition and Machine Learning.
Hastie, Tibshirani, Friedman. (2009). The Elements of Statistical Learning. Springer.
Assessment
Assessment strategy
R programming test (10%)
A short computer-based class test, comprising approximately 20 questions to monitor and facilitate the acquisition of basic skills required for the primary substantive summative assessments described below.
Coursework 1. (40%) Summary of analysis techniques
A written summary of supervised and unsupervised analysis techniques and their applications in quantitative cell biology, using the references provided in lectures and on Blackboard as a start. Students are encouraged to explore cutting-edge methods, that have been adopted for biological data analysis and are widely used in the scientific literature (1,500-word limit).
Coursework 2. (50%) Analysis of a dataset
Utilising the R software environment, you will be asked to conduct a thorough analysis of a dataset provided. The full problem details and dataset will be provided to the students via Blackboard. Submission will include annotated R code in a scrip format. Students will be assessed by the success of their method to achieve a thorough analysis of the dataset provided to include:
1. A fully annotated R script that executes all analysis commands and runs without error.
2. A set of professionally produced figures with appropriate captions that summarises the analysis.
3. A written summary providing biological interpretation of the analysis (300-word limit).
Assessment requirements
You must pass the module with an average overall mark of 50% or above. There is compensation between assessment elements provided a mark of 40% or higher is attained in each element. Candidates who fail one or more elements of the module at the first attempt will be permitted to re-sit the failed elements as supplementary assessments. Candidates who achieve at least 50% overall at the second attempt will be permitted to pass the module with a capped mark of 50%
Summative
This is how we’ll formally assess what you have learned in this module.
Method | Percentage contribution |
---|---|
Data Analysis | 50% |
Class Test | 10% |
Written summary | 40% |
Referral
This is how we’ll assess you if you don’t meet the criteria to pass this module.
Method | Percentage contribution |
---|---|
Written summary | 40% |
Class Test | 10% |
Data Analysis | 50% |