Statistical methods for computational biology


Class Times: Tuesdays and Thursdays 2:50-4:05
Location: Physics 128
Instructors: Sayan Mukherjee
Office Hours: By appointment
Email Contact : sayan at stat dot duke dot edu

Course description

The use of statistical methods and tools from applied probability to address problems in computational molecular biology. Biological problems in sequence analysis, structure prediction, gene expression analysis, phylogenetic trees, and statistical genetics will be addressed. The following statistical topics and techniques will be used to address the bioloigcal problems: classical hypothesis testing, Bayesian hypothesis testing, Multiple hypothesis testing, extremal statistics, Markov chains, continuous Markov processes, Expectation Maximization and imputation, classification methods, and clustering methods. Along the way we'll learn about gambling, card shuffling, and coin tossing.

Problem sets

  • Problem set #1
  • Problem set #2
  • Problem set #3
  • Prerequisites

    STA 213: Introduction to Statistical Methods, basic knowledge of biology, MTH 104: Linear Algebra and Applications

    Grading

    There will be four problem sets that will account for 40% of the grade, a midterm exam that will account for 20% of the grade, a final exam for 40% of the grade (students that have an A after the midterm, both exam and homeworks, will have an option to complete a final project in lieu of the final exam)

    Syllabus

    The subject contained in each class is (hopefully) contained in the lecture notes that I am preparing. Most of this material will be taken from the five books listed in the reading list.
  • S. Mukherjee Course notes
  • S. Mukherjee Course notes for classification and regression (sections 2,3, mainly 5)


  • Date Title
    Class 01 Thur 12 Jan Course at a glance
    Class 02 Tue 17 Jan Overview of probability and stats and notation (I)
    Class 03 Thur 19 Jan Overview of probability and stats and notation (II)
    Class 04 Tue 24 Jan Statistical inference
    Class 05 Thur 26 Jan More statistical inference
    Class 06 Tue 31 Jan Classical hypothesis testing (with applications)
    Class 07 Thur 2 Feb Bayesian hypothesis testing (with applications)
    Class 08 Tue 7 Feb Multiple hypothesis testing
    Class 09 Thur 9 Feb Intro to theory of Markov chains and random walks (I)
    Class 10 Tue 14 Feb BLAST and its cousins (I)
    Class 11 Thur 16 Feb BLAST and its cousins (II)

    Reading List