The LEARNING OBJECTIVE is spelled out here
I will try to use mostly web-resources, or rather, ask you to find web-resources and study from there.
No text is necessary, but a good book to have is An introduction to Bioinformatics Algorithms
by Neil Jones and Pavel Pevezner, MIT Press 2004, ISBN: 0-202-10106-8
Text web page
Slides from Pevezner TEXT book:
Background
Ch 3: Background
Ch 4: Motif finding
Ch 4: DNA mapping/ Partial Divest
Ch 6:Sequence distance measurement with dynamic programming, slide 52 onwards.
Pairwise Alignment with DP
Alignment algorithms Needleman–Wunsch algorithm, Smith-Waterman algorithm,
Also, check Wiki
Protein Data Bank uses mmseq2 algorithm: search->sequence search->Advanced search
Multiple Sequence Alignment
Phylogenetic Tree
Bi-clustering for GeneExpression Analysis with Micro-array
Ch 10: Clustering of Microarray data
Bioconductor tutorial on clustering: presentation
Bi-clustering presentation
Bi-clustering review paper by Tanay-Shamir 2004
Density-based non-partitioning clustering Gupta et al., ACM Tr. on Comp. Bio (2010).
Ch 10: Molecular Evolution
Ch 8: DNA sequencing: graph theory
Ch 8: Mass Spec analyses
Ch 5: Genome rearrangements
Ch 11: Hidden Markov Models
From the past:
SeqAlignment
Fragment Assembly
Structure Prediction
A tutorial on HMM
-----------------------------
RESOURCES:
A decent introduction
to molecular biology by Hunter.
Some information on Human Genome Project is
here.
Wiki on DNA sequencing
Some collection of important web databases / tools
Pvz-Jones book's asnmt 1 page.
A short but good tutorial on Hidden Markov Model (from Horse's mouth!)
Rabiner (Problems 1, 2 & 3) in Proc. of IEEE, Feb 1989
Pfam protein family DB
BLAST original: Altschul et al's paper
Altschul et al's Gapped & Psi-blast, Comment on Gapped-blast complexity
A lecture-note on < a href="Resources/ProtStr.ppt">Protein structure
(acknowledgement: http://www.bmolchem.wisc.edu/courses/fall704/module1/J.Keck_Powerpoints_2009/Lectures%201%20and%202.ppt)
A good short description of domain, motif, fold, etc. of proteins from wiki: http://en.wikipedia.org/wiki/Protein_structure#Domains.2C_motifs.2C_and_folds_in_protein_structure
SCOP (Strctural Classification of Proteins): http://en.wikipedia.org/wiki/Structural_Classification_of_Proteins
CATH (Class-Architecture-Topology-HomologSuperfamily) protein structure classification wiki: http://en.wikipedia.org/wiki/CATH
FSSP (Families of Structurally Similar Proteins) automated database, uses DALI algorithm, wiki: http://en.wikipedia.org/wiki/Families_of_structurally_similar_proteins
Aaron's talk on Folds-Motifs
Steve Johnson's talk on Go ontology
Protein structure alignments:
Ye et al's paper
(1) STRAP site,
(2) TM Align paper
(3) LOCK alignment paper
(4) DALI paper
CONFERENCES:
Intelligent Systems for Molecular Biology
ISMB.
International Conference on Research on Molecular Biology
RECOMB.
IEEE Computational Systems Bioinformatics Conference
CSB.
---------------- Summer 2020: Computatinal Virology ---------
Official Syllabi: CSE-4510-SpecialTopic-CompVirology, CSE-5400-SpecialTopic-CompVirology
A weekly course plan/overview
SarsCov2ProteinsListFromNYtimes
A codon translator logic-code, due to Dr. Frazan
================ Spring 2011 ================
Class: W 6:30-9:15pm Crawford 402 (we will see!)
Office Hours: 2-4pm TR
A dated course plan for Fall 2011.
A news item on gene expression analysis leading to regulatory pathway discovery.
-------------------- Spring 2010 --------------
Office Hours: 2-4pm TW
Home Work 1 on Biology primer
Assignment 1 1 on Sequence search
Assignment 2 1 on UNIPROT, Heart-2DE, and Phylip
Assignment 3 on Protein 3D structure alignment
Parallel suffix tree paper from IBM.
Exam time meeting: Wednesday 5/5/10, 8:20pm, on the final project results
Target PRIB 2010 conference: deadline May 20, 2010
============== PREVIOUS SEMESTERS =======
Spring 2009
---------------------------------------
plan /journal
Algorithms basics syllabus
Projects assignment (developing)
ALL PRESENTATIONS AFTER UPDATES SHOULD BE SUBMITTED WITHIN THE NEXT CLASS
Key to the Quiz 1 on Biology primer.
Programming Assignment 1
Programming Assignment 2
Updated & due Wednesday. Penalty after due date.
Sorry, two input sequences are same. Use any example-pair(s) of your choice.
Quiz 2
Programming Assignement 3 on HMM.
On 4/15/09 Wednesday: Project-discussion with each group for 15 minutes
Final Project presentation guidelines: COMPREHENSIVE PRESENTATION WITH INTRO TO THE
PROBLEM, PAPERS YOU HAVE READ, ALGORITHMS YOU ARE IMPLEMENTING, INPUT DATA SET,
EMPHASISE ON YOUR RESULTS
Final Project Presentations:
Protein Strcuture similarity measurement
Clustering gene Expresion data
EST to homologus sequences
GO Ontology
DUE 4/29/09: ANONYMOUS class feedback.
THANKS.
Final (5/6, 8:30 pm): Closed book, 1 hour (not 2), some short questions,
some from bio-basics, some on each project, question on writing algorithms,
dry runing alg and basic understanding of algorithms...
Grades
On Quiz2 BreakPtReversal & DP answers are regraded.
Formula for aggregate is up on the spreadsheet!
Spring 2006
---------------
Assignment 1 (due 1/19/06)
Assignment 2 (points 30):
(1) Answer the questions on Genebank and Swissprot databses (print
the questions too)
(2) Questions 4.15, 4.16 and 4.17 from the text (p122-3).
(3) Analyze the complexities of the algorithms "BruteForceMotifSearch"
(p 109) and "SimpleMedianSearch" (p113). Do not use book's analyses
even if you arrive at the same results. (due 2/10/06)
Assignment 3 (points 50,
FINALLY Due: May 4, 06)
There will be a guest lecture by Dr. Leonard on Tuesday, February 28.
Projects
--(Due: Presentation on May 2, '06, 7:30-10:30 pm) SEE ANNOUNCEMENT --
Presentation schedule (Room Olin EC 239-240:
System biology: 7:30-8:30 pm. (Gary Hrezo and Weijung Huang)
Protein Docking: 8:30-9:30 pm. (Johannes Nangolo and Christpher Roach)
Correlogram method in protein classification: 9:30-10:30 pm. (Kyle Cacciatore and Stephen Jonsson)
Spring 2005
---------------
Assignment 1 (due 2/8/05): Text Exc. 1, 2, 3 on page 30
Biology Presentation schedule:
Robert Asfar - 2/8/05
Florent launay - 2/8/05
Park Sung Hoon - 2/10/05
Ram, Anjali - 2/10/05
Programming assignment:
Implement Global alignment Dynamic programming algorithm,
(Due: 2/12/05)
Projects.
Project proposal due 3/17/05, Thursday.
Presentations: (BLAST: Rob, PAM: Anjali,
Suffix tree: Park: 3/15/05 Tuesday (15 min)
Quiz on Fragment assembly: 3/24/05 Thursday
I will let you complete it in the next class Thursday 3/31/05, for
about 20 minutes at the end of the class
(Due: 4/20/05 Thursday)
Implement the dynamic programming algorithm for RNA base pairing-prediction
with the simplest assumption. Use alpha values as follows:
alpha(ri,rj)=-2, if (ri,rj)=(A,U) or (U,A) or (G,C) or (C,G),
=0, otherwise. Program should work on any string of length up to 100.
Presentation schedule:
Protein structure prediction: Rob Asfar: 4/14-19/05
Anjali Ram: 4/19-21/05
System biology: Park: 4/21-26/05
PROJECT PRESENTATION: THURSDAY 5/5/05 EXAM TIME
POWER-POINT PRESENTATION+DEMO, MAX 40 MIN, MIN 20 MIN
IN CLASSROOM OR IN MY OFFICE
-----------------------------
A
tutorial on BLAST.
-----------------------------
Spring 2005:
Class Time: Tuesday Thursday 6:30-7:45 pm
Room: E250
-----------------------------
Spring 2003:
(The notes below are primarily from the submissions from the students
in Spring 2003, particularly those from Michael Smith.)
Class schedule: Monday-Wednesday 11 - 12:15 am
Meets at: Room 132EC
Chapter 1: Introduction to Biology
lecture notes.
Some database search procedures: here.
String comparison algorithms: from Cormen et al's Algorithms text book, embedded in my lecture notes on the Algorithms class notes.
Chapter 3: Sequence comparison lecture slides.
Chapter 4: Fragment Assembly lecture slides.
Chapter 6: Phylogenetic Trees lecture notes.
Chapter 8: Molecular Structure Prediction was not covered this time.
Chapter 9: DNA computing lecture slides.
Project description.
A self study done on Sickle Cell Anemia, some notes.
----------------------------
-----------------------------
Spring 2004:
Projects:
Expectation: (1) Literature survey on the current status of the
field evidenced in bibiography development and a presentation(s),
(2) and a software implementation. Both (1) and (2) for the Graduate
Students, and only (2) for the Undergraduate student.
A report of approximately 5 page typed, data from the experiments,
and (outside the 5 pages) source code will be due.
E-mail/CD/ floppy any format is acceptable.
Due date for report submission: April 15 or next class to that date.
System Biology of E-coli cell division process. (Data: Prof. Leonard)
Michel Lacle
Implementation of Blast allignment algorithm and Sequence distance
measurement between Protein chains (subsequently to be expanded
toward usage of Correlogram method of Huang et al, as an MS Thesis).
(Data: Protein Data Bank)
Gandhali Samant
Microarray data clustering algorithm implementation. (Data: ??)
Sunjit Bir
Instance-based learning implementation for clustering sequences.
(Data: Prof. Leonard / PDB)
Manav Rattan
Fragment assembly implementation.(Data: ??) Aditi Gupta
Phylogeny reconstruction implementation. (Data: ??)
Lalit Samant
Helix, sheet (secondary structure) prediction, and solvent accessibility
prediction (1D structure) using DSP and homology modeling techniques.
(Data: PDB)
Seema Gandhi
Deploying matrix method, and dynamic programming method to detect motifs
in some nucleotide sequences and then representing the sequences based on
existing motifs (Ref: Gaurv Tandon's work on computer security).
(Data: Prof. Leonard)
Carl Harroch
-----------------------------
Materials are copyrighted to me (year 2003),
or shared with the acknowledged students, as the case may be.
E-mail:
dmitra@zach.fit.edu