This course introduces relevant programming techniques for data analytics. Topics include programming languages, relevant software packages, good programming practices, linear algebra in data analytics, numerical computing, and 4~5 machine learning algorithms as running problems. After completing the course, students will gain the skills to implement a data analytics pipeline (data collection, data retrieval, data analysis, data visualization) and several "handy" machine learning algorithms.
We will use Piazza for discussion (e.g., homework, project). Post your questions there, and the teaching staff and your fellow classmates will be able to help answer them quickly. You can also use Piazza to find project teammates.
Tsquare will only be used for submission of assignments and projects.
Instructor  Da Kuang  Thu, 23pm, Klaus 1315 
Instructor  Polo Chau  Thu, 45pm, Klaus 1315 
TA  Lianxiao (Shawn) Qiu  Mon, 12pm, Klaus 2108 
Date  Topic  Wed  Fri  Events  

Aug  20, 22  * Course introduction * Course survey * Introduction to Python and its data structures 
Slides  Slides  
27, 29  * Python exercises Q&A * Data collection

Slides  
Sep  3, 5  * Data collection (cont'd)

Slides  Slides  HW1 out (Wed) 
10, 12  * Charting/Visualization

R resources (Link 1) (Link 2) (Link 3) 
Slides  
17, 19 
* Data storage and retrieval in sqlite * Basic linear algebra overview

Slides  Notes  HW1 due (Fri)  
24, 26  * Dense and sparse matrices (including Numpy) * Good programming practices 
Slides  
Oct  1, 3  * Basic linear algebra overview

Notes  Notes  HW2 out (Mon) 
8, 10  * Linear regression

Notes  Scripts  HW2 due (Fri) HW3 out (Sat) 

15, 17  * Logistic regression

Slides  Scripts  
22, 24  * Computer architecture overview * Vectorization in Numpy and R 
Slides  Scripts  HW3 due (Fri/Sat)  
29, 31  * Kmeans clustering * Project proposal presentations 
Slides  Project proposal due (Thu)  
Nov  5, 7  * Kmeans clustering: Case studies * Efficient implementation of Kmeans 
Scripts (README)  
12, 14  * Numerical software stacks * Singular value decomposition (SVD) 
Slides  Notes  
19, 21  * SVD, eigenvalue decomposition (EVD), and PCA * Computing SVD, EVD and PCA 
Notes  Progress report due (Wed)  
26, 28  (Thanksgiving holiday)  
Dec  3, 5  * Latest research; popular topics * Final project presentations 
Final report due (Thu) 
Prof. Le Song  Introduction to Computational Data Analysis  Spring 2014