Machine Learning CAP6610

CAP 6610, Machine Learning, Fall 2018

Place:CSE Building; E222
Time:MWF 4 (10:40-11:30 a.m.)

Instructor:
Arunava Banerjee
Office: CSE E336.
E-mail: arunava@cise.ufl.edu.
Phone: 505-1556.
Office hours: Wednesday 2:00 p.m.-4:00 p.m.

TA:
XXX XXXX
Office: CSE Exxx.
E-mail: xxx@cise.ufl.edu.
Office hours: Monday x:00 p.m.-x:00 p.m.(at CSE E309) or by appointment.

Pre-requisites:

The official pre-requisites for this course is COT5615 (Mathematics for Intelligent Systems). Specifically, knowledge of calculus and linear algebra is necessary since we shall be touching on mathematical probability theory. In addition, proficiency in some programming language is a must.

Textbook (recommended): Machine Learning: A Probabilistic Perspective, Murphy, ISBN-10: 0262018020.

Reference: Pattern Recognition and Machine Learning, Bishop, ISBN 0-38-731073-8.

Reference: Pattern Classification, 2nd Edition, Duda, Hart and Stork, John Wiley, ISBN 0-471-05669-3.

Tentative list of Topics to be covered

Review of mathematical probability theory; finite sample probability bounds.
Decision Trees
Bayes decision theory
Bayesian learning
Maximum likelihood estimation and Expectation Maximization
Linear and generalized linear models for regression and classification,
Sparsity promoting priors with conjugates and their relationship to regularization
Kernel methods including Support Vector Machines
Error Back-propagation and Neural Networks
Mixture models
Hidden Markov models
Principal Components Analysis
Independent Components Analysis
Reinforcement Learning
Performance evaluation: re-substitution, cross-validation, bagging, and boosting

The above list is tentative at this juncture and the set of topics we end up covering might change due to class interest and/or time constraints.

Please return to this page at least once a week to check updates in the table below

Evaluation:

One individual project spanning the semester: 10%
Homework assignments (written and programming): 30%
Two midterm exam: 30% each (2 hrs, in-class)
There will be no makeup exams (Exceptions shall be made for those that present appropriate letters from the Dean of Students Office).

The final grade will be on the curve.

Course Policies:

Late assignments: All homework assignments are due before class.
Plagiarism: You are expected to submit your own solutions to the assignments. While the final project and presentation will be done in groups, each member will be required to demonstrate his/her contribution to the work.
Attendance: Their is no official attendance requirement. If you find better use of the time spent sitting thru lectures, please feel free to devote such to any occupation of your liking. However, keep in mind that it is your responsibility to stay abreast of the material presented in class.
Cell Phones: Absolutely no phone calls during class. Please turn off the ringer on your cell phone before coming to class.

Academic Dishonesty: See http://www.dso.ufl.edu/judicial/honestybrochure.htm for Academic Honesty Guidelines. All academic dishonesty cases will be handled through the University of Florida Honor Court procedures as documented by the office of Student Services, P202 Peabody Hall. You may contact them at 392-1261 for a "Student Judicial Process: Guide for Students" pamphlet.

Students with Disabilities: Students requesting classroom accommodation must first register with the Dean of Students Office. The Dean of Students Office will provide documentation to the student who must then provide this documentation to the Instructor when requesting accommodation.

Announcements

The second midterm on Dec 5th is NOT cumulative. You will be tested on material covered after Neural networks.

As discussed in class, the final project report (written in the form of a technical conference/journal paper) is due On or BEFORE December 9th midnight.

Midterm 1 will be held on October 12th. Midterm 2 will be held on the last day of classes, December 5th.

HW1 has been posted on Canvas. Please pay speacial attention to deadline and how much space you have for each answer.

I have posted Durrett's book below. You are expected to be comfortable with Chapter 1.

HomeWorks

HomeWork Due Date Solutions

List of Topics covered

Week Topic Additional Reading

Aug 19 - Aug 25

Putative framework:
Supervised, Unsupervised Learning. Reinforcement Learning
Labeled/unlabeled datasets, training/testing.
Generalization, over-fitting to training data

Dactyl
AlphaGo Zero

Aug 26 - Sep 01

Continued with discussing roadmap for the rest of the semester.
High level view of topics that we hope to cover
why we need to know basic mathematical probability theory

Probability: Theory and Examples by Durrett Edition 4.1.
Read and understand Chapter 1.(partiularly sections 1-3)
Do exercise problems.

Sep 02 - Sep 08

Sample space, outcome
Measurable space, sigma algebra
limit supremum and limit infimum
probability, random variable
distribution function, density

Sep 09 - Sep 15

The "Risk functional" approach Loss function, Hypothesis space Empirical Risk and the Empirical risk minimization principle
Application to classification, regression, density estimation
Jensen's inequality

Sep 16 - Sep 22

Decision Trees
Impurity: Entropy, Gini, Misclasification
NP-hardness of the problem
Prunning, cross validation

Sep 23 - Sep 29

Multivariate Regression
Closed form solution
Overdetermined, Underdetermined linear systems
Moore-Penrose Pseudo inverse

Sep 30 - Oct 06

Perceptron Learning rule
Started mistake bound theorem for perceptron
Energy function for perceptron learning and gradient descent
Here Perceptron convergence theorem

Oct 07 - Oct 13

Artificial sigmoidal neuron and gradient descent on error
Multi-layer perceptrons and Error back propagation
Midterm

Oct 14 - Oct 20

convolution neural networks, recurrent neural networks
ReLU, Softmax, Dropout
LSTM, ResNet, HighwayNet, DenseNet
Background for Support vector machines
Constrained optimization problem
Convexity, Convex sets, local minima=global minima
LSTM A very good blog post that describes the idea

Oct 21 - Oct 27

Lagrange multipliers
Lagrange duality

Lagrange Duality A very good geometric view of what is going on
Karush Kuhn-Tucker conditions.
Convex Optimization Book by Boyd and Vandenberghe

Oct 28 - Nov 03

Support vector machines
maximum margin
Primal and dual formulations
The kernel trick; polynomial, gaussian kernels

Nov 04 - Nov 10

Unsupervised learning
Principal Component Analysis
Derivation of Markov, Chebyshev, Hoeffding inequalities

Nov 11 - Nov 17

Overview of Statistical learning theory
Vapnik-Chervonenkis dimension
Empirical Radamacher Complexity
Maximum Likelihood

VC Bound A very good informal description of the theory

Nov 18 - Nov 24

Maximum Likelihood estimates of mean and variance for the multi-variate Normal distribution.
Thanksgiving

VC Bound A very good informal description of the theory

Nov 25 - Dec 01

K-Means clustering; objective function and algorithm
Mixture of Gaussians and Expectation Maximization.

Wiki on K-Means clustering.
Here are D'Souza's notes.

Week	Topic	Additional Reading
Aug 19 - Aug 25	Putative framework: Supervised, Unsupervised Learning. Reinforcement Learning Labeled/unlabeled datasets, training/testing. Generalization, over-fitting to training data	Dactyl AlphaGo Zero
Aug 26 - Sep 01	Continued with discussing roadmap for the rest of the semester. High level view of topics that we hope to cover why we need to know basic mathematical probability theory	Probability: Theory and Examples by Durrett Edition 4.1. Read and understand Chapter 1.(partiularly sections 1-3) Do exercise problems.
Sep 02 - Sep 08	Sample space, outcome Measurable space, sigma algebra limit supremum and limit infimum probability, random variable distribution function, density
Sep 09 - Sep 15	The "Risk functional" approach Loss function, Hypothesis space Empirical Risk and the Empirical risk minimization principle Application to classification, regression, density estimation Jensen's inequality
Sep 16 - Sep 22	Decision Trees Impurity: Entropy, Gini, Misclasification NP-hardness of the problem Prunning, cross validation
Sep 23 - Sep 29	Multivariate Regression Closed form solution Overdetermined, Underdetermined linear systems Moore-Penrose Pseudo inverse
Sep 30 - Oct 06	Perceptron Learning rule Started mistake bound theorem for perceptron Energy function for perceptron learning and gradient descent	Here Perceptron convergence theorem
Oct 07 - Oct 13	Artificial sigmoidal neuron and gradient descent on error Multi-layer perceptrons and Error back propagation Midterm
Oct 14 - Oct 20	convolution neural networks, recurrent neural networks ReLU, Softmax, Dropout LSTM, ResNet, HighwayNet, DenseNet Background for Support vector machines Constrained optimization problem Convexity, Convex sets, local minima=global minima	LSTM A very good blog post that describes the idea
Oct 21 - Oct 27	Lagrange multipliers Lagrange duality	Lagrange Duality A very good geometric view of what is going on Karush Kuhn-Tucker conditions. Convex Optimization Book by Boyd and Vandenberghe
Oct 28 - Nov 03	Support vector machines maximum margin Primal and dual formulations The kernel trick; polynomial, gaussian kernels
Nov 04 - Nov 10	Unsupervised learning Principal Component Analysis Derivation of Markov, Chebyshev, Hoeffding inequalities
Nov 11 - Nov 17	Overview of Statistical learning theory Vapnik-Chervonenkis dimension Empirical Radamacher Complexity Maximum Likelihood	VC Bound A very good informal description of the theory
Nov 18 - Nov 24	Maximum Likelihood estimates of mean and variance for the multi-variate Normal distribution. Thanksgiving	VC Bound A very good informal description of the theory
Nov 25 - Dec 01	K-Means clustering; objective function and algorithm Mixture of Gaussians and Expectation Maximization.	Wiki on K-Means clustering. Here are D'Souza's notes.