The focus of this course is on the theory behind machine learning algorithms for regression and classification problems. We first present the general probabilistic approach to prediction problems, and then introduce elements of Vapnik-Chervonenkis Theory, providing theoretical guarantees of empirical risk minimization in the context of binary classification. We also discuss linear techniques for regression and classification, including regularization techniques, such as ridge regression and the lasso. The last section of the course focuses on non-linear algorithms, such as trees, ensemble methods (bagging, random forests), support vector machines, neural networks, and introduce reproducing kernel Hilbert spaces, and their application in machine learning.