**Teacher**: dr. Luka Grubišić, professor**Semester**: third**ECTS**: 4

Required course

Course objectives

After having completed the course, students should be able to:

- Formulate a data analytics problem as an optimisation problem,
- Estimate the requirements of the underlying dataset size,
- Implement the appropriate optimisation algorithm,
- Decide on the appropriate stopping criterion to obtain the meaningful accuracy of the learning task.

Expected learning outcomes

After the completion of this course, students are expected to be able to:

- use empirical risk minimisation to formulate data driven modelling as an optimisation task,
- choose an appropriate method of machine learning from high dimensional datasets (SVM, LASSO, logistic regression, label propagation, neural networks),
- implement the chosen method as an efficient algorithm using techniques of mathematical optimisation,
- evaluate the efficiency of the implemented solution method when applied to a data set of high dimension,
- rationally compare available methods based on performance indicators as well as its fitness for a particular application,
- understand the time-data trade-off in data analytics,
- successfully apply the chosen optimisation method in the context of interdisciplinary research in mathematical biology,
- evaluate results of a validation method by simulation.

Course content

- Basics of data modelling (regression, classification, clustering).
- Introduction to statistical justification of data analytics (empirical risk minimisation).
- Review of convex analysis (separation theorems, gradient descent methods, duality theory, optimisation of separable, pairwise separable and composite convex functions).
- Regression and classification methods. Least square and least absolute deviation method. Generalised linear models, neural networks, support vector machines, least absolute shrinkage and selection operator.
- Methods for reconstructing and segmenting images (singular value decomposition for matrices and tensors, regularisation and optimisation).
- Scaling methods to large datasets.
- Validation of the fitted model.