# Mathematics from Data

**Teacher**: dr. Luka Grubišić, professor

**Semester**: third

**ECTS**: 4

Required course

After having completed the course, students should be able to:

- Formulate a data analytics problem as an optimisation problem,
- Estimate the requirements of the underlying dataset size,
- Implement the appropriate optimisation algorithm,
- Decide on the appropriate stopping criterion to obtain the meaningful accuracy of the learning task.

After the completion of this course, students are expected to be able to:

- use empirical risk minimisation to formulate data driven modelling as an optimisation task,
- choose an appropriate method of machine learning from high dimensional datasets (SVM, LASSO, logistic regression, label propagation, neural networks),
- implement the chosen method as an efficient algorithm using techniques of mathematical optimisation,
- evaluate the efficiency of the implemented solution method when applied to a data set of high dimension,
- rationally compare available methods based on performance indicators as well as its fitness for a particular application,
- understand the time-data trade-off in data analytics,
- successfully apply the chosen optimisation method in the context of interdisciplinary research in mathematical biology,
- evaluate results of a validation method by simulation.

- Basics of data modelling (regression, classification, clustering).
- Introduction to statistical justification of data analytics (empirical risk minimisation).
- Review of convex analysis (separation theorems, gradient descent methods, duality theory, optimisation of separable, pairwise separable and composite convex functions).
- Regression and classification methods. Least square and least absolute deviation method. Generalised linear models, neural networks, support vector machines, least absolute shrinkage and selection operator.
- Methods for reconstructing and segmenting images (singular value decomposition for matrices and tensors, regularisation and optimisation).
- Scaling methods to large datasets.
- Validation of the fitted model.