The objective of this course is to provide the classic tools of mathematical statistics which includes the choice of the probabilistic model, its estimation and its evaluation. We will be particularly interested in the linear model and its extensions in the context of high-dimensional statistical learning (LASSO, RIDGE, PCR PLS), the logistic model and tree-based models (CART, RF, Boosting etc. ). The aim of this course is also to provide training in the manipulation of data and the practical implementation of the studied models. For this, a substantial part of the course is oriented towards the implementation of the different models using the R software through the study of a large number of examples.
Linear and logistic regression. Model selection. Design of experiments. L1 L2 Penalized regression. Regression trees.
- Linear regression. Validities and limitations of the method. Model selection.
- Design of experiments: screening and response surface
- Logistic regression
- Elements of statistical learning in high dimension
PRACTICAL ACTIVITIES The three activities will be devoted to learning the techniques of regression models on the R software. Numerous data sets will be studied.
- Know how to recognize different classes of statistical learning problems.
- Know how to implement basic models of statistical learning and validate their relevance.
- Know how to propose learning methods adapted to the high dimension
- Know how to use R.
Final mark = 60% Knowledge + 40% Know-how Know-how = 100% continuous assessment Knowledge= 100% final exam