We will focus on the family of Bayesian methods, which is distinguished by its optimality in the sense of certain criteria, by its reduced cost from an algorithmic point of view and by the interpretability of its results. We will also study the solutions available to the data scientist when the learning sample is small in relation to the number of parameters to be learned, or when the learning must be done in an unsupervised manner. In terms of application, we will focus on the exploration of a textual corpus to discover, for example, new customers eligible for the sale of a service/product, to predict the feelings (opinions) of customers or to understand the behaviours that predict fraud.
Bayesian decision theory, Unsupervised learning, Hidden Markov models, Text mining, Sentiment analysis, Chatbot, Natural Language Processing, Automatic translation.
- Bayesian decision (2h)
- Gaussian mixture model (2h)
- Hidden Markov chain (2h)
- Practical work on Bayesian learning (2h)
- Computational linguistic, NLP and practical Text Mining (8h)
- Restitution of a scientific reading by group (4h)
- Select the appropriate ML method(s) for their classification problem, considering different criteria.
- Develop programs using these methods to analyze their own data.
- Implement a processing chain to interpret texts (e.g. tweet).
- Become familiar with modern text mining techniques and tools and Read recent research papers on the topics mentioned.
Grade = 50% knowledge + 50% know-how Knowledge mark = 100% final exam Know-how mark = 50% for practical and 50% scientific paper restitution