Scikit-Learn is a very powerful Python package for maschine learning.
The Scikit-Learn course is intended for engineers, economists (Insurance, Banking), marketing experts, consultants, who want to apply machine learning algorithms to create software that acts intelligently by learning from data. The course consists of about 50%-60% exercises with a trainer per 5 to 9 participants helping individually. At the end of the course participants will be able to program powerful, state of the art predictive algorithms in a few lines of code and use them in production for forecasting.
The course assumes familiarity with core Python and also with linear algebra and basic calculus. It covers the theoretical aspects of the 3 to 4 most powerful machine learning algorithms of Scikit-Learn in more detail. Otherwise the focus is on the main principles and tools of Scikit-Learn necessary to successfully apply it to real world problems. The topics:

  • Numpy
  • Estimator, Predictor¬† interfaces ( fit(), predict(), score() methods )
  • Linear-Regression & Logistic-Regression
  • Evaluating predictive models. Different metrics. (Precision, Recall, ROC, gain-chart,..)
  • Visualizing interactions of variables and model quality with seaborn, scikit-plot ....
  • Transformer interface. Transforming non-numerical to numerical data.
  • Pipelines. Defining Transformation + Prediction processes into one process.
  • How to discover and to deal with outliers. Pitfalls and how to overcome them.
  • Ridge-Regression, Lasso-Regression, Regularization
  • Support-Vector-Machines (Classifier, Regressor, theoretical + practical aspects)
  • GridSearchCV (Finding good model parameters)
  • Principal-Components- and Space-Density Transformers
  • Decision Trees, Random Forests. (Theory + practical aspects)
  • Clustering

Each of the listed topics has one or more exercise units. The course duration is 5 days.