Pandas

Pandas is a Python package for the handling and analysis of spreadsheet data. The Pandas course deals with loading, cleaning, manipulating, merging, visualizing spreadsheet like data. This course assumes familiarity with Python. It should be interesting, for example, for people who work a lot with Excel and want to automatize repetitive tasks or analyze more complicated data. The course consists of about 60%-70% exercises with a trainer per 5 to 9 participants helping individually. At the end of the course participants will have a very thorough knowledge and practical experience with Pandas. They will know all the core tools and capabilities of Pandas. The topics of the course:

1) Numpy

  • ndarray creation routines.
  • Array elements access.
  • array slicing.
  • Elementwise operations.
  • Attributes of ndarray.

2) The Series object

  • Contruct a Series object. Different methods.
  • Series object behaves like a numpy array in certain aspects.
  • Checking if index key is present.
  • Series object behaves like a dict in certain aspects.

3) The DataFrame object

  • Construct a DataFrame object. Various methods to do so.
  • Add / delete columns.
  • Row selection and slicing.
  • df.loc[], df.iloc[], df.at[], df.iat[] selection and access methods.
  • head(), tail(), transpose() methods.
  • DataFrame attributes.
  • Column-wise, row-wise methods.
  • DataFrame behaves like a 2-dimensional numpy array in certain aspects.

4) Cleaning and replacing data in a DataFrame

  • How to deal with missing data.
  • The replace() method.
  • Reading or writing a DataFrame from / to a csv-file or Excel-file.
  • String operations of String-Series.
  • Iterating over rows, columns or cells.
  • Renaming certain columns or rows.
  • Sorting a DataFrame with respect to a self defined criterion.
  • Calculate covariances, correlations of pairs of columns.

5) SQL-like operations on DataFrames

  • The Split-Calculate-Combine principle.
  • Adding data to Series or DataFrames.
  • Joining DataFrames with SQL-like join-operations.

6) Data visualization

  • The plot method of DataFrame.
  • The Seaborn plotting package.

Each of the above chapters has one or more exercise units. The course duration is 5 days.

On request, this course can be combined with the other courses or shortened with a duration between 2 and 5 days. If you are interested in this course, please send us a message, since we plan courses dynamically on demand.