Data processing with Python (o R)
(TDADPP)
Data processing refers to the management and analysis of data along its entire life cycle. In this course, students learn how to develop a Python software able to collect, organize and analyse data in order to obtain a first exploratory knowledge
Audience (and prerequisites)
Anyone with a basic knowledge of Python language who wants to explore its applications in Data Science
Approaches (Objective)
Data Collection
- Open data sources
- API
- Scraping
Data Representation
- Data formats
- Relational algebra
- Database
Data Quality Assessment
- Data source and data fusion
- Data volume
- Data standards and data impact
Data wrangling with pandas
- Indexing
- Reshaping
- Merging and joining
- Cleaning and normalisation
Exploratory Data Analysis
- Descriptive statistics
- Data visualisation
- Feature engineering
- Clustering