ABOUT THE COURSE!
As part of our Certificate Program in Data Science, this course aims at providing a fundamental understanding on using SQL to access data, data visualization and exploratory data analysis.
Data analysis is a process for obtaining raw data and converting it into information useful for decision-making by users. The data are necessary as inputs to the analysis, much of the world’s data resides in database. Extracting data from database using SQL is an important technique you must learn if you want to become a data scientist. Data initially obtained must be processed or organised and clean for analysis. This course engages you with knowledge about data wrangling and data exploration by using Python’s library such as Numpy or Pandas. One of the key skills of a data scientist is the ability to tell a compelling story, visualizing data and findings in an approachable and stimulating way. Various techniques have been taught for presenting data visually.
Moreover, this course will provide learners with chances to practice skills required to leverage data for revealing valuable insights and advancing their career.
To begin the course, let's take a few minutes to explore the course site. Review the material we’ll cover each week, and preview the assignments/projects/quizzes you’ll need to complete to pass the course.
Main concepts are delivered through videos, demos and hands-on exercises.
|Course name:||Data Analysis with Python|
|Estimated Time:||6 weeks. Student should allocate at average of 2 hours/ day to complete the course.|
- Comprehends the basics of SQL, applies SQL statement to query data
- Practices with advanced SQL statements
- Use Python to access to a database
- Comprehends Data Analysis, Applies Python package to imCGrt and exCGrt data
- Explains some problems in data and practices technique to handle missing value, normalize data
- Practices descriptive statistical and explains various correlation statistical methods
- Explains and practices model development, evaluation and turning model
- Comprehends Data Visualization and practices with Matplotlib
- Uses Seaborn to create plots
- Applies advanced data visualization
Module 1: Databases and SQL for Data Science
- Lesson 1: Introduction to Databases
- Lesson 2: Basic SQL
- Lesson 3: String Patterns, Ranges, Sorting, and Grouping
- Lesson 4: Functions, Sub-Queries, Multiple Tables
- Lesson 5: Accessing databases using Python
Assignment 1: Real-world database project using SQL statement
Module 2: Data Analysis
- Lesson 6: Importing and Exporting Datasets
- Lesson 7: Data WRANGLING
- Lesson 8: Exploratory Data Analysis
- Lesson 9: Model Development
- Lesson 10: Model Evaluation and Refinement
Module 3: Data Visualization with Python
- Lesson 11: Introduction to Data Visualization
- Lesson 12: Basic Visualization Tools
- Lesson 13: Specialized Visualization Tools
- Lesson 14: Advanced Visualization Tools
- Lesson 15: Visualizing Geospatial Data
- Lesson 16: Interactive Visualization with Plotly
Assignment 2: NYC taxi analysis project
M.S. Vu Thuong Huyen
REVIEWERS & TESTER
Ph.D. Dang Hoang Vu
M.Sc. Nguyen Cong Thanh
Assoc. Prof. Tu Minh Phuong
Ph.D. Nguyen Van Vinh
Ph.D. Tran The Trung
Below is the list of all free massive open online learning sources (MOOC) from Coursera used for this course by FUNiX: