This is the code repository for Data Mining with Python: Implementing Classification and Regression, published by Packt. It contains all the supporting project files necessary to work through the video course from start to finish.
Python is a dynamic programming language used in a wide range of domains by programmers who find it simple yet powerful. In today’s world, everyone wants to gain insights from the deluge of data coming their way. Data mining provides a way of finding these insights, and Python is one of the most popular languages for data mining, providing both power and flexibility in analysis. Python has become the language of choice for data scientists for data analysis, visualization, and machine learning.
In this course, you will discover the key concepts of data mining and learn how to apply different data mining techniques to find the valuable insights hidden in real-world data. You will also tackle some notorious data mining problems to get a concrete understanding of these techniques.
We begin by introducing you to the important data mining concepts and the Python libraries used for data mining. You will understand the process of cleaning data and the steps involved in filtering out noise and ensuring that the data available can be used for accurate analysis. You will also build your first intelligent application that makes predictions from data. Then you will learn about the classification and regression techniques such as logistic regression, k-NN classifier, and SVM, and implement them in real-world scenarios such as predicting house prices and the number of TV show viewers.
By the end of this course, you will be able to apply the concepts of classification and regression using Python and implement them in a real-world setting.
- Understand the basic data mining concepts to implement efficient models using Python
- Know how to use Python libraries and mathematical toolkits such as numpy, pandas, matplotlib, and sci-kit learn
- Build your first application that makes predictions from data and see how to evaluate the regression model
- Analyze and implement Logistic Regression and the KNN model
- Dive into the most effective data cleaning process to get accurate results
- Master the classification concepts and implement the various classification algorithms
To fully benefit from the coverage included in this course, you will need:
Basic knowledge of Python
This course has the following software requirements:
OS: Any modern OS (Windows, Mac, or Linux)
· RAM: minimum required for the Operating System
· CPU: minimum required for the Operating System