Email-Spam-Classifier-Using-Naive-Bayes

Naive Bayes is a supervised classification technique based on Bayes' Theorem with an assumption of independence among predictors. That is, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.

It is a popular technique for text categorization, judging documents as belonging to one category or the other (such as spam or legitimate, sports or politics, etc.) with word frequencies as features.

Goal: Previously unseen records should be assigned a class as accurately as possible

We have a bunch of emails classified as 'spam' and a bunch of emails classified as 'ham' (not spam)
The emails are first read and stored in a dataframe. They are then parsed using CountVectorizer
This information is used to train the model and its prediction is then tested with a sample input

Python Libraries used: pandas, numpy, io, os, CountVectorizer and MultinomialNB from sklearn

The Spam classifier classifies the given input as a spam/ham. A sample output is found here.

Some Practical Applications:

Direct Marketing
Fraud Detection
Text Classification
Spam Filtering
Categorizing News
Medical Diagnosis
Face Recognition

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
emails		emails
.gitignore		.gitignore
GaussianNB.py		GaussianNB.py
NaiveBayes.py		NaiveBayes.py
README.md		README.md
SpamClassifier.py		SpamClassifier.py
SpamClassifierOutput.txt		SpamClassifierOutput.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Email-Spam-Classifier-Using-Naive-Bayes

Some Practical Applications:

About

Releases

Packages

Languages

Surya-Murali/Email-Spam-Classifier-Using-Naive-Bayes

Folders and files

Latest commit

History

Repository files navigation

Email-Spam-Classifier-Using-Naive-Bayes

Some Practical Applications:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages