Written by Prashant Basnet
👋 Welcome to my Signature, a space between logic and curiosity.
I’m a Software Development Engineer who loves turning ideas into systems that work beautifully.
This space captures the process: the bugs, breakthroughs, and “aha” moments that keep me building.
ML model implementation using scikit-learn.
#NLP #series #model
once you have your model created with the desired parameters, it's time to fit your model or train your model on some training data.
ScikitLearn comes with train test split functionality.
Pandas library in python allow us to actually read in CSV files, text files or tab separated files. It can read in files into what's known as data frame object.
| label | message | length | punct |
|----------|------------------------------------------------------------------------------------------|------------|-------|
| ham | Go until jurong point, crazy.. Available only ... |. 111 | 9 |
| ham | Ok lar... Joking wif u oni... | 29 | 6 |
| spam | Free entry in 2 a wkly comp to win FA Cup fina... | 155 | 6 |
| ham | U dun say so early hor... U c already then say... | 49 | 6 |
| ham | Nah I don't think he goes to usf, he lives aro... | 61 | 2 |
here we still don't know how to extract features from the text?
so for now we are just going to be using these numerical features i.e length and punctuation. Later we will learn how to convert the text message into numerical information using text feature extraction.
Things to note: df.isNull().sum() checks if any data is null
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.3, random_state=42)
#step 1 import model
#step 2 instantiate model
Now we have trained our model it's ready for testing.
for testing and evaluation:
for example, let's say accuracy is 83%, then we want to try out with other different models. The way you implement different models is
Naive Bayes
from sklearn.naive_bayes import MultinomialNB
nb_model = MultinomialNB()
nb_model.fit(X_train,y_train)
predictions = nb_model.predict(X_test)
Support vector classification model
from sklearn.svm import SVC
svc_model = SVC()
svc_model.fit(X_train, y_train)
predictions2 = svc_model.predict(X_test)
This is how you train and test the model.
next we will learn feature extraction on raw text.