Topic Modelling in ML:

#series #MachineLearning

Topic modeling allows us to efficiently analyse large volumes of text by clustering documents into topics.

A large amount of text data is unlabelled, meaning we won't be able to apply our previous supervised learning approaches to created machine learning models for that data. For example, different categories of newspaper article that too, text itself is unlabelled. So it's up to us to try to discover those labels through topic modeling.

If we have unlabelled data, we can attempt to discover labels.

In the case of text data, this means attempting to discover clusters of documents , grouped together by topic.

A very important idea here is,

it's difficult to evaluate the unsupervised learning model effectiveness. We don't actually know the correct topic or the right answer to begin with.
All we know is that the documents clustered together share similar topic ideas.
It's up to the user to identify what these topic actually represents.