Introduction to Topic Modeling
If you do not have labels, it is unsupervised learning
If you have label, it is supervised learning
Topic model is the type of statistical model which allows you to discover topic about a document which consist of cluster of words which frequently occurs together.
Different functionality of Topic Modeling
Topic model belongs to fuzzy clustering where each document can belong to different degree to a different cluster.
Workflow of Topic Models
1. Input documents
2. Passing through topic models
3. Get topic (clusters of words (A word can belong to more than one topic))
Topic: Distribution of frequencies of word that are core in that topic
Document: Distribution of topics.
Different types of Algorithms: LSA, LDA, HDP etc.
Most important one: LDA (unsupervised)
Inconvenience of LDA:
If you have label, it is supervised learning
Topic model is the type of statistical model which allows you to discover topic about a document which consist of cluster of words which frequently occurs together.
Different functionality of Topic Modeling
- Find Latent variables regarding structure of document
- Clustering of words together
Topic model belongs to fuzzy clustering where each document can belong to different degree to a different cluster.
Workflow of Topic Models
1. Input documents
2. Passing through topic models
3. Get topic (clusters of words (A word can belong to more than one topic))
Topic: Distribution of frequencies of word that are core in that topic
Document: Distribution of topics.
Different types of Algorithms: LSA, LDA, HDP etc.
Most important one: LDA (unsupervised)
Inconvenience of LDA:
- Number of topics you want to find (k)
- Number of iterations that you have to iterate over
Results answer two questions in LDA
- Which topics occurs in this document
- What should be the topic for the word X in that document
Comments
Post a Comment