Roadmap of NLP for Machine Learning

2 min readDec 17, 2021

Natural Language Processing (NLP) is the AI-based solution that helps computers understand, interpret and manipulate human language. NLP has several practical use cases like Machine Translation, Conversational AI bots, Resume evaluation, Fraud detection, etc. NLP leverage the concepts of Tokenization, Entity Recognition, Word Embeddings, Topic Modeling, Transfer Learning to build AI-based systems.

Following is the roadmap that I followed during my post-grad Data Science course and it has benefitted me immensely to prepare for the ML interviews. It is also helping me at the workplace, where my work focuses mainly on NLP and Deep Learning.

Pre-processing

Sentence cleaning
Stop Words
Regular Expression
Tokenization
N-grams (Unigram, Bigram, Trigram)
Text Normalization
Stemming
Lemmatization

Linguistics

Part-of-Speech Tags
Constituency Parsing
Dependency Parsing
Syntactic Parsing
Semantic Analysis
Lexical Semantics
Coreference Resolution
Chunking
Entity Extraction / Named Entity Recognition (NER)
Named Entity Disambiguation / Entity Linking
Knowledge Graphs

Word Embeddings

1. Frequency-based Word Embedding

One Hot Encoding
Bag of Words or CountVectorizer()
TFIDF or TfidfVectorizer()
Co-occurrence Matrix, Co-occurrence Vector
HashingVectorizer

2. Pretrained Word Embedding

Word2Vec (by Google) : (2 types) CBOW, Skip-Gram
GloVe (by Stanford)
fastText (by Facebook)

Topic Modeling

Latent Semantic Analysis (LSA)
Probabilistic Latent Semantic Analysis (pLSA)
Latent Dirichlet Allocation (LDA)
lda2Vec
Non-Negative Matrix Factorization (NMF)

NLP with Deep Learning

Machine Learning (Logistic Regression, SVM, Naïve Bayes)
Embedding Layer
Artificial Neural Network
Deep Neural Network
Convolution Neural Network
RNN/LSTM/GRU
Bi-RNN/Bi-LSTM/Bi-GRU
Pretrained Language Models: ELMo, ULMFiT
Sequence-to-Sequence/Encoder-Decoder
Transformers (attention mechanism)
Encoder-only Transformers: BERT
Decoder-only Transformers: GPT
Transfer Learning

Example Use cases

Sentiment Analysis
Question Answering
Language Translation
Text/Intent Classification
Text Summarization
Text Similarity
Text Clustering
Text Generation
Chatbots (DialogFlow, RASA, Self-made Bots)

Libraries

NLTK
Spacy
Gensim (mainly for topic modeling)

Free YouTube resources:

Credits to Standford University, NPTEL, Sentdex, Krish Naik.

Check out these Blogs

Thanks for reading the article! If you like my article do 👏. Have I missed any vital topic? Let me know in the comments. I’ll update!

If you are interested to check out the Mathematics roadmap for Machine Learning, click here.

Connect with me on Linked-in for more updates or any help related to how to move forward with the above topics.

Roadmap of NLP for Machine Learning

Pre-processing

Linguistics

Word Embeddings

Topic Modeling

NLP with Deep Learning

Example Use cases

Libraries

Free YouTube resources:

Check out these Blogs

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Hrisav Bhowmick

Responses (3)