CLASSIC NLP
TF-IDF & ML (8)
-
Write TF-IDF from scratch.
-
What is normalization in TF-IDF ?
-
Why do you need to know about TF-IDF in our time, and how can you use it in complex models?
-
Explain how Naive Bayes works. What can you use it for?
-
How can SVM be prone to overfitting?
-
Explain possible methods for text preprocessing ( lemmatization and stemming ). What algorithms do you know for this, and in what cases would you use them?
-
What metrics for text similarity do you know?
-
Explain the difference between cosine similarity and cosine distance. Which of these values can be negative? How would you use them?
METRICS (7)
-
Explain precision and recall in simple words and what you would look at in the absence of F1 score?
-
In what case would you observe changes in specificity ?
-
When would you look at macro, and when at micro metrics? Why does the weighted metric exist?
-
What is perplexity? What can we consider it with?
-
What is the BLEU metric?
-
Explain the difference between different types of ROUGE metrics?
-
What is the difference between BLUE and ROUGE?
WORD2VEC(9)
-
Explain how Word2Vec learns? What is the loss function? What is maximized?
-
What methods of obtaining embeddings do you know? When will each be better?
-
What is the difference between static and contextual embeddings?
-
What are the two main architectures you know, and which one learns faster?
-
What is the difference