Neural Embeddings u0026 Word2Vec Fundamentals Quiz Quiz

Explore the basic concepts of neural embeddings and Word2Vec, including their key principles, training methods, and typical applications for representing words as vectors. Gain insights on how these techniques capture word meaning, context, and similarity for natural language processing tasks.

  1. Purpose of Word Embeddings

    What is the primary purpose of using word embeddings in natural language processing tasks?

    1. To organize words in alphabetical order
    2. To count the frequency of each word in a text
    3. To store dictionary definitions for each word
    4. To represent words as numerical vectors capturing semantic similarities

    Explanation: Word embeddings convert words into numerical vectors where similar words have vectors close to each other in space, capturing semantic relations. Storing dictionary definitions is not the goal of embeddings, even though context matters. Counting word frequency or sorting alphabetically are unrelated to the primary purpose of embeddings, which focus on meaning and relationships.

  2. Continuous Bag of Words (CBOW) Model

    In the Continuous Bag of Words (CBOW) Word2Vec model, what does the network try to predict?

    1. The target word from surrounding context words
    2. The context words from the target word
    3. The sentence sentiment
    4. The length of the sentence

    Explanation: CBOW predicts a missing or target word using its context, meaning the words around it. Predicting context from the target word describes the Skip-gram model's approach. Sentence length and sentiment predictions are unrelated to the CBOW model, which focuses on context-to-word prediction.

  3. Skip-gram Model Objective

    What is the main objective of the skip-gram variant of Word2Vec?

    1. To sort words by their frequency in the document
    2. To identify parts of speech for each word
    3. To predict surrounding context words from a given target word
    4. To translate sentences between languages

    Explanation: The skip-gram model predicts context words based on a target word, allowing it to learn strong relationships even for infrequent words. Sorting by frequency, performing translation, and tagging parts of speech are not the objectives of the skip-gram model, which is focused on context prediction.

  4. Dimensionality of Embeddings

    Why is it beneficial to use a lower-dimensional vector (such as 100 or 300 dimensions) to represent words in NLP tasks?

    1. It increases the risk of overfitting the model
    2. It reduces computation and helps capture essential word meanings
    3. It always results in a loss of semantic information
    4. It makes the embedding harder to visualize

    Explanation: Lower-dimensional vectors are computationally efficient and capture key semantic information, enabling models to process language effectively. Overfitting is less likely with lower dimensions than with high-dimensional sparse vectors. While visualization can be challenging, it is unrelated to this main benefit. Carefully chosen low dimensions can still retain important semantics rather than always losing information.

  5. Word Similarity in Embedding Space

    If the vectors for 'cat' and 'kitten' are close together in embedding space, what does this indicate?

    1. That the neural network made a prediction error
    2. That 'cat' and 'kitten' are likely semantically similar
    3. That 'kitten' is always the next word after 'cat'
    4. That 'cat' is more common in the data

    Explanation: Close proximity in embedding space suggests that the model has learned a semantic similarity between 'cat' and 'kitten'. Frequency alone doesn't determine vector closeness. A prediction error is not implied by vector similarity. Word order or co-occurrence isn't directly represented by closeness of vectors.

  6. Input Data for Training Embeddings

    Which type of data is typically required to train Word2Vec embeddings?

    1. A database of speech audio files
    2. A list of word definitions from a dictionary
    3. A large collection of unlabeled text documents
    4. A set of manually annotated images

    Explanation: Word2Vec needs unlabeled text to learn patterns of word usage and context. Images, word definitions, and audio files are not suitable for directly training word embeddings using Word2Vec, though other models may use such data for different tasks.

  7. Analogy Tasks in Word2Vec

    How can Word2Vec embeddings be used to solve analogy tasks, such as 'man is to king as woman is to ___'?

    1. By performing vector arithmetic operations like king - man + woman
    2. By arranging words in alphabetical order
    3. By counting how many times each word appears
    4. By translating words between languages

    Explanation: Word2Vec allows analogies to be solved through vector arithmetic because relational information is encoded directionally in the embeddings. Alphabetical ordering and word frequency counting do not reflect relationships. Translation is a separate task not directly addressed by embeddings arithmetic.

  8. Context Window Size Effect

    What does increasing the context window size in a Word2Vec model typically lead to?

    1. Capturing broader, more general word relationships
    2. Focusing only on nearby word spelling
    3. Ignoring all punctuation marks
    4. Reducing the number of training epochs

    Explanation: A larger context window includes more words, helping the model capture general and topic-level associations. Spellings and punctuation are not the focus of context windows. The number of epochs relates to training cycles and is unaffected by window size.

  9. Handling Unknown Words

    How does a standard Word2Vec model typically handle words it did not see during training (out-of-vocabulary words)?

    1. It uses the embedding of the most similar seen word
    2. It cannot generate embeddings for unseen words
    3. It guesses embeddings by re-training the model
    4. It assigns random numbers to the unknown word

    Explanation: Standard Word2Vec models cannot create embeddings for words not encountered during training. Guessing by re-training is not typically performed in live usage. Using similar word vectors or assigning random vectors are workaround strategies but are not part of the original model's standard behavior.

  10. Semantic Relationships Captured

    What type of relationships do neural embeddings like those from Word2Vec mainly capture between words?

    1. Syntactic parsing rules
    2. Numeric values assigned to words
    3. Physical locations of objects
    4. Semantic similarity and relatedness

    Explanation: Neural embeddings primarily model semantic similarities, grouping related words close together in vector space. Syntactic parsing, physical location, and strictly numeric assignments are outside the scope of what Word2Vec embeddings capture; their focus is on meaning and relationships.