Foundations of Response Caching: Keys, TTL, Invalidation, and Strategies Quiz

Test your understanding of caching basics for generated responses, including cache keys, time-to-live (TTL), cache invalidation, and the differences between client-side and server-side caching. This easy-level quiz is designed to reinforce core caching concepts with practical questions and real-world scenarios.

  1. Identifying Cache Keys

    Which three key elements are commonly combined to form a unique cache key for storing generated responses?

    1. Prompt, user location, and data format
    2. Prompt, parameters, and model version
    3. Prompt, file size, and network speed
    4. Server name, client ID, and timestamp

    Explanation: Prompt, parameters, and model version are typically used together to generate a unique cache key, ensuring responses are correctly matched to specific requests. Server name, client ID, and timestamp are unrelated to the content of generated responses for most caching scenarios. Prompt with user location and data format could be relevant in specific use cases but is not a general standard. Prompt, file size, and network speed are not directly related to uniquely identifying cached responses.

  2. Understanding TTL

    What does the TTL (Time To Live) setting in a cache control?

    1. How many users can access the cache
    2. How long a cached response remains valid
    3. The total storage quota for the cache
    4. The size of each generated response

    Explanation: TTL specifies how long a cached response should be considered fresh before being removed or replaced. It does not set the storage quota—that is a function of cache capacity. TTL doesn't affect how many users can access the cache, nor does it determine the size of responses.

  3. Scenario: Cache Key Uniqueness

    If two requests share the same prompt but have different parameters, why should their cache keys differ?

    1. Different parameters may result in different responses
    2. It increases the time spent computing results
    3. It saves memory space in the cache
    4. It prevents errors in network routing

    Explanation: Even with the same prompt, different parameters (like length or temperature) can lead to unique generated outputs, so cache keys must differ. Saving memory space does not directly relate to changing cache keys. Network routing errors are unrelated to caching logic, and increasing computation time is not a direct result of differing cache keys.

  4. Client vs. Server-Side Cache

    Which cache type stores generated responses on the user's device for faster repeated access?

    1. Database cache
    2. Proxy cache
    3. Server-side cache
    4. Client-side cache

    Explanation: Client-side caching means storing responses locally on the user's device, reducing load times for repeated requests. Server-side caching keeps responses on the server, not on the user device. Database caching refers to optimizing database queries, and proxy caching involves an intermediate system, but not directly the end user's device.

  5. Cache Invalidation Basics

    What is cache invalidation in the context of caching generated responses?

    1. Removing outdated or incorrect cached responses
    2. Creating a new cache entry for every request
    3. Increasing the TTL of cache entries
    4. Encrypting cached data for security

    Explanation: Cache invalidation is the process of removing entries that are no longer valid, ensuring users receive up-to-date results. Creating new entries isn't invalidation; that's caching. Increasing TTL extends validity rather than removing outdated data. Encrypting cached data is a security measure, not invalidation.

  6. Example: TTL Expiry

    If a cache entry has a TTL of 5 minutes but is accessed after 10 minutes, what usually happens?

    1. The TTL is automatically extended by 5 minutes
    2. The expired entry is returned to the user
    3. The cache entry is considered expired and a new response is generated
    4. All cache entries are deleted

    Explanation: When the TTL is exceeded, the cache treats the entry as expired and will typically generate a new response. Returning an expired entry defeats the purpose of TTL. TTL doesn't automatically extend unless specifically configured. Deleting all cache entries due to one expiry is unnecessary and incorrect.

  7. Scenario: Prompt Mutation

    If the same parameters and model version are used but the prompt text slightly changes, what should happen with the cache key?

    1. TTL should be increased
    2. The cache should be bypassed entirely
    3. The previous cache key should be reused
    4. A new cache key should be generated

    Explanation: Even small changes to the prompt mean the generated response will likely differ, so a completely new cache key is needed. Reusing the key would cause mismatches. Increasing TTL does not address prompt changes. Bypassing the cache is unnecessary—valid caching can still occur with a new key.

  8. TTL Adjustment Effects

    What is a likely effect of reducing the TTL for cached generated responses from 10 minutes to 1 minute?

    1. The network speed increases for all requests
    2. Cache keys need to be changed
    3. Cached responses become more accurate but use more memory
    4. Cache hits decrease and more responses are generated

    Explanation: A shorter TTL means cached entries expire faster, leading to fewer cache hits and more new computations. Accuracy can improve, but memory use often declines with shorter TTLs. Network speed isn't directly affected by TTL. Changing TTL doesn't mean cache keys themselves change.

  9. Use Case: Cache Layer Choice

    If consistent responses are needed across multiple users, which caching strategy is preferable?

    1. Server-side caching
    2. Browser local storage
    3. Direct database caching
    4. Client-side caching

    Explanation: Server-side caching allows all users to benefit from the same cached responses, ensuring consistency. Client-side caches are limited to individual users. Browser local storage is a kind of client-side caching, so it's also user-specific. Direct database caching optimizes database queries, not generated response sharing.

  10. Importance of Model Version in the Cache Key

    Why is it important to include the model version in the cache key when storing generated responses?

    1. Model versions increase the cache storage cost
    2. Different model versions may produce different outputs for the same prompt
    3. The model version determines the user interface color
    4. Model versions automatically clear the entire cache

    Explanation: Updates to the underlying model can change response behavior, so the cache key needs to reflect the version used. Storage cost does not hinge on including the model version. User interface colors are unrelated. Model upgrades do not by themselves empty the whole cache; isolation is achieved through cache key versioning.

  11. Cache Consistency Issue

    A user complains of getting outdated information repeatedly even after an update. What cache problem is likely occurring?

    1. The prompt is too generic
    2. Stale cache entries were not properly invalidated
    3. The user is using an incorrect device
    4. The cache has inadequate storage space

    Explanation: If users receive outdated data, it usually means invalidation policies have failed and old entries weren't removed. Storage space issues might cause missed cache hits, but not persistent outdated data. Prompt generality is not directly linked to stale caches. Device type does not control cache freshness.

  12. Cache Storage Limit

    When a cache reaches its storage limit, what often happens to make room for new responses?

    1. The TTL for all entries is set to zero
    2. All existing entries are duplicated
    3. Only new responses are cached; old ones remain
    4. The least recently used entries are deleted

    Explanation: Many cache systems use a least recently used (LRU) strategy to delete the oldest unused entries for space. Duplicating entries wastes space. Setting TTL to zero expires all items, not just the oldest. Caching only new responses without removing old ones doesn't free memory.

  13. Matching Parameters to Cache Keys

    When a parameter like 'max_length' changes from 50 to 100 for the same prompt, what should a correct caching system do?

    1. Overwrite the existing cache entry
    2. Create a separate cache entry for each 'max_length' value
    3. Ignore the parameter and reuse the same cache key
    4. Set the TTL to unlimited

    Explanation: Changing relevant parameters such as 'max_length' alters the output, so each unique parameter set deserves its own cache entry. Overwriting would lose previous results. Ignoring the parameter risks delivering incorrect outputs. Unlimited TTL is unrelated to parameter changes.

  14. Cache Key Collisions

    If two different requests accidentally share the same cache key, which issue is most likely to happen?

    1. Cache entries get encrypted
    2. The server’s processing time increases
    3. Users receive incorrect or mismatched responses
    4. Cache memory doubles in usage

    Explanation: Cache key collisions cause different requests to return the same, potentially incorrect, cached result. Increased memory isn't direct; in fact, it may reduce unique entries. Server processing time isn't necessarily affected by key reuse. Encryption isn't linked to cache key issues.

  15. Scenario: Cache Bypass

    Which of the following actions bypasses cached responses and always generates a fresh response?

    1. Increasing the cache size
    2. Disabling the cache for a request
    3. Setting a very long TTL
    4. Reusing the same cache key

    Explanation: Disabling caching forces a new response regardless of cache content. A long TTL keeps responses cached longer, not bypasses. Increasing cache size lets you store more entries. Reusing the same key uses whatever is in the cache, not a fresh computation.

  16. Hinting Invalidation on Data Changes

    When the data used to generate responses changes significantly, what should a cache system ideally do?

    1. Invalidate affected cache entries to ensure accuracy
    2. Ignore and continue serving old cache entries
    3. Switch from client-side to server-side caching
    4. Reduce the TTL of unrelated entries

    Explanation: To maintain accurate and up-to-date answers, affected cache entries should be invalidated when data changes. Just reducing TTL on unrelated items doesn't address accuracy. Changing from client- to server-side caching is more about architecture, not timely accuracy. Serving outdated entries risks delivering incorrect responses.