Cassandra Failures and Repair: Core Concepts Quiz Quiz

Enhance your understanding of handling node failures and managing repair processes in Cassandra clusters with this focused quiz. Evaluate your grasp on replication, consistency, anti-entropy repair, hinted handoff, and essential best practices in maintaining resilient, high-availability data systems.

  1. Node Failure Identification

    In a Cassandra cluster, what typically happens when a single node becomes unreachable due to a hardware failure?

    1. All data on the cluster becomes inaccessible.
    2. The entire cluster shuts down immediately.
    3. Data is permanently lost from the database.
    4. Other nodes continue to serve requests using replicas.

    Explanation: Cassandra is designed with high availability in mind, so when one node fails, requests are routed to other replicas that hold copies of the data. The entire cluster does not shut down, so option two is incorrect. All data does not become inaccessible; only the downed node is affected. Data is not permanently lost because replicas exist elsewhere in the cluster.

  2. Purpose of the Repair Process

    What is the main purpose of performing a repair operation in a Cassandra cluster?

    1. To shut down non-essential nodes.
    2. To reset all nodes to their original state.
    3. To synchronize data between replicas and ensure consistency.
    4. To clear out all user data in the cluster.

    Explanation: Repair operations in Cassandra are critical for synchronizing data between replicas, helping maintain consistency especially after failures or network partitions. Resetting nodes to their original state is not the purpose of repair. Clearing all user data is unrelated, and shutting down nodes is not part of the repair process.

  3. Understanding Hinted Handoff

    If a write is attempted on a node that is currently down, which method does Cassandra use to temporarily store this information?

    1. Node Caching
    2. Hinted Handoff
    3. Immediate Deletion
    4. Snapshot Restore

    Explanation: Hinted handoff is the technique used to store hints about missed writes so they can be delivered once the node is back online. Snapshot restore is for backup and recovery, not write availability. Node caching is unrelated to storing pending writes, and immediate deletion does not help in reapplying missed writes.

  4. Anti-Entropy Repair Mechanism

    Which repair method in Cassandra compares data between replicas and fixes any differences found?

    1. Anti-entropy repair
    2. Lazy write
    3. Row cache refresh
    4. Partition bounce

    Explanation: Anti-entropy repair specifically refers to comparing and reconciling data between replicas to fix inconsistencies. Lazy write is not a repair process but relates to write-back caching. Row cache refresh deals with caching and is unrelated to repairs. Partition bounce is not a valid repair method.

  5. Effect of Repair on Consistency

    How does running a repair operation help improve data consistency after a node comes back online?

    1. It increases the replication factor automatically.
    2. It synchronizes missed updates across replicas.
    3. It deletes all new data since the node was offline.
    4. It blocks all writes until completion.

    Explanation: Repairs ensure that any missed updates or inconsistencies between replicas due to node downtime are reconciled. Repairs do not alter the replication factor, which must be manually reconfigured. Deleting new data would be harmful and is not part of the process, and repair does not block all writes.

  6. Consistency Level and Node Failures

    If a write request is sent using consistency level QUORUM and one node is down, what will happen in a cluster with three replicas?

    1. The write succeeds as long as two replicas acknowledge the write.
    2. The system ignores consistency levels during failures.
    3. All nodes must be up for QUORUM to work.
    4. The write fails instantly even if two replicas are available.

    Explanation: QUORUM requires a majority of replicas to acknowledge the write; in a three-replica setup, two acknowledgments suffice for success. The operation does not fail instantly if two nodes are available. Not all nodes must be up, as QUORUM does not require all replicas. Consistency levels are not ignored during failures.

  7. Full vs. Incremental Repair

    What is the main difference between a full repair and an incremental repair in Cassandra?

    1. Full repair checks all data, while incremental repair processes only data changed since the last repair.
    2. Full repair deletes old data while incremental repair does not.
    3. Incremental repair runs faster because it uses more nodes.
    4. Incremental repair increases the replication factor automatically.

    Explanation: A full repair examines every piece of data, whereas incremental repair processes only segments changed since the previous repair, improving efficiency. Incremental repair does not inherently run faster due to more nodes. Full repairs don't delete old data, and incremental repairs do not affect the replication factor.

  8. When to Run Repairs

    Which situation best describes an appropriate time to run a repair operation in Cassandra?

    1. During every successful write operation.
    2. Every time a user reads data from the database.
    3. After a node was down and has rejoined the cluster.
    4. Before installing the database software.

    Explanation: Running a repair is important after a node returns to the cluster to ensure it is consistent with the rest. Performing repairs during every write or read is not practical or necessary. Repairs cannot be run before the database software is installed.

  9. Merits of Regular Repairs

    Why is it recommended to schedule regular repair operations in Cassandra clusters?

    1. To prevent data consistency issues from accumulating over time.
    2. To increase disk space usage intentionally.
    3. To reduce the number of nodes required in the cluster.
    4. To erase duplicate data entries automatically.

    Explanation: Regular repairs ensure inconsistencies do not build up between replicas, maintaining healthy data integrity. Repairs do not erase duplicate data but synchronize correct values. They do not reduce the node count or intentionally increase disk usage.

  10. Role of Hints During Recovery

    After a previously unavailable node rejoins the cluster, what happens to the hints stored for it by other nodes?

    1. The hints are delivered to the recovered node to replay missed writes.
    2. The hints are deleted immediately without use.
    3. The hints disable automatic repair on the node.
    4. The recovered node ignores any stored hints.

    Explanation: When a node returns, hints are sent to it so it can replay and apply any writes missed while offline. Deleting hints immediately would result in data loss. The node does not ignore the hints, as they are essential for consistency. Hints do not disable repairs, but complement them for data health.