Explore key concepts of CouchDB scaling in distributed environments with this quiz, covering replication, consistency, node management, and common pitfalls. Designed for beginners, these questions help reinforce understanding of how distributed databases address scalability issues.
Which replication type in CouchDB is best suited for synchronizing data between two distributed servers located in different regions?
Explanation: Master-Master Replication is appropriate when synchronizing data between servers in different regions because it allows changes on both nodes to be shared. Single Node Replication handles only one direction and may not capture updates from both locations. Local Backup is not a replication method but rather a process of copying data for recovery. Shuffle Sharding is unrelated, as it refers to distributing workload across subsets of nodes, not replication mechanisms.
In distributed CouchDB environments, what is the consistency model most commonly used by default?
Explanation: Eventual Consistency allows data written to one node to eventually appear on others, common in distributed CouchDB setups. Strong Consistency ensures all nodes always agree instantly, which is not default in these databases. Snapshot Isolation and Immediate Consistency are advanced models found in other systems and not as easily implemented in CouchDB by default.
What is a common bottleneck when horizontally scaling CouchDB clusters with many active nodes?
Explanation: As more nodes participate in updates, increased replication conflicts often occur, becoming a significant bottleneck. Lack of network connectivity is a potential issue but not a typical scaling bottleneck. CouchDB uses JSON, not XML files, for storage. Memory usage generally increases, not decreases, with scaling, making that option incorrect.
When two users update the same document in different CouchDB nodes at the same time, how does CouchDB resolve the situation?
Explanation: CouchDB stores both conflicting document versions and sets a conflict flag, requiring manual resolution. Automatic merging is not performed since CouchDB cannot safely merge arbitrary changes. Overwriting updates silently or deleting the document would risk data loss, which is not the system's approach.
If a CouchDB node becomes unavailable in a distributed cluster, what is a typical behavior?
Explanation: Distributed systems like CouchDB allow the cluster to keep functioning, with the offline node catching up on changes in synchronization once it returns. The entire cluster will not halt due to one node's failure. Traffic is not sent to an XML server, and data loss does not occur as changes are replayed to the recovered node.
What is the main benefit of sharding data across multiple nodes in a distributed CouchDB setup?
Explanation: Sharding distributes data and incoming requests across nodes, increasing performance and balancing the load. Sharding does not compress data nor does it guarantee the absence of replication conflicts. It also is not responsible for encrypting databases.
When writing data in a distributed CouchDB cluster, what does the 'quorum' parameter control?
Explanation: The quorum parameter sets the number of nodes that must confirm a write for it to be considered successful. It does not relate to data shard sizes, encryption key management, or marking previous document versions for deletion. Adjusting quorum affects data consistency and availability.
Which approach ensures that recent data is not lost if a CouchDB node crashes before synchronizing with others?
Explanation: Write-ahead logging keeps recent changes saved so that they can be recovered even after a crash, improving durability. Simply rebooting does not address data loss risk. Migrating to CSV files is unrelated and does not enhance safety. Reducing nodes may reduce system scalability and fault tolerance.
During a temporary network partition in a distributed CouchDB environment, what is a likely outcome?
Explanation: When nodes in a network partition make concurrent updates, conflicts can arise and will need resolving when connectivity is restored. Data is not instantly lost during a network partition. Write and read operations may be possible in each partition but not synchronized. Conflicts are not automatically merged without review.
When a new node is added to a CouchDB distributed cluster, what usually happens to balance data?
Explanation: To utilize the new node, data is usually redistributed among all nodes for balanced storage and workload. The new node does not automatically replace others, and clients do not need to reconnect only to the latest node. No conversion to flat files is performed during such scaling operations.