Explore the essential concepts of CouchDB replication with this quiz, featuring fundamental questions about replication types, benefits, conflicts, and key terms. Perfect for beginners seeking to reinforce their understanding of distributed databases and data synchronization.
Which of the following best describes the primary purpose of replication in CouchDB?
Explanation: Replication in CouchDB mainly synchronizes and copies data between two or more databases, ensuring that changes are reflected across all replicas. Query speed improvements, while possible through distribution, are not the central aim. Deleting documents or converting them to a different format, like XML, is not what replication does. Instead, replication keeps databases consistent and up-to-date.
What is the difference between unidirectional and bidirectional replication in CouchDB?
Explanation: Unidirectional replication sends changes from a source database to a target, while bidirectional replication allows both databases to exchange and apply changes to each other. The process is not about deleting or only updating data, nor is it focused on disk space usage. Data conversion is not a replication direction feature.
In CouchDB replication, which data format is typically used for document storage and transfer?
Explanation: CouchDB stores and transfers documents in JSON format, which is lightweight and easily processed. CSV and XML are used in other contexts, but not as default formats for CouchDB documents. YML is a common typo for YAML, a different data format unrelated to default CouchDB storage.
How does CouchDB typically handle replication conflicts when two documents have been updated independently?
Explanation: CouchDB preserves both conflicted versions and allows an application or user to resolve the conflict manually. Overwriting without notice could cause data loss, and automatic merging is not supported out of the box due to potential ambiguity. Deleting both versions would eliminate data, which is not acceptable for conflict management.
Why is replication considered important in distributed databases like CouchDB?
Explanation: Replication is used mainly to provide high availability, ensuring that data remains accessible and safe if one node fails. While security is essential, replication alone does not protect against all threats. Compression and indexing are unrelated features not directly influenced by replication.
Which of these is NOT a standard method for triggering replication in CouchDB?
Explanation: Replication isn't initiated by saving a document as a PDF. Replication can be triggered manually using APIs, scheduled for specific times, set as continuous, or configured using replication documents. PDF saving is unrelated to replication processes.
What is a 'checkpoint' in the context of CouchDB replication?
Explanation: A checkpoint records the progress of replication so it can resume from the correct position after interruption. It is not for database optimization, admin access, or search acceleration. The term specifically relates to tracking synchronization progress.
During replication, how does CouchDB know which documents have changed?
Explanation: CouchDB uses the changes feed to keep track of document modifications for efficient replication. Access logs track HTTP requests but do not list database changes specifically. Indexes are not solely for tracking data changes, and CSV exports are irrelevant to replication tracking.
What is the primary purpose of a replication filter in CouchDB?
Explanation: Replication filters allow selective replication, restricting which documents get copied based on specified rules. They do not influence memory usage or encryption, nor are they used for view creation. This helps optimize replication for specific needs.
What distinguishes continuous replication from a one-time replication in CouchDB?
Explanation: Continuous replication constantly monitors for changes and replicates them as soon as they appear, while one-time replication only synchronizes data once. Automatic data deletion and data format differences are inaccurate. Continuous replication may use more bandwidth over time, but it does not inherently require double the network resources for every situation.