Data Modeling and Schema Design for Analytical Databases Quiz Quiz

Assess your understanding of data modeling concepts and schema design strategies for analytical databases. This quiz covers dimensional modeling, normalization, denormalization, star and snowflake schemas, surrogate keys, and best practices essential for analytical data environments.

  1. Star vs. Snowflake Schemas

    Which schema design uses denormalized dimension tables to reduce query complexity and is commonly used in analytical databases?

    1. Star schema
    2. Ring schema
    3. Snowflake schema
    4. Hierarchical schema

    Explanation: A star schema uses denormalized dimension tables connected directly to a central fact table, making queries simpler and faster for analytical workloads. The snowflake schema, in contrast, normalizes dimension tables, which can add complexity. The hierarchical schema is unrelated to analytical databases, and ring schema is not a standard term in data modeling.

  2. Fact Table Identification

    In a sales analytics example, which table type primarily stores transactional data such as 'SaleAmount' and 'QuantitySold'?

    1. Fact table
    2. Attribute table
    3. Dimension table
    4. Bridge table

    Explanation: A fact table records transactional metrics like sale amount and quantity sold, often at the intersection of various dimensions. Dimension tables hold descriptive data such as customer or product details. Attribute tables and bridge tables serve other purposes, such as storing additional properties or resolving many-to-many relationships, not the primary transactional data.

  3. Normalization Purpose

    Why is normalization typically avoided in dimension tables of analytical schemas?

    1. To enforce data consistency
    2. To improve query performance
    3. To save storage space
    4. To increase referential integrity

    Explanation: Denormalization in dimension tables is used to improve query performance by reducing the number of joins needed during data retrieval. Normalization, while useful for enforcing consistency and saving storage in transactional systems, is less favored here as it can slow down complex queries. Referential integrity is maintained but is not the primary reason for avoiding normalization in this context.

  4. Surrogate Key Usage

    What is the main benefit of using surrogate keys in dimension tables for analytical data models?

    1. They enforce business rules
    2. They encrypt sensitive data
    3. They simplify handling slowly changing dimensions
    4. They reduce data duplication

    Explanation: Surrogate keys help manage slowly changing dimensions by providing unique, non-business identifiers that make tracking history easier. They don't directly enforce business rules or encrypt data, and while they can reduce duplication, their primary purpose is related to data version management. This makes slowly changing dimensions easier to implement and track.

  5. Role of a Dimension Table

    Which statement best describes the purpose of a dimension table in analytical schema designs?

    1. They store aggregated metrics
    2. They log data transformations
    3. They provide descriptive context to facts
    4. They serve as indexes for speeding up searches

    Explanation: Dimension tables offer descriptive context, such as customer names or product categories, enabling meaningful analysis of the facts. They do not store aggregated metrics; that's the fact table's role. Indexes assist with search but aren't a schema design element, and logging data transformations is unrelated to dimension tables.

  6. Granularity Selection

    If a data warehouse tracks daily sales per store, what is the granularity of the corresponding fact table?

    1. One record per region
    2. One record per month
    3. One record per store per day
    4. One record per product category

    Explanation: Granularity refers to the level of detail stored; in this case, it's a record for each store on each day. Storing data by region, month, or product category would miss the specified requirement for daily store-level detail. Choosing the right granularity is essential for balancing detail and performance.

  7. Bridge Table Functionality

    Why might a bridge table be included in a dimensional model for analytical databases?

    1. To create star schemas
    2. To store metadata
    3. To resolve many-to-many relationships
    4. To enforce normalization

    Explanation: Bridge tables are used for resolving many-to-many relationships between fact and dimension tables in analytical models. They do not enforce normalization, create star schemas, or store metadata. Their typical purpose is to maintain accurate analytical associations, such as products belonging to multiple categories.

  8. Conformed Dimensions

    What is a conformed dimension in the context of analytical data modeling?

    1. A dimension automatically created by the database
    2. A dimension used across multiple fact tables or data marts
    3. A temporary lookup table
    4. A dimension with only numeric values

    Explanation: A conformed dimension is shared by multiple fact tables or data marts, ensuring consistent reporting and analysis. It is not defined by having only numeric values or being a temporary lookup; nor is it automatically created by databases. Semantic consistency across the system is their main function.

  9. Schema Evolution

    What is the best approach to handle new attributes added to a dimension table in a data warehouse?

    1. Normalize the existing table further
    2. Delete old data and reload
    3. Create a new fact table
    4. Add new columns to the existing dimension table

    Explanation: Adding new columns accommodates new descriptive attributes and is a standard practice for evolving dimensions. Creating a new fact table or normalizing the table may not meet the objective, and deleting old data risks losing historical context. Adjusting the table structure by adding columns maintains backward compatibility.

  10. Slowly Changing Dimension (SCD) Types

    Which type of slowly changing dimension keeps historical records by creating a new row when an attribute changes?

    1. Type 0
    2. Type 2
    3. Type 3
    4. Type 1

    Explanation: A Type 2 slowly changing dimension tracks history by adding a new row for each change, preserving old versions. Type 1 overwrites data without keeping history, Type 0 does not change, and Type 3 stores limited previous values but not full history. Type 2 best supports complete historical analysis for changing attributes.