Enhance your understanding of effective database indexing strategies with this quiz, designed to reinforce key performance optimization concepts. Explore best practices, common indexing pitfalls, and essential query optimization techniques for efficient data retrieval.
Why does having a primary key on a table automatically create an index in most relational databases?
Explanation: A primary key uniquely identifies each row, so an index is created to guarantee fast access and enforce uniqueness. Allowing duplicates would defeat the purpose of a primary key, and compression is not a direct function of indexing. Indexing does not shuffle data randomly but organizes it for quick retrieval.
Which type of column is generally best suited for indexing to improve search query performance, as shown in a frequent name-search example?
Explanation: Columns often used in WHERE conditions benefit the most from indexing, enabling faster search and retrieval. Columns rarely used or always null are not queried for filtering, making their indexing ineffective. Similarly, indexing a column with identical values offers no optimization since all results would be similar.
What is the main difference between a unique index and a non-unique index when applied to a column storing email addresses?
Explanation: A unique index guarantees that each value in the indexed column is different, which is often required for email addresses. Neither unique nor non-unique indexes directly affect data sorting or specifically impact deletion speeds, and their storage costs are generally similar except for the uniqueness constraint.
How can a composite index on (first_name, last_name) benefit queries that search for both fields together?
Explanation: A composite index allows efficient searches when both fields are used in queries, such as looking up a specific full name. It does not handle misspellings, nor does it restrict updates to the indexed columns. While there is some storage overhead, it does not double the database size.
What is a common drawback of adding too many indexes to a single table containing millions of rows?
Explanation: Excessive indexes require maintenance on each data change, increasing the time to insert or update records. Indexes do not influence query accuracy or impose direct row count limits. They also do not prevent read operations; rather, they typically accelerate search queries.
When frequently running queries like SELECT * FROM sales WHERE date BETWEEN '2024-01-01' AND '2024-03-31', which indexing strategy is most effective?
Explanation: Range queries on the date column benefit greatly from an index, which allows quick location of records within the given dates. Indexing all columns is unnecessary and can negatively impact performance. Removing indexes makes searches slower, and indexing price with uniqueness does not directly help date range queries.
Why should unused indexes be periodically dropped from a large database table?
Explanation: Unused indexes consume disk space and impose overhead during insert, update, or delete operations. They do not prevent SELECT queries and have no effect on automatically merging rows or changing the order of stored data. Keeping unnecessary indexes can therefore hinder overall performance.
What routine database activity helps ensure that indexes remain efficient over time, especially in tables with lots of updates?
Explanation: Rebuilding or reorganizing helps eliminate fragmentation and keeps index access paths efficient. Deleting indexes removes their optimization benefit. Frequently renaming columns or adding random data does not maintain or improve index effectiveness, and can disrupt operations.
How can an index improve the speed of sorting data when running a query like SELECT * FROM orders ORDER BY order_date?
Explanation: Indexes on the sorting column help access data in the desired order, reducing sorting effort. They do not prevent duplication, encrypt the information, or randomly skip rows, all of which do not relate directly to sorting efficiency.
In a table where most rows have a status of 'active', how can a partial index on status='inactive' help optimize specific queries?
Explanation: Partial indexes are useful when only a fraction of rows should be searchable quickly, such as those marked 'inactive'. They do not remove rows, sort results by row length, or block access to other data. Instead, they target the indexing to specific conditions, optimizing relevant queries.