Data Analytics SQL Interview Challenge Quiz

Test your SQL knowledge with this multiple-choice quiz focused on essential concepts and query techniques often encountered in data analytics interviews. Boost your understanding of SQL joins, performance optimization, window functions, and handling common data scenarios.

  1. Finding the Second-Highest Value

    Which SQL clause combination can you use to find the second-highest salary in an 'employees' table?

    1. MAX(salary) LIMIT 1 OFFSET 1
    2. ORDER BY salary ASC LIMIT 2
    3. ORDER BY salary DESC LIMIT 1 OFFSET 1
    4. GROUP BY salary HAVING COUNT(*) = 2

    Explanation: Using 'ORDER BY salary DESC LIMIT 1 OFFSET 1' sorts salaries in descending order and skips the first row, returning the second-highest. 'GROUP BY salary HAVING COUNT(*) = 2' is used for grouping and does not return the actual salary. 'ORDER BY salary ASC LIMIT 2' retrieves the two lowest salaries. 'MAX(salary) LIMIT 1 OFFSET 1' is not a valid SQL pattern for this task.

  2. SQL Query Optimization

    What is a common method to optimize a slow SQL query that involves multiple large tables?

    1. Adding proper indexes to the columns used in joins and filters
    2. Decreasing the number of table columns
    3. Changing data types to text
    4. Using SELECT * for all queries

    Explanation: Adding indexes to columns frequently used in joins or WHERE clauses can greatly improve query performance. Decreasing the number of table columns may not affect join speed significantly. Changing data types to text often slows down operations. Using SELECT * retrieves unnecessary data and may slow queries further.

  3. INNER JOIN vs OUTER JOIN

    Which statement best describes the main difference between INNER JOIN and OUTER JOIN in SQL?

    1. INNER JOIN uses more memory than OUTER JOIN
    2. INNER JOIN sorts the results automatically
    3. OUTER JOIN cannot be used with WHERE clauses
    4. INNER JOIN returns matching rows only, while OUTER JOIN includes unmatched rows with NULLs

    Explanation: INNER JOIN combines rows with matching values in both tables, whereas OUTER JOIN returns all rows from one table and fills in NULLs for missing matches. Memory usage depends on result size, not JOIN type. Both JOIN types can be used with WHERE clauses. Neither JOIN inherently sorts results.

  4. Handling Duplicate Rows

    When faced with duplicate rows in an SQL result, which keyword helps you return only unique rows?

    1. DISTINCT
    2. UNIQUE
    3. SEPARATE
    4. ISOLATE

    Explanation: The DISTINCT keyword filters out duplicate rows, returning only unique rows in the result set. UNIQUE is used as a constraint, not as a query keyword. SEPARATE and ISOLATE are not valid SQL keywords for distinct results.

  5. UNION vs UNION ALL

    When combining two SELECT statements, which keyword preserves duplicate rows in the resulting dataset?

    1. COLLECT
    2. JOIN
    3. UNION ALL
    4. UNION

    Explanation: UNION ALL combines results from multiple queries and includes all duplicate rows. UNION removes duplicates from the result. JOIN is used to relate tables, not to stack rows. COLLECT is not a standard SQL clause for combining results.

  6. Using Indexes Effectively

    How do indexes contribute to better SQL query performance?

    1. By encrypting column data
    2. By removing duplicate data automatically
    3. By allowing faster searching on indexed columns
    4. By compressing the database size

    Explanation: Indexes create data structures that speed up lookups and filtering on large tables. They do not compress the database, remove duplicates, or encrypt data. Their main advantage is reducing the time it takes to find records based on indexed columns.

  7. GROUP BY vs HAVING

    What is the primary purpose of the HAVING clause in SQL?

    1. To filter groups after aggregation
    2. To perform mathematical operations
    3. To increase memory usage
    4. To sort the results ascendingly

    Explanation: The HAVING clause is applied to filter groups created with GROUP BY after aggregation functions like SUM or COUNT. Sorting is accomplished with ORDER BY, not HAVING. HAVING does not influence memory use or directly perform arithmetic calculations.

  8. Difference Between TRUNCATE and DELETE

    Which of these describes the difference between TRUNCATE and DELETE in SQL?

    1. TRUNCATE can operate on multiple tables; DELETE cannot
    2. DELETE renames the table; TRUNCATE changes column types
    3. TRUNCATE removes all rows quickly without conditions; DELETE can remove specific rows using conditions
    4. TRUNCATE always fails if the table has any data

    Explanation: TRUNCATE removes all table rows efficiently and cannot include a WHERE clause, while DELETE allows row-by-row removal using WHERE. Neither statement renames tables or changes column types. TRUNCATE works only when the table exists and may have additional constraints in some databases.

  9. Using Window Functions

    Which SQL feature allows you to perform calculations like running totals or ranking within a query without grouping the results?

    1. Indexes
    2. Window functions
    3. Cursors
    4. Procedures

    Explanation: Window functions perform calculations across specified row sets, such as ranking or calculating running totals, without reducing result rows. Procedures are blocks of code, not calculation tools. Indexes support faster lookups, not analytics. Cursors navigate result sets but do not provide analytic functions.

  10. Subquery vs Join

    In SQL, what is a key distinction between using a subquery and a join?

    1. A subquery nests a SELECT inside another query; a join combines tables in a single query block
    2. A subquery can only be used with numeric data
    3. A subquery always runs faster than a join
    4. A join cannot return columns from both tables

    Explanation: A subquery is a query within another query, often used for filtering or calculating values on the fly, whereas a join brings together tables through related columns in one SELECT statement. Subqueries do not always offer better performance compared to joins. Joins can return columns from both or all joined tables, and subqueries are not restricted to numeric data.