Essential SQL Interview Questions for Data Engineers Quiz

Test your knowledge of SQL basics and data engineering interview concepts with this quiz. Explore questions designed to help you practice SQL syntax, query logic, joins, data modeling, and practical problem-solving skills for real-world data engineering scenarios.

  1. Understanding SQL Keywords

    Which SQL keyword is used to remove duplicate records from the result set of a query?

    1. DISTINCT
    2. DIFFERENT
    3. REMOVE
    4. UNIQUE

    Explanation: DISTINCT is the correct SQL keyword used to return only unique rows, removing duplicates from your result set. UNIQUE is incorrect because it is typically used in constraints, not in SELECT statements. REMOVE and DIFFERENT are not valid SQL keywords for eliminating duplicates. Only DISTINCT provides this functionality in standard SQL syntax.

  2. Selecting Columns

    In SQL, which symbol is used to select all columns from a table in a SELECT statement?

    1. #
    2. %
    3. *
    4. u0026

    Explanation: The asterisk (*) symbol is used in SQL to select all columns from a table, such as using SELECT * FROM table_name. The other options (#, %, u0026) are not valid symbols for this purpose. Using * is a quick way to retrieve every column, whereas specifying each column name gives more control.

  3. Filtering Results

    What SQL clause would you use to return only rows where an employee's salary is over 50000?

    1. WHERE
    2. SELECT
    3. HAVING
    4. ORDER BY

    Explanation: The WHERE clause is used to filter rows based on a specified condition, such as WHERE salary u003E 50000. HAVING is used with GROUP BY for aggregated data, SELECT is for choosing columns, and ORDER BY organizes the result set. Only WHERE directly filters rows by their individual values.

  4. Sorting Records

    Which clause sorts the result of a query in SQL?

    1. SORT
    2. GROUP BY
    3. ARRANGE
    4. ORDER BY

    Explanation: ORDER BY is the standard SQL clause used for sorting query results, either in ascending or descending order. GROUP BY is for grouping rows for aggregation, SORT and ARRANGE are not valid SQL keywords. ORDER BY is essential for arranging results based on one or more columns.

  5. Understanding Joins

    Which type of SQL join returns all rows from both tables, matching them where possible, and filling with NULLs where no match exists?

    1. RIGHT JOIN
    2. LEFT JOIN
    3. FULL OUTER JOIN
    4. INNER JOIN

    Explanation: A FULL OUTER JOIN returns all rows from both tables and fills in missing matches with NULL values. INNER JOIN only returns matched rows. LEFT JOIN returns all rows from the left table and matching rows from the right. RIGHT JOIN does the opposite for the right table. Only FULL OUTER JOIN includes all rows from both sides.

  6. Aggregated Data

    If you want to find the total number of orders in an 'orders' table, which SQL function should you use?

    1. SUM
    2. MAX
    3. AVG
    4. COUNT

    Explanation: COUNT is the function used to determine how many rows (orders) exist in a table or meet a condition. SUM would add up numeric values, AVG calculates the average, and MAX gives the largest value in a column. COUNT is the function specifically for counting rows.

  7. Grouping Rows

    Which clause in SQL groups rows that have the same values in specified columns?

    1. JOIN
    2. ORDER BY
    3. GROUP BY
    4. DISTINCT

    Explanation: GROUP BY clusters rows into groups based on the specified column values. ORDER BY is for sorting, JOIN combines tables, and DISTINCT removes duplicates. GROUP BY is essential for aggregate functions like SUM or COUNT to operate on each group.

  8. Alias Usage

    How would you rename a column in your result set to 'Total' using SQL?

    1. CHANGE
    2. RENAME
    3. AS
    4. LIKE

    Explanation: The AS keyword assigns an alias to a column, for example, SELECT SUM(price) AS Total. LIKE is used for pattern matching, RENAME is not a valid keyword in SELECT, and CHANGE does not rename columns in SQL. AS is the standard method for giving columns descriptive names in query results.

  9. Limiting Results

    Which keyword limits the number of rows returned by a SELECT query in SQL?

    1. RESTRICT
    2. TOP
    3. OFFSET
    4. LIMIT

    Explanation: LIMIT is used to specify the maximum number of rows a query returns, especially in systems like MySQL and PostgreSQL. TOP is used in some databases but is less standard. RESTRICT and OFFSET are not used for this purpose; OFFSET is used alongside LIMIT to skip rows. LIMIT is the correct and widely accepted choice.

  10. Checking for Nulls

    What condition would you use in SQL to find rows where the 'date_shipped' column has no value?

    1. IS BLANK
    2. IS NULL
    3. NULL
    4. EQUALS NULL

    Explanation: The IS NULL condition checks if a column has no value. IS BLANK and EQUALS NULL are not valid SQL syntax. NULL alone isn't a condition; it's a value. Using IS NULL is the correct and only standard way to test for missing or undefined values in SQL.

  11. Primary Key Meaning

    What is the primary purpose of a primary key in a SQL table?

    1. To calculate totals
    2. To store descriptive text
    3. To combine two tables
    4. To uniquely identify each row

    Explanation: A primary key enforces uniqueness, ensuring each row in the table can be distinguished by its key value. It does not store text, combine tables (that’s the purpose of joins or foreign keys), or calculate totals. Uniqueness and identification are the key functions of a primary key.

  12. Wildcards in Searching

    Which symbol is used as a wildcard for any sequence of characters in SQL's LIKE clause?

    1. *
    2. =
    3. #
    4. %

    Explanation: The percent sign (%) acts as a wildcard in SQL LIKE clauses, matching any sequence of characters. The asterisk (*) is used as a wildcard in some systems but not standard SQL's LIKE. The hash (#) and equals sign (=) do not serve as wildcards. Use % when searching for patterns in textual data.

  13. Updating Data

    Which SQL statement is used to change existing values in a table?

    1. INSERT
    2. ALTER
    3. SELECT
    4. UPDATE

    Explanation: UPDATE is the statement to modify existing values in database rows, often paired with SET and WHERE. INSERT adds new records, SELECT queries data, and ALTER changes the table structure, not its data. Only UPDATE modifies the current data within rows.

  14. Combining Results

    Which SQL operator combines the results of two SELECT statements and removes duplicates?

    1. UNION
    2. MERGE
    3. ADD
    4. JOIN

    Explanation: The UNION operator combines two result sets and automatically removes duplicate rows. JOINs are for connecting tables horizontally, MERGE is used for merging tables in some databases but with different behavior, and ADD is not an SQL operator. Use UNION when you want to stack the results of compatible queries.

  15. Foreign Key Purpose

    What is the role of a foreign key in a relational database table?

    1. To generate automatic values
    2. To display query results
    3. To store numeric data
    4. To establish a link to another table

    Explanation: A foreign key is a constraint that references a column in another table, creating relationships between data sets. It is not for storing numbers (any column can do that), generating values (handled by sequences or auto-increment features), or displaying results. The main use of a foreign key is to maintain data integrity between tables.

  16. Default Sorting Order

    What is the default sort order when you use ORDER BY in SQL without specifying ASC or DESC?

    1. Descending
    2. Grouped
    3. Random
    4. Ascending

    Explanation: By default, ORDER BY sorts results in ascending order (smallest to largest or A to Z). Descending would require the DESC keyword. Random is not the default behavior, and GROUPED refers to the GROUP BY operation, not to sorting. If no order is specified, ascending is assumed in SQL.