Test your knowledge of SQL basics and data engineering interview concepts with this quiz. Explore questions designed to help you practice SQL syntax, query logic, joins, data modeling, and practical problem-solving skills for real-world data engineering scenarios.
Which SQL keyword is used to remove duplicate records from the result set of a query?
Explanation: DISTINCT is the correct SQL keyword used to return only unique rows, removing duplicates from your result set. UNIQUE is incorrect because it is typically used in constraints, not in SELECT statements. REMOVE and DIFFERENT are not valid SQL keywords for eliminating duplicates. Only DISTINCT provides this functionality in standard SQL syntax.
In SQL, which symbol is used to select all columns from a table in a SELECT statement?
Explanation: The asterisk (*) symbol is used in SQL to select all columns from a table, such as using SELECT * FROM table_name. The other options (#, %, u0026) are not valid symbols for this purpose. Using * is a quick way to retrieve every column, whereas specifying each column name gives more control.
What SQL clause would you use to return only rows where an employee's salary is over 50000?
Explanation: The WHERE clause is used to filter rows based on a specified condition, such as WHERE salary u003E 50000. HAVING is used with GROUP BY for aggregated data, SELECT is for choosing columns, and ORDER BY organizes the result set. Only WHERE directly filters rows by their individual values.
Which clause sorts the result of a query in SQL?
Explanation: ORDER BY is the standard SQL clause used for sorting query results, either in ascending or descending order. GROUP BY is for grouping rows for aggregation, SORT and ARRANGE are not valid SQL keywords. ORDER BY is essential for arranging results based on one or more columns.
Which type of SQL join returns all rows from both tables, matching them where possible, and filling with NULLs where no match exists?
Explanation: A FULL OUTER JOIN returns all rows from both tables and fills in missing matches with NULL values. INNER JOIN only returns matched rows. LEFT JOIN returns all rows from the left table and matching rows from the right. RIGHT JOIN does the opposite for the right table. Only FULL OUTER JOIN includes all rows from both sides.
If you want to find the total number of orders in an 'orders' table, which SQL function should you use?
Explanation: COUNT is the function used to determine how many rows (orders) exist in a table or meet a condition. SUM would add up numeric values, AVG calculates the average, and MAX gives the largest value in a column. COUNT is the function specifically for counting rows.
Which clause in SQL groups rows that have the same values in specified columns?
Explanation: GROUP BY clusters rows into groups based on the specified column values. ORDER BY is for sorting, JOIN combines tables, and DISTINCT removes duplicates. GROUP BY is essential for aggregate functions like SUM or COUNT to operate on each group.
How would you rename a column in your result set to 'Total' using SQL?
Explanation: The AS keyword assigns an alias to a column, for example, SELECT SUM(price) AS Total. LIKE is used for pattern matching, RENAME is not a valid keyword in SELECT, and CHANGE does not rename columns in SQL. AS is the standard method for giving columns descriptive names in query results.
Which keyword limits the number of rows returned by a SELECT query in SQL?
Explanation: LIMIT is used to specify the maximum number of rows a query returns, especially in systems like MySQL and PostgreSQL. TOP is used in some databases but is less standard. RESTRICT and OFFSET are not used for this purpose; OFFSET is used alongside LIMIT to skip rows. LIMIT is the correct and widely accepted choice.
What condition would you use in SQL to find rows where the 'date_shipped' column has no value?
Explanation: The IS NULL condition checks if a column has no value. IS BLANK and EQUALS NULL are not valid SQL syntax. NULL alone isn't a condition; it's a value. Using IS NULL is the correct and only standard way to test for missing or undefined values in SQL.
What is the primary purpose of a primary key in a SQL table?
Explanation: A primary key enforces uniqueness, ensuring each row in the table can be distinguished by its key value. It does not store text, combine tables (that’s the purpose of joins or foreign keys), or calculate totals. Uniqueness and identification are the key functions of a primary key.
Which symbol is used as a wildcard for any sequence of characters in SQL's LIKE clause?
Explanation: The percent sign (%) acts as a wildcard in SQL LIKE clauses, matching any sequence of characters. The asterisk (*) is used as a wildcard in some systems but not standard SQL's LIKE. The hash (#) and equals sign (=) do not serve as wildcards. Use % when searching for patterns in textual data.
Which SQL statement is used to change existing values in a table?
Explanation: UPDATE is the statement to modify existing values in database rows, often paired with SET and WHERE. INSERT adds new records, SELECT queries data, and ALTER changes the table structure, not its data. Only UPDATE modifies the current data within rows.
Which SQL operator combines the results of two SELECT statements and removes duplicates?
Explanation: The UNION operator combines two result sets and automatically removes duplicate rows. JOINs are for connecting tables horizontally, MERGE is used for merging tables in some databases but with different behavior, and ADD is not an SQL operator. Use UNION when you want to stack the results of compatible queries.
What is the role of a foreign key in a relational database table?
Explanation: A foreign key is a constraint that references a column in another table, creating relationships between data sets. It is not for storing numbers (any column can do that), generating values (handled by sequences or auto-increment features), or displaying results. The main use of a foreign key is to maintain data integrity between tables.
What is the default sort order when you use ORDER BY in SQL without specifying ASC or DESC?
Explanation: By default, ORDER BY sorts results in ascending order (smallest to largest or A to Z). Descending would require the DESC keyword. Random is not the default behavior, and GROUPED refers to the GROUP BY operation, not to sorting. If no order is specified, ascending is assumed in SQL.