Data Engineering Interview Essentials Quiz: Python, AWS u0026 SQL Concepts Quiz

Test your foundational knowledge of data engineering with this beginner-friendly quiz covering key Python, AWS, and SQL topics. Ideal for interview preparation and review of essential concepts like inheritance, decorators, AWS Glue, SQL GROUP BY, database schemas, and more.

  1. Python Class Inheritance

    Which keyword is used in Python to create a child class that inherits from a parent class?

    1. super
    2. def
    3. import
    4. class

    Explanation: In Python, the 'class' keyword is used to define both parent and child classes, supporting inheritance of attributes and methods. 'super' is used inside methods to call the parent class, not to define the child class. 'def' is used for defining functions, and 'import' is for importing modules.

  2. Python Decorators

    What is a decorator in Python used for?

    1. To inherit from another class
    2. To encrypt data
    3. To sort a list
    4. To modify the behavior of a function

    Explanation: Decorators in Python are used to modify or enhance the behavior of functions or methods without changing their code. They are not related to encrypting data, sorting, or class inheritance. The other options describe entirely different features or functionalities within Python.

  3. Enumerate in Python

    When looping over a list in Python and you want both the index and the item, which built-in function should you use?

    1. enumerate
    2. counter
    3. zip
    4. sum

    Explanation: The 'enumerate' function allows you to loop over a sequence while keeping track of both the index and the item. 'zip' combines two or more iterables into tuples, 'sum' adds items together, and 'counter' is not a built-in function for this purpose.

  4. Python Unit Testing

    Which Python module is commonly used to write unit tests for your code?

    1. testit
    2. unittest
    3. pycheck
    4. pytest

    Explanation: 'unittest' is the standard module included with Python for writing unit tests. While 'pytest' is also used for testing, it is third-party and not the standard library option. 'testit' and 'pycheck' are not standard Python modules for unit testing.

  5. AWS Glue Purpose

    What is the main purpose of AWS Glue in data engineering tasks?

    1. Extract, Transform, Load (ETL) operations
    2. Storing backup files
    3. Running containerized applications
    4. Managing virtual servers

    Explanation: AWS Glue is designed for ETL (Extract, Transform, Load) tasks like preparing, transforming, and cataloging data. Managing servers and running containers are tasks for other AWS services, while storing backup files is not the main focus of Glue.

  6. Installing External Packages in AWS Glue

    How can you provide additional Python packages to an AWS Glue job?

    1. Add them to the root directory of Glue
    2. Reference .whl or .egg files in an S3 bucket with --extra-py-files
    3. Specify them in the database schema
    4. Install them using pip inside the job script

    Explanation: AWS Glue jobs require you to upload .whl or .egg files to S3 and reference them using the --extra-py-files parameter. Installing with pip inside the job script is not supported, and simply placing the packages in a directory or schema will not work.

  7. SQL GROUP BY Clause

    In SQL, which clause is used to group rows that have the same values in specified columns, often for aggregate functions?

    1. HAVING
    2. GROUP BY
    3. ORDER BY
    4. WHERE

    Explanation: The GROUP BY clause is used to organize rows into groups with the same values, enabling aggregate calculations. ORDER BY sorts results, WHERE filters rows before grouping, and HAVING filters groups after they're created.

  8. Lexicographical Order in Python

    If you have a list ['grape', 'apple', 'banana'] and sort it using Python's sort() method, what will be the first item in the sorted list?

    1. banana
    2. apple
    3. bananaapple
    4. grape

    Explanation: Sorting the list lexicographically (like in the dictionary) puts 'apple' first. 'banana' and 'grape' come later. 'bananaapple' is not in the original list, so it's incorrect.

  9. What is a Database Schema?

    Which statement best describes a database schema?

    1. A single row in a database table
    2. A SQL function
    3. A blueprint describing tables, columns, and data types
    4. A collection of queries

    Explanation: A database schema defines the structure of the entire database, including tables and columns. It does not refer to individual rows, query collections, or SQL functions.

  10. Shallow vs Deep Copying in Python

    In Python, which type of object copying duplicates nested objects completely rather than just referencing them?

    1. Direct assignment
    2. Deep copy
    3. Shallow copy
    4. Lambda copy

    Explanation: Deep copy creates a new object and recursively copies all nested elements, avoiding shared references. Shallow copy only duplicates the outer object. Direct assignment points to the same object, and 'lambda copy' is not a copying term.