Structured Data I/O (CSV, JSON) Quiz Quiz

Assess your understanding of key concepts in structured data input and output with an emphasis on CSV and JSON formats. This quiz covers data structure, parsing nuances, serialization, and common pitfalls, perfect for learners seeking to deepen their knowledge in handling structured data files.

  1. CSV vs. JSON Data Organization

    Which statement best describes a key difference in how CSV and JSON formats organize data for structured I/O tasks?

    1. CSV files always require quotation marks around every value, but JSON does not.
    2. CSV stores data as nested objects, whereas JSON uses tabular columns.
    3. CSV organizes data in a table with rows and columns, while JSON uses hierarchical key-value pairs.
    4. CSV can represent complex hierarchies natively, unlike JSON.

    Explanation: CSV structures information using a flat table of rows and columns, best suited for simple, tabular data. In contrast, JSON organizes data hierarchically with key-value pairs, allowing for nesting and more complex relationships. CSV does not support nested objects like JSON does, making the second and third options incorrect. Quotation marks in CSV are only required for certain values, so the fourth option is misleading.

  2. Reading CSV Files with Embedded Commas

    When reading a CSV file where some values contain commas, which is the standard method that ensures the data is correctly parsed?

    1. Add a backslash before every comma inside values.
    2. Insert line breaks after every value with a comma.
    3. Replace commas with semicolons in the data.
    4. Enclose values with commas in double quotation marks.

    Explanation: Double quotation marks are the conventional way to encapsulate values containing commas within CSV files, so parsers can distinguish between actual field separators and comma characters in data. Simply replacing commas with semicolons (second option) or backslashes (third option) modifies the data incorrectly. The fourth option of inserting line breaks disrupts the file structure and does not solve the parsing issue.

  3. Serializing Data to JSON Format

    Which action is crucial when serializing data into a JSON string to ensure valid output?

    1. Sort the values alphabetically before writing.
    2. Convert non-string dictionary keys to strings.
    3. Remove all whitespace between values.
    4. Replace colons with equal signs for assignments.

    Explanation: Valid JSON requires that all object keys be strings, so converting non-string dictionary keys is essential before serialization. Removing whitespace is unnecessary since JSON allows optional spaces for readability. Sorting values or using equal signs instead of colons is not required by the JSON standard, making those options incorrect.

  4. Handling Missing Values in CSV Files

    When a CSV file has missing values in certain rows, what is the standard representation for these missing entries?

    1. They are replaced by the previous row's value.
    2. They must be filled with a zero.
    3. They are marked as 'N/A' in every cell.
    4. They are left as empty fields with no content between the delimiters.

    Explanation: The accepted method is to leave fields representing missing values empty between delimiters, allowing parsers to recognize them as missing without making assumptions. Marking as 'N/A' or filling with zero introduces arbitrary data. Carrying over the previous row’s value is not standard and can misrepresent the actual data.

  5. JSON vs. CSV Data Types Support

    Which type of data can JSON represent directly but basic CSV structure cannot?

    1. Delimited lists by tabs
    2. Flat numerical values
    3. Nested arrays and objects
    4. Single row headers

    Explanation: JSON natively supports nesting of arrays and objects, meaning complex or hierarchical data structures can be represented directly. CSV, by design, represents flat data and does not support native nesting. Flat numerical values and row headers are supported by both formats, while tab-delimited lists refer to a variation of CSV, not a data type.