Explore core skills in loading, manipulating, and visualizing data using the Pandas library in Python. Assess your knowledge of key Pandas concepts applied to real-world datasets like COVID-19 statistics.
Which two primary data structures does Pandas provide for handling and analyzing data in Python?
Explanation: DataFrame and Series are the main data structures in Pandas, allowing for flexible and powerful data manipulation and analysis. Array and Tuple, Matrix and Vector, and List and Set are not primary Pandas structures; rather, they are native Python or NumPy types, which Pandas can interoperate with but does not define as its core structures.
What Pandas function is used to load a comma-separated values (CSV) file into a DataFrame?
Explanation: The read_csv function loads CSV files into Pandas DataFrames efficiently. to_csv is used for exporting DataFrames to CSV files. read_table reads files with a general delimiter but defaults to tab-separated files. import_csv is not a valid Pandas function.
Which method computes the mean (average) of a DataFrame column in Pandas?
Explanation: The mean() method calculates the average value of numeric columns. sum() returns the total, median() gives the middle value, and average() is not a standard Pandas method—mean() is the correct choice.
How do you filter rows in a DataFrame where the 'cases' column is greater than 1000?
Explanation: The bracketed syntax df[df['cases'] > 1000] is correct for filtering rows based on a condition. filter() does not accept comparison operators this way; select() and where() are not native Pandas DataFrame methods for row filtering.
Which Pandas method allows you to quickly plot a column, such as 'cases', as a basic graph?
Explanation: The plot() method provides a simple way to visualize DataFrame columns. draw(), show(), and visualize() are not Pandas plotting methods—plot() is the standard function for this purpose.