Window Functions in TimescaleDB: Advanced Querying Quiz Quiz

Explore your understanding of advanced querying with window functions in TimescaleDB. This quiz reviews key concepts such as ranking, aggregation, partitioning, and analytic SQL for time-series data, helping you master efficient data analysis with window functions.

Identifying Correct Use of ROW_NUMBER
Which window function assigns a unique sequential integer to rows within each partition, starting at 1 for the first row in each partition?
1. DENSE_RANK
2. ROW_NUMBER
3. NTILE
4. RANK
Explanation: ROW_NUMBER assigns a unique, sequential integer to each row in the order specified within the partition. RANK and DENSE_RANK are similar but assign the same number to rows with identical values and can skip or repeat ranks. NTILE divides rows into a specified number of groups and does not guarantee sequential unique integers. Therefore, ROW_NUMBER is the correct function for this behavior.
Understanding PARTITION BY in Window Functions
When using window functions, what is the role of the PARTITION BY clause in a query analyzing daily sales data per store?
1. It divides result rows into groups per store for separate window calculations
2. It increases the number of rows returned in the result
3. It filters only the rows for a specific store
4. It orders all sales by total daily revenue
Explanation: The PARTITION BY clause splits data into partitions, such as by store, so calculations like rankings or sums reset for each store. It doesn't filter rows; that's what WHERE does. It doesn't increase the number of rows but instead groups them logically. It also does not perform ordering by itself; the ORDER BY clause within the window function handles sorting.
Functionality of ORDER BY in Window Functions
In a query using SUM(quantity) OVER (ORDER BY date), what effect does the ORDER BY inside the OVER clause have on the result?
1. It filters out duplicate dates
2. It randomizes the rows within the window
3. It calculates a running total based on date order
4. It limits results to one row per date
Explanation: ORDER BY within the OVER clause determines the sequence for cumulative calculations, such as a running sum by date. It does not randomize, which would require a different function. Filtering out duplicates or limiting to one row per date is not the purpose of an ORDER BY within a window function.
Choosing the Correct Usage of LAG
What value does the LAG function return when used to access the previous temperature value in a time-ordered dataset?
1. The earliest temperature in the dataset
2. The temperature from the next row
3. The temperature from the previous row
4. The average temperature
Explanation: LAG is used to return the value from a previous row based on the defined order, such as time. It does not provide the next row's value (LEAD does that), nor does it give the average or earliest value, which are achieved with other functions or explicit queries. Thus, LAG retrieves the temperature from the previous row.
Selecting NTILE for Equal Distribution
Why would you use the NTILE window function on a table of sensor readings ordered by value?
1. To divide readings into a specified number of equal-sized groups
2. To return only the first value in each group
3. To find the exact median value
4. To skip duplicate values in the results
Explanation: NTILE divides ordered rows into a defined number of nearly equal-size groups, useful for percentiles or quartiles. It does not specifically calculate medians, nor does it filter out duplicates or return only first group values. Its main function is to assign group numbers for distribution analysis.
Correct Syntax for Window Function Usage
Which of the following shows the correct syntax for using AVG(temperature) as a window function for each sensor?
1. AVG(temperature) OVER (PARTITION BY sensor_id)
2. AVG(temperature) PARTITION BY sensor_id
3. AVG(temperature) GROUP sensor_id
4. OVER() AVG(temperature) sensor_id
Explanation: The correct syntax requires placing the window function inside the OVER clause, specifying PARTITION BY for grouping. The second and fourth options don't correctly use SQL syntax for window functions, and the third option misplaces keywords. Only the first option is valid SQL.
Ranking Rows with RANK vs. DENSE_RANK
What is the key difference between RANK and DENSE_RANK when ranking rows with duplicate values?
1. RANK always gives each row a unique number
2. RANK and DENSE_RANK are identical in all cases
3. RANK skips numbers after ties; DENSE_RANK does not
4. DENSE_RANK assigns random numbers to ties
Explanation: RANK will leave gaps in ranking after tied values, while DENSE_RANK assigns sequential numbers with no gaps. DENSE_RANK does not use random numbers, and although they are similar, their treatment of ties is different. RANK does not guarantee a unique number for each row in the case of ties.
Purpose of Frame Clauses in Window Functions
In window functions, what do frame clauses such as ROWS BETWEEN 1 PRECEDING AND CURRENT ROW specify?
1. The subset of rows to include in the window calculation for each row
2. The total number of columns in the table
3. The primary key used for grouping
4. The ordering of the entire query result
Explanation: Frame clauses define which rows are considered for each window calculation, such as including one preceding row to the current row. They are not responsible for ordering the overall query, setting column counts, or creating primary keys. The frame determines calculation scope row by row.
Using LEAD for Forward Values
When analyzing stock prices over time, which window function would you use to retrieve the price from the next row?
1. SUM
2. LAG
3. LEAD
4. RANK
Explanation: LEAD returns data from the next row in the specified order, perfect for comparing current stock prices to upcoming values. LAG gets values from previous rows, not next. RANK and SUM do not retrieve adjacent values; they provide rankings or aggregate totals instead.
Cumulative Sum Application Example
Which query is best for getting a cumulative sum of rainfall amounts ordered by day, when analyzing weather data?
1. SELECT day, SUM(rainfall) OVER (ORDER BY day) AS running_total FROM weather
2. SELECT day, SUM(rainfall) OVER rainfall ORDER BY day
3. SELECT SUM(rainfall) FROM weather GROUP BY day
4. SELECT running_total, rainfall FROM weather
Explanation: Using SUM with OVER (ORDER BY ...) gives a cumulative or running total per day. GROUP BY alone aggregates data but does not provide a running sum. The third option has incorrect syntax, and the fourth does not show how to calculate the running total. Only the first option implements the intended cumulative calculation.

Window Functions in TimescaleDB: Advanced Querying Quiz Quiz

Identifying Correct Use of ROW_NUMBER

Understanding PARTITION BY in Window Functions

Functionality of ORDER BY in Window Functions

Choosing the Correct Usage of LAG

Selecting NTILE for Equal Distribution

Correct Syntax for Window Function Usage

Ranking Rows with RANK vs. DENSE_RANK

Purpose of Frame Clauses in Window Functions

Using LEAD for Forward Values

Cumulative Sum Application Example