Test your foundational knowledge of grouping by time buckets, rollup techniques, and calculating percentiles in time-series data analysis.
Grouping Data in Time-Series
Which SQL clause is commonly used to group time-series data into equal intervals, such as every 5 minutes?
- ORDER BY
- COLLECT BY
- GROUP BY
- CLUSTER BY
- ROLLUP
Purpose of Downsampling
Why is downsampling applied to time-series data collected every second for visualization over several years?
- To duplicate raw values
- To increase the data volume
- To randomly delete data points
- To reduce storage and improve query speed
- To make data more complex
Understanding Time Buckets
If data is grouped into 10-minute time buckets, what happens to data points within each bucket?
- They are separated into new tables
- Data is sorted alphabetically
- Each point is displayed individually
- All points are deleted except the first
- They are aggregated into a single summarized value
Aggregation Example
Given temperature readings every minute, how could you summarize hourly trends using SQL?
- CLUSTER BY temperature values
- GROUP BY hour and apply AVG() to the temperature
- SELECT DISTINCT reading per second
- IGNORE PETITIONS during querying
- USE MINUTE ORDER ON data
Percentile Queries
What is the result of using a percentile function, such as PERCENTILE(90), in a time bucket aggregation?
- A list of all unique values in each bucket
- The maximum value per bucket
- The average across all buckets
- The sum of all values per bucket
- The value below which 90% of values fall in each bucket
Choosing Bucket Size
Which factor most influences the choice of time bucket size when aggregating time-series data?
- The username of the analyst
- The alphabetical order of measurements
- The color of data points
- The total number of columns
- The analysis granularity required
Rollup Concept
What does a 'rollup' typically refer to in time-series data processing?
- Aggregating detailed data into summary values for larger time intervals
- Increasing the frequency of stored data points
- Rolling data visualizations into a single graph
- Formatting data in text files
- Repeating data queries continuously
AVG() Function Use
When used with time window aggregations, what does the AVG() function calculate?
- The interval between each data point
- The alphabetical median of data labels
- The average of all collected values within each time bucket
- The sum of all time bucket durations
- The number of time buckets
Sample SQL Syntax
Which of the following represents correct syntax to group by 1-hour intervals in SQL-like queries?
- ORDER BY hour_bucket(timestamp, 1)
- ROLLUP BY interval hour(timestamp)
- GROUP BY time_bucket('1 hour', timestamp)
- GROUP ON hour time, timestamp
- BUCKET HOUR GROUP(timestamp)
Result of Downsampling
How does downsampling affect the resolution of a time-series data set?
- It removes all aggregate values
- It converts all data to categorical labels
- It reduces the data resolution by summarizing multiple points into aggregates
- It increases the data granularity by splitting points further
- It sorts the data by event name