Discover fundamental concepts and best practices commonly assessed in TCS Data Analyst interviews, including SQL operations, Power BI functions, data analysis steps, and problem-solving approaches.
Which statement best describes the difference between INNER JOIN and LEFT JOIN in SQL when querying two tables?
Explanation: The correct answer summarizes the key difference: INNER JOIN gives only rows present in both tables, while LEFT JOIN includes all left table rows regardless of matches on the right. Option B reverses the behavior, C is incorrect about right table focus, and D is false since the joins typically yield different results.
If you find missing values in a dataset, which of the following is NOT a standard method to handle them?
Explanation: Randomly filling with extreme values is not a recommended method, as it can distort the analysis. The other options—dropping, imputing, and using predictive methods—are common and appropriate ways to handle missing data.
Which of the following steps comes FIRST in the data analysis process?
Explanation: Data analysis should begin by clearly defining the problem to be solved. Visualization, analysis, and interpretation occur after data collection and cleaning. Starting with visualization or analysis without a defined problem is inappropriate.
What is the main purpose of OLTP systems as compared to OLAP systems?
Explanation: OLTP is designed for fast transactional (write) operations, while OLAP focuses on analytical (read) queries. Options C and D mix unrelated concepts; option B describes OLAP, not OLTP.
What is the primary use of the CALCULATE() function in Power BI?
Explanation: CALCULATE() changes the context by applying filters to calculations. It does not visualize, combine tables, or de-duplicate data based on the context provided.
Which method would NOT help in identifying outliers in a dataset?
Explanation: Alphabetical order provides no information about data outliers. The IQR method, Z-scores, and visualizations like boxplots or scatter plots are common outlier identification techniques.
What is the primary goal of normalization in SQL databases?
Explanation: Normalization aims to reduce redundancy and logically organize data. It does not focus on graphical reports, increasing storage, or merging unrelated tables.
In an SQL query, how does the HAVING clause differ from the WHERE clause?
Explanation: The main difference is that WHERE filters rows before grouping and aggregation, while HAVING applies conditions to the aggregated groups. They are not interchangeable, and both are valid SQL clauses.
In Power BI, which best describes a 'dimension'?
Explanation: Dimensions are descriptive fields like Product or Region. Numeric calculations are measures; Python scripts are tools, not dimensions; filters can use both measures and dimensions.
If sales have dropped significantly in one region, what is a good first step in your analysis?
Explanation: Comparing trends helps identify when and how the sales drop occurred. Options B and C jump to solutions without analysis, and D ignores potentially important external factors.