SQL Interview Questions and Answers (175+ High-Priority Questions)

SQL Interview Questions and Answers

This guide is designed for Data Engineer, Data Analyst, Database Developer, SQL Developer, BI Developer, ETL Developer, Cloud Data Engineer, Solution Architect, and Technical Architect interviews.


SECTION 1: SQL FUNDAMENTALS

1. What is SQL?

Answer:

SQL (Structured Query Language) is a standard language used to:

  • Store data
  • Retrieve data
  • Update data
  • Delete data
  • Manage databases

Example:

SELECT * FROM Employees;

2. What are the different types of SQL statements?

Answer:

DDL (Data Definition Language)

CREATE
ALTER
DROP
TRUNCATE

DML (Data Manipulation Language)

INSERT
UPDATE
DELETE
MERGE

DQL (Data Query Language)

SELECT

DCL (Data Control Language)

GRANT
REVOKE

TCL (Transaction Control Language)

COMMIT
ROLLBACK
SAVEPOINT

3. What is a Database?

Answer:

A structured collection of related data.

Example:

  • Employee Database
  • Banking Database
  • Hospital Database

4. What is a Table?

Answer:

A table stores data in rows and columns.

Example:

EmpIDName
101John
102David

5. What is a Row?

Answer:

A row represents one record.

Example:

101 | John | IT

6. What is a Column?

Answer:

A column represents one attribute.

Example:

EmployeeName
Salary
Department

7. What is a Primary Key?

Answer:

A column that uniquely identifies each row.

Properties:

  • Unique
  • Not Null
  • One per table

Example:

CREATE TABLE Employee(
EmpID INT PRIMARY KEY
);

8. What is a Foreign Key?

Answer:

A column that creates relationship between tables.

CREATE TABLE Orders(
CustomerID INT,
FOREIGN KEY(CustomerID)
REFERENCES Customers(CustomerID)
);

9. Difference between Primary Key and Foreign Key?

Primary KeyForeign Key
UniqueCan repeat
Not NullCan be Null
Identifies rowCreates relationship

10. What is a Unique Key?

Answer:

Prevents duplicate values.

CREATE TABLE Users(
Email VARCHAR(100) UNIQUE
);

SECTION 2: NULLS & CONSTRAINTS

11. What is NULL?

Answer:

Represents missing or unknown value.

SELECT * FROM Employee
WHERE ManagerID IS NULL;

12. Difference between NULL and Zero?

NULLZero
UnknownNumeric value
No dataActual value

13. What is NOT NULL Constraint?

Answer:

Ensures column cannot be empty.

Name VARCHAR(50) NOT NULL

14. What is CHECK Constraint?

Answer:

Validates data.

Age INT CHECK (Age >=18)

15. What is DEFAULT Constraint?

Answer:

Assigns default value.

Status VARCHAR(20)
DEFAULT 'Active'

16. What is Referential Integrity?

Answer:

Ensures foreign key values exist in parent table.


17. What happens when inserting NULL into NOT NULL column?

Answer:

Database throws error.


18. What is Composite Key?

Answer:

Multiple columns forming primary key.

PRIMARY KEY(OrderID, ProductID)

19. What is Candidate Key?

Answer:

Columns capable of becoming Primary Key.


20. What is Alternate Key?

Answer:

Candidate Key not chosen as Primary Key.


SECTION 3: JOINS

21. What is JOIN?

Answer:

Combines rows from multiple tables.


22. Types of Joins?

  1. INNER JOIN
  2. LEFT JOIN
  3. RIGHT JOIN
  4. FULL JOIN
  5. SELF JOIN
  6. CROSS JOIN

23. INNER JOIN

SELECT *
FROM Orders O
INNER JOIN Customers C
ON O.CustomerID=C.CustomerID;

Returns matching rows.


24. LEFT JOIN

Returns all rows from left table.

SELECT *
FROM A
LEFT JOIN B
ON A.ID=B.ID;

25. RIGHT JOIN

Returns all rows from right table.


26. FULL OUTER JOIN

Returns all rows from both tables.


27. CROSS JOIN

Returns Cartesian product.

SELECT *
FROM A CROSS JOIN B;

28. SELF JOIN

Join table to itself.

SELECT E.Name,M.Name
FROM Employee E
JOIN Employee M
ON E.ManagerID=M.EmpID;

29. Difference between INNER and OUTER JOIN?

INNER = matching rows only

OUTER = matching + non-matching rows


30. What is Cartesian Product?

Answer:

Every row joined with every row.


SECTION 4: AGGREGATE FUNCTIONS

31. What is COUNT()?

SELECT COUNT(*) FROM Employee;

32. COUNT(*) vs COUNT(Column)

COUNT(*) counts all rows

COUNT(Column) ignores NULLs


33. SUM()

SELECT SUM(Salary)
FROM Employee;

34. AVG()

SELECT AVG(Salary)
FROM Employee;

35. MAX()

SELECT MAX(Salary)
FROM Employee;

36. MIN()

SELECT MIN(Salary)
FROM Employee;

37. GROUP BY

SELECT Department,
COUNT(*)
FROM Employee
GROUP BY Department;

38. HAVING

Filters grouped data.

SELECT Department,
COUNT(*)
FROM Employee
GROUP BY Department
HAVING COUNT(*) > 5;

39. WHERE vs HAVING?

WHEREHAVING
Before GroupingAfter Grouping
Individual RowsGroups

40. Can Aggregate Functions Ignore NULL?

Yes.


SECTION 5: SUBQUERIES

41. What is a Subquery?

Query inside another query.

SELECT *
FROM Employee
WHERE Salary >
(
SELECT AVG(Salary)
FROM Employee
);

42. Types of Subqueries?

  • Single Row
  • Multiple Row
  • Correlated
  • Nested

43. What is Correlated Subquery?

Runs once for every outer row.

SELECT *
FROM Employee E
WHERE Salary >
(
SELECT AVG(Salary)
FROM Employee
WHERE Department=E.Department
);

44. EXISTS vs IN?

EXISTS:

  • Stops after first match
  • Better for large datasets

IN:

  • Compares list

45. NOT EXISTS vs NOT IN?

NOT EXISTS handles NULL better.


SECTION 6: INDEXES

46. What is an Index?

Database object improving query performance.


47. Why use Index?

Faster retrieval.


48. Types of Indexes?

  • Clustered
  • Non-clustered
  • Composite
  • Unique
  • Bitmap (Oracle)

49. Clustered Index?

Stores data physically sorted.

One per table.


50. Non-Clustered Index?

Separate structure pointing to data.

Multiple allowed.


51. Clustered vs Non-Clustered?

ClusteredNon-Clustered
Physical orderLogical structure
OneMany

52. Disadvantages of Index?

  • Storage
  • Slower inserts
  • Slower updates

53. What is Composite Index?

Multiple columns indexed.

CREATE INDEX idx_emp
ON Employee(DeptID,Salary);

54. What is Covering Index?

Contains all required columns.


55. What is Index Scan?

Reads many index pages.


56. What is Index Seek?

Direct lookup.

More efficient.


57. What is Full Table Scan?

Entire table read.


58. How to identify missing indexes?

Using execution plans.


59. What causes index fragmentation?

Frequent inserts/updates/deletes.


60. How to fix fragmentation?

Rebuild or reorganize index.


SECTION 7: NORMALIZATION

61. What is Normalization?

Reducing redundancy.


62. Why Normalize?

  • Better consistency
  • Less duplication

63. First Normal Form (1NF)

No repeating groups.


64. Second Normal Form (2NF)

1NF + no partial dependency.


65. Third Normal Form (3NF)

2NF + no transitive dependency.


66. BCNF?

Stronger version of 3NF.


67. What is Denormalization?

Adding redundancy for performance.


68. Benefits of Denormalization?

Faster reporting.


69. Drawbacks of Denormalization?

Data duplication.


70. OLTP vs OLAP?

OLTPOLAP
TransactionsAnalytics
NormalizedDenormalized

SECTION 8: TRANSACTIONS

71. What is Transaction?

Single logical unit of work.


72. ACID Properties?

Atomicity

All or nothing

Consistency

Valid state

Isolation

Transactions independent

Durability

Permanent after commit


73. COMMIT?

Saves transaction.


74. ROLLBACK?

Undo transaction.


75. SAVEPOINT?

Partial rollback point.


76. What is Deadlock?

Two transactions waiting on each other.


77. How to prevent deadlocks?

  • Consistent object access order
  • Short transactions

78. What is Locking?

Protecting data consistency.


79. Types of Locks?

  • Shared
  • Exclusive
  • Update

80. What is Blocking?

One transaction waits for another.


SECTION 9: ADVANCED SQL

81. What is CTE?

Common Table Expression.

WITH SalesCTE AS
(
SELECT *
FROM Sales
)
SELECT *
FROM SalesCTE;

82. Recursive CTE?

Used for hierarchies.


83. What is Window Function?

Performs calculations across rows.


84. ROW_NUMBER()

SELECT ROW_NUMBER()
OVER(ORDER BY Salary DESC)

85. RANK()

Handles ties.


86. DENSE_RANK()

No rank gaps.


87. LEAD()

Next row value.


88. LAG()

Previous row value.


89. NTILE()

Creates buckets.


90. Running Total?

SUM(Salary)
OVER(ORDER BY EmpID)

SECTION 10: SCENARIO-BASED QUESTIONS

91. Find 2nd Highest Salary

SELECT MAX(Salary)
FROM Employee
WHERE Salary <
(
SELECT MAX(Salary)
FROM Employee
);

92. Find Nth Highest Salary

SELECT Salary
FROM
(
SELECT Salary,
DENSE_RANK()
OVER(ORDER BY Salary DESC) RN
FROM Employee
) X
WHERE RN=5;

93. Find Duplicate Records

SELECT Email,
COUNT(*)
FROM Users
GROUP BY Email
HAVING COUNT(*)>1;

94. Delete Duplicate Records

Using ROW_NUMBER().


95. Find Employees Without Manager

SELECT *
FROM Employee
WHERE ManagerID IS NULL;

96. Find Department Highest Salary

Using ROW_NUMBER partition.


97. Find Top 3 Salaries per Department

Using DENSE_RANK().


98. Find Missing IDs

Using sequence comparison.


99. Difference Between DELETE, TRUNCATE, DROP?

DELETETRUNCATEDROP
DMLDDLDDL
Can rollbackUsually noNo
Removes rowsRemoves all rowsRemoves table

100. MERGE Statement?

UPSERT operation.

MERGE Target T
USING Source S
ON T.ID=S.ID
WHEN MATCHED THEN
UPDATE SET Name=S.Name
WHEN NOT MATCHED THEN
INSERT(ID,Name)
VALUES(S.ID,S.Name);

101–175 QUICK FIRE INTERVIEW QUESTIONS

  1. What is View?
  2. Simple View vs Materialized View?
  3. What is Stored Procedure?
  4. Function vs Procedure?
  5. Trigger Types?
  6. BEFORE Trigger?
  7. AFTER Trigger?
  8. INSTEAD OF Trigger?
  9. What is Cursor?
  10. Cursor drawbacks?
  11. Implicit Cursor?
  12. Explicit Cursor?
  13. Dynamic SQL?
  14. Static SQL?
  15. Temporary Tables?
  16. Global Temp Table?
  17. Local Temp Table?
  18. Table Variables?
  19. What is Schema?
  20. What is Data Warehouse?
  21. Fact Table?
  22. Dimension Table?
  23. Star Schema?
  24. Snowflake Schema?
  25. Surrogate Key?
  26. Natural Key?
  27. Slowly Changing Dimension?
  28. SCD Type 1?
  29. SCD Type 2?
  30. SCD Type 3?
  31. What is ETL?
  32. ETL vs ELT?
  33. What is Partitioning?
  34. Horizontal Partitioning?
  35. Vertical Partitioning?
  36. Sharding?
  37. Query Optimizer?
  38. Execution Plan?
  39. Cardinality?
  40. Statistics?
  41. Parameter Sniffing?
  42. Query Hints?
  43. Explain Analyze?
  44. Materialized View?
  45. Index Rebuild?
  46. Index Reorganize?
  47. Heap Table?
  48. Clustered Key Lookup?
  49. Bookmark Lookup?
  50. Covering Index?
  51. Snapshot Isolation?
  52. Read Committed?
  53. Read Uncommitted?
  54. Repeatable Read?
  55. Serializable?
  56. Phantom Reads?
  57. Dirty Reads?
  58. Non-repeatable Reads?
  59. Database Replication?
  60. Log Shipping?
  61. AlwaysOn?
  62. CDC (Change Data Capture)?
  63. Change Tracking?
  64. Audit Table?
  65. Database Migration?
  66. SQL Injection?
  67. Prevention of SQL Injection?
  68. Prepared Statements?
  69. Encryption at Rest?
  70. Encryption in Transit?
  71. Transparent Data Encryption?
  72. Row-Level Security?
  73. Dynamic Data Masking?
  74. Database Backup Types?
  75. Full vs Incremental Backup?

Top 25 SQL Questions Asked Most Frequently

  1. Difference between DELETE, TRUNCATE, DROP
  2. Primary Key vs Unique Key
  3. Clustered vs Non-Clustered Index
  4. INNER vs LEFT JOIN
  5. EXISTS vs IN
  6. CTE vs Subquery
  7. View vs Materialized View
  8. Normalization vs Denormalization
  9. ACID Properties
  10. Deadlock
  11. Window Functions
  12. ROW_NUMBER vs RANK vs DENSE_RANK
  13. Find 2nd Highest Salary
  14. Find Duplicate Records
  15. Find Top N Records
  16. Correlated Subquery
  17. Transactions
  18. Isolation Levels
  19. Indexing Strategy
  20. Execution Plans
  21. Partitioning
  22. Stored Procedure vs Function
  23. Triggers
  24. SQL Injection Prevention
  25. Data Warehouse Modeling

For senior-level Data Engineer, Cloud Architect, and Database Architect interviews (10–15 years experience), candidates are often expected to answer 250+ advanced SQL questions, including query optimization, indexing internals, execution plans, partitioning, data warehousing, CDC, distributed SQL, and real-world scenario-based problems.

Another Set of Questions

This covers foundational to advanced SQL topics for interviews (freshers to experienced roles). Questions are categorized for easier navigation. Answers include explanations, examples (using standard SQL with common dialect notes), and key points.

1. Basics & Fundamentals (1-30)

1. What is SQL? SQL (Structured Query Language) is the standard language for managing and querying relational databases. It handles DDL (defining structures), DML (manipulating data), DQL (querying), and more.

2. What is an RDBMS? How does it differ from DBMS? RDBMS (Relational Database Management System) stores data in tables with relations via keys and follows ACID properties. DBMS is more general and may not enforce relational models or normalization. Examples: MySQL, PostgreSQL (RDBMS); some older file-based systems (DBMS).

3. What are the main subsets of SQL?

  • DDL: CREATE, ALTER, DROP, TRUNCATE.
  • DML: INSERT, UPDATE, DELETE, SELECT (sometimes grouped under DQL).
  • DCL: GRANT, REVOKE.
  • TCL: COMMIT, ROLLBACK, SAVEPOINT.

4. What is the difference between SQL and MySQL? SQL is the language/standard. MySQL is a specific RDBMS implementing SQL.

5. What is a Database? An organized collection of data, typically in tables (rows/records and columns/fields).

6. What is a Table? A collection of related data in rows and columns.

7. What is a Query? A request to retrieve, insert, update, or delete data (e.g., SELECT statement).

8. What is the SELECT statement? Retrieves data: SELECT column1, column2 FROM table_name WHERE condition;

9. What is an SQL Alias? Temporary name for a table/column: SELECT first_name AS fname FROM employees;

10. What are common clauses with SELECT? WHERE, GROUP BY, HAVING, ORDER BY, LIMIT/OFFSET.

11-20: Data Types

  • Difference between CHAR and VARCHAR: CHAR is fixed-length (pads spaces); VARCHAR is variable-length.
  • INT vs BIGINT, DATE vs DATETIME, TEXT vs VARCHAR, etc.

21. What are Constraints? Rules for data integrity: NOT NULL, UNIQUE, PRIMARY KEY, FOREIGN KEY, CHECK, DEFAULT.

22. What is a Primary Key? Uniquely identifies each row; UNIQUE + NOT NULL; only one per table (can be composite).

23. What is a Foreign Key? Links to a Primary Key in another table for referential integrity.

24. What is a UNIQUE constraint? Ensures distinct values; multiple allowed per table (NULLs often permitted depending on DBMS).

25. What is a Composite Key? Primary key made of two or more columns.

26. Difference between PRIMARY KEY and UNIQUE KEY. PRIMARY KEY implies NOT NULL and only one per table; UNIQUE allows NULLs (in most DBMS) and multiples.

27. What is Normalization? Process to reduce redundancy: 1NF (atomic values), 2NF (no partial dependency), 3NF (no transitive dependency), BCNF, etc.

28. What is Denormalization? Intentionally adding redundancy for read performance (e.g., reporting systems).

29. What is Data Integrity? Accuracy, consistency, and reliability of data over its lifecycle.

30. What are OLTP vs OLAP? OLTP: Transaction processing (many short queries, normalized). OLAP: Analytical processing (complex aggregates, often denormalized/data warehouse).

2. Joins & Set Operations (31-55)

31. What are SQL Joins? Combine rows from two+ tables based on related columns.

32-36. Types of Joins & Differences:

  • INNER JOIN: Matching rows only.
  • LEFT (OUTER) JOIN: All from left + matches from right (NULLs for non-matches).
  • RIGHT JOIN: Mirror of LEFT.
  • FULL OUTER JOIN: All from both (NULLs where no match).
  • CROSS JOIN: Cartesian product.
  • SELF JOIN: Join table to itself (e.g., employee-manager).

37. What is a Cross Join? Cartesian product of two tables.

38. What is an Anti-Join? Rows in one table without matches in another (LEFT JOIN … WHERE right.key IS NULL or NOT EXISTS).

39. Difference between UNION and UNION ALL. UNION removes duplicates (slower); UNION ALL keeps them (faster).

40-42. INTERSECT (common rows), EXCEPT/MINUS (rows in first not in second).

43-55. Practical join questions: Write queries for employee-department, sales by category, etc.

3. Subqueries, CTEs & Advanced Querying (56-90)

56. What is a Subquery? Query nested inside another. Types: scalar, row, table, correlated.

57. Correlated vs Non-correlated Subquery. Correlated references outer query (executed per row, slower); non-correlated is independent.

58. What is a CTE (Common Table Expression)? Temporary named result set using WITH; improves readability, supports recursion.

59. Difference between Subquery and CTE. CTEs are more readable, reusable in the same query, and support recursion.

60. Write a query for second highest salary. SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees); or using RANK()/DENSE_RANK().

61-70. Questions on finding nth highest, duplicates, employees above dept average, etc.

71. What is EXISTS vs IN? EXISTS checks for row existence (good for large sets); IN compares values. Prefer EXISTS with NOT for anti-joins.

72-80. Window functions: ROW_NUMBER(), RANK(), DENSE_RANK(), LAG(), LEAD(), SUM() OVER(), etc.

81. Running Total Example: SUM(amount) OVER (PARTITION BY product ORDER BY date ROWS UNBOUNDED PRECEDING)

82-90. Recursive CTEs for hierarchies (org charts), NTILE, PERCENT_RANK, etc.

4. Aggregates, Grouping & Functions (91-120)

91. What is GROUP BY? Groups rows with same values for aggregates.

92. Difference between WHERE and HAVING. WHERE filters before grouping; HAVING after (for aggregates).

93-100. Aggregate functions: COUNT, SUM, AVG, MIN, MAX. Handling NULLs.

101-110. String functions: CONCAT, SUBSTRING, LIKE/ILIKE, REGEXP. Date functions: DATEADD, DATEDIFF, EXTRACT. Numeric: ROUND, CEIL, FLOOR.

111. Pattern Matching: LIKE with % and _ .

112-120. CASE statements, COALESCE, NULLIF.

5. Indexes, Performance & Optimization (121-150)

121. What is an Index? Data structure for faster lookups (like book index).

122. Clustered vs Non-Clustered Index. Clustered: Dictates physical row order (one per table). Non-clustered: Separate structure (multiple allowed).

123. How to optimize a slow query? Use EXPLAIN/execution plan, add indexes on WHERE/JOIN/ORDER BY, avoid SELECT *, filter early, update stats, etc.

124-130. Covering indexes, composite indexes, when indexes hurt (writes), full table scans.

131. Difference between DELETE, TRUNCATE, DROP. DELETE (DML, logged, WHERE possible); TRUNCATE (DDL, fast, no WHERE); DROP (removes object).

132-140. Partitioning, materialized views, query rewriting, avoiding functions on columns.

141-150. Performance tuning: statistics, parallelism, connection pooling, schema design impact.

6. Transactions, Security & Advanced Concepts (151-175+)

151. What is a Transaction? Unit of work following ACID (Atomicity, Consistency, Isolation, Durability). Commands: COMMIT, ROLLBACK.

152. What are Locks & Isolation Levels? Prevent concurrency issues (READ UNCOMMITTED to SERIALIZABLE).

153. What is a View? Virtual table from a query (simplifies, security). Materialized views store data.

154. What is a Stored Procedure? Precompiled SQL code for reuse (performance, security).

155. What is a Trigger? Auto-executes on events (INSERT/UPDATE/DELETE).

156. What is a Cursor? Iterates row-by-row (avoid in set-based SQL when possible).

157-165. User-defined functions (scalar/table-valued), collation, SQL injection prevention (parameterized queries).

166. Database design for e-commerce, time-series, etc.

167-175+. Advanced: Recursive queries, pivoting/unpivoting, JSON handling, full-text search, sharding, replication, backup strategies, NoSQL vs SQL trade-offs.

Additional 10+ Scenario Questions:

  • Remove duplicates.
  • Find gaps in sequences.
  • Cumulative sums.
  • Top N per group.
  • Handling NULLs in joins/aggregates.
  • Query for consecutive days/ streaks.
  • Dynamic pivoting.
  • Slowly Changing Dimensions (SCD) in warehouses.
  • Indexing strategy for high-write tables.
  • Query optimization case studies.

Tips for Interviews:

  • Practice on platforms like LeetCode, HackerRank, StrataScratch.
  • Understand execution plans.
  • Know your DBMS specifics (PostgreSQL window functions, SQL Server specifics, etc.).
  • Explain trade-offs (e.g., normalization vs performance).
  • Write clean, readable code with CTEs.

This list exceeds 175 when including variations and practical coding problems. Focus on understanding why rather than memorizing syntax. For hands-on practice, use sample schemas (employees, departments, sales, orders). Good luck!

More Set of Questions

This is a comprehensive guide to 175+ SQL interview questions, categorized by difficulty and topic, with detailed answers. Focus on highest-priority (frequency in real interviews: FAANG, product-based, and data roles).


PART 1: BASIC SQL (Warm-up & Screening) — 40 Questions

1. What is SQL?

Answer: Structured Query Language – used to communicate with relational databases (RDBMS).

2. Difference between SQL and MySQL?

SQLMySQL
LanguageRDBMS software
StandardizedSpecific implementation

3. What are the 5 sublanguages of SQL?

  • DDL (Data Definition) – CREATE, ALTER, DROP, TRUNCATE
  • DML (Data Manipulation) – SELECT, INSERT, UPDATE, DELETE
  • DCL (Data Control) – GRANT, REVOKE
  • TCL (Transaction Control) – COMMIT, ROLLBACK, SAVEPOINT
  • DQL (Data Query) – SELECT

4. Difference between DELETE and TRUNCATE

DELETETRUNCATE
DMLDDL
WHERE clause allowedNo WHERE
slower (row-by-row)faster (deallocates pages)
can rollbackcan rollback in some DBs (but not all)
triggers firetriggers don’t fire

5. Difference between DROP and TRUNCATE

DROP removes table structure; TRUNCATE removes data only.

6. What is a Primary Key?

  • Unique, not null, only one per table, can be composite.

7. What is a Foreign Key?

  • Links two tables; enforces referential integrity.

8. What is a Composite Key?

  • Primary key using two or more columns.

9. Difference between UNIQUE and PRIMARY KEY

UNIQUEPRIMARY KEY
One NULL allowedNo NULL
Multiple per tableOne per table

10. What is a NULL value?

  • Not zero, not blank – unknown or missing.

11. Difference between WHERE and HAVING

WHEREHAVING
Before GROUP BYAfter GROUP BY
Cannot use aggregateCan use aggregate

12. What are aggregate functions?

  • COUNTSUMAVGMINMAX

13. What is GROUP BY?

  • Groups rows with same values; used with aggregates.

14. What is ORDER BY?

  • Sorts result – ASC (default) or DESC.

15. Difference between INNER JOIN and OUTER JOIN

Inner – matching rows only
Outer – matching + non-matching (LEFT/RIGHT/FULL)

16. What is LEFT JOIN?

  • All rows from left table + matched from right; NULL if no match.

17. What is SELF JOIN?

  • Joining a table with itself using aliases.

18. What is a CROSS JOIN?

  • Cartesian product (every row × every row).

19. What is a UNION vs UNION ALL?

  • UNION removes duplicates; UNION ALL includes all.

20. What is a VIEW?

  • Virtual table (stored query). Not physically stored.

21. Can you update a view?

  • Yes, if it’s based on a single table and doesn’t use aggregates.

22. What is an index?

  • Improves SELECT speed; slows INSERT/UPDATE/DELETE.

23. Clustered vs Non-clustered index

  • Clustered – determines physical order (only one).
  • Non-clustered – separate structure with pointers.

24. What is a JOIN?

  • Combines columns from two+ tables based on related column.

25. What is DISTINCT?

  • Removes duplicate rows in result.

26. What are wildcards in LIKE?

  • % – any number of chars
  • _ – exactly one char

27. What is IN operator?

  • Shorthand for multiple OR conditions.

28. What is BETWEEN?

  • Inclusive range (numeric, text, dates).

29. What is CASE statement?

  • Conditional logic in SQL (like if-else).

Example:

sql

SELECT name,
  CASE WHEN salary > 100000 THEN 'High' ELSE 'Low' END
FROM employees;

30. What is COALESCE?

  • Returns first non-NULL value.

31. What is NULLIF?

  • Returns NULL if two expressions equal, else first expression.

32. Difference between CHAR and VARCHAR

CHAR(n)VARCHAR(n)
Fixed lengthVariable length
Faster for fixed-sizeSaves space

33. What is a TEMPORARY TABLE?

  • Exists only for session; auto-dropped.

34. What is a STORED PROCEDURE?

  • Precompiled SQL code; can have input/output parameters.

35. What is a FUNCTION vs PROCEDURE?

FunctionProcedure
Returns one valueMay return none or multiple
Cannot have output paramsCan have
Called in SELECTCalled with EXEC/CALL

36. What is AUTO_INCREMENT / IDENTITY?

  • Automatically generates sequential numeric values.

37. What is a transaction?

  • Set of SQL statements executed as a unit (ACID).

38. What are ACID properties?

  • Atomicity – all or nothing
  • Consistency – valid state before/after
  • Isolation – concurrent transactions don’t interfere
  • Durability – committed changes persist

39. What are SQL constraints?

  • NOT NULL, UNIQUE, PRIMARY KEY, FOREIGN KEY, CHECK, DEFAULT.

40. Difference between HAVING and WHERE (detailed)

WHERE filters rows before aggregation; HAVING filters after.


PART 2: INTERMEDIATE SQL (Data Rounds) — 50 Questions

41. Write a query to find duplicate rows in a table.

sql

SELECT col1, col2, COUNT(*)
FROM table
GROUP BY col1, col2
HAVING COUNT(*) > 1;

42. Delete duplicate rows keeping one occurrence.

sql

DELETE FROM table
WHERE id NOT IN (
  SELECT MIN(id)
  FROM table
  GROUP BY col1, col2
);

43. Find second highest salary.

sql

SELECT MAX(salary) FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

Or using OFFSET:

sql

SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;

44. Find nth highest salary.

sql

SELECT DISTINCT salary
FROM employees e1
WHERE N-1 = (
  SELECT COUNT(DISTINCT salary)
  FROM employees e2
  WHERE e2.salary > e1.salary
);

45. Find employees earning more than their manager.

sql

SELECT e.name
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;

46. Running total using window function.

sql

SELECT date, amount,
  SUM(amount) OVER (ORDER BY date) AS running_total
FROM sales;

47. Moving average (3-day).

sql

SELECT date, amount,
  AVG(amount) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM sales;

48. Rank employees by salary per department.

sql

SELECT name, dept_id, salary,
  RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rnk
FROM employees;

Difference: RANK (gaps), DENSE_RANK (no gaps), ROW_NUMBER (unique).

49. Lead/Lag – previous & next salary.

sql

SELECT name, salary,
  LAG(salary, 1) OVER (ORDER BY hire_date) AS prev_salary,
  LEAD(salary, 1) OVER (ORDER BY hire_date) AS next_salary
FROM employees;

50. Find employees hired in last 30 days.

sql

SELECT * FROM employees
WHERE hire_date >= CURRENT_DATE - INTERVAL '30 days';

51. Pivot table (row to column).

sql

SELECT product_id,
  SUM(CASE WHEN month='Jan' THEN sales END) AS Jan_sales,
  SUM(CASE WHEN month='Feb' THEN sales END) AS Feb_sales
FROM sales_data
GROUP BY product_id;

52. Unpivot using UNION ALL.

sql

SELECT product_id, 'Jan' AS month, Jan_sales AS sales
FROM pivot_table
UNION ALL
SELECT product_id, 'Feb', Feb_sales FROM pivot_table;

53. Recursive CTE to get hierarchy (org chart).

sql

WITH RECURSIVE org AS (
  SELECT id, name, manager_id, 1 AS level
  FROM employees WHERE manager_id IS NULL
  UNION ALL
  SELECT e.id, e.name, e.manager_id, o.level+1
  FROM employees e
  JOIN org o ON e.manager_id = o.id
)
SELECT * FROM org;

54. Find missing IDs in sequence.

sql

SELECT t1.id + 1 AS missing_start
FROM table t1
LEFT JOIN table t2 ON t1.id + 1 = t2.id
WHERE t2.id IS NULL;

55. String concatenation per group.

  • MySQL: GROUP_CONCAT(name)
  • PostgreSQL: STRING_AGG(name, ',')
  • SQL Server: STRING_AGG(name, ',')

56. Update using join.

sql

UPDATE orders
SET status = 'Shipped'
FROM shipments
WHERE orders.id = shipments.order_id;

57. Delete using join (MySQL/PostgreSQL).

sql

DELETE o
FROM orders o
JOIN old_orders oo ON o.id = oo.id;

58. Merge / Upsert (INSERT OR UPDATE).

  • MySQL: INSERT ... ON DUPLICATE KEY UPDATE
  • PostgreSQL: INSERT ... ON CONFLICT DO UPDATE

59. Find first and last order per customer.

sql

SELECT DISTINCT customer_id,
  FIRST_VALUE(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS first,
  LAST_VALUE(order_date) OVER (PARTITION BY customer_id ORDER BY order_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last
FROM orders;

60. Difference between ROW_NUMBER()RANK()DENSE_RANK().

Example:

SalaryROW_NUMBERRANKDENSE_RANK
100111
100211
90332

61. What is a CTE (Common Table Expression)?

  • Temporary named result set; improves readability.

62. CTE vs Subquery.

CTESubquery
Reusable in same queryNot reusable
Recursive possibleNo recursion
Readable for complex queriesCan be messy

63. What is a Window Function?

  • Performs calculation across rows related to current row without collapsing them.

64. Difference between WHERE and QUALIFY (Snowflake etc.)

  • QUALIFY filters after window functions; WHERE cannot.

65. Find median salary.

sql

SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) FROM employees;

Or manual:

sql

SELECT AVG(salary) FROM (
  SELECT salary, ROW_NUMBER() OVER (ORDER BY salary) AS rn,
         COUNT(*) OVER () AS total
  FROM employees
) t WHERE rn IN (FLOOR((total+1)/2), CEIL((total+1)/2));

66. Mode (most frequent salary).

sql

SELECT salary FROM employees
GROUP BY salary
ORDER BY COUNT(*) DESC
LIMIT 1;

67. What is a covering index?

  • Index that includes all columns needed for query (no table access).

68. Composite index order importance?

  • Leftmost prefix rule: index on (A,B,C) supports A, A+B, A+B+C, but not B or C alone.

69. Write query for “employees who have same salary”.

sql

SELECT e1.name, e2.name, e1.salary
FROM employees e1
JOIN employees e2 ON e1.salary = e2.salary AND e1.id < e2.id;

70. What is EXISTS vs IN?

  • EXISTS stops on first match (faster for large subqueries).
  • IN evaluates all.

71. What is ANY / ALL?

sql

SELECT name FROM employees WHERE salary > ANY (SELECT salary FROM managers);

72. What is INTERSECT?

  • Returns common rows from two queries (removes duplicates).

73. What is EXCEPT / MINUS?

  • Rows in first query not in second.

74. Case when multiple conditions.

sql

CASE WHEN score >= 90 THEN 'A'
     WHEN score >= 80 THEN 'B'
     ELSE 'C'
END

75. What is ISNULL() vs NULLIF().

  • ISNULL(col,0) – replace NULL with 0.
  • NULLIF(a,b) – returns NULL if a=b.

76. Generate date series.

  • PostgreSQL: generate_series('2024-01-01', '2024-12-31', '1 day'::interval)

77. Split comma-separated string into rows.

sql

SELECT unnest(string_to_array('a,b,c', ','));  -- PostgreSQL

78. Convert rows to JSON.

  • PostgreSQL: json_agg()row_to_json()

79. Explain query plan.

  • EXPLAIN – shows execution strategy; EXPLAIN ANALYZE – runs + shows actual.

80. What is a bitmap index?

  • Used in data warehouses; efficient for low-cardinality columns.

81. What is table partitioning?

  • Splits large table into smaller physical pieces (range, list, hash).

82. What is sharding?

  • Horizontal partitioning across different servers.

83. What is materialized view?

  • Physical copy of query result (refreshed manually/scheduled).

84. What is a database deadlock?

  • Two transactions each waiting for the other to release a lock.

85. What is isolation level?

  • Read Uncommitted, Read Committed, Repeatable Read, Serializable.

86. What is a dirty read?

  • Reading uncommitted changes from another transaction.

87. What is a non-repeatable read?

  • Same query returns different results due to updates.

88. What is a phantom read?

  • New rows appear matching WHERE condition.

89. What is WITH (NOLOCK)? (SQL Server)

  • Allows dirty reads (less blocking).

90. What is an execution plan operator (e.g., nested loop, hash join)?

  • Nested loop – small tables, index present.
  • Hash join – large, unsorted tables.
  • Merge join – sorted inputs.

PART 3: ADVANCED SQL (Hard/Tricky) — 45 Questions

91. Find gaps and islands (consecutive days).

Islands example:

sql

WITH grp AS (
  SELECT order_date,
         order_date - ROW_NUMBER() OVER (ORDER BY order_date)::int AS grp
  FROM orders
)
SELECT MIN(order_date), MAX(order_date), COUNT(*)
FROM grp
GROUP BY grp;

92. Sessionization (user sessions from clicks).

sql

WITH session_start AS (
  SELECT user_id, click_time,
    CASE WHEN click_time - LAG(click_time) OVER (PARTITION BY user_id ORDER BY click_time) > interval '30 minutes'
         THEN 1 ELSE 0 END AS new_session
  FROM clicks
)
SELECT user_id, session_id, MIN(click_time), MAX(click_time)
FROM (
  SELECT *, SUM(new_session) OVER (PARTITION BY user_id ORDER BY click_time) AS session_id
  FROM session_start
) s
GROUP BY user_id, session_id;

93. Rolling retention (cohort analysis).

sql

WITH cohorts AS (
  SELECT user_id, MIN(order_date) AS first_order
  FROM orders GROUP BY user_id
)
SELECT EXTRACT(MONTH FROM c.first_order) AS cohort,
       EXTRACT(MONTH FROM o.order_date) - EXTRACT(MONTH FROM c.first_order) AS month_offset,
       COUNT(DISTINCT o.user_id) / MAX(COUNT(DISTINCT o.user_id)) OVER (PARTITION BY c.first_order) AS retention
FROM orders o
JOIN cohorts c ON o.user_id = c.user_id
GROUP BY 1,2;

94. Self-join to find overlapping date ranges.

sql

SELECT a.emp_id, a.start_date, a.end_date, b.start_date, b.end_date
FROM assignments a
JOIN assignments b ON a.emp_id = b.emp_id
WHERE a.id < b.id
  AND a.start_date <= b.end_date
  AND a.end_date >= b.start_date;

95. Recursive CTE to generate Fibonacci.

sql

WITH RECURSIVE fib(a,b,n) AS (
  SELECT 0::bigint,1::bigint,1
  UNION ALL
  SELECT b, a+b, n+1 FROM fib WHERE n < 20
)
SELECT a FROM fib;

96. What is a Window Frame?

  • ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  • RANGE BETWEEN – based on value, not row position.

97. Rank with ties skipping next rank?

Use RANK().

98. Cumulative distribution CUME_DIST().

  • Relative rank (0 to 1).

99. Percent rank PERCENT_RANK().

  • (RANK - 1) / (total rows - 1)

100. Conditional aggregation with filter clause (PostgreSQL).

sql

SELECT dept,
       COUNT(*) FILTER (WHERE salary > 50000) AS high_earners
FROM employees
GROUP BY dept;

101. String manipulation – extract domain from email.

sql

SELECT SUBSTRING(email FROM '@(.*)$') FROM users;

102. Split name into first/last.

sql

SELECT SPLIT_PART(full_name, ' ', 1) AS first,
       SPLIT_PART(full_name, ' ', 2) AS last
FROM names;

103. Parsing JSON in SQL.

  • PostgreSQL: data->>'key'jsonb_array_elements()

104. What is a partial index?

sql

CREATE INDEX active_users ON users(id) WHERE status = 'active';

105. Function vs stored procedure performance.

Functions are cached; procedures can manage transactions.

106. Dynamic SQL in stored procedure.

sql

EXECUTE IMMEDIATE 'SELECT * FROM ' || table_name;

107. Temporary table vs CTE performance.

Temp table can have indexes; CTE is in-memory (not always).

108. Database normalization up to 3NF.

  • 1NF: atomic values
  • 2NF: no partial dependency on composite key
  • 3NF: no transitive dependency

109. Denormalization use cases.

Reporting, data warehouses, frequent joins.

110. What is a FULL OUTER JOIN example?

sql

SELECT * FROM customers c
FULL OUTER JOIN orders o ON c.id = o.cust_id;
-- includes customers without orders + orders without customers

111. What is a NATURAL JOIN?

Auto-joins on same column names (dangerous, avoid).

112. What is USING clause in join?

sql

SELECT * FROM employees JOIN departments USING (dept_id);

113. What are window function framing options?

ROWSRANGEGROUPS (SQL:2016).

114. What is IGNORE NULLS in LEAD/LAG?

Some DBs support: LAG(salary IGNORE NULLS)

115. What is SELECT FOR UPDATE?

Locks selected rows for update.

116. What is a SAVEPOINT?

sql

BEGIN;
SAVEPOINT sp1;
DELETE FROM orders;
ROLLBACK TO sp1;
COMMIT;

117. What is READ COMMITTED SNAPSHOT? (SQL Server)

Row-versioning to avoid blocking.

118. What is pg_stat_activity?

PostgreSQL view for current queries.

119. What is an execution time vs parse time?

Parse time builds plan; execution time runs it.

120. What is query optimization heuristic?

Rule-based vs cost-based optimizer.

121. What is a statistics histogram?

Used by optimizer to estimate row counts.

122. What is a cross-database query?

sql

SELECT * FROM db1.schema.table AS t1
JOIN db2.schema.table AS t2 ...

123. Export query results to CSV (PostgreSQL).

sql

\copy (SELECT * FROM users) TO 'users.csv' CSV HEADER;

124. What is pg_dump / mysqldump?

Backup utilities.

125. What is CDC (Change Data Capture)?

Capturing INSERT/UPDATE/DELETE for replication/analytics.

126. What is Slowly Changing Dimension (SCD) Type 2?

Track history with start/end date and current flag.

127. Write SCD Type 2 merge.

sql

UPDATE dim SET end_date = CURRENT_DATE, is_current = FALSE
FROM stage WHERE dim.id = stage.id AND dim.is_current = TRUE AND dim.data <> stage.data;

INSERT INTO dim (id, data, start_date, end_date, is_current)
SELECT id, data, CURRENT_DATE, '9999-12-31', TRUE FROM stage
ON CONFLICT ...;

128. What is a surrogate key vs natural key?

  • Surrogate: artificial (auto-increment)
  • Natural: business column (SSN)

129. What is an UUID vs serial?

UUID – globally unique, larger storage, slower.

130. What is a covering index with INCLUDE?

sql

CREATE INDEX idx ON table (col1) INCLUDE (col2, col3);

131. What is index skip scan?

Used if leading column has low cardinality but not in WHERE.

132. Write a query for “customers who bought all products”.

sql

SELECT customer_id
FROM orders
GROUP BY customer_id
HAVING COUNT(DISTINCT product_id) = (SELECT COUNT(*) FROM products);

133. Write a query for “top 3 products per category”.

sql

WITH ranked AS (
  SELECT product, category, sales,
         ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS rn
  FROM products
)
SELECT * FROM ranked WHERE rn <= 3;

134. Write query for “cumulative sum reset on negative”.

Requires recursion or custom aggregation.

135. Write query to find “students who have same marks in all subjects”.

sql

SELECT student_id
FROM scores
GROUP BY student_id
HAVING MIN(marks) = MAX(marks);

136. Write query to find “employees with no manager”.

sql

SELECT * FROM employees WHERE manager_id IS NULL;

137. Difference between CROSS APPLY and JOIN (SQL Server).

CROSS APPLY – like join with a correlated subquery; can call table-valued function.

138. What is OUTER APPLY?

Like LEFT JOIN for table-valued functions.

139. What is PIVOT in SQL Server?

sql

SELECT * FROM sales
PIVOT (SUM(amount) FOR month IN ([Jan],[Feb])) AS p;

140. What is UNPIVOT?

sql

SELECT product, month, sales
FROM pivot_table
UNPIVOT (sales FOR month IN (Jan, Feb)) AS u;

141. What are database hints?

Force specific join type or index usage (avoid unless expert).

142. What is OPTION (RECOMPILE)?

Forces recompilation (parameter sniffing fix).

143. What is an OUTPUT clause? (SQL Server)

Returns affected rows.

144. What is a MERGE statement?

Upsert in single statement (INSERT/UPDATE/DELETE).

145. What are sequences?

Generate numbers independent of table (vs identity).

146. What is NEXTVAL and CURRVAL?

Sequence functions.

147. What is SYSDATE vs CURRENT_DATE?

  • SYSDATE – includes time (Oracle)
  • CURRENT_DATE – ANSI SQL standard

148. What is an INTERVAL data type?

Stores duration (e.g., INTERVAL '1 day')

149. What is a common pitfall of OR in WHERE?

Prevents index usage; use UNION or IN instead.

150. Write query to “swap gender values”.

sql

UPDATE users SET gender = CASE WHEN gender='M' THEN 'F' ELSE 'M' END;

151. Write query to “remove spaces from column”.

sql

UPDATE table SET col = REPLACE(col, ' ', '');

152. What is REGEXP_SUBSTR?

Extract pattern (e.g., phone number).

153. What is TRANSLATE vs REPLACE?

TRANSLATE – char-by-char mapping; REPLACE – substring.

154. What is SOUNDEX?

Phonetic matching.

155. Difference between IN and EXISTS with example.

sql

-- IN (evaluates all)
SELECT name FROM cust WHERE id IN (SELECT cust_id FROM orders)

-- EXISTS (short-circuits)
SELECT name FROM cust WHERE EXISTS (SELECT 1 FROM orders WHERE cust_id = cust.id)

156. What is GREATEST / LEAST?

Returns max/min of list (not aggregate).

157. What is WITH CHECK OPTION in views?

Prevents updates that would exclude row from view.

158. What is an updatable view?

Single table, no aggregates, no DISTINCT, no GROUP BY.

159. What is a GENERATED ALWAYS column?

Computed column (virtual or stored).

160. What is ON DELETE CASCADE?

Deletes child rows when parent deleted.

161. What is ON DELETE SET NULL?

Sets foreign key to NULL.

162. What is deferrable constraint?

Checked at commit, not immediately.

163. What is a CHECK constraint example?

sql

ALTER TABLE employees ADD CONSTRAINT check_salary CHECK (salary > 0);

164. What is a DEFAULT constraint?

sql

created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP

165. What is an exclusion constraint? (PostgreSQL)

Prevents overlapping ranges.

166. What is a PARTITION BY for ROW_NUMBER()?

Groups before numbering.

167. Difference between LAG and LEAD.

LAG – previous row; LEAD – next row.

168. What is a RANGE window frame?

Based on value range, not row count.

169. What is FILTER clause for aggregates? (PostgreSQL)

Conditional aggregation.

170. What is WITH ORDINALITY?

Adds row number to set-returning functions.

171. What is a lateral join? (PostgreSQL)

Allows subquery to reference previous from clause.

172. What is jsonb vs json (PostgreSQL)?

jsonb – binary, faster indexing, no duplicate keys.

173. What is an ARRAY type in SQL?

PostgreSQL: INT[]ARRAY_AGG.

174. What is unnest()?

Expands array to rows.

175. What is a polymorphic table function?

Returns different schema based on parameters (advanced).


PART 4: SCENARIO-BASED HIGHEST PRIORITY — 10 Questions

176. Design a schema for Instagram.

Answer: Users, Posts, Followers, Likes, Comments, Stories, Messages.

177. Write query to find followers of followers (recommendations).

sql

SELECT DISTINCT f2.follower_id
FROM follows f1
JOIN follows f2 ON f1.follower_id = f2.followee_id
WHERE f1.followee_id = ?
  AND f2.follower_id NOT IN (
    SELECT follower_id FROM follows WHERE followee_id = ?
  );

178. Optimize a slow query.

Steps: EXPLAIN ANALYZE, missing index, wrong join order, statistics outdated, query rewrite.

179. Find “active users” who posted at least 1 post each week for 4 weeks.

sql

WITH weekly_posts AS (
  SELECT user_id, DATE_TRUNC('week', post_date) AS week,
         COUNT(*) AS posts
  FROM posts
  WHERE post_date >= CURRENT_DATE - INTERVAL '4 weeks'
  GROUP BY user_id, week
  HAVING COUNT(*) >= 1
)
SELECT user_id FROM weekly_posts
GROUP BY user_id
HAVING COUNT(DISTINCT week) = 4;

180. Handle billion-row table with no index.

Use partitioning, summary tables, map-reduce, or columnar storage (Parquet, Redshift).


Final Tips for SQL Interview

  • Write readable SQL – use CTEs, indentation, aliases.
  • Know window functions thoroughly – 90% of advanced questions.
  • Practice on real platforms – LeetCode (Database section), HackerRank, StrataScratch.
  • Know your DBMS quirks – MySQL, PostgreSQL, SQL Server differ.
  • Ask clarifying questions – joins, NULLs, duplicates handling.

🤞 Sign up for our newsletter!

We don’t spam! Read more in our privacy policy

Scroll to Top