Microsoft Fabric is a unified, end-to-end SaaS analytics platform from Microsoft that integrates data integration, engineering, warehousing, science, real-time analytics, and business intelligence (Power BI) on a shared foundation called OneLake. It eliminates data silos, reduces tool fragmentation, and enables seamless collaboration across roles.

Core Concepts and Architecture

Q: What is Microsoft Fabric? A: Microsoft Fabric is a SaaS analytics platform unifying data workloads (ingestion, transformation, analytics, ML, real-time, and reporting) in one environment. It uses OneLake as a single logical data lake (built on ADLS Gen2) for zero-copy data sharing across workloads, with shared governance, security, and compute. Key workloads include Data Factory, Data Engineering (Lakehouse), Data Warehouse, Data Science, Real-Time Intelligence, Power BI, and Databases.

Q: What is OneLake? How does it differ from Azure Data Lake Gen2 (ADLS Gen2)? A: OneLake is Fabric’s tenant-wide unified data lake, acting as “OneDrive for data.” It provides a single namespace, automatic provisioning, and seamless integration across all Fabric items without manual setup. Data is stored in open formats like Delta Parquet. Unlike ADLS Gen2, OneLake is SaaS-managed (no need to handle storage accounts, RBAC at the storage level, or infrastructure), supports shortcuts for zero-copy external data access, and enforces unified governance/metadata.

Q: Explain Lakehouse vs. Warehouse in Fabric. A:

Lakehouse: Combines data lake flexibility (files, unstructured/semi-structured data, Spark processing) with warehouse reliability (ACID via Delta Lake, SQL endpoint). Ideal for data engineering/science with notebooks, Spark jobs, and open formats.
Warehouse: SQL-first, high-performance for structured analytics (T-SQL). Separates compute/storage, optimized for BI/warehouse workloads. Both sit on OneLake and support Direct Lake.

Q: What is Direct Lake mode? A: A Power BI semantic model storage mode (Fabric-only) that queries Delta tables in OneLake directly with import-like performance (in-memory columnar) but near-real-time data without full imports or frequent refreshes. It transcodes Parquet on-the-fly and uses V-Order for optimization. Great for large datasets; supports Direct Lake on OneLake (multi-item) or SQL endpoints.

Q: What are Shortcuts in OneLake? A: Virtual references to data in other Fabric items, ADLS Gen2, S3, etc., without copying. They enable zero-copy analytics, reduce costs/latency, and support governance policies from the source.

Q: Explain the Medallion Architecture in Fabric. A: A layered data organization:

Bronze (Raw): Ingested data as-is.
Silver (Cleansed/Enriched): Validated, deduplicated, conformed.
Gold (Business-ready): Aggregated, modeled for consumption (e.g., semantic models). Implemented via Lakehouse with Delta tables, pipelines/dataflows for ETL, and Spark/SQL for transformations.

Workloads and Components

Q: What are the main workloads in Microsoft Fabric? A: Data Factory (ingestion/orchestration), Data Engineering (Spark/Lakehouse), Data Warehouse (SQL), Data Science (ML/notebooks), Real-Time Intelligence (KQL/eventstreams), Power BI (visualization), Databases (SQL/Cosmos mirroring). All share OneLake.

Q: What are Pipelines and Dataflows (Gen2) in Fabric? A:

Pipelines (Data Factory): Orchestration workflows (like Azure Data Factory) for scheduling, copying, and transforming data. Support low-code and activities like notebooks.
Dataflows Gen2: Low-code Power Query-based ETL for reusable transformations, landing data in Lakehouse/Warehouse as Delta tables. Faster and more scalable than Gen1.

Q: What notebooks and languages are supported in Fabric? A: Interactive environments for code-first development (Data Engineering/Science). Support PySpark, Spark SQL, Scala, and others. Integrate with Lakehouse for data exploration and ML.

Q: Explain Real-Time Intelligence in Fabric. A: Handles streaming/event data with low latency using Eventstreams, KQL (Kusto Query Language) for querying logs/time-series, and integration with Power BI for live dashboards. Includes Real-Time Hub for cataloging streams.

Security, Governance, and Administration

Q: How does security and governance work in Fabric? A: Built-in Microsoft Purview for lineage, sensitivity labels, auditing. RBAC (roles at workspace/item level), encryption, tenant settings, domains for logical organization, and cross-tenant sharing with policy enforcement. Unified metadata layer.

Q: What is the pricing model? A: Capacity-based (F SKUs or Pay-As-You-Go). Shared compute across workloads; billed on usage (CU-seconds). Includes trials and reserved capacities.

Q: Explain Workspaces, Capacities, Domains, and Tenants. A:

Tenant: Org-level Fabric environment.
Capacity: Compute resource (F SKUs) assigned to workspaces.
Workspace: Collaboration boundary (like folders) containing items.
Domains: Logical grouping of workspaces for governance (e.g., by department).

Performance, Optimization, and Advanced Topics

Q: How do you optimize performance in Fabric (Spark jobs, queries, reports)? A: Use partitioning, OPTIMIZE/VACUUM on Delta tables, caching, proper file sizes, V-Order, Direct Lake, aggregate tables, and capacity scaling. Monitor with Fabric tools. For Spark: tune executors, avoid skew.

Q: How does Fabric handle schema drift/evolution and incremental loads? A: Delta Lake supports schema evolution/merges. Use watermarks, CDC, or pipeline conditions for incrementals. Dataflows/pipelines handle drift with dynamic mapping.

Q: How does Fabric support CI/CD and collaboration? A: Git integration for version control (notebooks, pipelines, etc.), deployment pipelines for promoting across environments, and workspace sharing.

Q: What is the difference between Fabric and Azure Synapse? A: Fabric is a unified SaaS platform with OneLake and shared everything; Synapse is more PaaS with separate services requiring integration. Fabric simplifies management and reduces duplication.

Scenario-Based and Practical Questions

Q: How would you design a real-time sales dashboard in Fabric? A: Ingest via Eventstreams/Pipelines → Process in Lakehouse/Eventhouse with KQL → Model in semantic layer → Visualize in Power BI with Direct Lake for near-real-time updates. Use shortcuts if needed.

Q: How do you migrate from Power BI or Synapse to Fabric? A: Use shortcuts/mirroring for data, convert datasets to Lakehouse/Warehouse, leverage Direct Lake, and use deployment pipelines. Start with OneLake as the foundation.

Q: Troubleshoot a failed pipeline or slow Spark job. A: Check activity logs, dependencies, capacity, data skew, and errors. For Spark: review Spark UI for stages, optimize code (broadcast joins, etc.). Use Lakehouse maintenance activities (OPTIMIZE/VACUUM).

1. What is Microsoft Fabric?

Answer

Microsoft Fabric is an end-to-end unified analytics platform that combines:

Data Engineering
Data Factory
Data Science
Data Warehouse
Real-Time Analytics
Power BI
Data Activator

into a single SaaS platform.

Fabric is built on OneLake and eliminates the need to manage multiple services separately.

Interview Answer

“Microsoft Fabric is Microsoft’s unified analytics platform that integrates data ingestion, storage, transformation, analytics, AI, and visualization into a single SaaS solution. It uses OneLake as a centralized data lake and supports Data Engineering, Data Factory, Data Science, Data Warehousing, Real-Time Analytics, and Power BI.”

2. Why was Microsoft Fabric introduced?

Answer

Problems with traditional architecture:

Separate services
Data duplication
Complex integrations
Multiple security models
High operational overhead

Fabric solves these by:

Single storage layer
Unified governance
Shared security model
Reduced ETL movement
Integrated AI capabilities

3. What are the core components of Microsoft Fabric?

Answer

Data Factory
Data Engineering
Data Warehouse
Data Science
Real-Time Analytics
Power BI
Data Activator
OneLake

4. What is OneLake?

Answer

OneLake is the central storage layer of Microsoft Fabric.

Similar to:

AWS S3
Azure Data Lake Storage Gen2

Features:

Single copy of data
Organization-wide lake
Open format storage
Delta Lake support

Interview Answer

“OneLake acts as the unified data lake for Fabric. Every workload stores data inside OneLake, eliminating data silos and reducing duplication.”

5. What is the significance of OneLake?

Answer

Benefits:

Single source of truth
Data sharing
Governance
Security
Faster analytics

6. Explain the Medallion Architecture in Fabric

Answer

Layers:

Bronze

Raw Data

Example:

Customer.csv
Sales.json

Silver

Cleaned Data

Example:

Deduplicated customers
Validated transactions

Gold

Business-ready Data

Example:

Sales KPI tables
Revenue Aggregations

Architecture:

Source
  ↓
Bronze
  ↓
Silver
  ↓
Gold
  ↓
Power BI

7. What is a Lakehouse?

Answer

A Lakehouse combines:

Data Lake flexibility
Data Warehouse performance

Features:

Structured data
Semi-structured data
Unstructured data

Stored as Delta Tables.

8. Difference Between Data Lake and Lakehouse

Data Lake	Lakehouse
Raw Storage	Managed Tables
Limited SQL	Full SQL
No ACID	ACID Support
Schema-on-read	Schema Enforcement

9. What is a Warehouse in Fabric?

Answer

Fabric Warehouse provides:

SQL analytics
T-SQL support
Star schema support
BI reporting

Use cases:

Enterprise reporting
KPI dashboards
Data marts

10. Lakehouse vs Warehouse

Lakehouse	Warehouse
Spark Engine	SQL Engine
Data Engineering	BI Analytics
Flexible Schema	Structured Schema
Data Science	Reporting

11. What is Delta Lake?

Answer

Delta Lake is an open-source storage layer providing:

ACID transactions
Schema enforcement
Time travel
Data versioning

12. What are Delta Tables?

Answer

Delta tables are parquet files with transaction logs.

Benefits:

Faster reads
Rollbacks
Incremental processing

13. Explain Time Travel

Answer

Allows querying historical data versions.

Example:

SELECT *
FROM sales
VERSION AS OF 5;

Useful for:

Audits
Recovery
Data validation

14. What is Data Factory in Fabric?

Answer

Data Factory provides:

Pipelines
Dataflows Gen2
Data ingestion
ETL/ELT

Similar to Azure Data Factory.

15. Pipeline vs Dataflow

Pipeline	Dataflow
Orchestration	Transformation
Workflow Control	Power Query
Scheduling	Data Cleaning

16. What are Dataflows Gen2?

Answer

Low-code ETL tool.

Features:

Power Query
Incremental refresh
Data transformation

17. What is Data Engineering in Fabric?

Answer

Uses Spark notebooks for:

ETL
Big Data processing
Data transformation

Languages:

Python
Scala
SQL
Spark SQL

18. What is a Notebook?

Answer

Interactive development environment.

Supports:

PySpark
Spark SQL
Python

Example:

df = spark.read.csv("/Files/sales.csv")
df.show()

19. What is Spark?

Answer

Apache Spark is a distributed processing engine.

Benefits:

In-memory computation
Parallel processing
Big data support

20. What is Spark Pool?

Answer

Collection of Spark compute resources.

Fabric manages Spark automatically.

No infrastructure management required.

21. Explain Fabric Capacity

Answer

Fabric uses Capacity Units (CU).

Examples:

More CU = More compute power.

22. What is Capacity Throttling?

Answer

Occurs when workload exceeds available capacity.

Symptoms:

Slow reports
Delayed jobs
Query failures

23. How do you optimize Fabric Capacity?

Answer

Use incremental refresh
Optimize SQL
Schedule heavy jobs
Use partitioning
Monitor Capacity Metrics

24. What is Direct Lake?

Answer

Direct Lake allows Power BI to read OneLake data directly.

Benefits:

No import
No refresh
Near real-time

25. Direct Lake vs Import Mode

Direct Lake	Import
Real-time	Refresh Required
Faster Updates	Cached
No Data Duplication	Duplicate Storage

26. Direct Lake vs DirectQuery

Direct Lake	DirectQuery
Reads OneLake	Reads Source
Faster	Slower
Fabric Native	External Source

27. Explain Data Activator

Answer

No-code event monitoring tool.

Example:

If:

Sales < $1000

Then:

Send Alert

28. What is Real-Time Analytics?

Answer

Analyzes streaming data.

Examples:

IoT
Clickstream
Telemetry

29. What is Eventstream?

Answer

Fabric service for streaming ingestion.

Sources:

Kafka
Event Hub
IoT

30. What is KQL?

Answer

Kusto Query Language.

Used in:

Real-Time Analytics
Log Analytics

Example:

StormEvents
| where State == "Texas"

31. Explain Fabric Security

Answer

Security layers:

Entra ID Authentication
RBAC
Workspace Permissions
OneLake Security
Sensitivity Labels

32. What is Row-Level Security (RLS)?

Answer

Restricts data access by user.

Example:

Manager sees:

All Regions

Sales Rep sees:

Assigned Region Only

33. What is Object-Level Security (OLS)?

Answer

Restricts access to:

Tables
Columns

34. What is Purview Integration?

Answer

Fabric integrates with Microsoft Purview for:

Data Catalog
Lineage
Governance
Compliance

35. Explain Data Lineage

Answer

Tracks data movement.

Example:

SQL DB
 ↓
Pipeline
 ↓
Lakehouse
 ↓
Power BI

36. What are Shortcuts in OneLake?

Answer

Virtual references to external data.

Supported:

ADLS
Amazon S3

No data copy required.

37. Why are Shortcuts important?

Answer

Benefits:

Zero-copy architecture
Reduced storage costs
Faster access

38. What is Mirroring?

Answer

Near real-time replication into Fabric.

Sources:

Azure SQL
SQL Server
Databases

39. Explain Fabric Mirroring Use Case

Answer

Example:

ERP Database → Fabric

Changes automatically synchronized.

40. What is Semantic Model?

Answer

Business layer used by Power BI.

Contains:

Measures
Relationships
KPIs

Advanced Scenario Questions

41. How would you design a Fabric architecture for a retail company?

Answer

Architecture:

POS Systems
ERP
CRM
   ↓
Data Factory
   ↓
Bronze Lakehouse
   ↓
Silver Transformations
   ↓
Gold Layer
   ↓
Direct Lake
   ↓
Power BI

42. How would you process 10 TB daily in Fabric?

Answer

Use Spark notebooks
Delta Tables
Partitioning
Incremental loads
Auto Optimize

43. How would you reduce Power BI refresh time?

Answer

Direct Lake
Aggregations
Incremental refresh
Star schema

44. How would you migrate from Synapse to Fabric?

Answer

Move data to OneLake
Recreate pipelines
Convert Spark notebooks
Rebuild semantic models
Enable Direct Lake

45. How would you handle Slowly Changing Dimensions (SCD Type 2)?

Answer

Maintain:

Customer_ID
Start_Date
End_Date
Current_Flag

Track historical changes.

Senior Architect Questions

46. Why choose Fabric over Databricks?

Answer

Choose Fabric when:

Heavy Power BI usage
Unified platform needed
Business analytics focus

Choose Databricks when:

Advanced ML
Custom Spark workloads
Multi-cloud strategy

47. Why choose Fabric over Snowflake?

Answer

Fabric:

Integrated Power BI
OneLake
Lower integration effort

Snowflake:

Multi-cloud
Strong SQL warehouse
Independent ecosystem

48. Explain End-to-End Fabric Data Flow

Source Systems
      ↓
Data Factory
      ↓
OneLake
      ↓
Lakehouse
      ↓
Spark Transformation
      ↓
Gold Layer
      ↓
Warehouse
      ↓
Semantic Model
      ↓
Power BI

Most Frequently Asked Microsoft Fabric Interview Questions

What is OneLake?

Unified storage layer.

What is Direct Lake?

Power BI reads OneLake directly.

What is a Lakehouse?

Combination of Data Lake + Warehouse.

What is Mirroring?

Near real-time database replication.

What is a Shortcut?

Zero-copy data access.

What is Eventstream?

Streaming ingestion service.

What is Delta Lake?

ACID-compliant storage layer.

What is Medallion Architecture?

Bronze → Silver → Gold.

What is Data Activator?

Event-driven alerting service.

What is Fabric Capacity?

Compute units used to run workloads.

Critical Real-World Questions Asked by US Companies (2025–2026)

Design a Fabric Lakehouse architecture for 100 million daily transactions.
Explain Direct Lake internals.
Fabric vs Databricks vs Snowflake.
Implement SCD Type 2 using Fabric.
Optimize a slow Direct Lake report.
Design Medallion architecture in Fabric.
Explain OneLake shortcuts.
Fabric security model.
Fabric capacity planning strategy.
Fabric migration roadmap from Synapse/ADF.
CDC implementation using Mirroring.
Delta Lake optimization techniques.
Spark optimization in Fabric.
Fabric governance using Purview.
Real-time streaming architecture using Eventstream + KQL.

These topics cover approximately 90–95% of Microsoft Fabric interview questions typically asked for Data Engineer, Analytics Engineer, BI Engineer, Cloud Data Engineer, and Data Architect roles in the U.S. market.

1. Design a Microsoft Fabric Lakehouse Architecture for 100 Million Daily Transactions

Interview Answer

For a platform processing 100 million transactions per day, I would design a highly scalable Medallion Architecture in Microsoft Fabric using OneLake as the centralized storage layer.

Architecture Flow

Source Systems
     │
     ▼
Data Factory / Eventstream / CDC
     │
     ▼
Bronze Layer (Raw Data)
     │
     ▼
Silver Layer (Validated & Cleansed)
     │
     ▼
Gold Layer (Business Aggregates)
     │
     ▼
Direct Lake Semantic Model
     │
     ▼
Power BI Reports

Components

Data Ingestion

SAP
Salesforce
SQL Server
Oracle
APIs
Kafka Streams

Use:

Fabric Data Factory
Fabric Mirroring
Eventstream

Bronze Layer

Store raw data as:

Delta Parquet Files

Partition:

/year/month/day/hour

Example:

transactions/2026/06/07/14/

Benefits:

Historical retention
Replay capability
Audit trail

Silver Layer

Transform using Spark notebooks.

Operations:

Deduplication
Data Quality Checks
Standardization
CDC Merge Logic

Example:

MERGE INTO customer

Gold Layer

Business-ready datasets.

Examples:

Sales Summary
Revenue Metrics
Customer KPIs
Product Performance

Serving Layer

Use:

Direct Lake Semantic Models

Benefits:

No Import refresh
Near real-time analytics
Billion-row scalability

Capacity Design

For 100M/day:

Recommended:

Fabric F128 or higher

Multiple F64 capacities

depending on concurrency.

Expected Interview Follow-Up

Why not Import Mode?

Because:

Dataset size exceeds memory limits.
Refresh windows become large.
Direct Lake eliminates refresh dependency.

2. Explain Direct Lake Internals

Interview Answer

Direct Lake is Fabric’s high-performance analytics mode.

Unlike Import mode:

Import:
Power BI → Copies Data → VertiPaq

Direct Lake:
Power BI → Reads Delta Tables Directly

Internal Architecture

OneLake Delta Files
        │
        ▼
Direct Lake Engine
        │
        ▼
VertiPaq On-Demand Loading
        │
        ▼
Power BI Visuals

Query Process

Step 1

User opens report.

Step 2

Power BI reads metadata.

Step 3

Only required columns are loaded.

Step 4

Query executed using VertiPaq engine.

Benefits

No Scheduled Refresh

Traditional:

Source → Import → Refresh

Direct Lake:

Source → OneLake → Query

Large Scale

Supports:

Billions of rows

Fast Performance

Typically:

Sub-second response

for optimized models.

Direct Lake Fallback

If unsupported operations occur:

Complex Security
Unsupported Calculations

Fabric falls back to:

DirectQuery

which is slower.

3. Fabric vs Databricks vs Snowflake

Feature	Fabric	Databricks	Snowflake
Lakehouse	Yes	Yes	Partial
BI Included	Yes	No	No
Power BI Integration	Native	External	External
OneLake	Yes	No	No
Spark	Yes	Excellent	Limited
SQL Analytics	Good	Good	Excellent
AI/ML	Good	Excellent	Good
Cost Simplicity	Excellent	Medium	Medium
Governance	Strong	Strong	Strong

When to Choose Fabric

If organization uses:

Microsoft ecosystem
Power BI
Azure AD
M365

Fabric provides lowest TCO.

When to Choose Databricks

Best for:

Advanced ML
AI
Data Science
Large Spark workloads

When to Choose Snowflake

Best for:

Enterprise SQL Warehousing
Cross-cloud analytics

4. Implement SCD Type 2 Using Fabric

Scenario

Customer address changes.

Need history tracking.

Source

CustomerID
Address

Target Table

CustomerID
Address
StartDate
EndDate
IsCurrent

Process

Compare incoming records with current rows.

If change detected:

Expire Current Row

UPDATE customer_dim
SET EndDate=current_date(),
    IsCurrent='N'

Insert New Version

INSERT INTO customer_dim
VALUES(...)

Spark MERGE Example

MERGE INTO customer_dim t
USING source s
ON t.customerid=s.customerid
AND t.iscurrent='Y'

Benefits

Preserves:

Historical reporting
Regulatory compliance
Customer lifecycle tracking

5. Optimize a Slow Direct Lake Report

Interview Answer

I follow a systematic approach.

Step 1: Check Model Size

Analyze:

Large Cardinality Columns
Unused Columns

Remove unnecessary fields.

Step 2: Star Schema

Avoid:

Snowflake Design

Use:

Fact + Dimensions

Step 3: Optimize DAX

Bad:

FILTER(ALL(Table))

Better:

CALCULATE()

with selective filters.

Step 4: Aggregation Tables

Create:

Daily Sales
Monthly Revenue

instead of scanning detailed facts.

Step 5: Reduce Visuals

Avoid:

30 visuals/page

Recommended:

5-10 visuals/page

Step 6: Capacity Metrics

Monitor:

CPU
Memory
Query Wait Time

using Fabric Capacity Metrics App.

6. Design Medallion Architecture in Fabric

Bronze Layer

Raw ingestion.

No Transformations

Purpose:

Audit & Replay

Silver Layer

Business cleansing.

Examples:

Deduplication
Data Validation
Standardization

Gold Layer

Business-ready datasets.

Examples:

Sales KPIs
Customer KPIs
Executive Dashboards

Benefits

Scalability
Governance
Data Quality
Reusability

7. Explain OneLake Shortcuts

Interview Answer

OneLake Shortcut is a virtual pointer to external storage.

No data copy required.

Supported Sources

ADLS Gen2
Amazon S3
Databricks
Other Fabric Lakehouses

Traditional

Copy Data
Store Data
Manage Data

Shortcut

Reference Data

Benefits

Zero ETL
Reduced storage cost
Single logical lake

8. Fabric Security Model

Fabric security has multiple layers.

Identity Layer

Uses:

Microsoft Entra ID

Authentication:

SSO
MFA
Conditional Access

Workspace Security

Roles:

Admin
Member
Contributor
Viewer

Item Security

Applied on:

Lakehouse
Warehouse
Reports

Data Security

RLS

Region-based filtering

OLS

Hide sensitive columns

Network Security

Private Endpoints
Managed VNET
Encryption

9. Fabric Capacity Planning Strategy

Key Factors

Data Volume

Example:

100M transactions/day

User Concurrency

Example:

500 concurrent users

Workload Mix

ETL
Reporting
Data Science

Capacity Recommendation

Users	Capacity
<100	F16
100-300	F32
300-1000	F64-F128
Enterprise Scale	F128+

Best Practice

Separate capacities for:

Development
Testing
Production

10. Fabric Migration Roadmap from Synapse / ADF

Phase 1 Assessment

Inventory:

Pipelines
Notebooks
SQL Pools

Phase 2 Mapping

Synapse	Fabric
Dedicated SQL Pool	Warehouse
Spark Pool	Fabric Spark
Data Lake	OneLake
ADF	Data Factory

Phase 3 Migration

Migrate:

Pipelines
Delta Tables
Reports

Phase 4 Validation

Validate:

Data Quality
Performance
Security

Phase 5 Production Cutover

Parallel run:

2-4 weeks

before full cutover.

11. CDC Implementation Using Mirroring

Interview Answer

Fabric Mirroring continuously captures source database changes.

Flow

SQL Server
     │
     ▼
CDC Logs
     │
     ▼
Fabric Mirroring
     │
     ▼
OneLake Delta Tables

Captured Operations

INSERT
UPDATE
DELETE

Benefits

Near real-time
No custom ETL
Minimal source impact

12. Delta Lake Optimization Techniques

OPTIMIZE

Compacts small files.

OPTIMIZE sales

Z-ORDER

Improves data skipping.

OPTIMIZE sales
ZORDER BY(customerid)

VACUUM

Removes obsolete files.

VACUUM sales RETAIN 168 HOURS

Partitioning

Example:

Date
Region

Benefits

Faster scans
Reduced I/O
Lower cost

13. Spark Optimization in Fabric

Partition Tuning

Bad:

1 partition

Good:

200-500 partitions

depending on cluster size.

Broadcast Join

broadcast(dim_table)

for small dimensions.

Cache Frequently Used Data

df.cache()

Predicate Pushdown

Filter early.

df.filter(...)

Avoid Wide Transformations

Reduce:

Shuffle operations
Large joins

Use Delta Format

Better than:

CSV
JSON

14. Fabric Governance Using Purview

Components

Use Microsoft Purview for enterprise governance.

Capabilities

Data Catalog

Business metadata discovery.

Data Lineage

Track:

Source → Pipeline → Lakehouse → Report

Sensitivity Labels

Examples:

Public
Internal
Confidential
Restricted

Data Classification

Detect:

PII
PHI
Financial Data

Benefits

Compliance
Auditability
Regulatory reporting

15. Real-Time Streaming Architecture Using Eventstream + KQL

Architecture

IoT Devices
Applications
Kafka
Event Hub
      │
      ▼
Fabric Eventstream
      │
      ▼
KQL Database
      │
      ▼
Real-Time Analytics Dashboard

Eventstream Responsibilities

Ingestion
Routing
Filtering
Transformation

KQL Database Responsibilities

Time-series storage
Real-time analytics
Log analytics

Example KQL Query

Transactions
| where Timestamp > ago(5m)
| summarize Count=count() by Region

Real-World Use Cases

Fraud Detection
Healthcare Monitoring
Manufacturing IoT
Application Monitoring
Financial Transaction Analytics

Executive Interview Summary (2-Minute Answer)

“At enterprise scale, I design Microsoft Fabric solutions using a Medallion Lakehouse architecture built on OneLake. Data is ingested through Mirroring, Data Factory, and Eventstream into Bronze, transformed through Spark into Silver, and aggregated into Gold. For analytics, I leverage Direct Lake semantic models to achieve near real-time performance without dataset refreshes. I optimize performance using Delta Lake techniques such as OPTIMIZE, Z-ORDER, partitioning, and Spark tuning. Security is implemented through Microsoft Entra ID, RLS, OLS, workspace roles, and governance with Microsoft Purview. For large workloads such as 100 million daily transactions, I plan capacity using F64–F128 SKUs, monitor utilization, and separate production workloads across dedicated capacities.”

Preparing for a Microsoft Fabric interview requires understanding both architectural concepts and practical implementation details. This guide covers fundamental, data engineering, security, and scenario-based questions with comprehensive answers.

Fundamental & Architecture Questions
Lakehouse & Data Warehousing Questions
Data Engineering & Pipelines
Security & Governance Questions
Performance Optimization Questions
Real-World Scenario Questions
Quick Reference Answer Guide

Fundamental & Architecture Questions

Q1: What is Microsoft Fabric, and how does it unify data analytics workflows?

Answer:

Microsoft Fabric is a unified, end-to-end analytics platform that integrates multiple data workloads into a single software-as-a-service (SaaS) solution . It combines capabilities from Power BI, Azure Synapse, and Azure Data Factory, creating a cohesive environment where data professionals can work without juggling multiple services.

Key unification aspects:

Single data lake (OneLake): One central storage system instead of separate silos
Shared semantic model: Consistent business definitions across all tools
Integrated experiences: Data engineering, data warehousing, real-time analytics, and Power BI all within one interface
SaaS simplicity: No infrastructure management—automatic scaling, patching, and updates

Interview Tip: Emphasize how Fabric eliminates the “spaghetti architecture” problem where companies had disconnected tools for each analytics task.

Q2: What is OneLake, and how does it compare to traditional data lakes?

Answer:

OneLake is the single, unified data lake that serves as the storage foundation for Microsoft Fabric . Unlike traditional data lakes, OneLake is not another storage account you provision—it’s built into Fabric and automatically available.

Comparison table:

Aspect	Traditional Data Lake (ADLS Gen2)	OneLake in Fabric
Provisioning	Manual storage account creation	Automatic with Fabric workspace
Multiple tenants	Separate accounts per workload	Single logical lake across all workloads
Structure	Folder-based organization	Workspace + item-based organization
Shortcuts	Not available	Native shortcuts to external data
Open formats	Supports Parquet/Delta	Built on Delta Parquet format

Key differentiator: OneLake provides “shortcuts” (similar to symbolic links) that let you reference data from other lakes without copying it .

Q3: Explain the concept of Lakehouse in Microsoft Fabric.

Answer:

A Lakehouse combines the best features of data lakes (low-cost storage, schema flexibility) and data warehouses (performance, ACID transactions, governance). In Microsoft Fabric, the Lakehouse is a native item that stores data in Delta Parquet format.

Core components:

Files section: Raw files (structured/semi-structured)
Tables section: Managed Delta tables with schema enforcement
SQL endpoint: Automatic availability for T-SQL queries
Default semantic model: Auto-generated for Power BI reporting

Why it matters: Fabric Lakehouse eliminates the need to choose between data lake and warehouse—you get both without data duplication .

Q4: What are the core components of Microsoft Fabric?

Answer:

Fabric consists of seven core workloads (called “experiences”):

Component	Primary Use
Data Factory	Data ingestion and orchestration (like ADF)
Synapse Data Engineering	Spark-based data transformation using notebooks
Synapse Data Warehousing	Traditional data warehouse with T-SQL
Synapse Data Science	ML model development and tracking
Real-Time Analytics	Streaming data with KQL (Kusto Query Language)
Power BI	Reporting and semantic modeling
Data Activator	Automated actions based on data patterns

All components share the same OneLake storage and security model .

Lakehouse & Data Warehousing Questions

Q5: What’s the difference between Lakehouse and Warehouse items in Fabric, and when would you choose one over the other?

Answer:

This is a critical distinction interviewers test frequently .

Lakehouse:

Storage format: Delta Parquet (open format)
Query engine: Spark SQL + SQL endpoint
Best for: Exploratory analytics, streaming data, ML workloads
Schema: Schema-on-read with optional enforcement
Cost: Lower storage cost, moderate query performance

Warehouse:

Storage format: Native columnar format (optimized for T-SQL)
Query engine: T-SQL only
Best for: Enterprise reporting, governed BI, complex joins
Schema: Schema-on-write (strict enforcement)
Cost: Higher storage cost, excellent query performance

Decision guide:

Choose Lakehouse when: Data is unstructured, multiple formats, data science workloads, cost is primary concern
Choose Warehouse when: Strong governance required, complex T-SQL, existing Synapse migration, predictable schemas

Interview Tip: Explain hybrid approaches—build Lakehouse for raw ingestion, then create shortcuts or views from Warehouse.

Q6: How do you implement an end-to-end Lakehouse architecture for real-time analytics?

Answer:

A complete Lakehouse architecture for real-time analytics follows this flow :

text

Ingestion → Streaming Processing → Storage → Serving → Visualization

1. Event Hub/IoT Hub → 2. Stream Analytics/Spark Streaming → 3. Lakehouse Delta Tables → 4. SQL Endpoint → 5. Power BI Direct Lake

Step-by-step implementation:

Ingestion layer: Configure Event Hubs or Kafka to receive streaming data
Processing: Use Fabric notebooks with Structured Streaming or Real-Time Hub
Storage: Write to Lakehouse Delta tables with automatic partitioning
Optimization: Implement Z-order clustering on frequently filtered columns
Serving: Use the automatic SQL endpoint for reporting
Visualization: Connect Power BI using Direct Lake mode for sub-second latency

Key to real-time: Use Direct Lake mode—not DirectQuery or Import—to query Delta files directly without moving data .

Q7: How do you set up a Lakehouse, and what scenarios make it a good fit?

Answer:

Setup process :

Navigate to Fabric workspace → Create new item → Lakehouse
Name the Lakehouse and confirm creation
Two views appear: Explorer (files/tables) and SQL endpoint
Load data via Data Factory pipeline, notebook, or upload
Tables auto-discover from Delta files in managed folder

Best-fit scenarios:

Data lake modernization: Existing ADLS Gen2 with Delta format
Medallion architecture: Bronze (raw), Silver (cleaned), Gold (aggregated) layers
Data science projects: Need Spark and Python support
Multi-format data: Mix of JSON, CSV, Parquet, images
Cost-sensitive analytics: Large data volumes with moderate performance needs

Data Engineering & Pipelines

Q8: Explain how Fabric Data Pipelines differ from Azure Data Factory pipelines—and when to use each.

Answer:

This question tests your understanding of Fabric’s relationship to existing Azure services .

Aspect	Fabric Data Pipelines	Azure Data Factory (ADF)
Location	Within Fabric workspace	Standalone Azure service
Activities	Copy, Notebook, KQL, Dataflow, Spark	100+ activities including custom
Integration	Native with OneLake, shortcuts	Requires explicit connectors
Orchestration	Workspace-level triggers	More complex trigger options
Code reusability	Limited to workspace	Data Factory Studio reuse
Pricing model	Fabric Capacity Units	ADF v2 pricing

When to use each:

Use Fabric Pipeline: For workloads entirely within Fabric ecosystem, OneLake-to-OneLake copies, native Fabric activities
Use ADF: For complex hybrid scenarios, existing ADF investments, custom .NET activities, SSIS lift-and-shift

Interview Insight: Microsoft wants Fabric to eventually replace ADF for most data integration, but they maintain both for now .

Q9: How do you optimize Spark job performance inside Fabric Notebooks for large-scale datasets?

Answer:

Spark optimization is a high-priority topic in Fabric interviews .

Key strategies:

1. Partitioning optimization:

python

# Read with optimal partitions
df = spark.read.parquet("path").repartition(200)  # Based on cluster cores

# Write with partitioning column
df.write.partitionBy("year", "month").format("delta").save("path")

2. Use Delta optimizations:

sql

OPTIMIZE table_name;  -- Compacts small files
VACUUM table_name RETAIN 168 HOURS;  -- Clean up old versions

3. Caching strategies:

python

df.cache()  # For reused DataFrames
df.count()  # Forces cache materialization

4. Shuffle tuning:

python

spark.conf.set("spark.sql.shuffle.partitions", "200")
spark.conf.set("sppark.sql.adaptive.enabled", "true")  # AQE in Spark 3.x

5. Notebook-specific:

Use %run for modular code instead of large monolithic notebooks
Detach and reattach sessions when memory issues occur
Monitor Spark UI for skew detection

Q10: How would you implement complete data lineage, monitoring, and alerting within Fabric?

Answer:

Fabric provides multi-layered observability :

Data Lineage:

Impact analysis view: Shows upstream/downstream dependencies
Workspace lineage: Visual graph of item relationships (pipelines, dataflows, datasets)
Column-level lineage: When using Purview integration

Monitoring:

Monitor Hub: Central view of all runs (pipeline, notebook, dataflow)
Spark Application History: Detailed job stages, tasks, and executors
Capacity Metrics app: Track Consumption (CUs) and throttling events

Alerting Setup:

python

# Example: Custom alert from notebook
from notebookutils import mssparkutils

if failed_count > threshold:
    mssparkutils.notebook.run("SendAlertNotebook")
    # Or use Pipeline Web activity to call Logic App

Best practice: Set up alerts on:

Pipeline failures (email to Data Engineer DL)
Capacity exceeding 80% for more than 10 minutes
Long-running queries (> 5 minutes)

Q11: What’s your strategy for handling schema drift, incremental loading, and CDC in Fabric?

Answer:

Schema Drift Handling :

python

# In Dataflow Gen2: Use "Allow schema drift" option
# In Notebooks: Use dynamic schema evolution
df.write.mode("append").option("mergeSchema", "true").saveAsTable("target")

Incremental Loading patterns:

Watermark technique:

sql

SELECT * FROM source 
WHERE LastModifiedDate > (SELECT MAX(LastModifiedDate) FROM target)

Delta Lake Change Data Feed:

sql

-- Enable on table
ALTER TABLE mytable SET TBLPROPERTIES (delta.enableChangeDataFeed = true)

-- Read changes
SELECT * FROM table_changes('mytable', 123, 456)

CDC Implementation:

Source Type	Fabric Method
SQL Server	Azure Data Factory with CDC connector
Event Hubs	Real-Time Hub → Lakehouse
API sources	Incremental pipeline with last-run timestamp
Files (ADLS)	File modified date with LastFileModified function

Security & Governance Questions

Q12: How does Microsoft Fabric handle data encryption?

Answer:

Fabric implements a multi-layer encryption approach :

At Rest (OneLake storage):

Default: Azure Storage Service Encryption (SSE) with Microsoft-managed keys
Enhanced: Customer-managed keys (CMK) via Azure Key Vault for compliance (HIPAA, PCI DSS)
Double encryption: Optional for highest sensitivity workloads

In Transit:

TLS 1.2+ for all service-to-service communication
HTTPS required for all API access

Practical example: A healthcare provider storing patient records enables CMK encryption to meet HIPAA requirements while maintaining analytics processing .

Q13: What are the key security roles in Fabric, and how would you implement them in an enterprise scenario?

Answer:

Pre-defined roles (Workspace level) :

Role	Permissions
Admin	Full control: manage users, settings, delete workspace
Member	Create/Edit items, share, but cannot manage permissions
Contributor	Create/Edit items, cannot share or manage access
Viewer	Read-only access to all workspace items

Enterprise implementation:

yaml

# Global retail company example:
Workspace_Europe_Sales:
  - Germany_Team: Contributor (local sales data only)
  - France_Team: Contributor
  - Global_Finance: Viewer (cross-region reports)
  - IT_Admin_Group: Admin

Workspace_APAC_SupplyChain:
  - Singapore_Ops: Admin
  - India_Logistics: Member
  - Global_Exec: Viewer

Best practice: Apply least privilege principle. Use Azure AD groups—never assign permissions directly to users .

Q14: How do you implement Row-Level Security (RLS) for a sales organization using Fabric?

Answer:

RLS restricts row access based on user attributes. Implementation path :

Step 1: Define DAX filters in the semantic model

text

DAX Role: "SalesRep_Filter"
[Region] = USERNAME()  

DAX Role: "Manager_Filter"  
[EmployeeID] = USERPRINCIPALNAME()

Step 2: Create roles in Fabric

Navigate to semantic model → Row-Level Security
Define role name and DAX expression

Step 3: Test using “View as” role

Verify each user sees only their rows

Step 4: Assign users (Azure AD groups recommended)

text

Role: SalesRep_Filter → Members: SalesRep_Europe_Group, SalesRep_APAC_Group
Role: Manager_View → Members: Sales_Management_Group

Real-world scenario: A global sales company ensures European sales reps cannot see APAC transaction data while maintaining a single dataset for management reporting .

Q15: How do sensitivity labels help secure financial reports?

Answer:

Microsoft Purview sensitivity labels integrate natively with Fabric .

How it works:

Classification: Label automatically applied based on content patterns (SSN, credit card, financial terms)
Protection: Encryption applied to the content
Enforcement: Prevents sharing outside authorized departments
Audit: Tracks all access attempts

Banking example:

M&A analysis reports automatically get “Highly Confidential” label
Label triggers encryption and blocks external sharing
Manager approval required before sharing outside M&A department
All access attempts logged for FINRA compliance

Implementation:

powershell

# Labels applied via Purview compliance portal
# Auto-labeling policies based on:
- Sensitive info types (account numbers, SWIFT codes)
- Pattern matching (confidentiality headers)
- User classification (investment banking vs retail banking)

Q16: Why would a pharmaceutical company implement Private Link for their Fabric environment?

Answer:

Private Link creates secure, private network connections between Fabric and Azure Virtual Network .

Clinical trial scenario:

Patient data must never traverse public internet
Private Link ensures all data movement stays within Azure backbone network
Combines with Network Security Groups for strict IP whitelisting
Complies with FDA data protection requirements

Benefits for regulated industries:

Concern	Private Link Solution
Data interception	No internet exposure
Hybrid connectivity	Secure on-prem gateway connection
IP restriction	Whitelist only research facility IPs
Compliance	No public endpoint logging required

Setup: Configure private endpoints for OneLake, Power BI, and Data Factory within the organization’s VNet.

Q17: How should a retail chain handle payment processing secrets in their Fabric pipelines?

Answer:

Use Azure Key Vault as the secrets management solution .

DO NOT hardcode:

python

# ❌ Wrong - Never do this
connection_string = "Server=paymentdb;User=sa;Password=SuperSecret123!"

Correct approach:

python

# ✅ Right - Reference Key Vault
from notebookutils import mssparkutils
secret = mssparkutils.credentials.getSecret("https://keyvault.vault.azure.net/", "payment-api-key")

Best practices for retail:

Store payment gateway credentials, API keys, connection strings
Implement automated key rotation (every 30-90 days)
Enable detailed audit logging for all secret access attempts
Use managed identities—no service principal secrets stored anywhere
Comply with PCI DSS by logging every secret access

Performance Optimization Questions

Q18: Your Power BI report connected via Direct Lake is showing slow load times. What performance tuning strategies would you apply?

Answer:

This is a common scenario-based question .

Diagnostic checklist:

Check if report is actually using Direct Lake (not falling back to DirectQuery)
Identify bottlenecks using Performance Analyzer in Power BI Desktop

Optimization strategies:

Issue	Solution
Small file problem	Run OPTIMIZE on Delta tables to compact files
Large column counts	Remove unused columns from the semantic model
Complex DAX measures	Pre-aggregate at Lakehouse level via Spark
Cross-table joins	Ensure join columns are partition-aligned
High cardinality columns	Implement aggregations for high-level reports

Step-by-step tuning:

sql

-- 1. Compact Delta files
OPTIMIZE sales_table WHERE date >= '2024-01-01'

-- 2. Create Z-order index on frequently filtered colum
OPTIMIZE sales_table ZORDER BY (transaction_date, region)

-- 3. Apply VACUUM to remove old versions
VACUUM sales_table RETAIN 168 HOURS

-- 4. Verify file sizes (target 100MB-1GB per file)

Direct Lake specific: Force refresh of semantic model after major Lakehouse changes to ensure caching.

Q19: How do you configure and manage Fabric capacities to avoid throttling?

Answer:

Fabric uses Capacity Units (CUs) that function like fuel—operations consume CUs, and smooth out over a 30-second window .

Key concepts:

Smoothing window: 30 seconds (operations in this window are averaged)
Throttling occurs when: Average CU consumption > provisioned capacity for >30 seconds
Throttling behavior: Queue operations (spillover) not failure

Management strategies:

1. Monitor capacity usage:

Use Fabric Capacity Metrics app (download from AppSource)
Track operations by user, item type, time of day

2. Optimize heavy operations:

python

# Instead of 10 small writes:
for file in files:
    df.write.mode("append").save()  # 10 operations

# Do one batch write:
all_df = reduce(union, [read(f) for f in files])
all_df.write.mode("append").save()  # 1 operation

3. Time-shift workloads:

Schedule heavy ETL during off-hours
Stagger pipeline start times

4. Use bursting:
Fabric allows short-term spikes above provisioned capacity—design for average, not peak

5. Purchase backup capacity:
Pay-as-you-go for overflow during peak seasons

Real-World Scenario Questions

Q20: How do you integrate external Delta Lake tables from ADLS Gen2 into Fabric’s OneLake while preserving lineage?

Answer:

Use OneLake Shortcuts—references to external data without copying .

Implementation:

text

ADLS Gen2 location: abfss://container@storage.dfs.core.windows.net/external/tables/sales/

In Fabric Lakehouse: Create shortcut to above path
Result: Data appears in Lakehouse files/tables section

Lineage preservation:

External tables maintain original metadata (created/modified dates)
Purview integration traces source system through shortcut
Pipeline lineage shows data flow from external source to Fabric items

Steps:

python

# Create shortcut via notebook
from notebookutils import fabricNotebook
fabricNotebook.create_shortcut(
    path="abfss://external@storage.dfs.core.windows.net/sales",
    shortcut_type="ADLS"
)

Benefits over copy:

No storage duplication costs
Instant availability (zero copy time)
Always current (no sync lag)
Access control enforced at source level

Q21: How would you design a disaster recovery and data backup strategy in Fabric?

Answer:

Fabric presents unique challenges since OneLake is a unified storage system .

Backup strategies:

Component	DR Approach
OneLake data	Zone-redundant storage (ZRS) across availability zones
Cross-region	Manual or automated replication to paired region
Semantic models	Export .pbip files to source control
Pipelines/Notebooks	Git integration (Azure DevOps or GitHub)
Workspace metadata	Use Fabric REST APIs to export workspace JSON

Practical implementation:

Level 1 (Within region):

Enable ZRS on storage (Microsoft managed)
Recovery Point Objective (RPO): < 15 minutes

Level 2 (Cross-region):

python

# Scheduled notebook to replicate critical tables
def replicate_to_dr_region():
    df = spark.read.format("delta").load("abfss://primary@onelake.dfs.fabric.microsoft.com/critical")
    df.write.format("delta").save("abfss://dr@drstorage.dfs.core.windows.net/backup/")

Level 3 (Metadata backup):

Connect Fabric workspace to Git repository
Schedule weekly export of pipeline definitions
Store notebook source code in version control

Recovery procedure:

Restore workspace from Git (metadata)
Reattach to surviving OneLake data (automatic)
If region-level failure, redirect to DR storage using shortcuts

Q22: Your organization wants to integrate machine learning predictions from Azure ML into a Fabric Lakehouse. How will you design that end-to-end integration pipeline?

Answer:

This tests your ability to combine Microsoft’s ML ecosystem with Fabric .

Architecture:

text

Azure ML Model Training → Model Registry → Batch Scoring Pipeline → Lakehouse Storage → Power BI Reporting

Step-by-step implementation:

1. Register and deploy model in Azure ML:

python

# In Azure ML workspace
model = Model.register(workspace, "model_path", "sales_forecast_model")
endpoint = OnlineEndpoint.deploy(name="forecast-endpoint", model=model)

2. Create Fabric notebook for batch inference:

python

# Fabric notebook - Batch scoring
import requests
from notebookutils import mssparkutils

# Get model endpoint from Key Vault
endpoint = mssparkutils.credentials.getSecret("keyvault", "ml-endpoint")
token = mssparkutils.credentials.getToken("https://ml.azure.com")

# Read new data from Lakehouse
new_orders = spark.read.table("bronze.orders").filter("processed = false")

# Score in batches (5000 records per call)
results = []
for batch in new_orders.limit(5000).collect():
    response = requests.post(endpoint, json={"data": batch}, headers={"Authorization": f"Bearer {token}"})
    results.append(response.json())

# Write predictions to Lakehouse
scored_df = spark.createDataFrame(results)
scored_df.write.mode("append").saveAsTable("silver.sales_predictions")

3. Orchestrate with Data Pipeline:

text

Pipeline Schedule (daily at 6 AM):
    Notebook: "Batch Scoring" →
    Copy Activity: predictions to gold layer →
    Semantic Model Refresh

4. Consumption:

Power BI connects to predictions table via Direct Lake
Optionally retrain model monthly using Fabric Data Science

Quick Reference Answer Guide

Top 5 Interviewers’ Focus Areas:

OneLake & Shortcuts – Understand zero-copy data sharing
Direct Lake mode – Know when and why to use it
Capacity management – Monitor and optimize CU consumption
Security (RLS + Purview) – Implement compliance patterns
Differences from Synapse/ADF – Know when Fabric is better

Critical Differences to Remember:

Concept	Traditional Azure	Microsoft Fabric
Storage	Separate ADLS accounts	OneLake (unified)
Data warehouse	Dedicated SQL pool	Warehouse item
ETL	ADF pipelines	Data Factory pipelines (Fabric-native)
Lake queries	Spark/SQL endpoints	Lakehouse + SQL endpoint
Semantic model	Power BI dataset	Same but native in Fabric

Final Preparation Tips

From real interview experiences :

✅ DO:

Practice explaining Fabric as “SaaS for all data workloads”
Prepare real project examples that use multiple Fabric components
Understand capacity units (CUs) and smoothing window concept
Know when to choose Lakehouse vs Warehouse with specific trade-offs

❌ DON’T:

Treat Fabric like “just another Synapse module” (this is a common failure point)
Claim Direct Lake mode is always better than Import mode
Ignore security and governance (it’s a top interview focus)
Forget about Purview integration for enterprise scenarios

Remember: Microsoft doesn’t test syntax—they test how you think about data as a unified, governed, and scalable ecosystem . Focus on architectural understanding and trade-off analysis in your answers.

Core Concepts and Architecture

Workloads and Components

Security, Governance, and Administration

Performance, Optimization, and Advanced Topics

Scenario-Based and Practical Questions

Other Common Questions

1. What is Microsoft Fabric?

Answer

Interview Answer

2. Why was Microsoft Fabric introduced?

Answer

3. What are the core components of Microsoft Fabric?

Answer

4. What is OneLake?

Answer

Interview Answer

5. What is the significance of OneLake?

Answer

6. Explain the Medallion Architecture in Fabric

Answer

Bronze

Silver

Gold

7. What is a Lakehouse?

Answer

8. Difference Between Data Lake and Lakehouse

9. What is a Warehouse in Fabric?

Answer

10. Lakehouse vs Warehouse

11. What is Delta Lake?

Answer

12. What are Delta Tables?

Answer

13. Explain Time Travel

Answer

14. What is Data Factory in Fabric?

Answer

15. Pipeline vs Dataflow

16. What are Dataflows Gen2?

Answer

17. What is Data Engineering in Fabric?

Answer

18. What is a Notebook?

Answer

19. What is Spark?

Answer

20. What is Spark Pool?

Answer

21. Explain Fabric Capacity

Answer

22. What is Capacity Throttling?

Answer

23. How do you optimize Fabric Capacity?

Answer

24. What is Direct Lake?

Answer

25. Direct Lake vs Import Mode

26. Direct Lake vs DirectQuery

27. Explain Data Activator

Answer

28. What is Real-Time Analytics?

Answer

29. What is Eventstream?

Answer

30. What is KQL?

Answer

31. Explain Fabric Security

Answer

32. What is Row-Level Security (RLS)?

Answer

33. What is Object-Level Security (OLS)?

Answer

34. What is Purview Integration?

Answer

35. Explain Data Lineage

Answer

36. What are Shortcuts in OneLake?

Answer

37. Why are Shortcuts important?

Answer