ETL
A data integration process that extracts data from sources, transforms it, and loads it into a target.
Also: Extract Transform Load
Definition
ETL is a data pipeline pattern that extracts raw data from source systems (databases, APIs, files), transforms it by cleaning, normalizing, deduplicating, and enriching, then loads it into a destination such as a data warehouse or data lake. ETL ensures data quality and consistency across analytical systems. Modern ELT variants load raw data first and transform in the warehouse. Tools include Apache Spark, dbt, Informatica, and Apache Airflow for orchestration.
Example
“A nightly ETL job extracts transaction records from an OLTP database, converts currencies to USD, removes duplicate records, and loads the cleaned data into the analytics data warehouse for morning reports.”
Synonyms
- data pipeline
- data integration
- data transformation pipeline
- ELT
Antonyms / Opposites
- real-time streaming
- raw data access
Images
CC-licensed · free to useVideo
Related Terms
- data-warehouse
- data-lake
- sql
- replication
