Section: IT & Technology · DatabasesDifficulty: Medium

ETL

USUK

A data integration process that extracts data from sources, transforms it, and loads it into a target.

Also: Extract Transform Load

Definition

ETL is a data pipeline pattern that extracts raw data from source systems (databases, APIs, files), transforms it by cleaning, normalizing, deduplicating, and enriching, then loads it into a destination such as a data warehouse or data lake. ETL ensures data quality and consistency across analytical systems. Modern ELT variants load raw data first and transform in the warehouse. Tools include Apache Spark, dbt, Informatica, and Apache Airflow for orchestration.

Example

A nightly ETL job extracts transaction records from an OLTP database, converts currencies to USD, removes duplicate records, and loads the cleaned data into the analytics data warehouse for morning reports.

Synonyms

  • data pipeline
  • data integration
  • data transformation pipeline
  • ELT

Antonyms / Opposites

  • real-time streaming
  • raw data access

Images

CC-licensed · free to use
More on Wikimedia
Loading images…

Video

  • data-warehouse
  • data-lake
  • sql
  • replication

Dictionary Entry

Back to IT & Technology