Data lineage is the end-to-end record of a dataset’s origins, transformations, movements, and usage over time. It maps how data flows across systems—from source to storage to consumption—capturing each step in its lifecycle. Lineage provides critical transparency into where data comes from, how it has been altered, and where it ends up.