Cleanse data to resolve quality issues

URN: TECDT80843
Business Sectors (Suites): IT(Data Science)
Developed by: e-skills
Approved on: 30 Mar 2023

Overview

This standard is about cleansing data to resolve quality issues.

Data engineers prepare data for analytical or operational uses. They are typically responsible for designing and building data pipelines to bring together information from different source systems. They integrate, consolidate, cleanse and structure data to make it easily accessible and in usable format. Processed data can then be used by business executives, data analysts and other end users to inform organisational processes and decision making.

Cleansing data involves profiling datasets, organising data, tidying data, and cleansing datasets to resolve data quality issues. This also includes documenting quality issues and cleansing activities applied to resolve them.

This standard is for those who need to cleanse data to resolve quality issues as part of their duties.


Performance criteria

You must be able to:

  1. Review the organisations data architecture and data types to inform data cleansing practice
  2. Organise data to arrange it into accessible structures for cleansing

  3. Profile datasets to identify data quality issues and cleansing needs

  4. Develop a data cleansing strategy to maintain high data integrity

  5. Investigate and resolve data quality issues in line with organisational procedures

  6. Apply data cleansing tools to datasets to filter out unwanted data 

  7. Write and execute tests to validate data quality

  8. Automate data cleansing processes to improve accuracy and efficiency

  9. Document data quality metrics, issues and resolutions in line with organisational procedures


Knowledge and Understanding

You need to know and understand:

  1. How data is structured, where it is located and how it flows in organisational processes
  2. The organisations data architecture, models and data types used
  3. Why it is important to cleanse data
  4. The range of common data quality issues that can arise in data and how to check for them including duplicate data, missing values, null data and outliers
  5. The steps involved in profiling datasets to identify quality issues
  6. How to cleanse datasets to resolve data quality issues
  7. The industry standard tools and techniques used to cleanse data and how to apply them
  8. How to measure and report data quality metrics
  9. The supplementary performance issues that can occur in connection with data quality and how to resolve them
  10. The main steps involved in tidying and cleansing data and how to apply them
  11. The tools and techniques used to monitor data quality within an organisation
  12. Industry best practice strategies used to improve data quality
  13. How to document data quality metrics, issues and resolutions

Scope/range


Scope Performance


Scope Knowledge


Values


Behaviours


Skills


Glossary


Links To Other NOS


External Links


Version Number

1

Indicative Review Date

30 Mar 2026

Validity

Current

Status

Original

Originating Organisation

ODAG Consultants Ltd.

Original URN

TECDT80843

Relevant Occupations

Information and Communication Technology Professionals

SOC Code

2134

Keywords

data engineering, data cleansing, data design, data processing, data cleansing