Design data processing systems
Overview
This standard is about designing data processing systems.
Data engineers prepare data for analytical or operational uses. They are typically responsible for designing and building data pipelines to bring together information from different source systems. They integrate, consolidate, cleanse and structure data to make it easily accessible and in usable format. Processed data can then be used by business executives, data analysts and other end users to inform organisational processes and decision making.
Designing data processing systems involves selecting data storage systems, designing data pipelines, and implementing design approaches for security, scalability, efficiency, flexibility and portability. It also includes testing and documenting data processing design solutions.
This standard is for those who need to design data processing systems as part of their duties.
Performance criteria
You must be able to:
- Review requirements to plan data processing system specifications
- Translate business requirements into accurate data solution designs and roadmaps
- Produce and interpret data models to determine the relationships and data flows required
- Identify opportunities to reuse existing data flows to improve efficiency
Reverse engineer data models from live systems to model data structures
Design data extraction and manipulation routines to process data into useable information
Identify data flows and data lineage, to show which parts of the organisation generate and utilise data, and how data moves through the organisation
Develop designs of data models, data pipelines and data services in line with organisational requirements
Select storage technologies for data system designs in line with organisational requirements
Verify the organisational procedures for data security and compliance to inform data system designs
- Review and apply organisational standards for scalability, efficiency, reliability, availability, flexibility and portability to inform data system designs
- Document data system design solutions in line with organisational standards
Automate manual data flows to enable scaling and repeatable use
Test data system solutions in line with system requirements
- Document data processing designs to inform system developers
- Provide technical advice and guidance to support system developers and end users
Knowledge and Understanding
You need to know and understand:
- That organisational data is an asset with unique properties that influence its management
- The data management practices used by an organisation to maintain high quality data
- The types of data structures and architectures used in organisations
- The main principles of data access, privacy and security and how to apply these to data design
How to develop, interpret and compare data models
Where to use different types of data models
- The industry standard tools for data design and how to apply them
- Cloud data platforms and data storage technologies
- How to map storage systems to data processing requirements
- Industry standard data modelling patterns and standards
- Industry standard systems used for on-premises and cloud-based data storage and their implications on data privacy
- How to keep updated on data technologies and platforms
Corporate, industry and professional data standards
Industry and organisational standards for data management including scalability, efficiency, reliability, availability, flexibility, portability and quality
How to incorporate security into data processing design
The data policy, legislative, regulatory and operational constraints which exist within the organisation
The industry standard technologies and design principles involved in batch and streaming data processing
The steps involved in data staging and how to apply them
How to document data designs