Back

Assist in developing and validating machine learning solutions

URN: TECIS805301

Business Sectors (Suites): IT(Data Science)

Developed by: e-skills

Approved on: 2020

Download as PDF Download as Word

Overview

This standard identifies the competences you need to assist in the development of machine learning algorithms and their implementation, in accordance with approved procedures.

You will be required to assist in the development of machine learning algorithms, which encompass machine learning workflow; supervised and unsupervised learning; creation of training and test datasets; fitting classifier, regression and clustering models and interpreting the results.

Machine learning algorithms are used in a wide variety of applications where it is difficult or infeasible to develop a conventional algorithm to perform the task. This will involve the practical use of software tools for machine learning algorithm development. You will be able to assess the performance of a developed model and identify the role of training and test datasets in this process.

Your responsibilities will require you to comply with organisational policies and procedures. You will be expected to work to instructions, alone or in conjunction with others, taking personal responsibility for your own actions.

Your underpinning knowledge will be sufficient to provide a sound basis for your work and will enable you to apply the required procedures for the development and testing of machine learning algorithms. You will recognise the importance of good quality data and the role of feature selection in the learning process.

This activity can be increasingly found in any sector or organisation and in particular those associated with implementing automated reasoning systems that can learn to respond based on training datasets provided. It is likely to be undertaken by people working as Junior Machine Learning Specialists or Junior Machine Learning Engineers.

Performance criteria

You must be able to:

perform data extraction, preparation and transformation in order to produce required datasets
assist with data cleaning of noisy, incomplete data or data with established data quality issues using approved tools and techniques
assist in selecting and applying statistical tools to generate descriptive statistics from different datasets for the purpose of feature selection
assist with creating analytical models using approved modelling techniques to model structured data
assist in applying best-practice model fit testing and validation techniques to assess model performance
select a classifier algorithm, and use industry standard software tools to load a dataset and produce a machine learning classifier model
apply feature selection and linear methods in order to perform variable reduction to improve the performance of models
assist in evaluating the measures of model fit for a classifier model
assist in identifying the features of a classifier model
document the modelling process in line with organisational standards
assist in producing documentation in order to secure implementation sign-off
produce visualisations, charts and graphs to communicate data and model process in required timescales

Knowledge and Understanding

You need to know and understand:

the purpose, key features and applications of machine learning
the role that algorithms play in machine learning
the need to validate machine learning models and how to carry this out
the basic concepts of variable creation and reduction in data analysis
how variables and features impact model performance in testing and validating analytical models
the potential data quality issues that can arise, including missing values, duplicate data, incorrect data and how to deal with these
the implications of data quality for analysis model performance
the importance of feature selection in effective machine learning
the industry standard approaches used for data cleansing and how to apply them
the difference between supervised, unsupervised and reinforcement learning
the purpose of training and testing data sets in developing and evaluating a machine learning model
the feature identification stage used in model development and how to perform this
the industry standard statistical methods and best-practice modelling techniques (including classifier, regression and clustering models) used to develop machine learning solutions
the main types of algorithm used to develop machine learning solutions, including decision trees, nearest neighbour and linear classifier
the industry standard tools used for implementing algorithms and developing machine learning models
the characteristics of under-fitting and over-fitting in a classifier model
the different approaches to model improvement in a classifier problem
the measures of performance that can be used in model development, covering classification, prediction and clustering

Assist in developing and validating machine learning solutions

Overview

Performance criteria

Knowledge and Understanding

Scope/range

Scope Performance

Scope Knowledge

Values

Behaviours

Skills

Glossary

Links To Other NOS

External Links

Version Number

Indicative Review Date

Validity

Status

Originating Organisation

Original URN

Relevant Occupations

SOC Code

Keywords