+27 (0)21 551 2410
Facebook Twitter LinkedIn

Master Data Quality Accelerators


Fast Track Solution For South African Party Master Data Quality

Backed by our proven Data Quality Management Methodology, InfoBluePrint has developed specifications, processes, templates and reference data for South African natural person and juristic entity party data (eg. customer, supplier, agent, intermediary and employee) that can be re-used for consistent and reliable application, both architecturally and programmatically.

The benefits of this solution are:

  • Better time to value and ROI of data quality software technology purchase
  • Avoids time consuming and costly trial and error development that is inherent to data quality and data cleansing activity, in particular with a new tool and a "blank page"
  • Immediate access to tried & tested reference data gathered over many years, specifically for SA
  • Cleansing performed to industry standards where possible eg SANS1883
  • Consistent and reliable results
  • Common data model allows for any data source and any structures to benefit from the same data cleansing code
  • Better performance and throughput due to scalable design and delta processing capability (if integrated with an ETL tool)
  • Audited programmatic data cleansing – detailed visibility of what was cleansed and why
  • Modular design for complete customisation and easy maintenance

How can InfoBluePrint help?


Automatic Data Classification: ie Natural Person vs Organisation. Based on actual content rather than unreliable capturing practices.

Sub-Type Classification: SA ID, Temp Visa, Private Co., Trust, NGO, Medical, School, etc, etc.

Fixing of Mis-fielded Data and Derivation of Missing Data.

Parsing, Cleansing, Standardising of: Name, Legal, Trading, Maiden - ID, Co Reg, VAT - Addresses (all classes) - Telephone (all classes) - Email - Banking details. And more ...

Householding – various categorisation eg. via name, address, email, banking etc.

Data Quality Assessment, Validation & Scoring.


The main features of this solution

This solution can be built upon and customised to client specific requirements.


Data Quality Improvement Methodology

InfoBluePrint have developed a formal methodology for all aspects of Data Quality Improvement. The methodology is supported by comprehensive Process Definitions, Checklists, Templates, Job Aids, Education Material and, where relevant, Technology Specific Material.

Reporting Templates

Generic Reporting templates are used to ensure that data quality reporting is done in a consistent and presentable manner, for all levels of an organisation. Where relevant DQ Tool specific reporting eg via dashboards, is leveraged for comprehensive data quality monitoring.

Generic InfoBluePrint Data Model

The InfoBluePrint (IBP) Model is a logical data model for demographic data that is implemented physically in a relational database. Client specific data is mapped against this model and generic rules are then applied to the model to Validate, Standardise, Parse, Correct and Augment client specific demographic data. This approach allows for maximum flexibility in that it caters for any data source input, regardless of structural variation.

South African Reference Data

Reference data is used within the context of the InfoBluePrint Model to support validation and standardisation of attributes for both natural and juristic persons, such as name prefixes and suffixes, titles, address types, common misspellings, noise data, company types, ID types, telephone area codes, telephone types and common representations of various party data attributes. Reference Data is used both for Validation and Data Cleansing.

AfriGIS Integration

For South African address reference data, InfoBluePrint has integrated with the AfriGIS data set for postcode, suburb and town.

DQ Technical Architecture

The InfoBluePrint Technical Architecture defines the configuration requirements to implement the relevant DQ Tool to deliver consistent, auditable and high performance data quality processes. This is the Architecture within which data is validated, cleansed, augmented etc.

Generic Mapping Specification

The InfoBluePrint Generic Mapping Specification provides for Client specific demographic data to be mapped and loaded to the InfoBluePrint Model. Once loaded to the InfoBluePrint Model, data can then be validated and cleansed against standard generic rules, without having to change code for data source variations.

Generic Business Rules

Generic Business Rules for South African party data are defined in Business terms and are applicable for attributes, and combinations of attributes, of the InfoBluePrint Model.

Generic Validation Rules

Generic Validation Rules are the technical implementation of Business Rules and are applied in the relevant DQ Tool against the InfoBluePrint Model.

Generic Standardisation Rules

Standardisation Rules are associated with Data Cleansing and are normally used in conjunction with Data Parsing. Data Standardisation is used to provide a consistent format and structure to attributes such as title, initials, address elements, name prefixes and suffixes and many others – in fact any attribute which should be derived from value sets. See “Generic Processes” below for additional detail.

Generic Process Specifications

Generic DQ Tool Process Specifications are provided for the following:

  • Classification of party data into natural person vs juristic entity (individual vs organisation), based on actual data content anywhere in the record (rather than on unreliable field names and/or flags).
  • Sub-Type Classification based on identity data found with the data record eg: SA ID, Temporary Visa, Refugee, Private Company, Trust, NGO, Medical, School and others.
  • Parsing to move mis-fielded data into the correct fields, for example: “Trading As” names, “Attention of” data, and information typically mis-fielded in address fields.
  • Derivation of missing data from other data in the record.
  • Parsing, Cleansing and Standardising of: Name, Legal Name, Trading Name, Maiden Name, Nickname, ID, Co Reg No, Addresses (postal, physical, farm, site, international, informal), Telephone (fixed, mobile, premium rate), Email, Banking Details and more.
  • Quality Scoring of demographic data.
  • Householding along various lines for example by name, address, email, banking details, and any combinations thereof.
  • PAMSS data preparation to improve PAMSS scores.