Quantcast
Channel: Concentra Blog » Microsoft
Viewing all articles
Browse latest Browse all 10

What you need in your data warehouse framework

$
0
0

Introduction

Every company that either builds ETL processes for enterprise data warehouses for a living or develops and maintains their own data warehouses and ETL processes internally eventually creates a standardised approach to core functionality that is always required in enterprise-level data warehouses.

Advantages of an Data Warehouse framework

Creating a Data Warehouse framework provides a number of advantages.

1.Consistency of development

Development would follow coding and design conventions, typically using templates. This ensures that new developers can easily pick up the standard approaches and handover/support of the processes are typically easier.

2.Standardisation and automation of core functionality

Building in core functionality into the framework approach is vital because it reduces repetitive development effort, ensures consistency of development and enables developers to concentrate on the custom requirements without having to remember to implement standard processes.

3.Speed of development

This is a by-product of having pre-built standard functionality, templates and common components, which results in development cycles being much shorter than having to develop processes from scratch.

Core Framework functionality

This is my list of core functionality that I believe should be standard in any enterprise Data Warehouse framework:

  •  Log ALL data sources received and processed, with a standard set of progress reports.
  •  Automatic audit trail of all data flowing through the data loading process, giving the ability to track data from landing through to the data warehouse
  •  ETL progress logging showing the history of all the ETL loading processes down to the task level. This can be used to determine any bottlenecks or decreased performance of the load process over time as well as show the current load progress
  •  Data validation and data error logging down to row and column level values, with standard data error reports to assist data quality audits
  • Ability to easily roll data out of the system
  • Automatically handle duplicate data source items (reject/append/replace)
  • Automatically cater for multiple datasets sent by multiple data providers
  • Ability to add/cater for new data providers of existing data feeds with no additional development effort
  • Automatic filing/archiving of loaded flat files
  • Design to handle any invalid data/files, to ensure other data loading is not affected
  • Modularised packages to allow granular control of data loading processes.
  • Default behaviour for NULL/invalid/missing data

The following list is extended to include useful functionality that we have added to our framework capability, but may not be a core requirement:

  • Automatically handle batches of data that need to be processed together
  • Caters for different data source structures and formats from different data providers automatically
  • Reduced effort to change the data loads if data feed specifications change over time
  • Ability to automatically generate the database objects required for the ETL process
  • Ability to automatically generate the ETL packages

datawareframework

 

Supporting standards

Supporting the ability to deliver a useful and practical Data Warehouse/ETL framework are a number of very important standards:

  • Agreed data warehouse methodology – Kimball/Inmon/Other
  • Data warehouse design and naming conventions – to ensure all database objects are named consistently so that they are instantly recognisable and understood, regardless of which developer created them.
  • Standardised Data Warehouse Architecture
  • ETL development standards – required to ensure that all ETL packages look and behave consistently and contain all the standard functionality required

Summary

Concentra’s Data Warehouse Framework is built using Microsoft technologies, but this core functionality and approach is not dependent on the technology, but rather, the common requirements of practically every enterprise data warehouse implementation.
Our framework is constantly evolving as we add additional useful functionality with each new data warehouse implementation we carry out, but the core functionality in the framework is always used in each and every data warehouse we deploy. The advantage for our clients is that they receive a proven Data Warehouse implementation with all the typical functionality required for an enterprise Data Warehouse at a much lower cost and effort than is typically possible, along with the ability to fully customise and enhance their solution without being tied down to any proprietary software or licensing restrictions.
To find out more about how Concentra can help you implement your own Data Warehouse framework, please contact us at sales@concentra.co.uk

The post What you need in your data warehouse framework appeared first on Concentra Blog.


Viewing all articles
Browse latest Browse all 10

Trending Articles