What Is Data Preparation?
Data preparation is the act of manipulating raw data into a form that can readily and accurately be analysed, e.g. for business purposes.
Extract Transform and Load (ETL) is the general procedure of copying data from one or more sources into a destination system. This system then represents the data in a different format which is visually compelling to the user.
To improve integrity the data undergoes a process of normalisation and cleansing, whereby duplicates are removed, and missing information is completed. The final improved data set is then loaded into a single source database system, often referred to as a Data Warehouse.
Business Intelligence Analysts use this single data source to prepare business insights in the form of reports and dashboard views.
Why Data Sources Are Important
As data accumulates, managing it can start to become quite complex. Traditional methods of discovering patterns in large data sets, such as Data Mining, can make the analytical process slow, unreliable and subject to inconsistency.
As a business leader you need to have absolute confidence in the data insights you are presented with. The future of your business depends on well informed decisions. Analysts providing the insights to you also need to be confident that the data is accurate.
Disparate data sources create problems when trying to spot overall data trends. For example; customer contact data in a CRM system may be separated from the data related to their purchasing behaviour, which is held in an accounting system. This can result in a disjointed overview of those customers.
Today’s Business Intelligence solutions provide a ‘self-service’ data preparation model. This is beneficial because it reduces the burden on your IT department. It also ensures that security and governance are at the core of your data process.
Data Sources We Work With
- Amazon RDS (all types)
- Amazon S3
- Apache Drill
- Apache HBase
- Azure Cosmos DB
- Azure Data Explorer
- Azure Database for MySQL
- Azure Database for PostgresSQL
- Azure Datalake Gen1
- Azure Datalake Gen2
- Azure Search
- Azure SQL Data Warehouse
- Azure SQL Database
- Azure Table Storage
- Common Data Service for Apps
- Dynamics 365
- Flat Files
- Google AdWords
- Google BigQuery
- Greenplum Database
- Hadoop HDFS
- HTTP Endpoint
- IBM DB2
- Microsoft Access
- Microsoft Azure (Blob Storage)
- Microsoft Dynamics AX
- Microsoft Dynamics CRM
- Microsoft Excel
- Mongo DB
- Office 365
- Oracle Responsys
- Oracle Service Cloud
- Standard File System
Benefits To You
Business User Enablement – provide users with powerful analytical insights without the need for IT support
Control – reusable transformation rules increase efficiency
Data Quality – improved quality right from the beginning of the process
Speed – automating this lengthy and manual process saves you time and money
Transformation – immediate value from your data by creating business specific standards and formats
What Is Data Visualisation?
Data visualisation is the graphic representation of data in the form of images and charts.
Presenting data in this way is aesthetically pleasing and easier for users to interact with. As the saying goes ‘a picture speaks a thousand words‘ but don’t just take our word for it, see for yourself:
Let’s take a look at this spreadsheet data showing coffee sales by country per month.
|Year/Month||Bolivia||Ecuador||Madagasca||Papua New Guinea||Rwanda||Average|
Here is the same data, this time displayed as an interactive chart.
Which format do you think provides the most insight?
The spreadsheet is merely a table of numbers. The chart however, shows colour coded data, average line plotted, and an immediate visual comparison of past and future year/month figures. Users can also zoom in/out and highlight/select specific products.
If you think the chart functionality is impressive, we can assure you that data displays get a lot more interesting when you add dashboards into the mix!