Data warehouse optimisation using Hadoop

| Factsheet
A new solution from Capgemini, implemented in partnership with Informatica, Cloudera and Appfluent, optimises the ratio between the value of data and storage costs, making it easy to take advantage of new big data technologies.
Today’s escalating data volumes can prevent Online Transaction Processing (OLTP) systems from generating and processing transactions efficiently which can cause Data Warehouse (DW) performance issues when querying data. Total Cost of Ownership (TCO), too, escalates rapidly because of the need for upgrades to DW hardware, and since licenses are priced according to volume. Organisations often pay to store information simply because they might need it.
Radically new capabilities are needed to enable DW and OLTP to function cost-effectively in this new environment. First, organisations need to cope with data volumes that traditional DW platforms (whether RDBMSs  or appliances) were never designed for. Second, they must deal with unstructured, semi-structured and structured data. 
Big data technologies such as Apache Hadoop excel at managing large volumes of unstructured data. However, this data needs to be pulled together with structured data for analytical purposes.  As Big Data technologies and Apache Hadoop are coming into mainstream use, being able to integrate these new technologies with existing legacy Data Warehouse platforms to get the best of both worlds is key.
In partnership with Informatica, Cloudera and Appfluent, Capgemini has developed an integrated solution that allows OLTP systems and DWs to serve their primary functions efficiently and cost-effectively.