Train Sure Data Warehousing

The task of obtaining any meaningful data or information from the early computer systems used to be very tedious. Consequently, a number of methods, techniques, and tools have been developed to solve that problem. These include decentralized processing, extract processing, executive information systems (EIS), query tools, relational databases, and so forth. The need for timely and accurate decisions also has led to the development of decision-support systems, ranging from simple to very sophisticated systems. The data warehouse is the latest tool in this evolution.

Traditional business applications were designed and developed with the objective of helping specific departments or functions such as marketing, human resources, finance, inventory management, loan processing, among others. Because such applications were typically developed independently and without coordination, over a period of time, they often collected redundant data. Also, the data residing in these applications, which were often developed on different platforms, were incompatible and inconsistent. Consequently, there was poor data management, enterprise view of data was lacking, and, frequently, a query would return varied results depending on the application that was accessed and analyzed.

What made the situation even worse after 1981, when the personal computer (PC) was developed, was the explosion in the number of systems as well as the quantity and types of data being collected. The loss of a central data repository coincided with the widespread demand for timely and increased information.

Online transaction processing (OLTP) systems were developed to capture and store business operations data. Because robustness was their top priority, rather than reporting or user accessibility, they suffered from some serious limitations. Their most obvious shortcomings were the inability to address the business users’ need to access stored transaction data and management decision-support requirements in a timely fashion. OLTP systems also did not address history and summarization requirements or support integration needs (the ability to analyze data across different systems and/or platforms).

The failure of OLTP systems to provide decision-support capabilities ultimately led to the data warehousing concept in the late 1980s. Its objective, in contrast to OLTP systems, was to extract information instead of capturing and storing data, and hence, it became a strategic tool for decision makers. At this time, data warehousing strives to become the foundation of corporate-wide business reporting and analytics by becoming the enterprise information hub, supporting both tactical and strategic decision making.

In a way, the data warehouse concept has involved traversing a full circle. After the PC was invented, islands of data had sprouted in a move of independence, away from the centralized mainframe concept. The data warehouse, by collecting data stored in disparate systems, is a return to the centralized concept. A key difference exists, however; a data warehouse enables enterprise as well as local decision-support needs to be met, while permitting independent data islands to continue flourishing. It provides a central validated data repository, which can provide “one version of the truth,” while supporting the satellite systems that it feeds.

Most data warehouses do not have current data because of the extraction, transformation, and loading (ETL) process, which is typically based on a daily loading schedule except in the case of real-time systems. Hence, data can be out of sync between the data warehouse and the operational OLTP system, which can cause reports executed in the different systems to return different results. In most cases, this is not an issue as long as the users are trained and made aware of the loading schedule.

Latest Blog Posts