Data Wharehousing: OLAP Report Example | Topics and Well Written Essays

? Data Warehousing In the recent past, relational database systems have been common with the idea of incorporating data retrieval and analysis tools to their functionality. One, among the many tools used in such a case is OLAP, on-line analytical processing. This acronym, or rather the whole term as a concept has been controversial for a while, being given diverse definitions and meanings. It can be viewed in the business intelligence concept, relating to marketing, different management aspects, project planning and management, financial issues, reporting and lastly data mining (Becker, 2002) OLAP as a tool not only facilitates retrieval of data, but also helps in making a detailed analysis of the retrieved data. This is a concept of OLAP mainly used when it’s being related to data mining. This is possible through pre-aggregation, an aspect of relational databases in that they not only facilitate the creation of tables, but also can manipulate them together with the data they contain. Pre-aggregation in connection with OLAP basically explains how factual information, in this sense data that has been collected can be used to come up with probability estimations used for distribution (Kozielski & Wrembel, 2009). Data mining is a concept that has been confused with OLAP for a while. These two terms, though different have been used synonymously to refer to the other. However, these are two similar but different terms when viewed critically. Data mining is a data knowledge discovery mechanism that aims at identifying, from a pool of data, sets of useful and important data that may have been ignored, classifies the data in relation to the whole set and associates it to its class. This data analysis concept focuses mainly on dividing the existing data into small manageable sets, as regards their relationship. On the other hand, OLAP focuses on addition of more data to a pool that is already in existence. This is made possible since OLAP gives data a multi-dimensional approach, creating summaries in different dimensions that are then added up to the original data, making it more comprehensive A general definition associated with OLAP is that which describes it as software put in place to create a platform in which the user interacts easily with a complex database accessed online and is able to prompt the database for a service which n return provides a report in a form understandable to him/her. However, there are many other definitions attached to this concept. It’s amazing that some people even used the full name of the acronym as a definition, but this is entirely questionable since it does not give gist of what the concept is or what it refers to. The most commonly used definitions are listed below (Becker, 2002). First, there is what we can call the popular definition. It’s a precise and interesting definition of OLAP as set of many spreadsheets in a package. This is just but a typical meaning, mainly used to people who have little or no knowledge in information technology. Its true OLAP is used in spreadsheets to enhance the view of data, but it’s certainly not a set or group of spreadsheets. Secondly, OLAP has been popularly defined as a report given or presented with some extra information attached to it. This is adopted from the way OLAP works, by presenting data from different dimensions with diverse interpretations. In this case, OLAP allows the database users to scan through different perspectives of an issue. Lastly, there is the technical definition attached to OLAP. It describes on-line analytical processing as enhanced, friendly browsing of similar, multidimensional data. When a user prompts a query to the software, it may take up to less than a minute to give an output, hence its attribution as a fast software. It also does this in levels with a set of new data for each level. MegaSave uses data marts, tools used in presenting data and facts in multi-dimensions. The best definition, therefore, for such a scenario is the last one. This is so because so far, it’s the only definition appreciating OLAP as software that supports multi-dimensional concepts. It also supports the concept of aggregation in OLAP, since a data mart itself is an aggregation used when giving feedback, when carrying out an analysis and when making critical decisions (Becker, 2002) MegaSave proposes implementation of a data mart as opposed to a data warehouse. However, these two concepts are similar, just that the scope and size of the two vary. A data warehouse is larger and focuses on so much in an organization while a data mart has a relatively narrower scope. It focuses more on the business management part of the organizations and is not meant to be independent, but rather to add up to the functionality of a warehouse. MegaSave’s data mart is also an operational system. This refers to a system meant to keep in place the security, timeliness and accuracy of data used in business transactions by application of database normalization and creation of relationships between similar entities. Evident from the above discussion of what OLAP is, these concepts are covered in the functionality of this software. MegaSave’s relational database also enhances management of relationships between tables containing similar or related data. With time, however, some outdated data is cleared off the operational system but retained in the data warehouse, in case it’s needed in future (Kozielski & Wrembel, 2009). There are different dimensions or approaches to data warehousing. Choice of a dimension entirely depends on the kind of data to be managed, company policies and regulations, and most importantly the kind of operations the software will be used for and the speed required for the output to be presented. The main two approaches that will be covered in this paper are; the bottom-up and top-down data warehousing approaches. Bottom-up approach, proposed by Ralph Kimball states that data marts are originally meant to be used for specific transactions. They cannot be generalized to cover all activities of an organization, since their original authorship is more of task-specific. However, he appreciates that bottom-up is not a dependent approach, but one that stems up from the roots of a top-down approach that lists the tasks that can be analyzed. He goes further to argue that data marts only carry facts, with the any dimensions through which these facts can be analyzed. This explains why they need to be task specific, since it may not be logical to mix facts of different concepts, which may even be contradicting against each other. However, he proposes an integration of these data marts to create a warehouse, which focuses on and covers the whole organizational activities. This can be made possible by what he calls data warehouse bus architecture, a move towards collecting and putting in one place all the data marts addressing different business processes (Becker, 2002). Integration is done by first identifying the data marts to be joined, further identifying the points at which these can merge without much ado and then the real merger is carried out. A summary of the shared facts between the two data marts is done and then the keys of these facts, which have been summarized accordingly, are joined. The integrity of the data, thereafter, entirely depends on how the created warehouse is managed. Most important in achieve this is ensuring the dimensions of the various data marts have a high level of consistency and are compatible (Kozielski & Wrembel, 2009). This approach has been termed good and reliable by data warehousing experts, mainly because of its modularity. As opposed to a whole complex system, bottom-up approach presents a scenario where each segment in the system is independent to run on its own and a problem in one segment has little to do with the others. It’s easier to manipulate data in such a system as it is self training, user friendly and relatively interactive. Ralph appreciates that such a system reduces the pressure of initial installation, as it can be done phase to phase until all the processes of an organization are covered. It’s also easier to achieve productivity with bottom-up approach, since data marts are adopted for each business process and put into task independently before the others. Profits are, therefore, realized early in time in the specific transactions hence boosting the other departments into achieving faster adoption of data marts (Bellatreche, 2010). Bottom-up approach also takes much pride in its ability to merge the services offered by its integrated data marts. It’s true that the data marts function independently, to deliver their unique services; it’s also true that the integration matches the two independent services, producing them as one. For instance, if sales and marketing data marts are merged, then a sales-marketing service provision is possible, in addition to the independent sales ad another independent marketing (Kozielski & Wrembel, 2009). This approach has, however, been criticized by its opponents. A major criticism of the bottom-up approach has been its failure to appreciate the need for a centralized system. It focuses on decentralization, but later integrates the system to behave as though it was centralized. The reason behind this has been attributed by the need to share information and resources across the enterprise, desire not to separate enterprise processes originally designed to work as one, a move by the management and other employees to understand the entire processes and decisions, are at all departments and the need to work towards a common goal and following similar strategies. The critiques of this model, however, have argued that if this matters to an enterprise, then the original model should be centralized and not a mix up of centralization and decentralization (Silverston, Inmon & Graziano, 1997). Another major model addressed by this paper is the top-down approach to data warehousing. It’s a criticism of the bottom-up approach that was original the idea of Bill Inmon. He starts by disputing the bottom-up approach and giving the standard definition of a data warehouse as a centralized registry for an organization. This has been a controversial definition, one that has not been adopted by the proponents of the bottom-up approach, who argue that whether centralized or not, a repository remains to be one. The location just serves to enhance effectiveness and efficiency, but not to qualify a data warehouse as being or being not a repository. Inmon says that the data warehouse must be located at a central place in the organization, creating a scenario of similar proximity to all other departments. He also calls for all the dimensional marts unique to the transactions to be created from and stored in the central warehouse, where they are managed by the people working in the repository, and not the specific department’s staff. This, according to him, will help in ensuring proper management of the business processes and the whereabouts of the enterprise. He terms a decentralized system as relatively compromising, saying that such a system may raise contradictions if focus is to be channeled towards the uniqueness of each department. In giving credit to his proposal of a top-down approach, Inmon characterizes a data warehouse as being subject oriented. In explaining this he says that in a centralized repository, data is classified according to the subject it addresses. Indexing is also done, using subject as the main entry to the database. This, therefore, serves to make information in the database easily accessible, as one need to key in the subject to retrieve the information he/she requires. Data contained in a warehouse is static. Bottom-up approach allows for data to be deleted from the mart but be retained in the warehouse. Inmon criticizes this, stating that once adopted and qualified as worth storage in the data warehouse, data should not be dealt away with. It should, instead, be retained for as long as it can survive, since, according to him, the data warehouse should be large enough to keep the existing but yet handle more amounts of data (Kozielski & Wrembel, 2009). Inmon also advocates for data consistency in a centralized system. He argues that consistency can only be achieved by putting in one place all the data of an enterprise, since all activities work together for the good of the organization, there is no sense in separating the location of the data. Data in a centralized system is also time variant and up to date. This can be attributed to the fact that a centralized management system can be easily monitored, and the concerned parties for management of the systems are more accountable. This is not the case for a bottom-up system, where each department takes charge of its data management. Given the design of the company in question, a bottom-up approach is recommended as the best approach to be adopted. This is basically because of its design. The fact that the businesses ran by this company are so diverse, in terms of scope and focus cannot be appropriate for adoption of a centralized system. A bottom-up approach will do well since the various stores, in various places can manage themselves effectively, since they operate in geographically different places, hence diversity in the kind of challenges they are likely to face, therefore, need for very diverse strategies (Kozielski & Wrembel, 2009). Implementation of the approach to this company is also favorable. It allows the departments and regions that have attained the capacity to adopt the system to gain from its services earlier enough, hence more profit generation, as opposed to a case where all the regions have to wait for the main management to lay down strategies of adopting new systems (Silverston, Inmon & Graziano, 1997). Adoption of a warehouse requires for choice of an architecture that will facilitate integration and efficiency of the proposed approach. There is much architecture, from which one can be selected and implemented as the most suitable. Commonly, there are five architectures applicable to data warehousing (Golfarelli & Rizzi, 2009). First, there is the concept of standalone data marts. These usually apply to and are developed by individual departments or enterprises, who seek to meet their very specific requirements that may not be met by a general system applicable to a wide range of similar organizations. Such are solely independent and cannot be applied in another situation apart from the one it was originally meant for. These data marts are used in isolation, and are not compatible to be integrated with others that may be having similar objectives. However, such data marts are known for high levels of inconsistency and lack of uniformity in the way data is stored and managed (Ferdinandi, 1999). Then there is a commonly used architecture, data mart bus architecture. Such architecture links dimensional data marts to form an overall system that is applicable to the whole organization. Data is organized in what is called a star schema to allow for easier integration and applicability (Taniar, 2008). Hub and spoke architecture is also common to data warehousing. It appreciates subject as the main concept that can help bring related concepts in close proximity. Data is stored in the database collectively by subject hence a self attaining classification scheme. It also appreciates levels in the database, to allow for diversity appreciation and easier access. This architecture is attributed o centralization, apart from that it dependent data marts. Lastly, there is the federated architecture. It takes the shape of most decision support systems, where there are globally set variables and metadata attachments that determine the related data sets that need to be integrated. This is then made possible by use of keys, which act as links to the shared data sets (Adelman & Moss, 2000). The major challenge that faces data warehousing is the loss of data during the shift into the data warehouse. Data quality is commonly compromised on in the event of minimizing and compression so more data can fit into the data marts. Lossless compression has been proposed as the way to go, though some losses are still incurred, especially in the event of shifting (Balsters, Brock, & Conrad, 2001). Failure to invest much in data management and governance has also been a major challenge facing data warehousing. Most companies invest less in data management and governance, mainly because they are ignorant of the importance of information to the success and well-being of the organization. There is also need for business modeling. An enterprise needs to know the specific objectives that it aims at achieving so they can work towards a common goal. This has also been a challenge in data warehousing. The main business concepts should be considered in determining the approach to be taken and the data marts to be integrated. Data warehousing also faces the challenge of timeliness. It compromises on the integrity of data by the fact that data may not be accurate and timely. It takes long to adopt data warehousing because of the expenses incurred in implementing and installing the data warehouse (Taniar, 2008). Interpretation of data is also a major challenge in adoption of data warehousing. Data that has been summarized to fit into the data marts or that which is shared and loses independence may be interpreted wrongly by other users, who might have no information on when, how and why the data might have been summarized. SAS Web OLAP Viewer for Java can be used to implement OLAP in the web. It provides a good interface for user interaction and use of OLAP in viewing data. It allows users to view data in many dimensions and helps in interpretation of the compressed data to enhance accuracy, effectiveness and efficiency of data. For instance business analysts are at a better standing to give the data a critical analysis, help them in visualizing, interpreting and using the data from many angles to enhance efficiency. Another major benefit of such a system is its ability to allow for boardroom reports. These are advanced reports that can be edited and formatted with lots of ease, to suit the company strategies (Golfarelli & Rizzi, 2009). It has also a major advantage in that it eliminates much deployment costs. These are costs incurred when each user needs to acquire the software on their machines to be able to use them. It’s an added advantage to its benefits, since users can manipulate the software and use OLAP on the web without downloading it into their machines (Bellatreche, 2010). However, this software has some shortcomings. The fact that it can be accessed and edited on the web, without having user rights exposes data to dangers of unauthorized access and editing. One can easily manipulate data especially in the desire to suit their own interests. A major disadvantage of OLAP is also the fact that it works differently with the reports. The report tools work differently with the data analysis tools. This is a major disadvantage since it makes the working of the software slow and integration barely achieved. In implementing such a system in MegaSave, its features will be more of an advantage, given the structure of the company. The use of the software on the web without downloading gives the company an advantage given that the regions are in geographically different places (Golfarelli & Rizzi, 2009). References Adelman & Moss, 2000, Data warehouse project management. Boston, MA: Addison-Wesley. Balsters, Brock & Conrad, 2001, Database schema evolution and meta-modeling: 9th International Workshop on Foundations of Models and Languages for Data and Objects, FoMLaDO/DEMM 2000, Dagstuhl Castle, Germany, September 18-21, 2000 : selected papers. Berlin: Springer. BarquiI & Edelstein, 1997, Building, using, and managing the data warehouse. Upper Saddle River, N.J.: Prentice Hall PTR. Bellatreche, 2010, Data warehousing design and advanced engineering applications methods for complex construction. Hershey, PA: Information Science Reference. Ferdinandi, 1999, Data warehousing advice for managers. New York: AMACOM. Kozielski & Wrembel, 2009, New trends in data warehousing and data analysis. New York: Springer. Silverston, Inmon & Graziano, 1997, The data model resource book: a library of logical data models and data warehouse designs. New York: Wiley. Taniar, 2008, Data mining and knowledge discovery technologies. Hershey: IGI Pub. Becker, 2002, Data warehousing and web engineering. Hershey, PA: IRM Press. Golfarelli, M., & Rizzi, 2009,Data warehouse design: modern principles and methodologies. New York: McGraw-Hill. Read More

Data Wharehousing: OLAP - Report Example

Extract of sample "Data Wharehousing: OLAP"

CHECK THESE SAMPLES OF Data Wharehousing: OLAP

Impact of Data Warehousing and OLAP Models on Management Accounting

Important Data Mining Techniquesning

Hierarchical Database Models

Decision Support System Technologies

Requesting Informational Interview

Business Analytics

Data Warehouses, OLAP, and Data Mining

Recent Developments in Data Warehouses and Its Application in E-Commerce