StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Data Warehouses & OLAP & Data Mining - Assignment Example

Cite this document
Summary
The paper "Data Warehouses & OLAP & Data Mining" is an outstanding example of a business assignment. Liu (2011) defines a data warehouse as a subject-oriented, time-varying, integrated, and non-volatile collection of data that supports management’s process of decision making. According to Berendt and Spiliopoulou (2000), a data warehouse is a centralized depository that amasses data from multiple informational sources…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER93.2% of users find it useful

Extract of sample "Data Warehouses & OLAP & Data Mining"

Business Intelligence - Project: Date Warehouses & OLAP & Data Mining Name Course Instructions Date An Introduction to Data Warehousing Liu (2011) defines data warehouse as a subject oriented, time-varying, integrated, and non-volatile collection of data that supports management’s process of decision making. According to Berendt and Spiliopoulou (2000), a data warehouse is a centralized depository that amasses data from multiple informational sources and changes this data into a general, multi-dimensional data model for resourceful querying and analysis. Berendt and Spiliopoulou (2000), indicates that a data warehouse has the ability to tackle a wide variety of occurrences. In keeping with Liu (2011), a data warehouse is a stockroom for an organization’s historical data. Information got from operational systems is retrieved and imported into the data warehouse regularly. Resultantly, complex enquiries and queries are conducted through the data warehouse with very little intermission to the operational systems. Berendt and Spiliopoulou (2000), indicates that the imported data can only be read only type and can only and the existing data in the warehouse. The value of data warehouse increase with the increase of data in the warehouse, since analyses dating over a long duration of time is possible. When a user’s query crops up to the warehouse, it is possible to retrieve all historical data addressing the query, this supports in decision making. There are two forms way by which an organization manages its information, the first is through operation systems and the second is through data warehouses. For online transaction processing (OLTP), organizations use operational systems. Data warehousing on the other hand are designed to maintain (OLAP). In reference to Liu (2011), operational systems concentrate on large volumes transactions processing on daily basis and they use real time data. Berendt and Spiliopoulou (2000), further indicates that these systems are generally process oriented and their focus is usually on specific tasks like student registration, management of employees’ timesheet and updating financial transactions. These systems are optimized for simplicity and speed of modification, thus allowing effective, efficient and trouble-free data entry retrieval. Berendt and Spiliopoulou (2000), argues that such systems also follow historical and transactional data. In reference to Liu (2011), operational systems mainly focus on current data management. The chief function of data warehouse is to manage and store historical data. Berendt and Spiliopoulou (2000), argues that data warehouses are generally subject specific and carry data from multiple operation systems for them to support organizational decision making. In academic institutions, data warehouses are used to address issues regarding pupil’s satisfaction, the attrition rate, and the effectiveness of new instructional techniques. In response to a concern, relevant data can be mined and utilized for data analysis and generation of reports. Data warehousing applications There are three types of data warehousing applications; a. Personal productivity applications; these are applications like statistical packages, spreadsheets and graphic tools. These applications are useful in presenting and manipulating data on individual PCs and they are developed for standalone environment. In reference to Maurizio et al. (2003), the tools address applications that require small volume o f warehouse data. b. Data query and data reporting applications; they deliver wide data access via simple, list-oriented queries. They also generate simple reports which provide view of historical data. However they do not address the enterprise requirement for in-depth analysis and planning (Maurizio et al. 2003). c. The plan analysis application; they address essential business requirements like budgeting, forecasting, customer profitability, f8inancial consolidation, sales analysis and manufacturing mix analysis; all that use historical, projected and derived data (Maurizio et al. 2003). Just as in all information systems, data warehousing components’ should be viewed against the framework that focuses on business applications they are designed to address but nomt on technology. As such the application served by data warehousing must be viewed in three main categories. Benefits of data warehousing Data warehousing is the most strategically momentous advancements in information processing in the modern times. According to Maurizio et al. (2003), data warehousing is seen as the modern day answer to information overload. Some of the advantages of data warehousing include; Enhanced Business intelligence; insights are gained through improved access to information. Data warehousing provides ample information that assist in decision making. Decisions that affect an organizations strategies and operations are based upon credible facts and are backed by actual organizational data. More so, data warehouses and other business related intelligence can be applied directly to the processes of the organization like; marketing segmentation, financial management, inventory management and sales. Improved systems and query performance; the chief purpose of data warehousing is speedy data retrieval and analysis. Data warehouse are designed for large volume data storage and rapid data querying. These analytical systems are made differently from operational systems which main focus is creation and modification of data. Contrary data ware housing is made for data analysis and retrieval rather than efficient upkeep of individual records. These warehouses allow for large system burden retrieved from operational environment, data warehouse effectively distributes the systems load to the entire technology infrastructure in an organization. Business Intelligence from compound sources; for many organizations, enterprise information systems consist of compound subsystems which are physically separated and built on different platforms. Merging data from multiple data sources is a common necessity when conducting business intelligence. Data warehouse serves in integrating disparate data sources and making the data accessible from a single place the consolidation of data into a solitary data repository assuage the burden of data duplication and enables easy extraction of data. It make data warehouse single unit of true from the organization rather than multiple truths from multiple individual subsystems. Timely access to Data; according to Maurizio et al. (2003), data warehouse enables decision makers to have access to data from a assortment of resource in a short period. Scheduled data integration routines often referred to as ETL, are controlled within a data warehouse environment. Maurizio et al. (2003), indicates that the routine merge data from compounded source systems and convert the data into a valuable format. As a result, data users can easily access data from a single interface. Enhanced Data Quality, reliability and consistency; data warehousing typically entails conversion of information from frequent source systems and data records and conversion of the incongruent data into a general format. Data from various organizational units and departments are standardized and the inconsistent nature o f data from distinctive source system is eliminated. As a result different organizational units produce consistent results; this increases confidence in the organization’s data. Historical intelligence; data ware houses keep many years worth of data that can not be stored or reported from transactional system. Characteristically, transactional systems satisfy most operating reporting necessities for a specified period of time without inclusion of historical data. Contrary, data warehouse stores large volumes of historical data, Maurizio et al. (2003), indicates that these data can enable advanced business intelligence like time-period analysis, trend prediction and analysis. Data warehouse allows for advanced reporting and analysis of multiple time-periods. Elevated returns on investments (ROI); this refers to the amount of increased revenue and decreased expenses and organization can realize from investment on capital. Maurizio et al. (2003), argues that, implementations of data warehouses and complementary business intelligence system help a business in generating high amounts of revenue and provision of considerable cost savings. The elementary goal of data warehouse according to Ponniah (2010) is to support strategic planning, forecasting and modeling at the organizational level. Data warehouse must fulfill the need for acquaintance for an area of ambiguity or growth within an organization. Philip et al. (2002), states that in order to accomplish these task, data warehouse must provide a solitary, comprehensive consistent and reliable view of the organization. Philip et al. (2002), further indicate that data must be easily accessible and comprehensible for use. Further, the data warehouse should avail information consistently and securely to the user. Ones the data is collected from the source systems, such data must complete a range of measures of quality assurance to confirm its accuracy. In keeping with Khan (2003), such data must be varied, fully accounted for and appropriately labeled before it is availed to users. In reference to Ponniah (2010) such data must be resilient and able to impeccably adapt to transformation without discrediting the already present data. In addition Khan (2003) indicates that effective data warehousing can help in formation of meaningful relationship between business and information technology, and in facilitating enterprise-level strategic planning and growth. Components: A data warehouse is divided into four main components i.e. the source staging system, data staging area, the presentation servers, and end user data access. Each of the four components of the data warehouse serves a unique function in preparation of data for manipulation and examination. Retrieced from: As mentioned above, the operational system of records, capture and process the organization’s of every day transactions. Operation systems concentrate much on the efficient processing performance because they deal with high volume of transactions. According to Philip et al. (2002), they operating systems function in isolation and do not typically share common data with other source systems. The data obtained through the operating systems is uploaded to the staging area. Data staging area acts doubly as a storage area for the data and as a platform for the set processes referred as Extract-transformation-load (ETL). Ponniah (2010) indicates that this set of processes occurs to regulate the raw data to incorporate it into the data warehouse environment. Khan (2003) indicates that, to begin with, data is extracted from several sources systems and clichéd into the data staging area. At the data staging area data is combined, purified and transformed into a uniform format and structure. This is where missing elements, duplicated data, incorrect labels, misspelling among other errors are manipulated and corrected. After the data is standardized, it is directed to the data presentation area where it can be accessed by the users. The formatted data that has been refined is available for the user queries in the data presentation area. Data presentation area is a set of integrated data marts. Data mart on the other hand can be defines as a subset of the data warehouse as represents specific data regarding a specific function. An organization can use multiple data marts. Each data mart being relevant to the department for which it has been designed. Data that is available in the data presentation area must be meticulously and logically organized. Berendt and Spiliopoulou (2000), once the data presentation area has formatted data, users are availed with a variety data access tools to perform queries, these tools include; data mining applications, ad hoc query tools, and sophisticated forecasting tools. Further to the components of data warehouse, it is critical to establish the importance of a strong metadata structure. Metadata structure has vital information that guides the process of changing the raw data from the operational systems to available data in the presentation area (Philip et al. 2002). Due to this value the metadata resources must be; accessible, categorized carefully, protected as the data itself. Data warehouses are effectual in the conversion from intuitive information gathering to objective and systematic investigation. In reference to Ponniah (2010), they provide users access and control to a wide range of formatted and centralized data to choose the utmost course of action and support business decisions. Data users can manipulate and customize the data in the warehouse to support specific queries that can enable positive changes at different business levels. Khan (2003), points out that since several stages increase data integrity and accuracy, there is chance to conduct complex queries with a great sense of confidence. Data warehousing has several advent ages; however, it also have some drawback and challenges (Khan, 2003). Khan (2003), states that depending on their design data warehouse have considerable risks because of their complex architecture, poor quality information, long development cycles and inability to adapt as fast as business conditions change. In addition, since the operational source systems provide the data that eventually gets to the data presentation area, Silva and Vieira (2002) states that data warehouses are limited by these source systems. As a result, each organization must focus on incessant assessment and upgrading of it data warehouse and source systems. This will resultantly lead to effectiveness in researching and supporting organizational decisions. PART B: OLAP and Data Mining Online Analytical Process (OLAP), as defined by Philip et al. (2002), describes a class of technologies designed for live ad hoc data analysis and access and it is based on multidimensional views of organization’s data. Silva and Vieira (2002) further indicates that with OLAP tools, a person can analyze and navigate through data to come up with trends, spot omissions and even attain fundamental details to better realize the flow of the originations’ activities. ROLAP (Relational OLAP), on the other hand, can be defined as a set of user applications and interfaces that presents a relational database with a dimensional flavor. Silva and Vieira (2002) defines MOLAP (Multidimensional OLAP), as sets of user interfaces; proprietary and applications database technology with a strong dimensional flavor. According to Philip et al. (2002), majority of OLAP approaches center around the scheme of reformulating flat data into a data store hat is multidimensional and optimized to data analysis. Silva and Vieira (2002), states that the multidimensional data store is referred to as a hypercube and it store data’s long dimensions. The OLAP technology is non-relational and is regularly based on precise multidimensional database (MDDBs). OLAP style data marts can be full participants in a data warehouse bus if designed around conformed dimensions and facts. The multi-dimensional data Model They are integral part of OLAP since OLAP is online. The Model must provide quick answers during interactive sessions. The multidimensional data model is designed to solve complex real time queries. The central appeal of the dimensional model of an organization is its simplicity, which is fundamental in allowing the users to understand databases, it also allows the software to navigate database effectively and efficiently. The multidimensional data model comprise of logical cubes, dimensions, measures, levels, hierarchies and attributes. Silva and Vieira (2002) indicates that the plainness of the model is inherent given that it defines objects that represent real-world organizational entities. Logical multi-dimensional data model Logical data cubes Logical cubes present a way of organizing measures of same shape, i.e. they are of exact same dimensions. According to Sholom et al. (2005), similar cube measure has the matching relationships to other rational objects and can straightforwardly be analyzed and presented together. Data mining This ia a process of extracting unknown but relevant information from large volume of database and using the information for crucial business decision. Data mining converts data to information and it is in most cases bottom up. The data mining process is as follows; The process of data mining Data mining entail proactive finding of information and business executable models. For example the below business objectives can be used to discover the signs o f these customer behaviors in the data that has been gathered and hence classify clients accordingly. Marketing- which people are likely to purchase? Loyalty- Which group of people are likely to defect? Credit- where can one get more profitable loans? Forecasts- what goods are likely to sell out? Fraud- how and when did it occur? The above business problems can be addressed through data mining patterns as follows. Classification- classifying clients into; loyal, high profit, high loss Prediction- forecast future demand of a product Segmentation- search groups of similar clientele base Association- which products can be consumed together i.e bread and butter Sequence- after purchasing a car people go for insurance covers. Web and text Mining Web and text mining, consist of; web structure mining, web usage mining and web content mining. According to Sholom et al. (2005), it refers to the discovery of user entrée patterns from usage logs of the web. Web structure mining is designed to discover useful knowledge from the configuration of hyperlinks. Sholom et al. (2005), indicates that web content mining aims at extracting useful information from web page contents. PART C: Designing of a multidimensional Model for a second hand car dealer in the UAE The qualifier matrix Below is based on auto business requirement, it indicates the time, customer demographics, date, dealer and the method of payments can be chosen using the matrix as indicated below. Part c: table 1 the qualifier matrix Sales revenue MSRP Base Price Distribution of Sales Number of sold Vehicles Amount Payable Model- Name Category Dealer Name Year Quarter Month Day Financing type Income range Day Part c: table 2 Dealer store Information about Dealers Column Name Description Dealer Key (PK) A code representing the Dealer (Unique) Name Dealer Name City Dealer City State Dealer State Load Timestamp Date the records were loaded Part c: table 3 Customer Demographic: Contains information pertaining the customer’s location (demographics) Column Name Description Demographic Key (PK) Unique code standing for customers location Age Customer’s age Gender Customers Gender Income-Range Customer’s income Marital Status Customer’s marital status Household Size Family Size Vehicle owned Number of Vehicle customer possess Home Value Value of Customers residence Owned residence or Rented About Customers home (rented or owned) Load Time stamp Date the record was made Part c: table 4 Store information on methods of payment Column Description Finance Key (PK) Unique Code for modes of payment Finance type Type of finance customer uses Terms in months Duration for which lease or loan is taken Rate Interest on loan Lease Agent Name of the lease agent Load time-stamp Date the records were made Part c: table 5 Products table (contains information about the product) Column Description Model Name Vehicle model Model year Year of manufacture Product category Category of the vehicle Exterior color Vehicle’s exterior color Interior color Vehicle interior color Load time Stamp Date the records were loaded Part c: table 6 Information on date of sales Column Description Time Key (PK) Unique code to represent time Year Year of sale Quarter Quarter of sale Month Month the sale was made Day Day of the month of the year the transaction took place Load time Stamp Date the records were loaded Part c: table 7 Sales fact tables; contains facts regarding the sales transaction of the vehicles Column Description Time Key (FK) Unique code representing date of sale Product Key (FK) Unique code representing vehicle type Finance Key (FK) Unique code representing payment Demographic Key (FK) Unique code representing customers location Dealer Key Unique code representing the Dealer Actual sales prices Amount of money the vehicle is paid for by the customer MSPR Base Price Base price of the model sold Down payment amount Amount paid for down payment Load time stamp The date this information was entered Star schema References Berendt, B. & Spiliopoulou, M. (2000). Analyzing navigation behavior in www integrating multiple data systems. VLDB Journal, Special Issue on Databases and the Web 9, 1 56–75. Khan, A. (2003). Data warehousing 101 : concepts and implementation. New York Lincoln, NE: iUniverse, Inc. Liu, B. (2011). Web data mining: exploring hyperlinks, contents, and usage data. Heidelberg New York: Springer. Malinowski, E. (2008). Advanced data warehouse design from conventional to spatial and temporal applications. Berlin: Springer. Maurizio L, Panos V, Matthias J,& Yannis V (2003). Fundamentals of data warehouses. Berlin New York: Springer. Nagabhushana, S. (2006). Data warehousing OLAP and data mining. New Delhi: New Age International. Philip A., Yannis E, Ramakrishnan R, Dimitris P. (2002). Large Volumes of data bases 2002 proceedings of the 28th International Conference on incredibly Large Data Bases, Hong Kong SAR, China, 20-23 August 2002. St. Louis, MO: Morgan Kaufmann. Ponniah, P. (2010). Data warehousing fundamentals for IT professionals. Hoboken, N.J: John Wiley & Sons. Prabhu, L. (2002). Data warehousing: concepts, techniques, products and applications. New Delhi: Prentice-Hall of India. Pujari, A. (2001). Data mining techniques. Hyderabad Great Britain: Universities Press. Sholom M. W, Nitin I, Zhang T, & Fred D. (2005). Text mining predictive methods for analyzing unstructured information. New York: Springer. Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Data Warehouses & OLAP & Data Mining Assignment Example | Topics and Well Written Essays - 3000 words, n.d.)
Data Warehouses & OLAP & Data Mining Assignment Example | Topics and Well Written Essays - 3000 words. https://studentshare.org/business/2038686-business-intelligence-project-date-warehouses-olap-data-mining
(Data Warehouses & OLAP & Data Mining Assignment Example | Topics and Well Written Essays - 3000 Words)
Data Warehouses & OLAP & Data Mining Assignment Example | Topics and Well Written Essays - 3000 Words. https://studentshare.org/business/2038686-business-intelligence-project-date-warehouses-olap-data-mining.
“Data Warehouses & OLAP & Data Mining Assignment Example | Topics and Well Written Essays - 3000 Words”. https://studentshare.org/business/2038686-business-intelligence-project-date-warehouses-olap-data-mining.
  • Cited: 0 times

CHECK THESE SAMPLES OF Data Warehouses & OLAP & Data Mining

Data-Mining within the Airline Industry - Making Data Accessible, Value of Business Intelligence

… The paper “Data-Mining within the Airline Industry - Making Data Accessible, Value of business intelligence” is an excellent example of the case study on information technology.... Still this time around we see that there are yet organizations that do not use business intelligence Technology; as an alternative, these corporations have a huge workforce of information technology experts.... The paper “Data-Mining within the Airline Industry - Making Data Accessible, Value of business intelligence” is an excellent example of the case study on  information technology....
17 Pages (4250 words) Case Study

The Role of Warehouses in Distribution Management

… The paper "The Role of warehouses in Distribution Management" is an outstanding example of a management literature review.... The paper "The Role of warehouses in Distribution Management" is an outstanding example of a management literature review.... Also included in distribution management is the determination of ideal quantities of a product for supply to specific warehouses or points-of-sale so as to attain the most efficient delivery to customers....
6 Pages (1500 words) Literature review

Factors Affecting Data Warehousing Success

… The paper "Factors Affecting data Warehousing Success" is an outstanding example of a management assignment.... Factors that affect the information system are not different from those that affect data warehousing success.... The paper "Factors Affecting data Warehousing Success" is an outstanding example of a management assignment.... Factors that affect the information system are not different from those that affect data warehousing success....
5 Pages (1250 words) Assignment

Significance of Implementing Business Intelligence in Decision-Making Process

… The paper "Significance of Implementing business intelligence in Decision-Making Process" is an outstanding example of a management research proposal.... The paper "Significance of Implementing business intelligence in Decision-Making Process" is an outstanding example of a management research proposal.... This, therefore, means that organizations can access an enormous amount of information and data including but not limited to consumer behaviour, financial calculations, sales, economic trends, efficiency measures, demographics and many more (Dresner, 2008) business intelligence contributed a lot to this change, as a result of the development of systems, tools and methods that made it easier to collect, store and analyse this enormous amount of information and data is known as business intelligence systems and applications (Moss & Atre, 2003) business intelligence and information technology have offered businesses and organizations great capacity to collect, store and analyse data in a scientific and systematic way in management....
6 Pages (1500 words) Research Proposal

Business Intelligence and Enterprise Data Mining

… The paper "Business Intelligence and Enterprise data mining " Is a great example of a Business Case Study.... The paper "Business Intelligence and Enterprise data mining " Is a great example of a Business Case Study.... Operational or Transactional data Sales, cost, inventory, payroll, and accounting Nonoperational data Industry sales, forecast data, and macroeconomic data Metadata Data about the data itself, such as logical database design or data dictionary definitions Information Patterns, associations, or relationships among all data Knowledge Summary information about historical patterns and future trends of a product or service for decision making   data mining is the process in which data from different sources and perspectives is analyzed, categorized, and summarised into useful information and used for making better business decisions and conducting better business....
8 Pages (2000 words) Case Study

Effective Business Intelligence Systems

… The paper "Effective business intelligence Systems" is a great example of a business plan on management.... nbsp;The purpose of this project is to implement a business intelligence system at Temper Tinglary SOHO Art Gallery-Yarra Valley.... The paper "Effective business intelligence Systems" is a great example of a business plan on management.... nbsp;The purpose of this project is to implement a business intelligence system at Temper Tinglary SOHO Art Gallery-Yarra Valley....
8 Pages (2000 words)

Ballarat Base Hospital - Business Intelligence Systems

… The paper "Ballarat Base Hospital - business intelligence Systems" is a great example of a business case study.... The paper "Ballarat Base Hospital - business intelligence Systems" is a great example of a business case study.... This can be attained by business intelligence systems such as performance and operational dashboards.... Currently, the information systems in Ballarat are computer-based but they are not up to date and they can be improved further using business intelligence systems such as operational and performance dashboards....
4 Pages (1000 words) Case Study

Business Intelligence Enhances the Quality of Decision-Making

… The paper "business intelligence Enhances the Quality of Decision-Making" is a great example of business coursework.... nbsp;business intelligence systems are the systems that are designed by the management of the organizations as one of the modern tools ought to be adopted by business organizations to improve their performance.... The paper "business intelligence Enhances the Quality of Decision-Making" is a great example of business coursework....
11 Pages (2750 words) Coursework
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us