StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Architecture and Techniques for Data Warehousing and Data Mining - Research Paper Example

Cite this document
Summary
The paper "Architecture and Techniques for Data Warehousing and Data Mining" focuses on the critical and comprehensive analysis of the technology's architecture methods, support tools, and applications that are used for data warehousing and data mining…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER98.6% of users find it useful

Extract of sample "Architecture and Techniques for Data Warehousing and Data Mining"

Running Head: TECHNIQUES FOR DATA WAREHOUSING & DATA MINING Architecture And Techniques For Data Warehousing And Data Mining [Name Of Student] [Name Of Institution] Architecture And Techniques For Data Warehousing And Data Mining ABSTRACT Organizations have been actively implementing data warehousing technology, which facilitates enormous enterprise-wide databases. As a result, the amount of data that organizations possess is growing at a phenomenal rate. The next challenge for these organizations is how to interpret the data and how to transform it into useful information and knowledge. Data mining is one technology used for meeting this challenge. This paper gives a comprehensive view of the technology's architecture methods, support tools, and applications that are used for data warehousing and data mining. INTRODUCTION Data-mining technology allows digging out implicit facts from the accessible enterprise databases. Data mining alms to disclose a great number of facts about data, through structured online processes. This skill can be used to help out corporate decision-making developments. This paper provides a review of the most wide spread architecture and techniques for data warehousing and data mining. DATA MINING METHODS Data mining schemes can be normally divided into two main categories. a few well-liked statistical techniques working consist of probability distributions, correlation, regression, cluster analysis, and discriminant analysis (Date, 2004). The next group of methods practical in data mining is a stem of leading-edge artificial intelligence identified as machine learning. It proposes with a training deposit of data from which the information mining organization learns and finds the bounds for its models. Such a method is also known as inductive reasoning, which involves gaining rules by learning a large number of tests in the database (Date, 2004). The subsequent examples of machine knowledge are the ones mainly used by data mining structure developers. Neural networks make up the most extensively used method in data mining. They reproduce the approach the human brain learns and utilize rules inferred from information patterns to erect unseen layers of judgment for analysis. Decision trees cluster data based on their principles. Genetic algorithms are the hybrids of biology and computing science. Most of the data mining methods are computational intensive. For example, neural networks try to mimic the human brain, which consists of about 1011 neurons, to perform tasks. A combination of interconnected tools must be engaged to assist data mining efforts (Ross, 2000). DATA MINING TOOLS The introduction of new data mining products is increasing at a phenomenal rate, and they can be categorized as data, hardware, software, and network. Data A large repository of data, often a data warehouse, is required in order to do data mining. The repository captures both internal and external data for the organization. Inmon identified four elements of data in data warehouses that enhance the data mining process: (Rigdon, 2003) 1. Integrated data. Well structured and consistent data makes the mining easier. 2. Detailed and summarized data. Historical data. Historical data is crucial for businesses to understand their seasonality and business cycles. 3. Metadata provides the context of data and serves as a road map for end users in data mining (Inmon, 2002). Hardware The major problem of most data mining techniques is that they are computational intensive. While most database management systems (DBMS) can perform a certain degree of data mining, large data mining applications require very sophisticated hardware components. Software Data mining software products are often the integration of the products of vendors from various areas including software engineering, statistics, and graphic presentation. The software responsible for data analysis includes intelligent agents, OLAP, Query tools, and statistical tools; the software responsible for information presentation includes data visualization, desktop presentation, and reporting software. Network Two kinds of data mining exist from a network perspective: centralized data mining, which uses specialists in mainframe-centric IS department to extract information and report the results to managers, and client/server data mining, which allows end users to conduct the mining directly from their desktop computers on the data stored in servers. DATA WAREHOUSING FUNDAMENTALS A data warehouse (or smaller-scale data mart) is a specifically prepared warehouse of data designed to sustain decision-making. The information comes from prepared systems and outer sources. To build the data warehouse, data are removed from source systems, fresh (e.g., to sense and right errors), malformed (e.g., put into topic groups or summarized), and burdened into a data accumulate (i.e., placed into a data warehouse) (Elmasri & Navathe , 2003). The data in a data warehouse have the subsequent distinctiveness. Subject oriented -- The data are logically organized around major subjects of the organization, e.g., around customers, sales, or items produced. Integrated -- All of the data about the subject are combined and can be analyzed together. Time variant -- Historical data are maintained in detail form. Nonvolatile -- The data are read only, not updated or changed by users. (Elmasri & Navathe , 2003) A data warehouse draws facts from set systems, but is actually split and serves a different reason. Operational systems have their individual databases and are used for operations processing; a data warehouse has its own database and is used to hold up decision-making. Once the warehouse is shaped, users (e.g., analysts, managers) entrance the data in the warehouse using tools that spawn SQL (i.e., structured query language) queries or through application such as a decision sustaining system or an executive information system (Dietel, 2004). "Data warehousing" is a broader expression than "data warehouse" and is deployed to illustrate the creation, maintenance, use, and constant refreshing of the information in the warehouse. THE WAREHOUSE ARCHITECTURE There are numerous architectures for data warehouses, and several factors affect the architecture that is chosen. PRIMARY DATA STORES The information architecture illustrates two primary data stores that the majority of information consumers will be most interested in accessing. The Enterprise data marts tend to focus on integrating information of interest to the enterprise from multiple business units. The Departmental data marts tend to focus on business unit information for decision support. The types of access would include predefined reports, Web reports, queries and possibly an OLAP-type database. SECONDARY DATA STORES The information architecture illustrates three secondary data stores which include the Base/Staging Data Store (BDS) Operational Data Store (ODS), and the Private Data Store. 1. The Base data store merges, converts, cleanses, and regroups multiple sources of data and calculates key derived values. Types of access can range from programmed queries to sophisticated data mining or powerful analytic type tools. 2. Operational data stores tend to support operations or is an interim store of MVS data for staging to the BDS. Access is typically via predefined reports or specific applications. 3. The Private data store tends to support specialized or one-time analysis and is generally not stores as part of an integrated warehouse. Types of access are typically queries, desktop OLAP and spreadsheets. END USER ACCESS METHODS WEB REPORTS (PRE-DEFINED REPORTS) Pre-defined reports can be considered the simplest form of user access that allows a non-technical user to choose a fixed report typically available through the common desktop Web browser. The advantage of a Web browser for the delivery option is that a vendor tool does not need to be launched to access the data. QUERY Query tools are PC-based software that provides a graphical interface to facilitate the creation of SQL queries to the database. The advantage of these tools is that they arrow an end user with no knowledge of SQL and limited knowledge of the database structures to create ad hoc queries. After a result set is retrieved, these tools provide the capability to perform various operations on the data, such as computing totals and averages. MULTIDIMENSIONAL OLAP On line analytical processing (OLAP) is a term that describes a dimensional approach to decision support. The OLAP tools vary in the way they access and store data. Where query toots view data from a two-dimensional point of view, OLAP is concerned with rooking at the data from all angles (hence the term slice and dice often used in reference to OLAP type analysis). DATA APPLICATIONS Data applications are essentially front ends for a data mart or may replace it and directly access the base data store. They can be custom developed using rapid application development toots or purchased. The three most critical components of this architecture involves understanding the distinguishing characteristics of the various data stores, interpreting the variety of options that the end users have for accessing data warehouse information, and how a data warehouse project team develops descriptions of subject area data marts. CONCLUSION Data warehousing has shown that it can satisfy this need for information about customers, suppliers, market trends, the competition and how efficient the company's business procedures are. These days there is no room for guesswork, decisions have to be made on the basis of careful data analysis. Where do you go data mining for business intelligence? To the data warehouse where you will drill down to find the information you want. A feature of data warehousing is that no new databases are created and that the original database is left untouched. The information from the databases is copied into the data warehousing system where it is processed and used to prepare charts and graphs from which new facts will emerge. REFERENCES Adriaans, P. and Zantinge, D., Data Mining, Harlow, England: Addison Wesley Longman, 1996. Connolly, T.& Begg C.(2002): Database Sustems: a practical approach to design, implementation, and management. Addison-Wesley/Pearson Education. ISBN 0-201-7085 Date, C.J. (2004): An Introduction to Database Systems. Pearson Education. Deitel, H.M. et. al (2004): Internet and World Wide Web: how to program. Pearson Education.IsBN 0-13-124682-8. Elmasri & Navathe (2003): Fundamentals of Database Systems. Addison-Wesley. ISBN 0321204484. Gessaroli, J., "Data Mining: A Powerful Technology for Database Marketing," Telemarketing, (13:11), May 2001, pp. 64-68. Greenfeld, N., "Data mining," UNIX Review, (14:5), May 1996, pp. 9-14. Inmon, W.H., "The Data Warehouse and Data Mining," Communications of the ACM, (39:11), November 20002, pp. 49-50. Rigdon, E.E., "Data Mining Gains New Respectability," Marketing News, (31:1), January 2003, pp. 8. Ross, J.R., "Data Mining: Digging Deeper for Information Treasures," Stores, (78:5), May 2000, pp. 66-68. Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Architecture and Techniques for Data Warehousing and Data Mining Research Paper Example | Topics and Well Written Essays - 1500 words, n.d.)
Architecture and Techniques for Data Warehousing and Data Mining Research Paper Example | Topics and Well Written Essays - 1500 words. https://studentshare.org/logic-programming/2091714-architecture-and-techniques-for-data-warehousing-and-data-mining
(Architecture and Techniques for Data Warehousing and Data Mining Research Paper Example | Topics and Well Written Essays - 1500 Words)
Architecture and Techniques for Data Warehousing and Data Mining Research Paper Example | Topics and Well Written Essays - 1500 Words. https://studentshare.org/logic-programming/2091714-architecture-and-techniques-for-data-warehousing-and-data-mining.
“Architecture and Techniques for Data Warehousing and Data Mining Research Paper Example | Topics and Well Written Essays - 1500 Words”. https://studentshare.org/logic-programming/2091714-architecture-and-techniques-for-data-warehousing-and-data-mining.
  • Cited: 0 times

CHECK THESE SAMPLES OF Architecture and Techniques for Data Warehousing and Data Mining

Data Warehousing and Data Mining

This research ''data warehousing and data mining'' tells that data warehouses are primarily decision support systems and this functionality is achieved through data mining.... data mining or knowledge discovery is the most important task in data warehousing as far the usability of the system is concerned.... A data warehouse employs several tools like data modeling, star schema, data mining etc.... However, it is not a comprehensive definition and Vercellis (2009) himself admits, 'The term data warehousing indicates the whole set of interrelated activities involved in designing, implementing and using a data warehouse....
8 Pages (2000 words) Research Paper

The Infrastructure of Data Management and Data Mining Capabilities

As this is an extremely broad definition it generally focuses on a server-side data management and data mining, but within this paper, there is a need to have a broader focus of the end-user data management which will encourage employees to have a central repository for their files.... Many times a company will focus on just ensuring the data is secure but fail to engage their employees in training on these data management and data architecture systems.... Data analysis is a common term for data modeling and this activity is actually more in common with the ideas and methods found in synthesis than it does with analysis....
37 Pages (9250 words) Research Paper

Data Warehousing

The high-changing technology is important for an organization since it integrates data for easier analysis and data reporting.... This paper ''data warehousing'' tells that In the world today, many organizations are implementing advanced technologies that are aimed at improving the performance of the organizations.... data warehousing refers to an area within a computer where data is stored in an organized and centralized way.... It focuses on how well the data is stored for better analysis and reporting of information by the analysts....
11 Pages (2750 words) Essay

Data Mining Technologies

data mining is defined as: “a.... It is not possible to give specific advice, but there are four general principles Running Head data mining data mining At the beginning of the 21st century, organizations depend upon information technology unsuccessful use of information systems management.... data mining is defined as: “a decision support process in which we search for patterns of information in data” (Pushpa 2007, p.... data mining is based on statistical analysis and modeling techniques, data mining becomes a strategic weapon of organizations because it increases significantly the volume of information that can be stored and the ease with which it can be updated....
2 Pages (500 words) Essay

Data Warehouse Business Technology

Data warehouse is a system used for reporting and data analysis of information within organizations.... Some of the advantages of data using involves integrating data from different sources and combining it on a common platform, and keeping the data history as a way of tracking the information coverage within an organization.... Understanding the features of the data warehouse and the issues surrounding this technology is crucial for its application in the public domain....
4 Pages (1000 words) Essay

Master Data and Data Warehousing and Business Intelligence Management

The main focus of the paper "Master Data and data warehousing and Business Intelligence Management" is on explaining reference and master Data Integration Needs, on identifying reference Data Sources and contributors defining and maintaining the Data Integration Architecture.... ata Sources, Contributors and data Integration Architecture; Data sources may be primary or secondary in reference to the situation at hand (IBM, R 2012).... This will contribute to confidence in matching and reducing data redundancy and there will be no conflict of individuals sharing names and similar or almost similar street addresses....
6 Pages (1500 words) Essay

Data Warehouses, OLAP, and Data Mining

The author of the paper under the title "Data Warehouses, OLAP, and data mining" will begin with the statement that data management can be performed using different types of data analysis and reporting options.... The architecture and the structure of the data warehouse need to be set up to provide efficiency.... One of such options is a data warehouse.... Dimensional modeling is a particular type of data modeling that is associated with data warehouses....
8 Pages (2000 words) Assignment

Survey in Multimedia Data Mining by Content in Social Media

"Survey in Multimedia data mining by Content in Social Media" paper has managed to illustrate one data mining technique that has been successful in the social multimedia domain.... Content uploaded by partakers in these vast content pools is escorted by wide-ranging forms of metadata, like descriptive textual data or social network information.... Djeraba, Gabbouj, and Bouthemy (2006) posit that such data may entail scores of features: textual descriptors, data concerning the content capture location, the properties of camera's metadata, and also user data as well as information in the social network....
9 Pages (2250 words) Literature review
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us