Data Mining Techniques for Identifying Information Sources - Report Example

Add to wishlist

Summary

This report "Data Mining Techniques for Identifying Information Sources" presents database efficiency. The negative part illustrates that too much implementation of normalization can cause issues. The objective is to deploy the highest acceptable level of normalization…

Download full paper File format: .doc, available for editing

GRAB THE BEST PAPER91% of users find it useful

Data Mining Techniques for Identifying Information Sources

Read Text Preview

Subject: Information Technology
Type: Report
Level: College
Pages: 5 (1250 words)
Downloads: 2
Author: vandervortabbey

Extract of sample "Data Mining Techniques for Identifying Information Sources"

? Full Paper Data Mining 3NF is usually recommended for a corporate environment managing massive amount of replicated data. For instance there is no requirement of saving data several times. However, there is a requirement of doing more joins. Comparatively, 1NF will provide the functionality of storing replicated data regardless of number of joins. It is the choice of database administrator to evaluate what is the right form; it may be 3NF or 1 NF. Moreover, normalization comprises of five rules that are applied on a relational database. The main objective is to eliminate or minimize the redundancy and at the same time increasing database efficiency. The negative part illustrates that too much implementation of normalization can cause issues. The objective is to deploy the highest acceptable level of normalization. If we compare three of these NF’s, the 1NF removes replication in groups. The 2NF reduces data replication or redundancy and the 3NF reduces columns from the tables that are not reliant on primary keys. Therefore, database design must demonstrate the highest level of normalization possible, in order to make database efficient and robust. In order to maintain 3 large databases for a VLDB and to keep them efficient for two years if required, there is a requirement for constructing a ‘store and forward’ mechanism that will process the data or information from and through each distribution center database. Likewise, at the same time embrace that data or information pending till the completion of EDW. Moreover, data archiving is also required for maintaining each distribution center becoming a VLDB. EDW is efficient enough to support this scenario. A study demonstrated the overall cost of this disease throughout the world is $376 Billion annually. It is now almost fundamental that a person exceeding an age of 60 have more chances to get this disease, as it is now considered as the fourth largest live taking disease globally along with making its name for the fourth most common disease that contributes to a death of a person. However, the most common of all diabetes is the type 2. As there are almost 20% habitants suffering from in the United Arab Emirates alone, many research studies and debates are conducted yearly in Dubai and Abu Dhabi. Moreover, awareness sessions are conducted in every town of the cities to aware the people about this disease (MoH launches second phase of diabetes campaign.2010). However, this case study demonstrates the disease diabetes and medical data associated with patients from the Middle east region i.e. United Arab Emirates for discovering concealed patterns and the valuable information that can be utilized for decision making process. In addition, these informed decisions are performed by medical personnel and practitioners. Therefore, this case study can be utilized for illustrating the requirements for medication for each type of diabetes and also forecast the futuristic elements reflected in the extracted data (MoH launches second phase of diabetes campaign.2010). In the process of data mining, the data associated with people has risky ethical significances. Data mining experts need to deploy norms for making the data application resilient (Keating, 2008). As far as humans are concerned, this method is associated with disparity along with behaviors such as racialism, as they are negative to norms. Perception that is considered as another example is dependent on the applied classification, as it is recognized in splitting a disease that needs urgent attention. However, in case of a financial institution or a bank, loan acquisition is a non-ethical characteristic. Similarly, there are numerous factors that may be relevant to data mining. For instance, a report that was published from a leading consumer illustrated that in France, customer possessing a red car are more likely to be defaulters in returning loans back. As it is a debatable issue on categorizing it as ethical or non-ethical issue, similarly, insurance companies are always selected and discriminating because the differentiation factor includes a young person with an old lady is addressed in associated statistics, as young people have more likely hood of accidents, resulting in high insurance returns for their damaged cars. Other various issues pertain to data mining is techniques, tools, user involvement, performance and various types of data. A comprehensive discussion for each of them is as follows: 1.1 Methods of Mining and Interaction Concerns This issue pertains to the information extracted from databases and reviews the capability to gain information, as it focuses on mining information at many levels, the usability of issues associated with domains along with knowledge conception. 1.2 Mining several kinds of Information There are several different criteria that should be addressed by data associated with variety of data analysis and information discovery jobs. Moreover, jobs associated with data mining comprises of characterization, association / co relation and cluster analysis along with classification and forecast. Likewise, these jobs will utilize a single database for data mining and gives results as an output in several methods. 1.3 Collaborative Mining for Extracting Information It is difficult to predict the form along with composition of a database, as the purpose is to be iterative. Likewise, it can be allocated in different parts by applying sampling techniques along with providing techniques to the data mining specialist for making things easy and to save memory space as well. 1.4 Synchronizing Contextual Information Retaining the contextual information on the issue within a domain is a direction to define or decide the obscured data criteria that is demonstrated into summarizing terms. However, for a robust data mining process, domain knowledge is essential, as it analyze useful patterns that are associated with a set criterion. 1.5 Query Language for Data Mining Relational query language is valuable option as Structured Query Language (SQL) provides data mining specialist an option to implement various queries for acquiring specific set of data. However, for high level data processing in data mining, query languages are complex and far more advanced. These queries facilitate data mining experts to implement data mining jobs associated with domain knowledge. Moreover, these queries are also easily integrated with state of the art applications that are operational in data warehouses. Furthermore, along with integration, these queries also aid data mining experts to execute queries for quality data acquisition (Data mining extensions.2007). 1.6 Information Visualization The high level information extracted from databases is illustrated as high level visual representations and will activate data mining specialist to realize and apprehend the data (Data mining extensions.2007). However, interacting on a high level requires an aid of graphical representations such as graphs, bar charts, tables and rules. 1.7 Organizing Ineffectual Data Database also contains a large amount of extraneous data or incomplete data that creates hurdles in the process of data miming, as the algorithms and criteria are set to gather information from complete data, incomplete data also plays its part and make the process complex and in accurate in some cases (Data mining extensions.2007). However, the best option is to clean the data initially by data cleaning and analysis techniques and methods before utilizing it in the data mining process (Fowler, Karadayi, Chen, Meng, & Fowler, 2000). 1.8 Valuating Patterns Valuating patterns is an absolute essential task to perform. As there are numerous patterns that are extracted within the data mining processes and techniques, data mining specialist only analyze relevant and adequate patterns of data. This process involves high level expertise along with expert application knowledge, domain background knowledge issues and limitations associated with specific users. All these factors can limit the searching process for valid pattern discovery of data (Fowler, Karadayi, Chen, Meng, & Fowler, 2000). 1.9 Performance Limitations Data mining performance bottlenecks are linked with scalability, capability, and analogy of the data mining methods and procedures. Likewise, for making the data mining process effective, acquisition of information from data warehouses acquiring numerous databases is necessary. However, there are certain challenges when accessing data from large data warehouses, as some of the challenges includes long time delays in process data miming algorithms. However, the solution for this challenge is to incorporate distributed and parallel data mining techniques. These techniques can divide data in to different segments for making the process faster. References Data mining extensions. (2007). Network Dictionary, , 134-134. Fowler, R. H., Karadayi, T., Chen, Z., Meng, X., & Fowler, W. A. L. (2000). A visualization system using data mining techniques for identifying information sources. (). Keating, B. (2008). Data mining: What is it and how is it used? Journal of Business Forecasting, 27(3), 33-35. MoH launches second phase of diabetes campaign. (2010). Arabia 2000, Read More

Cite this document

APA
MLA
CHICAGO

(“Data Warehousing and Data Mining Essay Example | Topics and Well Written Essays - 1500 words”, n.d.)
Data Warehousing and Data Mining Essay Example | Topics and Well Written Essays - 1500 words. Retrieved from https://studentshare.org/information-technology/1480596-data-warehousing-and-data-mining

(Data Warehousing and Data Mining Essay Example | Topics and Well Written Essays - 1500 Words)
Data Warehousing and Data Mining Essay Example | Topics and Well Written Essays - 1500 Words. https://studentshare.org/information-technology/1480596-data-warehousing-and-data-mining.

“Data Warehousing and Data Mining Essay Example | Topics and Well Written Essays - 1500 Words”, n.d. https://studentshare.org/information-technology/1480596-data-warehousing-and-data-mining.

Cited: 0 times

CHECK THESE SAMPLES OF Data Mining Techniques for Identifying Information Sources

A Systematic Approach to Cost-Based Optimization in Data Mining Environment

data mining is commonly recognized as an interactive and iterative process.... he development of the Knowledge Discovery and data mining System (KDDMS) has been one of the long term aims of data mining so as to support the process of data mining… Table of Contents 1 References 12 1.... data mining is commonly recognized as an interactive and iterative process.... The development of the Knowledge Discovery and data mining System (KDDMS) has been one of the long term aims of data mining so as to support the process of data mining....

13 Pages (3250 words) Dissertation

Foundation of Data Mining

Foundation of Data Mining data mining techniques emerged as a result of product development and a long process of research.... data mining Name: Institution: Introduction data mining, also known as knowledge discovery, is the process of extracting and analyzing data from different sources and summarizing it into helpful information.... hellip; data mining software is a computer aided process of extracting and analyzing hidden predictive information from a large set of data (Hoptroff & Hoptroff, 2001)....

5 Pages (1250 words) Research Paper

Data Mining, Its Purpose and Its Working Methodology

From the paper "data mining, Its Purpose and Its Working Methodology" it is clear that data mining is a knowledge discovery process that is also known as Knowledge Discovery in Databases.... The primary function of data mining or KKD is to analyze and search a large number of data patterns in a database.... hellip; The first phase of the brief starts with the comprehensive introduction of data mining, its purpose and its working methodology....

12 Pages (3000 words) Coursework

Data Warehousing

Most organizations gather their data from various sources such as inventory management, online, and sales services as the data have to go through the data Life cycle management so that it can be useful to the organization.... The tools that are used in gathering the summarized data from various sources into the data warehouse are online analytical processing systems and query tools.... For organizations that need to manage and transact large volumes of data, it is appropriate to use this database as it helps in capturing data from different multiple and external sources (McDonald et al....

11 Pages (2750 words) Essay

Future of Business Intelligence, Data Classification and Prediction

nbsp;… data mining usage may allow analysis on data repository with analysis extending beyond the original scope of data.... Making sense of this increasing data volume requires data mining skills and techniques that have evolved with an increase in computing power.... data mining calls for electronic data storage and using a specified search for pattern identification.... Global data doubles by in every 20 months and increased availability of machines that can digest and process such data have increased opportunities for data mining....

8 Pages (2000 words) Coursework

Data Mining in Chain Hotels

This study “data mining in Chain Hotels” seeks to help the hotel industry develop a database for many of its operations.... The data mining for the store of information for each hotel and performs analysis with regard to the given hotel.... The data mining for the store of information for each hotel and performs analysis with regard to the given hotel.... The only way is to use the data mining to realize their underlying, interesting patterns and relationships that lie hidden within the analysis (data mining)....

7 Pages (1750 words) Assignment

Time Series Data Mining and Forecasting Using SQL Server 2008

Big data sets resulted into explosion in the utilization of extensive data mining techniques because of the increasing variation and size of the nature and content of the stored information.... This thesis "Time Series data mining and Forecasting Using SQL Server 2008" carries out data mining using the records on the production of major crops in Ghana for the past forty years as the data source.... It overviews time data mining, trends in data mining, review literature, etc....

64 Pages (16000 words) Thesis

Interventions to Detect Insurance Fraud

This possibility has made many companies consider more accurate methods of fraud detection and subsequently prevention such as the use of data mining and data matching techniques and fraud detection software.... This approach is also known as costly state verification and assumes that the insurer can get information about a claim without incurring an auditing cost (Schiller, 2002).... Models using this framework normally use the amount of the claim to make the decision of applying the monitoring techniques....

5 Pages (1250 words) Literature review