StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Multi Agent Driven Data Mining for Knowledge Discovery in Cloud Computing - Research Proposal Example

Cite this document
Summary
The author of the paper "Multi-Agent Driven Data Mining for Knowledge Discovery in Cloud Computing" will begin with the statement that cloud computing and data mining have become famous phenomena in the current application of information technology…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER96% of users find it useful

Extract of sample "Multi Agent Driven Data Mining for Knowledge Discovery in Cloud Computing"

Name: Tutor: Course: Date: Multi Agent Driven Data Mining For Knowledge Discovery in Cloud Computing INTRODUCTION Cloud computing and data mining have become famous phenomena in the current application of information technology. With the changing trends and emerging of the new concept in the information technology sector, data mining and knowledge discovery have proved to be of significant importance. Data mining can be defined as the process of extracting data or information from a database which is not explicitly defined by the database and can be used to come up with generalized conclusions based on the trends obtained from the data. A database may be described as a collection of formerly structured data. (Song II – Yeol et al 15) Multi agents data mining may be defined as the use of various agents cooperatively interact with the environment to achieve a specified objective. Multi agents will always act on behalf of users and will coordinate, cooperate, negotiate and exchange data with each other. An agent would basically refer to a software agent, a robot or a human being (Wrembel & Koncilla 45) Knowledge discovery can be defined as the process of critically searching large collections of data with the aim of coming up with patterns that can be used to make generalized conclusions. These patterns are sometimes referred to as knowledge about the data. Cloud computing can be defined as the delivery of computing services in which shared resources, information and software’s are provided over a network, for example, the information super highway. Cloud computing is normally provided over a web based service which hosts all the resources required (Bramer 36) This paper will critically analyze the statement” Multi Agent driven data mining for knowledge discovery in cloud computing” Data mining and knowledge discovery are very useful concepts especially when determining unknown phenomena. For example, a chain store that sells a variety of clothing can employ data mining skills to come up with some decisions. For instance, if it is observed that, at the beginning of the year more sweaters and jumpers are bought, and as the year progresses more light clothes are bought then the management can stock more jumpers and sweaters at the beginning of the year and ensure that they are sold at full price. (Zhou 55) In the above example, the database does no store the purchasing pattern, but it stores the records of items bought and at what time. For us to be able to establish the pattern, we need an agent which will observe the purchase pattern. On the other hand, the knowledge learnt from the above example is that more jumpers and sweaters are bought at the beginning of the year probably due to the winter season. Light clothes are bought during the mid year period may be due to the summer season. Cloud computing data mining can be very useful to cloud vendors. Cloud vendors can obtain useful information about their customers and use it to make valuable decision. In obtaining, information from the cloud computing data mining, the cloud vendors should not obtain data for the purposes of reselling. For example if a company a company that is interested in collecting the information that most users look for over the internet can employ the web usage mining technique. This technique will gather information about what most web users access on the internet, whether it is text or multimedia information. (Cooley 22) Web hosting is an example of a cloud computing in the sense that most of the data is stored by the web hosting company and not the individual or organization owning the website. Services can be obtained on a website by the help of a browser. (Masand & Spiliopoulou 65) The following model illustrates how data mining can be used to obtain information about the ordering behaviors of customers, who places their orders online using our interactive website. The information obtained will be used to identify which months produces the highest orders and by how much. I have employed the use of OLAP method to analyze the ordering trends of customers. When customers visit our site, for them to be able to order a product online they need to click on the link, for place an order. Various information is captured and stored in a database table. A customer is supposed to enter the following details on the order form. FIRSTNAME, LASTNAME, GENDER, MARITAL STATUS, LOCATION, TYPE OF CAKE, PAYMENT MODE, DATE and AGE. The following table indicates an extract of some data stored in the orders table from the database. DATE FNAME LNAME GENDER MARITAL STATUS LOCATION TYPE OF CAKE PAYMENT MODE AGE 12/2/2011 LEANZS LIGALE MALE SINGLE NAIROBI VANILA MASTER CARD 18 12/2/2011 KEVIN NGAIRA MALE SINGLE KIAMBU VANILA MPESA 22 13/2/2011 MILCAH ADEMA FEMALE MARRIED KISII MILKY PAYPAL 32 13/2/2011 STEVE LUBITA MALE SINGLE NAIROBI VANILA CREDIT CARD 21 28/3/2011 EDWIN MAFUNU MALE SINGLE KITUI WEDDING MPESA 28 28/3/2011 KITEN KOLO MALE SINGLE KIAMBU WEDDING PAYPAL 30 29/ 3/2011 KEVO POLOP MALE SINGLE NAIROBI WEDDING MPESA 32 From the above data, we have to extract data that can be used to sturdy the ordering behaviors of the customers which will help us come up with a conclusion that will enable us increase our sales. The above database does not provide explicit data on purchasing behaviors of customers hence we have to perform pre – data processing. After pre data processing, we will then load the information into a relational database and later represent the information in a multi dimensional array. The multidimensional array will help us analyze the data using data cube and the OLAP (on- line analytical processing) of the relational database. Pre-data processing (Bramer 40). Pre- data processing will involve various activities including, data cleaning, client details identification, capturing the order date and the type of cake that has been ordered by various clients. Data cleaning involves removing unwanted data from the large collection of data. For example, from the above model, we do not need data about the mode of payment, first name, last name, age, gender and location, hence data cleaning will remove the unwanted details and only remain with the wanted fields. Data cleaning can be achieved by using specialized intelligent software agents, or by other software applications like SPSS, or by use of a structured query language that will execute the selected query statement (Witten et al 78) The table below indicates the data that will be stored in the relational database ready to perform the OLAP process. DATE MARITAL STATUS TYPE OF CAKE 12/2/2011 SINGLE VANILA 12/2/2011 SINGLE VANILA 13/2/2011 MARRIED MILKY 13/2/2011 SINGLE VANILA 28/3/2011 SINGLE WEDDING 28/3/2011 SINGLE WEDDING 29/ 3/2011 SINGLE WEDDING An OLAP process uses the multidimensional data to represent information. The above data has to be represented into a multidimensional array. The above three attributes, DATE, MARITAL STATUS and TYPE OF CAKE can be represented in the following three dimensional array. The above data is then converted into a data cube that will eventually allows us to perform the roll up , drill down, slicing and the dicing operations on the model. (Witten 79) The figure below indicates the iris cube that represents the data which can now be sliced and diced. MARITAL STATUS Widowed Married 000000000000 CAKE TYPE Milky Single Vanilla 12/2 13/2 28/3 29/3 wedding DATE 2011 The above data cube represents all the possible aggregates of data which is as a result of a proper selection of a subset of the dimensions and summing up all the remaining solutions. In the above data cube, those combinations that have not been represented are indicated using a zero. The boxes that have been represented are indicated by the count number they indicate. From the above data cube we can conclude that on 28/3/2011, two clients who were single bought the wedding type cake. In order to properly analyze the above data we need to split into individual cubical components that represent the data. We have to perform the slicing and dicing operation on the data. Slicing is the process of selecting a range of cells or a group of cells from the multi dimensional array through specifying a specific value. On the other hand data dicing refers to the process of defining a sub array from the entire multidimensional array. (Cios 48) The tables below show the cross tabulation tables after the slicing and dicing operation has been done. a). vanilla cake 12/2/2011 13/2/2011 28/2/2011 29/2/2011 single 2 1 0 0 married 0 0 0 0 widowed 0 0 0 0 b). Milky cake 12/2/2011 13/2/2011 28/2/2011 29/2/2011 single 0 0 0 0 married 0 1 0 0 widowed 0 0 0 0 c). wedding cake 12/2/2011 13/2/2011 28/2/2011 29/2/2011 single 0 0 2 1 married 0 0 0 0 widowed 0 0 0 0 After performing the data slicing and dicing, we have then to obtain a fact table that will give the final results which can now be used to make conclusions. The following table indicates data which can be obtained from the above data cube. 12/2/2011 13/2/2011 28/3/2011 29/3/2011 TOTAL WEDDING 2 1 3 VANILLA 2 1 3 MILKY 1 1 TOTAL 2 2 2 1 7 We have been able to extract a table that shows how much cakes were sold on which date and the same time the types of cakes sold at what time of the year. The above data is a two dimensional representation from the three dimensional array. Other two dimensional data can still be obtained like what type of cake is mostly liked by a certain marital status group and at what time. The above data is not explicitly provided by the database. Viable conclusions could be made from the above data, we could obtain the knowledge that, (Cios 55) 1. Vanilla cakes, which are designed by our company for special moments with loved one is highly bought on the week just before the valentine day. Hence more vanilla cake could be baked and sold on that date at a maximum price. 2. Wedding cakes are highly bought towards the end of March. This could be probably attributed to the fact that most weddings are carried out in the month of April and many people prefer buying weeding cakes at this time in preparation of the weddings, hence more wedding cakes should be produced at this moment and sold at a maximum price. Also if we want to determine which marital status group likes our cakes most, we would come up with the following table from the data cubes MARITAL STATUS SINGLE MARRIED WDOWED TOTAL WEDDING 3 0 0 3 VANILLA 3 0 0 3 MILKY 0 1 0 1 TOTAL 6 1 0 7 From the above table we can conclude that most of our cakes are largely bought by young single men. This will help our marketing team now that we have been able to determine our target audience. This can be implemented by ensuring that most of our advertisements and marketing activities target the young men. CONCLUSION From the above model, we have obtained knowledge that implemented multi agent data mining that discovered knowledge in cloud computing since our database is hosted on a cloud computing platform. Works cited Spiliopoulou, M & Masand, B. Web usage analysis and user profiling: International WEBKDD '99 Workshop, San Diego, CA, USA, August 15, 1999 : revised papers. Berlin: Springer, 2000. Song.II & Eder, J & Nguyen, T. Data warehousing and knowledge security. Berlin: Springer, 2008. Wrembel, R & Koncilia, C.Data warehousing and OLAP: concepts, architectures and solutions. Hershey: Idea Group Inc, 2007. Zhou, Xiaofang. Web information systems.Berlin: 2004. Witten, I & Frank, E & Hall, M.Data mining:Practical Machine Learning Tools and techniques.Amsterdam: Elsevier, 2011. Cios, Kryzysztof. Data mining: a knowledge discovery approach. Berlin: Springer, 2007. Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(Multi Agent Driven Data Mining For Knowledge Discovery in Cloud Comput Research Proposal, n.d.)
Multi Agent Driven Data Mining For Knowledge Discovery in Cloud Comput Research Proposal. https://studentshare.org/logic-programming/2046667-multi-agent-driven-data-mining-for-knowledge-discovery-in-cloud-computing
(Multi Agent Driven Data Mining For Knowledge Discovery in Cloud Comput Research Proposal)
Multi Agent Driven Data Mining For Knowledge Discovery in Cloud Comput Research Proposal. https://studentshare.org/logic-programming/2046667-multi-agent-driven-data-mining-for-knowledge-discovery-in-cloud-computing.
“Multi Agent Driven Data Mining For Knowledge Discovery in Cloud Comput Research Proposal”. https://studentshare.org/logic-programming/2046667-multi-agent-driven-data-mining-for-knowledge-discovery-in-cloud-computing.
  • Cited: 0 times

CHECK THESE SAMPLES OF Multi Agent Driven Data Mining for Knowledge Discovery in Cloud Computing

Implementation of Knowledge Mining with Ontology

Cloud computing applications in cloud computing the remote computer does not have to do more and heavy processes any more when running processes.... This paper ''Implementation of Knowledge Mining with Ontology'' discusses that cloud computing involves offering of services via internet.... We have three types of cloud computing; private, public and hybrid cloud.... The following are more reasons to why most companies and organizations should consider cloud computing; cloud computing gives an alternative of storing data in another computers hardware....
6 Pages (1500 words) Literature review

Vancomycin-resistant Entercocci

(CHA 1996 pg 2)In the United States acquisition and transference of VRE outside hospital settings is not supported by data.... The paper “Vancomycin-resistant Entercocci” seeks to evaluate a bacteria found commonly in human intestinal tracts and vaginal canals.... Over the last decade Entercocci has become increasingly resistant to antibiotics, including Vancomycin....
140 Pages (35000 words) Essay

Emotional Intelligence

In contrast, Salovey and Mayer (1990) contend that EI develops with age and that emotional knowledge can be enhanced and emotional skills can be learned (Vitello-Cicciu 2003).... In 1990, two psychologists, Peter Salovey and John Mayer (1990), first coined the term emotional intelligence (EI), referring to EI as an ability to recognize the meaning of emotions and their relationships, and to reason and solve problems on the basis of them (Vitello-Cicciu 2003)....
41 Pages (10250 words) Essay

Information Retrieval through Multi-Agent System

The concept of cloud computing does not provide facilities for knowledge discovery and information retrieval.... nbsp;… The paper will undertake a review of the existing literature available in the tins arena and develop an empirical model showing real-time data flow through MAS with data mining after retrieval of meaningful information fa from a data warehouse present in a cloud computing environment.... cloud computing is a general term that refers to anything that “involves delivering hosted services over the Internet....
6 Pages (1500 words) Research Paper

Time Series Data Mining and Forecasting Using SQL Server 2008

This thesis "Time Series data mining and Forecasting Using SQL Server 2008" carries out data mining using the records on the production of major crops in Ghana for the past forty years as the data source.... It overviews time data mining, trends in data mining, review literature, etc.... hellip; In view of the increasing utilization of modern information technology, we use data on the production of some major crops in Ghana over the past forty years as a case to help in illustrating the manner in which data mining is applicable in such a time series helping the state to witness the benefits of such efforts....
64 Pages (16000 words) Thesis

BUSINESS ANALYTICS METHOD AND SOFTWARE

Over the past two decades, the world of data mining and analytics has been revolutionized due to the advances in statistical software for the targeted discovery of patterns and knowledge from large amounts of data.... The use of data mining is widespread across several domains such… The statistical tasks related to data mining include cluster analysis, time series analysis, multivariate statistics, spatial data analysis, and machine and statistical learning (Zhao, 2012)....
20 Pages (5000 words) Coursework

Treatment of Head Lice in Children

knowledge gained from this exhaustive study shall be applied clinically for the safe and effective treatment of patients.... Also, I will try to make this information available to other health care professionals so that they can use this knowledge while treating their patients....
21 Pages (5250 words) Literature review

Data Mining and Knowledge Discovery in Database

This paper is about the data mining as the main factor in the process involving the inferring of algorithms that explore the data, develop the model and discover the previous patters that are unknown.... data mining in the current situation has been important due to abundance of data that makes uses the knowledge of data discovery.... The aim of the research is to come up with the right process involved in data mining and to organize important methods that are developed in the field into unified and coherent catalog; presenting the performance evaluation approaches and techniques and also the cases and software tools that uses the method (Dai, Liu & Smirnov, 2012)....
12 Pages (3000 words) Research Paper
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us