GroupBased Labelling Scheme for Dynamic XML Databases - Case Study Example

Add to wishlist

Summary

The paper "GroupBased Labelling Scheme for Dynamic XML Databases" demonstrates how the scheme handles different types of insertion and how the relationships are preserved after insertions; simple examples were provided. The correctness of the scheme’s properties was given using simple algebra…

Download full paper File format: .doc, available for editing

GRAB THE BEST PAPER92.2% of users find it useful

GroupBased Labelling Scheme for Dynamic XML Databases

Read Text Preview

Subject: Information Technology
Type: Case Study
Level: Undergraduate
Pages: 13 (3250 words)
Downloads: 2
Author: tonidietrich

Extract of sample "GroupBased Labelling Scheme for Dynamic XML Databases"

Chapter 4: GroupBased Labelling Scheme for Dynamic XML Databases 4 Introduction Labelling a node of an XML document to reflect the structure is an important process that helps in indexing and retrieving XML data effectively. However, designing a dynamic labelling scheme which can handle insertions of new nodes without the need to re-label the existing labels, as well as taking the size of the labels and the query performance into consideration, is a challenging task; this was mentioned earlier in the literature chapter (Ch. 3). This chapter presents the principles of the dynamic labelling scheme proposed in this thesis before the design and implementation details are examined in Chapter 5. In this chapter, an overview is given in Section 4.2. Then, Section 4.3 illustrates how the initial labels are allocated and how the different relationships are determined. Section 4.4 describes how insertions are handled and how different relationships are preserved. A validation of the relationships using algebra is shown in Section 4.5. Finally, in Section 4.6, the chapter ends with a general conclusion that leads to the following chapter which discusses the scheme from the point of view of implementation. 4.2 An Overview of the Scheme The proposed scheme is based on the parent-child grouping to facilitate the identification of parent-child and sibling relationships faster, based on a simple comparison. Parent-child grouping was also selected due to the high number of xml documents that come with this type of relationship (Goldman & Widom, 1997). Again, parent-child and sibling grouping facilitate smother insertions of new nodes, given the fact that in this form of grouping only a simple tree structure will be dealt with rather than the whole tree (Cohen, Kaplan & Milo, 2002). The advantage of allowing smoother insertion builds on the prefix GroupID labelling scheme but does not determine a fixed number of nodes to be inserted. Gusfield (1997) also observed that when dealing with parent-child groupings, labelling can be thought of as being easier, faster and more accurate as it deals with a simple tree structure. The simple structure has to do with a root node and its direct children nodes. Another critical characteristic of the scheme is that it uses two labels for each node in order to facilitate the processing nodes within the same group that uses their simple local labels. This is in contrast with multiplication-based scheme where the global label is used to connect a group to the whole tree which helps in identifying relationships between nodes belong to different group (Milo & Suciu, 1999). Based on existing schemes such as the DDE labelling scheme, the researcher appreciates the fact that using the scheme to create the first part global label has its advantage as this scheme facilitates the insertion without re-labelling of the existing labels (McCreight, 1976). It also ensures that there is the identification of all relationship at a go (Tatarinov et al., 2008). Based on this advantage, the proposed scheme has also been intended to possess these properties. What is more, the scheme is designed to allow fast identification of relations. This is because it is appreciated that as fast as the relationships between nodes can be determined, the query processing will be optimised (Li & Moon, 2001). This labelling scheme is divided into two parts. Each label has a local and a global part. The local label, which is given to each node, can be duplicated whereas the global label uniquely identifies a group of local labels. This is generated based on the Dynamic Dewey labelling scheme (Xu, 2009 #48). Xu’s scheme is a Dewy labelling scheme (Tatarinov, 2002 #155) used when the document is static; i.e. when no insertions have occurred. Generally, each node has local and global labels and they are assigned as follows: Every node except the root node is grouped with its child nodes and is given a global label. The local label is assigned for each node within a group starting from the parent node; then, the child nodes are labelled in a serial order. The root node refers to the document root. And the child nodes for a specific node are the immediate child nodes without the grandchildren or further descendants (i.e. the nodes that have direct parent/child relationship with a specific node) 4.3 The Initial Labelling Firstly, ‘1’ is assigned to the document root as its global and local labels. Then two phases of the process are performed. The Phase 1: This starts by grouping every node and its child nodes to form a sub-tree. Each sub-tree is given a global label, which consists of two components if the node is not a child of the root. The first component is the Dewey label. The second one is either the number of the child node, starting from left to right 1,2,3…ith where ith is the last child node; or it is information about a new inserted node when random skewed insertion has occurred (Wu, Lee & Hsu, 2008). This second component of the global label is used to preserve the document order after insertions have been made. More details are provided in the next section. Phase 2: This phase involves assigning a local label to each node where all the document root’s child nodes have the same local label, which is ‘1.0’. Then, the local label of the first child within a group is calculated by incrementing its parent’s local label by one; the next sibling node’s local label is then derived by adding one to the previous sibling local label and so on until the end of the document. Figure (4.1) shows the initial labelling of the scheme and the nodes’ labels are presented in Table 4.1 As seen in Figure 4.1: A node can belong to two groups, which seems to overlap. However, this is not an issue because when such overlap occurs, the node can be treated in two ways as required to handle the situation. The first is to handle the overlap as a child node in a group, while the second is to handle it as a root node in another group (Zhang et al., 2001). For example, given a node H which belongs to a group with label 1.2.1 as child node, this same node could have a root in another group which is labelled 1.2.1.1. In this example, the root can be seen to have been assigned by simply concatenating the H node’s 1st group label where it appears as child node and second part of the global label (Yun & Chung, 2008). Thus, from this explanation an overlap requiring the intervention explained above will be said to be formed if and only if one of the following conditions hold: 1. Their 1st part of their global label is the same 2. The 1st part of the global label of one of them was the result of concatenating both parts of global labels of the other one. 4.3.1 The Scheme’s Properties Given two nodes, n1, n2, with level1, level2 as their levels (the level refers to the level of the element node in the XML tree), and with labels A and B, where global labels are a1.a2…am , ith and b1.b2…bn, , jth respectively and their local labels are La1. La2 and Lb1. Lb2, the label properties can be defined as in the following: Node Level: The level information of each node can be derived from its global label as follows: The level is the number of components in the first part of the node’s global label plus 1 if the second part of the global label exists; i.e. if the SG equals null, the level is the number of component in FG. e.g. As shown in Fig. 4.1 and Table 4.1, the level of node ‘B’ is (2), whereas the level of node ‘J’ is (5) based on their global labels, as shown in Table 4.1. Label Order: Case 1: label order between nodes within the same group; i.e. the first parts of their global label are equal or the global label of one of them forms the first part of the global label of the other. In this case, the order is based on the nodes’ local labels and can be simply determined as follows: Definition 1: n1 (is before) n2 if, and only if, one of the two following conditions holds: C1: level1 < level2 C2: level1 = level2 and La1. La2 < Lb1. Lb2 e.g. from Fig. 4.1 and Table 4.1, node ‘B’ with level (2) is before node ‘D’ as its level is (3), and node ‘H’ is before ‘I’ as ‘H’ local (3.0)< ‘I’ local (4.0). Case 2: label order between nodes within a different group. In this case, the order is based on the first part of nodes’ global labels and is determined using the DDE pre-order definition (Xu et al., 2009) which states that: e.g. from Fig. 4.1 and Table 4.1, node ‘C’ Read More

Cite this document

APA
MLA
CHICAGO

(“Explaining my scheme Essay Example | Topics and Well Written Essays - 3000 words”, n.d.)
Explaining my scheme Essay Example | Topics and Well Written Essays - 3000 words. Retrieved from https://studentshare.org/information-technology/1680400-explaining-my-scheme

(Explaining My Scheme Essay Example | Topics and Well Written Essays - 3000 Words)
Explaining My Scheme Essay Example | Topics and Well Written Essays - 3000 Words. https://studentshare.org/information-technology/1680400-explaining-my-scheme.

“Explaining My Scheme Essay Example | Topics and Well Written Essays - 3000 Words”, n.d. https://studentshare.org/information-technology/1680400-explaining-my-scheme.

Cited: 0 times

CHECK THESE SAMPLES OF GroupBased Labelling Scheme for Dynamic XML Databases

Application Development and Databases

All issues will be thoroughly reviewed and al possible solutions will be provided in the light of the case study.... At last the recommendation as to which the report can be further improved will also be discussed. Access control in these IT systems, is one of the cornerstones of any Information Security Policy....

17 Pages (4250 words) Essay

Using a Star Database Schema

Data in the data warehouse comes from various multiple operational databases and therefore in some instances constraints applied in transactional databases need to be loosened.... According to Michelle (2007), a data warehouse is a repository for a company's historical data....

2 Pages (500 words) Research Paper

Labelling Theory

The paper "labelling Theory" highlights that by gaining an understanding why individuals take part in a crime, experts can devise ways to break the cycle, curb crime, and offer rehabilitation to the deviating individuals.... hellip; In general, today, the labelling theory presents a highly significant aspect of criminal justice.... labelling theory presents a distinctly sociological perspective that concentrates on the role played by social labelling in the progression of deviance and crime....

1 Pages (250 words) Essay

Scheme Evaluation & Future Direction - GroupBased Scheme

s far as the outcome of facilitating node insertions in dynamic xml data inefficient way, the results of the study as given in chapter 6 and chapter 7 showed that even though different levels of increases in consumption time were identified for the node's levels of relationship, the GroupBased scheme gave a better performance in determining different relationships in static form as against the DDE scheme.... The paper "Scheme Evaluation & Future Direction - GroupBased Scheme" highlights that generally speaking, re-designing the GroupBased scheme using a different labelling scheme instead of the DDE scheme could improve efficiency or lead to a new theory....

13 Pages (3250 words) Case Study

Developing rigorous hypotheses

In effect, should deductive approach have been used exclusively in this study, the researcher would have had to focus the experiment on testing and examining the theory so that a firm theory or conclusion would be derived from the effect of the proposed scheme as a labelling scheme (Hardy & Bryman, 2004; Creswell, 2007).... As a result of this, instead of exclusively basing on the hypothesis set in the first chapter, greater part of the research approach was inductive, where the researcher's main basis for drawing conclusions on the performance, scalability and efficiency of the proposed labelling scheme was taken from the accumulated findings throughout the research process....

2 Pages (500 words) Essay

Static vs. Dynamic XML queries

dynamic xml queries A dynamic document is one that is continually edited and up d.... A dynamic xml query not only retrieves information and updates the content of the document in question; it also inserts new nodes while at the same time deleting the existing nodes often resulting in a change in the document structure.... A labeling scheme supporting solely static XML queries is not enough for XML to become a general standard for data representation and exchange; a labeling scheme that effectively supports dynamic xml trees is also necessary (BEHRENDS 2007)....

1 Pages (250 words) Essay

Web Site Development and Information Architecture

The author of this coursework "Web Site Development and Information Architecture" analyses two programing languages as HTML and xml.... nbsp;… xml is a document-processing standard that is an official recommendation of the World Wide Web Consortium (W3C), the same group responsible for overseeing the HTML standard (Eckstein & Casabianca, 2001).... According to Eckstein & Casabianca (2001), xml is actually a simplified form of Standard Generalized Markup Language (SGML2), which is extremely complex, especially for the Web....

9 Pages (2250 words) Coursework

History of XML Programming

"History of xml Programming" examines the evolution of the WWW in terms of the need for a general-purpose markup language, draws conclusions about the power and advantages of xml that have made it the global standard for enterprise data exchange, and compares Data Type Definitions and xml Schemas.... According to (Tidwell 4) xml has the ability to do data interchange.... However, xml will make it easy and possible to send structured data across the internet ensuring nothing gets lost during translation....

6 Pages (1500 words) Assignment