StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

String Processing Structures and Algorithms - Assignment Example

Cite this document
Summary
The author of this paper "String Processing Structures and Algorithms" examines the need-based classification of that are text data and numeric data, elementary or basic string operations: substring operations, length operations, concatenation operation, and the description of each operation type…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER93.3% of users find it useful
String Processing Structures and Algorithms
Read Text Preview

Extract of sample "String Processing Structures and Algorithms"

of school] of Discipline] String Processing of Introduction: The data when observed under the scope of computers is generally classified into two types that are text data and numeric data. This classification is sometimes considered as a need based classification. This is due to the fact that the criteria behind this classification is based on storage optimization and nature of processing of respective data. Numeric data is stored as per its binary representation. The equivalent binary number of a numeric decimal value is stored as a string of bits on required number of bytes. Whereas the text or character type data needs one byte to store one character. A finite sequence or band of characters or bits is called as string. However in the text below the term string is used for text or character strings. Strings are sometimes defined as array of characters (ASCII or Unicode). Array is one of the basic data structures that are used to store data. The design of array is based on a logical linear organization (Abstract Data Type) known as a LIST. List is any linear or columnar representation of data items. Another structure that is used to represent the LIST ADT is Linked List. This is a dynamic structure as compared to array which is static in nature. The size of static structures cannot be reduced or expanded at runtime (i.e. after they are established or declared in memory). The dynamic structures allow resizing at runtime. The strings due to their linear nature can also be represented through linked lists but the management of strings implemented through linked lists become highly cumbersome. Array is the most suitable structure to implement strings. To make strings finite there must be an end mark that can be easily indicated through a special character. As per the standards established by C language the NULL character (i.e. ‘\0’) is used as a terminator symbol in the following text. Some of the examples of strings are, S1 = A Q u i c k b r o w n ‘\0’ S2 = E n g a d g e t ‘\0’ The strings are generally represented through an identifier. ‘S1’ and ‘S2’ are used as string names or identifiers in the examples mentioned above. Note that both the strings are of varying length with their respective end markers (NULL characters). String Operations: Data without operations is meaningless. Similarly, the storage of strings as character arrays is of no use until and unless exposed to the required string processing operations that are required to support real world transactional requirements. The algorithms for the string operations discussed in the text below are highlighted through a generalized pseudo code. However the syntax of C language is dominant. Strings operations are generally classified into two categories, that are, 1. Elementary string operations. 2. Word Processing operations. ELEMENTARY STRING OPERATIONS: Elementary or basic string operations are the most essential operations that help advanced string operations to achieve required results. The discussion in the coming text explains each operation in detail. These operations are, LENGTH operation This operation or function is performed to find out the length of a given string. This string operation is the most elementary. Rest of the operations use it to produce the required results. This operation is generally represented as LEN (S) or StringLen(S) with a string (whose length is to be found out) as its single parameter. This function returns the length of given string as a numeric integer. Figure 1.0 highlights the respective algorithm. int StringLen (S) { count = 0; while (1) { if (S [ count ] == ‘\0’) break; count++; } return count; } Figure 1.0 SUBSTRING operation This operation is about extracting a required part of a given string i.e. taking out a sub string from a super or larger string. This is one of extremely useful string operations. This operation returns a string and needs a super string, the position number from where the sub string is to be extracted out or copied and the length of required substring. The substrings can be of multiple types but the most popular types are Initial Substrings, Terminal Substrings and Substrings. The operations is generally represented as, char[] SubString (S, position, length_substr) The algorithmic pseudo code of SubString ( ) function is presented in figure 1.1 char[] SubString (S, position, length_substr) { char S1[length_substr]; count = 0; while (count < = length_substr) { S1 [ count ] = S [position+count]; count++; } S1[count] = ‘\0’; return S1; } // Declare a new char array to hold the required substring with an added subscript to accommodate NULL character. Figure 1.1 Example: Consider S = A Q u i c k b r o w n ‘\0’ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Then the result of SubString(S, 8,5) would be, S1 = b r o w n ‘\0’ 0 1 2 3 4 5 Note that position is given according to the zero orientation standard of C-language i.e. for 9th position the subscript 8 is given. The NULL character is placed at 6th position which is created as an additional position while declaring the sub string character array. There are number of versions of the same substring function/operation that are used for varied scenarios. For example instead of stopping at a counted length there can be a sentinel based break that allows the extraction of substrings of indefinite lengths. This version is very useful in performing certain business transactions where a particular information is required as a substring from larger strings. For example to extract out the user ids from respective email addresses in a database the sentinel character would be ‘@’. The detailed narration is as follows. Let ‘email’ be a character string or a field that contains email addresses of registered users like customers or suppliers in a scenario of a sales business. The starting position is understood be the first position in such cases. email = m y u s e r i d @ e m a i l . c o m ‘\0’ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 The narration if this version of SubString ( ) function is given below in figure 1.2. char[] SubString (email, sentinel) { char S1[l]; count = 0; while (email[count] != sentinel) { S1 [ count ] = email [count]; count++; } S1[count] = ‘\0’; return S1; } // Declare a new char array to hold the required substring. Figure 1.2 Now the result of SubString (email, ‘@’) would be, S1 = m y u s e r i d ‘\0’ 0 1 2 3 4 5 6 7 8 Sometimes, the identification numbers are assigned in a sequential manner with some other codes to identify the department number, work type or station of an employee. For example “ENG-1010” may mean an employee working at production department and its seniority rank is 1010 according to the employee hiring sequence. Thus if an organization wants to extract out the department name of an employee or requires to compare seniority levels for promotion of employees, the SubString function would be of great help. Concatenation The concatenation operation is about merging two different strings into one string. This operation is generally performed by declaring a third character array of size equal to the collective sized of both the arrays to be merged or concatenated. Afterwards both the strings are transferred to the third array character by character sequentially with a NULL character placed at end. The signature of concatenation operation or function can generally be given as, char [] Concat (S1, S2) Where S1 and S2 are strings to be concatenated and the result would be returned as a character string. The algorithmic narration of Concat( ) is presented in figure 1.3. char[] Concat (S1, S2) { char result[StringLen ( S1 ) + StringLen (S2) + 1 ]; count = 0; while (S1[count] != NULL) { result [ count ] = S1 [count]; count++; } c = 0; while (S2[c] != NULL) { result [ count ] = S2 [c]; count++; c ++; } result[count] = NULL; return result; } // Declare a new char array to hold the required substring with size = Length of S1 +Length of S2 + 1 // The NULL character is considered equivalent to ‘\0’ Figure 1.2 For example let S1 and S2 be the strings to be merged, S1 = b r o w n ‘\0’ 0 1 2 3 4 5 6 S2 = f o x ‘\0’ 0 1 2 3 Then Concat (S1, S2) would result in, result = b r o w n f o x ‘\0’ 0 1 2 3 4 5 6 7 8 9 The concatenation operation is useful in many respects. For example, consider a scenario of a business that requires to send letters to its respective clients (male and female both). The title association with the names stored in database can accordingly be done through the Concat ( ) operation / function. The gender field is to be checked to figure out whether the client is a male or female. The titles of “Mr. “ and “Ms. “ will then be concatenated accordingly. Following is the detailed narration. if (gender == “male”) Concat (“Mr. “, FirstName) else Concat (“Ms. “, FirstName) Figure 1.4 Pattern matching: Pattern matching is about searching the location of a substring (if exists) into a super string. This operation returns NULL if the substring is not found. The super string is sometimes referred to as Text while the substring that is to be found out and matched is referred to as Pattern. Pattern matching or string comparison is done through multiple algorithms. The most famous algorithms are Brute Force, Knuth Morris Pratt Algorithm and Rabin Karp Algorithm. This operation returns the location or position of matched string as an integer or returns NULL if not found. The signature of this function or operation is generally represented as, int INDEX (Text, Pattern) Where, Text and Pattern are of string type. An example below would elaborate the function of this operation. Let T be the text string and P be the pattern string. T = A Q u i c k b r o w n ‘\0’ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 And, P = Q u i c k ‘\0’ 0 1 2 3 4 5 The result of INDEX(T,P) would be 3, which is the position of the pattern in the given text. Two of the most popular algorithmic pseudo codes are presented through figures 1.5 and 1.6 subsequently. INDEX1 is a Brute Force Algorithm while INDEX2 uses the state transition methodology by creating initial and terminal substrings of the given pattern to originate intermediate states of matched pattern. INDEX1(T,P) { K = 1; while (K  StringLen(T)-StringLen(P)-1) { FOR (J=0; J < = StringLen(P); J++) { If (P[J] == P[K+J]) if ( J == StringLen(P) + 1) return K+ 1; else break; } K = K+1 } return NULL } Figure 1.5 INDEX2(T,P) { K = 0; S[K] = Q[K]; // Q[K] contains terminal substrings of pattern while (S[K]  P AND K  StringLen(T)) { S[K+1] = F (S[K] , T[K]) K = K+1 IF (S[K] = P) Return K-LENGTH(P) else Return NULL } } Figure 1.6 Pattern matching is of great use for generalized business processing. The databases that contain business records may be processed in numerous ways through pattern matching to achieve required results. For example to target the customers of certain profession like “Doctors” pattern matching is highly useful to conclude the list of doctors. WORD PROCESSING OPERATIONS: The most notable word processing operations or Advanced String Operations are INSERT( ), DELETE ( ) and REPLACE ( ). There are other advanced operation that are usually incorporated in a word processor but these are the simplest. These operations can easily be narrated through the primary or basic string processing functions that are explained above. The narration of each with examples is as follows. INSERT(T, S, position) means insert the given string S into the intended text T (which is also a string) at location indicated through the position variable. T = A Q u i c k f o x ‘\0’ 0 1 2 3 4 5 6 7 8 9 10 11 S = b r o w n ‘\0’ 0 1 2 3 4 5 6 INSERT(T.S,8) would result in, T = A Q u i c k b r o w n f o x ‘\0’ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 This result is achieved through the use of elementary string processing functions. The function calls are arranged in the manner presented below Concat( SubString(T, 0, position - 2) , S , SubString (T, position -1, StringLen( T ) – (position – 2)) The DELETE and REPLACE functions also work similarly i.e. with the help of basic string operations. REPLACE (T, S1, S2) means replace in S1 with S2 in text T. This is achieved through, 1. location = INDEX (T, S1) // Search for string 1 into the text 2. DELETE(T, location, StringLen(S1) // Delete number of characters equal to the length of string 1 from searched location. 3. INSERT (T, location, S2) // Insert S2 at position given as searched location in text T. CONCLUSION: The string processing functions lie at core of every scenario that is related to text or word processing. Be it a business database containing daily transactions or a personal document, the string processing is always required to achieve required results. REFERENCES: LANGSAM, Y., AUGENSTEIN, M., TENENBAUM, A. M., & TENENBAUM, A. M. (1996). Data structures using C and C++. Upper Saddle River, N.J., Prentice Hall. Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(String Processing Structures and Algorithms Assignment Example | Topics and Well Written Essays - 2000 words, n.d.)
String Processing Structures and Algorithms Assignment Example | Topics and Well Written Essays - 2000 words. https://studentshare.org/information-technology/1776867-data-structures-and-algorithms
(String Processing Structures and Algorithms Assignment Example | Topics and Well Written Essays - 2000 Words)
String Processing Structures and Algorithms Assignment Example | Topics and Well Written Essays - 2000 Words. https://studentshare.org/information-technology/1776867-data-structures-and-algorithms.
“String Processing Structures and Algorithms Assignment Example | Topics and Well Written Essays - 2000 Words”. https://studentshare.org/information-technology/1776867-data-structures-and-algorithms.
  • Cited: 0 times

CHECK THESE SAMPLES OF String Processing Structures and Algorithms

Writing Scientific Report

A simple comparison of the images presented above reveals that the first above tends to coalesce human structures and certain rocks with vegetation.... The second image has excluded vegetation near the centre especially and around it where human made structures exist....
22 Pages (5500 words) Coursework

Benchmarking Micro-architecture Using Software

Name: Instructor: Course: Date: Introduction The Microprocessor without Interlocked Pipeline Stages (MIPS) is common in the programming of microsystems.... These microsystems include microprocessors and microcontrollers, which are used to achieve certain tasks.... hellip; The performance of these systems greatly depends on the design of the hardware and the firmware used as well....
4 Pages (1000 words) Research Paper

Big Data Problem Causes and Exposition

Some problems experienced in handling big data include collecting, processing, analysing and storing of the meaningful data for future use (Ohlhorst, 2013, p11).... Big Data Problem Name: Institution: Course: Tutor: Date: Big data is a collection of sets of data that are very large and compound which are challenging to capture, store, search, transfer, analyse or visualise using simple method and using the usual database that analyse small data....
4 Pages (1000 words) Essay

Information Technology in Financial Organizations

Business functions in financial organizations in today's world are directed to attract more customers and investors.... Just a few decades back, most of the work was carried out manually thereby requiring a lot of effort, manpower and time to carry out even the most basic functions of the organization (tam-inc....
10 Pages (2500 words) Essay

Storage of User Generated Data Using Distributed Backup Method

Big data is a term used in the description of a massive volume of unstructured and structured data so large such that it presents considerable difficulty in its processing via conventional software… An IDC report has made the prediction that there will be growth in the volume of global data by a constant factor of 300 from 2005 to the year 2020 (Aluru & Simmhan, 2013)....
5 Pages (1250 words) Essay

Assessment of Going to College for a Computer Science Degree and How It Prepares One for That Career

The paper "Assessment of Going to College for a Computer Science Degree and How It Prepares One for That Career" describes that the author has decided to grow in this field in his professional career.... He has joined a reputed and good college and has started taking the classes quite seriously.... hellip; It is clear that I had a strong interest in computer programming and applications from my child hood....
6 Pages (1500 words) Essay

Computer Theory: Finite Automato

string - a sequence of symbols.... f the system is in one of the accept states while the FSm is being done, the string is made to be accepted in the process.... … FINITE AUTOMATOAn FA is a 5-tuple  , whereQ: Finite set of states : Finite input alphabet : Transition function Start stateF: Set of final statesFinite state automaton is also known as a finite state machine (FSM)....
6 Pages (1500 words) Essay

Computer Networks Problem Solving

… The paper “Computer Networks Problem Solving” is a  forceful variant of assignment on information technology.... Socket API is a service model that works on the TCP protocol of the application layer.... Sockets are defined as the created endpoints on both the sender and receiver that establish a service....
8 Pages (2000 words) Assignment
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us