Concurrency know that a distributed database systems are

Concurrency Control in Distributed Database Management SystemAbstract: As we know that there are so many algorithms of concurrency control that hasproposed for distributed database system even the complex and large algorithms,but as we know that a distributed database systems are becoming popular and acommercial reality, performance tradeoffs for distributed concurrency controlare still not well recognize. In this research paper I tried to focus on somemajor and important issues after studying four representative algorithms. 1)Wound-Wait. 2) Distributed 2PL.

3) Basic Timestamp ordering and a Distributedoptimistic algorithm. 4) Using a detailed model of a distributed DBMS.Keywords: Cohort Process (Ci), Master Process (M) 1 Introduction:From the past years, Distributed Databases have takenattention in the database research community.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

Data distribution and replicationoffer opportunities for improving performance through parallel query executionand load balancing as well as increasing the availability of data. In fact,these opportunities have played a major role in motivating the design of thecurrent generation of database machines (e.g., 1, 2). This paper addressessome of the important performance issues related to these algorithms. Most of the distributed concurrency control algorithms comeinto one of three basic classes: locking algorithms 3,4,5,6,7, Timestampalgorithms, 8,9,1, and optimistic (or certification) algorithms 10,11,12,13. Many proposed algorithms reviewed 14 and describe how additionalalgorithms may be synthesized by combining basic mechanisms from the lockingand timestamp classes 14. Given the many proposed distributed concurrency controlalgorithms, a number of researchers have undertaken studies of their performance.

For example, the behavior of various distributed locking algorithms wasinvestigated in 15, 4, 16, 17. Where algorithms with varying degrees ofcentralization of locking and approaches to deadlock handling have been studiedand compared with one another. Several distributed Timestamp based algorithmswere examined in 18. A qualitative study addressing performance issues for anumber of distributed locking and timestamp algorithms was presented in 14.The performance of locking was compared with that of basic timestamp orderingin 19, with basic and multi-version timestamp ordering in 20. While thedistributed concurrency control performance studies to date have been usefulbut a number of important questions remain unanswered.

These include: 1. How do theperformance characteristics of the various basic algorithm classes compareunder alternative assumptions about the nature of the database, the workload,and the computational environment? 2. How does thedistributed nature of transactions affect the behavior of the various classesof concurrency control algorithms? 3. How much of aperformance penalty must be incur for synchronization and updates when data isreplicated for availability or query performance reasons? We examine four concurrency control algorithms in thisstudy, including two locking algorithms, a timestamp algorithm and anoptimistic algorithm.

The algorithms considered span a wide range ofcharacteristics in terms of how conflicts are detected and resolved. Section 2describes our choice of concurrency control algorithms. The structure andcharacteristics of our model are described in Section 3 and section 4 describesthe testing of algorithms.

Finally, Section 5 summarizes the main conclusionsof this study and raises questions that we plan to address in the future.2. DISTRIBUTEDCONCURRENCY CONTROL ALGORITHMS For this study, we have chosen to examine four algorithmsthat we consider to be representative of the basic design space for distributedconcurrency control 138 International Journal of Computer Science &Communication (IJCSC) mechanisms. We summarize the salient aspects of thesefour algorithms in this section. In order to do this, we must first explain thestructure that we have assumed for distributed transactions.

2.1. The Structure ofDistributed Transactions Figure 1 shows a general distributed transaction in terms of the processesinvolved in its execution. Each transaction has a master process (M) that runsat its site of origination. The master process in turn sets up a collection ofCohort processes (Ci) to perform the actual processing involved in running thetransaction.

Since virtually all query processing strategies for distributeddatabase systems involve accessing data at the site(s) where it resides, ratherthan accessing it remotely. There is at least one such cohort for each sitewhere data is accessed by the transaction. In general, data may be replicated,in which case each cohort that update any data items is assumed to have one ormore update (Uij) processes associated with it at other sites. In particular, acohort will have an update process at each remote site that stores a copy ofthe data items that it updates. It communicates with its update processes forconcurrency control purposes, and it also sends them copies of the relevantupdates during the first phase of the commit protocol described below. The centralized two-phase commit protocol will be used inconjunction with each of the concurrency control algorithms examined. The protocolworks as follows 5: Fig 1: Distributed Transaction StructureWhen a cohort finishes executing its Portion of a query, itssends an “execution complete” message to the master. When the master hasreceived such a message from each cohort, it will initiate the commit protocolby sending “prepare to commit” messages to all sites.

Assuming that a cohortwishes to commit, it sends a “prepared” message back to the master, and themaster will send “commit” messages to each cohort after receiving preparedmessages from all cohorts. The protocol ends with the master receiving “committed”messages from each of the cohorts. If any cohort is unable to commit, it willreturn a “cannot commit” message instead of a “prepared” message in the firstphase, causing the master to send “abort” instead of “commit” messages in thesecond phase of the protocol.

When replica update processes are present, thecommit protocol becomes a nested two-phase commit protocol. Messages flowbetween the master and the cohorts, and the cohorts in turn interact with theirupdaters. That is, each cohort sends “prepare to commit” messages to itsupdaters after receiving such a message from the master, and it gathers theresponses from its updaters before sending a “prepared” message back to themaster; phase two of the protocol is similarly modified. 2.2. DistributedTwo-Phase Locking (2PL)The first algorithm is the distributed “read any, write all”two-phase locking algorithm described in 15. Transactions set read locks onitems that they read, and they convert their read locks to write locks on itemsthat need to be updated. To read an item, it suffices to set a read lock on anycopy of the item, so the local copy is locked; to update an item, write locksare required on all copies.

Write locks are obtained as the transactionexecutes, with the transaction blocking on a write request until all of thecopies of the item to be updated have been successfully locked. All locks areheld until the transaction has successfully committed or aborted. Deadlock is a possibility, Local deadlocks are checked forany time a transaction blocks, and are resolved when necessary by restartingthe transaction with the most recent initial startup time among those involvedin the deadlock cycle. (A cohort is restarted by aborting it locally andsending an “abort” message to its master, which in turn notifies all of theprocesses involved in the transaction.

) Global deadlock detection is handled bya “Snoop” process, which periodically requests waits-for information from allsites and then checks for and resolves any global deadlocks (using the samevictim selection criteria as for local deadlocks). We do not associate the”Snoop” responsibility with any particular site. Instead, each site takes aturn being the “Snoop” site and then hands this task over to the next site. The”Snoop” responsibility thus rotates among the sites in a round-robin fashion,ensuring that no one site will become a bottleneck due to global deadlockdetection costs. 2.3. Wound-Wait (WW)The second algorithm is the distributed wound-wait lockingalgorithm, again with the “read any, write all” rule. It differs from 2PL inits handling of the deadlock problem: Rather than maintaining waits-forinformation and then checking for local and global deadlocks, deadlocks areprevented via the use of timestamps.

Each transaction is numbered according toits initial startup time, and younger transactions are prevented from makingolder ones wait. If an older transaction requests a lock, and if the requestwould lead to the older transaction waiting for a younger transaction, the AnApproach for Concurrency Control in Distributed Database System 139 youngertransaction is “wounded” – it is restarted unless it is already in the secondphase of its commit protocol (in which case the “wound” is not fatal, and issimply ignored). Younger transactions can wait for older transactions so thatthe possibility of deadlocks is eliminated. 2.

4. Basic TimestampOrdering (BTO)The third algorithm is the basic timestamp orderingalgorithm of 9, 14. Like wound-wait, it employs transaction startuptimestamps, but it uses them differently. Rather than using a locking approach,BTO associates timestamps with all recently accessed data items and requiresthat conflicting data accesses by transactions be performed in timestamp order.Transactions that attempt to perform out-of-order accesses are restarted.

Whena read request is received for an item, it is permitted if the timestamp of therequester exceeds the item’s write timestamp. When a write request is received,it is permitted if the requester’s timestamp exceeds the read timestamp of theitem; in the event that the timestamp of the requester is less than the writetimestamp of the item, the update is simply ignored (by the Thomas write rule14). For replicated data, the “read any, write all” approach isused, so a read request may be sent to any copy while a write request must besent to (and approved by) all copies. Integration of the algorithm withtwophase commit is accomplished as follows: Writers keep their updates in aprivate workspace until commit time. 2.

5. DistributedCertification (OPT)The fourth algorithm is the distributed, timestamp-based,optimistic concurrency control algorithm from 13, which operates byexchanging certification information during the commit protocol. For each dataitem, a read timestamp and a write timestamp are maintained. Transactions mayread and update data items freely, storing any updates into a local workspaceuntil commit time. For each read, the transaction must remember the versionidentifier (i.e., write timestamp) associated with the item when it was read.

Then, when all of the transaction’s cohorts have completed their work, and havereported back to the master, the transaction is assigned a globally uniquetimestamp. This timestamp is sent to each cohort in the “prepare to commit”message, and it is used to locally certify all of its reads and writes asfollows: A read request is certified if (i) the version that was read is stillthe current version of the item, and (ii) no write with a newer timestamp hasalready been locally certified. A write request is certified if (i) no laterreads have been certified and subsequently committed, and (ii) no later readshave been locally certified already.  3. MODELING ADISTRIBUTED DBMS Figure 2 shows the general structure of the model. Each sitein the model has four components: a source, which generates transactions andalso maintains transactionlevel performance information for the site, atransaction manager, which models the execution behavior of transactions, aconcurrency control manager, which implements the details of a particularconcurrency control algorithm, and a resource manager, which models the CPU andI/O resources of the site.

In addition to these per-site components, the modelalso has a network manager, which models the behavior of the communicationsnetwork. Figure 3 presents a slightly more detailed view of these componentsand their key interactions. Fig 2: Distributed DBMS Model Structure Fig   3: A Closer Look at the Model 3.1. The Transaction Manager Each transaction in theworkload will have a master process, a number of cohorts, and possibly a numberof 140 International Journal of Computer Science & Communication (IJCSC)updaters. As described earlier, the master resides at the site where thetransaction was submitted. Each cohort makes a sequence of read and writesrequests to one or more files that are stored at its site; a transaction hasone cohort at each site where it needs to access data.

Cohorts communicate withtheir updaters when remote write access permission is needed for replicateddata, and the updaters then make the required write requests for local copiesof the data on behalf of their cohorts. A transaction can execute in either asequential or parallel fashion, depending on the execution pattern of thetransaction class. 3.2. The ResourceManager The resource manager can be viewed as a model of theoperating system for a site; it manages the physical resources of the site,including its CPU and its disks. The resource manager provides CPU and I/Oservice to the transaction manager and concurrency control manager, and it alsoprovides message sending services (which involve using the CPU resource). Thetransaction manager uses CPU and I/O resources for reading and writing diskpages, and it also sends messages.

The concurrency control manager uses the CPUresource for processing requests, and it too sends messages. 3.3. The NetworkManager The network manager encapsulates the model of thecommunications network. Our network model is currently quite simplistic, actingjust as a switch for routing messages from site to site. The characteristics ofthe network are isolated in this module; it would be a simple matter to replaceour current model with a more sophisticated one in the future. 3.

4. The ConcurrencyControl Manager The concurrencycontrol manager captures the semantics of a given concurrency controlalgorithm, and it is the only module that must be changed from algorithm toalgorithm. As was illustrated in Figure 3, it is responsible for handlingconcurrency control requests made by the transaction manager, including readand write access requests, requests to get permission to commit a transaction,and several types of master and cohort management requests to initialize andterminate master and cohort processes.  4. TESTING OFALGORITHMS Suppose two transactions with the same time-stamp aresubmitted from the client 1 &2. The cohort processes (T11 & T12) oftransaction T1 and T21 and T22 of transaction T2 are executing on server 1& 3 and server 1 & 4 respectively.  In the first case,where the cohort processes (T11& T21) of transaction T1 and T2 updating thesame data content Q1.

Two phase lockingProtocol (2PL) will ensure that T11 and T21 update the data content insequential manner. Both the cohorts at first make the request to lock the datacontent Q1. Database Management System will grant the lock either to the T11 orto the T21 depending on their arrival at server. Wound-Wait (WW)protocol will uses to ensure the consistency of the database. If the TS (T11)RTS (Q1) and the suppose Q1 is presently locked by the T21 then T21 willwounded to wait and Q1 will be allocated to the T11.

When T11 will finish the executionthen Q1 will given to the T21 for further execution. Basic Time-StampOrdering (BTO) is uses to ensure that whether all the cohorts reading andupdating the correct value version of the contents and ensuring the consistencyof the database or not. If T11 have locked the data content and updating thedata content then BTO protocol will ensure T11 is allowed to update only ifTS(T11)>RTS(Q1) otherwise the updates will be ignored. If T11 reading the data contents then it will ensure that ifTS (T11)>WTS (Q1) then only T11 is permitted otherwise simply ignored. DistributionCertification (OPT) protocol is uses to ensure the correct version of datacontents read and update by the transactions.

This protocol uses the readcertification where it ensure that the read version of the data content isstill current version and no write with newer time-stamp has been locally orglobally certified. In the case of write request certification it ensures thatno later read have been certified and subsequently commit and no later readhave been locally or globally already certified. In the Second case,where cohort processes (T12 & T22) of transactions T1 and T2 are updatingthe data content Q2 on server3 &4 respectively. Two Phase lockingProtocol (2PL) will ensure that data content Q2 should be updated insequential manner. T12 and T22 will make request to lock the Q2 data content.If the TS (T12)

In this situation, there may be a global deadlock.This global deadlock handled by the “snoop” process. In this situation, one ofthe cohort either T12 or T22 will be selected as victim and restart again.

So,lock can be granted to either T12 or to the T22.Wound-Wait (WW)protocol will check if the TS (T12) < TS (22) then Q2 locked by T12otherwise by theT22. In this case we are assuming that T22 was selected as avictim and submitted again. Basic Time-Stamp Ordering (BTO) isuses to ensure that whether all the Cohorts reading and updating the correctvalue version of the contents and ensuring the consistency of the database ornot. If T12 have locked the data content and updating the data content then BTOprotocol will ensure T12 is allowed to update only if TS(T12)>RTS(Q2)otherwise the updates will be ignored.

If T12 reading the data contents then itwill ensure that if TS (T12)>WTS (Q2) then only T12 is permitted otherwisesimply ignored. Distribution Certification (OPT) protocol will use the read andwrite certification to maintain the consistency of the data content.In this testing, we find that in all the cases our algorithmis maintaining the consistency of the data contents an also avoiding andresolving the deadlock. 5. CONCLUSIONS ANDFUTURE WORK In this paper we have tried to get rid of on distributedconcurrency control performance tradeoffs by studying the performance of fourrepresentative algorithms – Distributed 2PL, wound-wait, basic timestampordering, and a distributed optimistic algorithm – using a common performanceframework. We examined the performance of these algorithms under variousdegrees of contention, data replication, and workload “Distributions.

” The combination of these results suggests that “optimisticlocking,” where transactions lock remote copies of data only as they enter intothe commit protocol (at the risk of end-of-transaction deadlocks), may actuallybe the best performer in replicated databases where messages are costly, Weplan to investigate this assumption in the future.  REFERENCES 1 “Implementing Atomic Actions on Decentralized Data,” ACMTrans. on Comp. Sys. 1, 1, Feb.

1994. 2 “A High Performance Backend Database Machine,” Proc.12th VLDB Conf., Kyoto. Japan.

Aug. 1996. 3 “Locking and Deadlock Detection in DistributedDatabases,” Proc. 3rd Berkeley Workshop on Dist. Data Mgmt. and Camp.

Networks,Aug, 2000. 4 “The Effects of Concurrency Control on the Performanceof a Distributed Data Management System,” Proc. 4th Berkeley Workshop on Dist.Data Mgmt. and Comp.

Networks, Aug. 2000. 5 “Distributed Concurrency Control Performance: A Study ofAlgorithm, Distribution, Replication”, Comp.

Scien. Deptt. Madison, 2002.

6 “Concurrency Control and Consistency of Multiple Copiesof Data in Distributed INGRES,” IEEE Trans. on Software Engineering SE-5.3. May2004.

7 “Transactions and Consistency in Distributed DatabaseSystems,” ACM Trans. on Database Sys. 7.3. Sept.

2004. 8 “A Majority Consensus Approach to Concurrency Controlfor Multiple Copy Databases,” ACM Trans.on Database Sys. 4. 2 June 2003. 9 “Timestamp-Based Algorithms for Concurrency Control inDistributed Database Systems,” Proc. 6th VLDB Cot& Mexico City, Mexico,Oct.

2003. 10 “Correctness of Concurrency Control and Implications inDistributed Databases,” Proc. COMPSAC ’04 Conf. Chicago, IL, Nov. 2004. 11 “Optimistic Methods for Concurrency Control inDistributed Database Systems,” Proc.

7th VLDB Conf. Cannes, France, Sept. 2005.12 “On the Use of Optimistic Methods for ConcurrencyControl in Distributed Databases,” Proc. 6th Berkeley Workshop on Dist. DataMgmt.

and Comp. Networks, Feb. 2000. 13 “Timestamp Based Certification Schemes for Transactionsin Distributed Database Systems,” Proc. ACM SIGMOD Conf.

, Austin, TX, May 2000.14 “Concurrency Control in Distributed Database Systems,”ACM Comp. Surveys 13.

2, June 2001. 15 “Performance of Update Algorithm for Replicated Data ina Distributed Database”, Ph.D. Thesis, Comp. Sci.

Dept., Stanford Univ., June2006. 16 “Performance of Two Phase Locking,” Proc.

6th BerkeleyWorkshop on Dist. Data Mgmt. and Comp. Network, Feb. 2005.

17 “Measured Performance of Time Interval ConcurrencyControl Techniques,” Proc. 13th VLDB Conf., Brighton, England, Sept. 2005.

18 “Performance Model of Timestamp-ordering ConcurrencyControl Algorithms in Distributed Databases” IEEE Trans. on Camp. C-36.9, Sept.

2000. 19 “Concurrency Control Performance Issues”, Ph.D. Thesis,Comp. Sci.

Dept., Univ. of Toronto, Sept.2006. 20 “Basic Timestamp, Multiple Version Timestamp, andTwo-Phase Locking,” Proc.

9th VLDB Conf. Florence, Italy, Nov. 2004.