Misplaced Pages

Global serializability

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
This article may be too technical for most readers to understand. Please help improve it to make it understandable to non-experts, without removing the technical details. (January 2017) (Learn how and when to remove this message)
This article needs attention from an expert in databases or computer science. The specific problem is: extremely difficult to understand for a non-technical audience and parts read like an essay or personal POV. Discussion on the talk page went on in 2011-2012, but seemingly came to no consensus. Needs an overhaul. WikiProject Databases or WikiProject Computer science may be able to help recruit an expert. (September 2022)
(Learn how and when to remove this message)

In concurrency control of databases, transaction processing (transaction management), and other transactional distributed applications, global serializability (or modular serializability) is a property of a global schedule of transactions. A global schedule is the unified schedule of all the individual database (and other transactional object) schedules in a multidatabase environment (e.g., federated database). Complying with global serializability means that the global schedule is serializable, has the serializability property, while each component database (module) has a serializable schedule as well. In other words, a collection of serializable components provides overall system serializability, which is usually incorrect. A need in correctness across databases in multidatabase systems makes global serializability a major goal for global concurrency control (or modular concurrency control). With the proliferation of the Internet, Cloud computing, Grid computing, and small, portable, powerful computing devices (e.g., smartphones), as well as increase in systems management sophistication, the need for atomic distributed transactions and thus effective global serializability techniques, to ensure correctness in and among distributed transactional applications, seems to increase.

In a federated database system or any other more loosely defined multidatabase system, which are typically distributed in a communication network, transactions span multiple (and possibly distributed) databases. Enforcing global serializability in such system, where different databases may use different types of concurrency control, is problematic. Even if every local schedule of a single database is serializable, the global schedule of a whole system is not necessarily serializable. The massive communication exchanges of conflict information needed between databases to reach conflict serializability globally would lead to unacceptable performance, primarily due to computer and communication latency. Achieving global serializability effectively over different types of concurrency control has been open for several years.

The global serializability problem

Problem statement

The difficulties described above translate into the following problem:

Find an efficient (high-performance and fault tolerant) method to enforce Global serializability (global conflict serializability) in a heterogeneous distributed environment of multiple autonomous database systems. The database systems may employ different concurrency control methods. No limitation should be imposed on the operations of either local transactions (confined to a single database system) or global transactions (span two or more database systems).

Quotations

Lack of an appropriate solution for the global serializability problem has driven researchers to look for alternatives to serializability as a correctness criterion in a multidatabase environment (e.g., see Relaxing global serializability below), and the problem has been characterized as difficult and open. The following two quotations demonstrate the mindset about it by the end of the year 1991, with similar quotations in numerous other articles:

  • "Without knowledge about local as well as global transactions, it is highly unlikely that efficient global concurrency control can be provided... Additional complications occur when different component DBMSs and the FDBMSs support different concurrency mechanisms... It is unlikely that a theoretically elegant solution that provides conflict serializability without sacrificing performance (i.e., concurrency and/or response time) and availability exists."

Proposed solutions

Several solutions, some partial, have been proposed for the global serializability problem. Among them:

Relaxing global serializability

Some techniques have been developed for relaxed global serializability (i.e., they do not guarantee global serializability; see also Relaxing serializability). Among them (with several publications each):

  • Quasi serializability
  • Two-level serializability

Another common reason nowadays for Global serializability relaxation is the requirement of availability of internet products and services. This requirement is typically answered by large scale data replication. The straightforward solution for synchronizing replicas' updates of a same database object is including all these updates in a single atomic distributed transaction. However, with many replicas such a transaction is very large, and may span several computers and networks that some of them are likely to be unavailable. Thus such a transaction is likely to end with abort and miss its purpose. Consequently, Optimistic replication (Lazy replication) is often utilized (e.g., in many products and services by Google, Amazon, Yahoo, and alike), while global serializability is relaxed and compromised for eventual consistency. In this case relaxation is done only for applications that are not expected to be harmed by it.

Classes of schedules defined by relaxed global serializability properties either contain the global serializability class, or are incomparable with it. What differentiates techniques for relaxed global conflict serializability (RGCSR) properties from those of relaxed conflict serializability (RCSR) properties that are not RGCSR is typically the different way global cycles (span two or more databases) in the global conflict graph are handled. No distinction between global and local cycles exists for RCSR properties that are not RGCSR. RCSR contains RGCSR. Typically RGCSR techniques eliminate local cycles, i.e., provide local serializability (which can be achieved effectively by regular, known concurrency control methods); however, obviously they do not eliminate all global cycles (which would achieve global serializability).

References

  1. Amit Sheth, James Larson (1990): "Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases" Archived 2007-10-19 at the Wayback Machine, ACM Computing Surveys, Vol. 22, No 3, pp. 183-236, September 1990 (quotation from page 227)
  2. Weimin Du and Ahmed K. Elmagarmid (1989): "Quasi Serializability: a Correctness Criterion for Global Concurrency Control in InterBase" Archived 2010-12-21 at the Wayback Machine, Proceedings of the Fifteenth International Conference on Very Large Data Bases (VLDB), August 22–25, 1989, Amsterdam, the Netherlands, pp. 347-355, Morgan Kaufmann, ISBN 1-55860-101-5
  3. Sharad Mehrotra, Rajeev Rastogi, Henry Korth, Abraham Silberschatz (1998): "Ensuring Consistency in Multidatabases by Preserving Two-Level Serializability", ACM Transactions on Database Systems (TODS), Vol. 23, No. 2, pp. 199-230, June 1998 (quotation from the article's Abstract)
  4. Gray, J.; Helland, P.; O’Neil, P.; Shasha, D. (1996). The dangers of replication and a solution. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. pp. 173–182. CiteSeerX 10.1.1.43.6498. doi:10.1145/233269.233330.
Categories:
Global serializability Add topic