Features of NewSQL

  • Post author:
  • Post category:General
Features of NewSQL

23143B462-0

NewSQL is a class of modern Relational Database Management System that seek to provide the same scalable performance of NoSQL systems for Online Transaction Processing (OLTP) and read-write workloads, while maintaining the ACID guarantees of a traditional database system. The term was first used by 451 Group analyst Matthew Aslett in a 2011 research paper, discussing the rise of new database system as a challenger to established vendors.

Many enterprise systems that handle high-profile data (e.g., financial and order processing systems) also need to scale but are unable to use NoSQL solutions, because they cannot give up strong transactional and consistency requirements. The only options previously available for these organizations were to either purchase a more powerful single-node machine or develop custom middleware that distributes queries over traditional DBMS nodes. Both approaches are prohibitively expensive and thus are not an option for many. Thus, in this paper, Aslett discusses how NewSQL upstarts are poised to challenge the supremacy of commercial vendors, especially Oracle.

Although NewSQL systems vary greatly in internal architecture, the two distinguishing features common amongst them are – a) They support the Relational Data Model b) Use SQL as their primary interface. The applications targeted by these NewSQL systems are characterized as having a large number of transactions that (1) are short-lived (i.e., no user stalls), (2) touch a small subset of data using index lookups (i.e., no full table scans or large distributed joins), and (3) are repetitive (i.e. executing the same queries with different inputs). These NewSQL systems achieve high performance and scalability by eschewing much of the legacy architecture of the original IBM System R design, such as heavyweight recovery or concurrency control algorithms. One of the first known NewSQL systems is the H-Store Parallel Database System.

NewSQL systems can be loosely grouped into three categories:

New Architectures: The first type of NewSQL systems are completely made in new database platforms. These are designed to operate in a distributed cluster of shared-nothing nodes, in which each node owns a subset of the data. These databases are often written from scratch with a distributed architecture in mind, and include components such as distributed concurrency control, flow control, and distributed query processing. Examples of systems in this category are Google Spanner, Clustrix, VoltDB, MemSQL, Pivotal’s GemFire XD, SAP HANA, NuoDB, TiDB, and Trafodion.

SQL Engines : The second category are highly optimized storage engines for SQL. These systems provide the same programming interface as SQL, but scale better than built-in engines, such as InnoDB. Examples of these new storage engines include MySQL Cluster, Infobright, TokuDB and the now defunct InfiniDB.

Transparent Sharding: These systems provide a sharding middleware layer to automatically split databases across multiple nodes. ScaleBase is an example of this type of system.

Altogether the combined and rejuvenated technologies can be helpful in regaining a helpful database, which can be used for well-organized hosting scenarios. It helps in providing information in a form directly usable or the same can be subjected to further processing by other applications. Thus, the general growth and redistribution of the database technologies are a major concern and have to be duly improved in future.

Leave a Reply