NTU General GBase 8a Enterprise Enhancement 1 .

Distributed storage of data

Columns and rows are mixed

The data managed by NTU's general gbase 8a is organized and physically stored in columns on disk. In the face of massive data analysis, analytical databases store table data in columns, and the column storage architecture has natural advantages for query, statistics, and analysis operations.

Its advantages are reflected in the following aspects:

Lower IO

Only access to the columns involved in the query will result in disk Io, and columns that are not involved in the query do not need to be accessed and do not result in disk io.

High compression ratio The compression ratio can reach 2 to 20 times.

Mixed rows and columns are supported

NTU General GBASE 8A MPP Cluster supports mixed rows and columns. For a clustered architecture of a columnstore, when the operation involves a large number of columns and the data records accessed are very discrete, a large number of discrete ios will occur. The row-column hybrid feature improves disk IO performance by storing information for redundant rows.

Distributed storage

GBASE 8A MPP Cluster can process structured data above petabyte, and can adopt random data storage distribution policy mode or hash data storage distribution policy mode for large table data. Users can choose the appropriate data storage distribution strategy according to the needs of business scenarios, so as to obtain the best balance between performance, reliability, and flexibility.

Random data storage distribution policy pattern

The random data storage distribution strategy mode refers to the database creating a randomly distributed distribution table, and the data will be randomly and evenly distributed to each data node when the data is stored.

Hash data storage distribution policy pattern

The hash data storage distribution policy mode refers to the processing of each piece of data in the original data according to the specified hash distribution column when the data is stored in the database, and the processed data is loaded into a specific hash bucket according to the hash value, and each hash bucket corresponds to a cluster data node. In this way, the data obtained by each node has some common characteristics (the specified columns all have the same hash value), and the optimization engine can optimize the query plan according to these common characteristics at query time, so as to achieve the purpose of shortening the query time.

NTU General GBase 8a Enterprise Enhancement 1 .

Related Pages

NEW!GB 14881 General hygienic code for food production is open for public comment!

The operator is ruthless and pushes a 9 yuan 150GB general traffic package, the real cabbage price

The operator is ruthless and pushes a 9 yuan 157GB general traffic package, which is a real cabbage