1. Basis and significance of topic selection (no less than 300 words).
1) Basis for topic selection.
It is very difficult to integrate these new database algorithms into most database systems. At present, most of the database systems, such as MySOL, Oracle, Microsoft SOL Server, their function scalability is not good, if you want to add new algorithms to these database systems, you often need to modify the source code of these systems, both the development threshold and the development cycle is very high. This hinders the application of new database algorithms, especially AI-based database algorithms, in database systems. In addition to modifying the system source code, it is also a common method to implement a new database algorithm by simulating the relevant behaviors and operations of the database. Although this method greatly reduces the workload and development threshold, it only stays in the simulation environment and cannot test the performance of the algorithm in the real system, and the industrial value is not high. The new indexing algorithm fiting-tree algorithm is a performance test in a simulated database environment, but the algorithm does not consider compatibility issues with the actual database system, such as the physical location of the data is not a single value.
The best way to solve this problem is to integrate new algorithms into the database system by developing extensions. This approach is highly flexible and greatly reduces the effort and development effort, but there are not many systems that are currently well scalable. Compared with other database systems, the open-source database system PostgreSQL has very good scalability. PostgreSQL was designed with scalability in mind, so it is very flexible compared to other databases. PostgreSQL extends new algorithms to the database through an extension interface, and the runtime is just like the existing features of PostgreSQL. However, the extension interface of postgresql is not perfect, and the use of postgresql extension development interface has a high threshold for developers, especially for academic workers, and the first thing developers need to do is to familiarize themselves with the underlying implementation of postgresql, so the development of extensions usually takes a long time to complete.
2) Research implications.
Nowadays, artificial intelligence technology is in full swing, and it is widely used in various fields because of its powerful learning, reasoning, and planning capabilities. In the field of databases, artificial intelligence technology not only has a very broad application prospect, but also provides new development opportunities for database systems. In recent years, many new database algorithms have emerged, which are based on artificial intelligence technology, and automatically perform operations such as query load, parameter tuning, data partitioning, index maintenance, and query optimization by modeling and learning data distribution, query load, performance and other characteristics, which not only improves the performance of database-related algorithms, but also reduces the workload of database operation and maintenance personnel, which is of great significance for the development of database technology.
In order to promote the application and development of artificial intelligence technology in the database field, this paper plans to develop a system based on the open-source database PostgreSQL Visual Dock Station (hereinafter referred to as VDS). The system provides an easy-to-use extended interface for indexing algorithms and scanning algorithms, as well as extended visual management, configuration and installationIn addition, in order to facilitate researchers to tune and test the algorithm, the system provides a query plan visualization function. On the one hand, the system is conducive to researchers to quickly test and apply the algorithm in the actual system, and on the other hand, it can also assist researchers to optimize the algorithm, which is of great significance for many studies of database algorithms. Finally, this paper uses a new extension interface to implement the fiting-tree algorithm, which improves the industrial value of the algorithm and enriches the functional features of the PostgreSQL system.
2. Research objectives and main content (including ** (design) outline, no less than 500 words).
1) Research objectives.
The development prospect of artificial intelligence technology in the field of database is very broad, but the obstacle hindering the application of artificial intelligence technology in the field of database is the high threshold and large workload of expanding new algorithms in the actual database system, so the primary research content of this paper is to reduce the difficulty of integrating new algorithms in the actual database system. Considering the pros and cons of the existing work, this paper plans to develop a new database algorithm by extension, so as to reduce the difficulty of integrating the new algorithm in the database system.
2) Main content.
This article will be divided into six chapters. The first chapter is an introduction, which describes the research background and main work of this paper. Chapter 2 is the requirements analysis of the VDS system. The third chapter is the architecture design of VDS, the main content is to design the architecture of the system, and the functions of each subsystem are introduced. Chapter 4 is the key technology and implementation, this chapter introduces in detail the key technologies for the implementation of VDS system, as well as the specific technical solutions. Chapter 5 is the system implementation and operation test, which mainly describes the implementation of the system, and demonstrates that the system meets the design requirements such as function, performance and security. Chapter 6 is a summary, summarizing the full text and presenting prospects, as well as the shortcomings of current systematic research.
3. Research methods and means.
Chapter 1 Introduction.
Chapter 2 System Requirements Design.
2.1 System Requirements Description.
2.2. Analysis of system functional requirements.
2.3. Analysis of non-functional requirements of the system.
Chapter 3 System Architecture Design.
3.1. Overall structural design.
3.2. System function design.
3.2.1. Extended development module design.
3.2.2. Extended management module design.
3.2.3. Extended test module design.
3.2.4. Query the design of the visualization module.
Chapter 4: Key Technologies and Implementations.
4.1 General technical overview of the system.
4.2 Implementation of extended development functions.
4.3. Expand the implementation of management functions.
4.4 Implementation of Extended Test Functions.
4.5 Query visualization implementation.
4.6 fiting-tree index extension implementation.
Chapter 5 System Implementation and Operational Testing.
Conclusion. References.
Bibliography (author, title or title of the book, publisher or issue number, date of publication or issue number).
1] Sun Luming, Zhang Shaomin, Ji Tao, Li Cuiping, Chen Hong. "Research on data management technology empowered by artificial intelligence.
Investigate. "Journal of Software,2020,31(3):600-619
2] galakatos,alex,et al. "fiting-tree:a data-aware index
structure." proceedings of the 2019 international conference on management of data. 2019.
3] zhou, wensheng, and xiaojun ye. "implementation of aries
algorithms in postgresql." computer engineering 1(2006): 24. [4] vengerov, d**id, et al. "join size estimation subject to filter conditions."proceedings of the vldb endowment 8.12(2015):1530-1541.
5] Zhang et al. "Connectivity algorithms for in-memory computing. "Journal of East China Normal University (Natural Science Edition).
6] barthels, claude, et al. "distributed join algorithms on
thousands of cores."proceedings of the vldb endowment 10.5 (2017): 517-528.
7] chakraborty,sanjay,n.k.nagwani,and lopamudra dey.
performance comparison of incremental k-means and incremental dbscan algorithms." arxiv preprint arxiv:1406.4751(2014).
8] golshanara,ladan,seyed mohammad taghi rouhani rankoohi,and hamed shah-hosseini."a multi-colony ant algorithm for optimizing join queries in distributed database systems."knowledge and information systems 39.1 (2014): 175-206.
9] ngo h q. worst-case optimal join algorithms: techniques,results, and open problems[j]. proceedings of the acm sigact-sigmod-sigart symposium on principles of database systems, 2018:111-124. doi:10.1145/3196959.3196990.
10] ngo, hung q. "worst-case optimal join algorithms: techniques, results, and open problems."proceedings of the 37th acm sigmod-sigact-sigai symposium on principles of database systems. 2018.
11] Fan Xieyu, Ren Yingchao. "Parallel spatial connection algorithm implementation of open-source relational database clusters. "Compute.
machine system application 025010(2016):233-239.
12] fahad, adil, et al. "a survey of clustering algorithms for big
data: taxonomy and empirical analysis." ieee transactions on emerging topics in computing 2.3(2014):267-279.
13] velmurugan t. performance based analysis between k-means and
fuzzy c-means clustering algorithms for connection oriented telecommunication data[j]. applied soft computing, 2014, 19:134-146.
14] giles, d**id m., et al. "advancements in the aerosol robotic network (aeronet) version 3 database- automated near-real-time quality control algorithm with improved cloud screening for sun photometer aerosol optical depth(aod)measurements. atmospheric measurement techniques 12.1(2019).
15] Shao B, Xu Guosheng. "DLB+ Tree: An in-memory database indexing algorithm based on double-leaf nodes. "
The 10th Annual Conference of China Institute of Communications, 2014
16] Zhang Yansong, Zhang Yu, Wang Shan. "A vector index-based in-memory OLAP star connection acceleration is new.
Technology. "Chinese Journal of Computers 8(2019): 2