Open Access
A BiLSTM cardinality estimator in complex database systems based on attention mechanism
Author(s) -
Zhou Qiang,
Yang Guoping,
Song Haiquan,
Guo Jin,
Zhang Yadong,
Wei Shengjie,
Qu Lulu,
Gutierrez Louis Alberto,
Qiao Shaojie
Publication year - 2022
Publication title -
caai transactions on intelligence technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.613
H-Index - 15
ISSN - 2468-2322
DOI - 10.1049/cit2.12069
Subject(s) - cardinality (data modeling) , computer science , estimator , set (abstract data type) , data mining , theoretical computer science , mathematics , statistics , programming language
Abstract An excellent cardinality estimation can make the query optimiser produce a good execution plan. Although there are some studies on cardinality estimation, the prediction results of existing cardinality estimators are inaccurate and the query efficiency cannot be guaranteed as well. In particular, they are difficult to accurately obtain the complex relationships between multiple tables in complex database systems. When dealing with complex queries, the existing cardinality estimators cannot achieve good results. In this study, a novel cardinality estimator is proposed. It uses the core techniques with the BiLSTM network structure and adds the attention mechanism. First, the columns involved in the query statements in the training set are sampled and compressed into bitmaps. Then, the Word2vec model is used to embed the word vectors about the query statements. Finally, the BiLSTM network and attention mechanism are employed to deal with word vectors. The proposed model takes into consideration not only the correlation between tables but also the processing of complex predicates. Extensive experiments and the evaluation of BiLSTM‐Attention Cardinality Estimator (BACE) on the IMDB datasets are conducted. The results show that the deep learning model can significantly improve the quality of cardinality estimation, which is a vital role in query optimisation for complex databases.