A data processing apparatus is adapted to store and process input data, and output a result corresponding to a query input by a user. In particular, when a capacity of the input data is large, various types of databases are used to increase a processing rate and obtain reliable results.
Among these databases, a graph database may be optimized to process semi-structured data which does not observe a structured data model rule connected to a relational database or a different type of data table, such that it may be applied to various fields such as social data, recommendation, and geographic spatial analysis.
Meanwhile, a query of the graph database may be represented as a graph pattern, and a query for searching for a specific pattern in the overall graph is performed to search for the desired data.
FIG. 1 is a diagram illustrating a query used in a conventional graph database as the graph pattern. A process of processing the query is performed by searching for the whole graph to search for a sub graph matching with the input query pattern. In this case, which portion of the graph pattern query is first searched has a great effect on query processing performance. Therefore, it is important to accurately predict a level of an intermediate result at the time of searching for a graph.
Even the conventional relational database uses a method of predicting an intermediate result by making a histogram for a table to make a query processing execution plan at the time of processing the query.
However, the graph database does not have a fixed schema and has a structure in which a data form is more complicated than that of the relational database, such that there is a problem that the histogram for the relational database may not be applied to the graph database.