Technical Field
The present invention relates to a method and apparatus for distributing graph data in a distributed computing environment and, more particularly, to a method and apparatus for distributing graph data in a distributed computing environment, wherein a partition is generated based on common sub-graphs or vertexes depending on whether the common sub-graphs are present in the graph data and corresponding graph data is distributed to a partition having a minimum processing cost in each vertex.
Description of the Related Art
As the Internet is advanced, numerous data is generated and distributed by netizens every day. Recently, in many companies, in particular, in many search engine companies and web portals, to collect and accumulate huge amounts of data as much as possible and to extract meaningful data from the collected data as soon as possible become a company's competitiveness.
For this reason, many companies construct large-scale clusters at a low cost and are doing a lot of research into high-capacity distributed management and task distribution parallel processing technologies.
That is, the value of a large amount of data that is difficult to be processed by an existing single machine system is emerging. Distributed parallel systems are introduced into and used in various fields as alternatives for processing the large amount of data.
A hashing method is used to process a large amount of graph data in a distributed computing environment.
However, the hashing method has limited distributed computing performance for graph data due to data distributed without taking a graph structure into consideration and a network cost occurring because data distributed to different servers is searched for.