Social networking applications enable groups of people to connect with each other and share information online. The groups can share common attributes, e.g., a same city, country, workplace, profession. The groups can be from different parts of the world, or different walks of life. Social networking applications can also recommend a user of the application to connect with another user, e.g., by sending friend suggestions to the user. The social networking application can determine the friend suggestion based on user characteristics of the users. For example, the social networking application can suggest a first user to become friends with a second user because second user is a friend of a friend of the first user, the second user went to same school as the first user, or lived in the same city as the first user. If there are multiple users to be suggested, the social networking application can rank the users to be suggested based on some common user characteristics with the first user.
Some social networking applications use clustering tools to categorize users into buckets such that similar users may be categorized to the same bucket. For example, to determine whether two users are similar, the tool identifies the buckets to which the user identification (ID) of the users are assigned, and if the assigned bucket ID of both the users is the same, the tool determines that there is a high probability that the users are similar. For example, the users may be friends, or friends of friends. In some cases they may not be similar, but most likely, the clustering process categorized them together into the same bucket just because it was beneficial based on some properties of the users. A problem with such a tool is that whenever two people are not in the same bucket, the tool does not offer much data to interpret the similarity between the users. Therefore, if one of the two users is assigned to bucket B5 and the other of the two users to bucket B9, the two users can be most likely complete strangers to each other, or could be good friends but the clustering process could not fit them in the same bucket. In some clustering processes, on an average a fair percentage of the friends of a user can be assigned to different buckets. Therefore, the fact that two users are not assigned to the same bucket may not be useful information for determining the similarity of the users, and concluding that they are not similar may not be a true statement. Accordingly, the tools provided by current social networking applications for determining whether two users are similar may not be accurate and/or are faulty.