3D modeling techniques are used for various applications such as augmented reality, visualization, video messaging and so on. 3D models of objects can be generated from images and data scanned by multi-sensor systems, including cameras and depth sensors. Certain applications such as augmented reality based applications require analysis of multiple 3D objects in conjunction with each other. Artificial intelligence techniques have been used for analysis of images and sensor data. For example, neural networks such as GOOGLENET and RESNET perform image recognition, with very high accuracy. Convolutional Neural Networks (CNNs) have been used for image recognition tasks, exploiting spatial structures (e.g., edges, texture, color), while recurrent neural networks (RNNs) are used for temporal processing (such as with natural language: speech, text). These neural networks have also been used in combination, for example to create text annotations for images. However, conventional techniques have been inadequate to perform analysis of 3D models of objects in association with each other as required for augmented reality and other applications.