Video content has become a part of everyday life with an increasing amount of video content becoming available online, and people spending an increasing amount of time online. Additionally, individuals are able to create and share video content online using video sharing websites and social media. Recognizing visual contents in unconstrained videos has found a new importance in many applications, such as video searches on the Internet, video recommendations, smart advertising, etc. However, conventional approaches for recognizing activities in videos do not properly and efficiently make predictions of the activities depicted in the videos.