1. Technical Field
The present invention relates generally to information processing and, in particular, to systems and methods for inferring gender by fusion of multimodal content.
2. Description of the Related Art
Obtaining gender information of users (e.g., in social media) can be very useful for marketing purposes (e.g., user segmentation, targeted advertising, and so forth). Current approaches rely on the following: (1) text analysis, which is limited by a specific language model and reaches a performance ceiling by using only one source of information; (2) social graphs, which are complementary to textual information, but offer limited gender prediction accuracy; and (3) visual information, but only from profile picture face analysis, which is not always available and/or reliable, or profile colors, which offer very limited gender prediction accuracy.
The semantic information of non-face showing profile pictures and images/videos in a user's collection has not been appropriately employed yet. Furthermore the combination of multimodal (namely visual and non-visual) cues for gender estimation has largely been ignored or has been performed in a trivial, suboptimal manner. Thus, there is a need for a system that derives user gender using an effective multimodal combination of visual and non-visual cues.