Users may consume audio content via a number of content consumption devices. Certain content consumption devices may be configured to receive voice-based commands, or may otherwise be configured to recognize speech. Voice input from users to such devices may reflect a physical or emotional characteristic of the user. Accordingly, determining a physical or emotional characteristic of a user using a voice input may be desired.