Smart home devices normally have the capability of collecting real-time multimedia data (including video and/or audio data) and identifying events in the collected multimedia data. For example, some multimedia surveillance devices identify individual audio events including screams and gunshots. Another automatic health monitoring device detects cough sound as a representative acoustical symptom of abnormal health conditions for the purposes of attending to the health of the aged who live alone. Some home devices include digital audio applications to classify the acoustic events to distinct classes (e.g., music, news, sports, cartoon and movie). Regardless of the audio events or classes that are detected, existing home devices rely on predetermined audio programs to identify individual audio events independently or, at most, the audio events in the context of general background noise. These smart home devices do not differentiate multiple audio events that often occur simultaneously, nor do they adjust the predetermined audio programs according to the capabilities of the home devices and the variation of the ambient environments. It would be beneficial to have a more efficient audio event detection mechanism than the current practice.