The field of human-computer interaction and interface has been undergoing rapid changes in recent years due to advances in computer technology and related fields. In particular, a lot of time and effort has been directed to utilizing technological advances in multimedia and multimodal processing to improve the usability and productivity of processing systems. Multimedia systems have been developed and improved at a rapid pace to provide the users with more and more options and flexibility in their interface with the systems, for both input and output applications. However, current multimedia systems still have a number of shortcomings in their approach to using multimedia capabilities available to present information in various output modalities. Current multimedia systems typically predefine or prescribe the modality and representation for presenting any given output information in a fixed or predetermined manner. For example, a word processing application may be configured to always present output information to the users in text format on the display screen or a company announcement system may be designed to always present company announcements as audio output, etc. Even when a system or an application is designed to present certain information simultaneously in multiple modalities (e.g., both visually and aurally), such a system or application typically cannot determine whether such a multi-modal presentation of information (e.g., both visually and aurally) in certain situations is desirable or even acceptable from a user's point of view. For example, assuming that a company announcement system is designed and configured to always present company announcements both visually (as text on display screen) and aurally (as audio output on speakers attached to the user's computer), such an inflexible selection of output modality (or modalities) may not be desirable or acceptable to certain users in certain situations. For example, a user A who currently is listening to music or having a conference call may not want to hear a company announcement as audio output on the speakers. In this case, user A may just want to see the display of the company announcement on the computer screen. Similarly, a user B who is working on a word document and already having a number of windows open on his computer screen may not want to see his screen to be cluttered with any more text but rather may only want to hear the company announcement as audio output. In other words, for any given system and/or application, rigid or inflexible designation of information to any particular (output) modality may not be desirable and even unacceptable to certain users in certain circumstances or situations. Too much information presented in any given output modality may lead to information overload with respect to a user's sensory and mental capacity to absorb information. In addition, certain users may not be able to even receive or interpret information in certain modalities. For example, audio output may not be acceptable for users with hearing impairments. Likewise, visual output (e.g., text, graphics, video, etc.) is not acceptable to users who are blind. In addition to the shortcomings mentioned above, current multimodal and multimedia systems also lack capabilities to effectively and efficiently support multiple representations of information in a multimodal environment. For example, output information from a given application may be represented in different forms or formats. However, current systems and applications are unable to determine which representation(s) of information may be better or more preferable from a user's point of view under certain circumstances. Even when an application is designed to provide and support alternative representations of information, assignment or selection of a particular type of representation is typically predetermined or requires a user to manually intervene or specify the particular representation that is suitable for that particular user at a particular moment in time. For example, a typical file management system may allow the users to view their file listing either as a summary listing (e.g., a listing of file names only), a detailed listing (e.g., a listing of file names with additional file information such as the file types, the file sizes, etc.), a list of small icons, or a list of large icons, etc. However, a user is typically required to specify or select a particular type of listing (representation of file information) that he wants to use at any given moment. Similarly, even when the systems and applications support multiple window sizes and positions for visual display, the size (and the position) of the window for any given information is either predetermined by the system or application, or the user is required to manually select or specify the desirable size (and the position). For example, when a user brings up multiple applications, the information generated from the multiple applications are either displayed in all large windows or all small windows in fixed positions, or in a fixed overlapping manner without taking into considerations the user's preferences or needs and the changes in the system's conditions and environment.