In nowadays (hands-free) one-to-one communication systems a transmission of a speech is realized by means of a monophonic transmission channel, among others due to bandwidth limitations. As a result at a reproduction side all sounds come from the same direction (or directions if multiple loudspeakers are used) and hence a human ability to separate sound sources based on binaural hearing cannot be used. As a consequence listening to the speech contaminated with noise and/or competing speakers is difficult and leads to reduced speech intelligibility and listener's fatigue. For this reason with hands-free telephony systems the desired speech signal that is transmitted is as “clean” as possible, i.e. it comprises only the desired direct speech. Stationary noise suppression is a must have in hands-free communication. Microphone array beam-forming with additional processing can be used to further enhance the speech. However, the known systems do not provide face-to-face feeling during the communication, especially not for informal settings where not only the speech (message) is important but also the feeling of being together.