There are many methods in the prior art for filtering noise from speech data. One of the more recent approaches is to use a neural network to model a speaker's voice and then identify the speech from the noisy speech data. This approach has met with limited success for several reasons. First, it requires a large amount of training time and training data including the speaker's speech, which is sometimes impractical or impossible to acquire. Second, the speech, after being extracted from the noise, may include some distortion depending on the precision with which the speech was modeled by the DNN. Third, the neural network model is likely to work with only a single language, meaning that an additional neural network must be trained and deployed for modeling speech of other languages. There is therefore a need for a robust technique to accurately represent and filter noise from speech data that operates independent of the amount of training speech data or language.