When multimedia content is delivered over a distribution network to a plurality of end users, whether via satellite, cable, twisted copper, fiber, or another medium, audio components and video components may be segregated to improve network efficiencies. However, when segregated audio and video packets are transported across the network, random and systematic sources of error or delay may affect video and audio packets differently and can, therefore, negatively impact the synchronization. Because the most common or recognizable manifestation of the problem may be a detectable difference in timing between the visual perception of the movement of a speaker's lips and the audio perception of the corresponding sound, this problem is commonly referred to as lip synchronization error or, more simply, lip sync error.