Knowledge of channel state information (CSI) at a transmitter and/or a receiver can have a dramatic impact on communication over multipath channels in terms of power and spectral efficiency. In particular, coherent communication is generally far more efficient than non-coherent communication, but requires that the CSI be known at the receiver to maintain coherence. Training-based schemes, which involve probing the channel with known signaling waveforms and linear processing of the corresponding channel output, are commonly used to learn the CSI at the receiver. Recent measurement studies have shown that physical multipath channels tend to exhibit an approximately sparse multipath structure at high signal space dimension—time-bandwidth-antenna product. The CSI for such channels is characterized with significantly fewer dominant parameters compared to the maximum number dictated by the delay-Doppler-angle spread of the channel. Conventional training-based methods, often based on exhaustive probing coupled with a linear least squares approach or a non-linear parametric estimator, are ill-suited for exploiting the inherently low-dimensional nature of sparse or approximately sparse multipath channels.