Compression schemes and memory layouts have been proposed over time for better achieving efficiency in data processing. However, conventional techniques are severely limited due to resource constraints, which is particularly true in machine learning environments. For example, conventional schemes are known to require performance of multiple convolutional operations on layers in neural networks, which often results in over-fetching of data around boundaries.