Data resampling refers to recalculating the values of the samples at new locations, using a combination of the original values near those locations. For example, as shown in FIGS. 14A and 14B, in the context of image-based resampling, similar areas (e.g., from a Europe map) can be shown in different projections, such as geographic latitude/longitude (FIG. 14A) and Lambert Conformal Conic (LCC) (FIG. 14B). In an exemplary reprojection scenario, an original set of data is received in the geographic latitude/longitude format, but the received is data is only valid for the land portions and not the water portions, such as would exist in NASA's Land Surface Temperature Anomaly data sets. So all the land-based pixels are valid data and the water-based pixels are invalid (e.g., as represented by NaN, Inf, or a special out of range value like 99999.0). Since this data is in angular coordinates, the size of shapes closer to the poles are stretched, as evidenced by the comparison with the LCC projection of the same area (FIG. 14B). For example, the Scandanavian countries appear smaller in the LCC projection as compared to the geographic view.
A reprojection, therefore, from the geographic view to the LCC view warps each pixel in the image and requires resampling to be performed to create the output image as a regular grid of values. Three known modes of resampling include: nearest neighbor, bilinear, and cubic convolution. However, using known interpolation techniques, the invalid pixels can affect the output values of valid land pixels that are near the coast. In fact, for the cubic convolution algorithm, with its 4×4 neighborhood, it might make some of the smaller islands completely disappear into values (e.g., NaN or impossible values like 50000) representing invalid data points because every pixel on the island is within 3 pixels of a white no data pixel.
As described herein, data samples are assumed to lie on a regular grid that is indexed by a sequential series of integers. Pairs of floating point resampling coordinates that are to be evaluated are identified using (x,y) notation. Further, the floor(x) function is denoted [x], which returns the largest integer less than or equal to x. Also, dx is defined as dx=x−[x], which is the fractional portion of x in the [0, 1) range. For a given collection of data F, the notation Fi,j is used as the value of the sample at integer coordinates (i,j).
For nearest neighbor resampling, the data sample closest to (x, y) is selected. The closest sample is in the 2×2 neighborhood of samples Fi,j, Fi+1,j, Fi,j+1, Fi+1,j+1. This is usually by testing dx<0.5 and dy<0.5 to determine which of the 4 neighboring samples is closest.
For bilinear resampling, the same 2×2 neighborhood of samples around the (x, y) coordinates is used, but a weighted average of those 4 values is calculated. The full expression for this weighted average isFi,j(1−dx)(1−dy)+Fi+1,jdx(1−dy)+Fi,j+1(1−dx)dy+Fi+1,j+1dxdy 
Through the separation of variables, it is possible to first pair-wise interpolate in one direction, and then interpolate those intermediate values in the orthogonal direction. The order is unimportant; the same result is calculated either way. So for example, each pair of samples in the same row is first averaged, and then those intermediate values are then averaged.top=Fi,j(1−dx)+Fi+1,jdx bottom=Fi,j+1(1−dx)+Fi+1,j+1dx value=top(1−dy)+bottom·dy 
For cubic convolution resampling, a 4×4 sample neighborhood is used, using all the values from (i−1, j−1) to (i+2, j+2) inclusive. Again separation of variables is used to perform the calculations in one direction, then the other. The cubic convolution resampling function is actually a family of functions, parameterized by a value a in the [−1, 0) range. The full expression for this in the horizontal direction is:Fi−1(a·dx3−2a·dx2+a·dx)+Fi((a+2)dx3−(a+3)dx2+1)+Fi+1(−(a+2)dx3+(2a+3)dx2−a·dx)+Fi+2(−a·dx3+a·dx2)This function is used for each of the four rows, j−1, j, j+1, j+2, and then the same function is used to interpolate those values using dy instead of dx. The most common value for 2-D resampling is a=−0.5, which reduces the interpolation to cubic Hermite spline interpolation. This yields a continuous resampled surface with a continuous first derivative, which looks best to the eye.
Such resampling functions expect that all the Fi,j sample values are valid. However, this is not always the case, as data collections frequently define a “no data” value for samples that were not measured, and known methods can mask out unimportant samples to restrict the calculations. In both such cases, samples are utilized in the calculations that should not be. The worst case situation of this is when the original data set uses floating point values, and the “no data” value is set to the special value of Infinity or NaN (not a number). Any calculations involving these two special values will return the same special value. With nearest neighbor resampling, the number of “no data” samples generated is comparable to the number of “no data” samples in the input data set. With bilinear resampling, this means that each “no data” sample can affect a 3×3 neighborhood centered on it, since it could be any of the samples in the 2×2 neighborhood used in the weighted average calculations. With cubic convolution resampling, the “no data” sample can affect a 7×7 neighborhood centered on it, since it could be any of the samples in the 4×4 neighborhood used in the calculations. So the resampled output can have a lot more “no data” values than the original data set, which can severely affect the utility of the resampled data set.