co.cask.cdap.metrics.data.TimeValueAggregator
Given a collection of timeseries, aggregate the values at each timestamp between the earliest and latest data points, where the value at each timestamp can be interpolated if there is not a datapoint at that timestamp. For example, given two series that look like: t1 t2 t3 t4 t5 t6 t7 t8 - 5 - 5 - 5 - 9 1 - 3 - 1 - 3 - without any interpolation, it would aggregate to a single timeseries like: t1 t2 t3 t4 t5 t6 t7 t8 1 5 3 5 1 5 3 9 This is fine if the absence of data really means the value there is a 0. However, if there is no data because we are not writing data points at the finest granularity (1 second), then this aggregate does not give an accurate picture. This can be the case if we're sampling, or if the metric being tracked does not change very frequently, and is thus not written frequently. Interpolating the data just means we're filling in the missing points with something that is reasonably likely to have been the true value at that point. With linear interpolation, the individual time series get transformed into: t1 t2 t3 t4 t5 t6 t7 t8 - 5 5 5 5 5 7 9 1 2 3 2 1 2 3 - and the final aggregate timeseries becomes: t1 t2 t3 t4 t5 t6 t7 t8 1 7 8 7 6 7 10 9 With step interpolation, the individual time series get transformed into: t1 t2 t3 t4 t5 t6 t7 t8 - 5 5 5 5 5 5 9 1 1 3 3 1 1 3 - and the final aggregate timeseries becomes: t1 t2 t3 t4 t5 t6 t7 t8 1 6 8 8 6 6 8 9