TimeSeries
After the successful creation of a TimeAxis, a Shyft TimeSeries
can be
instantiated. In Shyft we see the TimeSeries
as a function that can be evaluated at any time
point f(t)
. If we evaluate the TimeSeries
outside the time axis range, it will
return NaN
. Inside the defined intervals of the time axis, it will interpolate between values, how it will
be interpolated will depend on what type of TimeSeries
we use.
A concrete time series can be instantiated by giving a TimeAxis
, ta
, a set of values, values
,
and the point interpretation, point_fx
, you want the time series to have.
The available point interpretations are POINT_INSTANT_VALUE
and POINT_AVERAGE_VALUE
.
These are available in shyft.time_series.point_interpretation_policy
.
As mentioned above, the time series value f(t) is defined by its numbers and point-interpretation within the TimeAxis
total_period,
and is NaN
outside this definition interval.
From this it mathematically follows that a binary operation between two time-series will
yield a time series with at time axis covering the intersection of the two operands.
Thus the result will also be NaN
outside the resulting time axis.
For all time-series representation, we strongly recommend to use un-prefixed SI-units.
E.g: W (Watt) not MW (MegaWatt)
This greatly simplify doing math using formulas, and also works well with integral, and derivative functions. Avoid using anything but un-prefixed SI-units until presentation level (UI, or export to 3rd party system interfaces).
Recall that time in shyft is simply a number, SI-unit is s (Seconds).
Time series types
POINT_INSTANT_VALUE
A POINT_INSTANT_VALUE
is a time series where a value is linearly interpolated between
the start and end points of each interval. So the term linear between points
captures an important implication
of this time-series: it requires at a minimum two points to define that line.
This representation is useful for signals that represents ‘state’, like observed water level at a point in time.
So if we create a 4 interval time axis and input the values [0, 3, 1, 4]
as shown below
from shyft.time_series import (
TimeSeries, TimeAxis, Calendar, point_interpretation_policy
)
ta = TimeAxis(0, 1, 4)
values = [0, 3, 1, 4]
ts_instant = TimeSeries(
ta=ta,
values=values,
point_fx=point_interpretation_policy.POINT_INSTANT_VALUE
)
it will produce a time series on the form
t3_____t4
t1 /
/\ /
/ \ /
/ \/
t0 / t2
and evaluating the time series at different time points we can see how it interpolates between the points.
print(ts_instant(0)) # 0.0
print(ts_instant(1)) # 3.0
print(ts_instant(1.5)) # 2.0
print(ts_instant(3.5)) # 4.0
print(ts_instant(4.0)) # nan
Note
Worth noticing is that in the last time interval it extrapolates the last value as a straight line since it does not have a last value to interpolate to.
POINT_AVERAGE_VALUE
A POINT_AVERAGE_VALUE
is a time series type where the whole interval has
the same value. It is typically used to represent signals that are constant over a time interval.
Like effect produced [W], or water-flow [m3/s].
ts_average = TimeSeries(
ta=ta,
values=values,
point_fx=point_interpretation_policy.POINT_AVERAGE_VALUE
)
t3______t4
t1_____ |
| | |
| | |
| |____|
t0_____| t2
And evaluating the average time series at the same points as the instant series shows the differences between their interpolation
print(ts_average(0)) # 0.0
print(ts_average(1)) # 3.0
print(ts_average(1.5)) # 3.0
print(ts_average(3.5)) # 4.0
print(ts_average(4.0)) # nan
Inspection functions
To inspect the time series there exists a few utility functions we should know about.
TimeSeries(t: int/float/time)
If we call the time series with an int
, float
or time
object it will
evaluate itself on the specific time point and return the value.
ta = TimeAxis(0, 1, 4)
values = [0, 3, 1, 4]
ts = TimeSeries(
ta=ta,
values=values,
point_fx=point_interpretation_policy.POINT_AVERAGE_VALUE
)
print(ts(1)) # 3.0
point_interpretation()
Returns the point interpretation of the time series.
print(ts.point_interpretation()) # POINT_AVERAGE_VALUE
size()
Returns the number of intervals in the time series
print(ts.size()) # 4
time_axis
Returns the time axis of the time series
print(ts.time_axis) # TimeAxis('1970-01-01T00:00:00Z', 1s, 4)
values.to_numpy()
Returns a numpy array with the values it was set up with
print(ts.values.to_numpy()) # [0. 3. 1. 4.]
General time series manipulation
For a comprehensive list of available functions see shyft.time_series.TimeSeries()
.
The time series in Shyft are thought of as mathematical expressions and not indexed values.
This makes the manipulation of time series a bit different than with numpy arrays or pandas series.
We set up some convenience functions to inspect the time series.
from shyft.time_series import (
time, TimeSeries, TimeAxis, POINT_AVERAGE_VALUE as stair_case,
POINT_INSTANT_VALUE as linear, Calendar,
FORWARD as d_forward, BACKWARD as d_backward, CENTER as d_center
)
def show_values(text: str, ts: TimeSeries):
print(f'{text}:\n {ts.values.to_numpy()}')
Arithmetics
Doing basic arithmetics with time series is as simple as with numbers.
HOUR = time(3600)
values = list(range(24))
ta = TimeAxis(start=time('2021-01-01T00:00:00Z'), delta_t=HOUR, n=len(values))
ts = TimeSeries(ta=ta, values=values, point_fx=stair_case)
show_values('Initial TimeSeries', ts)
# Initial TimeSeries:
# [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
# 18. 19. 20. 21. 22. 23.]
ts_addition = ts + 2
ts_multiplication = ts*2
show_values('TimeSeries + 2', ts_addition)
#TimeSeries + 2:
# [ 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
# 20. 21. 22. 23. 24. 25.]
show_values('TimeSeries*2', ts_multiplication)
#TimeSeries*2:
# [ 0. 2. 4. 6. 8. 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34.
# 36. 38. 40. 42. 44. 46.]
We can also do the same arthmetic on two time series
values.reverse()
ts_reversed = TimeSeries(ta=ta, values=values, point_fx=stair_case)
show_values('TimeSeries + TimeSeries', ts + ts_reversed)
#TimeSeries + TimeSeries:
# [23. 23. 23. 23. 23. 23. 23. 23. 23. 23. 23. 23. 23. 23. 23. 23. 23. 23.
# 23. 23. 23. 23. 23. 23.]
show_values('TimeSeries*TimeSeries', ts*ts_reversed)
#TimeSeries*TimeSeries:
# [ 0. 22. 42. 60. 76. 90. 102. 112. 120. 126. 130. 132. 132. 130.
# 126. 120. 112. 102. 90. 76. 60. 42. 22. 0.]
One thing to notice when we do arithmetics on time series is if they have different time axis. So if we for example have two time series with partial overlap in time, only overlapping time intervals will be defined.
values.reverse()
ta1 = TimeAxis(start=time('2021-01-01T00:00:00Z'), delta_t=HOUR, n=len(values))
# time axis shifted 12 hours
ta2 = TimeAxis(start=time('2021-01-01T12:00:00Z'), delta_t=HOUR, n=len(values))
ts1 = TimeSeries(ta=ta1, values=values, point_fx=stair_case)
ts2 = TimeSeries(ta=ta2, values=values, point_fx=stair_case)
ts_addition = ts1 + ts2
show_values('TimeSeries.time_axis1 + TimeSeries.time_axis2', ts_addition)
#TimeSeries.time_axis1 + TimeSeries.time_axis2:
# [12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34.]
print(f'{"t":15}{"ts1":10}{"ts2":10}{"ts1 + ts2":10}')
for t in map(int, sorted(set(list(ta1.time_points) + list(ta2.time_points)))):
print(f'{t:<15}{ts1(t):<10}{ts2(t):<10}{ts_addition(t):<10}')
#t ts1 ts2 ts1 + ts2
#1609459200 0.0 nan nan
#1609462800 1.0 nan nan
#1609466400 2.0 nan nan
#1609470000 3.0 nan nan
#1609473600 4.0 nan nan
#1609477200 5.0 nan nan
#1609480800 6.0 nan nan
#1609484400 7.0 nan nan
#1609488000 8.0 nan nan
#1609491600 9.0 nan nan
#1609495200 10.0 nan nan
#1609498800 11.0 nan nan
#1609502400 12.0 0.0 12.0
#1609506000 13.0 1.0 14.0
#1609509600 14.0 2.0 16.0
#1609513200 15.0 3.0 18.0
#1609516800 16.0 4.0 20.0
#1609520400 17.0 5.0 22.0
#1609524000 18.0 6.0 24.0
#1609527600 19.0 7.0 26.0
#1609531200 20.0 8.0 28.0
#1609534800 21.0 9.0 30.0
#1609538400 22.0 10.0 32.0
#1609542000 23.0 11.0 34.0
#1609545600 nan 12.0 nan
#1609549200 nan 13.0 nan
#1609552800 nan 14.0 nan
#1609556400 nan 15.0 nan
#1609560000 nan 16.0 nan
#1609563600 nan 17.0 nan
#1609567200 nan 18.0 nan
#1609570800 nan 19.0 nan
#1609574400 nan 20.0 nan
#1609578000 nan 21.0 nan
#1609581600 nan 22.0 nan
#1609585200 nan 23.0 nan
#1609588800 nan nan nan
Since ts2
is not defined in the intervals 1 to 11, the resulting time series
will not be defined in that period either. The same goes for where ts1
is not defined.
Mathematical operations
There are many built-in functions to help with manipulating time series in Shyft. These examples are
not exhaustive, so please refer to the documentation on shyft.time_series.TimeSeries()
for a complete
list.
Average
The average function takes only one argument, the time axis. The resulting expression (still a time-series), yields the true average over the time periods of that time axis.
The value of each time period interval of the resulting time-series is equal to the true average of the non-nan
sections of that interval. E.g. the i’th interval, ranging from period(i).start
to period(i).end
Example: We create two time axis, both with a period of a week, but one with daily resolution and one with hourly resolution. Then we create two hourly time series, one stair case and one linear, with linearly increasing values. After that we average both of them with the daily time axis.
t0 = time('2021-01-01T00:00:00Z')
ta_hourly = TimeAxis(start=t0, delta_t=HOUR, n=24*7)
ta_daily = TimeAxis(start=t0, delta_t=24*HOUR, n=7)
values = [i for i in range(ta_hourly.size())]
ts_stairs = TimeSeries(ta=ta_hourly, values=values, point_fx=stair_case)
ts_linear = TimeSeries(ta=ta_hourly, values=values, point_fx=linear)
show_values('Original stairs', ts_stairs)
#Original stairs:
# [ 0. 1. 2. 3. 4. ... 163. 164. 165. 166. 167.]
show_values('Original linear', ts_linear)
#Original linear:
# [ 0. 1. 2. 3. 4. ... 163. 164. 165. 166. 167.]
ts_stairs_avg = ts_stairs.average(ta_daily)
ts_linear_avg = ts_linear.average(ta_daily)
show_values('Daily stairs average', ts_stairs_avg)
#Daily stairs average:
# [ 11.5 35.5 59.5 83.5 107.5 131.5 155.5]
show_values('Daily linear average', ts_linear_avg)
#Daily linear average:
# [ 12. 36. 60. 84. 108. 132. 155.5]
Note
The point interpretation of a time series that is created from an averaging will always be a stair case series, as per definition: it represents the true average of that interval.
print(f'Point interpretation of ts_stairs_avg: {ts_stairs_avg.point_interpretation()}\n'
f'Point interpratation of ts_linear_avg: {ts_linear_avg.point_interpretation()}')
#Point interpretation of ts_stairs_avg: POINT_AVERAGE_VALUE
#Point interpratation of ts_linear_avg: POINT_AVERAGE_VALUE
Accumulate
Accumulate takes a time axis as input and returns a new time series where the i’th value
is the integral of non-nan fragments from t0
to ti
.
ts_linear_hourly_acc = ts_linear.accumulate(ta_hourly)
ts_linear_daily_acc = ts_linear.accumulate(ta_daily)
show_values('Hourly accumulation', ts_linear_hourly_acc/HOUR)
#Hourly accumulation::
# [0.00 0.50 2.00 4.50 8.00 12.50
# ...
# 13122.0 13284.5 13448.0 13612.5 13778.0 13944.5]
show_values('Daily accumulation', ts_linear_daily_acc/(HOUR*24))
#Daily accumulation::
# [ 0. 12. 48. 108. 192. 300. 432.]
Note
Integral operations on shyft time series are done with a dt
of seconds, which is the SI unit for time.
It implies that if the source unit of the time-series is W (watt), and you integrate, or accumulate it,
it gives the correct unit of Ws -> J/s x s -> J (Joule).
Derivative
We can compute the derivative of a time series forwards, backwards or center. As with accumulate the operations happen on second resolution so the resulting time-unit is pr standard SI system.
E.g.: if you have a time series with SI-unit J (Joule), and apply the .derivative() function, the resulting time-unit will accordingly be J/s (Joule pr second), e.g. W (Watt).
show_values('Daily derivative forward', ts_stairs_avg.derivative(d_forward)*(HOUR*24))
#Daily derivative forward:
# [24. 24. 24. 24. 24. 24. 0.]
show_values('Daily derivative backward', ts_stairs_avg.derivative(d_backward)*(HOUR*24))
#Daily derivative backward:
# [ 0. 24. 24. 24. 24. 24. 24.]
show_values('Daily derivative center', ts_stairs_avg.derivative(d_center)*(HOUR*24))
#Daily derivative center:
# [12. 24. 24. 24. 24. 24. 12.]
Integral
We can integrate a time series over a specified time axis. As for the average function, it works with the non-nan section of each interval of the time axis.
show_values('Daily integral', ts_stairs.integral(ta=ta_daily)/(HOUR*24))
#Daily integral:
# [ 11.5 35.5 59.5 83.5 107.5 131.5 155.5]
Statistics
With the statistics function we can directly get the different percentiles over a specified time axis.
show_values('Daily 10 percentile', ts_linear.statistics(ta=ta_daily, p=10))
#Daily 10 percentile:
# [ 2.3 26.3 50.3 74.3 98.3 122.3 146.3]
show_values('Daily 50 percentile', ts_linear.statistics(ta=ta_daily, p=50))
#Daily 50 percentile:
# [ 11.5 35.5 59.5 83.5 107.5 131.5 155.5]
show_values('Daily 90 percentile', ts_linear.statistics(ta=ta_daily, p=90))
#Daily 90 percentile:
# [ 20.7 44.7 68.7 92.7 116.7 140.7 164.7]
Utility functions
Here is a small collection of helpful functions when manipulating or extracting information from time series.
Inside
This function creates a new time series with values where it is either inside or outside a
defined range. We can set the minimum and maximum value of the range, the value it should use where
it meets NaN
, and also the values to set where it is inside or outside the range.
By default NaN
will continue to be NaN
, inside range will be 1 and outside range 0.
t0 = time('2021-01-01T00:00:00Z')
ta = TimeAxis(start=t0, delta_t=HOUR, n=10)
values = [i*10 for i in range(ta.size())]
ts = TimeSeries(ta=ta, values=values, point_fx=linear)
show_values('Smaller than 50', ts.inside(min_v=float('nan'), max_v=50))
#Smaller than 50:
# [1. 1. 1. 1. 1. 0. 0. 0. 0. 0.]
show_values('Larger than 50', ts.inside(min_v=50, max_v=float('nan')))
#Larger than 50:
# [0. 0. 0. 0. 0. 1. 1. 1. 1. 1.]
To have no upper or lower limit we set the min_v
or max_v
to NaN
.
show_values('Between 25 and 65', ts.inside(min_v=25, max_v=65, inside_v=10, outside_v=20))
#Between 25 and 65:
# [20. 20. 20. 10. 10. 10. 10. 20. 20. 20.]
Here we check if values are inside the range 25-65 and set map inside values to 10 and outside values to 20.
Max/min
These function returns a new time series with filled in values of whichever value that is maximum/minimum of the input value or value in the time series.
show_values('Max of 40', ts.max(number=40))
#Max of 40:
# [40. 40. 40. 40. 40. 50. 60. 70. 80. 90.]
show_values('Min of 40', ts.min(number=40))
#Min of 40:
# [ 0. 10. 20. 30. 40. 40. 40. 40. 40. 40.]
Time shift
This function shifts the values forward or backward in time on the basis of a dt
.
It moves forward for positive time step and backwards for negative time step.
ts_hour_shift = ts.time_shift(HOUR)
show_values('Shifted time series', ts_hour_shift)
#Shifted time series:
# [ 0. 10. 20. 30. 40. 50. 60. 70. 80. 90.]
print(ts.time_axis)
#TimeAxis('2021-01-01T00:00:00Z', 3600s, 10)
print(ts_hour_shift.time_axis)
#TimeAxis('2021-01-01T01:00:00Z', 3600s, 10)
As we can see here the values stay the same, but the time axis has been shifted an hour forward.