shyft.time_series
This package contains the following classes and functions to use by end-users. The namespace itself contains more classes and functions, but these are used internally.
Note
Vector types Because the actual code is written in C++ which is strongly typed, Shyft Python code uses the concept of “vector” classes which are basically lists of the named type e.g. TsBindInfo and TsBindInfoVector. These are not specifically documented here.
However, some vector types like TsVector are documented, because they provide more functionality than a simple list.
Time
Elements in this category deal with date/time.
Function utctime_now
- class shyft.time_series.utctime_now
Bases:
returns time now as seconds since 1970s
Function deltahours
- class shyft.time_series.deltahours((object)n)
Bases:
returns time equal to specified n hours
Function deltaminutes
- class shyft.time_series.deltaminutes((object)n)
Bases:
returns time equal to specified n minutes
Class time
- class shyft.time_series.time
Bases:
instance
time is represented as a number, in SI-unit seconds.
For accuracy and performance, it’s internally represented as 64bit integer, at micro-second resolution It is usually used in two roles:
- time measured in seconds since epoch (1970.01.01UTC)
often constructed using the Calendar class that takes calendar-coordinates (YMDhms etc) and returns the corresponding time-point, taking time-zone and dst etc. into account.
>>> utc = Calendar() >>> t1 = utc.time(2018,10,15,16,30,15) >>> t2 = time('2018-10-15T16:30:15Z') >>> t3 = time(1539621015)
- time measure, in unit of seconds (resolution up to 1us)
often constructed from numbers, always use SI-unit of seconds
>>> dt1 = time(3600) # 3600 seconds >>> dt2 = time(0.123456) # 0.123456 seconds
It can be constructed supplying number of seconds, or a well defined iso8601 string
To convert it to a python number, use float() or int() to cast operations If dealing with time-zones/calendars conversion, use the calendar.time(..)/.calendar_units functions. If you want to use time-zone/calendar semantic add/diff/trim, use the corresponding Calendar methods.
See also
Calendar,deltahours,deltaminutes
- __init__((object)arg1) object :
construct a 0s
- __init__( (object)arg1, (object)seconds) -> object :
construct a from a precise time like item, e.g. time, float,int and iso8601 string
- Args:
seconds (): A time like item, time, float,int or iso8601 YYYY-MM-DDThh:mm:ss[.xxxxxx]Z string
- epoch = time(0)
- max = time.max
- min = time.min
- property seconds
returns time in seconds
- Type:
float
- sqrt()
object sqrt(tuple args, dict kwds)
- undefined = time.undefined
Class YMDhms
- class shyft.time_series.YMDhms
Bases:
instance
Defines calendar coordinates as Year Month Day hour minute second and micro_second The intended usage is ONLY as result from the Calendar.calendar_units(t), to ensure a type-safe return of these entities for a given time.
Please use this as a read-only return type from the Calendar.calendar_units(t)
- __init__((YMDhms)arg1) None
- __init__( (YMDhms)arg1, (object)Y [, (object)M [, (object)D [, (object)h [, (object)m [, (object)s [, (object)us]]]]]]) -> None :
Creates calendar coordinates specifying Y,M,D,h,m,s,us
- property day
int:
- property hour
int:
- is_null((YMDhms)arg1) bool :
returns true if all values are 0, - the null definition
- is_valid((YMDhms)arg1) bool :
returns true if YMDhms values are reasonable
- static max() YMDhms :
returns the maximum representation
- property micro_second
int:
- static min() YMDhms :
returns the minimum representation
- property minute
int:
- property month
int:
- property second
int:
- property year
int:
Class YWdhms
- class shyft.time_series.YWdhms
Bases:
instance
Defines calendar coordinates as iso Year Week week-day hour minute second and micro_second The intended usage is ONLY as result from the Calendar.calendar_week_units(t), to ensure a type-safe return of these entities for a given time.
Notes
Please use this as a read-only return type from the Calendar.calendar_week_units(t)
- __init__((YWdhms)arg1) None
- __init__( (YWdhms)arg1, (object)Y [, (object)W [, (object)wd [, (object)h [, (object)m [, (object)s [, (object)us]]]]]]) -> None :
Creates calendar coordinates specifying iso Y,W,wd,h,m,s
- Args:
Y (int): iso-year
W (int): iso week [1..53]
wd (int): week_day [1..7]=[mo..sun]
h (int): hour [0..23]
m (int): minute [0..59]
s (int): second [0..59]
us (int): micro_second [0..999999]
- property hour
int:
- is_null((YWdhms)arg1) bool :
returns true if all values are 0, - the null definition
- is_valid((YWdhms)arg1) bool :
returns true if YWdhms values are reasonable
- property iso_week
int:
- property iso_year
int:
- static max() YWdhms :
returns the maximum representation
- property micro_second
int:
- static min() YWdhms :
returns the minimum representation
- property minute
int:
- property second
int:
- property week_day
week_day,[1..7]=[mo..sun]
- Type:
int
Class TzInfo
- class shyft.time_series.TzInfo
Bases:
instance
The TzInfo class is responsible for providing information about the time-zone of the calendar. This includes:
name (olson identifier)
base_offset
utc_offset(t) time-dependent
The Calendar class provides a shared pointer to it’s TzInfo object
- __init__((TzInfo)arg1, (time)base_tz) None :
creates a TzInfo with a fixed utc-offset(no dst-rules)
- base_offset((TzInfo)arg1) time :
returnes the time-invariant part of the utc-offset
- is_dst((TzInfo)arg1, (time)t) bool :
returns true if DST is observed at given utc-time t
- name((TzInfo)arg1) str :
returns the olson time-zone identifier or name for the TzInfo
- utc_offset((TzInfo)arg1, (time)t) time :
returns the utc_offset at specified utc-time, takes DST into account if applicable
Class Calendar
- class shyft.time_series.Calendar
Bases:
instance
Calendar deals with the concept of a human calendar
In Shyft we practice the ‘utc-time perimeter’ principle,
the core is utc-time exclusively
we deal with time-zone and calendars at the interfaces/perimeters
In python, this corresponds to timestamp[64], or as the integer version of the time package representation e.g. the difference between time.time()- utctime_now() is in split-seconds
Calendar functionality:
Conversion between the calendar coordinates YMDhms or iso week YWdhms and utctime, taking any timezone and DST into account
Calendar constants, utctimespan like values for Year,Month,Week,Day,Hour,Minute,Second
Calendar arithmetic, like adding calendar units, e.g. day,month,year etc.
Calendar arithmetic, like trim/truncate a utctime down to nearest timespan/calendar unit. eg. day
Calendar arithmetic, like calculate difference in calendar units(e.g days) between two utctime points
Calendar Timezone and DST handling
Converting time to string and vice-versa
Notes
Please notice that although the calendar concept is complete:
We only implement features as needed in the core and interfaces
Currently this includes most options, including olson time-zone handling
The time-zone support is currently a snapshot of rules ~2023
but we plan to use standard packages like Howard Hinnant’s online approach for this later.
Working range for DST is 1905..2105 (dst is considered 0 outside)
Working range,validity of the calendar computations is limited to gregorian as of boost::date.
Working range avoiding overflows is -4000 .. + 4000 Years
- DAY = time(86400)
- HOUR = time(3600)
- MINUTE = time(60)
- MONTH = time(2592000)
- QUARTER = time(7776000)
- RANGE = [-9999-01-01T00:00:00Z,9999-12-31T23:59:59Z>
- SECOND = time(1)
- TZ_RANGE = [1905-01-01T00:00:00Z,2105-01-01T00:00:00Z>
- WEEK = time(604800)
- YEAR = time(31536000)
- __init__((Calendar)arg1) None
- __init__( (Calendar)arg1, (time)tz_offset) -> None :
creates a calendar with constant tz-offset
- Args:
tz_offset (time): specifies utc offset, time(3600) gives UTC+01 zone
- __init__( (Calendar)arg1, (object)tz_offset) -> None :
creates a calendar with constant tz-offset
- Args:
tz_offset (int): seconds utc offset, 3600 gives UTC+01 zone
- __init__( (Calendar)arg1, (object)olson_tz_id) -> None :
create a Calendar from Olson timezone id, eg. ‘Europe/Oslo’
- Args:
olson_tz_id (str): Olson time-zone id, e.g. ‘Europe/Oslo’
- add()
- object add(tuple args, dict kwds) :
This function does a calendar semantic add.
Conceptually this is similar to t + deltaT*n but with deltaT equal to calendar::DAY,WEEK,MONTH,YEAR and/or with dst enabled time-zone the variation of length due to dst as well as month/year length is taken into account. E.g. add one day, and calendar have dst, could give 23,24 or 25 hours due to dst. Similar for week or any other time steps.
- Args:
t (time): timestamp utc seconds since epoch
delta_t (time): timestep in seconds, with semantic interpretation of DAY,WEEK,MONTH,YEAR
n (int): number of timesteps to add
- Returns:
time: t. new timestamp with the added time-steps, seconds utc since epoch
- Notes:
ref. to related functions .diff_units(…) and .trim(..)
- calendar_units()
- object calendar_units(tuple args, dict kwds) :
returns YMDhms (.year,.month,.day,.hour,.minute..) for specified t, in the time-zone as given by the calendar
- Args:
t (time): timestamp utc seconds since epoch
- Returns:
YMDhms: calendar_units. calendar units as in year-month-day hour-minute-second
- calendar_week_units()
- object calendar_week_units(tuple args, dict kwds) :
returns iso YWdhms, with properties (.iso_year,.iso_week,.week_day,.hour,..) for specified t, in the time-zone as given by the calendar
- Args:
t (time): timestamp utc seconds since epoch
- Returns:
YWdms: calendar_week_units. calendar units as in iso year-week-week_day hour-minute-second
- day_of_year((Calendar)self, (time)t) int :
returns the day of year for the specified time
- Parameters:
t (time) – time to use for computation
- Returns:
day_of_year. in range 1..366
- Return type:
int
- diff_units()
- object diff_units(tuple args, dict kwds) :
calculate the distance t1..t2 in specified units, taking dst into account if observed The function takes calendar semantics when delta_t is calendar::DAY,WEEK,MONTH,YEAR, and in addition also dst if observed. e.g. the diff_units of calendar::DAY over summer->winter shift is 1, even if the number of hours during those days are 23 and 25 summer and winter transition respectively. It computes the calendar semantics of (t2-t1)/deltaT, where deltaT could be calendar units DAY,WEEK,MONTH,YEAR
- Args:
t1 (time): timestamp utc seconds since epoch
t2 (time): timestamp utc seconds since epoch
delta_t (time): timestep in seconds, with semantic interpretation of DAY,WEEK,MONTH,YEAR
trim_policy (trim_policy): Default TRIM_IN, could be TRIM_OUT or TRIM_ROUND.
- Returns:
int: n_units. number of units, so that t2 = c.add(t1,delta_t,n) + remainder(discarded). Depending on the trim_policy, the remainder results will add/subtract one unit to the result.
- Notes:
ref. to related functions .add(…) and .trim(…)
- quarter((Calendar)arg1, (time)t) int :
returns the quarter of the specified t, -1 if invalid t
- Parameters:
t (int) – timestamp utc seconds since epoch
- Returns:
quarter. in range[1..4], -1 if invalid time
- Return type:
int
- static region_id_list() StringVector :
Returns a list over predefined Olson time-zone identifiers
Notes
the list is currently static and reflects tz-rules approx as of 2014
- time((Calendar)self, (YMDhms)YMDhms) time :
convert calendar coordinates into time using the calendar time-zone
- Args:
YMDhms (YMDhms): calendar cooordinate structure containg year,month,day, hour,minute,second
- Returns:
int: timestamp. timestamp as in seconds utc since epoch
- time( (Calendar)self, (YWdhms)YWdhms) -> time :
convert calendar iso week coordinates structure into time using the calendar time-zone
- Args:
YWdhms (YWdhms): structure containg iso specification calendar coordinates
- Returns:
int: timestamp. timestamp as in seconds utc since epoch
- time( (Calendar)self, (object)Y [, (object)M=1 [, (object)D=1 [, (object)h=0 [, (object)m=0 [, (object)s=0 [, (object)us=0]]]]]]) -> time :
convert calendar coordinates into time using the calendar time-zone
- Args:
Y (int): Year
M (int): Month [1..12], default=1
D (int): Day [1..31], default=1
h (int): hour [0..23], default=0
m (int): minute [0..59], default=0
s (int): second [0..59], default=0
us (int): micro second[0..999999], default=0
- Returns:
time: timestamp. timestamp as in seconds utc since epoch
- time_from_week((Calendar)self, (object)Y[, (object)W=1[, (object)wd=1[, (object)h=0[, (object)m=0[, (object)s=0[, (object)us=0]]]]]]) time :
convert calendar iso week coordinates into time using the calendar time-zone
- Parameters:
Y (int) – ISO Year
W (int) – ISO Week [1..54], default=1
wd (int) – week_day [1..7]=[mo..su], default=1
h (int) – hour [0..23], default=0
m (int) – minute [0..59], default=0
s (int) – second [0..59], default=0
us (int) – micro second[0..999999], default=0
- Returns:
timestamp. timestamp as in seconds utc since epoch
- Return type:
- to_string()
- object to_string(tuple args, dict kwds) :
- convert time t to readable iso standard string taking
the current calendar properties, including timezone into account
- Args:
utctime (time): seconds utc since epoch
- Returns:
str: iso time string. iso standard formatted string,including tz info
- to_string( (Calendar)self, (UtcPeriod)utcperiod) -> str :
convert utcperiod p to readable string taking current calendar properties, including timezone into account
- Args:
utcperiod (UtcPeriod): An UtcPeriod object
- Returns:
str: period-string. [start..end>, iso standard formatted string,including tz info
- trim()
- object trim(tuple args, dict kwds) :
Round time t down to the nearest calendar time-unit delta_t, taking the calendar time-zone and dst into account.
- Args:
t (time): timestamp utc seconds since epoch
delta_t (time): timestep in seconds, with semantic interpretation of Calendar.{DAY|WEEK|MONTH|YEAR}
- Returns:
time: t. new trimmed timestamp, seconds utc since epoch
- Notes:
ref to related functions .add(t,delta_t,n),.diff_units(t1,t2,delta_t)
Class UtcPeriod
- class shyft.time_series.UtcPeriod
Bases:
instance
UtcPeriod defines the open utc-time range [start..end> where end is required to be equal or greater than start
- __init__((UtcPeriod)arg1) None
- __init__( (UtcPeriod)arg1, (time)start, (time)end) -> None :
Create utcperiod given start and end
- contains((UtcPeriod)self, (time)t) bool :
returns true if time t is contained in this utcperiod
- contains( (UtcPeriod)self, (object)t) -> bool :
returns true if time t is contained in this utcperiod
- contains( (UtcPeriod)self, (UtcPeriod)p) -> bool :
returns true if utcperiod p is contained in this utcperiod
- diff_units((UtcPeriod)self, (Calendar)calendar, (time)delta_t) int :
- Calculate the distance from start to end of UtcPeriod in specified units, taking dst into account if observed
The function takes calendar semantics when delta_t is calendar::DAY,WEEK,MONTH,YEAR, and in addition also dst if observed. e.g. the diff_units of calendar::DAY over summer->winter shift is 1, even if the number of hours during those days are 23 and 25 summer and winter transition respectively
- Args:
calendar (calendar): shyft calendar
delta_t (time): timestep in seconds, with semantic interpretation of DAY,WEEK,MONTH,YEAR
- Returns:
int: n_units. number of units in UtcPeriod
- diff_units( (UtcPeriod)self, (Calendar)calendar, (object)delta_t) -> int :
Calculate the distance from start to end of UtcPeriod in specified units, taking dst into account if observed The function takes calendar semantics when delta_t is calendar::DAY,WEEK,MONTH,YEAR, and in addition also dst if observed. e.g. the diff_units of calendar::DAY over summer->winter shift is 1, even if the number of hours during those days are 23 and 25 summer and winter transition respectively
- Args:
calendar (calendar): shyft calendar
delta_t (int): timestep in seconds, with semantic interpretation of DAY,WEEK,MONTH,YEAR
- Returns:
int: n_units. number of units in UtcPeriod
- intersection((UtcPeriod)a, (UtcPeriod)b) UtcPeriod :
Returns the intersection of two periods if there is an intersection, the resulting period will be .valid() and .timespan()>0 If there is no intersection, an empty not .valid() period is returned
- overlaps((UtcPeriod)self, (UtcPeriod)p) bool :
returns true if period p overlaps this utcperiod
- timespan((UtcPeriod)arg1) time :
returns end-start, the timespan of the period
- to_string((UtcPeriod)arg1) str :
A readable representation in UTC
- trim((UtcPeriod)self, (Calendar)calendar, (time)delta_t[, (trim_policy)trim_policy=shyft.time_series._time_series.trim_policy.TRIM_IN]) UtcPeriod :
- Round UtcPeriod up or down to the nearest calendar time-unit delta_t
taking the calendar time-zone and dst into account
- Args:
calendar (calendar): shyft calendar
delta_t (time): timestep in seconds, with semantic interpretation of Calendar.(DAY,WEEK,MONTH,YEAR)
trim_policy (trim_policy): TRIM_IN if rounding period inwards, else rounding outwards
- Returns:
UtcPeriod: trimmed_UtcPeriod. new trimmed UtcPeriod
- trim( (UtcPeriod)self, (Calendar)calendar, (object)delta_t [, (trim_policy)trim_policy=shyft.time_series._time_series.trim_policy.TRIM_IN]) -> UtcPeriod :
Round UtcPeriod up or down to the nearest calendar time-unit delta_t taking the calendar time-zone and dst into account
- Args:
calendar (calendar): shyft calendar
delta_t (int): timestep in seconds, with semantic interpretation of Calendar.(DAY,WEEK,MONTH,YEAR)
trim_policy (trim_policy): TRIM_IN if rounding period inwards, else rounding outwards
- Returns:
UtcPeriod: trimmed_UtcPeriod. new trimmed UtcPeriod
- valid((UtcPeriod)arg1) bool :
returns true if start<=end otherwise false
Class UtcTimeVector
- class shyft.time_series.UtcTimeVector
Bases:
instance
- __init__((object)arg1) object :
construct empty UtcTimeVecor
- __init__( (object)arg1, (UtcTimeVector)clone_me) -> object :
construct a copy of supplied UtcTimeVecor
- Args:
clone_me (UtcTimeVector): to be cloned
- __init__( (object)arg1, (IntVector)seconds_vector) -> object :
construct a from seconds epoch utc as integer
- Args:
seconds (IntVector): seconds
- __init__( (object)arg1, (DoubleVector)seconds_vector) -> object :
construct a from seconds epoch utc as float
- Args:
seconds (DoubleVector): seconds, up to us resolution epoch utc
- __init__( (object)arg1, (list)times) -> object :
construct a from a list of something that is convertible to UtcTime
- Args:
times (list): a list with convertible times
- __init__( (object)arg1, (object)np_times) -> object :
construct a from a numpy array of int64 s epoch
- Args:
np_times (list): a list with convertible times
- __init__( (object)arg1, (object)np_times) -> object :
construct a from a numpy array of float s epoch
- Args:
np_times (list): a list with float convertible times
- append((UtcTimeVector)arg1, (object)arg2) None
- extend((UtcTimeVector)arg1, (object)arg2) None
- static from_numpy((object)arg1) UtcTimeVector
- push_back()
- object push_back(tuple args, dict kwds) :
appends a utctime like value to the vector
- Args:
t (utctime): an int (seconds), or utctime
- size()
- to_numpy((UtcTimeVector)self) object :
convert to numpy array of type np.int64, seconds since epoch
- to_numpy_double((UtcTimeVector)self) object :
convert to numpy array of type np.float64, seconds since epoch
Time series
Elements in this category are the actual time series.
Class TimeAxis
- class shyft.time_series.TimeAxis
Bases:
instance
A time-axis is a set of ordered non-overlapping periods, and TimeAxis provides the most generic implementation of this. The internal representation is selected based on provided parameters to the constructor. The internal representation is one of TimeAxis FixedDeltaT CalendarDelataT or ByPoints. The internal representation type and corresponding realizations are available as properties.
Notes
The internal representation can be one of TimeAxisCalendarDeltaT,TimeAxisFixedDeltaT,TimeAxisByPoints
- __call__((TimeAxis)self, (int)i) UtcPeriod :
Returns the i-th period of the time-axis
- Parameters:
i (int) – index to lookup
- Returns:
period. The period for the supplied index
- Return type:
- __init__((TimeAxis)arg1) None
- __init__( (TimeAxis)arg1, (time)start, (time)delta_t, (object)n) -> None :
creates a time-axis with n intervals, fixed delta_t, starting at start
- Args:
start (utctime): utc-time 1970 utc based
delta_t (utctime): number of seconds delta-t, length of periods in the time-axis
n (int): number of periods in the time-axis
- __init__( (TimeAxis)arg1, (Calendar)calendar, (time)start, (time)delta_t, (object)n) -> None :
creates a calendar time-axis
- Args:
calendar (Calendar): specifies the calendar to be used, keeps the time-zone and dst-arithmetic rules
start (utctime): utc-time 1970 utc based
delta_t (utctime): number of seconds delta-t, length of periods in the time-axis
n (int): number of periods in the time-axis
- __init__( (TimeAxis)arg1, (UtcTimeVector)time_points, (time)t_end) -> None :
creates a time-axis by specifying the time_points and t-end of the last interval
- Args:
time_points (UtcTimeVector): ordered set of unique utc-time points, the start of each consecutive period
t_end (time): the end of the last period in time-axis, utc-time 1970 utc based, must be > time_points[-1]
- __init__( (TimeAxis)arg1, (UtcTimeVector)time_points) -> None :
create a time-axis supplying n+1 points to define n intervals
- Args:
time_points (UtcTimeVector): ordered set of unique utc-time points, 0..n-2:the start of each consecutive period,n-1: end of last period
- __init__( (TimeAxis)arg1, (TimeAxisCalendarDeltaT)calendar_dt) -> None :
create a time-axis from a calendar time-axis
- Args:
calendar_dt (TimeAxisCalendarDeltaT): existing calendar time-axis
- __init__( (TimeAxis)arg1, (TimeAxisFixedDeltaT)fixed_dt) -> None :
create a time-axis from a a fixed delta-t time-axis
- Args:
fixed_dt (TimeAxisFixedDeltaT): existing fixed delta-t time-axis
- __init__( (TimeAxis)arg1, (TimeAxisByPoints)point_dt) -> None :
create a time-axis from a a by points time-axis
- Args:
point_dt (TimeAxisByPoints): existing by points time-axis
- property calendar_dt
The calendar dt representation(if active)
- Type:
TimeAxisCalendarDeltaT
- empty((TimeAxis)self) bool :
true if empty time-axis
- Returns:
empty. true if empty time-axis
- Return type:
bool
- property fixed_dt
The fixed dt representation (if active)
- Type:
TimeAxisFixedDeltaT
- index_of((TimeAxis)self, (time)t[, (int)ix_hint=18446744073709551615]) int :
- Parameters:
t (int) – utctime in seconds 1970.01.01
ix_hint (int) – index-hint to make search in point-time-axis faster
- Returns:
index. the index of the time-axis period that contains t, npos if outside range
- Return type:
int
- merge((TimeAxis)self, (TimeAxis)other) TimeAxis :
Returns a new time-axis that contains the union of time-points/periods of the two time-axis. If there is a gap between, it is filled merge with empty time-axis results into the other time-axis
- open_range_index_of((TimeAxis)self, (time)t[, (int)ix_hint=18446744073709551615]) int :
returns the index that contains t, or is before t
- Parameters:
t (int) – utctime in seconds 1970.01.01
ix_hint (int) – index-hint to make search in point-time-axis faster
- Returns:
index. the index the time-axis period that contains t, npos if before first period n-1, if t is after last period
- Return type:
int
- period((TimeAxis)self, (int)i) UtcPeriod :
- Parameters:
i (int) – the i’th period, 0..n-1
- Returns:
period. the i’th period of the time-axis
- Return type:
- property point_dt
point_dt representation(if active)
- Type:
TimeAxisByPoints
- size((TimeAxis)arg1) int :
- Returns:
number of periods in time-axis
- Return type:
int
- slice((TimeAxis)self, (int)start, (int)n) TimeAxis :
returns slice of time-axis as a new time-axis
- Parameters:
start (int) – first interval to include
n (int) – number of intervals to include
- Returns:
time-axis. A new time-axis with the specified slice
- Return type:
- time((TimeAxis)self, (int)i) time :
- Parameters:
i (int) – the i’th period, 0..n-1
- Returns:
utctime. the start(utctime) of the i’th period of the time-axis
- Return type:
int
- property time_points
- extract all time-points from a TimeAxis
like [ time_axis.time(i) ].append(time_axis.total_period().end) if time_axis.size() else []
- Parameters:
time_axis (TimeAxis)
- Returns:
time_points – [ time_axis.time(i) ].append(time_axis.total_period().end)
- Return type:
numpy.array(dtype=np.int64)
- property time_points_double
extract all time-points from a TimeAxis with microseconds like [ time_axis.time(i) ].append(time_axis.total_period().end) if time_axis.size() else []
- Parameters:
time_axis (TimeAxis)
- Returns:
time_points – [ time_axis.time(i) ].append(time_axis.total_period().end)
- Return type:
numpy.array(dtype=np.float64)
- property timeaxis_type
describes what time-axis representation type this is,e.g (fixed|calendar|point)_dt
- Type:
TimeAxisType
Class TimeSeries
- class shyft.time_series.TimeSeries
Bases:
instance
A time-series providing mathematical and statistical operations and functionality.
A time-series can be an expression, or a concrete point time-series. All time-series do have a time-axis, values, and a point fx policy. The value, f(t) outside the time-axis is nan Operations between time-series, e.g. a+b, respects the mathematical nan op something equals nan
The time-series can provide a value for all the intervals, and the point_fx policy defines how the values should be interpreted:
POINT_INSTANT_VALUE(linear):
the point value is valid at the start of the period, linear between points -or flat extended value if the next point is nan. typical for state-variables, like water-level, temperature measured at 12:00 etc.
POINT_AVERAGE_VALUE(stair-case):
the point represents an average or constant value over the period typical for model-input and results, precipitation mm/h, discharge m^3/s
Examples:
>>> import numpy as np >>> from shyft.time_series import Calendar,deltahours,TimeAxis,TimeSeries,POINT_AVERAGE_VALUE as fx_avg >>> >>> utc = Calendar() # ensure easy consistent explicit handling of calendar and time >>> ta = TimeAxis(utc.time(2016, 9, 1, 8, 0, 0), deltahours(1), 10) # create a time-axis to use >>> a = TimeSeries(ta, np.linspace(0, 10, num=len(ta)), fx_avg) >>> b = TimeSeries(ta, np.linspace(0, 1, num=len(ta)), fx_avg) >>> c = a + b*3.0 # c is now an expression, time-axis is the overlap of a and b, lazy evaluation >>> c_values = c.values.to_numpy() # compute and extract the values, as numpy array >>> c_evaluated=c.evaluate() # computes the expression, return a new concrete point-ts equal to the expression >>> >>> # Calculate data for new time-points >>> value_1 = a(utc.time(2016, 9, 1, 8, 30)) # calculates value at a given time >>> ta_target = TimeAxis(utc.time(2016, 9, 1, 7, 30), deltahours(1), 12) # create a target time_axis >>> ts_new = a.average(ta_target) # new time-series with values on ta_target >>>
TimeSeries can also be symbolic, that is, have urls, that is resolved later, serverside using the
DtsServer
The TimeSeries functionality includes:construction: TimeSeries(time-axis,values,point_interpretation), TimeSeries(ts_url), TimeSeries(ts_url,ts_fragment)
mutating points:
set()
fill()
scale_by()
merge_points()
combining/extending:
extend()
use_time_axis_from()
resampling:
average()
accumulate()
time_shift()
use_time_axis_from()
use_time_axis()
f(x):
transform()
abs()
derivative()
integral()
min()
max()
pow()
log()
boolean f(x):
inside()
, creates mask(1.0,0.0 series) you can use for math/filtering expressionsstatistics:
statistics()
kling_gupta()
nash_sutcliffe()
filtering:
convolve_w()
krls_interpolation()
quality and correction:
quality_and_self_correction()
quality_and_ts_correction()
, min-max limits, replace by surrounding points or replacement tsbit-encoded:
decode()
stacking and percentiles:
stack()
n-ary operations:
TsVector.sum()
TsVector.forecast_merge()
, operations on TsVector that results in a TimeSerieshydrology domain:
rating_curve()
bucket_to_hourly()
ice_packing()
ice_packing_recession()
shyft.time_series.create_glacier_melt_ts_m3s()
Other useful classes to look at:
TimeAxis
,Calendar
,TsVector
,point_interpretation_policy
Please check the extensive test suite, notebooks, examples and time_series for usage.
- __call__((TimeSeries)self, (time)t) float :
return the f(t) value for the time-series
- __init__((TimeSeries)self) None :
constructs and empty time-series
- __init__( (TimeSeries)self, (TimeAxis)ta, (DoubleVector)values, (point_interpretation_policy)point_fx) -> None :
construct a timeseries time-axis ta, corresponding values and point interpretation policy point_fx
- __init__( (TimeSeries)self, (TimeAxis)ta, (object)fill_value, (point_interpretation_policy)point_fx) -> None :
construct a time-series with time-axis ta, specified fill-value, and point interpretation policy point_fx
- __init__( (TimeSeries)self, (TimeAxisFixedDeltaT)ta, (DoubleVector)values, (point_interpretation_policy)point_fx) -> None :
construct a timeseries timeaxis ta with corresponding values, and point interpretation policy point_fx
- __init__( (TimeSeries)self, (TimeAxisFixedDeltaT)ta, (object)fill_value, (point_interpretation_policy)point_fx) -> None :
construct a timeseries with fixed-delta-t time-axis ta, specified fill-value, and point interpretation policy point_fx
- __init__( (TimeSeries)self, (TimeAxisByPoints)ta, (DoubleVector)values, (point_interpretation_policy)point_fx) -> None :
construct a time-series with a point-type time-axis ta, corresponding values, and point-interpretation point_fx
- __init__( (TimeSeries)self, (TimeAxisByPoints)ta, (object)fill_value, (point_interpretation_policy)point_fx) -> None :
construct a time-series with a point-type time-axis ta, specified fill-value, and point-interpretation point_fx
- __init__( (TimeSeries)self, (TsFixed)core_result_ts) -> None :
construct a time-series from a shyft core time-series, to ease working with core-time-series in user-interface/scripting
- __init__( (TimeSeries)self, (TimeSeries)clone) -> None :
creates a shallow copy of the clone time-series
- __init__( (TimeSeries)self, (DoubleVector)pattern, (time)dt, (TimeAxis)ta) -> None :
construct a repeated pattern time-series given a equally spaced dt pattern and a time-axis ta
- Args:
pattern (DoubleVector): a list of numbers giving the pattern
dt (int): number of seconds between each of the pattern-values
ta (TimeAxis): time-axis that forms the resulting time-series time-axis
- __init__( (TimeSeries)self, (DoubleVector)pattern, (time)dt, (time)t0, (TimeAxis)ta) -> None :
construct a time-series given a equally spaced dt pattern, starting at t0, and a time-axis ta
- __init__( (TimeSeries)self, (object)ts_id) -> None :
constructs a bind-able ts, providing a symbolic possibly unique id that at a later time can be bound, using the .bind(ts) method to concrete values if the ts is used as ts, like size(),.value(),time() before it is bound, then a runtime-exception is raised
- Args:
ts_id (str): url-like identifier for the time-series,notice that shyft://<container>/<path> is for shyft-internal store
- __init__( (TimeSeries)self, (object)ts_id, (TimeSeries)bts) -> None :
constructs a ready bound ts, providing a symbolic possibly unique id that at a later time can be used to correlate with back-end store
- Args:
ts_id (str): url-type of id, notice that shyft://<container>/<path> is for shyft-internal store
bts (TimeSeries): A time-series, that is either a concrete ts, or an expression that can be evaluated to form a concrete ts
- abs((TimeSeries)self) TimeSeries :
create a new ts, abs(py::self
- Returns:
ts. a new time-series expression, that will provide the abs-values of self.values
- Return type:
- accumulate((TimeSeries)self, (TimeAxis)ta) TimeSeries :
create a new ts where each i-th value :: | integral f(t)*dt, from t0..ti
given the specified time-axis ta, and the point interpretation.
- Parameters:
ta (TimeAxis) – time-axis that specifies the periods where accumulated integral is applied
- Returns:
ts. a new time-series expression, that will provide the accumulated values when requested
- Return type:
Notes: In contrast to
integral()
, accumulate has a point-instant interpretation. Asvalues()
gives the start values of each interval, seeTimeSeries
, accumulate(ta).values provides the accumulation over the intervals [t0..t0, t0..t1, t0..t2, …], thus values[0] is always 0.)
- average((TimeSeries)self, (TimeAxis)ta) TimeSeries :
create a new ts that is the true average of self over the specified time-axis ta. Notice that same definition as for integral applies; non-nan parts goes into the average
- Parameters:
ta (TimeAxis) – time-axis that specifies the periods where true-average is applied
- Returns:
ts. a new time-series expression, that will provide the true-average when requested
- Return type:
Notes
the self point interpretation policy is used when calculating the true average
- bind((TimeSeries)self, (TimeSeries)bts) None :
given that this ts,self, is a bind-able ts (aref_ts) and that bts is a concrete point TimeSeries, or something that can be evaluated to one, use it as representation for the values of this ts. Other related functions are find_ts_bind_info,TimeSeries(‘a-ref-string’)
- Parameters:
bts (TimeSeries) – a concrete point ts, or ready-to-evaluate expression, with time-axis, values and fx_policy
Notes
raises runtime_error if any of preconditions is not true
- bind_done((TimeSeries)self[, (object)skip_check=False]) None :
after bind operations on unbound time-series of an expression is done, call bind_done() to prepare the expression for use Other related methods are .bind(), .find_ts_bind_info() and needs_bind().
- Parameters:
skip_check (bool) – If set true this function assumes all siblings are bound, as pr. standard usage pattern for the mentioned functions
Notes
Usually this is done automatically by the dtss framework, but if not using dtss
this function is needed after the symbolic ts’s are bound
- bucket_to_hourly((TimeSeries)self, (object)start_hour_utc, (object)bucket_emptying_limit) TimeSeries :
Precipitation bucket measurements have a lot of tweaks that needs to be resolved, including negative variations over the day due to faulty temperature-dependent volume/weight sensors attached.
A precipitation bucket accumulates precipitation, so the readings should be strictly increasing by time, until the bucket is emptied (full, or as part of maintenance).
The goal for the bucket_to_hourly algorithm is to provide hourly precipitation, based on some input signal that usually is hourly(averaging is used if not hourly).
The main strategy is to use 24 hour differences (typically at hours in a day where the temperature is low, like early in the morning.), to adjust the hourly volume.
Differences in periods of 24hour are distributed on all positive hourly evenets, the negative derivatives are zeroed out, so that the hourly result for each 24 hour is steady increasing, and equal to the difference of the 24hour area.
The derivative is then used to compute the hourly precipitation rate in mm/h
- Parameters:
start_hour_utc (int) – valid range [0..24], usually set to early morning(low-stable temperature)
bucket_emptying_limit (float) – a negative number, range[-oo..0>, limit of when to detect an emptying of a bucket in the unit of the measurements series
- Returns:
ts. a new hourly rate ts, that transforms the accumulated series, compensated for the described defects
- Return type:
- clone_expression((TimeSeries)self) TimeSeries :
create a copy of the ts-expressions, except for the bound payload of the reference ts. For the reference terminals, those with ts_id, only the ts_id is copied. Thus, to re-evaluate the expression, those have to be bound.
Notes
this function is only useful in context where multiple bind/rebind while keeping the expression is needed.
- compress((TimeSeries)self, (object)accuracy) TimeSeries :
Compress by reducing number of points sufficient to represent the same f(t) within accuracy. The returned ts is a new ts with break-point/variable interval representation. note: lazy binding expressions(server-side eval) is not yet supported.
- Parameters:
() (accuracy) – if v[i]-v[i+1] <accuracy the v[i+1] is dropped
- Returns:
compressed_ts. a new compressed within accuracy time-series
- Return type:
- compress_size((TimeSeries)self, (object)accuracy) int :
Compute number of points this time-series could be reduced to if calling ts.compress(accuracy). note: lazy binding expressions(server-side eval) is not yet supported.
- Parameters:
() (accuracy) – if v[i]-v[i+1] <accuracy the v[i+1] is dropped
- Returns:
compressed_size. number of distinct point needed to represent the time-series
- Return type:
int
- convolve_w((TimeSeries)self, (DoubleVector)weights, (convolve_policy)policy) TimeSeries :
create a new ts that is the convolved ts with the given weights list
- Parameters:
weights (DoubleVector) – the weights profile, use DoubleVector.from_numpy(…) to create these. It’s the callers responsibility to ensure the sum of weights are 1.0
policy (convolve_policy) – (USE_NEAREST|USE_ZERO|USE_NAN + BACKWARD|FORWARD|CENTER). Specifies how to handle boundary values
- Returns:
ts. a new time-series that is evaluated on request to the convolution of self
- Return type:
- decode((TimeSeries)self, (object)start_bit, (object)n_bits) TimeSeries :
Create an time-series that decodes the source using provided specification start_bit and n_bits. This function can typically be used to decode status-signals from sensors stored as binary encoded bits, using integer representation The floating point format allows up to 52 bits to be precisely stored as integer - thus there are restrictions to start_bit and n_bits accordingly. Practical sensors quality signals have like 32 bits of status information encoded If the value in source time-series is:
negative
nan
larger than 52 bits
Then nan is returned for those values
ts.decode(start_bit=1,n_bits=1) will return values [0,1,nan] similar: ts.decode(start_bit=1,n_bits=2) will return values [0,1,2,3,nan] etc..
- Parameters:
start_bit (int) – where in the n-bits integer the value is stored, range[0..51]
n_bits (int) – how many bits are encoded, range[0..51], but start_bit +n_bits < 51
- Returns:
decode_ts. Evaluated on demand decoded time-series
- Return type:
- derivative((TimeSeries)self[, (derivative_method)method=shyft.time_series._time_series.derivative_method.DEFAULT]) TimeSeries :
Compute the derivative of the ts, according to the method specified. For linear(POINT_INSTANT_VALUE), it is always the derivative of the straight line between points, - using nan for the interval starting at the last point until end of time-axis. Default for stair-case(POINT_AVERAGE_VALUE) is the average derivative over each time-step, - using 0 as rise for the first/last half of the intervals at the boundaries. here you can influence the method used, selecting .forward_diff, .backward_diff
- Parameters:
method (derivative_method) – default value gives center/average derivative .(DEFAULT|FORWARD|BACKWARD|CENTER)
- Returns:
derivative. The derivative ts
- Return type:
- static deserialize((ByteVector)blob) TimeSeries :
convert a blob, as returned by .serialize() into a Timeseries
- evaluate((TimeSeries)self) TimeSeries :
Forces evaluation of the expression, returns a new concrete time-series that is detached from the expression.
- Returns:
ts. the evaluated copy of the expression that self represents
- Return type:
- extend((TimeSeries)self, (TimeSeries)ts[, (extend_split_policy)split_policy=shyft.time_series._time_series.extend_split_policy.LHS_LAST[, (extend_fill_policy)fill_policy=shyft.time_series._time_series.extend_fill_policy.FILL_NAN[, (time)split_at=time(0)[, (object)fill_value=nan]]]]) TimeSeries :
create a new time-series that is self extended with ts
- Parameters:
ts (TimeSeries) – time-series to extend self with, only values after both the start of self, and split_at is used
split_policy (extend_split_policy) – policy determining where to split between self and ts
fill_policy (extend_fill_policy) – policy determining how to fill any gap between self and ts
split_at (utctime) – time at which to split if split_policy == EPS_VALUE
fill_value (float) – value to fill any gap with if fill_policy == EPF_FILL
- Returns:
extended_ts. a new time-series that is the extension of self with ts
- Return type:
- fill((TimeSeries)self, (object)v) None :
fill all values with v
- find_ts_bind_info((TimeSeries)self) TsBindInfoVector :
recursive search through the expression that this ts represents, and return a list of TsBindInfo that can be used to inspect and possibly ‘bind’ to ts-values. see also related function bind()
- Returns:
bind_info. A list of BindInfo where each entry contains a symbolic-ref and a ts that needs binding
- Return type:
TsBindInfoVector
- get((TimeSeries)self, (int)i) Point :
returns i’th point(t,v)
- get_krls_predictor((TimeSeries)self, (time)dt[, (object)gamma=0.001[, (object)tolerance=0.01[, (int)size=1000000]]]) KrlsRbfPredictor :
Get a KRLS predictor trained on this time-series.
If you only want a interpolation of self use krls_interpolation instead, this method return the underlying predictor instance that can be used to generate mean-squared error estimates, or can be further trained on more data.
Notes
A predictor can only be generated for a bound time-series.
- Parameters:
dt (float) – The time-step in seconds the underlying predictor is specified for. Note that this does not put a limit on time-axes used, but for best results it should be approximatly equal to the time-step of time-axes used with the predictor. In addition it should not be to long, else you will get poor results. Try to keep the dt less than a day, 3-8 hours is usually fine.
gamma (float (optional)) – Determines the width of the radial basis functions for the KRLS algorithm. Lower values mean wider basis functions, wider basis functions means faster computation but lower accuracy. Note that the tolerance parameter also affects speed and accurcy. A large value is around 1E-2, and a small value depends on the time step. By using values larger than 1E-2 the computation will probably take to long. Testing have reveled that 1E-3 works great for a time-step of 3 hours, while a gamma of 1E-2 takes a few minutes to compute. Use 1E-4 for a fast and tolerably accurate prediction. Defaults to 1E-3
tolerance (float (optional)) – The krls training tolerance. Lower values makes the prediction more accurate, but slower. This typically have less effect than gamma, but is usefull for tuning. Usually it should be either 0.01 or 0.001. Defaults to 0.01
size (int (optional)) – The size of the “memory” of the underlying predictor. The default value is usually enough. Defaults to 1000000.
Examples:
>>> import numpy as np >>> import scipy.stats as stat >>> from shyft.time_series import ( ... Calendar, utctime_now, deltahours, ... TimeAxis, TimeSeries ... ) >>> >>> cal = Calendar() >>> t0 = utctime_now() >>> dt = deltahours(1) >>> n = 365*24 # one year >>> >>> # generate random bell-shaped data >>> norm = stat.norm() >>> data = np.linspace(0, 20, n) >>> data = stat.norm(10).pdf(data) + norm.pdf(np.random.rand(*data.shape)) >>> # ----- >>> ta = TimeAxis(cal, t0, dt, n) >>> ts = TimeSeries(ta, data) >>> >>> # create a predictor >>> pred = ts.get_krls_predictor() >>> total_mse = pred.predictor_mse(ts) # compute mse relative to ts >>> krls_ts = pred.predict(ta) # generate a prediction, this is the result from ts.krls_interpolation >>> krls_mse_ts = pred.mse_ts(ts, points=6) # compute a mse time-series using 6 points around each sample
- Returns:
krls_predictor. A KRLS predictor pre-trained once on self.
- Return type:
- Other related methods are:
- get_time_axis((TimeSeries)self) TimeAxis :
TimeAxis: the time-axis
- ice_packing((TimeSeries)self, (IcePackingParameters)ip_params, (ice_packing_temperature_policy)ipt_policy) TimeSeries :
Create a binary time-series indicating whether ice-packing is occuring or not.
Note
self is interpreted and assumed to be a temperature time-series.
The ice packing detection is based on the mean temperature in a predetermined time window before the time-point of interrest (see IcePackingParameters.window. The algorithm determines there to be ice packing when the mean temperature is below a given threshold temperature (see IcePackingParameters.threshold_temp).
- Parameters:
ip_param (IcePackingParameters) – Parameter container controlling the ice packing detection.
ipt_policy (ice_packing_temperature_policy) – Policy flags for determining how to deal with missing temperature values.
- Returns:
ice_packing_ts. A time-series indicating wheter ice packing occurs or not
- Return type:
Example:
>>> import numpy as np >>> from shyft.time_series import ( ... IcePackingParameters, ice_packing_temperature_policy, ... TimeAxis, TimeSeries, point_interpretation_policy, DoubleVector, ... utctime_now, deltahours, deltaminutes, ... ) >>> >>> t0 = utctime_now() >>> dt = deltaminutes(15) >>> n = 100 >>> >>> # generate jittery data >>> # - first descending from +5 to -5 then ascending back to +5 >>> # - include a NaN hole at the bottom of the V >>> n_ = n if (n//2)*2 == n else n+1 # assure even >>> data = np.concatenate(( ... np.linspace(5, -5, n_//2), np.linspace(-5, 5, n_//2) ... )) + np.random.uniform(-0.75, 0.75, n_) # add uniform noise >>> data[n_//2 - 1:n_//2 + 2] = float('nan') # add some missing data >>> >>> # create Shyft data structures >>> ta = TimeAxis(t0, dt, n_) >>> temperature_ts = TimeSeries(ta, DoubleVector.from_numpy(data), ... point_interpretation_policy.POINT_AVERAGE_VALUE) >>> >>> # do the ice packing detection >>> ip_param = IcePackingParameters( ... threshold_window=deltahours(5), ... threshold_temperature=-1.0) >>> # try all the different temperature policies >>> ice_packing_ts_disallow = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.DISALLOW_MISSING) >>> ice_packing_ts_initial = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.ALLOW_INITIAL_MISSING) >>> ice_packing_ts_any = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.ALLOW_ANY_MISSING) >>> >>> # plotting >>> from matplotlib import pyplot as plt >>> from shyft.time_series import time_axis_extract_time_points >>> >>> # NOTE: The offsets below are added solely to be able to distinguish between the different time-axes >>> >>> plt.plot(time_axis_extract_time_points(ta)[:-1], temperature_ts.values, label='Temperature') >>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_ts_disallow.values.to_numpy() + 1, ... label='Ice packing? [DISALLOW_MISSING]') >>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_ts_initial.values.to_numpy() - 1, ... label='Ice packing? [ALLOW_INITIAL_MISSING]') >>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_ts_any.values.to_numpy() - 3, ... label='Ice packing? [ALLOW_ANY_MISSING]') >>> plt.legend() >>> plt.show()
- ice_packing_recession((TimeSeries)self, (TimeSeries)ip_ts, (IcePackingRecessionParameters)ipr_params) TimeSeries :
Create a new time series where segments are replaced by recession curves.
Note
The total period (TimeSeries.total_period) of self needs to be equal to, or contained in the total period of ip_ts.
- Parameters:
ip_ts (TimeSeries) – A binary time-series indicating if ice packing occurring. See TimeSeries.ice_packing.
ip_param (IcePackingParameters) – Parameter container controlling the ice packing recession curve.
- Returns:
ice_packing_recession_ts. A time-series where sections in self is replaced by recession curves as indicated by ip_ts.
- Return type:
Example:
>>> import numpy as np >>> from shyft.time_series import ( ... IcePackingParameters, IcePackingRecessionParameters, ice_packing_temperature_policy, ... TimeAxis, TimeSeries, point_interpretation_policy, DoubleVector, ... utctime_now, deltahours, deltaminutes, ... ) >>> >>> t0 = utctime_now() >>> dt = deltaminutes(15) >>> n = 100 >>> >>> # generate jittery temperature data >>> # - first descending from +5 to -5 then ascending back to +5 >>> # - include a NaN hole at the bottom of the V >>> n_ = n if (n//2)*2 == n else n+1 # assure even >>> temperature_data = np.concatenate(( ... np.linspace(5, -5, n_//2), np.linspace(-5, 5, n_//2) ... )) + np.random.uniform(-0.75, 0.75, n_) # add uniform noise >>> temperature_data[n_ // 2 - 1:n_ // 2 + 2] = float('nan') # add some missing data >>> >>> # create Shyft data structures for temperature >>> ta = TimeAxis(t0, dt, n_) >>> temperature_ts = TimeSeries(ta, DoubleVector.from_numpy(temperature_data), ... point_interpretation_policy.POINT_AVERAGE_VALUE) >>> >>> # generate jittery waterflow data >>> # - an upwards curving parabola >>> x0 = ta.total_period().start >>> x1 = ta.total_period().end >>> x = np.linspace(x0, x1, n_) >>> flow_data = -0.0000000015*(x - x0)*(x - x1) + 1 + np.random.uniform(-0.5, 0.5, n_) >>> del x0, x1, x >>> >>> # create Shyft data structures for temperature >>> flow_ts = TimeSeries(ta, DoubleVector.from_numpy(flow_data), ... point_interpretation_policy.POINT_AVERAGE_VALUE) >>> >>> # do the ice packing detection >>> ip_param = IcePackingParameters( ... threshold_window=deltahours(5), ... threshold_temperature=-1.0) >>> # compute the detection time-series >>> # ice_packing_ts = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.DISALLOW_MISSING) >>> # ice_packing_ts = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.ALLOW_INITIAL_MISSING) >>> ice_packing_ts = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.ALLOW_ANY_MISSING) >>> >>> # setup for the recession curve >>> ipr_param = IcePackingRecessionParameters( ... alpha=0.00009, ... recession_minimum=2.) >>> # compute a recession curve based on the ice packing ts >>> ice_packing_recession_ts_initial = flow_ts.ice_packing_recession(ice_packing_ts, ipr_param) >>> >>> # plotting >>> from matplotlib import pyplot as plt >>> from shyft.time_series import time_axis_extract_time_points >>> >>> plt.plot(time_axis_extract_time_points(ta)[:-1], temperature_ts.values, label='Temperature') >>> plt.plot(time_axis_extract_time_points(ta)[:-1], flow_ts.values, label='Flow') >>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_ts.values.to_numpy(), ... label='Ice packing?') >>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_recession_ts_initial.values.to_numpy(), ... label='Recession curve') >>> plt.legend() >>> plt.show()
- index_of((TimeSeries)self, (time)t) int :
return the index of the intervall that contains t, or npos if not found
- inside((TimeSeries)self, (object)min_v, (object)max_v[, (object)nan_v=nan[, (object)inside_v=1.0[, (object)outside_v=0.0]]]) TimeSeries :
Create an inside min-max range ts, that transforms the point-values that falls into the half open range [min_v .. max_v > to the value of inside_v(default=1.0), or outside_v(default=0.0), and if the value considered is nan, then that value is represented as nan_v(default=nan) You would typically use this function to form a true/false series (inside=true, outside=false)
- Parameters:
min_v (float) – minimum range, values < min_v are not inside min_v==NaN means no lower limit
max_v (float) – maximum range, values >= max_v are not inside. max_v==NaN means no upper limit
nan_v (float) – value to return if the value is nan
inside_v (float) – value to return if the ts value is inside the specified range
outside_v (float) – value to return if the ts value is outside the specified range
- Returns:
inside_ts. Evaluated on demand inside time-series
- Return type:
- integral((TimeSeries)self, (TimeAxis)ta) TimeSeries :
create a new ts that is the true integral of self over the specified time-axis ta. defined as integral of the non-nan part of each time-axis interval
- Parameters:
ta (TimeAxis) – time-axis that specifies the periods where true-integral is applied
- Returns:
ts. a new time-series expression, that will provide the true-integral when requested
- Return type:
Notes
the self point interpretation policy is used when calculating the true average
- kling_gupta(other_ts: TimeSeries, s_r: float = 1.0, s_a: float = 1.0, s_b: float = 1.0) float
computes the kling_gupta correlation using self as observation, and self.time_axis as the comparison time-axis
- Parameters:
other_ts (Timeseries) – the predicted/calculated time-series to correlate
s_r (float) – the kling gupta scale r factor(weight the correlation of goal function)
s_a (float) – the kling gupta scale a factor(weight the relative average of the goal function)
s_b (float) – the kling gupta scale b factor(weight the relative standard deviation of the goal function)
- Returns:
KGEs
- Return type:
float
- krls_interpolation((TimeSeries)self, (time)dt[, (object)gamma=0.001[, (object)tolerance=0.01[, (int)size=1000000]]]) TimeSeries :
Compute a new TS that is a krls interpolation of self.
The KRLS algorithm is a kernel regression algorithm for aproximating data, the implementation used here is from DLib: http://dlib.net/ml.html#krls The new time-series has the same time-axis as self, and the values vector contain no nan entries.
If you also want the mean-squared error of the interpolation use get_krls_predictor instead, and use the predictor api to generate a interpolation and a mse time-series. Other related functions are TimeSeries.get_krls_predictor, KrlsRbfPredictor
- Parameters:
dt (float) – The time-step in seconds the underlying predictor is specified for. Note that this does not put a limit on time-axes used, but for best results it should be approximatly equal to the time-step of time-axes used with the predictor. In addition it should not be to long, else you will get poor results. Try to keep the dt less than a day, 3-8 hours is usually fine.
gamma (float (optional)) – Determines the width of the radial basis functions for the KRLS algorithm. Lower values mean wider basis functions, wider basis functions means faster computation but lower accuracy. Note that the tolerance parameter also affects speed and accurcy. A large value is around 1E-2, and a small value depends on the time step. By using values larger than 1E-2 the computation will probably take to long. Testing have reveled that 1E-3 works great for a time-step of 3 hours, while a gamma of 1E-2 takes a few minutes to compute. Use 1E-4 for a fast and tolerably accurate prediction. Defaults to 1E-3
tolerance (float (optional)) – The krls training tolerance. Lower values makes the prediction more accurate, but slower. This typically have less effect than gamma, but is usefull for tuning. Usually it should be either 0.01 or 0.001. Defaults to 0.01
size (int (optional)) – The size of the “memory” of the underlying predictor. The default value is usually enough. Defaults to 1000000.
Examples:
>>> import numpy as np >>> import scipy.stats as stat >>> from shyft.time_series import ( ... Calendar, utctime_now, deltahours, ... TimeAxis, TimeSeries ... ) >>> >>> cal = Calendar() >>> t0 = utctime_now() >>> dt = deltahours(1) >>> n = 365*24 # one year >>> >>> # generate random bell-shaped data >>> norm = stat.norm() >>> data = np.linspace(0, 20, n) >>> data = stat.norm(10).pdf(data) + norm.pdf(np.random.rand(*data.shape)) >>> # ----- >>> ta = TimeAxis(cal, t0, dt, n) >>> ts = TimeSeries(ta, data) >>> >>> # compute the interpolation >>> ts_ipol = ts.krls_interpolation(deltahours(3))
- Returns:
krls_ts. A new time series being the KRLS interpolation of self.
- Return type:
- log((TimeSeries)self) TimeSeries :
create a new ts that contains log(py::self)
- lower_half((TimeSeries)self) TimeSeries :
Create a ts that contains non-negative values only.
- Returns:
lower_half_ts. Evaluated on demand inside time-series
- Return type:
- lower_half_mask((TimeSeries)self) TimeSeries :
Create a ts that contains 1.0 in place of non-positive values, and 0.0 in case of positive values.
- Returns:
lower_half_mask_ts. Evaluated on demand inside time-series
- Return type:
- max((TimeSeries)self, (object)number) TimeSeries :
create a new ts that contains the max of self and number for each time-step
- max( (TimeSeries)self, (TimeSeries)ts_other) -> TimeSeries :
create a new ts that contains the max of self and ts_other
- merge_points((TimeSeries)self, (TimeSeries)ts) TimeSeries :
Given that self is a concrete point-ts(not an expression), or empty ts, this function modifies the point-set of self, with points, (time,value) from other ts The result of the merge operation is the distinct set of time-points from self and other ts where values from other ts overwrites values of self if they happen to be at the same time-point
- Parameters:
ts (TimeSeries) – time-series to merge the time,value points from
- Returns:
self. self modified with the merged points from other ts
- Return type:
- min((TimeSeries)self, (object)number) TimeSeries :
create a new ts that contains the min of self and number for each time-step
- min( (TimeSeries)self, (TimeSeries)ts_other) -> TimeSeries :
create a new ts that contains the min of self and ts_other
- min_max_check_linear_fill((TimeSeries)self, (object)v_min, (object)v_max[, (object)dt_max=time.max]) TimeSeries :
- Create a min-max range checked ts with fill-values if value is NaN or outside range
If the underlying time-series is point-instant, then fill-values are linear-interpolation, otherwise, the previous value, if available is used as fill-value. A similar function with more features is quality_and_self_correction()
- Args:
v_min (float): minimum range, values < v_min are considered NaN. v_min==NaN means no lower limit
v_max (float): maximum range, values > v_max are considered NaN. v_max==NaN means no upper limit
dt_max (int): maximum time-range in seconds allowed for interpolating/extending values, default= max_utctime
- Returns:
TimeSeries: min_max_check_linear_fill. Evaluated on demand time-series with NaN, out of range values filled in
- min_max_check_linear_fill( (TimeSeries)self, (object)v_min, (object)v_max [, (time)dt_max=time.max]) -> TimeSeries :
Create a min-max range checked ts with fill-values if value is NaN or outside range If the underlying time-series is point-instant, then fill-values are linear-interpolation, otherwise, the previous value, if available is used as fill-value. Similar and more parameterized function is quality_and_self_correction()
- Args:
v_min (float): minimum range, values < v_min are considered NaN. v_min==NaN means no lower limit
v_max (float): maximum range, values > v_max are considered NaN. v_max==NaN means no upper limit
dt_max (int): maximum time-range in seconds allowed for interpolating/extending values, default= max_utctime
- Returns:
TimeSeries: min_max_check_linear_fill. Evaluated on demand time-series with NaN, out of range values filled in
- min_max_check_ts_fill((TimeSeries)self, (object)v_min, (object)v_max, (object)dt_max, (TimeSeries)cts) TimeSeries :
Create a min-max range checked ts with cts-filled-in-values if value is NaN or outside range
- Args:
v_min (float): minimum range, values < v_min are considered NaN. v_min==NaN means no lower limit
v_max (float): maximum range, values > v_max are considered NaN. v_max==NaN means no upper limit
dt_max (int): maximum time-range in seconds allowed for interpolating values
cts (TimeSeries): time-series that keeps the values to be filled in at points that are NaN or outside min-max-limits
- Returns:
TimeSeries: min_max_check_ts_fill. Evaluated on demand time-series with NaN, out of range values filled in
- min_max_check_ts_fill( (TimeSeries)self, (object)v_min, (object)v_max, (time)dt_max, (TimeSeries)cts) -> TimeSeries :
Create a min-max range checked ts with cts-filled-in-values if value is NaN or outside range
- Args:
v_min (float): minimum range, values < v_min are considered NaN. v_min==NaN means no lower limit
v_max (float): maximum range, values > v_max are considered NaN. v_max==NaN means no upper limit
dt_max (int): maximum time-range in seconds allowed for interpolating values
cts (TimeSeries): time-series that keeps the values to be filled in at points that are NaN or outside min-max-limits
- Returns:
TimeSeries: min_max_check_ts_fill. Evaluated on demand time-series with NaN, out of range values filled in
- nash_sutcliffe(other_ts: TimeSeries) float
Computes the Nash-Sutcliffe model effiency coefficient (n.s) for the two time-series over the time_axis of the observed_ts, self. Ref: http://en.wikipedia.org/wiki/Nash%E2%80%93Sutcliffe_model_efficiency_coefficient :param other_ts: the time-series that is the model simulated / calculated ts :type other_ts: TimeSeries
- Returns:
float
- Return type:
The n.s performance, that have a maximum at 1.0
- needs_bind((TimeSeries)self) bool :
returns true if there are any unbound time-series in the expression this time-series represent These functions also supports symbolic time-series handling: .find_ts_bind_info(),bind() and bind_done()
- partition_by((TimeSeries)self, (Calendar)calendar, (time)t, (time)partition_interval, (int)n_partitions, (time)common_t0) TsVector :
DEPRECATED(replaced by .stack ) : from a time-series, construct a TsVector of n time-series partitions. The partitions are simply specified by calendar, delta_t(could be symbolic, like YEAR : MONTH:DAY) and n. To make yearly partitions, just pass Calendar.YEAR as partition_interval. The t - parameter set the start - time point in the source-time-series, e.g. like 1930.09.01 The common_t0 - parameter set the common start - time of the new partitions, e.g. 2017.09.01
The typical usage will be to use this function to partition years into a vector with 80 years, where we can do statistics, percentiles to compare and see the different effects of yearly season variations. Note that the function is more general, allowing any periodic partition, like daily, weekly, monthly etc. that allows you to study any pattern or statistics that might be periodic by the partition pattern. Other related methods are time_shift,average,TsVector.
- Parameters:
calendar (Calendar) – The calendar to use, typically utc
t (utctime) – specifies where to pick the first partition
partition_interval (utctimespan) – the length of each partition, Calendar.YEAR,Calendar.DAY etc.
n_partitions (int) – number of partitions
common_t0 (utctime) – specifies the time to correlate all the partitions
- Returns:
ts-partitions. with length n_partitions, each ts is time-shifted to common_t0 expressions
- Return type:
- point_interpretation((TimeSeries)self) point_interpretation_policy :
returns the point interpretation policy
- pow((TimeSeries)self, (object)number) TimeSeries :
create a new ts that contains pow(py::self,number)
- pow( (TimeSeries)self, (TimeSeries)ts_other) -> TimeSeries :
create a new ts that contains pow(py::self,ts_other)
- quality_and_self_correction((TimeSeries)self, (QacParameter)parameters) TimeSeries :
returns a new time-series that applies quality checks accoring to parameters and fills in values according to rules specified in parameters.
- Parameters:
parameter (QacParameter) – Parameter with rules for quality and corrections
- Returns:
ts. a new time-series where the values are subject to quality and correction as specified
- Return type:
- quality_and_ts_correction((TimeSeries)self, (QacParameter)parameters, (TimeSeries)cts) TimeSeries :
returns a new time-series that applies quality checks accoring to parameters and fills in values from the cts, according to rules specified in parameters.
- Parameters:
parameter (QacParameter) – Parameter with rules for quality and corrections
cts (TimeSeries) – is used to fill in correct values, as f(t) for values that fails quality-checks
- Returns:
ts. a new time-series where the values are subject to quality and correction as specified
- Return type:
- rating_curve((TimeSeries)self, (RatingCurveParameters)rc_param) TimeSeries :
Create a new TimeSeries that is computed using a RatingCurveParameter instance.
Examples:
>>> import numpy as np >>> from shyft.time_series import ( ... utctime_now, deltaminutes, ... TimeAxis, TimeSeries, ... RatingCurveFunction, RatingCurveParameters ... ) >>> >>> # parameters >>> t0 = utctime_now() >>> dt = deltaminutes(30) >>> n = 48*2 >>> >>> # make rating function, each with two segments >>> rcf_1 = RatingCurveFunction() >>> rcf_1.add_segment(0, 2, 0, 1) # add segment from level 0, computing f(h) = 2*(h - 0)**1 >>> rcf_1.add_segment(5.3, 1, 1, 1.4) # add segment from level 5.3, computing f(h) = 1.3*(h - 1)**1.4 >>> rcf_2 = RatingCurveFunction() >>> rcf_2.add_segment(0, 1, 1, 1) # add segment from level 0, computing f(h) = 1*(h - 1)**1 >>> rcf_2.add_segment(8.0, 0.5, 0, 2) # add segment from level 8.0, computing f(h) = 0.5*(h - 0)**2 >>> >>> # add rating curves to a parameter pack >>> rcp = RatingCurveParameters() >>> rcp.add_curve(t0, rcf_1) # rcf_1 is active from t0 >>> rcp.add_curve(t0+dt*n//2, rcf_2) # rcf_2 takes over from t0 + dt*n/2 >>> >>> # create a time-axis/-series >>> ta = TimeAxis(t0, dt, n) >>> ts = TimeSeries(ta, np.linspace(0, 12, n)) >>> rc_ts = ts.rating_curve(rcp) # create a new time series computed using the rating curve functions >>>
- Parameters:
rc_param (RatingCurveParameter) – RatingCurveParameter instance.
- Returns:
rcts. A new TimeSeries computed using self and rc_param.
- Return type:
- repeat((TimeSeries)self, (TimeAxis)repeat_time_axis) TimeSeries :
Repeat all time-series over the given repeat_time_axis periods
- Parameters:
repeat_time_axis (TimeAxis) – A time-axis that have the coarse repeat interval, like YEAR or similar
- Returns:
repeated_ts. time-series where pattern of self is repeated throughout the period of repeat_time_axis
- Return type:
- scale_by((TimeSeries)self, (object)v) None :
scale all values by the specified factor v
- serialize((TimeSeries)self) ByteVector :
convert ts (expression) into a binary blob
- set((TimeSeries)self, (int)i, (object)v) None :
set the i’th value
- set_point_interpretation((TimeSeries)self, (point_interpretation_policy)policy) None :
set new policy
- set_ts_id((TimeSeries)self, (object)ts_id) None :
Set a new ts_id of symbolic ts, requires unbound ts. To create symbolic time-series use TimeSeries(‘url://like/id’) or with payload: TimeSeries(‘url://like/id’,ts_with_values)
- size((TimeSeries)self) int :
returns number of points
- slice((TimeSeries)self, (object)i0, (object)n) TimeSeries :
Given that self is a concrete point-ts(not an expression), or empty ts, return a new TimeSeries containing the n values starting from index i0.
- Parameters:
i0 (int) – Index of first element to include in the slice
n (int) – Number of elements to include in the slice
- stack((TimeSeries)self, (Calendar)calendar, (time)t0, (int)n_dt, (time)dt, (int)n_partitions, (time)target_t0, (time)dt_snap) TsVector :
stack time-series into a TsVector of n_partitions time-series, each with semantic calendar length n_dt x dt. The partitions are simply specified by calendar, n_dt x dt(could be symbolic, like YEAR : MONTH:DAY) and n. To make yearly partitions, just pass 1, Calendar.YEAR as n_dt and dt respectively. The t0 - parameter set the start - time point in the source-time-series, e.g. like 1930.09.01 The target_t0 - parameter set the common start-time of the stack, e.g. 2017.09.01 The dt_snap - parameter is useful to ensure that if target_to is a monday, then each partition is adjusted to neares monday. The snap mechanism could be useful if you would like to stack something like consumption, that would follow a weekly pattern.
The typical usage will be to use this function to partition years into a vector with 80 years, where we can do statistics, percentiles to compare and see the different effects of yearly season variations. Note that the function is more general, allowing any periodic partition, like daily, weekly, monthly etc. that allows you to study any pattern or statistics that might be periodic by the partition pattern. Other related methods are time_shift,average,TsVector.
- Parameters:
calendar (Calendar) – The calendar to use, typically utc
t0 (utctime) – specifies where to pick the first partition, e.g. 1930.09.01
n_dt (int) – number of calendar units for the length of the stride
dt (utctimespan) – the basic calendar length unit, Calendar.YEAR,Calendar.DAY
n_partitions (int) – number of partitions,e.g. length of the resulting TsVector
target_t0 (utctime) – specifies the common target time for the stack, e.g. 2017.09.01
dt_snap (utctimespan) – default 0, if set to WEEK, each stacked partition will be week-aligned.
- Returns:
stacked_ts. with length n_partitions, each ts is time-shifted (calendar n_dt x n) to common_t0 expressions
- Return type:
- statistics((TimeSeries)self, (TimeAxis)ta, (object)p) TimeSeries :
Create a new ts that extract the specified statistics from self over the specified time-axis ta Statistics are created for the point values of the time-series that falls within each time-period of the time-axis. If there are no points within the period, nan will be the result. Tip: use ts.average(ta_hourly_resolution).statistics(ta_weekly,p=50) to get the functional true hourly average statistics.
- Parameters:
ta (TimeAxis) – time-axis for the statistics
p (int) – percentile range [0..100], or statistical_property.AVERAGE|MIN_EXTREME|MAX_EXTREME
- Returns:
ts. a new time-series expression, will provide the statistics when requested
- Return type:
- stringify((TimeSeries)self) str :
return human-readable string of ts or expression
- time((TimeSeries)self, (int)i) time :
returns the time at the i’th point
- time_shift((TimeSeries)self, (time)delta_t) TimeSeries :
create a new ts that is a the time-shift’ed version of self
- Parameters:
delta_t (int) – number of seconds to time-shift, positive values moves forward
- Returns:
ts. a new time-series, that appears as time-shifted version of self
- Return type:
- total_period((TimeSeries)self) UtcPeriod :
returns the total period covered by the time-axis of this time-series
- transform((TimeSeries)self, (object)points, (interpolation_scheme)method) TimeSeries :
Create a transformed time-series, having values taken from pointwise function evaluation. Function values are determined by interpolating the given points, using the specified method. Valid method arguments are ‘polynomial’, ‘linear’ and ‘catmull-rom’.
- Returns:
transform_ts. New TimeSeries where each element is an evaluated-on-demand transformed time-series.
- Return type:
- ts_id((TimeSeries)self) str :
returns ts_id of symbolic ts, or empty string if not symbolic ts To create symbolic time-series use TimeSeries(‘url://like/id’) or with payload: TimeSeries(‘url://like/id’,ts_with_values)
- Returns:
ts_id. url-like ts_id as passed to constructor or empty if the ts is not a ts with ts_id
- Return type:
str
- unbind((TimeSeries)self) None :
Reset the ts-expression to unbound state, discarding bound symbol references. For time-series, or expressions, that does not have symbolic references, no effect, see also .find_ts_bind_info(),bind() and bind_done()
- upper_half((TimeSeries)self) TimeSeries :
Create a ts that contains non-negative values only.
- Returns:
upper_half_ts. Evaluated on demand inside time-series
- Return type:
- upper_half_mask((TimeSeries)self) TimeSeries :
Create a ts that contains 1.0 in place of non-negative values, and 0.0 in case of negative values.
- Returns:
upper_half_mask_ts. Evaluated on demand inside time-series
- Return type:
- use_time_axis((TimeSeries)self, (TimeAxis)time_axis) TimeSeries :
Create a new ts that have the same values as self, but filtered to the time-axis points from from the supplied time-axis. This function migth be useful for making new time-series, that exactly matches the time-axis of another series. Values of the resulting time-series is like like: [self(t) for t in time_axis.time_points[:-1]
- Parameters:
time_axis (TimeAxis) – the wanted time-axis
- Returns:
ts. a new time-series, that appears as resampled values of self
- Return type:
- use_time_axis_from((TimeSeries)self, (TimeSeries)other) TimeSeries :
Create a new ts that have the same values as self, but filtered to the time-axis points from from the other supplied time-series. This function migth be useful for making new time-series, that exactly matches the time-axis of another series. Values of the resulting time-series is like like: [self(t) for t in other.time_axis.time_points[:-1] Notice that the other time-series can be an unbound (expression) in this case.
- Parameters:
other (TimeSeries) – time-series that provides the wanted time-axis
- Returns:
ts. a new time-series, that appears as resampled values of self
- Return type:
- property v
returns the point-values of timeseries, alias for .values
- value((TimeSeries)self, (int)i) float :
returns the value at the i’th time point
- property values
the values values (possibly calculated on the fly)
- Type:
DoubleVector
Class TsVector
- class shyft.time_series.TsVector
Bases:
instance
A vector, as in strongly typed list, array, of time-series that supports ts-math operations. You can create a TsVector from a list, or list generator of type TimeSeries. TsVector is to TimeSeries that a numpy array is to numbers, see also
TimeSeries
Math operations and their types transformations:
number bin_op ts_vector -> ts_vector
ts_vector bin_op ts_vector -> ts_vector
ts bin_op ts_vector -> ts_vector
where bin_op is any of (*,/,+,-) and explicit forms of binary functions like pow,log,min,max.
In addition these are also available:
average()
integral()
accumulate()
time_shift()
percentiles()
All operation return a new object, usually a ts-vector, containing the resulting expressions
Examples:
>>> import numpy as np >>> from shyft.time_series import TsVector,Calendar,deltahours,TimeAxis,TimeSeries,POINT_AVERAGE_VALUE as fx_avg >>> >>> utc = Calendar() # ensure easy consistent explicit handling of calendar and time >>> ta1 = TimeAxis(utc.time(2016, 9, 1, 8, 0, 0), deltahours(1), 10) # create a time-axis for ts1 >>> ts1 = TimeSeries(ta1, np.linspace(0, 10, num=len(ta)), fx_avg) >>> ta2 = TimeAxis(utc.time(2016, 9, 1, 8, 30, 0), deltahours(1), 5) # create a time-axis to ts2 >>> ts2 = TimeSeries(ta2, np.linspace(0, 1, num=len(ta)), fx_avg) >>> tsv = TsVector([ts1, ts2]) # create ts vector from list of time-series >>> c = tsv + tsv*3.0 # c is now an expression, time-axis is the overlap of a and b, lazy evaluation >>> c_values = c[0].values.to_numpy() # compute and extract the values of the ith (here: 0) time-series, as numpy array >>> >>> # Calculate data for new time-points >>> value_1 = tsv(utc.time(2016, 9, 1, 8, 30)) # calculates value at a given time >>> ta_target = TimeAxis(utc.time(2016, 9, 1, 7, 30), deltahours(1), 12) # create a target time_axis >>> tsv_new = tsv.average(ta_target) # new ts-vector with values on target time_axis >>> ts0_val = tsv_new[0].values.to_numpy() # access values of the ith (here: 0) time-series as a numpy array >>>
- __init__((TsVector)arg1, (TsVector)clone) None :
Create a clone.
- __init__( (object)arg1) -> object :
Create an empty TsVector
- __init__( (object)arg1, (TsVector)cloneme) -> object :
Create a shallow clone of the TsVector
- Args:
cloneme (TsVector): The TsVector to be cloned
- __init__( (object)arg1, (list)ts_list) -> object :
Create a TsVector from a python list of TimeSeries
- Args:
ts_list (List[TimeSeries]): A list of TimeSeries
- abs((TsVector)self) TsVector :
create a new ts-vector, with all members equal to abs(py::self
- Returns:
tsv. a new TsVector expression, that will provide the abs-values of self.values
- Return type:
- accumulate((TsVector)self, (TimeAxis)ta) TsVector :
create a new vector of time-series where the vaue of each i-th element is computed as: integral f(t) *dt, from t0..ti given the specified time-axis ta and point interpretation.
- Parameters:
ta (TimeAxis) – time-axis that specifies the periods where accumulated integral is applied
- Returns:
tsv. a new time-series expression, that will provide the accumulated values when requested
- Return type:
Notes
Has a point-instant interpretation, see also note in
TimeSeries.accumulate()
for possible consequences
- append((TsVector)arg1, (object)arg2) None
- average((TsVector)self, (TimeAxis)ta) TsVector :
create a new vector of ts that is the true average of self over the specified time-axis ta.
- Parameters:
ta (TimeAxis) – time-axis that specifies the periods where true-average is applied
- Returns:
tsv. a new time-series expression, that will provide the true-average when requested
- Return type:
Notes
the self point interpretation policy is used when calculating the true average
- average_slice((TsVector)self, (time)lead_time, (time)delta_t, (object)n) TsVector :
Returns a ts-vector with the average time-series of the specified slice The slice for each ts is specified by the lead_time, delta_t and n parameters. See also nash_sutcliffe,forecast_merge
- Parameters:
lead_time (int) – number of seconds lead-time offset from each ts .time(0)
delta_t (int) – delta-time seconds to average as basis for n.s. simulation and observation values
n (int) – number of time-steps of length delta_t to slice out of each forecast/simulation ts
- Returns:
ts_vector_sliced. a ts-vector with average ts of each slice specified.
- Return type:
- clone_expression((TsVector)self) TsVector :
create a copy of the ts-expressions, except for the bound payload of the reference ts. For the reference terminals, those with ts_id, only the ts_id is copied. Thus, to re-evaluate the expression, those have to be bound.
Notes
this function is only useful in context where multiple bind/rebind while keeping the expression is needed.
- derivative((TsVector)self[, (derivative_method)method=shyft.time_series._time_series.derivative_method.DEFAULT]) TsVector :
create a new vector of ts where each i’th element is the derivative of f(t)
- Parameters:
method (derivative_method) – what derivative_method variant to use
- Returns:
tsv. where each member is the derivative of the source
- Return type:
- evaluate((TsVector)self) TsVector :
Evaluates the expressions in TsVector multithreaded, and returns the resulting TsVector, where all items now are concrete terminals, that is, not expressions anymore. Useful client-side if you have complex large expressions where all time-series are bound (not symbols)
- Returns:
evaluated_clone. returns the computed result as a new ts-vector
- Return type:
- extend((TsVector)arg1, (object)arg2) None
- extend_ts((TsVector)arg1, (TimeSeries)ts[, (extend_split_policy)split_policy=shyft.time_series._time_series.extend_split_policy.LHS_LAST[, (extend_fill_policy)fill_policy=shyft.time_series._time_series.extend_fill_policy.FILL_NAN[, (time)split_at=time(0)[, (object)fill_value=nan]]]]) TsVector :
create a new dd::ats_vector where all time-series are extended by ts
- Args:
ts (TimeSeries): time-series to extend each time-series in self with
split_policy (extend_ts_split_policy): policy determining where to split between self and ts
fill_policy (extend_ts_fill_policy): policy determining how to fill any gap between self and ts
split_at (utctime): time at which to split if split_policy == EPS_VALUE
fill_value (float): value to fill any gap with if fill_policy == EPF_FILL
- Returns:
TsVector: new_ts_vec. a new time-series vector where all time-series in self have been extended by ts
- extend_ts( (TsVector)arg1, (TsVector)ts [, (extend_split_policy)split_policy=shyft.time_series._time_series.extend_split_policy.LHS_LAST [, (extend_fill_policy)fill_policy=shyft.time_series._time_series.extend_fill_policy.FILL_NAN [, (time)split_at=time(0) [, (object)fill_value=nan]]]]) -> TsVector :
create a new dd::ats_vector where all ts’ are extended by the matching ts from ts_vec
- Args:
ts_vec (TsVector): time-series vector to extend time-series in self with
split_policy (extend_ts_split_policy): policy determining where to split between self and ts
fill_policy (extend_ts_fill_policy): policy determining how to fill any gap between self and ts
split_at (utctime): time at which to split if split_policy == EPS_VALUE
fill_value (float): value to fill any gap with if fill_policy == EPF_FILL
- Returns:
TsVector: new_ts_vec. a new time-series vector where all time-series in self have been extended by the corresponding time-series in ts_vec
- extract_as_table((TsVector)self, (Calendar)cal, (object)time_scale) DoubleVectorVector :
Extract values in the ts-vector as a table, where columns: | [0] is the distinct union of all time_scale*(time-points i, + cal.tz_offset(i)) | [1..n] is the value contribution of the i’th ts, nan if no contribution at that time-point This function primary usage is within visual-layer of the shyft.dashboard package to speed up processing. The semantics and parameters reflects this.
- Parameters:
cal (Calendar) – Calendar to use for tz-offset of each time-point (to resolve bokeh lack of tz-handling)
time_scale (float) – time-scale to multiply the time from si-unit [s] to any scaled unit, typically ms
- Returns:
table. A 2d vector where [0] contains time, [1..n] the values
- Return type:
DoubleVectorVector
- forecast_merge((TsVector)self, (time)lead_time, (time)fc_interval) TimeSeries :
merge the forecasts in this vector into a time-series that is constructed taking a slice of length fc_interval starting lead_time into each of the forecasts of this time-series vector. The content of the vector should be ordered in forecast-time, each entry at least fc_interval separated from the previous. If there is missing forecasts (larger than fc_interval between two forecasts) this is automagically repaired using extended slices from the existing forecasts
- Parameters:
lead_time (int) – start slice number of seconds from t0 of each forecast
fc_interval (int) – length of each slice in seconds, and thus also gives the forecast-interval separation
- Returns:
merged time-series. A merged forecast time-series
- Return type:
- inside((TsVector)self, (object)min_v, (object)max_v[, (object)nan_v=nan[, (object)inside_v=1.0[, (object)outside_v=0.0]]]) TsVector :
Create an inside min-max range ts-vector, that transforms the point-values that falls into the half open range [min_v .. max_v > to the value of inside_v(default=1.0), or outside_v(default=0.0), and if the value considered is nan, then that value is represented as nan_v(default=nan) You would typically use this function to form a true/false series (inside=true, outside=false)
- Parameters:
min_v (float) – minimum range, values < min_v are not inside min_v==NaN means no lower limit
max_v (float) – maximum range, values >= max_v are not inside. max_v==NaN means no upper limit
nan_v (float) – value to return if the value is nan
inside_v (float) – value to return if the ts value is inside the specified range
outside_v (float) – value to return if the ts value is outside the specified range
- Returns:
inside_tsv. New TsVector where each element is an evaluated-on-demand inside time-series
- Return type:
- integral((TsVector)self, (TimeAxis)ta) TsVector :
create a new vector of ts that is the true integral of self over the specified time-axis ta. defined as integral of the non-nan part of each time-axis interval
- Parameters:
ta (TimeAxis) – time-axis that specifies the periods where true-integral is applied
- Returns:
tsv. a new time-series expression, that will provide the true-integral when requested
- Return type:
Notes
the self point interpretation policy is used when calculating the true average
- log((TsVector)self) TsVector :
returns TsVector log(py::self)
- max((TsVector)self, (object)number) TsVector :
returns max of vector and a number
- max( (TsVector)self, (TimeSeries)ts) -> TsVector :
returns max of ts-vector and a ts
- max( (TsVector)self, (TsVector)tsv) -> TsVector :
returns max of ts-vector and another ts-vector
- min((TsVector)self, (object)number) TsVector :
returns min of vector and a number
- min( (TsVector)self, (TimeSeries)ts) -> TsVector :
returns min of ts-vector and a ts
- min( (TsVector)self, (TsVector)tsv) -> TsVector :
returns min of ts-vector and another ts-vector
- nash_sutcliffe((TsVector)self, (TimeSeries)observation_ts, (time)lead_time, (time)delta_t, (object)n) float :
Computes the nash-sutcliffe (wiki nash-sutcliffe) criteria between the observation_ts over the slice of each time-series in the vector. The slice for each ts is specified by the lead_time, delta_t and n parameters. The function is provided to ease evaluation of forecast performance for different lead-time periods into each forecast. The returned value range is 1.0 for perfect match -oo for no match, or nan if observations is constant or data missing. See also nash_sutcliffe_goal_function
- Parameters:
observation_ts (TimeSeries) – the observation time-series
lead_time (int) – number of seconds lead-time offset from each ts .time(0)
delta_t (int) – delta-time seconds to average as basis for n.s. simulation and observation values
n (int) – number of time-steps of length delta_t to slice out of each forecast/simulation ts
- Returns:
nash-sutcliffe value. the nash-sutcliffe criteria evaluated over all time-series in the TsVector for the specified lead-time, delta_t and number of elements
- Return type:
double
- percentiles((TsVector)self, (TimeAxis)time_axis, (IntVector)percentiles) TsVector :
Calculate the percentiles of all timeseries. over the specified time-axis. The definition is equal to e.g. NIST R7, excel, and in R. The time-series point_fx interpretation is used when performing the true-average over the time_axis periods. This functions works on bound expressions, for unbound expressions, use the DtsClient.percentiles.
See also
DtsClient.percentiles()
if you want to evaluate percentiles of an unbound expression.- Args:
percentiles (IntVector): A list of numbers,like [ 0, 25,50,-1,75,100] will return 6 time-series. Number with special sematics are: -1 -> arithmetic average, -1000 -> min extreme value +1000 -> max extreme value
time_axis (TimeAxis): The time-axis used when applying true-average to the time-series
- Returns:
TsVector: calculated_percentiles. Time-series list with evaluated percentile results, same length as input
- percentiles( (TsVector)self, (TimeAxisFixedDeltaT)time_axis, (IntVector)percentiles) -> TsVector :
Calculate the percentiles of the timeseries. over the specified time-axis. The definition is equal to e.g. NIST R7, excel, and in R. The time-series point_fx interpretation is used when performing the true-average over the time_axis periods. This functions works on bound expressions, for unbound expressions, use the DtsClient.percentiles.
See also
DtsClient.percentiles()
if you want to evaluate percentiles of an unbound expression.- Args:
percentiles (IntVector): A list of numbers,[ 0, 25,50,-1,75,100] will return 6 time-series,`-1 -> arithmetic average`, -1000 -> min extreme value, ` +1000 max extreme value`
time_axis (TimeAxisFixedDeltaT): The time-axis used when applying true-average to the time-series
- Returns:
TsVector: calculated_percentiles. Time-series list with evaluated percentile results, same length as input
- pow((TsVector)self, (object)number) TsVector :
returns TsVector pow(py::self,number)
- pow( (TsVector)self, (TimeSeries)ts) -> TsVector :
returns TsVector pow(py::self,ts)
- pow( (TsVector)self, (TsVector)tsv) -> TsVector :
returns TsVector pow(py::self,tsv)
- repeat((TsVector)self, (TimeAxis)repeat_time_axis) TsVector :
Repeat all time-series over the given repeat_time_axis periods
- size()
- slice((TsVector)self, (IntVector)indexes) TsVector :
returns a slice of self, specified by indexes
- Parameters:
indexes (IntVector) – the indicies to pick out from self, if indexes is empty, then all is returned
- Returns:
slice. a new TsVector, with content according to indexes specified
- Return type:
- statistics((TsVector)self, (TimeAxis)ta, (object)p) TsVector :
create a new vector of ts where each element is ts.statistics(ta,p)
- sum((TsVector)self) TimeSeries :
returns sum of all ts in TsVector as ts as in reduce(add,..))
- time_shift((TsVector)self, (time)delta_t) TsVector :
create a new vector of ts that is a the time-shifted version of self
- transform((TsVector)self, (object)points, (interpolation_scheme)method) TsVector :
Create a transformed ts-vector, having values taken from pointwise function evaluation. Function values are determined by interpolating the given points, using the specified method. Valid method arguments are ‘polynomial’, ‘linear’ and ‘catmull-rom’.
- Returns:
transform_tsv. New TsVector where each element is an evaluated-on-demand transformed time-series.
- Return type:
- use_time_axis((TsVector)self, (TimeAxis)time_axis) TsVector :
Create a new ts-vector applying
TimeSeries.use_time_axis()
on each member, e.g. resampling instant values at specified time-points.
- use_time_axis_from((TsVector)self, (TimeSeries)other) TsVector :
Create a new ts-vector applying
TimeSeries.use_time_axis_from()
on each member- Parameters:
other (TimeSeries) – time-series that provides the wanted time-axis
- Returns:
tsv. time-series vector, where each element have time-axis from other
- Return type:
- value_range((TsVector)self, (UtcPeriod)p) DoubleVector :
Computes min and max of all non-nan values in the period for bound expressions.
- Parameters:
p (UtcPeriod)
- Returns:
values. Resulting [min_value, max_value]. If all values are equal, min = max = the_value
- Return type:
DoubleVectorVector
- values_at((TsVector)self, (time)t) DoubleVector :
Computes the value at specified time t for all time-series
- Args:
t (utctime): seconds since epoch 1970 UTC
- values_at( (TsVector)self, (object)t) -> DoubleVector :
Computes the value at specified time t for all time-series
- Args:
t (int): seconds since epoch 1970 UTC
- values_at_time(t: int)
Time series expressions
The elements in this category implement the time series expressions solution.
Class TsBindInfo
- class shyft.time_series.TsBindInfo
Bases:
instance
TsBindInfo gives information about the time-series and it’s binding represented by encoded string reference Given that you have a concrete ts, you can bind that the bind_info.ts using bind_info.ts.bind() see also Timeseries.find_ts_bind_info() and Timeseries.bind()
- __init__((TsBindInfo)self) None
- property id
a unique id/url that identifies a time-series in a ts-database/file-store/service
- Type:
str
- property ts
the ts, provides .bind(another_ts) to set the concrete values
- Type:
DTSS - The Distributed Time series System
The elements in this category implements the the DTSS. The DTSS provides ready to use services and components, which is useful in itself.
In addition, the services are extensible by python hooks, callbacks, that allow the user to extend/adapt the functionality to cover other time-series data base backends and services.
Note that DTSS is not a data-base as such, but do have a built in high performance time-series db. The DTSS is better viewed as computing component/service, that are capable of evaluating time-series expressions, extracting the wanted information, and sending it back to the clients. One of the important properties of the DTSS is that we can bring the heavy computations to where the data is located. In addition it as a specialized advanced caching system that allows evaluations to run on memory(utilizing multi-core evaluations).
The DTSS contains a high performance in memory queue for messages that consists of collections of time-series. The queue mechanism also provide end-to-end handshake, so that producer can know that consumer have processed the queue message.
The Transfer service built into the DTSS also allows for efficient direct replication, to other DTSS instances for timeseries that matches regular expression, and even with regular expression translation before pushing to remote instance.a
The transfer mechanism is resilient to network and service interruptions, and can propagate changes to large sets of time-series in a few millisecond (limited by network/storage bandwidth).
The open design allows it to utilize any existing legacy ts-databases/services through customization points.
Class DtsServer
- class shyft.time_series.DtsServer
Bases:
instance
A distributed time-series server.
The server part of the Shyft Distributed TimeSeries System(DTSS). Capable of processing time-series messages and responding accordingly.
It has dual service interfaces:
raw-socket boost serialized binary, use
DtsClient
web-api, web-socket(https/wss w. auth supported) using boost.beast, boost.spirit to process/emit messages. This also supports ts-change subscriptions.
python customization and extension capability
The user can setup callback to python to handle unbound symbolic time-series references, ts-urls. This means that you can use your own ts database backend if you have one that can beat the shyft-internal ts-db.
The DtsServer then resolves symbolic references reading time-series from a service or storage for the specified period. The server object will then compute the resulting time-series vector, and respond back to clients with the results multi-node considerations:
firewall/routing: ensure that the port you are using are open for ip-traffic(use ssh-tunnel if you need ssl/tls)
we strongly recommend using linux for performance and longterm stability
The Dts also support master-slave mode, that allows scaling out computations to several Dtss instances, see set_master_slave_mode
backend storage
There are 3 internal backends, and customization for external storage as well. Internal storage containers:
rocksdb - by facebook, configurable specifying ts_rdb in the set_container method
leveldb - (deprecated, replaced by rocksdb) by google, configurable specifying ts_ldb in the set_container method
filedb - fast zero overhead, and simple internal binary formats, configurable specifying ts_db in the set_container method
The kind of backend storage for the backing store ts-containers is specifed in the set_container method, for explicit creation of ts-containers. Notice that for remotely client created containers for geo time-series storage, the default_geo_db_type applies, set to ts_rdb.**External storage** can be setup by suppling python callbacks for the find,`read`,`store` and remove_container hooks. To ensure that containers are (remotely) found and configured after reboot/restart, ensure to provide a dtss configuration file where this information is stored. Then specifying something else than the shyft:// pre-fix for the ts-urls, allows any external storage to be used. HPC setup, configure linux os user limits For high performance envirments, the ulimit, especially memory, number of files open, needs to be set to higher values than the defaults, usually nofiles is 1024, which is to low for HPC apps. We recommend 4096, or 8192 or even higher for demanding databases. For tuning rocksdb, or leveldb, read tuning guides for those libraries, -we provide some basic parameters for tuning, but more can be added if needed.
See also
DtsClient
- __init__((DtsServer)self) None
- add_auth_tokens((DtsServer)self, (StringVector)tokens) None :
Adds auth tokens, and activate authentication. The tokens is compared exactly to the autorization token passed in the request. Authorization should onlye be used for the https/wss, unless other measures(vpn/ssh tunnels etc.) are used to protect auth tokens on the wire Important! Ensure to start_web_api with tls_only=True when using auth!
- Parameters:
() (tokens) – list of tokens, where each token is like Basic dXNlcjpwd2Q=, e.g: base64 user:pwd
- property alive_connections
returns currently alive connections to the server
- Type:
int
- property auth_needed
returns true if the server is setup with auth-tokens, requires web-api clients to pass a valid token
- Type:
bool
- auth_tokens((DtsServer)self) StringVector :
returns the registered authentication tokens.
- cache((DtsServer)self, (StringVector)ts_ids, (TsVector)ts_vector) None :
add/update specified ts_ids with corresponding ts to cache please notice that there is no validation of the tds_ids, they are threated identifiers,not verified against any existing containers etc. Requests that follows, will use the cached item as long as it satisfies the identifier and the coverage period requested
- Parameters:
ts_ids (StringVector) – a list of time-series ids
ts_vector (TsVector) – a list of corresponding time-series
- property cache_max_items
cache_max_items is the maximum number of time-series identities that are kept in memory. Elements exceeding this capacity is elided using the least-recently-used algorithm. Notice that assigning a lower value than the existing value will also flush out time-series from cache in the least recently used order.
- Type:
int
- property cache_memory_target
The memory max target in number of bytes. If not set directly the following equation is use: cache_memory_target = cache_ts_initial_size_estimate * cache_max_items When setting the target directly, number of items in the chache is set so that real memory usage is less than the specified target. The setter could cause elements to be flushed out of cache.
- Type:
int
- property cache_stats
the current cache statistics
- Type:
- property cache_ts_initial_size_estimate
The initial time-series size estimate in bytes for the cache mechanism. memory-target = cache_ts_initial_size_estimate * cache_max_items algorithm. Notice that assigning a lower value than the existing value will also flush out time-series from cache in the least recently used order.
- Type:
int
- property cb
callback for binding unresolved time-series references to concrete time-series. Called if the incoming messages contains unbound time-series. The signature of the callback function should be TsVector cb(Vector,utcperiod)
Examples:
>>> from shyft import time_series as sa
>>> def resolve_and_read_ts(ts_ids,read_period): >>> print('ts_ids:', len(ts_ids), ', read period=', str(read_period)) >>> ta = sa.TimeAxis(read_period.start, sa.deltahours(1), read_period.timespan()//sa.deltahours(1)) >>> x_value = 1.0 >>> r = sa.TsVector() >>> for ts_id in ts_ids : >>> r.append(sa.TimeSeries(ta, fill_value = x_value)) >>> x_value = x_value + 1 >>> return r >>> # and then bind the function to the callback >>> dtss=sa.DtsServer() >>> dtss.cb=resolve_and_read_ts >>> dtss.set_listening_port(20000) >>> dtss.process_messages(60000)
- clear((DtsServer)self) None :
stop serving connections, gracefully.
See also
cb, process_messages(msec),start_server()
- clear_cache_stats((DtsServer)self) None :
clear accumulated cache_stats
- close((DtsServer)self) None :
stop serving connections, gracefully.
See also
cb, process_messages(msec),start_server()
- property configuration_file
configuration file to enable persistent container configurations over coldstarts
- Type:
str
- property default_geo_db_config
Default parameters for geo db created by clients
- property default_geo_db_type
default container type for geo db created by clients,(ts_rdb,`ts_ldb`,`ts_db`), defaults set to ts_rdb
- Type:
str
- find((DtsServer)self, (object)search_expression) TsInfoVector :
Find ts information that fully matches the regular search-expression. For the shyft file based backend, take care to specify path elements precisely, so that the directories visited is minimised. e.g:a/.*/my.ts Will prune out any top level directory not starting with a, but will match any subdirectories below that level. Refer to python test-suites for a wide range of examples using find. Notice that the regexp search algoritm uses ignore case. Please be aware that custom backend by python extension might have different rules.
- Parameters:
search_expression (str) – regular search-expression, to be interpreted by the back-end tss server
- Returns:
ts_info_vector. The search result, as vector of TsInfo objects
- Return type:
TsInfoVector
See also
TsInfo,TsInfoVector
- property find_cb
callback for finding time-series using a search-expression. Called everytime the .find() method is called. The signature of the callback function should be fcb(search_expr: str)->TsInfoVector
Examples:
>>> from shyft import time_series as sa
>>> def find_ts(search_expr: str)->sa.TsInfoVector: >>> print('find:',search_expr) >>> r = sa.TsInfoVector() >>> tsi = sa.TsInfo() >>> tsi.name = 'some_test' >>> r.append(tsi) >>> return r >>> # and then bind the function to the callback >>> dtss=sa.DtsServer() >>> dtss.find_cb=find_ts >>> dtss.set_listening_port(20000) >>> # more code to invoce .find etc.
- Type:
Callable[[str],TsInfoVector]
- fire_cb((DtsServer)self, (StringVector)msg, (UtcPeriod)rp) TsVector :
testing fire cb from c++
- flush_cache((DtsServer)self, (StringVector)ts_ids) None :
flushes the specified ts_ids from cache Has only effect for ts-ids that are in cache, non-existing items are ignored
- Parameters:
ts_ids (StringVector) – a list of time-series ids to flush out
- flush_cache_all((DtsServer)self) None :
flushes all items out of cache (cache_stats remain un-touched)
- property geo_ts_read_cb
Callback for reading geo_ts db. Called everytime there is a need for geo_ts not stored in cached. The signature of the callback function should be grcb(cfg:GeoTimeSeriesConfiguration, slice:GeoSlice)->GeoMatrix
- Type:
Callable[[GeoTimeSeriesConfiguration,GeoSlice],GeoMatrix]
- property geo_ts_store_cb
callback for storing to geo_ts db. Called everytime the client.store_geo_ts() method is called. The signature of the callback function should be gscb(cfg:GeoTimeSeriesConfiguration, tsm:GeoMatrix, replace:bool)->None
- Type:
Callable[[GeoTimeSeriesConfiguration,GeoMatrix,bool],None]
- get_container_names((DtsServer)self) StringVector :
Return a list of the names of containers available on the server
- get_geo_db_ts_info((DtsServer)self) GeoTimeSeriesConfigurationVector :
Returns the configured geo-ts data-bases on the server, so queries can be specified and formulated
- Returns:
A strongly typed list of GeoTimeseriesConfiguration
- Return type:
GeoTimeseriesConfigurationVector
See also
.geo_evaluate()
- get_listening_ip((DtsServer)self) str :
Get the current ip listen address
- Returns:
listening ip. note that 0.0.0.0 means listening for all interfaces
- get_listening_port((DtsServer)self) int :
returns the port number it’s listening at for serving incoming request
- get_max_connections((DtsServer)self) int :
returns the maximum number of connections to be served concurrently
- property graceful_close_timeout_ms
how long to let a connection linger after message is processed to allow for flushing out reply to client. Ref to dlib.net dlib.net/dlib/server/server_kernel_abstract.h.html
- Type:
int
- is_running((DtsServer)self) bool :
true if server is listening and running
See also
start_server(),process_messages(msec)
- process_messages((DtsServer)self, (object)msec) None :
wait and process messages for specified number of msec before returning the dtss-server is started if not already running
- Parameters:
msec (int) – number of millisecond to process messages
Notes
this method releases GIL so that callbacks are not blocked when the
dtss-threads perform the callback
See also
cb,start_server(),is_running,clear()
- read((DtsServer)self, (StringVector)ts_ids, (UtcPeriod)read_period[, (object)use_ts_cached_read=True[, (object)update_ts_cache=True]]) TsVector :
Reads from the db-backend/cache the specified ts_ids for covering read_period. NOTE: That the ts-backing-store, either cached or by read, will return data for:
at least the period needed to evaluate the read_period
In case of cached result, this will currently involve the entire matching cached time-series segment.
- Parameters:
ts_ids (StringVector) – a list of shyft-urrls, like shyft://abc/def
read_period (UtcPeriod) – the valid non-zero length period that the binding service should read from the backing ts-store/ts-service
use_ts_cached_read (bool) – use of server-side ts-cache
update_ts_cache (bool) – when reading time-series, also update the cache with the data
- Returns:
tsvector. an evaluated list of point time-series in the same order as the input list
- Return type:
See also
DtsServer
- remove_auth_tokens((DtsServer)self, (StringVector)tokens) None :
removes auth tokens, if it matches all available tokens, then deactivate auth requirement for clients
- Parameters:
() (tokens) – list of tokens, where each token is like Basic dXNlcjpwd2Q=, e.g: base64 user:pwd
- remove_container((DtsServer)self, (object)container_url[, (object)delete_from_disk=False]) None :
remove an internal shyft store container or an external container from the dtss-server. container_url on the form shyft://<container>/ will remove internal containers all other urls with be forwarded to the remove_external_cb callback on the server removal of containers can take a long time to finish
- Parameters:
container_url (str) – url of the container as pr. url definition above
delete_from_disk (bool) – Flag to indicate if the container should be deleted from disk
- property remove_container_cb
callback for removing external containers. Called when the .remove_container() method is called with a non-shyft container url. The signature of the callback function should be rcb(container_url: string, remove_from_disk: bool)->None
- Type:
Callable[[str, bool],None]
- set_auto_cache((DtsServer)self, (object)active) None :
set auto caching all reads active or passive. Default is off, and caching must be done through explicit calls to .cache(ts_ids,ts_vector)
- Parameters:
active (bool) – if set True, all reads will be put into cache
- set_can_remove((DtsServer)self, (object)can_remove) None :
Set whether the DtsServer support removing time-series The default setting is false, su unless this method is called with true as argument the server will not allow removing data using DtsClient.remove.
- Parameters:
can_remove (bool) –
true
if the server should allow removing data.false
otherwise
- set_container((DtsServer)self, (object)name, (object)root_dir[, (object)container_type=''[, (DtssCfg)cfg=<shyft.time_series._time_series.DtssCfg object at 0x73d7da212980>]]) None :
set ( or replaces) an internal shyft store container to the dtss-server. All ts-urls with shyft://<container>/ will resolve to this internal time-series storage for find/read/store operations
- Parameters:
name (str) – Name of the container as pr. url definition above
root_dir (str) – A valid directory root for the container
container_type (str) – one of (‘ts_rdb’, ‘ts_ldb’,’ts_db’), container type to add.
Notes
currently this call should only be used when the server is not processing messages, - before starting, or after stopping listening operations
- set_geo_ts_db((DtsServer)self, (GeoTimeSeriesConfiguration)geo_ts_cfg) None :
This add/replace a geo-ts database to the server, so that geo-related requests can be resolved by means of this configuation and the geo-related callbacks.
- Parameters:
geo_ts_cfg (GeoTimeseriesConfiguration) – The configuration for the new geo-ts data-base
- set_listening_ip((DtsServer)self, (object)ip) None :
Set the ip address to specific interface ip. Must be called prior to the start server method
- Parameters:
() (ip) – ip address, like 127.0.0.1 for local host only interface
- set_listening_port((DtsServer)self, (object)port_no) None :
set the listening port for the service
- Parameters:
port_no (int) – a valid and available tcp-ip port number to listen on.
20000 (typically it could be)
- Returns:
nothing.
- Return type:
None
- set_master_slave_mode((DtsServer)self, (object)ip, (object)port, (object)master_poll_time, (int)unsubscribe_threshold, (object)unsubscribe_max_delay) None :
Set master-slave mode, redirecting all IO calls on this dtss to the master ip:port dtss. This instance of the dtss is kept in sync with changes done on the master using subscription to changes on the master Calculations, and caches are still done locally unloading the computational efforts from the master.
- Parameters:
ip (str) – The ip address where the master dtss is running
port (int) – The port number for the master dtss
master_poll_time (time) – [s] max time between each update from master, typicall 0.1 s is ok
unsubscribe_threshold (int) – minimum number of unsubscribed time-series before also unsubscribing from the master
unsubscribe_max_delay (int) – maximum time to delay unsubscriptions, regardless number
- set_max_connections((DtsServer)self, (object)max_connect) None :
limits simultaneous connections to the server (it’s multithreaded, and uses on thread pr. connect)
- Parameters:
max_connect (int) – maximum number of connections before denying more connections
See also
get_max_connections()
- start_async((DtsServer)self) int :
(deprecated, use start_server) start server listening in background, and processing messages
See also
set_listening_port(port_no),set_listening_ip,is_running,cb,process_messages(msec)
- Returns:
port_no. the port used for listening operations, either the value as by set_listening_port, or if it was unspecified, a new available port
- Return type:
in
Notes
you should have setup up the callback, cb before calling start_async
Also notice that processing will acquire the GIL
-so you need to release the GIL to allow for processing messages
See also
process_messages(msec)
- start_server((DtsServer)self) int :
start server listening in background, and processing messages
See also
set_listening_port(port_no),set_listening_ip,is_running,cb,process_messages(msec)
- Returns:
port_no. the port used for listening operations, either the value as by set_listening_port, or if it was unspecified, a new available port
- Return type:
in
Notes
you should have setup up the callback, cb before calling start_server
Also notice that processing will acquire the GIL
-so you need to release the GIL to allow for processing messages
See also
process_messages(msec)
- start_web_api((DtsServer)self, (object)host_ip, (object)port, (object)doc_root[, (object)fg_threads=2[, (object)bg_threads=4[, (object)tls_only=False]]]) int :
starts the dtss web-api on the specified host_ip, port, doc_root and number of threads
- Parameters:
host_ip (str) – 0.0.0.0 for any interface, 127.0.0.1 for local only etc.
port (int) – port number to serve the web_api on, ensure it’s available!
doc_root (str) – directory from which we will serve http/https documents, like index.html etc.
fg_threads (int) – number of web-api foreground threads, typical 1-4 depending on load
bg_threads (int) – number of long running background threads workers to serve dtss-request etc.
tls_only (bool) – default false, set to true to enforce tls sessions only.
- Returns:
port. real port number used, if 0 is passed as port it is auto-allocated
- Return type:
int
- stop_server((DtsServer)self[, (object)timeout=1000]) None :
stop serving connections, gracefully.
See also
start_server()
- stop_web_api((DtsServer)self) None :
Stops any ongoing web-api service
- store((DtsServer)self, (TsVector)tsv, (StorePolicy)store_policy) None :
Store the time-series in the ts-vector in the dtss backend. Stores the time-series fragments data passed to the backend. If store_policy.strict == True: It is semantically stored as if
first erasing the existing stored points in the range of ts.time_axis().total_period()
then inserting the points of the ts.
Thus, only modifying the parts of time-series that are covered by the ts-fragment passed. If there is no previously existing time-series, its merely stores the ts-fragment as the initial content and definition of the time-series.
When creating time-series 1st time, pay attention to the time-axis, and point-interpretation as this remains the properties of the newly created time-series. Storing 15min data to a time-series defined initially as hourly series will raise exception. On the other hand, the variable interval time-series are generic and will accept any time-resolution to be stored
If store_policy.strict ==False The passed time-series fragment is interpreted as a f(t), and projected to the time-axis time-points/intervals of the target time-series If the target time-series is a stair-case type (POINT_AVERAGE_VALUE), then the true average of the passed time-series fragment is used to align with the target. If the target time-series is a linear type (POINT_INSTANT_VALUE), then the f(t) of the passed time-series fragment for the time-points of the target-series is used. The store_policy.recreate == True, is used to replace the entire definition of any previously stored time-series. This is semantically as if erasing the previously stored time-series and replacing its entire content and definition, starting fresh with the newly passed time-series. The store_policy.best_effort == True or False, controls how logical errors are handled. If best_effort is set to True, then all time-series are attempted stored, and if any failed, the returned value of the function will be non-empty list of diagnostics identifying those that failed and the diagnostics. If best_effort is set to False, then exception is raised on the first item that fails, the remaining items are not stored. The time-series should be created like this, with url and a concrete point-ts:
>>> a=sa.TimeSeries(ts_url,ts_points) >>> tsv.append(a)
- Parameters:
tsv (TsVector) – ts-vector with time-series, url-reference and values to be stored at dtss server
store_policy (StorePolicy) – Determines how to project the passed time-series fragments to the backend stored time-series
- Returns:
diagnostics. For any failed items, normally empty
- Return type:
TsDiagnosticsItemList
See also
TsVector
- property store_ts_cb
callback for storing time-series. Called everytime the .store_ts() method is called and non-shyft urls are passed. The signature of the callback function should be scb(tsv: TsVector)->None
Examples:
>>> from shyft import time_series as sa
>>> def store_ts(tsv:sa.TsVector)->None: >>> print('store:',len(tsv)) >>> # each member is a bound ref_ts with an url >>> # extract the url, decode and store >>> # >>> # >>> return >>> # and then bind the function to the callback >>> dtss=sa.DtsServer() >>> dtss.store_ts_cb=store_ts >>> dtss.set_listening_port(20000) >>> # more code to invoce .store_ts etc.
- Type:
Callable[[TsVector],None]
- swap_container((DtsServer)self, (object)container_name_a[, (object)container_name_b=False]) None :
Swap the backend storage for container a and b. The content of a and b should be equal prior to the call to ensure wanted semantics, as well as cache correctness. This is the case if a is immutable, and copied to b prior to the operation. If a is not permanently immutable, it has to be ensured at least for the time where the copy/swap operation is done. The intended purpose is to support migration and moving ts-db backends. When swap is done, the remove_container can be used for the container that is redunant. A typical operation is copy a->`a_tmp`, then swap(a,`a_tmp`), then remove(shyft://a_tmp,True)
- Parameters:
container_name_a (str) – Name of container a
container_name_b (str) – Name of container b
- class shyft.time_series.DtssCfg
Bases:
instance
Configuration for google level db specific parameters.
Each parameter have reasonable defaults, have a look at google level db documentation for the effect of max_file_size, write_buffer_size and compression. The ppf remains constant once db is created (any changes will be ignored). The other can be changed on persisted/existing databases.
About compression: Turns out although very effective for a lot of time-series, it have a single thread performance cost of 2..3x native read/write performance due to compression/decompression.
However, for geo dtss we are using multithreaded writes, so performance is limited to the io-capacity, so it might be set to true for those kind of scenarios.
- __init__((DtssCfg)self) None
- __init__( (DtssCfg)self, (object)ppf, (object)compress, (object)max_file_size, (object)write_buffer_size [, (object)log_level=200 [, (object)test_mode=0 [, (object)ix_cache=0 [, (object)ts_cache=0]]]]) -> None :
construct a DtssCfg with all values specified
- property compression
(default False), using snappy compression, could reduce storage 1::3 at similar cost of performance
- Type:
bool
- property ix_cache
low-level index-cache, could be useful when working with large compressed databases
- Type:
int
- property log_level
default warn(200), trace(-1000),debug(0),info(100),error(300),fatal(400)
- Type:
int
- property max_file_size
(default 100Mega), choose to make a reasonable number of files for storing time-series
- Type:
int
- property ppf
(default 1024) ts-points per fragment(e.g.key/value), how large ts is chunked into fragments, read/write operations to key-value storage are in fragment sizes.
- Type:
int
- property test_mode
for internal use only, should always be set to 0(the default)
- Type:
int
- property ts_cache
low-level data-cache, could be useful in case of very large compressed databases
- Type:
int
- property write_buffer_size
(default 10Mega), to balance write io-activity.
- Type:
int
Class DtsClient
- class shyft.time_series.DtsClient
Bases:
instance
The client side part of the distributed time series system(DTSS).
The DtsClient communicate with the DtsServer using an efficient raw socket protocol using boost binary serialization. A typical operation would be that the DtsClient forwards TsVector that represents lists and structures of time-series expressions) to the DtsServer(s), that takes care of binding unbound symbolic time-series, evaluate and return the results back to the DtsClient. This class is closely related to the
DtsServer
and useful reference is alsoTsVector
.Best practice for client/server is to use cache following two simple rules(the default):
Always caching writes (because then consumers get it fresh and fast).
Always use caching reads(utilize and maintain the adaptive cache).
The only two known very rare and special scenarios where uncached writes can be useful are when loading large initial content of time-series db. Another special scenario, where caching reads should be turned off is when using the 3rd party dtss backend extension, where the 3rd party db is written/modified outside the control of dtss Also note that the caching works with the ts-terminals, not the result of the expressions. When reading time-series expressions, such as ts = ts1 - ts2, implementation of the cache is such that it contains the ts-terminals (here, ts1 and ts2), not the expression itself (ts). The .cache_stats, provides cache statistics for the server. The cache can be flushed, useful for some special cases of loading data outside cache.
- __init__((DtsClient)self, (object)host_port[, (object)auto_connect=True[, (object)timeout_ms=1000]]) None :
- Constructs a dts-client with the specifed host_port parameter.
A connection is immediately done to the server at specified port. If no such connection can be made, it raises a RuntimeError.
host_port (string): a string of the format ‘host:portnumber’, e.g. ‘localhost:20000’
auto_connect (bool): default True, connection pr. call. if false, connection last lifetime of object unless explicitely closed/reopened
timeout_ms (int): defalt 1000ms, used for timeout connect/reconnect/close operations
- __init__( (DtsClient)self, (StringVector)host_ports, (object)auto_connect, (object)timeout_ms) -> None :
Constructs a dts-client with the specifed host_ports parameters. A connection is immediately done to the server at specified port. If no such connection can be made, it raises a RuntimeError. If several servers are passed, the .evaluate and .percentile function will partition the ts-vector between the provided servers and scale out the computation
host_ports (StringVector): a a list of string of the format ‘host:portnumber’, e.g. ‘localhost:20000’
auto_connect (bool): default True, connection pr. call. if false, connection last lifetime of object unless explicitly closed/reopened
timeout_ms (int): default 1000ms, used for timeout connect/reconnect/close operations
- add_geo_ts_db((DtsClient)self, (GeoTimeSeriesConfiguration)geo_cfg) None :
Adds a new geo time-series database to the dtss-server with the given specifications
geo_cfg (GeoTimeSeriesConfiguration): the configuration to be added to the server specifying the dimensionality etc.
See also
.get_geo_db_ts_info()
- property auto_connect
If connections are made as needed, and kept short, otherwise externally managed.
- Type:
bool
- cache_flush((DtsClient)self) None :
Flush the cache (including statistics) on the server. This can be useful in scenario when cache_on_write=False in the store operations.
- property cache_stats
Get the cache_stats (including statistics) on the server.
- Type:
- close((DtsClient)self[, (object)timeout_ms=1000]) None :
Close the connection. If auto_connect is enabled it will automatically reopen if needed.
- property compress_expressions
If True, the expressions are compressed before sending to the server. For expressions of any size, like 100 elements, with expression depth 100 (e.g. nested sums), this can speed up the transmission by a factor or 3.
- Type:
bool
- property connections
Get remote server connections.
- Type:
int
- evaluate((DtsClient)self, (TsVector)ts_vector, (UtcPeriod)utcperiod[, (object)use_ts_cached_read=True[, (object)update_ts_cache=True[, (UtcPeriod)clip_result=[not-valid-period>]]]) TsVector :
Evaluates the expressions in the ts_vector. If the expression includes unbound symbolic references to time-series, these time-series will be passed to the binding service callback on the serverside, passing on the specifed utcperiod.
- NOTE: That the ts-backing-store, either cached or by read, will return data for:
at least the period needed to evaluate the utcperiod
In case of cached result, this will currently involve the entire matching cached time-series segment.
- In particular, this means that the returned result could be larger than the specified utcperiod, unless you specify clip_result
Other available methods, such as the expression (x.average(ta)), including time-axis, can be used to exactly control the returned result size. Also note that the semantics of utcperiod is to ensure that enough data is read from the backend, so that it can evaluate the expressions. Use clip_result argument to clip the time-range of the resulting time-series to fit your need if needed - this will typically be in scenarios where you have not supplied time-axis operations (unbounded eval), and you also are using caching.
- See also
DtsClient.percentiles()
if you want to evaluate percentiles of an expression.
- Parameters:
ts_vector (TsVector) – a list of time-series (expressions), including unresolved symbolic references
utcperiod (UtcPeriod) – the valid non-zero length period that the binding service should read from the backing ts-store/ts-service
use_ts_cached_read (bool) – use of server-side ts-cache
update_ts_cache (bool) – when reading time-series, also update the cache with the data
clip_result (UtcPeriod) – If supplied, clip the time-range of the resulting time-series to cover evaluation f(t) over this period only
- Returns:
tsvector. an evaluated list of point time-series in the same order as the input list
- Return type:
See also
DtsServer
- find((DtsClient)self, (object)search_expression) TsInfoVector :
Find ts information that fully matches the regular search-expression. For the shyft file based backend, take care to specify path elements precisely, so that the directories visited is minimised. e.g:a/.*/my.ts Will prune out any top level directory not starting with a, but will match any subdirectories below that level. Refer to python test-suites for a wide range of examples using find. Notice that the regexp search algoritm uses ignore case. Please be aware that custom backend by python extension might have different rules.
- Parameters:
search_expression (str) – regular search-expression, to be interpreted by the back-end tss server
- Returns:
ts_info_vector. The search result, as vector of TsInfo objects
- Return type:
TsInfoVector
See also
TsInfo,TsInfoVector
- geo_evaluate((DtsClient)self, (object)geo_ts_db_name, (StringVector)variables, (IntVector)ensembles, (TimeAxis)time_axis, (time)ts_dt, (GeoQuery)geo_range, (object)concat, (time)cc_dt0[, (object)use_cache=True[, (object)update_cache=True]]) GeoTsMatrix :
Evaluates a geo-temporal query on the server, and return the results
- Args:
geo_ts_db_name (string): The name of the geo_ts_db, e.g. arome, ec, arome_cc ec_cc etc.
variables (StringVector): list of variables, like ‘temperature’,’precipitation’. If empty, return data for all available variables
ensembles (IntVector): list of ensembles to read, if empty return all available
time_axis (TimeAxis): return geo_ts where t0 matches time-points of this time-axis. If concat, the ta.total_period().end determines how long to extend latest forecast
ts_dt (time): specifies the length of the time-slice to read from each time-series
geo_range (GeoQuery): Specify polygon to include, empty means all
concat (bool): If true, the geo_ts for each ensemble/point is joined together to form one singe time-series, concatenating a slice from each of the forecasts
cc_dt0 (time): concat delta time to skip from beginning of each geo_ts, so you can specify 3h, then select +3h.. slice-end from each forecast
use_cache (bool): use cache if available(speedup)
update_cache (bool): if reading data from backend, also stash it to the cache for faster evaluations
- Returns:
GeoMatrix: r. A matrix where the elements are GeoTimeSeries, accessible using indicies time,variable, ensemble, t0
- See also:
.get_geo_ts_db_info()
- geo_evaluate( (DtsClient)self, (GeoEvalArgs)eval_args [, (object)use_cache=True [, (object)update_cache=True]]) -> GeoTsMatrix :
Evaluates a geo-temporal query on the server, and return the results
- Args:
eval_args (GeoEvalArgs): complete set of arguments for geo-evaluation, including geo-db, scope for variables, ensembles, time and geo-range
use_cache (bool): use cache if available(speedup)
update_cache (bool): if reading data from backend, also stash it to the cache for faster evaluations
- Returns:
GeoMatrix: r. A matrix where the elements are GeoTimeSeries, accessible using indicies time,variable, ensemble, t0
- See also:
.get_geo_ts_db_info()
- geo_store((DtsClient)self, (object)geo_ts_db_name, (GeoMatrix)tsm, (object)replace[, (object)cache=True]) None :
Store a ts-matrix with needed dimensions and data to the specified geo-ts-db
- Parameters:
geo_ts_db_name (string) – The name of the geo_ts_db, e.g. arome, ec, arome_cc ec_cc etc.
tsm (TsMatrix) – A dense matrix with dimensionality complete for variables, ensembles and geo-points,flexible time-dimension 1..n
replace (bool) – Replace existing geo time-series with the new ones, does not extend existing ts, replaces them!
cache (bool) – Also put values to the cache
See also
.get_geo_ts_db_info(),.geo_evaluate
- get_container_names((DtsClient)arg1) StringVector :
Return a list of the names of containers available on the server
- get_geo_db_ts_info((DtsClient)self) GeoTimeSeriesConfigurationVector :
Returns the configured geo-ts data-bases on the server, so queries can be specified and formulated
- Returns:
A strongly typed list of GeoTimeseriesConfiguration
- Return type:
GeoTimeseriesConfigurationVector
See also
.geo_evaluate()
- get_server_version((DtsClient)arg1) str :
Returns the server version major.minor.patch string, if multiple servers, the version of the first is returned
- get_transfer_status((DtsClient)self, (object)name, (object)clear_status) TransferStatus :
Get status of specified status, if clear_status is True, also clear it.
- Parameters:
() (clear_status) – the name of the transfer
() – if true, also clear status at server
- Returns:
transfer_status. The TransferStatus
- get_transfers((DtsClient)self) TransferConfigurationList :
returns configured active transfers.
- Returns:
transfer_configurations. A list of configured Transfers
- get_ts_info((DtsClient)self, (object)ts_url) TsInfo :
Get ts information for a time-series from the backend
- Parameters:
ts_url (str) – Time-series url to lookup ts info for
- Returns:
ts_info. A TsInfo object
- Return type:
See also
TsInfo
- merge_store_ts_points((DtsClient)self, (TsVector)tsv[, (object)cache_on_write=True]) None :
Merge the ts-points supplied in the tsv into the existing time-series on the server side. The effect of each ts is similar to as if:
read ts.total_period() from ts point store
in memory appy the TimeSeries.merge_points(ts) on the read-ts
write the resulting merge-result back to the ts-store
This function is suitable for typical data-collection tasks where the points collected is from an external source, appears as batches, that should just be added to the existing point-set
- Parameters:
tsv (TsVector) – ts-vector with time-series, url-reference and values to be stored at dtss server
cache_on_write (bool) – updates the cache with the result of the merge operation, if set to False, this is skipped, notice that this is only useful for very special use-cases.
- Returns:
None.
See also
TsVector
- percentiles((DtsClient)self, (TsVector)ts_vector, (UtcPeriod)utcperiod, (TimeAxis)time_axis, (IntVector)percentile_list[, (object)use_ts_cached_read=True[, (object)update_ts_cache=True]]) TsVector :
Evaluates the expressions in the ts_vector for the specified utcperiod. If the expression includes unbound symbolic references to time-series, these time-series will be passed to the binding service callback on the serverside.
- Parameters:
ts_vector (TsVector) – a list of time-series (expressions), including unresolved symbolic references
utcperiod (UtcPeriod) – the valid non-zero length period that the binding service should read from the backing ts-store/ts-service
time_axis (TimeAxis) – the time_axis for the percentiles, e.g. a weekly time_axis
percentile_list (IntVector) – a list of percentiles, where -1 means true average, 25=25percentile etc
use_ts_cached_read (bool) – utilize server-side cached results
update_ts_cache (bool) – when reading time-series, also update the server-side cache
- Returns:
tsvector. an evaluated list of percentile time-series in the same order as the percentile input list
- Return type:
See also
.evaluate(), DtsServer
- q_ack((DtsClient)self, (object)name, (object)msg_id, (object)diagnostics) None :
After q_get, q_ack confirms that the message is ok/handled back to the process that called q_put.
- Parameters:
() (diagnostics) – the name of the queue
() – the msg_id, required to be unique within current messages keept by the queue
() – the freetext diagnostics to but along with the message, we recommend json formatted
- q_add((DtsClient)self, (object)name) None :
Add a a named queue to the dtss server
- Parameters:
() (name) – the name of the new queue, required to be unique
- q_get((DtsClient)self, (object)name, (time)max_wait) QueueMessage :
Get a message out from the named queue, waiting max_wait time for it if it’s not already there.
- Parameters:
() (max_wait) – the name of the queue
() – max_time to wait for message to arrive
- Returns:
q_msg. A queue message consisting of .info describing the message, and the time-series vector .tsv
- q_list((DtsClient)self) StringVector :
returns a list of defined queues on the dtss server
- q_maintain((DtsClient)self, (object)name, (object)keep_ttl_items[, (object)flush_all=False]) None :
Maintains, removes items that has passed through the queue, and are marked as done. To flush absolutely all items, pass flush_all=True.
- Parameters:
() (flush_all) – the name of the queue
() – If true, the ttl set for the done messages are respected, and they are not removed until the create+ttl has expired
() – removes all items in the queue and kept by the queue, the queue is emptied
- q_msg_info((DtsClient)self, (object)name, (object)msg_id) QueueMessageInfo :
From the specified queue, fetch info about specified msg_id. By inspecting the provided information, one can see when the messaage is created, fetched, and done with.
- Parameters:
() (msg_id) – the name of the queue
() – the msg_id
- Returns:
msg_info. the information/state of the identified message
- q_msg_infos((DtsClient)self, (object)name) QueueMessageInfoVector :
Returns all message informations from a queue, including not yet pruned fetched/done messages
- Parameters:
() (name) – the name of the queue
- Returns:
msg_infos. the list of information keept in the named queue
- q_put((DtsClient)self, (object)name, (object)msg_id, (object)description, (time)ttl, (TsVector)tsv) None :
Put a message, as specified with the supplied parameters, into the specified named queue.
- Parameters:
() (tsv) – the name of the queue
() – the msg_id, required to be unique within current messages keept by the queue
() – the freetext description to but along with the message, we recommend json formatted
() – time-to-live for the message after done, if specified, the q_maintain process can be asked to keep done messages that have ttl
() – time-series vector, with the wanted payload of time-series
- q_remove((DtsClient)self, (object)name) None :
Removes a named queue from dtss server, including all data in flight on the queue
- Parameters:
() (name) – the name of the queue
- q_size((DtsClient)self, (object)name) int :
Returns number of queue messages waiting to be read by q_get.
- Parameters:
() (name) – the name of the queue
- Returns:
unread count. number of elements queued up
- remove((DtsClient)arg1, (object)ts_url) None :
Remove a time-series from the dtss backend The time-series referenced by
ts_url
is removed from the backend DtsServer. Note that the DtsServer may prohibit removing time-series.- Parameters:
ts_url (str) – shyft url referencing a time series
- remove_container((DtsClient)self, (object)container_url[, (object)delete_from_disk=False]) None :
remove an internal shyft store container or an external container from the dtss-server. container_url on the form shyft://<container>/ will remove internal containers all other urls with be forwarded to the remove_external_cb callback on the server removal of containers can take a long time to finish
- Parameters:
container_url (str) – url of the container as pr. url definition above
delete_from_disk (bool) – Flag to indicate if the container should be deleted from disk
- remove_geo_ts_db((DtsClient)self, (object)geo_ts_db_name) None :
Remove the specified geo time-series database from dtss-server
geo_ts_db_name (string): the name of the geo-ts-database to be removed
See also
.get_geo_db_ts_info(),add_geo_ts_db()
- reopen((DtsClient)self[, (object)timeout_ms=1000]) None :
(Re)open a connection after close or server restart.
- set_container((DtsClient)self, (object)name, (object)relative_path[, (object)container_type='ts_db'[, (DtssCfg)cfg=<shyft.time_series._time_series.DtssCfg object at 0x73d7da212c00>]]) None :
create an internal shyft store container to the dtss-server with a root relative path. All ts-urls with shyft://<container>/ will resolve to this internal time-series storage for find/read/store operations will not replace existing containers that have the same name
- Parameters:
name (str) – Name of the container as pr. url definition above
relative_path (str) – A valid directory for the container relative to the root path of the server.
container_type (str) – one of (‘ts_rdb’, ‘ts_ldb’,’ts_db’), container type to add.
- start_transfer((DtsClient)self, (TransferConfiguration)cfg) None :
Starts a transfer on the server using the provided TransferConfiguration.
- Parameters:
() (cfg) – the configuration for the transfer
- stop_transfer((DtsClient)self, (object)name, (time)max_wait) None :
Stop,cancel, removes a named transfer.
- Parameters:
() (max_wait) – the name of the transfer to remove
() – to let existing transfers gracefully finish
- store((DtsClient)self, (TsVector)tsv, (StorePolicy)store_policy) object :
Store the time-series in the ts-vector in the dtss backend. Stores the time-series fragments data passed to the backend. If store_policy.strict == True: It is semantically stored as if
first erasing the existing stored points in the range of ts.time_axis().total_period()
then inserting the points of the ts.
Thus, only modifying the parts of time-series that are covered by the ts-fragment passed. If there is no previously existing time-series, its merely stores the ts-fragment as the initial content and definition of the time-series.
When creating time-series 1st time, pay attention to the time-axis, and point-interpretation as this remains the properties of the newly created time-series. Storing 15min data to a time-series defined initially as hourly series will raise exception. On the other hand, the variable interval time-series are generic and will accept any time-resolution to be stored
If store_policy.strict ==False The passed time-series fragment is interpreted as a f(t), and projected to the time-axis time-points/intervals of the target time-series If the target time-series is a stair-case type (POINT_AVERAGE_VALUE), then the true average of the passed time-series fragment is used to align with the target. If the target time-series is a linear type (POINT_INSTANT_VALUE), then the f(t) of the passed time-series fragment for the time-points of the target-series is used. The store_policy.recreate == True, is used to replace the entire definition of any previously stored time-series. This is semantically as if erasing the previously stored time-series and replacing its entire content and definition, starting fresh with the newly passed time-series. The store_policy.best_effort == True or False, controls how logical errors are handled. If best_effort is set to True, then all time-series are attempted stored, and if any failed, the returned value of the function will be non-empty list of diagnostics identifying those that failed and the diagnostics. If best_effort is set to False, then exception is raised on the first item that fails, the remaining items are not stored. The time-series should be created like this, with url and a concrete point-ts:
>>> a=sa.TimeSeries(ts_url,ts_points) >>> tsv.append(a)
- Parameters:
tsv (TsVector) – ts-vector with time-series, url-reference and values to be stored at dtss server
store_policy (StorePolicy) – Determines how to project the passed time-series fragments to the backend stored time-series
- Returns:
diagnostics. For any failed items, normally empty
- Return type:
TsDiagnosticsItemList
See also
TsVector
- store_ts((DtsClient)self, (TsVector)tsv[, (object)overwrite_on_write=False[, (object)cache_on_write=True]]) None :
Store the time-series in the ts-vector in the dtss backend. Stores the time-series fragments data passed to the backend. It is semantically stored as if
first erasing the existing stored points in the range of ts.time_axis().total_period()
then inserting the points of the ts.
Thus, only modifying the parts of time-series that are covered by the ts-fragment passed. If there is no previously existing time-series, its merely stores the ts-fragment as the initial content and definition of the time-series.
When creating time-series 1st time, pay attention to the time-axis, and point-interpretation as this remains the properties of the newly created time-series. Storing 15min data to a time-series defined initially as hourly series will raise exception. On the otherhand, the variable interval time-series are generic and will accept any time-resolution to be stored
The overwrite_on_write = True, is used to replace the entire definition of any previously stored time-series. This is semantically as if erasing the previously stored time-series and replacing its entire content and definition, starting fresh with the newly passed time-series. The time-series should be created like this, with url and a concrete point-ts:
>>> a=sa.TimeSeries(ts_url,ts_points) >>> tsv.append(a)
- Parameters:
tsv (TsVector) – ts-vector with time-series, url-reference and values to be stored at dtss server
overwrite_on_write (bool) – When True the backend replaces the entire content and definition of any existing time-series with the passed time-series
cache_on_write (bool) – defaults True, if set to False, the cache is not updated, and should only be considered used in very special use-cases.
- Returns:
None.
See also
TsVector
- swap_container((DtsClient)self, (object)container_name_a[, (object)container_name_b=False]) None :
Swap the backend storage for container a and b. The content of a and b should be equal prior to the call to ensure wanted semantics, as well as cache correctness. This is the case if a is immutable, and copied to b prior to the operation. If a is not permanently immutable, it has to be ensured at least for the time where the copy/swap operation is done. The intended purpose is to support migration and moving ts-db backends. When swap is done, the remove_container can be used for the container that is redunant. A typical operation is copy a->`a_tmp`, then swap(a,`a_tmp`), then remove(shyft://a_tmp,True)
- Parameters:
container_name_a (str) – Name of container a
container_name_b (str) – Name of container b
- total_clients = 0
- update_geo_ts_db_info((DtsClient)self, (object)geo_ts_db_name, (object)description, (object)json, (object)origin_proj4) None :
Update info fields of the geo ts db configuration to the supplied parameters
- Parameters:
geo_ts_db_name (string) – The name of the geo_ts_db, e.g. arome, ec, arome_cc ec_cc etc.
description (str) – The description field of the database
json (str) – The user specified json like string
origin_proj4 (str) – The origin proj4 field update
See also
.get_geo_ts_db_info()
Class StorePolicy
- class shyft.time_series.StorePolicy
Bases:
instance
Determine how DTSS stores time-series, used in context of the DtsClient.
- __init__((StorePolicy)self) None
- __init__( (StorePolicy)self, (object)recreate, (object)strict, (object)cache [, (object)best_effort=False]) -> None :
construct object with specified parameters
- property best_effort
try to store all, return diagnostics for logical errors instead of raise exception
- Type:
bool
- property cache
update cache with new values
- Type:
bool
- property recreate
recreate time-series if it existed, with entirely new definition
- Type:
bool
- property strict
use strict requirement for aligment of timepoints. If True(default), then require perfect matching time-axis on the incoming ts fragment, and transfer those points to the target. Notice that if the target is a break-point/flexible interval time-series, then all time-axis passed are a perfect match. If False, use a functional mapping approach, resampling or averaging the passed time-series fragment to align with the target time-series. If target is linear time-series, the new ts fragment is evaluated f(t) for the covered time-points in the target time-series. If target is stair-case time-series, then the true average of the new ts fragment, evaluated over the covering/touched period of the target time-series.
- Type:
bool
Class TransferConfiguration
- class shyft.time_series.TransferConfiguration
Bases:
instance
The transfer configuration describes what time-series to transfer, when, how and where to transfer.
- class HowSpec
Bases:
instance
- __init__()
Raises an exception This class cannot be instantiated from Python
- property partition_size
maximum number of ts to transfer in one batch
- Type:
int
- property retries
retries on connect/re-send for remote
- Type:
int
- class PeriodSpec
Bases:
instance
- __init__()
Raises an exception This class cannot be instantiated from Python
- class RemoteSpec
Bases:
instance
- __init__()
Raises an exception This class cannot be instantiated from Python
- property host
host ip or name
- Type:
str
- property port
host port number
- Type:
str
- class TimeSeriesSpec
Bases:
instance
- __init__()
Raises an exception This class cannot be instantiated from Python
- property replace_pattern
reg-ex replace pattern
- Type:
str
- property search_pattern
reg-ex search pattern
- Type:
str
- class WhenSpec
Bases:
instance
- __init__()
Raises an exception This class cannot be instantiated from Python
- property changed
if true, use subscription to transfer when changed
- Type:
bool
- __init__((TransferConfiguration)self) None
- property json
a user specified json info that could be useful for extensions
- Type:
str
- property name
name of the transfer, used for reference to active transfers
- Type:
str
- property period
specification of the time-period to transfer
- Type:
- property read_remote
transfer-direction,if true, pull from remote,otherwise push
- Type:
bool
- property read_updates_cache
if true, also update cache while reading
- Type:
bool
- property store_policy
used for writing time-series
- Type:
- property what
timeseries to transfer,reg-expr
- Type:
- property where
specification for the remote, host, port
- Type:
Class TransferStatus
- class shyft.time_series.TransferStatus
Bases:
instance
Transfer status gives insight into the current state of a transfer.
- class ReadError
Bases:
instance
- __init__()
Raises an exception This class cannot be instantiated from Python
- property code
read failure code
- Type:
LookupError
- property ts_url
ts-url
- Type:
str
- class ReadErrorList
Bases:
instance
A strongly typed list of ReadError
- __init__((ReadErrorList)arg1) None
- __init__( (ReadErrorList)arg1, (ReadErrorList)clone) -> None :
Create a clone.
- append((ReadErrorList)arg1, (object)arg2) None
- extend((ReadErrorList)arg1, (object)arg2) None
- class WriteError
Bases:
instance
- __init__()
Raises an exception This class cannot be instantiated from Python
- property code
write failure code
- Type:
TsDiagnostics
- property ts_url
ts-url
- Type:
str
- class WriteErrorList
Bases:
instance
A strongly typed list of WriteError
- __init__((WriteErrorList)arg1) None
- __init__( (WriteErrorList)arg1, (WriteErrorList)clone) -> None :
Create a clone.
- append((WriteErrorList)arg1, (object)arg2) None
- extend((WriteErrorList)arg1, (object)arg2) None
- __init__()
Raises an exception This class cannot be instantiated from Python
- property n_ts_found
number of ts found for transfer, if it is zero an readers alive the reader part will continue search for time-series until they appear
- Type:
int
- property read_errors
read errors
- Type:
- property read_speed
points/seconds
- Type:
float
- property reader_alive
true if reader is still working/monitoring
- Type:
bool
- property remote_errors
remote connection errors
- Type:
StringVector
- property total_transferred
number of points transferred
- Type:
int
- property write_errors
write errors
- Type:
- property write_speed
points/seconds
- Type:
float
- property writer_alive
true if writer is still working/monitoring
- Type:
bool
Class QueueMessage
- class shyft.time_series.QueueMessage
Bases:
instance
A QueueMessage as returned from the DtsClient.q_get(..) consist of the .info part and the payload time-series vector .tsv
- __init__((QueueMessage)arg1) None
- __init__( (object)arg1, (object)msg_id, (object)desc, (time)ttl, (TsVector)tsv) -> object :
constructs a QueueMessage
- Args:
msg_id (str): unique identifier for the message
desc (str): custom description
ttl (time): time the message should live on the queue
tsv (TsVector): timeseries payload
- property info
The information about the message
- property tsv
The time-series vector payload part of the message
Class QueueMessageInfo
- class shyft.time_series.QueueMessageInfo
Bases:
instance
Information about the queue item, such as the state of the item, in-queue,fetched, done. This element is never to be created by the python user, but is a return type from the dtss queue message info related calls.
- __init__((QueueMessageInfo)self) None
- property description
A user specified description, we recommend json format
- Type:
str
- property diagnostics
Time when the message acknowledged done from the receiver(end-to-end ack)
- Type:
str
- property msg_id
The unique id for this message in the live-queue
- Type:
str
Class TsInfo
- class shyft.time_series.TsInfo
Bases:
instance
Gives some information from the backend ts data-store about the stored time-series, that could be useful in some contexts
- __init__((TsInfo)self) None
- __init__( (TsInfo)self, (object)name, (point_interpretation_policy)point_fx, (time)delta_t, (object)olson_tz_id, (UtcPeriod)data_period, (time)created, (time)modified) -> None :
construct a TsInfo with all values specified
- property created
when time-series was created, seconds 1970s utc
- property data_period
the period for data-stored, if applicable
- property delta_t
time-axis steps, in seconds, 0 if irregular time-steps
- property modified
when time-series was last modified, seconds 1970 utc
- property name
the unique name
- property olson_tz_id
empty or time-axis calendar for calendar,t0,delta_t type time-axis
- property point_fx
how to interpret the points, instant value, or average over period
Class CacheStats
- class shyft.time_series.CacheStats
Bases:
instance
Cache statistics for the DtsServer.
- __init__((CacheStats)self) None
- property coverage_misses
number of misses where we did find the time-series id, but the period coverage was insufficient
- Type:
int
- property fragment_count
number of time-series fragments in the cache, (greater or equal to id_count)
- Type:
int
- property hits
number of hits by time-series id
- Type:
int
- property id_count
number of unique time-series identities in cache
- Type:
int
- property misses
number of misses by time-series id
- Type:
int
- property point_count
total number of time-series points in the cache
- Type:
int
Geo-location Time series
The elements in this section integrate the generic time series concepts above with a geo-spatial co-ordinate system. This functionality extends to co-ordinate based queries in the time series storage.
Class GeoPoint
- class shyft.time_series.GeoPoint
Bases:
instance
GeoPoint commonly used in the shyft::core for representing a 3D point in the terrain model. The primary usage is in geo-located time-series and the interpolation routines
Absolutely a primitive point model aiming for efficiency and simplicity.
Units of x,y,z are metric, z positive upwards to sky, represents elevation x is east-west axis y is south-north axis
- __init__((GeoPoint)arg1) None
- __init__( (GeoPoint)arg1, (object)x, (object)y, (object)z) -> None :
construct a geo_point with x,y,z
- Args:
x (float): meter units
y (float): meter units
z (float): meter units
- __init__( (GeoPoint)arg1, (GeoPoint)clone) -> None :
create a copy
- Args:
clone (GeoPoint): the object to clone
- static difference((GeoPoint)a, (GeoPoint)b) GeoPoint :
returns GeoPoint(a.x - b.x, a.y - b.y, a.z - b.z)
- static distance2((GeoPoint)a, (GeoPoint)b) float :
returns the euclidian distance^2
- static distance_measure((GeoPoint)arg1, (GeoPoint)a, (object)b, (object)p) float :
return sum(a-b)^p
- transform((GeoPoint)self, (object)from_epsg, (object)to_epsg) GeoPoint :
compute transformed point from_epsg to to_epsg coordinate
- Args:
from_epsg (int): interpret the current point as cartesian epsg coordinate
to_epsg (int): the returned points cartesian epsg coordinate system
- Returns:
GeoPoint: new point. The new point in the specified cartesian epsg coordinate system
- transform( (GeoPoint)self, (object)from_proj4, (object)to_proj4) -> GeoPoint :
compute transformed point from_epsg to to_epsg coordinate
- Args:
from_proj4 (str): interpret the current point as this proj4 specification
to_proj4 (str): the returned points proj4 specified coordinate system
- Returns:
GeoPoint: new point. The new point in the specified proj4 coordinate system
- property x
east->west
- Type:
float
- static xy_distance((GeoPoint)a, (GeoPoint)b) float :
returns sqrt((a.x - b.x)*(a.x - b.x) + (a.y - b.y)*(a.y - b.y))
- property y
south->north
- Type:
float
- property z
ground->upwards
- Type:
float
- static zscaled_distance((GeoPoint)a, (GeoPoint)b, (object)zscale) float :
sqrt( (a.x - b.x)*(a.x - b.x) + (a.y - b.y)*(a.y - b.y) + (a.z - b.z)*(a.z - b.z)*zscale*zscale)
Class GeoTimeSeries
- class shyft.time_series.GeoTimeSeries
Bases:
instance
A minimal geo-located time-series, a time-series plus a representative 3d mid_point
- __init__((GeoTimeSeries)arg1) None
- __init__( (GeoTimeSeries)arg1, (GeoPoint)mid_point, (TimeSeries)ts) -> None :
Construct a GeoTimeSeries
- Args:
mid_point (GeoPoint): The 3d location representative for ts
ts (TimeSeries): Any kind of TimeSeries
- property mid_point
the mid-point(of an area) for which the assigned time-series is valid
- Type:
- property ts
the assigned time-series
- Type:
Class GeoTimeSeriesVector
- class shyft.time_series.GeoTimeSeriesVector
Bases:
instance
- __init__((object)arg1) object :
Create an empty TsVector
- __init__( (object)arg1, (list)geo_ts_list) -> object :
Create a GeoTimeSeriesVector from a python list of GeoTimeSeries
- Args:
geo_ts_list (List[GeoTimeSeries]): A list of GeoTimeSeries
- __init__( (object)arg1, (TimeAxis)time_axis, (GeoPointVector)geo_points, (object)np_array, (point_interpretation_policy)point_fx) -> object :
Create a GeoTimeSeriesVector from time-axis,geo-points,2d-numpy-array and point-interpretation
- Args:
time_axis (TimeAxis): time-axis that matches in length to 2nd dim of np_array
geo_points (GeoPointVector): the geo-positions for the time-series, should be of length n_ts
np_array (np.ndarray): numpy array of dtype=np.float64, and shape(n_ts,n_points)
point_fx (point interpretation): one of POINT_AVERAGE_VALUE|POINT_INSTANT_VALUE
- Returns:
GeoTimeSeriesVector: GeoTimeSeriesVector. a GeoTimeSeriesVector of length first np_array dim, n_ts, each with geo-point and time-series with time-axis, values and point_fx
- append((GeoTimeSeriesVector)arg1, (object)arg2) None
- extend((GeoTimeSeriesVector)arg1, (object)arg2) None
- extract_ts_vector((GeoTimeSeriesVector)self) TsVector :
Provides a TsVector of the time-series part of GeoTimeSeries
- Returns:
ts-vector. A TsVector(shallow copy) of the time-series part of GeoTsVector
- Return type:
- values_at_time((GeoTimeSeriesVector)self, (time)t) DoubleVector :
The values at specified time as a DoubleVector, that you can use .to_numpy() to get np array from This function can be suitable if you are doing area-animated (birds-view) presentations
- Parameters:
t (time) – the time that should be used for getting each value
- Returns:
values. The evaluated geo.ts(t) for all items in the vector
- Return type:
DoubleVector
Class GeoQuery
- class shyft.time_series.GeoQuery
Bases:
instance
A query as a polygon with specified geo epsg coordinate system
- __init__((GeoQuery)arg1) None
- __init__( (GeoQuery)arg1, (object)epsg, (GeoPointVector)points) -> None :
Construct a GeoQuery from specified parameters
- Args:
epsg (int): A valid epsg for the polygon, and also wanted coordinate system
points (GeoPointVector): 3 or more points forming a polygon that is the spatial scope
- property epsg
the epsg coordinate system
- Type:
int
- property polygon
the polygon giving the spatial scope
- Type:
GeoPointVector
Class GeoSlice
- class shyft.time_series.GeoSlice
Bases:
instance
Keeps data that describes as slice into the t0-variable-ensemble-geo, (t,v,e,g), space. It is the result-type of GeoTimeSeriesConfiguration.compute(GeoEvalArgs) and is passed to the geo-db-read callback to specify wanted time-series to read. Note that the content of a GeoSlice can only be interpreted in terms of the GeoTimeSeriesConfiguration it is derived from. The indices and values of the slice, strongly relates to the definition of it’s geo-tsdb.
- __init__((GeoSlice)arg1) None
- __init__( (GeoSlice)arg1, (IntVector)v, (IntVector)g, (IntVector)e, (UtcTimeVector)t, (time)ts_dt [, (time)skip_dt]) -> None :
Construct a GeoSlice from supplied vectors.
- Args:
v (IntVector): list of variables idx, each defined by GeoTimeSeriesConfiguration.variables[i]
e (IntVector): list of ensembles, each in range 0..GeoTimeSeriesConfiguration.n_ensembles-1
g (IntVector): list of geo-point idx, each defined by GeoTimeSeriesConfiguration.grid.points[i]
t (UtcTimeVector): list of t0-time points, each of them should exist in GeoTimeSeriesConfiguration.t0_times
ts_dt (time): time-length to read from each time-series, we read from [t0+skip_dt.. t0+skip_dt+ts_dt>
skip_dt (time): time-length to skip from start each time-series, we read from [t0+skip_dt.. t0+skip_dt+ts_dt>
- property e
list of ensembles, each in range 0..GeoTimeSeriesConfiguration.n_ensembles-1
- Type:
IntVector
- property g
list of geo-point idx, each defined by GeoTimeSeriesConfiguration.grid.points[i]
- Type:
IntVector
- property skip_dt
time length to skip from start of each time-series, [t0+skip_dt.. t0+skip_dt+ts_dt>
- Type:
- property t
list of t0-time points, each of them should exist in GeoTimeSeriesConfiguration.t0_times
- Type:
- property ts_dt
time length to read from each time-series, [t0+skip_dt.. t0+skip_dt+ts_dt>
- Type:
- property v
list of variables idx, each defined by GeoTimeSeriesConfiguration.variables[i]
- Type:
IntVector
Class GeoTsMatrix
- class shyft.time_series.GeoTsMatrix
Bases:
instance
GeoTsMatrix is 4d matrix,index dimensions (t0,variable,ensemble,geo_point) to be understood as a slice of a geo-ts-db (slice could be the entire db) The element types of the matrix is
GeoTimeSeries
- __init__((GeoTsMatrix)arg1, (object)n_t0, (object)n_v, (object)n_e, (object)n_g) None :
create GeoTsMatrix with specified t0,variables,ensemble and geo-point dimensions
- concatenate((GeoTsMatrix)self, (time)cc_dt0, (time)concat_interval) GeoTsMatrix :
Concatenate all the forecasts in the GeoTsMatrix using supplied parameters
- Parameters:
- Returns:
tsm. A new concatenated geo-ts-matrix
- Return type:
- evaluate((GeoTsMatrix)self) GeoTsMatrix :
Apply the expression to each time-series of the specified variable.
Args: :returns: GeoTsMatrix. A new GeoTsMatrix, where all time-series is evaluated :rtype: GeoTsMatrix
- extract_geo_ts_vector((GeoTsMatrix)self, (object)t, (object)v, (object)e) GeoTimeSeriesVector :
Given given arguments, return the GeoTimeSeriesVector suitable for constructing GeoPointSource for hydrology region-environments forcing data
- Parameters:
t (int) – the forecast index, e.g. selects specific forecast, in case of several (t0)
v (int) – the variable index, e.g. selects temperature,precipitation etc.
e (int) – the ensemble index, in case of many ensembles, select specific ensemble
- Returns:
GeoTimeSeriesVector. The GeoTsVector for selected forcast time t, variable and ensemble
- Return type:
- get_geo_point((GeoTsMatrix)self, (object)t, (object)v, (object)e, (object)g) GeoPoint :
return self[t,v,e,g].mid_point of type GeoPoint
- get_ts((GeoTsMatrix)self, (object)t, (object)v, (object)e, (object)g) TimeSeries :
return self[t,v,e,g] of type TimeSeries
- set_geo_point((GeoTsMatrix)self, (object)t, (object)v, (object)e, (object)g, (GeoPoint)point) None :
performs self[t,v,e,g].mid_point= point
- set_ts((GeoTsMatrix)self, (object)t, (object)v, (object)e, (object)g, (TimeSeries)ts) None :
performs self[t,v,e,g].ts= ts
- property shape
The shape of a GeoMatrix in terms of forecasts(n_t0),variables(n_v),ensembles(n_e) and geopoints(n_g)
- Type:
- transform((GeoTsMatrix)self, (object)variable, (TimeSeries)expression) GeoTsMatrix :
Apply the expression to each time-series of the specified variable.
- Args:
variable (): the variable index to select the specific variable, use -1 to apply to all
expr (): ts expression, like 2.0*TimeSeries(‘x’), where x will be substituted with the variable, notice that its a required to be just one unbound time-series with the reference name ‘x’.
- Returns:
GeoTsMatrix: GeoTsMatrix. A new GeoTsMatrix, where the time-series for the specified variable is transformed
- transform( (GeoTsMatrix)self, (TsVector)expr_vector) -> GeoTsMatrix :
Apply the expression to each time-series of the specified variable.
- Args:
expr_vector (): ts expressions, like 2.0*TimeSeries(‘0’), where 0 will be substituted with the corresponding variable, notice that its a required to be just one unbound time-series with the reference name ‘0’.
- Returns:
GeoTsMatrix: GeoTsMatrix. A new GeoTsMatrix, where the time-series for the specified variable is transformed
Class GeoMatrixShape
- class shyft.time_series.GeoMatrixShape
Bases:
instance
- __init__((GeoMatrixShape)arg1, (object)n_t0, (object)n_v, (object)n_e, (object)n_g) None :
Create with specified dimensionality
- property n_e
number of ensembles
- Type:
int
- property n_g
number of geo points
- Type:
int
- property n_t0
number of t0, e.g forecasts
- Type:
int
- property n_v
number of variables
- Type:
int
Class GeoGridSpec
- class shyft.time_series.GeoGridSpec
Bases:
instance
A point set for a geo-grid, but does not have to be a regular grid. It serves the role of defining the spatial representative mid-points for a typical spatial grid, e.g as for arome, or ec forecasts, where the origin shape usually is a box. To support general grid-spec, the optional, then equally sized, shapes provides the polygon shape for each individual mid-point.
- __init__((GeoGridSpec)arg1) None
- __init__( (GeoGridSpec)arg1, (object)epsg, (GeoPointVector)points) -> None :
Construct a GeoQuery from specified parameters
- Args:
epsg (int): A valid epsg for the spatial points
points (GeoPointVector): 0 or more representative points for the spatial properties of the grid
- __init__( (GeoGridSpec)arg1, (object)epsg, (GeoPointVectorVector)polygons) -> None :
Construct a GeoQuery from specified parameters
- Args:
epsg (int): A valid epsg for the spatial points
polygons (GeoPointVectorVector): 0 or more representative shapes as polygons, the mid-points are computed based on shapes
- property epsg
the epsg coordinate system
- Type:
int
- find_geo_match((GeoGridSpec)self, (GeoQuery)geo_query) IntVector :
finds the points int the grid that is covered by the polygon of the geo_query note: that currently we only consider the horizontal dimension when matching points
- Parameters:
geo_query (GeoQuery) – A polygon giving an area to capture
- Returns:
matches. a list of all points that is inside, or on the border of the specified polygon, in guaranteed ascending point index order
- Return type:
IntVector
- property points
the representative mid-points of the spatial grid
- Type:
GeoPointVector
- property polygons
the polygons describing the grid, mid-points are centroids of the polygons
- Type:
GeoPointVectorVector
Class GeoEvalArgs
- class shyft.time_series.GeoEvalArgs
Bases:
instance
GeoEvalArgs is used for the geo-evaluate functions.
It describes scope for the geo-evaluate function, in terms of:
the geo-ts database identifier
variables to extract, by names
ensemble members (list of ints)
temporal, using t0 from specified time-axis, + ts_dt for time-range
spatial, using points for a polygon
and optionally:
the concat postprocessing with parameters
- __init__((GeoEvalArgs)arg1) None
- __init__( (GeoEvalArgs)arg1, (object)geo_ts_db_id, (StringVector)variables, (IntVector)ensembles, (TimeAxis)time_axis, (time)ts_dt, (GeoQuery)geo_range, (object)concat, (time)cc_dt0) -> None :
Construct GeoEvalArgs from specified parameters
- Args:
geo_ts_db_id (str): identifies the geo-ts-db, short, as ‘arome’, ‘ec’, as specified with server.add_geo_ts_db(cfg)
variables (StringVector): names of the wanted variables, if empty, return all variables configured
ensembles (IntVector): List of ensembles, if empty, return all ensembles configured
time_axis (TimeAxis): specifies the t0, and .total_period().end is used as concatenation open-end fill-in length
ts_dt (time): specifies the time-length to read from each time-series,t0.. t0+ts_dt, and .total_period().end is used as concatenation open-end fill-in length
geo_range (GeoQuery): the spatial scope of the query, if empty, return all configured
concat (bool): postprocess using concatenated forecast, returns ‘one’ concatenated forecast from many.
cc_dt0 (time): concat lead-time, skip cc_dt0 of each forecast (offsets the slice you selects)
- __init__( (GeoEvalArgs)arg1, (object)geo_ts_db_id, (IntVector)ensembles, (TimeAxis)time_axis, (time)ts_dt, (GeoQuery)geo_range, (object)concat, (time)cc_dt0, (TsVector)ts_expressions) -> None :
Construct GeoEvalArgs from specified parameters
- Args:
geo_ts_db_id (str): identifies the geo-ts-db, short, as ‘arome’, ‘ec’, as specified with server.add_geo_ts_db(cfg)
ensembles (IntVector): List of ensembles, if empty, return all ensembles configured
time_axis (TimeAxis): specifies the t0, and .total_period().end is used as concatenation open-end fill-in length
ts_dt (time): specifies the time-length to read from each time-series,t0.. t0+ts_dt, and .total_period().end is used as concatenation open-end fill-in length
geo_range (GeoQuery): the spatial scope of the query, if empty, return all configured
concat (bool): postprocess using concatenated forecast, returns ‘one’ concatenated forecast from many.
cc_dt0 (time): concat lead-time, skip cc_dt0 of each forecast (offsets the slice you selects)
ts_expressions (TsVector): expressions to evaluate, where the existing variables are referred to by index number as a string, ex. TimeSeries(‘0’)
- property concat
postprocess using concatenated forecast, returns ‘one’ concatenated forecast from many
- Type:
bool
- property ens
list of ensembles to return, empty=all, if specified >0
- Type:
IntVector
- property geo_ts_db_id
the name for the config (keep it minimal)
- Type:
str
- property t0_time_axis
specifies the t0, and .total_period().end is used as concatenation open-end fill-in length
- Type:
- property variables
the human readable description of this geo ts db
- Type:
StringVector
Class GeoTimeSeriesConfiguration
- class shyft.time_series.GeoTimeSeriesConfiguration
Bases:
instance
Contain minimal description to efficiently work with arome/ec forecast data It defines the spatial, temporal and ensemble dimensions available, and provides means of mapping a GeoQuery to a set of ts_urls that serves as keys for manipulating and assembling forcing input data for example to the shyft hydrology region-models.
- __init__((GeoTimeSeriesConfiguration)arg1) None
- __init__( (GeoTimeSeriesConfiguration)arg1, (object)prefix, (object)name, (object)description, (GeoGridSpec)grid, (UtcTimeVector)t0_times, (time)dt, (object)n_ensembles, (StringVector)variables [, (object)json=’’ [, (object)origin_proj4=’’]]) -> None :
Construct a GeoQuery from specified parameters
- Args:
prefix (str): ts-url prefix, like shyft:// for internally stored ts, or geo:// for externally stored parts
name (str): A shortest possible unique name of the configuration
description (str): a human readable description of the configuration
grid (GeoGridSpec): specification of the spatial grid
t0_times (UtcTimeVector): List of time where we have register time-series,e.g forecast times, first time-point
dt (time): the (max) length of each geo-ts, so geo_ts total_period is [t0..t0+dt>
n_ensembles (int): number of ensembles available, must be >0, 1 if no ensembles
variables (StringVector): list of minimal keys, representing temperature, precipitation etc
json (str): A user specified json string
origin_proj4 (str): The proj4 string for the origin transform of this dataset
- bounding_box((GeoTimeSeriesConfiguration)self, (GeoSlice)slice) GeoPointVector :
Compute the 3D bounding_box, as two GeoPoints containing the min-max of x,y,z of points in the GeoSlice Could be handy when generating queries to externally stored geo-ts databases like netcdf etc. See also convex_hull().
- Parameters:
slice (GeoSlice) – a geo-slice with specified dimensions in terms of t0, variables, ensembles,geo-points
- Returns:
bbox. with two GeoPoints, [0] keeping the minimum x,y,z, and [1] the maximum x,y,z
- Return type:
GeoPointVector
- compute((GeoTimeSeriesConfiguration)self, (GeoEvalArgs)eval_args) GeoSlice :
Compute the GeoSlice from evaluation arguments
- Parameters:
eval_args (GeoEvalArgs) – Specification to evaluate
- Returns:
geo_slice. A geo-slice describing (t0,v,e,g) computed
- Return type:
- convex_hull((GeoTimeSeriesConfiguration)self, (GeoSlice)slice) GeoPointVector :
Compute the 2D convex hull, as a list of GeoPoints describing the smallest convex planar polygon containing all points in the slice wrt. x,y. The returned point sequence is ‘closed’, i.e the first and last point in the sequence are equal. See also bounding_box().
- Parameters:
slice (GeoSlice) – a geo-slice with specified dimensions in terms of t0, variables, ensembles,geo-points
- Returns:
hull. containing the sequence of points of the convex hull polygon.
- Return type:
GeoPointVector
- create_geo_ts_matrix((GeoTimeSeriesConfiguration)self, (GeoSlice)slice) GeoTsMatrix :
Creates a GeoTsMatrix(element type is GeoTimeSeries) to hold the values according to dimensionality of GeoSlice
- Parameters:
slice (GeoSlice) – a geo-slice with specified dimensions in terms of t0, variables, ensembles,geo-points
- Returns:
geo_ts_matrix. ready to be filled in with points and time-series
- Return type:
- create_ts_matrix((GeoTimeSeriesConfiguration)self, (GeoSlice)slice) GeoMatrix :
Creates a GeoMatrix (element type is TimeSeries only) to hold the values according to dimensionality of GeoSlice
- Parameters:
slice (GeoSlice) – a geo-slice with specified dimensions in terms of t0, variables, ensembles,geo-points
- Returns:
ts_matrix. ready to be filled in time-series(they are all empty/null)
- Return type:
GeoMatrix
- property description
the human readable description of this geo ts db
- Type:
str
- find_geo_match_ix((GeoTimeSeriesConfiguration)self, (GeoQuery)geo_query) IntVector :
Returns the indices to the points that matches the geo_query (polygon)
- Parameters:
geo_query (GeoQuery) – The query, polygon that matches the spatial scope
- Returns:
point_indexes. The list of indices that matches the geo_query
- Return type:
IntVector
- property grid
the spatial grid definition
- Type:
- property json
a json formatted string with custom data as needed
- Type:
str
- property n_ensembles
number of ensembles available, range 1..n
- Type:
int
- property name
the name for the config (keep it minimal)
- Type:
str
- property origin_proj4
informative only, if not empty, specifies the origin proj4 of this dataset
- Type:
str
- property prefix
// for internally stored ts, or geo:// for externally stored parts
- Type:
str
- Type:
ts-url prefix, like shyft
- property t0_times
list of time-points, where there are registered/available time-series
- Type:
- property variables
the list of available properties, like short keys for precipitation,temperature etc
- Type:
StringList
Working with time series
The elements in this section define how code shall behave or are actual tools dealing with time series.
Policies
The elements in this section describe how time series are interpreted.
Class convolve_policy
- class shyft.time_series.convolve_policy
Bases:
enum
Ref TimeSeries.convolve_w function, this policy determine how to handle initial conditions USE_NEAREST: value(0) is used for all values before value(0), and value(n-1) is used for all values after value(n-1) == ‘mass preserving’ USE_ZERO : use zero for all values before value(0) or after value(n-1) == ‘shape preserving’ USE_NAN : nan is used for all values outside the ts BACKWARD : filter is ‘backward looking’ == boundary handling in the beginning of ts FORWARD : filter is ‘forward looking’ == boundary handling in the end of ts CENTER : filter is centered == boundary handling in both ends
- BACKWARD = shyft.time_series._time_series.convolve_policy.BACKWARD
- CENTER = shyft.time_series._time_series.convolve_policy.CENTER
- FORWARD = shyft.time_series._time_series.convolve_policy.FORWARD
- USE_NAN = shyft.time_series._time_series.convolve_policy.USE_NAN
- USE_NEAREST = shyft.time_series._time_series.convolve_policy.USE_NEAREST
- USE_ZERO = shyft.time_series._time_series.convolve_policy.USE_ZERO
- names = {'BACKWARD': shyft.time_series._time_series.convolve_policy.BACKWARD, 'CENTER': shyft.time_series._time_series.convolve_policy.CENTER, 'FORWARD': shyft.time_series._time_series.convolve_policy.FORWARD, 'USE_NAN': shyft.time_series._time_series.convolve_policy.USE_NAN, 'USE_NEAREST': shyft.time_series._time_series.convolve_policy.USE_NEAREST, 'USE_ZERO': shyft.time_series._time_series.convolve_policy.USE_ZERO}
- values = {1: shyft.time_series._time_series.convolve_policy.USE_NEAREST, 2: shyft.time_series._time_series.convolve_policy.USE_ZERO, 4: shyft.time_series._time_series.convolve_policy.USE_NAN, 16: shyft.time_series._time_series.convolve_policy.FORWARD, 32: shyft.time_series._time_series.convolve_policy.CENTER, 64: shyft.time_series._time_series.convolve_policy.BACKWARD}
Class derivative_method
- class shyft.time_series.derivative_method
Bases:
enum
Ref. the .derivative time-series function, this defines how to compute the derivative of a given time-series
- BACKWARD = shyft.time_series._time_series.derivative_method.BACKWARD
- CENTER = shyft.time_series._time_series.derivative_method.CENTER
- DEFAULT = shyft.time_series._time_series.derivative_method.DEFAULT
- FORWARD = shyft.time_series._time_series.derivative_method.FORWARD
- names = {'BACKWARD': shyft.time_series._time_series.derivative_method.BACKWARD, 'CENTER': shyft.time_series._time_series.derivative_method.CENTER, 'DEFAULT': shyft.time_series._time_series.derivative_method.DEFAULT, 'FORWARD': shyft.time_series._time_series.derivative_method.FORWARD}
- values = {0: shyft.time_series._time_series.derivative_method.DEFAULT, 1: shyft.time_series._time_series.derivative_method.FORWARD, 2: shyft.time_series._time_series.derivative_method.BACKWARD, 3: shyft.time_series._time_series.derivative_method.CENTER}
Class extend_fill_policy
- class shyft.time_series.extend_fill_policy
Bases:
enum
Ref TimeSeries.extend function, this policy determines how to represent values in a gap EPF_NAN : use nan values in the gap EPF_LAST: use the last value before the gap EPF_FILL: use a supplied value in the gap
- FILL_NAN = shyft.time_series._time_series.extend_fill_policy.FILL_NAN
- FILL_VALUE = shyft.time_series._time_series.extend_fill_policy.FILL_VALUE
- USE_LAST = shyft.time_series._time_series.extend_fill_policy.USE_LAST
- names = {'FILL_NAN': shyft.time_series._time_series.extend_fill_policy.FILL_NAN, 'FILL_VALUE': shyft.time_series._time_series.extend_fill_policy.FILL_VALUE, 'USE_LAST': shyft.time_series._time_series.extend_fill_policy.USE_LAST}
- values = {0: shyft.time_series._time_series.extend_fill_policy.FILL_NAN, 1: shyft.time_series._time_series.extend_fill_policy.USE_LAST, 2: shyft.time_series._time_series.extend_fill_policy.FILL_VALUE}
Class extend_split_policy
- class shyft.time_series.extend_split_policy
Bases:
enum
Ref TimeSeries.extend function, this policy determines where to split/shift from one ts to the other EPS_LHS_LAST : use nan values in the gap EPS_RHS_FIRST: use the last value before the gap EPS_VALUE : use a supplied value in the gap
- AT_VALUE = shyft.time_series._time_series.extend_split_policy.AT_VALUE
- LHS_LAST = shyft.time_series._time_series.extend_split_policy.LHS_LAST
- RHS_FIRST = shyft.time_series._time_series.extend_split_policy.RHS_FIRST
- names = {'AT_VALUE': shyft.time_series._time_series.extend_split_policy.AT_VALUE, 'LHS_LAST': shyft.time_series._time_series.extend_split_policy.LHS_LAST, 'RHS_FIRST': shyft.time_series._time_series.extend_split_policy.RHS_FIRST}
- values = {0: shyft.time_series._time_series.extend_split_policy.LHS_LAST, 1: shyft.time_series._time_series.extend_split_policy.RHS_FIRST, 2: shyft.time_series._time_series.extend_split_policy.AT_VALUE}
Class interpolation_scheme
- class shyft.time_series.interpolation_scheme
Bases:
enum
Interpolation methods used by TimeSeries.transform
- SCHEME_CATMULL_ROM = shyft.time_series._time_series.interpolation_scheme.SCHEME_CATMULL_ROM
- SCHEME_LINEAR = shyft.time_series._time_series.interpolation_scheme.SCHEME_LINEAR
- SCHEME_POLYNOMIAL = shyft.time_series._time_series.interpolation_scheme.SCHEME_POLYNOMIAL
- names = {'SCHEME_CATMULL_ROM': shyft.time_series._time_series.interpolation_scheme.SCHEME_CATMULL_ROM, 'SCHEME_LINEAR': shyft.time_series._time_series.interpolation_scheme.SCHEME_LINEAR, 'SCHEME_POLYNOMIAL': shyft.time_series._time_series.interpolation_scheme.SCHEME_POLYNOMIAL}
- values = {0: shyft.time_series._time_series.interpolation_scheme.SCHEME_LINEAR, 1: shyft.time_series._time_series.interpolation_scheme.SCHEME_POLYNOMIAL, 2: shyft.time_series._time_series.interpolation_scheme.SCHEME_CATMULL_ROM}
Class point_interpretation_policy
- class shyft.time_series.point_interpretation_policy
Bases:
enum
Determines how to interpret the points in a timeseries when interpreted as a function of time, f(t)
- POINT_AVERAGE_VALUE = shyft.time_series._time_series.point_interpretation_policy.POINT_AVERAGE_VALUE
- POINT_INSTANT_VALUE = shyft.time_series._time_series.point_interpretation_policy.POINT_INSTANT_VALUE
- names = {'POINT_AVERAGE_VALUE': shyft.time_series._time_series.point_interpretation_policy.POINT_AVERAGE_VALUE, 'POINT_INSTANT_VALUE': shyft.time_series._time_series.point_interpretation_policy.POINT_INSTANT_VALUE}
- values = {0: shyft.time_series._time_series.point_interpretation_policy.POINT_INSTANT_VALUE, 1: shyft.time_series._time_series.point_interpretation_policy.POINT_AVERAGE_VALUE}
Class trim_policy
- class shyft.time_series.trim_policy
Bases:
enum
Enum to decide if to trim inwards or outwards where TRIM_IN means inwards, TRIM_ROUND rounds halfway cases away from zero.
- TRIM_IN = shyft.time_series._time_series.trim_policy.TRIM_IN
- TRIM_OUT = shyft.time_series._time_series.trim_policy.TRIM_OUT
- TRIM_ROUND = shyft.time_series._time_series.trim_policy.TRIM_ROUND
- names = {'TRIM_IN': shyft.time_series._time_series.trim_policy.TRIM_IN, 'TRIM_OUT': shyft.time_series._time_series.trim_policy.TRIM_OUT, 'TRIM_ROUND': shyft.time_series._time_series.trim_policy.TRIM_ROUND}
- values = {0: shyft.time_series._time_series.trim_policy.TRIM_IN, 1: shyft.time_series._time_series.trim_policy.TRIM_OUT, 2: shyft.time_series._time_series.trim_policy.TRIM_ROUND}
Tools
The elements in this section work with time series.
Class KrlsRbfPredictor
- class shyft.time_series.KrlsRbfPredictor
Bases:
instance
Time-series predictor using the KRLS algorithm with radial basis functions.
The KRLS (Kernel Recursive Least-Squares) algorithm is a kernel regression algorithm for aproximating data, the implementation used here is from:
This predictor uses KRLS with radial basis functions (RBF). Other related
shyft.time_series.TimeSeries.krls_interpolation()
shyft.time_series.TimeSeries.TimeSeries.get_krls_predictor()
Examples:
>>> >>> import numpy as np >>> import matplotlib.pyplot as plt >>> from shyft.time_series import ( ... Calendar, utctime_now, deltahours, ... TimeAxis, TimeSeries, ... KrlsRbfPredictor ... ) >>> >>> # setup >>> cal = Calendar() >>> t0 = utctime_now() >>> dt = deltahours(3) >>> n = 365*8 # one year >>> >>> # ready plot >>> fig, ax = plt.subplots() >>> >>> # shyft objects >>> ta = TimeAxis(t0, dt, n) >>> pred = KrlsRbfPredictor( ... dt=deltahours(8), ... gamma=1e-5, # NOTE: this should be 1e-3 for real data ... tolerance=0.001 ... ) >>> >>> # generate data >>> total_series = 4 >>> data_range = np.linspace(0, 2*np.pi, n) >>> ts = None # to store the final data-ts >>> # ----- >>> for i in range(total_series): >>> data = np.sin(data_range) + (np.random.random(data_range.shape) - 0.5)/5 >>> ts = TimeSeries(ta, data) >>> # ----- >>> training_mse = pred.train(ts) # train the predictor >>> # ----- >>> print(f'training step {i+1:2d}: mse={training_mse}') >>> ax.plot(ta.time_points[:-1], ts.values, 'bx') # plot data >>> >>> # prediction >>> ts_pred = pred.predict(ta) >>> ts_mse = pred.mse_ts(ts, points=3) # mse using 7 point wide filter >>> # (3 points before/after) >>> >>> # plot interpolation/predicton on top of results >>> ax.plot(ta.time_points[:-1], ts_mse.values, '0.6', label='mse') >>> ax.plot(ta.time_points[:-1], ts_pred.values, 'r-', label='prediction') >>> ax.legend() >>> plt.show()
- __init__((KrlsRbfPredictor)arg1) None
- __init__( (KrlsRbfPredictor)self, (time)dt [, (object)gamma=0.001 [, (object)tolerance=0.01 [, (int)size=1000000]]]) -> None :
Construct a new predictor.
- Args:
dt (float): The time-step in seconds the predictor is specified for. Note that this does not put a limit on time-axes used, but for best results it should be approximatly equal to the time-step of time-axes used with the predictor. In addition it should not be to long, else you will get poor results. Try to keep the dt less than a day, 3-8 hours is usually fine.
gamma (float (optional)): Determines the width of the radial basis functions for the KRLS algorithm. Lower values mean wider basis functions, wider basis functions means faster computation but lower accuracy. Note that the tolerance parameter also affects speed and accurcy. A large value is around 1E-2, and a small value depends on the time step. By using values larger than 1E-2 the computation will probably take to long. Testing have reveled that 1E-3 works great for a time-step of 3 hours, while a gamma of 1E-2 takes a few minutes to compute. Use 1E-4 for a fast and tolerably accurate prediction. Defaults to 1E-3
tolerance (float (optional)): The krls training tolerance. Lower values makes the prediction more accurate, but slower. This typically have less effect than gamma, but is usefull for tuning. Usually it should be either 0.01 or 0.001. Defaults to 0.01
size (int (optional)): The size of the “memory” of the predictor. The default value is usually enough. Defaults to 1000000.
- clear((KrlsRbfPredictor)self) None :
Clear all training data from the predictor.
- mse_ts((KrlsRbfPredictor)self, (TimeSeries)ts[, (int)points=0]) TimeSeries :
Compute a mean-squared error time-series of the predictor relative to the supplied ts.
- Parameters:
ts (TimeSeries) – Time-series to compute mse against.
points (int (optional)) – Positive number of extra points around each point to use for mse.
0. (Defaults to)
- Returns:
mse_ts. Time-series with mean-squared error values.
- Return type:
See also
KrlsRbfPredictor.predictor_mse, KrlsRbfPredictor.predict
- predict((KrlsRbfPredictor)self, (TimeAxis)ta) TimeSeries :
Predict a time-series for for time-axis.
Notes
The predictor will predict values outside the range of the values it is trained on, but these
values will often be zero. This may also happen if there are long gaps in the training data
and you try to predict values for the gap. Using wider basis functions partly remedies this,
but makes the prediction overall less accurate.
- Parameters:
ta (TimeAxis) – Time-axis to predict values for.
- Returns:
ts. Predicted time-series.
- Return type:
See also
KrlsRbfPredictor.mse_ts, KrlsRbfPredictor.predictor_mse
- predictor_mse((KrlsRbfPredictor)self, (TimeSeries)ts[, (int)offset=0[, (int)count=18446744073709551615[, (int)stride=1]]]) float :
Compute the predictor mean-squared prediction error for count first from ts.
- Parameters:
ts (TimeSeries) – Time-series to compute mse against.
offset (int (optional)) – Positive offset from the start of the time-series. Default to 0.
count (int (optional)) – Positive number of samples from the time-series to to use.
value. (Default to the maximum)
stride (int (optional)) – Positive stride between samples from the time-series. Defaults to 1.
See also
KrlsRbfPredictor.predict, KrlsRbfPredictor.mse_ts
- train((KrlsRbfPredictor)self, (TimeSeries)ts[, (int)offset=0[, (int)count=18446744073709551615[, (int)stride=1[, (int)iterations=1[, (object)mse_tol=0.001]]]]]) float :
Train the predictor using samples from ts.
- Parameters:
ts (TimeSeries) – Time-series to train on.
offset (int (optional)) – Positive offset from the start of the time-series. Default to 0.
count (int (optional)) – Positive number of samples to to use. Default to the maximum value.
stride (int (optional)) – Positive stride between samples from the time-series. Defaults to 1.
iterations (int (optional)) – Positive maximum number of times to train on the samples. Defaults to 1.
mse_tol (float (optional)) – Positive tolerance for the mean-squared error over the training data.
1E-9. (If the mse after a training session is less than this skip training further. Defaults to)
- Returns:
mse. Mean squared error of the predictor relative to the time-series trained on.
- Return type:
float (optional)
Class QacParameter
Qac = Quality Assurance Controls
- class shyft.time_series.QacParameter
Bases:
instance
The qac parameter controls how quailty checks are done, providing min-max range, plus repeated values checks It also provides parameters that controls how the replacement/correction values are filled in, like maximum time-span between two valid neighbour points that allows for linear/extension filling
- __init__((QacParameter)arg1) None
- __init__( (QacParameter)self, (time)max_timespan, (object)min_x, (object)max_x, (time)repeat_timespan, (object)repeat_tolerance [, (object)constant_filler=nan]) -> None :
a quite complete qac, only lacks repeat_allowed value(s)
- __init__( (QacParameter)self, (time)max_timespan, (object)min_x, (object)max_x, (time)repeat_timespan, (object)repeat_tolerance, (object)repeat_allowed [, (object)constant_filler=nan]) -> None :
a quite complete qac, including one repeat_allowed value
- property constant_filler
this is applied to values that fails quality checks, if no correction ts, and no interpolation/extension is active
- Type:
float
- property max_timespan
maximum timespan between two ok values that allow interpolation, or extension of values.If zero, no linear/extend correction
- Type:
- property max_v
maximum value or nan for no maximum value limit
- Type:
float
- property min_v
minimum value or nan for no minimum value limit
- Type:
float
- property repeat_allowed
values that are allowed to repeat, within repeat-tolerance
- Type:
bool
- property repeat_timespan
maximum timespan the same value can be repeated (within repeat_tolerance).If zero, no repeat validation done
- Type:
- property repeat_tolerance
values are considered repeated if they differ by less than repeat_tolerance
- Type:
float
Hydrology
The elements in this section are hydrology specific.
Class IcePackingParameters
- class shyft.time_series.IcePackingParameters
Bases:
instance
Parameter pack controlling ice packing computations. See TimeSeries.ice_packing for usage.
- __init__((IcePackingParameters)self, (time)threshold_window, (object)threshold_temperature) None :
Defines a paramter pack for ice packing detection.
- Args:
threshold_window (utctime): Positive, seconds for the lookback window.
threshold_temperature (float): Floating point threshold temperature.
- __init__( (IcePackingParameters)self, (object)threshold_window, (object)threshold_temperature) -> None :
Defines a paramter pack for ice packing detection.
- Args:
threshold_window (int): Positive integer seconds for the lookback window.
threshold_temperature (float): Floating point threshold temperature.
- property threshold_temperature
The threshold temperature for ice packing to occur. Ice packing will occur when the average temperature in the window period is less than the threshold.
- Type:
float
Class IcePackingRecessionParameters
- class shyft.time_series.IcePackingRecessionParameters
Bases:
instance
Parameter pack controlling ice packing recession computations. See TimeSeries.ice_packing_recession for usage.
- __init__((IcePackingRecessionParameters)self, (object)alpha, (object)recession_minimum) None :
Defines a parameter pack for ice packing reduction using a simple recession for the water-flow.
- Parameters:
alpha (float) – Recession curve curving parameter.
recession_minimum (float) – Minimum value for the recession.
- property alpha
Parameter controlling the curving of the recession curve.
- Type:
float
- property recession_minimum
The minimum value of the recession curve.
- Type:
float
Class ice_packing_temperature_policy
- class shyft.time_series.ice_packing_temperature_policy
Bases:
enum
Policy enum to specify how TimeSeries.ice_packing handles missing temperature values.
- The enum defines three values:
DISALLOW_MISSING disallows any missing values. With this policy whenever a NaN value is encountered, or the window of values to consider extends outside the range of the time series, a NaN value will be written to the result time-series.
ALLOW_INITIAL_MISSING disallows explicit NaN values, but allows the window of values to consider to expend past the range of the time-series for the initial values.
ALLOW_ANY_MISSING allow the window of values to contain NaN values, averaging what it can. Only if all the values in the window is NaN, the result wil be NaN.
- ALLOW_ANY_MISSING = shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_ANY_MISSING
- ALLOW_INITIAL_MISSING = shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_INITIAL_MISSING
- DISALLOW_MISSING = shyft.time_series._time_series.ice_packing_temperature_policy.DISALLOW_MISSING
- names = {'ALLOW_ANY_MISSING': shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_ANY_MISSING, 'ALLOW_INITIAL_MISSING': shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_INITIAL_MISSING, 'DISALLOW_MISSING': shyft.time_series._time_series.ice_packing_temperature_policy.DISALLOW_MISSING}
- values = {0: shyft.time_series._time_series.ice_packing_temperature_policy.DISALLOW_MISSING, 1: shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_INITIAL_MISSING, 2: shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_ANY_MISSING}
Class RatingCurveFunction
- class shyft.time_series.RatingCurveFunction
Bases:
instance
Combine multiple RatingCurveSegments into a rating function.
RatingCurveFunction aggregates multiple RatingCurveSegments and routes. computation calls to the correct segment based on the water level to compute for.
See also
RatingCurveSegment, RatingCurveParameters
- __init__((RatingCurveFunction)self) None :
Defines a new empty rating curve function.
- __init__( (RatingCurveFunction)self, (RatingCurveSegments)segments [, (object)is_sorted=True]) -> None :
constructs a function from a segment-list
- add_segment((RatingCurveFunction)self, (object)lower, (object)a, (object)b, (object)c) None :
Add a new curve segment with the given parameters.
- See also:
RatingCurveSegment
- add_segment( (RatingCurveFunction)self, (RatingCurveSegment)segment) -> None :
Add a new curve segment as a copy of an exting.
- See also:
RatingCurveSegment
- flow((RatingCurveFunction)self, (DoubleVector)levels) DoubleVector :
Compute flow for a range of water levels.
- Args:
levels (DoubleVector): Range of water levels to compute flow for.
- flow( (RatingCurveFunction)self, (object)level) -> float :
Compute flow for the given level.
- Args:
level (float): Water level to compute flow for.
- size((RatingCurveFunction)self) int :
Get the number of RatingCurveSegments composing the function.
Class RatingCurveParameters
- class shyft.time_series.RatingCurveParameters
Bases:
instance
Parameter pack controlling rating level computations.
A parameter pack encapsulates multiple RatingCurveFunction’s with time-points. When used with a TimeSeries representing level values it maps computations for each level value onto the correct RatingCurveFunction, which again maps onto the correct RatingCurveSegment for the level value.
See also
RatingCurveSegment, RatingCurveFunction, TimeSeries.rating_curve
- __init__((object)arg1) object :
Defines a empty RatingCurveParameter instance
- __init__( (object)arg1, (RatingCurveTimeFunctions)t_f_list) -> object :
create parameters in one go from list of RatingCurveTimeFunction elements
- Args:
t_f_list (RatingCurveTimeFunctions): a list of RatingCurveTimeFunction elements
- add_curve((RatingCurveParameters)self, (time)t, (RatingCurveFunction)curve) RatingCurveParameters :
Add a curve to the parameter pack.
- Parameters:
t (RatingCurveFunction) – First time-point the curve is valid for.
curve (RatingCurveFunction) – RatingCurveFunction to add at t.
- Returns:
self. to allow chaining building functions
- Return type:
- flow((RatingCurveParameters)self, (time)t, (object)level) float :
Compute the flow at a specific time point.
- Args:
t (utctime): Time-point of the level value.
level (float): Level value at t.
- Returns:
float: flow. Flow correcponding to input level at t, nan if level is less than the least water level of the first segment or before the time of the first rating curve function.
- flow( (RatingCurveParameters)self, (TimeSeries)ts) -> DoubleVector :
Compute the flow at a specific time point.
- Args:
ts (TimeSeries): Time series of level values.
- Returns:
DoubleVector: flow. Flow correcponding to the input levels of the time-series, nan where the level is less than the least water level of the first segment and for time-points before the first rating curve function.
Class RatingCurveSegment
- class shyft.time_series.RatingCurveSegment
Bases:
instance
Represent a single rating-curve equation.
The rating curve function is a*(h - b)^c where a, b, and c are parameters for the segment and h is the water level to compute flow for. Additionally there is a lower parameter for the least water level the segment is valid for. Seen separatly a segment is considered valid for any level greater than lower. n The function segments are gathered into many RatingCurveFunction to represent a set of different rating functions for different levels. Related classes are RatingCurveFunction, RatingCurveParameters
- __init__((RatingCurveSegment)self) None
- __init__( (RatingCurveSegment)self, (object)lower, (object)a, (object)b, (object)c) -> None :
Defines a new RatingCurveSegment with the specified parameters
- property a
Parameter a
- Type:
float
- property b
Parameter b
- Type:
float
- property c
Parameter c
- Type:
float
- flow((RatingCurveSegment)self, (DoubleVector)levels[, (int)i0=0[, (int)iN=18446744073709551615]]) DoubleVector :
Compute the flow for a range of water levels
- Args:
levels (DoubleVector): Vector of water levels
i0 (int): first index to use from levels, defaults to 0
iN (int): first index _not_ to use from levels, defaults to std::size_t maximum.
- Returns:
DoubleVector: flow. Vector of flow values.
- flow( (RatingCurveSegment)self, (object)level) -> float :
Compute the flow for the given water level.
- Notes:
There is _no_ check to see if level is valid. It’s up to the user to call
with a correct level.
- Args:
level (float): water level
- Returns:
double: flow. the flow for the given water level
- property lower
Least valid water level. Not mutable after constructing a segment.
- Type:
float
- valid((RatingCurveSegment)self, (object)level) bool :
Check if a water level is valid for the curve segment
level (float): water level
- Returns:
valid. True if level is greater or equal to lower
- Return type:
bool
Class RatingCurveSegments
- class shyft.time_series.RatingCurveSegments
Bases:
instance
A typed list of RatingCurveSegment, used to construct RatingCurveParameters.
- __init__((RatingCurveSegments)self) None
__init__( (RatingCurveSegments)arg1, (RatingCurveSegments)clone_me) -> None
- append((RatingCurveSegments)arg1, (object)arg2) None
- extend((RatingCurveSegments)arg1, (object)arg2) None
Class RatingCurveTimeFunction
- class shyft.time_series.RatingCurveTimeFunction
Bases:
instance
Composed of time t and RatingCurveFunction
- __init__((RatingCurveTimeFunction)self) None :
Defines empty pair t,f
- __init__( (RatingCurveTimeFunction)self, (time)t, (RatingCurveFunction)f) -> None :
Construct an object with function f valid from time t
- Args:
t (int): epoch time in 1970 utc [s]
f (RatingCurveFunction): the function
- property f
the rating curve function
- Type:
Class RatingCurveTimeFunctions
- class shyft.time_series.RatingCurveTimeFunctions
Bases:
instance
A typed list of RatingCurveTimeFunction elements
- __init__((RatingCurveTimeFunctions)self) None :
Defines empty list pair t,f
__init__( (RatingCurveTimeFunctions)arg1, (RatingCurveTimeFunctions)clone_me) -> None
- append((RatingCurveTimeFunctions)arg1, (object)arg2) None
- extend((RatingCurveTimeFunctions)arg1, (object)arg2) None