shyft.time_series

This package contains the following classes and functions to use by end-users. The namespace itself contains more classes and functions, but these are used internally.

Note

Vector types Because the actual code is written in C++ which is strongly typed, Shyft Python code uses the concept of “vector” classes which are basically lists of the named type e.g. TsBindInfo and TsBindInfoVector. These are not specifically documented here.

However, some vector types like TsVector are documented, because they provide more functionality than a simple list.

Time

Elements in this category deal with date/time.

Function utctime_now

class shyft.time_series.utctime_now

Bases:

returns time now as seconds since 1970s

Function deltahours

class shyft.time_series.deltahours((object)n)

Bases:

returns time equal to specified n hours

Function deltaminutes

class shyft.time_series.deltaminutes((object)n)

Bases:

returns time equal to specified n minutes

Class time

class shyft.time_series.time

Bases: instance

time is represented as a number, in SI-unit seconds.

For accuracy and performance, it’s internally represented as 64bit integer, at micro-second resolution It is usually used in two roles:

  1. time measured in seconds since epoch (1970.01.01UTC)

    often constructed using the Calendar class that takes calendar-coordinates (YMDhms etc) and returns the corresponding time-point, taking time-zone and dst etc. into account.

>>>      utc = Calendar()
>>>      t1 = utc.time(2018,10,15,16,30,15)
>>>      t2 = time('2018-10-15T16:30:15Z')
>>>      t3 = time(1539621015)
  1. time measure, in unit of seconds (resolution up to 1us)

    often constructed from numbers, always use SI-unit of seconds

>>>      dt1 = time(3600)  # 3600 seconds
>>>      dt2 = time(0.123456)  #  0.123456 seconds

It can be constructed supplying number of seconds, or a well defined iso8601 string

To convert it to a python number, use float() or int() to cast operations If dealing with time-zones/calendars conversion, use the calendar.time(..)/.calendar_units functions. If you want to use time-zone/calendar semantic add/diff/trim, use the corresponding Calendar methods.

See also

Calendar,deltahours,deltaminutes

__init__((object)arg1) object :

construct a 0s

__init__( (object)arg1, (object)seconds) -> object :

construct a from a precise time like item, e.g. time, float,int and iso8601 string

Args:

seconds (): A time like item, time, float,int or iso8601 YYYY-MM-DDThh:mm:ss[.xxxxxx]Z string

epoch = time(0)
max = time.max
min = time.min
static now() time
property seconds

returns time in seconds

Type:

float

sqrt()

object sqrt(tuple args, dict kwds)

undefined = time.undefined

Class YMDhms

class shyft.time_series.YMDhms

Bases: instance

Defines calendar coordinates as Year Month Day hour minute second and micro_second The intended usage is ONLY as result from the Calendar.calendar_units(t), to ensure a type-safe return of these entities for a given time.

Please use this as a read-only return type from the Calendar.calendar_units(t)

__init__((YMDhms)arg1) None
__init__( (YMDhms)arg1, (object)Y [, (object)M [, (object)D [, (object)h [, (object)m [, (object)s [, (object)us]]]]]]) -> None :

Creates calendar coordinates specifying Y,M,D,h,m,s,us

property day

int:

property hour

int:

is_null((YMDhms)arg1) bool :

returns true if all values are 0, - the null definition

is_valid((YMDhms)arg1) bool :

returns true if YMDhms values are reasonable

static max() YMDhms :

returns the maximum representation

property micro_second

int:

static min() YMDhms :

returns the minimum representation

property minute

int:

property month

int:

property second

int:

property year

int:

Class YWdhms

class shyft.time_series.YWdhms

Bases: instance

Defines calendar coordinates as iso Year Week week-day hour minute second and micro_second The intended usage is ONLY as result from the Calendar.calendar_week_units(t), to ensure a type-safe return of these entities for a given time.

Notes

Please use this as a read-only return type from the Calendar.calendar_week_units(t)

__init__((YWdhms)arg1) None
__init__( (YWdhms)arg1, (object)Y [, (object)W [, (object)wd [, (object)h [, (object)m [, (object)s [, (object)us]]]]]]) -> None :

Creates calendar coordinates specifying iso Y,W,wd,h,m,s

Args:

Y (int): iso-year

W (int): iso week [1..53]

wd (int): week_day [1..7]=[mo..sun]

h (int): hour [0..23]

m (int): minute [0..59]

s (int): second [0..59]

us (int): micro_second [0..999999]

property hour

int:

is_null((YWdhms)arg1) bool :

returns true if all values are 0, - the null definition

is_valid((YWdhms)arg1) bool :

returns true if YWdhms values are reasonable

property iso_week

int:

property iso_year

int:

static max() YWdhms :

returns the maximum representation

property micro_second

int:

static min() YWdhms :

returns the minimum representation

property minute

int:

property second

int:

property week_day

week_day,[1..7]=[mo..sun]

Type:

int

Class TzInfo

class shyft.time_series.TzInfo

Bases: instance

The TzInfo class is responsible for providing information about the time-zone of the calendar. This includes:

  • name (olson identifier)

  • base_offset

  • utc_offset(t) time-dependent

The Calendar class provides a shared pointer to it’s TzInfo object

__init__((TzInfo)arg1, (time)base_tz) None :

creates a TzInfo with a fixed utc-offset(no dst-rules)

base_offset((TzInfo)arg1) time :

returnes the time-invariant part of the utc-offset

is_dst((TzInfo)arg1, (time)t) bool :

returns true if DST is observed at given utc-time t

name((TzInfo)arg1) str :

returns the olson time-zone identifier or name for the TzInfo

utc_offset((TzInfo)arg1, (time)t) time :

returns the utc_offset at specified utc-time, takes DST into account if applicable

Class Calendar

class shyft.time_series.Calendar

Bases: instance

Calendar deals with the concept of a human calendar

In Shyft we practice the ‘utc-time perimeter’ principle,

  • the core is utc-time exclusively

  • we deal with time-zone and calendars at the interfaces/perimeters

In python, this corresponds to timestamp[64], or as the integer version of the time package representation e.g. the difference between time.time()- utctime_now() is in split-seconds

Calendar functionality:

  • Conversion between the calendar coordinates YMDhms or iso week YWdhms and utctime, taking any timezone and DST into account

  • Calendar constants, utctimespan like values for Year,Month,Week,Day,Hour,Minute,Second

  • Calendar arithmetic, like adding calendar units, e.g. day,month,year etc.

  • Calendar arithmetic, like trim/truncate a utctime down to nearest timespan/calendar unit. eg. day

  • Calendar arithmetic, like calculate difference in calendar units(e.g days) between two utctime points

  • Calendar Timezone and DST handling

  • Converting time to string and vice-versa

Notes

Please notice that although the calendar concept is complete:

  • We only implement features as needed in the core and interfaces

    Currently this includes most options, including olson time-zone handling

    The time-zone support is currently a snapshot of rules ~2023

    but we plan to use standard packages like Howard Hinnant’s online approach for this later.

  • Working range for DST is 1905..2105 (dst is considered 0 outside)

  • Working range,validity of the calendar computations is limited to gregorian as of boost::date.

  • Working range avoiding overflows is -4000 .. + 4000 Years

DAY = time(86400)
HOUR = time(3600)
MINUTE = time(60)
MONTH = time(2592000)
QUARTER = time(7776000)
RANGE = [-9999-01-01T00:00:00Z,9999-12-31T23:59:59Z>
SECOND = time(1)
TZ_RANGE = [1905-01-01T00:00:00Z,2105-01-01T00:00:00Z>
WEEK = time(604800)
YEAR = time(31536000)
__init__((Calendar)arg1) None
__init__( (Calendar)arg1, (time)tz_offset) -> None :

creates a calendar with constant tz-offset

Args:

tz_offset (time): specifies utc offset, time(3600) gives UTC+01 zone

__init__( (Calendar)arg1, (object)tz_offset) -> None :

creates a calendar with constant tz-offset

Args:

tz_offset (int): seconds utc offset, 3600 gives UTC+01 zone

__init__( (Calendar)arg1, (object)olson_tz_id) -> None :

create a Calendar from Olson timezone id, eg. ‘Europe/Oslo’

Args:

olson_tz_id (str): Olson time-zone id, e.g. ‘Europe/Oslo’

add()
object add(tuple args, dict kwds) :

This function does a calendar semantic add.

Conceptually this is similar to t + deltaT*n but with deltaT equal to calendar::DAY,WEEK,MONTH,YEAR and/or with dst enabled time-zone the variation of length due to dst as well as month/year length is taken into account. E.g. add one day, and calendar have dst, could give 23,24 or 25 hours due to dst. Similar for week or any other time steps.

Args:

t (time): timestamp utc seconds since epoch

delta_t (time): timestep in seconds, with semantic interpretation of DAY,WEEK,MONTH,YEAR

n (int): number of timesteps to add

Returns:

time: t. new timestamp with the added time-steps, seconds utc since epoch

Notes:

ref. to related functions .diff_units(…) and .trim(..)

calendar_units()
object calendar_units(tuple args, dict kwds) :

returns YMDhms (.year,.month,.day,.hour,.minute..) for specified t, in the time-zone as given by the calendar

Args:

t (time): timestamp utc seconds since epoch

Returns:

YMDhms: calendar_units. calendar units as in year-month-day hour-minute-second

calendar_week_units()
object calendar_week_units(tuple args, dict kwds) :

returns iso YWdhms, with properties (.iso_year,.iso_week,.week_day,.hour,..) for specified t, in the time-zone as given by the calendar

Args:

t (time): timestamp utc seconds since epoch

Returns:

YWdms: calendar_week_units. calendar units as in iso year-week-week_day hour-minute-second

day_of_year((Calendar)self, (time)t) int :

returns the day of year for the specified time

Parameters:

t (time) – time to use for computation

Returns:

day_of_year. in range 1..366

Return type:

int

diff_units()
object diff_units(tuple args, dict kwds) :

calculate the distance t1..t2 in specified units, taking dst into account if observed The function takes calendar semantics when delta_t is calendar::DAY,WEEK,MONTH,YEAR, and in addition also dst if observed. e.g. the diff_units of calendar::DAY over summer->winter shift is 1, even if the number of hours during those days are 23 and 25 summer and winter transition respectively. It computes the calendar semantics of (t2-t1)/deltaT, where deltaT could be calendar units DAY,WEEK,MONTH,YEAR

Args:

t1 (time): timestamp utc seconds since epoch

t2 (time): timestamp utc seconds since epoch

delta_t (time): timestep in seconds, with semantic interpretation of DAY,WEEK,MONTH,YEAR

trim_policy (trim_policy): Default TRIM_IN, could be TRIM_OUT or TRIM_ROUND.

Returns:

int: n_units. number of units, so that t2 = c.add(t1,delta_t,n) + remainder(discarded). Depending on the trim_policy, the remainder results will add/subtract one unit to the result.

Notes:

ref. to related functions .add(…) and .trim(…)

quarter((Calendar)arg1, (time)t) int :

returns the quarter of the specified t, -1 if invalid t

Parameters:

t (int) – timestamp utc seconds since epoch

Returns:

quarter. in range[1..4], -1 if invalid time

Return type:

int

static region_id_list() StringVector :

Returns a list over predefined Olson time-zone identifiers

Notes

the list is currently static and reflects tz-rules approx as of 2014

time((Calendar)self, (YMDhms)YMDhms) time :

convert calendar coordinates into time using the calendar time-zone

Args:

YMDhms (YMDhms): calendar cooordinate structure containg year,month,day, hour,minute,second

Returns:

int: timestamp. timestamp as in seconds utc since epoch

time( (Calendar)self, (YWdhms)YWdhms) -> time :

convert calendar iso week coordinates structure into time using the calendar time-zone

Args:

YWdhms (YWdhms): structure containg iso specification calendar coordinates

Returns:

int: timestamp. timestamp as in seconds utc since epoch

time( (Calendar)self, (object)Y [, (object)M=1 [, (object)D=1 [, (object)h=0 [, (object)m=0 [, (object)s=0 [, (object)us=0]]]]]]) -> time :

convert calendar coordinates into time using the calendar time-zone

Args:

Y (int): Year

M (int): Month [1..12], default=1

D (int): Day [1..31], default=1

h (int): hour [0..23], default=0

m (int): minute [0..59], default=0

s (int): second [0..59], default=0

us (int): micro second[0..999999], default=0

Returns:

time: timestamp. timestamp as in seconds utc since epoch

time_from_week((Calendar)self, (object)Y[, (object)W=1[, (object)wd=1[, (object)h=0[, (object)m=0[, (object)s=0[, (object)us=0]]]]]]) time :

convert calendar iso week coordinates into time using the calendar time-zone

Parameters:
  • Y (int) – ISO Year

  • W (int) – ISO Week [1..54], default=1

  • wd (int) – week_day [1..7]=[mo..su], default=1

  • h (int) – hour [0..23], default=0

  • m (int) – minute [0..59], default=0

  • s (int) – second [0..59], default=0

  • us (int) – micro second[0..999999], default=0

Returns:

timestamp. timestamp as in seconds utc since epoch

Return type:

time

to_string()
object to_string(tuple args, dict kwds) :
convert time t to readable iso standard string taking

the current calendar properties, including timezone into account

Args:

utctime (time): seconds utc since epoch

Returns:

str: iso time string. iso standard formatted string,including tz info

to_string( (Calendar)self, (UtcPeriod)utcperiod) -> str :

convert utcperiod p to readable string taking current calendar properties, including timezone into account

Args:

utcperiod (UtcPeriod): An UtcPeriod object

Returns:

str: period-string. [start..end>, iso standard formatted string,including tz info

trim()
object trim(tuple args, dict kwds) :

Round time t down to the nearest calendar time-unit delta_t, taking the calendar time-zone and dst into account.

Args:

t (time): timestamp utc seconds since epoch

delta_t (time): timestep in seconds, with semantic interpretation of Calendar.{DAY|WEEK|MONTH|YEAR}

Returns:

time: t. new trimmed timestamp, seconds utc since epoch

Notes:

ref to related functions .add(t,delta_t,n),.diff_units(t1,t2,delta_t)

property tz_info

keeping the time-zone name, utc-offset and DST rules (if any)

Type:

TzInfo

Class UtcPeriod

class shyft.time_series.UtcPeriod

Bases: instance

UtcPeriod defines the open utc-time range [start..end> where end is required to be equal or greater than start

__init__((UtcPeriod)arg1) None
__init__( (UtcPeriod)arg1, (time)start, (time)end) -> None :

Create utcperiod given start and end

contains((UtcPeriod)self, (time)t) bool :

returns true if time t is contained in this utcperiod

contains( (UtcPeriod)self, (object)t) -> bool :

returns true if time t is contained in this utcperiod

contains( (UtcPeriod)self, (UtcPeriod)p) -> bool :

returns true if utcperiod p is contained in this utcperiod

diff_units((UtcPeriod)self, (Calendar)calendar, (time)delta_t) int :
Calculate the distance from start to end of UtcPeriod in specified units, taking dst into account if observed

The function takes calendar semantics when delta_t is calendar::DAY,WEEK,MONTH,YEAR, and in addition also dst if observed. e.g. the diff_units of calendar::DAY over summer->winter shift is 1, even if the number of hours during those days are 23 and 25 summer and winter transition respectively

Args:

calendar (calendar): shyft calendar

delta_t (time): timestep in seconds, with semantic interpretation of DAY,WEEK,MONTH,YEAR

Returns:

int: n_units. number of units in UtcPeriod

diff_units( (UtcPeriod)self, (Calendar)calendar, (object)delta_t) -> int :

Calculate the distance from start to end of UtcPeriod in specified units, taking dst into account if observed The function takes calendar semantics when delta_t is calendar::DAY,WEEK,MONTH,YEAR, and in addition also dst if observed. e.g. the diff_units of calendar::DAY over summer->winter shift is 1, even if the number of hours during those days are 23 and 25 summer and winter transition respectively

Args:

calendar (calendar): shyft calendar

delta_t (int): timestep in seconds, with semantic interpretation of DAY,WEEK,MONTH,YEAR

Returns:

int: n_units. number of units in UtcPeriod

property end

Defines the end of the period, not inclusive

Type:

time

intersection((UtcPeriod)a, (UtcPeriod)b) UtcPeriod :

Returns the intersection of two periods if there is an intersection, the resulting period will be .valid() and .timespan()>0 If there is no intersection, an empty not .valid() period is returned

Parameters:
Returns:

intersection. The computed intersection, or an empty not .valid() UtcPeriod

Return type:

UtcPeriod

overlaps((UtcPeriod)self, (UtcPeriod)p) bool :

returns true if period p overlaps this utcperiod

property start

Defines the start of the period, inclusive

Type:

time

timespan((UtcPeriod)arg1) time :

returns end-start, the timespan of the period

to_string((UtcPeriod)arg1) str :

A readable representation in UTC

trim((UtcPeriod)self, (Calendar)calendar, (time)delta_t[, (trim_policy)trim_policy=shyft.time_series._time_series.trim_policy.TRIM_IN]) UtcPeriod :
Round UtcPeriod up or down to the nearest calendar time-unit delta_t

taking the calendar time-zone and dst into account

Args:

calendar (calendar): shyft calendar

delta_t (time): timestep in seconds, with semantic interpretation of Calendar.(DAY,WEEK,MONTH,YEAR)

trim_policy (trim_policy): TRIM_IN if rounding period inwards, else rounding outwards

Returns:

UtcPeriod: trimmed_UtcPeriod. new trimmed UtcPeriod

trim( (UtcPeriod)self, (Calendar)calendar, (object)delta_t [, (trim_policy)trim_policy=shyft.time_series._time_series.trim_policy.TRIM_IN]) -> UtcPeriod :

Round UtcPeriod up or down to the nearest calendar time-unit delta_t taking the calendar time-zone and dst into account

Args:

calendar (calendar): shyft calendar

delta_t (int): timestep in seconds, with semantic interpretation of Calendar.(DAY,WEEK,MONTH,YEAR)

trim_policy (trim_policy): TRIM_IN if rounding period inwards, else rounding outwards

Returns:

UtcPeriod: trimmed_UtcPeriod. new trimmed UtcPeriod

valid((UtcPeriod)arg1) bool :

returns true if start<=end otherwise false

Class UtcTimeVector

class shyft.time_series.UtcTimeVector

Bases: instance

__init__((object)arg1) object :

construct empty UtcTimeVecor

__init__( (object)arg1, (UtcTimeVector)clone_me) -> object :

construct a copy of supplied UtcTimeVecor

Args:

clone_me (UtcTimeVector): to be cloned

__init__( (object)arg1, (IntVector)seconds_vector) -> object :

construct a from seconds epoch utc as integer

Args:

seconds (IntVector): seconds

__init__( (object)arg1, (DoubleVector)seconds_vector) -> object :

construct a from seconds epoch utc as float

Args:

seconds (DoubleVector): seconds, up to us resolution epoch utc

__init__( (object)arg1, (list)times) -> object :

construct a from a list of something that is convertible to UtcTime

Args:

times (list): a list with convertible times

__init__( (object)arg1, (object)np_times) -> object :

construct a from a numpy array of int64 s epoch

Args:

np_times (list): a list with convertible times

__init__( (object)arg1, (object)np_times) -> object :

construct a from a numpy array of float s epoch

Args:

np_times (list): a list with float convertible times

append((UtcTimeVector)arg1, (object)arg2) None
extend((UtcTimeVector)arg1, (object)arg2) None
static from_numpy((object)arg1) UtcTimeVector
push_back()
object push_back(tuple args, dict kwds) :

appends a utctime like value to the vector

Args:

t (utctime): an int (seconds), or utctime

size()
to_numpy((UtcTimeVector)self) object :

convert to numpy array of type np.int64, seconds since epoch

to_numpy_double((UtcTimeVector)self) object :

convert to numpy array of type np.float64, seconds since epoch

Time series

Elements in this category are the actual time series.

Class TimeAxis

class shyft.time_series.TimeAxis

Bases: instance

A time-axis is a set of ordered non-overlapping periods, and TimeAxis provides the most generic implementation of this. The internal representation is selected based on provided parameters to the constructor. The internal representation is one of TimeAxis FixedDeltaT CalendarDelataT or ByPoints. The internal representation type and corresponding realizations are available as properties.

Notes

The internal representation can be one of TimeAxisCalendarDeltaT,TimeAxisFixedDeltaT,TimeAxisByPoints

__call__((TimeAxis)self, (int)i) UtcPeriod :

Returns the i-th period of the time-axis

Parameters:

i (int) – index to lookup

Returns:

period. The period for the supplied index

Return type:

UtcPeriod

__init__((TimeAxis)arg1) None
__init__( (TimeAxis)arg1, (time)start, (time)delta_t, (object)n) -> None :

creates a time-axis with n intervals, fixed delta_t, starting at start

Args:

start (utctime): utc-time 1970 utc based

delta_t (utctime): number of seconds delta-t, length of periods in the time-axis

n (int): number of periods in the time-axis

__init__( (TimeAxis)arg1, (Calendar)calendar, (time)start, (time)delta_t, (object)n) -> None :

creates a calendar time-axis

Args:

calendar (Calendar): specifies the calendar to be used, keeps the time-zone and dst-arithmetic rules

start (utctime): utc-time 1970 utc based

delta_t (utctime): number of seconds delta-t, length of periods in the time-axis

n (int): number of periods in the time-axis

__init__( (TimeAxis)arg1, (UtcTimeVector)time_points, (time)t_end) -> None :

creates a time-axis by specifying the time_points and t-end of the last interval

Args:

time_points (UtcTimeVector): ordered set of unique utc-time points, the start of each consecutive period

t_end (time): the end of the last period in time-axis, utc-time 1970 utc based, must be > time_points[-1]

__init__( (TimeAxis)arg1, (UtcTimeVector)time_points) -> None :

create a time-axis supplying n+1 points to define n intervals

Args:

time_points (UtcTimeVector): ordered set of unique utc-time points, 0..n-2:the start of each consecutive period,n-1: end of last period

__init__( (TimeAxis)arg1, (TimeAxisCalendarDeltaT)calendar_dt) -> None :

create a time-axis from a calendar time-axis

Args:

calendar_dt (TimeAxisCalendarDeltaT): existing calendar time-axis

__init__( (TimeAxis)arg1, (TimeAxisFixedDeltaT)fixed_dt) -> None :

create a time-axis from a a fixed delta-t time-axis

Args:

fixed_dt (TimeAxisFixedDeltaT): existing fixed delta-t time-axis

__init__( (TimeAxis)arg1, (TimeAxisByPoints)point_dt) -> None :

create a time-axis from a a by points time-axis

Args:

point_dt (TimeAxisByPoints): existing by points time-axis

property calendar_dt

The calendar dt representation(if active)

Type:

TimeAxisCalendarDeltaT

empty((TimeAxis)self) bool :

true if empty time-axis

Returns:

empty. true if empty time-axis

Return type:

bool

property fixed_dt

The fixed dt representation (if active)

Type:

TimeAxisFixedDeltaT

index_of((TimeAxis)self, (time)t[, (int)ix_hint=18446744073709551615]) int :
Parameters:
  • t (int) – utctime in seconds 1970.01.01

  • ix_hint (int) – index-hint to make search in point-time-axis faster

Returns:

index. the index of the time-axis period that contains t, npos if outside range

Return type:

int

merge((TimeAxis)self, (TimeAxis)other) TimeAxis :

Returns a new time-axis that contains the union of time-points/periods of the two time-axis. If there is a gap between, it is filled merge with empty time-axis results into the other time-axis

Parameters:

other (TimeAxis) – The other time-axis to merge with

Returns:

merge_result. the resulting merged time-axis

Return type:

TimeAxis

open_range_index_of((TimeAxis)self, (time)t[, (int)ix_hint=18446744073709551615]) int :

returns the index that contains t, or is before t

Parameters:
  • t (int) – utctime in seconds 1970.01.01

  • ix_hint (int) – index-hint to make search in point-time-axis faster

Returns:

index. the index the time-axis period that contains t, npos if before first period n-1, if t is after last period

Return type:

int

period((TimeAxis)self, (int)i) UtcPeriod :
Parameters:

i (int) – the i’th period, 0..n-1

Returns:

period. the i’th period of the time-axis

Return type:

UtcPeriod

property point_dt

point_dt representation(if active)

Type:

TimeAxisByPoints

size((TimeAxis)arg1) int :
Returns:

  1. number of periods in time-axis

Return type:

int

slice((TimeAxis)self, (int)start, (int)n) TimeAxis :

returns slice of time-axis as a new time-axis

Parameters:
  • start (int) – first interval to include

  • n (int) – number of intervals to include

Returns:

time-axis. A new time-axis with the specified slice

Return type:

TimeAxis

time((TimeAxis)self, (int)i) time :
Parameters:

i (int) – the i’th period, 0..n-1

Returns:

utctime. the start(utctime) of the i’th period of the time-axis

Return type:

int

property time_points
extract all time-points from a TimeAxis

like [ time_axis.time(i) ].append(time_axis.total_period().end) if time_axis.size() else []

Parameters:

time_axis (TimeAxis)

Returns:

time_points – [ time_axis.time(i) ].append(time_axis.total_period().end)

Return type:

numpy.array(dtype=np.int64)

property time_points_double

extract all time-points from a TimeAxis with microseconds like [ time_axis.time(i) ].append(time_axis.total_period().end) if time_axis.size() else []

Parameters:

time_axis (TimeAxis)

Returns:

time_points – [ time_axis.time(i) ].append(time_axis.total_period().end)

Return type:

numpy.array(dtype=np.float64)

property timeaxis_type

describes what time-axis representation type this is,e.g (fixed|calendar|point)_dt

Type:

TimeAxisType

total_period((TimeAxis)arg1) UtcPeriod :
Returns:

total_period. the period that covers the entire time-axis

Return type:

UtcPeriod

Class TimeSeries

class shyft.time_series.TimeSeries

Bases: instance

A time-series providing mathematical and statistical operations and functionality.

A time-series can be an expression, or a concrete point time-series. All time-series do have a time-axis, values, and a point fx policy. The value, f(t) outside the time-axis is nan Operations between time-series, e.g. a+b, respects the mathematical nan op something equals nan

The time-series can provide a value for all the intervals, and the point_fx policy defines how the values should be interpreted:

POINT_INSTANT_VALUE(linear):

the point value is valid at the start of the period, linear between points -or flat extended value if the next point is nan. typical for state-variables, like water-level, temperature measured at 12:00 etc.

POINT_AVERAGE_VALUE(stair-case):

the point represents an average or constant value over the period typical for model-input and results, precipitation mm/h, discharge m^3/s

Examples:

>>> import numpy as np
>>> from shyft.time_series import Calendar,deltahours,TimeAxis,TimeSeries,POINT_AVERAGE_VALUE as fx_avg
>>>
>>> utc = Calendar()  # ensure easy consistent explicit handling of calendar and time
>>> ta = TimeAxis(utc.time(2016, 9, 1, 8, 0, 0), deltahours(1), 10)  # create a time-axis to use
>>> a = TimeSeries(ta, np.linspace(0, 10, num=len(ta)), fx_avg)
>>> b = TimeSeries(ta, np.linspace(0,  1, num=len(ta)), fx_avg)
>>> c = a + b*3.0  # c is now an expression, time-axis is the overlap of a and b, lazy evaluation
>>> c_values = c.values.to_numpy()  # compute and extract the values, as numpy array
>>> c_evaluated=c.evaluate() # computes the expression, return a new concrete point-ts equal to the expression
>>>
>>> # Calculate data for new time-points
>>> value_1 = a(utc.time(2016, 9, 1, 8, 30)) # calculates value at a given time
>>> ta_target = TimeAxis(utc.time(2016, 9, 1, 7, 30), deltahours(1), 12)  # create a target time_axis
>>> ts_new = a.average(ta_target) # new time-series with values on ta_target
>>>

TimeSeries can also be symbolic, that is, have urls, that is resolved later, serverside using the DtsServer The TimeSeries functionality includes:

Other useful classes to look at: TimeAxis , Calendar , TsVector , point_interpretation_policy

Please check the extensive test suite, notebooks, examples and time_series for usage.

__call__((TimeSeries)self, (time)t) float :

return the f(t) value for the time-series

__init__((TimeSeries)self) None :

constructs and empty time-series

__init__( (TimeSeries)self, (TimeAxis)ta, (DoubleVector)values, (point_interpretation_policy)point_fx) -> None :

construct a timeseries time-axis ta, corresponding values and point interpretation policy point_fx

__init__( (TimeSeries)self, (TimeAxis)ta, (object)fill_value, (point_interpretation_policy)point_fx) -> None :

construct a time-series with time-axis ta, specified fill-value, and point interpretation policy point_fx

__init__( (TimeSeries)self, (TimeAxisFixedDeltaT)ta, (DoubleVector)values, (point_interpretation_policy)point_fx) -> None :

construct a timeseries timeaxis ta with corresponding values, and point interpretation policy point_fx

__init__( (TimeSeries)self, (TimeAxisFixedDeltaT)ta, (object)fill_value, (point_interpretation_policy)point_fx) -> None :

construct a timeseries with fixed-delta-t time-axis ta, specified fill-value, and point interpretation policy point_fx

__init__( (TimeSeries)self, (TimeAxisByPoints)ta, (DoubleVector)values, (point_interpretation_policy)point_fx) -> None :

construct a time-series with a point-type time-axis ta, corresponding values, and point-interpretation point_fx

__init__( (TimeSeries)self, (TimeAxisByPoints)ta, (object)fill_value, (point_interpretation_policy)point_fx) -> None :

construct a time-series with a point-type time-axis ta, specified fill-value, and point-interpretation point_fx

__init__( (TimeSeries)self, (TsFixed)core_result_ts) -> None :

construct a time-series from a shyft core time-series, to ease working with core-time-series in user-interface/scripting

__init__( (TimeSeries)self, (TimeSeries)clone) -> None :

creates a shallow copy of the clone time-series

__init__( (TimeSeries)self, (DoubleVector)pattern, (time)dt, (TimeAxis)ta) -> None :

construct a repeated pattern time-series given a equally spaced dt pattern and a time-axis ta

Args:

pattern (DoubleVector): a list of numbers giving the pattern

dt (int): number of seconds between each of the pattern-values

ta (TimeAxis): time-axis that forms the resulting time-series time-axis

__init__( (TimeSeries)self, (DoubleVector)pattern, (time)dt, (time)t0, (TimeAxis)ta) -> None :

construct a time-series given a equally spaced dt pattern, starting at t0, and a time-axis ta

__init__( (TimeSeries)self, (object)ts_id) -> None :

constructs a bind-able ts, providing a symbolic possibly unique id that at a later time can be bound, using the .bind(ts) method to concrete values if the ts is used as ts, like size(),.value(),time() before it is bound, then a runtime-exception is raised

Args:

ts_id (str): url-like identifier for the time-series,notice that shyft://<container>/<path> is for shyft-internal store

__init__( (TimeSeries)self, (object)ts_id, (TimeSeries)bts) -> None :

constructs a ready bound ts, providing a symbolic possibly unique id that at a later time can be used to correlate with back-end store

Args:

ts_id (str): url-type of id, notice that shyft://<container>/<path> is for shyft-internal store

bts (TimeSeries): A time-series, that is either a concrete ts, or an expression that can be evaluated to form a concrete ts

abs((TimeSeries)self) TimeSeries :

create a new ts, abs(py::self

Returns:

ts. a new time-series expression, that will provide the abs-values of self.values

Return type:

TimeSeries

accumulate((TimeSeries)self, (TimeAxis)ta) TimeSeries :

create a new ts where each i-th value :: | integral f(t)*dt, from t0..ti

given the specified time-axis ta, and the point interpretation.

Parameters:

ta (TimeAxis) – time-axis that specifies the periods where accumulated integral is applied

Returns:

ts. a new time-series expression, that will provide the accumulated values when requested

Return type:

TimeSeries

Notes: In contrast to integral() , accumulate has a point-instant interpretation. As values() gives the start values of each interval, see TimeSeries , accumulate(ta).values provides the accumulation over the intervals [t0..t0, t0..t1, t0..t2, …], thus values[0] is always 0.)

average((TimeSeries)self, (TimeAxis)ta) TimeSeries :

create a new ts that is the true average of self over the specified time-axis ta. Notice that same definition as for integral applies; non-nan parts goes into the average

Parameters:

ta (TimeAxis) – time-axis that specifies the periods where true-average is applied

Returns:

ts. a new time-series expression, that will provide the true-average when requested

Return type:

TimeSeries

Notes

the self point interpretation policy is used when calculating the true average

bind((TimeSeries)self, (TimeSeries)bts) None :

given that this ts,self, is a bind-able ts (aref_ts) and that bts is a concrete point TimeSeries, or something that can be evaluated to one, use it as representation for the values of this ts. Other related functions are find_ts_bind_info,TimeSeries(‘a-ref-string’)

Parameters:

bts (TimeSeries) – a concrete point ts, or ready-to-evaluate expression, with time-axis, values and fx_policy

Notes

raises runtime_error if any of preconditions is not true

bind_done((TimeSeries)self[, (object)skip_check=False]) None :

after bind operations on unbound time-series of an expression is done, call bind_done() to prepare the expression for use Other related methods are .bind(), .find_ts_bind_info() and needs_bind().

Parameters:

skip_check (bool) – If set true this function assumes all siblings are bound, as pr. standard usage pattern for the mentioned functions

Notes

Usually this is done automatically by the dtss framework, but if not using dtss

this function is needed after the symbolic ts’s are bound

bucket_to_hourly((TimeSeries)self, (object)start_hour_utc, (object)bucket_emptying_limit) TimeSeries :

Precipitation bucket measurements have a lot of tweaks that needs to be resolved, including negative variations over the day due to faulty temperature-dependent volume/weight sensors attached.

A precipitation bucket accumulates precipitation, so the readings should be strictly increasing by time, until the bucket is emptied (full, or as part of maintenance).

The goal for the bucket_to_hourly algorithm is to provide hourly precipitation, based on some input signal that usually is hourly(averaging is used if not hourly).

The main strategy is to use 24 hour differences (typically at hours in a day where the temperature is low, like early in the morning.), to adjust the hourly volume.

Differences in periods of 24hour are distributed on all positive hourly evenets, the negative derivatives are zeroed out, so that the hourly result for each 24 hour is steady increasing, and equal to the difference of the 24hour area.

The derivative is then used to compute the hourly precipitation rate in mm/h

Parameters:
  • start_hour_utc (int) – valid range [0..24], usually set to early morning(low-stable temperature)

  • bucket_emptying_limit (float) – a negative number, range[-oo..0>, limit of when to detect an emptying of a bucket in the unit of the measurements series

Returns:

ts. a new hourly rate ts, that transforms the accumulated series, compensated for the described defects

Return type:

TimeSeries

clone_expression((TimeSeries)self) TimeSeries :

create a copy of the ts-expressions, except for the bound payload of the reference ts. For the reference terminals, those with ts_id, only the ts_id is copied. Thus, to re-evaluate the expression, those have to be bound.

Notes

this function is only useful in context where multiple bind/rebind while keeping the expression is needed.

Returns:

semantic_clone. returns a copy of the ts, except for the payload at reference/symbolic terminals, where only `ts_id`is copied

Return type:

TsVector

compress((TimeSeries)self, (object)accuracy) TimeSeries :

Compress by reducing number of points sufficient to represent the same f(t) within accuracy. The returned ts is a new ts with break-point/variable interval representation. note: lazy binding expressions(server-side eval) is not yet supported.

Parameters:

() (accuracy) – if v[i]-v[i+1] <accuracy the v[i+1] is dropped

Returns:

compressed_ts. a new compressed within accuracy time-series

Return type:

TimeSeries

compress_size((TimeSeries)self, (object)accuracy) int :

Compute number of points this time-series could be reduced to if calling ts.compress(accuracy). note: lazy binding expressions(server-side eval) is not yet supported.

Parameters:

() (accuracy) – if v[i]-v[i+1] <accuracy the v[i+1] is dropped

Returns:

compressed_size. number of distinct point needed to represent the time-series

Return type:

int

convolve_w((TimeSeries)self, (DoubleVector)weights, (convolve_policy)policy) TimeSeries :

create a new ts that is the convolved ts with the given weights list

Parameters:
  • weights (DoubleVector) – the weights profile, use DoubleVector.from_numpy(…) to create these. It’s the callers responsibility to ensure the sum of weights are 1.0

  • policy (convolve_policy) – (USE_NEAREST|USE_ZERO|USE_NAN + BACKWARD|FORWARD|CENTER). Specifies how to handle boundary values

Returns:

ts. a new time-series that is evaluated on request to the convolution of self

Return type:

TimeSeries

decode((TimeSeries)self, (object)start_bit, (object)n_bits) TimeSeries :

Create an time-series that decodes the source using provided specification start_bit and n_bits. This function can typically be used to decode status-signals from sensors stored as binary encoded bits, using integer representation The floating point format allows up to 52 bits to be precisely stored as integer - thus there are restrictions to start_bit and n_bits accordingly. Practical sensors quality signals have like 32 bits of status information encoded If the value in source time-series is:

  • negative

  • nan

  • larger than 52 bits

Then nan is returned for those values

ts.decode(start_bit=1,n_bits=1) will return values [0,1,nan] similar: ts.decode(start_bit=1,n_bits=2) will return values [0,1,2,3,nan] etc..

Parameters:
  • start_bit (int) – where in the n-bits integer the value is stored, range[0..51]

  • n_bits (int) – how many bits are encoded, range[0..51], but start_bit +n_bits < 51

Returns:

decode_ts. Evaluated on demand decoded time-series

Return type:

TimeSeries

derivative((TimeSeries)self[, (derivative_method)method=shyft.time_series._time_series.derivative_method.DEFAULT]) TimeSeries :

Compute the derivative of the ts, according to the method specified. For linear(POINT_INSTANT_VALUE), it is always the derivative of the straight line between points, - using nan for the interval starting at the last point until end of time-axis. Default for stair-case(POINT_AVERAGE_VALUE) is the average derivative over each time-step, - using 0 as rise for the first/last half of the intervals at the boundaries. here you can influence the method used, selecting .forward_diff, .backward_diff

Parameters:

method (derivative_method) – default value gives center/average derivative .(DEFAULT|FORWARD|BACKWARD|CENTER)

Returns:

derivative. The derivative ts

Return type:

TimeSeries

static deserialize((ByteVector)blob) TimeSeries :

convert a blob, as returned by .serialize() into a Timeseries

evaluate((TimeSeries)self) TimeSeries :

Forces evaluation of the expression, returns a new concrete time-series that is detached from the expression.

Returns:

ts. the evaluated copy of the expression that self represents

Return type:

TimeSeries

extend((TimeSeries)self, (TimeSeries)ts[, (extend_split_policy)split_policy=shyft.time_series._time_series.extend_split_policy.LHS_LAST[, (extend_fill_policy)fill_policy=shyft.time_series._time_series.extend_fill_policy.FILL_NAN[, (time)split_at=time(0)[, (object)fill_value=nan]]]]) TimeSeries :

create a new time-series that is self extended with ts

Parameters:
  • ts (TimeSeries) – time-series to extend self with, only values after both the start of self, and split_at is used

  • split_policy (extend_split_policy) – policy determining where to split between self and ts

  • fill_policy (extend_fill_policy) – policy determining how to fill any gap between self and ts

  • split_at (utctime) – time at which to split if split_policy == EPS_VALUE

  • fill_value (float) – value to fill any gap with if fill_policy == EPF_FILL

Returns:

extended_ts. a new time-series that is the extension of self with ts

Return type:

TimeSeries

fill((TimeSeries)self, (object)v) None :

fill all values with v

find_ts_bind_info((TimeSeries)self) TsBindInfoVector :

recursive search through the expression that this ts represents, and return a list of TsBindInfo that can be used to inspect and possibly ‘bind’ to ts-values. see also related function bind()

Returns:

bind_info. A list of BindInfo where each entry contains a symbolic-ref and a ts that needs binding

Return type:

TsBindInfoVector

get((TimeSeries)self, (int)i) Point :

returns i’th point(t,v)

get_krls_predictor((TimeSeries)self, (time)dt[, (object)gamma=0.001[, (object)tolerance=0.01[, (int)size=1000000]]]) KrlsRbfPredictor :

Get a KRLS predictor trained on this time-series.

If you only want a interpolation of self use krls_interpolation instead, this method return the underlying predictor instance that can be used to generate mean-squared error estimates, or can be further trained on more data.

Notes

A predictor can only be generated for a bound time-series.

Parameters:
  • dt (float) – The time-step in seconds the underlying predictor is specified for. Note that this does not put a limit on time-axes used, but for best results it should be approximatly equal to the time-step of time-axes used with the predictor. In addition it should not be to long, else you will get poor results. Try to keep the dt less than a day, 3-8 hours is usually fine.

  • gamma (float (optional)) – Determines the width of the radial basis functions for the KRLS algorithm. Lower values mean wider basis functions, wider basis functions means faster computation but lower accuracy. Note that the tolerance parameter also affects speed and accurcy. A large value is around 1E-2, and a small value depends on the time step. By using values larger than 1E-2 the computation will probably take to long. Testing have reveled that 1E-3 works great for a time-step of 3 hours, while a gamma of 1E-2 takes a few minutes to compute. Use 1E-4 for a fast and tolerably accurate prediction. Defaults to 1E-3

  • tolerance (float (optional)) – The krls training tolerance. Lower values makes the prediction more accurate, but slower. This typically have less effect than gamma, but is usefull for tuning. Usually it should be either 0.01 or 0.001. Defaults to 0.01

  • size (int (optional)) – The size of the “memory” of the underlying predictor. The default value is usually enough. Defaults to 1000000.

Examples:

>>> import numpy as np
>>> import scipy.stats as stat
>>> from shyft.time_series import (
...     Calendar, utctime_now, deltahours,
...     TimeAxis, TimeSeries
... )
>>>
>>> cal = Calendar()
>>> t0 = utctime_now()
>>> dt = deltahours(1)
>>> n = 365*24  # one year
>>>
>>> # generate random bell-shaped data
>>> norm = stat.norm()
>>> data = np.linspace(0, 20, n)
>>> data = stat.norm(10).pdf(data) + norm.pdf(np.random.rand(*data.shape))
>>> # -----
>>> ta = TimeAxis(cal, t0, dt, n)
>>> ts = TimeSeries(ta, data)
>>>
>>> # create a predictor
>>> pred = ts.get_krls_predictor()
>>> total_mse = pred.predictor_mse(ts)  # compute mse relative to ts
>>> krls_ts = pred.predict(ta)  # generate a prediction, this is the result from ts.krls_interpolation
>>> krls_mse_ts = pred.mse_ts(ts, points=6)  # compute a mse time-series using 6 points around each sample
Returns:

krls_predictor. A KRLS predictor pre-trained once on self.

Return type:

KrlsRbfPredictor

Other related methods are:

shyft.time_series.TimeSeries.krls_interpolation()

get_time_axis((TimeSeries)self) TimeAxis :

TimeAxis: the time-axis

ice_packing((TimeSeries)self, (IcePackingParameters)ip_params, (ice_packing_temperature_policy)ipt_policy) TimeSeries :

Create a binary time-series indicating whether ice-packing is occuring or not.

Note

self is interpreted and assumed to be a temperature time-series.

The ice packing detection is based on the mean temperature in a predetermined time window before the time-point of interrest (see IcePackingParameters.window. The algorithm determines there to be ice packing when the mean temperature is below a given threshold temperature (see IcePackingParameters.threshold_temp).

Parameters:
Returns:

ice_packing_ts. A time-series indicating wheter ice packing occurs or not

Return type:

TimeSeries

Example:

>>> import numpy as np
>>> from shyft.time_series import (
...     IcePackingParameters, ice_packing_temperature_policy,
...     TimeAxis, TimeSeries, point_interpretation_policy, DoubleVector,
...     utctime_now, deltahours, deltaminutes,
... )
>>>
>>> t0 = utctime_now()
>>> dt = deltaminutes(15)
>>> n = 100
>>>
>>> # generate jittery data
>>> # - first descending from +5 to -5 then ascending back to +5
>>> # - include a NaN hole at the bottom of the V
>>> n_ = n if (n//2)*2 == n else n+1  # assure even
>>> data = np.concatenate((
...     np.linspace(5, -5, n_//2), np.linspace(-5, 5, n_//2)
... )) + np.random.uniform(-0.75, 0.75, n_)  # add uniform noise
>>> data[n_//2 - 1:n_//2 + 2] = float('nan')  # add some missing data
>>>
>>> # create Shyft data structures
>>> ta = TimeAxis(t0, dt, n_)
>>> temperature_ts = TimeSeries(ta, DoubleVector.from_numpy(data),
...                             point_interpretation_policy.POINT_AVERAGE_VALUE)
>>>
>>> # do the ice packing detection
>>> ip_param = IcePackingParameters(
...     threshold_window=deltahours(5),
...     threshold_temperature=-1.0)
>>> # try all the different temperature policies
>>> ice_packing_ts_disallow = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.DISALLOW_MISSING)
>>> ice_packing_ts_initial = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.ALLOW_INITIAL_MISSING)
>>> ice_packing_ts_any = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.ALLOW_ANY_MISSING)
>>>
>>> # plotting
>>> from matplotlib import pyplot as plt
>>> from shyft.time_series import time_axis_extract_time_points
>>>
>>> # NOTE: The offsets below are added solely to be able to distinguish between the different time-axes
>>>
>>> plt.plot(time_axis_extract_time_points(ta)[:-1], temperature_ts.values, label='Temperature')
>>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_ts_disallow.values.to_numpy() + 1,
...          label='Ice packing? [DISALLOW_MISSING]')
>>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_ts_initial.values.to_numpy() - 1,
...          label='Ice packing? [ALLOW_INITIAL_MISSING]')
>>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_ts_any.values.to_numpy() - 3,
...          label='Ice packing? [ALLOW_ANY_MISSING]')
>>> plt.legend()
>>> plt.show()
ice_packing_recession((TimeSeries)self, (TimeSeries)ip_ts, (IcePackingRecessionParameters)ipr_params) TimeSeries :

Create a new time series where segments are replaced by recession curves.

Note

The total period (TimeSeries.total_period) of self needs to be equal to, or contained in the total period of ip_ts.

Parameters:
  • ip_ts (TimeSeries) – A binary time-series indicating if ice packing occurring. See TimeSeries.ice_packing.

  • ip_param (IcePackingParameters) – Parameter container controlling the ice packing recession curve.

Returns:

ice_packing_recession_ts. A time-series where sections in self is replaced by recession curves as indicated by ip_ts.

Return type:

TimeSeries

Example:

>>> import numpy as np
>>> from shyft.time_series import (
...     IcePackingParameters, IcePackingRecessionParameters, ice_packing_temperature_policy,
...     TimeAxis, TimeSeries, point_interpretation_policy, DoubleVector,
...     utctime_now, deltahours, deltaminutes,
... )
>>>
>>> t0 = utctime_now()
>>> dt = deltaminutes(15)
>>> n = 100
>>>
>>> # generate jittery temperature data
>>> # - first descending from +5 to -5 then ascending back to +5
>>> # - include a NaN hole at the bottom of the V
>>> n_ = n if (n//2)*2 == n else n+1  # assure even
>>> temperature_data = np.concatenate((
...     np.linspace(5, -5, n_//2), np.linspace(-5, 5, n_//2)
... )) + np.random.uniform(-0.75, 0.75, n_)  # add uniform noise
>>> temperature_data[n_ // 2 - 1:n_ // 2 + 2] = float('nan')  # add some missing data
>>>
>>> # create Shyft data structures for temperature
>>> ta = TimeAxis(t0, dt, n_)
>>> temperature_ts = TimeSeries(ta, DoubleVector.from_numpy(temperature_data),
...                             point_interpretation_policy.POINT_AVERAGE_VALUE)
>>>
>>> # generate jittery waterflow data
>>> # - an upwards curving parabola
>>> x0 = ta.total_period().start
>>> x1 = ta.total_period().end
>>> x = np.linspace(x0, x1, n_)
>>> flow_data = -0.0000000015*(x - x0)*(x - x1) + 1 + np.random.uniform(-0.5, 0.5, n_)
>>> del x0, x1, x
>>>
>>> # create Shyft data structures for temperature
>>> flow_ts = TimeSeries(ta, DoubleVector.from_numpy(flow_data),
...                      point_interpretation_policy.POINT_AVERAGE_VALUE)
>>>
>>> # do the ice packing detection
>>> ip_param = IcePackingParameters(
...     threshold_window=deltahours(5),
...     threshold_temperature=-1.0)
>>> # compute the detection time-series
>>> # ice_packing_ts = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.DISALLOW_MISSING)
>>> # ice_packing_ts = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.ALLOW_INITIAL_MISSING)
>>> ice_packing_ts = temperature_ts.ice_packing(ip_param, ice_packing_temperature_policy.ALLOW_ANY_MISSING)
>>>
>>> # setup for the recession curve
>>> ipr_param = IcePackingRecessionParameters(
...     alpha=0.00009,
...     recession_minimum=2.)
>>> # compute a recession curve based on the ice packing ts
>>> ice_packing_recession_ts_initial = flow_ts.ice_packing_recession(ice_packing_ts, ipr_param)
>>>
>>> # plotting
>>> from matplotlib import pyplot as plt
>>> from shyft.time_series import time_axis_extract_time_points
>>>
>>> plt.plot(time_axis_extract_time_points(ta)[:-1], temperature_ts.values, label='Temperature')
>>> plt.plot(time_axis_extract_time_points(ta)[:-1], flow_ts.values, label='Flow')
>>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_ts.values.to_numpy(),
...          label='Ice packing?')
>>> plt.plot(time_axis_extract_time_points(ta)[:-1], ice_packing_recession_ts_initial.values.to_numpy(),
...          label='Recession curve')
>>> plt.legend()
>>> plt.show()
index_of((TimeSeries)self, (time)t) int :

return the index of the intervall that contains t, or npos if not found

inside((TimeSeries)self, (object)min_v, (object)max_v[, (object)nan_v=nan[, (object)inside_v=1.0[, (object)outside_v=0.0]]]) TimeSeries :

Create an inside min-max range ts, that transforms the point-values that falls into the half open range [min_v .. max_v > to the value of inside_v(default=1.0), or outside_v(default=0.0), and if the value considered is nan, then that value is represented as nan_v(default=nan) You would typically use this function to form a true/false series (inside=true, outside=false)

Parameters:
  • min_v (float) – minimum range, values < min_v are not inside min_v==NaN means no lower limit

  • max_v (float) – maximum range, values >= max_v are not inside. max_v==NaN means no upper limit

  • nan_v (float) – value to return if the value is nan

  • inside_v (float) – value to return if the ts value is inside the specified range

  • outside_v (float) – value to return if the ts value is outside the specified range

Returns:

inside_ts. Evaluated on demand inside time-series

Return type:

TimeSeries

integral((TimeSeries)self, (TimeAxis)ta) TimeSeries :

create a new ts that is the true integral of self over the specified time-axis ta. defined as integral of the non-nan part of each time-axis interval

Parameters:

ta (TimeAxis) – time-axis that specifies the periods where true-integral is applied

Returns:

ts. a new time-series expression, that will provide the true-integral when requested

Return type:

TimeSeries

Notes

the self point interpretation policy is used when calculating the true average

kling_gupta(other_ts: TimeSeries, s_r: float = 1.0, s_a: float = 1.0, s_b: float = 1.0) float

computes the kling_gupta correlation using self as observation, and self.time_axis as the comparison time-axis

Parameters:
  • other_ts (Timeseries) – the predicted/calculated time-series to correlate

  • s_r (float) – the kling gupta scale r factor(weight the correlation of goal function)

  • s_a (float) – the kling gupta scale a factor(weight the relative average of the goal function)

  • s_b (float) – the kling gupta scale b factor(weight the relative standard deviation of the goal function)

Returns:

KGEs

Return type:

float

krls_interpolation((TimeSeries)self, (time)dt[, (object)gamma=0.001[, (object)tolerance=0.01[, (int)size=1000000]]]) TimeSeries :

Compute a new TS that is a krls interpolation of self.

The KRLS algorithm is a kernel regression algorithm for aproximating data, the implementation used here is from DLib: http://dlib.net/ml.html#krls The new time-series has the same time-axis as self, and the values vector contain no nan entries.

If you also want the mean-squared error of the interpolation use get_krls_predictor instead, and use the predictor api to generate a interpolation and a mse time-series. Other related functions are TimeSeries.get_krls_predictor, KrlsRbfPredictor

Parameters:
  • dt (float) – The time-step in seconds the underlying predictor is specified for. Note that this does not put a limit on time-axes used, but for best results it should be approximatly equal to the time-step of time-axes used with the predictor. In addition it should not be to long, else you will get poor results. Try to keep the dt less than a day, 3-8 hours is usually fine.

  • gamma (float (optional)) – Determines the width of the radial basis functions for the KRLS algorithm. Lower values mean wider basis functions, wider basis functions means faster computation but lower accuracy. Note that the tolerance parameter also affects speed and accurcy. A large value is around 1E-2, and a small value depends on the time step. By using values larger than 1E-2 the computation will probably take to long. Testing have reveled that 1E-3 works great for a time-step of 3 hours, while a gamma of 1E-2 takes a few minutes to compute. Use 1E-4 for a fast and tolerably accurate prediction. Defaults to 1E-3

  • tolerance (float (optional)) – The krls training tolerance. Lower values makes the prediction more accurate, but slower. This typically have less effect than gamma, but is usefull for tuning. Usually it should be either 0.01 or 0.001. Defaults to 0.01

  • size (int (optional)) – The size of the “memory” of the underlying predictor. The default value is usually enough. Defaults to 1000000.

Examples:

>>> import numpy as np
>>> import scipy.stats as stat
>>> from shyft.time_series import (
...     Calendar, utctime_now, deltahours,
...     TimeAxis, TimeSeries
... )
>>>
>>> cal = Calendar()
>>> t0 = utctime_now()
>>> dt = deltahours(1)
>>> n = 365*24  # one year
>>>
>>> # generate random bell-shaped data
>>> norm = stat.norm()
>>> data = np.linspace(0, 20, n)
>>> data = stat.norm(10).pdf(data) + norm.pdf(np.random.rand(*data.shape))
>>> # -----
>>> ta = TimeAxis(cal, t0, dt, n)
>>> ts = TimeSeries(ta, data)
>>>
>>> # compute the interpolation
>>> ts_ipol = ts.krls_interpolation(deltahours(3))
Returns:

krls_ts. A new time series being the KRLS interpolation of self.

Return type:

TimeSeries

log((TimeSeries)self) TimeSeries :

create a new ts that contains log(py::self)

lower_half((TimeSeries)self) TimeSeries :

Create a ts that contains non-negative values only.

Returns:

lower_half_ts. Evaluated on demand inside time-series

Return type:

TimeSeries

lower_half_mask((TimeSeries)self) TimeSeries :

Create a ts that contains 1.0 in place of non-positive values, and 0.0 in case of positive values.

Returns:

lower_half_mask_ts. Evaluated on demand inside time-series

Return type:

TimeSeries

max((TimeSeries)self, (object)number) TimeSeries :

create a new ts that contains the max of self and number for each time-step

max( (TimeSeries)self, (TimeSeries)ts_other) -> TimeSeries :

create a new ts that contains the max of self and ts_other

merge_points((TimeSeries)self, (TimeSeries)ts) TimeSeries :

Given that self is a concrete point-ts(not an expression), or empty ts, this function modifies the point-set of self, with points, (time,value) from other ts The result of the merge operation is the distinct set of time-points from self and other ts where values from other ts overwrites values of self if they happen to be at the same time-point

Parameters:

ts (TimeSeries) – time-series to merge the time,value points from

Returns:

self. self modified with the merged points from other ts

Return type:

TimeSeries

min((TimeSeries)self, (object)number) TimeSeries :

create a new ts that contains the min of self and number for each time-step

min( (TimeSeries)self, (TimeSeries)ts_other) -> TimeSeries :

create a new ts that contains the min of self and ts_other

min_max_check_linear_fill((TimeSeries)self, (object)v_min, (object)v_max[, (object)dt_max=time.max]) TimeSeries :
Create a min-max range checked ts with fill-values if value is NaN or outside range

If the underlying time-series is point-instant, then fill-values are linear-interpolation, otherwise, the previous value, if available is used as fill-value. A similar function with more features is quality_and_self_correction()

Args:

v_min (float): minimum range, values < v_min are considered NaN. v_min==NaN means no lower limit

v_max (float): maximum range, values > v_max are considered NaN. v_max==NaN means no upper limit

dt_max (int): maximum time-range in seconds allowed for interpolating/extending values, default= max_utctime

Returns:

TimeSeries: min_max_check_linear_fill. Evaluated on demand time-series with NaN, out of range values filled in

min_max_check_linear_fill( (TimeSeries)self, (object)v_min, (object)v_max [, (time)dt_max=time.max]) -> TimeSeries :

Create a min-max range checked ts with fill-values if value is NaN or outside range If the underlying time-series is point-instant, then fill-values are linear-interpolation, otherwise, the previous value, if available is used as fill-value. Similar and more parameterized function is quality_and_self_correction()

Args:

v_min (float): minimum range, values < v_min are considered NaN. v_min==NaN means no lower limit

v_max (float): maximum range, values > v_max are considered NaN. v_max==NaN means no upper limit

dt_max (int): maximum time-range in seconds allowed for interpolating/extending values, default= max_utctime

Returns:

TimeSeries: min_max_check_linear_fill. Evaluated on demand time-series with NaN, out of range values filled in

min_max_check_ts_fill((TimeSeries)self, (object)v_min, (object)v_max, (object)dt_max, (TimeSeries)cts) TimeSeries :

Create a min-max range checked ts with cts-filled-in-values if value is NaN or outside range

Args:

v_min (float): minimum range, values < v_min are considered NaN. v_min==NaN means no lower limit

v_max (float): maximum range, values > v_max are considered NaN. v_max==NaN means no upper limit

dt_max (int): maximum time-range in seconds allowed for interpolating values

cts (TimeSeries): time-series that keeps the values to be filled in at points that are NaN or outside min-max-limits

Returns:

TimeSeries: min_max_check_ts_fill. Evaluated on demand time-series with NaN, out of range values filled in

min_max_check_ts_fill( (TimeSeries)self, (object)v_min, (object)v_max, (time)dt_max, (TimeSeries)cts) -> TimeSeries :

Create a min-max range checked ts with cts-filled-in-values if value is NaN or outside range

Args:

v_min (float): minimum range, values < v_min are considered NaN. v_min==NaN means no lower limit

v_max (float): maximum range, values > v_max are considered NaN. v_max==NaN means no upper limit

dt_max (int): maximum time-range in seconds allowed for interpolating values

cts (TimeSeries): time-series that keeps the values to be filled in at points that are NaN or outside min-max-limits

Returns:

TimeSeries: min_max_check_ts_fill. Evaluated on demand time-series with NaN, out of range values filled in

nash_sutcliffe(other_ts: TimeSeries) float

Computes the Nash-Sutcliffe model effiency coefficient (n.s) for the two time-series over the time_axis of the observed_ts, self. Ref: http://en.wikipedia.org/wiki/Nash%E2%80%93Sutcliffe_model_efficiency_coefficient :param other_ts: the time-series that is the model simulated / calculated ts :type other_ts: TimeSeries

Returns:

float

Return type:

The n.s performance, that have a maximum at 1.0

needs_bind((TimeSeries)self) bool :

returns true if there are any unbound time-series in the expression this time-series represent These functions also supports symbolic time-series handling: .find_ts_bind_info(),bind() and bind_done()

partition_by((TimeSeries)self, (Calendar)calendar, (time)t, (time)partition_interval, (int)n_partitions, (time)common_t0) TsVector :

DEPRECATED(replaced by .stack ) : from a time-series, construct a TsVector of n time-series partitions. The partitions are simply specified by calendar, delta_t(could be symbolic, like YEAR : MONTH:DAY) and n. To make yearly partitions, just pass Calendar.YEAR as partition_interval. The t - parameter set the start - time point in the source-time-series, e.g. like 1930.09.01 The common_t0 - parameter set the common start - time of the new partitions, e.g. 2017.09.01

The typical usage will be to use this function to partition years into a vector with 80 years, where we can do statistics, percentiles to compare and see the different effects of yearly season variations. Note that the function is more general, allowing any periodic partition, like daily, weekly, monthly etc. that allows you to study any pattern or statistics that might be periodic by the partition pattern. Other related methods are time_shift,average,TsVector.

Parameters:
  • calendar (Calendar) – The calendar to use, typically utc

  • t (utctime) – specifies where to pick the first partition

  • partition_interval (utctimespan) – the length of each partition, Calendar.YEAR,Calendar.DAY etc.

  • n_partitions (int) – number of partitions

  • common_t0 (utctime) – specifies the time to correlate all the partitions

Returns:

ts-partitions. with length n_partitions, each ts is time-shifted to common_t0 expressions

Return type:

TsVector

point_interpretation((TimeSeries)self) point_interpretation_policy :

returns the point interpretation policy

pow((TimeSeries)self, (object)number) TimeSeries :

create a new ts that contains pow(py::self,number)

pow( (TimeSeries)self, (TimeSeries)ts_other) -> TimeSeries :

create a new ts that contains pow(py::self,ts_other)

quality_and_self_correction((TimeSeries)self, (QacParameter)parameters) TimeSeries :

returns a new time-series that applies quality checks accoring to parameters and fills in values according to rules specified in parameters.

Parameters:

parameter (QacParameter) – Parameter with rules for quality and corrections

Returns:

ts. a new time-series where the values are subject to quality and correction as specified

Return type:

TimeSeries

quality_and_ts_correction((TimeSeries)self, (QacParameter)parameters, (TimeSeries)cts) TimeSeries :

returns a new time-series that applies quality checks accoring to parameters and fills in values from the cts, according to rules specified in parameters.

Parameters:
  • parameter (QacParameter) – Parameter with rules for quality and corrections

  • cts (TimeSeries) – is used to fill in correct values, as f(t) for values that fails quality-checks

Returns:

ts. a new time-series where the values are subject to quality and correction as specified

Return type:

TimeSeries

rating_curve((TimeSeries)self, (RatingCurveParameters)rc_param) TimeSeries :

Create a new TimeSeries that is computed using a RatingCurveParameter instance.

Examples:

>>> import numpy as np
>>> from shyft.time_series import (
...     utctime_now, deltaminutes,
...     TimeAxis, TimeSeries,
...     RatingCurveFunction, RatingCurveParameters
... )
>>>
>>> # parameters
>>> t0 = utctime_now()
>>> dt = deltaminutes(30)
>>> n = 48*2
>>>
>>> # make rating function, each with two segments
>>> rcf_1 = RatingCurveFunction()
>>> rcf_1.add_segment(0, 2, 0, 1)    # add segment from level 0, computing f(h) = 2*(h - 0)**1
>>> rcf_1.add_segment(5.3, 1, 1, 1.4)  # add segment from level 5.3, computing f(h) = 1.3*(h - 1)**1.4
>>> rcf_2 = RatingCurveFunction()
>>> rcf_2.add_segment(0, 1, 1, 1)    # add segment from level 0, computing f(h) = 1*(h - 1)**1
>>> rcf_2.add_segment(8.0, 0.5, 0, 2)  # add segment from level 8.0, computing f(h) = 0.5*(h - 0)**2
>>>
>>> # add rating curves to a parameter pack
>>> rcp = RatingCurveParameters()
>>> rcp.add_curve(t0, rcf_1)  # rcf_1 is active from t0
>>> rcp.add_curve(t0+dt*n//2, rcf_2)  # rcf_2 takes over from t0 + dt*n/2
>>>
>>> # create a time-axis/-series
>>> ta = TimeAxis(t0, dt, n)
>>> ts = TimeSeries(ta, np.linspace(0, 12, n))
>>> rc_ts = ts.rating_curve(rcp)  # create a new time series computed using the rating curve functions
>>>
Parameters:

rc_param (RatingCurveParameter) – RatingCurveParameter instance.

Returns:

rcts. A new TimeSeries computed using self and rc_param.

Return type:

TimeSeries

repeat((TimeSeries)self, (TimeAxis)repeat_time_axis) TimeSeries :

Repeat all time-series over the given repeat_time_axis periods

Parameters:

repeat_time_axis (TimeAxis) – A time-axis that have the coarse repeat interval, like YEAR or similar

Returns:

repeated_ts. time-series where pattern of self is repeated throughout the period of repeat_time_axis

Return type:

TimeSeries

scale_by((TimeSeries)self, (object)v) None :

scale all values by the specified factor v

serialize((TimeSeries)self) ByteVector :

convert ts (expression) into a binary blob

set((TimeSeries)self, (int)i, (object)v) None :

set the i’th value

set_point_interpretation((TimeSeries)self, (point_interpretation_policy)policy) None :

set new policy

set_ts_id((TimeSeries)self, (object)ts_id) None :

Set a new ts_id of symbolic ts, requires unbound ts. To create symbolic time-series use TimeSeries(‘url://like/id’) or with payload: TimeSeries(‘url://like/id’,ts_with_values)

size((TimeSeries)self) int :

returns number of points

slice((TimeSeries)self, (object)i0, (object)n) TimeSeries :

Given that self is a concrete point-ts(not an expression), or empty ts, return a new TimeSeries containing the n values starting from index i0.

Parameters:
  • i0 (int) – Index of first element to include in the slice

  • n (int) – Number of elements to include in the slice

stack((TimeSeries)self, (Calendar)calendar, (time)t0, (int)n_dt, (time)dt, (int)n_partitions, (time)target_t0, (time)dt_snap) TsVector :

stack time-series into a TsVector of n_partitions time-series, each with semantic calendar length n_dt x dt. The partitions are simply specified by calendar, n_dt x dt(could be symbolic, like YEAR : MONTH:DAY) and n. To make yearly partitions, just pass 1, Calendar.YEAR as n_dt and dt respectively. The t0 - parameter set the start - time point in the source-time-series, e.g. like 1930.09.01 The target_t0 - parameter set the common start-time of the stack, e.g. 2017.09.01 The dt_snap - parameter is useful to ensure that if target_to is a monday, then each partition is adjusted to neares monday. The snap mechanism could be useful if you would like to stack something like consumption, that would follow a weekly pattern.

The typical usage will be to use this function to partition years into a vector with 80 years, where we can do statistics, percentiles to compare and see the different effects of yearly season variations. Note that the function is more general, allowing any periodic partition, like daily, weekly, monthly etc. that allows you to study any pattern or statistics that might be periodic by the partition pattern. Other related methods are time_shift,average,TsVector.

Parameters:
  • calendar (Calendar) – The calendar to use, typically utc

  • t0 (utctime) – specifies where to pick the first partition, e.g. 1930.09.01

  • n_dt (int) – number of calendar units for the length of the stride

  • dt (utctimespan) – the basic calendar length unit, Calendar.YEAR,Calendar.DAY

  • n_partitions (int) – number of partitions,e.g. length of the resulting TsVector

  • target_t0 (utctime) – specifies the common target time for the stack, e.g. 2017.09.01

  • dt_snap (utctimespan) – default 0, if set to WEEK, each stacked partition will be week-aligned.

Returns:

stacked_ts. with length n_partitions, each ts is time-shifted (calendar n_dt x n) to common_t0 expressions

Return type:

TsVector

statistics((TimeSeries)self, (TimeAxis)ta, (object)p) TimeSeries :

Create a new ts that extract the specified statistics from self over the specified time-axis ta Statistics are created for the point values of the time-series that falls within each time-period of the time-axis. If there are no points within the period, nan will be the result. Tip: use ts.average(ta_hourly_resolution).statistics(ta_weekly,p=50) to get the functional true hourly average statistics.

Parameters:
  • ta (TimeAxis) – time-axis for the statistics

  • p (int) – percentile range [0..100], or statistical_property.AVERAGE|MIN_EXTREME|MAX_EXTREME

Returns:

ts. a new time-series expression, will provide the statistics when requested

Return type:

TimeSeries

stringify((TimeSeries)self) str :

return human-readable string of ts or expression

time((TimeSeries)self, (int)i) time :

returns the time at the i’th point

property time_axis

the time-axis

Type:

TimeAxis

time_shift((TimeSeries)self, (time)delta_t) TimeSeries :

create a new ts that is a the time-shift’ed version of self

Parameters:

delta_t (int) – number of seconds to time-shift, positive values moves forward

Returns:

ts. a new time-series, that appears as time-shifted version of self

Return type:

TimeSeries

total_period((TimeSeries)self) UtcPeriod :

returns the total period covered by the time-axis of this time-series

transform((TimeSeries)self, (object)points, (interpolation_scheme)method) TimeSeries :

Create a transformed time-series, having values taken from pointwise function evaluation. Function values are determined by interpolating the given points, using the specified method. Valid method arguments are ‘polynomial’, ‘linear’ and ‘catmull-rom’.

Returns:

transform_ts. New TimeSeries where each element is an evaluated-on-demand transformed time-series.

Return type:

TimeSeries

ts_id((TimeSeries)self) str :

returns ts_id of symbolic ts, or empty string if not symbolic ts To create symbolic time-series use TimeSeries(‘url://like/id’) or with payload: TimeSeries(‘url://like/id’,ts_with_values)

Returns:

ts_id. url-like ts_id as passed to constructor or empty if the ts is not a ts with ts_id

Return type:

str

unbind((TimeSeries)self) None :

Reset the ts-expression to unbound state, discarding bound symbol references. For time-series, or expressions, that does not have symbolic references, no effect, see also .find_ts_bind_info(),bind() and bind_done()

upper_half((TimeSeries)self) TimeSeries :

Create a ts that contains non-negative values only.

Returns:

upper_half_ts. Evaluated on demand inside time-series

Return type:

TimeSeries

upper_half_mask((TimeSeries)self) TimeSeries :

Create a ts that contains 1.0 in place of non-negative values, and 0.0 in case of negative values.

Returns:

upper_half_mask_ts. Evaluated on demand inside time-series

Return type:

TimeSeries

use_time_axis((TimeSeries)self, (TimeAxis)time_axis) TimeSeries :

Create a new ts that have the same values as self, but filtered to the time-axis points from from the supplied time-axis. This function migth be useful for making new time-series, that exactly matches the time-axis of another series. Values of the resulting time-series is like like: [self(t) for t in time_axis.time_points[:-1]

Parameters:

time_axis (TimeAxis) – the wanted time-axis

Returns:

ts. a new time-series, that appears as resampled values of self

Return type:

TimeSeries

use_time_axis_from((TimeSeries)self, (TimeSeries)other) TimeSeries :

Create a new ts that have the same values as self, but filtered to the time-axis points from from the other supplied time-series. This function migth be useful for making new time-series, that exactly matches the time-axis of another series. Values of the resulting time-series is like like: [self(t) for t in other.time_axis.time_points[:-1] Notice that the other time-series can be an unbound (expression) in this case.

Parameters:

other (TimeSeries) – time-series that provides the wanted time-axis

Returns:

ts. a new time-series, that appears as resampled values of self

Return type:

TimeSeries

property v

returns the point-values of timeseries, alias for .values

value((TimeSeries)self, (int)i) float :

returns the value at the i’th time point

property values

the values values (possibly calculated on the fly)

Type:

DoubleVector

Class TsVector

class shyft.time_series.TsVector

Bases: instance

A vector, as in strongly typed list, array, of time-series that supports ts-math operations. You can create a TsVector from a list, or list generator of type TimeSeries. TsVector is to TimeSeries that a numpy array is to numbers, see also TimeSeries

Math operations and their types transformations:

  • number bin_op ts_vector -> ts_vector

  • ts_vector bin_op ts_vector -> ts_vector

  • ts bin_op ts_vector -> ts_vector

where bin_op is any of (*,/,+,-) and explicit forms of binary functions like pow,log,min,max.

In addition these are also available: average() integral() accumulate() time_shift() percentiles()

All operation return a new object, usually a ts-vector, containing the resulting expressions

Examples:

>>> import numpy as np
>>> from shyft.time_series import TsVector,Calendar,deltahours,TimeAxis,TimeSeries,POINT_AVERAGE_VALUE as fx_avg
>>>
>>> utc = Calendar()  # ensure easy consistent explicit handling of calendar and time
>>> ta1 = TimeAxis(utc.time(2016, 9, 1, 8, 0, 0), deltahours(1), 10)  # create a time-axis for ts1
>>> ts1 = TimeSeries(ta1, np.linspace(0, 10, num=len(ta)), fx_avg)
>>> ta2 = TimeAxis(utc.time(2016, 9, 1, 8, 30, 0), deltahours(1), 5)  # create a time-axis to ts2
>>> ts2 = TimeSeries(ta2, np.linspace(0,  1, num=len(ta)), fx_avg)
>>> tsv = TsVector([ts1, ts2]) # create ts vector from list of time-series
>>> c = tsv + tsv*3.0  # c is now an expression, time-axis is the overlap of a and b, lazy evaluation
>>> c_values = c[0].values.to_numpy()  # compute and extract the values of the ith (here: 0) time-series, as numpy array
>>>
>>> # Calculate data for new time-points
>>> value_1 = tsv(utc.time(2016, 9, 1, 8, 30)) # calculates value at a given time
>>> ta_target = TimeAxis(utc.time(2016, 9, 1, 7, 30), deltahours(1), 12)  # create a target time_axis
>>> tsv_new = tsv.average(ta_target) # new ts-vector with values on target time_axis
>>> ts0_val = tsv_new[0].values.to_numpy() # access values of the ith (here: 0) time-series as a numpy array
>>>
__init__((TsVector)arg1, (TsVector)clone) None :

Create a clone.

__init__( (object)arg1) -> object :

Create an empty TsVector

__init__( (object)arg1, (TsVector)cloneme) -> object :

Create a shallow clone of the TsVector

Args:

cloneme (TsVector): The TsVector to be cloned

__init__( (object)arg1, (list)ts_list) -> object :

Create a TsVector from a python list of TimeSeries

Args:

ts_list (List[TimeSeries]): A list of TimeSeries

abs((TsVector)self) TsVector :

create a new ts-vector, with all members equal to abs(py::self

Returns:

tsv. a new TsVector expression, that will provide the abs-values of self.values

Return type:

TsVector

accumulate((TsVector)self, (TimeAxis)ta) TsVector :

create a new vector of time-series where the vaue of each i-th element is computed as: integral f(t) *dt, from t0..ti given the specified time-axis ta and point interpretation.

Parameters:

ta (TimeAxis) – time-axis that specifies the periods where accumulated integral is applied

Returns:

tsv. a new time-series expression, that will provide the accumulated values when requested

Return type:

TsVector

Notes

Has a point-instant interpretation, see also note in TimeSeries.accumulate() for possible consequences

append((TsVector)arg1, (object)arg2) None
average((TsVector)self, (TimeAxis)ta) TsVector :

create a new vector of ts that is the true average of self over the specified time-axis ta.

Parameters:

ta (TimeAxis) – time-axis that specifies the periods where true-average is applied

Returns:

tsv. a new time-series expression, that will provide the true-average when requested

Return type:

TsVector

Notes

the self point interpretation policy is used when calculating the true average

average_slice((TsVector)self, (time)lead_time, (time)delta_t, (object)n) TsVector :

Returns a ts-vector with the average time-series of the specified slice The slice for each ts is specified by the lead_time, delta_t and n parameters. See also nash_sutcliffe,forecast_merge

Parameters:
  • lead_time (int) – number of seconds lead-time offset from each ts .time(0)

  • delta_t (int) – delta-time seconds to average as basis for n.s. simulation and observation values

  • n (int) – number of time-steps of length delta_t to slice out of each forecast/simulation ts

Returns:

ts_vector_sliced. a ts-vector with average ts of each slice specified.

Return type:

TsVector

clone_expression((TsVector)self) TsVector :

create a copy of the ts-expressions, except for the bound payload of the reference ts. For the reference terminals, those with ts_id, only the ts_id is copied. Thus, to re-evaluate the expression, those have to be bound.

Notes

this function is only useful in context where multiple bind/rebind while keeping the expression is needed.

Returns:

semantic_clone. returns a copy of the ts, except for the payload at reference/symbolic terminals, where only `ts_id`is copied

Return type:

TsVector

derivative((TsVector)self[, (derivative_method)method=shyft.time_series._time_series.derivative_method.DEFAULT]) TsVector :

create a new vector of ts where each i’th element is the derivative of f(t)

Parameters:

method (derivative_method) – what derivative_method variant to use

Returns:

tsv. where each member is the derivative of the source

Return type:

TsVector

evaluate((TsVector)self) TsVector :

Evaluates the expressions in TsVector multithreaded, and returns the resulting TsVector, where all items now are concrete terminals, that is, not expressions anymore. Useful client-side if you have complex large expressions where all time-series are bound (not symbols)

Returns:

evaluated_clone. returns the computed result as a new ts-vector

Return type:

TsVector

extend((TsVector)arg1, (object)arg2) None
extend_ts((TsVector)arg1, (TimeSeries)ts[, (extend_split_policy)split_policy=shyft.time_series._time_series.extend_split_policy.LHS_LAST[, (extend_fill_policy)fill_policy=shyft.time_series._time_series.extend_fill_policy.FILL_NAN[, (time)split_at=time(0)[, (object)fill_value=nan]]]]) TsVector :

create a new dd::ats_vector where all time-series are extended by ts

Args:

ts (TimeSeries): time-series to extend each time-series in self with

split_policy (extend_ts_split_policy): policy determining where to split between self and ts

fill_policy (extend_ts_fill_policy): policy determining how to fill any gap between self and ts

split_at (utctime): time at which to split if split_policy == EPS_VALUE

fill_value (float): value to fill any gap with if fill_policy == EPF_FILL

Returns:

TsVector: new_ts_vec. a new time-series vector where all time-series in self have been extended by ts

extend_ts( (TsVector)arg1, (TsVector)ts [, (extend_split_policy)split_policy=shyft.time_series._time_series.extend_split_policy.LHS_LAST [, (extend_fill_policy)fill_policy=shyft.time_series._time_series.extend_fill_policy.FILL_NAN [, (time)split_at=time(0) [, (object)fill_value=nan]]]]) -> TsVector :

create a new dd::ats_vector where all ts’ are extended by the matching ts from ts_vec

Args:

ts_vec (TsVector): time-series vector to extend time-series in self with

split_policy (extend_ts_split_policy): policy determining where to split between self and ts

fill_policy (extend_ts_fill_policy): policy determining how to fill any gap between self and ts

split_at (utctime): time at which to split if split_policy == EPS_VALUE

fill_value (float): value to fill any gap with if fill_policy == EPF_FILL

Returns:

TsVector: new_ts_vec. a new time-series vector where all time-series in self have been extended by the corresponding time-series in ts_vec

extract_as_table((TsVector)self, (Calendar)cal, (object)time_scale) DoubleVectorVector :

Extract values in the ts-vector as a table, where columns: | [0] is the distinct union of all time_scale*(time-points i, + cal.tz_offset(i)) | [1..n] is the value contribution of the i’th ts, nan if no contribution at that time-point This function primary usage is within visual-layer of the shyft.dashboard package to speed up processing. The semantics and parameters reflects this.

Parameters:
  • cal (Calendar) – Calendar to use for tz-offset of each time-point (to resolve bokeh lack of tz-handling)

  • time_scale (float) – time-scale to multiply the time from si-unit [s] to any scaled unit, typically ms

Returns:

table. A 2d vector where [0] contains time, [1..n] the values

Return type:

DoubleVectorVector

forecast_merge((TsVector)self, (time)lead_time, (time)fc_interval) TimeSeries :

merge the forecasts in this vector into a time-series that is constructed taking a slice of length fc_interval starting lead_time into each of the forecasts of this time-series vector. The content of the vector should be ordered in forecast-time, each entry at least fc_interval separated from the previous. If there is missing forecasts (larger than fc_interval between two forecasts) this is automagically repaired using extended slices from the existing forecasts

Parameters:
  • lead_time (int) – start slice number of seconds from t0 of each forecast

  • fc_interval (int) – length of each slice in seconds, and thus also gives the forecast-interval separation

Returns:

merged time-series. A merged forecast time-series

Return type:

TimeSeries

inside((TsVector)self, (object)min_v, (object)max_v[, (object)nan_v=nan[, (object)inside_v=1.0[, (object)outside_v=0.0]]]) TsVector :

Create an inside min-max range ts-vector, that transforms the point-values that falls into the half open range [min_v .. max_v > to the value of inside_v(default=1.0), or outside_v(default=0.0), and if the value considered is nan, then that value is represented as nan_v(default=nan) You would typically use this function to form a true/false series (inside=true, outside=false)

Parameters:
  • min_v (float) – minimum range, values < min_v are not inside min_v==NaN means no lower limit

  • max_v (float) – maximum range, values >= max_v are not inside. max_v==NaN means no upper limit

  • nan_v (float) – value to return if the value is nan

  • inside_v (float) – value to return if the ts value is inside the specified range

  • outside_v (float) – value to return if the ts value is outside the specified range

Returns:

inside_tsv. New TsVector where each element is an evaluated-on-demand inside time-series

Return type:

TsVector

integral((TsVector)self, (TimeAxis)ta) TsVector :

create a new vector of ts that is the true integral of self over the specified time-axis ta. defined as integral of the non-nan part of each time-axis interval

Parameters:

ta (TimeAxis) – time-axis that specifies the periods where true-integral is applied

Returns:

tsv. a new time-series expression, that will provide the true-integral when requested

Return type:

TsVector

Notes

the self point interpretation policy is used when calculating the true average

log((TsVector)self) TsVector :

returns TsVector log(py::self)

max((TsVector)self, (object)number) TsVector :

returns max of vector and a number

max( (TsVector)self, (TimeSeries)ts) -> TsVector :

returns max of ts-vector and a ts

max( (TsVector)self, (TsVector)tsv) -> TsVector :

returns max of ts-vector and another ts-vector

min((TsVector)self, (object)number) TsVector :

returns min of vector and a number

min( (TsVector)self, (TimeSeries)ts) -> TsVector :

returns min of ts-vector and a ts

min( (TsVector)self, (TsVector)tsv) -> TsVector :

returns min of ts-vector and another ts-vector

nash_sutcliffe((TsVector)self, (TimeSeries)observation_ts, (time)lead_time, (time)delta_t, (object)n) float :

Computes the nash-sutcliffe (wiki nash-sutcliffe) criteria between the observation_ts over the slice of each time-series in the vector. The slice for each ts is specified by the lead_time, delta_t and n parameters. The function is provided to ease evaluation of forecast performance for different lead-time periods into each forecast. The returned value range is 1.0 for perfect match -oo for no match, or nan if observations is constant or data missing. See also nash_sutcliffe_goal_function

Parameters:
  • observation_ts (TimeSeries) – the observation time-series

  • lead_time (int) – number of seconds lead-time offset from each ts .time(0)

  • delta_t (int) – delta-time seconds to average as basis for n.s. simulation and observation values

  • n (int) – number of time-steps of length delta_t to slice out of each forecast/simulation ts

Returns:

nash-sutcliffe value. the nash-sutcliffe criteria evaluated over all time-series in the TsVector for the specified lead-time, delta_t and number of elements

Return type:

double

percentiles((TsVector)self, (TimeAxis)time_axis, (IntVector)percentiles) TsVector :

Calculate the percentiles of all timeseries. over the specified time-axis. The definition is equal to e.g. NIST R7, excel, and in R. The time-series point_fx interpretation is used when performing the true-average over the time_axis periods. This functions works on bound expressions, for unbound expressions, use the DtsClient.percentiles.

See also DtsClient.percentiles() if you want to evaluate percentiles of an unbound expression.

Args:

percentiles (IntVector): A list of numbers,like [ 0, 25,50,-1,75,100] will return 6 time-series. Number with special sematics are: -1 -> arithmetic average, -1000 -> min extreme value +1000 -> max extreme value

time_axis (TimeAxis): The time-axis used when applying true-average to the time-series

Returns:

TsVector: calculated_percentiles. Time-series list with evaluated percentile results, same length as input

percentiles( (TsVector)self, (TimeAxisFixedDeltaT)time_axis, (IntVector)percentiles) -> TsVector :

Calculate the percentiles of the timeseries. over the specified time-axis. The definition is equal to e.g. NIST R7, excel, and in R. The time-series point_fx interpretation is used when performing the true-average over the time_axis periods. This functions works on bound expressions, for unbound expressions, use the DtsClient.percentiles.

See also DtsClient.percentiles() if you want to evaluate percentiles of an unbound expression.

Args:

percentiles (IntVector): A list of numbers,[ 0, 25,50,-1,75,100] will return 6 time-series,`-1 -> arithmetic average`, -1000 -> min extreme value, ` +1000 max extreme value`

time_axis (TimeAxisFixedDeltaT): The time-axis used when applying true-average to the time-series

Returns:

TsVector: calculated_percentiles. Time-series list with evaluated percentile results, same length as input

pow((TsVector)self, (object)number) TsVector :

returns TsVector pow(py::self,number)

pow( (TsVector)self, (TimeSeries)ts) -> TsVector :

returns TsVector pow(py::self,ts)

pow( (TsVector)self, (TsVector)tsv) -> TsVector :

returns TsVector pow(py::self,tsv)

repeat((TsVector)self, (TimeAxis)repeat_time_axis) TsVector :

Repeat all time-series over the given repeat_time_axis periods

Parameters:

repeat_time_axis (TimeAxis) – A time-axis that have the coarse repeat interval, like YEAR or similar

Returns:

tsv. time-series vector, where each element is repeated according to parameter

Return type:

TsVector

size()
slice((TsVector)self, (IntVector)indexes) TsVector :

returns a slice of self, specified by indexes

Parameters:

indexes (IntVector) – the indicies to pick out from self, if indexes is empty, then all is returned

Returns:

slice. a new TsVector, with content according to indexes specified

Return type:

TsVector

statistics((TsVector)self, (TimeAxis)ta, (object)p) TsVector :

create a new vector of ts where each element is ts.statistics(ta,p)

Parameters:
  • ta (TimeAxis) – time-axis for the statistics

  • p (int) – percentile range [0..100], or statistical_property.AVERAGE|MIN_EXTREME|MAX_EXTREME

Returns:

tsv. a new time-series expression, will provide the statistics when requested

Return type:

TsVector

sum((TsVector)self) TimeSeries :

returns sum of all ts in TsVector as ts as in reduce(add,..))

time_shift((TsVector)self, (time)delta_t) TsVector :

create a new vector of ts that is a the time-shifted version of self

Parameters:

delta_t (time) – number of seconds to time-shift, positive values moves forward

Returns:

tsv. a new time-series, that appears as time-shifted version of self

Return type:

TsVector

transform((TsVector)self, (object)points, (interpolation_scheme)method) TsVector :

Create a transformed ts-vector, having values taken from pointwise function evaluation. Function values are determined by interpolating the given points, using the specified method. Valid method arguments are ‘polynomial’, ‘linear’ and ‘catmull-rom’.

Returns:

transform_tsv. New TsVector where each element is an evaluated-on-demand transformed time-series.

Return type:

TsVector

use_time_axis((TsVector)self, (TimeAxis)time_axis) TsVector :

Create a new ts-vector applying TimeSeries.use_time_axis() on each member, e.g. resampling instant values at specified time-points.

Parameters:

time_axis (TimeAxis) – time-axis used to resample values from orignal ts

Returns:

tsv. time-series vector, where each element have resampled time-axis values

Return type:

TsVector

use_time_axis_from((TsVector)self, (TimeSeries)other) TsVector :

Create a new ts-vector applying TimeSeries.use_time_axis_from() on each member

Parameters:

other (TimeSeries) – time-series that provides the wanted time-axis

Returns:

tsv. time-series vector, where each element have time-axis from other

Return type:

TsVector

value_range((TsVector)self, (UtcPeriod)p) DoubleVector :

Computes min and max of all non-nan values in the period for bound expressions.

Parameters:

p (UtcPeriod)

Returns:

values. Resulting [min_value, max_value]. If all values are equal, min = max = the_value

Return type:

DoubleVectorVector

values_at((TsVector)self, (time)t) DoubleVector :

Computes the value at specified time t for all time-series

Args:

t (utctime): seconds since epoch 1970 UTC

values_at( (TsVector)self, (object)t) -> DoubleVector :

Computes the value at specified time t for all time-series

Args:

t (int): seconds since epoch 1970 UTC

values_at_time(t: int)

Time series expressions

The elements in this category implement the time series expressions solution.

Class TsBindInfo

class shyft.time_series.TsBindInfo

Bases: instance

TsBindInfo gives information about the time-series and it’s binding represented by encoded string reference Given that you have a concrete ts, you can bind that the bind_info.ts using bind_info.ts.bind() see also Timeseries.find_ts_bind_info() and Timeseries.bind()

__init__((TsBindInfo)self) None
property id

a unique id/url that identifies a time-series in a ts-database/file-store/service

Type:

str

property ts

the ts, provides .bind(another_ts) to set the concrete values

Type:

TimeSeries

DTSS - The Distributed Time series System

The elements in this category implements the the DTSS. The DTSS provides ready to use services and components, which is useful in itself.

In addition, the services are extensible by python hooks, callbacks, that allow the user to extend/adapt the functionality to cover other time-series data base backends and services.

Note that DTSS is not a data-base as such, but do have a built in high performance time-series db. The DTSS is better viewed as computing component/service, that are capable of evaluating time-series expressions, extracting the wanted information, and sending it back to the clients. One of the important properties of the DTSS is that we can bring the heavy computations to where the data is located. In addition it as a specialized advanced caching system that allows evaluations to run on memory(utilizing multi-core evaluations).

The DTSS contains a high performance in memory queue for messages that consists of collections of time-series. The queue mechanism also provide end-to-end handshake, so that producer can know that consumer have processed the queue message.

The Transfer service built into the DTSS also allows for efficient direct replication, to other DTSS instances for timeseries that matches regular expression, and even with regular expression translation before pushing to remote instance.a

The transfer mechanism is resilient to network and service interruptions, and can propagate changes to large sets of time-series in a few millisecond (limited by network/storage bandwidth).

The open design allows it to utilize any existing legacy ts-databases/services through customization points.

Class DtsServer

class shyft.time_series.DtsServer

Bases: instance

A distributed time-series server.

The server part of the Shyft Distributed TimeSeries System(DTSS). Capable of processing time-series messages and responding accordingly.

It has dual service interfaces:

  1. raw-socket boost serialized binary, use DtsClient

  2. web-api, web-socket(https/wss w. auth supported) using boost.beast, boost.spirit to process/emit messages. This also supports ts-change subscriptions.

python customization and extension capability

The user can setup callback to python to handle unbound symbolic time-series references, ts-urls. This means that you can use your own ts database backend if you have one that can beat the shyft-internal ts-db.

The DtsServer then resolves symbolic references reading time-series from a service or storage for the specified period. The server object will then compute the resulting time-series vector, and respond back to clients with the results multi-node considerations:

  1. firewall/routing: ensure that the port you are using are open for ip-traffic(use ssh-tunnel if you need ssl/tls)

  2. we strongly recommend using linux for performance and longterm stability

The Dts also support master-slave mode, that allows scaling out computations to several Dtss instances, see set_master_slave_mode

backend storage

There are 3 internal backends, and customization for external storage as well. Internal storage containers:

  1. rocksdb - by facebook, configurable specifying ts_rdb in the set_container method

  2. leveldb - (deprecated, replaced by rocksdb) by google, configurable specifying ts_ldb in the set_container method

  3. filedb - fast zero overhead, and simple internal binary formats, configurable specifying ts_db in the set_container method

The kind of backend storage for the backing store ts-containers is specifed in the set_container method, for explicit creation of ts-containers. Notice that for remotely client created containers for geo time-series storage, the default_geo_db_type applies, set to ts_rdb.**External storage** can be setup by suppling python callbacks for the find,`read`,`store` and remove_container hooks. To ensure that containers are (remotely) found and configured after reboot/restart, ensure to provide a dtss configuration file where this information is stored. Then specifying something else than the shyft:// pre-fix for the ts-urls, allows any external storage to be used. HPC setup, configure linux os user limits For high performance envirments, the ulimit, especially memory, number of files open, needs to be set to higher values than the defaults, usually nofiles is 1024, which is to low for HPC apps. We recommend 4096, or 8192 or even higher for demanding databases. For tuning rocksdb, or leveldb, read tuning guides for those libraries, -we provide some basic parameters for tuning, but more can be added if needed.

See also

DtsClient

__init__((DtsServer)self) None
add_auth_tokens((DtsServer)self, (StringVector)tokens) None :

Adds auth tokens, and activate authentication. The tokens is compared exactly to the autorization token passed in the request. Authorization should onlye be used for the https/wss, unless other measures(vpn/ssh tunnels etc.) are used to protect auth tokens on the wire Important! Ensure to start_web_api with tls_only=True when using auth!

Parameters:

() (tokens) – list of tokens, where each token is like Basic dXNlcjpwd2Q=, e.g: base64 user:pwd

property alive_connections

returns currently alive connections to the server

Type:

int

property auth_needed

returns true if the server is setup with auth-tokens, requires web-api clients to pass a valid token

Type:

bool

auth_tokens((DtsServer)self) StringVector :

returns the registered authentication tokens.

cache((DtsServer)self, (StringVector)ts_ids, (TsVector)ts_vector) None :

add/update specified ts_ids with corresponding ts to cache please notice that there is no validation of the tds_ids, they are threated identifiers,not verified against any existing containers etc. Requests that follows, will use the cached item as long as it satisfies the identifier and the coverage period requested

Parameters:
  • ts_ids (StringVector) – a list of time-series ids

  • ts_vector (TsVector) – a list of corresponding time-series

property cache_max_items

cache_max_items is the maximum number of time-series identities that are kept in memory. Elements exceeding this capacity is elided using the least-recently-used algorithm. Notice that assigning a lower value than the existing value will also flush out time-series from cache in the least recently used order.

Type:

int

property cache_memory_target

The memory max target in number of bytes. If not set directly the following equation is use: cache_memory_target = cache_ts_initial_size_estimate * cache_max_items When setting the target directly, number of items in the chache is set so that real memory usage is less than the specified target. The setter could cause elements to be flushed out of cache.

Type:

int

property cache_stats

the current cache statistics

Type:

CacheStats

property cache_ts_initial_size_estimate

The initial time-series size estimate in bytes for the cache mechanism. memory-target = cache_ts_initial_size_estimate * cache_max_items algorithm. Notice that assigning a lower value than the existing value will also flush out time-series from cache in the least recently used order.

Type:

int

property cb

callback for binding unresolved time-series references to concrete time-series. Called if the incoming messages contains unbound time-series. The signature of the callback function should be TsVector cb(Vector,utcperiod)

Examples:

>>> from shyft import time_series as sa
>>> def resolve_and_read_ts(ts_ids,read_period):
>>>     print('ts_ids:', len(ts_ids), ', read period=', str(read_period))
>>>     ta = sa.TimeAxis(read_period.start, sa.deltahours(1), read_period.timespan()//sa.deltahours(1))
>>>     x_value = 1.0
>>>     r = sa.TsVector()
>>>     for ts_id in ts_ids :
>>>         r.append(sa.TimeSeries(ta, fill_value = x_value))
>>>         x_value = x_value + 1
>>>     return r
>>> # and then bind the function to the callback
>>> dtss=sa.DtsServer()
>>> dtss.cb=resolve_and_read_ts
>>> dtss.set_listening_port(20000)
>>> dtss.process_messages(60000)
Type:

Callable[[StringVector,UtcPeriod],TsVector]

clear((DtsServer)self) None :

stop serving connections, gracefully.

See also

cb, process_messages(msec),start_server()

clear_cache_stats((DtsServer)self) None :

clear accumulated cache_stats

close((DtsServer)self) None :

stop serving connections, gracefully.

See also

cb, process_messages(msec),start_server()

property configuration_file

configuration file to enable persistent container configurations over coldstarts

Type:

str

property default_geo_db_config

Default parameters for geo db created by clients

Type:

GeoTimeSeriesConfiguration

property default_geo_db_type

default container type for geo db created by clients,(ts_rdb,`ts_ldb`,`ts_db`), defaults set to ts_rdb

Type:

str

find((DtsServer)self, (object)search_expression) TsInfoVector :

Find ts information that fully matches the regular search-expression. For the shyft file based backend, take care to specify path elements precisely, so that the directories visited is minimised. e.g:a/.*/my.ts Will prune out any top level directory not starting with a, but will match any subdirectories below that level. Refer to python test-suites for a wide range of examples using find. Notice that the regexp search algoritm uses ignore case. Please be aware that custom backend by python extension might have different rules.

Parameters:

search_expression (str) – regular search-expression, to be interpreted by the back-end tss server

Returns:

ts_info_vector. The search result, as vector of TsInfo objects

Return type:

TsInfoVector

See also

TsInfo,TsInfoVector

property find_cb

callback for finding time-series using a search-expression. Called everytime the .find() method is called. The signature of the callback function should be fcb(search_expr: str)->TsInfoVector

Examples:

>>> from shyft import time_series as sa
>>> def find_ts(search_expr: str)->sa.TsInfoVector:
>>>     print('find:',search_expr)
>>>     r = sa.TsInfoVector()
>>>     tsi = sa.TsInfo()
>>>     tsi.name = 'some_test'
>>>     r.append(tsi)
>>>     return r
>>> # and then bind the function to the callback
>>> dtss=sa.DtsServer()
>>> dtss.find_cb=find_ts
>>> dtss.set_listening_port(20000)
>>> # more code to invoce .find etc.
Type:

Callable[[str],TsInfoVector]

fire_cb((DtsServer)self, (StringVector)msg, (UtcPeriod)rp) TsVector :

testing fire cb from c++

flush_cache((DtsServer)self, (StringVector)ts_ids) None :

flushes the specified ts_ids from cache Has only effect for ts-ids that are in cache, non-existing items are ignored

Parameters:

ts_ids (StringVector) – a list of time-series ids to flush out

flush_cache_all((DtsServer)self) None :

flushes all items out of cache (cache_stats remain un-touched)

property geo_ts_read_cb

Callback for reading geo_ts db. Called everytime there is a need for geo_ts not stored in cached. The signature of the callback function should be grcb(cfg:GeoTimeSeriesConfiguration, slice:GeoSlice)->GeoMatrix

Type:

Callable[[GeoTimeSeriesConfiguration,GeoSlice],GeoMatrix]

property geo_ts_store_cb

callback for storing to geo_ts db. Called everytime the client.store_geo_ts() method is called. The signature of the callback function should be gscb(cfg:GeoTimeSeriesConfiguration, tsm:GeoMatrix, replace:bool)->None

Type:

Callable[[GeoTimeSeriesConfiguration,GeoMatrix,bool],None]

get_container_names((DtsServer)self) StringVector :

Return a list of the names of containers available on the server

get_geo_db_ts_info((DtsServer)self) GeoTimeSeriesConfigurationVector :

Returns the configured geo-ts data-bases on the server, so queries can be specified and formulated

Returns:

  1. A strongly typed list of GeoTimeseriesConfiguration

Return type:

GeoTimeseriesConfigurationVector

See also

.geo_evaluate()

get_listening_ip((DtsServer)self) str :

Get the current ip listen address

Returns:

listening ip. note that 0.0.0.0 means listening for all interfaces

get_listening_port((DtsServer)self) int :

returns the port number it’s listening at for serving incoming request

get_max_connections((DtsServer)self) int :

returns the maximum number of connections to be served concurrently

property graceful_close_timeout_ms

how long to let a connection linger after message is processed to allow for flushing out reply to client. Ref to dlib.net dlib.net/dlib/server/server_kernel_abstract.h.html

Type:

int

is_running((DtsServer)self) bool :

true if server is listening and running

See also

start_server(),process_messages(msec)

process_messages((DtsServer)self, (object)msec) None :

wait and process messages for specified number of msec before returning the dtss-server is started if not already running

Parameters:

msec (int) – number of millisecond to process messages

Notes

this method releases GIL so that callbacks are not blocked when the

dtss-threads perform the callback

See also

cb,start_server(),is_running,clear()

read((DtsServer)self, (StringVector)ts_ids, (UtcPeriod)read_period[, (object)use_ts_cached_read=True[, (object)update_ts_cache=True]]) TsVector :

Reads from the db-backend/cache the specified ts_ids for covering read_period. NOTE: That the ts-backing-store, either cached or by read, will return data for:

  • at least the period needed to evaluate the read_period

  • In case of cached result, this will currently involve the entire matching cached time-series segment.

Parameters:
  • ts_ids (StringVector) – a list of shyft-urrls, like shyft://abc/def

  • read_period (UtcPeriod) – the valid non-zero length period that the binding service should read from the backing ts-store/ts-service

  • use_ts_cached_read (bool) – use of server-side ts-cache

  • update_ts_cache (bool) – when reading time-series, also update the cache with the data

Returns:

tsvector. an evaluated list of point time-series in the same order as the input list

Return type:

TsVector

See also

DtsServer

remove_auth_tokens((DtsServer)self, (StringVector)tokens) None :

removes auth tokens, if it matches all available tokens, then deactivate auth requirement for clients

Parameters:

() (tokens) – list of tokens, where each token is like Basic dXNlcjpwd2Q=, e.g: base64 user:pwd

remove_container((DtsServer)self, (object)container_url[, (object)delete_from_disk=False]) None :

remove an internal shyft store container or an external container from the dtss-server. container_url on the form shyft://<container>/ will remove internal containers all other urls with be forwarded to the remove_external_cb callback on the server removal of containers can take a long time to finish

Parameters:
  • container_url (str) – url of the container as pr. url definition above

  • delete_from_disk (bool) – Flag to indicate if the container should be deleted from disk

property remove_container_cb

callback for removing external containers. Called when the .remove_container() method is called with a non-shyft container url. The signature of the callback function should be rcb(container_url: string, remove_from_disk: bool)->None

Type:

Callable[[str, bool],None]

set_auto_cache((DtsServer)self, (object)active) None :

set auto caching all reads active or passive. Default is off, and caching must be done through explicit calls to .cache(ts_ids,ts_vector)

Parameters:

active (bool) – if set True, all reads will be put into cache

set_can_remove((DtsServer)self, (object)can_remove) None :

Set whether the DtsServer support removing time-series The default setting is false, su unless this method is called with true as argument the server will not allow removing data using DtsClient.remove.

Parameters:

can_remove (bool) – true if the server should allow removing data. false otherwise

set_container((DtsServer)self, (object)name, (object)root_dir[, (object)container_type=''[, (DtssCfg)cfg=<shyft.time_series._time_series.DtssCfg object at 0x73d7da212980>]]) None :

set ( or replaces) an internal shyft store container to the dtss-server. All ts-urls with shyft://<container>/ will resolve to this internal time-series storage for find/read/store operations

Parameters:
  • name (str) – Name of the container as pr. url definition above

  • root_dir (str) – A valid directory root for the container

  • container_type (str) – one of (‘ts_rdb’, ‘ts_ldb’,’ts_db’), container type to add.

Notes

currently this call should only be used when the server is not processing messages, - before starting, or after stopping listening operations

set_geo_ts_db((DtsServer)self, (GeoTimeSeriesConfiguration)geo_ts_cfg) None :

This add/replace a geo-ts database to the server, so that geo-related requests can be resolved by means of this configuation and the geo-related callbacks.

Parameters:

geo_ts_cfg (GeoTimeseriesConfiguration) – The configuration for the new geo-ts data-base

set_listening_ip((DtsServer)self, (object)ip) None :

Set the ip address to specific interface ip. Must be called prior to the start server method

Parameters:

() (ip) – ip address, like 127.0.0.1 for local host only interface

set_listening_port((DtsServer)self, (object)port_no) None :

set the listening port for the service

Parameters:
  • port_no (int) – a valid and available tcp-ip port number to listen on.

  • 20000 (typically it could be)

Returns:

nothing.

Return type:

None

set_master_slave_mode((DtsServer)self, (object)ip, (object)port, (object)master_poll_time, (int)unsubscribe_threshold, (object)unsubscribe_max_delay) None :

Set master-slave mode, redirecting all IO calls on this dtss to the master ip:port dtss. This instance of the dtss is kept in sync with changes done on the master using subscription to changes on the master Calculations, and caches are still done locally unloading the computational efforts from the master.

Parameters:
  • ip (str) – The ip address where the master dtss is running

  • port (int) – The port number for the master dtss

  • master_poll_time (time) – [s] max time between each update from master, typicall 0.1 s is ok

  • unsubscribe_threshold (int) – minimum number of unsubscribed time-series before also unsubscribing from the master

  • unsubscribe_max_delay (int) – maximum time to delay unsubscriptions, regardless number

set_max_connections((DtsServer)self, (object)max_connect) None :

limits simultaneous connections to the server (it’s multithreaded, and uses on thread pr. connect)

Parameters:

max_connect (int) – maximum number of connections before denying more connections

See also

get_max_connections()

start_async((DtsServer)self) int :

(deprecated, use start_server) start server listening in background, and processing messages

See also

set_listening_port(port_no),set_listening_ip,is_running,cb,process_messages(msec)

Returns:

port_no. the port used for listening operations, either the value as by set_listening_port, or if it was unspecified, a new available port

Return type:

in

Notes

you should have setup up the callback, cb before calling start_async

Also notice that processing will acquire the GIL

-so you need to release the GIL to allow for processing messages

See also

process_messages(msec)

start_server((DtsServer)self) int :

start server listening in background, and processing messages

See also

set_listening_port(port_no),set_listening_ip,is_running,cb,process_messages(msec)

Returns:

port_no. the port used for listening operations, either the value as by set_listening_port, or if it was unspecified, a new available port

Return type:

in

Notes

you should have setup up the callback, cb before calling start_server

Also notice that processing will acquire the GIL

-so you need to release the GIL to allow for processing messages

See also

process_messages(msec)

start_web_api((DtsServer)self, (object)host_ip, (object)port, (object)doc_root[, (object)fg_threads=2[, (object)bg_threads=4[, (object)tls_only=False]]]) int :

starts the dtss web-api on the specified host_ip, port, doc_root and number of threads

Parameters:
  • host_ip (str) – 0.0.0.0 for any interface, 127.0.0.1 for local only etc.

  • port (int) – port number to serve the web_api on, ensure it’s available!

  • doc_root (str) – directory from which we will serve http/https documents, like index.html etc.

  • fg_threads (int) – number of web-api foreground threads, typical 1-4 depending on load

  • bg_threads (int) – number of long running background threads workers to serve dtss-request etc.

  • tls_only (bool) – default false, set to true to enforce tls sessions only.

Returns:

port. real port number used, if 0 is passed as port it is auto-allocated

Return type:

int

stop_server((DtsServer)self[, (object)timeout=1000]) None :

stop serving connections, gracefully.

See also

start_server()

stop_web_api((DtsServer)self) None :

Stops any ongoing web-api service

store((DtsServer)self, (TsVector)tsv, (StorePolicy)store_policy) None :

Store the time-series in the ts-vector in the dtss backend. Stores the time-series fragments data passed to the backend. If store_policy.strict == True: It is semantically stored as if

first erasing the existing stored points in the range of ts.time_axis().total_period()

then inserting the points of the ts.

Thus, only modifying the parts of time-series that are covered by the ts-fragment passed. If there is no previously existing time-series, its merely stores the ts-fragment as the initial content and definition of the time-series.

When creating time-series 1st time, pay attention to the time-axis, and point-interpretation as this remains the properties of the newly created time-series. Storing 15min data to a time-series defined initially as hourly series will raise exception. On the other hand, the variable interval time-series are generic and will accept any time-resolution to be stored

If store_policy.strict ==False The passed time-series fragment is interpreted as a f(t), and projected to the time-axis time-points/intervals of the target time-series If the target time-series is a stair-case type (POINT_AVERAGE_VALUE), then the true average of the passed time-series fragment is used to align with the target. If the target time-series is a linear type (POINT_INSTANT_VALUE), then the f(t) of the passed time-series fragment for the time-points of the target-series is used. The store_policy.recreate == True, is used to replace the entire definition of any previously stored time-series. This is semantically as if erasing the previously stored time-series and replacing its entire content and definition, starting fresh with the newly passed time-series. The store_policy.best_effort == True or False, controls how logical errors are handled. If best_effort is set to True, then all time-series are attempted stored, and if any failed, the returned value of the function will be non-empty list of diagnostics identifying those that failed and the diagnostics. If best_effort is set to False, then exception is raised on the first item that fails, the remaining items are not stored. The time-series should be created like this, with url and a concrete point-ts:

>>>   a=sa.TimeSeries(ts_url,ts_points)
>>>   tsv.append(a)
Parameters:
  • tsv (TsVector) – ts-vector with time-series, url-reference and values to be stored at dtss server

  • store_policy (StorePolicy) – Determines how to project the passed time-series fragments to the backend stored time-series

Returns:

diagnostics. For any failed items, normally empty

Return type:

TsDiagnosticsItemList

See also

TsVector

property store_ts_cb

callback for storing time-series. Called everytime the .store_ts() method is called and non-shyft urls are passed. The signature of the callback function should be scb(tsv: TsVector)->None

Examples:

>>> from shyft import time_series as sa
>>> def store_ts(tsv:sa.TsVector)->None:
>>>     print('store:',len(tsv))
>>>     # each member is a bound ref_ts with an url
>>>     # extract the url, decode and store
>>>     #
>>>     #
>>>     return
>>> # and then bind the function to the callback
>>> dtss=sa.DtsServer()
>>> dtss.store_ts_cb=store_ts
>>> dtss.set_listening_port(20000)
>>> # more code to invoce .store_ts etc.
Type:

Callable[[TsVector],None]

swap_container((DtsServer)self, (object)container_name_a[, (object)container_name_b=False]) None :

Swap the backend storage for container a and b. The content of a and b should be equal prior to the call to ensure wanted semantics, as well as cache correctness. This is the case if a is immutable, and copied to b prior to the operation. If a is not permanently immutable, it has to be ensured at least for the time where the copy/swap operation is done. The intended purpose is to support migration and moving ts-db backends. When swap is done, the remove_container can be used for the container that is redunant. A typical operation is copy a->`a_tmp`, then swap(a,`a_tmp`), then remove(shyft://a_tmp,True)

Parameters:
  • container_name_a (str) – Name of container a

  • container_name_b (str) – Name of container b

class shyft.time_series.DtssCfg

Bases: instance

Configuration for google level db specific parameters.

Each parameter have reasonable defaults, have a look at google level db documentation for the effect of max_file_size, write_buffer_size and compression. The ppf remains constant once db is created (any changes will be ignored). The other can be changed on persisted/existing databases.

About compression: Turns out although very effective for a lot of time-series, it have a single thread performance cost of 2..3x native read/write performance due to compression/decompression.

However, for geo dtss we are using multithreaded writes, so performance is limited to the io-capacity, so it might be set to true for those kind of scenarios.

__init__((DtssCfg)self) None
__init__( (DtssCfg)self, (object)ppf, (object)compress, (object)max_file_size, (object)write_buffer_size [, (object)log_level=200 [, (object)test_mode=0 [, (object)ix_cache=0 [, (object)ts_cache=0]]]]) -> None :

construct a DtssCfg with all values specified

property compression

(default False), using snappy compression, could reduce storage 1::3 at similar cost of performance

Type:

bool

property ix_cache

low-level index-cache, could be useful when working with large compressed databases

Type:

int

property log_level

default warn(200), trace(-1000),debug(0),info(100),error(300),fatal(400)

Type:

int

property max_file_size

(default 100Mega), choose to make a reasonable number of files for storing time-series

Type:

int

property ppf

(default 1024) ts-points per fragment(e.g.key/value), how large ts is chunked into fragments, read/write operations to key-value storage are in fragment sizes.

Type:

int

property test_mode

for internal use only, should always be set to 0(the default)

Type:

int

property ts_cache

low-level data-cache, could be useful in case of very large compressed databases

Type:

int

property write_buffer_size

(default 10Mega), to balance write io-activity.

Type:

int

Class DtsClient

class shyft.time_series.DtsClient

Bases: instance

The client side part of the distributed time series system(DTSS).

The DtsClient communicate with the DtsServer using an efficient raw socket protocol using boost binary serialization. A typical operation would be that the DtsClient forwards TsVector that represents lists and structures of time-series expressions) to the DtsServer(s), that takes care of binding unbound symbolic time-series, evaluate and return the results back to the DtsClient. This class is closely related to the DtsServer and useful reference is also TsVector .

Best practice for client/server is to use cache following two simple rules(the default):

  1. Always caching writes (because then consumers get it fresh and fast).

  2. Always use caching reads(utilize and maintain the adaptive cache).

The only two known very rare and special scenarios where uncached writes can be useful are when loading large initial content of time-series db. Another special scenario, where caching reads should be turned off is when using the 3rd party dtss backend extension, where the 3rd party db is written/modified outside the control of dtss Also note that the caching works with the ts-terminals, not the result of the expressions. When reading time-series expressions, such as ts = ts1 - ts2, implementation of the cache is such that it contains the ts-terminals (here, ts1 and ts2), not the expression itself (ts). The .cache_stats, provides cache statistics for the server. The cache can be flushed, useful for some special cases of loading data outside cache.

__init__((DtsClient)self, (object)host_port[, (object)auto_connect=True[, (object)timeout_ms=1000]]) None :
Constructs a dts-client with the specifed host_port parameter.

A connection is immediately done to the server at specified port. If no such connection can be made, it raises a RuntimeError.

host_port (string): a string of the format ‘host:portnumber’, e.g. ‘localhost:20000’

auto_connect (bool): default True, connection pr. call. if false, connection last lifetime of object unless explicitely closed/reopened

timeout_ms (int): defalt 1000ms, used for timeout connect/reconnect/close operations

__init__( (DtsClient)self, (StringVector)host_ports, (object)auto_connect, (object)timeout_ms) -> None :

Constructs a dts-client with the specifed host_ports parameters. A connection is immediately done to the server at specified port. If no such connection can be made, it raises a RuntimeError. If several servers are passed, the .evaluate and .percentile function will partition the ts-vector between the provided servers and scale out the computation

host_ports (StringVector): a a list of string of the format ‘host:portnumber’, e.g. ‘localhost:20000’

auto_connect (bool): default True, connection pr. call. if false, connection last lifetime of object unless explicitly closed/reopened

timeout_ms (int): default 1000ms, used for timeout connect/reconnect/close operations

add_geo_ts_db((DtsClient)self, (GeoTimeSeriesConfiguration)geo_cfg) None :

Adds a new geo time-series database to the dtss-server with the given specifications

geo_cfg (GeoTimeSeriesConfiguration): the configuration to be added to the server specifying the dimensionality etc.

See also

.get_geo_db_ts_info()

property auto_connect

If connections are made as needed, and kept short, otherwise externally managed.

Type:

bool

cache_flush((DtsClient)self) None :

Flush the cache (including statistics) on the server. This can be useful in scenario when cache_on_write=False in the store operations.

property cache_stats

Get the cache_stats (including statistics) on the server.

Type:

CacheStats

close((DtsClient)self[, (object)timeout_ms=1000]) None :

Close the connection. If auto_connect is enabled it will automatically reopen if needed.

property compress_expressions

If True, the expressions are compressed before sending to the server. For expressions of any size, like 100 elements, with expression depth 100 (e.g. nested sums), this can speed up the transmission by a factor or 3.

Type:

bool

property connections

Get remote server connections.

Type:

int

evaluate((DtsClient)self, (TsVector)ts_vector, (UtcPeriod)utcperiod[, (object)use_ts_cached_read=True[, (object)update_ts_cache=True[, (UtcPeriod)clip_result=[not-valid-period>]]]) TsVector :

Evaluates the expressions in the ts_vector. If the expression includes unbound symbolic references to time-series, these time-series will be passed to the binding service callback on the serverside, passing on the specifed utcperiod.

NOTE: That the ts-backing-store, either cached or by read, will return data for:
  • at least the period needed to evaluate the utcperiod

  • In case of cached result, this will currently involve the entire matching cached time-series segment.

In particular, this means that the returned result could be larger than the specified utcperiod, unless you specify clip_result

Other available methods, such as the expression (x.average(ta)), including time-axis, can be used to exactly control the returned result size. Also note that the semantics of utcperiod is to ensure that enough data is read from the backend, so that it can evaluate the expressions. Use clip_result argument to clip the time-range of the resulting time-series to fit your need if needed - this will typically be in scenarios where you have not supplied time-axis operations (unbounded eval), and you also are using caching.

See also

DtsClient.percentiles() if you want to evaluate percentiles of an expression.

Parameters:
  • ts_vector (TsVector) – a list of time-series (expressions), including unresolved symbolic references

  • utcperiod (UtcPeriod) – the valid non-zero length period that the binding service should read from the backing ts-store/ts-service

  • use_ts_cached_read (bool) – use of server-side ts-cache

  • update_ts_cache (bool) – when reading time-series, also update the cache with the data

  • clip_result (UtcPeriod) – If supplied, clip the time-range of the resulting time-series to cover evaluation f(t) over this period only

Returns:

tsvector. an evaluated list of point time-series in the same order as the input list

Return type:

TsVector

See also

DtsServer

find((DtsClient)self, (object)search_expression) TsInfoVector :

Find ts information that fully matches the regular search-expression. For the shyft file based backend, take care to specify path elements precisely, so that the directories visited is minimised. e.g:a/.*/my.ts Will prune out any top level directory not starting with a, but will match any subdirectories below that level. Refer to python test-suites for a wide range of examples using find. Notice that the regexp search algoritm uses ignore case. Please be aware that custom backend by python extension might have different rules.

Parameters:

search_expression (str) – regular search-expression, to be interpreted by the back-end tss server

Returns:

ts_info_vector. The search result, as vector of TsInfo objects

Return type:

TsInfoVector

See also

TsInfo,TsInfoVector

geo_evaluate((DtsClient)self, (object)geo_ts_db_name, (StringVector)variables, (IntVector)ensembles, (TimeAxis)time_axis, (time)ts_dt, (GeoQuery)geo_range, (object)concat, (time)cc_dt0[, (object)use_cache=True[, (object)update_cache=True]]) GeoTsMatrix :

Evaluates a geo-temporal query on the server, and return the results

Args:

geo_ts_db_name (string): The name of the geo_ts_db, e.g. arome, ec, arome_cc ec_cc etc.

variables (StringVector): list of variables, like ‘temperature’,’precipitation’. If empty, return data for all available variables

ensembles (IntVector): list of ensembles to read, if empty return all available

time_axis (TimeAxis): return geo_ts where t0 matches time-points of this time-axis. If concat, the ta.total_period().end determines how long to extend latest forecast

ts_dt (time): specifies the length of the time-slice to read from each time-series

geo_range (GeoQuery): Specify polygon to include, empty means all

concat (bool): If true, the geo_ts for each ensemble/point is joined together to form one singe time-series, concatenating a slice from each of the forecasts

cc_dt0 (time): concat delta time to skip from beginning of each geo_ts, so you can specify 3h, then select +3h.. slice-end from each forecast

use_cache (bool): use cache if available(speedup)

update_cache (bool): if reading data from backend, also stash it to the cache for faster evaluations

Returns:

GeoMatrix: r. A matrix where the elements are GeoTimeSeries, accessible using indicies time,variable, ensemble, t0

See also:

.get_geo_ts_db_info()

geo_evaluate( (DtsClient)self, (GeoEvalArgs)eval_args [, (object)use_cache=True [, (object)update_cache=True]]) -> GeoTsMatrix :

Evaluates a geo-temporal query on the server, and return the results

Args:

eval_args (GeoEvalArgs): complete set of arguments for geo-evaluation, including geo-db, scope for variables, ensembles, time and geo-range

use_cache (bool): use cache if available(speedup)

update_cache (bool): if reading data from backend, also stash it to the cache for faster evaluations

Returns:

GeoMatrix: r. A matrix where the elements are GeoTimeSeries, accessible using indicies time,variable, ensemble, t0

See also:

.get_geo_ts_db_info()

geo_store((DtsClient)self, (object)geo_ts_db_name, (GeoMatrix)tsm, (object)replace[, (object)cache=True]) None :

Store a ts-matrix with needed dimensions and data to the specified geo-ts-db

Parameters:
  • geo_ts_db_name (string) – The name of the geo_ts_db, e.g. arome, ec, arome_cc ec_cc etc.

  • tsm (TsMatrix) – A dense matrix with dimensionality complete for variables, ensembles and geo-points,flexible time-dimension 1..n

  • replace (bool) – Replace existing geo time-series with the new ones, does not extend existing ts, replaces them!

  • cache (bool) – Also put values to the cache

See also

.get_geo_ts_db_info(),.geo_evaluate

get_container_names((DtsClient)arg1) StringVector :

Return a list of the names of containers available on the server

get_geo_db_ts_info((DtsClient)self) GeoTimeSeriesConfigurationVector :

Returns the configured geo-ts data-bases on the server, so queries can be specified and formulated

Returns:

  1. A strongly typed list of GeoTimeseriesConfiguration

Return type:

GeoTimeseriesConfigurationVector

See also

.geo_evaluate()

get_server_version((DtsClient)arg1) str :

Returns the server version major.minor.patch string, if multiple servers, the version of the first is returned

get_transfer_status((DtsClient)self, (object)name, (object)clear_status) TransferStatus :

Get status of specified status, if clear_status is True, also clear it.

Parameters:
  • () (clear_status) – the name of the transfer

  • () – if true, also clear status at server

Returns:

transfer_status. The TransferStatus

get_transfers((DtsClient)self) TransferConfigurationList :

returns configured active transfers.

Returns:

transfer_configurations. A list of configured Transfers

get_ts_info((DtsClient)self, (object)ts_url) TsInfo :

Get ts information for a time-series from the backend

Parameters:

ts_url (str) – Time-series url to lookup ts info for

Returns:

ts_info. A TsInfo object

Return type:

TsInfo

See also

TsInfo

merge_store_ts_points((DtsClient)self, (TsVector)tsv[, (object)cache_on_write=True]) None :

Merge the ts-points supplied in the tsv into the existing time-series on the server side. The effect of each ts is similar to as if:

  1. read ts.total_period() from ts point store

  2. in memory appy the TimeSeries.merge_points(ts) on the read-ts

  3. write the resulting merge-result back to the ts-store

This function is suitable for typical data-collection tasks where the points collected is from an external source, appears as batches, that should just be added to the existing point-set

Parameters:
  • tsv (TsVector) – ts-vector with time-series, url-reference and values to be stored at dtss server

  • cache_on_write (bool) – updates the cache with the result of the merge operation, if set to False, this is skipped, notice that this is only useful for very special use-cases.

Returns:

None.

See also

TsVector

percentiles((DtsClient)self, (TsVector)ts_vector, (UtcPeriod)utcperiod, (TimeAxis)time_axis, (IntVector)percentile_list[, (object)use_ts_cached_read=True[, (object)update_ts_cache=True]]) TsVector :

Evaluates the expressions in the ts_vector for the specified utcperiod. If the expression includes unbound symbolic references to time-series, these time-series will be passed to the binding service callback on the serverside.

Parameters:
  • ts_vector (TsVector) – a list of time-series (expressions), including unresolved symbolic references

  • utcperiod (UtcPeriod) – the valid non-zero length period that the binding service should read from the backing ts-store/ts-service

  • time_axis (TimeAxis) – the time_axis for the percentiles, e.g. a weekly time_axis

  • percentile_list (IntVector) – a list of percentiles, where -1 means true average, 25=25percentile etc

  • use_ts_cached_read (bool) – utilize server-side cached results

  • update_ts_cache (bool) – when reading time-series, also update the server-side cache

Returns:

tsvector. an evaluated list of percentile time-series in the same order as the percentile input list

Return type:

TsVector

See also

.evaluate(), DtsServer

q_ack((DtsClient)self, (object)name, (object)msg_id, (object)diagnostics) None :

After q_get, q_ack confirms that the message is ok/handled back to the process that called q_put.

Parameters:
  • () (diagnostics) – the name of the queue

  • () – the msg_id, required to be unique within current messages keept by the queue

  • () – the freetext diagnostics to but along with the message, we recommend json formatted

q_add((DtsClient)self, (object)name) None :

Add a a named queue to the dtss server

Parameters:

() (name) – the name of the new queue, required to be unique

q_get((DtsClient)self, (object)name, (time)max_wait) QueueMessage :

Get a message out from the named queue, waiting max_wait time for it if it’s not already there.

Parameters:
  • () (max_wait) – the name of the queue

  • () – max_time to wait for message to arrive

Returns:

q_msg. A queue message consisting of .info describing the message, and the time-series vector .tsv

q_list((DtsClient)self) StringVector :

returns a list of defined queues on the dtss server

q_maintain((DtsClient)self, (object)name, (object)keep_ttl_items[, (object)flush_all=False]) None :

Maintains, removes items that has passed through the queue, and are marked as done. To flush absolutely all items, pass flush_all=True.

Parameters:
  • () (flush_all) – the name of the queue

  • () – If true, the ttl set for the done messages are respected, and they are not removed until the create+ttl has expired

  • () – removes all items in the queue and kept by the queue, the queue is emptied

q_msg_info((DtsClient)self, (object)name, (object)msg_id) QueueMessageInfo :

From the specified queue, fetch info about specified msg_id. By inspecting the provided information, one can see when the messaage is created, fetched, and done with.

Parameters:
  • () (msg_id) – the name of the queue

  • () – the msg_id

Returns:

msg_info. the information/state of the identified message

q_msg_infos((DtsClient)self, (object)name) QueueMessageInfoVector :

Returns all message informations from a queue, including not yet pruned fetched/done messages

Parameters:

() (name) – the name of the queue

Returns:

msg_infos. the list of information keept in the named queue

q_put((DtsClient)self, (object)name, (object)msg_id, (object)description, (time)ttl, (TsVector)tsv) None :

Put a message, as specified with the supplied parameters, into the specified named queue.

Parameters:
  • () (tsv) – the name of the queue

  • () – the msg_id, required to be unique within current messages keept by the queue

  • () – the freetext description to but along with the message, we recommend json formatted

  • () – time-to-live for the message after done, if specified, the q_maintain process can be asked to keep done messages that have ttl

  • () – time-series vector, with the wanted payload of time-series

q_remove((DtsClient)self, (object)name) None :

Removes a named queue from dtss server, including all data in flight on the queue

Parameters:

() (name) – the name of the queue

q_size((DtsClient)self, (object)name) int :

Returns number of queue messages waiting to be read by q_get.

Parameters:

() (name) – the name of the queue

Returns:

unread count. number of elements queued up

remove((DtsClient)arg1, (object)ts_url) None :

Remove a time-series from the dtss backend The time-series referenced by ts_url is removed from the backend DtsServer. Note that the DtsServer may prohibit removing time-series.

Parameters:

ts_url (str) – shyft url referencing a time series

remove_container((DtsClient)self, (object)container_url[, (object)delete_from_disk=False]) None :

remove an internal shyft store container or an external container from the dtss-server. container_url on the form shyft://<container>/ will remove internal containers all other urls with be forwarded to the remove_external_cb callback on the server removal of containers can take a long time to finish

Parameters:
  • container_url (str) – url of the container as pr. url definition above

  • delete_from_disk (bool) – Flag to indicate if the container should be deleted from disk

remove_geo_ts_db((DtsClient)self, (object)geo_ts_db_name) None :

Remove the specified geo time-series database from dtss-server

geo_ts_db_name (string): the name of the geo-ts-database to be removed

See also

.get_geo_db_ts_info(),add_geo_ts_db()

reopen((DtsClient)self[, (object)timeout_ms=1000]) None :

(Re)open a connection after close or server restart.

set_container((DtsClient)self, (object)name, (object)relative_path[, (object)container_type='ts_db'[, (DtssCfg)cfg=<shyft.time_series._time_series.DtssCfg object at 0x73d7da212c00>]]) None :

create an internal shyft store container to the dtss-server with a root relative path. All ts-urls with shyft://<container>/ will resolve to this internal time-series storage for find/read/store operations will not replace existing containers that have the same name

Parameters:
  • name (str) – Name of the container as pr. url definition above

  • relative_path (str) – A valid directory for the container relative to the root path of the server.

  • container_type (str) – one of (‘ts_rdb’, ‘ts_ldb’,’ts_db’), container type to add.

start_transfer((DtsClient)self, (TransferConfiguration)cfg) None :

Starts a transfer on the server using the provided TransferConfiguration.

Parameters:

() (cfg) – the configuration for the transfer

stop_transfer((DtsClient)self, (object)name, (time)max_wait) None :

Stop,cancel, removes a named transfer.

Parameters:
  • () (max_wait) – the name of the transfer to remove

  • () – to let existing transfers gracefully finish

store((DtsClient)self, (TsVector)tsv, (StorePolicy)store_policy) object :

Store the time-series in the ts-vector in the dtss backend. Stores the time-series fragments data passed to the backend. If store_policy.strict == True: It is semantically stored as if

first erasing the existing stored points in the range of ts.time_axis().total_period()

then inserting the points of the ts.

Thus, only modifying the parts of time-series that are covered by the ts-fragment passed. If there is no previously existing time-series, its merely stores the ts-fragment as the initial content and definition of the time-series.

When creating time-series 1st time, pay attention to the time-axis, and point-interpretation as this remains the properties of the newly created time-series. Storing 15min data to a time-series defined initially as hourly series will raise exception. On the other hand, the variable interval time-series are generic and will accept any time-resolution to be stored

If store_policy.strict ==False The passed time-series fragment is interpreted as a f(t), and projected to the time-axis time-points/intervals of the target time-series If the target time-series is a stair-case type (POINT_AVERAGE_VALUE), then the true average of the passed time-series fragment is used to align with the target. If the target time-series is a linear type (POINT_INSTANT_VALUE), then the f(t) of the passed time-series fragment for the time-points of the target-series is used. The store_policy.recreate == True, is used to replace the entire definition of any previously stored time-series. This is semantically as if erasing the previously stored time-series and replacing its entire content and definition, starting fresh with the newly passed time-series. The store_policy.best_effort == True or False, controls how logical errors are handled. If best_effort is set to True, then all time-series are attempted stored, and if any failed, the returned value of the function will be non-empty list of diagnostics identifying those that failed and the diagnostics. If best_effort is set to False, then exception is raised on the first item that fails, the remaining items are not stored. The time-series should be created like this, with url and a concrete point-ts:

>>>   a=sa.TimeSeries(ts_url,ts_points)
>>>   tsv.append(a)
Parameters:
  • tsv (TsVector) – ts-vector with time-series, url-reference and values to be stored at dtss server

  • store_policy (StorePolicy) – Determines how to project the passed time-series fragments to the backend stored time-series

Returns:

diagnostics. For any failed items, normally empty

Return type:

TsDiagnosticsItemList

See also

TsVector

store_ts((DtsClient)self, (TsVector)tsv[, (object)overwrite_on_write=False[, (object)cache_on_write=True]]) None :

Store the time-series in the ts-vector in the dtss backend. Stores the time-series fragments data passed to the backend. It is semantically stored as if

first erasing the existing stored points in the range of ts.time_axis().total_period()

then inserting the points of the ts.

Thus, only modifying the parts of time-series that are covered by the ts-fragment passed. If there is no previously existing time-series, its merely stores the ts-fragment as the initial content and definition of the time-series.

When creating time-series 1st time, pay attention to the time-axis, and point-interpretation as this remains the properties of the newly created time-series. Storing 15min data to a time-series defined initially as hourly series will raise exception. On the otherhand, the variable interval time-series are generic and will accept any time-resolution to be stored

The overwrite_on_write = True, is used to replace the entire definition of any previously stored time-series. This is semantically as if erasing the previously stored time-series and replacing its entire content and definition, starting fresh with the newly passed time-series. The time-series should be created like this, with url and a concrete point-ts:

>>>   a=sa.TimeSeries(ts_url,ts_points)
>>>   tsv.append(a)
Parameters:
  • tsv (TsVector) – ts-vector with time-series, url-reference and values to be stored at dtss server

  • overwrite_on_write (bool) – When True the backend replaces the entire content and definition of any existing time-series with the passed time-series

  • cache_on_write (bool) – defaults True, if set to False, the cache is not updated, and should only be considered used in very special use-cases.

Returns:

None.

See also

TsVector

swap_container((DtsClient)self, (object)container_name_a[, (object)container_name_b=False]) None :

Swap the backend storage for container a and b. The content of a and b should be equal prior to the call to ensure wanted semantics, as well as cache correctness. This is the case if a is immutable, and copied to b prior to the operation. If a is not permanently immutable, it has to be ensured at least for the time where the copy/swap operation is done. The intended purpose is to support migration and moving ts-db backends. When swap is done, the remove_container can be used for the container that is redunant. A typical operation is copy a->`a_tmp`, then swap(a,`a_tmp`), then remove(shyft://a_tmp,True)

Parameters:
  • container_name_a (str) – Name of container a

  • container_name_b (str) – Name of container b

total_clients = 0
update_geo_ts_db_info((DtsClient)self, (object)geo_ts_db_name, (object)description, (object)json, (object)origin_proj4) None :

Update info fields of the geo ts db configuration to the supplied parameters

Parameters:
  • geo_ts_db_name (string) – The name of the geo_ts_db, e.g. arome, ec, arome_cc ec_cc etc.

  • description (str) – The description field of the database

  • json (str) – The user specified json like string

  • origin_proj4 (str) – The origin proj4 field update

See also

.get_geo_ts_db_info()

Class StorePolicy

class shyft.time_series.StorePolicy

Bases: instance

Determine how DTSS stores time-series, used in context of the DtsClient.

__init__((StorePolicy)self) None
__init__( (StorePolicy)self, (object)recreate, (object)strict, (object)cache [, (object)best_effort=False]) -> None :

construct object with specified parameters

property best_effort

try to store all, return diagnostics for logical errors instead of raise exception

Type:

bool

property cache

update cache with new values

Type:

bool

property recreate

recreate time-series if it existed, with entirely new definition

Type:

bool

property strict

use strict requirement for aligment of timepoints. If True(default), then require perfect matching time-axis on the incoming ts fragment, and transfer those points to the target. Notice that if the target is a break-point/flexible interval time-series, then all time-axis passed are a perfect match. If False, use a functional mapping approach, resampling or averaging the passed time-series fragment to align with the target time-series. If target is linear time-series, the new ts fragment is evaluated f(t) for the covered time-points in the target time-series. If target is stair-case time-series, then the true average of the new ts fragment, evaluated over the covering/touched period of the target time-series.

Type:

bool

Class TransferConfiguration

class shyft.time_series.TransferConfiguration

Bases: instance

The transfer configuration describes what time-series to transfer, when, how and where to transfer.

class HowSpec

Bases: instance

__init__()

Raises an exception This class cannot be instantiated from Python

property partition_size

maximum number of ts to transfer in one batch

Type:

int

property retries

retries on connect/re-send for remote

Type:

int

property sleep_before_retry

when stuff fail, how long to sleep before retry connect/send message to remote

Type:

time

class PeriodSpec

Bases: instance

__init__()

Raises an exception This class cannot be instantiated from Python

property now_relative

if >0, then time.now() trimmed to now_relative

Type:

time

property period

period range to transfer,either now_releative, or absolute

Type:

UtcPeriod

class RemoteSpec

Bases: instance

__init__()

Raises an exception This class cannot be instantiated from Python

property host

host ip or name

Type:

str

property port

host port number

Type:

str

class TimeSeriesSpec

Bases: instance

__init__()

Raises an exception This class cannot be instantiated from Python

property replace_pattern

reg-ex replace pattern

Type:

str

property search_pattern

reg-ex search pattern

Type:

str

class WhenSpec

Bases: instance

__init__()

Raises an exception This class cannot be instantiated from Python

property changed

if true, use subscription to transfer when changed

Type:

bool

property linger_time

max time to wait before transfer detected changes

Type:

time

property poll_interval

how often to check for changes

Type:

time

property schedule

if non-empty, transfer according to time-axis

Type:

TimeAxis

__init__((TransferConfiguration)self) None
property how

retries and delays for remote connection

Type:

HowSpec

property json

a user specified json info that could be useful for extensions

Type:

str

property name

name of the transfer, used for reference to active transfers

Type:

str

property period

specification of the time-period to transfer

Type:

PeriodSpec

property read_remote

transfer-direction,if true, pull from remote,otherwise push

Type:

bool

property read_updates_cache

if true, also update cache while reading

Type:

bool

property store_policy

used for writing time-series

Type:

StorePolicy

property what

timeseries to transfer,reg-expr

Type:

TimeSeriesSpec

property when

oneshot, subscription,continuous, scheduled

Type:

WhenSpec

property where

specification for the remote, host, port

Type:

RemoteSpec

Class TransferStatus

class shyft.time_series.TransferStatus

Bases: instance

Transfer status gives insight into the current state of a transfer.

class ReadError

Bases: instance

__init__()

Raises an exception This class cannot be instantiated from Python

property code

read failure code

Type:

LookupError

property ts_url

ts-url

Type:

str

class ReadErrorList

Bases: instance

A strongly typed list of ReadError

__init__((ReadErrorList)arg1) None
__init__( (ReadErrorList)arg1, (ReadErrorList)clone) -> None :

Create a clone.

append((ReadErrorList)arg1, (object)arg2) None
extend((ReadErrorList)arg1, (object)arg2) None
class WriteError

Bases: instance

__init__()

Raises an exception This class cannot be instantiated from Python

property code

write failure code

Type:

TsDiagnostics

property ts_url

ts-url

Type:

str

class WriteErrorList

Bases: instance

A strongly typed list of WriteError

__init__((WriteErrorList)arg1) None
__init__( (WriteErrorList)arg1, (WriteErrorList)clone) -> None :

Create a clone.

append((WriteErrorList)arg1, (object)arg2) None
extend((WriteErrorList)arg1, (object)arg2) None
__init__()

Raises an exception This class cannot be instantiated from Python

property last_activity

number of points transferred

Type:

time

property n_ts_found

number of ts found for transfer, if it is zero an readers alive the reader part will continue search for time-series until they appear

Type:

int

property read_errors

read errors

Type:

ReadErrorList

property read_speed

points/seconds

Type:

float

property reader_alive

true if reader is still working/monitoring

Type:

bool

property remote_errors

remote connection errors

Type:

StringVector

property total_transferred

number of points transferred

Type:

int

property write_errors

write errors

Type:

WriteErrorList

property write_speed

points/seconds

Type:

float

property writer_alive

true if writer is still working/monitoring

Type:

bool

Class QueueMessage

class shyft.time_series.QueueMessage

Bases: instance

A QueueMessage as returned from the DtsClient.q_get(..) consist of the .info part and the payload time-series vector .tsv

__init__((QueueMessage)arg1) None
__init__( (object)arg1, (object)msg_id, (object)desc, (time)ttl, (TsVector)tsv) -> object :

constructs a QueueMessage

Args:

msg_id (str): unique identifier for the message

desc (str): custom description

ttl (time): time the message should live on the queue

tsv (TsVector): timeseries payload

property info

The information about the message

property tsv

The time-series vector payload part of the message

Class QueueMessageInfo

class shyft.time_series.QueueMessageInfo

Bases: instance

Information about the queue item, such as the state of the item, in-queue,fetched, done. This element is never to be created by the python user, but is a return type from the dtss queue message info related calls.

__init__((QueueMessageInfo)self) None
property created

Time when the message was put into the queue

Type:

time

property description

A user specified description, we recommend json format

Type:

str

property diagnostics

Time when the message acknowledged done from the receiver(end-to-end ack)

Type:

str

property done

Time when the message acknowledged done from the receiver(end-to-end ack)

Type:

time

property fetched

Time when the message was fetched from the queue

Type:

time

property msg_id

The unique id for this message in the live-queue

Type:

str

property ttl

Time to live set for this message, used to prune out old messages

Type:

time

Class TsInfo

class shyft.time_series.TsInfo

Bases: instance

Gives some information from the backend ts data-store about the stored time-series, that could be useful in some contexts

__init__((TsInfo)self) None
__init__( (TsInfo)self, (object)name, (point_interpretation_policy)point_fx, (time)delta_t, (object)olson_tz_id, (UtcPeriod)data_period, (time)created, (time)modified) -> None :

construct a TsInfo with all values specified

property created

when time-series was created, seconds 1970s utc

property data_period

the period for data-stored, if applicable

property delta_t

time-axis steps, in seconds, 0 if irregular time-steps

property modified

when time-series was last modified, seconds 1970 utc

property name

the unique name

property olson_tz_id

empty or time-axis calendar for calendar,t0,delta_t type time-axis

property point_fx

how to interpret the points, instant value, or average over period

Class CacheStats

class shyft.time_series.CacheStats

Bases: instance

Cache statistics for the DtsServer.

__init__((CacheStats)self) None
property coverage_misses

number of misses where we did find the time-series id, but the period coverage was insufficient

Type:

int

property fragment_count

number of time-series fragments in the cache, (greater or equal to id_count)

Type:

int

property hits

number of hits by time-series id

Type:

int

property id_count

number of unique time-series identities in cache

Type:

int

property misses

number of misses by time-series id

Type:

int

property point_count

total number of time-series points in the cache

Type:

int

Geo-location Time series

The elements in this section integrate the generic time series concepts above with a geo-spatial co-ordinate system. This functionality extends to co-ordinate based queries in the time series storage.

Class GeoPoint

class shyft.time_series.GeoPoint

Bases: instance

GeoPoint commonly used in the shyft::core for representing a 3D point in the terrain model. The primary usage is in geo-located time-series and the interpolation routines

Absolutely a primitive point model aiming for efficiency and simplicity.

Units of x,y,z are metric, z positive upwards to sky, represents elevation x is east-west axis y is south-north axis

__init__((GeoPoint)arg1) None
__init__( (GeoPoint)arg1, (object)x, (object)y, (object)z) -> None :

construct a geo_point with x,y,z

Args:

x (float): meter units

y (float): meter units

z (float): meter units

__init__( (GeoPoint)arg1, (GeoPoint)clone) -> None :

create a copy

Args:

clone (GeoPoint): the object to clone

static difference((GeoPoint)a, (GeoPoint)b) GeoPoint :

returns GeoPoint(a.x - b.x, a.y - b.y, a.z - b.z)

static distance2((GeoPoint)a, (GeoPoint)b) float :

returns the euclidian distance^2

static distance_measure((GeoPoint)arg1, (GeoPoint)a, (object)b, (object)p) float :

return sum(a-b)^p

transform((GeoPoint)self, (object)from_epsg, (object)to_epsg) GeoPoint :

compute transformed point from_epsg to to_epsg coordinate

Args:

from_epsg (int): interpret the current point as cartesian epsg coordinate

to_epsg (int): the returned points cartesian epsg coordinate system

Returns:

GeoPoint: new point. The new point in the specified cartesian epsg coordinate system

transform( (GeoPoint)self, (object)from_proj4, (object)to_proj4) -> GeoPoint :

compute transformed point from_epsg to to_epsg coordinate

Args:

from_proj4 (str): interpret the current point as this proj4 specification

to_proj4 (str): the returned points proj4 specified coordinate system

Returns:

GeoPoint: new point. The new point in the specified proj4 coordinate system

property x

east->west

Type:

float

static xy_distance((GeoPoint)a, (GeoPoint)b) float :

returns sqrt((a.x - b.x)*(a.x - b.x) + (a.y - b.y)*(a.y - b.y))

property y

south->north

Type:

float

property z

ground->upwards

Type:

float

static zscaled_distance((GeoPoint)a, (GeoPoint)b, (object)zscale) float :

sqrt( (a.x - b.x)*(a.x - b.x) + (a.y - b.y)*(a.y - b.y) + (a.z - b.z)*(a.z - b.z)*zscale*zscale)

Class GeoTimeSeries

class shyft.time_series.GeoTimeSeries

Bases: instance

A minimal geo-located time-series, a time-series plus a representative 3d mid_point

__init__((GeoTimeSeries)arg1) None
__init__( (GeoTimeSeries)arg1, (GeoPoint)mid_point, (TimeSeries)ts) -> None :

Construct a GeoTimeSeries

Args:

mid_point (GeoPoint): The 3d location representative for ts

ts (TimeSeries): Any kind of TimeSeries

property mid_point

the mid-point(of an area) for which the assigned time-series is valid

Type:

GeoPoint

property ts

the assigned time-series

Type:

TimeSeries

Class GeoTimeSeriesVector

class shyft.time_series.GeoTimeSeriesVector

Bases: instance

__init__((object)arg1) object :

Create an empty TsVector

__init__( (object)arg1, (list)geo_ts_list) -> object :

Create a GeoTimeSeriesVector from a python list of GeoTimeSeries

Args:

geo_ts_list (List[GeoTimeSeries]): A list of GeoTimeSeries

__init__( (object)arg1, (TimeAxis)time_axis, (GeoPointVector)geo_points, (object)np_array, (point_interpretation_policy)point_fx) -> object :

Create a GeoTimeSeriesVector from time-axis,geo-points,2d-numpy-array and point-interpretation

Args:

time_axis (TimeAxis): time-axis that matches in length to 2nd dim of np_array

geo_points (GeoPointVector): the geo-positions for the time-series, should be of length n_ts

np_array (np.ndarray): numpy array of dtype=np.float64, and shape(n_ts,n_points)

point_fx (point interpretation): one of POINT_AVERAGE_VALUE|POINT_INSTANT_VALUE

Returns:

GeoTimeSeriesVector: GeoTimeSeriesVector. a GeoTimeSeriesVector of length first np_array dim, n_ts, each with geo-point and time-series with time-axis, values and point_fx

append((GeoTimeSeriesVector)arg1, (object)arg2) None
extend((GeoTimeSeriesVector)arg1, (object)arg2) None
extract_ts_vector((GeoTimeSeriesVector)self) TsVector :

Provides a TsVector of the time-series part of GeoTimeSeries

Returns:

ts-vector. A TsVector(shallow copy) of the time-series part of GeoTsVector

Return type:

TsVector

values_at_time((GeoTimeSeriesVector)self, (time)t) DoubleVector :

The values at specified time as a DoubleVector, that you can use .to_numpy() to get np array from This function can be suitable if you are doing area-animated (birds-view) presentations

Parameters:

t (time) – the time that should be used for getting each value

Returns:

values. The evaluated geo.ts(t) for all items in the vector

Return type:

DoubleVector

Class GeoQuery

class shyft.time_series.GeoQuery

Bases: instance

A query as a polygon with specified geo epsg coordinate system

__init__((GeoQuery)arg1) None
__init__( (GeoQuery)arg1, (object)epsg, (GeoPointVector)points) -> None :

Construct a GeoQuery from specified parameters

Args:

epsg (int): A valid epsg for the polygon, and also wanted coordinate system

points (GeoPointVector): 3 or more points forming a polygon that is the spatial scope

property epsg

the epsg coordinate system

Type:

int

property polygon

the polygon giving the spatial scope

Type:

GeoPointVector

Class GeoSlice

class shyft.time_series.GeoSlice

Bases: instance

Keeps data that describes as slice into the t0-variable-ensemble-geo, (t,v,e,g), space. It is the result-type of GeoTimeSeriesConfiguration.compute(GeoEvalArgs) and is passed to the geo-db-read callback to specify wanted time-series to read. Note that the content of a GeoSlice can only be interpreted in terms of the GeoTimeSeriesConfiguration it is derived from. The indices and values of the slice, strongly relates to the definition of it’s geo-tsdb.

__init__((GeoSlice)arg1) None
__init__( (GeoSlice)arg1, (IntVector)v, (IntVector)g, (IntVector)e, (UtcTimeVector)t, (time)ts_dt [, (time)skip_dt]) -> None :

Construct a GeoSlice from supplied vectors.

Args:

v (IntVector): list of variables idx, each defined by GeoTimeSeriesConfiguration.variables[i]

e (IntVector): list of ensembles, each in range 0..GeoTimeSeriesConfiguration.n_ensembles-1

g (IntVector): list of geo-point idx, each defined by GeoTimeSeriesConfiguration.grid.points[i]

t (UtcTimeVector): list of t0-time points, each of them should exist in GeoTimeSeriesConfiguration.t0_times

ts_dt (time): time-length to read from each time-series, we read from [t0+skip_dt.. t0+skip_dt+ts_dt>

skip_dt (time): time-length to skip from start each time-series, we read from [t0+skip_dt.. t0+skip_dt+ts_dt>

property e

list of ensembles, each in range 0..GeoTimeSeriesConfiguration.n_ensembles-1

Type:

IntVector

property g

list of geo-point idx, each defined by GeoTimeSeriesConfiguration.grid.points[i]

Type:

IntVector

property period

slice period as [skip_dt..skip_dt+ts_dt) from each ts relative t0

Type:

UtcPeriod

property skip_dt

time length to skip from start of each time-series, [t0+skip_dt.. t0+skip_dt+ts_dt>

Type:

time

property t

list of t0-time points, each of them should exist in GeoTimeSeriesConfiguration.t0_times

Type:

UtcTimeVector

property ts_dt

time length to read from each time-series, [t0+skip_dt.. t0+skip_dt+ts_dt>

Type:

time

property v

list of variables idx, each defined by GeoTimeSeriesConfiguration.variables[i]

Type:

IntVector

Class GeoTsMatrix

class shyft.time_series.GeoTsMatrix

Bases: instance

GeoTsMatrix is 4d matrix,index dimensions (t0,variable,ensemble,geo_point) to be understood as a slice of a geo-ts-db (slice could be the entire db) The element types of the matrix is GeoTimeSeries

__init__((GeoTsMatrix)arg1, (object)n_t0, (object)n_v, (object)n_e, (object)n_g) None :

create GeoTsMatrix with specified t0,variables,ensemble and geo-point dimensions

concatenate((GeoTsMatrix)self, (time)cc_dt0, (time)concat_interval) GeoTsMatrix :

Concatenate all the forecasts in the GeoTsMatrix using supplied parameters

Parameters:
  • cc_dt0 (time) – skip first period of length cc_dt0 each forecast

  • concat_interval (time) – the nominal length between each ts.time(0) of all time-series

Returns:

tsm. A new concatenated geo-ts-matrix

Return type:

GeoTsMatrix

evaluate((GeoTsMatrix)self) GeoTsMatrix :

Apply the expression to each time-series of the specified variable.

Args: :returns: GeoTsMatrix. A new GeoTsMatrix, where all time-series is evaluated :rtype: GeoTsMatrix

extract_geo_ts_vector((GeoTsMatrix)self, (object)t, (object)v, (object)e) GeoTimeSeriesVector :

Given given arguments, return the GeoTimeSeriesVector suitable for constructing GeoPointSource for hydrology region-environments forcing data

Parameters:
  • t (int) – the forecast index, e.g. selects specific forecast, in case of several (t0)

  • v (int) – the variable index, e.g. selects temperature,precipitation etc.

  • e (int) – the ensemble index, in case of many ensembles, select specific ensemble

Returns:

GeoTimeSeriesVector. The GeoTsVector for selected forcast time t, variable and ensemble

Return type:

GeoTimeSeriesVector

get_geo_point((GeoTsMatrix)self, (object)t, (object)v, (object)e, (object)g) GeoPoint :

return self[t,v,e,g].mid_point of type GeoPoint

get_ts((GeoTsMatrix)self, (object)t, (object)v, (object)e, (object)g) TimeSeries :

return self[t,v,e,g] of type TimeSeries

set_geo_point((GeoTsMatrix)self, (object)t, (object)v, (object)e, (object)g, (GeoPoint)point) None :

performs self[t,v,e,g].mid_point= point

set_ts((GeoTsMatrix)self, (object)t, (object)v, (object)e, (object)g, (TimeSeries)ts) None :

performs self[t,v,e,g].ts= ts

property shape

The shape of a GeoMatrix in terms of forecasts(n_t0),variables(n_v),ensembles(n_e) and geopoints(n_g)

Type:

GeoMatrixShape

transform((GeoTsMatrix)self, (object)variable, (TimeSeries)expression) GeoTsMatrix :

Apply the expression to each time-series of the specified variable.

Args:

variable (): the variable index to select the specific variable, use -1 to apply to all

expr (): ts expression, like 2.0*TimeSeries(‘x’), where x will be substituted with the variable, notice that its a required to be just one unbound time-series with the reference name ‘x’.

Returns:

GeoTsMatrix: GeoTsMatrix. A new GeoTsMatrix, where the time-series for the specified variable is transformed

transform( (GeoTsMatrix)self, (TsVector)expr_vector) -> GeoTsMatrix :

Apply the expression to each time-series of the specified variable.

Args:

expr_vector (): ts expressions, like 2.0*TimeSeries(‘0’), where 0 will be substituted with the corresponding variable, notice that its a required to be just one unbound time-series with the reference name ‘0’.

Returns:

GeoTsMatrix: GeoTsMatrix. A new GeoTsMatrix, where the time-series for the specified variable is transformed

Class GeoMatrixShape

class shyft.time_series.GeoMatrixShape

Bases: instance

__init__((GeoMatrixShape)arg1, (object)n_t0, (object)n_v, (object)n_e, (object)n_g) None :

Create with specified dimensionality

property n_e

number of ensembles

Type:

int

property n_g

number of geo points

Type:

int

property n_t0

number of t0, e.g forecasts

Type:

int

property n_v

number of variables

Type:

int

Class GeoGridSpec

class shyft.time_series.GeoGridSpec

Bases: instance

A point set for a geo-grid, but does not have to be a regular grid. It serves the role of defining the spatial representative mid-points for a typical spatial grid, e.g as for arome, or ec forecasts, where the origin shape usually is a box. To support general grid-spec, the optional, then equally sized, shapes provides the polygon shape for each individual mid-point.

__init__((GeoGridSpec)arg1) None
__init__( (GeoGridSpec)arg1, (object)epsg, (GeoPointVector)points) -> None :

Construct a GeoQuery from specified parameters

Args:

epsg (int): A valid epsg for the spatial points

points (GeoPointVector): 0 or more representative points for the spatial properties of the grid

__init__( (GeoGridSpec)arg1, (object)epsg, (GeoPointVectorVector)polygons) -> None :

Construct a GeoQuery from specified parameters

Args:

epsg (int): A valid epsg for the spatial points

polygons (GeoPointVectorVector): 0 or more representative shapes as polygons, the mid-points are computed based on shapes

property epsg

the epsg coordinate system

Type:

int

find_geo_match((GeoGridSpec)self, (GeoQuery)geo_query) IntVector :

finds the points int the grid that is covered by the polygon of the geo_query note: that currently we only consider the horizontal dimension when matching points

Parameters:

geo_query (GeoQuery) – A polygon giving an area to capture

Returns:

matches. a list of all points that is inside, or on the border of the specified polygon, in guaranteed ascending point index order

Return type:

IntVector

property points

the representative mid-points of the spatial grid

Type:

GeoPointVector

property polygons

the polygons describing the grid, mid-points are centroids of the polygons

Type:

GeoPointVectorVector

Class GeoEvalArgs

class shyft.time_series.GeoEvalArgs

Bases: instance

GeoEvalArgs is used for the geo-evaluate functions.

It describes scope for the geo-evaluate function, in terms of:

  • the geo-ts database identifier

  • variables to extract, by names

  • ensemble members (list of ints)

  • temporal, using t0 from specified time-axis, + ts_dt for time-range

  • spatial, using points for a polygon

and optionally:

  • the concat postprocessing with parameters

__init__((GeoEvalArgs)arg1) None
__init__( (GeoEvalArgs)arg1, (object)geo_ts_db_id, (StringVector)variables, (IntVector)ensembles, (TimeAxis)time_axis, (time)ts_dt, (GeoQuery)geo_range, (object)concat, (time)cc_dt0) -> None :

Construct GeoEvalArgs from specified parameters

Args:

geo_ts_db_id (str): identifies the geo-ts-db, short, as ‘arome’, ‘ec’, as specified with server.add_geo_ts_db(cfg)

variables (StringVector): names of the wanted variables, if empty, return all variables configured

ensembles (IntVector): List of ensembles, if empty, return all ensembles configured

time_axis (TimeAxis): specifies the t0, and .total_period().end is used as concatenation open-end fill-in length

ts_dt (time): specifies the time-length to read from each time-series,t0.. t0+ts_dt, and .total_period().end is used as concatenation open-end fill-in length

geo_range (GeoQuery): the spatial scope of the query, if empty, return all configured

concat (bool): postprocess using concatenated forecast, returns ‘one’ concatenated forecast from many.

cc_dt0 (time): concat lead-time, skip cc_dt0 of each forecast (offsets the slice you selects)

__init__( (GeoEvalArgs)arg1, (object)geo_ts_db_id, (IntVector)ensembles, (TimeAxis)time_axis, (time)ts_dt, (GeoQuery)geo_range, (object)concat, (time)cc_dt0, (TsVector)ts_expressions) -> None :

Construct GeoEvalArgs from specified parameters

Args:

geo_ts_db_id (str): identifies the geo-ts-db, short, as ‘arome’, ‘ec’, as specified with server.add_geo_ts_db(cfg)

ensembles (IntVector): List of ensembles, if empty, return all ensembles configured

time_axis (TimeAxis): specifies the t0, and .total_period().end is used as concatenation open-end fill-in length

ts_dt (time): specifies the time-length to read from each time-series,t0.. t0+ts_dt, and .total_period().end is used as concatenation open-end fill-in length

geo_range (GeoQuery): the spatial scope of the query, if empty, return all configured

concat (bool): postprocess using concatenated forecast, returns ‘one’ concatenated forecast from many.

cc_dt0 (time): concat lead-time, skip cc_dt0 of each forecast (offsets the slice you selects)

ts_expressions (TsVector): expressions to evaluate, where the existing variables are referred to by index number as a string, ex. TimeSeries(‘0’)

property cc_dt0

concat lead-time

Type:

time

property concat

postprocess using concatenated forecast, returns ‘one’ concatenated forecast from many

Type:

bool

property ens

list of ensembles to return, empty=all, if specified >0

Type:

IntVector

property geo_range

the spatial scope, as simple polygon

Type:

GeoQuery

property geo_ts_db_id

the name for the config (keep it minimal)

Type:

str

property t0_time_axis

specifies the t0, and .total_period().end is used as concatenation open-end fill-in length

Type:

TimeAxis

property ts_dt

specifies the time-length to read from each time-series,t0.. t0+ts_dt,

Type:

time

property ts_expressions

time series expressions to evaluate instead of variables

Type:

TsVector

property variables

the human readable description of this geo ts db

Type:

StringVector

Class GeoTimeSeriesConfiguration

class shyft.time_series.GeoTimeSeriesConfiguration

Bases: instance

Contain minimal description to efficiently work with arome/ec forecast data It defines the spatial, temporal and ensemble dimensions available, and provides means of mapping a GeoQuery to a set of ts_urls that serves as keys for manipulating and assembling forcing input data for example to the shyft hydrology region-models.

__init__((GeoTimeSeriesConfiguration)arg1) None
__init__( (GeoTimeSeriesConfiguration)arg1, (object)prefix, (object)name, (object)description, (GeoGridSpec)grid, (UtcTimeVector)t0_times, (time)dt, (object)n_ensembles, (StringVector)variables [, (object)json=’’ [, (object)origin_proj4=’’]]) -> None :

Construct a GeoQuery from specified parameters

Args:

prefix (str): ts-url prefix, like shyft:// for internally stored ts, or geo:// for externally stored parts

name (str): A shortest possible unique name of the configuration

description (str): a human readable description of the configuration

grid (GeoGridSpec): specification of the spatial grid

t0_times (UtcTimeVector): List of time where we have register time-series,e.g forecast times, first time-point

dt (time): the (max) length of each geo-ts, so geo_ts total_period is [t0..t0+dt>

n_ensembles (int): number of ensembles available, must be >0, 1 if no ensembles

variables (StringVector): list of minimal keys, representing temperature, precipitation etc

json (str): A user specified json string

origin_proj4 (str): The proj4 string for the origin transform of this dataset

bounding_box((GeoTimeSeriesConfiguration)self, (GeoSlice)slice) GeoPointVector :

Compute the 3D bounding_box, as two GeoPoints containing the min-max of x,y,z of points in the GeoSlice Could be handy when generating queries to externally stored geo-ts databases like netcdf etc. See also convex_hull().

Parameters:

slice (GeoSlice) – a geo-slice with specified dimensions in terms of t0, variables, ensembles,geo-points

Returns:

bbox. with two GeoPoints, [0] keeping the minimum x,y,z, and [1] the maximum x,y,z

Return type:

GeoPointVector

compute((GeoTimeSeriesConfiguration)self, (GeoEvalArgs)eval_args) GeoSlice :

Compute the GeoSlice from evaluation arguments

Parameters:

eval_args (GeoEvalArgs) – Specification to evaluate

Returns:

geo_slice. A geo-slice describing (t0,v,e,g) computed

Return type:

GeoSlice

convex_hull((GeoTimeSeriesConfiguration)self, (GeoSlice)slice) GeoPointVector :

Compute the 2D convex hull, as a list of GeoPoints describing the smallest convex planar polygon containing all points in the slice wrt. x,y. The returned point sequence is ‘closed’, i.e the first and last point in the sequence are equal. See also bounding_box().

Parameters:

slice (GeoSlice) – a geo-slice with specified dimensions in terms of t0, variables, ensembles,geo-points

Returns:

hull. containing the sequence of points of the convex hull polygon.

Return type:

GeoPointVector

create_geo_ts_matrix((GeoTimeSeriesConfiguration)self, (GeoSlice)slice) GeoTsMatrix :

Creates a GeoTsMatrix(element type is GeoTimeSeries) to hold the values according to dimensionality of GeoSlice

Parameters:

slice (GeoSlice) – a geo-slice with specified dimensions in terms of t0, variables, ensembles,geo-points

Returns:

geo_ts_matrix. ready to be filled in with points and time-series

Return type:

GeoTsMatrix

create_ts_matrix((GeoTimeSeriesConfiguration)self, (GeoSlice)slice) GeoMatrix :

Creates a GeoMatrix (element type is TimeSeries only) to hold the values according to dimensionality of GeoSlice

Parameters:

slice (GeoSlice) – a geo-slice with specified dimensions in terms of t0, variables, ensembles,geo-points

Returns:

ts_matrix. ready to be filled in time-series(they are all empty/null)

Return type:

GeoMatrix

property description

the human readable description of this geo ts db

Type:

str

property dt

the (max) length of each geo-ts, so geo_ts total_period is [t0..t0+dt>

Type:

time

find_geo_match_ix((GeoTimeSeriesConfiguration)self, (GeoQuery)geo_query) IntVector :

Returns the indices to the points that matches the geo_query (polygon)

Parameters:

geo_query (GeoQuery) – The query, polygon that matches the spatial scope

Returns:

point_indexes. The list of indices that matches the geo_query

Return type:

IntVector

property grid

the spatial grid definition

Type:

GeoGridSpec

property json

a json formatted string with custom data as needed

Type:

str

property n_ensembles

number of ensembles available, range 1..n

Type:

int

property name

the name for the config (keep it minimal)

Type:

str

property origin_proj4

informative only, if not empty, specifies the origin proj4 of this dataset

Type:

str

property prefix

// for internally stored ts, or geo:// for externally stored parts

Type:

str

Type:

ts-url prefix, like shyft

property t0_time_axis

t0 time-points as time-axis

Type:

TimeAxis

property t0_times

list of time-points, where there are registered/available time-series

Type:

UtcTimeVector

property variables

the list of available properties, like short keys for precipitation,temperature etc

Type:

StringList

Working with time series

The elements in this section define how code shall behave or are actual tools dealing with time series.

Policies

The elements in this section describe how time series are interpreted.

Class convolve_policy

class shyft.time_series.convolve_policy

Bases: enum

Ref TimeSeries.convolve_w function, this policy determine how to handle initial conditions USE_NEAREST: value(0) is used for all values before value(0), and value(n-1) is used for all values after value(n-1) == ‘mass preserving’ USE_ZERO : use zero for all values before value(0) or after value(n-1) == ‘shape preserving’ USE_NAN : nan is used for all values outside the ts BACKWARD : filter is ‘backward looking’ == boundary handling in the beginning of ts FORWARD : filter is ‘forward looking’ == boundary handling in the end of ts CENTER : filter is centered == boundary handling in both ends

BACKWARD = shyft.time_series._time_series.convolve_policy.BACKWARD
CENTER = shyft.time_series._time_series.convolve_policy.CENTER
FORWARD = shyft.time_series._time_series.convolve_policy.FORWARD
USE_NAN = shyft.time_series._time_series.convolve_policy.USE_NAN
USE_NEAREST = shyft.time_series._time_series.convolve_policy.USE_NEAREST
USE_ZERO = shyft.time_series._time_series.convolve_policy.USE_ZERO
names = {'BACKWARD': shyft.time_series._time_series.convolve_policy.BACKWARD, 'CENTER': shyft.time_series._time_series.convolve_policy.CENTER, 'FORWARD': shyft.time_series._time_series.convolve_policy.FORWARD, 'USE_NAN': shyft.time_series._time_series.convolve_policy.USE_NAN, 'USE_NEAREST': shyft.time_series._time_series.convolve_policy.USE_NEAREST, 'USE_ZERO': shyft.time_series._time_series.convolve_policy.USE_ZERO}
values = {1: shyft.time_series._time_series.convolve_policy.USE_NEAREST, 2: shyft.time_series._time_series.convolve_policy.USE_ZERO, 4: shyft.time_series._time_series.convolve_policy.USE_NAN, 16: shyft.time_series._time_series.convolve_policy.FORWARD, 32: shyft.time_series._time_series.convolve_policy.CENTER, 64: shyft.time_series._time_series.convolve_policy.BACKWARD}

Class derivative_method

class shyft.time_series.derivative_method

Bases: enum

Ref. the .derivative time-series function, this defines how to compute the derivative of a given time-series

BACKWARD = shyft.time_series._time_series.derivative_method.BACKWARD
CENTER = shyft.time_series._time_series.derivative_method.CENTER
DEFAULT = shyft.time_series._time_series.derivative_method.DEFAULT
FORWARD = shyft.time_series._time_series.derivative_method.FORWARD
names = {'BACKWARD': shyft.time_series._time_series.derivative_method.BACKWARD, 'CENTER': shyft.time_series._time_series.derivative_method.CENTER, 'DEFAULT': shyft.time_series._time_series.derivative_method.DEFAULT, 'FORWARD': shyft.time_series._time_series.derivative_method.FORWARD}
values = {0: shyft.time_series._time_series.derivative_method.DEFAULT, 1: shyft.time_series._time_series.derivative_method.FORWARD, 2: shyft.time_series._time_series.derivative_method.BACKWARD, 3: shyft.time_series._time_series.derivative_method.CENTER}

Class extend_fill_policy

class shyft.time_series.extend_fill_policy

Bases: enum

Ref TimeSeries.extend function, this policy determines how to represent values in a gap EPF_NAN : use nan values in the gap EPF_LAST: use the last value before the gap EPF_FILL: use a supplied value in the gap

FILL_NAN = shyft.time_series._time_series.extend_fill_policy.FILL_NAN
FILL_VALUE = shyft.time_series._time_series.extend_fill_policy.FILL_VALUE
USE_LAST = shyft.time_series._time_series.extend_fill_policy.USE_LAST
names = {'FILL_NAN': shyft.time_series._time_series.extend_fill_policy.FILL_NAN, 'FILL_VALUE': shyft.time_series._time_series.extend_fill_policy.FILL_VALUE, 'USE_LAST': shyft.time_series._time_series.extend_fill_policy.USE_LAST}
values = {0: shyft.time_series._time_series.extend_fill_policy.FILL_NAN, 1: shyft.time_series._time_series.extend_fill_policy.USE_LAST, 2: shyft.time_series._time_series.extend_fill_policy.FILL_VALUE}

Class extend_split_policy

class shyft.time_series.extend_split_policy

Bases: enum

Ref TimeSeries.extend function, this policy determines where to split/shift from one ts to the other EPS_LHS_LAST : use nan values in the gap EPS_RHS_FIRST: use the last value before the gap EPS_VALUE : use a supplied value in the gap

AT_VALUE = shyft.time_series._time_series.extend_split_policy.AT_VALUE
LHS_LAST = shyft.time_series._time_series.extend_split_policy.LHS_LAST
RHS_FIRST = shyft.time_series._time_series.extend_split_policy.RHS_FIRST
names = {'AT_VALUE': shyft.time_series._time_series.extend_split_policy.AT_VALUE, 'LHS_LAST': shyft.time_series._time_series.extend_split_policy.LHS_LAST, 'RHS_FIRST': shyft.time_series._time_series.extend_split_policy.RHS_FIRST}
values = {0: shyft.time_series._time_series.extend_split_policy.LHS_LAST, 1: shyft.time_series._time_series.extend_split_policy.RHS_FIRST, 2: shyft.time_series._time_series.extend_split_policy.AT_VALUE}

Class interpolation_scheme

class shyft.time_series.interpolation_scheme

Bases: enum

Interpolation methods used by TimeSeries.transform

SCHEME_CATMULL_ROM = shyft.time_series._time_series.interpolation_scheme.SCHEME_CATMULL_ROM
SCHEME_LINEAR = shyft.time_series._time_series.interpolation_scheme.SCHEME_LINEAR
SCHEME_POLYNOMIAL = shyft.time_series._time_series.interpolation_scheme.SCHEME_POLYNOMIAL
names = {'SCHEME_CATMULL_ROM': shyft.time_series._time_series.interpolation_scheme.SCHEME_CATMULL_ROM, 'SCHEME_LINEAR': shyft.time_series._time_series.interpolation_scheme.SCHEME_LINEAR, 'SCHEME_POLYNOMIAL': shyft.time_series._time_series.interpolation_scheme.SCHEME_POLYNOMIAL}
values = {0: shyft.time_series._time_series.interpolation_scheme.SCHEME_LINEAR, 1: shyft.time_series._time_series.interpolation_scheme.SCHEME_POLYNOMIAL, 2: shyft.time_series._time_series.interpolation_scheme.SCHEME_CATMULL_ROM}

Class point_interpretation_policy

class shyft.time_series.point_interpretation_policy

Bases: enum

Determines how to interpret the points in a timeseries when interpreted as a function of time, f(t)

POINT_AVERAGE_VALUE = shyft.time_series._time_series.point_interpretation_policy.POINT_AVERAGE_VALUE
POINT_INSTANT_VALUE = shyft.time_series._time_series.point_interpretation_policy.POINT_INSTANT_VALUE
names = {'POINT_AVERAGE_VALUE': shyft.time_series._time_series.point_interpretation_policy.POINT_AVERAGE_VALUE, 'POINT_INSTANT_VALUE': shyft.time_series._time_series.point_interpretation_policy.POINT_INSTANT_VALUE}
values = {0: shyft.time_series._time_series.point_interpretation_policy.POINT_INSTANT_VALUE, 1: shyft.time_series._time_series.point_interpretation_policy.POINT_AVERAGE_VALUE}

Class trim_policy

class shyft.time_series.trim_policy

Bases: enum

Enum to decide if to trim inwards or outwards where TRIM_IN means inwards, TRIM_ROUND rounds halfway cases away from zero.

TRIM_IN = shyft.time_series._time_series.trim_policy.TRIM_IN
TRIM_OUT = shyft.time_series._time_series.trim_policy.TRIM_OUT
TRIM_ROUND = shyft.time_series._time_series.trim_policy.TRIM_ROUND
names = {'TRIM_IN': shyft.time_series._time_series.trim_policy.TRIM_IN, 'TRIM_OUT': shyft.time_series._time_series.trim_policy.TRIM_OUT, 'TRIM_ROUND': shyft.time_series._time_series.trim_policy.TRIM_ROUND}
values = {0: shyft.time_series._time_series.trim_policy.TRIM_IN, 1: shyft.time_series._time_series.trim_policy.TRIM_OUT, 2: shyft.time_series._time_series.trim_policy.TRIM_ROUND}

Tools

The elements in this section work with time series.

Class KrlsRbfPredictor

class shyft.time_series.KrlsRbfPredictor

Bases: instance

Time-series predictor using the KRLS algorithm with radial basis functions.

The KRLS (Kernel Recursive Least-Squares) algorithm is a kernel regression algorithm for aproximating data, the implementation used here is from:

This predictor uses KRLS with radial basis functions (RBF). Other related shyft.time_series.TimeSeries.krls_interpolation() shyft.time_series.TimeSeries.TimeSeries.get_krls_predictor()

Examples:

>>>
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from shyft.time_series import (
...     Calendar, utctime_now, deltahours,
...     TimeAxis, TimeSeries,
...     KrlsRbfPredictor
... )
>>>
>>> # setup
>>> cal = Calendar()
>>> t0 = utctime_now()
>>> dt = deltahours(3)
>>> n = 365*8  # one year
>>>
>>> # ready plot
>>> fig, ax = plt.subplots()
>>>
>>> # shyft objects
>>> ta = TimeAxis(t0, dt, n)
>>> pred = KrlsRbfPredictor(
...     dt=deltahours(8),
...     gamma=1e-5,  # NOTE: this should be 1e-3 for real data
...     tolerance=0.001
... )
>>>
>>> # generate data
>>> total_series = 4
>>> data_range = np.linspace(0, 2*np.pi, n)
>>> ts = None  # to store the final data-ts
>>> # -----
>>> for i in range(total_series):
>>>     data = np.sin(data_range) + (np.random.random(data_range.shape) - 0.5)/5
>>>     ts = TimeSeries(ta, data)
>>>     # -----
>>>     training_mse = pred.train(ts)  # train the predictor
>>>     # -----
>>>     print(f'training step {i+1:2d}: mse={training_mse}')
>>>     ax.plot(ta.time_points[:-1], ts.values, 'bx')  # plot data
>>>
>>> # prediction
>>> ts_pred = pred.predict(ta)
>>> ts_mse = pred.mse_ts(ts, points=3)  # mse using 7 point wide filter
>>>                                     # (3 points before/after)
>>>
>>> # plot interpolation/predicton on top of results
>>> ax.plot(ta.time_points[:-1], ts_mse.values, '0.6', label='mse')
>>> ax.plot(ta.time_points[:-1], ts_pred.values, 'r-', label='prediction')
>>> ax.legend()
>>> plt.show()
__init__((KrlsRbfPredictor)arg1) None
__init__( (KrlsRbfPredictor)self, (time)dt [, (object)gamma=0.001 [, (object)tolerance=0.01 [, (int)size=1000000]]]) -> None :

Construct a new predictor.

Args:

dt (float): The time-step in seconds the predictor is specified for. Note that this does not put a limit on time-axes used, but for best results it should be approximatly equal to the time-step of time-axes used with the predictor. In addition it should not be to long, else you will get poor results. Try to keep the dt less than a day, 3-8 hours is usually fine.

gamma (float (optional)): Determines the width of the radial basis functions for the KRLS algorithm. Lower values mean wider basis functions, wider basis functions means faster computation but lower accuracy. Note that the tolerance parameter also affects speed and accurcy. A large value is around 1E-2, and a small value depends on the time step. By using values larger than 1E-2 the computation will probably take to long. Testing have reveled that 1E-3 works great for a time-step of 3 hours, while a gamma of 1E-2 takes a few minutes to compute. Use 1E-4 for a fast and tolerably accurate prediction. Defaults to 1E-3

tolerance (float (optional)): The krls training tolerance. Lower values makes the prediction more accurate, but slower. This typically have less effect than gamma, but is usefull for tuning. Usually it should be either 0.01 or 0.001. Defaults to 0.01

size (int (optional)): The size of the “memory” of the predictor. The default value is usually enough. Defaults to 1000000.

clear((KrlsRbfPredictor)self) None :

Clear all training data from the predictor.

mse_ts((KrlsRbfPredictor)self, (TimeSeries)ts[, (int)points=0]) TimeSeries :

Compute a mean-squared error time-series of the predictor relative to the supplied ts.

Parameters:
  • ts (TimeSeries) – Time-series to compute mse against.

  • points (int (optional)) – Positive number of extra points around each point to use for mse.

  • 0. (Defaults to)

Returns:

mse_ts. Time-series with mean-squared error values.

Return type:

TimeSeries

See also

KrlsRbfPredictor.predictor_mse, KrlsRbfPredictor.predict

predict((KrlsRbfPredictor)self, (TimeAxis)ta) TimeSeries :

Predict a time-series for for time-axis.

Notes

The predictor will predict values outside the range of the values it is trained on, but these

values will often be zero. This may also happen if there are long gaps in the training data

and you try to predict values for the gap. Using wider basis functions partly remedies this,

but makes the prediction overall less accurate.

Parameters:

ta (TimeAxis) – Time-axis to predict values for.

Returns:

ts. Predicted time-series.

Return type:

TimeSeries

See also

KrlsRbfPredictor.mse_ts, KrlsRbfPredictor.predictor_mse

predictor_mse((KrlsRbfPredictor)self, (TimeSeries)ts[, (int)offset=0[, (int)count=18446744073709551615[, (int)stride=1]]]) float :

Compute the predictor mean-squared prediction error for count first from ts.

Parameters:
  • ts (TimeSeries) – Time-series to compute mse against.

  • offset (int (optional)) – Positive offset from the start of the time-series. Default to 0.

  • count (int (optional)) – Positive number of samples from the time-series to to use.

  • value. (Default to the maximum)

  • stride (int (optional)) – Positive stride between samples from the time-series. Defaults to 1.

See also

KrlsRbfPredictor.predict, KrlsRbfPredictor.mse_ts

train((KrlsRbfPredictor)self, (TimeSeries)ts[, (int)offset=0[, (int)count=18446744073709551615[, (int)stride=1[, (int)iterations=1[, (object)mse_tol=0.001]]]]]) float :

Train the predictor using samples from ts.

Parameters:
  • ts (TimeSeries) – Time-series to train on.

  • offset (int (optional)) – Positive offset from the start of the time-series. Default to 0.

  • count (int (optional)) – Positive number of samples to to use. Default to the maximum value.

  • stride (int (optional)) – Positive stride between samples from the time-series. Defaults to 1.

  • iterations (int (optional)) – Positive maximum number of times to train on the samples. Defaults to 1.

  • mse_tol (float (optional)) – Positive tolerance for the mean-squared error over the training data.

  • 1E-9. (If the mse after a training session is less than this skip training further. Defaults to)

Returns:

mse. Mean squared error of the predictor relative to the time-series trained on.

Return type:

float (optional)

Class QacParameter

Qac = Quality Assurance Controls

class shyft.time_series.QacParameter

Bases: instance

The qac parameter controls how quailty checks are done, providing min-max range, plus repeated values checks It also provides parameters that controls how the replacement/correction values are filled in, like maximum time-span between two valid neighbour points that allows for linear/extension filling

__init__((QacParameter)arg1) None
__init__( (QacParameter)self, (time)max_timespan, (object)min_x, (object)max_x, (time)repeat_timespan, (object)repeat_tolerance [, (object)constant_filler=nan]) -> None :

a quite complete qac, only lacks repeat_allowed value(s)

__init__( (QacParameter)self, (time)max_timespan, (object)min_x, (object)max_x, (time)repeat_timespan, (object)repeat_tolerance, (object)repeat_allowed [, (object)constant_filler=nan]) -> None :

a quite complete qac, including one repeat_allowed value

property constant_filler

this is applied to values that fails quality checks, if no correction ts, and no interpolation/extension is active

Type:

float

property max_timespan

maximum timespan between two ok values that allow interpolation, or extension of values.If zero, no linear/extend correction

Type:

time

property max_v

maximum value or nan for no maximum value limit

Type:

float

property min_v

minimum value or nan for no minimum value limit

Type:

float

property repeat_allowed

values that are allowed to repeat, within repeat-tolerance

Type:

bool

property repeat_timespan

maximum timespan the same value can be repeated (within repeat_tolerance).If zero, no repeat validation done

Type:

time

property repeat_tolerance

values are considered repeated if they differ by less than repeat_tolerance

Type:

float

Hydrology

The elements in this section are hydrology specific.

Class IcePackingParameters

class shyft.time_series.IcePackingParameters

Bases: instance

Parameter pack controlling ice packing computations. See TimeSeries.ice_packing for usage.

__init__((IcePackingParameters)self, (time)threshold_window, (object)threshold_temperature) None :

Defines a paramter pack for ice packing detection.

Args:

threshold_window (utctime): Positive, seconds for the lookback window.

threshold_temperature (float): Floating point threshold temperature.

__init__( (IcePackingParameters)self, (object)threshold_window, (object)threshold_temperature) -> None :

Defines a paramter pack for ice packing detection.

Args:

threshold_window (int): Positive integer seconds for the lookback window.

threshold_temperature (float): Floating point threshold temperature.

property threshold_temperature

The threshold temperature for ice packing to occur. Ice packing will occur when the average temperature in the window period is less than the threshold.

Type:

float

property threshold_window

The period back in seconds for which the average temperature is computed when looking for ice packing.

Type:

time

Class IcePackingRecessionParameters

class shyft.time_series.IcePackingRecessionParameters

Bases: instance

Parameter pack controlling ice packing recession computations. See TimeSeries.ice_packing_recession for usage.

__init__((IcePackingRecessionParameters)self, (object)alpha, (object)recession_minimum) None :

Defines a parameter pack for ice packing reduction using a simple recession for the water-flow.

Parameters:
  • alpha (float) – Recession curve curving parameter.

  • recession_minimum (float) – Minimum value for the recession.

property alpha

Parameter controlling the curving of the recession curve.

Type:

float

property recession_minimum

The minimum value of the recession curve.

Type:

float

Class ice_packing_temperature_policy

class shyft.time_series.ice_packing_temperature_policy

Bases: enum

Policy enum to specify how TimeSeries.ice_packing handles missing temperature values.

The enum defines three values:
  • DISALLOW_MISSING disallows any missing values. With this policy whenever a NaN value is encountered, or the window of values to consider extends outside the range of the time series, a NaN value will be written to the result time-series.

  • ALLOW_INITIAL_MISSING disallows explicit NaN values, but allows the window of values to consider to expend past the range of the time-series for the initial values.

  • ALLOW_ANY_MISSING allow the window of values to contain NaN values, averaging what it can. Only if all the values in the window is NaN, the result wil be NaN.

ALLOW_ANY_MISSING = shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_ANY_MISSING
ALLOW_INITIAL_MISSING = shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_INITIAL_MISSING
DISALLOW_MISSING = shyft.time_series._time_series.ice_packing_temperature_policy.DISALLOW_MISSING
names = {'ALLOW_ANY_MISSING': shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_ANY_MISSING, 'ALLOW_INITIAL_MISSING': shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_INITIAL_MISSING, 'DISALLOW_MISSING': shyft.time_series._time_series.ice_packing_temperature_policy.DISALLOW_MISSING}
values = {0: shyft.time_series._time_series.ice_packing_temperature_policy.DISALLOW_MISSING, 1: shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_INITIAL_MISSING, 2: shyft.time_series._time_series.ice_packing_temperature_policy.ALLOW_ANY_MISSING}

Class RatingCurveFunction

class shyft.time_series.RatingCurveFunction

Bases: instance

Combine multiple RatingCurveSegments into a rating function.

RatingCurveFunction aggregates multiple RatingCurveSegments and routes. computation calls to the correct segment based on the water level to compute for.

See also

RatingCurveSegment, RatingCurveParameters

__init__((RatingCurveFunction)self) None :

Defines a new empty rating curve function.

__init__( (RatingCurveFunction)self, (RatingCurveSegments)segments [, (object)is_sorted=True]) -> None :

constructs a function from a segment-list

add_segment((RatingCurveFunction)self, (object)lower, (object)a, (object)b, (object)c) None :

Add a new curve segment with the given parameters.

See also:

RatingCurveSegment

add_segment( (RatingCurveFunction)self, (RatingCurveSegment)segment) -> None :

Add a new curve segment as a copy of an exting.

See also:

RatingCurveSegment

flow((RatingCurveFunction)self, (DoubleVector)levels) DoubleVector :

Compute flow for a range of water levels.

Args:

levels (DoubleVector): Range of water levels to compute flow for.

flow( (RatingCurveFunction)self, (object)level) -> float :

Compute flow for the given level.

Args:

level (float): Water level to compute flow for.

size((RatingCurveFunction)self) int :

Get the number of RatingCurveSegments composing the function.

Class RatingCurveParameters

class shyft.time_series.RatingCurveParameters

Bases: instance

Parameter pack controlling rating level computations.

A parameter pack encapsulates multiple RatingCurveFunction’s with time-points. When used with a TimeSeries representing level values it maps computations for each level value onto the correct RatingCurveFunction, which again maps onto the correct RatingCurveSegment for the level value.

See also

RatingCurveSegment, RatingCurveFunction, TimeSeries.rating_curve

__init__((object)arg1) object :

Defines a empty RatingCurveParameter instance

__init__( (object)arg1, (RatingCurveTimeFunctions)t_f_list) -> object :

create parameters in one go from list of RatingCurveTimeFunction elements

Args:

t_f_list (RatingCurveTimeFunctions): a list of RatingCurveTimeFunction elements

add_curve((RatingCurveParameters)self, (time)t, (RatingCurveFunction)curve) RatingCurveParameters :

Add a curve to the parameter pack.

Parameters:
Returns:

self. to allow chaining building functions

Return type:

RatingCurveParameters

flow((RatingCurveParameters)self, (time)t, (object)level) float :

Compute the flow at a specific time point.

Args:

t (utctime): Time-point of the level value.

level (float): Level value at t.

Returns:

float: flow. Flow correcponding to input level at t, nan if level is less than the least water level of the first segment or before the time of the first rating curve function.

flow( (RatingCurveParameters)self, (TimeSeries)ts) -> DoubleVector :

Compute the flow at a specific time point.

Args:

ts (TimeSeries): Time series of level values.

Returns:

DoubleVector: flow. Flow correcponding to the input levels of the time-series, nan where the level is less than the least water level of the first segment and for time-points before the first rating curve function.

Class RatingCurveSegment

class shyft.time_series.RatingCurveSegment

Bases: instance

Represent a single rating-curve equation.

The rating curve function is a*(h - b)^c where a, b, and c are parameters for the segment and h is the water level to compute flow for. Additionally there is a lower parameter for the least water level the segment is valid for. Seen separatly a segment is considered valid for any level greater than lower. n The function segments are gathered into many RatingCurveFunction to represent a set of different rating functions for different levels. Related classes are RatingCurveFunction, RatingCurveParameters

__init__((RatingCurveSegment)self) None
__init__( (RatingCurveSegment)self, (object)lower, (object)a, (object)b, (object)c) -> None :

Defines a new RatingCurveSegment with the specified parameters

property a

Parameter a

Type:

float

property b

Parameter b

Type:

float

property c

Parameter c

Type:

float

flow((RatingCurveSegment)self, (DoubleVector)levels[, (int)i0=0[, (int)iN=18446744073709551615]]) DoubleVector :

Compute the flow for a range of water levels

Args:

levels (DoubleVector): Vector of water levels

i0 (int): first index to use from levels, defaults to 0

iN (int): first index _not_ to use from levels, defaults to std::size_t maximum.

Returns:

DoubleVector: flow. Vector of flow values.

flow( (RatingCurveSegment)self, (object)level) -> float :

Compute the flow for the given water level.

Notes:

There is _no_ check to see if level is valid. It’s up to the user to call

with a correct level.

Args:

level (float): water level

Returns:

double: flow. the flow for the given water level

property lower

Least valid water level. Not mutable after constructing a segment.

Type:

float

valid((RatingCurveSegment)self, (object)level) bool :

Check if a water level is valid for the curve segment

level (float): water level

Returns:

valid. True if level is greater or equal to lower

Return type:

bool

Class RatingCurveSegments

class shyft.time_series.RatingCurveSegments

Bases: instance

A typed list of RatingCurveSegment, used to construct RatingCurveParameters.

__init__((RatingCurveSegments)self) None

__init__( (RatingCurveSegments)arg1, (RatingCurveSegments)clone_me) -> None

append((RatingCurveSegments)arg1, (object)arg2) None
extend((RatingCurveSegments)arg1, (object)arg2) None

Class RatingCurveTimeFunction

class shyft.time_series.RatingCurveTimeFunction

Bases: instance

Composed of time t and RatingCurveFunction

__init__((RatingCurveTimeFunction)self) None :

Defines empty pair t,f

__init__( (RatingCurveTimeFunction)self, (time)t, (RatingCurveFunction)f) -> None :

Construct an object with function f valid from time t

Args:

t (int): epoch time in 1970 utc [s]

f (RatingCurveFunction): the function

property f

the rating curve function

Type:

RatingCurveFunction

property t

.f is valid from t, the epoch 1970[s] time

Type:

time

Class RatingCurveTimeFunctions

class shyft.time_series.RatingCurveTimeFunctions

Bases: instance

A typed list of RatingCurveTimeFunction elements

__init__((RatingCurveTimeFunctions)self) None :

Defines empty list pair t,f

__init__( (RatingCurveTimeFunctions)arg1, (RatingCurveTimeFunctions)clone_me) -> None

append((RatingCurveTimeFunctions)arg1, (object)arg2) None
extend((RatingCurveTimeFunctions)arg1, (object)arg2) None