tspy.data_structures package

class tspy.data_structures.MultiTimeSeries(tsc, j_mts)

Bases: object

A collection of TimeSeries where each time-series is identified by some key.

Notes

Like time-series, operations performed against a multi-time-series are executed lazily unless specified otherwise

All transforms against a multi-time-series will be run in parallel across time-series

There is no assumption that all time-series must be aligned or have like periodicity.

Examples

create a multi-time-series from a dict

>>> import tspy
>>> ts1 = tspy.time_series([1,2,3])
>>> ts2 = tspy.builder().add(tspy.observation(2,4)).add(tspy.observation(10,1)).result().to_time_series()
>>> mts = tspy.multi_time_series({'a': ts1, 'b': ts2})
>>> mts
a time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
b time series
------------------------------
TimeStamp: 2     Value: 4
TimeStamp: 10     Value: 1

create a multi-time-series from a pandas dataframe

>>> import tspy
>>> import numpy as np
>>> import pandas as pd
>>> header = ['', 'key', 'timestamp', "name", "age"]
>>> row1 = ['Row1', "a", 1, "josh", 27]
>>> row2 = ['Row2', "b", 3, "john", 4]
>>> row3 = ['Row3', "a", 5, "bob", 17]
>>> data = np.array([header, row1, row2, row3])
>>> df = pd.DataFrame(data=data[1:, 1:], index=data[1:, 0], columns=data[0, 1:]).astype(dtype={'key': 'object', 'timestamp': 'int64'})
>>> mts = tspy.multi_time_series(df, "key", "timestamp")
>>> mts
a time series
------------------------------
TimeStamp: 1     Value: {name=josh, age=27}
TimeStamp: 5     Value: {name=bob, age=17}
b time series
------------------------------
TimeStamp: 3     Value: {name=john, age=4}
Attributes
keys

Returns

Methods

aggregate(zero, seq_func, comb_func)

aggregate all series in this multi-time-series to produce a single value

aggregate_series(list_to_val_func)

aggregate all time-series in the multi-time-series using a summation function to produce a single time-series

aggregate_series_with_key(list_to_val_func)

aggregate all time-series with key in the multi-time-series using a summation function to produce a single time-series

align(key[, interp_func])

align all time-series on a key

cache([cache_size])

suggest to the multi-time-series to cache values

collect([inclusive])

collect and materialize this multi-time-series

collect_series(key[, inclusive])

get a collection of observations given a key

describe()

retrieve a NumStats object per time-series computed from all values in this multi-time-series (double)

fillna(interpolator[, null_value])

produce a new multi-time-series which is the result of filling all null values.

filter(func)

produce a new multi-time-series which is the result of filtering by each observation’s value given a filter function.

filter_series(func)

filter each time-series by its time-series object

filter_series_key(func)

filter each time-series by its key

forecast(num_predictions, fm[, …])

forecast the next num_predictions using a forecasting model for each time-series

full_align(multi_time_series[, …])

align two multi-time-series based on a temporal full join strategy and optionally interpolate missing values

full_join(multi_time_series[, join_func, …])

join two multi-time-series based on a temporal full join strategy and optionally interpolate missing values

get_time_series(key)

get a time-series given a key

get_values(start, end[, inclusive])

get all values between a range in this multi-time-series

inner_align(multi_time_series)

align two multi-time-series based on a temporal inner join strategy

inner_join(multi_time_series[, join_func])

join two multi-time-series based on a temporal inner join strategy

left_align(multi_time_series[, interp_func])

align two multi-time-series based on a temporal left join strategy and optionally interpolate missing values

left_join(multi_time_series[, join_func, …])

join two multi-time-series based on a temporal left join strategy and optionally interpolate missing values

left_outer_align(multi_time_series[, …])

align two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values

left_outer_join(multi_time_series[, …])

join two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values

map(func)

produce a new multi-time-series where each observation’s value in this multi-time-series is mapped to a new observation value

map_series(func)

map each ObservationCollection to a new collection of observations

map_series_key(func)

map each time-series key to a new key

map_series_with_key(func)

map each ObservationCollection to a new collection of observations giving access to each time-series key

pair_wise_transform(binary_transform)

produce a new multi-time-series which is the product of performing a pair-wise transform against all combination of keys

print([start, end, inclusive])

print this multi-time-series

reduce(func)

reduce each time-series in this multi-time-series to a single value

reduce_range(func, start, end[, inclusive])

reduce each time-series in this multi-time-series to a single value given a range

resample(period, interp_func)

produce a new multi-time-series by resampling each time-series to a given periodicity

right_align(multi_time_series[, interp_func])

align two multi-time-series based on a temporal right join strategy and optionally interpolate missing values

right_join(multi_time_series[, join_func, …])

join two multi-time-series based on a temporal right join strategy and optionally interpolate missing values

right_outer_align(multi_time_series[, …])

align two time-series based on a temporal right outer join strategy and optionally interpolate missing values

right_outer_join(multi_time_series[, …])

join two multi-time-series based on a temporal right outer join strategy and optionally interpolate missing values

segment(window[, step, enforce_bounds])

produce a new segment-multi-time-series from a performing a sliding-based segmentation over each time-series

segment_by(func)

produce a new segment-multi-time-series from a performing a group-by operation on each observation’s value for each time-series

segment_by_anchor(func, left_delta, right_delta)

produce a new segment-multi-time-series from performing an anchor-based segmentation over each time-series.

segment_by_changepoint([change_point])

produce a new segment-multi-time-series from performing a chang-point based segmentation.

segment_by_time(window, step)

produce a new segment-multi-time-series from a performing a time-based segmentation over each time-series

time_series(key)

get a time-series given a key

to_df([format, inclusive])

convert this multi-time-series to a pandas dataframe.

to_df_instants([inclusive])

convert this multi-time-series to an observations pandas dataframe.

to_df_observations([inclusive])

convert this multi-time-series to an observations pandas dataframe.

to_segments(segment_transform)

produce a new segment-multi-time-series from a segmentation transform

transform(*args)

produce a new multi-time-series which is the result of performing a transforming over each time-series.

trs(key)

get a time-series time-reference-system given a key

uncache()

remove the multi-time-series caching mechanism

with_trs([granularity, time_tick])

create a new multi-time-series with its timestamps mapped based on a granularity and start_time.

write([start, end, inclusive])

create a multi-time-series-writer given a range

aggregate(zero, seq_func, comb_func)

aggregate all series in this multi-time-series to produce a single value

Parameters
zeroany

zero value for aggregation

seq_funcfunc

operation to perform against each time series to reduce a time series to a single value

comb_funcfunc

operation to perform against each reduced time series values to combine those values

Returns
any

single output value representing the aggregate of all time-series in the multi-time-series

Raises
TSErrorWithMessage

If there is an error in aggregating, e.g. incorrect type

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1,2,3])
>>> ts2 = tspy.builder().add(tspy.observation(2,4)).add(tspy.observation(10,1)).result().to_time_series()
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2})
>>> mts_orig
a time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
b time series
------------------------------
TimeStamp: 2     Value: 4
TimeStamp: 10     Value: 1

get the sum over all time-series

>>> from tspy.functions import reducers
>>> sum = mts_orig.aggregate(0, lambda agg,cur: agg + cur.reduce(reducers.sum()), lambda agg1, agg2: agg1 + agg2)
>>> sum
11.0
aggregate_series(list_to_val_func)

aggregate all time-series in the multi-time-series using a summation function to produce a single time-series

Parameters
list_to_val_funcfunc

function which produces a single value given a list of values

Returns
TimeSeries

a new time-series

Notes

all time-series in this multi-time-series should be aligned prior to calling aggregate_series

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1,2,3])
>>> ts2 = tspy.time_series([2,3,4])
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2})
a time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
b time series
------------------------------
TimeStamp: 0     Value: 2
TimeStamp: 1     Value: 3
TimeStamp: 2     Value: 4

create a sum per time-tick time-series

>>> ts = mts_orig.aggregate_series(lambda l: sum(l))
TimeStamp: 0     Value: 3
TimeStamp: 1     Value: 5
TimeStamp: 2     Value: 7
aggregate_series_with_key(list_to_val_func)

aggregate all time-series with key in the multi-time-series using a summation function to produce a single time-series

Parameters
list_to_val_funcfunc

function which produces a single value given a list of pairs with key and value

Returns
TimeSeries

a new time-series

align(key, interp_func=<function MultiTimeSeries.<lambda>>)

align all time-series on a key

Parameters
keyany

key to a time-series within this multi-time-series

interp_funcfunc or interpolator, optional

the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
MultiTimeSeries

a new multi-time-series

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1,2,3])
>>> ts2 = tspy.builder().add(tspy.observation(2,4)).add(tspy.observation(10,1)).result().to_time_series()
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2})
>>> mts_orig
a time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
b time series
------------------------------
TimeStamp: 2     Value: 4
TimeStamp: 10     Value: 1

align all time-series on time-series ‘b’

>>> mts = mts_orig.align("b")
>>> mts
a time series
------------------------------
TimeStamp: 2     Value: 3
TimeStamp: 10     Value: null
b time series
------------------------------
TimeStamp: 2     Value: 4
TimeStamp: 10     Value: 1
cache(cache_size=None)

suggest to the multi-time-series to cache values

Parameters
cache_sizeint, optional

the max cache size (default is max long)

Returns
MultiTimeSeries

a new multi-time-series

Notes

this is a lazy operation and will only suggest to the multi-time-series to save values once computed

collect(inclusive=False)

collect and materialize this multi-time-series

Parameters
inclusivebool, optional

if true, will use inclusive bounds (default is False)

Returns
dict

a collection of observations for each key

Raises
TSErrorWithMessage

If there is an error in collecting data, e.g. incorrect type

Notes

see collect() for usage

collect_series(key, inclusive=False)

get a collection of observations given a key

Parameters
keyany

the key associated with a time-series in this multi-time-series

inclusivebool, optional

if true, will use inclusive bounds (default is False)

Returns
ObservationCollection

the collection of observations associated with the given key

Raises
ValueError

If there is an error in aggregating, e.g. incorrect key

describe()

retrieve a NumStats object per time-series computed from all values in this multi-time-series (double)

Returns
dict

NumStats for each key

Raises
TSErrorWithMessage

describe doesn’t work with the data type

fillna(interpolator, null_value=None)

produce a new multi-time-series which is the result of filling all null values.

Parameters
interpolatorfunc or interpolator

the interpolator method to be used when a value is null

null_valueany, optional

denotes a null value, for instance if nullValue = NaN, NaN would be filled

Returns
MultiTimeSeries

a new multi-time-series

filter(func)

produce a new multi-time-series which is the result of filtering by each observation’s value given a filter function.

Parameters
funcfunc

the filter on observation’s value function

Returns
MultiTimeSeries

a new multi-time-series

Notes

see filter() for usage

filter_series(func)

filter each time-series by its time-series object

Parameters
funcfunc

function which given a time-series will produce a boolean denoting whether to keep the time-series

Returns
MultiTimeSeries

a new multi-time-series

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1.0, 2.0, 3.0, 4.0])
>>> ts2 = tspy.time_series([1.0, -2.0, 3.0, 4.0])
>>> ts3 = tspy.time_series([0.0, 1.0, 2.0, 4.0])
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2, 'c': ts3})
>>> mts_orig
a time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
b time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: -2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
c time series
------------------------------
TimeStamp: 0     Value: 0.0
TimeStamp: 1     Value: 1.0
TimeStamp: 2     Value: 2.0
TimeStamp: 3     Value: 4.0
>>> mts = mts_orig.filter_series(lambda s: -2.0 not in [x.value for x in s.collect()])
>>> mts
a time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
c time series
------------------------------
TimeStamp: 0     Value: 0.0
TimeStamp: 1     Value: 1.0
TimeStamp: 2     Value: 2.0
TimeStamp: 3     Value: 4.0
filter_series_key(func)

filter each time-series by its key

Parameters
funcfunc

function which given a key will produce a boolean denoting whether to keep the time-series

Returns
MultiTimeSeries

a new multi-time-series

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1.0, 2.0, 3.0, 4.0])
>>> ts2 = tspy.time_series([1.0, -2.0, 3.0, 4.0])
>>> ts3 = tspy.time_series([0.0, 1.0, 2.0, 4.0])
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2, 'c': ts3})
>>> mts_orig
a time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
b time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: -2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
c time series
------------------------------
TimeStamp: 0     Value: 0.0
TimeStamp: 1     Value: 1.0
TimeStamp: 2     Value: 2.0
TimeStamp: 3     Value: 4.0

filter each series by key != ‘a’

>>> mts = mts_orig.filter_series_key(lambda k: k != 'a')
>>> mts
b time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: -2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
c time series
------------------------------
TimeStamp: 0     Value: 0.0
TimeStamp: 1     Value: 1.0
TimeStamp: 2     Value: 2.0
TimeStamp: 3     Value: 4.0
forecast(num_predictions, fm, start_training_time=None, confidence=1.0)

forecast the next num_predictions using a forecasting model for each time-series

Parameters
num_predictionsint

number of forecasts past the end of the time-series to retrieve

fmForecastingModel

the forecasting model to use

start_training_timeint or datetime, optional

point at which to start training the forecasting model

confidencefloat

number between 0 and 1 which is used in calculating the confidence interval

Returns
dict

a collection of observations for each key

Raises
TSErrorWithMessage

If there is an error in forecasting

Notes

see forecast() for usage

full_align(multi_time_series, left_interp_func=<function MultiTimeSeries.<lambda>>, right_interp_func=<function MultiTimeSeries.<lambda>>)

align two multi-time-series based on a temporal full join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries

the time-series to align with

left_interp_funcfunc or interpolator, optional

the left multi-time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

right_interp_funcfunc or interpolator, optional

the right multi-time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
tuple

aligned multi-time-series

Notes

full align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see full_align() for usage

full_join(multi_time_series, join_func=None, left_interp_func=<function MultiTimeSeries.<lambda>>, right_interp_func=<function MultiTimeSeries.<lambda>>)

join two multi-time-series based on a temporal full join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries or TimeSeries

the multi-time-series to join with

join_funcfunc, optional

function to join to values (default is join to list where left is index 0, right is index 1)

left_interp_funcfunc or interpolator, optional

the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

right_interp_funcfunc or interpolator, optional

the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
MultiTimeSeries

a new multi-time-series

Notes

full join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see full_join() for usage

get_time_series(key)

get a time-series given a key

Parameters
keyany

the key associated with a time-series in this multi-time-series

Returns
TimeSeries

the time-series associated with the given key

Raises
ValueError

If there is an error in aggregating, e.g. incorrect key

get_values(start, end, inclusive=False)

get all values between a range in this multi-time-series

Parameters
startint or datetime

start of range (inclusive)

endint or datetime

end of range (inclusive)

inclusivebool, optional

if true, will use inclusive bounds (default is False)

Returns
dict

a collection of observations for each key

Raises
TSErrorWithMessage

If there is an error in collecting data, e.g. incorrect type

Notes

see get_values() for usage

inner_align(multi_time_series)

align two multi-time-series based on a temporal inner join strategy

Parameters
multi_time_seriesMultiTimeSeries

the multi-time-series to align with

Returns
tuple

aligned multi-time-series

Notes

inner align will align on like time-series keys. If a key does not exist in one time-series, it will be discarded

see inner_align() for usage

inner_join(multi_time_series, join_func=None)

join two multi-time-series based on a temporal inner join strategy

Parameters
multi_time_seriesMultiTimeSeries or TimeSeries

the multi-time-series to join with

join_funcfunc, optional

function to join 2 values at a given time-tick. If None given, joined value will be in a list (default is None)

Returns
MultiTimeSeries

a new multi-time-series

Notes

inner join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see inner_join() for usage

property keys
Returns
list

all keys in this multi-time-series

left_align(multi_time_series, interp_func=<function MultiTimeSeries.<lambda>>)

align two multi-time-series based on a temporal left join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries

the time-series to align with

interp_funcfunc or interpolator, optional

the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
tuple

aligned multi-time-series

Notes

left align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see left_align() for usage

left_join(multi_time_series, join_func=None, interp_func=<function MultiTimeSeries.<lambda>>)

join two multi-time-series based on a temporal left join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries or TimeSeries

the multi-time-series to join with

join_funcfunc, optional

function to join to values (default is join to list where left is index 0, right is index 1)

interp_funcfunc or interpolator, optional

the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
MultiTimeSeries

a new multi-time-series

Notes

left join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see left_join() for usage

left_outer_align(multi_time_series, interp_func=<function MultiTimeSeries.<lambda>>)

align two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries

the multi-time-series to align with

interp_funcfunc or interpolator, optional

the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
tuple

aligned multi-time-series

Notes

left outer align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see left_outer_align() for usage

left_outer_join(multi_time_series, join_func=None, interp_func=<function MultiTimeSeries.<lambda>>)

join two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries or TimeSeries

the multi-time-series to join with

join_funcfunc, optional

function to join to values (default is join to list where left is index 0, right is index 1)

interp_funcfunc or interpolator, optional

the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
MultiTimeSeries

a new multi-time-series

Notes

left outer join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see left_outer_join() for usage

map(func)

produce a new multi-time-series where each observation’s value in this multi-time-series is mapped to a new observation value

Parameters
funcfunc

value mapping function

Returns
MultiTimeSeries

a new multi-time-series

Notes

see map() for usage

map_series(func)

map each ObservationCollection to a new collection of observations

Parameters
funcfunc

function which given a collection of observations, will produce a new collection of observations

Returns
MultiTimeSeries

a new multi-time-series

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1,2,3])
>>> ts2 = tspy.builder().add(tspy.observation(2,4)).add(tspy.observation(10,1)).result().to_time_series()
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2})
>>> mts_orig
a time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
b time series
------------------------------
TimeStamp: 2     Value: 4
TimeStamp: 10     Value: 1

add one to each value in our multi-time-series

>>> mts = mts_orig.map_series(lambda s: s.to_time_series().map(lambda x: x + 1).collect())
>>> mts
a time series
------------------------------
TimeStamp: 0     Value: 2
TimeStamp: 1     Value: 3
TimeStamp: 2     Value: 4
b time series
------------------------------
TimeStamp: 2     Value: 5
TimeStamp: 10     Value: 2
map_series_key(func)

map each time-series key to a new key

Parameters
funcfunc

function which given a time-series key, will produce a new time-series key

Returns
MultiTimeSeries

a new multi-time-series

Notes

all produced keys must be unique

map_series_with_key(func)

map each ObservationCollection to a new collection of observations giving access to each time-series key

Parameters
funcfunc

function which given a collection of observations and a key, will produce a new collection of observations

Returns
MultiTimeSeries

a new multi-time-series

pair_wise_transform(binary_transform)

produce a new multi-time-series which is the product of performing a pair-wise transform against all combination of keys

Parameters
binary_transformBinaryTransform

the binary transform to execute across all pairs in this multi-time-series

Returns
MultiTimeSeries

a new multi-time-series

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1.0, 2.0, 3.0, 4.0])
>>> ts2 = tspy.time_series([1.0, -2.0, 3.0, 4.0])
>>> ts3 = tspy.time_series([0.0, 1.0, 2.0, 4.0])
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2, 'c': ts3})
>>> mts_orig
a time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
b time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: -2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
c time series
------------------------------
TimeStamp: 0     Value: 0.0
TimeStamp: 1     Value: 1.0
TimeStamp: 2     Value: 2.0
TimeStamp: 3     Value: 4.0

perform a pair-wise correlation on this multi-time-series (sliding windows of size 3)

>>> from tspy.functions import reducers
>>> mts = mts_orig.segment(3).pair_wise_transform(reducers.correlation())
>>> mts
(a, a) time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 1.0
(a, b) time series
------------------------------
TimeStamp: 0     Value: 0.3973597071195132
TimeStamp: 1     Value: 0.9332565252573828
(b, a) time series
------------------------------
TimeStamp: 0     Value: 0.3973597071195132
TimeStamp: 1     Value: 0.9332565252573828
(a, c) time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 0.9819805060619657
(b, b) time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 1.0
(c, a) time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 0.9819805060619657
(b, c) time series
------------------------------
TimeStamp: 0     Value: 0.3973597071195132
TimeStamp: 1     Value: 0.8485552916276634
(c, b) time series
------------------------------
TimeStamp: 0     Value: 0.3973597071195132
TimeStamp: 1     Value: 0.8485552916276633
(c, c) time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 1.0
print(start=None, end=None, inclusive=False)

print this multi-time-series

Parameters
startint or datetime, optional

start of range (inclusive) (default is current first time-tick)

endint or datetime

end of range (inclusive) (default is current last time-tick)

inclusivebool, optional

if true, will use inclusive bounds (default is False)

Raises
ValueError

If there is an error in the input arguments.

reduce(func)

reduce each time-series in this multi-time-series to a single value

Parameters
funcunary reducer or func

the unary reducer method to be used

Returns
dict

the output of time-series reduction for each key

Notes

see reduce() for usage

reduce_range(func, start, end, inclusive=False)

reduce each time-series in this multi-time-series to a single value given a range

Parameters
funcunary reducer or func

the unary reducer method to be used

startint or datetime

start of range (inclusive)

endint or datetime

end of range (inclusive)

inclusivebool, optional

if true, will use inclusive bounds (default is False)

Returns
dict

the output of time-series reduction for each key

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1.0, 2.0, 3.0, 4.0])
>>> ts2 = tspy.time_series([1.0, -2.0, 3.0, 4.0])
>>> ts3 = tspy.time_series([0.0, 1.0, 2.0, 4.0])
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2, 'c': ts3})
>>> mts_orig
a time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: 2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
b time series
------------------------------
TimeStamp: 0     Value: 1.0
TimeStamp: 1     Value: -2.0
TimeStamp: 2     Value: 3.0
TimeStamp: 3     Value: 4.0
c time series
------------------------------
TimeStamp: 0     Value: 0.0
TimeStamp: 1     Value: 1.0
TimeStamp: 2     Value: 2.0
TimeStamp: 3     Value: 4.0

reduce each time-series to an average from [1,2]

>>> from tspy.functions import reducers
>>> avg_dict = mts_orig.reduce_range(reducers.average(), 1, 2)
>>> avg_dict
{'a': 2.5, 'b': 0.5, 'c': 1.5}
resample(period, interp_func)

produce a new multi-time-series by resampling each time-series to a given periodicity

Parameters
periodint

the period to resample to

funcfunc or interpolator

the interpolator method to be used when a value doesn’t exist at a given time-tick

Returns
MultiTimeSeries

a new multi-time-series

Notes

see resample() for usage

right_align(multi_time_series, interp_func=<function MultiTimeSeries.<lambda>>)

align two multi-time-series based on a temporal right join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries

the time-series to align with

interp_funcfunc or interpolator, optional

the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
tuple

aligned multi-time-series

Notes

right align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see right_align() for usage

right_join(multi_time_series, join_func=None, interp_func=<function MultiTimeSeries.<lambda>>)

join two multi-time-series based on a temporal right join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries or TimeSeries

the multi-time-series to join with

join_funcfunc, optional

function to join to values (default is join to list where left is index 0, right is index 1)

interp_funcfunc or interpolator, optional

the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
MultiTimeSeries

a new multi-time-series

Notes

right join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see right_join() for usage

right_outer_align(multi_time_series, interp_func=<function MultiTimeSeries.<lambda>>)

align two time-series based on a temporal right outer join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries

the multi-time-series to align with

interp_funcfunc or interpolator, optional

the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
tuple

aligned multi-time-series

Notes

right outer align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see right_outer_align() for usage

right_outer_join(multi_time_series, join_func=None, interp_func=<function MultiTimeSeries.<lambda>>)

join two multi-time-series based on a temporal right outer join strategy and optionally interpolate missing values

Parameters
multi_time_seriesMultiTimeSeries or TimeSeries

the multi-time-series to join with

join_funcfunc, optional

function to join to values (default is join to list where left is index 0, right is index 1)

interp_funcfunc or interpolator, optional

the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)

Returns
MultiTimeSeries

a new multi-time-series

Notes

right outer join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded

see right_outer_join() for usage

segment(window, step=1, enforce_bounds=True)

produce a new segment-multi-time-series from a performing a sliding-based segmentation over each time-series

Parameters
windowint

number of observations per window

stepint, optional

step size to slide (default is 1)

enforce_sizebool, optional

if true, will require a window to have the given window size number of observations, otherwise windows can have less than or equal to the window size number of observations. (default is True)

Returns
see SegmentMultiTimeSeries()

a new segment-multi-time-series

Notes

see segment() for usage

segment_by(func)

produce a new segment-multi-time-series from a performing a group-by operation on each observation’s value for each time-series

Parameters
funcfunc

value to key function

Returns
see SegmentMultiTimeSeries()

a new segment-multi-time-series

Notes

see segment_by() for usage

segment_by_anchor(func, left_delta, right_delta)

produce a new segment-multi-time-series from performing an anchor-based segmentation over each time-series. An anchor point is defined as any value that satisfies the filter function. When an anchor point is determined the segment is built based on left_delta time ticks to the left of the point and right_delta time ticks to the right of the point.

Parameters
funcfunc

the filter anchor point function

left_deltaint

left delta time ticks to the left of the anchor point

right_deltaint

right delta time ticks to the right of the anchor point

percint, optional

number between 0 and 1.0 to denote how often to accept the anchor (default is None)

Returns
see SegmentMultiTimeSeries()

a new segment-multi-time-series

Notes

see segment_by_anchor() for usage

segment_by_changepoint(change_point=None)

produce a new segment-multi-time-series from performing a chang-point based segmentation. A change-point can be defined as any change in 2 values that results in a true statement.

Parameters
change_pointfunc, optional

a function given a prev/next value to determine if a change exists (default is simple constant change)

Returns
see SegmentMultiTimeSeries()

a new segment-multi-time-series

Notes

see segment_by_changepoint() for usage

segment_by_time(window, step)

produce a new segment-multi-time-series from a performing a time-based segmentation over each time-series

Parameters
windowint

time-tick length of window

stepint

time-tick length of step

Returns
see SegmentMultiTimeSeries()

a new segment-multi-time-series

Notes

see segment_by_time() for usage

time_series(key)

get a time-series given a key

Parameters
keyany

the key associated with a time-series in this multi-time-series

Returns
TimeSeries

the time-series associated with this given key

to_df(format='observations', inclusive=False)

convert this multi-time-series to a pandas dataframe. A pandas dataframe can wither be stored in observations format (key column per row) or instants format (one time-series per column)

Parameters
formatstr, optional

dataframe format to store in (default is observations format)

inclusivebool, optional

if true, will use inclusive bounds (default is False)

Returns
dataframe

a pandas dataframe representation of this time-series

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1,2,3])
>>> ts2 = tspy.time_series([2,3,4])
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2})
a time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
b time series
------------------------------
TimeStamp: 0     Value: 2
TimeStamp: 1     Value: 3
TimeStamp: 2     Value: 4

create an observations dataframe

>>> df = mts_orig.to_df(format="observations")
>>> df
   timestamp key  value
0          0   a      1
1          1   a      2
2          2   a      3
3          0   b      2
4          1   b      3
5          2   b      4

create an instants dataframe

>>> mts = mts_orig.to_df(format="instants")
>>> mts
   timestamp  a  b
0          0  1  2
1          1  2  3
2          2  3  4
to_df_instants(inclusive=False)

convert this multi-time-series to an observations pandas dataframe. An observations dataframe is one which contains a time-series per column.

Parameters
inclusivebool, optional

if true, will use inclusive bounds (default is False)

Returns
dataframe

a pandas dataframe representation of this time-series

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1,2,3])
>>> ts2 = tspy.time_series([2,3,4])
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2})
a time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
b time series
------------------------------
TimeStamp: 0     Value: 2
TimeStamp: 1     Value: 3
TimeStamp: 2     Value: 4

create an instants dataframe

>>> mts = mts_orig.to_df_instants()
>>> mts
   timestamp  a  b
0          0  1  2
1          1  2  3
2          2  3  4
to_df_observations(inclusive=False)

convert this multi-time-series to an observations pandas dataframe. An observations dataframe is one which contains a key column per record.

Parameters
inclusivebool, optional

if true, will use inclusive bounds (default is False)

Returns
dataframe

a pandas dataframe representation of this time-series

Examples

create a simple multi-time-series

>>> import tspy
>>> ts1 = tspy.time_series([1,2,3])
>>> ts2 = tspy.time_series([2,3,4])
>>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2})
a time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
b time series
------------------------------
TimeStamp: 0     Value: 2
TimeStamp: 1     Value: 3
TimeStamp: 2     Value: 4

create an observations dataframe

>>> df = mts_orig.to_df_observations()
>>> df
   timestamp key  value
0          0   a      1
1          1   a      2
2          2   a      3
3          0   b      2
4          1   b      3
5          2   b      4
to_segments(segment_transform)

produce a new segment-multi-time-series from a segmentation transform

Parameters
segment_transformUnaryTransform

the transform which will result in a time-series of segments

Returns
see SegmentMultiTimeSeries()

a new segment-multi-time-series

Notes

see to_segments() for usage

transform(*args)

produce a new multi-time-series which is the result of performing a transforming over each time-series. A transform can be of type unary (one time-series in, one time-series out) or binary (two time-series in, one time-series out)

Parameters
argsUnaryTransform or BinaryTransform

the transformation to apply on each time-series

Returns
MultiTimeSeries

a new multi-time-series

Raises
ValueError

If there is an error in the input arguments

Notes

transforms can be shape changing (time-series size out does not necessarily equal time-series size in)

see transform() for usage

trs(key)

get a time-series time-reference-system given a key

Parameters
keyany

the key associated with a time-series in this multi-time-series

Returns
TRSTRS

this time-series time-reference-system

uncache()

remove the multi-time-series caching mechanism

Returns
MultiTimeSeries

a new multi-time-series

with_trs(granularity=datetime.timedelta(0, 0, 1000), time_tick=datetime.datetime(1970, 1, 1, 0, 0, tzinfo=datetime.timezone.utc))

create a new multi-time-series with its timestamps mapped based on a granularity and start_time. In the scope of this method, granularity refers to the granularity at which to see time_ticks and start_time refers to the zone-date-time in which to start your time-series data when calling get_values()

Parameters
granularitydatetime.timedelta, optional

the granularity for use in time-series TRS (default is 1ms)

start_timedatetime, optional

the starting date-time of the time-series (default is 1970-01-01 UTC)

Returns
MultiTimeSeries

a new multi-time-series with its time_ticks mapped based on a new TRS.

Notes

time_ticks will be mapped as follows - (current_time_tick - start_time) / granularity

if any source time-series does not have a time-reference-system associated with it, this method will throw and exception

write(start=None, end=None, inclusive=False)

create a multi-time-series-writer given a range

Parameters
startint or datetime, optional

start of range (inclusive) (default is None)

endint or datetime, optional

end of range (inclusive) (default is None)

inclusivebool, optional

if true, will use inclusive bounds (default is False)

Returns
MultiTimeSeriesWriter

a new multi-time-series-writer

Raises
ValueError

If there is an error in the input arguments

class tspy.data_structures.Observation(tsc, time_tick=- 1, value=None)

Bases: object

Basic storage unit for a single time-series observation

Examples

create a simple observation

>>> import tspy
>>> obs = tspy.observation(1,1)
>>> obs
TimeStamp: 1     Value: 1
Attributes
time_tickint

Returns

valueany

Returns

Methods

__call__(timestamp, value)

Call self as a function.

property time_tick
Returns
int

the time-tick associated with this observation

property value
Returns
any

the value associated with this observation

class tspy.data_structures.ObservationCollection(tsc, j_observations=None)

Bases: object

A special form of materialized time-series (sorted collection) whose values are of type Observation.

An observation-collection has the following properties:

  1. Sorted by observation time-tick

  2. Support for observations with duplicate time-ticks

  3. Duplicate time-ticks will keep ordering

Examples

create an observation-collection

>>> import tspy
>>> ts_builder = tspy.builder()
>>> ts_builder.add(tspy.observation(1,1))
>>> ts_builder.add(tspy.observation(2,2))
>>> ts_builder.add(tspy.observation(1,3))
>>> observations = ts_builder.result()
>>> observations
[(1,1),(1,3),(2,2)]

iterate through this collection

>>> for o in observations:
    ...print(o.time_tick, ",", o.value)
1 , 1
1 , 3
2 , 2
Attributes
size

Returns

trs

Returns

Methods

Java()

mapping to a compatible class in Java via Py4J

ceiling(time_tick)

get the ceiling observation for the given time-tick.

contains(time_tick)

Checks for containment of time-tick within the collection

first()

get the first observation in this collection.

floor(time_tick)

get the floor observation for the given time-tick.

higher(time_tick)

get the higher observation for the given time-tick.

is_empty()

checks if there is any observation

last()

get the last observation in this collection.

lower(time_tick)

get the lower observation for the given time-tick.

to_time_series([granularity, start_time])

convert this collection to a time-series

class Java

Bases: object

mapping to a compatible class in Java via Py4J

implements = ['com.ibm.research.time_series.core.utils.ObservationCollection']
ceiling(time_tick)

get the ceiling observation for the given time-tick. The ceiling is defined as the the observation which bares the same time-tick as the given time-tick, or if one does not exist, the next higher observation. If no such observation exists that satisfies these arguments, in the collection, None will be returned.

Parameters
time_tickint

the time-tick

Returns
Observation

the ceiling observation

contains(time_tick)

Checks for containment of time-tick within the collection

Parameters
time_tickint

the time-tick

Returns
bool

True if an observation in this collection has the given time-tick, otherwise False

first()

get the first observation in this collection. The first observation is that observation which has the lowest timestamp in the collection. If 2 observations have the same timestamp, the first observation that was in the collection will be the one returned.

Returns
Observation

the first observation in this collection

floor(time_tick)

get the floor observation for the given time-tick. The floor is defined as the the observation which bares the same time-tick as the given time-tick, or if one does not exist, the next lower observation. If no such observation exists that satisfies these arguments, in the collection, None will be returned.

Parameters
time_tickint

the time-tick

Returns
Observation

the floor observation

higher(time_tick)

get the higher observation for the given time-tick. The higher is defined as the the observation which bares a time-tick greater than the given time-tick. If no such observation exists that satisfies these arguments, in the collection, None will be returned.

Parameters
time_tickint

the time-tick

Returns
Observation

the floor observation

is_empty()

checks if there is any observation

Returns
bool

True if no observations exist in this collection, otherwise False

last()

get the last observation in this collection. The last observation is that observation which has the highest timestamp in the collection. If 2 observations have the same timestamp, the last observation that was in the collection will be the one returned.

Returns
Observation

the last observation in this collection

lower(time_tick)

get the lower observation for the given time-tick. The lower is defined as the the observation which bares a time-tick less than the given time-tick. If no such observation exists that satisfies these arguments, in the collection, None will be returned.

Parameters
time_tickint

the time-tick

Returns
Observation

the floor observation

property size
Returns
int

the number of observations in this collection

to_time_series(granularity=None, start_time=None)

convert this collection to a time-series

Parameters
granularitydatetime.timedelta, optional

the granularity for use in time-series TRS (default is None if no start_time, otherwise 1ms)

start_timedatetime, optional

the starting date-time of the time-series (default is None if no granularity, otherwise 1970-01-01 UTC)

Returns
TimeSeries

a new time-series

property trs
Returns
TRSTRS

this time-series time-reference-system

class tspy.data_structures.Segment(tsc, j_observations, start=None, end=None)

Bases: tspy.data_structures.observations.ObservationCollection.ObservationCollection

A special form of observation-collection which holds additional information as to how the segment was created. Segments are usually created through the use of a segmentation transform.

Notes

a segments start/end need not equal its first/last time-tick

Attributes
startint

Returns

endint

Returns

Methods

Java()

mapping to a compatible class in Java via Py4J

ceiling(time_tick)

get the ceiling observation for the given time-tick.

contains(time_tick)

Checks for containment of time-tick within the collection

first()

get the first observation in this collection.

floor(time_tick)

get the floor observation for the given time-tick.

higher(time_tick)

get the higher observation for the given time-tick.

is_empty()

checks if there is any observation

last()

get the last observation in this collection.

lower(time_tick)

get the lower observation for the given time-tick.

to_time_series([granularity, start_time])

convert this collection to a time-series

toString

property end
Returns
int

end time-tick of window at instantiation time

property observations
Returns
ObservationCollection

the underlying collection of observations in this segment

property start
Returns
int

start time-tick of window at instantiation time

toString()
class tspy.data_structures.SegmentMultiTimeSeries(tsc, j_mts)

Bases: tspy.data_structures.multi_time_series.MultiTimeSeries.MultiTimeSeries

A special form of multi-time-series that consists of observations with a value of type Segment

Attributes
keys

Returns

Methods

aggregate(zero, seq_func, comb_func)

aggregate all series in this multi-time-series to produce a single value

aggregate_series(list_to_val_func)

aggregate all time-series in the multi-time-series using a summation function to produce a single time-series

aggregate_series_with_key(list_to_val_func)

aggregate all time-series with key in the multi-time-series using a summation function to produce a single time-series

align(key[, interp_func])

align all time-series on a key

cache([cache_size])

suggest to the multi-time-series to cache values

collect([inclusive])

collect and materialize this multi-time-series

collect_series(key[, inclusive])

get a collection of observations given a key

describe()

retrieve a NumStats object per time-series computed from all values in this multi-time-series (double)

fillna(interpolator[, null_value])

produce a new multi-time-series which is the result of filling all null values.

filter(func)

produce a new multi-time-series which is the result of filtering by each observation’s value given a filter function.

filter_series(func)

filter each time-series by its time-series object

filter_series_key(func)

filter each time-series by its key

flatten([key_func])

converts this segment-multi-time-series into a multi-time-series where each time-series will be the result of a single segment

forecast(num_predictions, fm[, …])

forecast the next num_predictions using a forecasting model for each time-series

full_align(multi_time_series[, …])

align two multi-time-series based on a temporal full join strategy and optionally interpolate missing values

full_join(multi_time_series[, join_func, …])

join two multi-time-series based on a temporal full join strategy and optionally interpolate missing values

get_time_series(key)

get a time-series given a key

get_values(start, end[, inclusive])

get all values between a range in this multi-time-series

inner_align(multi_time_series)

align two multi-time-series based on a temporal inner join strategy

inner_join(multi_time_series[, join_func])

join two multi-time-series based on a temporal inner join strategy

left_align(multi_time_series[, interp_func])

align two multi-time-series based on a temporal left join strategy and optionally interpolate missing values

left_join(multi_time_series[, join_func, …])

join two multi-time-series based on a temporal left join strategy and optionally interpolate missing values

left_outer_align(multi_time_series[, …])

align two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values

left_outer_join(multi_time_series[, …])

join two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values

map(func)

produce a new multi-time-series where each observation’s value in this multi-time-series is mapped to a new observation value

map_series(func)

map each ObservationCollection to a new collection of observations

map_series_key(func)

map each time-series key to a new key

map_series_with_key(func)

map each ObservationCollection to a new collection of observations giving access to each time-series key

pair_wise_transform(binary_transform)

produce a new multi-time-series which is the product of performing a pair-wise transform against all combination of keys

print([start, end, inclusive])

print this multi-time-series

reduce(func)

reduce each time-series in this multi-time-series to a single value

reduce_range(func, start, end[, inclusive])

reduce each time-series in this multi-time-series to a single value given a range

resample(period, interp_func)

produce a new multi-time-series by resampling each time-series to a given periodicity

right_align(multi_time_series[, interp_func])

align two multi-time-series based on a temporal right join strategy and optionally interpolate missing values

right_join(multi_time_series[, join_func, …])

join two multi-time-series based on a temporal right join strategy and optionally interpolate missing values

right_outer_align(multi_time_series[, …])

align two time-series based on a temporal right outer join strategy and optionally interpolate missing values

right_outer_join(multi_time_series[, …])

join two multi-time-series based on a temporal right outer join strategy and optionally interpolate missing values

segment(window[, step, enforce_bounds])

produce a new segment-multi-time-series from a performing a sliding-based segmentation over each time-series

segment_by(func)

produce a new segment-multi-time-series from a performing a group-by operation on each observation’s value for each time-series

segment_by_anchor(func, left_delta, right_delta)

produce a new segment-multi-time-series from performing an anchor-based segmentation over each time-series.

segment_by_changepoint([change_point])

produce a new segment-multi-time-series from performing a chang-point based segmentation.

segment_by_time(window, step)

produce a new segment-multi-time-series from a performing a time-based segmentation over each time-series

time_series(key)

get a time-series given a key

to_df([format, inclusive])

convert this multi-time-series to a pandas dataframe.

to_df_instants([inclusive])

convert this multi-time-series to an observations pandas dataframe.

to_df_observations([inclusive])

convert this multi-time-series to an observations pandas dataframe.

to_segments(segment_transform)

produce a new segment-multi-time-series from a segmentation transform

transform(*args)

produce a new multi-time-series which is the result of performing a transforming over each time-series.

trs(key)

get a time-series time-reference-system given a key

uncache()

remove the multi-time-series caching mechanism

with_trs([granularity, time_tick])

create a new multi-time-series with its timestamps mapped based on a granularity and start_time.

write([start, end, inclusive])

create a multi-time-series-writer given a range

flatten(key_func=None)

converts this segment-multi-time-series into a multi-time-series where each time-series will be the result of a single segment

Parameters
key_funcfunc, optional

operation where given a segment, produce a unique key (default is create key based on start of segment)

Returns
MultiTimeSeries

a new multi-time-series

Notes

this is not a lazy operation and will materialize the time-series

Examples

create a simple multi-time-series

>>> import tspy
>>> mts_orig = tspy.multi_time_series.dict({'a': tspy.data_structures.list([1,2,3]), 'b': tspy.data_structures.list([4,5,6])})
>>> mts_orig
a time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
b time series
------------------------------
TimeStamp: 0     Value: 4
TimeStamp: 1     Value: 5
TimeStamp: 2     Value: 6

segment the multi-time-series using a simple sliding window

>>> mts_sliding = mts_orig.segment(2)
>>> mts_sliding
a time series
------------------------------
TimeStamp: 0     Value: original bounds: (0,1) actual bounds: (0,1) observations: [(0,1),(1,2)]
TimeStamp: 1     Value: original bounds: (1,2) actual bounds: (1,2) observations: [(1,2),(2,3)]
b time series
------------------------------
TimeStamp: 0     Value: original bounds: (0,1) actual bounds: (0,1) observations: [(0,4),(1,5)]
TimeStamp: 1     Value: original bounds: (1,2) actual bounds: (1,2) observations: [(1,5),(2,6)]

flatten the segments into a single multi-time-series

>>> mts = mts_sliding.flatten()
>>> mts
(a, 0) time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
(b, 0) time series
------------------------------
TimeStamp: 0     Value: 4
TimeStamp: 1     Value: 5
(a, 1) time series
------------------------------
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
(b, 1) time series
------------------------------
TimeStamp: 1     Value: 5
TimeStamp: 2     Value: 6
class tspy.data_structures.SegmentTimeSeries(tsc, j_ts, trs=None)

Bases: tspy.data_structures.time_series.TimeSeries.TimeSeries

A special form of time-series that consists of observations with a value of type Segment

Attributes
trs

Returns

Methods

cache([cache_size])

suggest to the time-series to cache values

collect([inclusive])

collect all observations in this time-series

concat(other_time_series)

produce a new time-series which is the result of concatenating two time-series

count([inclusive])

count the current number of observations in this time-series

describe()

retrieve time-series statistics computed from all values in this time-series

fillna(interpolator[, null_value])

produce a new time-series which is the result of filling all null values.

filter(func)

produce a new time-series which is the result of filtering by each observation’s value given a filter function.

flatmap(func)

produce a new time-series where each observation’s value in this time-series is mapped to 0 to N new values.

flatten([key_func])

converts this segment-time-series into a multi-time-series where each time-series will be the result of a single segment

forecast(num_predictions, fm[, …])

forecast the next num_predictions using a forecasting model

full_align(time_series[, left_interp_func, …])

align two time-series based on a temporal full join strategy and optionally interpolate missing values

full_join(time_series[, join_func, …])

join two time-series based on a temporal full join strategy and optionally interpolate missing values

get_values(start, end[, inclusive])

get all values between a range in this time-series

inner_align(time_series)

align two time-series based on a temporal inner join strategy

inner_join(time_series[, join_func])

join two time-series based on a temporal inner join strategy

lag(lag_amount)

produce a new time-series which is a lagged version of the current time-series.

left_align(time_series[, interp_func])

align two time-series based on a temporal left join strategy and optionally interpolate missing values

left_join(time_series[, join_func, interp_func])

join two time-series based on a temporal left join strategy and optionally interpolate missing values

left_outer_align(time_series[, interp_func])

align two time-series based on a temporal left outer join strategy and optionally interpolate missing values

left_outer_join(time_series[, join_func, …])

join two time-series based on a temporal left outer join strategy and optionally interpolate missing values

map(func)

produce a new time-series where each observation’s value in this time-series is mapped to a new observation value

map_with_index(func)

produce a new time-series where each observation’s value in this time-series is mapped given the old value and an index to a new observation value

print([start, end, inclusive, human_readable])

print this time-series

reduce(*args)

reduce this time-series or two time-series to a single value

resample(period, func)

produce a new time-series by resampling the current time-series to a given periodicity

right_align(time_series[, interp_func])

align two time-series based on a temporal right join strategy and optionally interpolate missing values

right_join(time_series[, join_func, interp_func])

join two time-series based on a temporal right join strategy and optionally interpolate missing values

right_outer_align(time_series[, interp_func])

align two time-series based on a temporal right outer join strategy and optionally interpolate missing values

right_outer_join(time_series[, join_func, …])

join two time-series based on a temporal right outer join strategy and optionally interpolate missing values

segment(window[, step, enforce_size])

produce a new segment-time-series from a performing a sliding-based segmentation over the time-series

segment_by(func)

produce a new segment-time-series from a performing a group-by operation on each observation’s value

segment_by_anchor(func, left_delta, right_delta)

produce a new segment-time-series from performing an anchor-based segmentation over the time-series.

segment_by_changepoint([change_point])

produce a new segment-time-series from performing a chang-point based segmentation.

segment_by_marker(*args, **kwargs)

produce a new segment-time-series from performing a marker based segmentation.

segment_by_time(window, step)

produce a new segment-time-series from a performing a time-based segmentation over the time-series

shift(shift_amount[, default_value])

produce a new time-series which is a shifted version of the current time-series.

to_df([inclusive])

convert this time-series to a pandas dataframe

to_segments(segment_transform)

produce a new segment-time-series from a segmentation transform

transform(*args)

produce a new time-series which is the result of performing a transforming over the time-series.

uncache()

remove the time-series caching mechanism

with_trs([granularity, start_time])

create a new time-series with its timestamps mapped based on a granularity and start_time.

write([start, end, inclusive])

create a time-series-writer given a range

cache(cache_size=None)

suggest to the time-series to cache values

Parameters
cache_sizeint, optional

the max cache size (default is max long)

Returns
TimeSeries

a new time-series

Notes

this is a lazy operation and will only suggest to the time-series to save values once computed

flatmap(func)

produce a new time-series where each observation’s value in this time-series is mapped to 0 to N new values.

Parameters
funcfunc

value mapping function which returns a list of values

Returns
TimeSeries

a new time-series with its values flat-mapped

Notes

an observations time-tick will be duplicated if a single value maps to multiple values

Examples

create a simple time-series

>>> import tspy
>>> ts_orig = tspy.time_series([1, 2, 3])
>>> ts_orig
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3

flat map each time-series observation value by duplicating the value

>>> ts = ts_orig.flatmap(lambda x: [x, x])
>>> ts
TimeStamp: 0     Value: 1
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
TimeStamp: 2     Value: 3
flatten(key_func=None)

converts this segment-time-series into a multi-time-series where each time-series will be the result of a single segment

Parameters
key_funcfunc, optional

operation where given a segment, produce a unique key (default is create key based on start of segment)

Returns
MultiTimeSeries

a new multi-time-series

Notes

this is not a lazy operation and will materialize the time-series

Examples

create a simple time-series

>>> import tspy
>>> ts_orig = tspy.data_structures.list([1,2,3,4,5,6])
>>> ts_orig
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
TimeStamp: 3     Value: 4
TimeStamp: 4     Value: 5
TimeStamp: 5     Value: 6

segment the time-series using a simple sliding window

>>> ts_sliding = ts_orig.segment(2)
>>> ts_sliding
TimeStamp: 0     Value: original bounds: (0,1) actual bounds: (0,1) observations: [(0,1),(1,2)]
TimeStamp: 1     Value: original bounds: (1,2) actual bounds: (1,2) observations: [(1,2),(2,3)]
TimeStamp: 2     Value: original bounds: (2,3) actual bounds: (2,3) observations: [(2,3),(3,4)]
TimeStamp: 3     Value: original bounds: (3,4) actual bounds: (3,4) observations: [(3,4),(4,5)]
TimeStamp: 4     Value: original bounds: (4,5) actual bounds: (4,5) observations: [(4,5),(5,6)]

flatten the segments into a single multi-time-series

>>> mts = ts_sliding.flatten()
>>> mts
0 time series
------------------------------
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
1 time series
------------------------------
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3
2 time series
------------------------------
TimeStamp: 2     Value: 3
TimeStamp: 3     Value: 4
3 time series
------------------------------
TimeStamp: 3     Value: 4
TimeStamp: 4     Value: 5
4 time series
------------------------------
TimeStamp: 4     Value: 5
TimeStamp: 5     Value: 6
map(func)

produce a new time-series where each observation’s value in this time-series is mapped to a new observation value

Parameters
funcfunc

value mapping function

Returns
TimeSeries

a new time-series with its values re-mapped

Examples

create a simple time-series

>>> import tspy
>>> ts_orig = tspy.time_series([1, 2, 3])
>>> ts_orig
TimeStamp: 0     Value: 1
TimeStamp: 1     Value: 2
TimeStamp: 2     Value: 3

add one to each value of the time-series and produce a new time-series

>>> ts = ts_orig.map(lambda x: x + 1)
>>> ts
TimeStamp: 0     Value: 2
TimeStamp: 1     Value: 3
TimeStamp: 2     Value: 4

map each value of the time-series to a string and produce a new time-series

>>> ts = ts_orig.map(lambda x: "value - " + str(x))
>>> ts
TimeStamp: 0     Value: value - 1
TimeStamp: 1     Value: value - 2
TimeStamp: 2     Value: value - 3

map each value of the time-series to a string using high performant expressions

>>> from tspy.functions import expressions as exp
>>> ts = ts_orig.map(exp.add(exp.id(), 1))
TimeStamp: 0     Value: 2.0
TimeStamp: 1     Value: 3.0
TimeStamp: 2     Value: 4.0
class tspy.data_structures.Stats(tsc, j_stats)

Bases: object

time-series statistics object in use when describing a time-series

Attributes
topany

Returns

uniqueint

Returns

frequencyint

Returns

firstObservation

Returns

lastObservation

Returns

countint

Returns

min_inter_arrival_timeint

Returns

max_inter_arrival_timeint

Returns

mean_inter_arrival_timefloat

Returns

Methods

toString

property count
Returns
int

number of elements in the time-series

property first
Returns
Observation

the first observation

property frequency
Returns
int

number of occurrences of the top element

property last
Returns
Observation

the last observation

property max_inter_arrival_time
Returns
int

max time between observations

property mean_inter_arrival_time
Returns
float

mean time between observations

property min_inter_arrival_time
Returns
int

min time between observations

toString()
property top
Returns
any

element that occurs most frequently

property unique
Returns
int

number of unique elements

class tspy.data_structures.TRS(tsc, granularity, start_time, j_trs=None)

Bases: object

Time reference system (TRS) is a local, regional or global system used to identify time. A time reference system defines a specific projection for forward and reverse mapping between timestamp and its numeric representation. A common example that most of us are familiar with is UTC time, which maps a timestamp (Jan 1, 2019 12am midnight GMT) into a 64-bit integer value (1546300800000) that captures the number of milliseconds that have elapsed since Jan 1, 1970 12am (midnight) GMT. Generally speaking, the timestamp value is better suited for human readability, while the numeric representation is better suited for machine processing.

Notes

A timestamp is mapped into a numeric representation by computing the number of elapsed time-ticks since the offset. A numeric representation is scaled by the time-tick and shifted by the offset when it is mapped back to a timestamp.

forward + reverse projections may be lossy. For instance, if the true time granularity of a time-series is in seconds, then forward and reverse mapping of timestamps 9:00:01 and 9:00:02 (to be read as hh:mm:ss) to a time-tick of one minute would result in timestamps 9:00:00 and 9:00:00 (respectively). In this example a time-series whose granularity is in seconds is being mapped to minutes and thus the reverse mapping looses information. However, the mapped granularity is higher than the granularity of the input time-series (more specifically if the time-series granularity is an integral multiple of the mapped granularity) then the forward + reverse projection is guaranteed to be lossless. For example, mapping a time-series whose granularity is in minutes to seconds and reverse projecting it to minutes would result in lossless reconstruction of the timestamps.

By default, Granularity is one millisecond, and Start-Time is 1st Jan 1970 00:00:00

Attributes
granularitydatetime.timedelta

Returns

start_timedatetime

Returns

Methods

to_index(time)

get the given millisecond long as a time-tick index

to_long_lower(index)

get the given time-tick as a millisecond long (lower bound)

to_long_upper(index)

get the given time-tick as a millisecond long (upper bound)

property granularity
Returns
granularitydatetime.timedelta

granularity that captures time-tick granularity (e.g., 1 minute)

property start_time
Returns
start_timedatetime

start-time that captures an offset (e.g., 1st Jan 2019 12am midnight US Eastern Daylight Savings Time) to the start time of the time-series)

to_index(time)

get the given millisecond long as a time-tick index

Parameters
timeint or datetime

the time to convert to an index

Returns
int

a time-tick index

to_long_lower(index)

get the given time-tick as a millisecond long (lower bound)

Parameters
indexint

the time-tick to convert

Returns
int

a millisecond long

to_long_upper(index)

get the given time-tick as a millisecond long (upper bound)

Parameters
indexint

the time-tick to convert

Returns
int

a millisecond long

Subpackages