tspy.data_structures package¶
-
class
tspy.data_structures.
MultiTimeSeries
(tsc, j_mts)¶ Bases:
object
A collection of
TimeSeries
where each time-series is identified by some key.Notes
Like time-series, operations performed against a multi-time-series are executed lazily unless specified otherwise
All transforms against a multi-time-series will be run in parallel across time-series
There is no assumption that all time-series must be aligned or have like periodicity.
Examples
create a multi-time-series from a dict
>>> import tspy >>> ts1 = tspy.time_series([1,2,3]) >>> ts2 = tspy.builder().add(tspy.observation(2,4)).add(tspy.observation(10,1)).result().to_time_series() >>> mts = tspy.multi_time_series({'a': ts1, 'b': ts2}) >>> mts a time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 b time series ------------------------------ TimeStamp: 2 Value: 4 TimeStamp: 10 Value: 1
create a multi-time-series from a pandas dataframe
>>> import tspy >>> import numpy as np >>> import pandas as pd >>> header = ['', 'key', 'timestamp', "name", "age"] >>> row1 = ['Row1', "a", 1, "josh", 27] >>> row2 = ['Row2', "b", 3, "john", 4] >>> row3 = ['Row3', "a", 5, "bob", 17] >>> data = np.array([header, row1, row2, row3]) >>> df = pd.DataFrame(data=data[1:, 1:], index=data[1:, 0], columns=data[0, 1:]).astype(dtype={'key': 'object', 'timestamp': 'int64'}) >>> mts = tspy.multi_time_series(df, "key", "timestamp") >>> mts a time series ------------------------------ TimeStamp: 1 Value: {name=josh, age=27} TimeStamp: 5 Value: {name=bob, age=17} b time series ------------------------------ TimeStamp: 3 Value: {name=john, age=4}
- Attributes
keys
Returns
Methods
aggregate
(zero, seq_func, comb_func)aggregate all series in this multi-time-series to produce a single value
aggregate_series
(list_to_val_func)aggregate all time-series in the multi-time-series using a summation function to produce a single time-series
aggregate_series_with_key
(list_to_val_func)aggregate all time-series with key in the multi-time-series using a summation function to produce a single time-series
align
(key[, interp_func])align all time-series on a key
cache
([cache_size])suggest to the multi-time-series to cache values
collect
([inclusive])collect and materialize this multi-time-series
collect_series
(key[, inclusive])get a collection of observations given a key
describe
()retrieve a
NumStats
object per time-series computed from all values in this multi-time-series (double)fillna
(interpolator[, null_value])produce a new multi-time-series which is the result of filling all null values.
filter
(func)produce a new multi-time-series which is the result of filtering by each observation’s value given a filter function.
filter_series
(func)filter each time-series by its time-series object
filter_series_key
(func)filter each time-series by its key
forecast
(num_predictions, fm[, …])forecast the next num_predictions using a forecasting model for each time-series
full_align
(multi_time_series[, …])align two multi-time-series based on a temporal full join strategy and optionally interpolate missing values
full_join
(multi_time_series[, join_func, …])join two multi-time-series based on a temporal full join strategy and optionally interpolate missing values
get_time_series
(key)get a time-series given a key
get_values
(start, end[, inclusive])get all values between a range in this multi-time-series
inner_align
(multi_time_series)align two multi-time-series based on a temporal inner join strategy
inner_join
(multi_time_series[, join_func])join two multi-time-series based on a temporal inner join strategy
left_align
(multi_time_series[, interp_func])align two multi-time-series based on a temporal left join strategy and optionally interpolate missing values
left_join
(multi_time_series[, join_func, …])join two multi-time-series based on a temporal left join strategy and optionally interpolate missing values
left_outer_align
(multi_time_series[, …])align two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values
left_outer_join
(multi_time_series[, …])join two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values
map
(func)produce a new multi-time-series where each observation’s value in this multi-time-series is mapped to a new observation value
map_series
(func)map each
ObservationCollection
to a new collection of observationsmap_series_key
(func)map each time-series key to a new key
map_series_with_key
(func)map each
ObservationCollection
to a new collection of observations giving access to each time-series keypair_wise_transform
(binary_transform)produce a new multi-time-series which is the product of performing a pair-wise transform against all combination of keys
print
([start, end, inclusive])print this multi-time-series
reduce
(func)reduce each time-series in this multi-time-series to a single value
reduce_range
(func, start, end[, inclusive])reduce each time-series in this multi-time-series to a single value given a range
resample
(period, interp_func)produce a new multi-time-series by resampling each time-series to a given periodicity
right_align
(multi_time_series[, interp_func])align two multi-time-series based on a temporal right join strategy and optionally interpolate missing values
right_join
(multi_time_series[, join_func, …])join two multi-time-series based on a temporal right join strategy and optionally interpolate missing values
right_outer_align
(multi_time_series[, …])align two time-series based on a temporal right outer join strategy and optionally interpolate missing values
right_outer_join
(multi_time_series[, …])join two multi-time-series based on a temporal right outer join strategy and optionally interpolate missing values
segment
(window[, step, enforce_bounds])produce a new segment-multi-time-series from a performing a sliding-based segmentation over each time-series
segment_by
(func)produce a new segment-multi-time-series from a performing a group-by operation on each observation’s value for each time-series
segment_by_anchor
(func, left_delta, right_delta)produce a new segment-multi-time-series from performing an anchor-based segmentation over each time-series.
segment_by_changepoint
([change_point])produce a new segment-multi-time-series from performing a chang-point based segmentation.
segment_by_time
(window, step)produce a new segment-multi-time-series from a performing a time-based segmentation over each time-series
time_series
(key)get a time-series given a key
to_df
([format, inclusive])convert this multi-time-series to a pandas dataframe.
to_df_instants
([inclusive])convert this multi-time-series to an observations pandas dataframe.
to_df_observations
([inclusive])convert this multi-time-series to an observations pandas dataframe.
to_segments
(segment_transform)produce a new segment-multi-time-series from a segmentation transform
transform
(*args)produce a new multi-time-series which is the result of performing a transforming over each time-series.
trs
(key)get a time-series time-reference-system given a key
uncache
()remove the multi-time-series caching mechanism
with_trs
([granularity, time_tick])create a new multi-time-series with its timestamps mapped based on a granularity and start_time.
write
([start, end, inclusive])create a multi-time-series-writer given a range
-
aggregate
(zero, seq_func, comb_func)¶ aggregate all series in this multi-time-series to produce a single value
- Parameters
- zeroany
zero value for aggregation
- seq_funcfunc
operation to perform against each time series to reduce a time series to a single value
- comb_funcfunc
operation to perform against each reduced time series values to combine those values
- Returns
- any
single output value representing the aggregate of all time-series in the multi-time-series
- Raises
- TSErrorWithMessage
If there is an error in aggregating, e.g. incorrect type
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1,2,3]) >>> ts2 = tspy.builder().add(tspy.observation(2,4)).add(tspy.observation(10,1)).result().to_time_series() >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2}) >>> mts_orig a time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 b time series ------------------------------ TimeStamp: 2 Value: 4 TimeStamp: 10 Value: 1
get the sum over all time-series
>>> from tspy.functions import reducers >>> sum = mts_orig.aggregate(0, lambda agg,cur: agg + cur.reduce(reducers.sum()), lambda agg1, agg2: agg1 + agg2) >>> sum 11.0
-
aggregate_series
(list_to_val_func)¶ aggregate all time-series in the multi-time-series using a summation function to produce a single time-series
- Parameters
- list_to_val_funcfunc
function which produces a single value given a list of values
- Returns
TimeSeries
a new time-series
Notes
all time-series in this multi-time-series should be aligned prior to calling aggregate_series
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1,2,3]) >>> ts2 = tspy.time_series([2,3,4]) >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2}) a time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 b time series ------------------------------ TimeStamp: 0 Value: 2 TimeStamp: 1 Value: 3 TimeStamp: 2 Value: 4
create a sum per time-tick time-series
>>> ts = mts_orig.aggregate_series(lambda l: sum(l)) TimeStamp: 0 Value: 3 TimeStamp: 1 Value: 5 TimeStamp: 2 Value: 7
-
aggregate_series_with_key
(list_to_val_func)¶ aggregate all time-series with key in the multi-time-series using a summation function to produce a single time-series
- Parameters
- list_to_val_funcfunc
function which produces a single value given a list of pairs with key and value
- Returns
TimeSeries
a new time-series
-
align
(key, interp_func=<function MultiTimeSeries.<lambda>>)¶ align all time-series on a key
- Parameters
- keyany
key to a time-series within this multi-time-series
- interp_funcfunc or interpolator, optional
the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- Returns
MultiTimeSeries
a new multi-time-series
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1,2,3]) >>> ts2 = tspy.builder().add(tspy.observation(2,4)).add(tspy.observation(10,1)).result().to_time_series() >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2}) >>> mts_orig a time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 b time series ------------------------------ TimeStamp: 2 Value: 4 TimeStamp: 10 Value: 1
align all time-series on time-series ‘b’
>>> mts = mts_orig.align("b") >>> mts a time series ------------------------------ TimeStamp: 2 Value: 3 TimeStamp: 10 Value: null b time series ------------------------------ TimeStamp: 2 Value: 4 TimeStamp: 10 Value: 1
-
cache
(cache_size=None)¶ suggest to the multi-time-series to cache values
- Parameters
- cache_sizeint, optional
the max cache size (default is max long)
- Returns
MultiTimeSeries
a new multi-time-series
Notes
this is a lazy operation and will only suggest to the multi-time-series to save values once computed
-
collect
(inclusive=False)¶ collect and materialize this multi-time-series
- Parameters
- inclusivebool, optional
if true, will use inclusive bounds (default is False)
- Returns
- dict
a collection of observations for each key
- Raises
- TSErrorWithMessage
If there is an error in collecting data, e.g. incorrect type
Notes
see
collect()
for usage
-
collect_series
(key, inclusive=False)¶ get a collection of observations given a key
- Parameters
- keyany
the key associated with a time-series in this multi-time-series
- inclusivebool, optional
if true, will use inclusive bounds (default is False)
- Returns
ObservationCollection
the collection of observations associated with the given key
- Raises
- ValueError
If there is an error in aggregating, e.g. incorrect key
-
describe
()¶ retrieve a
NumStats
object per time-series computed from all values in this multi-time-series (double)- Returns
- dict
NumStats
for each key
- Raises
- TSErrorWithMessage
describe doesn’t work with the data type
-
fillna
(interpolator, null_value=None)¶ produce a new multi-time-series which is the result of filling all null values.
- Parameters
- interpolatorfunc or interpolator
the interpolator method to be used when a value is null
- null_valueany, optional
denotes a null value, for instance if nullValue = NaN, NaN would be filled
- Returns
MultiTimeSeries
a new multi-time-series
-
filter
(func)¶ produce a new multi-time-series which is the result of filtering by each observation’s value given a filter function.
- Parameters
- funcfunc
the filter on observation’s value function
- Returns
MultiTimeSeries
a new multi-time-series
Notes
see
filter()
for usage
-
filter_series
(func)¶ filter each time-series by its time-series object
- Parameters
- funcfunc
function which given a time-series will produce a boolean denoting whether to keep the time-series
- Returns
MultiTimeSeries
a new multi-time-series
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1.0, 2.0, 3.0, 4.0]) >>> ts2 = tspy.time_series([1.0, -2.0, 3.0, 4.0]) >>> ts3 = tspy.time_series([0.0, 1.0, 2.0, 4.0]) >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2, 'c': ts3}) >>> mts_orig a time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 b time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: -2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 c time series ------------------------------ TimeStamp: 0 Value: 0.0 TimeStamp: 1 Value: 1.0 TimeStamp: 2 Value: 2.0 TimeStamp: 3 Value: 4.0
>>> mts = mts_orig.filter_series(lambda s: -2.0 not in [x.value for x in s.collect()]) >>> mts a time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 c time series ------------------------------ TimeStamp: 0 Value: 0.0 TimeStamp: 1 Value: 1.0 TimeStamp: 2 Value: 2.0 TimeStamp: 3 Value: 4.0
-
filter_series_key
(func)¶ filter each time-series by its key
- Parameters
- funcfunc
function which given a key will produce a boolean denoting whether to keep the time-series
- Returns
MultiTimeSeries
a new multi-time-series
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1.0, 2.0, 3.0, 4.0]) >>> ts2 = tspy.time_series([1.0, -2.0, 3.0, 4.0]) >>> ts3 = tspy.time_series([0.0, 1.0, 2.0, 4.0]) >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2, 'c': ts3}) >>> mts_orig a time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 b time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: -2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 c time series ------------------------------ TimeStamp: 0 Value: 0.0 TimeStamp: 1 Value: 1.0 TimeStamp: 2 Value: 2.0 TimeStamp: 3 Value: 4.0
filter each series by key != ‘a’
>>> mts = mts_orig.filter_series_key(lambda k: k != 'a') >>> mts b time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: -2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 c time series ------------------------------ TimeStamp: 0 Value: 0.0 TimeStamp: 1 Value: 1.0 TimeStamp: 2 Value: 2.0 TimeStamp: 3 Value: 4.0
-
forecast
(num_predictions, fm, start_training_time=None, confidence=1.0)¶ forecast the next num_predictions using a forecasting model for each time-series
- Parameters
- num_predictionsint
number of forecasts past the end of the time-series to retrieve
- fm
ForecastingModel
the forecasting model to use
- start_training_timeint or datetime, optional
point at which to start training the forecasting model
- confidencefloat
number between 0 and 1 which is used in calculating the confidence interval
- Returns
- dict
a collection of observations for each key
- Raises
- TSErrorWithMessage
If there is an error in forecasting
Notes
see
forecast()
for usage
-
full_align
(multi_time_series, left_interp_func=<function MultiTimeSeries.<lambda>>, right_interp_func=<function MultiTimeSeries.<lambda>>)¶ align two multi-time-series based on a temporal full join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
the time-series to align with
- left_interp_funcfunc or interpolator, optional
the left multi-time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- right_interp_funcfunc or interpolator, optional
the right multi-time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
- tuple
aligned multi-time-series
Notes
full align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
full_align()
for usage
-
full_join
(multi_time_series, join_func=None, left_interp_func=<function MultiTimeSeries.<lambda>>, right_interp_func=<function MultiTimeSeries.<lambda>>)¶ join two multi-time-series based on a temporal full join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
orTimeSeries
the multi-time-series to join with
- join_funcfunc, optional
function to join to values (default is join to list where left is index 0, right is index 1)
- left_interp_funcfunc or interpolator, optional
the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- right_interp_funcfunc or interpolator, optional
the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
MultiTimeSeries
a new multi-time-series
Notes
full join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
full_join()
for usage
-
get_time_series
(key)¶ get a time-series given a key
- Parameters
- keyany
the key associated with a time-series in this multi-time-series
- Returns
TimeSeries
the time-series associated with the given key
- Raises
- ValueError
If there is an error in aggregating, e.g. incorrect key
-
get_values
(start, end, inclusive=False)¶ get all values between a range in this multi-time-series
- Parameters
- startint or datetime
start of range (inclusive)
- endint or datetime
end of range (inclusive)
- inclusivebool, optional
if true, will use inclusive bounds (default is False)
- Returns
- dict
a collection of observations for each key
- Raises
- TSErrorWithMessage
If there is an error in collecting data, e.g. incorrect type
Notes
see
get_values()
for usage
-
inner_align
(multi_time_series)¶ align two multi-time-series based on a temporal inner join strategy
- Parameters
- multi_time_series
MultiTimeSeries
the multi-time-series to align with
- multi_time_series
- Returns
- tuple
aligned multi-time-series
Notes
inner align will align on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
inner_align()
for usage
-
inner_join
(multi_time_series, join_func=None)¶ join two multi-time-series based on a temporal inner join strategy
- Parameters
- multi_time_series
MultiTimeSeries
orTimeSeries
the multi-time-series to join with
- join_funcfunc, optional
function to join 2 values at a given time-tick. If None given, joined value will be in a list (default is None)
- multi_time_series
- Returns
MultiTimeSeries
a new multi-time-series
Notes
inner join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
inner_join()
for usage
-
property
keys
¶ - Returns
- list
all keys in this multi-time-series
-
left_align
(multi_time_series, interp_func=<function MultiTimeSeries.<lambda>>)¶ align two multi-time-series based on a temporal left join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
the time-series to align with
- interp_funcfunc or interpolator, optional
the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
- tuple
aligned multi-time-series
Notes
left align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
left_align()
for usage
-
left_join
(multi_time_series, join_func=None, interp_func=<function MultiTimeSeries.<lambda>>)¶ join two multi-time-series based on a temporal left join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
orTimeSeries
the multi-time-series to join with
- join_funcfunc, optional
function to join to values (default is join to list where left is index 0, right is index 1)
- interp_funcfunc or interpolator, optional
the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
MultiTimeSeries
a new multi-time-series
Notes
left join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
left_join()
for usage
-
left_outer_align
(multi_time_series, interp_func=<function MultiTimeSeries.<lambda>>)¶ align two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
the multi-time-series to align with
- interp_funcfunc or interpolator, optional
the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
- tuple
aligned multi-time-series
Notes
left outer align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
left_outer_align()
for usage
-
left_outer_join
(multi_time_series, join_func=None, interp_func=<function MultiTimeSeries.<lambda>>)¶ join two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
orTimeSeries
the multi-time-series to join with
- join_funcfunc, optional
function to join to values (default is join to list where left is index 0, right is index 1)
- interp_funcfunc or interpolator, optional
the right time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
MultiTimeSeries
a new multi-time-series
Notes
left outer join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
left_outer_join()
for usage
-
map
(func)¶ produce a new multi-time-series where each observation’s value in this multi-time-series is mapped to a new observation value
- Parameters
- funcfunc
value mapping function
- Returns
MultiTimeSeries
a new multi-time-series
Notes
see
map()
for usage
-
map_series
(func)¶ map each
ObservationCollection
to a new collection of observations- Parameters
- funcfunc
function which given a collection of observations, will produce a new collection of observations
- Returns
MultiTimeSeries
a new multi-time-series
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1,2,3]) >>> ts2 = tspy.builder().add(tspy.observation(2,4)).add(tspy.observation(10,1)).result().to_time_series() >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2}) >>> mts_orig a time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 b time series ------------------------------ TimeStamp: 2 Value: 4 TimeStamp: 10 Value: 1
add one to each value in our multi-time-series
>>> mts = mts_orig.map_series(lambda s: s.to_time_series().map(lambda x: x + 1).collect()) >>> mts a time series ------------------------------ TimeStamp: 0 Value: 2 TimeStamp: 1 Value: 3 TimeStamp: 2 Value: 4 b time series ------------------------------ TimeStamp: 2 Value: 5 TimeStamp: 10 Value: 2
-
map_series_key
(func)¶ map each time-series key to a new key
- Parameters
- funcfunc
function which given a time-series key, will produce a new time-series key
- Returns
MultiTimeSeries
a new multi-time-series
Notes
all produced keys must be unique
-
map_series_with_key
(func)¶ map each
ObservationCollection
to a new collection of observations giving access to each time-series key- Parameters
- funcfunc
function which given a collection of observations and a key, will produce a new collection of observations
- Returns
MultiTimeSeries
a new multi-time-series
-
pair_wise_transform
(binary_transform)¶ produce a new multi-time-series which is the product of performing a pair-wise transform against all combination of keys
- Parameters
- binary_transformBinaryTransform
the binary transform to execute across all pairs in this multi-time-series
- Returns
MultiTimeSeries
a new multi-time-series
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1.0, 2.0, 3.0, 4.0]) >>> ts2 = tspy.time_series([1.0, -2.0, 3.0, 4.0]) >>> ts3 = tspy.time_series([0.0, 1.0, 2.0, 4.0]) >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2, 'c': ts3}) >>> mts_orig a time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 b time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: -2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 c time series ------------------------------ TimeStamp: 0 Value: 0.0 TimeStamp: 1 Value: 1.0 TimeStamp: 2 Value: 2.0 TimeStamp: 3 Value: 4.0
perform a pair-wise correlation on this multi-time-series (sliding windows of size 3)
>>> from tspy.functions import reducers >>> mts = mts_orig.segment(3).pair_wise_transform(reducers.correlation()) >>> mts (a, a) time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 1.0 (a, b) time series ------------------------------ TimeStamp: 0 Value: 0.3973597071195132 TimeStamp: 1 Value: 0.9332565252573828 (b, a) time series ------------------------------ TimeStamp: 0 Value: 0.3973597071195132 TimeStamp: 1 Value: 0.9332565252573828 (a, c) time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 0.9819805060619657 (b, b) time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 1.0 (c, a) time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 0.9819805060619657 (b, c) time series ------------------------------ TimeStamp: 0 Value: 0.3973597071195132 TimeStamp: 1 Value: 0.8485552916276634 (c, b) time series ------------------------------ TimeStamp: 0 Value: 0.3973597071195132 TimeStamp: 1 Value: 0.8485552916276633 (c, c) time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 1.0
-
print
(start=None, end=None, inclusive=False)¶ print this multi-time-series
- Parameters
- startint or datetime, optional
start of range (inclusive) (default is current first time-tick)
- endint or datetime
end of range (inclusive) (default is current last time-tick)
- inclusivebool, optional
if true, will use inclusive bounds (default is False)
- Raises
- ValueError
If there is an error in the input arguments.
-
reduce
(func)¶ reduce each time-series in this multi-time-series to a single value
- Parameters
- funcunary reducer or func
the unary reducer method to be used
- Returns
- dict
the output of time-series reduction for each key
Notes
see
reduce()
for usage
-
reduce_range
(func, start, end, inclusive=False)¶ reduce each time-series in this multi-time-series to a single value given a range
- Parameters
- funcunary reducer or func
the unary reducer method to be used
- startint or datetime
start of range (inclusive)
- endint or datetime
end of range (inclusive)
- inclusivebool, optional
if true, will use inclusive bounds (default is False)
- Returns
- dict
the output of time-series reduction for each key
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1.0, 2.0, 3.0, 4.0]) >>> ts2 = tspy.time_series([1.0, -2.0, 3.0, 4.0]) >>> ts3 = tspy.time_series([0.0, 1.0, 2.0, 4.0]) >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2, 'c': ts3}) >>> mts_orig a time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: 2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 b time series ------------------------------ TimeStamp: 0 Value: 1.0 TimeStamp: 1 Value: -2.0 TimeStamp: 2 Value: 3.0 TimeStamp: 3 Value: 4.0 c time series ------------------------------ TimeStamp: 0 Value: 0.0 TimeStamp: 1 Value: 1.0 TimeStamp: 2 Value: 2.0 TimeStamp: 3 Value: 4.0
reduce each time-series to an average from [1,2]
>>> from tspy.functions import reducers >>> avg_dict = mts_orig.reduce_range(reducers.average(), 1, 2) >>> avg_dict {'a': 2.5, 'b': 0.5, 'c': 1.5}
-
resample
(period, interp_func)¶ produce a new multi-time-series by resampling each time-series to a given periodicity
- Parameters
- periodint
the period to resample to
- funcfunc or interpolator
the interpolator method to be used when a value doesn’t exist at a given time-tick
- Returns
MultiTimeSeries
a new multi-time-series
Notes
see
resample()
for usage
-
right_align
(multi_time_series, interp_func=<function MultiTimeSeries.<lambda>>)¶ align two multi-time-series based on a temporal right join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
the time-series to align with
- interp_funcfunc or interpolator, optional
the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
- tuple
aligned multi-time-series
Notes
right align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
right_align()
for usage
-
right_join
(multi_time_series, join_func=None, interp_func=<function MultiTimeSeries.<lambda>>)¶ join two multi-time-series based on a temporal right join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
orTimeSeries
the multi-time-series to join with
- join_funcfunc, optional
function to join to values (default is join to list where left is index 0, right is index 1)
- interp_funcfunc or interpolator, optional
the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
MultiTimeSeries
a new multi-time-series
Notes
right join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
right_join()
for usage
-
right_outer_align
(multi_time_series, interp_func=<function MultiTimeSeries.<lambda>>)¶ align two time-series based on a temporal right outer join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
the multi-time-series to align with
- interp_funcfunc or interpolator, optional
the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
- tuple
aligned multi-time-series
Notes
right outer align will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
right_outer_align()
for usage
-
right_outer_join
(multi_time_series, join_func=None, interp_func=<function MultiTimeSeries.<lambda>>)¶ join two multi-time-series based on a temporal right outer join strategy and optionally interpolate missing values
- Parameters
- multi_time_series
MultiTimeSeries
orTimeSeries
the multi-time-series to join with
- join_funcfunc, optional
function to join to values (default is join to list where left is index 0, right is index 1)
- interp_funcfunc or interpolator, optional
the left time-series interpolator method to be used when a value doesn’t exist at a given time-tick (default is fill with None)
- multi_time_series
- Returns
MultiTimeSeries
a new multi-time-series
Notes
right outer join will join on like time-series keys. If a key does not exist in one time-series, it will be discarded
see
right_outer_join()
for usage
-
segment
(window, step=1, enforce_bounds=True)¶ produce a new segment-multi-time-series from a performing a sliding-based segmentation over each time-series
- Parameters
- windowint
number of observations per window
- stepint, optional
step size to slide (default is 1)
- enforce_sizebool, optional
if true, will require a window to have the given window size number of observations, otherwise windows can have less than or equal to the window size number of observations. (default is True)
- Returns
- see
SegmentMultiTimeSeries()
a new segment-multi-time-series
- see
Notes
see
segment()
for usage
-
segment_by
(func)¶ produce a new segment-multi-time-series from a performing a group-by operation on each observation’s value for each time-series
- Parameters
- funcfunc
value to key function
- Returns
- see
SegmentMultiTimeSeries()
a new segment-multi-time-series
- see
Notes
see
segment_by()
for usage
-
segment_by_anchor
(func, left_delta, right_delta)¶ produce a new segment-multi-time-series from performing an anchor-based segmentation over each time-series. An anchor point is defined as any value that satisfies the filter function. When an anchor point is determined the segment is built based on left_delta time ticks to the left of the point and right_delta time ticks to the right of the point.
- Parameters
- funcfunc
the filter anchor point function
- left_deltaint
left delta time ticks to the left of the anchor point
- right_deltaint
right delta time ticks to the right of the anchor point
- percint, optional
number between 0 and 1.0 to denote how often to accept the anchor (default is None)
- Returns
- see
SegmentMultiTimeSeries()
a new segment-multi-time-series
- see
Notes
see
segment_by_anchor()
for usage
-
segment_by_changepoint
(change_point=None)¶ produce a new segment-multi-time-series from performing a chang-point based segmentation. A change-point can be defined as any change in 2 values that results in a true statement.
- Parameters
- change_pointfunc, optional
a function given a prev/next value to determine if a change exists (default is simple constant change)
- Returns
- see
SegmentMultiTimeSeries()
a new segment-multi-time-series
- see
Notes
see
segment_by_changepoint()
for usage
-
segment_by_time
(window, step)¶ produce a new segment-multi-time-series from a performing a time-based segmentation over each time-series
- Parameters
- windowint
time-tick length of window
- stepint
time-tick length of step
- Returns
- see
SegmentMultiTimeSeries()
a new segment-multi-time-series
- see
Notes
see
segment_by_time()
for usage
-
time_series
(key)¶ get a time-series given a key
- Parameters
- keyany
the key associated with a time-series in this multi-time-series
- Returns
TimeSeries
the time-series associated with this given key
-
to_df
(format='observations', inclusive=False)¶ convert this multi-time-series to a pandas dataframe. A pandas dataframe can wither be stored in observations format (key column per row) or instants format (one time-series per column)
- Parameters
- formatstr, optional
dataframe format to store in (default is observations format)
- inclusivebool, optional
if true, will use inclusive bounds (default is False)
- Returns
- dataframe
a pandas dataframe representation of this time-series
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1,2,3]) >>> ts2 = tspy.time_series([2,3,4]) >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2}) a time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 b time series ------------------------------ TimeStamp: 0 Value: 2 TimeStamp: 1 Value: 3 TimeStamp: 2 Value: 4
create an observations dataframe
>>> df = mts_orig.to_df(format="observations") >>> df timestamp key value 0 0 a 1 1 1 a 2 2 2 a 3 3 0 b 2 4 1 b 3 5 2 b 4
create an instants dataframe
>>> mts = mts_orig.to_df(format="instants") >>> mts timestamp a b 0 0 1 2 1 1 2 3 2 2 3 4
-
to_df_instants
(inclusive=False)¶ convert this multi-time-series to an observations pandas dataframe. An observations dataframe is one which contains a time-series per column.
- Parameters
- inclusivebool, optional
if true, will use inclusive bounds (default is False)
- Returns
- dataframe
a pandas dataframe representation of this time-series
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1,2,3]) >>> ts2 = tspy.time_series([2,3,4]) >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2}) a time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 b time series ------------------------------ TimeStamp: 0 Value: 2 TimeStamp: 1 Value: 3 TimeStamp: 2 Value: 4
create an instants dataframe
>>> mts = mts_orig.to_df_instants() >>> mts timestamp a b 0 0 1 2 1 1 2 3 2 2 3 4
-
to_df_observations
(inclusive=False)¶ convert this multi-time-series to an observations pandas dataframe. An observations dataframe is one which contains a key column per record.
- Parameters
- inclusivebool, optional
if true, will use inclusive bounds (default is False)
- Returns
- dataframe
a pandas dataframe representation of this time-series
Examples
create a simple multi-time-series
>>> import tspy >>> ts1 = tspy.time_series([1,2,3]) >>> ts2 = tspy.time_series([2,3,4]) >>> mts_orig = tspy.multi_time_series({'a': ts1, 'b': ts2}) a time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 b time series ------------------------------ TimeStamp: 0 Value: 2 TimeStamp: 1 Value: 3 TimeStamp: 2 Value: 4
create an observations dataframe
>>> df = mts_orig.to_df_observations() >>> df timestamp key value 0 0 a 1 1 1 a 2 2 2 a 3 3 0 b 2 4 1 b 3 5 2 b 4
-
to_segments
(segment_transform)¶ produce a new segment-multi-time-series from a segmentation transform
- Parameters
- segment_transformUnaryTransform
the transform which will result in a time-series of segments
- Returns
- see
SegmentMultiTimeSeries()
a new segment-multi-time-series
- see
Notes
see
to_segments()
for usage
-
transform
(*args)¶ produce a new multi-time-series which is the result of performing a transforming over each time-series. A transform can be of type unary (one time-series in, one time-series out) or binary (two time-series in, one time-series out)
- Parameters
- args
UnaryTransform
orBinaryTransform
the transformation to apply on each time-series
- args
- Returns
MultiTimeSeries
a new multi-time-series
- Raises
- ValueError
If there is an error in the input arguments
Notes
transforms can be shape changing (time-series size out does not necessarily equal time-series size in)
see
transform()
for usage
-
trs
(key)¶ get a time-series time-reference-system given a key
- Parameters
- keyany
the key associated with a time-series in this multi-time-series
- Returns
- TRS
TRS
this time-series time-reference-system
- TRS
-
uncache
()¶ remove the multi-time-series caching mechanism
- Returns
MultiTimeSeries
a new multi-time-series
-
with_trs
(granularity=datetime.timedelta(0, 0, 1000), time_tick=datetime.datetime(1970, 1, 1, 0, 0, tzinfo=datetime.timezone.utc))¶ create a new multi-time-series with its timestamps mapped based on a granularity and start_time. In the scope of this method, granularity refers to the granularity at which to see time_ticks and start_time refers to the zone-date-time in which to start your time-series data when calling
get_values()
- Parameters
- granularitydatetime.timedelta, optional
the granularity for use in time-series
TRS
(default is 1ms)- start_timedatetime, optional
the starting date-time of the time-series (default is 1970-01-01 UTC)
- Returns
MultiTimeSeries
a new multi-time-series with its time_ticks mapped based on a new
TRS
.
Notes
time_ticks will be mapped as follows - (current_time_tick - start_time) / granularity
if any source time-series does not have a time-reference-system associated with it, this method will throw and exception
-
write
(start=None, end=None, inclusive=False)¶ create a multi-time-series-writer given a range
- Parameters
- startint or datetime, optional
start of range (inclusive) (default is None)
- endint or datetime, optional
end of range (inclusive) (default is None)
- inclusivebool, optional
if true, will use inclusive bounds (default is False)
- Returns
MultiTimeSeriesWriter
a new multi-time-series-writer
- Raises
- ValueError
If there is an error in the input arguments
-
class
tspy.data_structures.
Observation
(tsc, time_tick=- 1, value=None)¶ Bases:
object
Basic storage unit for a single time-series observation
Examples
create a simple observation
>>> import tspy >>> obs = tspy.observation(1,1) >>> obs TimeStamp: 1 Value: 1
Methods
__call__
(timestamp, value)Call self as a function.
-
property
time_tick
¶ - Returns
- int
the time-tick associated with this observation
-
property
value
¶ - Returns
- any
the value associated with this observation
-
property
-
class
tspy.data_structures.
ObservationCollection
(tsc, j_observations=None)¶ Bases:
object
A special form of materialized time-series (sorted collection) whose values are of type
Observation
.An observation-collection has the following properties:
Sorted by observation time-tick
Support for observations with duplicate time-ticks
Duplicate time-ticks will keep ordering
Examples
create an observation-collection
>>> import tspy >>> ts_builder = tspy.builder() >>> ts_builder.add(tspy.observation(1,1)) >>> ts_builder.add(tspy.observation(2,2)) >>> ts_builder.add(tspy.observation(1,3)) >>> observations = ts_builder.result() >>> observations [(1,1),(1,3),(2,2)]
iterate through this collection
>>> for o in observations: ...print(o.time_tick, ",", o.value) 1 , 1 1 , 3 2 , 2
Methods
Java
()mapping to a compatible class in Java via Py4J
ceiling
(time_tick)get the ceiling observation for the given time-tick.
contains
(time_tick)Checks for containment of time-tick within the collection
first
()get the first observation in this collection.
floor
(time_tick)get the floor observation for the given time-tick.
higher
(time_tick)get the higher observation for the given time-tick.
is_empty
()checks if there is any observation
last
()get the last observation in this collection.
lower
(time_tick)get the lower observation for the given time-tick.
to_time_series
([granularity, start_time])convert this collection to a time-series
-
class
Java
¶ Bases:
object
mapping to a compatible class in Java via Py4J
-
implements
= ['com.ibm.research.time_series.core.utils.ObservationCollection']¶
-
-
ceiling
(time_tick)¶ get the ceiling observation for the given time-tick. The ceiling is defined as the the observation which bares the same time-tick as the given time-tick, or if one does not exist, the next higher observation. If no such observation exists that satisfies these arguments, in the collection, None will be returned.
- Parameters
- time_tickint
the time-tick
- Returns
Observation
the ceiling observation
-
contains
(time_tick)¶ Checks for containment of time-tick within the collection
- Parameters
- time_tickint
the time-tick
- Returns
- bool
True if an observation in this collection has the given time-tick, otherwise False
-
first
()¶ get the first observation in this collection. The first observation is that observation which has the lowest timestamp in the collection. If 2 observations have the same timestamp, the first observation that was in the collection will be the one returned.
- Returns
Observation
the first observation in this collection
-
floor
(time_tick)¶ get the floor observation for the given time-tick. The floor is defined as the the observation which bares the same time-tick as the given time-tick, or if one does not exist, the next lower observation. If no such observation exists that satisfies these arguments, in the collection, None will be returned.
- Parameters
- time_tickint
the time-tick
- Returns
Observation
the floor observation
-
higher
(time_tick)¶ get the higher observation for the given time-tick. The higher is defined as the the observation which bares a time-tick greater than the given time-tick. If no such observation exists that satisfies these arguments, in the collection, None will be returned.
- Parameters
- time_tickint
the time-tick
- Returns
Observation
the floor observation
-
is_empty
()¶ checks if there is any observation
- Returns
- bool
True if no observations exist in this collection, otherwise False
-
last
()¶ get the last observation in this collection. The last observation is that observation which has the highest timestamp in the collection. If 2 observations have the same timestamp, the last observation that was in the collection will be the one returned.
- Returns
Observation
the last observation in this collection
-
lower
(time_tick)¶ get the lower observation for the given time-tick. The lower is defined as the the observation which bares a time-tick less than the given time-tick. If no such observation exists that satisfies these arguments, in the collection, None will be returned.
- Parameters
- time_tickint
the time-tick
- Returns
Observation
the floor observation
-
property
size
¶ - Returns
- int
the number of observations in this collection
-
to_time_series
(granularity=None, start_time=None)¶ convert this collection to a time-series
- Parameters
- granularitydatetime.timedelta, optional
the granularity for use in time-series
TRS
(default is None if no start_time, otherwise 1ms)- start_timedatetime, optional
the starting date-time of the time-series (default is None if no granularity, otherwise 1970-01-01 UTC)
- Returns
TimeSeries
a new time-series
-
property
trs
¶ - Returns
- TRS
TRS
this time-series time-reference-system
- TRS
-
class
tspy.data_structures.
Segment
(tsc, j_observations, start=None, end=None)¶ Bases:
tspy.data_structures.observations.ObservationCollection.ObservationCollection
A special form of observation-collection which holds additional information as to how the segment was created. Segments are usually created through the use of a segmentation transform.
Notes
a segments start/end need not equal its first/last time-tick
Methods
Java
()mapping to a compatible class in Java via Py4J
ceiling
(time_tick)get the ceiling observation for the given time-tick.
contains
(time_tick)Checks for containment of time-tick within the collection
first
()get the first observation in this collection.
floor
(time_tick)get the floor observation for the given time-tick.
higher
(time_tick)get the higher observation for the given time-tick.
is_empty
()checks if there is any observation
last
()get the last observation in this collection.
lower
(time_tick)get the lower observation for the given time-tick.
to_time_series
([granularity, start_time])convert this collection to a time-series
toString
-
property
end
¶ - Returns
- int
end time-tick of window at instantiation time
-
property
observations
¶ - Returns
ObservationCollection
the underlying collection of observations in this segment
-
property
start
¶ - Returns
- int
start time-tick of window at instantiation time
-
toString
()¶
-
property
-
class
tspy.data_structures.
SegmentMultiTimeSeries
(tsc, j_mts)¶ Bases:
tspy.data_structures.multi_time_series.MultiTimeSeries.MultiTimeSeries
A special form of multi-time-series that consists of observations with a value of type
Segment
- Attributes
keys
Returns
Methods
aggregate
(zero, seq_func, comb_func)aggregate all series in this multi-time-series to produce a single value
aggregate_series
(list_to_val_func)aggregate all time-series in the multi-time-series using a summation function to produce a single time-series
aggregate_series_with_key
(list_to_val_func)aggregate all time-series with key in the multi-time-series using a summation function to produce a single time-series
align
(key[, interp_func])align all time-series on a key
cache
([cache_size])suggest to the multi-time-series to cache values
collect
([inclusive])collect and materialize this multi-time-series
collect_series
(key[, inclusive])get a collection of observations given a key
describe
()retrieve a
NumStats
object per time-series computed from all values in this multi-time-series (double)fillna
(interpolator[, null_value])produce a new multi-time-series which is the result of filling all null values.
filter
(func)produce a new multi-time-series which is the result of filtering by each observation’s value given a filter function.
filter_series
(func)filter each time-series by its time-series object
filter_series_key
(func)filter each time-series by its key
flatten
([key_func])converts this segment-multi-time-series into a multi-time-series where each time-series will be the result of a single segment
forecast
(num_predictions, fm[, …])forecast the next num_predictions using a forecasting model for each time-series
full_align
(multi_time_series[, …])align two multi-time-series based on a temporal full join strategy and optionally interpolate missing values
full_join
(multi_time_series[, join_func, …])join two multi-time-series based on a temporal full join strategy and optionally interpolate missing values
get_time_series
(key)get a time-series given a key
get_values
(start, end[, inclusive])get all values between a range in this multi-time-series
inner_align
(multi_time_series)align two multi-time-series based on a temporal inner join strategy
inner_join
(multi_time_series[, join_func])join two multi-time-series based on a temporal inner join strategy
left_align
(multi_time_series[, interp_func])align two multi-time-series based on a temporal left join strategy and optionally interpolate missing values
left_join
(multi_time_series[, join_func, …])join two multi-time-series based on a temporal left join strategy and optionally interpolate missing values
left_outer_align
(multi_time_series[, …])align two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values
left_outer_join
(multi_time_series[, …])join two multi-time-series based on a temporal left outer join strategy and optionally interpolate missing values
map
(func)produce a new multi-time-series where each observation’s value in this multi-time-series is mapped to a new observation value
map_series
(func)map each
ObservationCollection
to a new collection of observationsmap_series_key
(func)map each time-series key to a new key
map_series_with_key
(func)map each
ObservationCollection
to a new collection of observations giving access to each time-series keypair_wise_transform
(binary_transform)produce a new multi-time-series which is the product of performing a pair-wise transform against all combination of keys
print
([start, end, inclusive])print this multi-time-series
reduce
(func)reduce each time-series in this multi-time-series to a single value
reduce_range
(func, start, end[, inclusive])reduce each time-series in this multi-time-series to a single value given a range
resample
(period, interp_func)produce a new multi-time-series by resampling each time-series to a given periodicity
right_align
(multi_time_series[, interp_func])align two multi-time-series based on a temporal right join strategy and optionally interpolate missing values
right_join
(multi_time_series[, join_func, …])join two multi-time-series based on a temporal right join strategy and optionally interpolate missing values
right_outer_align
(multi_time_series[, …])align two time-series based on a temporal right outer join strategy and optionally interpolate missing values
right_outer_join
(multi_time_series[, …])join two multi-time-series based on a temporal right outer join strategy and optionally interpolate missing values
segment
(window[, step, enforce_bounds])produce a new segment-multi-time-series from a performing a sliding-based segmentation over each time-series
segment_by
(func)produce a new segment-multi-time-series from a performing a group-by operation on each observation’s value for each time-series
segment_by_anchor
(func, left_delta, right_delta)produce a new segment-multi-time-series from performing an anchor-based segmentation over each time-series.
segment_by_changepoint
([change_point])produce a new segment-multi-time-series from performing a chang-point based segmentation.
segment_by_time
(window, step)produce a new segment-multi-time-series from a performing a time-based segmentation over each time-series
time_series
(key)get a time-series given a key
to_df
([format, inclusive])convert this multi-time-series to a pandas dataframe.
to_df_instants
([inclusive])convert this multi-time-series to an observations pandas dataframe.
to_df_observations
([inclusive])convert this multi-time-series to an observations pandas dataframe.
to_segments
(segment_transform)produce a new segment-multi-time-series from a segmentation transform
transform
(*args)produce a new multi-time-series which is the result of performing a transforming over each time-series.
trs
(key)get a time-series time-reference-system given a key
uncache
()remove the multi-time-series caching mechanism
with_trs
([granularity, time_tick])create a new multi-time-series with its timestamps mapped based on a granularity and start_time.
write
([start, end, inclusive])create a multi-time-series-writer given a range
-
flatten
(key_func=None)¶ converts this segment-multi-time-series into a multi-time-series where each time-series will be the result of a single segment
- Parameters
- key_funcfunc, optional
operation where given a segment, produce a unique key (default is create key based on start of segment)
- Returns
MultiTimeSeries
a new multi-time-series
Notes
this is not a lazy operation and will materialize the time-series
Examples
create a simple multi-time-series
>>> import tspy >>> mts_orig = tspy.multi_time_series.dict({'a': tspy.data_structures.list([1,2,3]), 'b': tspy.data_structures.list([4,5,6])}) >>> mts_orig a time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 b time series ------------------------------ TimeStamp: 0 Value: 4 TimeStamp: 1 Value: 5 TimeStamp: 2 Value: 6
segment the multi-time-series using a simple sliding window
>>> mts_sliding = mts_orig.segment(2) >>> mts_sliding a time series ------------------------------ TimeStamp: 0 Value: original bounds: (0,1) actual bounds: (0,1) observations: [(0,1),(1,2)] TimeStamp: 1 Value: original bounds: (1,2) actual bounds: (1,2) observations: [(1,2),(2,3)] b time series ------------------------------ TimeStamp: 0 Value: original bounds: (0,1) actual bounds: (0,1) observations: [(0,4),(1,5)] TimeStamp: 1 Value: original bounds: (1,2) actual bounds: (1,2) observations: [(1,5),(2,6)]
flatten the segments into a single multi-time-series
>>> mts = mts_sliding.flatten() >>> mts (a, 0) time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 (b, 0) time series ------------------------------ TimeStamp: 0 Value: 4 TimeStamp: 1 Value: 5 (a, 1) time series ------------------------------ TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 (b, 1) time series ------------------------------ TimeStamp: 1 Value: 5 TimeStamp: 2 Value: 6
-
class
tspy.data_structures.
SegmentTimeSeries
(tsc, j_ts, trs=None)¶ Bases:
tspy.data_structures.time_series.TimeSeries.TimeSeries
A special form of time-series that consists of observations with a value of type
Segment
- Attributes
trs
Returns
Methods
cache
([cache_size])suggest to the time-series to cache values
collect
([inclusive])collect all observations in this time-series
concat
(other_time_series)produce a new time-series which is the result of concatenating two time-series
count
([inclusive])count the current number of observations in this time-series
describe
()retrieve time-series statistics computed from all values in this time-series
fillna
(interpolator[, null_value])produce a new time-series which is the result of filling all null values.
filter
(func)produce a new time-series which is the result of filtering by each observation’s value given a filter function.
flatmap
(func)produce a new time-series where each observation’s value in this time-series is mapped to 0 to N new values.
flatten
([key_func])converts this segment-time-series into a multi-time-series where each time-series will be the result of a single segment
forecast
(num_predictions, fm[, …])forecast the next num_predictions using a forecasting model
full_align
(time_series[, left_interp_func, …])align two time-series based on a temporal full join strategy and optionally interpolate missing values
full_join
(time_series[, join_func, …])join two time-series based on a temporal full join strategy and optionally interpolate missing values
get_values
(start, end[, inclusive])get all values between a range in this time-series
inner_align
(time_series)align two time-series based on a temporal inner join strategy
inner_join
(time_series[, join_func])join two time-series based on a temporal inner join strategy
lag
(lag_amount)produce a new time-series which is a lagged version of the current time-series.
left_align
(time_series[, interp_func])align two time-series based on a temporal left join strategy and optionally interpolate missing values
left_join
(time_series[, join_func, interp_func])join two time-series based on a temporal left join strategy and optionally interpolate missing values
left_outer_align
(time_series[, interp_func])align two time-series based on a temporal left outer join strategy and optionally interpolate missing values
left_outer_join
(time_series[, join_func, …])join two time-series based on a temporal left outer join strategy and optionally interpolate missing values
map
(func)produce a new time-series where each observation’s value in this time-series is mapped to a new observation value
map_with_index
(func)produce a new time-series where each observation’s value in this time-series is mapped given the old value and an index to a new observation value
print
([start, end, inclusive, human_readable])print this time-series
reduce
(*args)reduce this time-series or two time-series to a single value
resample
(period, func)produce a new time-series by resampling the current time-series to a given periodicity
right_align
(time_series[, interp_func])align two time-series based on a temporal right join strategy and optionally interpolate missing values
right_join
(time_series[, join_func, interp_func])join two time-series based on a temporal right join strategy and optionally interpolate missing values
right_outer_align
(time_series[, interp_func])align two time-series based on a temporal right outer join strategy and optionally interpolate missing values
right_outer_join
(time_series[, join_func, …])join two time-series based on a temporal right outer join strategy and optionally interpolate missing values
segment
(window[, step, enforce_size])produce a new segment-time-series from a performing a sliding-based segmentation over the time-series
segment_by
(func)produce a new segment-time-series from a performing a group-by operation on each observation’s value
segment_by_anchor
(func, left_delta, right_delta)produce a new segment-time-series from performing an anchor-based segmentation over the time-series.
segment_by_changepoint
([change_point])produce a new segment-time-series from performing a chang-point based segmentation.
segment_by_marker
(*args, **kwargs)produce a new segment-time-series from performing a marker based segmentation.
segment_by_time
(window, step)produce a new segment-time-series from a performing a time-based segmentation over the time-series
shift
(shift_amount[, default_value])produce a new time-series which is a shifted version of the current time-series.
to_df
([inclusive])convert this time-series to a pandas dataframe
to_segments
(segment_transform)produce a new segment-time-series from a segmentation transform
transform
(*args)produce a new time-series which is the result of performing a transforming over the time-series.
uncache
()remove the time-series caching mechanism
with_trs
([granularity, start_time])create a new time-series with its timestamps mapped based on a granularity and start_time.
write
([start, end, inclusive])create a time-series-writer given a range
-
cache
(cache_size=None)¶ suggest to the time-series to cache values
- Parameters
- cache_sizeint, optional
the max cache size (default is max long)
- Returns
TimeSeries
a new time-series
Notes
this is a lazy operation and will only suggest to the time-series to save values once computed
-
flatmap
(func)¶ produce a new time-series where each observation’s value in this time-series is mapped to 0 to N new values.
- Parameters
- funcfunc
value mapping function which returns a list of values
- Returns
TimeSeries
a new time-series with its values flat-mapped
Notes
an observations time-tick will be duplicated if a single value maps to multiple values
Examples
create a simple time-series
>>> import tspy >>> ts_orig = tspy.time_series([1, 2, 3]) >>> ts_orig TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3
flat map each time-series observation value by duplicating the value
>>> ts = ts_orig.flatmap(lambda x: [x, x]) >>> ts TimeStamp: 0 Value: 1 TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 TimeStamp: 2 Value: 3
-
flatten
(key_func=None)¶ converts this segment-time-series into a multi-time-series where each time-series will be the result of a single segment
- Parameters
- key_funcfunc, optional
operation where given a segment, produce a unique key (default is create key based on start of segment)
- Returns
MultiTimeSeries
a new multi-time-series
Notes
this is not a lazy operation and will materialize the time-series
Examples
create a simple time-series
>>> import tspy >>> ts_orig = tspy.data_structures.list([1,2,3,4,5,6]) >>> ts_orig TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 TimeStamp: 3 Value: 4 TimeStamp: 4 Value: 5 TimeStamp: 5 Value: 6
segment the time-series using a simple sliding window
>>> ts_sliding = ts_orig.segment(2) >>> ts_sliding TimeStamp: 0 Value: original bounds: (0,1) actual bounds: (0,1) observations: [(0,1),(1,2)] TimeStamp: 1 Value: original bounds: (1,2) actual bounds: (1,2) observations: [(1,2),(2,3)] TimeStamp: 2 Value: original bounds: (2,3) actual bounds: (2,3) observations: [(2,3),(3,4)] TimeStamp: 3 Value: original bounds: (3,4) actual bounds: (3,4) observations: [(3,4),(4,5)] TimeStamp: 4 Value: original bounds: (4,5) actual bounds: (4,5) observations: [(4,5),(5,6)]
flatten the segments into a single multi-time-series
>>> mts = ts_sliding.flatten() >>> mts 0 time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 1 time series ------------------------------ TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3 2 time series ------------------------------ TimeStamp: 2 Value: 3 TimeStamp: 3 Value: 4 3 time series ------------------------------ TimeStamp: 3 Value: 4 TimeStamp: 4 Value: 5 4 time series ------------------------------ TimeStamp: 4 Value: 5 TimeStamp: 5 Value: 6
-
map
(func)¶ produce a new time-series where each observation’s value in this time-series is mapped to a new observation value
- Parameters
- funcfunc
value mapping function
- Returns
TimeSeries
a new time-series with its values re-mapped
Examples
create a simple time-series
>>> import tspy >>> ts_orig = tspy.time_series([1, 2, 3]) >>> ts_orig TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3
add one to each value of the time-series and produce a new time-series
>>> ts = ts_orig.map(lambda x: x + 1) >>> ts TimeStamp: 0 Value: 2 TimeStamp: 1 Value: 3 TimeStamp: 2 Value: 4
map each value of the time-series to a string and produce a new time-series
>>> ts = ts_orig.map(lambda x: "value - " + str(x)) >>> ts TimeStamp: 0 Value: value - 1 TimeStamp: 1 Value: value - 2 TimeStamp: 2 Value: value - 3
map each value of the time-series to a string using high performant expressions
>>> from tspy.functions import expressions as exp >>> ts = ts_orig.map(exp.add(exp.id(), 1)) TimeStamp: 0 Value: 2.0 TimeStamp: 1 Value: 3.0 TimeStamp: 2 Value: 4.0
-
class
tspy.data_structures.
Stats
(tsc, j_stats)¶ Bases:
object
time-series statistics object in use when describing a time-series
- Attributes
top
anyReturns
unique
intReturns
frequency
intReturns
first
Observation
Returns
last
Observation
Returns
count
intReturns
min_inter_arrival_time
intReturns
max_inter_arrival_time
intReturns
mean_inter_arrival_time
floatReturns
Methods
toString
-
property
count
¶ - Returns
- int
number of elements in the time-series
-
property
first
¶ - Returns
Observation
the first observation
-
property
frequency
¶ - Returns
- int
number of occurrences of the top element
-
property
last
¶ - Returns
Observation
the last observation
-
property
max_inter_arrival_time
¶ - Returns
- int
max time between observations
-
property
mean_inter_arrival_time
¶ - Returns
- float
mean time between observations
-
property
min_inter_arrival_time
¶ - Returns
- int
min time between observations
-
toString
()¶
-
property
top
¶ - Returns
- any
element that occurs most frequently
-
property
unique
¶ - Returns
- int
number of unique elements
-
class
tspy.data_structures.
TRS
(tsc, granularity, start_time, j_trs=None)¶ Bases:
object
Time reference system (TRS) is a local, regional or global system used to identify time. A time reference system defines a specific projection for forward and reverse mapping between timestamp and its numeric representation. A common example that most of us are familiar with is UTC time, which maps a timestamp (Jan 1, 2019 12am midnight GMT) into a 64-bit integer value (1546300800000) that captures the number of milliseconds that have elapsed since Jan 1, 1970 12am (midnight) GMT. Generally speaking, the timestamp value is better suited for human readability, while the numeric representation is better suited for machine processing.
Notes
A timestamp is mapped into a numeric representation by computing the number of elapsed time-ticks since the offset. A numeric representation is scaled by the time-tick and shifted by the offset when it is mapped back to a timestamp.
forward + reverse projections may be lossy. For instance, if the true time granularity of a time-series is in seconds, then forward and reverse mapping of timestamps 9:00:01 and 9:00:02 (to be read as hh:mm:ss) to a time-tick of one minute would result in timestamps 9:00:00 and 9:00:00 (respectively). In this example a time-series whose granularity is in seconds is being mapped to minutes and thus the reverse mapping looses information. However, the mapped granularity is higher than the granularity of the input time-series (more specifically if the time-series granularity is an integral multiple of the mapped granularity) then the forward + reverse projection is guaranteed to be lossless. For example, mapping a time-series whose granularity is in minutes to seconds and reverse projecting it to minutes would result in lossless reconstruction of the timestamps.
By default, Granularity is one millisecond, and Start-Time is 1st Jan 1970 00:00:00
- Attributes
granularity
datetime.timedeltaReturns
start_time
datetimeReturns
Methods
to_index
(time)get the given millisecond long as a time-tick index
to_long_lower
(index)get the given time-tick as a millisecond long (lower bound)
to_long_upper
(index)get the given time-tick as a millisecond long (upper bound)
-
property
granularity
¶ - Returns
- granularitydatetime.timedelta
granularity that captures time-tick granularity (e.g., 1 minute)
-
property
start_time
¶ - Returns
- start_timedatetime
start-time that captures an offset (e.g., 1st Jan 2019 12am midnight US Eastern Daylight Savings Time) to the start time of the time-series)
-
to_index
(time)¶ get the given millisecond long as a time-tick index
- Parameters
- timeint or datetime
the time to convert to an index
- Returns
- int
a time-tick index
-
to_long_lower
(index)¶ get the given time-tick as a millisecond long (lower bound)
- Parameters
- indexint
the time-tick to convert
- Returns
- int
a millisecond long
-
to_long_upper
(index)¶ get the given time-tick as a millisecond long (upper bound)
- Parameters
- indexint
the time-tick to convert
- Returns
- int
a millisecond long
Subpackages¶
- tspy.data_structures.forecasting package
- tspy.data_structures.io package
- Submodules
- tspy.data_structures.io.DataSink module
- tspy.data_structures.io.MultiDataSink module
- tspy.data_structures.io.MultiTimeSeriesWriteFormat module
- tspy.data_structures.io.MultiTimeSeriesWriter module
- tspy.data_structures.io.PullStreamMultiTimeSeriesReader module
- tspy.data_structures.io.PullStreamTimeSeriesReader module
- tspy.data_structures.io.PushStreamMultiTimeSeriesReader module
- tspy.data_structures.io.PushStreamTimeSeriesReader module
- tspy.data_structures.io.PythonQueueStreamMultiTimeSeriesReader module
- tspy.data_structures.io.PythonQueueStreamTimeSeriesReader module
- tspy.data_structures.io.TimeSeriesReader module
- tspy.data_structures.io.TimeSeriesWriteFormat module
- tspy.data_structures.io.TimeSeriesWriter module
- Submodules
- tspy.data_structures.ml package
- tspy.data_structures.multi_time_series package
- tspy.data_structures.observations package
- tspy.data_structures.stream_multi_time_series package
- tspy.data_structures.stream_time_series package
- tspy.data_structures.time_series package
- tspy.data_structures.transforms package