tspy package¶
Entry point to time-series data-structure creation functions
Use
time-series related data builders, e.g.
function |
description |
---|---|
create one observation |
|
create one observation collection |
|
create one record |
|
create one time-series |
|
create one multi-time-series |
|
create a time-series builder |
tspy.stream_time_series
module : APIs for connecting stream data into time-series formtspy.stream_multi_time_series
module : APIs for connecting stream data into multi-time-series formtspy.functions
module : APIs for performing different time-series related operations, e.g. reduce, transforms, segmenttspy.models
module : APIs for loading/creating a time-series-based modeltspy.forecasters
module : APIs for different forecasting modelstspy.exceptions
module :tspy.ml
module : APIs for different machine-learning methods
tspy.data_structures
module : tspy-specific data structure (internally used)
-
tspy.
multi_time_series
(*args, **kwargs)¶ creates a multi-time-series object using data in either
dict
dataframe
time-series-reader
observation collection
- Parameters
- data:
type dict or pandas.DataFrame
dict where one key per
ObservationCollection
orTimeSeries
dataframe: there are two way to convert to MTS, depending upon the use-cases,
group by values of a given column [e.g. temperature time series and a bunch of locations (keys)],
each column is turned into its own time-series [a single timestamp and multiple metrics, e.g. temperature and humidity columns]
- key_columnstring
(only use when data is a pandas’s DataFrame and use-case (1)) column name containing the key, each key value is used for grouping data into a single-time-series. IMPORTANT: key_column and key_columns are used exclusively.
- key_columnslist, optional
(only use when data is a pandas’s DataFrame and use-case (2)) columns to use in multi-time-series creation (default is all columns), i.e. each column is turned into its own time-series component. IMPORTANT: key_column and key_columns are used exclusively.
- ts_columnstring, optional
(only use when data is a pandas’s DataFrame) column name containing time-ticks (default: time-tick is based on index into dataframe)
- value_columnlist or string, optional
(only use when data is a pandas’s DataFrame and use-case (1)) column name(s) containing values (default is all columns)
- granularitydatetime.timedelta, optional
the granularity for use in time-series
TRS
(default is None if no start_time, otherwise 1ms)- start_timedatetime, optional
the starting date-time of the time-series (default is None if no granularity, otherwise 1970-01-01 UTC)
- Returns
MultiTimeSeries
a new multi-time-series
- Raises
- ValueError
If there is an error in the input arguments, e.g. not a supporting data type
Examples
create a dict with observation-collection values
>>> import tspy >>> my_dict = {"ts1": tspy.time_series([1,2,3]).collect(), "ts2": tspy.time_series([4,5,6]).collect()} >>> my_dict {'ts1': [(0,1),(1,2),(2,3)], 'ts2': [(0,4),(1,5),(2,6)]}
create a multi-time-series from dict without a time-reference-system
>>> mts = tspy.multi_time_series(my_dict) >>> mts ts2 time series ------------------------------ TimeStamp: 0 Value: 4 TimeStamp: 1 Value: 5 TimeStamp: 2 Value: 6 ts1 time series ------------------------------ TimeStamp: 0 Value: 1 TimeStamp: 1 Value: 2 TimeStamp: 2 Value: 3
create a simple df with a single index
>>> import numpy as np >>> import pandas as pd >>> data = np.array([['', 'letters', 'timestamp', "numbers"], ...['', "a", 1, 27], ...['', "b", 3, 4], ...['', "a", 5, 17], ...['', "a", 3, 7], ...['', "b", 2, 45] ...]) >>> df = pd.DataFrame(data=data[1:, 1:], ...columns=data[0, 1:]).astype(dtype={'letters': 'object', 'timestamp': 'int64', 'numbers': 'float64'}) letters timestamp numbers 0 a 1 27.0 1 b 3 4.0 2 a 5 17.0 3 a 3 7.0 4 b 2 45.0
create a multi-time-series from a df using instants format
>>> mts = tspy.multi_time_series(df, ts_column='timestamp') >>> mts numbers time series ------------------------------ TimeStamp: 1 Value: 27.0 TimeStamp: 2 Value: 45.0 TimeStamp: 3 Value: 4.0 TimeStamp: 3 Value: 7.0 TimeStamp: 5 Value: 17.0 letters time series ------------------------------ TimeStamp: 1 Value: a TimeStamp: 2 Value: b TimeStamp: 3 Value: b TimeStamp: 3 Value: a TimeStamp: 5 Value: a
create a simple df with a single index
>>> import numpy as np >>> import pandas as pd >>> data = np.array([['', 'letters', 'timestamp', "numbers"], ...['', "a", 1, 27], ...['', "b", 3, 4], ...['', "a", 5, 17], ...['', "a", 3, 7], ...['', "b", 2, 45] ...]) >>> df = pd.DataFrame(data=data[1:, 1:], ...columns=data[0, 1:]).astype(dtype={'letters': 'object', 'timestamp': 'int64', 'numbers': 'float64'}) letters timestamp numbers 0 a 1 27.0 1 b 3 4.0 2 a 5 17.0 3 a 3 7.0 4 b 2 45.0
create a multi-time-series from a df using observations format where the key is letters
>>> mts = tspy.multi_time_series(df, key_column="letters", ts_column='timestamp') a time series ------------------------------ TimeStamp: 1 Value: {numbers=27.0} TimeStamp: 3 Value: {numbers=7.0} TimeStamp: 5 Value: {numbers=17.0} b time series ------------------------------ TimeStamp: 2 Value: {numbers=45.0} TimeStamp: 3 Value: {numbers=4.0}
-
tspy.
observation
(time_tick, value)¶ create an observation
- Parameters
- time_tickint
observations time-tick
- valueany
observations value
- Returns
-
tspy.
observations
(*varargs)¶ returns an
ObservationCollection
- Parameters
- observationsvarargs
either empty or a variable number of observations
- Returns
ObservationCollection
a new observation-collection
-
tspy.
record
(**kwargs)¶ create a record type (similar to dict)
- Parameters
- kwargsnamed args
key/value arguments
- Returns
- record
a dict-like structure that is handled for high performance in time-series
-
tspy.
time_series
(*args, **kwargs)¶ creates a single-time-series object using data in either
list
dataframe
time-series-reader
observation collection
- Parameters
- data:
type list or pandas.DataFrame or TimeSeriesReader or ObservationCollection
list
DataFrame
- ts_funcfunc, optional=None
(only use with data is a list) if used, it is the function to combine duplicate time-ticks (default is do not combine)
- granularitydatetime.timedelta, optional
the granularity for use in time-series
TRS
(default is None if no start_time, otherwise 1ms)- start_timedatetime, optional
the starting date-time of the time-series (default is None if no granularity, otherwise 1970-01-01 UTC)
- ts_columnstring, optional
(only use with data is a pd.DataFrame) the name of the column containing timestamps used in retrieving timestamps (default is using timestamps based on record index)
- value_columnstring or list, optional
(only use with data is a pd.DataFrame) the name of the column containing values used in retrieving values (default is create value using all columns)
- Returns
TimeSeries
a new time-series
Examples
create a simple pandas dataframe
>>> import numpy as np >>> import pandas as pd >>> data = np.array([['', 'key', 'timestamp', "value"], ['', "a", 1, 27], ['', "b", 3, 4], ['', "a", 5, 17], ['', "a", 3, 7], ['', "b", 2, 45] ]) >>> df = pd.DataFrame(data=data[1:, 1:], index=data[1:, 0], columns=data[0, 1:]).astype(dtype={'key': 'object', 'timestamp': 'int64', 'value': 'float64'}) >>> df key timestamp value a 1 27.0 b 3 4.0 a 5 17.0 a 3 7.0 b 2 45.0
create a time-series from a dataframe specifying a timestamp and value column
>>> ts = tspy.time_series(df, ts_column="timestamp", value_column="value") >>> ts TimeStamp: 1 Value: 27.0 TimeStamp: 2 Value: 45.0 TimeStamp: 3 Value: 4.0 TimeStamp: 3 Value: 7.0 TimeStamp: 5 Value: 17.0
create a time-series from a dataframe specifying only a timestamp column - it will uses all other columns and stores as value as a single dictionary.
>>> ts = tspy.time_series(df, ts_column="timestamp") >>> ts TimeStamp: 1 Value: {value=27.0, key=a} TimeStamp: 2 Value: {value=45.0, key=b} TimeStamp: 3 Value: {value=4.0, key=b} TimeStamp: 3 Value: {value=7.0, key=a} TimeStamp: 5 Value: {value=17.0, key=a}
create a time-series from a dataframe specifying no timestamp or value column
>>> ts = tspy.time_series(df) >>> ts TimeStamp: 0 Value: {value=27.0, key=a, timestamp=1} TimeStamp: 1 Value: {value=4.0, key=b, timestamp=3} TimeStamp: 2 Value: {value=17.0, key=a, timestamp=5} TimeStamp: 3 Value: {value=7.0, key=a, timestamp=3} TimeStamp: 4 Value: {value=45.0, key=b, timestamp=2}
create a time-series from a dataframe specifying a timestamp column and using a time-reference-system
>>> import datetime >>> start_time = datetime.datetime(1990, 7, 6) >>> granularity = datetime.timedelta(weeks=1) >>> ts = tspy.time_series(df, ts_column="timestamp", granularity=granularity, start_time=start_time) >>> ts TimeStamp: 1990-07-13T00:00Z Value: {value=27.0, key=a} TimeStamp: 1990-07-20T00:00Z Value: {value=45.0, key=b} TimeStamp: 1990-07-27T00:00Z Value: {value=4.0, key=b} TimeStamp: 1990-07-27T00:00Z Value: {value=7.0, key=a} TimeStamp: 1990-08-10T00:00Z Value: {value=17.0, key=a}
create a time-series from a list of values
>>> ts = tspy.time_series([0, 1]) >>> ts TimeStamp: 0 Value: 0 TimeStamp: 1 Value: 1
create a time-series from a list of values with a time-reference system
>>> import datetime >>> granularity = datetime.timedelta(days=1) >>> start_time = datetime.datetime(1990,7,6) >>> ts = tspy.time_series([0, 1], granularity=granularity, start_time=start_time) >>> ts TimeStamp: 1990-07-06T00:00Z Value: 0 TimeStamp: 1990-07-07T00:00Z Value: 1
create a collection of observations
>>> import tspy >>> observations = tspy.builder().add(tspy.observation(0,0)).add(tspy.observation(1,1)).result() >>> observations [(0,0),(1,1)]
create a time-series from observations
>>> ts = tspy.time_series(observations) >>> ts TimeStamp: 0 Value: 0 TimeStamp: 1 Value: 1
create a time-series from observations with a time-reference system
>>> import datetime >>> granularity = datetime.timedelta(days=1) >>> start_time = datetime.datetime(1990,7,6) >>> ts = tspy.time_series(observations, granularity=granularity, start_time=start_time) >>> ts TimeStamp: 1990-07-06T00:00Z Value: 0 TimeStamp: 1990-07-07T00:00Z Value: 1
Subpackages¶
- tspy.data_structures package
- Subpackages
- tspy.data_structures.forecasting package
- tspy.data_structures.io package
- tspy.data_structures.ml package
- tspy.data_structures.multi_time_series package
- tspy.data_structures.observations package
- tspy.data_structures.stream_multi_time_series package
- tspy.data_structures.stream_time_series package
- tspy.data_structures.time_series package
- tspy.data_structures.transforms package
- Submodules
- Subpackages
- tspy.functions package
- tspy.ml package