warsa.timeseries package¶

Submodules¶

warsa.timeseries.daily module¶

warsa.timeseries.hourly module¶

warsa.timeseries.interpolation module¶

warsa.timeseries.monthly module¶

warsa.timeseries.threshold module¶

threshold_ranges(sr, threshold)¶

Returns a pandas data frame with columns Beg, End, and Type. Beg is a date/time with the beginning of the a time interval over/under threshold, End is a date/time with the end of this time interval. Type is 1 for intervals over threshold and -1 for intervals under or equal threshold. Beg of an interval is equal End of the previous interval. End of the last interval is NaT (not a time), i.e., it is an open interval. Consecutive intervals have alternating 1 and -1 values. Example:

Beg End Type 01.11.1937 07:30 12.10.1947 07:30 1 -> over threshold 12.10.1947 07:30 14.11.1947 07:30 -1 -> under or equal threshold 14.11.1947 07:30 01.09.1959 07:30 1 01.09.1959 07:30 30.01.1960 07:30 -1 … 12.09.2015 07:30 25.10.2015 07:30 1 25.10.2015 07:30 20.11.2015 07:30 -1 20.11.2015 07:30 NaT 1

Parameters:	sr – pandas series with date as index threshold –
Returns:

threshold_ranges_annual(df, beg_month=1, beg_day=1, beg_hour=0, beg_minute=0)¶

threshold_ranges_annual_plot(df, ax=None, plot_name=None, color1='C1', color2='C2')¶

threshold_ranges_annual_plot1(df, ax=None, plot_name=None, color1='C1', color2='C2')¶

warsa.timeseries.timeseries module¶

create_annual_precipitation_statistics(filenames_in, column_name_in, filename_out, beg_datetime, end_datetime, frequency='D', column_name_out=None)¶

Parameters:

filenames_in – list of filenames of time series. The first columns is assumed to be the datetime or date and will be used as index
column_name_in – name of the column in the time series with precipitation values
filename_out – name of the output file
beg_datetime –
end_datetime –
frequency –
column_name_out – function to extract the column name from the filename. if None, all digits in the time series will be used as column name preceded by ‘P’

Returns:

create_annual_statistics(df, beg_datetime, end_datetime, frequency='D', stat=<function sum>, gaps=True, isnull=True, zeros=True, greaterthenzero=True)¶

create_gap_statistics(df, beg_datetime, end_datetime, frequency='D')¶: Return a data frame with the number of i-days gaps: number of 1-day gaps, number of 2-days gaps, etc. Starts from

drop_date_duplicates(df)¶

get_n_monthly_data(sr, months=range(1, 13), n=1, how='sum')¶

Parameters:	sr – Series months – list of months, e.g. [11, 12, 1, 2, 3] n – number of months to aggregate: 1, 2, 3, 4, or 6. Default: n=1
Returns:	pandas.DataFrame with n-monthly values ‘M’ for each year

get_year_begin_and_end_timestamp(beg_timestamp, end_timestamp, year=None)¶

is_leap_day(dt)¶

Check whether dt is the 29.02 of a leap year

Parameters:	dt (datetime, pd.Timestamp, np.datetime64) – datetime
Returns:	True/False
Return type:	bool

read_csv(filename, columns=None, beg_datetime=Timestamp('1677-09-21 00:12:43.145225'), end_datetime=Timestamp('2262-04-11 23:47:16.854775807'), separator=';', index_col=0)¶

read_time_series(filename, columns=None, beg_datetime=Timestamp('1677-09-21 00:12:43.145225'), end_datetime=Timestamp('2262-04-11 23:47:16.854775807'), worksheet_name=None)¶

read_xls(excel_filename, columns=None, beg_datetime=Timestamp('1677-09-21 00:12:43.145225'), end_datetime=Timestamp('2262-04-11 23:47:16.854775807'), worksheet_name=0)¶

Reads only one worksheet. If worksheet_name=None, reads the first worksheet

io_ex: string, file-like object, or xlrd workbook

return: DataFrame

read_xlsx(excel_filename, columns=None, beg_datetime=None, end_datetime=None, worksheet_name=None)¶

Reads only one worksheet. If worksheet_name=None, reads the first worksheet

return: list of DataFrame

replace_year(dt, year)¶

set_time(datetime, time)¶

Set time to all elements in the list values

Parameters:	datetime (list, pd.DatetimeIndex, or np.array) – list of datetime, pd.Timestamp, or np.datetime64 time (datetime.time) – time (hour, minute, second, microsecond), 0 < microsecond < 999999
Returns:	list of pd.Timestamp
Return type:	list

shift(df, frequency, minutes)¶

Parameters:	df – data frame or series frequency – frequency of df.index as string, e.g., ‘30min’, ‘1H’, ‘3H’, or ‘1D’ minutes – minutes to shift, negative values for west time zones
Returns:	data frame or series

slice_by_timestamp(df, beg_timestamp=Timestamp('1677-09-21 00:12:43.145225'), end_timestamp=Timestamp('2262-04-11 23:47:16.854775807'))¶: Slice the data frame from index starting at beg_timestamp to end_timestamp, including the latter :param df: :param beg_timestamp: datetime.datetime, pandas.timestamp, or numpy.datetime64 :param end_timestamp: datetime.datetime, pandas.timestamp, or numpy.datetime64 :return: data frame

slice_init(df)¶

Remove rows until a first row with data is found

Returns df[df.first_valid_index():]

Parameters:	df –
Returns:

slice_to_full_years(df, beg_timestamp=None)¶: Slice to full hydrological years according to beg_timestamp. The year in beg_timestamp is a dummy (not used). :param df: data frame or series :param beg_timestamp: :return:

split_annually(df, beg_datetime=None, end_datetime=None)¶

Returns an ordered dictionary with (initial_datetime, final_datetime) as key and the data frame with from the: corresponding year as value.

beg_datetime: datetime to start the first split.

end_datetime: datetime to finish the last split (the last row may include end_datetime).

The slices start at beg_datetime and repeat each year at the same month, day, hour, minute, and microsecond as defined in beg_datetime.

The slices upper limits correspond to the month, day, hour, minute, and microsecond as defined in end_datetime

If month, day, hour, minute, and microsecond in end_datetime are less then those defined in beg_datetime, then the slices upper limits will fall in the next year. Example: beg_datetime=datetime(2000,10,1), end_datetime=datetime(2003,3,31) 1. slice key: datetime(2000,10,1) to datetime(2001,3,31) 2. slice key: datetime(2001,10,1) to datetime(2002,3,31) 3. slice key: datetime(2002,10,1) to datetime(2003,3,31)

Parameters:	df – pandas series or data frame with datetime or Timestamp as index beg_datetime – datetime or Timestamp end_datetime – datetime or Timestamp: split to less than end_datetime
Returns:	dictionary (Timestamp, Timestamp): series or data frame

to_csv(df, filename, start_row=None, separated=True, float_format='%.2f')¶

to_excel(df, filename, start_row=None, separated=True, column=None)¶

to_open_interval(dt)¶

write_time_series(df, filename, start_row=0, separated=False, column=None, float_format='%.2f')¶

Write the time series in df into the filename. Filename can be in: the formats csv, xls, or xlsx.
If separated is False, save the time series into one file in the case of: csv-files or into one worksheet in the case of xls- or xlsx-format Otherwise, save each time series into a different file in the case of csv-files or in different worksheets in case of excel-files. In these cases, the code (identifier) of time series will be appended to the respective outputfilename before the suffix for csv-files or will be use to name the different worksheets for excel files.
column applies only when separated is False and the filename has a suffix: .xls or .xlsx