warsa.timeseries package¶
Submodules¶
warsa.timeseries.daily module¶
warsa.timeseries.hourly module¶
warsa.timeseries.interpolation module¶
warsa.timeseries.monthly module¶
warsa.timeseries.threshold module¶
-
threshold_ranges
(sr, threshold)¶ Returns a pandas data frame with columns Beg, End, and Type. Beg is a date/time with the beginning of the a time interval over/under threshold, End is a date/time with the end of this time interval. Type is 1 for intervals over threshold and -1 for intervals under or equal threshold. Beg of an interval is equal End of the previous interval. End of the last interval is NaT (not a time), i.e., it is an open interval. Consecutive intervals have alternating 1 and -1 values. Example:
Beg End Type 01.11.1937 07:30 12.10.1947 07:30 1 -> over threshold 12.10.1947 07:30 14.11.1947 07:30 -1 -> under or equal threshold 14.11.1947 07:30 01.09.1959 07:30 1 01.09.1959 07:30 30.01.1960 07:30 -1 … 12.09.2015 07:30 25.10.2015 07:30 1 25.10.2015 07:30 20.11.2015 07:30 -1 20.11.2015 07:30 NaT 1
Parameters: - sr – pandas series with date as index
- threshold –
Returns:
-
threshold_ranges_annual
(df, beg_month=1, beg_day=1, beg_hour=0, beg_minute=0)¶
-
threshold_ranges_annual_plot
(df, ax=None, plot_name=None, color1='C1', color2='C2')¶
-
threshold_ranges_annual_plot1
(df, ax=None, plot_name=None, color1='C1', color2='C2')¶
warsa.timeseries.timeseries module¶
-
create_annual_precipitation_statistics
(filenames_in, column_name_in, filename_out, beg_datetime, end_datetime, frequency='D', column_name_out=None)¶ Parameters: - filenames_in – list of filenames of time series. The first columns is assumed to be the datetime or date and will be used as index
- column_name_in – name of the column in the time series with precipitation values
- filename_out – name of the output file
- beg_datetime –
- end_datetime –
- frequency –
- column_name_out – function to extract the column name from the filename. if None, all digits in the time series will be used as column name preceded by ‘P’
Returns:
-
create_annual_statistics
(df, beg_datetime, end_datetime, frequency='D', stat=<function sum>, gaps=True, isnull=True, zeros=True, greaterthenzero=True)¶
-
create_gap_statistics
(df, beg_datetime, end_datetime, frequency='D')¶ Return a data frame with the number of i-days gaps: number of 1-day gaps, number of 2-days gaps, etc. Starts from
-
drop_date_duplicates
(df)¶
-
get_n_monthly_data
(sr, months=range(1, 13), n=1, how='sum')¶ Parameters: - sr – Series
- months – list of months, e.g. [11, 12, 1, 2, 3]
- n – number of months to aggregate: 1, 2, 3, 4, or 6. Default: n=1
Returns: pandas.DataFrame with n-monthly values ‘M’ for each year
-
get_year_begin_and_end_timestamp
(beg_timestamp, end_timestamp, year=None)¶
-
is_leap_day
(dt)¶ Check whether
dt
is the 29.02 of a leap yearParameters: dt (datetime, pd.Timestamp, np.datetime64) – datetime Returns: True/False Return type: bool
-
read_csv
(filename, columns=None, beg_datetime=Timestamp('1677-09-21 00:12:43.145225'), end_datetime=Timestamp('2262-04-11 23:47:16.854775807'), separator=';', index_col=0)¶
-
read_time_series
(filename, columns=None, beg_datetime=Timestamp('1677-09-21 00:12:43.145225'), end_datetime=Timestamp('2262-04-11 23:47:16.854775807'), worksheet_name=None)¶
-
read_xls
(excel_filename, columns=None, beg_datetime=Timestamp('1677-09-21 00:12:43.145225'), end_datetime=Timestamp('2262-04-11 23:47:16.854775807'), worksheet_name=0)¶ Reads only one worksheet. If worksheet_name=None, reads the first worksheet
io_ex: string, file-like object, or xlrd workbook
return: DataFrame
-
read_xlsx
(excel_filename, columns=None, beg_datetime=None, end_datetime=None, worksheet_name=None)¶ Reads only one worksheet. If worksheet_name=None, reads the first worksheet
return: list of DataFrame
-
replace_year
(dt, year)¶
-
set_time
(datetime, time)¶ Set time to all elements in the list values
Parameters: - datetime (list, pd.DatetimeIndex, or np.array) – list of datetime, pd.Timestamp, or np.datetime64
- time (datetime.time) – time (hour, minute, second, microsecond), 0 < microsecond < 999999
Returns: list of pd.Timestamp
Return type: list
-
shift
(df, frequency, minutes)¶ Parameters: - df – data frame or series
- frequency – frequency of df.index as string, e.g., ‘30min’, ‘1H’, ‘3H’, or ‘1D’
- minutes – minutes to shift, negative values for west time zones
Returns: data frame or series
-
slice_by_timestamp
(df, beg_timestamp=Timestamp('1677-09-21 00:12:43.145225'), end_timestamp=Timestamp('2262-04-11 23:47:16.854775807'))¶ Slice the data frame from index starting at beg_timestamp to end_timestamp, including the latter :param df: :param beg_timestamp: datetime.datetime, pandas.timestamp, or numpy.datetime64 :param end_timestamp: datetime.datetime, pandas.timestamp, or numpy.datetime64 :return: data frame
-
slice_init
(df)¶ Remove rows until a first row with data is found
Returns df[df.first_valid_index():]
Parameters: df – Returns:
-
slice_to_full_years
(df, beg_timestamp=None)¶ Slice to full hydrological years according to beg_timestamp. The year in beg_timestamp is a dummy (not used). :param df: data frame or series :param beg_timestamp: :return:
-
split_annually
(df, beg_datetime=None, end_datetime=None)¶ - Returns an ordered dictionary with (initial_datetime, final_datetime) as key and the data frame with from the
- corresponding year as value.
beg_datetime: datetime to start the first split.
end_datetime: datetime to finish the last split (the last row may include end_datetime).
The slices start at beg_datetime and repeat each year at the same month, day, hour, minute, and microsecond as defined in beg_datetime.
The slices upper limits correspond to the month, day, hour, minute, and microsecond as defined in end_datetime
If month, day, hour, minute, and microsecond in end_datetime are less then those defined in beg_datetime, then the slices upper limits will fall in the next year. Example: beg_datetime=datetime(2000,10,1), end_datetime=datetime(2003,3,31) 1. slice key: datetime(2000,10,1) to datetime(2001,3,31) 2. slice key: datetime(2001,10,1) to datetime(2002,3,31) 3. slice key: datetime(2002,10,1) to datetime(2003,3,31)
Parameters: - df – pandas series or data frame with datetime or Timestamp as index
- beg_datetime – datetime or Timestamp
- end_datetime – datetime or Timestamp: split to less than end_datetime
Returns: dictionary (Timestamp, Timestamp): series or data frame
-
to_csv
(df, filename, start_row=None, separated=True, float_format='%.2f')¶
-
to_excel
(df, filename, start_row=None, separated=True, column=None)¶
-
to_open_interval
(dt)¶
-
write_time_series
(df, filename, start_row=0, separated=False, column=None, float_format='%.2f')¶ - Write the time series in df into the filename. Filename can be in
- the formats csv, xls, or xlsx.
- If separated is False, save the time series into one file in the case of
- csv-files or into one worksheet in the case of xls- or xlsx-format Otherwise, save each time series into a different file in the case of csv-files or in different worksheets in case of excel-files. In these cases, the code (identifier) of time series will be appended to the respective outputfilename before the suffix for csv-files or will be use to name the different worksheets for excel files.
- column applies only when separated is False and the filename has a suffix
- .xls or .xlsx