warsa.timeseries package

Submodules

warsa.timeseries.daily module

warsa.timeseries.hourly module

warsa.timeseries.interpolation module

warsa.timeseries.monthly module

warsa.timeseries.threshold module

threshold_ranges(sr, threshold)

Returns a pandas data frame with columns Beg, End, and Type. Beg is a date/time with the beginning of the a time interval over/under threshold, End is a date/time with the end of this time interval. Type is 1 for intervals over threshold and -1 for intervals under or equal threshold. Beg of an interval is equal End of the previous interval. End of the last interval is NaT (not a time), i.e., it is an open interval. Consecutive intervals have alternating 1 and -1 values. Example:

Beg End Type 01.11.1937 07:30 12.10.1947 07:30 1 -> over threshold 12.10.1947 07:30 14.11.1947 07:30 -1 -> under or equal threshold 14.11.1947 07:30 01.09.1959 07:30 1 01.09.1959 07:30 30.01.1960 07:30 -1 … 12.09.2015 07:30 25.10.2015 07:30 1 25.10.2015 07:30 20.11.2015 07:30 -1 20.11.2015 07:30 NaT 1

Parameters:
  • sr – pandas series with date as index
  • threshold
Returns:

threshold_ranges_annual(df, beg_month=1, beg_day=1, beg_hour=0, beg_minute=0)
threshold_ranges_annual_plot(df, ax=None, plot_name=None, color1='C1', color2='C2')
threshold_ranges_annual_plot1(df, ax=None, plot_name=None, color1='C1', color2='C2')

warsa.timeseries.timeseries module

create_annual_precipitation_statistics(filenames_in, column_name_in, filename_out, beg_datetime, end_datetime, frequency='D', column_name_out=None)
Parameters:
  • filenames_in – list of filenames of time series. The first columns is assumed to be the datetime or date and will be used as index
  • column_name_in – name of the column in the time series with precipitation values
  • filename_out – name of the output file
  • beg_datetime
  • end_datetime
  • frequency
  • column_name_out – function to extract the column name from the filename. if None, all digits in the time series will be used as column name preceded by ‘P’
Returns:

create_annual_statistics(df, beg_datetime, end_datetime, frequency='D', stat=<function sum>, gaps=True, isnull=True, zeros=True, greaterthenzero=True)
create_gap_statistics(df, beg_datetime, end_datetime, frequency='D')

Return a data frame with the number of i-days gaps: number of 1-day gaps, number of 2-days gaps, etc. Starts from

drop_date_duplicates(df)
get_n_monthly_data(sr, months=range(1, 13), n=1, how='sum')
Parameters:
  • sr – Series
  • months – list of months, e.g. [11, 12, 1, 2, 3]
  • n – number of months to aggregate: 1, 2, 3, 4, or 6. Default: n=1
Returns:

pandas.DataFrame with n-monthly values ‘M’ for each year

get_year_begin_and_end_timestamp(beg_timestamp, end_timestamp, year=None)
is_leap_day(dt)

Check whether dt is the 29.02 of a leap year

Parameters:dt (datetime, pd.Timestamp, np.datetime64) – datetime
Returns:True/False
Return type:bool
read_csv(filename, columns=None, beg_datetime=Timestamp('1677-09-21 00:12:43.145225'), end_datetime=Timestamp('2262-04-11 23:47:16.854775807'), separator=';', index_col=0)
read_time_series(filename, columns=None, beg_datetime=Timestamp('1677-09-21 00:12:43.145225'), end_datetime=Timestamp('2262-04-11 23:47:16.854775807'), worksheet_name=None)
read_xls(excel_filename, columns=None, beg_datetime=Timestamp('1677-09-21 00:12:43.145225'), end_datetime=Timestamp('2262-04-11 23:47:16.854775807'), worksheet_name=0)

Reads only one worksheet. If worksheet_name=None, reads the first worksheet

io_ex: string, file-like object, or xlrd workbook

return: DataFrame

read_xlsx(excel_filename, columns=None, beg_datetime=None, end_datetime=None, worksheet_name=None)

Reads only one worksheet. If worksheet_name=None, reads the first worksheet

return: list of DataFrame

replace_year(dt, year)
set_time(datetime, time)

Set time to all elements in the list values

Parameters:
  • datetime (list, pd.DatetimeIndex, or np.array) – list of datetime, pd.Timestamp, or np.datetime64
  • time (datetime.time) – time (hour, minute, second, microsecond), 0 < microsecond < 999999
Returns:

list of pd.Timestamp

Return type:

list

shift(df, frequency, minutes)
Parameters:
  • df – data frame or series
  • frequency – frequency of df.index as string, e.g., ‘30min’, ‘1H’, ‘3H’, or ‘1D’
  • minutes – minutes to shift, negative values for west time zones
Returns:

data frame or series

slice_by_timestamp(df, beg_timestamp=Timestamp('1677-09-21 00:12:43.145225'), end_timestamp=Timestamp('2262-04-11 23:47:16.854775807'))

Slice the data frame from index starting at beg_timestamp to end_timestamp, including the latter :param df: :param beg_timestamp: datetime.datetime, pandas.timestamp, or numpy.datetime64 :param end_timestamp: datetime.datetime, pandas.timestamp, or numpy.datetime64 :return: data frame

slice_init(df)

Remove rows until a first row with data is found

Returns df[df.first_valid_index():]

Parameters:df
Returns:
slice_to_full_years(df, beg_timestamp=None)

Slice to full hydrological years according to beg_timestamp. The year in beg_timestamp is a dummy (not used). :param df: data frame or series :param beg_timestamp: :return:

split_annually(df, beg_datetime=None, end_datetime=None)
Returns an ordered dictionary with (initial_datetime, final_datetime) as key and the data frame with from the
corresponding year as value.

beg_datetime: datetime to start the first split.

end_datetime: datetime to finish the last split (the last row may include end_datetime).

The slices start at beg_datetime and repeat each year at the same month, day, hour, minute, and microsecond as defined in beg_datetime.

The slices upper limits correspond to the month, day, hour, minute, and microsecond as defined in end_datetime

If month, day, hour, minute, and microsecond in end_datetime are less then those defined in beg_datetime, then the slices upper limits will fall in the next year. Example: beg_datetime=datetime(2000,10,1), end_datetime=datetime(2003,3,31) 1. slice key: datetime(2000,10,1) to datetime(2001,3,31) 2. slice key: datetime(2001,10,1) to datetime(2002,3,31) 3. slice key: datetime(2002,10,1) to datetime(2003,3,31)

Parameters:
  • df – pandas series or data frame with datetime or Timestamp as index
  • beg_datetime – datetime or Timestamp
  • end_datetime – datetime or Timestamp: split to less than end_datetime
Returns:

dictionary (Timestamp, Timestamp): series or data frame

to_csv(df, filename, start_row=None, separated=True, float_format='%.2f')
to_excel(df, filename, start_row=None, separated=True, column=None)
to_open_interval(dt)
write_time_series(df, filename, start_row=0, separated=False, column=None, float_format='%.2f')
Write the time series in df into the filename. Filename can be in
the formats csv, xls, or xlsx.
If separated is False, save the time series into one file in the case of
csv-files or into one worksheet in the case of xls- or xlsx-format Otherwise, save each time series into a different file in the case of csv-files or in different worksheets in case of excel-files. In these cases, the code (identifier) of time series will be appended to the respective outputfilename before the suffix for csv-files or will be use to name the different worksheets for excel files.
column applies only when separated is False and the filename has a suffix
.xls or .xlsx

warsa.timeseries.timestamp module

warsa.timeseries.yearly module

Module contents