memilio.epidata.getTestingData

Functions

download_testing_data()

Downloads the Sars-CoV-2 test data sets from RKI on country and federal state level.

get_testing_data([read_data, file_format, ...])

Downloads the RKI testing data and provides positive rates of testing data in different ways.

main()

Main program entry.

transform_weeks_to_dates(df_test)

Transforms the calender weeks of the two data frames obtained from

memilio.epidata.getTestingData.download_testing_data()

Downloads the Sars-CoV-2 test data sets from RKI on country and federal state level. Information on federal state level do not sum up to country-wide information since less laboratories are participating.

Returns:

dataframe array with country level information first and federal state level second

memilio.epidata.getTestingData.get_testing_data(
read_data=False,
file_format='json_timeasstring',
out_folder='/home/docs/checkouts/readthedocs.org/user_builds/memilio/data/',
start_date=datetime.date(2020, 1, 1),
end_date=datetime.date(2026, 6, 15),
impute_dates=False,
moving_average=0,
**kwargs,
)

Downloads the RKI testing data and provides positive rates of testing data in different ways. Since positive rates also implicitly provide information on testing numbers while the opposite is not necessarily true without having additional information, only positive rates are provided.

The data is read from the internet. The file is read in or stored at the folder “out_folder”/Germany/pydata. To store and change the data we use pandas.

While working with the data - the column names are changed to English depending on defaultDict - The column “Date” provides information on the date of each data

point given in the corresponding columns.

  • The data is exported in three different ways:
    • germany_testpos: Positive rates of testing for whole Germany

    • germany_states_testpos: Positive rates of testing for all

      federal states of Germany

    • germany_counties_from_states_testpos: Positive rates of testing

      for all counties of Germany, only taken from the values of the federal states. No extrapolation applied.

  • Missing dates are imputed for all data frames (‘impute_dates’ is

    not optional but always executed).

  • A central moving average of N days is optional.

  • Start and end dates can be provided to define the length of the

    returned data frames.

Parameters:
  • read_data – True or False. Defines if data is read from file or downloaded. (Default value = dd.defaultDict[‘read_data’])

  • file_format – File format which is used for writing the data. Default defined in defaultDict.

  • out_folder – Folder where data is written to. Default defined in defaultDict.

  • start_date – Date of first date in dataframe. Default defined in defaultDict.

  • end_date – Date of last date in dataframe. Default defined in defaultDict.

  • impute_dates – True or False. Defines if values for dates without new information are imputed. Default defined in defaultDict. At the moment they are always imputed.

  • moving_average – Integers >=0. Applies an ‘moving_average’-days moving average on all time series to smooth out effects of irregular reporting. Default defined in defaultDict.

  • **kwargs

memilio.epidata.getTestingData.main()

Main program entry.

memilio.epidata.getTestingData.transform_weeks_to_dates(df_test)
Transforms the calender weeks of the two data frames obtained from

RKI sources to dates in the middle of the corresponding week (i.e., Thursdays).

Parameters:

df_test

Returns:

test data data frames with calender weeks replaced by dates.