Skip to content

ClimateData

Climate data loader for ClimAID.

Built-in climate data is available for South Asia. Other countries require user-supplied weather and projection data.

get_historical(district)

Retrieve historical climate data for a given district.

This method loads historical weather data either from
  • Built-in ClimAID datasets (for supported countries), or
  • A user-provided dataset (CSV, Excel, or Parquet)

It then filters the data for the specified district.

Parameters

str

Name of the district (must match 'Dist_States' column in dataset).

Returns

pandas.DataFrame :

Historical climate data for the specified district.

Raises

ValueError
  • If the country is not supported and no custom weather file is provided.
  • If no data is found for the given district.
  • If file format is unsupported.

Notes

  • Built-in data currently includes South Asia weather datasets.
  • User datasets must contain:
    • 'Dist_States' column (district name)
    • 'time' column (date/time)
  • Time column is automatically parsed for supported formats.

get_projection(district, model=None, ssp=None)

Retrieve climate projection data for a specific district.

This method
  • Automatically selects the appropriate dataset (India or South Asia)
  • Downloads and caches data if not already available
  • Filters projections based on district, model, and scenario

Parameters

str

District name (must match 'Dist_States' column in dataset).

str, optional

Climate model name (e.g., "MIROC6"). If provided, filters dataset to that model.

str, optional

Emission scenario (e.g., "ssp245", "ssp585"). Filters dataset based on scenario column.

Returns

pandas.DataFrame :

Filtered projection data for the specified district.

Raises

ValueError

If no matching data is found for the given filters.

Notes

  • Uses DatasetManager to fetch datasets lazily.
  • Data is cached in memory after first load for performance.
  • Supports both built-in datasets and user-provided files.

load_sample_dataset(name)

Load bundled sample datasets included with climaid.

These datasets are intended for demonstration and testing purposes, particularly for users exploring the Global Mode via the browser interface.

The datasets do not represent real-world observations; they are synthetic or simplified samples designed to illustrate expected data structure, variable formats, and workflows within climaid.

Warning
  • The data is incomplete, so may not work with ClimAID.
  • Users must use the dataset structure as a reference and then prepare their own.
  • After verifying the content, they may upload the data through the ClimAID Browser Interface.

The datsets can also be viewed here: https://github.com/sam-as//climaid/docs/dataset_samples

Parameters

str

Name of the dataset to load.

  • Available options:
    • "climate" : Sample historical climate data
    • "projection" : Sample climate projection data
    • "disease" : Sample disease/incidence data

Returns

pandas.DataFrame :

A DataFrame containing the requested sample dataset.

Raises

ValueError

If an invalid dataset name is provided.

Notes

These datasets are packaged with the library and accessed using importlib.resources, ensuring compatibility after installation.

Examples
--------

>>> from climaid.climate_data import ClimateData
>>> cl = ClimateData()
>>> df = cl.load_sample_dataset("climate")
>>> df.head()