Wave Module

The wave module contains a set of functions to calculate quantities of interest for wave energy converters (WEC).

The wave module uses wave elevation time series data and spectra data.


The io submodule contains the following functions to request, load, and manipulate CDiP data, National Data Buoy Center (NDBC) data, WPTO Hindcast data, and WIND Toolkit data data. The io module also has functions to load and manipulate WEC-Sim and SWAN model data.



Returns historic or realtime data from CDIP THREDDS server


Parses a passed CDIP netCDF file or requests a station number from http://cdip.ucsd.edu/) and parses.


Iterates over and extracts variables from CDIP bouy data.

mhkit.wave.io.cdip.request_netCDF(station_number, data_type)[source]

Returns historic or realtime data from CDIP THREDDS server

  • station_number (string) – CDIP station number of interest

  • data_type (string) – ‘historic’ or ‘realtime’


nc (xarray Dataset) – netCDF data for the given station number and data type

mhkit.wave.io.cdip.request_parse_workflow(nc=None, station_number=None, parameters=None, years=None, start_date=None, end_date=None, data_type='historic', all_2D_variables=False, silent=False, to_pandas=True)[source]

Parses a passed CDIP netCDF file or requests a station number from http://cdip.ucsd.edu/) and parses. This function can return specific parameters is passed. Years may be non-consecutive e.g. [2001, 2010]. Time may be sliced by dates (start_date or end date in YYYY-MM-DD). data_type defaults to historic but may also be set to ‘realtime’. By default 2D variables are not parsed if all 2D varaibles are needed. See the MHKiT CDiP example Jupyter notbook for information on available parameters.

  • nc (netCDF Object) – netCDF data for the given station number and data type. Can be the output of request_netCDF

  • station_number (string) – Station number of CDIP wave buoy

  • parameters (string or list of strings) – Parameters to return. If None will return all varaibles except 2D-variables.

  • years (int or list of int) – Year date, e.g. 2001 or [2001, 2010]

  • start_date (string) – Start date in YYYY-MM-DD, e.g. ‘2012-04-01’

  • end_date (string) – End date in YYYY-MM-DD, e.g. ‘2012-04-30’

  • data_type (string) – Either ‘historic’ or ‘realtime’

  • all_2D_variables (boolean) – Will return all 2D data. Enabling this will add significant processing time. If all 2D variables are not needed it is recomended to pass 2D parameters of interest using the ‘parameters’ keyword and leave this set to False. Default False.

  • silent (boolean) – Set to True to prevent the print statement that announces when 2D variable processing begins. Default False.

  • to_pandas (bool (optional)) – Flag to output a dictionary of pandas objects instead of a dictionary of xarray objects. Default = True.


data (dictionary) –

‘data’: dictionary of variables
’vars’: pandas DataFrame or xarray Dataset

1D variables indexed by time

’vars2D’: dictionary of DataFrames or Datasets, optional

If 2D-vars are passed in the ‘parameters key’ or if run with all_2D_variables=True, then this key will appear with a dictonary of DataFrames of 2D variables.

’metadata’: dictionary

Anything not of length time

mhkit.wave.io.cdip.get_netcdf_variables(nc, start_date=None, end_date=None, parameters=None, all_2D_variables=False, silent=False, to_pandas=True)[source]

Iterates over and extracts variables from CDIP bouy data. See the MHKiT CDiP example Jupyter notbook for information on available parameters.

  • nc (netCDF Object) – netCDF data for the given station number and data type

  • start_stamp (float) – Data of interest start in seconds since epoch

  • end_stamp (float) – Data of interest end in seconds since epoch

  • parameters (string or list of strings) – Parameters to return. If None will return all varaibles except 2D-variables. Default None.

  • all_2D_variables (boolean) – Will return all 2D data. Enabling this will add significant processing time. If all 2D variables are not needed it is recomended to pass 2D parameters of interest using the ‘parameters’ keyword and leave this set to False. Default False.

  • silent (boolean) – Set to True to prevent the print statement that announces when 2D variable processing begins. Default False.

  • to_pandas (bool (optional)) – Flag to output a dictionary of pandas objects instead of a dictionary of xarray objects. Default = True.


results (dictionary) –

‘data’: dictionary of variables
’vars’: pandas DataFrame or xarray Dataset

1D variables indexed by time

’vars2D’: dictionary of DataFrames or Datasets, optional

If 2D-vars are passed in the ‘parameters key’ or if run with all_2D_variables=True, then this key will appear with a dictonary of DataFrames/Datasets of 2D variables.

’metadata’: dictionary

Anything not of length time



Reads a NDBC wave buoy data file (from https://www.ndbc.noaa.gov).


For a given parameter this will return a DataFrame or Dataset of years, station IDs and file names that contain that parameter data.


Requests data by filenames and returns a dictionary of DataFrames or dictionary of Datasets for each filename passed.


Converts the NDBC date and time information reported in separate columns into a DateTime index and removed the NDBC date & time columns.


Takes a DataFrame/Dataset and converts the NDBC date columns


Returns an ordered dictionary of NDBC parameters with unit values.


Request the directional spectrum data and return an xarray.Dataset containing all 5 variables.


Create the spread function from the 4 relevant NDBC parameter data.


Create the spectrum from the 5 relevant NDBC parameter data.


Fetches and parses the metadata of a National Data Buoy Center (NDBC) station from https://www.ndbc.noaa.gov.

mhkit.wave.io.ndbc.read_file(file_name, missing_values=['MM', 9999, 999, 99], to_pandas=True)[source]

Reads a NDBC wave buoy data file (from https://www.ndbc.noaa.gov).

Realtime and historical data files can be loaded with this function.

Note: With realtime data, missing data is denoted by “MM”. With historical data, missing data is denoted using a variable number of # 9’s, depending on the data type (for example: 9999.0 999.0 99.0). ‘N/A’ is automatically converted to missing data.

Data values are converted to float/int when possible. Column names are also converted to float/int when possible (this is useful when column names are frequency).

  • file_name (string) – Name of NDBC wave buoy data file

  • missing_value (list of values) – List of values that denote missing data

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


  • data (pandas DataFrame or xarray Dataset) – Data indexed by datetime with columns named according to header row

  • metadata (dict or None) – Dictionary with {column name: units} key value pairs when the NDBC file contains unit information, otherwise None is returned

mhkit.wave.io.ndbc.available_data(parameter, buoy_number=None, proxy=None, clear_cache=False, to_pandas=True)[source]

For a given parameter this will return a DataFrame or Dataset of years, station IDs and file names that contain that parameter data.

  • parameter (string) – ‘swden’: ‘Raw Spectral Wave Current Year Historical Data’ ‘swdir’: ‘Spectral Wave Current Year Historical Data (alpha1)’ ‘swdir2’: ‘Spectral Wave Current Year Historical Data (alpha1)’ ‘swr1’: ‘Spectral Wave Current Year Historical Data (r1)’ ‘swr2’: ‘Spectral Wave Current Year Historical Data (r2)’ ‘stdmet’: ‘Standard Meteorological Current Year Historical Data’ ‘cwind’ : ‘Continuous Winds Current Year Historical Data’

  • buoy_number (string (optional)) – Buoy Number. 5-character alpha-numeric station identifier

  • proxy (dict) – Proxy dict passed to python requests, (e.g. proxy_dict= {“http”: ‘http:wwwproxy.yourProxy:80/’})

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


available_data (pandas DataFrame or xarray Dataset) – DataFrame with station ID, years, and NDBC file names.

mhkit.wave.io.ndbc.request_data(parameter, filenames, proxy=None, clear_cache=False, to_pandas=True)[source]

Requests data by filenames and returns a dictionary of DataFrames or dictionary of Datasets for each filename passed. If filenames for a single buoy are passed then the yearly DataFrames in the returned dictionary (ndbc_data) are indexed by year (e.g. ndbc_data[‘2014’]). If multiple buoy ids are passed then the returned dictionary is indexed by buoy id and year (e.g. ndbc_data[‘46022’][‘2014’]).

  • parameter (string) – ‘swden’ : ‘Raw Spectral Wave Current Year Historical Data’ ‘swdir’: ‘Spectral wave data (alpha1)’ ‘swdir2’: ‘Spectral wave data (alpha2)’ ‘swr1’: ‘Spectral wave data (r1)’ ‘swr2’: ‘Spectral wave data (r2)’ ‘stdmet’: ‘Standard Meteorological Current Year Historical Data’ ‘cwind’ : ‘Continuous Winds Current Year Historical Data’

  • filenames (pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Data filenames on https://www.ndbc.noaa.gov/data/historical/{parameter}/

  • proxy (dict) – Proxy dict passed to python requests, (e.g. proxy_dict= {“http”: ‘http:wwwproxy.yourProxy:80/’})

  • to_pandas (bool (optional)) – Flag to output a dictionary of pandas objects instead of a dictionary of xarray objects. Default = True.


ndbc_data (dict) – Dictionary of DataFrames/Datasets indexed by buoy and year.

mhkit.wave.io.ndbc.to_datetime_index(parameter, ndbc_data, to_pandas=True)[source]

Converts the NDBC date and time information reported in separate columns into a DateTime index and removed the NDBC date & time columns.

  • parameter (string) – ‘swden’: ‘Raw Spectral Wave Current Year Historical Data’ ‘swdir’: ‘Spectral wave data (alpha1)’ ‘swdir2’: ‘Spectral wave data (alpha2)’ ‘swr1’: ‘Spectral wave data (r1)’ ‘swr2’: ‘Spectral wave data (r2)’ ‘stdmet’: ‘Standard Meteorological Current Year Historical Data’ ‘cwind’: ‘Continuous Winds Current Year Historical Data’

  • ndbc_data (pandas DataFrame or xarray Dataset) – NDBC data in dataframe with date and time columns to be converted

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


df_datetime (pandas DataFrame or xarray Dataset) – Dataframe with NDBC date columns removed, and datetime index

mhkit.wave.io.ndbc.dates_to_datetime(data, return_date_cols=False, return_as_dataframe=False, to_pandas=True)[source]
Takes a DataFrame/Dataset and converts the NDBC date columns

(e.g. “#YY MM DD hh mm”) to datetime. Returns a DataFrame/Dataset with the removed NDBC date columns a new [‘date’] columns with DateTime Format.

  • data (pandas DataFrame or xarray Dataset) – Dataframe with headers (e.g. [‘YY’, ‘MM’, ‘DD’, ‘hh’, {‘mm’}])

  • return_date_col (Bool (optional)) – Default False. When true will return list of NDBC date columns

  • return_as_dataFrame (bool) – Results returned as a DataFrame (useful for MHKiT-MATLAB)

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


  • date (pandas Series or xarray DataArray) – Series with NDBC dates dropped and new [‘date’] column in DateTime format

  • ndbc_date_cols (list (optional)) – List of the DataFrame/Dataset columns headers for dates as provided by NDBC


Returns an ordered dictionary of NDBC parameters with unit values. If no parameter is passed then an ordered dictionary of all NDBC parameterz specified unites is returned. If a parameter is specified then only the units associated with that parameter are returned. Note that many NDBC parameters report multiple measurements and in that case the returned dictionary will contain the NDBC measurement name and associated unit for all the measurements associated with the specified parameter. Optional parameter values are given below. All units are based on https://www.ndbc.noaa.gov/measdes.shtml.


parameter (string (optional)) – ‘adcp’: ‘Acoustic Doppler Current Profiler Current Year Historical Data’ ‘cwind’: ‘Continuous Winds Current Year Historical Data’ ‘dart’: ‘Water Column Height (DART) Current Year Historical Data’ ‘derived2’: ‘Derived Met Values’ ‘ocean’ : ‘Oceanographic Current Year Historical Data’ ‘rain’ : ‘Hourly Rain Current Year Historical Data’ ‘rain10’: ‘10-Minute Rain Current Year Historical Data’ ‘rain24’: ‘24-Hour Rain Current Year Historical Data’ ‘realtime2’: ‘Detailed Wave Summary (Realtime .spec data files only)’ ‘srad’: ‘Solar Radiation Current Year Historical Data’ ‘stdmet’: ‘Standard Meteorological Current Year Historical Data’ ‘supl’: ‘Supplemental Measurements Current Year Historical Data’ ‘swden’: ‘Raw Spectral Wave Current Year Historical Data’ ‘swdir’: ‘Spectral Wave Current Year Historical Data (alpha1)’ ‘swdir2’: ‘Spectral Wave Current Year Historical Data (alpha2)’ ‘swr1’: ‘Spectral Wave Current Year Historical Data (r1)’ ‘swr2’: ‘Spectral Wave Current Year Historical Data (r2)’


units (dict) – Dictionary of parameter units

mhkit.wave.io.ndbc.request_directional_data(buoy, year)[source]

Request the directional spectrum data and return an xarray.Dataset containing all 5 variables. The NDBC historical data is organized into files based on buoy number, year, and parameter. For a given buoy number and year, the five files—corresponding to the 5 parameters NDBC uses to describe directional wave spectrum—are fetched and processed.

  • buoy (string) – Buoy Number. Five character alpha-numeric station identifier.

  • year (int) – Four digit year.


ndbc_data (xr.Dataset) – Dataset containing the five parameter data indexed by frequency and date.

mhkit.wave.io.ndbc.create_spread_function(data, directions)[source]

Create the spread function from the 4 relevant NDBC parameter data. Return as an xarray.DataArray indexed by frequency and wave direction.

  • data (xr.Dataset) – Dataset containing the four NDBC parameter data indexed by frequency.

  • directions (np.ndarray) – One-dimensional array of wave directions in degrees.


spread (xr.DataArray) – DataArray containing the spread function values indexed by frequency and wave direction.

mhkit.wave.io.ndbc.create_directional_spectrum(data, directions)[source]

Create the spectrum from the 5 relevant NDBC parameter data. Return as an xarray.DataArray indexed by frequency and wave direction.

  • data (xr.Dataset) – Dataset containing the five NDBC parameter data indexed by frequency.

  • directions (np.ndarray) – One-dimensional array of wave directions in degrees.


spectrum (xr.DataArray) – DataArray containing the spectrum values indexed by frequency and wave direction.

mhkit.wave.io.ndbc.get_buoy_metadata(station_number: str)[source]

Fetches and parses the metadata of a National Data Buoy Center (NDBC) station from https://www.ndbc.noaa.gov.

Extracts information such as provider, buoy type, latitude, longitude, and other metadata from the station’s webpage.


station_number (string) – The station number (ID) of the NDBC buoy


data (dict) – A dictionary containing metadata of the buoy with keys representing the information type and values containing the corresponding data



Reads in SWAN table format output


Reads in SWAN block output with headers and creates a dictionary of DataFrames or Datasets for each SWAN output variable in the output file.


Converts a dictionary of structured 2D grid SWAN block format x (columns),y (index) to SWAN table format x (column),y (column), values (column) DataFrame or Dataset.


Converts structured 2D grid SWAN block format x (columns), y (index) to SWAN table format x (column),y (column), values (column) DataFrame.

mhkit.wave.io.swan.read_table(swan_file, to_pandas=True)[source]

Reads in SWAN table format output

  • swan_file (str) – filename to import

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


  • swan_data (pandas DataFrame or xarray Dataset) – Dataframe of swan output

  • metaDict (Dictionary) – Dictionary of metaData

mhkit.wave.io.swan.read_block(swan_file, to_pandas=True)[source]

Reads in SWAN block output with headers and creates a dictionary of DataFrames or Datasets for each SWAN output variable in the output file.

  • swan_file (str) – swan block file to import

  • to_pandas (bool (optional)) – Flag to output a dictionary of pandas objects instead of a dictionary of xarray objects. Default = True.


  • data (Dictionary) – Dictionary of DataFrames or Datasets of swan output variables

  • metaDict (Dictionary) – Dictionary of metaData dependent on file type

mhkit.wave.io.swan.dictionary_of_block_to_table(dictionary_of_DataFrames, names=None, to_pandas=True)[source]

Converts a dictionary of structured 2D grid SWAN block format x (columns),y (index) to SWAN table format x (column),y (column), values (column) DataFrame or Dataset.

  • dictionary_of_DataFrames (Dictionary) – Dictionary of DataFrames in with columns as X indicie and Y as index.

  • names (List (Optional)) – Name of data column in returned table. Default=Dictionary.keys()

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


swanTables (pandas DataFrame or xarray Dataset) – DataFrame/Dataset with columns x,y,values where values = Dictionary.keys() or names

mhkit.wave.io.swan.block_to_table(data, name='values', to_pandas=True)[source]

Converts structured 2D grid SWAN block format x (columns), y (index) to SWAN table format x (column),y (column), values (column) DataFrame.

  • data (pandas DataFrame or xarray Dataset) – DataFrame in with columns as X indicie and Y as index.

  • name (string (Optional)) – Name of data column in returned table. Default=’values’

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


table (pandas DataFrame or xarray Dataset) – DataFrame with columns x,y,values



Loads the wecSim response class once 'output' has been saved to a .mat structure.

mhkit.wave.io.wecsim.read_output(file_name, to_pandas=True)

Loads the wecSim response class once ‘output’ has been saved to a .mat structure.

NOTE: Python is unable to import MATLAB objects. MATLAB must be used to save the wecSim object as a structure.

  • file_name (string) – Name of wecSim output file saved as a .mat structure

  • to_pandas (bool (optional)) – Flag to output a dictionary of pandas objects instead of a dictionary of xarray objects. Default = True.


ws_output (dict) – Dictionary of pandas DataFrames or xarray Datasets, indexed by time (s)

WIND Toolkit Hindcast

Wind Toolkit Data Utility Functions

This module contains a collection of utility functions designed to facilitate the extraction, caching, and visualization of wind data from the WIND Toolkit hindcast dataset hosted on AWS. This dataset includes offshore wind hindcast data with various parameters like wind speed, direction, temperature, and pressure.

Key Functions:
  • region_selection: Determines which predefined wind region a given latitude and longitude fall within.

  • get_region_data: Retrieves latitude and longitude data points for a specified wind region. Uses caching to speed up repeated requests.

  • plot_region: Plots the geographical extent of a specified wind region and can overlay a given latitude-longitude point.

  • elevation_to_string: Converts a parameter (e.g., ‘windspeed’) and elevation values (e.g., [20, 40, 120]) to the formatted strings used in the WIND Toolkit.

  • request_wtk_point_data: Fetches specified wind data parameters for given latitude-longitude points and years from the WIND Toolkit hindcast dataset. Supports caching for faster repeated data retrieval.

  • rex: Library to handle renewable energy datasets.

  • pandas: Data manipulation and analysis.

  • os, hashlib, pickle: Used for caching functionality.

  • matplotlib: Used for plotting.

  • To access the WIND Toolkit hindcast data, users need to configure h5pyd for data access on HSDS (see the metocean_example or WPTO_hindcast_example notebook for more details).

  • While some functions perform basic checks (e.g., verifying that latitude and longitude are within a predefined region), it’s essential to understand the boundaries of each region and the available parameters and elevations in the dataset.


akeeste ssolson




Returns the name of the predefined region in which the given coordinates reside.


Retrieves the latitude and longitude data points for the specified region from the cache if available; otherwise, fetches the data and caches it for subsequent calls.


Visualizes the area that a given region covers.


Takes in a parameter (e.g. 'windspeed') and elevations (e.g. [20, 40, 120]) and returns the formatted strings that are input to WIND Toolkit (e.g. windspeed_10m).


Returns data from the WIND Toolkit offshore wind hindcast hosted on AWS at the specified latitude and longitude point(s), or the closest available point(s).Visit https://registry.opendata.aws/nrel-pds-wtk/ for more information about the dataset and available locations and years.

mhkit.wave.io.hindcast.wind_toolkit.region_selection(lat_lon, preferred_region='')[source]

Returns the name of the predefined region in which the given coordinates reside. Can be used to check if the passed lat/lon pair is within the WIND Toolkit hindcast dataset.

  • lat_lon (tuple) – Latitude and longitude coordinates as floats or integers

  • preferred_region (string (optional)) – Latitude and longitude coordinates as floats or integers


region (string) – Name of predefined region for given coordinates


Retrieves the latitude and longitude data points for the specified region from the cache if available; otherwise, fetches the data and caches it for subsequent calls.

The function forms a unique identifier from the region parameter and checks whether the corresponding data is available in the cache. If the data is found, it’s loaded and returned. If not, the data is fetched, cached, and then returned.


region (str) – Name of the predefined region in the WIND Toolkit for which to retrieve latitude and longitude data points. It is case-sensitive. Examples: ‘Offshore_CA’,’Hawaii’,’Mid_Atlantic’,’NW_Pacific’


  • lats (numpy.ndarray) – A 1D array containing the latitude coordinates of data points in the specified region.

  • lons (numpy.ndarray) – A 1D array containing the longitude coordinates of data points in the specified region.


>>> lats, lons = get_region_data('Offshore_CA')
mhkit.wave.io.hindcast.wind_toolkit.plot_region(region, lat_lon=None, ax=None)[source]

Visualizes the area that a given region covers. Can help users understand the extent of a region since they are not all rectangular.

  • region (string) – Name of predefined region in the WIND Toolkit Options: ‘Offshore_CA’,’Hawaii’,’Mid_Atlantic’,’NW_Pacific’

  • lat_lon (couple (optional)) – Latitude and longitude pair to plot on top of the chosen region. Useful to inform accurate latitude-longitude selection for data analysis.

  • ax (matplotlib axes object (optional)) – Axes for plotting. If None, then a new figure is created.


ax (matplotlib pyplot axes)

mhkit.wave.io.hindcast.wind_toolkit.elevation_to_string(parameter, elevations)[source]

Takes in a parameter (e.g. ‘windspeed’) and elevations (e.g. [20, 40, 120]) and returns the formatted strings that are input to WIND Toolkit (e.g. windspeed_10m). Does not check parameter against the elevation levels. This is done in request_wtk_point_data.

  • parameter (string) – Name of the WIND toolkit parameter. Options: ‘windspeed’, ‘winddirection’, ‘temperature’, ‘pressure’

  • elevations (list) – List of elevations (float). Values can range from approxiamtely 20 to 200 in increments of 20, depending on the parameter in question. See Documentation for request_wtk_point_data for the full list of available parameters.


parameter_list (list) – Formatted List of WIND Toolkit parameter strings

mhkit.wave.io.hindcast.wind_toolkit.request_wtk_point_data(time_interval, parameter, lat_lon, years, preferred_region='', tree=None, unscale=True, str_decode=True, hsds=True, clear_cache=False, to_pandas=True)[source]

Returns data from the WIND Toolkit offshore wind hindcast hosted on AWS at the specified latitude and longitude point(s), or the closest available point(s).Visit https://registry.opendata.aws/nrel-pds-wtk/ for more information about the dataset and available locations and years.

Calls with multiple parameters must have the same time interval. Calls with multiple locations must use the same region (use the plot_region function).

Note: To access the WIND Toolkit hindcast data, you will need to configure h5pyd for data access on HSDS. Please see the metocean_example or WPTO_hindcast_example notebook for more information.

  • time_interval (string) – Data set type of interest Options: ‘1-hour’ ‘5-minute’

  • parameter (string or list of strings) –

    Dataset parameter to be downloaded. Other parameters may be available. This list is limited to those available at both 5-minute and 1-hour time intervals for all regions. Options:

    ’precipitationrate_0m’, ‘inversemoninobukhovlength_2m’, ‘relativehumidity_2m’, ‘surface_sea_temperature’, ‘pressure_0m’, ‘pressure_100m’, ‘pressure_200m’, ‘temperature_10m’, ‘temperature_20m’, ‘temperature_40m’, ‘temperature_60m’, ‘temperature_80m’, ‘temperature_100m’, ‘temperature_120m’, ‘temperature_140m’, ‘temperature_160m’, ‘temperature_180m’, ‘temperature_200m’, ‘winddirection_10m’, ‘winddirection_20m’, ‘winddirection_40m’, ‘winddirection_60m’, ‘winddirection_80m’, ‘winddirection_100m’, ‘winddirection_120m’, ‘winddirection_140m’, ‘winddirection_160m’, ‘winddirection_180m’, ‘winddirection_200m’, ‘windspeed_10m’, ‘windspeed_20m’, ‘windspeed_40m’, ‘windspeed_60m’, ‘windspeed_80m’, ‘windspeed_100m’, ‘windspeed_120m’, ‘windspeed_140m’, ‘windspeed_160m’, ‘windspeed_180m’, ‘windspeed_200m’

  • lat_lon (tuple or list of tuples) – Latitude longitude pairs at which to extract data. Use plot_region() or region_selection() to see the corresponding region for a given location.

  • years (list) – Year(s) to be accessed. The years 2000-2019 available (up to 2020 for Mid-Atlantic). Examples: [2015] or [2004,2006,2007]

  • preferred_region (string (optional)) – Region that the lat_lon belongs to (‘Offshore_CA’ or ‘NW_Pacific’). Required when a lat_lon point falls in both the Offshore California and NW Pacific regions. Overlap region defined by latitude = (41.213, 42.642) and longitude = (-129.090, -121.672). Default = ‘’

  • tree (str | cKDTree (optional)) – cKDTree or path to .pkl file containing pre-computed tree of lat, lon coordinates, default = None

  • unscale (bool (optional)) – Boolean flag to automatically unscale variables on extraction Default = True

  • str_decode (bool (optional)) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read. Default = True

  • hsds (bool (optional)) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS. Setting to False will indicate to look for files on local machine, not AWS. Default = True

  • clear_cache (bool (optional)) – Boolean flag to clear the cache related to this specific request. Default is False.

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


  • data (DataFrame) – Data indexed by datetime with columns named for parameter and cooresponding metadata index

  • meta (DataFrame) – Location metadata for the requested data location

WPTO Hindcast

This module provides functions to access and process WPTO wave hindcast data hosted on AWS at specified latitude and longitude points or the closest available points. It includes functions to retrieve data for predefined regions, request point data for various parameters, and request directional spectrum data.

  • region_selection(lat_lon): Returns the name of the predefined region for given latitude and longitude coordinates.

  • request_wpto_point_data(data_type, parameter, lat_lon, years, tree=None, unscale=True, str_decode=True, hsds=True): Returns data from the WPTO wave hindcast hosted on AWS at the specified latitude and longitude point(s) for the requested data type, parameter, and years.

  • request_wpto_directional_spectrum(lat_lon, year, tree=None, unscale=True, str_decode=True, hsds=True): Returns directional spectra data from the WPTO wave hindcast hosted on AWS at the specified latitude and longitude point(s) for the given year.

  • sys

  • time.sleep

  • pandas

  • xarray

  • numpy

  • rex.MultiYearWaveX, rex.WaveX

Author: rpauly, aidanbharath, ssolson Date: 2023-09-26


Returns the name of the predefined region in which the given coordinates reside.


Returns data from the WPTO wave hindcast hosted on AWS at the specified latitude and longitude point(s), or the closest available point(s).


Returns directional spectra data from the WPTO wave hindcast hosted on AWS at the specified latitude and longitude point(s), or the closest available point(s).


Returns the name of the predefined region in which the given coordinates reside. Can be used to check if the passed lat/lon pair is within the WPTO hindcast dataset.


lat_lon (list or tuple) – Latitude and longitude coordinates as floats or integers


region (string) – Name of predefined region for given coordinates

mhkit.wave.io.hindcast.hindcast.request_wpto_point_data(data_type, parameter, lat_lon, years, tree=None, unscale=True, str_decode=True, hsds=True, path=None, to_pandas=True)[source]

Returns data from the WPTO wave hindcast hosted on AWS at the specified latitude and longitude point(s), or the closest available point(s). Visit https://registry.opendata.aws/wpto-pds-us-wave/ for more information about the dataset and available locations and years.

Note: To access the WPTO hindcast data, you will need to configure h5pyd for data access on HSDS. Please see the WPTO_hindcast_example notebook for setup instructions.

  • data_type (string) – Data set type of interest Options: ‘3-hour’ ‘1-hour’

  • parameter (string or list of strings) –

    Dataset parameter to be downloaded 3-hour dataset options: ‘directionality_coefficient’,

    ’energy_period’, ‘maximum_energy_direction’ ‘mean_absolute_period’, ‘mean_zero-crossing_period’, ‘omni-directional_wave_power’, ‘peak_period’ ‘significant_wave_height’, ‘spectral_width’, ‘water_depth’

    1-hour dataset options: ‘directionality_coefficient’,

    ’energy_period’, ‘maximum_energy_direction’ ‘mean_absolute_period’, ‘mean_zero-crossing_period’, ‘omni-directional_wave_power’, ‘peak_period’, ‘significant_wave_height’, ‘spectral_width’, ‘water_depth’, ‘maximim_energy_direction’, ‘mean_wave_direction’, ‘frequency_bin_edges’

  • lat_lon (tuple or list of tuples) – Latitude longitude pairs at which to extract data

  • years (list) – Year(s) to be accessed. The years 1979-2010 available. Examples: [1996] or [2004,2006,2007]

  • tree (str | cKDTree (optional)) – cKDTree or path to .pkl file containing pre-computed tree of lat, lon coordinates, default = None

  • unscale (bool (optional)) – Boolean flag to automatically unscale variables on extraction Default = True

  • str_decode (bool (optional)) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read. Default = True

  • hsds (bool (optional)) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS. Setting to False will indicate to look for files on local machine, not AWS. Default = True

  • path (string (optional)) – Optionally override with a custom .h5 filepath. Useful when setting hsds=False.

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


  • data (pandas DataFrame or xarray Dataset) – Data indexed by datetime with columns named for parameter and cooresponding metadata index

  • meta (DataFrame) – Location metadata for the requested data location

mhkit.wave.io.hindcast.hindcast.request_wpto_directional_spectrum(lat_lon, year, tree=None, unscale=True, str_decode=True, hsds=True, path=None)[source]

Returns directional spectra data from the WPTO wave hindcast hosted on AWS at the specified latitude and longitude point(s), or the closest available point(s). The data is returned as an xarray Dataset with keys indexed by a graphical identifier (gid). gid`s are integers which represent a lat, long on which data is stored. Requesting an array of `lat_lons will return a dataset with multiple gids representing the data closest to each requested lat, lon.

Visit https://registry.opendata.aws/wpto-pds-us-wave/ for more information about the dataset and available locations and years.

Note: To access the WPTO hindcast data, you will need to configure h5pyd for data access on HSDS. Please see the WPTO_hindcast_example notebook for more information.

  • lat_lon (tuple or list of tuples) – Latitude longitude pairs at which to extract data

  • year (string) – Year to be accessed. The years 1979-2010 available. Only one year can be requested at a time.

  • tree (str | cKDTree (optional)) – cKDTree or path to .pkl file containing pre-computed tree of lat, lon coordinates, default = None

  • unscale (bool (optional)) – Boolean flag to automatically unscale variables on extraction Default = True

  • str_decode (bool (optional)) – Boolean flag to decode the bytestring meta data into normal strings. Setting this to False will speed up the meta data read. Default = True

  • hsds (bool (optional)) – Boolean flag to use h5pyd to handle .h5 ‘files’ hosted on AWS behind HSDS. Setting to False will indicate to look for files on local machine, not AWS. Default = True

  • path (string (optional)) – Optionally override with a custom .h5 filepath. Useful when setting hsds=False


  • data (xarray Dataset) – Coordinates as datetime, frequency, and direction for data at specified location(s)

  • meta (DataFrame) – Location metadata for the requested data location


The resource submodule contains functions compute wave energy spectra and metrics.

The following functions can be used to compute wave energy spectra:


Calculates the wave energy spectrum from wave elevation time-series


Calculates Pierson-Moskowitz Spectrum from IEC TS 62600-2 ED2 Annex C.2 (2019)


Calculates JONSWAP Spectrum from IEC TS 62600-2 ED2 Annex C.2 (2019)

The following functions can be used to compute wave metrics from spectra:


Calculates wave elevation time-series from spectrum


Calculates the Nth frequency moment of the spectrum


Calculates wave height from spectra


Calculates wave average zero crossing period from spectra


Calculates wave average crest period from spectra


Calculates mean wave period from spectra


Calculates wave peak period from spectra


Calculates wave energy period from spectra


Calculates bandwidth from spectra


Calculates wave spectral width from spectra


Calculates the omnidirectional wave energy flux of the spectra


Convert from spectral energy period (Te) to peak period (Tp) using ITTC approximation for JONSWAP Spectrum.


Calculates wave celerity (group velocity)


Calculates wave length from wave number To compute: 2*pi/wavenumber


Calculates wave number


Calculates the depth regime based on wavelength and height Deep water: h/l > ratio This function exists so sinh in wave celerity doesn't blow up to infinity.

mhkit.wave.resource.elevation_spectrum(eta, sample_rate, nnft, window='hann', detrend=True, noverlap=None, time_dimension='', to_pandas=True)[source]

Calculates the wave energy spectrum from wave elevation time-series

  • eta (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Wave surface elevation [m] indexed by time [datetime or s]

  • sample_rate (float) – Data frequency [Hz]

  • nnft (integer) – Number of bins in the Fast Fourier Transform

  • window (string (optional)) – Signal window type. ‘hann’ is used by default given the broadband nature of waves. See scipy.signal.get_window for more options.

  • detrend (bool (optional)) – Specifies if a linear trend is removed from the data before calculating the wave energy spectrum. Data is detrended by default.

  • noverlap (int, optional) – Number of points to overlap between segments. If None, noverlap = nperseg / 2. Defaults to None.

  • time_dimension (string (optional)) – Name of the xarray dimension corresponding to time. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz].

mhkit.wave.resource.pierson_moskowitz_spectrum(f, Tp, Hs, to_pandas=True)[source]

Calculates Pierson-Moskowitz Spectrum from IEC TS 62600-2 ED2 Annex C.2 (2019)

  • f (list, np.ndarray, pd.Series, xr.DataArray) – Frequency [Hz]

  • Tp (float/int) – Peak period [s]

  • Hs (float/int) – Significant wave height [m]

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


S (xarray Dataset) – Spectral density [m^2/Hz] indexed frequency [Hz]

mhkit.wave.resource.jonswap_spectrum(f, Tp, Hs, gamma=None, to_pandas=True)[source]

Calculates JONSWAP Spectrum from IEC TS 62600-2 ED2 Annex C.2 (2019)

  • f (list, np.ndarray, pd.Series, xr.DataArray) – Frequency [Hz]

  • Tp (float/int) – Peak period [s]

  • Hs (float/int) – Significant wave height [m]

  • gamma (float (optional)) – Gamma

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


S (pandas Series or xarray DataArray) – Spectral density [m^2/Hz] indexed frequency [Hz]

mhkit.wave.resource.surface_elevation(S, time_index, seed=None, frequency_bins=None, phases=None, method='ifft', frequency_dimension='', to_pandas=True)[source]

Calculates wave elevation time-series from spectrum

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • time_index (numpy array) – Time used to create the wave elevation time-series [s], for example, time = np.arange(0,100,0.01)

  • seed (int (optional)) – Random seed

  • frequency_bins (numpy array, pandas Series, or xarray DataArray (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • phases (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Explicit phases for frequency components (overrides seed) for example, phases = np.random.rand(len(S)) * 2 * np.pi

  • method (str (optional)) – Method used to calculate the surface elevation. ‘ifft’ (Inverse Fast Fourier Transform) used by default if the given frequency_bins==None or is evenly spaced. ‘sum_of_sines’ explicitly sums each frequency component and used by default if uneven frequency_bins are provided. The ‘ifft’ method is significantly faster.

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension (the index for pandas input).

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


eta (pandas DataFrame or xarray Dataset) – Wave surface elevation [m] indexed by time [s].

mhkit.wave.resource.frequency_moment(S, N, frequency_bins=None, frequency_dimension='', to_pandas=True)[source]

Calculates the Nth frequency moment of the spectrum

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • N (int) – Moment (0 for 0th, 1 for 1st ….)

  • frequency_bins (numpy array or pandas Series (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


m (pandas DataFrame or xarray Dataset) – Nth Frequency Moment indexed by S.columns

mhkit.wave.resource.significant_wave_height(S, frequency_dimension='', frequency_bins=None, to_pandas=True)[source]

Calculates wave height from spectra

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • frequency_bins (numpy array or pandas Series (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


Hm0 (pandas DataFrame or xarray Dataset) – Significant wave height [m] index by S.columns

mhkit.wave.resource.average_zero_crossing_period(S, frequency_dimension='', frequency_bins=None, to_pandas=True)[source]

Calculates wave average zero crossing period from spectra

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • frequency_bins (numpy array or pandas Series (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


Tz (pandas DataFrame or xarray Dataset) – Average zero crossing period [s] indexed by S.columns

mhkit.wave.resource.average_crest_period(S, frequency_dimension='', frequency_bins=None, to_pandas=True)[source]

Calculates wave average crest period from spectra

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • frequency_bins (numpy array or pandas Series (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


Tavg (pandas DataFrame or xarray Dataset) – Average wave period [s] indexed by S.columns

mhkit.wave.resource.average_wave_period(S, frequency_dimension='', frequency_bins=None, to_pandas=True)[source]

Calculates mean wave period from spectra

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • frequency_bins (numpy array or pandas Series (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


Tm (pandas DataFrame or xarray Dataset) – Mean wave period [s] indexed by S.columns

mhkit.wave.resource.peak_period(S, frequency_dimension='', to_pandas=True)[source]

Calculates wave peak period from spectra

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


Tp (pandas DataFrame or xarray Dataset) – Wave peak period [s] indexed by S.columns

mhkit.wave.resource.energy_period(S, frequency_dimension='', frequency_bins=None, to_pandas=True)[source]

Calculates wave energy period from spectra

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • frequency_bins (numpy array or pandas Series (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


Te (pandas DataFrame or xarray Dataset) – Wave energy period [s] indexed by S.columns

mhkit.wave.resource.spectral_bandwidth(S, frequency_dimension='', frequency_bins=None, to_pandas=True)[source]

Calculates bandwidth from spectra

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • frequency_bins (numpy array or pandas Series (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


e (pandas DataFrame or xarray Dataset) – Spectral bandwidth [s] indexed by S.columns

mhkit.wave.resource.spectral_width(S, frequency_dimension='', frequency_bins=None, to_pandas=True)[source]

Calculates wave spectral width from spectra

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • frequency_bins (numpy array or pandas Series (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


v (pandas DataFrame or xarray Dataset) – Spectral width [m] indexed by S.columns

mhkit.wave.resource.energy_flux(S, h, deep=False, rho=1025, g=9.80665, ratio=2, frequency_dimension='', to_pandas=True)[source]

Calculates the omnidirectional wave energy flux of the spectra

  • S (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • h (float) – Water depth [m]

  • deep (bool (optional)) – If True use the deep water approximation. Default False. When False a depth check is run to check for shallow water. The ratio of the shallow water regime can be changed using the ratio keyword.

  • rho (float (optional)) – Water Density [kg/m^3]. Default = 1025 kg/m^3

  • g (float (optional)) – Gravitational acceleration [m/s^2]. Default = 9.80665 m/s^2

  • ratio (float or int (optional)) – Only applied if depth=False. If h/l > ratio, water depth will be set to deep. Default ratio = 2.

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


J (pandas DataFrame or xarray Dataset) – Omni-directional wave energy flux [W/m] indexed by S.columns

mhkit.wave.resource.energy_period_to_peak_period(Te, gamma)[source]

Convert from spectral energy period (Te) to peak period (Tp) using ITTC approximation for JONSWAP Spectrum.

Approximation is given in “The Specialist Committee on Waves, Final Report and Recommendations to the 23rd ITTC”, Proceedings of the 23rd ITTC - Volume 2, Table A4.

  • Te (int, float, np.ndarray, pd.Series, pd.DataFrame, xr.DataArray, xr.Dataset)

  • gamma (float or int) – Peak enhancement factor for JONSWAP spectrum


Tp (float or array) – Spectral peak period [s]

mhkit.wave.resource.wave_celerity(k, h, g=9.80665, depth_check=False, ratio=2, frequency_dimension='', to_pandas=True)[source]

Calculates wave celerity (group velocity)

  • k (pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Wave number [1/m] indexed by frequency [Hz]

  • h (float) – Water depth [m]

  • g (float (optional)) – Gravitational acceleration [m/s^2]. Default 9.80665 m/s.

  • depth_check (bool (optional)) – If True check depth regime. Default False.

  • ratio (float or int (optional)) – Only applied if depth_check=True. If h/l > ratio, water depth will be set to deep. Default ratio = 2

  • frequency_dimension (string (optional)) – Name of the xarray dimension corresponding to frequency. If not supplied, defaults to the first dimension. Does not affect pandas input.

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


Cg (pandas DataFrame or xarray Dataset) – Water celerity [m/s] indexed by frequency [Hz]


Calculates wave length from wave number To compute: 2*pi/wavenumber


k (int, float, numpy ndarray, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Wave number [1/m] indexed by frequency


l (int, float, numpy ndarray, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Wave length [m] indexed by frequency. Output type is identical to the type of k.

mhkit.wave.resource.wave_number(f, h, rho=1025, g=9.80665, to_pandas=True)[source]

Calculates wave number

To compute wave number from angular frequency (w), convert w to f before using this function (f = w/2*pi)

  • f (int, float, numpy ndarray, pandas DataFrame, pandas Series, xarray DataArray) – Frequency [Hz]

  • h (float) – Water depth [m]

  • rho (float (optional)) – Water density [kg/m^3]

  • g (float (optional)) – Gravitational acceleration [m/s^2]

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


k (pandas DataFrame or xarray Dataset) – Wave number [1/m] indexed by frequency [Hz]

mhkit.wave.resource.depth_regime(l, h, ratio=2)[source]

Calculates the depth regime based on wavelength and height Deep water: h/l > ratio This function exists so sinh in wave celerity doesn’t blow up to infinity.

P.K. Kundu, I.M. Cohen (2000) suggest h/l >> 1 for deep water (pg 209) Same citation as above, they also suggest for 3% accuracy, h/l > 0.28 (pg 210) However, since this function allows multiple wavelengths, higher ratio numbers are more accurate across varying wavelengths.

  • l (int, float, np.ndarray, pd.Series, pd.DataFrame, xr.DataArray, xr.Dataset) – wavelength [m]

  • h (float or int) – water column depth [m]

  • ratio (float or int (optional)) – if h/l > ratio, water depth will be set to deep. Default ratio = 2


depth_reg (boolean or boolean array-like) – Boolean True if deep water, False otherwise


The performance submodule contains functions to compute capture length, statistics, performance matrices, and mean annual energy production.


Calculates the capture length (often called capture width).


Calculates statistics, including count, mean, standard deviation (std), min, percentiles (25%, 50%, 75%), and max.


Generates a capture length matrix for a given statistic


Generates a wave energy flux matrix for a given statistic


Generates a power matrix from a capture length matrix and wave energy flux matrix


Calculates mean annual energy production (MAEP) from time-series


Calculates mean annual energy production (MAEP) from matrix data along with data frequency in each bin


High-level function to compute power performance quantities of interest following IEC TS 62600-100 for given wave spectra.

mhkit.wave.performance.capture_length(P, J, to_pandas=True)[source]

Calculates the capture length (often called capture width).

  • P (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Power [W]

  • J (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Omnidirectional wave energy flux [W/m]

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


L (pandas Series or xarray DataArray) – Capture length [m]

mhkit.wave.performance.statistics(X, to_pandas=True)[source]

Calculates statistics, including count, mean, standard deviation (std), min, percentiles (25%, 50%, 75%), and max.

Note that std uses a degree of freedom of 1 in accordance with IEC/TS 62600-100.

  • X (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Data

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


stats (pandas Series or xarray DataArray) – Statistics

mhkit.wave.performance.capture_length_matrix(Hm0, Te, L, statistic, Hm0_bins, Te_bins, to_pandas=True)[source]

Generates a capture length matrix for a given statistic

Note that IEC/TS 62600-100 requires capture length matrices for the mean, std, count, min, and max.

  • Hm0 (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Significant wave height from spectra [m]

  • Te (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Energy period from spectra [s]

  • L (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Capture length [m]

  • statistic (string) – Statistic for each bin, options include: ‘mean’, ‘std’, ‘median’, ‘count’, ‘sum’, ‘min’, ‘max’, and ‘frequency’. Note that ‘std’ uses a degree of freedom of 1 in accordance with IEC/TS 62600-100.

  • Hm0_bins (numpy array) – Bin centers for Hm0 [m]

  • Te_bins (numpy array) – Bin centers for Te [s]

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


LM (pandas DataFrame or xarray DataArray) – Capture length matrix with index equal to Hm0_bins and columns equal to Te_bins

mhkit.wave.performance.wave_energy_flux_matrix(Hm0, Te, J, statistic, Hm0_bins, Te_bins, to_pandas=True)[source]

Generates a wave energy flux matrix for a given statistic

  • Hm0 (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Significant wave height from spectra [m]

  • Te (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Energy period from spectra [s]

  • J (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Wave energy flux from spectra [W/m]

  • statistic (string) – Statistic for each bin, options include: ‘mean’, ‘std’, ‘median’, ‘count’, ‘sum’, ‘min’, ‘max’, and ‘frequency’. Note that ‘std’ uses a degree of freedom of 1 in accordance of IEC/TS 62600-100.

  • Hm0_bins (numpy array) – Bin centers for Hm0 [m]

  • Te_bins (numpy array) – Bin centers for Te [s]

  • to_pandas (bool (optional)) – Flag to output pandas instead of xarray. Default = True.


JM (pandas DataFrame or xarray DataArray) – Wave energy flux matrix with index equal to Hm0_bins and columns equal to Te_bins

mhkit.wave.performance.power_matrix(LM, JM)[source]

Generates a power matrix from a capture length matrix and wave energy flux matrix

  • LM (pandas DataFrame, xarray DataArray, or xarray Dataset) – Capture length matrix

  • JM (pandas DataFrame, xarray DataArray, or xarray Dataset) – Wave energy flux matrix


PM (pandas DataFrame, xarray DataArray, or xarray Dataset) – Power matrix

mhkit.wave.performance.mean_annual_energy_production_timeseries(L, J)[source]

Calculates mean annual energy production (MAEP) from time-series

  • L (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Capture length

  • J (numpy array, pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Wave energy flux


maep (float) – Mean annual energy production

mhkit.wave.performance.mean_annual_energy_production_matrix(LM, JM, frequency)[source]

Calculates mean annual energy production (MAEP) from matrix data along with data frequency in each bin

  • LM (pandas DataFrame, xarray DataArray, or xarray Dataset) – Capture length

  • JM (pandas DataFrame, xarray DataArray, or xarray Dataset) – Wave energy flux

  • frequency (pandas DataFrame, xarray DataArray, or xarray Dataset) – Data frequency for each bin


maep (float) – Mean annual energy production

mhkit.wave.performance.power_performance_workflow(S, h, P, statistic, frequency_bins=None, deep=False, rho=1205, g=9.80665, ratio=2, show_values=False, savepath='')[source]

High-level function to compute power performance quantities of interest following IEC TS 62600-100 for given wave spectra.

  • S (pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed by frequency [Hz]

  • h (float) – Water depth [m]

  • P (numpy ndarray, pandas DataFrame, pandas Series, xarray DataArray, or xarray Dataset) – Power [W]

  • statistic (string or list of strings) – Statistics for plotting capture length matrices, options include: “mean”, “std”, “median”, “count”, “sum”, “min”, “max”, and “frequency”. Note that “std” uses a degree of freedom of 1 in accordance with IEC/TS 62600-100. To output capture length matrices for multiple binning parameters, define as a list of strings: statistic = [“”, “”, “”]

  • frequency_bins (numpy array or pandas Series (optional)) – Bin widths for frequency of S. Required for unevenly sized bins

  • deep (bool (optional)) – If True use the deep water approximation. Default False. When False a depth check is run to check for shallow water. The ratio of the shallow water regime can be changed using the ratio keyword.

  • rho (float (optional)) – Water density [kg/m^3]. Default = 1025 kg/m^3

  • g (float (optional)) – Gravitational acceleration [m/s^2]. Default = 9.80665 m/s^2

  • ratio (float or int (optional)) – Only applied if depth=False. If h/l > ratio, water depth will be set to deep. Default ratio = 2.

  • show_values (bool (optional)) – Show values on the scatter diagram. Default = False.

  • savepath (string (optional)) – Path to save figure. Terminate with ‘’. Default=””.


  • LM (xarray dataset) – Capture length matrices

  • maep_matrix (float) – Mean annual energy production


The graphics submodule contains functions to plot wave data and related metrics.


Plots wave amplitude spectrum versus omega


Plot wave surface elevation time-series


Plots values in the matrix as a scatter diagram


Plots, in the style of Chakrabarti (2005), relative importance of viscous, inertia, and diffraction phemonena Chakrabarti, Subrata.


Plots an overlay of the x1 and x2 variables to the calculate environmental contours.


Creates an average annual energy matrix with frequency of occurance.


Creates a cumulative distribution of energy flux as described in IEC TS 62600-101.


Create subplots showing: Significant Wave Height (Hs), Peak Period (Tp), and Direction (Dp) using OPeNDAP service from CDIP THREDDS Server.


Create plot of monthly-averaged boxes of Significant Wave Height (Hs) data.


Create a contour polar plot of a directional spectrum.

mhkit.wave.graphics.plot_spectrum(S, ax=None)[source]

Plots wave amplitude spectrum versus omega

  • S (pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Spectral density [m^2/Hz] indexed frequency [Hz]

  • ax (matplotlib axes object) – Axes for plotting. If None, then a new figure is created.


ax (matplotlib pyplot axes)

mhkit.wave.graphics.plot_elevation_timeseries(eta, ax=None)[source]

Plot wave surface elevation time-series

  • eta (pandas Series, pandas DataFrame, xarray DataArray, or xarray Dataset) – Wave surface elevation [m] indexed by time [datetime or s]

  • ax (matplotlib axes object) – Axes for plotting. If None, then a new figure is created.


ax (matplotlib pyplot axes)

mhkit.wave.graphics.plot_matrix(M, xlabel='Te', ylabel='Hm0', zlabel=None, show_values=True, ax=None)[source]

Plots values in the matrix as a scatter diagram

  • M (pandas Series, pandas DataFrame, xarray DataArray) – Matrix with numeric labels for x and y axis, and numeric entries. An example would be the average capture length matrix generated by mhkit.device.wave, or something similar.

  • xlabel (string (optional)) – Title of the x-axis

  • ylabel (string (optional)) – Title of the y-axis

  • zlabel (string (optional)) – Colorbar label

  • show_values (bool (optional)) – Show values on the scatter diagram

  • ax (matplotlib axes object) – Axes for plotting. If None, then a new figure is created.


ax (matplotlib pyplot axes)

mhkit.wave.graphics.plot_chakrabarti(H, lambda_w, D, ax=None)[source]

Plots, in the style of Chakrabarti (2005), relative importance of viscous, inertia, and diffraction phemonena Chakrabarti, Subrata. Handbook of Offshore Engineering (2-volume set). Elsevier, 2005.


Using floats

>>> plt.figure()
>>> D = 5
>>> H = 8
>>> lambda_w = 200
>>> wave.graphics.plot_chakrabarti(H, lambda_w, D)

Using numpy array

>>> plt.figure()
>>> D = np.linspace(5,15,5)
>>> H = 8*np.ones_like(D)
>>> lambda_w = 200*np.ones_like(D)
>>> wave.graphics.plot_chakrabarti(H, lambda_w, D)

Using pandas DataFrame

>>> plt.figure()
>>> D = np.linspace(5,15,5)
>>> H = 8*np.ones_like(D)
>>> lambda_w = 200*np.ones_like(D)
>>> df = pd.DataFrame([H.flatten(),lambda_w.flatten(),D.flatten()], index=['H','lambda_w','D']).transpose()
>>> wave.graphics.plot_chakrabarti(df.H, df.lambda_w, df.D)
  • H (int, float, numpy array, pandas Series, or xarray DataArray) – Wave height [m]

  • lambda_w (int, float, numpy array, pandas Series, or xarray DataArray) – Wave length [m]

  • D (int, float, numpy array, pandas Series, or xarray DataArray) – Characteristic length [m]

  • ax (matplotlib axes object (optional)) – Axes for plotting. If None, then a new figure is created.


ax (matplotlib pyplot axes)

mhkit.wave.graphics.plot_environmental_contour(x1, x2, x1_contour, x2_contour, **kwargs)[source]

Plots an overlay of the x1 and x2 variables to the calculate environmental contours.

  • x1 (list, np.ndarray, pd.Series, xr.DataArray) – x-axis data

  • x2 (list, np.ndarray, pd.Series, xr.DataArray) – x-axis data

  • x1_contour (list, np.ndarray, pd.Series, xr.DataArray) – Calculated x1 contour values

  • x2_contour (list, np.ndarray, pd.Series, xr.DataArray) – Calculated x2 contour values

  • **kwargs (optional) –

    x_label: string (optional)

    x-axis label. Default None.

    y_label: string (optional)

    y-axis label. Default None.

    data_label: string (optional)

    Legend label for x1, x2 data (e.g. ‘Buoy 46022’). Default None.

    contour_label: string or list of strings (optional)

    Legend label for x1_contour, x2_contour countor data (e.g. ‘100-year contour’). Default None.

    axmatplotlib axes object (optional)

    Axes for plotting. If None, then a new figure is created. Default None.

    markers: string

    string or list of strings to use as marker types


ax (matplotlib pyplot axes)

mhkit.wave.graphics.plot_avg_annual_energy_matrix(Hm0, Te, J, time_index=None, Hm0_bin_size=None, Te_bin_size=None, Hm0_edges=None, Te_edges=None)[source]

Creates an average annual energy matrix with frequency of occurance.

  • Hm0 (array-like) – Significant wave height

  • Te (array-like) – Energy period

  • J (array-like) – Energy flux

  • time_index (DateTime Index) – time to index by. Optional default None. If None Passed parameters must be series indexed by Datetime.

  • Hm0_bin_size (float, int) – Creates edges of bin using this discrtization. Optional default None. If not passed must pass Hm0_edges.

  • Te_bin_size (float, int) – Creates edges of bin using this discrtization. Optional default None. If not passed must pass Te_edges.

  • Hm0_edges (array-like) – Defines the Hm0 bin edges to use. Optional default None.

  • Te_edges (array-like) – Defines the Te bin edges to use. Optional default None.


fig (Figure) – Average annual energy table plot


Creates a cumulative distribution of energy flux as described in IEC TS 62600-101.


J (pd.Series, xr.DataArray) – Energy Flux with DateTime index


ax (axes) – Figure of monthly cumulative distribution

mhkit.wave.graphics.plot_compendium(Hs, Tp, Dp, buoy_title=None, ax=None)[source]

Create subplots showing: Significant Wave Height (Hs), Peak Period (Tp), and Direction (Dp) using OPeNDAP service from CDIP THREDDS Server.

See http://cdip.ucsd.edu/themes/cdip?pb=1&bl=cdip?pb=1&d2=p70&u3=s:100:st:1:v:compendium:dt:201204 for example Compendium plot.

Developed based on: http://cdip.ucsd.edu/themes/media/docs/documents/html_pages/compendium.html

  • Hs (pandas Series or xarray DataArray) – significant wave height

  • Tp (pandas Series or xarray DataArray) – significant wave height

  • Dp (pandas Series or xarray DataArray) – significant wave height

  • buoy_title (string (optional)) – Buoy title from the CDIP THREDDS Server

  • ax (matplotlib axes object (optional)) – Axes for plotting. If None, then a new figure is created.


ax (matplotlib pyplot axes)

mhkit.wave.graphics.plot_boxplot(Hs, buoy_title=None)[source]

Create plot of monthly-averaged boxes of Significant Wave Height (Hs) data.

Developed based on:


  • Hs (pandas Series or xarray DataArray) – Spectral density [m^2/Hz] indexed frequency [Hz]

  • buoy_title (string (optional)) – Buoy title from the CDIP THREDDS Server

  • ax (matplotlib axes object (optional)) – Axes for plotting. If None, then a new figure is created.


ax (matplotlib pyplot axes)

mhkit.wave.graphics.plot_directional_spectrum(spectrum, color_level_min=None, fill=True, nlevels=11, name='Elevation Variance', units='m^2')[source]

Create a contour polar plot of a directional spectrum.

  • spectrum (xarray.DataArray) – Spectral data indexed frequency [Hz] and wave direction [deg].

  • color_level_min (float (optional)) – Minimum color bar level.

  • fill (bool) – Whether to use contourf (filled) instead of contour (lines).

  • nlevels (int) – Number of contour levels to plot.

  • name (str) – Name of the (integral) spectrum variable.

  • units (str) – Units of the (integral) spectrum variable.


ax (matplotlib pyplot axes)


Contains functions for calculating environmental contours of extreme seastates


Returns a Dictionary of x1 and x2 components for each contour method passed.


Calculates environmental contours of extreme sea states using the improved joint probability distributions with the inverse first-order reliability method (I-FORM) probability for the desired return period (return_period).


Sample a sea state between contours of specified return periods.


Get Hs points along a specified environmental contour using user-defined T values.

mhkit.wave.contours.environmental_contours(x1, x2, sea_state_duration, return_period, method, **kwargs)[source]

Returns a Dictionary of x1 and x2 components for each contour method passed. A method may be one of the following: Principal Component Analysis, Gaussian, Gumbel, Clayton, Rosenblatt, nonparametric Gaussian, nonparametric Clayton, nonparametric Gumbel, bivariate KDE, log bivariate KDE

  • x1 (list, np.ndarray, pd.Series, xr.DataArray) – Component 1 data

  • x2 (list, np.ndarray, pd.Series, xr.DataArray) – Component 2 data

  • sea_state_duration (int or float) – x1 and x2 averaging period in seconds

  • return_period (int, float) – Return period of interest in years

  • method (string or list) – Copula method to apply. Options include [‘PCA’,’gaussian’, ‘gumbel’, ‘clayton’, ‘rosenblatt’, ‘nonparametric_gaussian’, ‘nonparametric_clayton’, ‘nonparametric_gumbel’, ‘bivariate_KDE’ ‘bivariate_KDE_log’]

  • **kwargs

    min_bin_count: int

    Passed to _copula_parameters to sets the minimum number of bins allowed. Default = 40.

    initial_bin_max_val: int, float

    Passed to _copula_parameters to set the max value of the first bin. Default = 1.

    bin_val_size: int, float

    Passed to _copula_parameters to set the size of each bin after the initial bin. Default 0.25.

    nb_steps: int

    Discretization of the circle in the normal space is used for copula component calculation. Default nb_steps=1000.


    Must specify bandwidth for bivariate KDE method. Default = None.

    Ndata_bivariate_KDE: int

    Must specify bivariate KDE method. Defines the contoured space from which samples are taken. Default = 100.

    max_x1: float

    Defines the max value of x1 to discretize the KDE space

    max_x2: float

    Defines the max value of x2 to discretize the KDE space

    PCA: dict

    If provided, the principal component analysis (PCA) on x1, x2 is skipped. The PCA will be the same for a given x1, x2 therefore this step may be skipped if multiple calls to environmental contours are made for the same x1, x2 pair. The PCA dict may be obtained by setting return_fit=True when calling the PCA method.

    return_fit: boolean

    Will return fitting parameters used for each method passed. Default False.


copulas (Dictionary) – Dictionary of x1 and x2 copula components for each copula method

mhkit.wave.contours.PCA_contour(x1, x2, fit, kwargs)[source]

Calculates environmental contours of extreme sea states using the improved joint probability distributions with the inverse first-order reliability method (I-FORM) probability for the desired return period (return_period). Given the return_period of interest, a circle of iso-probability is created in the principal component analysis (PCA) joint probability (x1, x2) reference frame. Using the joint probability value, the cumulative distribution function (CDF) of the marginal distribution is used to find the quantile of each component. Finally, using the improved PCA methodology, the component 2 contour lines are calculated from component 1 using the relationships defined in Eckert-Gallup et. al. 2016.

Eckert-Gallup, A. C., Sallaberry, C. J., Dallman, A. R., & Neary, V. S. (2016). Application of principal component analysis (PCA) and improved joint probability distributions to the inverse first-order reliability method (I-FORM) for predicting extreme sea states. Ocean Engineering, 112, 307-319.

  • x1 (list, np.ndarray, pd.Series, xr.DataArray) – Component 1 data

  • x2 (list, np.ndarray, pd.Series, xr.DataArray) – Component 2 data

  • fit (dict) – Dictionary of the iso-probability results. May additionally contain the principal component analysis (PCA) on x1, x2 The PCA will be the same for a given x1, x2 therefore this step may be skipped if multiple calls to environmental contours are made for the same x1, x2 pair. The PCA dict may be obtained by setting return_fit=True when calling the PCA method.

  • kwargs (optional) –


    Data points in each bin for the PCA fit. Default bin_size=250.


    Discretization of the circle in the normal space used for I-FORM calculation. Default nb_steps=1000.

    return_fit: boolean

    Default False, if True will return the PCA fit dictionary


  • x1_contour (numpy array) – Calculated x1 values along the contour boundary following return to original input orientation.

  • x2_contour (numpy array) – Calculated x2 values along the contour boundary following return to original input orientation.

  • fit (dict (optional)) – principal component analysis dictionary Keys: —– ‘principal_axes’: sign corrected PCA axes ‘shift’ : The shift applied to x2 ‘x1_fit’ : gaussian fit of x1 data ‘mu_param’ : fit to _mu_fcn ‘sigma_param’ : fit to _sig_fits

mhkit.wave.contours.samples_full_seastate(x1, x2, points_per_interval, return_periods, sea_state_duration, method='PCA', bin_size=250)[source]

Sample a sea state between contours of specified return periods.

This function is used for the full sea state approach for the extreme load. See Coe et al. 2018 for more details. It was originally part of WDRT.

Coe, R. G., Michelen, C., Eckert-Gallup, A., & Sallaberry, C. (2018). Full long-term design response analysis of a wave energy converter. Renewable Energy, 116, 356-366.

  • x1 (list, np.ndarray, pd.Series, xr.DataArray) – Component 1 data

  • x2 (list, np.ndarray, pd.Series, xr.DataArray) – Component 2 data

  • points_per_interval (int) – Number of sample points to be calculated per contour interval.

  • return_periods (np.array) – Vector of return periods that define the contour intervals in which samples will be taken. Values must be greater than zero and must be in increasing order.

  • sea_state_duration (int or float) – x1 and x2 sample rate (seconds)

  • method (string or list) – Copula method to apply. Currently only ‘PCA’ is implemented.

  • bin_size (int) – Number of data points in each bin


  • Hs_Samples (np.array) – Vector of Hs values for each sample point.

  • Te_Samples (np.array) – Vector of Te values for each sample point.

  • weight_points (np.array) – Vector of probabilistic weights for each sampling point to be used in risk calculations.

mhkit.wave.contours.samples_contour(t_samples, t_contour, hs_contour)[source]

Get Hs points along a specified environmental contour using user-defined T values.

  • t_samples (list, np.ndarray, pd.Series, xr.DataArray) – Points for sampling along return contour

  • t_contour (list, np.ndarray, pd.Series, xr.DataArray) – T values along contour

  • hs_contour (list, np.ndarray, pd.Series, xr.DataArray) – Hs values along contour


hs_samples (np.ndarray) – points sampled along return contour