DOLfYN Module
The DOLfYN module contains a set of functions to analyze binary Nortek or TRDI files. Instrument datafiles are processed and returned as an xarray dataset.
There a couple ways to start using the dolfyn module. The first way is to import the entire module using import mhkit.dolfyn as dolfyn. Many functions are directly available from the main dolfyn import:
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file. 

Read an ADCP or ADV datafile from the examples directory. 

Save xarray dataset as netCDF (.nc). 

Load xarray dataset from netCDF (.nc) 

Save xarray dataset as a MATLAB (.mat) file 

Load xarray dataset from MATLAB (.mat) file, complimentary to save_mat() 

Rotate a dataset to a new coordinate system. 

Compute the principal angle of the horizontal velocity. 

Set the magnetic declination 

Set the instrument to head rotation matrix for the Nortek ADV if it hasn't already been set through a '.userdata.json' file. 

Calculate the orientation matrix from DOLfYNdefined euler angles. 

Calculate DOLfYNdefined euler angles from the orientation matrix. 

Calculate orientation from Nortek AHRS quaternions, where q = [W, X, Y, Z] instead of the standard q = [X, Y, Z, W] = [q1, q2, q3, q4] 

This is the base binning (averaging) tool. 
ADP Module
The other two ways are to import the instrumentspecific modules. The ADP module contains routines for reading and working with ADP/ADCP data and is imported using from mhkit.dolfyn.adp import api. It contains:
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file. 

Load xarray dataset from netCDF (.nc) 

Rotate a dataset to a new coordinate system. 

Compute the principal angle of the horizontal velocity. 

Module containing functions to clean data 

This is the base binning (averaging) tool. 


ADV Module
The ADV module contains routines for reading and working with ADV data and is imported using from mhkit.dolfyn.adv import api. It contains:
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file. 

Load xarray dataset from netCDF (.nc) 

Rotate a dataset to a new coordinate system. 

Compute the principal angle of the horizontal velocity. 

Set the instrument to head rotation matrix for the Nortek ADV if it hasn't already been set through a '.userdata.json' file. 

Module containing functions to clean data 

This function performs motion correction on an IMUADV data object. 

This is the base binning (averaging) tool. 

A class that builds upon VelBinner for calculating turbulence statistics and velocity spectra from ADV data 

Functional version of ADVBinner that computes a suite of turbulence statistics for the input dataset, and returns a binned data object. 
IO
The io submodule contains the following functions to read binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or TRDI (.000, .PD0, .ENX, etc.) data file.
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file. 

Read an ADCP or ADV datafile from the examples directory. 

Save xarray dataset as netCDF (.nc). 

Load xarray dataset from netCDF (.nc) 

Save xarray dataset as a MATLAB (.mat) file 

Load xarray dataset from MATLAB (.mat) file, complimentary to save_mat() 
 mhkit.dolfyn.io.api.read(fname, userdata=True, nens=None, **kwargs)[source]
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file.
 Parameters:
filename (string) – Filename of instrument file to read.
userdata (True, False, or string of userdata.json filename (default
True
)) – Whether to read the ‘<basefilename>.userdata.json’ file.nens (None, int or 2element tuple (start, stop)) – Number of pings or ensembles to read from the file. Default is None, read entire file
**kwargs (dict) – Passed to instrumentspecific parser.
 Returns:
ds (xarray.Dataset) – An xarray dataset from instrument datafile.
 mhkit.dolfyn.io.api.read_example(name, **kwargs)[source]
Read an ADCP or ADV datafile from the examples directory.
 Parameters:
name (str) –
A few available files:
AWAC_test01.wpr BenchFile01.ad2cp RDI_test01.000 vector_burst_mode01.VEC vector_data01.VEC vector_data_imu01.VEC winriver01.PD0 winriver02.PD0
 Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data.
 mhkit.dolfyn.io.api.save(ds, filename, format='NETCDF4', engine='netcdf4', compression=False, **kwargs)[source]
Save xarray dataset as netCDF (.nc).
 Parameters:
ds (xarray.Dataset) – Dataset to save
filename (str) – Filename and/or path with the ‘.nc’ extension
compression (bool) – When true, compress all variables with zlib complevel=1. Default is False
**kwargs (dict) – These are passed directly to
xarray.Dataset.to_netcdf()
Notes
Drops ‘config’ lines.
Rewrites variable encoding dict
More detailed compression options can be specified by specifying ‘encoding’ in kwargs. The values in encoding will take precedence over whatever is set according to the compression option above. See the xarray.to_netcdf documentation for more details.
 mhkit.dolfyn.io.api.load(filename)[source]
Load xarray dataset from netCDF (.nc)
 Parameters:
filename (str) – Filename and/or path with the ‘.nc’ extension
 Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data.
 mhkit.dolfyn.io.api.save_mat(ds, filename, datenum=True)[source]
Save xarray dataset as a MATLAB (.mat) file
 Parameters:
ds (xarray.Dataset) – Dataset to save
filename (str) – Filename and/or path with the ‘.mat’ extension
datenum (bool) – If true, converts time to datenum. If false, time will be saved in “epoch time”.
Notes
The xarray data format is saved as a MATLAB structure with the fields ‘vars, coords, config, units’. Converts time to datenum
See also
scipy.io.savemat
 mhkit.dolfyn.io.api.load_mat(filename, datenum=True)[source]
Load xarray dataset from MATLAB (.mat) file, complimentary to save_mat()
A .mat file must contain the fields: {vars, coords, config, units}, where ‘coords’ contain the dimensions of all variables in ‘vars’.
 Parameters:
filename (str) – Filename and/or path with the ‘.mat’ extension
datenum (bool) – If true, converts time from datenum. If false, converts time from “epoch time”.
 Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data.
See also
scipy.io.loadmat
 mhkit.dolfyn.io.nortek.read_nortek(filename, userdata=True, debug=False, do_checksum=False, nens=None, **kwargs)[source]
Read a classic Nortek (AWAC and Vector) datafile
 Parameters:
filename (string) – Filename of Nortek file to read.
userdata (bool, or string of userdata.json filename) – Whether to read the ‘<basefilename>.userdata.json’ file. Default = True
debug (bool) – Logs debugger ouput if true. Default = False
do_checksum (bool) – Whether to perform the checksum of each data block. Default = False
nens (None, int or 2element tuple (start, stop)) – Number of pings or ensembles to read from the file. Default is None, read entire file
 Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data
 mhkit.dolfyn.io.nortek2.read_signature(filename, userdata=True, nens=None, rebuild_index=False, debug=False, dual_profile=False, **kwargs)[source]
Read a Nortek Signature (.ad2cp) datafile
 Parameters:
filename (string) – The filename of the file to load.
userdata (bool) – To search for and use a .userdata.json or not
nens (None, int or 2element tuple (start, stop)) – Number of pings or ensembles to read from the file. Default is None, read entire file
rebuild_index (bool) – Force rebuild of dolfynwritten datafile index. Useful for code updates. Default = False
debug (bool) – Logs debugger ouput if true. Default = False
dual_profile (bool) – Set to true if instrument is running multiple profiles. Default = False
 Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data
 mhkit.dolfyn.io.nortek2.split_dp_datasets(ds)[source]
Splits a dataset containing dual profiles into individual profiles
 mhkit.dolfyn.io.rdi.read_rdi(filename, userdata=None, nens=None, debug_level=1, vmdas_search=False, winriver=False, **kwargs)[source]
Read a TRDI binary data file.
 Parameters:
filename (string) – Filename of TRDI file to read.
userdata (True, False, or string of userdata.json filename) – Whether to read the ‘<basefilename>.userdata.json’ file. Default = True
nens (None, int or 2element tuple (start, stop)) – Number of pings or ensembles to read from the file. Default is None, read entire file
debug_level (int) – Debug level [0  2]. Default = 1
vmdas_search (bool) – Search from the end of each ensemble for the VMDAS navigation block. The byte offsets are sometimes incorrect. Default = False
winriver (bool) – If file is winriver or not. Automatically set by dolfyn, this is helpful for debugging. Default = False
 Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data
Rotate
The rotate submodule contains tools to rotate a dataset to different coordinate systems. When a file is read into dolfyn, the data will be stored in the same coordinate system it was saved in. The different coordinate systems are “beam” <> “inst” <> “earth” <> “principal”. “Beam” and “inst” are manufacturer/instrument specific and refer to the alongbeam (either 3 or 4 beams) and instrument (XYZ) reference frame. Instrument reference frames differ across sensors, but the rotate code takes this into account when rotating into the “earth” frame, defined as EastNorthUp (ENU). The earth frame can then be rotated to the principal axes, defined as streamwisecross_streamvertical.
Rotate a dataset to a new coordinate system. 

Compute the principal angle of the horizontal velocity. 

Set the magnetic declination 

Set the instrument to head rotation matrix for the Nortek ADV if it hasn't already been set through a '.userdata.json' file. 

Calculate the orientation matrix from DOLfYNdefined euler angles. 

Calculate DOLfYNdefined euler angles from the orientation matrix. 

Calculate orientation from Nortek AHRS quaternions, where q = [W, X, Y, Z] instead of the standard q = [X, Y, Z, W] = [q1, q2, q3, q4] 

Calculate "tilt", the vertical inclination, from pitch and roll. 
 mhkit.dolfyn.rotate.api.rotate2(ds, out_frame='earth', inplace=True)[source]
Rotate a dataset to a new coordinate system.
 Parameters:
ds (xr.Dataset) – The dolfyn dataset (ADV or ADCP) to rotate.
out_frame (string {'beam', 'inst', 'earth', 'principal'}) – The coordinate system to rotate the data into.
inplace (bool) – When True
ds
is modified. When False a copy is returned. Default = True
 Returns:
ds (xarray.Dataset or None) – Returns a new rotated dataset when ``inplace=False``, otherwise returns None.
Notes
This function rotates all variables in
ds.attrs['rotate_vars']
.In order to rotate to the ‘principal’ frame, a value should exist for
ds.attrs['principal_heading']
. The functioncalc_principal_heading
is recommended for this purpose, e.g.:ds.attrs[‘principal_heading’] = dolfyn.calc_principal_heading(ds[‘vel’].mean(range))
where here we are using the depthaveraged velocity to calculate the principal direction.
 mhkit.dolfyn.rotate.api.calc_principal_heading(vel, tidal_mode=True)[source]
Compute the principal angle of the horizontal velocity.
 Parameters:
vel (np.ndarray (2,...,Nt), or (3,...,Nt)) – The 2D or 3D velocity array (3rddim is ignored in this calculation)
tidal_mode (bool) – If true, range is set from 0 to +/180 degrees. If false, range is 0 to 360 degrees. Default = True
 Returns:
p_heading (float or ndarray) – The principal heading in degrees clockwise from North.
Notes
When tidal_mode=True, this tool calculates the heading that is aligned with the bidirectional flow. It does so following these steps:
rotates vectors with negative velocity by 180 degrees
then doubles those angles to make a complete circle again
computes a mean direction from this, and halves that angle (to undo the doubledangles in step 2)
The returned angle is forced to be between 0 and 180. So, you may need to add 180 to this if you want your positive direction to be in the westernhalf of the plane.
Otherwise, this function simply computes the average direction using a vector method.
 mhkit.dolfyn.rotate.api.set_declination(ds, declin, inplace=True)[source]
Set the magnetic declination
 Parameters:
ds (xarray.Dataset or
dolfyn.velocity.Velocity
) – The input dataset or velocity classdeclination (float) – The value of the magnetic declination in degrees (positive values specify that Magnetic North is clockwise from True North)
inplace (bool) – When True
ds
is modified. When False a copy is returned. Default = True
 Returns:
ds (xarray.Dataset or None) – Returns a new dataset with declination set when ``inplace=False``, otherwise returns None.
Notes
This function modifies the data object in the following ways:
If the dataset is in the earth reference frame at the time of setting declination, it will be rotated into the “TrueEast, TrueNorth, Up” (hereafter, ETU) coordinate system
dat['orientmat']
is modified to be an ETU to instrument (XYZ) rotation matrix (rather than the magneticENU to XYZ rotation matrix). Therefore, all rotations to/from the ‘earth’ frame will now be to/from this ETU coordinate system.The value of the specified declination will be stored in
dat.attrs['declination']
dat['heading']
is adjusted for declination (i.e., it is relative to True North).If
dat.attrs['principal_heading']
is set, it is adjusted to account for the orientation of the new ‘True’ earth coordinate system (i.e., calling set_declination on a data object in the principal coordinate system, then calling dat.rotate2(‘earth’) will yield a data object in the new ‘True’ earth coordinate system)
 mhkit.dolfyn.rotate.api.set_inst2head_rotmat(ds, rotmat, inplace=True)[source]
Set the instrument to head rotation matrix for the Nortek ADV if it hasn’t already been set through a ‘.userdata.json’ file.
 Parameters:
ds (xarray.Dataset) – The data set to assign inst2head_rotmat
rotmat (float) – 3x3 rotation matrix
inplace (bool) – When True
ds
is modified. When False a copy is returned. Default = True
 Returns:
ds (xarray.Dataset or None) – Returns a new dataset with inst2head_rotmat set when ``inplace=False``, otherwise returns None.
Notes
If the data object is in earth or principal coords, it is first rotated to ‘inst’ before assigning inst2head_rotmat, it is then rotated back to the coordinate system in which it was input. This way the inst2head_rotmat gets applied correctly (in inst coordinate system).
 mhkit.dolfyn.rotate.base.euler2orient(heading, pitch, roll, units='degrees')[source]
Calculate the orientation matrix from DOLfYNdefined euler angles.
This function is not likely to be called during data processing since it requires DOLfYNdefined euler angles. It is intended for testing DOLfYN.
The matrices H, P, R are the transpose of the matrices for rotation about z, y, x as shown here https://en.wikipedia.org/wiki/Rotation_matrix. The transpose is used because in DOLfYN the orientation matrix is organized for rotation from EARTH –> INST, while the wiki’s matrices are organized for rotation from INST –> EARTH.
 Parameters:
heading (numpy.ndarray) – The heading angle.
pitch (numpy.ndarray) – The pitch angle.
roll (numpy.ndarray) – The roll angle.
units (str) – Units in degrees or radians. is ‘degrees’
 Returns:
omat (numpy.ndarray (3x3xtime)) – The orientation matrix of the data. The returned orientation matrix obeys the following conventions:
a “ZYX” rotation order. That is, these variables are computed assuming that rotation from the earth > instrument frame happens by rotating around the zaxis first (heading), then rotating around the yaxis (pitch), then rotating around the xaxis (roll). Note this requires matrix multiplication in the reverse order.
heading is defined as the direction the xaxis points, positive clockwise from North (this is opposite the righthandrule around the Zaxis), range 0360 degrees.
pitch is positive when the xaxis pitches up (this is opposite the righthandrule around the Yaxis)
roll is positive according to the righthandrule around the instrument’s xaxis
 mhkit.dolfyn.rotate.base.orient2euler(omat)[source]
Calculate DOLfYNdefined euler angles from the orientation matrix.
 Parameters:
omat (numpy.ndarray) – The orientation matrix
 Returns:
heading (numpy.ndarray) – The heading angle. Heading is defined as the direction the xaxis points, positive clockwise from North (this is opposite the righthandrule around the Zaxis), range 0360 degrees.
pitch (np.ndarray) – The pitch angle (degrees). Pitch is positive when the xaxis pitches up (this is opposite the righthandrule around the Yaxis).
roll (np.ndarray) – The roll angle (degrees). Roll is positive according to the righthandrule around the instrument’s xaxis.
 mhkit.dolfyn.rotate.base.quaternion2orient(quaternions)[source]
Calculate orientation from Nortek AHRS quaternions, where q = [W, X, Y, Z] instead of the standard q = [X, Y, Z, W] = [q1, q2, q3, q4]
 Parameters:
quaternions (xarray.DataArray) – Quaternion dataArray from the raw dataset
 Returns:
orientmat (numpy.ndarray) – The earth2inst rotation maxtrix as calculated from the quaternions
See also
scipy.spatial.transform.Rotation
 mhkit.dolfyn.rotate.base.calc_tilt(pitch, roll)[source]
Calculate “tilt”, the vertical inclination, from pitch and roll.
 Parameters:
roll (numpy.ndarray or xarray.DataArray) – Instrument roll in degrees
pitch (numpy.ndarray or xarray.DataArray) – Instrument pitch in degrees
 Returns:
tilt (numpy.ndarray) – Vertical inclination of the instrument in degrees
Cleaning Data
The clean submodule contains basic quality control tools specific to ADCP and ADVs.
ADPspecific functions:
Adds an instrument's height above seafloor (for an upfacing instrument) or depth below water surface (for a downfacing instrument) to the range coordinate. 

Find the surface (water level or seafloor) from amplitude data and adds the variable "depth" to the input Dataset. 

Calculates the distance to the water surface. 

Mask the values of 3D data (vel, amp, corr, echo) that are beyond the surface. 

Filters out data where correlation is below a threshold in the alongbeam correlation data. 

Median filters the orientation data (headingpitchroll or quaternions) 

Find values of a variable that exceed a threshold value, and assign "val" to the velocity data where the threshold is exceeded. 

Fill gaps (nan values) in var across time using the specified method 

Fill gaps (nan values) in var along the depth profile using the specified method 
ADVspecific functions:
Interpolate over mask values in timeseries data using the specified method 

Fill missing values with the ensemble mean. 

Returns a logical vector where a spike in u of magnitude greater than thresh occurs. 

Returns a logical vector that is True where the values of u are outside of range. 

The Goring & Nikora 2002 'despiking' method, with Wahl2003 correction. 
Module containing functions to clean data
 mhkit.dolfyn.adp.clean.set_range_offset(ds, h_deploy)[source]
Adds an instrument’s height above seafloor (for an upfacing instrument) or depth below water surface (for a downfacing instrument) to the range coordinate. Also adds an attribute to the Dataset with the current “h_deploy” distance.
 Parameters:
ds (xarray.Dataset) – The adcp dataset to ajust ‘range’ on
h_deploy (numeric) – Deployment location in the water column, in [m]
 Returns:
None, operates “in place”
Notes
Center of bin 1 = h_deploy + blank_dist + cell_size
Nortek doesn’t take h_deploy into account, so the range that DOLfYN calculates distance is from the ADCP transducers. TRDI asks for h_deploy input in their deployment software and is thereby known by DOLfYN.
If the ADCP is mounted on a tripod on the seafloor, h_deploy will be the height of the tripod +/ any extra distance to the transducer faces. If the instrument is vesselmounted, h_deploy is the distance between the surface and downwardfacing ADCP’s transducers.
 mhkit.dolfyn.adp.clean.find_surface(ds, thresh=10, nfilt=None)[source]
Find the surface (water level or seafloor) from amplitude data and adds the variable “depth” to the input Dataset.
 Parameters:
ds (xarray.Dataset) – The full adcp dataset
thresh (int) – Specifies the threshold used in detecting the surface. Default = 10 (The amount that amplitude must increase by near the surface for it to be considered a surface hit)
nfilt (int) – Specifies the width of the median filter applied, must be odd. Default is None
 Returns:
None, operates “in place”
 mhkit.dolfyn.adp.clean.find_surface_from_P(ds, salinity=35)[source]
Calculates the distance to the water surface. Temperature and salinity are used to calculate seawater density, which is in turn used with the pressure data to calculate depth.
 Parameters:
ds (xarray.Dataset) – The full adcp dataset
salinity (numeric) – Water salinity in psu. Default = 35
 Returns:
None, operates “in place” and adds the variables “water_density” and
”depth” to the input dataset.
Notes
Requires that the instrument’s pressure sensor was calibrated/zeroed before deployment to remove atmospheric pressure.
Calculates seawater density using a linear approximation of the sea water equation of state:
\[\rho  \rho_0 = \alpha (TT_0) + \beta (SS_0) + \kappa P\]Where \(\rho\) is water density, \(T\) is water temperature, \(P\) is water pressure, \(S\) is practical salinity, \(\alpha\) is the thermal expansion coefficient, \(\beta\) is the haline contraction coefficient, and \(\kappa\) is adiabatic compressibility.
 mhkit.dolfyn.adp.clean.nan_beyond_surface(ds, val=nan, beam_angle=None, inplace=False)[source]
Mask the values of 3D data (vel, amp, corr, echo) that are beyond the surface.
 Parameters:
ds (xarray.Dataset) – The adcp dataset to clean
val (nan or numeric) – Specifies the value to set the bad values to. Default is numpy.nan
beam_angle (int) – ADCP beam inclination angle. Default = dataset.attrs[‘beam_angle’]
inplace (bool) – When True the existing data object is modified. When False a copy is returned. Default = False
 Returns:
ds (xarray.Dataset) – Sets the adcp dataset where relevant arrays with values greater than depth set to NaN
Notes
Surface interference expected to happen at distance > range * cos(beam angle)  cell size
 mhkit.dolfyn.adp.clean.correlation_filter(ds, thresh=50, inplace=False)[source]
Filters out data where correlation is below a threshold in the alongbeam correlation data.
 Parameters:
ds (xarray.Dataset) – The adcp dataset to clean.
thresh (numeric) – The maximum value of correlation to screen, in counts or %. Default = 50
inplace (bool) – When True the existing data object is modified. When False a copy is returned. Default = False
 Returns:
ds (xarray.Dataset) – Elements in velocity, correlation, and amplitude are removed if below the correlation threshold
Notes
Does not edit correlation or amplitude data.
 mhkit.dolfyn.adp.clean.medfilt_orient(ds, nfilt=7)[source]
Median filters the orientation data (headingpitchroll or quaternions)
 Parameters:
ds (xarray.Dataset) – The adcp dataset to clean
nfilt (numeric) – The length of the medianfiltering kernel. Must be odd. Default = 7
 Returns:
ds (xarray.Dataset) – The adcp dataset with the filtered orientation data
See also
scipy.signal.medfilt
 mhkit.dolfyn.adp.clean.val_exceeds_thresh(var, thresh=5, val=nan)[source]
Find values of a variable that exceed a threshold value, and assign “val” to the velocity data where the threshold is exceeded.
 Parameters:
var (xarray.DataArray) – Variable to clean
thresh (numeric) – The maximum value of velocity to screen. Default = 5
val (nan or numeric) – Specifies the value to set the bad values to. Default is numpy.nan
 Returns:
ds (xarray.Dataset) – The adcp dataset with datapoints beyond thresh are set to val
 mhkit.dolfyn.adp.clean.fillgaps_time(var, method='cubic', maxgap=None)[source]
Fill gaps (nan values) in var across time using the specified method
 Parameters:
var (xarray.DataArray) – The variable to clean
method (string) – Interpolation method to use. Default is ‘cubic’
maxgap (numeric) – Maximum gap of missing data to interpolate across. Default is None
 Returns:
out (xarray.DataArray) – The input DataArray ‘var’ with gaps in ‘var’ interpolated across time
See also
xarray.DataArray.interpolate_na
 mhkit.dolfyn.adp.clean.fillgaps_depth(var, method='cubic', maxgap=None)[source]
Fill gaps (nan values) in var along the depth profile using the specified method
 Parameters:
var (xarray.DataArray) – The variable to clean
method (string) – Interpolation method to use. Default is ‘cubic’
maxgap (numeric) – Maximum gap of missing data to interpolate across. Default is None
 Returns:
out (xarray.DataArray) – The input DataArray ‘var’ with gaps in ‘var’ interpolated across depth
See also
xarray.DataArray.interpolate_na
Module containing functions to clean data
 mhkit.dolfyn.adv.clean.clean_fill(u, mask, npt=12, method='cubic', maxgap=6)[source]
Interpolate over mask values in timeseries data using the specified method
 Parameters:
u (xarray.DataArray) – The dataArray to clean.
mask (bool) – Logical tensor of elements to “nan” out (from spikeThresh, rangeLimit, or GN2002) and replace
npt (int) – The number of points on either side of the bad values that interpolation occurs over
method (string) – Interpolation method to use (linear, cubic, pchip, etc). Default is ‘cubic’
maxgap (numeric) – Maximum gap of missing data to interpolate across. Default is None
 Returns:
da (xarray.DataArray) – The dataArray with nan’s filled in
See also
xarray.DataArray.interpolate_na
 mhkit.dolfyn.adv.clean.fill_nan_ensemble_mean(u, mask, fs, window)[source]
Fill missing values with the ensemble mean.
 Parameters:
u (xarray.DataArray (..., time)) – The dataArray to clean. Can be 1D or 2D.
mask (bool) – Logical tensor of elements to “nan” out (from spikeThresh, rangeLimit, or GN2002) and replace
fs (int) – Instrument sampling frequency
window (int) – Size of window in seconds used to calculate ensemble means
 Returns:
da (xarray.DataArray) – The dataArray with nan’s filled in
Notes
Gaps larger than the ensemble size will not get filled in.
 mhkit.dolfyn.adv.clean.spike_thresh(u, thresh=10)[source]
Returns a logical vector where a spike in u of magnitude greater than thresh occurs. Both ‘Negative’ and ‘positive’ spikes are found.
 Parameters:
u (xarray.DataArray) – The timeseries data to clean.
thresh (int) – Magnitude of velocity spike, must be positive. Default = 10
 Returns:
mask (numpy.ndarray) – Logical vector with spikes labeled as ‘True’
 mhkit.dolfyn.adv.clean.range_limit(u, range=[5, 5])[source]
Returns a logical vector that is True where the values of u are outside of range.
 Parameters:
u (xarray.DataArray) – The timeseries data to clean.
range (list) – Min and max magnitudes beyond which are masked. Default is [5, 5]
 Returns:
mask (numpy.ndarray) – Logical vector with spikes labeled as ‘True’
 mhkit.dolfyn.adv.clean.GN2002(u, npt=5000)[source]
The Goring & Nikora 2002 ‘despiking’ method, with Wahl2003 correction. Returns a logical vector that is true where spikes are identified.
 Parameters:
u (xarray.DataArray) – The velocity array (1D or 3D) to clean.
npt (int) – The number of points over which to perform the method. Default = 5000
 Returns:
mask (numpy.ndarray) – Logical vector with spikes labeled as ‘True’
Motion Correction
The motion correction submodule contains tools to correct Nortek Vector ADVIMU data using the onboard IMU, if equipped. Requires proper setup prior to collecting data.
This function performs motion correction on an IMUADV data object. 
This function performs motion correction on an IMUADV data object. The IMU and ADV data should be tightly synchronized and contained in a single data object.
 param ds:
Cleaned ADV dataset in “inst” coordinates
 type ds:
xarray.Dataset
 param accel_filtfreq:
the frequency at which to highpass filter the acceleration sigal to remove lowfrequency drift.
 type accel_filtfreq:
float
 param vel_filtfreq:
a second frequency to highpass filter the integrated acceleration. Optional, default = 1/3 of accel_filtfreq
 type vel_filtfreq:
float
 param to_earth:
All variables in the ds.props[‘rotate_vars’] list will be rotated into either the earth frame (to_earth=True) or the instrument frame (to_earth=False). Optional, default = True
 type to_earth:
bool
 param separate_probes:
a flag to perform motioncorrection at the probe tips, and perform motion correction in beamcoordinates, then transform back into XYZ/earth coordinates. This correction seems to be lower than the noise levels of the ADV, so the default is to not use it (False).
 type separate_probes:
bool
 returns:
This function returns None, it operates on the input data object,
ds
. The following attributes are added to ds –velraw
is the uncorrected velocityvelrot
is the rotational component of the head motion (fromangrt)
velacc
is the translational component of the head motion (fromaccel, the highpass filtered accel sigal)
acclow
is the lowpass filtered accel sigal (i.e.,The primary velocity vector attribute,
vel
, is motion correctedsuch that – vel = velraw + velrot + velacc
The sigs are correct in this equation. The measured velocity
induced by headmotion is *in the opposite direction of the head*
motion. i.e. when the head moves one way in stationary flow, it
measures a velocity in the opposite direction. Therefore, to
remove the motion from the raw sigal we *add the head motion.*
Notes
Acceleration signals from inertial sensors are notorious for having a small bias that can drift slowly in time. When integrating these sigals to estimate velocity the bias is amplified and leads to large errors in the estimated velocity. There are two methods for removing these errors,
highpass filter the acceleration sigal prior and/or after integrating. This implicitly assumes that the lowfrequency translational velocity is zero.
provide a slowlyvarying reference position (often from a GPS) to an IMU that can use the sigal (usually using Kalman filters) to debias the acceleration sigal.
Because method (1) removes real lowfrequency acceleration, method (2) is more accurate. However, providing reference position estimates to undersea instruments is practically challenging and expensive. Therefore, lacking the ability to use method (2), this function utilizes method (1).
For deployments in which the ADV is mounted on a mooring, or other semifixed structure, the assumption of zero lowfrequency translational velocity is a reasonable one. However, for deployments on ships, gliders, or other moving objects it is not. The measured velocity, after motioncorrection, will still hold some of this contamination and will be a sum of the ADV motion and the measured velocity on long time scales. If lowfrequency motion is known separate from the ADV (e.g. from a bottomtracking ADP, or from a ship’s GPS), it may be possible to remove that sigal from the ADV sigal in postprocessing.
Velocity Analysis
Analysis in DOLfYN is primarily handled through the VelBinner class. Below is a list of functions that can be called from VelBinner.
This is the base binning (averaging) tool. 

Reshape the array arr to shape (...,n,n_bin+n_pad). 

Reshape the array arr to shape (...,n,n_bin+n_pad) and remove the bestfit trend line from each bin. 

Reshape the array arr to shape (...,n,n_bin+n_pad) and remove the mean from each bin. 

Reshape the array arr to shape (...,n,n_bin+n_pad) and take the mean of each bin along the specified axis. 

Reshape the array arr to shape (...,n,n_bin+n_pad) and take the variance of each bin along the specified axis. 

Reshape the array arr to shape (...,n,n_bin+n_pad) and take the standard deviation of each bin along the specified axis. 

Calculate the turbulent kinetic energy (TKE) (variances of u,v,w). 

Calculate the power spectral density of velocity. 
 class mhkit.dolfyn.binned.TimeBinner(n_bin, fs, n_fft=None, n_fft_coh=None, noise=[0, 0, 0])[source]
Bases:
object
 reshape(arr, n_pad=0, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad).
 Parameters:
arr (numpy.ndarray)
n_pad (int) – Is used to add n_pad/2 points from the end of the previous ensemble to the top of the current, and n_pad/2 points from the top of the next ensemble to the bottom of the current. Zeros are padded in the upperleft and lowerright corners of the matrix (beginning/end of timeseries). In this case, the array shape will be (…,`n`,`n_pad`+`n_bin`)
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
 Returns:
out (numpy.ndarray)
Notes
n_bin can be noninteger, in which case the output array size will be n_pad`+`n_bin, and the decimal will cause skipping of some data points in arr. In particular, every mod(n_bin,1) bins will have a skipped point. For example:  for n_bin=2048.2 every 1/5 bins will have a skipped point.  for n_bin=4096.9 every 9/10 bins will have a skipped point.
 detrend(arr, axis=1, n_pad=0, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and remove the bestfit trend line from each bin.
 Parameters:
arr (numpy.ndarray)
axis (int) – Axis along which to take mean. Default = 1
n_pad (int) – Is used to add n_pad/2 points from the end of the previous ensemble to the top of the current, and n_pad/2 points from the top of the next ensemble to the bottom of the current. Zeros are padded in the upperleft and lowerright corners of the matrix (beginning/end of timeseries). In this case, the array shape will be (…,`n`,`n_pad`+`n_bin`). Default = 0
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
 Returns:
out (numpy.ndarray)
 demean(arr, axis=1, n_pad=0, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and remove the mean from each bin.
 Parameters:
arr (numpy.ndarray)
axis (int) – Axis along which to take mean. Default = 1
n_pad (int) – Is used to add n_pad/2 points from the end of the previous ensemble to the top of the current, and n_pad/2 points from the top of the next ensemble to the bottom of the current. Zeros are padded in the upperleft and lowerright corners of the matrix (beginning/end of timeseries). In this case, the array shape will be (…,`n`,`n_pad`+`n_bin`). Default = 0
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
 Returns:
out (numpy.ndarray)
 mean(arr, axis=1, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and take the mean of each bin along the specified axis.
 Parameters:
arr (numpy.ndarray)
axis (int) – Axis along which to take mean. Default = 1
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
 Returns:
out (numpy.ndarray)
 variance(arr, axis=1, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and take the variance of each bin along the specified axis.
 Parameters:
arr (numpy.ndarray)
axis (int) – Axis along which to take variance. Default = 1
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
 Returns:
out (numpy.ndarray)
 standard_deviation(arr, axis=1, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and take the standard deviation of each bin along the specified axis.
 Parameters:
arr (numpy.ndarray)
axis (int) – Axis along which to take std dev. Default = 1
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
 Returns:
out (numpy.ndarray)
 class mhkit.dolfyn.velocity.Velocity(ds, *args, **kwargs)[source]
Bases:
object
All ADCP and ADV xarray datasets wrap this base class.
The turbulencerelated attributes defined within this class assume that the
'tke_vec'
and'stress_vec'
data entries are included in the dataset. These are typically calculated using aVelBinner
tool, but the method for calculating these variables can depend on the details of the measurement (instrument, it’s configuration, orientation, etc.).See also
 rotate2(out_frame='earth', inplace=True)[source]
Rotate the dataset to a new coordinate system.
 Parameters:
out_frame (string {'beam', 'inst', 'earth', 'principal'}) – The coordinate system to rotate the data into.
inplace (bool) – When True the existing data object is modified. When False a copy is returned. Default = True
 Returns:
ds (xarray.Dataset or None) – Returns the rotated dataset when ``inplace=False``, otherwise returns None.
Notes
This function rotates all variables in
ds.attrs['rotate_vars']
.To rotate to the ‘principal’ frame, a value of
ds.attrs['principal_heading']
must exist. The functioncalc_principal_heading
is recommended for this purpose, e.g.:ds.attrs['principal_heading'] = dolfyn.calc_principal_heading(ds['vel'].mean(range))
where here we are using the depthaveraged velocity to calculate the principal direction.
 set_declination(declin, inplace=True)[source]
Set the magnetic declination
 Parameters:
declination (float) – The value of the magnetic declination in degrees (positive values specify that Magnetic North is clockwise from True North)
inplace (bool) – When True the existing data object is modified. When False a copy is returned. Default = True
 Returns:
ds (xarray.Dataset or None) – Returns the rotated dataset when ``inplace=False``, otherwise returns None.
Notes
This method modifies the data object in the following ways:
If the dataset is in the earth reference frame at the time of
setting declination, it will be rotated into the “TrueEast, TrueNorth, Up” (hereafter, ETU) coordinate system
dat['orientmat']
is modified to be an ETU to
instrument (XYZ) rotation matrix (rather than the magneticENU to XYZ rotation matrix). Therefore, all rotations to/from the ‘earth’ frame will now be to/from this ETU coordinate system.
The value of the specified declination will be stored in
dat.attrs['declination']
dat['heading']
is adjusted for declination
(i.e., it is relative to True North).
If
dat.attrs['principal_heading']
is set, it is
adjusted to account for the orientation of the new ‘True’ earth coordinate system (i.e., calling set_declination on a data object in the principal coordinate system, then calling dat.rotate2(‘earth’) will yield a data object in the new ‘True’ earth coordinate system)
 set_inst2head_rotmat(rotmat, inplace=True)[source]
Set the instrument to head rotation matrix for the Nortek ADV if it hasn’t already been set through a ‘.userdata.json’ file.
 Parameters:
rotmat (float) – 3x3 rotation matrix
inplace (bool) – When True the existing data object is rotated. When False a copy is returned that is rotated. Default = True
 Returns:
ds (xarray.Dataset or None) – Returns the rotated dataset when ``inplace=False``, otherwise returns None.
Notes
If the data object is in earth or principal coords, it is first rotated to ‘inst’ before assigning inst2head_rotmat, it is then rotated back to the coordinate system in which it was input. This way the inst2head_rotmat gets applied correctly (in inst coordinate system).
 save(filename, **kwargs)[source]
Save the data object (underlying xarray dataset) as netCDF (.nc).
 Parameters:
filename (str) – Filename and/or path with the ‘.nc’ extension
**kwargs (dict) – These are passed directly to
xarray.Dataset.to_netcdf()
.
Notes
See DOLfYN’s
save
function for additional details.
 property variables
A sorted list of the variable names in the dataset.
 property attrs
The attributes in the dataset.
 property coords
The coordinates in the dataset.
 property u
The first velocity component.
This is simply a shortcut to self[‘vel’][0]. Therefore, depending on the coordinate system of the data object (self.attrs[‘coord_sys’]), it is:
beam: beam1
inst: x
earth: east
principal: streamwise
 property v
The second velocity component.
This is simply a shortcut to self[‘vel’][1]. Therefore, depending on the coordinate system of the data object (self.attrs[‘coord_sys’]), it is:
beam: beam2
inst: y
earth: north
principal: crossstream
 property w
The third velocity component.
This is simply a shortcut to self[‘vel’][2]. Therefore, depending on the coordinate system of the data object (self.attrs[‘coord_sys’]), it is:
beam: beam3
inst: z
earth: up
principal: up
 property U
Horizontal velocity as a complex quantity
 property U_mag
Horizontal velocity magnitude
 property U_dir
Angle of horizontal velocity vector. Direction is ‘to’, as opposed to ‘from’. This function calculates angle as “degrees CCW from X/East/streamwise” and then converts it to “degrees CW from X/North/streamwise”.
 property E_coh
Coherent turbulent energy
Niel Kelley’s ‘coherent turbulence energy’, which is the RMS of the Reynold’s stresses.
See: NREL Technical Report TP50052353
 property I_tke
Turbulent kinetic energy intensity.
Ratio of sqrt(tke) to horizontal velocity magnitude.
 property I
Turbulence intensity.
Ratio of standard deviation of horizontal velocity to horizontal velocity magnitude.
 property tke
Turbulent kinetic energy (sum of the three components)
 property upvp_
u’v’bar Reynolds stress
 property upwp_
u’w’bar Reynolds stress
 property vpwp_
v’w’bar Reynolds stress
 property upup_
u’u’bar component of the tke
 property vpvp_
v’v’bar component of the tke
 property wpwp_
w’w’bar component of the tke
 class mhkit.dolfyn.velocity.VelBinner(n_bin, fs, n_fft=None, n_fft_coh=None, noise=[0, 0, 0])[source]
Bases:
TimeBinner
This is the base binning (averaging) tool. All DOLfYN binning tools derive from this base class.
Examples
The VelBinner class is used to compute averages and turbulence statistics from ‘raw’ (not averaged) ADV or ADP measurements, for example:
# First read or load some data. rawdat = dolfyn.read_example('BenchFile01.ad2cp') # Now initialize the averaging tool: binner = dolfyn.VelBinner(n_bin=600, fs=rawdat.fs) # This computes the basic averages avg = binner.bin_average(rawdat)
 tke = <xarray.DataArray 'tke' (tke: 3)> Size: 60B array(['upup_', 'vpvp_', 'wpwp_'], dtype='<U5') Dimensions without coordinates: tke Attributes: units: 1 long_name: Turbulent Kinetic Energy Vector Components coverage_content_type: coordinate
 tau = <xarray.DataArray 'tau' (tau: 3)> Size: 60B array(['upvp_', 'upwp_', 'vpwp_'], dtype='<U5') Dimensions without coordinates: tau Attributes: units: 1 long_name: Reynolds Stress Vector Components coverage_content_type: coordinate
 S = <xarray.DataArray 'S' (S: 3)> Size: 36B array(['Sxx', 'Syy', 'Szz'], dtype='<U3') Dimensions without coordinates: S Attributes: units: 1 long_name: Power Spectral Density Vector Components coverage_content_type: coordinate
 C = <xarray.DataArray 'C' (C: 3)> Size: 36B array(['Cxy', 'Cxz', 'Cyz'], dtype='<U3') Dimensions without coordinates: C Attributes: units: 1 long_name: CrossSpectral Density Vector Components coverage_content_type: coordinate
 bin_average(raw_ds, out_ds=None, names=None)[source]
Bin the dataset and calculate the ensemble averages of each variable.
 Parameters:
raw_ds (xarray.Dataset) – The raw data structure to be binned
out_ds (xarray.Dataset) – The bin’d (output) data object to which averaged data is added.
names (list of strings) – The names of variables to be averaged. If names is None, all data in raw_ds will be binned.
 Returns:
out_ds (xarray.Dataset) – The new (or updated when out_ds is not None) dataset with the averages of all the variables in raw_ds.
:raises AttributeError : when out_ds is supplied as input (not None): :raises and the values in out_ds.attrs are inconsistent with: :raises raw_ds.attrs or the properties of this VelBinner (n_bin,: :raises n_fft, fs, etc.):
Notes
raw_ds.attrs are copied to out_ds.attrs. Inconsistencies between the two (when out_ds is specified as input) raise an AttributeError.
 bin_variance(raw_ds, out_ds=None, names=None, suffix='_var')[source]
Bin the dataset and calculate the ensemble variances of each variable. Complementary to bin_average().
 Parameters:
raw_ds (xarray.Dataset) – The raw data structure to be binned.
out_ds (xarray.Dataset) – The binned (output) dataset to which variance data is added, nominally dataset output from bin_average()
names (list of strings) – The names of variables of which to calculate variance. If names is None, all data in raw_ds will be binned.
 Returns:
out_ds (xarray.Dataset) – The new (or updated when out_ds is not None) dataset with the variance of all the variables in raw_ds.
:raises AttributeError : when out_ds is supplied as input (not None): :raises and the values in out_ds.attrs are inconsistent with: :raises raw_ds.attrs or the properties of this VelBinner (n_bin,: :raises n_fft, fs, etc.):
Notes
raw_ds.attrs are copied to out_ds.attrs. Inconsistencies between the two (when out_ds is specified as input) raise an AttributeError.
 autocovariance(veldat, n_bin=None)[source]
Calculate the autocovariance of the rawsignal veldat
 Parameters:
veldat (xarray.DataArray) – The raw dataArray of which to calculate autocovariance
n_bin (float) – Number of data elements to use
 Returns:
da (xarray.DataArray) – The autocovariance of veldat
Notes
As opposed to crosscovariance, which returns the full crosscovariance between two arrays, this function only returns a quarter of the full autocovariance. It computes the autocovariance over half of the range, then averages the two sides (to return a ‘quartered’ covariance).
This has the advantage that the 0 index is actually zerolag.
 turbulence_intensity(U_mag, noise=0, thresh=0, detrend=False)[source]
Calculate noisecorrected turbulence intensity.
 Parameters:
U_mag (xarray.DataArray) – Raw horizontal velocity magnitude
noise (numeric) – Instrument noise level in same units as velocity. Typically found from <adv or adp>.turbulence.doppler_noise_level. Default: None.
thresh (numeric) – Theshold below which TI will not be calculated
detrend (bool (default: False)) – Detrend the velocity data (True), or simply demean it (False), prior to computing TI.
 turbulent_kinetic_energy(veldat, noise=None, detrend=True)[source]
Calculate the turbulent kinetic energy (TKE) (variances of u,v,w).
 Parameters:
veldat (xarray.DataArray) – Velocity data array from ADV or single beam from ADCP. The last dimension is assumed to be time.
noise (float or arraylike) – Instrument noise level in same units as velocity. Typically found from <adv or adp>.turbulence.doppler_noise_level. Default: None.
detrend (bool (default: False)) – Detrend the velocity data (True), or simply demean it (False), prior to computing TKE. Note: the PSD routines use detrend, so if you want to have the same amount of variance here as there use
detrend=True
.
 Returns:
tke_vec (xarray.DataArray) – dataArray containing u’u’_, v’v’_ and w’w’_
 power_spectral_density(veldat, freq_units='rad/s', fs=None, window='hann', noise=0, n_bin=None, n_fft=None, n_pad=None, step=None)[source]
Calculate the power spectral density of velocity.
 Parameters:
veldat (xr.DataArray) – The raw velocity data (of dims ‘dir’ and ‘time’).
freq_units (string) – Frequency units of the returned spectra in either Hz or rad/s (f or \(\omega\))
fs (float (optional)) – The sample rate. Default is binner.fs
window (string or array) – Specify the window function. Options: 1, None, ‘hann’, ‘hamm’
noise (numeric or array) – Instrument noise level in same units as velocity. Default: 0 (ADCP) or [0, 0, 0] (ADV).
n_bin (int (optional)) – The binsize. Default: from the binner.
n_fft (int (optional)) – The fft size. Default: from the binner.
n_pad (int (optional)) – The number of values to pad with zero. Default = 0.
step (int (optional)) – Controls amount of overlap in fft. Default: the step size is chosen to maximize data use, minimize nens, and have a minimum of 50% overlap.
 Returns:
psd (xarray.DataArray (3, M, N_FFT)) – The spectra in the ‘u’, ‘v’, and ‘w’ directions.
Turbulence Analysis
Functions for analyzing ADV data via the ADVBinner class, beyond those described in VelBinner.
A class that builds upon VelBinner for calculating turbulence statistics and velocity spectra from ADV data 

Functional version of ADVBinner that computes a suite of turbulence statistics for the input dataset, and returns a binned data object. 

Calculate the specific Reynolds stresses (crosscovariances of u,v,w in m^2/s^2) 

Calculate the crossspectral density of velocity components. 

Calculate the dissipation rate from the PSD 

Calculate dissipation rate using the "structure function" (SF) method 

Calculate the dissipation rate according to TE01. 

Calculate integral length scales. 
 class mhkit.dolfyn.adv.turbulence.ADVBinner(n_bin, fs, n_fft=None, n_fft_coh=None, noise=[0, 0, 0])[source]
Bases:
VelBinner
A class that builds upon VelBinner for calculating turbulence statistics and velocity spectra from ADV data
 Parameters:
n_bin (int) – The length of each bin, in number of points, for this averaging operator.
fs (int) – Instrument sampling frequency in Hz
n_fft (int) – The length of the FFT for computing spectra (must be <= n_bin). Optional, default n_fft = n_bin
n_fft_coh (int) –
Number of data points to use for coherence and crossspectra fft’s. Optional, default n_fft_coh = n_fft
 noisefloat or arraylike
Instrument noise level in same units as velocity. Typically found from adv.turbulence.doppler_noise_level. Default: None.
 reynolds_stress(veldat, detrend=True)[source]
Calculate the specific Reynolds stresses (crosscovariances of u,v,w in m^2/s^2)
 Parameters:
veldat (xr.DataArray) – A velocity data array. The last dimension is assumed to be time.
detrend (bool) – Detrend the velocity data (True), or simply demean it (False), prior to computing stress. Note: the psd routines use detrend, so if you want to have the same amount of variance here as there use
detrend=True
. Default = True
 Returns:
out (xarray.DataArray)
 cross_spectral_density(veldat, freq_units='rad/s', fs=None, window='hann', n_bin=None, n_fft_coh=None)[source]
Calculate the crossspectral density of velocity components.
 Parameters:
veldat (xarray.DataArray) – The raw 3D velocity data.
freq_units (string) – Frequency units of the returned spectra in either Hz or rad/s (f or \(\omega\))
fs (float (optional)) – The sample rate. Default = binner.fs
window (string or array) – Specify the window function. Options: 1, None, ‘hann’, ‘hamm’
n_bin (int (optional)) – The binsize. Default = binner.n_bin
n_fft_coh (int (optional)) – The fft size. Default = binner.n_fft_coh
 Returns:
csd (xarray.DataArray (3, M, N_FFT)) – The firstdimension of the crossspectrum is the three different crossspectra: ‘uv’, ‘uw’, ‘vw’.
 doppler_noise_level(psd, pct_fN=0.8)[source]
Calculate bias due to Doppler noise using the noise floor of the velocity spectra.
 Parameters:
psd (xarray.DataArray (dir, time, f)) – The ADV power spectral density of velocity (autospectra)
pct_fN (float) – Percent of Nyquist frequency to calculate characeristic frequency
 Returns:
doppler_noise (xarray.DataArray) – Doppler noise level in units of m/s
Notes
Approximates bias from
where :math: sigma_{noise} is the bias due to Doppler noise, N is the constant variance or spectral density, and f_{c} is the characteristic frequency.
The characteristic frequency is then found as
where f_{s}/2 is the Nyquist frequency.
Richard, JeanBaptiste, et al. “Method for identification of Doppler noise levels in turbulent flow measurements dedicated to tidal energy.” International Journal of Marine Energy 3 (2013): 5264.
Thiébaut, Maxime, et al. “Investigating the flow dynamics and turbulence at a tidalstream energy site in a highly energetic estuary.” Renewable Energy 195 (2022): 252262.
 check_turbulence_cascade_slope(psd, freq_range=[6.28, 12.57])[source]
This function calculates the slope of the PSD, the power spectra of velocity, within the given frequency range. The purpose of this function is to check that the region of the PSD containing the isotropic turbulence cascade decreases at a rate of \(f^{5/3}\).
 Parameters:
psd (xarray.DataArray ([time,] freq)) – The power spectral density (1D or 2D)
freq_range (iterable(2) (default: [6.28, 12.57])) – The range over which the isotropic turbulence cascade occurs, in units of the psd frequency vector (Hz or rad/s)
 Returns:
(m, b) (tuple (slope, yintercept)) – A tuple containing the coefficients of the logadjusted linear regression between PSD and frequency
Notes
Calculates slope based on the standard formula for dissipation:
\[S(k) = \alpha \epsilon^{2/3} k^{5/3} + N\]The slope of the isotropic turbulence cascade, which should be equal to \(k^{5/3}\) or \(f^{5/3}\), where k and f are the wavenumber and frequency vectors, is estimated using linear regression with a log transformation:
\[log10(y) = m*log10(x) + b\]Which is equivalent to
\[y = 10^{b} x^{m}\]Where \(y\) is S(k) or S(f), \(x\) is k or f, \(m\) is the slope (ideally 5/3), and \(10^{b}\) is the intercept of y at x^m=1.
 dissipation_rate_LT83(psd, U_mag, freq_range=[6.28, 12.57], noise=None)[source]
Calculate the dissipation rate from the PSD
 Parameters:
psd (xarray.DataArray (...,time,f)) – The power spectral density
U_mag (xarray.DataArray (...,time)) – The binaveraged horizontal velocity [m/s] (from dataset shortcut)
freq_range (iterable(2)) – The range over which to integrate/average the spectrum, in units of the psd frequency vector (Hz or rad/s). Default = [6.28, 12.57] rad/s
noise (float or arraylike) – Instrument noise level in same units as velocity. Typically found from adv.turbulence.calc_doppler_noise. Default: None.
 Returns:
epsilon (xarray.DataArray (…,n_time)) – dataArray of the dissipation rate
Notes
This uses the standard formula for dissipation:
\[S(k) = \alpha \epsilon^{2/3} k^{5/3} + N\]where \(\alpha = 0.5\) (1.5 for all three velocity components), k is wavenumber, S(k) is the turbulent kinetic energy spectrum, and `N’ is the doppler noise level associated with the TKE spectrum.
With \(k \rightarrow \omega / U\), then – to preserve variance – \(S(k) = U S(\omega)\), and so this becomes:
\[S(\omega) = \alpha \epsilon^{2/3} \omega^{5/3} U^{2/3} + N\]With \(k \rightarrow (2\pi f) / U\), then
\[S(\omega) = \alpha \epsilon^{2/3} f^{5/3} (U/(2*\pi))^{2/3} + N\]LT83 : Lumley and Terray, “Kinematics of turbulence convected by a random wave field”. JPO, 1983, vol13, pp20002007.
 dissipation_rate_SF(vel_raw, U_mag, fs=None, freq_range=[2.0, 4.0])[source]
Calculate dissipation rate using the “structure function” (SF) method
 Parameters:
vel_raw (xarray.DataArray (time)) – The raw velocity data upon which to perform the SF technique.
U_mag (xarray.DataArray) – The binaveraged horizontal velocity (from dataset shortcut)
fs (float) – The sample rate of vel_raw [Hz]
freq_range (iterable(2)) – The frequency range over which to compute the SF [Hz] (i.e. the frequency range within which the isotropic turbulence cascade falls). Default = [2., 4.] Hz
 Returns:
epsilon (xarray.DataArray) – dataArray of the dissipation rate
 dissipation_rate_TE01(dat_raw, dat_avg, freq_range=[6.28, 12.57])[source]
Calculate the dissipation rate according to TE01.
 Parameters:
dat_raw (xarray.Dataset) – The raw (off the instrument) adv dataset
dat_avg (xarray.Dataset) – The binaveraged adv dataset (calc’d from ‘calc_turbulence’ or ‘do_avg’). The spectra (psd) and basic turbulence statistics (‘tke_vec’ and ‘stress_vec’) must already be computed.
freq_range (iterable(2)) – The range over which to integrate/average the spectrum, in units of the psd frequency vector (Hz or rad/s). Default = [6.28, 12.57] rad/s
Notes
TE01 : Trowbridge, J and Elgar, S, “Turbulence measurements in the Surf Zone”. JPO, 2001, vol31, pp24032417.
 integral_length_scales(a_cov, U_mag, fs=None)[source]
Calculate integral length scales.
 Parameters:
a_cov (xarray.DataArray) – The autocovariance array (i.e. computed using autocovariance).
U_mag (xarray.DataArray) – The binaveraged horizontal velocity (from dataset shortcut)
fs (numeric) – The raw sample rate
 Returns:
L_int (numpy.ndarray (…, n_time)) – The integral length scale (T_int*U_mag).
Notes
The integral time scale (T_int) is the lagtime at which the autocovariance falls to 1/e.
If T_int is not reached, L_int will default to ‘0’.
 mhkit.dolfyn.adv.turbulence.turbulence_statistics(ds_raw, n_bin, fs, n_fft=None, freq_units='rad/s', window='hann')[source]
Functional version of ADVBinner that computes a suite of turbulence statistics for the input dataset, and returns a binned data object.
 Parameters:
ds_raw (xarray.Dataset) – The raw adv datset to bin, average and compute turbulence statistics of.
freq_units (string) – Frequency units of the returned spectra in either Hz or rad/s (f or \(\omega\)). Default is ‘rad/s’
window (string or array) – The window to use for calculating spectra.
 Returns:
ds (xarray.Dataset) – Returns an ‘binned’ (i.e. ‘averaged’) data object. All fields (variables) of the input data object are averaged in n_bin chunks. This object also computes the following items over those chunks:
tke_vec : The energy in each component, each components is alternatively accessible as:
upup_
,vpvp_
,wpwp_
)stress_vec : The Reynolds stresses, each component is alternatively accessible as:
upwp_
,vpwp_
,upvp_
)U_std : The standard deviation of the horizontal velocity U_mag.
psd : DataArray containing the spectra of the velocity in radial frequency units. The dataarray contains:  vel : the velocity spectra array (m^2/s/rad))  omega : the radial frequncy (rad/s)
Tools
Spectral analysis and miscellaneous DOLfYN functions are stored here. These functions are used throughout DOLfYN’s core code and may also be helpful to users in general.
FFTbased Functions:
Compute the frequency vector for a given nfft and fs. 

Compute the power spectral density (PSD). 

Compute the cross power spectral density (CPSD) of the signals a and b. 

Compute the cross power spectral density (CPSD) of the signals a and b. 
Other Functions:
Remove a linear trend from arr. 

Find continuous segments in a boolean array. 

Return an iterator object for looping over 1D slices, along 

Converts between the 'cartesian angle' (counterclockwise from East) and the 'polar angle' in (degrees clockwise from North) 

Linearly fill NaN value in an array. 

Fill gaps (NaN values) in 

Do a running median filter of the data. 
 mhkit.dolfyn.tools.fft.fft_frequency(nfft, fs, full=False)[source]
Compute the frequency vector for a given nfft and fs.
 Parameters:
fs (float) – The sampling frequency (e.g. samples/sec)
nfft (int) – The number of samples in a window.
full (bool) – Whether to return half frequencies (positive), or the full frequencies. Default = False
 Returns:
freq (numpy.ndarray) – The frequency vector, in same units as ‘fs’
 mhkit.dolfyn.tools.fft.cpsd_quasisync_1D(a, b, nfft, fs, window='hann')[source]
Compute the cross power spectral density (CPSD) of the signals a and b.
 Parameters:
a (numpy.ndarray) – The first signal.
b (numpy.ndarray) – The second signal.
nfft (int) – The number of points in the fft.
fs (float) – The sample rate (e.g. sample/second).
window ({None, 1, 'hann', numpy.ndarray}) – The window to use (default: ‘hann’). Valid entries are:  None,1 : uses a ‘boxcar’ or ones window.  ‘hann’ : hanning window.  a length(nfft) array : use this as the window directly.
 Returns:
cpsd (numpy.ndarray) – The crossspectral density of a and b.
See also
psd()
,coherence_1D()
,cpsd_1D()
,numpy.fft
Notes
a and b do not need to be ‘tightly’ synchronized, and can even be different lengths, but the first and lastindex of both series should be synchronized (to whatever degree you want unbiased phases).
This performs:
\[fft(a)*conj(fft(b))\]Note that this is consistent with
numpy.correlate()
.It detrends the data and uses a minimum of 50% overlap for the shorter of a and b. For the longer, the overlap depends on the difference in size. 1(l_short/l_long) data will be underutilized (where l_short and l_long are the length of the shorter and longer series, respectively).
The units of the spectra is the product of the units of a and b, divided by the units of fs.
 mhkit.dolfyn.tools.fft.cpsd_1D(a, b, nfft, fs, window='hann', step=None)[source]
Compute the cross power spectral density (CPSD) of the signals a and b.
 Parameters:
a (numpy.ndarray) – The first signal.
b (numpy.ndarray) – The second signal.
nfft (int) – The number of points in the fft.
fs (float) – The sample rate (e.g. sample/second).
window ({None, 1, 'hann', numpy.ndarray}) – The window to use (default: ‘hann’). Valid entries are:  None,1 : uses a ‘boxcar’ or ones window.  ‘hann’ : hanning window.  a length(nfft) array : use this as the window directly.
step (int) – Use this to specify the overlap. For example:  step : nfft/2 specifies a 50% overlap.  step : nfft specifies no overlap.  step=2*nfft means that half the data will be skipped. By default, step is calculated to maximize data use, have at least 50% overlap and minimize the number of ensembles.
 Returns:
cpsd (numpy.ndarray) – The crossspectral density of a and b.
See also
psd()
,coherence_1D()
,None
Notes
cpsd removes a linear trend from the signals.
The two signals should be the same length, and should both be real.
This performs:
\[fft(a)*conj(fft(b))\]This implementation is consistent with the numpy.correlate definition of correlation. (The conjugate of D.B. Chelton’s definition of correlation.)
The units of the spectra is the product of the units of a and b, divided by the units of fs.
 mhkit.dolfyn.tools.fft.psd_1D(a, nfft, fs, window='hann', step=None)[source]
Compute the power spectral density (PSD).
This function computes the onedimensional npoint PSD.
The units of the spectra is the product of the units of a and b, divided by the units of fs.
 Parameters:
a (numpy.ndarray) – The first signal, as a 1D vector
nfft (int) – The number of points in the fft.
fs (float) – The sample rate (e.g. sample/second).
window ({None, 1, 'hann', numpy.ndarray}) – The window to use (default: ‘hann’). Valid entries are:  None,1 : uses a ‘boxcar’ or ones window.  ‘hann’ : hanning window.  a length(nfft) array : use this as the window directly.
step (int) – Use this to specify the overlap. For example:  step : nfft/2 specifies a 50% overlap.  step : nfft specifies no overlap.  step=2*nfft means that half the data will be skipped. By default, step is calculated to maximize data use, have at least 50% overlap and minimize the number of ensembles.
 Returns:
cpsd (numpy.ndarray) – The crossspectral density of a and b.
Notes
Credit: This function’s line of code was copied from JN’s fast_psd.m routine.
See also
cpsd_1D()
,coherence_1D()
,None
 mhkit.dolfyn.tools.misc.detrend_array(arr, axis=1, in_place=False)[source]
Remove a linear trend from arr.
 Parameters:
arr (array_like) – The array from which to remove a linear trend.
axis (int) – The axis along which to operate.
Notes
This method is copied from the matplotlib.mlab library, but implements the covariance calcs explicitly for added speed.
This works much faster than mpl.mlab.detrend for multidimensional arrays, and is also faster than linalg.lstsq methods.
 mhkit.dolfyn.tools.misc.group(bl, min_length=0)[source]
Find continuous segments in a boolean array.
 Parameters:
bl (numpy.ndarray (dtype='bool')) – The input boolean array.
min_length (int (optional)) – Specifies the minimum number of continuous points to consider a group (i.e. that will be returned).
 Returns:
out (np.ndarray(slices,)) – a vector of slice objects, which indicate the continuous sections where bl is True.
Notes
This function has funny behavior for single points. It will return the same two indices for the beginning and end.
 mhkit.dolfyn.tools.misc.slice1d_along_axis(arr_shape, axis=0)[source]
Return an iterator object for looping over 1D slices, along
axis
, of an array of shape arr_shape. Parameters:
arr_shape (tuple,list) – Shape of the array over which the slices will be made.
axis (integer) – Axis along which arr is sliced.
 Returns:
Iterator (object) – The iterator object returns slice objects which slices arrays of shape arr_shape into 1D arrays.
Examples
>> out=np.empty(replace(arr.shape,0,1)) >> for slc in slice1d_along_axis(arr.shape,axis=0): >> out[slc]=my_1d_function(arr[slc])
 mhkit.dolfyn.tools.misc.convert_degrees(deg, tidal_mode=True)[source]
Converts between the ‘cartesian angle’ (counterclockwise from East) and the ‘polar angle’ in (degrees clockwise from North)
 Parameters:
deg (float or arraylike) – Number or array in ‘degrees CCW from East’ or ‘degrees CW from North’
tidal_mode (bool) – If true, range is set from 0 to +/180 degrees. If false, range is 0 to 360 degrees. Default = True
 Returns:
out (float or arraylike) – Input data transformed to ‘degrees CW from North’ or ‘degrees CCW from East’, respectively (based on deg)
Notes
The same algorithm is used to convert back and forth between ‘CCW from E’ and ‘CW from N’
 mhkit.dolfyn.tools.misc.fillgaps(a, maxgap=inf, dim=0, extrapFlg=False)[source]
Linearly fill NaN value in an array.
 Parameters:
a (numpy.ndarray) – The array to be filled.
maxgap (numpy.ndarray (optional: inf)) – The maximum gap to fill.
dim (int (optional: 0)) – The dimension to operate along.
extrapFlg (bool (optional: False)) – Whether to extrapolate if NaNs are found at the ends of the array.
See also
mhkit.dolfyn.tools.misc._interpgaps
Linearly interpolates in time.
Notes
This function interpolates assuming spacing/timestep between successive points is constant. If the spacing is not constant, use _interpgaps.
 mhkit.dolfyn.tools.misc.interpgaps(a, t, maxgap=inf, dim=0, extrapFlg=False)[source]
Fill gaps (NaN values) in
a
by linear interpolation along dimensiondim
with the point spacing specified int
. Parameters:
a (numpy.ndarray) – The array containing NaN values to be filled.
t (numpy.ndarray (len(t) == a.shape[dim])) – Independent variable of the points in
a
, e.g. timestepmaxgap (numpy.ndarray (optional: inf)) – The maximum gap to fill.
dim (int (optional: 0)) – The dimension to operate along.
extrapFlg (bool (optional: False)) – Whether to extrapolate if NaNs are found at the ends of the array.
See also
mhkit.dolfyn.tools.misc.fillgaps
Linearly interpolates in arrayindex space.
 mhkit.dolfyn.tools.misc.medfiltnan(a, kernel, thresh=0)[source]
Do a running median filter of the data. Regions where more than
thresh
fraction of the points are NaN are set to NaN. Parameters:
a (numpy.ndarray) – 2D array containing data to be filtered.
kernel_size (numpy.ndarray or list, optional) – A scalar or a list of length 2, giving the size of the median filter window in each dimension. Elements of kernel_size should be odd. If kernel_size is a scalar, then this scalar is used as the size in each dimension.
thresh (int) – Maximum gap in a to filter over
 Returns:
out (numpy.ndarray) – 2D array of same size containing filtered data
See also
scipy.signal.medfilt2d
Time
The time submodule contains functions to modify the format of the stored time between a variety of time formats.
Convert from epoch time (seconds since 1/1/1970 00:00:00) to numpy.datetime64 array 

Convert numpy.datetime64 array to epoch time (seconds since 1/1/1970 00:00:00) 

Convert numpy.datetime64 array to list of datetime objects 

Convert numpy.datetime64 array to list of datetime objects 

Convert from epoch time (seconds since 1/1/1970 00:00:00) to a list of datetime objects 

Convert list of datetime objects to legible strings 

Convert list of datetime objects to epoch time 

Convert list of datetime objects to MATLAB datenum 

Convert MATLAB datenum to list of datetime objects 
 mhkit.dolfyn.time.epoch2dt64(ep_time)[source]
Convert from epoch time (seconds since 1/1/1970 00:00:00) to numpy.datetime64 array
 Parameters:
ep_time (xarray.DataArray) – Time coordinate dataarray or single time element
 Returns:
time (numpy.datetime64) – The converted datetime64 array
 mhkit.dolfyn.time.dt642epoch(dt64)[source]
Convert numpy.datetime64 array to epoch time (seconds since 1/1/1970 00:00:00)
 Parameters:
dt64 (numpy.datetime64) – Single or array of datetime64 object(s)
 Returns:
time (float) – Epoch time (seconds since 1/1/1970 00:00:00)
 mhkit.dolfyn.time.date2dt64(dt)[source]
Convert numpy.datetime64 array to list of datetime objects
 Parameters:
time (datetime.datetime) – The converted datetime object
 Returns:
dt64 (numpy.datetime64) – Single or array of datetime64 object(s)
 mhkit.dolfyn.time.dt642date(dt64)[source]
Convert numpy.datetime64 array to list of datetime objects
 Parameters:
dt64 (numpy.datetime64) – Single or array of datetime64 object(s)
 Returns:
time (datetime.datetime) – The converted datetime object
 mhkit.dolfyn.time.epoch2date(ep_time, offset_hr=0, to_str=False)[source]
Convert from epoch time (seconds since 1/1/1970 00:00:00) to a list of datetime objects
 Parameters:
ep_time (xarray.DataArray) – Time coordinate dataarray or single time element
offset_hr (int) – Number of hours to offset time by (e.g. UTC 7 hours = PDT)
to_str (logical) – Converts datetime object to a readable string
 Returns:
time (datetime.datetime) – The converted datetime object or list(strings)
Notes
The specific time instance is set during deployment, usually sync’d to the deployment computer. The time seen by DOLfYN is in the timezone of the deployment computer, which is unknown to DOLfYN.
 mhkit.dolfyn.time.date2str(dt, format_str=None)[source]
Convert list of datetime objects to legible strings
 Parameters:
dt (datetime.datetime) – Single or list of datetime object(s)
format_str (string) – Timestamp string formatting. Default is ‘%Y%m%d %H:%M:%S.%f’ See datetime.strftime documentation for timestamp string formatting.
 Returns:
time (string) – Converted timestamps
 mhkit.dolfyn.time.date2epoch(dt)[source]
Convert list of datetime objects to epoch time
 Parameters:
dt (datetime.datetime) – Single or list of datetime object(s)
 Returns:
time (float) – Datetime converted to epoch time (seconds since 1/1/1970 00:00:00)