DOLfYN Module
The DOLfYN module contains a set of functions to analyze binary Nortek or TRDI files. Instrument datafiles are processed and returned as an xarray dataset.
There a couple ways to start using the dolfyn module. The first way is to import the entire module using import mhkit.dolfyn as dolfyn. Many functions are directly available from the main dolfyn import:
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file. |
|
Read an ADCP or ADV datafile from the examples directory. |
|
Save xarray dataset as netCDF (.nc). |
|
Load xarray dataset from netCDF (.nc) |
|
Save xarray dataset as a MATLAB (.mat) file |
|
Load xarray dataset from MATLAB (.mat) file, complimentary to save_mat() |
|
Rotate a dataset to a new coordinate system. |
|
Compute the principal angle of the horizontal velocity. |
|
Set the magnetic declination |
|
Set the instrument to head rotation matrix for the Nortek ADV if it hasn't already been set through a '.userdata.json' file. |
|
Calculate the orientation matrix from DOLfYN-defined euler angles. |
|
Calculate DOLfYN-defined euler angles from the orientation matrix. |
|
Calculate orientation from Nortek AHRS quaternions, where q = [W, X, Y, Z] instead of the standard q = [X, Y, Z, W] = [q1, q2, q3, q4] |
|
This is the base binning (averaging) tool. |
ADP Module
The other two ways are to import the instrument-specific modules. The ADP module contains routines for reading and working with ADP/ADCP data and is imported using from mhkit.dolfyn.adp import api. It contains:
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file. |
|
Load xarray dataset from netCDF (.nc) |
|
Rotate a dataset to a new coordinate system. |
|
Compute the principal angle of the horizontal velocity. |
|
Module containing functions to clean data |
|
This is the base binning (averaging) tool. |
|
|
ADV Module
The ADV module contains routines for reading and working with ADV data and is imported using from mhkit.dolfyn.adv import api. It contains:
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file. |
|
Load xarray dataset from netCDF (.nc) |
|
Rotate a dataset to a new coordinate system. |
|
Compute the principal angle of the horizontal velocity. |
|
Set the instrument to head rotation matrix for the Nortek ADV if it hasn't already been set through a '.userdata.json' file. |
|
Module containing functions to clean data |
|
This function performs motion correction on an IMU-ADV data object. |
|
This is the base binning (averaging) tool. |
|
A class that builds upon VelBinner for calculating turbulence statistics and velocity spectra from ADV data |
|
Functional version of ADVBinner that computes a suite of turbulence statistics for the input dataset, and returns a binned data object. |
IO
The io submodule contains the following functions to read binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or TRDI (.000, .PD0, .ENX, etc.) data file.
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file. |
|
Read an ADCP or ADV datafile from the examples directory. |
|
Save xarray dataset as netCDF (.nc). |
|
Load xarray dataset from netCDF (.nc) |
|
Save xarray dataset as a MATLAB (.mat) file |
|
Load xarray dataset from MATLAB (.mat) file, complimentary to save_mat() |
- mhkit.dolfyn.io.api.read(fname, userdata=True, nens=None, **kwargs)[source]
Read a binary Nortek (e.g., .VEC, .wpr, .ad2cp, etc.) or RDI (.000, .PD0, .ENX, etc.) data file.
- Parameters:
filename (string) – Filename of instrument file to read.
userdata (True, False, or string of userdata.json filename (default
True
)) – Whether to read the ‘<base-filename>.userdata.json’ file.nens (None, int or 2-element tuple (start, stop)) – Number of pings or ensembles to read from the file. Default is None, read entire file
**kwargs (dict) – Passed to instrument-specific parser.
- Returns:
ds (xarray.Dataset) – An xarray dataset from instrument datafile.
- mhkit.dolfyn.io.api.read_example(name, **kwargs)[source]
Read an ADCP or ADV datafile from the examples directory.
- Parameters:
name (str) –
A few available files:
AWAC_test01.wpr BenchFile01.ad2cp RDI_test01.000 vector_burst_mode01.VEC vector_data01.VEC vector_data_imu01.VEC winriver01.PD0 winriver02.PD0
- Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data.
- mhkit.dolfyn.io.api.save(ds, filename, format='NETCDF4', engine='netcdf4', compression=False, **kwargs)[source]
Save xarray dataset as netCDF (.nc).
- Parameters:
ds (xarray.Dataset) – Dataset to save
filename (str) – Filename and/or path with the ‘.nc’ extension
compression (bool) – When true, compress all variables with zlib complevel=1. Default is False
**kwargs (dict) – These are passed directly to
xarray.Dataset.to_netcdf()
Notes
Drops ‘config’ lines.
Rewrites variable encoding dict
More detailed compression options can be specified by specifying ‘encoding’ in kwargs. The values in encoding will take precedence over whatever is set according to the compression option above. See the xarray.to_netcdf documentation for more details.
- mhkit.dolfyn.io.api.load(filename)[source]
Load xarray dataset from netCDF (.nc)
- Parameters:
filename (str) – Filename and/or path with the ‘.nc’ extension
- Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data.
- mhkit.dolfyn.io.api.save_mat(ds, filename, datenum=True)[source]
Save xarray dataset as a MATLAB (.mat) file
- Parameters:
ds (xarray.Dataset) – Dataset to save
filename (str) – Filename and/or path with the ‘.mat’ extension
datenum (bool) – If true, converts time to datenum. If false, time will be saved in “epoch time”.
Notes
The xarray data format is saved as a MATLAB structure with the fields ‘vars, coords, config, units’. Converts time to datenum
See also
scipy.io.savemat
- mhkit.dolfyn.io.api.load_mat(filename, datenum=True)[source]
Load xarray dataset from MATLAB (.mat) file, complimentary to save_mat()
A .mat file must contain the fields: {vars, coords, config, units}, where ‘coords’ contain the dimensions of all variables in ‘vars’.
- Parameters:
filename (str) – Filename and/or path with the ‘.mat’ extension
datenum (bool) – If true, converts time from datenum. If false, converts time from “epoch time”.
- Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data.
See also
scipy.io.loadmat
- mhkit.dolfyn.io.nortek.read_nortek(filename, userdata=True, debug=False, do_checksum=False, nens=None, **kwargs)[source]
Read a classic Nortek (AWAC and Vector) datafile
- Parameters:
filename (string) – Filename of Nortek file to read.
userdata (bool, or string of userdata.json filename) – Whether to read the ‘<base-filename>.userdata.json’ file. Default = True
debug (bool) – Logs debugger ouput if true. Default = False
do_checksum (bool) – Whether to perform the checksum of each data block. Default = False
nens (None, int or 2-element tuple (start, stop)) – Number of pings or ensembles to read from the file. Default is None, read entire file
- Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data
- mhkit.dolfyn.io.nortek2.read_signature(filename, userdata=True, nens=None, rebuild_index=False, debug=False, dual_profile=False, **kwargs)[source]
Read a Nortek Signature (.ad2cp) datafile
- Parameters:
filename (string) – The filename of the file to load.
userdata (bool) – To search for and use a .userdata.json or not
nens (None, int or 2-element tuple (start, stop)) – Number of pings or ensembles to read from the file. Default is None, read entire file
rebuild_index (bool) – Force rebuild of dolfyn-written datafile index. Useful for code updates. Default = False
debug (bool) – Logs debugger ouput if true. Default = False
dual_profile (bool) – Set to true if instrument is running multiple profiles. Default = False
- Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data
- mhkit.dolfyn.io.nortek2.split_dp_datasets(ds)[source]
Splits a dataset containing dual profiles into individual profiles
- mhkit.dolfyn.io.rdi.read_rdi(filename, userdata=None, nens=None, debug_level=-1, vmdas_search=False, winriver=False, **kwargs)[source]
Read a TRDI binary data file.
- Parameters:
filename (string) – Filename of TRDI file to read.
userdata (True, False, or string of userdata.json filename) – Whether to read the ‘<base-filename>.userdata.json’ file. Default = True
nens (None, int or 2-element tuple (start, stop)) – Number of pings or ensembles to read from the file. Default is None, read entire file
debug_level (int) – Debug level [0 - 2]. Default = -1
vmdas_search (bool) – Search from the end of each ensemble for the VMDAS navigation block. The byte offsets are sometimes incorrect. Default = False
winriver (bool) – If file is winriver or not. Automatically set by dolfyn, this is helpful for debugging. Default = False
- Returns:
ds (xarray.Dataset) – An xarray dataset from the binary instrument data
Rotate
The rotate submodule contains tools to rotate a dataset to different coordinate systems. When a file is read into dolfyn, the data will be stored in the same coordinate system it was saved in. The different coordinate systems are “beam” <-> “inst” <-> “earth” <-> “principal”. “Beam” and “inst” are manufacturer/instrument specific and refer to the along-beam (either 3 or 4 beams) and instrument (XYZ) reference frame. Instrument reference frames differ across sensors, but the rotate code takes this into account when rotating into the “earth” frame, defined as East-North-Up (ENU). The earth frame can then be rotated to the principal axes, defined as streamwise-cross_stream-vertical.
Rotate a dataset to a new coordinate system. |
|
Compute the principal angle of the horizontal velocity. |
|
Set the magnetic declination |
|
Set the instrument to head rotation matrix for the Nortek ADV if it hasn't already been set through a '.userdata.json' file. |
|
Calculate the orientation matrix from DOLfYN-defined euler angles. |
|
Calculate DOLfYN-defined euler angles from the orientation matrix. |
|
Calculate orientation from Nortek AHRS quaternions, where q = [W, X, Y, Z] instead of the standard q = [X, Y, Z, W] = [q1, q2, q3, q4] |
|
Calculate "tilt", the vertical inclination, from pitch and roll. |
- mhkit.dolfyn.rotate.api.rotate2(ds, out_frame='earth', inplace=True)[source]
Rotate a dataset to a new coordinate system.
- Parameters:
ds (xr.Dataset) – The dolfyn dataset (ADV or ADCP) to rotate.
out_frame (string {'beam', 'inst', 'earth', 'principal'}) – The coordinate system to rotate the data into.
inplace (bool) – When True
ds
is modified. When False a copy is returned. Default = True
- Returns:
ds (xarray.Dataset or None) – Returns a new rotated dataset when ``inplace=False``, otherwise returns None.
Notes
This function rotates all variables in
ds.attrs['rotate_vars']
.In order to rotate to the ‘principal’ frame, a value should exist for
ds.attrs['principal_heading']
. The functioncalc_principal_heading
is recommended for this purpose, e.g.:ds.attrs[‘principal_heading’] = dolfyn.calc_principal_heading(ds[‘vel’].mean(range))
where here we are using the depth-averaged velocity to calculate the principal direction.
- mhkit.dolfyn.rotate.api.calc_principal_heading(vel, tidal_mode=True)[source]
Compute the principal angle of the horizontal velocity.
- Parameters:
vel (np.ndarray (2,...,Nt), or (3,...,Nt)) – The 2D or 3D velocity array (3rd-dim is ignored in this calculation)
tidal_mode (bool) – If true, range is set from 0 to +/-180 degrees. If false, range is 0 to 360 degrees. Default = True
- Returns:
p_heading (float or ndarray) – The principal heading in degrees clockwise from North.
Notes
When tidal_mode=True, this tool calculates the heading that is aligned with the bidirectional flow. It does so following these steps:
rotates vectors with negative velocity by 180 degrees
then doubles those angles to make a complete circle again
computes a mean direction from this, and halves that angle (to undo the doubled-angles in step 2)
The returned angle is forced to be between 0 and 180. So, you may need to add 180 to this if you want your positive direction to be in the western-half of the plane.
Otherwise, this function simply computes the average direction using a vector method.
- mhkit.dolfyn.rotate.api.set_declination(ds, declin, inplace=True)[source]
Set the magnetic declination
- Parameters:
ds (xarray.Dataset or
dolfyn.velocity.Velocity
) – The input dataset or velocity classdeclination (float) – The value of the magnetic declination in degrees (positive values specify that Magnetic North is clockwise from True North)
inplace (bool) – When True
ds
is modified. When False a copy is returned. Default = True
- Returns:
ds (xarray.Dataset or None) – Returns a new dataset with declination set when ``inplace=False``, otherwise returns None.
Notes
This function modifies the data object in the following ways:
If the dataset is in the earth reference frame at the time of setting declination, it will be rotated into the “True-East, True-North, Up” (hereafter, ETU) coordinate system
dat['orientmat']
is modified to be an ETU to instrument (XYZ) rotation matrix (rather than the magnetic-ENU to XYZ rotation matrix). Therefore, all rotations to/from the ‘earth’ frame will now be to/from this ETU coordinate system.The value of the specified declination will be stored in
dat.attrs['declination']
dat['heading']
is adjusted for declination (i.e., it is relative to True North).If
dat.attrs['principal_heading']
is set, it is adjusted to account for the orientation of the new ‘True’ earth coordinate system (i.e., calling set_declination on a data object in the principal coordinate system, then calling dat.rotate2(‘earth’) will yield a data object in the new ‘True’ earth coordinate system)
- mhkit.dolfyn.rotate.api.set_inst2head_rotmat(ds, rotmat, inplace=True)[source]
Set the instrument to head rotation matrix for the Nortek ADV if it hasn’t already been set through a ‘.userdata.json’ file.
- Parameters:
ds (xarray.Dataset) – The data set to assign inst2head_rotmat
rotmat (float) – 3x3 rotation matrix
inplace (bool) – When True
ds
is modified. When False a copy is returned. Default = True
- Returns:
ds (xarray.Dataset or None) – Returns a new dataset with inst2head_rotmat set when ``inplace=False``, otherwise returns None.
Notes
If the data object is in earth or principal coords, it is first rotated to ‘inst’ before assigning inst2head_rotmat, it is then rotated back to the coordinate system in which it was input. This way the inst2head_rotmat gets applied correctly (in inst coordinate system).
- mhkit.dolfyn.rotate.base.euler2orient(heading, pitch, roll, units='degrees')[source]
Calculate the orientation matrix from DOLfYN-defined euler angles.
This function is not likely to be called during data processing since it requires DOLfYN-defined euler angles. It is intended for testing DOLfYN.
The matrices H, P, R are the transpose of the matrices for rotation about z, y, x as shown here https://en.wikipedia.org/wiki/Rotation_matrix. The transpose is used because in DOLfYN the orientation matrix is organized for rotation from EARTH –> INST, while the wiki’s matrices are organized for rotation from INST –> EARTH.
- Parameters:
heading (numpy.ndarray) – The heading angle.
pitch (numpy.ndarray) – The pitch angle.
roll (numpy.ndarray) – The roll angle.
units (str) – Units in degrees or radians. is ‘degrees’
- Returns:
omat (numpy.ndarray (3x3xtime)) – The orientation matrix of the data. The returned orientation matrix obeys the following conventions:
a “ZYX” rotation order. That is, these variables are computed assuming that rotation from the earth -> instrument frame happens by rotating around the z-axis first (heading), then rotating around the y-axis (pitch), then rotating around the x-axis (roll). Note this requires matrix multiplication in the reverse order.
heading is defined as the direction the x-axis points, positive clockwise from North (this is opposite the right-hand-rule around the Z-axis), range 0-360 degrees.
pitch is positive when the x-axis pitches up (this is opposite the right-hand-rule around the Y-axis)
roll is positive according to the right-hand-rule around the instrument’s x-axis
- mhkit.dolfyn.rotate.base.orient2euler(omat)[source]
Calculate DOLfYN-defined euler angles from the orientation matrix.
- Parameters:
omat (numpy.ndarray) – The orientation matrix
- Returns:
heading (numpy.ndarray) – The heading angle. Heading is defined as the direction the x-axis points, positive clockwise from North (this is opposite the right-hand-rule around the Z-axis), range 0-360 degrees.
pitch (np.ndarray) – The pitch angle (degrees). Pitch is positive when the x-axis pitches up (this is opposite the right-hand-rule around the Y-axis).
roll (np.ndarray) – The roll angle (degrees). Roll is positive according to the right-hand-rule around the instrument’s x-axis.
- mhkit.dolfyn.rotate.base.quaternion2orient(quaternions)[source]
Calculate orientation from Nortek AHRS quaternions, where q = [W, X, Y, Z] instead of the standard q = [X, Y, Z, W] = [q1, q2, q3, q4]
- Parameters:
quaternions (xarray.DataArray) – Quaternion dataArray from the raw dataset
- Returns:
orientmat (numpy.ndarray) – The earth2inst rotation maxtrix as calculated from the quaternions
See also
scipy.spatial.transform.Rotation
- mhkit.dolfyn.rotate.base.calc_tilt(pitch, roll)[source]
Calculate “tilt”, the vertical inclination, from pitch and roll.
- Parameters:
roll (numpy.ndarray or xarray.DataArray) – Instrument roll in degrees
pitch (numpy.ndarray or xarray.DataArray) – Instrument pitch in degrees
- Returns:
tilt (numpy.ndarray) – Vertical inclination of the instrument in degrees
Cleaning Data
The clean submodule contains basic quality control tools specific to ADCP and ADVs.
ADP-specific functions:
Adds an instrument's height above seafloor (for an up-facing instrument) or depth below water surface (for a down-facing instrument) to the range coordinate. |
|
Find the surface (water level or seafloor) from amplitude data and adds the variable "depth" to the input Dataset. |
|
Calculates the distance to the water surface. |
|
Mask the values of 3D data (vel, amp, corr, echo) that are beyond the surface. |
|
Filters out data where correlation is below a threshold in the along-beam correlation data. |
|
Median filters the orientation data (heading-pitch-roll or quaternions) |
|
Find values of a variable that exceed a threshold value, and assign "val" to the velocity data where the threshold is exceeded. |
|
Fill gaps (nan values) in var across time using the specified method |
|
Fill gaps (nan values) in var along the depth profile using the specified method |
ADV-specific functions:
Interpolate over mask values in timeseries data using the specified method |
|
Fill missing values with the ensemble mean. |
|
Returns a logical vector where a spike in u of magnitude greater than thresh occurs. |
|
Returns a logical vector that is True where the values of u are outside of range. |
|
The Goring & Nikora 2002 'despiking' method, with Wahl2003 correction. |
Module containing functions to clean data
- mhkit.dolfyn.adp.clean.set_range_offset(ds, h_deploy)[source]
Adds an instrument’s height above seafloor (for an up-facing instrument) or depth below water surface (for a down-facing instrument) to the range coordinate. Also adds an attribute to the Dataset with the current “h_deploy” distance.
- Parameters:
ds (xarray.Dataset) – The adcp dataset to ajust ‘range’ on
h_deploy (numeric) – Deployment location in the water column, in [m]
- Returns:
None, operates “in place”
Notes
Center of bin 1 = h_deploy + blank_dist + cell_size
Nortek doesn’t take h_deploy into account, so the range that DOLfYN calculates distance is from the ADCP transducers. TRDI asks for h_deploy input in their deployment software and is thereby known by DOLfYN.
If the ADCP is mounted on a tripod on the seafloor, h_deploy will be the height of the tripod +/- any extra distance to the transducer faces. If the instrument is vessel-mounted, h_deploy is the distance between the surface and downward-facing ADCP’s transducers.
- mhkit.dolfyn.adp.clean.find_surface(ds, thresh=10, nfilt=None)[source]
Find the surface (water level or seafloor) from amplitude data and adds the variable “depth” to the input Dataset.
- Parameters:
ds (xarray.Dataset) – The full adcp dataset
thresh (int) – Specifies the threshold used in detecting the surface. Default = 10 (The amount that amplitude must increase by near the surface for it to be considered a surface hit)
nfilt (int) – Specifies the width of the median filter applied, must be odd. Default is None
- Returns:
None, operates “in place”
- mhkit.dolfyn.adp.clean.find_surface_from_P(ds, salinity=35)[source]
Calculates the distance to the water surface. Temperature and salinity are used to calculate seawater density, which is in turn used with the pressure data to calculate depth.
- Parameters:
ds (xarray.Dataset) – The full adcp dataset
salinity (numeric) – Water salinity in psu. Default = 35
- Returns:
None, operates “in place” and adds the variables “water_density” and
”depth” to the input dataset.
Notes
Requires that the instrument’s pressure sensor was calibrated/zeroed before deployment to remove atmospheric pressure.
Calculates seawater density using a linear approximation of the sea water equation of state:
\[\rho - \rho_0 = -\alpha (T-T_0) + \beta (S-S_0) + \kappa P\]Where \(\rho\) is water density, \(T\) is water temperature, \(P\) is water pressure, \(S\) is practical salinity, \(\alpha\) is the thermal expansion coefficient, \(\beta\) is the haline contraction coefficient, and \(\kappa\) is adiabatic compressibility.
- mhkit.dolfyn.adp.clean.nan_beyond_surface(ds, val=nan, beam_angle=None, inplace=False)[source]
Mask the values of 3D data (vel, amp, corr, echo) that are beyond the surface.
- Parameters:
ds (xarray.Dataset) – The adcp dataset to clean
val (nan or numeric) – Specifies the value to set the bad values to. Default is numpy.nan
beam_angle (int) – ADCP beam inclination angle. Default = dataset.attrs[‘beam_angle’]
inplace (bool) – When True the existing data object is modified. When False a copy is returned. Default = False
- Returns:
ds (xarray.Dataset) – Sets the adcp dataset where relevant arrays with values greater than depth set to NaN
Notes
Surface interference expected to happen at distance > range * cos(beam angle) - cell size
- mhkit.dolfyn.adp.clean.correlation_filter(ds, thresh=50, inplace=False)[source]
Filters out data where correlation is below a threshold in the along-beam correlation data.
- Parameters:
ds (xarray.Dataset) – The adcp dataset to clean.
thresh (numeric) – The maximum value of correlation to screen, in counts or %. Default = 50
inplace (bool) – When True the existing data object is modified. When False a copy is returned. Default = False
- Returns:
ds (xarray.Dataset) – Elements in velocity, correlation, and amplitude are removed if below the correlation threshold
Notes
Does not edit correlation or amplitude data.
- mhkit.dolfyn.adp.clean.medfilt_orient(ds, nfilt=7)[source]
Median filters the orientation data (heading-pitch-roll or quaternions)
- Parameters:
ds (xarray.Dataset) – The adcp dataset to clean
nfilt (numeric) – The length of the median-filtering kernel. Must be odd. Default = 7
- Returns:
ds (xarray.Dataset) – The adcp dataset with the filtered orientation data
See also
scipy.signal.medfilt
- mhkit.dolfyn.adp.clean.val_exceeds_thresh(var, thresh=5, val=nan)[source]
Find values of a variable that exceed a threshold value, and assign “val” to the velocity data where the threshold is exceeded.
- Parameters:
var (xarray.DataArray) – Variable to clean
thresh (numeric) – The maximum value of velocity to screen. Default = 5
val (nan or numeric) – Specifies the value to set the bad values to. Default is numpy.nan
- Returns:
ds (xarray.Dataset) – The adcp dataset with datapoints beyond thresh are set to val
- mhkit.dolfyn.adp.clean.fillgaps_time(var, method='cubic', maxgap=None)[source]
Fill gaps (nan values) in var across time using the specified method
- Parameters:
var (xarray.DataArray) – The variable to clean
method (string) – Interpolation method to use. Default is ‘cubic’
maxgap (numeric) – Maximum gap of missing data to interpolate across. Default is None
- Returns:
out (xarray.DataArray) – The input DataArray ‘var’ with gaps in ‘var’ interpolated across time
See also
xarray.DataArray.interpolate_na
- mhkit.dolfyn.adp.clean.fillgaps_depth(var, method='cubic', maxgap=None)[source]
Fill gaps (nan values) in var along the depth profile using the specified method
- Parameters:
var (xarray.DataArray) – The variable to clean
method (string) – Interpolation method to use. Default is ‘cubic’
maxgap (numeric) – Maximum gap of missing data to interpolate across. Default is None
- Returns:
out (xarray.DataArray) – The input DataArray ‘var’ with gaps in ‘var’ interpolated across depth
See also
xarray.DataArray.interpolate_na
Module containing functions to clean data
- mhkit.dolfyn.adv.clean.clean_fill(u, mask, npt=12, method='cubic', maxgap=6)[source]
Interpolate over mask values in timeseries data using the specified method
- Parameters:
u (xarray.DataArray) – The dataArray to clean.
mask (bool) – Logical tensor of elements to “nan” out (from spikeThresh, rangeLimit, or GN2002) and replace
npt (int) – The number of points on either side of the bad values that interpolation occurs over
method (string) – Interpolation method to use (linear, cubic, pchip, etc). Default is ‘cubic’
maxgap (numeric) – Maximum gap of missing data to interpolate across. Default is None
- Returns:
da (xarray.DataArray) – The dataArray with nan’s filled in
See also
xarray.DataArray.interpolate_na
- mhkit.dolfyn.adv.clean.fill_nan_ensemble_mean(u, mask, fs, window)[source]
Fill missing values with the ensemble mean.
- Parameters:
u (xarray.DataArray (..., time)) – The dataArray to clean. Can be 1D or 2D.
mask (bool) – Logical tensor of elements to “nan” out (from spikeThresh, rangeLimit, or GN2002) and replace
fs (int) – Instrument sampling frequency
window (int) – Size of window in seconds used to calculate ensemble means
- Returns:
da (xarray.DataArray) – The dataArray with nan’s filled in
Notes
Gaps larger than the ensemble size will not get filled in.
- mhkit.dolfyn.adv.clean.spike_thresh(u, thresh=10)[source]
Returns a logical vector where a spike in u of magnitude greater than thresh occurs. Both ‘Negative’ and ‘positive’ spikes are found.
- Parameters:
u (xarray.DataArray) – The timeseries data to clean.
thresh (int) – Magnitude of velocity spike, must be positive. Default = 10
- Returns:
mask (numpy.ndarray) – Logical vector with spikes labeled as ‘True’
- mhkit.dolfyn.adv.clean.range_limit(u, range=[-5, 5])[source]
Returns a logical vector that is True where the values of u are outside of range.
- Parameters:
u (xarray.DataArray) – The timeseries data to clean.
range (list) – Min and max magnitudes beyond which are masked. Default is [-5, 5]
- Returns:
mask (numpy.ndarray) – Logical vector with spikes labeled as ‘True’
- mhkit.dolfyn.adv.clean.GN2002(u, npt=5000)[source]
The Goring & Nikora 2002 ‘despiking’ method, with Wahl2003 correction. Returns a logical vector that is true where spikes are identified.
- Parameters:
u (xarray.DataArray) – The velocity array (1D or 3D) to clean.
npt (int) – The number of points over which to perform the method. Default = 5000
- Returns:
mask (numpy.ndarray) – Logical vector with spikes labeled as ‘True’
Motion Correction
The motion correction submodule contains tools to correct Nortek Vector ADV-IMU data using the onboard IMU, if equipped. Requires proper setup prior to collecting data.
This function performs motion correction on an IMU-ADV data object. |
This function performs motion correction on an IMU-ADV data object. The IMU and ADV data should be tightly synchronized and contained in a single data object.
- param ds:
Cleaned ADV dataset in “inst” coordinates
- type ds:
xarray.Dataset
- param accel_filtfreq:
the frequency at which to high-pass filter the acceleration sigal to remove low-frequency drift.
- type accel_filtfreq:
float
- param vel_filtfreq:
a second frequency to high-pass filter the integrated acceleration. Optional, default = 1/3 of accel_filtfreq
- type vel_filtfreq:
float
- param to_earth:
All variables in the ds.props[‘rotate_vars’] list will be rotated into either the earth frame (to_earth=True) or the instrument frame (to_earth=False). Optional, default = True
- type to_earth:
bool
- param separate_probes:
a flag to perform motion-correction at the probe tips, and perform motion correction in beam-coordinates, then transform back into XYZ/earth coordinates. This correction seems to be lower than the noise levels of the ADV, so the default is to not use it (False).
- type separate_probes:
bool
- returns:
This function returns None, it operates on the input data object,
ds
. The following attributes are added to ds –velraw
is the uncorrected velocityvelrot
is the rotational component of the head motion (fromangrt)
velacc
is the translational component of the head motion (fromaccel, the high-pass filtered accel sigal)
acclow
is the low-pass filtered accel sigal (i.e.,The primary velocity vector attribute,
vel
, is motion correctedsuch that – vel = velraw + velrot + velacc
The sigs are correct in this equation. The measured velocity
induced by head-motion is *in the opposite direction of the head*
motion. i.e. when the head moves one way in stationary flow, it
measures a velocity in the opposite direction. Therefore, to
remove the motion from the raw sigal we *add the head motion.*
Notes
Acceleration signals from inertial sensors are notorious for having a small bias that can drift slowly in time. When integrating these sigals to estimate velocity the bias is amplified and leads to large errors in the estimated velocity. There are two methods for removing these errors,
high-pass filter the acceleration sigal prior and/or after integrating. This implicitly assumes that the low-frequency translational velocity is zero.
provide a slowly-varying reference position (often from a GPS) to an IMU that can use the sigal (usually using Kalman filters) to debias the acceleration sigal.
Because method (1) removes real low-frequency acceleration, method (2) is more accurate. However, providing reference position estimates to undersea instruments is practically challenging and expensive. Therefore, lacking the ability to use method (2), this function utilizes method (1).
For deployments in which the ADV is mounted on a mooring, or other semi-fixed structure, the assumption of zero low-frequency translational velocity is a reasonable one. However, for deployments on ships, gliders, or other moving objects it is not. The measured velocity, after motion-correction, will still hold some of this contamination and will be a sum of the ADV motion and the measured velocity on long time scales. If low-frequency motion is known separate from the ADV (e.g. from a bottom-tracking ADP, or from a ship’s GPS), it may be possible to remove that sigal from the ADV sigal in post-processing.
Velocity Analysis
Analysis in DOLfYN is primarily handled through the VelBinner class. Below is a list of functions that can be called from VelBinner.
This is the base binning (averaging) tool. |
|
Reshape the array arr to shape (...,n,n_bin+n_pad). |
|
Reshape the array arr to shape (...,n,n_bin+n_pad) and remove the best-fit trend line from each bin. |
|
Reshape the array arr to shape (...,n,n_bin+n_pad) and remove the mean from each bin. |
|
Reshape the array arr to shape (...,n,n_bin+n_pad) and take the mean of each bin along the specified axis. |
|
Reshape the array arr to shape (...,n,n_bin+n_pad) and take the variance of each bin along the specified axis. |
|
Reshape the array arr to shape (...,n,n_bin+n_pad) and take the standard deviation of each bin along the specified axis. |
|
Calculate the turbulent kinetic energy (TKE) (variances of u,v,w). |
|
Calculate the power spectral density of velocity. |
- class mhkit.dolfyn.binned.TimeBinner(n_bin, fs, n_fft=None, n_fft_coh=None, noise=[0, 0, 0])[source]
Bases:
object
- reshape(arr, n_pad=0, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad).
- Parameters:
arr (numpy.ndarray) –
n_pad (int) – Is used to add n_pad/2 points from the end of the previous ensemble to the top of the current, and n_pad/2 points from the top of the next ensemble to the bottom of the current. Zeros are padded in the upper-left and lower-right corners of the matrix (beginning/end of timeseries). In this case, the array shape will be (…,`n`,`n_pad`+`n_bin`)
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
- Returns:
out (numpy.ndarray)
Notes
n_bin can be non-integer, in which case the output array size will be n_pad`+`n_bin, and the decimal will cause skipping of some data points in arr. In particular, every mod(n_bin,1) bins will have a skipped point. For example: - for n_bin=2048.2 every 1/5 bins will have a skipped point. - for n_bin=4096.9 every 9/10 bins will have a skipped point.
- detrend(arr, axis=-1, n_pad=0, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and remove the best-fit trend line from each bin.
- Parameters:
arr (numpy.ndarray) –
axis (int) – Axis along which to take mean. Default = -1
n_pad (int) – Is used to add n_pad/2 points from the end of the previous ensemble to the top of the current, and n_pad/2 points from the top of the next ensemble to the bottom of the current. Zeros are padded in the upper-left and lower-right corners of the matrix (beginning/end of timeseries). In this case, the array shape will be (…,`n`,`n_pad`+`n_bin`). Default = 0
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
- Returns:
out (numpy.ndarray)
- demean(arr, axis=-1, n_pad=0, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and remove the mean from each bin.
- Parameters:
arr (numpy.ndarray) –
axis (int) – Axis along which to take mean. Default = -1
n_pad (int) – Is used to add n_pad/2 points from the end of the previous ensemble to the top of the current, and n_pad/2 points from the top of the next ensemble to the bottom of the current. Zeros are padded in the upper-left and lower-right corners of the matrix (beginning/end of timeseries). In this case, the array shape will be (…,`n`,`n_pad`+`n_bin`). Default = 0
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
- Returns:
out (numpy.ndarray)
- mean(arr, axis=-1, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and take the mean of each bin along the specified axis.
- Parameters:
arr (numpy.ndarray) –
axis (int) – Axis along which to take mean. Default = -1
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
- Returns:
out (numpy.ndarray)
- variance(arr, axis=-1, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and take the variance of each bin along the specified axis.
- Parameters:
arr (numpy.ndarray) –
axis (int) – Axis along which to take variance. Default = -1
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
- Returns:
out (numpy.ndarray)
- standard_deviation(arr, axis=-1, n_bin=None)[source]
Reshape the array arr to shape (…,n,n_bin+n_pad) and take the standard deviation of each bin along the specified axis.
- Parameters:
arr (numpy.ndarray) –
axis (int) – Axis along which to take std dev. Default = -1
n_bin (int) – Override this binner’s n_bin. Default is binner.n_bin
- Returns:
out (numpy.ndarray)
- class mhkit.dolfyn.velocity.Velocity(ds, *args, **kwargs)[source]
Bases:
object
All ADCP and ADV xarray datasets wrap this base class.
The turbulence-related attributes defined within this class assume that the
'tke_vec'
and'stress_vec'
data entries are included in the dataset. These are typically calculated using aVelBinner
tool, but the method for calculating these variables can depend on the details of the measurement (instrument, it’s configuration, orientation, etc.).See also
- rotate2(out_frame='earth', inplace=True)[source]
Rotate the dataset to a new coordinate system.
- Parameters:
out_frame (string {'beam', 'inst', 'earth', 'principal'}) – The coordinate system to rotate the data into.
inplace (bool) – When True the existing data object is modified. When False a copy is returned. Default = True
- Returns:
ds (xarray.Dataset or None) – Returns the rotated dataset when ``inplace=False``, otherwise returns None.
Notes
This function rotates all variables in
ds.attrs['rotate_vars']
.To rotate to the ‘principal’ frame, a value of
ds.attrs['principal_heading']
must exist. The functioncalc_principal_heading
is recommended for this purpose, e.g.:ds.attrs['principal_heading'] = dolfyn.calc_principal_heading(ds['vel'].mean(range))
where here we are using the depth-averaged velocity to calculate the principal direction.
- set_declination(declin, inplace=True)[source]
Set the magnetic declination
- Parameters:
declination (float) – The value of the magnetic declination in degrees (positive values specify that Magnetic North is clockwise from True North)
inplace (bool) – When True the existing data object is modified. When False a copy is returned. Default = True
- Returns:
ds (xarray.Dataset or None) – Returns the rotated dataset when ``inplace=False``, otherwise returns None.
Notes
This method modifies the data object in the following ways:
If the dataset is in the earth reference frame at the time of
setting declination, it will be rotated into the “True-East, True-North, Up” (hereafter, ETU) coordinate system
dat['orientmat']
is modified to be an ETU to
instrument (XYZ) rotation matrix (rather than the magnetic-ENU to XYZ rotation matrix). Therefore, all rotations to/from the ‘earth’ frame will now be to/from this ETU coordinate system.
The value of the specified declination will be stored in
dat.attrs['declination']
dat['heading']
is adjusted for declination
(i.e., it is relative to True North).
If
dat.attrs['principal_heading']
is set, it is
adjusted to account for the orientation of the new ‘True’ earth coordinate system (i.e., calling set_declination on a data object in the principal coordinate system, then calling dat.rotate2(‘earth’) will yield a data object in the new ‘True’ earth coordinate system)
- set_inst2head_rotmat(rotmat, inplace=True)[source]
Set the instrument to head rotation matrix for the Nortek ADV if it hasn’t already been set through a ‘.userdata.json’ file.
- Parameters:
rotmat (float) – 3x3 rotation matrix
inplace (bool) – When True the existing data object is rotated. When False a copy is returned that is rotated. Default = True
- Returns:
ds (xarray.Dataset or None) – Returns the rotated dataset when ``inplace=False``, otherwise returns None.
Notes
If the data object is in earth or principal coords, it is first rotated to ‘inst’ before assigning inst2head_rotmat, it is then rotated back to the coordinate system in which it was input. This way the inst2head_rotmat gets applied correctly (in inst coordinate system).
- save(filename, **kwargs)[source]
Save the data object (underlying xarray dataset) as netCDF (.nc).
- Parameters:
filename (str) – Filename and/or path with the ‘.nc’ extension
**kwargs (dict) – These are passed directly to
xarray.Dataset.to_netcdf()
.
Notes
See DOLfYN’s
save
function for additional details.
- property variables
A sorted list of the variable names in the dataset.
- property attrs
The attributes in the dataset.
- property coords
The coordinates in the dataset.
- property u
The first velocity component.
This is simply a shortcut to self[‘vel’][0]. Therefore, depending on the coordinate system of the data object (self.attrs[‘coord_sys’]), it is:
beam: beam1
inst: x
earth: east
principal: streamwise
- property v
The second velocity component.
This is simply a shortcut to self[‘vel’][1]. Therefore, depending on the coordinate system of the data object (self.attrs[‘coord_sys’]), it is:
beam: beam2
inst: y
earth: north
principal: cross-stream
- property w
The third velocity component.
This is simply a shortcut to self[‘vel’][2]. Therefore, depending on the coordinate system of the data object (self.attrs[‘coord_sys’]), it is:
beam: beam3
inst: z
earth: up
principal: up
- property U
Horizontal velocity as a complex quantity
- property U_mag
Horizontal velocity magnitude
- property U_dir
Angle of horizontal velocity vector. Direction is ‘to’, as opposed to ‘from’. This function calculates angle as “degrees CCW from X/East/streamwise” and then converts it to “degrees CW from X/North/streamwise”.
- property E_coh
Coherent turbulent energy
Niel Kelley’s ‘coherent turbulence energy’, which is the RMS of the Reynold’s stresses.
See: NREL Technical Report TP-500-52353
- property I_tke
Turbulent kinetic energy intensity.
Ratio of sqrt(tke) to horizontal velocity magnitude.
- property I
Turbulence intensity.
Ratio of standard deviation of horizontal velocity to horizontal velocity magnitude.
- property tke
Turbulent kinetic energy (sum of the three components)
- property upvp_
u’v’bar Reynolds stress
- property upwp_
u’w’bar Reynolds stress
- property vpwp_
v’w’bar Reynolds stress
- property upup_
u’u’bar component of the tke
- property vpvp_
v’v’bar component of the tke
- property wpwp_
w’w’bar component of the tke
- class mhkit.dolfyn.velocity.VelBinner(n_bin, fs, n_fft=None, n_fft_coh=None, noise=[0, 0, 0])[source]
Bases:
TimeBinner
This is the base binning (averaging) tool. All DOLfYN binning tools derive from this base class.
Examples
The VelBinner class is used to compute averages and turbulence statistics from ‘raw’ (not averaged) ADV or ADP measurements, for example:
# First read or load some data. rawdat = dolfyn.read_example('BenchFile01.ad2cp') # Now initialize the averaging tool: binner = dolfyn.VelBinner(n_bin=600, fs=rawdat.fs) # This computes the basic averages avg = binner.bin_average(rawdat)
- tke = <xarray.DataArray 'tke' (tke: 3)> Size: 60B array(['upup_', 'vpvp_', 'wpwp_'], dtype='<U5') Dimensions without coordinates: tke Attributes: units: 1 long_name: Turbulent Kinetic Energy Vector Components coverage_content_type: coordinate
- tau = <xarray.DataArray 'tau' (tau: 3)> Size: 60B array(['upvp_', 'upwp_', 'vpwp_'], dtype='<U5') Dimensions without coordinates: tau Attributes: units: 1 long_name: Reynolds Stress Vector Components coverage_content_type: coordinate
- S = <xarray.DataArray 'S' (S: 3)> Size: 36B array(['Sxx', 'Syy', 'Szz'], dtype='<U3') Dimensions without coordinates: S Attributes: units: 1 long_name: Power Spectral Density Vector Components coverage_content_type: coordinate
- C = <xarray.DataArray 'C' (C: 3)> Size: 36B array(['Cxy', 'Cxz', 'Cyz'], dtype='<U3') Dimensions without coordinates: C Attributes: units: 1 long_name: Cross-Spectral Density Vector Components coverage_content_type: coordinate
- bin_average(raw_ds, out_ds=None, names=None)[source]
Bin the dataset and calculate the ensemble averages of each variable.
- Parameters:
raw_ds (xarray.Dataset) – The raw data structure to be binned
out_ds (xarray.Dataset) – The bin’d (output) data object to which averaged data is added.
names (list of strings) – The names of variables to be averaged. If names is None, all data in raw_ds will be binned.
- Returns:
out_ds (xarray.Dataset) – The new (or updated when out_ds is not None) dataset with the averages of all the variables in raw_ds.
:raises AttributeError : when out_ds is supplied as input (not None): :raises and the values in out_ds.attrs are inconsistent with: :raises raw_ds.attrs or the properties of this VelBinner (n_bin,: :raises n_fft, fs, etc.):
Notes
raw_ds.attrs are copied to out_ds.attrs. Inconsistencies between the two (when out_ds is specified as input) raise an AttributeError.
- bin_variance(raw_ds, out_ds=None, names=None, suffix='_var')[source]
Bin the dataset and calculate the ensemble variances of each variable. Complementary to bin_average().
- Parameters:
raw_ds (xarray.Dataset) – The raw data structure to be binned.
out_ds (xarray.Dataset) – The binned (output) dataset to which variance data is added, nominally dataset output from bin_average()
names (list of strings) – The names of variables of which to calculate variance. If names is None, all data in raw_ds will be binned.
- Returns:
out_ds (xarray.Dataset) – The new (or updated when out_ds is not None) dataset with the variance of all the variables in raw_ds.
:raises AttributeError : when out_ds is supplied as input (not None): :raises and the values in out_ds.attrs are inconsistent with: :raises raw_ds.attrs or the properties of this VelBinner (n_bin,: :raises n_fft, fs, etc.):
Notes
raw_ds.attrs are copied to out_ds.attrs. Inconsistencies between the two (when out_ds is specified as input) raise an AttributeError.
- autocovariance(veldat, n_bin=None)[source]
Calculate the auto-covariance of the raw-signal veldat
- Parameters:
veldat (xarray.DataArray) – The raw dataArray of which to calculate auto-covariance
n_bin (float) – Number of data elements to use
- Returns:
da (xarray.DataArray) – The auto-covariance of veldat
Notes
As opposed to cross-covariance, which returns the full cross-covariance between two arrays, this function only returns a quarter of the full auto-covariance. It computes the auto-covariance over half of the range, then averages the two sides (to return a ‘quartered’ covariance).
This has the advantage that the 0 index is actually zero-lag.
- turbulence_intensity(U_mag, noise=0, thresh=0, detrend=False)[source]
Calculate noise-corrected turbulence intensity.
- Parameters:
U_mag (xarray.DataArray) – Raw horizontal velocity magnitude
noise (numeric) – Instrument noise level in same units as velocity. Typically found from <adv or adp>.turbulence.doppler_noise_level. Default: None.
thresh (numeric) – Theshold below which TI will not be calculated
detrend (bool (default: False)) – Detrend the velocity data (True), or simply de-mean it (False), prior to computing TI.
- turbulent_kinetic_energy(veldat, noise=None, detrend=True)[source]
Calculate the turbulent kinetic energy (TKE) (variances of u,v,w).
- Parameters:
veldat (xarray.DataArray) – Velocity data array from ADV or single beam from ADCP. The last dimension is assumed to be time.
noise (float or array-like) – Instrument noise level in same units as velocity. Typically found from <adv or adp>.turbulence.doppler_noise_level. Default: None.
detrend (bool (default: False)) – Detrend the velocity data (True), or simply de-mean it (False), prior to computing TKE. Note: the PSD routines use detrend, so if you want to have the same amount of variance here as there use
detrend=True
.
- Returns:
tke_vec (xarray.DataArray) – dataArray containing u’u’_, v’v’_ and w’w’_
- power_spectral_density(veldat, freq_units='rad/s', fs=None, window='hann', noise=0, n_bin=None, n_fft=None, n_pad=None, step=None)[source]
Calculate the power spectral density of velocity.
- Parameters:
veldat (xr.DataArray) – The raw velocity data (of dims ‘dir’ and ‘time’).
freq_units (string) – Frequency units of the returned spectra in either Hz or rad/s (f or \(\omega\))
fs (float (optional)) – The sample rate. Default is binner.fs
window (string or array) – Specify the window function. Options: 1, None, ‘hann’, ‘hamm’
noise (numeric or array) – Instrument noise level in same units as velocity. Default: 0 (ADCP) or [0, 0, 0] (ADV).
n_bin (int (optional)) – The bin-size. Default: from the binner.
n_fft (int (optional)) – The fft size. Default: from the binner.
n_pad (int (optional)) – The number of values to pad with zero. Default = 0.
step (int (optional)) – Controls amount of overlap in fft. Default: the step size is chosen to maximize data use, minimize nens, and have a minimum of 50% overlap.
- Returns:
psd (xarray.DataArray (3, M, N_FFT)) – The spectra in the ‘u’, ‘v’, and ‘w’ directions.
Turbulence Analysis
Functions for analyzing ADV data via the ADVBinner class, beyond those described in VelBinner.
A class that builds upon VelBinner for calculating turbulence statistics and velocity spectra from ADV data |
|
Functional version of ADVBinner that computes a suite of turbulence statistics for the input dataset, and returns a binned data object. |
|
Calculate the specific Reynolds stresses (cross-covariances of u,v,w in m^2/s^2) |
|
Calculate the cross-spectral density of velocity components. |
|
Calculate the dissipation rate from the PSD |
|
Calculate dissipation rate using the "structure function" (SF) method |
|
Calculate the dissipation rate according to TE01. |
|
Calculate integral length scales. |
- class mhkit.dolfyn.adv.turbulence.ADVBinner(n_bin, fs, n_fft=None, n_fft_coh=None, noise=[0, 0, 0])[source]
Bases:
VelBinner
A class that builds upon VelBinner for calculating turbulence statistics and velocity spectra from ADV data
- Parameters:
n_bin (int) – The length of each bin, in number of points, for this averaging operator.
fs (int) – Instrument sampling frequency in Hz
n_fft (int) – The length of the FFT for computing spectra (must be <= n_bin). Optional, default n_fft = n_bin
n_fft_coh (int) –
Number of data points to use for coherence and cross-spectra fft’s. Optional, default n_fft_coh = n_fft
- noisefloat or array-like
Instrument noise level in same units as velocity. Typically found from adv.turbulence.doppler_noise_level. Default: None.
- reynolds_stress(veldat, detrend=True)[source]
Calculate the specific Reynolds stresses (cross-covariances of u,v,w in m^2/s^2)
- Parameters:
veldat (xr.DataArray) – A velocity data array. The last dimension is assumed to be time.
detrend (bool) – Detrend the velocity data (True), or simply de-mean it (False), prior to computing stress. Note: the psd routines use detrend, so if you want to have the same amount of variance here as there use
detrend=True
. Default = True
- Returns:
out (xarray.DataArray)
- cross_spectral_density(veldat, freq_units='rad/s', fs=None, window='hann', n_bin=None, n_fft_coh=None)[source]
Calculate the cross-spectral density of velocity components.
- Parameters:
veldat (xarray.DataArray) – The raw 3D velocity data.
freq_units (string) – Frequency units of the returned spectra in either Hz or rad/s (f or \(\omega\))
fs (float (optional)) – The sample rate. Default = binner.fs
window (string or array) – Specify the window function. Options: 1, None, ‘hann’, ‘hamm’
n_bin (int (optional)) – The bin-size. Default = binner.n_bin
n_fft_coh (int (optional)) – The fft size. Default = binner.n_fft_coh
- Returns:
csd (xarray.DataArray (3, M, N_FFT)) – The first-dimension of the cross-spectrum is the three different cross-spectra: ‘uv’, ‘uw’, ‘vw’.
- doppler_noise_level(psd, pct_fN=0.8)[source]
Calculate bias due to Doppler noise using the noise floor of the velocity spectra.
- Parameters:
psd (xarray.DataArray (dir, time, f)) – The ADV power spectral density of velocity (auto-spectra)
pct_fN (float) – Percent of Nyquist frequency to calculate characeristic frequency
- Returns:
doppler_noise (xarray.DataArray) – Doppler noise level in units of m/s
Notes
Approximates bias from
where :math: sigma_{noise} is the bias due to Doppler noise, N is the constant variance or spectral density, and f_{c} is the characteristic frequency.
The characteristic frequency is then found as
where f_{s}/2 is the Nyquist frequency.
Richard, Jean-Baptiste, et al. “Method for identification of Doppler noise levels in turbulent flow measurements dedicated to tidal energy.” International Journal of Marine Energy 3 (2013): 52-64.
Thiébaut, Maxime, et al. “Investigating the flow dynamics and turbulence at a tidal-stream energy site in a highly energetic estuary.” Renewable Energy 195 (2022): 252-262.
- check_turbulence_cascade_slope(psd, freq_range=[6.28, 12.57])[source]
This function calculates the slope of the PSD, the power spectra of velocity, within the given frequency range. The purpose of this function is to check that the region of the PSD containing the isotropic turbulence cascade decreases at a rate of \(f^{-5/3}\).
- Parameters:
psd (xarray.DataArray ([time,] freq)) – The power spectral density (1D or 2D)
freq_range (iterable(2) (default: [6.28, 12.57])) – The range over which the isotropic turbulence cascade occurs, in units of the psd frequency vector (Hz or rad/s)
- Returns:
(m, b) (tuple (slope, y-intercept)) – A tuple containing the coefficients of the log-adjusted linear regression between PSD and frequency
Notes
Calculates slope based on the standard formula for dissipation:
\[S(k) = \alpha \epsilon^{2/3} k^{-5/3} + N\]The slope of the isotropic turbulence cascade, which should be equal to \(k^{-5/3}\) or \(f^{-5/3}\), where k and f are the wavenumber and frequency vectors, is estimated using linear regression with a log transformation:
\[log10(y) = m*log10(x) + b\]Which is equivalent to
\[y = 10^{b} x^{m}\]Where \(y\) is S(k) or S(f), \(x\) is k or f, \(m\) is the slope (ideally -5/3), and \(10^{b}\) is the intercept of y at x^m=1.
- dissipation_rate_LT83(psd, U_mag, freq_range=[6.28, 12.57], noise=None)[source]
Calculate the dissipation rate from the PSD
- Parameters:
psd (xarray.DataArray (...,time,f)) – The power spectral density
U_mag (xarray.DataArray (...,time)) – The bin-averaged horizontal velocity [m/s] (from dataset shortcut)
freq_range (iterable(2)) – The range over which to integrate/average the spectrum, in units of the psd frequency vector (Hz or rad/s). Default = [6.28, 12.57] rad/s
noise (float or array-like) – Instrument noise level in same units as velocity. Typically found from adv.turbulence.calc_doppler_noise. Default: None.
- Returns:
epsilon (xarray.DataArray (…,n_time)) – dataArray of the dissipation rate
Notes
This uses the standard formula for dissipation:
\[S(k) = \alpha \epsilon^{2/3} k^{-5/3} + N\]where \(\alpha = 0.5\) (1.5 for all three velocity components), k is wavenumber, S(k) is the turbulent kinetic energy spectrum, and `N’ is the doppler noise level associated with the TKE spectrum.
With \(k \rightarrow \omega / U\), then – to preserve variance – \(S(k) = U S(\omega)\), and so this becomes:
\[S(\omega) = \alpha \epsilon^{2/3} \omega^{-5/3} U^{2/3} + N\]With \(k \rightarrow (2\pi f) / U\), then
\[S(\omega) = \alpha \epsilon^{2/3} f^{-5/3} (U/(2*\pi))^{2/3} + N\]LT83 : Lumley and Terray, “Kinematics of turbulence convected by a random wave field”. JPO, 1983, vol13, pp2000-2007.
- dissipation_rate_SF(vel_raw, U_mag, fs=None, freq_range=[2.0, 4.0])[source]
Calculate dissipation rate using the “structure function” (SF) method
- Parameters:
vel_raw (xarray.DataArray (time)) – The raw velocity data upon which to perform the SF technique.
U_mag (xarray.DataArray) – The bin-averaged horizontal velocity (from dataset shortcut)
fs (float) – The sample rate of vel_raw [Hz]
freq_range (iterable(2)) – The frequency range over which to compute the SF [Hz] (i.e. the frequency range within which the isotropic turbulence cascade falls). Default = [2., 4.] Hz
- Returns:
epsilon (xarray.DataArray) – dataArray of the dissipation rate
- dissipation_rate_TE01(dat_raw, dat_avg, freq_range=[6.28, 12.57])[source]
Calculate the dissipation rate according to TE01.
- Parameters:
dat_raw (xarray.Dataset) – The raw (off the instrument) adv dataset
dat_avg (xarray.Dataset) – The bin-averaged adv dataset (calc’d from ‘calc_turbulence’ or ‘do_avg’). The spectra (psd) and basic turbulence statistics (‘tke_vec’ and ‘stress_vec’) must already be computed.
freq_range (iterable(2)) – The range over which to integrate/average the spectrum, in units of the psd frequency vector (Hz or rad/s). Default = [6.28, 12.57] rad/s
Notes
TE01 : Trowbridge, J and Elgar, S, “Turbulence measurements in the Surf Zone”. JPO, 2001, vol31, pp2403-2417.
- integral_length_scales(a_cov, U_mag, fs=None)[source]
Calculate integral length scales.
- Parameters:
a_cov (xarray.DataArray) – The auto-covariance array (i.e. computed using autocovariance).
U_mag (xarray.DataArray) – The bin-averaged horizontal velocity (from dataset shortcut)
fs (numeric) – The raw sample rate
- Returns:
L_int (numpy.ndarray (…, n_time)) – The integral length scale (T_int*U_mag).
Notes
The integral time scale (T_int) is the lag-time at which the auto-covariance falls to 1/e.
If T_int is not reached, L_int will default to ‘0’.
- mhkit.dolfyn.adv.turbulence.turbulence_statistics(ds_raw, n_bin, fs, n_fft=None, freq_units='rad/s', window='hann')[source]
Functional version of ADVBinner that computes a suite of turbulence statistics for the input dataset, and returns a binned data object.
- Parameters:
ds_raw (xarray.Dataset) – The raw adv datset to bin, average and compute turbulence statistics of.
freq_units (string) – Frequency units of the returned spectra in either Hz or rad/s (f or \(\omega\)). Default is ‘rad/s’
window (string or array) – The window to use for calculating spectra.
- Returns:
ds (xarray.Dataset) – Returns an ‘binned’ (i.e. ‘averaged’) data object. All fields (variables) of the input data object are averaged in n_bin chunks. This object also computes the following items over those chunks:
tke_vec : The energy in each component, each components is alternatively accessible as:
upup_
,vpvp_
,wpwp_
)stress_vec : The Reynolds stresses, each component is alternatively accessible as:
upwp_
,vpwp_
,upvp_
)U_std : The standard deviation of the horizontal velocity U_mag.
psd : DataArray containing the spectra of the velocity in radial frequency units. The data-array contains: - vel : the velocity spectra array (m^2/s/rad)) - omega : the radial frequncy (rad/s)
Tools
Spectral analysis and miscellaneous DOLfYN functions are stored here. These functions are used throughout DOLfYN’s core code and may also be helpful to users in general.
FFT-based Functions:
Compute the frequency vector for a given nfft and fs. |
|
Compute the power spectral density (PSD). |
|
Compute the cross power spectral density (CPSD) of the signals a and b. |
|
Compute the cross power spectral density (CPSD) of the signals a and b. |
Other Functions:
Remove a linear trend from arr. |
|
Find continuous segments in a boolean array. |
|
Return an iterator object for looping over 1-D slices, along |
|
Converts between the 'cartesian angle' (counter-clockwise from East) and the 'polar angle' in (degrees clockwise from North) |
|
Linearly fill NaN value in an array. |
|
Fill gaps (NaN values) in |
|
Do a running median filter of the data. |
- mhkit.dolfyn.tools.fft.fft_frequency(nfft, fs, full=False)[source]
Compute the frequency vector for a given nfft and fs.
- Parameters:
fs (float) – The sampling frequency (e.g. samples/sec)
nfft (int) – The number of samples in a window.
full (bool) – Whether to return half frequencies (positive), or the full frequencies. Default = False
- Returns:
freq (numpy.ndarray) – The frequency vector, in same units as ‘fs’
- mhkit.dolfyn.tools.fft.cpsd_quasisync_1D(a, b, nfft, fs, window='hann')[source]
Compute the cross power spectral density (CPSD) of the signals a and b.
- Parameters:
a (numpy.ndarray) – The first signal.
b (numpy.ndarray) – The second signal.
nfft (int) – The number of points in the fft.
fs (float) – The sample rate (e.g. sample/second).
window ({None, 1, 'hann', numpy.ndarray}) – The window to use (default: ‘hann’). Valid entries are: - None,1 : uses a ‘boxcar’ or ones window. - ‘hann’ : hanning window. - a length(nfft) array : use this as the window directly.
- Returns:
cpsd (numpy.ndarray) – The cross-spectral density of a and b.
See also
psd()
,coherence_1D()
,cpsd_1D()
,numpy.fft
Notes
a and b do not need to be ‘tightly’ synchronized, and can even be different lengths, but the first- and last-index of both series should be synchronized (to whatever degree you want unbiased phases).
This performs:
\[fft(a)*conj(fft(b))\]Note that this is consistent with
numpy.correlate()
.It detrends the data and uses a minimum of 50% overlap for the shorter of a and b. For the longer, the overlap depends on the difference in size. 1-(l_short/l_long) data will be underutilized (where l_short and l_long are the length of the shorter and longer series, respectively).
The units of the spectra is the product of the units of a and b, divided by the units of fs.
- mhkit.dolfyn.tools.fft.cpsd_1D(a, b, nfft, fs, window='hann', step=None)[source]
Compute the cross power spectral density (CPSD) of the signals a and b.
- Parameters:
a (numpy.ndarray) – The first signal.
b (numpy.ndarray) – The second signal.
nfft (int) – The number of points in the fft.
fs (float) – The sample rate (e.g. sample/second).
window ({None, 1, 'hann', numpy.ndarray}) – The window to use (default: ‘hann’). Valid entries are: - None,1 : uses a ‘boxcar’ or ones window. - ‘hann’ : hanning window. - a length(nfft) array : use this as the window directly.
step (int) – Use this to specify the overlap. For example: - step : nfft/2 specifies a 50% overlap. - step : nfft specifies no overlap. - step=2*nfft means that half the data will be skipped. By default, step is calculated to maximize data use, have at least 50% overlap and minimize the number of ensembles.
- Returns:
cpsd (numpy.ndarray) – The cross-spectral density of a and b.
See also
psd()
,coherence_1D()
,None
Notes
cpsd removes a linear trend from the signals.
The two signals should be the same length, and should both be real.
This performs:
\[fft(a)*conj(fft(b))\]This implementation is consistent with the numpy.correlate definition of correlation. (The conjugate of D.B. Chelton’s definition of correlation.)
The units of the spectra is the product of the units of a and b, divided by the units of fs.
- mhkit.dolfyn.tools.fft.psd_1D(a, nfft, fs, window='hann', step=None)[source]
Compute the power spectral density (PSD).
This function computes the one-dimensional n-point PSD.
The units of the spectra is the product of the units of a and b, divided by the units of fs.
- Parameters:
a (numpy.ndarray) – The first signal, as a 1D vector
nfft (int) – The number of points in the fft.
fs (float) – The sample rate (e.g. sample/second).
window ({None, 1, 'hann', numpy.ndarray}) – The window to use (default: ‘hann’). Valid entries are: - None,1 : uses a ‘boxcar’ or ones window. - ‘hann’ : hanning window. - a length(nfft) array : use this as the window directly.
step (int) – Use this to specify the overlap. For example: - step : nfft/2 specifies a 50% overlap. - step : nfft specifies no overlap. - step=2*nfft means that half the data will be skipped. By default, step is calculated to maximize data use, have at least 50% overlap and minimize the number of ensembles.
- Returns:
cpsd (numpy.ndarray) – The cross-spectral density of a and b.
Notes
Credit: This function’s line of code was copied from JN’s fast_psd.m routine.
See also
cpsd_1D()
,coherence_1D()
,None
- mhkit.dolfyn.tools.misc.detrend_array(arr, axis=-1, in_place=False)[source]
Remove a linear trend from arr.
- Parameters:
arr (array_like) – The array from which to remove a linear trend.
axis (int) – The axis along which to operate.
Notes
This method is copied from the matplotlib.mlab library, but implements the covariance calcs explicitly for added speed.
This works much faster than mpl.mlab.detrend for multi-dimensional arrays, and is also faster than linalg.lstsq methods.
- mhkit.dolfyn.tools.misc.group(bl, min_length=0)[source]
Find continuous segments in a boolean array.
- Parameters:
bl (numpy.ndarray (dtype='bool')) – The input boolean array.
min_length (int (optional)) – Specifies the minimum number of continuous points to consider a group (i.e. that will be returned).
- Returns:
out (np.ndarray(slices,)) – a vector of slice objects, which indicate the continuous sections where bl is True.
Notes
This function has funny behavior for single points. It will return the same two indices for the beginning and end.
- mhkit.dolfyn.tools.misc.slice1d_along_axis(arr_shape, axis=0)[source]
Return an iterator object for looping over 1-D slices, along
axis
, of an array of shape arr_shape.- Parameters:
arr_shape (tuple,list) – Shape of the array over which the slices will be made.
axis (integer) – Axis along which arr is sliced.
- Returns:
Iterator (object) – The iterator object returns slice objects which slices arrays of shape arr_shape into 1-D arrays.
Examples
>> out=np.empty(replace(arr.shape,0,1)) >> for slc in slice1d_along_axis(arr.shape,axis=0): >> out[slc]=my_1d_function(arr[slc])
- mhkit.dolfyn.tools.misc.convert_degrees(deg, tidal_mode=True)[source]
Converts between the ‘cartesian angle’ (counter-clockwise from East) and the ‘polar angle’ in (degrees clockwise from North)
- Parameters:
deg (float or array-like) – Number or array in ‘degrees CCW from East’ or ‘degrees CW from North’
tidal_mode (bool) – If true, range is set from 0 to +/-180 degrees. If false, range is 0 to 360 degrees. Default = True
- Returns:
out (float or array-like) – Input data transformed to ‘degrees CW from North’ or ‘degrees CCW from East’, respectively (based on deg)
Notes
The same algorithm is used to convert back and forth between ‘CCW from E’ and ‘CW from N’
- mhkit.dolfyn.tools.misc.fillgaps(a, maxgap=inf, dim=0, extrapFlg=False)[source]
Linearly fill NaN value in an array.
- Parameters:
a (numpy.ndarray) – The array to be filled.
maxgap (numpy.ndarray (optional: inf)) – The maximum gap to fill.
dim (int (optional: 0)) – The dimension to operate along.
extrapFlg (bool (optional: False)) – Whether to extrapolate if NaNs are found at the ends of the array.
See also
mhkit.dolfyn.tools.misc._interpgaps
Linearly interpolates in time.
Notes
This function interpolates assuming spacing/timestep between successive points is constant. If the spacing is not constant, use _interpgaps.
- mhkit.dolfyn.tools.misc.interpgaps(a, t, maxgap=inf, dim=0, extrapFlg=False)[source]
Fill gaps (NaN values) in
a
by linear interpolation along dimensiondim
with the point spacing specified int
.- Parameters:
a (numpy.ndarray) – The array containing NaN values to be filled.
t (numpy.ndarray (len(t) == a.shape[dim])) – Independent variable of the points in
a
, e.g. timestepmaxgap (numpy.ndarray (optional: inf)) – The maximum gap to fill.
dim (int (optional: 0)) – The dimension to operate along.
extrapFlg (bool (optional: False)) – Whether to extrapolate if NaNs are found at the ends of the array.
See also
mhkit.dolfyn.tools.misc.fillgaps
Linearly interpolates in array-index space.
- mhkit.dolfyn.tools.misc.medfiltnan(a, kernel, thresh=0)[source]
Do a running median filter of the data. Regions where more than
thresh
fraction of the points are NaN are set to NaN.- Parameters:
a (numpy.ndarray) – 2D array containing data to be filtered.
kernel_size (numpy.ndarray or list, optional) – A scalar or a list of length 2, giving the size of the median filter window in each dimension. Elements of kernel_size should be odd. If kernel_size is a scalar, then this scalar is used as the size in each dimension.
thresh (int) – Maximum gap in a to filter over
- Returns:
out (numpy.ndarray) – 2D array of same size containing filtered data
See also
scipy.signal.medfilt2d
Time
The time submodule contains functions to modify the format of the stored time between a variety of time formats.
Convert from epoch time (seconds since 1/1/1970 00:00:00) to numpy.datetime64 array |
|
Convert numpy.datetime64 array to epoch time (seconds since 1/1/1970 00:00:00) |
|
Convert numpy.datetime64 array to list of datetime objects |
|
Convert numpy.datetime64 array to list of datetime objects |
|
Convert from epoch time (seconds since 1/1/1970 00:00:00) to a list of datetime objects |
|
Convert list of datetime objects to legible strings |
|
Convert list of datetime objects to epoch time |
|
Convert list of datetime objects to MATLAB datenum |
|
Convert MATLAB datenum to list of datetime objects |
- mhkit.dolfyn.time.epoch2dt64(ep_time)[source]
Convert from epoch time (seconds since 1/1/1970 00:00:00) to numpy.datetime64 array
- Parameters:
ep_time (xarray.DataArray) – Time coordinate data-array or single time element
- Returns:
time (numpy.datetime64) – The converted datetime64 array
- mhkit.dolfyn.time.dt642epoch(dt64)[source]
Convert numpy.datetime64 array to epoch time (seconds since 1/1/1970 00:00:00)
- Parameters:
dt64 (numpy.datetime64) – Single or array of datetime64 object(s)
- Returns:
time (float) – Epoch time (seconds since 1/1/1970 00:00:00)
- mhkit.dolfyn.time.date2dt64(dt)[source]
Convert numpy.datetime64 array to list of datetime objects
- Parameters:
time (datetime.datetime) – The converted datetime object
- Returns:
dt64 (numpy.datetime64) – Single or array of datetime64 object(s)
- mhkit.dolfyn.time.dt642date(dt64)[source]
Convert numpy.datetime64 array to list of datetime objects
- Parameters:
dt64 (numpy.datetime64) – Single or array of datetime64 object(s)
- Returns:
time (datetime.datetime) – The converted datetime object
- mhkit.dolfyn.time.epoch2date(ep_time, offset_hr=0, to_str=False)[source]
Convert from epoch time (seconds since 1/1/1970 00:00:00) to a list of datetime objects
- Parameters:
ep_time (xarray.DataArray) – Time coordinate data-array or single time element
offset_hr (int) – Number of hours to offset time by (e.g. UTC -7 hours = PDT)
to_str (logical) – Converts datetime object to a readable string
- Returns:
time (datetime.datetime) – The converted datetime object or list(strings)
Notes
The specific time instance is set during deployment, usually sync’d to the deployment computer. The time seen by DOLfYN is in the timezone of the deployment computer, which is unknown to DOLfYN.
- mhkit.dolfyn.time.date2str(dt, format_str=None)[source]
Convert list of datetime objects to legible strings
- Parameters:
dt (datetime.datetime) – Single or list of datetime object(s)
format_str (string) – Timestamp string formatting. Default is ‘%Y-%m-%d %H:%M:%S.%f’ See datetime.strftime documentation for timestamp string formatting.
- Returns:
time (string) – Converted timestamps
- mhkit.dolfyn.time.date2epoch(dt)[source]
Convert list of datetime objects to epoch time
- Parameters:
dt (datetime.datetime) – Single or list of datetime object(s)
- Returns:
time (float) – Datetime converted to epoch time (seconds since 1/1/1970 00:00:00)