Acoustics Module #349

jmcvey3 · 2024-08-29T18:31:50Z

Functions and example notebook for reading passive acoustics data from hydrophones.

# MHKiT v0.8.1 MHKiT v0.8.1, includes bug fixes in the example notebooks and fixes the dependencies to requirements updates prior to Numpy 2.0.0. Fixes MHKIT v0.8.0 installation issues (MHKiT-Software#334) by fixing the dependencies. - MHKiT-Software#335 Fixes bugs in MHKiT example notebooks - MHKiT-Software#327

…-Software#326) * tke updates * Fix shear velocity functions * More detail for tke shear production * Don't rotate heading beyond 360 degrees * Fix some typos in notebook * Rename deprecated function

…#340) * Test: Determine method using input frequency index * Feat: Use sum of sines if ifft is not computable This change allows `surface_elevation` to return a result if the user inputs a spectrum with a frequency index that does not have a zero frequency. If the non zero frequency index condition is found when the method is `ifft` we warn the user and change the method to `sum_of_sines` * Fix: Use previously found frequency index S.index may not exist for some input datasets, but f[0] does and we should use the value of f[0] here. * Test: Warn when using ifft with a non zero frequency * Lint

This PR adds a Github action to test the example notebooks as part of or CD pipeline. Additionally a timeout is added to which notebooks will fail if they exceed the given time.

Fixes MHKiT-Software#223

This PR addresses MHKiT-Software#315 by: - Changes the data called by the Wind Toolkit tests so that they run faster - Updates the .csv files that the tests compare against - Updates a few descriptions in the metocean example, fixes a sorting issue, reduces the data downloaded there - Tests in hindcast match the notebooks and use the same cache.

Bug fix by removing matplotlib version check needed for < 3.8.0

*MHKiT v0.8.2

jmcvey3 · 2024-09-17T18:19:30Z

What would you like to do about the "R" error codes? The number of variables and lines used per function are arbitrary at best.

I am going to try to address the too many locals by using a dictionary.

I am still thinking about the too many arguments. What do you think about keeping the current way the function works but adding the ability for the user to pass a dictionary e.g. {'quantile': 0.25} or {'min': None} ect.?

I have not looked into the too many statements yet.

A dictionary could work in some cases, and so could updating a variable name. I think it becomes a case of "when am I making the code harder to read or more complicated than it needs to be for the sake of an arbitrary test?". Solving "too many variables" will probably break "too many statements" and/or vice versa.

A better coded way of solving "too many arguments" is to use *args and **kwargs, but these concepts may be difficult for new users to understand.

ssolson · 2024-09-17T18:36:29Z

What would you like to do about the "R" error codes? The number of variables and lines used per function are arbitrary at best.

I am going to try to address the too many locals by using a dictionary.
I am still thinking about the too many arguments. What do you think about keeping the current way the function works but adding the ability for the user to pass a dictionary e.g. {'quantile': 0.25} or {'min': None} ect.?
I have not looked into the too many statements yet.

A dictionary could work in some cases, and so could updating a variable name. I think it becomes a case of "when am I making the code harder to read or more complicated than it needs to be for the sake of an arbitrary test?". Solving "too many variables" will probably break "too many statements" and/or vice versa.

A better coded way of solving "too many arguments" is to use *args and **kwargs, but these concepts may be difficult for new users to understand.

I resolved the issues without ignores in a new PR on your branch. I need to write tests still for some of the functions I broke out.

Dictionaries make the code harder to read imo but should also force you to group like variables.

In my experience trying to follow pylint is annoying, and takes longer but ultimately makes the code more modular and easier to work with long term. We're fully the realm of preferences with the "Refactor" codes so there is no right or wrong. My goal is to not use inline ignores unless there is no other way. Then with global ignores I think it is hard to justify that there should not be some limit on too many locals and too many arguments. So at that point it just boils down to what the limit should be and again there is no right answer.

ssolson

@jmcvey3 here are a couple of preliminary thoughts I had while trying to fix the pylint issues. I plan to give another pass at this after we get jmcvey3#6 merged. I attempted to fix some but not all of these in my PR against your branch.

ssolson · 2024-09-17T13:06:59Z

mhkit/acoustics/io.py

+        f.read(4)  # wave_key
+        list_key = f.read(4)
+
+        # Skip metadata if it exits (don't know how to parse)


I think this "don't know how to parse" should be removed here and added to the docstring... I don't know what LIST metadata is but something like:

Read .wav file from a hydrophone. Returns voltage timeseries if sensitivity not provided, returns pressure timeseries if it is provided. If present "LIST" metadata is not parsed.

Then update this comment to just say # Skip metadata if it exits

ssolson · 2024-09-17T13:10:27Z

mhkit/acoustics/io.py

+    peak_voltage=None,
+    sensitivity=None,
+    gain=0,
+    start_time="2024-06-06T00:00:00",


What is the significance of this start time? Should it be None or epoch time zero?

The date is random; it shows the correct time format to input cause people don't always look at docstrings

ssolson · 2024-09-17T15:22:30Z

mhkit/acoustics/base.py

+def band_average(
+    spsdl, octave=3, fmin=10, fmax=100000, method="median", method_arg=None
+):


Is this function poorly named?

Its called band average but defaults to median... but it can also process min , max, quantile ect.

Ah yes, yes it is then. I'll have to think about that one

ssolson · 2024-09-17T15:42:39Z

mhkit/tests/acoustics/test_analysis.py

Should this be called test_base or should base be called analysis?

ssolson · 2024-09-17T16:01:22Z

examples/data/acoustics/RBW_6661_20240601_053114.wav

remote: warning: File examples/data/acoustics/RBW_6661_20240601_053114.wav is 87.89 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB

Any way we could get this a little smaller and maintain a good example?

ssolson · 2024-09-17T16:02:00Z

examples/data/acoustics/6247.230204150508.wav

remote: warning: File examples/data/acoustics/6247.230204150508.wav is 54.93 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB

Any way we could get this a little smaller and maintain a good example?

ssolson · 2024-09-18T14:40:06Z

mhkit/acoustics/io.py

+        list_key = f.read(4)
+
+        # Skip metadata if it exits (don't know how to parse)
+        if "LIST" in list_key.decode():


Do the files we are testing have have "LIST" in them?

LIST is an optional header keyword in a .wav file, if I remember right. I've only used Ocean Sonics and Sountrap hydrophones, and of those two just the former has a LIST keyword.

ssolson · 2024-09-18T14:53:03Z

mhkit/acoustics/io.py

+
+    # Read data using scipy cause that's easier (will auto drop as int16 or int32)
+    fs, raw = wavfile.read(filename)
+    length = raw.shape[0] // fs  # length of recording in seconds


We should check that the number of samples is larger than the sampling frequency otherwise this integer division will return 0 and throw unclear errors because the user's sample time is less than 1s.

Well, it shouldn't matter the total time segment is less than 1 second

ssolson · 2024-09-19T13:59:40Z

mhkit/acoustics/base.py

+
+    # Use dolfyn PSD
+    binner = VelBinner(n_bin=win, fs=fs, n_fft=win)
+    # Always 50% overlap if numbers reshape perfectly


Not sure about this comment. I had to look this up but here is my understanding:

Overlapping refers to the technique where consecutive windows share some portion of their data.

So if I create a 10s dataset with 100 samples with no overlap this should return a len of 10.

If I use the same dataset with a 50% overlap I would expect this to return 19 values, however, only 10 samples are being return indicating to me there is no overlap.

Am I misunderstanding the use of overlapping here or missing something?

# Create some sample pressure data time = np.arange(0, 10, 0.1) data = np.sin(time) pressure = xr.DataArray(data, coords=[time], dims=["time"], attrs={"units": "Pa", "fs": 100}) fs = len(time) window = 0.1 # seconds win_samples = int(window * fs) # Calculate expected number of segments overlap = 0.5 step = int(win_samples * (1 - overlap)) expected_segments = (len(pressure) - win_samples) // step + 1

Yes, you'd have 19 segments, from which you're calculating periodograms for each. Segments and windows are different concepts.
I'm not an expert in fast fourier transforms, but here is the original document for Welch's method for power spectra, which is the mathematical explanation for that dolfyn function: https://ieeexplore.ieee.org/abstract/document/1161901

Okay if it should be 19 then I don't think the windowing in the function is working correctly.

Can you review this test?

def test_sound_pressure_spectral_density(self): """ Test sound pressure spectral density calculation. """ # Create some sample pressure data time = np.arange(0, 10, 0.1) data = np.sin(time) pressure = xr.DataArray(data, coords=[time], dims=["time"], attrs={"units": "Pa", "fs": 100}) # Adjust window size to get multiple segments fs = 100 window = 0.1 # seconds win_samples = int(window * fs) # Run the spectral density function spsd = acoustics.sound_pressure_spectral_density(pressure, fs=fs, window=window) # Assert that output is an xarray DataArray with expected dimensions self.assertIsInstance(spsd, xr.DataArray) self.assertIn("freq", spsd.dims) self.assertIn("time", spsd.dims) self.assertEqual(spsd.attrs["units"], "Pa^2/Hz") self.assertEqual(spsd.attrs["window"], f"{window}s") # Calculate expected number of segments overlap = 0.5 step = int(win_samples * (1 - overlap)) expected_segments = (len(pressure) - win_samples) // step + 1 self.assertEqual(spsd.shape[0], expected_segments)

======================================================================
FAIL: test_sound_pressure_spectral_density (main.TestAnalysis.test_sound_pressure_spectral_density)
Test sound pressure spectral density calculation.

Traceback (most recent call last):
File "C:\Users\sterl\Codes\MHKiT-Python\mhkit\tests\acoustics\test_analysis.py", line 196, in > test_sound_pressure_spectral_density
self.assertEqual(spsd.shape[0], expected_segments)
AssertionError: 10 != 19

No, segments in this case are not "windows". It also looks like I mispoke it seems. It's been a long time since I've dug into it, but "windowing" is the term used to describe smoothing functions (or filters) of a specified length that are applied atop of the signal. The 50% part comes in based on the amount of data the window, i.e. smoothing function, covers.

I'm digging into textbooks at this point to try and explain it to you.

For these functions the word "window" is equivalent to "bin size" or "ensemble", or the number of points to discretize.

I replaced all the instances of "window" with "bin" and updated the docstring to try and avoid confusion

Acoustics module

jmcvey3 · 2024-09-19T17:24:05Z

@ssolson Git wouldn't let me push to your branch, so I merged your PR and then committed my changes. A couple "dry" changes, and some other edits that are relatively minor but better for readability.

…ython into acoustics_module

ssolson · 2024-09-26T13:39:16Z

@jmcvey3 could you take a look at my comments above?

into acoustics_module

…ython into acoustics_module

jmcvey3 and others added 30 commits May 30, 2024 18:30

Initial commit

a319b33

Write up docstrings and comments

8a8664d

Setup and read iclisten hydrophone wav file

61ff635

Spectral analysis code

c648c24

Subtle changes

091bc64

Read Soundtrap and any other .wav file

4fb4bcf

Add first plots

865429e

Updates

43a70e7

Audio export, fractional octaves, plots

3c941e4

Fix SPL calculation, add docstrings, upload test data

7663273

black formatting

5f89bd6

Fix third octave spl integral

5ca2dc5

Improve time binning

70ddf6d

Updates backup

9bb3c41

Add and edit functions for correcting via a calibrated sensitivity curve

4f13ccb

Black formatting

6e869f1

Add function for decidecade SPL

4ecaa66

Improve Reynolds stress ADCP estimation notebook discussion V2 (MHKiT…

4cdbe03

…-Software#326) * tke updates * Fix shear velocity functions * More detail for tke shear production * Don't rotate heading beyond 360 degrees * Fix some typos in notebook * Rename deprecated function

ipython notebook test (MHKiT-Software#330)

ecbdfae

This PR adds a Github action to test the example notebooks as part of or CD pipeline. Additionally a timeout is added to which notebooks will fail if they exceed the given time.

Update NOAA Request function (MHKiT-Software#332)

f843cd1

Fixes MHKiT-Software#223

matplotlib >=3.8 (MHKiT-Software#345)

b6f47c7

Bug fix by removing matplotlib version check needed for < 3.8.0

MHKiT version increase for new release (MHKiT-Software#346)

948c6b9

*MHKiT v0.8.2

Add decidecade levels

1fd72f0

Docstring fixes, notebook text

d330162

Merge branch 'MHKiT-Software:master' into acoustics_module

dffc74c

Update acoustics example notebook

216c1b0

Add testing

3450e10

adjust quatile call to dict

764433c

ssolson added 7 commits September 18, 2024 11:03

add tests for helper functions

38489b4

test for iclisten metadata

fd5148f

type hints & checks

48318ba

type and value hints

c45a604

pylint

b933690

add tests for helper functions

5417cfe

modifiy to avoid inplace overwrite

2fc8527

ssolson reviewed Sep 19, 2024

View reviewed changes

jmcvey3 added 2 commits September 19, 2024 10:19

Merge pull request #6 from ssolson/acoustics_module

57a1922

Acoustics module

Cleanup and minor edits

f899f76

jmcvey3 and others added 8 commits September 19, 2024 10:27

less dry but fine

d560041

handle non freq dim

e7e7a02

water_depth can be array

09d08fb

pass float not array

b4b9ac2

Merge branch 'acoustics_module' of https://github.com/jmcvey3/MHKiT-P…

a186779

…ython into acoustics_module

fix errors from acoustics module updates

db0edab

pylint

5ac20ad

Ignore too many POSITIONAL args in addition to too many args

68dc32a

jmcvey3 and others added 7 commits October 1, 2024 10:43

Minor changes

8fe42de

Use 'bin' instead of 'window' to avoid confusion

0de1b6a

Remove redundant type checks

378ea29

Reduce complexity and don't hardcode plotting units

22512f1

Merge branch 'develop' of https://github.com/MHKiT-Software/MHKiT-Python

4f0c18b

into acoustics_module

Reduce data size by using float32 on output, f-string fix

5a1b8f3

Merge branch 'acoustics_module' of https://github.com/jmcvey3/MHKiT-P…

cc6bb2b

…ython into acoustics_module

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Acoustics Module #349

Acoustics Module #349

jmcvey3 commented Aug 29, 2024

jmcvey3 commented Sep 17, 2024

ssolson commented Sep 17, 2024 •

edited

Loading

ssolson left a comment •

edited

Loading

ssolson Sep 17, 2024

ssolson Sep 17, 2024

jmcvey3 Oct 1, 2024

ssolson Sep 17, 2024

jmcvey3 Oct 1, 2024

ssolson Sep 17, 2024

ssolson Sep 17, 2024

ssolson Sep 17, 2024

ssolson Sep 18, 2024

jmcvey3 Oct 1, 2024

ssolson Sep 18, 2024

jmcvey3 Oct 1, 2024 •

edited

Loading

ssolson Sep 19, 2024

jmcvey3 Oct 1, 2024

ssolson Oct 6, 2024 •

edited

Loading

jmcvey3 Oct 8, 2024 •

edited

Loading

jmcvey3 Oct 9, 2024

jmcvey3 commented Sep 19, 2024

ssolson commented Sep 26, 2024

Acoustics Module #349

Are you sure you want to change the base?

Acoustics Module #349

Conversation

jmcvey3 commented Aug 29, 2024

jmcvey3 commented Sep 17, 2024

ssolson commented Sep 17, 2024 • edited Loading

ssolson left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmcvey3 Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssolson Oct 6, 2024 • edited Loading

Choose a reason for hiding this comment

====================================================================== FAIL: test_sound_pressure_spectral_density (main.TestAnalysis.test_sound_pressure_spectral_density) Test sound pressure spectral density calculation.

jmcvey3 Oct 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmcvey3 commented Sep 19, 2024

ssolson commented Sep 26, 2024

ssolson commented Sep 17, 2024 •

edited

Loading

ssolson left a comment •

edited

Loading

jmcvey3 Oct 1, 2024 •

edited

Loading

ssolson Oct 6, 2024 •

edited

Loading

======================================================================
FAIL: test_sound_pressure_spectral_density (main.TestAnalysis.test_sound_pressure_spectral_density)
Test sound pressure spectral density calculation.

jmcvey3 Oct 8, 2024 •

edited

Loading