Proof of concept of "buffer_description_api" and xarray reference API bridge #1513

samuelgarcia · 2024-07-26T10:41:08Z

Given the idea of @bendichter and this gist

Here a very very early stage of a very very proof of concept of the "buffer_description_api" for analogsignal chunk.

The idea/goal is:

have a simple buffer description for analogsignal for binary or hdf5 cases (maybe more)
this description can be exported to json
this description can be easily transformed to the xarray/zarr reference external API
make so file cloud ready for streaming chunk of traces

See also kerchunk

… bridge

bendichter · 2024-07-26T11:38:10Z

neo/rawio/xarray_utils.py

+    rfs = dict()
+    rfs["version"] = 1
+    rfs["refs"] = dict()
+    rfs["refs"][".zgroup"] = json.dumps(dict(zarr_format=2))


You don't need the dumps command anymore

neo/rawio/xarray_utils.py

bendichter · 2024-07-26T11:41:44Z

Looks good!

…_buffer_description_api=True This should also solve the memmap and memmory leak problem.

…-neo into json_api

doc/source/rawio.rst

zm711

Looks pretty good to me.

doc/source/rawio.rst

Co-authored-by: Zach McKenzie <92116279+zm711@users.noreply.github.com>

h-mayorquin

I did a first reading. I think that this PR and the future documentation would benefit greatly if we have the schema of the buffer description somewhere. It does not have to be a formal schema (although that would be great) it could be just a description on the documentation or a python data class with types. Something that I can reference to see what should I expect to fill when I am doing a buffer like this.

h-mayorquin · 2024-10-16T16:33:47Z

doc/source/rawio.rst

+For reading analog signals **neo.rawio** has 2 important concepts:
+
+ 1. The **signal_stream** : it is a group of channels that can be read together using :func:`get_analog_signal_chunk()`.
+    This group of channels is guaranteed to have the same sampling rate, and the same duration per segment.


Now that we have logical channels, this should be same units as well, right?

At the moment not yet. The API do not enforce and ensure this.
See this https://github.com/NeuralEnsemble/python-neo/blob/master/neo/rawio/baserawio.py#L118
Ideally units should be add but some IO are mixing maybe the units in the same stream.

Can we add it as an ideal? Or would you rather first do the changes and then change it here?

I can close this for example:

#1133

doc/source/rawio.rst

h-mayorquin · 2024-10-16T16:36:02Z

neo/rawio/axonrawio.py

@@ -115,7 +116,7 @@ def _parse_header(self):
            head_offset = info["sections"]["DataSection"]["uBlockIndex"] * BLOCKSIZE
            totalsize = info["sections"]["DataSection"]["llNumEntries"]

-        self._raw_data = np.memmap(self.filename, dtype=sig_dtype, mode="r", shape=(totalsize,), offset=head_offset)
+        # self._raw_data = np.memmap(self.filename, dtype=sig_dtype, mode="r", shape=(totalsize,), offset=head_offset)


Should this be commented?

You lean removed ?
Done

Thanks, yeah, we can get stuff from the diff.

h-mayorquin · 2024-10-16T16:38:03Z

neo/rawio/baserawio.py

@@ -149,6 +151,9 @@ class BaseRawIO:

    rawmode = None  # one key from possible_raw_modes

+    # If true then 


This comment is not clear, if what is True?

h-mayorquin · 2024-10-16T16:39:18Z

neo/rawio/baserawio.py

@@ -1284,6 +1298,7 @@ def _get_signal_size(self, block_index: int, seg_index: int, stream_index: int):

        All channels indexed must have the same size and t_start.
        """
+        # must NOT be implemented if has_buffer_description_api=True


Will be ignored?

No this is done in the BaseRawWithBufferApiIO sub class

h-mayorquin · 2024-10-16T16:45:08Z

neo/rawio/utils.py

+
+def get_memmap_chunk_from_opened_file(fid, num_channels,  start, stop, dtype, file_offset=0):
+    """
+    Utility fonction to get a chunk as a memmap array directly from an opened file.


typo fonction

zm711

couple more comments from me too.

doc/source/rawio.rst

zm711 · 2024-10-17T19:23:26Z

neo/rawio/axonrawio.py

+            self._buffer_descriptions[0][seg_index][buffer_id] = {
+                "type" : "raw",
+                "file_path" : str(self.filename),
+                "dtype" : str(sig_dtype),


is this for jsonification? rather than store a np.dtype()?

If a data layout is split with a structure theorically we could handle it in zarr giving the list of chunks.
This could be done a the futur but this will not be in the actual PR.

str(sig_dtype) is for jsonification yes.

There are some without this safety step. So if we really want to be safe we should do this for all uses of dtype and Path no?

neo/rawio/baserawio.py

zm711 · 2024-10-17T19:26:31Z

neo/rawio/baserawio.py

+                for seg_index in self._hdf5_analogsignal_buffers[block_index].keys():
+                    for buffer_id, h5_file in self._hdf5_analogsignal_buffers[block_index][seg_index].items():
+                        h5_file.close()
+


no del here?

oups yes.
tahnks

zm711 · 2024-10-17T19:27:12Z

neo/rawio/brainvisionrawio.py

+        self._buffer_descriptions[0][0][buffer_id] = {
+            "type" : "raw",
+            "file_path" : binary_filename,
+            "dtype" : sig_dtype,


Here you didn't stringify like above. Why?

samuelgarcia · 2024-10-21T11:56:17Z

Merci beaucoup @zm711 and @h-mayorquin for the review

Co-authored-by: Heberto Mayorquin <h.mayorquin@gmail.com> Co-authored-by: Zach McKenzie <92116279+zm711@users.noreply.github.com>

Proof of concept of "buffer_description_api" and xarray reference API…

3083513

… bridge

bendichter reviewed Jul 26, 2024

View reviewed changes

apdavison added this to the future milestone Jul 26, 2024

samuelgarcia added 3 commits August 27, 2024 20:43

Implement get_analogsignal_chunk() generically when a rawio class has…

e70c014

…_buffer_description_api=True This should also solve the memmap and memmory leak problem.

wip

24fe98e

test on micromed

22c6698

samuelgarcia mentioned this pull request Sep 3, 2024

neo.rawio : API enhance proposal buffer_id and stream_id #1543

Open

samuelgarcia added 12 commits September 9, 2024 11:11

rebase on buffer_id

38a84f6

Implement get_analogsignal_chunk() generically when a rawio class has…

4d31b75

…_buffer_description_api=True This should also solve the memmap and memmory leak problem.

wip

5506b43

test on micromed

7a0de15

some fix

a53d147

Merge

907fa25

make strema a slice of buffer and xarray api use buffer_id

f6ea345

json api : winedr + winwcp

81c7121

buffer api : RawBinarySignalRawIO + RawMCSRawIO

2d21e48

json api : neuroscope + openephysraw

a64b054

More reader with buffer description

bc5a122

merge with master

356b281

samuelgarcia mentioned this pull request Oct 11, 2024

simple proposal for buffer_id in rawio #1544

Merged

samuelgarcia added 6 commits October 11, 2024 11:31

update with buffer api branch

cf612df

Merge branch 'add_signal_buffer_id' of github.com:samuelgarcia/python…

cb6ba8d

…-neo into json_api

wip

4714535

Merge branch 'add_signal_buffer_id' of github.com:samuelgarcia/python…

e9836d3

…-neo into json_api

json api start hdf5 on maxwell

334a882

doc for signal_stream signal_buffer

383ada7

samuelgarcia commented Oct 11, 2024

View reviewed changes

doc/source/rawio.rst Outdated Show resolved Hide resolved

zm711 reviewed Oct 11, 2024

View reviewed changes

Fix conflicts.

9c35752

Merci Zach

672e8fe

Co-authored-by: Zach McKenzie <92116279+zm711@users.noreply.github.com>

h-mayorquin reviewed Oct 16, 2024

View reviewed changes

Use class approach for buffer api : BaseRawWithBufferApiIO

a540840

zm711 reviewed Oct 17, 2024

View reviewed changes

samuelgarcia and others added 3 commits October 21, 2024 14:09

feedback

282951a

Apply suggestions from code review

401956a

Co-authored-by: Heberto Mayorquin <h.mayorquin@gmail.com> Co-authored-by: Zach McKenzie <92116279+zm711@users.noreply.github.com>

clean

2fa19fb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proof of concept of "buffer_description_api" and xarray reference API bridge #1513

Proof of concept of "buffer_description_api" and xarray reference API bridge #1513

samuelgarcia commented Jul 26, 2024

bendichter Jul 26, 2024

bendichter commented Jul 26, 2024

zm711 left a comment

h-mayorquin left a comment

h-mayorquin Oct 16, 2024

samuelgarcia Oct 21, 2024

h-mayorquin Oct 21, 2024

h-mayorquin Oct 16, 2024

samuelgarcia Oct 21, 2024

h-mayorquin Oct 21, 2024

h-mayorquin Oct 16, 2024

h-mayorquin Oct 16, 2024

samuelgarcia Oct 21, 2024

h-mayorquin Oct 16, 2024

zm711 left a comment

zm711 Oct 17, 2024

samuelgarcia Oct 21, 2024

samuelgarcia Oct 21, 2024

zm711 Oct 21, 2024

zm711 Oct 17, 2024

samuelgarcia Oct 21, 2024

zm711 Oct 17, 2024

samuelgarcia commented Oct 21, 2024

		@@ -149,6 +151,9 @@ class BaseRawIO:

		rawmode = None # one key from possible_raw_modes

		# If true then

Proof of concept of "buffer_description_api" and xarray reference API bridge #1513

Are you sure you want to change the base?

Proof of concept of "buffer_description_api" and xarray reference API bridge #1513

Conversation

samuelgarcia commented Jul 26, 2024

Choose a reason for hiding this comment

bendichter commented Jul 26, 2024

zm711 left a comment

Choose a reason for hiding this comment

h-mayorquin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zm711 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samuelgarcia commented Oct 21, 2024