-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proof of concept of "buffer_description_api" and xarray reference API bridge #1513
base: master
Are you sure you want to change the base?
Conversation
rfs = dict() | ||
rfs["version"] = 1 | ||
rfs["refs"] = dict() | ||
rfs["refs"][".zgroup"] = json.dumps(dict(zarr_format=2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the dumps command anymore
Looks good! |
…_buffer_description_api=True This should also solve the memmap and memmory leak problem.
…_buffer_description_api=True This should also solve the memmap and memmory leak problem.
…-neo into json_api
…-neo into json_api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me.
Co-authored-by: Zach McKenzie <92116279+zm711@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first reading. I think that this PR and the future documentation would benefit greatly if we have the schema of the buffer description somewhere. It does not have to be a formal schema (although that would be great) it could be just a description on the documentation or a python data class with types. Something that I can reference to see what should I expect to fill when I am doing a buffer like this.
For reading analog signals **neo.rawio** has 2 important concepts: | ||
|
||
1. The **signal_stream** : it is a group of channels that can be read together using :func:`get_analog_signal_chunk()`. | ||
This group of channels is guaranteed to have the same sampling rate, and the same duration per segment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have logical channels, this should be same units as well, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment not yet. The API do not enforce and ensure this.
See this https://github.com/NeuralEnsemble/python-neo/blob/master/neo/rawio/baserawio.py#L118
Ideally units should be add but some IO are mixing maybe the units in the same stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add it as an ideal? Or would you rather first do the changes and then change it here?
I can close this for example:
neo/rawio/axonrawio.py
Outdated
@@ -115,7 +116,7 @@ def _parse_header(self): | |||
head_offset = info["sections"]["DataSection"]["uBlockIndex"] * BLOCKSIZE | |||
totalsize = info["sections"]["DataSection"]["llNumEntries"] | |||
|
|||
self._raw_data = np.memmap(self.filename, dtype=sig_dtype, mode="r", shape=(totalsize,), offset=head_offset) | |||
# self._raw_data = np.memmap(self.filename, dtype=sig_dtype, mode="r", shape=(totalsize,), offset=head_offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be commented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You lean removed ?
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yeah, we can get stuff from the diff.
neo/rawio/baserawio.py
Outdated
@@ -149,6 +151,9 @@ class BaseRawIO: | |||
|
|||
rawmode = None # one key from possible_raw_modes | |||
|
|||
# If true then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is not clear, if what is True?
neo/rawio/baserawio.py
Outdated
@@ -1284,6 +1298,7 @@ def _get_signal_size(self, block_index: int, seg_index: int, stream_index: int): | |||
|
|||
All channels indexed must have the same size and t_start. | |||
""" | |||
# must NOT be implemented if has_buffer_description_api=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No this is done in the BaseRawWithBufferApiIO sub class
neo/rawio/utils.py
Outdated
|
||
def get_memmap_chunk_from_opened_file(fid, num_channels, start, stop, dtype, file_offset=0): | ||
""" | ||
Utility fonction to get a chunk as a memmap array directly from an opened file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo fonction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple more comments from me too.
self._buffer_descriptions[0][seg_index][buffer_id] = { | ||
"type" : "raw", | ||
"file_path" : str(self.filename), | ||
"dtype" : str(sig_dtype), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this for jsonification? rather than store a np.dtype()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a data layout is split with a structure theorically we could handle it in zarr giving the list of chunks.
This could be done a the futur but this will not be in the actual PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
str(sig_dtype) is for jsonification yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some without this safety step. So if we really want to be safe we should do this for all uses of dtype and Path no?
for seg_index in self._hdf5_analogsignal_buffers[block_index].keys(): | ||
for buffer_id, h5_file in self._hdf5_analogsignal_buffers[block_index][seg_index].items(): | ||
h5_file.close() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no del here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oups yes.
tahnks
neo/rawio/brainvisionrawio.py
Outdated
self._buffer_descriptions[0][0][buffer_id] = { | ||
"type" : "raw", | ||
"file_path" : binary_filename, | ||
"dtype" : sig_dtype, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you didn't stringify like above. Why?
Merci beaucoup @zm711 and @h-mayorquin for the review |
Given the idea of @bendichter and this gist
Here a very very early stage of a very very proof of concept of the "buffer_description_api" for analogsignal chunk.
The idea/goal is:
See also kerchunk