Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Rewrite #53

Open
JoschD opened this issue Sep 15, 2022 · 0 comments
Open

[Feature Request]: Rewrite #53

JoschD opened this issue Sep 15, 2022 · 0 comments
Labels
Type: Bug Something isn't working as it should.

Comments

@JoschD
Copy link
Member

JoschD commented Sep 15, 2022

Feature Description

Upon inspection of the SDDS code, I think I stumbled upon multiple possible bugs.

a) if you have entries with duplicate names, these will overwrite each other, as they are stored by name in the dict. -> this should at least raise an error somewhere (e.g. assert len(names) == len(set(names)))

b) The order of the definitions in the header is sorted (parameter, array, column) upon reading, but not the data. And it is sorted before reading the data. So basically, the data needs to be already sorted in that order. If sorting would actually sort anything, the file could not be read anymore.
Same goes upon writing: here actually the data is always written in that order, but the order of the headers is whatever came into the SDDSFile init. Here sorting the header would make sense, but it is not done!!

We never encountered problems, as we read LHC data which is already sorted and mostly use the turn_by_turn package for writing, which sorts the entries in the same way.
If it wouldn't the order in the header would be different from the order in the data and that order is actually required.

My suggestion is to remove all of that sorting and loop in the writer simply through the lists and write parameter, array, column one at a time, instead of gathering them first.

Possible Implementation

We could add a random hash to the defintions, maybe build from name + some random chars.
Then definitions + values could be combined into a single dict with definitions as keys and the corresponding values as values. (which we could also accept as alternative input to the init, so either two lists, or this kind of dict).
The only problem is, that we then have to rewrite the getter a bit, so that if a name-string is given, we actually return a tuple in case multiple entries are found.

something like that.

@JoschD JoschD added Type: Bug Something isn't working as it should. Feature Request labels Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Something isn't working as it should.
Projects
None yet
Development

No branches or pull requests

2 participants