-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output results progressively #30
Comments
This is already part of RABBIT as an input option https://github.com/natarajan-chidambaram/RABBIT#:~:text=%2D%2Dincremental, which the user can specify if they want the results as soon as it is predicted. |
Indeed, sorry ^^ |
Are you sure that it's working correctly for a text-based output? Looking at the code, it seems that the "full" (up to the current contributor) dataframe is displayed within the loop... I guess that headers are also displayed each time a contributor is processed. Also, it does not really make sense to have this "incremental" mode for file-based output (you can't read a file if there are repeated write access to it, so what's the value of doing this?) Please reopen if I'm right :-p |
(i) Yes, headers are displayed each time a contributor is processed in --incremental mode. Further, all the contributors that have been processed till the last processed contributor is displayed on the screen. I am reopening the issue as more discussion is required. |
(1) This makes is barely usable by users. Why not displaying the headers only once, then each time an account is processed, add a line to the output? Another point is that, in the current version of RABBIT, it seems we cannot output json or csv on STDOUT (which can be really convenient when calling rabbit from code). I think we should give more control on the output to the users, to fit their needs without restricting the use-cases. One possibility would be to have three "groups" of options: one to control where the results are provided (STDOUT or a file), one to control how rabbit processes the output (incremental mode or "full output" mode), one to control the format of the output (text, csv, json, ...). These options could be To make it even more user-friendly, without adding much to the complexity, we can also provide a Basically, this implies some rewriting of the code (not much, that said), and could be very beneficial since it is likely we can drop "pandas" as a dependency if we proceed that way! |
Note that I do not think we need |
Hello,
Related to #29, the current behaviour of RABBIT is to retrieve all data (for all accounts passed as input), to process them all and then to display the results of all of them, at once. What about processing accounts "one by one" so that we don't have to wait for all the accounts to be processed to get the first results?
The text was updated successfully, but these errors were encountered: