Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssistantBench #186

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

AssistantBench #186

wants to merge 6 commits into from

Conversation

oriyor
Copy link

@oriyor oriyor commented Oct 15, 2024

Add

  • task implementation
  • evaluation
  • logic to write predictions to file to support hidden test test
  • readme/.toml

Tests

  • Test for evaluation on dev set
  • Toy implementation task

oriyor and others added 5 commits October 18, 2024 16:49
Add
- task implementation
- evaluation
- logic to write predictions to file to support hidden test test
- readme/.toml

Tests
- Test for evaluation on dev set
- Toy implementation task
Add
- task implementation
- evaluation
- logic to write predictions to file to support hidden test test
- readme/.toml

Tests
- Test for evaluation on dev set
- Toy implementation task
@gasse
Copy link
Collaborator

gasse commented Oct 18, 2024

Hey @oriyor , I refactored a bit the packaging and added a few tests. Some of the tests fail (goal_answer is None), I am not sure how to fix it. Can you have a look?

Also, would you mind if we rename the tasks from browsergym/ab.X to browsergym/assistantbench.X for clarity?

@gasse gasse changed the title add AssistantBench AssistantBench Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants