Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a lotus-shed command to garbage collect all the indices not available in the state store #12377

Closed
3 of 9 tasks
akaladarshi opened this issue Aug 12, 2024 · 3 comments
Assignees
Labels
kind/feature Kind: Feature

Comments

@akaladarshi
Copy link
Contributor

akaladarshi commented Aug 12, 2024

Checklist

  • This is not brainstorming ideas. If you have an idea you'd like to discuss, please open a new discussion on the lotus forum and select the category as Ideas.
  • I have a specific, actionable, and well motivated feature request to propose.

Lotus component

  • lotus daemon - chain sync
  • lotus fvm/fevm - Lotus FVM and FEVM interactions
  • lotus miner/worker - sealing
  • lotus miner - proving(WindowPoSt/WinningPoSt)
  • lotus JSON-RPC API
  • lotus message management (mpool)
  • Other

What is the motivation behind this feature request? Is your feature request related to a problem? Please describe.

Before unifying all the indexes (msg, tx, event) into a single DB, it’s crucial to garbage-collect (GC) any indices that are no longer referenced in the state store. This step will ensure that pruned or obsolete data isn't carried over during the migration, leading to a cleaner and more efficient database.

Describe the solution you'd like

Introduce a lotus-shed command that performs garbage collection on all indices not referenced by the current state. This command should effectively remove unnecessary indexes, optimizing the database before the unification of the indexes.

Additional context

This has been discussed here as well and it will be helpful to @aarshkshah1992, as he is working on unifying the DB for the indexes.

@rvagg
Copy link
Member

rvagg commented Aug 13, 2024

@akaladarshi @aarshkshah1992 we need to figure out what the most efficient way to figure out what our oldest state tree is. We have all the tipsets in the chain store but I suppose if we were to iterate over the state roots and try and load just the state root block then we'd eventually find one that doesn't load.

@aarshkshah1992
Copy link
Contributor

@rvagg

As discussed offline, I think the best approach here is for the user to specify "start with tipset X and gc everything backwards till epoch Y". That atleast enables majority of our users who run a snapshot synced node with splitstore enabled (sync once -> then "run forever")to GC events they know they don't have the state for anymore (these users usually retain state for the last 1 or 2 days).

Once we have this in place, we can think about GC strategies for other use cases.

@BigLep
Copy link
Member

BigLep commented Oct 3, 2024

This is subsumed by work happening in #12453

@BigLep BigLep closed this as completed Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Kind: Feature
Projects
Status: ☑️ Done (Archive)
Development

Successfully merging a pull request may close this issue.

4 participants