-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embeddings of atoms from different representations #35
Comments
so you would like to have one vector per atom in a structure? |
I would like to get a vector for an atoms, not in the context of the atom being in any particular structure, but standalone. so that i can see if all the alkali elements are similar for models trained with different representations |
can i use the learned token embedding ? or do i even need to pass it through the model if it is is not in the context of structure ? |
ah, for this, people have used the learned embeddings of different tokens. Some existing techniques are here https://github.com/kjappelbaum/element-coder |
@n0w0f did you ever give this a look, do you plan to still look into it? |
I did not yet, but I think there can be lot of hidden insights there, and would love to followup |
@kjappelbaum , In order to check the similarity between atoms , or do those
King - Queen = Man - Women
analysis I would like to embed individual atoms with models trained on different representation. This is as a follow up to see if composition or atoms means anything for smaller modelsFor slice and composition maybe i can keep atom as the first token and pad all other token,
but for crystal-llm or cif_rep atoms usually comes in the later part of the representation , would keeping atom at the beginning work for these representations ?
The text was updated successfully, but these errors were encountered: