Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when adding textrank component for language model in Python 3.12 Docker setup #282

Open
matteosdocsity opened this issue Oct 10, 2024 · 0 comments

Comments

@matteosdocsity
Copy link

I encountered an issue when trying to use PyTextRank with spaCy in a Docker container using Python 3.12.7. The problem arises when I try to add the textrank component to the all language models (it in the example).

Environment:

Python version: 3.12.7 (using the Docker image python:3.12.7-slim-bullseye)
spaCy versions: 3.0.5 and 3.7.4 (tested both)
PyTextRank version: 3.3.0
Steps to reproduce: Here are the commands I'm using to set up the environment in Docker:

RUN /root/.cargo/bin/uv pip install --no-cache --system spacy==3.0.5 pytextrank==3.3.0
RUN python -m spacy download it_core_news_sm
RUN python -m spacy download en_core_web_sm
RUN python -m spacy download es_core_news_sm
RUN python -m spacy download pt_core_news_sm
RUN python -m spacy download ru_core_news_sm
RUN python -m spacy download fr_core_news_sm
RUN python -m spacy download de_core_news_sm
RUN python -m spacy download pl_core_news_sm
RUN python -m spacy download xx_ent_wiki_sm

Error Message: The following error is thrown when attempting to add the textrank component to the Italian language model:

File "/app/extractive_summary/text_rank.py", line 139, in process_chunk_summary
 nlp.add_pipe("textrank", config={"stopwords": {"word": list(self.stop_words)}}, last=True)

 File "/usr/local/lib/python3.12/site-packages/spacy/language.py", line 824, in add_pipe
    pipe_component = self.create_pipe(
                    ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/spacy/language.py", line 693, in create_pipe
    raise ValueError(err)
ValueError: [E002] Can't find factory for 'textrank' for language Italian (it). 
This usually happens when spaCy calls `nlp.create_pipe` with a custom component name that's not registered on the current language class. 
If you're using a custom component, make sure you've added the decorator `@Language.component` (for function components) or `@Language.factory` (for class components).

Expected Behavior: The textrank component should be successfully added to the Italian language model without throwing any errors.

Additional Information: This issue seems to be related to the component registration process in spaCy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant