Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Update device deserialization for components that use local models #7686

Merged
merged 9 commits into from
May 14, 2024

Conversation

sjrl
Copy link
Contributor

@sjrl sjrl commented May 13, 2024

Related Issues

  • fixes #issue-number

Proposed Changes:

I would like to be able to specify None or null when deserializin a pipeline that uses:

  • SentenceTransfomersTextEmbedder
  • SentenceTransfomersDocumentEmbedder
  • NamedEntityExtractor
  • SentenceTransformersDiversityRanker

Currently I get an error if I try and I'm required to provide a specific device following the ComponentDevice.to_dict() format.

Following implementation that is used for ExtractiveReader to overcome the same problem.

How did you test it?

  • added two unit tests

Notes for the reviewer

Checklist

@sjrl sjrl requested a review from a team as a code owner May 13, 2024 11:02
@sjrl sjrl requested review from anakin87 and removed request for a team May 13, 2024 11:02
@sjrl sjrl changed the title fix: Update device deserializtion for SentenceTransformersTextEmbedder fix: Update device deserialization for SentenceTransformersTextEmbedder May 13, 2024
@sjrl sjrl marked this pull request as draft May 13, 2024 11:02
@coveralls
Copy link
Collaborator

coveralls commented May 13, 2024

Pull Request Test Coverage Report for Build 9065132032

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 20 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.007%) to 90.421%

Files with Coverage Reduction New Missed Lines %
components/extractors/named_entity_extractor.py 20 68.48%
Totals Coverage Status
Change from base Build 9030307905: 0.007%
Covered Lines: 6532
Relevant Lines: 7224

💛 - Coveralls

@sjrl sjrl marked this pull request as ready for review May 13, 2024 12:27
@sjrl sjrl requested a review from a team as a code owner May 13, 2024 12:27
@sjrl sjrl requested review from dfokina and removed request for a team May 13, 2024 12:27
@sjrl sjrl added this to the 2.1.2 milestone May 13, 2024
Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, @sjrl!

Can you explain when this is necessary?
Typically, when serializing a Pipeline containing a SentenceTransformersTextEmbedder, the used device is serialized and I don't expect it to be null, but maybe I am overlooking something...

(If needed, please also update the LocalWhisperTranscriber)

@sjrl
Copy link
Contributor Author

sjrl commented May 13, 2024

Hey @anakin87! And of course.

This is more to do with deserializing from a YAML file where the pipeline was only generated as a YAML file (no serialization in this case). It's often the case that in dC we will create a pipeline from a YAML file in an editor that does not have access to the hardware we will run it on. Also it's possible as the editor of the pipeline I don't know what hardware this pipeline will run on (e.g. cpu vs gpu) so it's helpful to be able to specify device: null in the yaml file which will let init methods of these components auto detect which hardware to use.

@sjrl sjrl changed the title fix: Update device deserialization for SentenceTransformersTextEmbedder fix: Update device deserialization for components that use local models May 13, 2024
@anakin87
Copy link
Member

Thanks for the explanation!

This PR looks good.

  • Please add a similar mechanism to LocalWhisperTranscriber.
  • Update the release notes.

@sjrl sjrl requested a review from anakin87 May 13, 2024 14:57
Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@sjrl sjrl merged commit a2be90b into main May 14, 2024
25 checks passed
@sjrl sjrl deleted the sjrl-patch-1 branch May 14, 2024 06:36
davidsbatista pushed a commit that referenced this pull request May 16, 2024
…ls (#7686)

* fix: Update device deserializtion for SentenceTransformersTextEmbedder

* Add unit test

* Fix unit test

* Make same change to doc embedder

* Add release notes

* Add same change to Diversity Ranker and Named Entity Extractor

* Add unit test

* Add the same for whisper local

* Update release notes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants