-
Notifications
You must be signed in to change notification settings - Fork 278
replace RequestsFetcher for Urllib3Fetcher #2762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
0675f0c
create urllib3 fetcher, replace requestsFetcher with urllibFetcher in…
NicholasTanz 20d825f
fix line too long linting error
NicholasTanz 031778f
more linting stuff
NicholasTanz 18e42ce
replacing RequestsFecther with Urllib3Fetcher in .rst
NicholasTanz 2128030
utilize one pool manager
NicholasTanz 2aed81f
change error handling to MaxRetryError in _fetch()
NicholasTanz a48fca5
add retry error handling to _chunks()
NicholasTanz f8b1dbd
linting
NicholasTanz 326529b
Merge branch 'theupdateframework:develop' into switchUrlLib3
NicholasTanz 86cc7ad
clarify urllib3 as requirement in pyproject.toml and add back in requ…
NicholasTanz d67f126
remove self.app_user_agent attribute, as it's not used outside of init
NicholasTanz 6318760
swap invalid urls that are used in testing. (takes care of deprecatio…
NicholasTanz 2ac8bdc
linting
NicholasTanz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
# Copyright 2021, New York University and the TUF contributors | ||
# SPDX-License-Identifier: MIT OR Apache-2.0 | ||
|
||
"""Provides an implementation of ``FetcherInterface`` using the urllib3 HTTP | ||
library. | ||
""" | ||
|
||
from __future__ import annotations | ||
|
||
import logging | ||
from typing import TYPE_CHECKING | ||
from urllib import parse | ||
|
||
# Imports | ||
import urllib3 | ||
|
||
import tuf | ||
from tuf.api import exceptions | ||
from tuf.ngclient.fetcher import FetcherInterface | ||
|
||
if TYPE_CHECKING: | ||
from collections.abc import Iterator | ||
|
||
# Globals | ||
logger = logging.getLogger(__name__) | ||
|
||
|
||
# Classes | ||
class Urllib3Fetcher(FetcherInterface): | ||
"""An implementation of ``FetcherInterface`` based on the urllib3 library. | ||
|
||
Attributes: | ||
socket_timeout: Timeout in seconds, used for both initial connection | ||
delay and the maximum delay between bytes received. | ||
chunk_size: Chunk size in bytes used when downloading. | ||
""" | ||
|
||
def __init__( | ||
self, | ||
socket_timeout: int = 30, | ||
chunk_size: int = 400000, | ||
app_user_agent: str | None = None, | ||
) -> None: | ||
# NOTE: We use a separate urllib3.PoolManager per scheme+hostname | ||
# combination, in order to reuse connections to the same hostname to | ||
# improve efficiency, but avoiding sharing state between different | ||
# hosts-scheme combinations to minimize subtle security issues. | ||
# Some cookies may not be HTTP-safe. | ||
self._poolManagers: dict[tuple[str, str], urllib3.PoolManager] = {} | ||
|
||
# Default settings | ||
self.socket_timeout: int = socket_timeout # seconds | ||
self.chunk_size: int = chunk_size # bytes | ||
self.app_user_agent = app_user_agent | ||
NicholasTanz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def _fetch(self, url: str) -> Iterator[bytes]: | ||
"""Fetch the contents of HTTP/HTTPS url from a remote server. | ||
|
||
Args: | ||
url: URL string that represents a file location. | ||
|
||
Raises: | ||
exceptions.SlowRetrievalError: Timeout occurs while receiving | ||
data. | ||
exceptions.DownloadHTTPError: HTTP error code is received. | ||
|
||
Returns: | ||
Bytes iterator | ||
""" | ||
# Get a customized session for each new schema+hostname combination. | ||
poolmanager = self._get_poolmanager(url) | ||
NicholasTanz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# Get the urllib3.PoolManager object for this URL. | ||
# | ||
# Defer downloading the response body with preload_content=False. | ||
# Always set the timeout. This timeout value is interpreted by | ||
# urllib3 as: | ||
# - connect timeout (max delay before first byte is received) | ||
# - read (gap) timeout (max delay between bytes received) | ||
try: | ||
response = poolmanager.request( | ||
"GET", | ||
url, | ||
preload_content=False, | ||
timeout=urllib3.Timeout(connect=self.socket_timeout), | ||
NicholasTanz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
) | ||
except urllib3.exceptions.TimeoutError as e: | ||
raise exceptions.SlowRetrievalError from e | ||
|
||
# Check response status. | ||
try: | ||
if response.status >= 400: | ||
raise urllib3.exceptions.HTTPError | ||
except urllib3.exceptions.HTTPError as e: | ||
response.close() | ||
status = response.status | ||
raise exceptions.DownloadHTTPError(str(e), status) from e | ||
NicholasTanz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
return self._chunks(response) | ||
|
||
def _chunks( | ||
self, response: urllib3.response.BaseHTTPResponse | ||
) -> Iterator[bytes]: | ||
"""A generator function to be returned by fetch. | ||
|
||
This way the caller of fetch can differentiate between connection | ||
and actual data download. | ||
""" | ||
|
||
try: | ||
yield from response.stream(self.chunk_size) | ||
except ( | ||
urllib3.exceptions.ConnectionError, | ||
urllib3.exceptions.TimeoutError, | ||
) as e: | ||
raise exceptions.SlowRetrievalError from e | ||
|
||
finally: | ||
response.close() | ||
NicholasTanz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def _get_poolmanager(self, url: str) -> urllib3.PoolManager: | ||
"""Return a different customized urllib3.PoolManager per schema+hostname | ||
combination. | ||
|
||
Raises: | ||
exceptions.DownloadError: When there is a problem parsing the url. | ||
""" | ||
# Use a different urllib3.PoolManager per schema+hostname | ||
# combination, to reuse connections while minimizing subtle | ||
# security issues. | ||
parsed_url = parse.urlparse(url) | ||
|
||
if not parsed_url.scheme: | ||
raise exceptions.DownloadError(f"Failed to parse URL {url}") | ||
|
||
poolmanager_index = (parsed_url.scheme, parsed_url.hostname or "") | ||
poolmanager = self._poolManagers.get(poolmanager_index) | ||
|
||
if not poolmanager: | ||
# no default User-Agent when creating a poolManager | ||
ua = f"python-tuf/{tuf.__version__}" | ||
if self.app_user_agent is not None: | ||
ua = f"{self.app_user_agent} {ua}" | ||
|
||
poolmanager = urllib3.PoolManager(headers={"User-Agent": ua}) | ||
self._poolManagers[poolmanager_index] = poolmanager | ||
|
||
logger.debug("Made new poolManager %s", poolmanager_index) | ||
else: | ||
logger.debug("Reusing poolManager %s", poolmanager_index) | ||
|
||
return poolmanager |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.