misc#
Miscellaneous utility functions for sharded repodata processing.
This module contains utility functions that don't fit cleanly into other modules: - URL handling - Package name parsing - Data transformation helpers - Threading utilities
Functions#
|
If context.repodata_threads is not set, find the size of the connection pool |
|
Join base_url with relative_url, ensuring proper handling of all URL schemes. |
|
Determine whether the HTTPError is an HTTP 400 error code (except for 416). |
|
Convert bytes checksums to hex; leave unchanged if already str. |
|
Given a dependency spec, return the package name, or None if the spec is |
Given repodata or a single shard, remove any .tar.bz2 packages that have a |
|
Combine lists from in_queue until we see None. Yield combined lists. |
|
|
Decorator to send unhandled exceptions to the second argument out_queue. |
Attributes#
- _T#
- _URLJOIN_SAFE_SCHEMES#
- SHARDS_CONNECTIONS_DEFAULT = 10#
- _shards_connections() int#
If context.repodata_threads is not set, find the size of the connection pool in a typical https:// session. This should significantly reduce dropped connections. We match requests' default 10.
Is this shared between all sessions? Or do we get a different pool for a different get_session(url)?
Other adapters (file://, s3://) used in conda would have different concurrency behavior; we are not prepared to have separate threadpools per connection type.
- _safe_urljoin_with_slash(base_url: str, relative_url: str = '') str#
Join base_url with relative_url, ensuring proper handling of all URL schemes.
Python's urllib.parse.urljoin only handles schemes registered in
urllib.parse.uses_relative. For unregistered schemes likes3://, it returns just"."instead of the resolved URL. This function falls back to a scheme-swap workaround for those cases.The result always ends with "/" to enable proper string concatenation with filenames.
- _is_http_error_most_400_codes(status_code: str | int) bool#
Determine whether the HTTPError is an HTTP 400 error code (except for 416).
- ensure_hex_hash(record: conda._private.shards.typing.PackageRecordDict)#
Convert bytes checksums to hex; leave unchanged if already str.
- spec_to_package_name(spec: str) str | None#
Given a dependency spec, return the package name, or None if the spec is not parseable.
Uses conda's MatchSpec rather than libmambapy to avoid a hard dependency on a solver backend. With @functools.cache the performance is equivalent (benchmarked at ~10ms for 5000 unique specs either way).
- filter_redundant_packages(repodata: conda._private.shards.typing.ShardDict, use_only_tar_bz2=False) conda._private.shards.typing.ShardDict#
Given repodata or a single shard, remove any .tar.bz2 packages that have a .conda counterpart.
Return a shallow copy if use_only_tar_bz2==False, else unmodified input.
- combine_batches_until_none(in_queue: queue.SimpleQueue[collections.abc.Sequence[_T] | None]) collections.abc.Iterator[collections.abc.Sequence[_T]]#
Combine lists from in_queue until we see None. Yield combined lists.
- exception_to_queue(func)#
Decorator to send unhandled exceptions to the second argument out_queue.