shards#
Interface to sharded repodata code.
Classes#
Build a subset of repodata by traversing all packages that are dependencies |
|
Protocol for build_repodata_subset callable. |
Functions#
|
Retrieve all necessary information to build a repodata subset. |
- class RepodataSubset(shardlikes: collections.abc.Iterable[conda._private.shards.shards.ShardBase], spec_to_package_name: collections.abc.Callable[[str], str] = spec_to_package_name, repodata_version: int = 1, depth: int = sys.maxsize)#
Build a subset of repodata by traversing all packages that are dependencies and transitive dependencies of a root set of packages.
- DEFAULT_STRATEGY = 'pipelined'#
- _spec_to_package_name: collections.abc.Callable[[str], str]#
- _repodata_version = 1#
- depth = 9223372036854775807#
- classmethod has_strategy(strategy: str) bool#
Return True if this class provides the named shard traversal strategy.
- _neighbors(node: Node) collections.abc.Iterator[Node]#
Retrieve all unvisited neighbors of a node.
Neighbors in the context are dependencies of a package.
NOTE: This method assumes that the required shards have already been retrieved from the network via batch_retrieve_from_network() before neighbors() is called. It uses visit_package() to access already-loaded shards.
- reachable(root_packages, *, strategy=DEFAULT_STRATEGY) None#
Run named reachability strategy or the default.
Update self.shardlikes with reachable package records. Later, [shardlike.build_repodata() for shardlike in shardlikes] can be used to generate repodata.json-format subsets of each channel.
- reachable_bfs(root_packages)#
Fetch all packages reachable from root_packages' by following dependencies using the "breadth-first search" algorithm.
Update associated self.shardlikes to contain enough data to build a repodata subset.
- _reachable_bfs(root_packages, shard_cache: conda._private.shards.cache.ShardCache)#
Inner reachable_bfs() implementation.
- reachable_pipelined(root_packages)#
Fetch all packages reachable from root_packages' by following dependencies.
Build repodata subset using concurrent threads to follow dependencies, fetch from cache, and fetch from network.
- _reachable_pipelined(root_packages, network_worker: collections.abc.Callable[[queue.SimpleQueue[collections.abc.Sequence[NodeId] | None], queue.SimpleQueue[list[tuple[NodeId, conda._private.shards.typing.ShardDict] | Exception]], RepodataSubset._reachable_pipelined.cache, collections.abc.Sequence[conda._private.shards.shards.ShardBase]], None], cache: RepodataSubset._reachable_pipelined.cache)#
Set up queues and threads for shard traversal with a configurable network_worker. Called by reachable_pipelined()
- _pipelined_traversal(root_packages, cache_in_queue: queue.SimpleQueue[list[NodeId] | None], shard_out_queue: queue.SimpleQueue[list[tuple[NodeId, conda._private.shards.typing.ShardDict]] | Exception], cache_thread: threading.Thread, network_thread: threading.Thread)#
Run reachability algorithm given queues to submit and receive shards.
- _visit_node(parent_node: Node, mentioned_packages: collections.abc.Iterable[str]) collections.abc.Iterable[NodeId]#
Broadcast mentioned packages across channels. yield pending NodeId's.
- _drain_pending(pending: set[NodeId], shardlikes_by_url: dict[str, conda._private.shards.shards.ShardBase]) tuple[list[tuple[NodeId, conda._private.shards.typing.ShardDict]], list[NodeId]]#
Check pending for in-memory shards. Clear pending.
Return a list of shards we have and shards we need to fetch.
- build_repodata_subset(root_packages: collections.abc.Iterable[str], channels: dict[str, conda.models.channel.Channel], algorithm: Literal['bfs', 'pipelined'] = RepodataSubset.DEFAULT_STRATEGY, spec_to_package_name_func: collections.abc.Callable[[str], str] = spec_to_package_name, repodata_version: int = 1, depth: int = sys.maxsize) dict[str, conda._private.shards.shards.ShardBase] | None#
Retrieve all necessary information to build a repodata subset.
This function implements the conda.gateways.shards.BuildRepodataSubset protocol, allowing it to be passed to solvers that support sharded repodata optimization.
- Params:
root_packages: iterable of installed and requested package names channels: Channel objects; dict form preferred. algorithm: desired traversal algorithm ("bfs" or "pipelined") spec_to_package_name_func: callable to convert package specs to names.
Defaults to the standard spec_to_package_name.
repodata_version: repodata format version (1 = classic, 3 = v3). depth: the maximum depth of dependant packages to include in the repodata
subset.
- Returns:
None if there are no shards available, or a mapping of channel URL's to ShardBase objects where build_repodata() returns the computed subset.
- class BuildRepodataSubset#
Bases:
ProtocolProtocol for build_repodata_subset callable.
This function is used by solvers to construct a minimal subset of repodata based on the root packages that might be installed and the available channels. It traverses package dependencies to discover all reachable (channel, package) tuples, which are then used by the solver to reduce search space.
- __call__(root_packages: collections.abc.Iterable[str], channels: dict[str, Any], algorithm: Literal['bfs', 'pipelined'] = 'pipelined', repodata_version: int = 1) dict[str, Shards] | None#
Retrieve a minimal subset of repodata based on root packages.
- Parameters:
root_packages -- Iterable of installed and requested package names
channels -- Dictionary mapping channel URLs to Channel objects
algorithm -- Traversal algorithm to use ("bfs" or "pipelined")
repodata_version -- repodata format version (1 = classic, 3 = v3).
- Returns:
A dictionary mapping channel URLs to Shards objects containing the subset of packages needed, or None if shards are unavailable