shards#

Classes#

ShardBase

Abstract base class for shard-like objects.

ShardLike

Present a "classic" repodata.json as per-package shards.

Shards

Handle repodata_shards.msgpack.zst and individual per-package shards.

Functions#

fetch_channels(→ dict[str, ShardBase] | None)

fetch_shards_index(→ Shards | None)

Check a SubdirData's URL for shards.

class ShardBase#

Bases: abc.ABC

Abstract base class for shard-like objects.

Defines the common interface for both sharded repodata (Shards) and monolithic repodata presented as shards (ShardLike).

url: str#
repodata_no_packages: conda._private.shards.typing.RepodataDict#
visited: dict[str, conda._private.shards.typing.ShardDict | None]#
_base_url: str#
property package_names: collections.abc.KeysView[str]#
Abstractmethod:

Return the names of all packages available in this shard collection.

property base_url: str#

Return self.url joined with base_url from repodata, or self.url if no base_url was present. Packages are found here.

Note base_url can be a relative or an absolute url. Uses _safe_urljoin_with_slash to handle non-HTTP schemes (s3://, etc.).

__contains__(package: str) bool#

Check if a package is available in this shard collection.

abstractmethod shard_url(package: str) str#

Return shard URL for a given package. For monolithic repodata, should not be fetched but is a unique identifier.

Raise KeyError if package is not in the index.

abstractmethod shard_loaded(package: str) bool#

Return True if the given package's shard is in memory.

visit_package(package: str) conda._private.shards.typing.ShardDict#

Return a shard that is already loaded in memory and mark as visited.

visit_shard(package: str, shard: conda._private.shards.typing.ShardDict)#

Store new shard data in the visited dict.

build_repodata() conda._private.shards.typing.RepodataDict#

Return monolithic repodata including all visited shards.

Does not return "v3" repodata.

Prefer iter_records_v3() over this method.

iter_records() collections.abc.Iterable[tuple[str, dict]]#

Yield (filename, record) tuples for all packages in visited shards.

iter_records_v3() collections.abc.Iterable[tuple[tuple[str, str], dict]]#

Yield ((key, section), record) tuples for all packages in visited shards.

Section can be: "packages" for .tar.bz2 packages, "packages.conda" for .conda packages, "v3.whl", "v3.conda", "v3.tar.bz2" for v3 packages.

key is the same as the filename for "packages", "packages.conda" but is different from the filename for v3 packages.

class ShardLike(repodata: conda._private.shards.typing.RepodataDict, url: str = '')#

Bases: ShardBase

Present a "classic" repodata.json as per-package shards.

url: must be unique for all ShardLike used together.

repodata_no_packages: conda._private.shards.typing.RepodataDict#
url = ''#
shards: dict[str, conda._private.shards.typing.ShardDict]#
visited: dict[str, conda._private.shards.typing.ShardDict | None]#
__repr__()#
property package_names: collections.abc.KeysView[str]#

Return the names of all packages available in this shard collection.

shard_url(package: str) str#

Return shard URL for a given package.

Raise KeyError if package is not in the index.

shard_loaded(package: str) bool#

Return True if the given package's shard is in memory.

visit_package(package: str) conda._private.shards.typing.ShardDict#

Return a shard that is already in memory and mark as visited.

class Shards(shards_index: conda._private.shards.typing.ShardsIndexDict, url: str)#

Bases: ShardBase

Handle repodata_shards.msgpack.zst and individual per-package shards.

Parameters:
  • shards_index -- raw parsed msgpack dict. Don't change it or base_url,

  • wrong. (shards_base_url will be)

  • url -- URL of repodata_shards.msgpack.zst

_shards_base_url: str#
shards_index#
url#
_base_url#
session#
repodata_no_packages#
visited: dict[str, conda._private.shards.typing.ShardDict | None]#
_shard_url_cache: dict[str, str]#
property package_names#

Return the names of all packages available in this shard collection.

property packages_index#
property shards_base_url: str#

Return self.url joined with shards_base_url. Note shards_base_url can be a relative or an absolute url.

shard_url(package: str) str#

Return shard URL for a given package.

Raise KeyError if package is not in the index.

shard_loaded(package: str) bool#

Return True if the given package's shard is in memory.

visit_package(package: str) conda._private.shards.typing.ShardDict#

Return a shard that is already in memory and mark as visited.

fetch_channels(url_to_channel: dict[str, conda.models.channel.Channel]) dict[str, ShardBase] | None#
Parameters:

url_to_channel -- not modified, must already be expanded to subdirs.

Attempt to fetch the sharded index first and then fall back to retrieving a monolithic repodata.json file.

Returns:

A dict mapping channel URLs to Shard or ShardLike objects. None if no channels have shards. This dict preserves the key order of the input url_to_channel.

fetch_shards_index(sd: conda.core.subdir_data.SubdirData) Shards | None#

Check a SubdirData's URL for shards.

Return shards index bytes from cache or network. Return None if not found; caller should fetch normal repodata.

TODO: If this function fails to retrieve the sharded repodata index file, it will

mark it is as not supporting this feature in cache. This can problematic because sometimes server errors can happen which will lead it to wrongly assuming the channel doesn't support sharding. We need to rethink our logic for determining shard support.