Skip to content

A Python interface and pyo3 integration to the Rust object_store crate, providing a uniform API for interacting with object storage services and local files.

License

Notifications You must be signed in to change notification settings

developmentseed/object-store-py

Repository files navigation

object-store-py

PyPI

A Python interface and pyo3 integration to the Rust object_store crate, providing a uniform API for interacting with object storage services and local files.

Run the same code in multiple clouds via a simple runtime configuration change.

  • Easy to install with no Python dependencies.
  • Sync and async API.
  • Streaming downloads with configurable chunking.
  • Automatically supports multipart uploads under the hood for large file objects.
  • The underlying Rust library is production quality and used in large scale production systems, such as the Rust package registry crates.io.
  • Simple API with static type checking.
  • Helpers for constructing from environment variables and boto3.Session objects

Supported object storage providers include:

  • Amazon S3 and S3-compliant APIs like Cloudflare R2
  • Google Cloud Storage
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • Local filesystem
  • In-memory storage

Installation

pip install object-store-py

Documentation

Full documentation is available on the website.

Usage

Constructing a store

Classes to construct a store are exported from the object_store_py.store submodule:

  • S3Store: Configure a connection to Amazon S3.
  • GCSStore: Configure a connection to Google Cloud Storage.
  • AzureStore: Configure a connection to Microsoft Azure Blob Storage.
  • HTTPStore: Configure a connection to a generic HTTP server
  • LocalStore: Local filesystem storage providing the same object store interface.
  • MemoryStore: A fully in-memory implementation of ObjectStore.

Example

import boto3
from object_store_py.store import S3Store

session = boto3.Session()
store = S3Store.from_session(session, "bucket-name", config={"AWS_REGION": "us-east-1"})

Configuration

Each store class above has its own configuration, accessible through the config named parameter. This is covered in the docs, and string literals are in the type hints.

Additional HTTP client configuration is available via the client_options named parameter.

Interacting with a store

All methods for interacting with a store are exported as top-level functions (not methods on the store object):

  • copy: Copy an object from one path to another in the same object store.
  • delete: Delete the object at the specified location.
  • get: Return the bytes that are stored at the specified location.
  • head: Return the metadata for the specified location
  • list: List all the objects with the given prefix.
  • put: Save the provided bytes to the specified location
  • rename: Move an object from one path to another in the same object store.

There are a few additional APIs useful for specific use cases:

All methods have a comparable async method with the same name plus an _async suffix.

Example

import object_store_py as obs

store = obs.store.MemoryStore()

obs.put(store, "file.txt", b"hello world!")
response = obs.get(store, "file.txt")
response.meta
# {'path': 'file.txt',
#  'last_modified': datetime.datetime(2024, 10, 21, 16, 19, 45, 102620, tzinfo=datetime.timezone.utc),
#  'size': 12,
#  'e_tag': '0',
#  'version': None}
assert response.bytes() == b"hello world!"

byte_range = obs.get_range(store, "file.txt", offset=0, length=5)
assert byte_range == b"hello"

obs.copy(store, "file.txt", "other.txt")
assert obs.get(store, "other.txt").bytes() == b"hello world!"

All of these methods also have async counterparts, suffixed with _async.

import object_store_py as obs

store = obs.store.MemoryStore()

await obs.put_async(store, "file.txt", b"hello world!")
response = await obs.get_async(store, "file.txt")
response.meta
# {'path': 'file.txt',
#  'last_modified': datetime.datetime(2024, 10, 21, 16, 20, 36, 477418, tzinfo=datetime.timezone.utc),
#  'size': 12,
#  'e_tag': '0',
#  'version': None}
assert await response.bytes_async() == b"hello world!"

byte_range = await obs.get_range_async(store, "file.txt", offset=0, length=5)
assert byte_range == b"hello"

await obs.copy_async(store, "file.txt", "other.txt")
resp = await obs.get_async(store, "other.txt")
assert await resp.bytes_async() == b"hello world!"

Comparison to object-store-python

Read a detailed comparison to object-store-python, a previous Python library that also wraps the same Rust object_store crate.

About

A Python interface and pyo3 integration to the Rust object_store crate, providing a uniform API for interacting with object storage services and local files.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published