Module vs_object_store

Source
Available on docsrs only.
Expand description

§OpenDAL vs object_store

NOTE: This document is written by OpenDAL’s maintainers and not reviewed by object_store’s maintainers. So it could not be very objective.

§About object_store

object_store is

A focused, easy to use, idiomatic, high performance, async object store library interacting with object stores.

It was initially developed for InfluxDB IOx and later split out and donated to Apache Arrow.

§Similarities

§Language

Yes, of course. Both opendal and object_store are developed in Rust, a language empowering everyone to build reliable and efficient software.

§License

Both opendal and object_store are licensed under Apache 2.0.

§Owner

object_store is a part of Apache Arrow which means it’s hosted and maintained by the Apache Software Foundation.

opendal is now hosted by the Apache Software Foundation also.

§Domain

Both opendal and object_store can be used to access data stored on object storage services. The primary users of those projects are both cloud-native databases too:

  • opendal is mainly used by:
    • databend: A modern Elasticity and Performance cloud data warehouse
    • GreptimeDB: An open-source, cloud-native, distributed time-series database.
    • mozilla/sccache: sccache is ccache with cloud storage
    • risingwave: A Distributed SQL Database for Stream Processing
    • Vector: A high-performance observability data pipeline.
  • object_store is mainly used by:
    • datafusion: Apache Arrow DataFusion SQL Query Engine
    • Influxdb IOx: The new core of InfluxDB is written in Rust on top of Apache Arrow.

§Differences

§Vision

opendal is Open Data Access Layer that accesses data freely, painlessly, and efficiently. object_store is more focused on async object store support.

You will see the different visions lead to very different routes.

§Design

object_store exposed a trait called ObjectStore to users.

Users need to build a dyn ObjectStore and operate on it directly:

let object_store: Arc<dyn ObjectStore> = Arc::new(get_object_store());
let path: Path = "data/file01.parquet".try_into()?;
let stream = object_store
    .get(&path)
    .await?
    .into_stream();

opendal has a similar trait called Access

But opendal don’t expose this trait to end users directly. Instead, opendal expose a new struct called Operator and builds public API on it.

let op: Operator = Operator::from_env(Scheme::S3)?;
let r = op.reader("data/file01.parquet").await?;

§Interception

Both object_store and opendal provide a mechanism to intercept operations.

object_store called Adapters:

let object_store = ThrottledStore::new(get_object_store(), ThrottleConfig::default())

opendal called Layer:

let op = op.layer(TracingLayer).layer(MetricsLayer);

At the time of writing:

object_store (v0.5.0) supports:

  • ThrottleStore: Rate Throttling
  • LimitStore: Concurrent Request Limit

opendal supports:

  • ImmutableIndexLayer: immutable in-memory index.
  • LoggingLayer: logging.
  • MetadataCacheLayer: metadata cache.
  • ContentCacheLayer: content data cache.
  • MetricsLayer: metrics
  • RetryLayer: retry
  • SubdirLayer: Allow switch directory without changing original operator.
  • TracingLayer: tracing

§Services

opendal and object_store have different visions, so they have other services support:

serviceopendalobject_store
azblobYY
fsYY
ftpYN
gcsYY
hdfsYY (via datafusion-objectstore-hdfs)
httpY (read only)N
ipfsY (read only)N
ipmfsYN
memoryYY
obsYN
s3YY

opendal has an idea called Capability, so it’s services may have different capability sets. For example, opendal’s http and ipfs are read only.

§Features

opendal and object_store have different visions, so they have different feature sets:

opendalobject_storenotes
metadata-get some metadata from underlying storage
createput-
readget-
readget_range-
-get_rangesopendal doesn’t support read multiple ranges
writeput-
stathead-
deletedelete-
-listopendal doesn’t support list with prefix
listlist_with_delimiter-
-copy-
-copy_if_not_exists-
-rename-
-rename_if_not_exists-
presign-get a presign URL of object
multipartmultipartboth support, but API is different
blocking-opendal supports blocking API

§Demo show

The most straightforward complete demo how to read a file from s3:

opendal

let mut builder = S3::default();
builder.bucket("example");
builder.access_key_id("access_key_id");
builder.secret_access_key("secret_access_key");

let store = Operator::new(builder)?.finish();
let r = store.reader("data.parquet").await?;

object_store

let mut builder = AmazonS3Builder::new()
.with_bucket_name("example")
.with_access_key_id("access_key_id")
.with_secret_access_key("secret_access_key");

let store = Arc::new(builder.build()?);
let path: Path = "data.parquet".try_into().unwrap();
let stream = store.get(&path).await()?.into_stream();