Skip to content

Releases: lancedb/lance

v0.2.6 Schema evolution bug fixes, Google Colab support, and more datasets

13 Dec 02:41
Compare
Choose a tag to compare

What's Changed

  • [C++] Remove unused Reader APIs by @eddyxu in #344
  • [Python] fix timezone issue with version timestamp by @changhiskhan in #345
  • [C++] add Dataset::Make(string) API by @eddyxu in #346
  • [DUCKDB] Native duckdb lance reader by @eddyxu in #347
  • [DUCKDB] Read a special version of dataset by @eddyxu in #350
  • [DUCKDB] Fix duckdb manylinux build by @eddyxu in #351
  • [Python] Add colab badge to notebooks by @eddyxu in #354
  • [Notebook] ML dev cycle for DINO by @eddyxu in #355
  • [DUCKDB] fix type mapping for other int types by @changhiskhan in #359
  • [Python] Fix lance.dataset open local related path by @eddyxu in #365
  • [C++] Store relative path for data files by @eddyxu in #368
  • [C++] Add RAII util (defer) to auto cleanup / close resources after exiting the scope by @eddyxu in #369
  • [Python] Convert of ImageNet 1K into Lance dataset by @eddyxu in #366
  • [Python] Imagenet data quality analytics notebook by @eddyxu in #370

Full Changelog: v0.2.5...v0.2.6

v0.2.5 Schema evolution, support merging with arrow Table

02 Dec 06:15
Compare
Choose a tag to compare

What's Changed

  • [DOC] Fix notebook build by @eddyxu in #339
  • [Python] lance.write_dataset takes pandas DataFrame by @eddyxu in #342
  • [DOC] update readme docs to cater for import pathways from df/parquet by @jaichopra in #340
  • [Python] Improve PyTorch dataset ergonomic by @eddyxu in #336
  • [C++] Add columns from in-memory table by @eddyxu in #337
  • [Python] append column with a in-memory Pyarrow Table by @eddyxu in #338
  • [C++][Python] Add timestamp to each manifest version. by @eddyxu in #343

Full Changelog: v0.2.4...v0.2.5

v0.2.4: Schema Evolution and Append Column

28 Nov 21:25
Compare
Choose a tag to compare

Support Schema Evolution via Append Column.

What's Changed

  • [Notebook] fixes for notebook backing the blog post by @changhiskhan in #316
  • [C++] Append column by @eddyxu in #299
  • [Python] Append columns by @eddyxu in #318
  • [Use column projection during update by @eddyxu in https://github.com//pull/322
  • update to duckdb 0.6 by @changhiskhan in #312
  • [Python] Support add column via Expression. by @eddyxu in #324
  • [Python] Expose projection for append column by @eddyxu in #325
  • [C++] Support column projection during add_columns via expression by @eddyxu in #326
  • [Python] Pytorch Dataset uses Fragment instead of files and support versions by @eddyxu in #327
  • [C++] Move writer API a private API by @eddyxu in #329
  • [C++] Refectory Metadata class to eliminate protobuf reference. by @eddyxu in #328
  • [C++] Performance profiling and improvement by @eddyxu in #333
  • [C++] Upgrade lq cmd tool to be able to inspect new versioned format by @eddyxu in #334

Full Changelog: v0.2.3...v0.2.4

v0.2.3 Bugfix release; breaks dataset proto schema

16 Nov 04:23
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.2.2...v0.2.3

v0.2.2 Python notebooks and CV dataset conversion.

09 Nov 17:25
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.2.1...v0.2.2

v0.2.1 Bug fix release

04 Nov 22:41
Compare
Choose a tag to compare

Fixed bug affecting writes of fixed size list arrays as well as datagen code for Coco.
Updated to Arrow 10.0 newly released.

What's Changed

Full Changelog: v0.2.0...v0.2.1

v0.2.0 Dataset Versioning, DuckDB extension built with CUDA

02 Nov 19:38
Compare
Choose a tag to compare

Highlights

  • Lance Dataset versioning support
  • Duckdb Extension supports building against PyTorch with Cuda
  • Revamp README and documentation.

What's Changed

  • Fetch Dataset Versions by @eddyxu in #272
  • Readability improvement for metadata class by @Renkai in #275
  • [DuckDB] Enable DuckDB extension to build/run on CUDA-enabled PyTorch by @changhiskhan in #273
  • [Python] Support multi-versioned dataset by @eddyxu in #278
  • [Document] Add logo/README refresh by @jaichopra in #279
  • [Python] Fetch dataset versions. by @eddyxu in #280
  • [Python] Support group size and rows_per_file customization via write_dataset API by @eddyxu in #281
  • [Python] use new write API in python benchmark by @eddyxu in #282

Full Changelog: v0.1.5...v0.2.0

v0.1.5 Pandas Extension Type, Jupyter Notebook and Document Improvements

28 Oct 16:52
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.1.4...v0.1.5

v0.1.4: Var-length binary decoder performance improvements, Open Discord server for community.

16 Oct 17:18
Compare
Choose a tag to compare

What's Changed

  • CLI to inspect lance dataset by @eddyxu in #231
  • Generate primary key for Oxford Pet dataset by @eddyxu in #233
  • Fix datagen test by @eddyxu in #234
  • Add discord link and fix typo in README by @eddyxu in #236
  • Improve VarBinaryDecoder::Take performance by accumulating small batches by @eddyxu in #239

Full Changelog: v0.1.3...v0.1.4

Document improvements and bug fixes

09 Oct 02:34
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.1.2...v0.1.3