Releases: lancedb/lance
Releases · lancedb/lance
v0.2.6 Schema evolution bug fixes, Google Colab support, and more datasets
What's Changed
- [C++] Remove unused Reader APIs by @eddyxu in #344
- [Python] fix timezone issue with version timestamp by @changhiskhan in #345
- [C++] add Dataset::Make(string) API by @eddyxu in #346
- [DUCKDB] Native duckdb lance reader by @eddyxu in #347
- [DUCKDB] Read a special version of dataset by @eddyxu in #350
- [DUCKDB] Fix duckdb manylinux build by @eddyxu in #351
- [Python] Add colab badge to notebooks by @eddyxu in #354
- [Notebook] ML dev cycle for DINO by @eddyxu in #355
- [DUCKDB] fix type mapping for other int types by @changhiskhan in #359
- [Python] Fix lance.dataset open local related path by @eddyxu in #365
- [C++] Store relative path for data files by @eddyxu in #368
- [C++] Add RAII util (defer) to auto cleanup / close resources after exiting the scope by @eddyxu in #369
- [Python] Convert of ImageNet 1K into Lance dataset by @eddyxu in #366
- [Python] Imagenet data quality analytics notebook by @eddyxu in #370
Full Changelog: v0.2.5...v0.2.6
v0.2.5 Schema evolution, support merging with arrow Table
What's Changed
- [DOC] Fix notebook build by @eddyxu in #339
- [Python] lance.write_dataset takes pandas DataFrame by @eddyxu in #342
- [DOC] update readme docs to cater for import pathways from df/parquet by @jaichopra in #340
- [Python] Improve PyTorch dataset ergonomic by @eddyxu in #336
- [C++] Add columns from in-memory table by @eddyxu in #337
- [Python] append column with a in-memory Pyarrow Table by @eddyxu in #338
- [C++][Python] Add timestamp to each manifest version. by @eddyxu in #343
Full Changelog: v0.2.4...v0.2.5
v0.2.4: Schema Evolution and Append Column
Support Schema Evolution via Append Column.
What's Changed
- [Notebook] fixes for notebook backing the blog post by @changhiskhan in #316
- [C++] Append column by @eddyxu in #299
- [Python] Append columns by @eddyxu in #318
- [Use column projection during update by @eddyxu in https://github.com//pull/322
- update to duckdb 0.6 by @changhiskhan in #312
- [Python] Support add column via Expression. by @eddyxu in #324
- [Python] Expose projection for append column by @eddyxu in #325
- [C++] Support column projection during add_columns via expression by @eddyxu in #326
- [Python] Pytorch Dataset uses Fragment instead of files and support versions by @eddyxu in #327
- [C++] Move writer API a private API by @eddyxu in #329
- [C++] Refectory Metadata class to eliminate protobuf reference. by @eddyxu in #328
- [C++] Performance profiling and improvement by @eddyxu in #333
- [C++] Upgrade
lq
cmd tool to be able to inspect new versioned format by @eddyxu in #334
Full Changelog: v0.2.3...v0.2.4
v0.2.3 Bugfix release; breaks dataset proto schema
What's Changed
- [C++] Project schema via field Ids and Schema intersection by @eddyxu in #305
- when writing in batches, handle all na arrays properly by @changhiskhan in #306
- [C++] Use LanceFragment to build I/O exec plan by @eddyxu in #307
- [CI] Fix Github Action warning to upgrade nodejs 12 based actions by @eddyxu in #309
- Update README.md by @changhiskhan in #310
- Temporarily pin duckdb to 0.5.1 by @changhiskhan in #313
- Notebook for new blog post on versioning by @changhiskhan in #311
- [C++] Fix reading dictionary values from manifest files by @eddyxu in #314
Full Changelog: v0.2.2...v0.2.3
v0.2.2 Python notebooks and CV dataset conversion.
What's Changed
- [DOC] Update README.md by @jaichopra in #294
- [DUCKDB] Script to upload lance extension zip by @changhiskhan in #295
- [C++] Scan Node reads multiple files by @eddyxu in #300
- [Python] Add lance.util.duckdb to help install the extension transparently by @changhiskhan in #301
- [Python] Notebook fixes by @changhiskhan in #303
- [Python] Make dataset conversion a feature by @changhiskhan in #304
Full Changelog: v0.2.1...v0.2.2
v0.2.1 Bug fix release
Fixed bug affecting writes of fixed size list arrays as well as datagen code for Coco.
Updated to Arrow 10.0 newly released.
What's Changed
- remove duplicate test_mac.sh by @changhiskhan in #284
- Fix build on intel mac by @eddyxu in #286
- [C++] Fix write fixed list array bug by @eddyxu in #288
- Upgrade Apache Arrow to 10.0 by @eddyxu in #266
- temporary hack to fix pytorch loader until it can handle a versioned … by @changhiskhan in #293
- fix image_id alignment in coco datagen by @changhiskhan in #289
Full Changelog: v0.2.0...v0.2.1
v0.2.0 Dataset Versioning, DuckDB extension built with CUDA
Highlights
- Lance Dataset versioning support
- Duckdb Extension supports building against PyTorch with Cuda
- Revamp README and documentation.
What's Changed
- Fetch Dataset Versions by @eddyxu in #272
- Readability improvement for metadata class by @Renkai in #275
- [DuckDB] Enable DuckDB extension to build/run on CUDA-enabled PyTorch by @changhiskhan in #273
- [Python] Support multi-versioned dataset by @eddyxu in #278
- [Document] Add logo/README refresh by @jaichopra in #279
- [Python] Fetch dataset versions. by @eddyxu in #280
- [Python] Support group size and rows_per_file customization via write_dataset API by @eddyxu in #281
- [Python] use new write API in python benchmark by @eddyxu in #282
Full Changelog: v0.1.5...v0.2.0
v0.1.5 Pandas Extension Type, Jupyter Notebook and Document Improvements
What's Changed
- Add model inference notebook by @changhiskhan in #244
- update README.md to simplify comms, prior to blog post by @jaichopra in #248
- Jaichopra/rebrand lance by @jaichopra in #249
- Exclude jupyter notebook from github language stats by @changhiskhan in #251
- linguist fix by @changhiskhan in #253
- restore skipped test since extension types are working on mac by @changhiskhan in #256
- ingestion example by @changhiskhan in #252
- Update README.md by @jaichopra in #261
- [CI] pin arrow 9.0 in GHA by @eddyxu in #268
- Update README.md by @jaichopra in #264
- Pandas extension dtype for image by @changhiskhan in #267
- Reorganize tutorial notebooks by @changhiskhan in #265
- Merge two Schemas by @eddyxu in #263
- Versioning support with Appending Dataset by @eddyxu in #262
- Change datagen to use public https image urls by @changhiskhan in #271
Full Changelog: v0.1.4...v0.1.5
v0.1.4: Var-length binary decoder performance improvements, Open Discord server for community.
What's Changed
- CLI to inspect lance dataset by @eddyxu in #231
- Generate primary key for Oxford Pet dataset by @eddyxu in #233
- Fix datagen test by @eddyxu in #234
- Add discord link and fix typo in README by @eddyxu in #236
- Improve VarBinaryDecoder::Take performance by accumulating small batches by @eddyxu in #239
Full Changelog: v0.1.3...v0.1.4