13 Dec 02:41

eddyxu

00102dc

v0.2.6 Schema evolution bug fixes, Google Colab support, and more datasets

What's Changed

[C++] Remove unused Reader APIs by @eddyxu in #344
[Python] fix timezone issue with version timestamp by @changhiskhan in #345
[C++] add Dataset::Make(string) API by @eddyxu in #346
[DUCKDB] Native duckdb lance reader by @eddyxu in #347
[DUCKDB] Read a special version of dataset by @eddyxu in #350
[DUCKDB] Fix duckdb manylinux build by @eddyxu in #351
[Python] Add colab badge to notebooks by @eddyxu in #354
[Notebook] ML dev cycle for DINO by @eddyxu in #355
[DUCKDB] fix type mapping for other int types by @changhiskhan in #359
[Python] Fix lance.dataset open local related path by @eddyxu in #365
[C++] Store relative path for data files by @eddyxu in #368
[C++] Add RAII util (defer) to auto cleanup / close resources after exiting the scope by @eddyxu in #369
[Python] Convert of ImageNet 1K into Lance dataset by @eddyxu in #366
[Python] Imagenet data quality analytics notebook by @eddyxu in #370

Full Changelog: v0.2.5...v0.2.6

Contributors

eddyxu and changhiskhan

Assets 2

02 Dec 06:15

eddyxu

v0.2.5

ceb65ae

v0.2.5 Schema evolution, support merging with arrow Table

What's Changed

[DOC] Fix notebook build by @eddyxu in #339
[Python] lance.write_dataset takes pandas DataFrame by @eddyxu in #342
[DOC] update readme docs to cater for import pathways from df/parquet by @jaichopra in #340
[Python] Improve PyTorch dataset ergonomic by @eddyxu in #336
[C++] Add columns from in-memory table by @eddyxu in #337
[Python] append column with a in-memory Pyarrow Table by @eddyxu in #338
[C++][Python] Add timestamp to each manifest version. by @eddyxu in #343

Full Changelog: v0.2.4...v0.2.5

Contributors

eddyxu and jaichopra

Assets 2

28 Nov 21:25

eddyxu

v0.2.4

b6ba75f

v0.2.4: Schema Evolution and Append Column

Support Schema Evolution via Append Column.

What's Changed

[Notebook] fixes for notebook backing the blog post by @changhiskhan in #316
[C++] Append column by @eddyxu in #299
[Python] Append columns by @eddyxu in #318
[Use column projection during update by @eddyxu in https://github.com//pull/322
update to duckdb 0.6 by @changhiskhan in #312
[Python] Support add column via Expression. by @eddyxu in #324
[Python] Expose projection for append column by @eddyxu in #325
[C++] Support column projection during add_columns via expression by @eddyxu in #326
[Python] Pytorch Dataset uses Fragment instead of files and support versions by @eddyxu in #327
[C++] Move writer API a private API by @eddyxu in #329
[C++] Refectory Metadata class to eliminate protobuf reference. by @eddyxu in #328
[C++] Performance profiling and improvement by @eddyxu in #333
[C++] Upgrade lq cmd tool to be able to inspect new versioned format by @eddyxu in #334

Full Changelog: v0.2.3...v0.2.4

Contributors

eddyxu and changhiskhan

Assets 2

1 Join discussion

16 Nov 04:23

changhiskhan

v0.2.3

a55f929

v0.2.3 Bugfix release; breaks dataset proto schema

What's Changed

[C++] Project schema via field Ids and Schema intersection by @eddyxu in #305
when writing in batches, handle all na arrays properly by @changhiskhan in #306
[C++] Use LanceFragment to build I/O exec plan by @eddyxu in #307
[CI] Fix Github Action warning to upgrade nodejs 12 based actions by @eddyxu in #309
Update README.md by @changhiskhan in #310
Temporarily pin duckdb to 0.5.1 by @changhiskhan in #313
Notebook for new blog post on versioning by @changhiskhan in #311
[C++] Fix reading dictionary values from manifest files by @eddyxu in #314

Full Changelog: v0.2.2...v0.2.3

Contributors

eddyxu and changhiskhan

Assets 2

09 Nov 17:25

eddyxu

v0.2.2

8a9d736

v0.2.2 Python notebooks and CV dataset conversion.

What's Changed

[DOC] Update README.md by @jaichopra in #294
[DUCKDB] Script to upload lance extension zip by @changhiskhan in #295
[C++] Scan Node reads multiple files by @eddyxu in #300
[Python] Add lance.util.duckdb to help install the extension transparently by @changhiskhan in #301
[Python] Notebook fixes by @changhiskhan in #303
[Python] Make dataset conversion a feature by @changhiskhan in #304

Full Changelog: v0.2.1...v0.2.2

Contributors

eddyxu, changhiskhan, and jaichopra

Assets 2

04 Nov 22:41

changhiskhan

v0.2.1

70d72fb

v0.2.1 Bug fix release

Fixed bug affecting writes of fixed size list arrays as well as datagen code for Coco.
Updated to Arrow 10.0 newly released.

What's Changed

remove duplicate test_mac.sh by @changhiskhan in #284
Fix build on intel mac by @eddyxu in #286
[C++] Fix write fixed list array bug by @eddyxu in #288
Upgrade Apache Arrow to 10.0 by @eddyxu in #266
temporary hack to fix pytorch loader until it can handle a versioned … by @changhiskhan in #293
fix image_id alignment in coco datagen by @changhiskhan in #289

Full Changelog: v0.2.0...v0.2.1

Contributors

eddyxu and changhiskhan

Assets 2

02 Nov 19:38

eddyxu

v0.2.0

f4eaa21

v0.2.0 Dataset Versioning, DuckDB extension built with CUDA

Highlights

Lance Dataset versioning support
Duckdb Extension supports building against PyTorch with Cuda
Revamp README and documentation.

What's Changed

Fetch Dataset Versions by @eddyxu in #272
Readability improvement for metadata class by @Renkai in #275
[DuckDB] Enable DuckDB extension to build/run on CUDA-enabled PyTorch by @changhiskhan in #273
[Python] Support multi-versioned dataset by @eddyxu in #278
[Document] Add logo/README refresh by @jaichopra in #279
[Python] Fetch dataset versions. by @eddyxu in #280
[Python] Support group size and rows_per_file customization via write_dataset API by @eddyxu in #281
[Python] use new write API in python benchmark by @eddyxu in #282

Full Changelog: v0.1.5...v0.2.0

Contributors

eddyxu, changhiskhan, and 2 other contributors

Assets 2

28 Oct 16:52

eddyxu

v0.1.5

72e47ef

v0.1.5 Pandas Extension Type, Jupyter Notebook and Document Improvements

What's Changed

Add model inference notebook by @changhiskhan in #244
update README.md to simplify comms, prior to blog post by @jaichopra in #248
Jaichopra/rebrand lance by @jaichopra in #249
Exclude jupyter notebook from github language stats by @changhiskhan in #251
linguist fix by @changhiskhan in #253
restore skipped test since extension types are working on mac by @changhiskhan in #256
ingestion example by @changhiskhan in #252
Update README.md by @jaichopra in #261
[CI] pin arrow 9.0 in GHA by @eddyxu in #268
Update README.md by @jaichopra in #264
Pandas extension dtype for image by @changhiskhan in #267
Reorganize tutorial notebooks by @changhiskhan in #265
Merge two Schemas by @eddyxu in #263
Versioning support with Appending Dataset by @eddyxu in #262
Change datagen to use public https image urls by @changhiskhan in #271

Full Changelog: v0.1.4...v0.1.5

Contributors

eddyxu, changhiskhan, and jaichopra

Assets 2

16 Oct 17:18

eddyxu

v0.1.4

d7dfad6

v0.1.4: Var-length binary decoder performance improvements, Open Discord server for community.

What's Changed

CLI to inspect lance dataset by @eddyxu in #231
Generate primary key for Oxford Pet dataset by @eddyxu in #233
Fix datagen test by @eddyxu in #234
Add discord link and fix typo in README by @eddyxu in #236
Improve VarBinaryDecoder::Take performance by accumulating small batches by @eddyxu in #239

Full Changelog: v0.1.3...v0.1.4

Contributors

eddyxu

Assets 2

09 Oct 02:34

eddyxu

v0.1.3

41bdf4c

Document improvements and bug fixes

What's Changed

EDA Howtos by @eddyxu in #186
Apply Limit cross multiple files in the dataset. by @eddyxu in #226
Fix false assertion during BinaryEncoding by @eddyxu in #227

Full Changelog: v0.1.2...v0.1.3

Contributors

eddyxu

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Highlights

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Releases: lancedb/lance

v0.2.6 Schema evolution bug fixes, Google Colab support, and more datasets

What's Changed

Contributors

v0.2.5 Schema evolution, support merging with arrow Table

What's Changed

Contributors

v0.2.4: Schema Evolution and Append Column

What's Changed

Contributors

v0.2.3 Bugfix release; breaks dataset proto schema

What's Changed

Contributors

v0.2.2 Python notebooks and CV dataset conversion.

What's Changed

Contributors

v0.2.1 Bug fix release

What's Changed

Contributors

v0.2.0 Dataset Versioning, DuckDB extension built with CUDA

Highlights

What's Changed

Contributors

v0.1.5 Pandas Extension Type, Jupyter Notebook and Document Improvements

What's Changed

Contributors

v0.1.4: Var-length binary decoder performance improvements, Open Discord server for community.

What's Changed

Contributors

Document improvements and bug fixes

What's Changed

Contributors