Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexi: add new database package #3033

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

buck54321
Copy link
Member

Add the Lexi DB package, which wraps badger DB to provide a simplified API, to capture the sometimes tedious mechanics that we've repeated in various places, and to add utilities for indexing data for quick retrieval of filtered data.

This change is just the addition of the package. I will demonstrate it's use in tatanka separately.

I'll add some more testing, but feel free to review as is.

See this diff for an example of how it would replace the tx DB for Ethereum.

Add the Lexi DB package, which wraps badger DB to provide a simplified
API, to capture the sometimes tedious mechanics that we've repeated
in various places, and to add utilities for indexing data for quick
retrieval of filtered data.

This change is just the addition of the package. I will demonstrate
it's use in tatanka separately, and also show how this can replace
in our client/asset tx DB implementations.
Comment on lines +25 to +29
bLen := 1 + len(d.v) + wire.VarIntSerializeSize(uint64(len(d.v))) + wire.VarIntSerializeSize(uint64(len(d.indexes)))
for _, ib := range d.indexes {
bLen += len(ib) + wire.VarIntSerializeSize(uint64(len(ib)))
}
b := bytes.NewBuffer(make([]byte, 0, bLen))
Copy link
Member Author

@buck54321 buck54321 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I forgot to doc this encoding stuff. It's pretty much like BuildyBytes, but I'm pre-allocating the buffer and allowing any blob size. And I'm leveraging wire for the var int stuff.

I guess its a lot of work just to pre-allocate the buffer though. derp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When reading the doc I noticed this:

// MarshalJSON satisfies the json.Unmarshaler interface, returns a quoted copy

"...satisfies the json.Unmarshaler interface..."

I think it is the json.Marshaller interface. I could be wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests all PASS on my machine.

Copy link
Contributor

@martonp martonp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, sir.

// concurrently. This bugs the hell out of me, because I though that if a
// database was ACID-compliant, this was impossible, but I guess not. Either
// way, the solution is to try again.
func (db *DB) Update(f func(txn *badger.Txn) error) (err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be exported?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it like this.

// lookup and iteration.
type Table struct {
*DB
name string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like name isn't used anywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in (*Table).AddIndex

Comment on lines +89 to +102
type setOpts struct {
replace bool
}

// SetOptions is an knob to control how items are inserted into the table with
// Set.
type SetOption func(opts *setOpts)

// WithReplace allows replacing pre-existing values when calling Set.
func WithReplace() SetOption {
return func(opts *setOpts) {
opts.replace = true
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff should be above UseDefaultSetOptions.


// AddIndex adds an index to a Table. Once an Index is added, every datum
// Set in the Table will generate an entry in the Index too.
func (t *Table) AddIndex(name string, f func(k, v encoding.BinaryMarshaler) ([]byte, error)) (*Index, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we want to add an index when the table already has some data? The table could be versioned, and if a new index is added that didn't exist before, then all the existing entries are added to the index, and the version is incremented. This could be added in the future though if it's actually needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're on the same page. There will be some upgrade logic at some point, but we don't need to do it here. We'll add functionality as we go.

defer iter.Close()

if len(seek) == 0 {
iter.Rewind()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to Rewind if you haven't started iterating yet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how they do in the example. Maybe not necessary?

iter.Seek(seek)
}

for ; iter.ValidForPrefix(prefix); iter.Next() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just use iter.Valid() since you set the prefix in the opts.

Comment on lines 21 to 24
// ErrDeleteEntry can be returned from the function passed to Iterate to
// trigger deletion of the datum, all of its index entries, and its key-id
// entries.
ErrDeleteEntry = dex.ErrorKind("delete entry")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably know, but this isn't implemented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. Did (*Iter).Delete() instead.


func lastKeyForPrefix(txn *badger.Txn, p keyPrefix) (k []byte) {
reverseIteratePrefix(txn, p[:], nil, func(iter *badger.Iterator) error {
k = iter.Item().Key()[prefixSize:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need KeyCopy?

if err != nil {
return
}
return k, item.Value(func(kB []byte) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValueCopy?

Comment on lines +201 to +202
// Iterate iterates the index, providing access to the index entry, datum, and
// datum key via the Iter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referencing the datum in these comments is confusing when looking at the godoc, as datum is not exported.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh. Datum is also just a word for a single piece of data. Should I use "value" instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants