Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC doesn't seem to run #2003

Open
lkp-k opened this issue Sep 6, 2023 · 7 comments
Open

GC doesn't seem to run #2003

lkp-k opened this issue Sep 6, 2023 · 7 comments
Labels
kind/bug Something is broken.

Comments

@lkp-k
Copy link

lkp-k commented Sep 6, 2023

What version of Badger are you using?

Latest v4

What version of Go are you using?

1.20.3

Have you tried reproducing the issue with the latest release?

Yes

What is the hardware spec (RAM, CPU, OS)?

16gb ram, i5 intel, mac os

What steps will reproduce the bug?

package main

import (
	"fmt"
	"log"
	"math/rand"
	"strconv"
	"time"

	"github.com/dgraph-io/badger/v4"
)

var db *badger.DB

const charset = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"

func generateRandomString(length int) string {
	b := make([]byte, length)
	for i := range b {
		b[i] = charset[rand.Intn(len(charset))]
	}
	return string(b)
}

func main() {
	opts := badger.DefaultOptions("badger")
	opts.ValueLogFileSize = 1 << 20
	var err error
	db, err = badger.Open(opts)
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	fmt.Println("Start inserting 5 million items.")

	for i := 0; i < 5_000_000; i++ {
		err := db.Update(func(txn *badger.Txn) error {
			key := strconv.Itoa(i)
			value := []byte(generateRandomString(300))

			e := badger.NewEntry([]byte(key), value).WithTTL(30 * time.Second)

			err := txn.SetEntry(e)
			if err != nil {
				return err
			}
			return nil
		})

		if err != nil {
			log.Fatal("Error while inserting:", err)
		}
	}

	fmt.Println("Successfully inserted 5 million items.")

	startGarbageCollection()
}

func startGarbageCollection() {
	ticker := time.NewTicker(10 * time.Second)
	defer ticker.Stop()
	for range ticker.C {
	again:
		err := db.RunValueLogGC(0.01)
		fmt.Println(err)
		if err == nil {
			goto again
		}
	}
}

Expected behavior and actual result.

Since the keys are expiring in 30 seconds, I thought that the files in badger/ would get removed.

However, heres what I see:

image image

Additional information

No response

@lkp-k lkp-k added the kind/bug Something is broken. label Sep 6, 2023
@mangalaman93
Copy link
Contributor

may be related to #1995. Thanks for filling an issue with code to reproduce. I will try to find some time to look into it.

@wangyang0918
Copy link

Maybe the compaction is not triggered. Similar to https://discuss.dgraph.io/t/gc-may-not-work-in-some-cases/17197

@lkp-k
Copy link
Author

lkp-k commented Sep 29, 2023

Hi Friends. Any updates on this? Ty

@GraphR00t
Copy link

@mangalaman93 Would it possible for you to take a look when you have some time ? Thank you !

Copy link

This issue has been stale for 60 days and will be closed automatically in 7 days. Comment to keep it open.

@github-actions github-actions bot added the Stale label Jul 18, 2024
@GraphR00t
Copy link

Any updates ? Thanks !

@github-actions github-actions bot removed the Stale label Jul 19, 2024
@chrisDeFouRire
Copy link

Here's some information I hope could be of interest:

  • just running the ValueLogGC will not always trigger a GC
  • there must be compaction before ValueLogGC clears value log files
  • that's me testing with a 2 min TTL
  • I modified the LSM params to cause frequent compactions, then I saw the GC running... sometimes !
  • if no key was cleared by compaction, no value will be cleared of ValueLog

So if you want GC to happen more often, you must make it so compaction happens more often.

In my test, I used the following (extreme!) params before I could see things happen over a few minutes (with constant additions... the LSM doesn't fill in fast with high threshold).

badger.Open(badger.DefaultOptions("archiver").
WithValueLogMaxEntries(1000).WithValueThreshold(100).WithCompactL0OnClose(true).
WithBaseLevelSize(10000).WithNumLevelZeroTables(2).WithBaseTableSize(4000).WithMemTableSize(1000000))

Of course I do not recommend these params, I just wanted to make sure Badger does indeed GC, with the right conditions. Make sure you understand each param and don't blame me 🤣

All in all, GC works, if compaction occurs and compacts TTL'ed entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something is broken.
Development

No branches or pull requests

5 participants