Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Simplify the current state to just one table #1

Open
thoven87 opened this issue Jul 24, 2024 · 1 comment
Open

[Feature] Simplify the current state to just one table #1

thoven87 opened this issue Jul 24, 2024 · 1 comment

Comments

@thoven87
Copy link
Contributor

thoven87 commented Jul 24, 2024

I was thinking of the following schema

CREATE TABLE hummingbird.job_queue (
    id uuid,
    job_name VARCHAR(255) NOT NULL,
    payload bytea NOT NULL,
    priority INTEGER DEFAULT 100,
    status SMALLINT NOT NULL,
    error_count INTEGER NOT NULL DEFAULT 0,
    created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
    started_at TIMESTAMP WITH TIME ZONE,
    updated_at TIMESTAMP WITH TIME ZONE,
    completed_at TIMESTAMP WITH TIME ZONE
    CONSTRAINT _hb_pg_job_queue_pkey PRIMARY KEY (priority, created_at, id) PRIMARY KEY
);

This proposal will simplify the current two step approach that exists.

Instead of inserting in the jobs table and then move a job to the job_queue table, we can insert onset and update the job query to the following

UPDATE 
    _hb_pg_job_queue
SET status = \(Status.processing),
    started_at = \(Date.now)
WHERE id IN(
    SELECT
        id
    FROM _hb_pg_job_queue task_queue
    WHERE status IN (\(Status.pending)) OR completed_at IS NULL
    ORDER BY task_queue.created_at ASC
    FOR UPDATE SKIP LOCKED
    LIMIT 1
)
RETURNING id, payload

On failure, the error_count can be incremented of which can be used for exponential backoff retries.

Moreover, this affords a user to retain logs/job information in the table via config shouldDeleteJobAfterCompletion: Bool

@adam-fowler
Copy link
Member

I'm trying to remember what issues a single table brought up. I can't remember offhand though. This does look to make sense. Any additional functionality (like priority, additional timestamps) should probably be extended to the driver protocol, to better see if they make sense and separate from this change.

@thoven87 thoven87 transferred this issue from hummingbird-project/hummingbird-postgres Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants