Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom annotation support #124

Open
alnfedorov opened this issue Jul 21, 2020 · 3 comments
Open

Custom annotation support #124

alnfedorov opened this issue Jul 21, 2020 · 3 comments

Comments

@alnfedorov
Copy link

alnfedorov commented Jul 21, 2020

Hello!
I found the JBR Genome browser as a convenient tool to create weak genome annotations. However, the current annotation mode supports only SPAN-like annotation. It would be great to allow users to define/save/load custom annotation track.

For me, it will be sufficient to allow users to add new labels directly to the current SPAN annotation track(+save/load them within the session). This way, one avoids the following questions/uncertainties:

  • Should intersections be allowed?
  • Should multiple parallel annotations be supported?
  • If one to save and load tracks, where to store the colors? What to do with the annotation-color collisions?
  • Etc..

Sophisticated use-cases can be supported later when more feedback is available.

To give a perspective, I plan to annotate large portions of the genome for dozens of the histone ChIP-seq experiments. The datasets will eventually be published and potentially could be used to benchmark-optimize peak-calling / train ML-DL models. The problem is that the current annotation is not strong/versatile enough. For example, in some situations, it is clear that the region has exactly one peak, and the body of the peak can be labeled with high confidence at least partially. Partially, it can be covered by abutting start-stop, but semantic is different(exact single peak body(perhaps, partial), a single peak with start-stop somewhere here)

Thank you!
P.S. I am really impressed by the browser/SPAN/aging paper! Great job!

Edit: clarity.

@olegs
Copy link
Contributor

olegs commented Jul 21, 2020

@alnfedorov Thanks for your interest and for the warm words about the browser/SPAN/paper.

Indeed, at the moment JBR supports only SPAN supported annotations, i.e. only 4 possible marks without any intersections.
Annotations are saved to a regular BED file so that intersections are forbidden.

Should intersections be allowed?

In case if you want to store annotation in BED format, intersections are not allowed.

Should multiple parallel annotations be supported?

I don't think that creating several annotations simultaneously would be easy, it is much easier to create the first annotation, save it to BED file, open it in JBR, then proceed with the second etc.

If one to save and load tracks, where to store the colors? What to do with the annotation-color collisions?

We use names of BED entries for checking annotation type in SPAN and BED color for visualisation purposes.
Here you can learn more about extended BED format: https://m.ensembl.org/info/website/upload/bed.html
From this perspective colours and names can be any values conforming the given specification.

We can consider adding user configured markup annotations to JBR,
i.e. configure colours vs map corresponding and provide interface for user interface for adding them.

@alnfedorov
Copy link
Author

alnfedorov commented Jul 21, 2020

@olegs Thanks for the fast response!

Should intersections be allowed?

In case if you want to store annotation in BED format, intersections are not allowed.

Should multiple parallel annotations be supported?

I don't think that creating several annotations simultaneously would be easy, it is much easier to create the first annotation, save it to BED file, open it in JBR, then proceed with the second etc.

If one to save and load tracks, where to store the colors? What to do with the annotation-color collisions?

We use names of BED entries for checking annotation type in SPAN and BED color for visualisation purposes.
Here you can learn more about extended BED format: https://m.ensembl.org/info/website/upload/bed.html
From this perspective colours and names can be any values conforming the given specification.

You are absolutely right, and it is good that several evident(for me) problems are already solved or limited by the specification. However, I just meant that even simple enhancement to the current annotation mode would cover 99% of my needs, and it should not require any difficult design choices.

(By the way, I didn't find any restrictions for the overlapping intervals in the BED specs. Also, I am sure that BED is used to store ChIP-seq reads sometimes. And they can and should overlap often.)

We can consider adding user configured markup annotations to JBR,
i.e. configure colours vs map corresponding and provide interface for user interface for adding them.

That would be great!

@olegs
Copy link
Contributor

olegs commented Jul 21, 2020

(By the way, I didn't find any restrictions for the overlapping intervals in the BED specs. Also, I am sure that BED is used to store ChIP-seq reads sometimes. And they can and should overlap often.)

Indeed, .BED.GZ files are often used to store raw reads. I should have been more specific. I meant that there are lots of tools that don't expect BED intervals to overlap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants