Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support RepeatMasker output format for CHM13 (hs1) #15

Merged
merged 2 commits into from
Aug 9, 2023

Conversation

serge-p7v
Copy link
Contributor

No description provided.

@serge-p7v serge-p7v requested a review from olegs July 5, 2023 15:29
@olegs
Copy link
Contributor

olegs commented Jul 7, 2023

Adding repeats track fails: #16

@olegs
Copy link
Contributor

olegs commented Jul 7, 2023

Adding CpG track fails: #17

@@ -645,6 +645,7 @@ fun String.toStrand() = single().toStrand()
fun Char.toStrand() = when (this) {
'+' -> Strand.PLUS
'-' -> Strand.MINUS
'C' -> Strand.MINUS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please remove this option or add this super specific hs1 option processing to hs1 parsing?
Another possible solution is to add explicit comments on why we need this exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved this option to AnnotationsHs1.kt

@@ -103,6 +119,67 @@ object Repeats {

return builder.build()
}

private fun parseHs1(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to add this code to the AnnotationsHs1.kt file and delegate it to specialized functions from this file. It will make adding the next extraordinary genome easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return line.isNotBlank() && line.contains(chrMarker) && !line.contains(substringPresentOnlyInHeader)
}

private fun parseHs1RepeatsLine(line: String, chromosomes: Map<String, Chromosome>): Pair<Chromosome, Repeat>? {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test with a short prepared test data for repeats in hs1 specific format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some tests for hs1 repeats parsing.

@olegs olegs merged commit cc3f7f9 into master Aug 9, 2023
olegs pushed a commit that referenced this pull request Sep 7, 2023
* Support RepeatMasker output format for CHM13 (hs1)
* Move methods related to hs1 repeats parsing to AnnotationsHs1.kt. 
* Added tests

---------

Co-authored-by: Sergey Pestrikov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants