Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fasta file generation and change fasta files to use 2bit as source #20

Merged
merged 3 commits into from
Aug 25, 2023

Conversation

Xewar313
Copy link
Contributor

@Xewar313 Xewar313 commented Aug 23, 2023

#19

@iromeo
Copy link
Contributor

iromeo commented Aug 24, 2023

I see that fun Iterable<FastaRecord>.write(path: Path, width: Int = 80) was introduced earlier and wont covered by tests. Could you please cover it with some unit tests. In tests hg38, hg19 and other real genomes are not available. Instead, we use a generated Genome["to1"] genome with a couple of chromosomes.

  • For this method seems you do not need to1 and it's ok to generate same fake FastaRecords that are configured using string sequence + validate that header and split by desired width works ok

Also, please add a test for Genome.writeAsFasta and verify:

  • that the chromosomes order in resulting file is same as in Genome.chromSizesMap
  • that chromosomes sequence start with same nucleotieds as in Chromosome.sequence() for correspoding chromosomes
  • chomosome lenth in result file matched to Chromosome length in to1 genome.
    Keep in mind, that exact nucleotide sequence in the test chromosome is randomly generated, although seed is fixed and it should be same each time (see org.jetbrains.bio.genome.TestOrganismDataGenerator#generateSequence)

@iromeo
Copy link
Contributor

iromeo commented Aug 24, 2023

Also AnnotationsConfig.fastaUrl isn't used any more, so seems we could clean it up

@iromeo iromeo assigned Xewar313 and unassigned iromeo Aug 24, 2023
@Xewar313 Xewar313 assigned iromeo and unassigned Xewar313 Aug 25, 2023
@iromeo
Copy link
Contributor

iromeo commented Aug 25, 2023

@Xewar313 GenomeAnnotationEditor compilation is broken due to missing fasta url in epigenome repo, please fix there also

@iromeo iromeo assigned Xewar313 and unassigned iromeo Aug 25, 2023
@Xewar313 Xewar313 assigned iromeo and unassigned Xewar313 Aug 25, 2023
@Xewar313
Copy link
Contributor Author

@iromeo Sorry, I have forgotten to push jbr changes

@iromeo iromeo merged commit 88ec880 into master Aug 25, 2023
@iromeo iromeo deleted the #19-generate-fa-using-2bit branch August 25, 2023 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants