-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Early edge predictions #74
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Co-authored-by: Ivan Siutsou <[email protected]> Signed-off-by: Kristof Roomp <[email protected]>
Initial Lepton implementation tests all the DC coefficients on the edges to be nonzero
mcroomp
reviewed
May 24, 2024
mcroomp
reviewed
May 24, 2024
mcroomp
reviewed
May 24, 2024
mcroomp
reviewed
May 24, 2024
Merge the last changes and it looks like it's ready to checkin |
Hi there, if you get a chance to merge with the master I can approve. Thanks! |
Today evening then :) |
mcroomp
approved these changes
May 28, 2024
Melirius
added a commit
to Melirius/lepton_jpeg_rust
that referenced
this pull request
Jun 11, 2024
* use bitscan to shortcut zero searching * fastest so far * cleaned up code a bit * more optimizations * clarified changes * added comments * use aligned block as input * add unroll dependency * work in progress * work in progress * update cargo.lock * minor fixes * make envli 32 bit * update z in 16 increments to avoid extra shifts * working * remove bogus change * clean up envli * added comments= * add comments * improved comments for envli * update wide library * precalculate abs value * Update src/structs/jpeg_write.rs Co-authored-by: Ivan Siutsou <[email protected]> Signed-off-by: Kristof Roomp <[email protected]> * remove unused field in HuffCodes * Store length with code * Alternative VLI encoding * Cosmetics * Working early prediction - to clean * Cleared * Early edge prediction * Nonzero mask TODO: move raster update into decode_one_edge * Start of work on encoder * WIP: get rid of first transposition in IDCT * WIP: Nest step * No transposition in decoder * Code clear and formatting * More masks * Use transposed block all the way in decoding * Transposed nonzero mask in decoder * Simplified test * Working early prediction in encode, to clean * Code cleared, unused arrays removed * Shortened min_noise_threshold, more unified encoder/decoder code * Code clear * fill raster in at the same time as coordinate * missing updated cargo.lock since widen_mul was added later * RUSTy NeighborSummary * attempt 1 merge * remove extranious changes * remove extranious changes * removing extranious changes * removing extranious changes * removing extranious changes * removing extranious changes * removing extranious changes * avoid casting i32x8 using bytemuck if not strictly necessary as wide also provides typesafe casts * Some comments (by code review) * Comment elaboration * Typo fix * Unification of DC predictors calculations * fixed transpose * fixed mul * fix warnings * incorrect upcast of quantization table value * update dependencies and use from_u16x8 which is new in the wide crate * Revert "Merge remote-tracking branch 'MS/idctmul' into fasterjpeg_simd_variation" This reverts commit f2511d8, reversing changes made to 438a1a1. * Shorter FREQ_MAX * Correct checks of quantization tables Initial Lepton implementation tests all the DC coefficients on the edges to be nonzero * Formatting --------- Signed-off-by: Kristof Roomp <[email protected]> Co-authored-by: Kristof <[email protected]>
mcroomp
added a commit
that referenced
this pull request
Jul 25, 2024
* use bitscan to shortcut zero searching * fastest so far * cleaned up code a bit * more optimizations * clarified changes * added comments * use aligned block as input * add unroll dependency * work in progress * work in progress * update cargo.lock * minor fixes * make envli 32 bit * update z in 16 increments to avoid extra shifts * working * remove bogus change * clean up envli * added comments= * add comments * improved comments for envli * update wide library * precalculate abs value * Update src/structs/jpeg_write.rs Co-authored-by: Ivan Siutsou <[email protected]> Signed-off-by: Kristof Roomp <[email protected]> * remove unused field in HuffCodes * Store length with code * Alternative VLI encoding * Cosmetics * Working early prediction - to clean * Cleared * Early edge prediction * Nonzero mask TODO: move raster update into decode_one_edge * Start of work on encoder * WIP: get rid of first transposition in IDCT * WIP: Nest step * No transposition in decoder * Code clear and formatting * More masks * Use transposed block all the way in decoding * Transposed nonzero mask in decoder * Simplified test * Working early prediction in encode, to clean * Code cleared, unused arrays removed * Shortened min_noise_threshold, more unified encoder/decoder code * Code clear * fill raster in at the same time as coordinate * missing updated cargo.lock since widen_mul was added later * RUSTy NeighborSummary * attempt 1 merge * remove extranious changes * remove extranious changes * removing extranious changes * removing extranious changes * removing extranious changes * removing extranious changes * removing extranious changes * avoid casting i32x8 using bytemuck if not strictly necessary as wide also provides typesafe casts * Some comments (by code review) * Comment elaboration * Typo fix * Unification of DC predictors calculations * fixed transpose * fixed mul * fix warnings * incorrect upcast of quantization table value * update dependencies and use from_u16x8 which is new in the wide crate * Revert "Merge remote-tracking branch 'MS/idctmul' into fasterjpeg_simd_variation" This reverts commit f2511d8, reversing changes made to 438a1a1. * Shorter FREQ_MAX * Correct checks of quantization tables Initial Lepton implementation tests all the DC coefficients on the edges to be nonzero * Formatting --------- Signed-off-by: Kristof Roomp <[email protected]> Co-authored-by: Kristof <[email protected]>
mcroomp
added a commit
that referenced
this pull request
Sep 24, 2024
#82) * remove extra code dealing with corner cases that never got implemented * simplify even more and make encoder and decoder the same * no need mut * simplifyh parameters slightly * make parameter order the same for encode/decode row * remove warnings * make ProbabilityTableSet static since it never changes * Early edge predictions (#74) * use bitscan to shortcut zero searching * fastest so far * cleaned up code a bit * more optimizations * clarified changes * added comments * use aligned block as input * add unroll dependency * work in progress * work in progress * update cargo.lock * minor fixes * make envli 32 bit * update z in 16 increments to avoid extra shifts * working * remove bogus change * clean up envli * added comments= * add comments * improved comments for envli * update wide library * precalculate abs value * Update src/structs/jpeg_write.rs Co-authored-by: Ivan Siutsou <[email protected]> Signed-off-by: Kristof Roomp <[email protected]> * remove unused field in HuffCodes * Store length with code * Alternative VLI encoding * Cosmetics * Working early prediction - to clean * Cleared * Early edge prediction * Nonzero mask TODO: move raster update into decode_one_edge * Start of work on encoder * WIP: get rid of first transposition in IDCT * WIP: Nest step * No transposition in decoder * Code clear and formatting * More masks * Use transposed block all the way in decoding * Transposed nonzero mask in decoder * Simplified test * Working early prediction in encode, to clean * Code cleared, unused arrays removed * Shortened min_noise_threshold, more unified encoder/decoder code * Code clear * fill raster in at the same time as coordinate * missing updated cargo.lock since widen_mul was added later * RUSTy NeighborSummary * attempt 1 merge * remove extranious changes * remove extranious changes * removing extranious changes * removing extranious changes * removing extranious changes * removing extranious changes * removing extranious changes * avoid casting i32x8 using bytemuck if not strictly necessary as wide also provides typesafe casts * Some comments (by code review) * Comment elaboration * Typo fix * Unification of DC predictors calculations * fixed transpose * fixed mul * fix warnings * incorrect upcast of quantization table value * update dependencies and use from_u16x8 which is new in the wide crate * Revert "Merge remote-tracking branch 'MS/idctmul' into fasterjpeg_simd_variation" This reverts commit f2511d8, reversing changes made to 438a1a1. * Shorter FREQ_MAX * Correct checks of quantization tables Initial Lepton implementation tests all the DC coefficients on the edges to be nonzero * Formatting --------- Signed-off-by: Kristof Roomp <[email protected]> Co-authored-by: Kristof <[email protected]> * Fix of calc_sign_index (#85) * Fix of calc_sign_index To get in line with initial Lepton implementation * Formatting * Some remarks (by review) * make rayon optional (#90) * Fix fuzzer target and bump version (#91) * Fix fuzzing target and bump the nuget version * Revert "Fix fuzzing target and bump the nuget version" This reverts commit bfde3db. * Fix fuzzing target and bump the nuget version --------- Co-authored-by: Gadi Brovman <[email protected]> * move lepton_header into its own module (#92) * pipeline update (#94) * pipeline updats * update versions for rust 81 * move out color_index to get rid of probability_table_set entirely * save get_color_index --------- Signed-off-by: Kristof Roomp <[email protected]> Co-authored-by: Ivan Siutsou <[email protected]> Co-authored-by: gbrovman <[email protected]> Co-authored-by: Gadi Brovman <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #70
DCT coefficients are stored now in transposed raster order that is more suitable for IDCT (one transposition is excluded). Edge coefficients predictions are produced along with edge DCT coefficients. Overall performance gain on my machine (Zen3, x5950) is ~1.5 %.