Allocate filter_buf for two scan lines that we do all the filter
processing in, then do all other conversions (bit depth,
endianness, inserting alpha=255 values) on the way out.
Separating the two concerns makes everything much clearer.
This formulation is equivalent to the original (reference)
implementation but runs _significantly_ faster - this speeds up
the filtering portion of a Paeth-heavy 8192x8192 16-bit/channel
image by a factor of more than 2 on a Zen2 CPU.
I'm investigating doing a more thorough restructuring of this pass,
but this seems like a good first step.
I want to make some changes to the PNG loader, this is to get some
test coverage at least to make it easier not to break anything.
The two ref_results files that are "corrupt" files that stb_image
nevertheless loads without error are checksum failures; this is
by design, since stb_image does not verify checksums.
We speculatively try to fill our bit buffer to always contain
at least 16 bits for stbi__zhuffman_decode. It's not a sign of
a malformed stream for us to be reading past the end there,
because the contents of that bit buffer are speculative; it's
only a malformed stream if we actually _consume_ the extra bits.
This fix adds some extra logic where we the first time we hit
zeof, we add an explicit 16 extra zero bits at the top of the
bit buffer just so that for the purposes of the decoder, we have
16 bits in the buffer.
However, if at the end of stream, we have the "hit zeof once"
flag set and less than 16 bits remaining in the bit buffer, we
know some of those implicit zero bits got read, which indicates
we actually had a past-end-of-stream read. In that case, flag
it as an error.
While I'm in here, also rephrase the length-too-large check to
not do any potentially-overflowing pointer arithmetic.
Fixes issue #1456.
For paletted images, header-scanning mode (used by stbi_info) kept
going after the image header to see if a tRNS (transparency) chunk
shows up to correctly determine the channel count, but we would
immediately return after IHDR for non-paletted images.
This is incorrect. We also change our reported channel count on
RGB images with a tRNS chunk, therefore non-paletted images also
have to resume scanning further chunks.
Update the logic to keep scanning regardless and report the
correct channel count from stbi_info for RGB images with tRNS
(constant alpha channel).
Fixes issue #1419.
Limit to 1k, which is the maximum size of a 256-entry palette that
would ordinarily go there. It feels sensible to relax this a bit but
we don't want to go overboard.
Fixes issue #1325.
Specifically, this rejects length codes 286 and 287, and distance codes 30 and 31.
This avoids a scenario in which a file could contain a table in which
0 corresponded to length code 287, which would result in writing 0 bits.
Signed-off-by: Neil Bickford <nbickford@nvidia.com>