This formulation is equivalent to the original (reference)
implementation but runs _significantly_ faster - this speeds up
the filtering portion of a Paeth-heavy 8192x8192 16-bit/channel
image by a factor of more than 2 on a Zen2 CPU.
I'm investigating doing a more thorough restructuring of this pass,
but this seems like a good first step.
We speculatively try to fill our bit buffer to always contain
at least 16 bits for stbi__zhuffman_decode. It's not a sign of
a malformed stream for us to be reading past the end there,
because the contents of that bit buffer are speculative; it's
only a malformed stream if we actually _consume_ the extra bits.
This fix adds some extra logic where we the first time we hit
zeof, we add an explicit 16 extra zero bits at the top of the
bit buffer just so that for the purposes of the decoder, we have
16 bits in the buffer.
However, if at the end of stream, we have the "hit zeof once"
flag set and less than 16 bits remaining in the bit buffer, we
know some of those implicit zero bits got read, which indicates
we actually had a past-end-of-stream read. In that case, flag
it as an error.
While I'm in here, also rephrase the length-too-large check to
not do any potentially-overflowing pointer arithmetic.
Fixes issue #1456.
For paletted images, header-scanning mode (used by stbi_info) kept
going after the image header to see if a tRNS (transparency) chunk
shows up to correctly determine the channel count, but we would
immediately return after IHDR for non-paletted images.
This is incorrect. We also change our reported channel count on
RGB images with a tRNS chunk, therefore non-paletted images also
have to resume scanning further chunks.
Update the logic to keep scanning regardless and report the
correct channel count from stbi_info for RGB images with tRNS
(constant alpha channel).
Fixes issue #1419.
Limit to 1k, which is the maximum size of a 256-entry palette that
would ordinarily go there. It feels sensible to relax this a bit but
we don't want to go overboard.
Fixes issue #1325.
Specifically, this rejects length codes 286 and 287, and distance codes 30 and 31.
This avoids a scenario in which a file could contain a table in which
0 corresponded to length code 287, which would result in writing 0 bits.
Signed-off-by: Neil Bickford <nbickford@nvidia.com>
The component resamplers are not written to support this and I've
never seen it happen in a real (non-crafted) JPEG file so I'm
fine rejecting this as outright corrupt.
Fixes issue #1178.
As per MS's own docs, should ignore the r/g/b bitmasks in the
header unless BI_BITFIELDS compression is selected. Factor out
setting the default masks since that now exists in two branches.
Add some more checking for unsupported compression formats and
illegal bpp/compression combinations while I'm at it.
Fixes issue #783.
Put the formats that start with a clear magic number first,
the dodgy ones that don't have much of a distinctive header
should be tested for later after we've ruled out the clearer
ones.
Fixes issue #787, hopefully. (Never got a clean repro.)
It's implementation-specified behavior. Writing this code and then
relying on compiler strength reduction to turn it back into shifts
feels extremely silly but it is what it is.
Fixes issue #1097.
Define lrot in a way that doesn't involve UB when n==0.
Also, the previous patch ensures that n <= 15 for all callers
of stbi__extend_receive, so can remove the (less restrictive)
bounds check for 0 <= n < 17 (the bounds of stbi__bmask)
entirely.
Fixes issue #1065.
extend_receive implicitly requires n <= 15 (code length);
the maximum that actually makes sense for 8-bit baseline JPEG is
11, but 15 is the natural limit for us because the AC coding path
stores the number of magnitude bits in a nibble.
Check that DC delta bits are in range before attempting to call
extend_receive.
Fixes issue #1108.