We tried but it was nothing but trouble. New rule: with
GCC/Clang, if you're compiling with -msse2, you get always-on
SSE2 code, otherwise you don't get any. Trying to ship
anything with proper runtime dispatch requires both working
around certain bugs and some fiddling with build settings,
which runs contrary to the intent of a one-file library,
so bail on it entirely.
Fixes issue #280.
Fixes issue #410.
stbi__process_frame_header had two bugs when dealing with progressive
JPEGs:
1. when malloc failed allocating raw_data, previous components'
raw_coeff didn't get freed
2. no out-of-memory check in raw_coeff allocation
Fix both and share a bit more cleanup code in general.
Adds some helpers that check whether a product of multiple
factors (that need to be non-negative: this is enforced)
summed with another non-negative value overflows when
performed as int. Since stb_image mostly works in ints,
this seems like the safest route. Limits size of images
to 2GB but several of the decoders already enforce this
limit (or even lower ones).
Also adds wrappers for malloc that combine a mul-add-with-
overflow-check with the actual malloc, and return NULL
on failure. Then use them when allocating something that
is the product of multiple factors.
For image formats, also add a top-level "is this too big?"
check that gives a more useful error message; otherwise,
the failed mallocs result in an "out of memory" error.
The idea is that the top-level checks should be the primary
way to catch these bugs (and produce a useful error message).
But a misleading error message is still vastly preferable to
a buffer overflow exploit.
Fixes issues #310, #313, #314, #318. (Verified with the
provided test images)
Along the way, this fixes a previously unnoticed bug in
ldr_to_hdr / hdr_to_ldr (missing NULL check); these functions
are called with the result of an image decoder, so NULLs can
definitely happen.
Another bug noticed along the way is that handling of
interlaced 16-bit PNGs was incorrect. Fixing this (along
with the previous modifications) fixes issue #311.
Yet another bug noticed during this change is that reduce_png
did not check the right pointer during its out of memory
check. Fix that too.
I claimed that if the most significant bit of a 16bit pixel is set,
it should be opaque (as is suggested by some sources on the internet),
but implemented the opposite.
If implemented "correctly", lots of 16bit TGAs become invisible.. so I
guess 16bit TGAs aren't really supposed to have an alpha-channel, or at
least most 16bit TGAs (despite having set an "alpha-bit" in the "image
descriptor byte") in the wild don't seem to work like that.
So just assume 16bit non-greyscale TGAs are always STBI_rgb without
an alpha channel.
* Calculate correct stb format (incl. proper 16bit support) also when
using a colormap (palette)
* Create colormap with tga_comp, to correctly support 16bit RGB
(instead of using tga_palette_bits/8 and just copying the data)
* For TGAs with colormap, the TGA bits per pixel field specifies the
size of an index to the colormap - the "real" color depth
of the image is saved in the color map specification's bits per pixel
field. I think only 8 and 16bit indices make sense (16 should be
supported, otherwise the colormap length could be u8 instead of u16),
so I added support for both.
* Helper functions stbi__tga_get_comp() to calculate stb pixelformat and
stbi__tga_read_rgb16() to read one 16bit pixel and convert it to
24/32bit RGB(A) - for less duplicate code
* for paletted images, .._info()'s comp should be based on the palette's
bits per pixel, not the images bits per pixel (which describes the
size of an index into the palette and is also checked now)
* make sure the color (map) type and the image type fields of the header
are consistent (=> if TGA color type is 1 for paletted, the TGA image
type must be 1 or 9)
* .._test() does some more checks and uses stbi__get16le() instead of
stbi__get16be() - TGA is little endian.
* .._test() now always rewinds (sometimes it used to do only return 0;
without rewinding)
* remove "error check" at the beginning of stbi__tga_load(), because
all that is already tested in stbi__tga_test()
stbi__tga_* assumed that 16bit TGAs were Grayscale + Alpha.
However, if the TGA imagetype is not one of the gray ones, it's 16Bit
RGB data, with 5 Bits per channel. If the TGA image descriptor field
has alpha bits (the 3 least significant ones) set, the pixel's most
significant bit is for alpha: 1 for opaque and 0 for translucent.
Furthermore people claim that TGAs can also pretend to have 15bpp,
which is the same as 16bpp but definitely without alpha.
So 15/16bpp TGAs are now decoded to STBI_rgb(_alpha).
The warning concerns the return value of stbi_err, which is an int, being converted to a pointer. In VS2015 it seems casting directly from a 32-bit int to a 64-bit pointer triggers this warning. Worked around by first converting to a 64-bit int (here size_t) and then to a pointer.
The original AC decoding logic handled ZRL (runs of 16 zeros)
incorrectly.
The problem is that the original flow set r=16 and skipped the
final coeff write when s=0. This is not actually correct. The
problem is the intervening "refinement" bits.
With the original logic, even once we decrement r to 0, we keep
reading more refinement bits for subsequent coefficients until
we find the next currently-unsent AC in the current scan. That is,
it works as if it was trying to place 17 new AC values, and only
bails at the last minute from actually setting that 17th value.
This is wrong. Once we've found the 16th previously-unsent AC, we
need to stop reading refinement bits, otherwise we get out of sync
with the bit stream (which expects us to read a huffman code next).
The easiest fix is to just do what the JPEG standard implicitly
assumes anyway: treat ZRL as a run of 15 zeros followed by an
explicit magnitude-zero AC coeff. (That is, leave s=0 and actually
write s). So this is what this fix does.
GCC 4.7 gave the warning "signed and unsigned type in conditional
expression" because the ternary operator mixes signed and unsigned
integers. Fixed by casting to unsigned inside the "if" branch instead
of casting the result of the entire conditional.
This fixes two things. First, the logic to disable SSE2 on
GCC unless "-msse2" was not specific enough, and ended up
disabling SIMD support on NEON targets entirely. Shuffle
the detection logic around to make that bit x86-specific.
Second, 32-bit MinGW assumes 16-byte aligned stacks, but this is
not in the Windows ABI and hence DLLs and callbacks don't
necessarily provide it. This caused a crash.
This can be fixed by providing the right command-line option,
which we have no control over. As a compromise, disable the SSE2
path on MinGW unless a specific #define explained in the comments
is set. That way, we default to safe (never-crashing) behavior
unless the user explicitly signals they know what they're doing.
add STBI__ prefixes to internal SCAN_ enum;
strip unused function arguments for progressive funcs;
tweak release notes;
forget to git commit frequently so these would all be in their own commits;