-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
utf8proc_map_custom custom_data should be const void *
(not void *
) and could corrupt memory
#249
Comments
Here is an alternative implementation of
|
I don't think the However, I'm reluctant to make the pointer |
I think the core of the issue is that this "map function might call your callback once, probably will call it twice per element" behavior is just plain weird. Is there any other context in the world where a
OK, point taken. But, see above
fair enough-- but in that case, I would think it also conventional to provide an alternative where by the callback accepts (and is passed) an index parameter indicating the element position If it can't be possible to only call the custom function once, instead of twice, per element, then maybe a solution would be offer a more granular form that accepts two custom funcs. Note however that this would not address the problem of the order assumption.
|
Hi,
utf8proc_map_custom
takes avoid *custom_data
parameter. However, if thecustom_data
is modified by the custom function, utf8proc_map_custom might not work as expected-- and possibly corrupt memory-- becauseutf8proc_map_custom
callsutf8proc_decompose_custom
twice, and only the second time's results are kept, but there is no way to "reset" the custom data to its initial state before the second call.As an example, imagine I use a custom transformation to replace the first character with 'A', but keep the rest of the string. Using
utf8proc_map_custom
would seem easy enough:However, this will not actually work because
utf8proc_map_custom
only keeps the results of its secondutf8proc_decompose_custom
call, at which timectx->start_of_string
will already be set to 0.I believe this could also lead to memory corruption if the above example was run with an input string that had a multi-byte first character (in which case the first run of
utf8proc_decompose_custom
would receive a length assuming a single-byte first char, but the second run would write a multi-byte first char).It would be nice if there was a way to fix this issue inside
utf8proc_map_custom
without changing its signature, but that does not seem possible. But at a minimum, it would seem safer and more accurate ifcustom_data
was aconst void *
instead of avoid *
(and arguably, a bug in the latter form as it currently stands)The text was updated successfully, but these errors were encountered: