« Previous | Next » 

Revision 0da4c671

ID0da4c671659cfbae12def127b2e94690b9d9b5e1
Parent 881ac26f
Child 535c7777

Added by Felix Geisendörfer about 10 years ago

string_bytes: Guarantee valid utf-8 output

Previously v8's WriteUtf8 function would produce invalid utf-8 output
when encountering unmatched surrogate code units [1]. The new
REPLACE_INVALID_UTF8 option fixes that by replacing invalid code points
with the unicode replacement character.

[1]: JS Strings are defined as arrays of 16 bit unsigned integers. There
is no unicode enforcement, so one can easily end up with invalid unicode
code unit sequences inside a string.

Files

  • added
  • modified
  • copied
  • renamed
  • deleted

View differences