All checks were successful
CI Pipeline / build (push) Successful in 49s
Strip invalid XML 1.0 control characters (0x00-0x08, 0x0B-0x0C, 0x0E-0x1F) from text to prevent corrupted docx files that fail to open in LibreOffice. Fixes SAXParseException 'PCData Invalid Char value' errors.
16 lines
392 B
Ruby
16 lines
392 B
Ruby
# frozen_string_literal: true
|
|
|
|
module Notare
|
|
module XmlSanitizer
|
|
# Invalid XML 1.0 characters: 0x00, 0x01-0x08, 0x0B-0x0C, 0x0E-0x1F
|
|
# Valid whitespace preserved: 0x09 (tab), 0x0A (LF), 0x0D (CR)
|
|
INVALID_XML_CHARS = /[\x00-\x08\x0B\x0C\x0E-\x1F]/
|
|
|
|
def self.sanitize(text)
|
|
return text unless text.is_a?(String)
|
|
|
|
text.gsub(INVALID_XML_CHARS, "")
|
|
end
|
|
end
|
|
end
|