Sanitize invalid XML characters in text content
All checks were successful
CI Pipeline / build (push) Successful in 49s

Strip invalid XML 1.0 control characters (0x00-0x08, 0x0B-0x0C, 0x0E-0x1F)
from text to prevent corrupted docx files that fail to open in LibreOffice.

Fixes SAXParseException 'PCData Invalid Char value' errors.
This commit is contained in:
2026-01-22 09:10:33 +01:00
parent 8b4f538cbb
commit 64c8679044
6 changed files with 108 additions and 2 deletions

View File

@@ -3,6 +3,7 @@
require "nokogiri"
require_relative "notare/version"
require_relative "notare/xml_sanitizer"
require_relative "notare/nodes/base"
require_relative "notare/nodes/break"
require_relative "notare/nodes/hyperlink"