Add guillemet smart punctuation (<< and >> to « and »)
When --smart is enabled, << is converted to « (U+00AB) and >> to »
(U+00BB). Single < and > are unaffected. Autolinks and HTML tags
take precedence over guillemet detection.
Fix off-by-one in hex entity digit limit
The decimal digit check `(1..=7).contains(&num_digits)` was joined by
`||` so it also matched hex entities with 7 digits. Extract `is_hex`
and gate each branch so hex allows 1-6 digits and decimal allows 1-7.
Fixes: `A` (7 hex digits) was incorrectly parsed as `A`
instead of being treated as literal text per the CommonMark spec.
Support comma-delimited syntect language tokens
Background:
- cmark-gfm preserves `language-rust,ignore` (whitespace-delimited info token).
- GitHub rendering highlights `rust,ignore` as Rust.
This keeps Comrak's HTML class behavior unchanged while improving syntect token lookup by splitting on the first comma for syntax selection.
Includes a regression test for fenced blocks using `rust,ignore`.
Co-authored-by: Cosmic Horror <CosmicHorrorDev@pm.me>