test/compatibility/windows-1252.tst
author ali <ali@juiblex.co.uk>
Thu May 30 18:33:44 2013 +0100 (2013-05-30)
changeset 72 52d4a7f926b4
permissions -rw-r--r--
Support WINDOWS-1252 characters encoded as UTF-8
ali@22
     1
**************** ENCODING ****************
ali@22
     2
WINDOWS-1252
ali@22
     3
**************** INPUT ****************
ali@22
     4
gutcheck has only a very limited support for windows-1252, but it does
ali@22
     5
recognise some characters as letters.
ali@22
     6
ali@22
     7
Žal at the start of a paragraph would throw a warning if its first letter
ali@22
     8
wasn't recognised since the paragraph would then appear to start with
ali@22
     9
something other than a capital letter. Æsop likewise proves that ash is
ali@22
    10
seen as a letter (otherwise a warning would be given for a period not
ali@22
    11
followed by a capital letter). Œcolampadius does the same for œthel.
ali@22
    12
ali@22
    13
Ÿ-decay is something I don't even pretend to understand, but I'm quite
ali@22
    14
happy to abuse it to test that strange letter.
ali@22
    15
ali@22
    16
Contrawise, we can prove that some characters are _not_ seen as letters
ali@22
    17
since neither 2×2=4 nor 4÷2=2 produce a warning (if they had been seen
ali@22
    18
as letters, we would expect ‘Query digit’ warnings).
ali@22
    19
ali@22
    20
The trademark symbol ™ and œthel might,for whatever reason, confuse the
ali@22
    21
column numbers in warnings.
ali@22
    22
ali@22
    23
**************** EXPECTED ****************
ali@22
    24
ali@22
    25
gutcheck has only a very limited support for windows-1252, but it does
ali@22
    26
    Line 1 column 1 - Paragraph starts with lower-case
ali@22
    27
ali@22
    28
The trademark symbol ™ and œthel might,for whatever reason, confuse the
ali@22
    29
    Line 17 column 39 - Missing space?