Fix bug #19: Update documentation for 2.1 default tip
authorali <ali@juiblex.co.uk>
Wed Oct 02 09:14:33 2013 +0100 (2013-10-02)
changeset 1052d48e8cdda24
parent 104 70cc629ec1e0
Fix bug #19: Update documentation for 2.1
configure.ac
doc/bookloupe.txt
doc/loupe-test.txt
     1.1 --- a/configure.ac	Wed Oct 16 22:51:29 2013 +0100
     1.2 +++ b/configure.ac	Wed Oct 02 09:14:33 2013 +0100
     1.3 @@ -1,7 +1,7 @@
     1.4  #                                               -*- Autoconf -*-
     1.5  # Process this file with autoconf to produce a configure script.
     1.6  
     1.7 -AC_INIT([bookloupe],[2.0],[ali@juiblex.co.uk])
     1.8 +AC_INIT([bookloupe],[2.1],[ali@juiblex.co.uk])
     1.9  AC_PREREQ(2.59)
    1.10  AC_CONFIG_AUX_DIR([config])
    1.11  AC_CONFIG_SRCDIR([bookloupe/bookloupe.c])
     2.1 --- a/doc/bookloupe.txt	Wed Oct 16 22:51:29 2013 +0100
     2.2 +++ b/doc/bookloupe.txt	Wed Oct 02 09:14:33 2013 +0100
     2.3 @@ -9,7 +9,7 @@
     2.4  Microsoft Windows, Mac or Unix. For Windows-only people, there is
     2.5  an appendix at the end with brief instructions for running it.
     2.6  
     2.7 -Current version: 2.0
     2.8 +Current version: 2.1
     2.9  
    2.10  This software is Copyright Jim Tinsley 2000-2005 and
    2.11  J. Ali Harlow 2012 onwards.
    2.12 @@ -20,31 +20,128 @@
    2.13  See http://www.juiblex.co.uk/pgdp/bookloupe/ for the latest version.
    2.14  
    2.15  
    2.16 -Usage is: bookloupe [-setopxlywm] filename
    2.17 -      where:
    2.18 -      -s checks Single quotes 
    2.19 -      -e switches off Echoing of lines 
    2.20 -      -t checks Typos
    2.21 -      -o produces an Overview only
    2.22 -      -p sets strict quotes checking for Paragraphs
    2.23 -      -x (paranoid) switches OFF typo checking and extra checks
    2.24 -      -l turns off Line-end checks
    2.25 -      -y sets error messages to stdout
    2.26 -      -w is a special mode for web uploads (for future use)
    2.27 -      -v (verbose) forces individual reporting of minor problems
    2.28 -      -m interprets Markup of some common HTML tags and entities    
    2.29 -      -u warns about words in a user-defined typo file gutcheck.typ 
    2.30 -      -d ignores some DP-specific markup
    2.31 +                         Recent changes in behaviour
    2.32 +
    2.33 +Each new version of bookloupe brings bug fixes and improvements. Sometimes
    2.34 +the behaviour is also changed in ways that might be unexpected:
    2.35 +
    2.36 +Odd characters
    2.37 +
    2.38 +    The check for "odd" characters (tab, tilde, carat, forward slash and
    2.39 +    asterisks) is disabled in bookloupe 2.0 when the character set is
    2.40 +    switched from ASCII/ISO-8859-1 to UNICODE (ie., when the "There are a
    2.41 +    lot of foreign letters here." message is printed). As of bookloupe 2.1
    2.42 +    these tests operate independently of the character set selected.
    2.43 +
    2.44 +    Users may notice this change most especially in the case of the
    2.45 +    DP-specific /* ... */ markup. Bookloupe 2.0 often did not warn when
    2.46 +    this markup was encountered even when the --dp switch was not given.
    2.47 +    Bookloupe 2.1 will warn about this markup unless dp-specific mode is
    2.48 +    switched on, paranoid mode is switched off or the ebook contains more
    2.49 +    than 10 lines containing asterisks. In the last case
    2.50 +
    2.51 +      --> 11 lines in this file contain asterisks. Not reporting them.
    2.52 +
    2.53 +    will be printed.
    2.54 +
    2.55 +
    2.56 +
    2.57 +Usage is: bookloupe [OPTION...] filename
    2.58 +
    2.59 +Options:
    2.60 +      -d, --dp                  ignores some DP-specific markup
    2.61 +      -e, --no-echo             switches off Echoing of lines
    2.62 +      -s, --squote              checks Single quotes
    2.63 +      --typo                    checks Typos
    2.64 +      -p, --qpara               sets strict quotes checking for Paragraphs
    2.65 +      --no-paranoid             switches OFF typo checking and extra checks
    2.66 +      -l, --no-line-end         turns off Line-end checks
    2.67 +      -o, --overview            produces an Overview only
    2.68 +      -y, --stdout              sets error messages to stdout
    2.69 +      -h, --header              echos the header fields
    2.70 +      -m, --markup              ignore some common HTML markup
    2.71 +      -u, --usertypo            warns about words in a user-defined typo file
    2.72 +      -v, --verbose             forces individual reporting of minor problems
    2.73 +      -w, --web                 special mode for web uploads (for future use)
    2.74 +      --charset=NAME            the set of characters valid for this ebook
    2.75 +      --dump-config             dump the current configuration
    2.76 +
    2.77 +There are also inverted options available which are useful when it is
    2.78 +desired to override an option set in the configuration file:
    2.79 +
    2.80 +      --no-dp, --echo, --no-squote, --no-typo, --no-qpara, --paranoid,
    2.81 +      --line-end, --no-overview, --no-stdout, --no-header, --no-markup,
    2.82 +      --no-usertypo --no-verbose.
    2.83 +
    2.84 +Note: there is no --no-web since --web simply selects a set of options.
    2.85 +
    2.86 +Finally there are a couple of options that toggle the state of options
    2.87 +rather than setting or unsetting them: -t (for typo) and -x (for typo
    2.88 +and paranoid). These are mainly intended for compatability with gutcheck.
    2.89  
    2.90  Running bookloupe without any parameters will display a brief help message.
    2.91  
    2.92 -Sample usage: 
    2.93 +Sample usage:
    2.94  
    2.95      bookloupe warpeace.txt
    2.96  
    2.97  
    2.98  More detail:
    2.99  
   2.100 +    Configuration file
   2.101 +
   2.102 +      Bookloupe will look for a file named bookloupe.ini to read as
   2.103 +      a configuration file. Options set in a configuration file can
   2.104 +      be overridden from the command line as required.
   2.105 +
   2.106 +      The following directories are searched in order:
   2.107 +
   2.108 +        1) The current working directory. When run from the command
   2.109 +	line, this is the directory you ran it from. When run from
   2.110 +	guiguts it will normally be the directory that contains the
   2.111 +	guiguts program.
   2.112 +
   2.113 +	2) The directory containing the bookloupe program.
   2.114 +
   2.115 +	3) The user's configuration directory. Under MS-Windows this
   2.116 +	is normally CSIDL_LOCAL_APPDATA which is typically set to
   2.117 +	C:\Documents and Settings\<user>\Local Settings\Application Data.
   2.118 +	On other platforms this is normally $XDG_CONFIG_HOME which, if
   2.119 +	not set defaults to $HOME/.config
   2.120 +
   2.121 +	The directories to search can also be changed using the
   2.122 +	$BOOKLOUPE_CONFIG_PATH environment variable which is a colon
   2.123 +	separated (semi-colon separated under MS-Windows) list of
   2.124 +	directories.
   2.125 +
   2.126 +      The configuration file is a key file. This is very similar to,
   2.127 +      but not identical to a typical ini file as found under MS-Windows.
   2.128 +      Key files consist of a number of groups which start with the
   2.129 +      group name enclosed in square brackets on a line by itself.
   2.130 +      Bookloupe recognises just one group, "options". Then below the
   2.131 +      group name there follows the keys and their values for that
   2.132 +      group, one per line in the format key=value. Most of bookloupe's
   2.133 +      options are flags (ie., either on or off). For these keys, the
   2.134 +      value must be either "true" or "false". The file may also contain
   2.135 +      comment lines which begin with the # symbol. The names of the
   2.136 +      keys follow the long option names.
   2.137 +
   2.138 +      A sample configuration file is provided (in sample.ini). The file
   2.139 +      will need to be copied to bookloupe.ini before bookloupe will
   2.140 +      read it. You can also use the --dump-config option to write a
   2.141 +      configuration file for you. For example, if you typically want
   2.142 +      to run bookloupe with the --dp and --squote options, then you
   2.143 +      might do:
   2.144 +
   2.145 +        $ bookloupe --dp --squote --dump-config > configuration.ini
   2.146 +	$ ren configuration.ini bookloupe.ini
   2.147 +
   2.148 +      (Don't be tempted to merge these two steps or bookloupe will see
   2.149 +      an empty configuration file and complain.)
   2.150 +
   2.151 +      This same idea can also be used to modify an existing configuration.
   2.152 +
   2.153 +
   2.154      Character encoding
   2.155  
   2.156        Bookloupe will handle e-texts encoded in UTF-8 (preferred),
   2.157 @@ -52,44 +149,86 @@
   2.158        incorrectly, as ansi). The output will be in the same encoding
   2.159        as the input e-text.
   2.160  
   2.161 -    Echoing lines (-e to switch off)
   2.162  
   2.163 -      You may find it convenient, when reviewing Bookloupe's 
   2.164 +    Character set (--charset)
   2.165 +
   2.166 +      Character encodings have an implicit set of characters that
   2.167 +      can be encoded and thus define a set of characters that can
   2.168 +      be present in the text. However sometimes it is desirable
   2.169 +      that not all characters that can be encoded should be present
   2.170 +      in a text. The set of characters that should be present is
   2.171 +      known as the character set.
   2.172 +
   2.173 +      The default setting for the character set (called auto) does
   2.174 +      the same as gutcheck for Windows-1252 encoded texts for
   2.175 +      compatability:
   2.176 +
   2.177 +      If the file is predominately ASCII then the set of legal
   2.178 +      characters is ASCII and warnings are issued whenever non-ASCII
   2.179 +      characters are encountered. The message will either warn of
   2.180 +      non-ASCII or non-ISO-8859-1 characters as appropriate.
   2.181 +
   2.182 +      If the file contains a significant number of non-ASCII characters
   2.183 +      then a message is printed as follows:
   2.184 +
   2.185 +        --> There are a lot of foreign letters here. Not reporting them.
   2.186 +
   2.187 +      and the character set is widened to include all possible
   2.188 +      characters.
   2.189 +
   2.190 +      For UTF-8 encoded texts, auto selects UNICODE.
   2.191 +      
   2.192 +      Most character sets are simply defined in bookloupe as the
   2.193 +      set of all characters that can be encoded in the encoding of
   2.194 +      the same name. UNICODE is an exception and includes only the
   2.195 +      characters assigned in the relevant Unicode standard but
   2.196 +      excluding the Private Use Area characters. Note that the
   2.197 +      relevant Unicode standard is given by the version of glib in
   2.198 +      use rather than by any code in bookloupe and thus can vary
   2.199 +      from system to system. PG texts however are likely to be
   2.200 +      using characters assigned in very early Unicode standards,
   2.201 +      thus mitigating this issue.
   2.202 +
   2.203 +
   2.204 +    Echoing lines (--no-echo to switch off)
   2.205 +
   2.206 +      You may find it convenient, when reviewing Bookloupe's
   2.207        suggestions, to see the line that Bookloupe is questioning.
   2.208        That way, you can often see at a glance whether it is
   2.209        a real error that needs to be fixed, or a false positive
   2.210        that should be in the text, but Bookloupe's limited
   2.211        programming doesn't understand.
   2.212  
   2.213 -      By default, bookloupe echoes these lines, but if you don't 
   2.214 -      want to see the lines referred to, -e will switch it OFF.
   2.215 +      By default, bookloupe echoes these lines, but if you don't
   2.216 +      want to see the lines referred to, --no-echo will switch it
   2.217 +      OFF.
   2.218  
   2.219  
   2.220 -    Quotes (-s and -p switches)
   2.221 +    Quotes (--squote and --qpara switches)
   2.222  
   2.223 -      Bookloupe always looks for unbalanced doublequotes in a 
   2.224 +      Bookloupe always looks for unbalanced doublequotes in a
   2.225        paragraph. It is a common convention for writers not to
   2.226        close quotes in a paragraph if the next paragraph opens
   2.227        with quotes and is a continuation by the same speaker.
   2.228  
   2.229 -      Bookloupe therefore does not normally report unclosed quotes 
   2.230 +      Bookloupe therefore does not normally report unclosed quotes
   2.231        if the next paragraph begins with a quote. If you need
   2.232        to see all unclosed quotes, even where the next paragraph
   2.233        begins with a quote, you should use the -p switch.
   2.234  
   2.235 -      Singlequotes (' and ’) are a problem, since the same
   2.236 -      character is used for an apostrophe. I'm not sure that it is
   2.237 -      possible to get 100% accuracy on singlequotes checking,
   2.238 +      Singlequotes (', `, ‘ and ’) are a problem, since the same
   2.239 +      character can be used for an apostrophe. I'm not sure that it
   2.240 +      is possible to get 100% accuracy on singlequotes checking,
   2.241        particularly since dialect, quite common in PG texts,
   2.242        upsets the normal rules so badly. Consider the sentence:
   2.243          'Tis often said that a man's a man for a' that.
   2.244        As humans, we recognize that both apostrophes are used
   2.245 -      for contractions rather than quotes, but it isn't easy 
   2.246 +      for contractions rather than quotes, but it isn't easy
   2.247        to get a program to recognize that.
   2.248  
   2.249        Since bookloupe makes too many mistakes when trying to match
   2.250        singlequotes, it doesn't look for unbalanced singlequotes
   2.251 -      unless you specify the -s switch.
   2.252 +      unless you specify the --squote switch.
   2.253  
   2.254        Consider these sentences, which illustrate the main cases:
   2.255  
   2.256 @@ -102,12 +241,11 @@
   2.257          Those 'pack dogs' of yours look more like wolves.
   2.258  
   2.259  
   2.260 +    Typos (--typo switch)
   2.261  
   2.262 -    Typos (-t switch)
   2.263 -
   2.264 -      It's not bookoupe's job to be a spelling checker, but it
   2.265 -      does check for a list of common typos and OCR errors if you
   2.266 -      use the -t switch. (The -x switch also turns typo checking on.)
   2.267 +      It's not bookoupe's job to be a spelling checker, but it does
   2.268 +      check for a list of common typos and OCR errors if you use the
   2.269 +      --typo switch. (The -t and -x switchs also toggle typo checking.)
   2.270  
   2.271        It also checks for character combinations, especially involving
   2.272        h and b, which are often confused by OCR, that rarely or never
   2.273 @@ -119,10 +257,10 @@
   2.274        Bookloupe suppresses multiple reporting of the first 40 "typos"
   2.275        found. This is to remove the annoyance of seeing something like
   2.276        "FN" (footnote) or "LK" (initials) flagged as a typo 147 times
   2.277 -      in a text. 
   2.278 +      in a text.
   2.279  
   2.280  
   2.281 -    Line-end checking (-l switch to disable)
   2.282 +    Line-end checking (--no-line-end switch to disable)
   2.283  
   2.284        All PG texts should have a Carriage Return (CR - character 13)
   2.285        and a Line Feed (LF - character 10) at end of each line,
   2.286 @@ -134,31 +272,31 @@
   2.287        the correct terminator, but if you're on a work-in-progress
   2.288        in Linux, you might want to convert the line-ends as a final
   2.289        step, and not want to see thousands of errors every time you
   2.290 -      run bookloupe before that final step, so you can turn off 
   2.291 -      this checking with the -l switch.
   2.292 +      run bookloupe before that final step, so you can turn off
   2.293 +      this checking with the --no-line-end switch.
   2.294  
   2.295  
   2.296 -    Paranoid mode (-x switch to disable: Trust No One :-)
   2.297 +    Paranoid mode (--no-paranoid switch to disable: Trust No One :-)
   2.298  
   2.299 -      -x switches OFF typo-checking, the -t flag, automatically
   2.300 -      and some extra checks like standalone 1 and 0 queries.
   2.301 +      --no-paranoid switches OFF some extra checks like standalone
   2.302 +      1 and 0 queries.
   2.303  
   2.304  
   2.305 -    Overview mode (-o switch)
   2.306 +    Overview mode (--overview switch)
   2.307  
   2.308        This mode just gives a count of queries found
   2.309        instead of a detailed list.
   2.310  
   2.311  
   2.312 -    Header quote  (-h switch)
   2.313 +    Header quote  (--header switch)
   2.314  
   2.315 -      If you use the -h switch, bookloupe will also display
   2.316 +      If you use the --header switch, bookloupe will also display
   2.317        the Title, Author, Release and Edition fields from the
   2.318        PG header. This is useful mostly for the automated
   2.319        checks we do on recently-posted texts.
   2.320  
   2.321  
   2.322 -    Errors to stdout (-y switch)
   2.323 +    Errors to stdout (--stdout switch)
   2.324  
   2.325        If you're just running bookloupe normally, you can ignore
   2.326        this. It's only there for programs that provide a front
   2.327 @@ -167,23 +305,24 @@
   2.328        bookloupe ran OK.
   2.329  
   2.330  
   2.331 -    Verbose reporting (-v switch)
   2.332 +    Verbose reporting (--verbose switch)
   2.333  
   2.334        Normally, if bookloupe sees lots of long lines, short lines,
   2.335        spaced dashes, non-ASCII characters or dot-commas ".," it
   2.336        assumes these are features of the text, counts and summarizes
   2.337 -      them at the top of its report, but does not list them 
   2.338 -      individually. If the -v switch is on, bookloupe will list them all.
   2.339 +      them at the top of its report, but does not list them
   2.340 +      individually. If the verbose switch is on, bookloupe will list
   2.341 +      them all.
   2.342  
   2.343  
   2.344 -    Markup interpretation (-m switch)
   2.345 +    Markup interpretation (--markup switch)
   2.346  
   2.347        Normally, bookloupe flags anything it suspects of being HTML
   2.348 -      markup as a possible error. When you use the -m switch,
   2.349 +      markup as a possible error. When you use the --markup switch,
   2.350        however, it matches anything that looks like markup against
   2.351        a short list of common HTML tags and entities. If the markup
   2.352        is in that list, it either ignores the markup, in the case
   2.353 -      of a tag, or "interprets" the markup as its nearest ASCII 
   2.354 +      of a tag, or "interprets" the markup as its nearest ASCII
   2.355        equivalent, in the case of an entity. So, for example, using
   2.356        this switch, bookloupe will "see"
   2.357  
   2.358 @@ -200,28 +339,30 @@
   2.359        for PG, and get sane results. It does not support all tags.
   2.360        It does not support all entities. When it sees a tag or entity
   2.361        it does not recognize, it will query it as HTML just as if
   2.362 -      you hadn't specified the -m switch.
   2.363 +      you hadn't specified the --markup switch.
   2.364  
   2.365        Bookloupe will automatically switch on markup interpretation
   2.366        if it sees a lot of tags that appear to be markup, so mostly, you
   2.367        won't have to specify this.
   2.368  
   2.369 -    User-defined typos (-u switch)
   2.370 +
   2.371 +    User-defined typos (--usertypo switch)
   2.372  
   2.373        If you have a file named bookloupe.typ or gutcheck.typ either
   2.374        in your current working directory or in the directory from
   2.375        which you explicitly invoked bookoupe, but not necessarily on
   2.376 -      your path, and if you specify the -u switch, bookloupe will
   2.377 -      query any word specified in that file. The file is simple: one
   2.378 -      word, in lower case, per line. Be careful not to put multiple
   2.379 +      your path, and if you specify the --usertypo switch, bookloupe
   2.380 +      will query any word specified in that file. The file is simple:
   2.381 +      one word, in lower case, per line. Be careful not to put multiple
   2.382        words onto a line, or leave any rubbish other than the word on
   2.383        the line. You should have received a sample file bookloupe.typ
   2.384        with this package. The file may be encoded in UTF-8 (preferred),
   2.385        ISO-8859-1 (also known as Latin-1), or WINDOWS-1252 (also known,
   2.386        incorrectly, as ansi).
   2.387  
   2.388 -    Ignore DP markup (-d switch)
   2.389 -        
   2.390 +
   2.391 +    Ignore DP markup (--dp switch)
   2.392 +
   2.393        Distributed Proofreaders (http://www.pgdp.net) has for some
   2.394        time been the main source of PG texts, and proofers there use
   2.395        special conventions. This switch understands those conventions,
   2.396 @@ -229,6 +370,17 @@
   2.397        haven't had the special conventions removed yet. The special
   2.398        conventions supported are page-separators and
   2.399        "<sc>", "</sc>", "/*", "*/", "/#", "#/", "/$", "$/".
   2.400 + 
   2.401 +
   2.402 +    Dump the current configuration (--dump-config switch)
   2.403 +
   2.404 +      The --dump-config switch can be used to dump the current
   2.405 +      configuration. This is a combination of the internal defaults,
   2.406 +      the configuration file (if any) and the command line options.
   2.407 +      If a configuration file is present, any comments found in that
   2.408 +      file will be preserved in the dumped configuration. If there
   2.409 +      is no configuration file, then a default set of comments to
   2.410 +      go with the internal default configuration is generated.
   2.411  
   2.412  
   2.413  You will probably only run bookloupe on a text once or maybe twice,
   2.414 @@ -257,7 +409,7 @@
   2.415  length, HTML tags perhaps left from a conversion, unbalanced
   2.416  brackets.
   2.417  
   2.418 -Suggestions for additional checks would be appreciated and duly 
   2.419 +Suggestions for additional checks would be appreciated and duly
   2.420  considered, but no guarantees that they will be implemented.
   2.421  
   2.422  
   2.423 @@ -271,8 +423,8 @@
   2.424      gutcheck -o filename.txt
   2.425  
   2.426  That gives me a quick idea what I'm dealing with. It'll tell
   2.427 -me what kind of problems gutcheck sees, and give me an idea 
   2.428 -of how much more work needs to be done on the text. Keep in 
   2.429 +me what kind of problems gutcheck sees, and give me an idea
   2.430 +of how much more work needs to be done on the text. Keep in
   2.431  mind that gutcheck doesn't do anything like a full spellcheck,
   2.432  but when I see a text that has a lot of problems, I assume that
   2.433  it probably needs a spellcheck too.
   2.434 @@ -284,10 +436,10 @@
   2.435  where jj is my personal, all-purpose filename for temporary data
   2.436  that doesn't need to be kept. Then I open filename.txt and jj in
   2.437  a split-screen view in my editor, and work down the text, fixing
   2.438 -whatever needs fixing, and skipping whatever doesn't. If your 
   2.439 -editor doesn't split-screen, you can get much the same effect by 
   2.440 +whatever needs fixing, and skipping whatever doesn't. If your
   2.441 +editor doesn't split-screen, you can get much the same effect by
   2.442  opening your original file in your normal editor, and jj (or your
   2.443 -equivalent name) in something like Notepad, keeping both in view 
   2.444 +equivalent name) in something like Notepad, keeping both in view
   2.445  at the same time.
   2.446  
   2.447  Twice a day, an automatic process looks at all recently-posted
   2.448 @@ -296,17 +448,6 @@
   2.449  
   2.450  
   2.451  
   2.452 -        Future development of bookloupe
   2.453 -
   2.454 -Future versions will add support for UTF-8 characters that
   2.455 -are not in ISO-8859-1 (eg., curled quotation marks);
   2.456 -characters that do not have a composed form (version 2.0
   2.457 -treats these as taking 2 or more columns); zero width and
   2.458 -wide characters (version 2.0 treats these as taking 1 column).
   2.459 -
   2.460 -
   2.461 -
   2.462 -
   2.463  Explanations of common bookloupe messages:
   2.464  
   2.465      --> 74 lines in this file have white space at end
   2.466 @@ -343,11 +484,11 @@
   2.467      Line 3020 - Non-ASCII character 233
   2.468  
   2.469      Standard PG texts should use only ASCII characters with values
   2.470 -    up to 127; however, non-English, accented characters can be 
   2.471 -    represented according to several different non-ASCII encoding 
   2.472 +    up to 127; however, non-English, accented characters can be
   2.473 +    represented according to several different non-ASCII encoding
   2.474      schemes, using values over 127. If you have a plain English text
   2.475      with a few accented characters in words like cafe or tete-a-tete,
   2.476 -    you might replace the accented characters with their unaccented 
   2.477 +    you might replace the accented characters with their unaccented
   2.478      versions. The English pound sign is another commonly-seen
   2.479      non-ASCII character. If you have enough non-ASCII characters in
   2.480      your text that you feel removing them would degrade your text,
   2.481 @@ -376,6 +517,7 @@
   2.482      of spaces.
   2.483  
   2.484  
   2.485 +
   2.486      Line 1327 - Tilde character?
   2.487  
   2.488      The tilde character (~) might be legitimately used, but it's the
   2.489 @@ -386,7 +528,7 @@
   2.490  
   2.491      Line 1347 - Asterisk?
   2.492  
   2.493 -    Asterisks are reported only in paranoid mode (see -x). 
   2.494 +    Asterisks are reported only in paranoid mode (see -x).
   2.495      Like tildes, they are often used to indicate errors, but they are
   2.496      also legitimately used as line delimiters and footnote markers.
   2.497  
   2.498 @@ -411,7 +553,7 @@
   2.499  
   2.500      Hint: bookloupe will not flag lines as short if they are indented
   2.501      —if they start with a space. I like to start inserted stanzas
   2.502 -    and other such items indented with a couple of spaces so that 
   2.503 +    and other such items indented with a couple of spaces so that
   2.504      they stand out from the main text anyway.
   2.505  
   2.506  
   2.507 @@ -427,7 +569,7 @@
   2.508  
   2.509      The PG standard for an em-dash--like these--is two minus signs
   2.510      with no spaces before or after them. Bookloupe flags non-PG
   2.511 -    em-dashes - like this one. Normally, you will replace it with a 
   2.512 +    em-dashes - like this one. Normally, you will replace it with a
   2.513      PG-standard em-dash.
   2.514  
   2.515  
   2.516 @@ -451,8 +593,8 @@
   2.517  
   2.518      Line 2083 - Query standalone 0
   2.519  
   2.520 -    In paranoid mode (see -x) only, bookloupe warns about the digit 0 
   2.521 -    and the number 1 standing alone as a word. This can happen if the 
   2.522 +    In paranoid mode (see -x) only, bookloupe warns about the digit 0
   2.523 +    and the number 1 standing alone as a word. This can happen if the
   2.524      OCR misreads the words O or I.
   2.525  
   2.526  
   2.527 @@ -531,22 +673,22 @@
   2.528      Another bookloupe mainstay—unclosed doublequotes in a paragraph.
   2.529      See the discussion of quotes in the switches section near the
   2.530      start of this file.
   2.531 -    
   2.532 +
   2.533      Since the mismatch doesn't occur on any one line, bookloupe quotes
   2.534      the line number of the first blank line following the paragraph,
   2.535      since this is the point where it reconciles the count of quotes.
   2.536      However, if bookloupe is echoing lines, that is, you haven't used
   2.537 -    the -e switch, it will show the _first_ line of the paragraph, 
   2.538 -    to help you find the place without using line numbers. The 
   2.539 -    offending paragraph is therefore between the quoted line and 
   2.540 +    the -e switch, it will show the _first_ line of the paragraph,
   2.541 +    to help you find the place without using line numbers. The
   2.542 +    offending paragraph is therefore between the quoted line and
   2.543      the line number given.
   2.544  
   2.545  
   2.546  
   2.547      Line 2587 - Mismatched single quotes
   2.548  
   2.549 -    Only checked with the -s switch, since checking single quotes is 
   2.550 -    not a very reliable process. Otherwise, the same logic as for 
   2.551 +    Only checked with the -s switch, since checking single quotes is
   2.552 +    not a very reliable process. Otherwise, the same logic as for
   2.553      doublequotes applies.
   2.554  
   2.555  
   2.556 @@ -575,7 +717,7 @@
   2.557  
   2.558      to be put in, like the blank line above, and this often
   2.559      shows up as a new paragraph beginning with lower case.
   2.560 -    Sometimes the blank line is deliberate, as when a 
   2.561 +    Sometimes the blank line is deliberate, as when a
   2.562      quotation is inserted in a speech. Use your judgement.
   2.563  
   2.564  
   2.565 @@ -609,11 +751,11 @@
   2.566      option that will be somewhere on your
   2.567      Start/Programs menu.
   2.568  
   2.569 -    Now get into the C:\gut directory. 
   2.570 -    You can do this using the cd (change directory) 
   2.571 +    Now get into the C:\gut directory.
   2.572 +    You can do this using the cd (change directory)
   2.573      command, like this:
   2.574          cd \gut
   2.575 -    and your prompt will change to 
   2.576 +    and your prompt will change to
   2.577          C:\gut>
   2.578      so you know you're in the right place.
   2.579  
   2.580 @@ -641,7 +783,7 @@
   2.581      replace any existing file of that name.
   2.582  
   2.583      So, for example, if you have two Tolstoy files
   2.584 -    that you want to check, called WARPEACE.TXT and 
   2.585 +    that you want to check, called WARPEACE.TXT and
   2.586      ANNAK.TXT, make sure that neither of these names
   2.587      is ever used following the greater-than sign.
   2.588      To check these correctly, you might do:
   2.589 @@ -670,7 +812,7 @@
   2.590  
   2.591      6) Browse to the folder where you extracted bookloupe
   2.592  
   2.593 -    7) Double-click bookloupe.exe 
   2.594 +    7) Double-click bookloupe.exe
   2.595  
   2.596      Now, whenever you do "Gutcheck" in Guiguts, it will run bookloupe
   2.597      instead. Since the output will look very like gutcheck output, you
     3.1 --- a/doc/loupe-test.txt	Wed Oct 16 22:51:29 2013 +0100
     3.2 +++ b/doc/loupe-test.txt	Wed Oct 02 09:14:33 2013 +0100
     3.3 @@ -30,6 +30,13 @@
     3.4  C:\DP> set BOOKLOUPE=C:\GUTCHECK\GUTCHECK.EXE
     3.5  C:\DP> loupe-test *.tst
     3.6  
     3.7 +When a testcase fails, loupe-test shows the output of bookloupe (or gutcheck)
     3.8 +up until the point where it deviates from the expected result and displays a
     3.9 +carat (^) to point to the exact column where the deviation occurred. Sometimes
    3.10 +it can still be difficult to work out what is happening and so loupe-test also
    3.11 +supports a -o option which will simply print bookloupe's output without comment
    3.12 +or checking.
    3.13 +
    3.14  Writing your own testcases
    3.15  --------------------------
    3.16  
    3.17 @@ -163,6 +170,24 @@
    3.18         │    Line 3 column 29 - Query possible scanno arid              │
    3.19         └───────────────────────────────────────────────────────────────┘
    3.20  
    3.21 +Non standard output
    3.22 +-------------------
    3.23 +
    3.24 +Bookloupe normally follows a standard pattern when printing warnings which
    3.25 +loupe-test knows how to interpret. Occasionally this is not suitable and
    3.26 +the testcase needs to specify exactly what should be printed. This can
    3.27 +be done by adding a literal stdout to the EXPECTED tag:
    3.28 +
    3.29 +       ┌───────────────────────────────────────────────────────────────┐
    3.30 +       │**************** OPTIONS ****************                      │
    3.31 +       │--dump-config                                                  │
    3.32 +       │**************** EXPECTED(stdout) ****************             │
    3.33 +       │# Trivial configuration                                        │
    3.34 +       │                                                               │
    3.35 +       │[options]                                                      │
    3.36 +       │dp=true                                                        │
    3.37 +       └───────────────────────────────────────────────────────────────┘
    3.38 +
    3.39  False-positives
    3.40  ---------------
    3.41