doc/bookloupe.txt
author ali <ali@juiblex.co.uk>
Sun May 26 22:43:45 2013 +0100 (2013-05-26)
changeset 67 865063352146
parent 0 c2f4c0285180
child 74 411867e8e20b
permissions -rw-r--r--
Break check_for_control_characters() out
     1 
     2 
     3                             Gutcheck documentation
     4 
     5 
     6 gutcheck:  lists possible common formatting errors in a Project
     7 Gutenberg candidate file. It is a command line program and can be used
     8 under Win32 or Unix (gutcheck.c should compile anywhere; if it doesn't,
     9 tell me). For Windows-only people, there is an appendix at the end
    10 with brief instructions for running it.
    11 
    12 
    13 Current version: 0.99. Users of 0.98 see end of file for changes.
    14 
    15 You should also have received the licence file COPYING, a README file, 
    16 gutcheck.c, the source code, and gutcheck.exe, a DOS executable, with
    17 this file.
    18 
    19 This software is Copyright Jim Tinsley 2000-2005.
    20 
    21 Gutcheck comes wih ABSOLUTELY NO WARRANTY. For details, read the file COPYING.
    22 This is Free Software; you may redistribute it under certain conditions (GPL).
    23 
    24 See http://gutcheck.sourceforge.net for the latest version.
    25 
    26 
    27 Usage is: gutcheck [-setopxlywm] filename
    28       where:
    29       -s checks Single quotes 
    30       -e switches off Echoing of lines 
    31       -t checks Typos
    32       -o produces an Overview only
    33       -p sets strict quotes checking for Paragraphs
    34       -x (paranoid) switches OFF typo checking and extra checks
    35       -l turns off Line-end checks
    36       -y sets error messages to stdout
    37       -w is a special mode for web uploads (for future use)
    38       -v (verbose) forces individual reporting of minor problems
    39       -m interprets Markup of some common HTML tags and entities    
    40       -u warns about words in a user-defined typo file gutcheck.typ 
    41       -d ignores some DP-specific markup
    42 
    43 Running gutcheck without any parameters will display a brief help message.
    44 
    45 Sample usage: 
    46 
    47     gutcheck warpeace.txt
    48 
    49 
    50 More detail:
    51 
    52     Echoing lines (-e to switch off)
    53 
    54       You may find it convenient, when reviewing Gutcheck's 
    55       suggestions, to see the line that Gutcheck is questioning.
    56       That way, you can often see at a glance whether it is
    57       a real error that needs to be fixed, or a false positive
    58       that should be in the text, but Gutcheck's limited
    59       programming doesn't understand.
    60 
    61       By default, gutcheck echoes these lines, but if you don't 
    62       want to see the lines referred to, -e will switch it OFF.
    63 
    64 
    65     Quotes (-s and -p switches)
    66 
    67       Gutcheck always looks for unbalanced doublequotes in a 
    68       paragraph. It is a common convention for writers not to
    69       close quotes in a paragraph if the next paragraph opens
    70       with quotes and is a continuation by the same speaker.
    71 
    72       Gutcheck therefore does not normally report unclosed quotes 
    73       if the next paragraph begins with a quote. If you need
    74       to see all unclosed quotes, even where the next paragraph
    75       begins with a quote, you should use the -p switch.
    76 
    77       Singlequotes (') are a problem, since the same character
    78       is used for an apostrophe. I'm not sure that it is 
    79       possible to get 100% accuracy on singlequotes checking,
    80       particularly since dialect, quite common in PG texts,
    81       upsets the normal rules so badly. Consider the sentence:
    82         'Tis often said that a man's a man for a' that.
    83       As humans, we recognize that both apostrophes are used
    84       for contractions rather than quotes, but it isn't easy 
    85       to get a program to recognize that.
    86 
    87       Since Gutcheck makes too many mistakes when trying to match
    88       singlequotes, it doesn't look for unbalanced singlequotes
    89       unless you specify the -s switch.
    90 
    91       Consider these sentences, which illustrate the main cases:
    92 
    93         'Tis often said that a fool and his money are soon parted.
    94 
    95         'Becky's goin' home,' said Tom.
    96 
    97         The dogs' tails wagged in unison.
    98 
    99         Those 'pack dogs' of yours look more like wolves.
   100 
   101 
   102 
   103     Typos (-t switch)
   104 
   105       It's not Gutcheck's job to be a spelling checker, but it
   106       does check for a list of common typos and OCR errors if you
   107       use the -t switch. (The -x switch also turns typo checking on.)
   108 
   109       It also checks for character combinations, especially involving
   110       h and b, which are often confused by OCR, that rarely or never
   111       occur. For example, it queries "tbe" in a word. Now, "the" often
   112       occurs, but "tbe" is very rare (heartbeat, hotbed), so I'm
   113       playing the odds - a few false positives for many errors found.
   114       Similarly with "ii", which is a very common OCR error.
   115 
   116       Gutcheck suppresses multiple reporting of the first 40 "typos"
   117       found. This is to remove the annoyance of seeing something like
   118       "FN" (footnote) or "LK" (initials) flagged as a typo 147 times
   119       in a text. 
   120 
   121 
   122     Line-end checking (-l switch to disable)
   123 
   124       All PG texts should have a Carriage Return (CR - character 13)
   125       and a Line Feed (LF - character 10) at end of each line,
   126       regardless of what O/S you made them on. DOS/Windows, Unix
   127       and Mac have different conventions, but the final text should
   128       always use a CR/LF pair as its line terminator.
   129 
   130       By default, Gutcheck verifies that every line does have
   131       the correct terminator, but if you're on a work-in-progress
   132       in Linux, you might want to convert the line-ends as a final
   133       step, and not want to see thousands of errors every time you
   134       run Gutcheck before that final step, so you can turn off 
   135       this checking with the -l switch.
   136 
   137 
   138     Paranoid mode (-x switch to disable: Trust No One :-)
   139 
   140       -x switches OFF typo-checking, the -t flag, automatically
   141       and some extra checks like standalone 1 and 0 queries.
   142 
   143 
   144     Overview mode (-o switch)
   145 
   146        This mode just gives a count of queries found
   147        instead of a detailed list.
   148 
   149 
   150     Header quote  (-h switch)
   151 
   152        If you use the -h switch, gutcheck will also display
   153        the Title, Author, Release and Edition fields from the
   154        PG header. This is useful mostly for the automated
   155        checks we do on recently-posted texts.
   156 
   157 
   158     Errors to stdout (-y switch)
   159 
   160        If you're just running gutcheck normally, you can ignore
   161        this. It's only there for programs that provide a front
   162        end to gutcheck. It makes error messages appear within
   163        the output of gutcheck so that the front end knows whether
   164        gutcheck ran OK.
   165 
   166 
   167     Verbose reporting (-v switch)
   168 
   169        Normally, if gutcheck sees lots of long lines, short lines,
   170        spaced dashes, non-ASCII characters or dot-commas ".," it
   171        assumes these are features of the text, counts and summarizes
   172        them at the top of its report, but does not list them 
   173        individually. If the -v switch is on, gutcheck will list them all.
   174 
   175 
   176     Markup interpretation (-m switch)
   177 
   178        Normally, gutcheck flags anything it suspects of being HTML
   179        markup as a possible error. When you use the -m switch,
   180        however, it matches anything that looks like markup against
   181        a short list of common HTML tags and entities. If the markup
   182        is in that list, it either ignores the markup, in the case
   183        of a tag, or "interprets" the markup as its nearest ASCII 
   184        equivalent, in the case of an entity. So, for example, using
   185        this switch, gutcheck will "see"
   186 
   187        &ldquo;He went <i>thataway!</i>&rdquo;
   188 
   189        as
   190 
   191        "He went thataway!"
   192 
   193        and report accordingly.
   194 
   195        This switch does not, not, NOT check the validity of HTML;
   196        it exists so that you can run gutcheck on most HTML texts
   197        for PG, and get sane results. It does not support all tags.
   198        It does not support all entities. When it sees a tag or entity
   199        it does not recognize, it will query it as HTML just as if
   200        you hadn't specified the -m switch.
   201 
   202        Gutcheck 0.99 will automatically switch on markup interpretation
   203        if it sees a lot of tags that appear to be markup, so mostly, you
   204        won't have to specify this.
   205 
   206     User-defined typos (-u switch)
   207 
   208         If you have a file named gutcheck.typ either in your current
   209         working directory or in the directory from which you explicitly
   210         invoked gutcheck, but not necessarily on your path, and if you
   211         specify the -u switch, gutcheck will query any word specified 
   212         in that file. The file is simple: one word, in lower case, per
   213         line. 999 lines are allowed for. Be careful not to put multiple
   214         words onto a line, or leave any rubbish other than the word on
   215         the line. You should have received a sample file gutcheck.typ
   216         with this package.
   217 
   218     Ignore DP markup (-d switch)
   219         
   220         Distributed Proofreaders (http://www.pgdp.net) is currently
   221         (2005) the main source of PG texts, and proofers there use
   222         special conventions. This switch understands those conventions,
   223         so that people can use gutcheck on files in process that still
   224         haven't had the special conventions removed yet. The special
   225         conventions supported in 0.99 are page-separators and
   226         "<sc>", "</sc>", "/*", "*/", "/#", "#/", "/$", "$/".
   227 
   228 
   229 You will probably only run gutcheck on a text once or maybe twice,
   230 just prior to uploading; it usually finds a few formatting problems;
   231 it also usually finds queries that aren't problems at all - it often
   232 questions Tables of Contents for having short lines, for example.
   233 These are called "false positives", and need a human to decide on
   234 them.
   235 
   236 The text should be standard prose, and already close to PG normal
   237 format (plain text, about 70 characters per line with blank lines
   238 between paragraphs).
   239 
   240 Gutcheck merely draws your attention to things that might be errors.
   241 It is NOT a substitute for human judgement. Formatting choices like
   242 short lines may be for a reason that this program can't understand.
   243 
   244 Even the most careful human proofing can leave errors behind in a
   245 text, and there are several automated checks you can do to help find
   246 them. Of these, spellchecking (with _very_ careful human judgement) is
   247 the most important and most useful.
   248 
   249 Gutcheck does perform some basic typo-checking if you ask it to,
   250 but its focus is on formatting errors specific to PG texts - 
   251 mismatched quotes, non-ASCII characters, bad spacing, bad line
   252 length, HTML tags perhaps left from a conversion, unbalanced
   253 brackets.
   254 
   255 Suggestions for additional checks would be appreciated and duly 
   256 considered, but no guarantees that they will be implemented.
   257 
   258 
   259 
   260 
   261                 How do _I_ use it?
   262 
   263 Practically everyone I give gutcheck to asks me how _I_ use it.
   264 Well, when I get a text for posting, say filename.txt, I run
   265 
   266     gutcheck -o filename.txt
   267 
   268 That gives me a quick idea what I'm dealing with. It'll tell
   269 me what kind of problems gutcheck sees, and give me an idea 
   270 of how much more work needs to be done on the text. Keep in 
   271 mind that gutcheck doesn't do anything like a full spellcheck,
   272 but when I see a text that has a lot of problems, I assume that
   273 it probably needs a spellcheck too.
   274 
   275 Having got a feel for the ballpark, I run
   276 
   277     gutcheck filename.txt > jj
   278 
   279 where jj is my personal, all-purpose filename for temporary data
   280 that doesn't need to be kept. Then I open filename.txt and jj in
   281 a split-screen view in my editor, and work down the text, fixing
   282 whatever needs fixing, and skipping whatever doesn't. If your 
   283 editor doesn't split-screen, you can get much the same effect by 
   284 opening your original file in your normal editor, and jj (or your
   285 equivalent name) in something like Notepad, keeping both in view 
   286 at the same time.
   287 
   288 Twice a day, an automatic process looks at all recently-posted
   289 texts, and emails Michael, me, and sometimes other people with
   290 their gutcheck summaries.
   291 
   292 
   293 
   294         Future development of gutcheck
   295 
   296 Gutcheck has gone about as far as it can, given its current
   297 structure. In order to add better singlequotes checking,
   298 sentence checking, better he/be checking and other good stuff
   299 that I'd like to see, I'll have to rewrite it from a different
   300 angle - looking at the syntax instead of the lines. And I'll
   301 probably get around to that sooner or later.
   302 
   303 Meantime, I'm just trying to get this version stabilized, so
   304 please report any bugs you find. When it is stable, I'll run
   305 up a Windows port for those timid souls who can't look a 
   306 command line in the eye. :-)
   307 
   308 And I've started work on gutspell, a companion to gutcheck
   309 which will concentrate on spelling problems. PG spelling
   310 problems are unusual, since the range of texts we cover is
   311 so wide, and I'll be taking a somewhat unorthodox approach
   312 to writing this spelling-checker _specifically_ for texts
   313 containing a lot of dialect and uncommon words that have
   314 probably already been spell-checked against a standard
   315 modern dictionary.
   316 
   317 
   318 
   319 
   320 Explanations of common gutcheck messages:
   321 
   322     --> 74 lines in this file have white space at end
   323 
   324     PG texts shouldn't have extra white space added at end of line.
   325     Don't worry too much about this; they're not doing any harm,
   326     and they'll be removed during posting anyway.
   327 
   328 
   329     --> 348 lines in this file are short. Not reporting short lines.
   330     --> 84 lines in this file are long. Not reporting long lines.
   331     --> 8 lines in this file are VERY long!
   332 
   333     If there are a lot of long or short lines, Gutcheck won't list
   334     them individually. The short lines version of this message
   335     is commonly seen when gutchecking poetry and some plays, where
   336     the normal line length is shorter than the standard for prose.
   337     A "VERY long" line is one over 80 characters.  You normally
   338     shouldn't have any of these, but sometimes you may have to render
   339     a table that must be that long, or some special preformatted
   340     quotation that can't be broken.
   341 
   342 
   343     --> There are 75 spaced dashes and em-dashes in this file. Not reporting them.
   344 
   345     The PG standard for an emdash--like these--is two minus signs
   346     with no spaces before or after them. However, some older texts
   347     used spaced dashes - like these -- and if there are very many
   348     such spaced dashes in the file, gutcheck just draws your
   349     attention to it and doesn't list them individually.
   350 
   351 
   352 
   353     Line 3020 - Non-ASCII character 233
   354 
   355     Standard PG texts should use only ASCII characters with values
   356     up to 127; however, non-English, accented characters can be 
   357     represented according to several different non-ASCII encoding 
   358     schemes, using values over 127. If you have a plain English text
   359     with a few accented characters in words like cafe or tete-a-tete,
   360     you should replace the accented characters with their unaccented 
   361     versions. The English pound sign is another commonly-seen
   362     non-ASCII character. If you have enough non-ASCII characters in
   363     your text that you feel removing them would degrade your text
   364     unacceptably, you should probably consider doing an 8-bit text
   365     as well as a plain-ASCII version.
   366 
   367 
   368 
   369     Line 1207 - Non-ISO-8859 character 156
   370 
   371     Even in "8-bit" texts, there are distinctions between code sets.
   372     The ISO-8859 family of 8-bit code sets is the most commonly used
   373     in PG, and these sets do not define values in the range 128 through
   374     159 as printable characters. It's quite common for someone on a
   375     Windows or Mac machine to use a non-ISO character inadvertently,
   376     so this message warns that the character is not only not ASCII,
   377     but also outside the ISO-8859 range.
   378 
   379 
   380 
   381     Line 46 - Tab character?
   382 
   383     Some editors and WPs will put in Tab characters (character 9) to
   384     indicate indented text. You should not use these in a PG text,
   385     because you can't be sure how they will appear on a reader's
   386     screen. Find the Tab, and replace it with the appropriate number
   387     of spaces.
   388 
   389 
   390     Line 1327 - Tilde character?
   391 
   392     The tilde character (~) might be legitimately used, but it's the
   393     character commonly used by OCR software to indicate a place where
   394     it couldn't make out the letter, so gutcheck flags it.
   395 
   396 
   397 
   398     Line 1347 - Asterisk?
   399 
   400     Asterisks are reported only in paranoid mode (see -x). 
   401     Like tildes, they are often used to indicate errors, but they are
   402     also legitimately used as line delimiters and footnote markers.
   403 
   404 
   405 
   406     Line 1451 - Long line 129
   407 
   408     PG texts should have lines shorter than 76. There may be occasions
   409     where you decide that you really have to go out to 79 characters,
   410     but the sample above says that line 1451 is 129 characters long -
   411     probably two lines run together.
   412 
   413 
   414 
   415     Line 1590 - Short line?
   416 
   417     PG texts should have lines longer than 54 characters. However,
   418     there are special cases like poetry and tables of contents where
   419     the lines _should_ be shorter. So treat Gutcheck warnings about
   420     short lines carefully. Sometimes it's a genuine formatting
   421     problem; sometimes the line really needs to be short.
   422 
   423     Hint: gutcheck will not flag lines as short if they are indented
   424     - if they start with a space. I like to start inserted stanzas
   425     and other such items indented with a couple of spaces so that 
   426     they stand out from the main text anyway.
   427 
   428 
   429 
   430     Line 1804 - Begins with punctuation?
   431 
   432     Lines should normally not begin with commas, periods and so on.
   433     An exception is ellipses . . . which can happen at start of line.
   434 
   435 
   436 
   437     Line 1850 - Spaced em-dash?
   438 
   439     The PG standard for an em-dash--like these--is two minus signs
   440     with no spaces before or after them. Gutcheck flags non-PG
   441     em-dashes - like this one. Normally, you will replace it with a 
   442     PG-standard em-dash.
   443 
   444 
   445 
   446     Line 1904 - Query he/be error?
   447 
   448     Gutcheck makes a very minor effort to look for that scourge of all
   449     proofreaders, "be" replacing "he" or vice-versa, and draws your
   450     attention to it when it thinks it has found one.
   451 
   452 
   453 
   454     Line 2017 - Query digit in a1most
   455 
   456     The digit 1 is commonly OCRed for the letter l, the digit 0 for
   457     the letter O, and so on. When gutcheck sees a mix of digits and
   458     letters, it warns you. It may generate a false positive for
   459     something like 7am.
   460 
   461 
   462 
   463     Line 2083 - Query standalone 0
   464 
   465     In paranoid mode (see -x) only, gutcheck warns about the digit 0 
   466     and the number 1 standing alone as a word. This can happen if the 
   467     OCR misreads the words O or I.
   468 
   469 
   470 
   471     Line 2115 - Query word whetber
   472 
   473     If you have switched typo-checking on, gutcheck looks for
   474     potential typos, especially common h/b errors. It's not
   475     infallible; it sometimes queries legit words, but it's
   476     always worth taking a look.
   477 
   478 
   479 
   480     Line 2190 column 14 - Missing space?
   481 
   482     Omitting a space is a very common error,especially coming from
   483     OCRed text,and can be hard for a human to spot. The commas in
   484     the previous sentence illustrate the kind of thing I mean.
   485 
   486 
   487 
   488     Line 2240 column 48 - Spaced punctuation?
   489 
   490     The flip side of the "missing space" error , here , is when extra
   491     spaces are added before punctuation . Some old texts appear to add
   492     extra spaces around punctuation consistently, but this was a
   493     typographical convention rather than the author's intent, and the
   494     extra "spaces" should be removed when preparing a PG text.
   495 
   496 
   497 
   498     Line 2301 column 19 - Unspaced quotes?
   499 
   500     Another common spacing problem occurs in a phrase like "You wait
   501     there,"he said.
   502 
   503 
   504 
   505     Line 2385 column 27 - Wrongspaced quotes?
   506 
   507     As of version 0.98, gutcheck adds extra checks on whether a quote
   508     seems to be a start or end quote, and queries those that appear to
   509     be misplaced. This does give rise to false positives when quotes are
   510     nested, for example:
   511 
   512     "And how," she asked, "will your "friends" help you now?"
   513 
   514     but these false positives are worth it because of the many cases
   515     that this test catches, notably those like:
   516 
   517     "And how, "she said," will your friends help you now?"
   518 
   519     Sometimes a "wrongspaced quotes" query will arise because an earlier
   520     quote in the paragraph was omitted, so if the place specified seems
   521     to be OK, look back to see whether there's a problem in the preceding
   522     lines.
   523 
   524 
   525 
   526     Line 2400 - HTML Tag? <PRE>
   527 
   528     Some PG texts have been converted from HTML, and not all of the
   529     HTML tags have been removed.
   530 
   531 
   532 
   533     Line 2402 - HTML symbol? &emdash;
   534 
   535     Similarly, special HTML symbol characters can survive into PG
   536     texts. Can occasionally produce amusing false positives like
   537     . . . Marwick & Co were well known for it;
   538 
   539 
   540 
   541     Line 2540 - Mismatched quotes
   542 
   543     Another gutcheck mainstay - unclosed doublequotes in a paragraph.
   544     See the discussion of quotes in the switches section near the
   545     start of this file.
   546     
   547     Since the mismatch doesn't occur on any one line, gutcheck quotes
   548     the line number of the first blank line following the paragraph,
   549     since this is the point where it reconciles the count of quotes.
   550     However, if gutcheck is echoing lines, that is, you haven't used
   551     the -e switch, it will show the _first_ line of the paragraph, 
   552     to help you find the place without using line numbers. The 
   553     offending paragraph is therefore between the quoted line and 
   554     the line number given.
   555 
   556 
   557 
   558     Line 2587 - Mismatched single quotes
   559 
   560     Only checked with the -s switch, since checking single quotes is 
   561     not a very reliable process. Otherwise, the same logic as for 
   562     doublequotes applies.
   563 
   564 
   565 
   566     Line 2877 - Mismatched round brackets?
   567 
   568     Also curly and square brackets. Texts with a lot of brackets, like
   569     plays with bracketed stage instructions, may have mismatches.
   570 
   571 
   572     Line 3150 - No CR?
   573     Line 3204 - Two successive CRs?
   574     Line 3281 position 75 - CR without LF?
   575 
   576     These are the invalid line-end warnings. See the discussion of
   577     line-end checking in the switches section near the start of this
   578     file. If you see these, and your editor doesn't show anything
   579     wrong, you should probably try deleting the characters just before
   580     and after the line end, and the line-end itself, then retyping the
   581     characters and the line-end.
   582 
   583 
   584     Line 2940 - Paragraph starts with lower-case
   585 
   586     A common error in an e-text is for an extra blank line
   587 
   588     to be put in, like the blank line above, and this often
   589     shows up as a new paragraph beginning with lower case.
   590     Sometimes the blank line is deliberate, as when a 
   591     quotation is inserted in a speech. Use your judgement.
   592 
   593 
   594     Line 2987 - Extra period?
   595 
   596     An extra period. is a. common problem in OCRed text. and usually
   597     arises when a speck of dust on the page is mistaken for a period.
   598     or. as occasionally happens. when a comma loses its tail.
   599 
   600 
   601     Line 3012 column 12 - Double punctuation?
   602 
   603     Double punctuation., like that,, is a common typo and
   604     scanno. Some books have much legit double punctuation,
   605     like etc., etc., but it's worth checking anyway.
   606 
   607 
   608 
   609             *       *       *        *
   610 
   611 For Windows-only users who are unfamiliar with DOS:
   612 
   613     If you're a Windows-only user, you need to save
   614     gutcheck.exe into the folder (directory) where the
   615     text file you want to check is. Let's say your
   616     text file is in C:\GUT, then you should save
   617     GUTCHECK.EXE into C:\GUT.
   618 
   619     Now get to a DOS prompt. You can do this by
   620     selecting the "Command Prompt" or "MS-DOS Prompt"
   621     option that will be somewhere on your
   622     Start/Programs menu.
   623 
   624     Now get into the C:\GUT directory. 
   625     You can do this using the CD (change directory) 
   626     command, like this:
   627         CD \GUT
   628     and your prompt will change to 
   629         C:\GUT>
   630     so you know you're in the right place.
   631 
   632     Now type
   633         gutcheck yourfile.txt
   634     and you'll see gutcheck's report
   635 
   636     By default, gutcheck prints its queries to screen.
   637     If you want to create a file of them, to edit
   638     against the text, you can use the greater-than
   639     sign (>) to tell it to output the report to a
   640     file. For example, if you want its report in a
   641     file called QUERIES.LST, you could type
   642     
   643         gutcheck yourfile.txt > queries.lst
   644 
   645     The queries.lst file will then contain the listing
   646     of possible formatting errors, and you can
   647     edit it alongside your text.
   648 
   649     Whatever you do, DON'T make the filename after
   650     the greater-than sign the name of a file already
   651     on your disk that you want to keep, because
   652     the greater-than sign will cause gutcheck to
   653     replace any existing file of that name.
   654 
   655     So, for example, if you have two Tolstoy files
   656     that you want to check, called WARPEACE.TXT and 
   657     ANNAK.TXT, make sure that neither of these names
   658     is ever used following the greater-than sign.
   659     To check these correctly, you might do:
   660 
   661     gutcheck warpeace.txt >war.lst
   662 
   663     and
   664 
   665     gutcheck annak.txt > annak.lst
   666 
   667     separately. Then you can look at war.lst and annak.lst
   668     to see the gutcheck reports.
   669 
   670             *       *       *        *
   671 
   672 
   673 For existing 0.98 users upgrading to 0.99:
   674 
   675     If you run on old 16-bit DOS or Windows 3.x, I'm afraid
   676     you're out of luck. I'm not saying it _can't_ be compiled
   677     to run on 16-bit, but the executable with the package is
   678     for Win32 only. *nix users won't notice the change at all.
   679 
   680 
   681     There are two new switches: -u and -d. 
   682           See above for full rundown.
   683 
   684 
   685 Here's a list of the new errors:
   686 
   687     Line 1456 - Carat character?
   688 
   689     I^ve found a few.
   690 
   691 
   692     Line 1821 - Forward slash?
   693 
   694     Common error for italicized "I", or so /'ve found.
   695 
   696 
   697     Line 2139 - Query missing paragraph break?
   698 
   699     "Come here, son." "Do I _have_ to go, dad?"
   700     Like that. False positives in some texts. Sorry 'bout that,
   701     but these are often errors.
   702 
   703 
   704     Line 2200 - Query had/bad error?
   705 
   706     Clear enough. Doesn't catch as many as I'd like it to,
   707     but rarely gives false alarms.
   708 
   709 
   710     Line 2268 - Query punctuation after the?
   711 
   712     Some words, like "the", very rarely have punctuation
   713     following them. Others, like "Mrs", usually have a
   714     period, but never a comma. Occasional false positives.
   715 
   716 
   717     Line 2380 - Query possible scanno arid
   718 
   719     It found one of your user-defined typos when you
   720     used the -u switch.
   721 
   722 
   723     Line 2511 - Capital "S"?
   724 
   725     Surprisingly common specific case, like: Jane'S 
   726 
   727     
   728     Line 3469 - endquote missing punctuation?
   729 
   730     OK. This one can really cause a lot of false positives
   731     in some books, but it switches itself off if it finds
   732     more than 20 in a text, unless you force it to list them
   733     all with the -v switch.
   734     "Hey, dad" Johnny said, "can we go now?"
   735     is a common punctuation-missing error.
   736 
   737 
   738     Line 4266 - Mismatched underscores?
   739 
   740     Like mismatched anything else!
   741 
   742