ali@0: ali@0: ali@0: Gutcheck documentation ali@0: ali@0: ali@0: gutcheck: lists possible common formatting errors in a Project ali@0: Gutenberg candidate file. It is a command line program and can be used ali@0: under Win32 or Unix (gutcheck.c should compile anywhere; if it doesn't, ali@0: tell me). For Windows-only people, there is an appendix at the end ali@0: with brief instructions for running it. ali@0: ali@0: ali@0: Current version: 0.99. Users of 0.98 see end of file for changes. ali@0: ali@0: You should also have received the licence file COPYING, a README file, ali@0: gutcheck.c, the source code, and gutcheck.exe, a DOS executable, with ali@0: this file. ali@0: ali@0: This software is Copyright Jim Tinsley 2000-2005. ali@0: ali@0: Gutcheck comes wih ABSOLUTELY NO WARRANTY. For details, read the file COPYING. ali@0: This is Free Software; you may redistribute it under certain conditions (GPL). ali@0: ali@0: See http://gutcheck.sourceforge.net for the latest version. ali@0: ali@0: ali@0: Usage is: gutcheck [-setopxlywm] filename ali@0: where: ali@0: -s checks Single quotes ali@0: -e switches off Echoing of lines ali@0: -t checks Typos ali@0: -o produces an Overview only ali@0: -p sets strict quotes checking for Paragraphs ali@0: -x (paranoid) switches OFF typo checking and extra checks ali@0: -l turns off Line-end checks ali@0: -y sets error messages to stdout ali@0: -w is a special mode for web uploads (for future use) ali@0: -v (verbose) forces individual reporting of minor problems ali@0: -m interprets Markup of some common HTML tags and entities ali@0: -u warns about words in a user-defined typo file gutcheck.typ ali@0: -d ignores some DP-specific markup ali@0: ali@0: Running gutcheck without any parameters will display a brief help message. ali@0: ali@0: Sample usage: ali@0: ali@0: gutcheck warpeace.txt ali@0: ali@0: ali@0: More detail: ali@0: ali@0: Echoing lines (-e to switch off) ali@0: ali@0: You may find it convenient, when reviewing Gutcheck's ali@0: suggestions, to see the line that Gutcheck is questioning. ali@0: That way, you can often see at a glance whether it is ali@0: a real error that needs to be fixed, or a false positive ali@0: that should be in the text, but Gutcheck's limited ali@0: programming doesn't understand. ali@0: ali@0: By default, gutcheck echoes these lines, but if you don't ali@0: want to see the lines referred to, -e will switch it OFF. ali@0: ali@0: ali@0: Quotes (-s and -p switches) ali@0: ali@0: Gutcheck always looks for unbalanced doublequotes in a ali@0: paragraph. It is a common convention for writers not to ali@0: close quotes in a paragraph if the next paragraph opens ali@0: with quotes and is a continuation by the same speaker. ali@0: ali@0: Gutcheck therefore does not normally report unclosed quotes ali@0: if the next paragraph begins with a quote. If you need ali@0: to see all unclosed quotes, even where the next paragraph ali@0: begins with a quote, you should use the -p switch. ali@0: ali@0: Singlequotes (') are a problem, since the same character ali@0: is used for an apostrophe. I'm not sure that it is ali@0: possible to get 100% accuracy on singlequotes checking, ali@0: particularly since dialect, quite common in PG texts, ali@0: upsets the normal rules so badly. Consider the sentence: ali@0: 'Tis often said that a man's a man for a' that. ali@0: As humans, we recognize that both apostrophes are used ali@0: for contractions rather than quotes, but it isn't easy ali@0: to get a program to recognize that. ali@0: ali@0: Since Gutcheck makes too many mistakes when trying to match ali@0: singlequotes, it doesn't look for unbalanced singlequotes ali@0: unless you specify the -s switch. ali@0: ali@0: Consider these sentences, which illustrate the main cases: ali@0: ali@0: 'Tis often said that a fool and his money are soon parted. ali@0: ali@0: 'Becky's goin' home,' said Tom. ali@0: ali@0: The dogs' tails wagged in unison. ali@0: ali@0: Those 'pack dogs' of yours look more like wolves. ali@0: ali@0: ali@0: ali@0: Typos (-t switch) ali@0: ali@0: It's not Gutcheck's job to be a spelling checker, but it ali@0: does check for a list of common typos and OCR errors if you ali@0: use the -t switch. (The -x switch also turns typo checking on.) ali@0: ali@0: It also checks for character combinations, especially involving ali@0: h and b, which are often confused by OCR, that rarely or never ali@0: occur. For example, it queries "tbe" in a word. Now, "the" often ali@0: occurs, but "tbe" is very rare (heartbeat, hotbed), so I'm ali@0: playing the odds - a few false positives for many errors found. ali@0: Similarly with "ii", which is a very common OCR error. ali@0: ali@0: Gutcheck suppresses multiple reporting of the first 40 "typos" ali@0: found. This is to remove the annoyance of seeing something like ali@0: "FN" (footnote) or "LK" (initials) flagged as a typo 147 times ali@0: in a text. ali@0: ali@0: ali@0: Line-end checking (-l switch to disable) ali@0: ali@0: All PG texts should have a Carriage Return (CR - character 13) ali@0: and a Line Feed (LF - character 10) at end of each line, ali@0: regardless of what O/S you made them on. DOS/Windows, Unix ali@0: and Mac have different conventions, but the final text should ali@0: always use a CR/LF pair as its line terminator. ali@0: ali@0: By default, Gutcheck verifies that every line does have ali@0: the correct terminator, but if you're on a work-in-progress ali@0: in Linux, you might want to convert the line-ends as a final ali@0: step, and not want to see thousands of errors every time you ali@0: run Gutcheck before that final step, so you can turn off ali@0: this checking with the -l switch. ali@0: ali@0: ali@0: Paranoid mode (-x switch to disable: Trust No One :-) ali@0: ali@0: -x switches OFF typo-checking, the -t flag, automatically ali@0: and some extra checks like standalone 1 and 0 queries. ali@0: ali@0: ali@0: Overview mode (-o switch) ali@0: ali@0: This mode just gives a count of queries found ali@0: instead of a detailed list. ali@0: ali@0: ali@0: Header quote (-h switch) ali@0: ali@0: If you use the -h switch, gutcheck will also display ali@0: the Title, Author, Release and Edition fields from the ali@0: PG header. This is useful mostly for the automated ali@0: checks we do on recently-posted texts. ali@0: ali@0: ali@0: Errors to stdout (-y switch) ali@0: ali@0: If you're just running gutcheck normally, you can ignore ali@0: this. It's only there for programs that provide a front ali@0: end to gutcheck. It makes error messages appear within ali@0: the output of gutcheck so that the front end knows whether ali@0: gutcheck ran OK. ali@0: ali@0: ali@0: Verbose reporting (-v switch) ali@0: ali@0: Normally, if gutcheck sees lots of long lines, short lines, ali@0: spaced dashes, non-ASCII characters or dot-commas ".," it ali@0: assumes these are features of the text, counts and summarizes ali@0: them at the top of its report, but does not list them ali@0: individually. If the -v switch is on, gutcheck will list them all. ali@0: ali@0: ali@0: Markup interpretation (-m switch) ali@0: ali@0: Normally, gutcheck flags anything it suspects of being HTML ali@0: markup as a possible error. When you use the -m switch, ali@0: however, it matches anything that looks like markup against ali@0: a short list of common HTML tags and entities. If the markup ali@0: is in that list, it either ignores the markup, in the case ali@0: of a tag, or "interprets" the markup as its nearest ASCII ali@0: equivalent, in the case of an entity. So, for example, using ali@0: this switch, gutcheck will "see" ali@0: ali@0: “He went thataway!” ali@0: ali@0: as ali@0: ali@0: "He went thataway!" ali@0: ali@0: and report accordingly. ali@0: ali@0: This switch does not, not, NOT check the validity of HTML; ali@0: it exists so that you can run gutcheck on most HTML texts ali@0: for PG, and get sane results. It does not support all tags. ali@0: It does not support all entities. When it sees a tag or entity ali@0: it does not recognize, it will query it as HTML just as if ali@0: you hadn't specified the -m switch. ali@0: ali@0: Gutcheck 0.99 will automatically switch on markup interpretation ali@0: if it sees a lot of tags that appear to be markup, so mostly, you ali@0: won't have to specify this. ali@0: ali@0: User-defined typos (-u switch) ali@0: ali@0: If you have a file named gutcheck.typ either in your current ali@0: working directory or in the directory from which you explicitly ali@0: invoked gutcheck, but not necessarily on your path, and if you ali@0: specify the -u switch, gutcheck will query any word specified ali@0: in that file. The file is simple: one word, in lower case, per ali@0: line. 999 lines are allowed for. Be careful not to put multiple ali@0: words onto a line, or leave any rubbish other than the word on ali@0: the line. You should have received a sample file gutcheck.typ ali@0: with this package. ali@0: ali@0: Ignore DP markup (-d switch) ali@0: ali@0: Distributed Proofreaders (http://www.pgdp.net) is currently ali@0: (2005) the main source of PG texts, and proofers there use ali@0: special conventions. This switch understands those conventions, ali@0: so that people can use gutcheck on files in process that still ali@0: haven't had the special conventions removed yet. The special ali@0: conventions supported in 0.99 are page-separators and ali@0: "", "", "/*", "*/", "/#", "#/", "/$", "$/". ali@0: ali@0: ali@0: You will probably only run gutcheck on a text once or maybe twice, ali@0: just prior to uploading; it usually finds a few formatting problems; ali@0: it also usually finds queries that aren't problems at all - it often ali@0: questions Tables of Contents for having short lines, for example. ali@0: These are called "false positives", and need a human to decide on ali@0: them. ali@0: ali@0: The text should be standard prose, and already close to PG normal ali@0: format (plain text, about 70 characters per line with blank lines ali@0: between paragraphs). ali@0: ali@0: Gutcheck merely draws your attention to things that might be errors. ali@0: It is NOT a substitute for human judgement. Formatting choices like ali@0: short lines may be for a reason that this program can't understand. ali@0: ali@0: Even the most careful human proofing can leave errors behind in a ali@0: text, and there are several automated checks you can do to help find ali@0: them. Of these, spellchecking (with _very_ careful human judgement) is ali@0: the most important and most useful. ali@0: ali@0: Gutcheck does perform some basic typo-checking if you ask it to, ali@0: but its focus is on formatting errors specific to PG texts - ali@0: mismatched quotes, non-ASCII characters, bad spacing, bad line ali@0: length, HTML tags perhaps left from a conversion, unbalanced ali@0: brackets. ali@0: ali@0: Suggestions for additional checks would be appreciated and duly ali@0: considered, but no guarantees that they will be implemented. ali@0: ali@0: ali@0: ali@0: ali@0: How do _I_ use it? ali@0: ali@0: Practically everyone I give gutcheck to asks me how _I_ use it. ali@0: Well, when I get a text for posting, say filename.txt, I run ali@0: ali@0: gutcheck -o filename.txt ali@0: ali@0: That gives me a quick idea what I'm dealing with. It'll tell ali@0: me what kind of problems gutcheck sees, and give me an idea ali@0: of how much more work needs to be done on the text. Keep in ali@0: mind that gutcheck doesn't do anything like a full spellcheck, ali@0: but when I see a text that has a lot of problems, I assume that ali@0: it probably needs a spellcheck too. ali@0: ali@0: Having got a feel for the ballpark, I run ali@0: ali@0: gutcheck filename.txt > jj ali@0: ali@0: where jj is my personal, all-purpose filename for temporary data ali@0: that doesn't need to be kept. Then I open filename.txt and jj in ali@0: a split-screen view in my editor, and work down the text, fixing ali@0: whatever needs fixing, and skipping whatever doesn't. If your ali@0: editor doesn't split-screen, you can get much the same effect by ali@0: opening your original file in your normal editor, and jj (or your ali@0: equivalent name) in something like Notepad, keeping both in view ali@0: at the same time. ali@0: ali@0: Twice a day, an automatic process looks at all recently-posted ali@0: texts, and emails Michael, me, and sometimes other people with ali@0: their gutcheck summaries. ali@0: ali@0: ali@0: ali@0: Future development of gutcheck ali@0: ali@0: Gutcheck has gone about as far as it can, given its current ali@0: structure. In order to add better singlequotes checking, ali@0: sentence checking, better he/be checking and other good stuff ali@0: that I'd like to see, I'll have to rewrite it from a different ali@0: angle - looking at the syntax instead of the lines. And I'll ali@0: probably get around to that sooner or later. ali@0: ali@0: Meantime, I'm just trying to get this version stabilized, so ali@0: please report any bugs you find. When it is stable, I'll run ali@0: up a Windows port for those timid souls who can't look a ali@0: command line in the eye. :-) ali@0: ali@0: And I've started work on gutspell, a companion to gutcheck ali@0: which will concentrate on spelling problems. PG spelling ali@0: problems are unusual, since the range of texts we cover is ali@0: so wide, and I'll be taking a somewhat unorthodox approach ali@0: to writing this spelling-checker _specifically_ for texts ali@0: containing a lot of dialect and uncommon words that have ali@0: probably already been spell-checked against a standard ali@0: modern dictionary. ali@0: ali@0: ali@0: ali@0: ali@0: Explanations of common gutcheck messages: ali@0: ali@0: --> 74 lines in this file have white space at end ali@0: ali@0: PG texts shouldn't have extra white space added at end of line. ali@0: Don't worry too much about this; they're not doing any harm, ali@0: and they'll be removed during posting anyway. ali@0: ali@0: ali@0: --> 348 lines in this file are short. Not reporting short lines. ali@0: --> 84 lines in this file are long. Not reporting long lines. ali@0: --> 8 lines in this file are VERY long! ali@0: ali@0: If there are a lot of long or short lines, Gutcheck won't list ali@0: them individually. The short lines version of this message ali@0: is commonly seen when gutchecking poetry and some plays, where ali@0: the normal line length is shorter than the standard for prose. ali@0: A "VERY long" line is one over 80 characters. You normally ali@0: shouldn't have any of these, but sometimes you may have to render ali@0: a table that must be that long, or some special preformatted ali@0: quotation that can't be broken. ali@0: ali@0: ali@0: --> There are 75 spaced dashes and em-dashes in this file. Not reporting them. ali@0: ali@0: The PG standard for an emdash--like these--is two minus signs ali@0: with no spaces before or after them. However, some older texts ali@0: used spaced dashes - like these -- and if there are very many ali@0: such spaced dashes in the file, gutcheck just draws your ali@0: attention to it and doesn't list them individually. ali@0: ali@0: ali@0: ali@0: Line 3020 - Non-ASCII character 233 ali@0: ali@0: Standard PG texts should use only ASCII characters with values ali@0: up to 127; however, non-English, accented characters can be ali@0: represented according to several different non-ASCII encoding ali@0: schemes, using values over 127. If you have a plain English text ali@0: with a few accented characters in words like cafe or tete-a-tete, ali@0: you should replace the accented characters with their unaccented ali@0: versions. The English pound sign is another commonly-seen ali@0: non-ASCII character. If you have enough non-ASCII characters in ali@0: your text that you feel removing them would degrade your text ali@0: unacceptably, you should probably consider doing an 8-bit text ali@0: as well as a plain-ASCII version. ali@0: ali@0: ali@0: ali@0: Line 1207 - Non-ISO-8859 character 156 ali@0: ali@0: Even in "8-bit" texts, there are distinctions between code sets. ali@0: The ISO-8859 family of 8-bit code sets is the most commonly used ali@0: in PG, and these sets do not define values in the range 128 through ali@0: 159 as printable characters. It's quite common for someone on a ali@0: Windows or Mac machine to use a non-ISO character inadvertently, ali@0: so this message warns that the character is not only not ASCII, ali@0: but also outside the ISO-8859 range. ali@0: ali@0: ali@0: ali@0: Line 46 - Tab character? ali@0: ali@0: Some editors and WPs will put in Tab characters (character 9) to ali@0: indicate indented text. You should not use these in a PG text, ali@0: because you can't be sure how they will appear on a reader's ali@0: screen. Find the Tab, and replace it with the appropriate number ali@0: of spaces. ali@0: ali@0: ali@0: Line 1327 - Tilde character? ali@0: ali@0: The tilde character (~) might be legitimately used, but it's the ali@0: character commonly used by OCR software to indicate a place where ali@0: it couldn't make out the letter, so gutcheck flags it. ali@0: ali@0: ali@0: ali@0: Line 1347 - Asterisk? ali@0: ali@0: Asterisks are reported only in paranoid mode (see -x). ali@0: Like tildes, they are often used to indicate errors, but they are ali@0: also legitimately used as line delimiters and footnote markers. ali@0: ali@0: ali@0: ali@0: Line 1451 - Long line 129 ali@0: ali@0: PG texts should have lines shorter than 76. There may be occasions ali@0: where you decide that you really have to go out to 79 characters, ali@0: but the sample above says that line 1451 is 129 characters long - ali@0: probably two lines run together. ali@0: ali@0: ali@0: ali@0: Line 1590 - Short line? ali@0: ali@0: PG texts should have lines longer than 54 characters. However, ali@0: there are special cases like poetry and tables of contents where ali@0: the lines _should_ be shorter. So treat Gutcheck warnings about ali@0: short lines carefully. Sometimes it's a genuine formatting ali@0: problem; sometimes the line really needs to be short. ali@0: ali@0: Hint: gutcheck will not flag lines as short if they are indented ali@0: - if they start with a space. I like to start inserted stanzas ali@0: and other such items indented with a couple of spaces so that ali@0: they stand out from the main text anyway. ali@0: ali@0: ali@0: ali@0: Line 1804 - Begins with punctuation? ali@0: ali@0: Lines should normally not begin with commas, periods and so on. ali@0: An exception is ellipses . . . which can happen at start of line. ali@0: ali@0: ali@0: ali@0: Line 1850 - Spaced em-dash? ali@0: ali@0: The PG standard for an em-dash--like these--is two minus signs ali@0: with no spaces before or after them. Gutcheck flags non-PG ali@0: em-dashes - like this one. Normally, you will replace it with a ali@0: PG-standard em-dash. ali@0: ali@0: ali@0: ali@0: Line 1904 - Query he/be error? ali@0: ali@0: Gutcheck makes a very minor effort to look for that scourge of all ali@0: proofreaders, "be" replacing "he" or vice-versa, and draws your ali@0: attention to it when it thinks it has found one. ali@0: ali@0: ali@0: ali@0: Line 2017 - Query digit in a1most ali@0: ali@0: The digit 1 is commonly OCRed for the letter l, the digit 0 for ali@0: the letter O, and so on. When gutcheck sees a mix of digits and ali@0: letters, it warns you. It may generate a false positive for ali@0: something like 7am. ali@0: ali@0: ali@0: ali@0: Line 2083 - Query standalone 0 ali@0: ali@0: In paranoid mode (see -x) only, gutcheck warns about the digit 0 ali@0: and the number 1 standing alone as a word. This can happen if the ali@0: OCR misreads the words O or I. ali@0: ali@0: ali@0: ali@0: Line 2115 - Query word whetber ali@0: ali@0: If you have switched typo-checking on, gutcheck looks for ali@0: potential typos, especially common h/b errors. It's not ali@0: infallible; it sometimes queries legit words, but it's ali@0: always worth taking a look. ali@0: ali@0: ali@0: ali@0: Line 2190 column 14 - Missing space? ali@0: ali@0: Omitting a space is a very common error,especially coming from ali@0: OCRed text,and can be hard for a human to spot. The commas in ali@0: the previous sentence illustrate the kind of thing I mean. ali@0: ali@0: ali@0: ali@0: Line 2240 column 48 - Spaced punctuation? ali@0: ali@0: The flip side of the "missing space" error , here , is when extra ali@0: spaces are added before punctuation . Some old texts appear to add ali@0: extra spaces around punctuation consistently, but this was a ali@0: typographical convention rather than the author's intent, and the ali@0: extra "spaces" should be removed when preparing a PG text. ali@0: ali@0: ali@0: ali@0: Line 2301 column 19 - Unspaced quotes? ali@0: ali@0: Another common spacing problem occurs in a phrase like "You wait ali@0: there,"he said. ali@0: ali@0: ali@0: ali@0: Line 2385 column 27 - Wrongspaced quotes? ali@0: ali@0: As of version 0.98, gutcheck adds extra checks on whether a quote ali@0: seems to be a start or end quote, and queries those that appear to ali@0: be misplaced. This does give rise to false positives when quotes are ali@0: nested, for example: ali@0: ali@0: "And how," she asked, "will your "friends" help you now?" ali@0: ali@0: but these false positives are worth it because of the many cases ali@0: that this test catches, notably those like: ali@0: ali@0: "And how, "she said," will your friends help you now?" ali@0: ali@0: Sometimes a "wrongspaced quotes" query will arise because an earlier ali@0: quote in the paragraph was omitted, so if the place specified seems ali@0: to be OK, look back to see whether there's a problem in the preceding ali@0: lines. ali@0: ali@0: ali@0: ali@0: Line 2400 - HTML Tag?
ali@0: 
ali@0:     Some PG texts have been converted from HTML, and not all of the
ali@0:     HTML tags have been removed.
ali@0: 
ali@0: 
ali@0: 
ali@0:     Line 2402 - HTML symbol? &emdash;
ali@0: 
ali@0:     Similarly, special HTML symbol characters can survive into PG
ali@0:     texts. Can occasionally produce amusing false positives like
ali@0:     . . . Marwick & Co were well known for it;
ali@0: 
ali@0: 
ali@0: 
ali@0:     Line 2540 - Mismatched quotes
ali@0: 
ali@0:     Another gutcheck mainstay - unclosed doublequotes in a paragraph.
ali@0:     See the discussion of quotes in the switches section near the
ali@0:     start of this file.
ali@0:     
ali@0:     Since the mismatch doesn't occur on any one line, gutcheck quotes
ali@0:     the line number of the first blank line following the paragraph,
ali@0:     since this is the point where it reconciles the count of quotes.
ali@0:     However, if gutcheck is echoing lines, that is, you haven't used
ali@0:     the -e switch, it will show the _first_ line of the paragraph, 
ali@0:     to help you find the place without using line numbers. The 
ali@0:     offending paragraph is therefore between the quoted line and 
ali@0:     the line number given.
ali@0: 
ali@0: 
ali@0: 
ali@0:     Line 2587 - Mismatched single quotes
ali@0: 
ali@0:     Only checked with the -s switch, since checking single quotes is 
ali@0:     not a very reliable process. Otherwise, the same logic as for 
ali@0:     doublequotes applies.
ali@0: 
ali@0: 
ali@0: 
ali@0:     Line 2877 - Mismatched round brackets?
ali@0: 
ali@0:     Also curly and square brackets. Texts with a lot of brackets, like
ali@0:     plays with bracketed stage instructions, may have mismatches.
ali@0: 
ali@0: 
ali@0:     Line 3150 - No CR?
ali@0:     Line 3204 - Two successive CRs?
ali@0:     Line 3281 position 75 - CR without LF?
ali@0: 
ali@0:     These are the invalid line-end warnings. See the discussion of
ali@0:     line-end checking in the switches section near the start of this
ali@0:     file. If you see these, and your editor doesn't show anything
ali@0:     wrong, you should probably try deleting the characters just before
ali@0:     and after the line end, and the line-end itself, then retyping the
ali@0:     characters and the line-end.
ali@0: 
ali@0: 
ali@0:     Line 2940 - Paragraph starts with lower-case
ali@0: 
ali@0:     A common error in an e-text is for an extra blank line
ali@0: 
ali@0:     to be put in, like the blank line above, and this often
ali@0:     shows up as a new paragraph beginning with lower case.
ali@0:     Sometimes the blank line is deliberate, as when a 
ali@0:     quotation is inserted in a speech. Use your judgement.
ali@0: 
ali@0: 
ali@0:     Line 2987 - Extra period?
ali@0: 
ali@0:     An extra period. is a. common problem in OCRed text. and usually
ali@0:     arises when a speck of dust on the page is mistaken for a period.
ali@0:     or. as occasionally happens. when a comma loses its tail.
ali@0: 
ali@0: 
ali@0:     Line 3012 column 12 - Double punctuation?
ali@0: 
ali@0:     Double punctuation., like that,, is a common typo and
ali@0:     scanno. Some books have much legit double punctuation,
ali@0:     like etc., etc., but it's worth checking anyway.
ali@0: 
ali@0: 
ali@0: 
ali@0:             *       *       *        *
ali@0: 
ali@0: For Windows-only users who are unfamiliar with DOS:
ali@0: 
ali@0:     If you're a Windows-only user, you need to save
ali@0:     gutcheck.exe into the folder (directory) where the
ali@0:     text file you want to check is. Let's say your
ali@0:     text file is in C:\GUT, then you should save
ali@0:     GUTCHECK.EXE into C:\GUT.
ali@0: 
ali@0:     Now get to a DOS prompt. You can do this by
ali@0:     selecting the "Command Prompt" or "MS-DOS Prompt"
ali@0:     option that will be somewhere on your
ali@0:     Start/Programs menu.
ali@0: 
ali@0:     Now get into the C:\GUT directory. 
ali@0:     You can do this using the CD (change directory) 
ali@0:     command, like this:
ali@0:         CD \GUT
ali@0:     and your prompt will change to 
ali@0:         C:\GUT>
ali@0:     so you know you're in the right place.
ali@0: 
ali@0:     Now type
ali@0:         gutcheck yourfile.txt
ali@0:     and you'll see gutcheck's report
ali@0: 
ali@0:     By default, gutcheck prints its queries to screen.
ali@0:     If you want to create a file of them, to edit
ali@0:     against the text, you can use the greater-than
ali@0:     sign (>) to tell it to output the report to a
ali@0:     file. For example, if you want its report in a
ali@0:     file called QUERIES.LST, you could type
ali@0:     
ali@0:         gutcheck yourfile.txt > queries.lst
ali@0: 
ali@0:     The queries.lst file will then contain the listing
ali@0:     of possible formatting errors, and you can
ali@0:     edit it alongside your text.
ali@0: 
ali@0:     Whatever you do, DON'T make the filename after
ali@0:     the greater-than sign the name of a file already
ali@0:     on your disk that you want to keep, because
ali@0:     the greater-than sign will cause gutcheck to
ali@0:     replace any existing file of that name.
ali@0: 
ali@0:     So, for example, if you have two Tolstoy files
ali@0:     that you want to check, called WARPEACE.TXT and 
ali@0:     ANNAK.TXT, make sure that neither of these names
ali@0:     is ever used following the greater-than sign.
ali@0:     To check these correctly, you might do:
ali@0: 
ali@0:     gutcheck warpeace.txt >war.lst
ali@0: 
ali@0:     and
ali@0: 
ali@0:     gutcheck annak.txt > annak.lst
ali@0: 
ali@0:     separately. Then you can look at war.lst and annak.lst
ali@0:     to see the gutcheck reports.
ali@0: 
ali@0:             *       *       *        *
ali@0: 
ali@0: 
ali@0: For existing 0.98 users upgrading to 0.99:
ali@0: 
ali@0:     If you run on old 16-bit DOS or Windows 3.x, I'm afraid
ali@0:     you're out of luck. I'm not saying it _can't_ be compiled
ali@0:     to run on 16-bit, but the executable with the package is
ali@0:     for Win32 only. *nix users won't notice the change at all.
ali@0: 
ali@0: 
ali@0:     There are two new switches: -u and -d. 
ali@0:           See above for full rundown.
ali@0: 
ali@0: 
ali@0: Here's a list of the new errors:
ali@0: 
ali@0:     Line 1456 - Carat character?
ali@0: 
ali@0:     I^ve found a few.
ali@0: 
ali@0: 
ali@0:     Line 1821 - Forward slash?
ali@0: 
ali@0:     Common error for italicized "I", or so /'ve found.
ali@0: 
ali@0: 
ali@0:     Line 2139 - Query missing paragraph break?
ali@0: 
ali@0:     "Come here, son." "Do I _have_ to go, dad?"
ali@0:     Like that. False positives in some texts. Sorry 'bout that,
ali@0:     but these are often errors.
ali@0: 
ali@0: 
ali@0:     Line 2200 - Query had/bad error?
ali@0: 
ali@0:     Clear enough. Doesn't catch as many as I'd like it to,
ali@0:     but rarely gives false alarms.
ali@0: 
ali@0: 
ali@0:     Line 2268 - Query punctuation after the?
ali@0: 
ali@0:     Some words, like "the", very rarely have punctuation
ali@0:     following them. Others, like "Mrs", usually have a
ali@0:     period, but never a comma. Occasional false positives.
ali@0: 
ali@0: 
ali@0:     Line 2380 - Query possible scanno arid
ali@0: 
ali@0:     It found one of your user-defined typos when you
ali@0:     used the -u switch.
ali@0: 
ali@0: 
ali@0:     Line 2511 - Capital "S"?
ali@0: 
ali@0:     Surprisingly common specific case, like: Jane'S 
ali@0: 
ali@0:     
ali@0:     Line 3469 - endquote missing punctuation?
ali@0: 
ali@0:     OK. This one can really cause a lot of false positives
ali@0:     in some books, but it switches itself off if it finds
ali@0:     more than 20 in a text, unless you force it to list them
ali@0:     all with the -v switch.
ali@0:     "Hey, dad" Johnny said, "can we go now?"
ali@0:     is a common punctuation-missing error.
ali@0: 
ali@0: 
ali@0:     Line 4266 - Mismatched underscores?
ali@0: 
ali@0:     Like mismatched anything else!
ali@0: 
ali@0: