ali@0: ali@0: ali@74: Bookloupe documentation ali@0: ali@0: ali@74: bookloupe: lists possible common formatting errors in a Project ali@74: Gutenberg candidate file. Bookloupe is based on gutcheck, written ali@74: by Jim Tinsley. It is a command line program and can be used under ali@74: Microsoft Windows, Mac or Unix. For Windows-only people, there is ali@74: an appendix at the end with brief instructions for running it. ali@0: ali@87: Current version: 1.94, a beta version leading up to version 2.0 ali@0: ali@74: This software is Copyright Jim Tinsley 2000-2005 and ali@74: J. Ali Harlow 2012 onwards. ali@0: ali@74: Bookloupe comes wih ABSOLUTELY NO WARRANTY. For details, read the file COPYING. ali@0: This is Free Software; you may redistribute it under certain conditions (GPL). ali@0: ali@74: See http://www.juiblex.co.uk/pgdp/bookloupe/ for the latest version. ali@0: ali@0: ali@74: Usage is: bookloupe [-setopxlywm] filename ali@0: where: ali@0: -s checks Single quotes ali@0: -e switches off Echoing of lines ali@0: -t checks Typos ali@0: -o produces an Overview only ali@0: -p sets strict quotes checking for Paragraphs ali@0: -x (paranoid) switches OFF typo checking and extra checks ali@0: -l turns off Line-end checks ali@0: -y sets error messages to stdout ali@0: -w is a special mode for web uploads (for future use) ali@0: -v (verbose) forces individual reporting of minor problems ali@0: -m interprets Markup of some common HTML tags and entities ali@0: -u warns about words in a user-defined typo file gutcheck.typ ali@0: -d ignores some DP-specific markup ali@0: ali@74: Running bookloupe without any parameters will display a brief help message. ali@0: ali@0: Sample usage: ali@0: ali@74: bookloupe warpeace.txt ali@0: ali@0: ali@0: More detail: ali@0: ali@74: Character encoding ali@74: ali@74: Bookloupe will handle e-texts encoded in UTF-8 (preferred), ali@74: ISO-8859-1 (also known as Latin-1), or WINDOWS-1252 (also known, ali@74: incorrectly, as ansi). The output will be in the same encoding ali@74: as the input e-text. ali@74: ali@0: Echoing lines (-e to switch off) ali@0: ali@74: You may find it convenient, when reviewing Bookloupe's ali@74: suggestions, to see the line that Bookloupe is questioning. ali@0: That way, you can often see at a glance whether it is ali@0: a real error that needs to be fixed, or a false positive ali@74: that should be in the text, but Bookloupe's limited ali@0: programming doesn't understand. ali@0: ali@74: By default, bookloupe echoes these lines, but if you don't ali@0: want to see the lines referred to, -e will switch it OFF. ali@0: ali@0: ali@0: Quotes (-s and -p switches) ali@0: ali@74: Bookloupe always looks for unbalanced doublequotes in a ali@0: paragraph. It is a common convention for writers not to ali@0: close quotes in a paragraph if the next paragraph opens ali@0: with quotes and is a continuation by the same speaker. ali@0: ali@74: Bookloupe therefore does not normally report unclosed quotes ali@0: if the next paragraph begins with a quote. If you need ali@0: to see all unclosed quotes, even where the next paragraph ali@0: begins with a quote, you should use the -p switch. ali@0: ali@0: Singlequotes (') are a problem, since the same character ali@0: is used for an apostrophe. I'm not sure that it is ali@0: possible to get 100% accuracy on singlequotes checking, ali@0: particularly since dialect, quite common in PG texts, ali@0: upsets the normal rules so badly. Consider the sentence: ali@0: 'Tis often said that a man's a man for a' that. ali@0: As humans, we recognize that both apostrophes are used ali@0: for contractions rather than quotes, but it isn't easy ali@0: to get a program to recognize that. ali@0: ali@74: Since bookloupe makes too many mistakes when trying to match ali@0: singlequotes, it doesn't look for unbalanced singlequotes ali@0: unless you specify the -s switch. ali@0: ali@0: Consider these sentences, which illustrate the main cases: ali@0: ali@0: 'Tis often said that a fool and his money are soon parted. ali@0: ali@0: 'Becky's goin' home,' said Tom. ali@0: ali@0: The dogs' tails wagged in unison. ali@0: ali@0: Those 'pack dogs' of yours look more like wolves. ali@0: ali@0: ali@0: ali@0: Typos (-t switch) ali@0: ali@74: It's not bookoupe's job to be a spelling checker, but it ali@0: does check for a list of common typos and OCR errors if you ali@0: use the -t switch. (The -x switch also turns typo checking on.) ali@0: ali@0: It also checks for character combinations, especially involving ali@0: h and b, which are often confused by OCR, that rarely or never ali@0: occur. For example, it queries "tbe" in a word. Now, "the" often ali@0: occurs, but "tbe" is very rare (heartbeat, hotbed), so I'm ali@0: playing the odds - a few false positives for many errors found. ali@0: Similarly with "ii", which is a very common OCR error. ali@0: ali@74: Bookloupe suppresses multiple reporting of the first 40 "typos" ali@0: found. This is to remove the annoyance of seeing something like ali@0: "FN" (footnote) or "LK" (initials) flagged as a typo 147 times ali@0: in a text. ali@0: ali@0: ali@0: Line-end checking (-l switch to disable) ali@0: ali@0: All PG texts should have a Carriage Return (CR - character 13) ali@0: and a Line Feed (LF - character 10) at end of each line, ali@0: regardless of what O/S you made them on. DOS/Windows, Unix ali@0: and Mac have different conventions, but the final text should ali@0: always use a CR/LF pair as its line terminator. ali@0: ali@74: By default, bookloupe verifies that every line does have ali@0: the correct terminator, but if you're on a work-in-progress ali@0: in Linux, you might want to convert the line-ends as a final ali@0: step, and not want to see thousands of errors every time you ali@74: run bookloupe before that final step, so you can turn off ali@0: this checking with the -l switch. ali@0: ali@0: ali@0: Paranoid mode (-x switch to disable: Trust No One :-) ali@0: ali@0: -x switches OFF typo-checking, the -t flag, automatically ali@0: and some extra checks like standalone 1 and 0 queries. ali@0: ali@0: ali@0: Overview mode (-o switch) ali@0: ali@74: This mode just gives a count of queries found ali@74: instead of a detailed list. ali@0: ali@0: ali@0: Header quote (-h switch) ali@0: ali@74: If you use the -h switch, bookloupe will also display ali@74: the Title, Author, Release and Edition fields from the ali@74: PG header. This is useful mostly for the automated ali@74: checks we do on recently-posted texts. ali@0: ali@0: ali@0: Errors to stdout (-y switch) ali@0: ali@74: If you're just running bookloupe normally, you can ignore ali@74: this. It's only there for programs that provide a front ali@74: end to bookloupe. It makes error messages appear within ali@74: the output of bookloupe so that the front end knows whether ali@74: bookloupe ran OK. ali@0: ali@0: ali@0: Verbose reporting (-v switch) ali@0: ali@74: Normally, if bookloupe sees lots of long lines, short lines, ali@74: spaced dashes, non-ASCII characters or dot-commas ".," it ali@74: assumes these are features of the text, counts and summarizes ali@74: them at the top of its report, but does not list them ali@74: individually. If the -v switch is on, bookloupe will list them all. ali@0: ali@0: ali@0: Markup interpretation (-m switch) ali@0: ali@74: Normally, bookloupe flags anything it suspects of being HTML ali@74: markup as a possible error. When you use the -m switch, ali@74: however, it matches anything that looks like markup against ali@74: a short list of common HTML tags and entities. If the markup ali@74: is in that list, it either ignores the markup, in the case ali@74: of a tag, or "interprets" the markup as its nearest ASCII ali@74: equivalent, in the case of an entity. So, for example, using ali@74: this switch, bookloupe will "see" ali@0: ali@74: “He went thataway!” ali@0: ali@74: as ali@0: ali@74: "He went thataway!" ali@0: ali@74: and report accordingly. ali@0: ali@74: This switch does not, not, NOT check the validity of HTML; ali@74: it exists so that you can run bookloupe on most HTML texts ali@74: for PG, and get sane results. It does not support all tags. ali@74: It does not support all entities. When it sees a tag or entity ali@74: it does not recognize, it will query it as HTML just as if ali@74: you hadn't specified the -m switch. ali@0: ali@74: Bookloupe will automatically switch on markup interpretation ali@74: if it sees a lot of tags that appear to be markup, so mostly, you ali@74: won't have to specify this. ali@0: ali@0: User-defined typos (-u switch) ali@0: ali@74: If you have a file named bookloupe.typ or gutcheck.typ either ali@74: in your current working directory or in the directory from ali@74: which you explicitly invoked bookoupe, but not necessarily on ali@74: your path, and if you specify the -u switch, bookloupe will ali@74: query any word specified in that file. The file is simple: one ali@74: word, in lower case, per line. Be careful not to put multiple ali@74: words onto a line, or leave any rubbish other than the word on ali@74: the line. You should have received a sample file bookloupe.typ ali@74: with this package. The file may be encoded in UTF-8 (preferred), ali@74: ISO-8859-1 (also known as Latin-1), or WINDOWS-1252 (also known, ali@74: incorrectly, as ansi). ali@0: ali@0: Ignore DP markup (-d switch) ali@0: ali@74: Distributed Proofreaders (http://www.pgdp.net) has for some ali@74: time been the main source of PG texts, and proofers there use ali@74: special conventions. This switch understands those conventions, ali@74: so that people can use bookloupe on files in process that still ali@74: haven't had the special conventions removed yet. The special ali@74: conventions supported are page-separators and ali@74: "", "", "/*", "*/", "/#", "#/", "/$", "$/". ali@0: ali@0: ali@74: You will probably only run bookloupe on a text once or maybe twice, ali@0: just prior to uploading; it usually finds a few formatting problems; ali@0: it also usually finds queries that aren't problems at all - it often ali@0: questions Tables of Contents for having short lines, for example. ali@74: These are called "false positives," and need a human to decide on ali@0: them. ali@0: ali@0: The text should be standard prose, and already close to PG normal ali@0: format (plain text, about 70 characters per line with blank lines ali@0: between paragraphs). ali@0: ali@74: Bookloupe merely draws your attention to things that might be errors. ali@0: It is NOT a substitute for human judgement. Formatting choices like ali@0: short lines may be for a reason that this program can't understand. ali@0: ali@0: Even the most careful human proofing can leave errors behind in a ali@0: text, and there are several automated checks you can do to help find ali@0: them. Of these, spellchecking (with _very_ careful human judgement) is ali@0: the most important and most useful. ali@0: ali@74: Bookloupe does perform some basic typo-checking if you ask it to, ali@74: but its focus is on formatting errors specific to PG texts— ali@0: mismatched quotes, non-ASCII characters, bad spacing, bad line ali@0: length, HTML tags perhaps left from a conversion, unbalanced ali@0: brackets. ali@0: ali@0: Suggestions for additional checks would be appreciated and duly ali@0: considered, but no guarantees that they will be implemented. ali@0: ali@0: ali@0: ali@0: ali@74: How does Jim Tinsley use gutcheck? ali@0: ali@0: Practically everyone I give gutcheck to asks me how _I_ use it. ali@0: Well, when I get a text for posting, say filename.txt, I run ali@0: ali@0: gutcheck -o filename.txt ali@0: ali@0: That gives me a quick idea what I'm dealing with. It'll tell ali@0: me what kind of problems gutcheck sees, and give me an idea ali@0: of how much more work needs to be done on the text. Keep in ali@0: mind that gutcheck doesn't do anything like a full spellcheck, ali@0: but when I see a text that has a lot of problems, I assume that ali@0: it probably needs a spellcheck too. ali@0: ali@0: Having got a feel for the ballpark, I run ali@0: ali@0: gutcheck filename.txt > jj ali@0: ali@0: where jj is my personal, all-purpose filename for temporary data ali@0: that doesn't need to be kept. Then I open filename.txt and jj in ali@0: a split-screen view in my editor, and work down the text, fixing ali@0: whatever needs fixing, and skipping whatever doesn't. If your ali@0: editor doesn't split-screen, you can get much the same effect by ali@0: opening your original file in your normal editor, and jj (or your ali@0: equivalent name) in something like Notepad, keeping both in view ali@0: at the same time. ali@0: ali@0: Twice a day, an automatic process looks at all recently-posted ali@0: texts, and emails Michael, me, and sometimes other people with ali@0: their gutcheck summaries. ali@0: ali@0: ali@0: ali@74: Future development of bookloupe ali@0: ali@74: Bookloupe version 2.0 is intended to add UTF-8 support to ali@74: gutcheck. All the functionality should already be implemented ali@74: in the beta versions leading up to version 2.0, although ali@74: some bugs may well remain. ali@0: ali@74: Future versions will add support for UTF-8 characters that ali@74: are not in ISO-8859-1 (eg., curled quotation marks); ali@74: characters that do not have a composed form (version 2 ali@74: treats these as taking 2 or more columns); zero width and ali@74: wide characters (version 2 treats these as taking 1 column). ali@0: ali@0: ali@0: ali@0: ali@74: Explanations of common bookloupe messages: ali@0: ali@0: --> 74 lines in this file have white space at end ali@0: ali@0: PG texts shouldn't have extra white space added at end of line. ali@0: Don't worry too much about this; they're not doing any harm, ali@0: and they'll be removed during posting anyway. ali@0: ali@0: ali@0: --> 348 lines in this file are short. Not reporting short lines. ali@0: --> 84 lines in this file are long. Not reporting long lines. ali@0: --> 8 lines in this file are VERY long! ali@0: ali@74: If there are a lot of long or short lines, bookloupe won't list ali@0: them individually. The short lines version of this message ali@0: is commonly seen when gutchecking poetry and some plays, where ali@0: the normal line length is shorter than the standard for prose. ali@0: A "VERY long" line is one over 80 characters. You normally ali@0: shouldn't have any of these, but sometimes you may have to render ali@0: a table that must be that long, or some special preformatted ali@0: quotation that can't be broken. ali@0: ali@0: ali@0: --> There are 75 spaced dashes and em-dashes in this file. Not reporting them. ali@0: ali@0: The PG standard for an emdash--like these--is two minus signs ali@0: with no spaces before or after them. However, some older texts ali@0: used spaced dashes - like these -- and if there are very many ali@74: such spaced dashes in the file, bookoupe just draws your ali@0: attention to it and doesn't list them individually. ali@0: ali@0: ali@0: ali@0: Line 3020 - Non-ASCII character 233 ali@0: ali@0: Standard PG texts should use only ASCII characters with values ali@0: up to 127; however, non-English, accented characters can be ali@0: represented according to several different non-ASCII encoding ali@0: schemes, using values over 127. If you have a plain English text ali@0: with a few accented characters in words like cafe or tete-a-tete, ali@74: you might replace the accented characters with their unaccented ali@0: versions. The English pound sign is another commonly-seen ali@0: non-ASCII character. If you have enough non-ASCII characters in ali@74: your text that you feel removing them would degrade your text, ali@74: you should probably consider doing a UTF-8 text. ali@0: ali@0: ali@0: ali@0: Line 1207 - Non-ISO-8859 character 156 ali@0: ali@0: Even in "8-bit" texts, there are distinctions between code sets. ali@0: The ISO-8859 family of 8-bit code sets is the most commonly used ali@0: in PG, and these sets do not define values in the range 128 through ali@0: 159 as printable characters. It's quite common for someone on a ali@0: Windows or Mac machine to use a non-ISO character inadvertently, ali@0: so this message warns that the character is not only not ASCII, ali@0: but also outside the ISO-8859 range. ali@0: ali@0: ali@0: ali@0: Line 46 - Tab character? ali@0: ali@0: Some editors and WPs will put in Tab characters (character 9) to ali@0: indicate indented text. You should not use these in a PG text, ali@0: because you can't be sure how they will appear on a reader's ali@0: screen. Find the Tab, and replace it with the appropriate number ali@0: of spaces. ali@0: ali@0: ali@0: Line 1327 - Tilde character? ali@0: ali@0: The tilde character (~) might be legitimately used, but it's the ali@0: character commonly used by OCR software to indicate a place where ali@74: it couldn't make out the letter, so bookloupe flags it. ali@0: ali@0: ali@0: ali@0: Line 1347 - Asterisk? ali@0: ali@0: Asterisks are reported only in paranoid mode (see -x). ali@0: Like tildes, they are often used to indicate errors, but they are ali@0: also legitimately used as line delimiters and footnote markers. ali@0: ali@0: ali@0: ali@0: Line 1451 - Long line 129 ali@0: ali@0: PG texts should have lines shorter than 76. There may be occasions ali@0: where you decide that you really have to go out to 79 characters, ali@74: but the sample above says that line 1451 is 129 characters long— ali@0: probably two lines run together. ali@0: ali@0: ali@0: ali@0: Line 1590 - Short line? ali@0: ali@0: PG texts should have lines longer than 54 characters. However, ali@0: there are special cases like poetry and tables of contents where ali@74: the lines _should_ be shorter. So treat bookloupe warnings about ali@0: short lines carefully. Sometimes it's a genuine formatting ali@0: problem; sometimes the line really needs to be short. ali@0: ali@74: Hint: bookloupe will not flag lines as short if they are indented ali@74: —if they start with a space. I like to start inserted stanzas ali@0: and other such items indented with a couple of spaces so that ali@0: they stand out from the main text anyway. ali@0: ali@0: ali@0: ali@0: Line 1804 - Begins with punctuation? ali@0: ali@0: Lines should normally not begin with commas, periods and so on. ali@0: An exception is ellipses . . . which can happen at start of line. ali@0: ali@0: ali@0: ali@0: Line 1850 - Spaced em-dash? ali@0: ali@0: The PG standard for an em-dash--like these--is two minus signs ali@74: with no spaces before or after them. Bookloupe flags non-PG ali@0: em-dashes - like this one. Normally, you will replace it with a ali@0: PG-standard em-dash. ali@0: ali@0: ali@0: ali@0: Line 1904 - Query he/be error? ali@0: ali@74: Bookloupe makes a very minor effort to look for that scourge of all ali@0: proofreaders, "be" replacing "he" or vice-versa, and draws your ali@0: attention to it when it thinks it has found one. ali@0: ali@0: ali@0: ali@0: Line 2017 - Query digit in a1most ali@0: ali@0: The digit 1 is commonly OCRed for the letter l, the digit 0 for ali@74: the letter O, and so on. When bookloupe sees a mix of digits and ali@0: letters, it warns you. It may generate a false positive for ali@0: something like 7am. ali@0: ali@0: ali@0: ali@0: Line 2083 - Query standalone 0 ali@0: ali@74: In paranoid mode (see -x) only, bookloupe warns about the digit 0 ali@0: and the number 1 standing alone as a word. This can happen if the ali@0: OCR misreads the words O or I. ali@0: ali@0: ali@0: ali@0: Line 2115 - Query word whetber ali@0: ali@74: If you have switched typo-checking on, bookloupe looks for ali@0: potential typos, especially common h/b errors. It's not ali@0: infallible; it sometimes queries legit words, but it's ali@0: always worth taking a look. ali@0: ali@0: ali@0: ali@0: Line 2190 column 14 - Missing space? ali@0: ali@0: Omitting a space is a very common error,especially coming from ali@0: OCRed text,and can be hard for a human to spot. The commas in ali@0: the previous sentence illustrate the kind of thing I mean. ali@0: ali@0: ali@0: ali@0: Line 2240 column 48 - Spaced punctuation? ali@0: ali@0: The flip side of the "missing space" error , here , is when extra ali@0: spaces are added before punctuation . Some old texts appear to add ali@0: extra spaces around punctuation consistently, but this was a ali@0: typographical convention rather than the author's intent, and the ali@0: extra "spaces" should be removed when preparing a PG text. ali@0: ali@0: ali@0: ali@0: Line 2301 column 19 - Unspaced quotes? ali@0: ali@0: Another common spacing problem occurs in a phrase like "You wait ali@0: there,"he said. ali@0: ali@0: ali@0: ali@0: Line 2385 column 27 - Wrongspaced quotes? ali@0: ali@74: Bookloupe checks whether a quote seems to be a start or end quote, ali@74: and queries those that appear to be misplaced. This does give rise ali@74: to false positives when quotes are nested, for example: ali@0: ali@0: "And how," she asked, "will your "friends" help you now?" ali@0: ali@0: but these false positives are worth it because of the many cases ali@0: that this test catches, notably those like: ali@0: ali@0: "And how, "she said," will your friends help you now?" ali@0: ali@0: Sometimes a "wrongspaced quotes" query will arise because an earlier ali@0: quote in the paragraph was omitted, so if the place specified seems ali@0: to be OK, look back to see whether there's a problem in the preceding ali@0: lines. ali@0: ali@0: ali@0: ali@0: Line 2400 - HTML Tag?
ali@0: 
ali@0:     Some PG texts have been converted from HTML, and not all of the
ali@0:     HTML tags have been removed.
ali@0: 
ali@0: 
ali@0: 
ali@0:     Line 2402 - HTML symbol? &emdash;
ali@0: 
ali@0:     Similarly, special HTML symbol characters can survive into PG
ali@0:     texts. Can occasionally produce amusing false positives like
ali@0:     . . . Marwick & Co were well known for it;
ali@0: 
ali@0: 
ali@0: 
ali@0:     Line 2540 - Mismatched quotes
ali@0: 
ali@74:     Another bookloupe mainstay—unclosed doublequotes in a paragraph.
ali@0:     See the discussion of quotes in the switches section near the
ali@0:     start of this file.
ali@0:     
ali@74:     Since the mismatch doesn't occur on any one line, bookloupe quotes
ali@0:     the line number of the first blank line following the paragraph,
ali@0:     since this is the point where it reconciles the count of quotes.
ali@74:     However, if bookloupe is echoing lines, that is, you haven't used
ali@0:     the -e switch, it will show the _first_ line of the paragraph, 
ali@0:     to help you find the place without using line numbers. The 
ali@0:     offending paragraph is therefore between the quoted line and 
ali@0:     the line number given.
ali@0: 
ali@0: 
ali@0: 
ali@0:     Line 2587 - Mismatched single quotes
ali@0: 
ali@0:     Only checked with the -s switch, since checking single quotes is 
ali@0:     not a very reliable process. Otherwise, the same logic as for 
ali@0:     doublequotes applies.
ali@0: 
ali@0: 
ali@0: 
ali@0:     Line 2877 - Mismatched round brackets?
ali@0: 
ali@0:     Also curly and square brackets. Texts with a lot of brackets, like
ali@0:     plays with bracketed stage instructions, may have mismatches.
ali@0: 
ali@0: 
ali@0:     Line 3150 - No CR?
ali@0:     Line 3204 - Two successive CRs?
ali@0:     Line 3281 position 75 - CR without LF?
ali@0: 
ali@0:     These are the invalid line-end warnings. See the discussion of
ali@0:     line-end checking in the switches section near the start of this
ali@0:     file. If you see these, and your editor doesn't show anything
ali@0:     wrong, you should probably try deleting the characters just before
ali@0:     and after the line end, and the line-end itself, then retyping the
ali@0:     characters and the line-end.
ali@0: 
ali@0: 
ali@0:     Line 2940 - Paragraph starts with lower-case
ali@0: 
ali@0:     A common error in an e-text is for an extra blank line
ali@0: 
ali@0:     to be put in, like the blank line above, and this often
ali@0:     shows up as a new paragraph beginning with lower case.
ali@0:     Sometimes the blank line is deliberate, as when a 
ali@0:     quotation is inserted in a speech. Use your judgement.
ali@0: 
ali@0: 
ali@0:     Line 2987 - Extra period?
ali@0: 
ali@0:     An extra period. is a. common problem in OCRed text. and usually
ali@0:     arises when a speck of dust on the page is mistaken for a period.
ali@0:     or. as occasionally happens. when a comma loses its tail.
ali@0: 
ali@0: 
ali@0:     Line 3012 column 12 - Double punctuation?
ali@0: 
ali@0:     Double punctuation., like that,, is a common typo and
ali@0:     scanno. Some books have much legit double punctuation,
ali@0:     like etc., etc., but it's worth checking anyway.
ali@0: 
ali@0: 
ali@0: 
ali@0:             *       *       *        *
ali@0: 
ali@0: For Windows-only users who are unfamiliar with DOS:
ali@0: 
ali@0:     If you're a Windows-only user, you need to save
ali@74:     bookloupe.exe into the folder (directory) where the
ali@0:     text file you want to check is. Let's say your
ali@74:     text file is in C:\gut, then you should save
ali@74:     bookloupe.exe into C:\gut.
ali@0: 
ali@74:     Now get to a console. You can do this by
ali@0:     selecting the "Command Prompt" or "MS-DOS Prompt"
ali@0:     option that will be somewhere on your
ali@0:     Start/Programs menu.
ali@0: 
ali@74:     Now get into the C:\gut directory. 
ali@74:     You can do this using the cd (change directory) 
ali@0:     command, like this:
ali@74:         cd \gut
ali@0:     and your prompt will change to 
ali@74:         C:\gut>
ali@0:     so you know you're in the right place.
ali@0: 
ali@0:     Now type
ali@74:         bookloupe yourfile.txt
ali@74:     and you'll see bookloupe's report
ali@0: 
ali@74:     By default, bookloupe prints its queries to screen.
ali@0:     If you want to create a file of them, to edit
ali@0:     against the text, you can use the greater-than
ali@0:     sign (>) to tell it to output the report to a
ali@0:     file. For example, if you want its report in a
ali@74:     file called queries.lst, you could type
ali@74: 
ali@74:         bookloupe yourfile.txt > queries.lst
ali@0: 
ali@0:     The queries.lst file will then contain the listing
ali@0:     of possible formatting errors, and you can
ali@0:     edit it alongside your text.
ali@0: 
ali@0:     Whatever you do, DON'T make the filename after
ali@0:     the greater-than sign the name of a file already
ali@0:     on your disk that you want to keep, because
ali@74:     the greater-than sign will cause bookloupe to
ali@0:     replace any existing file of that name.
ali@0: 
ali@0:     So, for example, if you have two Tolstoy files
ali@0:     that you want to check, called WARPEACE.TXT and 
ali@0:     ANNAK.TXT, make sure that neither of these names
ali@0:     is ever used following the greater-than sign.
ali@0:     To check these correctly, you might do:
ali@0: 
ali@74:     bookloupe warpeace.txt > war.lst
ali@0: 
ali@0:     and
ali@0: 
ali@74:     bookloupe annak.txt > annak.lst
ali@0: 
ali@0:     separately. Then you can look at war.lst and annak.lst
ali@74:     to see the bookloupe reports.
ali@83: 
ali@83: For Windows-only users who want to use bookloupe from guiguts:
ali@83: 
ali@83:     1) If you haven't already done so, download bookloupe-win32-xxx.zip
ali@83:     from http://www.juiblex.co.uk/pgdp/bookloupe/
ali@83: 
ali@83:     2) Extract the files into a suitable folder, e.g. C:\DP\bookloupe
ali@83: 
ali@83:     3) Start Guiguts
ali@83: 
ali@83:     4) Choose Preferences | File Paths | Set File Paths..
ali@83: 
ali@83:     5) Click the "Locate Gutcheck..." button
ali@83: 
ali@83:     6) Browse to the folder where you extracted bookloupe
ali@83: 
ali@83:     7) Double-click bookloupe.exe