4.3 B. Proofreading using Kurzweil 1000 software

Skip to end of metadata
Go to start of metadata

The page Scanning And Proofreading Manual does not exist.

4.3  B.  Proofreading using Kurzweil 1000 software

Contributed by Bookshare volunteer  Mayrie ReNae

Back to:  4. Proofread a Book


Though this article is primarily concerned with proofreading a book using Kurzweil 1000, there are a few things that can more easily be done in Microsoft Word.


In Microsoft Word:
  1.  convert smart quotes to standard quotes,
  2.  convert em dashes to double hyphens,
  3.  determine whether the scan contains section breaks, or page breaks,
  4.  convert section breaks to page breaks,
  5.  remove spaces surrounding double hyphens.


Note:  All the parts of step 1 are no longer technically necessary, as they are done automatically by the RTF Converter immediately after one's finished proof has been uploaded back to Bookshare:

1.  Convert Smart Quotes to standard quotes using Find and Replace

Open the Find and Replace dialogue by typing:
    Control + h


In the Find box, type:
    ^0147

(That is a caret, then the four digits)

In the Replace box, type:
    ^0034

Press Replace All.


In the Find box, type:
    ^0148

In the Replace box, type:
    ^0034.

Press Replace All.



Note:  The following step is no longer technically necessary, as it is done automatically by the RTF Converter immediately after one's finished proof has been uploaded back to Bookshare:

2.  Convert em dashes to double hyphens using Find and Replace

Open the Find and Replace dialogue by typing:
    Control + h


In the Find box, type:
    ^+

(That is a caret, then a plus sign)

In the Replace box type:
    --

(That is two hyphens or two dashes, depending upon what you call that key to the right of the zero on the number row)

Press Replace All.



3.  Determine whether your file contains section breaks, or page breaks

Most raw scans contain section breaks.  We want to end up with the file containing page breaks.  If your file contains section breaks, we will replace them with page breaks.

However, the Submitter (the person who did the scan) may have already replaced the section breaks with page breaks, so first you must determine which flavor of breaks exist in the file you are working on.

To determine this, have Word tell you which type of break it found.  The breaks in a correctly-scanned file will either be all section breaks, or all page breaks.


First, we will search for section breaks.

Open the Find dialogue (not the Find and Replace dialogue!) by typing:
    Control + f

In the Find box, type the symbols to find a section break:
    ^b

(That is:  a caret, followed by a lower case letter 'b')

Press Find Next.


If Word reports that it found one section break, you may safely assume that all of the breaks in the file are section breaks.

If Word reports that no section breaks were found, the file likely contains all page breaks instead, but we should confirm this.


Again in the Find box, type the symbols to find a page break:
    ^m

(That is:  a caret, followed by a lower case letter 'm')

Press Find Next.


If Word reports that it found one page break, you may safely assume that all of the breaks in the file are page breaks.


If Word reports that it did not find any section breaks at all, nor any page breaks at all, this scan cannot be proofread!  Please Reject the book, and in the Comments box that will come up, write:
"This scan contained no section breaks and no page breaks."

Thank you for reporting this, as it will save other proofreaders from spending time on a file that cannot be proofread!


If you've determined that your file contains section breaks, you'll do only part "a" of Step 4, below.

If on the other hand, your file contains page breaks, you'll do only part "b" of Step 4, below.



4a.  Convert section breaks to page breaks using Find and Replace

Open the Find and Replace dialogue.

In the Find box, type:
    ^b

(That is:  a caret, then a lower case letter 'b')

In the Replace box, type:
    ^p^m^p

(That is:  a caret, a lower case letter 'p', a caret, a lower case letter 'm', a caret, then a lower case letter 'p')

Press Replace All.

Note:  This will also surround your newly converted page breaks with blank lines.



4b.  Surround page breaks with blank lines using Find and Replace

Open the Find and Replace dialogue.

In the Find box, type:
    ^m

(That is:  a caret, then a lower case letter 'm')

In the Replace box, type:
    ^p^m^p

(That is:  a caret, a lower case letter 'p', a caret, a lower case letter 'm', a caret, then a lower case letter 'p')

Press Replace All.



5.  Remove spaces surrounding double hyphens using Find and Replace

Open the Find and Replace dialogue.

In the Find box, type
    space--

(That is:  a space, followed by two hyphens)

In the Replace box, type:
    --

(That is:  two hyphens)

Press Replace All.


In the Find box, type:
    --space

(That is:  two hyphens, followed by a space)

In the Replace box, type:
    --

(That is:  two hyphens)

Press Replace All.



Save your document,

Close your document, and

Exit Microsoft Word.




Now you are ready to run Kurzweil 1000 and open your book.

Before beginning to proofread a book using Kurzweil 1000, there are some settings that need to be confirmed:

1. In the Reading Settings section of the Settings menu, make sure that "line endings will be ignored by the editor".

2. In the Voices section of the Settings menu, under "Reading voice", have Kurzweil change pitch for italicized and bolded text.  This gives you the most information auditorily about the text in the book you are reading.

3. In the Conversion Settings menu, disable all possible instances of "split long pages".

Note:  The Conversion Settings dialog did not exist in K1000 before version 11, so if you have an earlier version, skip this step.  If you are using version 11 it is important that you perform this step to avoid Kurzweil generated pages that will mess up your book.




Now you are ready to start proofreading using Kurzweil 1000.


1.  After opening your downloaded book, save it as a .KES file. This will assure that you have a backup file in case of problems.


2.  Clean up the preliminary pages and confirm an accurate page count.

Label:
[From The Back Cover]
[From The Front Flap]
[From The Back Flap]
[This Page is blank.]
(if any blank pages exist)


Read through all the preliminary pages and correct all scannos.


3.  Determine where the publisher thought page one should go, and set an operator defined page number there as page 1.

Check that the last page in the book is numbered properly, telling you that you do not have any missing or duplicated pages.  If the numbers don’t match, either rescan and insert the pages that were missed, or delete the duplicated pages.


4.  Book orientation.


4.1.  Standardize the Font Style, and Font Sizes

        -  Standardize the Font Style of the entire document, to Times New Roman

        -  Standardize the Font Size of the entire document, to 12 point

        -  Standardize the Font Size of the book title only, to 20 point, plus bolding

        -  Standardize the Font Size of all chapter headings, to 16 point, plus bolding


You can standardize the Font Style to Times New Roman, and the Font Size to 12 point by doing the following:

To first select the entire document, press:
    Control + a

To get the Edit menu, press:
    Alt + e

To get to Format, press:
    Down arrow

To open the Format menu, press:
    Enter

To get to Font, press:
    Enter


The first thing you will read is "the selected font is".
  -  In the list, find Times New Roman,
  -  Press tab until you get to Font Size,
  -  The Font Size should be set to 12,
  -  Bypass any references to bold, underlined or italicized text in this dialog,
  -  Tab to OK,
  -  Then press Enter.

Since you’ve selected the entire document for this function, it may take a while to finish.


Note:  The following step is no longer technically necessary:
4.2.  Protect chapter headings.

Protect all chapter headings by placing the page number followed by a blank line above the chapter heading and a blank line between the chapter heading and the text on the page


4.3.  Page down through the document, numbering and labeling all blank pages.


4.4.1.  Get rid of end-of-line hyphens.  This can usually be accomplished by replacing all hyphens that are followed by a line break (-\n), with 'nothing'.

Open the Find and Replace dialogue by typing:
    Control + h

In the Find box, type:
    -\n

(That is:  a hyphen (or dash), a backslash, followed by a lower case letter 'n')

Be sure to delete the entire contents of the Replace box!

Press Replace All.


4.4.2.  Ensure that words ending in hyphens are legitimate words.


4.4.3.  Sometimes the OCR engine will insert an additional space following a hyphen.  These extraneous spaces should be removed.


4.5.  Make sure that only one page number exists on each page.


4.6.  Remove all "Running Headers" and "Running Footers".

Do this only after protecting chapter headings, as very often the absence of a running header is the only indication of where a poorly scanned chapter heading should go.

Note:  Running Headers and Running Footers are strings of text that appear regularly at the top or bottom of all pages.  They can be:
  -  the book title,
  -  the author's name,
  -  the title of each chapter.


4.7.  Look at the first word on each page to be sure that it is a complete word.  If not, reconnect hyphenated words onto one page.  (An often-used convention is to cut the last part of a hyphenated word and paste it to the first part, which occurs on the previous page.)


Note:  The following step is no longer technically necessary:
4.8.  Ensure that blank lines at the tops of pages will be preserved.

On each page beginning with a lower case letter, insert a space before that initial lower case letter.  This will help later.


5.  Insert page numbers where they did not scan, numbering preliminary pages with lower case Roman Numerals, making sure that a blank line exists on either side of the page number.


Note:  The following step is no longer technically necessary, as it is done automatically by the RTF Converter immediately after one's finished proof has been uploaded back to Bookshare:
6.  Remove all extra blank lines by using Find and Replace.  "\n" is the character string that will search for a blank line (aka a carriage return).

Open the Find and Replace dialogue.

In the Find box, type:
    \n\n\n\n\n\n

(That is:  six instances of a backslash followed by a lower case letter 'n')


In the Replace box, type:
    \n\n

(That is:  two instances of a backslash followed by a lower case letter 'n')

Press Replace All.


Do this with the Replace box remaining the same, but with five, then four, then three Carriage Return symbols each successive time in the Find box.  This will get rid of instances of more than one blank line between:
  -  blocks of text,
  -  page numbers and chapter headings, and
  -  page numbers and the body of text on a page.


7.1.  Insert any Paragraph Marks between lines of dialogue that were removed by the OCR.  Each instance of a space between two double quotes will be replaced with a Paragraph Mark.

Open the Find dialogue.  (Not the Find and Replace dialogue!)

In the Find box, type:
    " "

(That is:  double quotes, a space, then double quotes)

In the Replace box, type:
    \n

(That is:  a backslash, then a lower case letter 'n')

Press Replace All.


This will separate any paragraphs between speakers, where the OCR software should have inserted a Paragraph Mark, but didn't.  This does happen regularly.


7.2.  Remove any extra Paragraph Marks (aka Carriage Returns) that shouldn't be there, and that were inadvertently inserted by the OCR.  This involves using the Find and Replace command 27 times.

In the Find box, type:
    \na

(That is:  a backslash, followed immediately by a lower case letter 'n', then a lower case letter 'a')

In the Replace box, type:
    space a

(That is:  a space, followed immediately by a lower case letter 'a')

Tab over to the More button to display more options.
Then check the box "Match Case" for case sensitivity.

Press Replace All.


You will do this with each letter of the alphabet in lower case.

Inserting a space at the tops of pages before each occurring lower case letter allows your carefully inserted blank lines between page numbers and text on the page to be preserved now.


8.  Make sure that all ellipses:
  -  are three periods, and not four or five,
  -  have no space before them, and
  -  have no spaces between any of the periods.

If the preceding are not done, the ellipsis will not be represented properly in Braille.


9.  Run Ranked Spelling.  Correct all scannos as Ranked Spelling or the Spell Checker finds them.


10.  At this point you should read the book and correct any errors that Ranked Spelling or the Spell Checker didn't find.  Hopefully you will catch them all!


Note:  The following step is no longer technically necessary, as it is done automatically by the RTF Converter immediately after one's finished proof has been uploaded back to Bookshare:
11.  Replace any instance of tabs with spaces using Find and Replace.

Note:  Absolutely only do this if the book you are proofreading does not contain tables!

Open the Find and Replace dialogue.

In the Find box, type:
    \t

(That is:  a backslash, then a lower case letter 't')

In the Replace box, type:
    a space

Press Replace All.


12.  Convert to RTF.  Close the file.

Once the book is approved don’t forget to go back and clear out all the backup files relating to this book.


NOW YOU'RE DONE!


To the next Section:  4.4 Renew your book in 60 seconds



TOP OF PAGE
The page Scanning And Proofreading Manual does not exist.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.