Artwork
Menu

Home
About Homeschool
How do I start?
About Robinson
    Curriculum (RC)

Tips for RC users
Format Text Files
Print Books at Home
Bind Books at Home
Free Books Online
    (RC, Henty & more)

Rosegate's Free Files
Lists (many RC)
Series Order
Homeschool Name/ ID
LDS RC Ideas
Links for Homeschool
Contact
Site Map

           Rosegate Harbour



About the Format of Online Books

Please see the formatting page to learn how to easily get these books ready to print.

What File Format are the Books In?  

    Plain Text Files - like those at Project Gutenberg
    HTML - like those at Baldwin Project, Bartleby
    PDFs
    Scanned Images or PDFs of Scanned Images - like those at Google Books, Internet Archive, MOA
    Uncorrected OCR Text - also available at Google Books, Internet Archive, MOA

  • Plain Text Files   Sample: Sense and Sensibility
    Plain and simple, every computer can read these. No pictures, though. (But many at Gutenberg are also available in HTML with pictures.) All of my links to online books go to plain text files, unless specified otherwise. Text files were my first choice, and I used them as often as possible. Be sure you have the latest version of a text. At Gutenberg, the original texts are often labeled as #10 in the "Edition" column. The newest version available will have the highest number. (See example) So, if 10 and 11 are available, get the 11. (If only one version is available, there may not be any number listed, and there may not even be an "edition" column.)
    Downloading the texts - The books are often available in more that one format. Look for one that says "plain text" (unless you want the HTML for pictures). The plain text often comes in 2 formats: "us-ascii" or "iso-8859-1". I don't think there's much difference, but I generally download the "us-ascii" version, as it's the most common. (Of course, if you just copy and paste from the browser, it doesn't matter which one you use.)
    To save, or not to save - If you want to save the file, download the one that says "zip" in the "compression" column. (You may want to change the file name when you save it, so you can find it later.) If you just want to view it, or copy and paste it, then click on the one that says "none" in the "compression" column, and it will load on your screen.


  • HTML   Sample: The Swiss Twins (these pics print in b & w, without yellow background)
    Fancier than text, because they can have bold, italics, and pictures. It can be a bit of extra work to copy and paste each page, but may be worth it to you. On RC and Series page only: if a book's available with pictures, I've marked it with this color, and also provided a separate link if it's not at the same location as the text file.
    Do I really need pictures?  No, but sometimes they're fun for younger children. And some books contain useful maps or diagrams. One option is to just save the picture you want, and paste it into whatever program you're reformatting your text with.


  • PDFs   Sample: A Christmas Carol
    I tried not to link to PDFs unless I couldn't find another version. Why? You can't reformat it to your tastes, they often have a lot of wasted white space, and, depending in the page size, the text can come out itty bitty if you print half size. (There are some that are suitable for half size printing, like those at Planet PDF. To determine size of PDF page, download and open with Adobe. Then click: File/ Document Properties. Page size is near the bottom.) If you don't mind printing full size, these are nice files and can contain pictures. (For PDFs of Scanned Images, read below. They are different and usually come out quite well in half size.)


  • Scanned Images or PDF of Scanned Images
      Sample: The Children's Book of Thanksgiving Stories at Internet Archive [IA]
      Sample: The Children's Book of Christmas Stories at Google Books [GB]
    These are like getting a photograph of each page of the original book (just like RC books), including any pictures it may contain. (Because of the possible pictures, I've marked scanned images in a color font, just like the HTML files with pics.) These are often large files and can take a while to download with dial-up, so I've also linked to a substitute in another format when possible. But if you prefer scanned books there are many available. Most of the text files I've linked to are also available at Google Books or the Internet Archive.
    Printing - You can usually print these in half size without any problem, since the originals were about that size anyway. (When printing these PDFs, they seem to work best if you change "Page Scaling" to say: "Fit to Printable Area" so the image fills the page. Of course, this might vary from one book to another, so see what looks best before printing.)
    Text option - Uncorrected OCR text is usually available as well as the PDFs or scans. (See below)
    Scan Color - Most online scans (like at Google Books [GB], MOA, Canadiana, etc.) are crisp black and white. But the PDF scans at the Internet Archive [IA] are sometimes yellowed and may print a grey background, but there is a way around that (and they are beginning to have more b & w available there.)
    Quick-Fix for Yellow Scans - Once the IA file is downloaded and opened in Adobe, look at the row of tabs at the far left. Click the one that says "layers". Click on the picture of the eye next to where it says "Background" and it will get rid of the colored background for you. (Pretty slick, eh?) The text and b & w pictures come out very well. Colored pictures need the background turned back on, but most books have b & w pictures anyway.
    [I did a book with colored pics as follows: I printed different sections of the book at a time (pages without pictures) to FinePrint, so all the text pages ran together. (It combines all of the jobs for you.) After printing those I turned the background back on and printed the pictures on toner saver in b & w, then inserted into the book. (Many old books have a blank page after the picture page, and often they are without numbers, so it works out perfectly.)]

  • Uncorrected OCR text   Sample: The Story Girl
    What's OCR anyway? A technology that "reads" a scanned page and translates it into text. It's not a perfect science, but usually works pretty well. I've linked to these only as a last resort, because I'm a perfectionist and don't like so many errors. (Those listed as uncorrected OCR are also available at the same link as scanned images, or PDFs of scanned images. IA's OCR texts seem to download faster than those at GB.)

    What isn't corrected? Remember that everything that was on the original page becomes part of the text. If the title of the book, or chapter, was written at the top of each page, then that title will show up in the text everywhere a new page started. And the page numbers get thrown in too. These stand out pretty well and can be easily deleted, but that can be a pain with a large book. (Or you could just ignore the page numbers.) Sometimes the OCR translates things incorrectly and you will get strange typos, or extra blank spaces here and there.

    Usually you can still tell what the text is supposed to say, even if it has typos. The table of contents will be the worst part, so scroll down to the main content before you decided if it's acceptable to you.

    When I get a bit of time, I hope to clean up some of the OCR texts I've linked to (at least those on the RC and Henty lists), and provide them on my free files page. I'll make a note of it on the booklists when that happens, so you can have a nicer version. In the meantime, you can make the best of these OCR files, print from the scans, or use one of the substitute books I've suggested.

    Go To Project Gutenberg

    Back  /  Home