Submission Guidelines for Tips, File Types, File Naming_Washington County Genealogy_PAGenWeb Project

Washington County PAGenWeb 
Genealogy Project

About the Washington County PAGenWeb Project

Search ALL my Sites.

Washington County PA 
Submission Guidelines for Tips, File Types, File Naming

Please read Submission Guidelines here for information about Privacy and the kinds of files I can use.

This page discusses the "technical" side of how you can help me make a webpage by how you create and name your files before submitting.

I realize most people do not know how HTML works to create a webpage, or exactly what a webmaster does.  It can be a time-consuming process to make one page.  So I offer the following short list of tips about sending files or images, along with some explanations.

1. Word.doc files put in extra "coding"; I have to do extra steps to get rid of Word's coding.  Even typing information into an email works better for me than working from a Word file.
2. If you do use MsWord, you must send me the images (jpg files) separately - I can handle 3 or 4 images per email.  I cannot copy the embedded images from a Word document.
3. I appreciate that most people like to do "formatting" in Word.  But I just have to remove your formatting codes and re-do them in HTML.  It's okay to show what you want to be bold, italicized, or centered-- just know that I still have to re-do it manually.
3. Name your images with first-name-last-name_date.jpg   Don't use a capital letter anywhere in the filename.  Use a hyphen (-) between each word in the filename. (I spend hours re-naming files so they will upload.  You will save me HOURS of work and frustration (it is not easy to switch a capital letter to lowercase when the file is MiXeD_CaSe, like Mary Johnson.jpg   or to remove spaces between names.  Files with spaces that are not changed would end up showing on the internet like this:  Mary%20%Johnson.jpg  -- IF it will even upload with a space in the filename.

What is code?

Every computer program produces "code".  Some programs work on binary codes (numbers) or pixels (such as pictures).  The program Word produces very "fat" coding; it is more than a webpage needs.

Code shows what is bold, italicized, big font, or small font.  Word says on every line, through its code "do this".

HTML tries to avoid "fat code'.  HTML tries to NOT use "do this" coding on every line.  For example, HTML may use just one line of code to set the size of font no matter how many lines of "data" is typed in.  But as a comparison, a Word file that uses 3,000 lines of typing may end up with 6,000 lines of "do this" coding to control what you see on a page.

In programs like Word and in HTML, a simple "space" is filled with 4 characters.  The code for one space is    with an ampersand to start it and a semi-colon to end it.  So, let's say you indent a paragraph 5 spaces, there are 5 sets of     If you use the spacebar in a program to set up how your document looks in Word, and you touch the spacebar 15 times, the code would be                 

So, even though your Word document looks nice when you have it all set up the way you want it, to get the file ready for the web I have to go into the file (what you don't see) and remove excess coding.  "A space" code, above, is just one example.

How does a Webmaster remove excess code?

There are two ways to "strip out" or remove excess coding:  (1) Locate and copy a piece of code that is not needed, and use Find > Replace, putting the changed code into the Replace box; or, (2) Locate a piece of code, then use Find and manually replace the excess code with the needed code.  Both ways can take hours (sometimes several days). 

What are the "worst" program codes to manually convert coding to web-ready?

1. Because Word uses excess code (bloated code), it is the worst to convert.  Also Microsoft uses "proprietary coding"--which, boiled down, means the program puts in clunky codes that a webpage doesn't really need.  The result is slow-loading pages and pages that don't "validate" (see below).

Tables in MSWord create bigger boxes than what is needed.

Setting up Word tables so that words are at the TOP or BOTTOM of a "cell" (box) is very difficult to convert into an HTML table.

2. Excel is great to organize data into "rows" and "columns" (the basics of a table).  But since it is also a Microsoft Office program, the coding is also fat (bloated).

Although it takes time to remove Excel coding, having the data separated into rows and columns is much easier than the alternative---manually copying each item of data and putting it into the correct "cell" (box) on a "table".

3. PDF files, especially ones that cannot be copied or have embedded images..  Again, people love PDF files because you can arrange material how you like it.  BUT, you must use the right settings so I can use the Copy command inside the PDF and Paste it into a plain text editor (i.e. Windows Notepad-text).

Optical Character Resolution (OCR) does NOT always work to successfully take the words from a PDF and put the words into a "plain text file".   If the OCR does not "understand" a word or certain letters, it simply inserts ANY character or it inserts a "best interpretation" of letters.  For example, this sentence

The cow jumped over the moon.

may end up looking like this:

T^e cov junneh o*wed t^e noom.

If OCR fails, I must manually fix or re-type every error!

Please!  do not send newspaper articles you saved into a PDF file!

4. Any "latest version" of Excel, because I may not have the 2010-2011 versions.  In other words, I may not be able to use the file you send because I have no way to OPEN it.


So, how can a submitter best help ME?

1. NAME your jpg files!
PLEASE-- do NOT send files you just uploaded from your camera unless you NAME them! I spend more time changing files like "DSC_Jany-22-2009_O115" to a name!

2. NAME files according to whether it is a person, place, object.  Include a date at the end if you can.  Don't make super long file names.
YES: washington-court-house_may-2007.jpg
YES: person-homestead_
YES: mary-martha-person_died-1812.jpg
YES: will-of-mary-martha-person_1802.txt

3. PLEASE!  Put names on cemetery tombstone photos.  Yes, I know it is a hassle---I end up doing the naming and I don't feel very friendly by the time I'm done. ;-)   At least put the person's FULL name and in the email tell me which cemetery it is. Examples of good filenames (even though they are long):

hill-family-cemetery_overview.jpg  (Shows all or part of the cemetery.)

PLEASE - Use a 0 before single-number months or days. 01 02 etc
Type the FULL YEAR.  Instead of 3-4-56 type 03-04-1856

 If the photos should be kept in a certain order, put a number in FRONT of the filename.
(Then continue the numbering for individual stones, IF you want them in a certain order.)

If several photos show the same person or place, but are just different shots, put the number at the end.


* I use these 2 numbering methods to keep files organized and so *I* don't get confused..

4. Use all lowercase on filenames.  NOT MiXeD CaSe.
YES:  mary-johnson.jpg
YES: mary_johnson.jpg
NO: Mary Johnson.jpg

5. File naming for the Web is different than what you are allowed to do on Windows Explorer (i.e. in My Documents or in My Pictures)

Windows Explorer allows you to use comma (,), period (.) parenthesis ( ) and some other non-letter characters.  But if you use those, *I* have to sit here taking each one OUT because they will NOT upload to the servers at Rootsweb.
NO:  HOPKINS Mary H. & family Wife of JOHNSON, Charles.jpg
BETTER: hopkins-mary-h_and-family_wife-of-johnson-charles.jpg
BEST: mary-h-hopkins_and-family_wife-of-charles-johnson.jpg

6. Write names as they would if you were saying them (first-middle-last). 
Use "mary-johnson", instead of "johnson-mary".  You'll help Google to "read" the names, show the names in search results, and help other researchers to find the file--and to connect to you.  People don't do Google searches as "Johnson Mary"; they'd type "Mary Johnson".

7. Use FULL names in your writing each time the name appears, especially for women's names!

Many people write something like:  Mary A., daughter of John Persons and Clarice, was born...
It is better for Google Search if you put a woman's first and last name(s) together, side by side.  Yes, Google will find "every instance on this page" but then you might end up with a result that shows 5 "Mary" names on a page (Google can't tell which Mary you want if you search for "Mary Persons".

BETTER / BEST:  Mary A. Persons, daughter of John Persons and Clarice __ Persons, was born...

TIP: Always use full names for every person, if you can, or at least do Full Name once in each paragraph that talks about that person.

8. Please use Spell Checker.  I'm not saying this to be rude, but if you don't catch and fix your typos, then I must do it.  (If I leave the typos in, you'll end up writing to me to point out the "mistake" and want me to fix it.)


Again, I know that most people don't fully understand how a webmaster makes a webpage or why filenames need to be a certain way for Rootsweb servers (when other hosting sites allow mixed case or spaces) or why some programs make "bad" code.  I have listed these items so you have an idea of the problems I encounter and what things take up so very much of my time.  After having to re-name a bunch of filenames made by a Digital Camera, or must spend hours removing periods, commas, and ampersands from filenames, I do get cranky! 

I cannot possibly cover all aspects of how to create or send submissions.  But the items listed are the top problems for me when I get files.


Read Submission Guidelines, Privacy, Copyright

Read Submission Guidelines for Tips, File Types, File Naming




This page was added October 21, 2010.


Return to the Washington County Genealogy Project

  2004 Judith Ann Florian, all rights reserved.