HOW TO DO GEDCOMs
GEnealogical Data COMmunications
by Jerry Merritt
(re-printed with permission)
I've noticed lately that sending and receiving GEDCOMs is becoming more and more of a problem for newcomers and old timers alike. I have floundered around with GEDCOM's for about a year now and have discovered a number of things about them that may help some of the rest of you.
Most people's reaction upon hearing what a GEDCOM does is to think "WOW. That must be so complicated I could never hope to understand how it works much less make changes to it." Nothing could be farther from the truth.
So let's see if we can de-mystify the GEDCOM so you can get better use from it.
I. What exactly is a GEDCOM:A GEDCOM is just an agreed upon way among the people who make computers and communications equipment and programs on how to format a simple document to encode a family tree. This GEDCOM document is so basic that any properly
constructed program can recreate the encoded family tree by manipulating the data in the GEDCOM. The GEDCOM is formatted in the most basic of all communications code. This code is called ASCII (sometimes referred to as TEXT mode) and just about any word processor can read it BUT cannot reformat it into a family tree. It takes another computer program such as Family Origins or Family TreeMaker to read the GEDCOM and make a family tree from the simple data in it.
II. Can I read data directly from a GEDCOM:Yes. Just call it up on your word processor.
Depending on how cleverly you have trained your computer to recognize various file extensions you may have to temporarily change the GEDCOM name from, say, SMITH.GED to SMITH.TXT to get your word processor to read the GEDCOM as a text file. See examples below.
III. What is all that stuff in the GEDCOM:The GEDCOM starts with a header that tells a family history program where to start. The header looks like this: 0 HEAD.
The header must be the very first line of the file called, say, SMITH.GED for the family history program to be able to read it. (Well, some programs are smarter than others and can read through stuff entered above 0 HEAD but for consistent results always make sure the GEDCOM starts with 0 HEAD. There should not be any empty line above the header. It must be the very first line.
Then there's some general information (or boiler plate) about the GEDCOM and who made it.
After that you're into the real heart of the encoded family tree. Here's what a typical header and boiler plate look like:0 HEAD 1 SOUR Family Origins
2 VERS 3.0 for Windows
2 CORP FormalSoft, Inc.
1 DEST FamilyOrigins
1 DATE 11 NOV 1996
1 SUBM @S1@
1 FILE BRIGMAN.GED
2 VERS 5.3
2 FORM LINEAGE-LINKED
0 @S1@ SUBM 1 NAME Jerry Merritt
1 1730 Poston Drive
2 CONT Panama City, Fl 32404
1 PHON (904) 874-2600
It tells what program made the GEDCOM and what version and type GEDCOM it is, as well as who made the data entries. All good stuff for future reference. Notice that every line starts with a numeral. This is important. I'll explain later.
IV. Okay, what's all that other stuff:After the boiler plate comes the data on all the people. Here's what the data on one person looks like: 0 @I30@ INDI
1 NAME Postell /BRIGMAN/
1 SEX M
1 REFN 12
2 DATE 21 FEB 1860
2 PLAC Freeport, Fl
2 DATE 6 JUN 1941
2 PLAC Mobile, Al
2 PLAC Pine Crest Cemetery, Mobile, Al
2 SOUR Daughter and Mobile cemetery records where he is listed as Peter Brigman
3 CONT Census records of Walton, Holmes, and Bay Co., Florida used to track
3 CONT Postell's migration. The early Panama City Pilot has numerous mention of
3 CONT Postell beginning about 1914 to 1922.
1 FAMC @F16@
1 FAMS @F7@
1 FAMS @F8@
1 FAMS @F9@
1 NOTE Personal history of final years from interview of his daughter, Annice
2 CONT Brigman Branch, in Mobile in 1995. Photos from Annice as well. Postell
2 CONT had children over a 52 year span. See the Panama City Pilot for numerous
2 CONT mention of Postell from about 1914 to 1922. By 1925 he had moved to Miami
2 CONT with his son Lorin and then relocated to Mobile where he spent the
2 CONT remainder of his days. See family history by Jerry Merritt for a
2 CONT biographical sketch and more back up documentation.
The first line, beginning with 0, tells you this is the start of data on a new person and how to link that person into a family tree.
The lines beginning with 1 are the next level of detail under the levels.
1. It's just like an outline format in a research paper. In the level 1 birth data is level 2 data listing more detailed birth data like where and when the person was born. And so it continues until all of the data about that particular person is coplete.
IMPORTANT:As long as we're at this point, I may as well cover one of the major reasons GEDCOMs you receive won't load properly into your family history program. You'll notice that there are some lines in the above sample that might be so long they would be broken and continued on the next line down if your email margins aren't set wide enough. (If they aren't
broken, move your margins in until they do break to see what I mean.) If your email program broke these long lines by putting a paragraph marker in that line, the GEDCOM won't load until you take that paragraph marker out to make one unbroken line again. (The usual response of a family history program to a broken line is to stop loading the GEDCOM and ask for another disk containing more data.) Recall I said every line had to start with a numeral on the lower (broken) line. This is easy to fix, if somewhat tedious. You just call the GEDCOM up in your word processor and remove the paragraph markers that are creating the broken lines. Then save the changes in TEXT format and it will load right up.
V. How do I know when the next person starts in the GEDCOM:That's easy. Read down to the next line starting with 0. It will look something like this: 0 @I36@ INDI
1 NAME Otis Hill /BRIGMAN
1 SEX M
The 0 and the stuff after it tell the family history program that this is another person and how to link this person into the family tree. Knowing how to link this person based on that 0 @I36@ is the only thing complicated about this whole business. (Well, there's some FAMC and FAMS stuff in there also that points a program to other links but it's the same principle.)
As you have seen, everything else about a GEDCOM is dirt simple. And you don't need to understand the linking business to work with GEDCOMS. The computer does that for you.
And so it continues until you reach the end of the GEDCOM. Here there must be one final line to tell the family history program that it has now reached the end of the dada and that all of the data was there. That final line looks like this:
VI. How do you send a GEDCOM:You can send a GEDCOM exactly like you send anything else over communications/phone lines. Just think of the GEDCOM as a letter you wrote in your word processor and saved in TEXT mode. That's because a GEDCOM is in TEXT mode. It's just a letter in a very specific format. It's that simple. Really. You can send it as part of the email text or as an attachment.
With what you know now, if you decided you wanted to send out a GEDCOM but without
birthdates of living people, you could just call up the GEDCOM on your word processor and find the lines beginning with 2 DATE under 1 BIRT and delete the dates for everybody still alive. The resulting GEDCOM will still work just fine. It just won't show the dates you took out.
1. To avoid the problem of broken lines in your GEDCOM, send it as an attachment. Of course, if the other person's email can't stand attachments or converts attachments into part of the message text, this won't do much good. They'll have to clean the GEDCOM up as already described before they can use it. If you aren't sending as an attachment, at least set your margins on your email as wide as they will go to prevent your program from breaking any lines with paragraph markers.
2. When you receive a GEDCOM use your "SAVE AS" function to save it in a directory or folder where you can find it easily. I have a folder called GENTEMP where I send these GEDCOMs until I can check them and clean them up before loading into my family history program.
3. Always open the GEDCOM with your word processor to check that it starts with 0 HEAD and ends with 0 TRLR.
4. Check that there are no paragraph markers inserted into long lines by expanding the margins of your word processor window so that the longest line clearly fits into the window. Now look for lines that don't begin with a NUMERAL 0, 1, 2, or 3. When you find them, go up to the end of the line above and delete the paragraph marker so that the two lines become one again.
Do this until EVERY line in the GEDCOM begins with a numeral. Then save the cleaned up document as a TEXT document named, say, SMITH.GED. Be sure the extension is GED and not TXT.
5. If everything else fails, try calling the GEDCOM up in your word processor and saving it again in TEXT format with a .GED extension. This will usually clear up any little differences between coding used by the sender's computer and what your computer uses.
I hope this helps. It's allowed me to load every GEDCOM I've ever received even though some of them took a lot of word processor work. By using "Find and Replace" cleverly, however, you can even get the word processor to find and fix the broken lines for you. Perhaps the real key here is to understand that a GEDCOM is such a simple document that you can fix it yourself once you understand a little about it. So don't get scared by the strange looking GEDCOM format.
Just rip into it and fix it up so it will do what you want it to. After all, if it won't work when you receive it, how much worse can you make it.
Posted 07 Dec 97