Latin Language and Script — Resources for the Genealogist

Latin Language and Script:

Resources for the Genealogist.

The Font Problem.

Unfortunately, there is still not a satisfactory way to write about the abbreviations you are likely to see in Latin manuscripts that used the very common "Gothic cursive" scripts. Things are a little better if you are working with more formal (non-cursive) scripts, or if you are publishing books where all the scripts are rendered in a non-cursive way. But that still doesn't help the scholar discuss what he sees in the original manuscript, except by including images of the original texts, an option that is awkward and time-consuming.

The first problem to be solved is to specify the various characters that need to be added to the Unicode standard, so that suitable fonts might be constructed. The Medieval Unicode Font Initiative (MUFI) has undertaken this task, with impressive results. Most of the characters and diacritical marks are now defined in the Unicode standard. Further, there are now several fonts that comply with the MUFI recommendations, though it has to be said that none of these fonts looks very much like what you will see in a "Gothic cursive" manuscript. The available fonts that include the MUFI recommendations are far better suited to formal publications than to discussions of the peculiarities of manuscripts.

At the same time, there are various fonts that do a good job of representing the basic alphabet and some of the alternate letter forms of the old cursive scripts. For example, the 1413 Cursive font by Errance does a good job of representing the basic script, but lacks the diacritical marks and abbreviation symbols. It is possible to "borrow" those symbols from other fonts in many cases, to produce usable documents in Microsoft Word, etc. Whether such documents can reliably be converted to PDF files that will appear identical when displayed in other operating systems, without requiring that these fonts be installed, is still not clear.

Using the free PDF add-in for Microsoft Word, it is possible to convert Word document into PDF files that include all of the required characters from any "embeddable" fonts that were used in the native Word document. So far, we have had good results using a combination of fonts to represent some of the problem abbreviations. For the basic "Gothic cursive" script, we use the 1413 Cursive font, but substituting as necessary with some of the characters in the Vinque font (the small s for example, since the 1413 Cursive font only has the long form of the small s). To this we add the "combining" or "spacing modifier" characters and the additional MUFI characters from the Junicode MUFI or Palemonas MUFI fonts, adjusting the size and weight of these characters if necessary — many of these characters look better in "bold" weight, or at larger size. More experimentation is needed.

The available fonts that follow the MUFI recommendations still need some tinkering. The additional symbols don't always have a consistent weight when used with the rest of the font (although I understand they look better when printed), and some of the combining forms don't look much like the versions I encounter in real cursive scripts.

Once a suitable font or combination of fonts is found, we still want to be able to present our discussions on the internet, in a way that will give the same appearance in all the major browsers. The Google Web Fonts project is clearly a major step in the right direction, but the fonts now included contain only a small subset of the Unicode standard, suitable for ordinary text, but not for the "combining diacritial marks" and the other extensions that we would need for our purposes. The Google Web Fonts and similar efforts are still quite new, so it is possible we may get over this hump in 2013.

After some experimentation, we used the FontSquirrel web site to generate a "web kit" for a subset of the Junicode font (removing a large number of characters that are not needed for our purposes, such as characters peculiar to Greek or Nordic texts). Then we installed the "web kit" on the RootsWeb server. So far, it looks like the Junicode font is doing what it should across at least the main browers, Internet Explorer, Firefox, and Chrome. Some issues with "combining characters" drawn from different fonts have been discovered, so we are not yet able to combine different fonts to represent some of the abbreviations we see in Medieval manuscripts.

You might think we could use popular fonts such as Times New Roman that include the "combining diacritical marks" for this purpose on the internet, at least for non-cursive scripts, but that turns out to be problematic. Some of the combining characters that look just right in Microsoft Word turn out to be placed incorrectly when displayed in Internet Explorer on the same computer. The same is true of other fonts, and the results are not the same in different browsers.

At the moment, we have two separate routes for presenting textual information on the internet in a way that is independent of installed fonts and proprietary software, browsers, etc. First, as discussed above, we can use a word processing program and then convert the document to PDF format. Second, we can use HTML and Cascading Style Sheets (CSS), which, together, provide standards-based tools for positioning and manipulating text. However, it turns out that these two approaches do not yield identical results. Sometimes "combining" characters that look just right in the word processor are not positioned correctly when used in HTML. On the other hand, the CSS standards give us the means to alter the position and size of characters to achieve effects that don't seem to be available in Microsoft word, such as lowering the characters 9 and 3 so that they can serve as Latin abbreviation symbols, 9 and 3. The "Character Spacing" dialogue in Microsoft Word can do this, but each effect has to be calibrated individually, and will be different for different font sizes, while the CSS model allows us to define named rules that will apply across all font sizes and styles. A word processing program with CSS capabilities would seem to be a wonderful invention!

Different browser implementations also contribute to the problem. Some Internet Explorer versions (8 and 9, at least) will break a "word" containing a hyphen inappropriately at the end of a line, for example breaking a prefix such as pri- before the hyphen, or the suffix -ibus after the hyphen. When these situations are detected, we use the non-standard but apparently functional <NOBR> </NOBR> tag in the HTML code.

In summary, then, while the Unicode standard is now roughly where it needs to be for discussing Latin paleography, the fonts that implement this standard need further development, and the methodology for presenting documents and web sites across multiple operating systems and browers still lags far behind. We are left with half-measures and the awkward necessity of illustrating almost everything with images.

Where Do We Go From Here?

Font developers can hardly be expected to implement the hundreds of characters that are mentioned in the MUFI project, most of which occur rarely, or pertain to scripts other than the Gothic cursive that is the main focus for medieval genealogy. When someone asked the question, which of the characters are most needed, I suddenly realized that what we really need here is statistics! I propose to tally up the abbreviation signs used in a selection of documents in diverse hands from the 15th Century, so we will know which Unicode characters are most essential for realistic fonts. A first attempt at a list of the most useful characters is here. Stay tuned! There are opportunities for individuals to contribute to the advancement of Latin paleography!

Coordinator for this site is John W. McCoy This page last updated Tuesday, 26-Feb-2013 11:07:44 MST.