Larimer County Genealogical Society

(+) What Format Should You Use to Store Your Files?

The following is a Plus Edition article, written by and copyright by Dick Eastman. 

One question that pops up frequently is: “What format should I use to save my files?” The question is often asked about digital pictures. Should they be saved as JPG or PDF or GIF or PNG or TIFF or some other format? Similar questions are often asked about word processing files, although there seem to be fewer options available. I thought I would offer a few suggestions and also tell what works for me.

Digital Picture

Today’s technology allows for a selection of image file formats, including JPG, GIF, TIFF, BMP, PSD, RAW, PNG, EPS, PDF, and others in a seemingly endless alphabet soup of abbreviations and acronyms.

You can find many good reasons and bad reasons for selecting any of these file formats. However, from a genealogist’s point of view, there are two significant issues to deal with: image size and image compression.

NOTE: PDF files have unique advantages and disadvantages for both digital pictures and for documents. I will write about PDF separately later in this article.

Image size has been an issue since the first scanned images were stored on a computer, back in the vacuum tube days. In this case, the physical size of the picture is not the issue, but the size of the file you create was very important. That is, the problem revolved around the number of bytes required to store a faithful reproduction of the original image.

Not many years ago, disk drives were expensive. Luckily, that problem is disappearing as the price per byte of storage has plummeted in the past few decades. Prices for one-terabyte disk drives have now dropped to the $50 range, a price undreamed of only a few years ago. It is now cost-effective to store hundreds of thousands of very large digital image files. Prices for disk storage are still dropping nearly every week.

However, file size remains an issue when transferring those files to another computer or when inserting images into a web page. Not everyone uses high-speed, multi-megabyte-per-second Internet connections. Next, even those who do use such high-speed connections find that including very large digital images in a web page results in slow performance. A high-resolution picture also might not display properly inside a web page. Such a picture might fill the entire screen or even “overflow” the screen, leaving no space for text, links, and other information in the web page. Finally, sending a hundred or so old family photographs to a cousin can be a painstaking effort if the files are very large.

Image file size, expressed as the number of bytes, increases with the number of pixels composing an image and the color depth of the pixels. The greater the number of rows and columns, the greater the image resolution and the larger the file. Also, each pixel of an image increases in size when its color depth increases: an 8-bit pixel (1 byte) stores 256 colors, and a 24-bit pixel (3 bytes) stores 16 million colors. Most color images these days are stored as 16-bit or, even better, as 24-bit colors. However, if the original picture is large (perhaps 8-by-10 inches or larger) and is scanned as a high-resolution image, the resultant digital image can be huge.

Fold3.com (formerly known as Footnote.com) created a single image of the entire Viet Nam War Memorial in Washington, D.C. The picture was created by taking several thousand very high resolution photographs, one for each small section of “the Wall,” and then electronically “stitching the images together” to form one huge image. The result is one huge image that consumes gigabytes of disk space. It is believed to be the biggest single image ever posted to the Internet, and special software had to be developed so that users could view pieces of the original image without downloading the entire master image. Downloading the entire master image might require several days or a week or even longer on a dial-up connection! Luckily, there is no need to do that as the custom-written software allows the user to “zoom in” and look only at specific segments. The result is quick downloads, even on dial-up connections. However, that is the only picture I know of that is available via the custom-written software that transfers only part of the image at a time.

The issue of file size quickly became a problem back in the days of expensive disk drives, when typical computer connection speeds were 300 baud or so. Storing hundreds of images on the limited storage capacity disk drives of the day was a problem, as was the inability to send large images across very slow network connections. To solve these problems, image compression was invented. Compression is not much of an issue in these days of high-speed Internet connections and cheap disk drives but still cannot be ignored.

File compression refers to the application of computer algorithms to analyze images and to find pixels to delete, thereby reducing the file size. For instance, if the picture had three red pixels in a row, the compression algorithms might eliminate one, or even two, of those pixels. The human eye probably won’t notice the difference, and the savings in file size is significant when thousands of pixels can be combined and the duplicates eliminated. The elimination of duplicate pixels is only one part of the sophisticated compression techniques used.

Of course, any time you delete pixels you are also reducing the quality of the original image. However, modern compression algorithms are very good at reducing file sizes without inducing significant loss of image quality. The most important word is “significant.”

The remainder of this article is reserved for Plus Edition subscribers only. If you have a Plus Edition subscription, you may read the full article at: https://eogn.com/(*)-Plus-Edition-News-Articles/10619149.

If you are not yet a Plus Edition subscriber, you can learn more about such subscriptions and even upgrade to a Plus Edition subscription immediately at https://eogn.com/page-18077.