USGS downloadable topo maps are in GeoPDF format. GeoPDF, like PDF, is regarded as a "final" format and, unlike GeoTIFF, not intended for further processing. It is still possible to extract the data from a GeoPDF file, but it is more complex than with other file formats.
To extract the map image alone, several image editing programs, including Photoshop or Gimp, will provide the necessary functionality. The easiest way, in my opinion, is IrfanView, with the plugins and also Ghostscript installed. But all these methods will lose georeferencing information - which is included in the GeoPDF file, a similar concept to GeoTIFF vs plain TIFF. However, GeoPDF is a proprietary format and GeoTIFF is open.
So, with the image extracted, we could now use standard 3 point georeferencing in TransDEM and achieve satisfying results. But wouldn't it be nice to have a more automated method by accessing the embedded georeferencing information in one way or another?
One possibility - and presumably the only reasonable one - is GDAL, an open source software library. GDAL handles dozens of geo data formats and has already been of use to TransDEM, for ESRI Binary Grid (*.adf) and the upcoming ERDAS Imagine (.img).
GDAL can work with GeoPDF but relies on another external library for this. This part is still in development. Furthermore, GDAL itself has a free license (X11/MIT), but that external library is licensed under GPL which is unsuitable for TransDEM.
Things may change in the future but for the time being I will not pursue
direct GeoPDF support in TransDEM. But partial automation is a realistic option.
The additional step would be to extract both the image and the georeferencing from the GeoPDF file. This would remain a manual step.
A binary distribution of GDAL comes with a console program called
gdal_translate.exe. gdal_translate.exe can take the GeoPDF as input and produce a JPEG image as output. It can also produce an ESRI world file with basic georeferencing information. Unfortunately, this does not suffice.
I have checked two different USGS GeoPDF map samples, a modern one and a historical one.
The modern one is in UTM/NAD83 projection, but slightly rotated. TransDEM can process world files, but only without rotation. I am still struggling here.
The older one is in Polyconic projection. Now what the heck is "Polyconic"? It has been used by USGS for decades but you will never see any polyconic coordinates.
"Snyder" has the formulas, numeric examples, and also background information. It turns out that for large scale topo maps, Polyconic was used to create the layout of a particular map on paper and for nothing else. So each map may have its own projection, a so-called local projection, similar to every town having its independent clock tower before the arrival of standard time. Therefore, polyconic coordinates are not interchangeable and therefore make not much sense at all. Hence, you don't see them printed. And we didn't care in the past, as those older USGS quadrangles delivered perfect results when georeferenced as UTM/NAD27.
Now, the GeoPDF file for a "polyconic" map takes polyconic literally and actually comes up with those unfamiliar projection coordinates in the georeferencing appendix. Because this projection is regarded as a local one, we also need some reference, what "local" actually means for a particular map, in form of a central meridian. That's the way the US polyconic projection defines itself. Usually, the central meridian of the projection is the central meridian of the map quadrangle, making it local indeed.
The world file, unfortunately, does not tell us the central meridian longitude. GDAL produces an additional file, though. It's in GDAL xml format and we find central meridian there. TransDEM needs to parse both the world and the auxiliary xml file. Since yesterday, it works in my software lab.
Thus, the
next version of TransDEM will offer automatic georeferencing for polyconic maps, offered as a plain raster image and accompanied by both the world file and the GDAL auxiliary xml file.
The procedure with TransDEM will consist of these steps:
- Download (historical/polyconic) GeoPDF map from USGS server. in this example it is a 15' quadrangle from 1943.
- Extract with gdal_translate.exe.
Code:
F:\Data\DigitalMapping\TransDEM\geopdf>gdal_translate -of jpeg -co "worldfile=yes" MO_Ozark_325125_1943_62500_geo.pdf MO_Ozark_325125_1943_62500_geo.jpg
It produces the jpeg file, the world file, and the auxiliary xml file.
- Open the image file in TransDEM and select Polyconic/NAD27 projection.
- TransDEM will look for the world file and because polyconic has been selected, the auxiliary xml file. If all is fine TransDEM will show the polyconic grid, typically with an easting of 0 for the central meridian in the map.
The projection coordinates are of no use outside the georeferencing procedure. Latitude and longitude, however, do matter and they should fit, compare SE corner of the map with the TransDEM status line:
- Convert to UTM.
- Optional: Crop the "map collar", using the "Transparent Margins" tool.
- Save.