IT into everything — a geological survey in transition
|From: Allen, P M. 2003. A geological survey in transition. British Geological Survey Occasional Publication No. 1. Keyworth:British Geological Survey.|
Chapter 18 IT into everything
The phrase ‘IT into everything’ was introduced into the senior management’s vocabulary at a strategic planning workshop in 1995 organised and facilitated by CEST, the Centre for the Exploitation of Science and Technology. It was a statement of the obvious in the sense that IT had been in nearly everything for some considerable time and the BGS had a very high proportion of computer-literate staff, more so than in many other geological surveys. But in the interviews with senior staff it conducted before the workshop, CEST found that this was not fully recognised in the organisation’s strategic thinking.
It was recognised by members of the NERC Experimental Cartography Unit back in the 1960s that computers would, in due course, fundamentally change the way geological surveys worked, but it was the geophysicists in the BGS who spearheaded the change process. Geochemists followed. The geologists were a slow, somewhat reluctant last, but by the mid-1990s all BGS scientists were exploiting computer technology together. When a new technology is introduced in any work area it is first used to do more efficiently and more quickly the sorts of things that were previously done with the old technology. The next stage begins when the new technology is applied to tasks that only it can do. The BGS is now sitting on the cusp between these two stages.
It is no surprise that geophysicists and geochemists did lead the way in the use of computer technology. Theirs are dominantly numerate sciences, within which it is relatively easy to find a use for computers. Geomagnetists and seismologists were amongst the early computer users. As early as 1964 the Atlas System, housed near Didcot, was used for processing geomagnetic data. Both the program and the data were input on paper tape. The next year Glasgow University obtained an English Electric Leo Marconi KDF9 computer which used the same language as the Atlas and operated the same way. This computer was then used by preference by the geomagnetism staff, who were based in Edinburgh, but in 1970 Edinburgh University set up its own computer centre and the BGS migrated to this. The global seismologists found a need for computing power in the 1980s to work in real-time. First, they took advantage of the DEC PDP15 in Edinburgh University, but later the BGS purchased the same range of computers and installed machines in the Edinburgh office to be used by the geomagnetists and seismologists. Both then kept pace closely with the emerging technology.
Also amongst the earliest users of computers were geophysicists dealing with gravity data. The data had been assembled and organised on 80-column library cards in the 1960s. These remained in use until 1974, when the data were transferred onto the IBM360/195 computer at the Rutherford Laboratory. The aeromagnetic data, collected on airborne surveys during the 1950s and 1960s were initially held in analogue form, but these data were digitised, starting in 1982. By 1989, when this data conversion exercise was completed, the two main geophysical databases were held, and the data processed, entirely on computers.
Data transmission using computers was pioneered in the BGS by the Geomagnetism Unit. Working in partnership with the United States Geological Survey, they established a pilot system, called INTERMAGNET, for exchanging geomagnetic data in real time between the UK and Denver, USA, in 1987. The system went fully operational the next year. Originally, they took advantage of satellite links for data transfer, but when the Internet became available this was used. INTERMAGNET is now a world-wide service using both the Internet and satellite links.
The World Wide Web was used by the Geomagnetism Group for the purposes of running an enquiry service in 1994. A little before the BGS set up its corporate website, the group developed its own to allow enquirers access to non-sensitive magnetic data, such as mean value magnetic declinations in specific geographical areas. The website was also used to give information about recent earthquakes, usually within hours of their happening.
In the Regional Geochemical Survey Programme, analytical data had been held on punch cards since 1969–70 and were easily migrated onto computer databases. Computers were used for running instruments and data processing from 1968 when the first computer was attached to the Jarrel Ash spectrometer. Computers are now an integral part of all analytical equipment.
There are several reasons for the slow start in the use of computers made by geologists. The innate conservatism of the Land Survey is a factor, but it is rather less important than the large size of the task they faced and the lack of appropriate computing technology to deal with complex, essentially graphic features in the early years. The attempt made by the Experimental Cartography Unit in the late 1960s to produce a geological map by digital methods led to the publication of the Abingdon 1:50 000 geological sheet in 1971. Unlike the geophysicists, whose pioneering work at this time led directly to successful operational systems using computers, the Abingdon experiment demonstrated that the technology was insufficiently mature and too expensive for this method to be given serious consideration for a production process at that time. Further research on it effectively stopped until the mid 1980s. In the meantime, however, computers were being used to generate relatively simple graphics on sand-and-gravel resource maps produced in the Industrial Mineral Assessment Programme in the late 1970s and a dedicated computerised word processor was purchased to produce mineral assessment reports in the early 1980s.
The use of computers for storing geological data also has a long history, but a faltering start. Bill Read, at that time based in Edinburgh, is credited with proposing to the then Director, Kingsley Dunham, in 1967 that a system for storing geological data on computers, similar to that being used in the oil industry, should be introduced into the BGS. Some work on this was started and David Gray was appointed to chair the Computer Committee, which was meant to coordinate computing across the Survey. It does not appear to have been successful. Conflict with NERC Computer Services, which was trying to implement pan-NERC solutions may have played its part, but throughout the 1970s computers were being tried in a wide variety of tasks in all parts of the Survey. Many of them were funded, after Rothschild, from commissioned-research budgets and lay outside corporate control, but this was an era of experimentation and it was probably unreasonable to expect much success in coordination during this experimental phase of activity. An independent Computer Unit was set up in the BGS in 1974, again to try to bring some control to this expanding and exciting field, but this also was not successful.
Meanwhile, the attempt to begin to computerise the BGS data holdings, which was started in the late 1960s, languished. It was not until 1981 when a two-man group, the Geological Databank Scotland, was established in the Edinburgh Office that serious experimentation on computer databases for geological information began again, largely in the Edinburgh and Newcastle offices but also in Exeter. There was pressure from the Edinburgh group to adopt the database management system called MIMER, developed by the Swedish Geological Survey, which had been loaned to the BGS on trial. It failed to make an impact on the decision-makers, who decided to standardise on the ORACLE relational database management system for the whole of the NERC.
The domination of the BGS computer policy by the demands of geophysicists, though challenged periodically, remained for nearly twenty years. The first significant shift in policy came in 1988, when NERC Computer Services (NCS) expressed the view to the NERC Scientific Computing Strategy Committee that a future strategy should address the total configuration for advanced research computing and not just the part supporting ‘number crunching’. The NCS produced ‘A plan for the 1990s’ in 1989 and initiated the second round of reviews of computing needs within the NERC institutes. The user requirements that emerged from this indicated that in the BGS the explosion in the volume of scientific data, fuelled in part by increasingly automated means of data capture, had created a demand for computer data management. This was no surprise to many staff, coming, as it did, over twenty years after Bill Read’s paper. It was recognised that there was also a strong and growing need for a modelling capability using computers and to support computerised cartographic and other graphical representation, in particular where it was interactive with computer databases. The development of an implementation plan from these decisively led computing in the BGS into new areas and away from the dominance of the mathematically based science disciplines, but the tensions remained. The BGS has a need for computing to support all areas of its work activity. With a limited budget for computing there is inevitably a struggle for priority in funding between those whose main need is for computers to support their research and those concerned with information management and computer databases. This conflict raises its head every year when spend on the capital budget is decided.
Although these new areas were pushing the limits of the current technological capability, the major manufacturers had become firmly established and many of the tools that are currently required within BGS were available then, though some in primitive forms. The complexities of dealing with geological map data with computers were no longer insurmountable. Strong pressures were brought to bear on the Director to force a change in the BGS computer policy and give priority to computer developments that supported geology as opposed to geophysics.
The money that came with the PES award for the enhancement of the NGIS Programme in 1990 was the key that opened the door to the future. Suddenly, the BGS was able to acquire appropriate kit in all areas where computers could make a contribution. Computers became increasingly used for word processing, data-basing and cartography. In the case of the last, the invention and widespread availability of GIS (geographical information systems) was revolutionary. By the middle of the 1990s every scientist had access to a PC or Apple Mac computer. By the end of the decade every scientist had one or more computers on his or her desk and access to a lap-top for out-of-office use.
It is easy now to forget just how recent and how dramatic this change in computer use is and how, as recently as 1990, much of what is now taken for granted by geologists, as opposed to geophysicists and geochemists, could not be done. There is, perhaps no better example of this than in the development of digital cartography. The Digital Map Production System (DMPS) for 1:50 000 maps, which was being developed from 1988 into the 1990s, stored data in an ORACLE relational database management system. The cartographic system and GIS in use was Intergraph, and the maps being produced were plotted using a Versatec electrostatic plotter through which vectorised linework and polygons depicting the geology were overlain on raster images of the topography. In 1990, the DMPS had been developed to meet the cartographic requirement for efficient production of the printed map via a GIS. Cartographic attributes were held digitally within the graphics file and within the ORACLE database, and were used to generate a finished digital map ready for proof checking using electrostatic plots and then reprographic processing. The success of the 1:50 000 production system contributed to the development of a comprehensive database/map production system based on the geological attribution of map data at a scale of 1:10 000, which was also being developed in the early 1990s.
The complexities of the 1:10 000 DMPS tested the application software beyond its limits in a number of areas — particularly in relation to the ‘doughnut’ effect. This is where one polygon (or many) is completely enclosed within another, like the hole in the middle of a doughnut. This disposition of polygons is commonplace on geological maps, but the software could not deal with it effectively. The DMPS could not be made into a practical, operating system until this and other problems were solved. Compared with 1971, when the Abingdon sheet was published, they were trivial problems and work-arounds were devised, but it took pressure from BGS and visits to the Intergraph headquarters in the United States to force them to recognise the seriousness of this problem, in particular, and improve their software. In 1991, the fully operational 1:10 000 DMPS, free of clever work-arounds that BGS cartographers and programmers had had to devise to make the prototype system work, was set for the last stages of its development. It was possible to think ambitiously about developing the sort of systems to handle geological data that were comparable to those already in existence for geophysical and geochemical data.
The principle of the 1:10 000 DMPS was to put all the geological information that was on an approved 1:10 000 geological map into databases. This included descriptions (attributes) for every line, polygon and point feature on the map. A line depicting a fault was described in the database as a fault, with information on the fault as well as the names of the formations and rock types on either side of it. This same line, however, was also given cartographic attributes, such as the thickness it had to be drawn, the length of the dashes if it were a dashed line and the colour ink to be used. This approach meant that the databases could be interrogated for their geological information and a map drawn to illustrate it. A map showing only sandstones could be derived from the database, or one that showed only faults or only coal seams.
The implications of this approach were many. The DMPS required comprehensive dictionaries to support the databases. Corporate rock classification schemes had to be drawn up. Surprisingly, in a Geological Survey with a 150-year history, none of these already existed. Classifications had to be fixed for all other data that were to be taken off the map and put in the databases. A Lexicon of names of stratigraphical units used on BGS maps was made and, eventually, mounted on the BGS website. The concept of ‘seamlessness’ was introduced for geological maps. In its simplest sense this meant that any two adjacent map sheets had to fit together seamlessly, such that an internally consistent map, straddling the boundary between two map sheets, could be generated from the computer databases. In the broader sense it meant that the geology of the whole of the UK should be depicted as a single entity to nationwide, common standards. To achieve this meant not only that there should not be any boundaries between map tiles, but that there should be a single, internally consistent stratigraphical classification scheme for all rocks in the country. Old practices, such as stopping solid geological boundary lines at the edge of drift deposits, had to end because the computer did not like dealing with gaps in information.
The impact of the introduction of digital map-making processes on thinking among everyone who was involved with the production of geological maps, from the field surveyor to the cartographer, was profound and led to many changes in practice. Outstanding among these were those related to map production. Prior to 1988, map production was carried out partly in house, partly externally. In house, the generation of vector linework was done digitally, while colour-polygon encoding was done manually. External bureaux were used for reprographic processing to produce plate-ready films for the printer. There were cost disadvantages in this arrangement. By 1990 a fully digital system was in place, which allowed the whole process, apart from plate-ready film production (until 1993) and printing, to be done in house cost effectively. Quality control became tighter and more effective and improvements in efficiency could be freely introduced. The effect on output was considerable. Between 1990 and 1994, output of printed maps per year more than doubled to over thirty and it has remained steady at this level since then.
The biggest demand that the introduction of digital map production made was that the BGS should operate to corporate standards in all things related to data storage. This was not required for map-production purposes, but full advantage of the DMPS could only be taken if all geological information for the whole of the UK could be integrated with it. To do this required a single set of hardware, software and database standards. If geochemical, geophysical and hydrogeological information were to be used in conjunction with the geological data within the DMPS, then they also had to conform to the same standards. Similarly, geotechnical, mineral-resource-potential and other derived information should also be kept within the same database framework to the same standards. Lastly, there is no reason why the coastline should act as a boundary between storage systems, when the geology crosses it.
The battle to introduce conformity in software and database standards had begun to be addressed by the Computer Management Group, an outcome of the 1997 reorganisation. The CMG consisted of the head of Geospatial Information Group in the chair, the head of Facilities Management and the chairs of the two Computer Users’ Groups. It was a small, innovative group not afraid to take on the big issues that had been highlighted in 1995. It championed the IT matrix, which, though it became operational, did not contain everyone who should have been in it. Group Managers and ADs, it transpired, had kept some key staff out of it so that they would remain totally in their control. The biggest achievement of the CMG, however, was the introduction of corporate software standards. Despite knowing that all the successful firms in the private sector and many government bodies had introduced them, this was regarded as an infringement of the individual rights of managers in some quarters of the BGS and was vigorously resisted. The small community of Apple-Mac users at first resisted the change, as did those in the three main GIS camps. Even though the BGS had standardised on the WordPerfect word-processing package some years ago, Word was becoming the de facto standard and there were several other packages in use in corners of the Survey. After several months of discussion, this action was still not fully resolved when David Falvey took office as Director in January 1998. The matter of database standards was taken up by a special project (called BGS-geoIDS), which was approved to run for three years by the BGS Board in March 1998. This was designed to ensure that at the end of it there should be a coherent set of database standards for the whole of the Survey.
At that same meeting the Board approved another project, named DigMapGB, the purpose of which was to digitise all 1:50 000 maps for the whole of the country and selected areas at 1:10 000. The 1:250 000 maps for the whole offshore and onshore area had already been digitised. This project was meant to provide a first version of a seamless digital map of the whole country. To function effectively it would require a unified map data standard incorporating the geological attribution from the 1:10 000 database, the cartographic enhancements to the 1:50 000 database and all data to be upgraded to it. It had once been rejected by the Programme Board, but was now being revived. Some members of the Directorate rejected it because there was nothing in it for them. It is easy to see why. One division, PGGOS, had only a marginal interest in UK onshore geology. Within the Minerals, Environment and Geochemistry Division, the practice of digitising whatever geological maps they needed for themselves was well established. Only the Geological and Hydrogeological Surveys and Corporate Services and Business Development divisions had a majority interest in the outcome of this proposal. The argument that the Directorate itself had developed in April 1995, that co-operation between groups was best exemplified when there were no overlapping interests, did not appear to apply to divisions.
The completion of these two projects paved the way for the implementation of the Digital Geoscientific Spatial Model (the DGSM UK). It is this that defines the threshold to the next logical stage in the development of the Geological Survey. The origin of the idea for the DGSM can be traced to an internal paper by Vic Loudon, ‘The data library, the database, and the computable model in geology’, dated June 1976. This paper presented the idea of the computable model as ‘an explicit statement of a hypothetical construct which represents a system or some aspect of it, implemented on the computer as a set of procedures and parameters …’. In 1979 a committee was set up under the chairmanship of Innes Lumsden to look into digital cartography within BGS and ended up recommending going beyond the digital map to construct the model behind it. Thus, a second committee was tasked to examine small-scale spatial modelling. It reported in 1982 and several small R&D projects were carried out as a consequence of its recommendations, but neither the computer power nor the software was then available for the task envisaged in the report.
Despite this, a small group of specialists in the BGS, under the guidance of Vic Loudon, continued to carry out research that contributed eventually to the implementation of his ideal. Some of the research was carried out within projects funded by the Department of the Environment. There were three of these towards the end of the 1980s. Each was a contract to carry out research into the delivery of thematic maps derived from geoscientific information for the use of town planners. Two were based on Southampton, the third on Wrexham. Out of these came the basic knowledge that led to the development of the DMPS, which delivers an essentially ‘flat’ model from a set of inter-linked databases. This went into production in 1993/94.
The logical next step was to extend the DMPS into the third and fourth dimensions. There had been several attempts to run projects to devise a 3-D modelling capability for geological information in the 1990s. Largely because of lack of resources and low priority, none had been successful beyond a certain point. The technical problems to be overcome were also very complex, and commercially available software was not adequate. Because the BGS had adopted a ‘first follower’ policy the Survey was not going to be tempted to put any resources to writing its own software for 3-D geological modelling. In different parts of the Survey expertise was building up in the use of the existing commercial 3-D modelling software packages such as EarthVision and Vulcan. The geophysicists, also, were expert at devising 2.5-D and 3-D models that simulated gravity and magnetic fields, relating them to low-resolution 3-D geological models. Thus, with the know-how gained through the development of the DMPS, it became timely in the mid-1990s to reconsider Vic Loudon’s original concept of a computable spatial model. It is now generally referred to as the DGSM and defined as the totality of the spatially referenced, validated and tested geoscientific information of the UK landmass and continental shelf within a single system.
There are two important strands to the DGSM concept. One is that it is fully four-dimensional. The standard, printed geological map is a representation of a three-dimensional model in two dimensions, which itself is a clever and enduring concept, quite different from the depiction of, say, a perspective drawing like a block diagram, on a two-dimensional surface. To progress beyond this and hold all the geoscientific information that is on a map, and which has been used to construct the map, within a properly spatially referenced three-dimensional model that may vary with time can only be done with computers. In the ideal model each piece of information is given xyzT coordinates and held in either a system of inter-linked databases or, ultimately, a single xyzT database. The total model itself will be too big and data-rich ever to be viewed, except as low-resolution versions, but parts of it could be illustrated at high resolution. The core of the DGSM, therefore, is the database, or set of databases, which have to be designed and maintained to exceptionally high standards of conformity and integrity.
This leads, naturally, to the second key strand. Nowadays, the central reference sources for geoscientific information are the standard publications of the BGS. These are the 1:10 000 and smaller-scale geological and other maps, the published memoirs, reports and various books, and the databases of interpreted and validated geochemical and geophysical data. Volumetrically, most of this is in analogue form and with few exceptions it is an ad hoc collection of sources that have not been validated to a common date stamp. The DGSM UK will replace these as the standard reference source for geoscience. All the data in it will be integrated, validated and updated each time new data are added. Thus, it will never lose its currency.
This is a big concept, to hold most, if not all, of the BGS geoscientific data in a single set of three-dimensional databases, which have conduits to systems that allow the data to be processed or transferred elsewhere for representation on maps, diagrams or whatever. A scoping study to develop a plan for developing the DGSM UK was carried out in 1999/00 and the NERC agreed £4.4 million additional funding for a programme to develop it over five years from 2000/01.
In some respects the building of the DGSM is little more than a massive housekeeping exercise to put the Survey’s data in good, modern order, but in parallel with this will be research into ways of making the data accessible. Here, the technology is still maturing. The growth in the use of Internet and e-mail has been phenomenal in the last ten years. There is now a general assumption among BGS scientists that communications between each other and with other researchers, customers and the general public will be by electronic means. The BGS established its corporate website in 1994, and in 1999/2000 it developed a commercial website in partnership with Compaq. Both exercises were accompanied by research into various ways of making data accessible via the web, culminating in 2000 with the launch of the Geoscientific Data Index on the web. This gives users access, at index or metadata level, to most of the BGS data holdings and will be one of the conduits that will be developed to allow access to the DGSM. The development of user-friendly and fast methods of transmitting large and complex packages of data via the web has now been taken up by industry. The BGS is poised to take advantage of anything new that comes along in this field.
Many complex problems need to be addressed before the DGSM can be made functional. Some are technical and can be handled internally or with manufacturers, but others, such as the management of copyright on the World Wide Web, are not and depend on national and international legal agreements. The use of the World Wide Web to distribute BGS data, until it is replaced by a better system, is now a reality, but the problems of policing usage of data so distributed makes a nonsense of some of the pressures brought to bear on the BGS by Government to exploit its data holdings for profit. This is a conundrum, far from solved.