AAPG Site Search | Home > EXPLORER > Knowledge Management for DB
Lasers and Outcrops
Computer Virus Security
Data Storage

By KATHY SHIRLEY
EXPLORER Correspondent

Hierarchical knowledge management analyzes the different parts of an exploration project and determines, particularly for seismic data, the different datasets and the most recently used version of the data.

"Large geotechnical datasets in petroleum exploration have created a continuously growing challenge to traditional data storage, management and delivery systems. New strategies are required in order for this data to be effectively utilized as an organizational asset to add to proven reserves," said Jess Kozman, service delivery manager for Schlumberger Information Solutions.

"Analysis of best practices based on successful information management implementations at over 50 international oil and gas organizations shows that the focus of the solutions has moved from data and information to knowledge," he added.

Storage management solutions now recognize three levels of digital datasets:

3 Large volume static data files.

3 Smaller but more numerous and dynamic user interpretation files.

3 One-time capture and retrieval files used for snapshots and archives at key milestones and benchmarks in the life of a field project.

Kozman said five years ago an exploration project might have included a migrated and stacked version of seismic data. Today there can be three or four different AVO versions, a pre-stacked depth migration version, attributes, pre-stacked gathers and all those volumes grow exponentially. The impetus for hierarchical management was IT personnel who realized they could no longer afford to just keep buying additional disks.

"The hierarchical knowledge management system is automated and works in the background with no user intervention required," he said. "It automatically tracks which versions have been most recently used, so the user is spending his time on geophysical analysis and adding value to the bottom line rather than spending time dealing with disk storage and figuring out which version was most recently used."

 

Can We Have It All?

Storing and Harvesting Massive Data Sets

Three-D seismic and the explosion of processing techniques that have continually made the data more useful are a vital part of today’s exploration and production projects -- but this proliferation has created a problem for oil companies: What to do with all that data?

Companies are increasingly struggling with this costly issue.

As a result, new storage management techniques have been developed, and those are now evolving into knowledge management for large databases.

Hierarchical storage management basically allows companies to move data between online disk systems and near-line tape systems, depending on how frequently a dataset is used.

"This is an automated system that ensures your most recent and most accurate data is always on line at any given time," explains Jess Kozman, service delivery manager for Schlumberger Information Solutions.

However, that concept is now moving to the next level, into what Kozman calls hierarchical knowledge management.

Kozman, who presented a paper titled "Where Has All My Data Gone? Case Histories of Hierarchical Knowledge Management for Large Datasets" at the 2003 AAPG international meeting in Barcelona, cites several projects where hierarchical knowledge management was successfully implemented to solve data management issues, including:

One of the first successful implementations of hierarchical content management, at Chevron Overseas Petroleum Inc. in San Ramon, Calif., at what is now the ChevronTexaco Upstream Technical Computing Center.

The management of large static trace files in the interpretation environment began in 2000 with the installation of a five terabyte automated tape library to manage application trace files stored on network-attached storage for six worldwide business units.

"Business rules were put in place to allow the release to near-line media of trace files not accessed in 32 days from over 500 seismic projects containing up to 30,000 physical files," he said. "In addition, a second copy of the archived tapes was used to provide backup and disaster recovery capabilities."

The system has grown to over 11 terabytes and over 100,000 physical files.

"According to the ChevronTexaco project manager, seven terabytes of the data exists only on near-line tape, and she recently wrote, ‘I shudder to think of how we would have handled all that inactive data’ without the near line systems," he said. "She indicated the previous system involved hours of work creating offline tapes ‘destined to be lost in desk drawers,’ and would have eventually required the purchase of more network attached storage at substantially higher cost than that of tape."

The near line system also saves money and pressure on the backup system, provides a comfort level for disaster recovery and makes it easier to retrieve files to a project than from a traditional UNIX backup system.

"Standard backup tapes at ChevronTexaco are kept for only three months, so projects have been saved by having the data on near line tapes after discovering it had been deleted completely from disk at some unknown time, he said. "Since the technical center is billed internally for backup services, taking the managed disks out of the backup schedule also saves money."

At Conoco in Lafayette, La., a near line tape robotic system was installed to manage the deepwater Gulf of Mexico exploration division’s seismic project storage from 1999 to 2003. Online storage had grown to approximately 2.2 terabytes, driven by up to a terabyte per year in new delivered data and the addition of reprocessed and specialty processed volumes.

"Conoco had purchased over 30,000 square kilometers of 3-D seismic surveys in five years to evaluate over 300 leases in the deepwater Gulf of Mexico," he said. "Manual intervention was required to manage the data volumes by physically backing up and deleting projects, taking interpretation time away from geoscientists."

The projected cost of managing this growth by simply buying additional network-attached storage was calculated at reaching approximately $10 million by 2000 -- an unacceptable figure in a cost control environment.

The solution: A 25 terabyte capacity automated tape library and hierarchical project content management solution using business rules was used to move seismic trace volumes that had not been accessed in over 30 days to near line media, where they could be recovered when needed using four high speed, high reliability drives working in parallel.

"Conoco determined independently that the implementation of a hierarchical near line storage strategy saved over $7 million over three years compared with the cost of buying additional network attached storage," Kozman said.

The latest development in the hierarchical storage of project content is from data and information to knowledge.

A hierarchical knowledge management system manages large static data files, dynamic user files containing project information and archive files. Such a system was installed by German Oil & Gas Egypt (GEOGE) in Cairo, where approximately 2.6 terabytes of interpretation data is moved automatically between online and network attached storage devices and a robotic automated tape library, he said.

Files from all three categories are identified by access patterns and segregated onto separate storage partitions and pools of tape media. File usage patterns are continuously monitored to gauge the effectiveness of the background processes and allow tuning of the system.

The system provides effective storage management, backup and disaster recovery capabilities and a method to capture and archive the knowledge contained in projects at key milestones in the project life cycle, including preparation for application upgrades, according to Kozman.

"Prior to the installation, GEOGE had more than one terabyte of data on disk with only manual backups of project seismic data and no scheduled backups for data outside of application projects," he said. "There was no in-place disaster recovery system. Ninety software application system projects were spread over 36 disk partitions and 30,000 physical files. Most of the disks were 95 to 100 percent full."

Some seismic trace files in interpretation projects had not been accessed by users in up to nine months, and manually created backup files were occupying more than 200 gigabytes on a high-end, network-attached storage device.

"Corrupted disks required system administrators to physically move data, one project at a time, and reload projects from tape -- resulting in days of lost work," he said. "Plus, a 15 percent growth in seismic data over three quarters was predicted and disk usage had begun to grow exponentially."


Tell us what you think ...

Name:
E-mail:
Are you a member of AAPG?
Would you like your comments to be considered for publication in the EXPLORER's Readers' Forum?
*Letters intended for publication must include the following.
*Phone:
*Location:

Letter:

Please enter the above text exactly in the field provided below to validate this submission.

TOP