Reducing risk in the petroleum industry

Priyadarshy shared how pipelining seismic, drilling, and production data can be used for long-term reservoir management.. Since it can be expensive tomove data from offshore or remote op

Trang 3

Reducing Risk in the Petroleum

Industry

Machine Data and Human Intelligence

Naveen Viswanath

Trang 4

Reducing Risk in the Petroleum Industry

by Naveen Viswanath

Printed in the United States of America

Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North,Sebastopol, CA 95472

O’Reilly books may be purchased for educational, business, or salespromotional use Online editions are also available for most titles(http://safaribooksonline.com) For more information, contact ourcorporate/institutional sales department: 800-998-9938 or

corporate@oreilly.com

Editor: Tim McGovern

Production Editor: Shiny Kalapurakkel

Copyeditor: Gillian McGarvey

Interior Designer: David Futato

Cover Designer: Karen Montgomery

Illustrator: Rebecca Panzer

August 2016: First Edition

Trang 5

Revision History for the First Edition

2016-08-11: First Release

The O’Reilly logo is a registered trademark of O’Reilly Media, Inc ReducingRisk in the Petroleum Industry, the cover image, and related trade dress aretrademarks of O’Reilly Media, Inc

While the publisher and the author have used good faith efforts to ensure thatthe information and instructions contained in this work are accurate, the

publisher and the author disclaim all responsibility for errors or omissions,including without limitation responsibility for damages resulting from the use

of or reliance on this work Use of the information and instructions contained

in this work is at your own risk If any code samples or other technology thiswork contains or describes is subject to open source licenses or the

intellectual property rights of others, it is your responsibility to ensure thatyour use thereof complies with such licenses and/or rights

978-1-491-96471-2

[LSI]

Trang 6

Reducing Risk in the Petroleum Industry: Machine Data and

Human Intelligence

Trang 7

To the buzzword-weary, Big Data has become the latest in the infinite series

of technologies that “change the world as we know it.” But amidst the hype,there is an epochal shift: the current exponential growth in data is

unprecedented and is not showing any signs of slowing down

Compared to the short timelines of technology startups, the long history ofthe petroleum industry provides stark examples to illustrate this change

Seismic research happens early in the exploration and extraction stages In

1990, one square kilometer yielded 300 megabytes of seismic data In 2015,this was 10 petabytes — 33 million times more, according to Satyam

Priyadarshy, chief data scientist at Halliburton First principles, intuition, andmanual arts are overwhelmed by this volume and variety of data Data-drivenmodels, however, can derive immense value from this data flood This reportgathers highlights from Strata+Hadoop World conferences that showcase theuse of data science to minimize risk in the petroleum industry

In the short term, data can be used to mitigate operational risk Given gooddata, machine learning can be used to optimize well completion parameterssuch as the amount and type of proppant used Ben Hamner, chief technologyofficer at the data science startup, Kaggle, says these are the biggest drivers

of well cost and the biggest expense when drilling the well They also have aproportionate impact on how much a well can produce Using completionparameters from machine learning on one well, the gain after costs was

$700,000

Priyadarshy shared how pipelining seismic, drilling, and production data can

be used for long-term reservoir management Since it can be expensive tomove data from offshore or remote operations, models use the data on siteand the results are aggregated with previously collected data and models.Oliver Mainka (vice president of product management at SAP), Hamner, andPriyadarshy all agree that the quality of data determines the value that can bederived from it Machines are very good at spotting new patterns in oceans ofdata The iterative use of human intelligence to clean the input data and

Trang 8

validate results based on experience makes machine data-crunching aneffective generator of value Big or small, using all the available data isjustified if it generates value.

Trang 9

Operational Risk

The spectrum of available data can be used to answer a variety of questions.High-quality input data is required for most analyses, and the output data canaddress different realms, like current operational risk and longer-term

organizational challenges

Here are some examples of addressing operational risk during different stages

of the upstream process

Trang 10

Exploration is an exciting time during which there can be immense paybackfor making the correct choices The right data and the information that resultsfrom this data processing can be valuable tools in the upstream arsenal

Domain expertise on data sources

The oil and gas industry has been a prolific user of data for a long time, asChevron’s Martin Waterhouse points out — and just as keeping oil flowing is

a complex operation running across continents, keeping information flowing

can be just as much of a challenge Big oil are large companies, but they are

not monoliths They are conglomerations of organizations which can be

considered large companies on their own The culture of the people, the roledata plays, and the time that the data is retained can be very different in eachorganization It can take years to figure out whom to ask questions, wherethings are done, and how the company functions Connecting domain

expertise with the latest in modeling and predictive analytics is as important

as implementing those models, but the payoff is worth it

In unconventional production (shale), well production is highly correlatedwith location Machine learning can help determine where to acquire acreage.The input data can come from:

Geology

Core samples are rich and accurate, but also rare and very expensive

Drilling and completion

Amount of proppant and fluid, number of stages, and injection rate

Production

Publicly available in the US; varies by state

Garbage in, garbage out applies here just as much as anywhere else Human

intelligence is critical for quality control of data Domain experts can tell thedifference between a bad sensor measurement and slowed production because

of transport issues For good performance, a combination of manual and

Trang 11

automated approaches is used to correct data when possible and reject

otherwise Hamner estimates, 95% of the effort in tackling predictive

problems in the industry lies in deeply understanding data sources and howthey fit into the business use case A related challenge is how to expose

results to key decision-makers

Integrating disparate data sources

A variety of sources can contribute to the data repository This can rangefrom automated high-sample-rate sensors to a human dropping a rope in atank every six months They can include audio, video, handwritten notes, andtext reports The challenge is to convert these different sample rates,

accuracies, accessibilities, costs, and difficulties into a validated, usable form

In a case that (like many others) cuts across both data varieties and domains,André Karpištšenko and his team at Marinexplore Inc (now PlanetOS) havebeen working to ease the flow and increase the utility of ocean-related data

In many parts of the world, risk is synonymous with weather The advent ofinexpensive, robust drones powered by wave and solar energy has made

available data that was once impossible to gather (in the eye of a storm) ortoo expensive (across the Pacific), which can keep us better informed of

upcoming weather This can directly impact planning locations for offshoredrilling platforms and shipping routes for oil tankers

Risk is also equated to uncertainty In the ocean, no two days are the sameand attributes like wind, waves, ocean currents, temperature, and pressurevary depending on location and time A prompt, easily accessible system ismore valuable than one with long data collection and processing times, whendelays can render information useless

When data is democratized, the experts are not isolated anymore There are

no long timelines to process and visualize data Data streams from sensors,models, and simulations are available to everyone This can even involve

sharing — that often maligned word Since many data sources (satellites,

models, gliders, buoys) are capital-intensive, Marinexplore started sharingpublic data as a demonstration of using existing resources well Now, leadingcompanies are thinking about how to better exchange data Karpištšenko’s

Trang 12

aim is a borderless ocean-data analysis world.

Trang 13

Drilling and Production

Over the life of a well, the risk-return equation can be optimized with

predictive maintenance Predictive maintenance, as understood by data folk,

uses predictive analytics to understand causation and correlation with

millions or even billions of records as a matter of course, and formulatespredictions about machine failure in order to proactively service devices

instead of relying on isolated inspections In a compressor, monitoring oiltemperatures and vibrations in real time offers direct cost advantages by

maximizing utility (service too soon) and minimizing downtime (service toolate) by operating until the desired point on the PF curve (potential failure,functional failure) This, says Mainka, can result in big numbers Even a

0.1% reduction in maintenance costs can translate into millions of dollarssaved For example, in Europe, maintenance cost is estimated to be 450

billion euros Of this, 300 billion could be addressed by maintenance

improvements and 70 billon is lost due to ineffective maintenance

The methods chosen for data processing should be able to handle the

characteristics of the incoming data Priyadarshy highlights the characteristics

of different types of upstream data During seismic studies, the volume ofdata is very large, but the velocity is slow and the data does not have to beanalyzed in real time The value is significant because if you wrongly choosethe drilling location of a well, it could cost you a few hundred million dollars.The complementary example is during drilling The volume of data is muchsmaller compared to seismic studies, but the velocity is faster, and sometimesyou have to analyze the data in real time If predictive models fail, it can beexpensive (when a drill bit gets stuck, for example) The value of real-timedata in any particular case is significant but not as high as well location

Sensors in real time

Sensors are becoming more pervasive, but what companies do with them stillvaries significantly Mainka offers an example Consider six data sources,producing trillions of records Processing all of them as a matter of course, inreal time, is new for 98% of companies — even though these are

Trang 14

sophisticated companies (Fortune 100, Fortune 500).

Sensor maturity translates to lower cost and improved robustness Petabytes

of data are now collected by millions of sensors The challenge is how to usethis fast enough so that value is not lost due to collection and processing.Karpištšenko shares an example from the early life of Marinexplore: oncebuoy data was collected and analyzed, it took a customer three months tomake a decision Given that the ocean is highly dynamic, this delay seems tonegate the usefulness of the information Marinexplore’s platform can showmeasurements from sensors and data from models and simulations (such asdaily sea temperatures) in seconds instead of months or years

Data methods

A few data science methods can be applied verbatim, whereas others requiretailoring to suit the petroleum industry While explaining use cases, the

speakers offer a glimpse into their instantiation of this world

Asset-intensive industries are especially interested in maximizing asset

productivity Mainka describes how either the end user or the manufacturer isinvolved depending on whether the assets are owned or rented By looking atbillions of records, models can create rules and back-calculate possible rootcauses of failure Anomalies can be either good or bad If good, try to repeat

it If bad, try to avoid it Multiple rules can be chained together to classifyscenarios In each case, by monitoring future performance, the system can beiteratively improved When an impending failure is detected, from the

perspective of the manufacturer, the next step could be to offer preventativemaintenance service for a positive customer experience The risk of

unscheduled maintenance and associated costs can thus be reduced

Organizations that generate the majority of maintenance work orders frompreventative and predictive inspections and use sophisticated reliability-basedmaintenance procedures and tools to increase asset availability have a 27%lower unplanned downtime without any increase in service and maintenancecost

As with most modeling, machine learning applied to exploration and

production can be validated against future performance Hamner lists the

Trang 15

following model evaluation strategies as being useful in picking parametersfor deeper study or for selecting between models:

In oil and gas, drilling is based on physics and first principles, with datacrunching to generate metrics and evaluate key performance indicators

(KPIs) However, using the volumes of data already stored, the goal is tolearn, innovate, and move to holistic data-driven analytics in real time

Priyadarshy details the three aspects that make now seem like the right time:

Trang 16

Long-Term Risk

Different aspects of long-term risk require unique approaches and solutions.Practical matters whose value can be quantified, like reservoir management,are better understood than institutional ones, like loss of expertise, whosevalue is more difficult to quantify

Trang 17

The oil and gas industry was one of the first aggregators of large amounts ofdata Most of the data challenges in upstream operations revolve aroundstorage Upstream data is expensive to gather, and it isn’t clear at the timewhat will be useful in the future Because companies could use it at somefuture time for some yet-to-be-determined purpose, they store as much asthey can Chevron has exabytes of such data, according to Waterhouse Thelong arc of data analytics in the industry reaches back to the ’80s and ’90s,when Chevron was an early adopter of Cray Supercomputers, used for

reservoir modeling More recently, to maximize production over the longterm, reservoir characterization and reservoir simulation both use big datatechnologies, says Priyadarshy

Trang 18

It is not sufficient to pick the right problem and solve it using good data It isequally important to share the results among the target population, ensure thatthe acquired knowledge does not perish, and future decisions are based onwhat was learned during a given study Any of these can be more challengingthan the others, for unexpected reasons

As a model of the integration of machine learning with human expertise inmaterials research, Kai Trepte, lead engineer at Harvard’s clean energy

project, explains how building blocks are mixed in computer models andtheir properties analyzed The data from this analysis is mined for promisingcandidates, speeding up the discovery process In addition, constraints formanufacturing and distribution are added to speed up real-world usage

Combinatorially, 26 promising fragments (from research at Stanford

University) resulted in 10 million molecules Help from human

experimentalists and theorists, and data mining and machine learning reducedthis number to 2.3 million molecules that required further study These 2.3million molecules required 150 million calculations, generating 400 terabytes

of data From that, the yield was about 0.5%

The compute time for such simulations is very large, so they used an existingopen source framework IBM World Computing Grid and Berkeley OpenInfrastructure for Network Computing (BOINC) where volunteers donateprocessing on their devices With 600,000 volunteers donating 22,000 CPUyears, it was equivalent to a 170,000-core supercomputer This is orders ofmagnitude higher than what a single, well-funded research team could afford

It is difficult to fathom how long physically making these millions of

different molecules and testing them would take Humans and machines

together made this study possible

But as a general statement in research, whether academic or industrial, there

is little funding for data persistence (especially when there aren’t publishableresults) In short, most of the data collected during research is lost By

Trepte’s estimate, within five years, 50% of raw data is lost In 10 years, 95%

of data is lost This is changing The Materials Genome Initiative has funded

Định dạng
Số trang	24
Dung lượng	1,92 MB