How data science helps us understand the subsurface
Historical data, software, new ways of working and data science can help us better understand the subsurface world. Here's how.
In Equinor, we’re working in many different realms. What first springs to mind might be the offshore rigs or processing plants on shore, but a major part of our work occurs in the natural world. One area of the natural world we work in is the subsurface - which we can describe as layers of rock and associated fluids below the seabed offshore and below ground level onshore. It’s not just where we can find and extract oil and gas but also pump CO2 back into for storage.
Before we know what areas are suitable for which, we need to take samples and survey the area. This is done in a few different ways. Direct samples are taken by drilling wells and sampling the rock through coring, but we add to this mix of data by also sending a variety of signals into the subsurface and recording the signals that return – methods such as well logging or seismic surveys.
We’ve been doing this since 1972 so we have nothing short of a monumental amount of data available. Then, add all the other energy companies and their available data into the mix, some with much older data than Equinor – as well as any other organization, government or corporation with datasets we access or purchase.
“The complexity of our data is truly daunting. If we’re to understand where the best rock characteristics or the best spots in the subsurface are, and do it faster and safer than ever before, then we need to somehow unlock all this data systematically,” Ashley Russell says.
Ashley is a geologist and self-taught data scientist working as the leader for data science and analytics in EPI’s Subsurface Excellence and Digital unit.
“Unlocking all this potential begins with data science – first with data engineering to transform the data into analytics-ready structures that are universally understood everywhere, especially by computers, and next with machine learning to pick out relationships and make predictions in these huge quantities of data, jobs impossible for a human,” Ashley explains.
Think of data science like a cake. The first layers are all about collecting the data, moving/storing it somewhere and then exploring, visualizing and transforming it all – before doing analytics, creating new calculations (which we call features) and choosing training data. When these layers are all done, that’s when the sweet glaze that is learning and optimizing can take form. The basics enable us to run machine learning, artificial intelligence and deep learning on our data – the juiciest bite of them all. That’s the slice that enables us to learn more and do more.
Credit: Monica Rogati/Hackernoon
Have no fear, the data is here
We have data spanning decades, but that doesn’t mean we can use it right away. Some of it is stored on tapes, disks - or if we’re lucky hard drives. One of the most important tools we have to unlock and unpack all these different data types is open-source software – software you’re free to use as you see fit. The open source programming language Python and its various packages are a frequently used tool in the subsurface world.
So, let’s look at an example. How would we unpack well logging data from Australia stored on a magnetic tape from 1982? The first step, naturally, is to digitize it with a tape reader. This would give us a .DLIS file, a tape-based standard file type commonly used in the geosciences of oil and gas.
Then, we can use DLISIO, an Equinor-developed open-source Python package. Traditionally, software developed for these files was designed to read DLIS-files manually, letting you have a window to look at the files content. While it’s a useful way to do things, it doesn’t exactly scale up. DLISIO lets users read files programmatically, which enable us to automate the process of reading large amounts of files with just a few lines of code.
“This gives a user the ability to work with many files at once, which opens up for new ways of working. The goal of DLISIO was always to make a general-purpose package to facilitate both the specific needs at the time as well as future ones. It was also evident that the current ways of working were shaped by the tools available. We wanted to change that with DLISIO,” says Erlend H?rstad, one of the creators.
Using DLISIO, the first Python code we run extracts metadata, showing us what the data contains and when it is from. With this, we can run another bit of code and extract the data we’re interested in. “With a few lines of code, we can actually transform a tape-based and very subsurface-specific file format into something we’re all more familiar with – data in rows and columns,” Ashley explains.
“Once we have the data in this structure, we can take it anywhere – including quick visualizations to look at that data statistically and also look at it in a familiar subsurface way,” she adds.
But there’s a problem – that was just one single .DLIS file. This Australian data set has over 36,000 files alone and that’s a pretty standard size. Multiply that with the amount of .DLIS files worldwide and you have millions of files in one very specific data type – and there are more data types that also need to be included, such as .LAS and .LIS.
领英推荐
“Cloud technology is a must to harness the massive computer power it offers to analyze these files, but it also requires a team to be able to put all this into practice on a scale,” Ashley says.
Looking for mathematical relationships in rock
In Ashley’s case, the team was all in place. There’s a Swedish and Serbian software developer, Chinese cloud architect, British geologist, Ukrainian petrophysicist, Dutch data scientist and an Italian Scrum Master – as well as Ashley, an American data scientist. The fact that we’re bringing up the nationalities of everyone isn’t a coincidence.
“All our backgrounds are unique, and we have very different experiences and different competencies. But when combined it allows us to address the subsurface data engineering chaos and starting to unlock all the data in totally new ways,” Ashley says.
Magnus I. Karlsson, a software developer from the team adds, “Technically, to tackle the problem of processing huge amounts of well log data, we needed to learn totally new technologies – how to best utilize the services from our cloud provider, Azure, to create scalable and cost-efficient solutions.”
Top row from left: Gerrit Toxopeus, Alexander Rolland, Erlend H?rstad.Bottom row from left: Ashley Russell, Pier Lorenzo Paracchini, Magnus Karlsson
The team applied the previously mentioned cloud processing power and extracted data from every single one of the 36,000 files.
“Now that we have all that data in a form we can work with, we can start looking for mathematical relationships in the data. We can start sorting out the signals we want from the noise and understand what the rocks can tell us,” Ashley explains. “Then we can start investigating these relationships, utilizing them and combining them with other data types. That allows us to mathematically and statistically quantify the relationships in our subsurface,” she says.
The methodology developed by this team has now been applied to unlock 1,270,000 well log files from 8 countries providing a massive dataset used for various machine learning projects and integrated into Equinor’s Reservoir Experience Platform. There, the exposed data can be quickly viewed alongside other unlocked datasets in a simple web browser.
The methodology was applied to unlock 1,270,000 well log files from 8 countries
Fellow geoscientist turned data scientist, Gerrit Toxopeus discusses this, "By utilizing cloud technology I can analyze huge amounts of data in a matter of hours rather than months. Even better is that the cloud allows us to host this data in a way that makes it more easily available to not only me, but everybody in Equinor working in the subsurface."
The team has now moved on to apply their learnings into other teams in Equinor to expand upon the same ambitions and successes – including working with the Open Subsurface Data Universe.
Tackling the big challenges ahead
?As more and more data make its way up the pyramid, we can make more and more use of it and find relationships that we hadn’t previously seen that let us classify the subsurface in even better ways. But why would we want to do this with historical data?
Our industry’s biggest challenge is to reduce our carbon footprint. Equinor’s ambition is to be carbon neutral by 2050 and we can’t do that without focusing on several things.
“Firstly, all hydrocarbons, oil and gas, are not the same, so we want to make sure we’re targeting the ones with the lowest CO2 footprint. Secondly, we need to explore locations suitable for carbon capture and storage. Not all subsurface locations are suitable, so we need to find the right properties within the rock to effectively pump the CO2 back and hold it there safely,” Ashley says. “We also have to do this faster, safer and under more competitive pressure than today,” she adds.
That’s why running machine learning and AI on our data will be ever so important – together with an ongoing engagement in the open-source community and working in diverse teams. "Analyzing, combining and understanding the historical data from the past will let us make even better decisions in the future. The subsurface and its data will play an important role in this work,” Ashley explains.
?
Project Manager Renewables Technology
2 年Thank you for sharing this story out here. A long and great journey build on peoples skills and motivation as input factor. With these large structured subsurface datasets our geoscientists are identifying and de-risking massive value prospects for the commercial benefit for everyone of us in Norway #Equinor
Good to see the solid Collaboration ?shild and thank you for helping to bring information of this work out! Great work by everyone. Thanks!
Director Capgemini Invent | Norway’s Top 50 Women in Technology 2022 | Founder Fr?ya Ventures | Keynote Speaker ?? | Professional Board Member
2 年Great work everyone! Brilliant stories of how the energy sector can utilize data science.
IT Advisor - Norwegian Director of Fisheries
2 年It's a long, long way to a drillers target. :)
CEO leading emission reduction with AI in oil and gas
2 年Excellent article..You have very lucidly explained the challenges of bringing in vast volumes of multiple datasets to unravel the complexities of subsurface. Great team work and this work is a hallmark of a truly data driven organization..