AI for Earth: From scribes to deeds
Bruno Sanchez-Andrade Nu?o
Executive Director Clay, AI for Earth | Due Diligence consultant
We have a very clear set of needs to understand Earth’s climate and nature, including measuring biodiversity, mapping disasters, and monitoring crops, to mention just a few. Yet despite having an enormous amount of diverse, Earth data (both open and commercial), most insights about the Earth are locked in the data, requiring very advanced technical skills, resources, time, and tools to use it.
We believe we finally have the missing piece to unlock this situation. To transition Earth Observation beyond being the scribes of doom, and instead support the change we want to see in the world, removing complexity and adding clarity and speed. That last piece we needed is AI. AI for Earth.
While AI has revolutionized many fields already, this recent AI wave has largely eluded Earth. With AI we can transcribe and translate any text like never before. We can have self-driving cars and detect cancer on X-Rays. Moreover, most famous examples in each of these cases use the same AI architecture: transformers. It is a remarkable human achievement to create such a universal learner. But we have not seen it applied to Earth data at scale. Until now.
Clay is a nonprofit team of 30 people from around the world (USA, Portugal, Denmark, Spain, India, Poland, New Zealand….) on a mission to use AI to make understanding Earth data — whether it’s finding deforestation, coastal erosion, or monitoring crop health — as simple as using ChatGPT or searching the web.
There are a number of barriers to making EO insights simple. The current state of geospatial analytics often involves inefficient one-off processes:? We often start with the same exact data (e.g. Sentinel or Landsat), processing it, and developing specific applications toward tasks like mapping deforestation or detecting floods. Dan Hammer and I helped create many of such processes. Thankfully, this is improving. Some of the most impactful such products help us see a pattern of progress: Global Forest Watch, Global Plastics Watch and Amazon Mining Watch. They arguably are the same "semantic monitoring" tools: Track Forest change, plastic waste-site, and illegal mines, respectively. They were created only years apart and yet they took orders of magnitude less time, resources, and funding.? Where are we going here and how can we go faster together?
AI has this thing called foundation models, where we frontload most of the undifferentiated heavy lift of compute. These foundation models were invented for text data, where one can use a large dataset, like Wikipedia text, and then finetune a smaller dataset for a specific task, like score product reviews. The larger dataset gives the model plenty of cases to learn patterns, that it can use to do the task much better, quickly and very efficiently than if it only had access to the smaller dataset. Moreover, the pre-task is universal in the sense that it can be re-used for any task. If you want to learn more, read ULMfit or later BERT/GPT papers.
When it comes to Earth data, the approach is even stronger for both technical reasons and usage. On the technical side I believe the Earth embedding space will be much more sparse than text (text has infinite options, Earth data is not, in fact I believe Earth is semantically largely ergodic). On usage, most outputs will be done with the same data sources we used to create the foundational model. I wrote a whole article on this a few weeks ago (which convinced us to fundraise for Clay).
Foundational model work clustering semantics, but these semantics are numbers, and humans rather use labels (mine, coast, river, ...) So we have also created an encoder that aligns these mathematical semantics with human labels. It works by pulling human label descriptions of every image (from Open Street Map) and creating embeddings that align with the embeddings create with the image. We also released this "text2earth" as open source, open data.
So, the “what” is clear. We also have strong opinions of the “how”. We could raise funds as a for-profit, or we could make a paper validating the concept. Perfectly valid options that are being explored by others. Instead, we decided that creating a non-profit to gather the needed funds to explore, compute and maintain this AI model was not only the best option to create the most positive impact both via non-profit and for profit stakeholders, but the best opportunity to have the right governance around it. Just like EO open data became an extremely powerful driver of impact, we can contribute an AI for EO open model to maximize value of that open data for everyone. Moreover, do it in a way that for-profits can build upon this foundation, getting a multimillion-dollar starter model on day 1.
We’re building it openly, with as much community and partner feedback as possible to make the most operational AI for Earth, aimed at delivering real nature and climate value.
Clay is uniquely focused:
ChatGPT didn't win because it was technically better, it won because it was extremely easy to use.
We launched Clay at Davos 2024, having raised $4M in philanthropy, and at SatSummit last week we released Clay model v.1. Our latest model can ingest any instrument and resolution. Trained on 70.7 million image chips from 2017-2023. With resolutions from 30m to 5cm. 1-10 per location.
It's extremely early, but we’ve already demonstrated promising results such as:
Clay v1 is the best AI model of Earth, partly because there's no agreed definition of what that means.
We are impressed by the potential of Clay v1, and we also struggle to accurately assess how good it is: There is no agreed benchmark when it comes to foundational EO models. There have been several attempts, but nothing has been widely used. We want to fix that, so we've started a working group to propose and promote such benchmarks. And to fast-track this process, we also launched a competition where the winning open model gets $10K in compute to make the model even better for everyone.
What is next now?
Clay is switching modes from "explore" to "exploit". We won't work on Clay v1.5 for a few months, we'll instead focus on making sure the model and the app are indeed as easy to use as possible. We'll work with partners to test the model, to find where it works and where it doesn't.
To help coordinate our first steps here, we are also making bi-weekly open calls to demo the app and a general open office. Sign up here: https://madewithclay.org/demo
There's a whole Earth to make Clay with.
Postdoctoral Researcher, Technical University of Munich
3 个月> "ChatGPT didn't win because it was technically better, it won because it was extremely easy to use." Completely agree with this!! > "Dynamic convolution (Doha paper)" Do you mean DOFA (https://arxiv.org/abs/2403.15356)? > "There is no agreed benchmark when it comes to foundational EO models." While that may be the case, there are dozens of widely used benchmark datasets. An imperfect benchmark is better than no benchmark. This announcement does not provide any evidence that Clay works better than the dozens of existing EO FMs that are already free and open source. https://arxiv.org/abs/2405.04285 lists several such EO FMs and benchmark datasets. > "last week we released Clay model v.1" Will the source code or model weights be released? We would LOVE to add Clay to TorchGeo (https://github.com/microsoft/torchgeo) alongside DOFA, Sat-MAE, GASSL, SatlasPretrain, and many other EO FMs. This will enable direct comparison on a wide variety of datasets, and make it easier for more scientists to use.
Map Building @ Meta | Data, Product, Growth
3 个月Are the labels you pull from OSM mainly for land cover, like that something is a building, or do you try to identify specific things like whether a building is a school or a library or a hospital? I am some ideas in the works that might be very compatible with this so perhaps should arrange a chat!