Reactive Java + Blocking Code

We have a backend system that was written using Spring Boot 3.x with a full Java Reactive stack, and while for the most part it's doing well but there's a certain part where it's causing spikes in latency which I have been trying to solve in the past month or so.

This problem that's been living in my head rent-free for the past month has been a very hard one. I've come across bottlenecks in these applications that would take me a few hours--maybe just a few days--to figure out and address it. For a lot of these bottlenecks the tools like YourKit are more than enough to identify them.

But like I said, this one is different.

How the problem has surfaced itself is worth another article on its own and for now that's not that relevant. Basically, in our backend system we needed to integrate a library that reads some scores data from some datastore (I/O operation), and it's written without using the non-Reactive Stack, and we can simply define it as this:

public class SomeDataLibrary {
  abstract SomeData getMyData();
}        

And to integrate this in our application, we simply wrapped it with Mono::fromCallable:

@RequiredArgsConstructor
public class ReactiveSomeDataLibrary {
   private final SomeDataLibrary source;
   Mono<SomeData> getMyData() {
     return Mono.fromCallable(() -> source.getMyData());
   }
}        

And to use the library:

@RequiredArgsConstructor
public class ConvertedDataSource {
  private final ReactiveSomeDataLibrary rsource;
  private Mono<ConvertedData> getConvertedData() {
    return rsource.getMyData()
      .map(d -> ConvertedData.fromSomeData(d));
  }
}        

SomeDataLibrary::getMyData would often cause latency spikes depending on traffic volume, which in itself is nothing out of this world. But our requirement is to ensure that we respond within a couple hundred milliseconds...and the spike from the library can span multiple seconds.

Now this is the kind of thing that's not going to be obvious with YourKit and even BlockHound because there's no thread contention and it occasionally happens on high load, but when it happens...boy, the effects are near-catastrophic.

The first step to mitigate this is to use Reactive Timeout:

@RequiredArgsConstructor
public class ConvertedDataSource {
  private final ReactiveSomeDataLibrary rsource;
  private Mono<ConvertedData> getConvertedData() {
    return rsource.getMyData()
      .map(d -> ConvertedData.fromSomeData(d))
      .timeout(Duration.ofMillis(200));
  }
}        

It's not enough. Can you see the problem?

Since ReactiveSomeDataLibrary simply wraps the blocking code, it would try to execute that whole block in the reactive threads. Now for some code that's natively reactive, the timeout will cut off at 200ms, but in the case above it will not; if SomeDataLibrary::getMyData takes 5 seconds, that block will take 5 seconds and return a timeout error.

This is where the Eureka moment (cue Captain Picard facepalm meme) finally sets in: I need to have the blocking code executing in a different threadpool!

Fortunately, with Java reactive there's some simple enough ways to do this without writing a lot of code, so now this becomes:

@RequiredArgsConstructor
public class ConvertedDataSource {
  private final ReactiveSomeDataLibrary rsource;
  private Mono<ConvertedData> getConvertedData() {
    return rsource.getMyData()
      .publishOn(Schedulers.boundedElastic())
      .map(d -> ConvertedData.fromSomeData(d))
      .timeout(Duration.ofMillis(200));
  }
}        

Voila! It solved our latency issue.

It reminded me so much of this video which I saw going viral a few years back: https://youtu.be/vJnAowdUyK4?si=_00H36Fcv-WrvPQ_




要查看或添加评论,请登录

Dexter Legaspi, MSc的更多文章