Side Project: Local LLM-Powered Image Renaming JavaFX Tool
In this article, I will discuss the technical aspects of a personal project that I completed during my job search in Singapore, highlighting challenges and solutions.
Use case: you have a disk full of randomly named images and you want to give them filenames relevant to their content.
Technical details
This article will cover: llama.cpp with Java, JavaFX as GUI, packaging the application and pitfalls along the way.
1. Back-end: llamacpp + LLaVA
Llama.cpp is a free, open source tool to run local Large Language Models. It revolutionized the use of LLM on consumer-grade PCs by lowering the requirements to run them at decent speeds. More famous tools like Ollama, KoboldCpp, LMstudio, Llamafile, Gpt4all and jan.ai all spawned from llama.cpp!
My application will launch the llama.cpp process using Java's ProcessBuilder.start(); with a script asking to load LLaVA, a multimodal LLM that can read text and interpret image (using CLIP). LLaVA is in charge of analyzing the image and provide a filename.
Then, I use Java's HttpClient to query lamacpp server, providing the image encoded in Base64 and a prompt: "A short but descriptive 5-10 words filename in camelCase with no whitespace for this image could be:".
HttpClient client = HttpClient.newHttpClient();
HttpRequest request;
try {
String bodyInString = objectMapper.writeValueAsString(body);
request = HttpRequest.newBuilder()
.uri(URI.create("https://127.0.0.1:8080/completion")) // local llama.cpp's server
.POST(HttpRequest.BodyPublishers.ofString(bodyInString))
.build();
try {
HttpResponse<String> response = client.send(request, BodyHandlers.ofString());
ResponseBody r = objectMapper.readValue(response.body(), ResponseBody.class);
Pitfall 1: llamacpp removed the support of images in early march. Solution: Use a version before March 5 (b2334). Another solution would be to use KoboldCpp instead, which is a llama.cpp derivative strongly dedicated on maintaining old features and backward compatibility.
Pitfall 2: LLaVA's answers are not perfect. The new filename may not be in camel case. Solution: use Java to correct it. I settled on the org.apache.commons.text.CaseUtils library.
Pitfall 3: It's VERY slow, 5 min per image at least. LLaVA is CLIP +an LLM loaded at the same time. Solution: don't use the LLM part, use CLIP directly through a Java deep learning library. On the flip side, CLIP alone won't return a filename as pretty/relevant that LLaVA does.
2. Front-end: JavaFX
JavaFX is the successor to Swing for developing Java desktop GUI applications. It's quite nice to use, especially with Scene Builder that let you drag and drop components and try immediately, making creating interfaces a breeze.
Pitfall 1: JavaFX is not in standard JDK anymore. Solutions: use the JavaFx Maven/Gradle plugin, or make a fat jar, or use a JDK provider that still include it (Bellsoft's Liberica JDK).
Pitfall 2: To launch the JavaFX application, even in the IDE, we are supposed to run mvn clean javafx:run. It's a bit clunky and slow to start and hard to debug. Solution: create another main class for the quick launches in your IDE:
class AnotherClassInMyMainClass {public static void main(String[] args) {MyMainClass .main(args);}}
3. Front-end: Logging and interacting with llama.cpp
To show progress, I display the output of llama.cpp that was launched in the background using process.getInputStream() + the logs of my own program using System.setOut toward a custom OutputRedirector class.
Pitfall 1: When launching a process with Java, you NEED to consume the input and the error stream or it will FREEZE after some point. Solution: read the streams, consume them, preferably in another thread. Search for StreamGobbler implementation.
Pitfall 2: Multithreading. Freezing or errors such as "Not on FX application thread" could appear. Solution: Use javafx.concurrent.Task to create thread. To update the interface use Platform.runLater.
private void processFiles(UserGuiInfos userGuiInfos, List<Path> fileList) {
ExecutorService service = Executors.newSingleThreadExecutor();
List<RenamingInfos> renamingInfosList = new ArrayList<>();
for(Path p: fileList) {
final int fileNumberFinal = fileList.indexOf(p);
LOGGER.info("Processing \"{}\"...", p);
Platform.runLater(() -> {
userGuiInfos.getMainController().updateProgress((double)fileNumberFinal/fileList.size(), p.toString(), (fileNumberFinal) + "/" + fileList.size());
});
4. Packaging the application
JLink is a tool that enables the creation of a tailored, lightweight-ish Java Runtime Environment specific to your application's needs. In my case, the final result is around 70MB whereas a full JRE is 200MB. The JavaFX plugin provide a javafx:jlink option to help with this task.
Pitfall 1: it'll produce plateform specific artifacts for JavaFX. So, by default, your artifact produced on GitHub pipeline, on a dockerized linux, won't work on Windows. Solution: use the javafx.platform option.
Pitfall 2: platform specific JRE. Solution: Of course, to create a mini JRE, you need a JRE corresponding to the target platform. Then point JLink to the appropriate JRE by specifying the module path. This is how you would do using Jlink manually on command line: jlink --module-path "path_to_the_targeted_platform_JDK\jmods"
But, using the JavaFX Maven plugin, I simply had to configure it like this:
<plugin>
<groupId>org.openjfx</groupId>
<!-- This plugin is mandatory to run/package JavaFx Applications. It also runs jlink. -->
<artifactId>javafx-maven-plugin</artifactId>
<version>0.0.8</version>
<configuration>
<!-- This configuration is used when we launch the app manually using "mvn javafx:run"-->
<launcher>PicturesAutoNamer</launcher>
<jlinkImageName>PicturesAutoNamer</jlinkImageName>
<mainClass>fr.pan.main.Main</mainClass>
</configuration>
<!-- The following executions are used to compile/package the app for distribution -->
<executions>
<execution>
<id>windows-build</id>
<goals>
<goal>jlink</goal>
</goals>
<phase>package</phase>
<configuration>
<stripDebug>true</stripDebug>
<noHeaderFiles>true</noHeaderFiles>
<noManPages>true</noManPages>
<launcher>PicturesAutoNamer</launcher>
<jlinkImageName>PicturesAutoNamer</jlinkImageName>
<mainClass>fr.pan.main.Main</mainClass>
<!-- Here we point to the downloaded windows JDK from mvn-jlink-wrapper and also the modules created by moditect --> <jmodsPath>${project.build.directory}${file.separator}jdkCache${file.separator}ADOPTIUM_21u_2024-01-21-02-59_windows_x64_hotspot${file.separator}jmods${path.separator}${project.build.directory}${file.separator}modules</jmodsPath>
<jlinkImageName>${project.name}-${project.version}-win</jlinkImageName>
<bindServices>true</bindServices>
<runtimePathOption>MODULEPATH</runtimePathOption>
</configuration>
</execution>
</executions>
</plugin>
Pitfall 3: not really a pitfall but to have the JDK for another platform, you first need to download it. Solution: I used MVN JLink maven plugin but you can also just use wget.
Pitfall 4: modules. You have to decide if you want to use a modular application or not and may need to tinker with the runtimePathOption of the JavaFX plugin.
Pitfall 5: Dealing with non-modular depencies, aka the "automatic module cannot be used with jlink" error. Solution: use moditect-maven-plugin. It will inject the module-info to the jar that don't have it. It's a bit hacky but hey, it works. The following example generate module-info.java descriptors for the non-modular lib "imgscalr" that I use to resize images:
<plugin>
<!-- Plugin needed to avoid the error "automatic module cannot be used with jlink" to inject module info to dependencies that doesn't have it -->
<groupId>org.moditect</groupId>
<artifactId>moditect-maven-plugin</artifactId>
<version>1.0.0.RC2</version>
<executions>
<execution>
<id>add-module-infos</id>
<phase>generate-resources</phase>
<goals>
<goal>add-module-info</goal>
</goals>
<configuration>
<overwriteExistingFiles>true</overwriteExistingFiles>
<outputDirectory>${project.build.directory}${file.separator}modules</outputDirectory>
<modules>
<module>
<artifact>
<groupId>org.imgscalr</groupId>
<artifactId>imgscalr-lib</artifactId>
<version>4.2</version>
</artifact>
<!-- I generated the module-info using "jdeps.exe dashdash generate-module-info out thumbnailator-0.4.20.jar" and copy/pasted here -->
<moduleInfoSource>
module imgscalr.lib {
requires transitive java.desktop;
exports org.imgscalr;
}
</moduleInfoSource>
</module>
</configuration>
</execution>
</executions>
</plugin>
This is the end of this long article, hope it'll be useful.
PS: as a reminder, I'm currently looking for a job in Singapore as a Java/JavaScript developer.
Founder & CEO at 杭州福强科技有限公司
5 个月for concurrencuy, facilities in javafx.concurrent package are preferred, with which u don't need Platform.runLater ;)