Into the Forest I Go (Again) - Part 1

Into the Forest I Go (Again) - Part 1

This is an update of something I worked on a few years back. At the time Colab's pricing was still reasonable and EfficientDet object detection models were still hot. This time around I used the (almost) current YOLOv10 model. (YOLO11 just came out but looks like it should have been called YOLO10.1) and ClaudeAI to format the input files appropriately. Since Colab is now about as reliable as one of the late Tom Magliozzi's cars I've switched to Kaggle Notebooks.

The problem is simple enough. Sweden has extensive larch forests in the province of V?sterg?tland. The larch trees themselves were introduced from the Alps nearly 300 years ago, but have now started to fall victim to an invasive pest of their own, a small moth whose larval stage likes to bore into their needles. After a few years of this indelicate treatment the tree's growth is stunted and it becomes vulnerable to other, even less pleasant, infections.

The above should be enough to convince you that detecting infection is of more than academic interest. Like most insect pests, the larva is invisible but its effects are not. The data consists of aerial, presumably drone, photos of several sections of forest. Trees are annotated as health (H), low damage (LD), high damage(HD), or other (O) if it's different species. From a human perspective the problem reduces to detecting changes in color. Healthy trees are a light green. Trees with low damages have some brown but usually still have green in the central crown. Heavily damaged trees are almost all brown. Other species are usually a deeper green. Each image has a lot, about 65 on average and often over a hundred, trees. The images are good sized too, 1500x1500. About 60% of the trees are low damage larch, 16% are badly damaged larch, 20% something else, and just under 5% are healthy larch trees.

A fuller description of the problem is in the original writeup.

Along with the images there are XML files that provide the annotations in a reasonably good attempt at PASCAL VOC format. Although not uncommon, this is useless for any object detection model. Given the way the data was structured, one folder for each location, services like Roboflow for file conversion weren't entirely practical. Nor were they necessary when I could just tell Claude to convert VOC to COCO and then COCO to YOLO, and have it do the train/val/test split at the same time. For those interested, Claude's work can be found here.

With those preliminaries out of the way (and isn't it nice that Claude can do the parsing grunt work now) I could start training the model. YOLOv10 comes in 6 sizes, although once you get past 10b (for balanced) the performance gains may not be worth the extra time and memory. I've focused on the smallest model, 10n (for nano), since it's fast enough for quick experiments. When I did go from 10n to 10b the improvement was modest considering the 10-fold increase in parameters (+2 MAP@50 and +2.3 MAP50-95 after 100 epoch with identical training parameters.)

What did I use as training parameters? I stuck with the default learning rate and learning rate scheduler. All else being equal I assumed Ultralytics knew what they were doing. There are some default Albumentations they don't let me turn off that are probably not helpful for this problem (Blur and ToGray) but don't seem to do any damage either. I didn't mess with the image default, although since 640x640 is a quarter of the size of 1500x1500 it might be worth trying to cut the original images into four. I left the mosaic and, somewhat reluctantly, erasing setting at default. I've never believed in color augmentations, except adjusting brightness, especially when the problem really reduces to detecting color changes. All the hue, saturation, and brightness randomizers get set to 0. Flips, since there's nothing special about the image orientation, scaling, and the default translations stay. I eventually added mixup (i.e. averaging two images together) with probability 0.3 (based off a paper I found) which did improve overall MAP.


A training batch from the best, to date, model.

That's enough ramble. What about results? First off, I should say something about the bounding boxes. Accuracy for the bounding boxes themselves is fantastic. Running a model with just a single class (i.e. just a tree detector) got me MAP@50 of 94.5 and MAP50-95 of 62.6. Almost all the error is in classification.

Most of the metrics track instance frequency. Given their unfortunate rarity, healthy trees have the lowest MAP. The majority low damage class performs about as well as the generic tree detector and high damage also performs well.

To improve the accuracy of the less frequent classes I tried adjusting the focal loss weight but the default 1.5 value worked best. Ultralytics does seem to know what they're doing. Add mixup did improve accuracy; MAP for healthy trees had been 64.2 without it.

I am actually happy with this result. Despite being just over a third the size of EfficientDetD1, YOLOv10n has slightly outperformed the D1 models I built three years ago. And it took a lot less work.


Come back later for part 2, where I provide the results of YOLOv10b and a deeper dive into what I think the models are, and are not, doing well.

要查看或添加评论,请登录

Daniel Morton PhD的更多文章

  • Mislabeled Data - Still Not as Bad as You'd Think

    Mislabeled Data - Still Not as Bad as You'd Think

    This is a followup to a previous article about mislabeled data. https://www.

  • Where Does Logistic Regression Come From?

    Where Does Logistic Regression Come From?

    The question is really: why does logistic regression take the form that it does? Why is the link function, for that is…

  • The Derivative of sin(x)

    The Derivative of sin(x)

    How do you derive the derivative of sin(x). Most of the answers you're likely to come up with (i.

  • Mislabeled Data - Not as Bad as You'd Think

    Mislabeled Data - Not as Bad as You'd Think

    Suppose I gave you a nice set of training data. Twenty features.

  • Claude and the TAs Nightmare

    Claude and the TAs Nightmare

    Back in my teaching assistant days there were two types of homework I liked to grade. There was the rare student who…

  • Claude fails Sideways Arithmetic

    Claude fails Sideways Arithmetic

    Sideways Arithmetic from Wayside School. I worked through that book when I was ten.

  • Two Issues About Object Detection Accuracy

    Two Issues About Object Detection Accuracy

    Object detection answers two questions simultaneously. What is it? And where is it? For the most part, computer vision…

  • ClaudeAI and ChatGPT try a Brain Tickler

    ClaudeAI and ChatGPT try a Brain Tickler

    Today's NYT Brain Ticker. Add two W's to each word and anagram the result to get a new word: 1.

  • How to Lie About Model Accuracy

    How to Lie About Model Accuracy

    I'm still looking at the Larch Casebearer Data. At this point I've produced four models that are at least close to best…

  • Another One Bytes the Dust

    Another One Bytes the Dust

    At this point there's not even much to say. I think we all know that Claude, and to a lesser extent ChatGPT, can do…

社区洞察

其他会员也浏览了