Heatmaps: FiftyOne Computer Vision Tips and Tricks – Oct 6, 2023
Author: Daniel G. , Machine learning Engineer at Voxel51
Welcome to our weekly FiftyOne tips and tricks blog where we cover interesting workflows and features of FiftyOne! This week we are taking a look at heatmaps. We aim to cover the basics of creating a heatmap in FiftyOne and one real world application of heatmaps .
Wait, what’s FiftyOne?
FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.
Ok, let’s dive into this week’s tips and tricks! Also feel free to follow along in our notebook or on YouTube !
What exactly is a Heatmap?
A heatmap in computer vision is a visual representation often used to highlight areas of interest or intensity within an image. It typically assigns color gradients to different regions of the image based on the intensity or significance of certain features, such as object boundaries, key points, or activation values from neural networks.
Darker regions usually indicate lower intensity or lesser importance, while brighter or more vibrant areas signify higher intensity or greater significance in the context of the task at hand, aiding in understanding and analyzing the image’s content and structure, especially for tasks like object detection, pose estimation, or saliency mapping.
In FiftyOne, heatmaps can be created in one of two ways. You can either point to the location of a saved map, or create a map in memory and load that instead. Checkout below how both of these are implemented:
# Example heatmap
map_path = "/tmp/heatmap.png"
_map = np.random.randint(256, size=(128, 128), dtype=np.uint8)
cv2.imwrite(map_path, map)
sample = fo.Sample(filepath="/path/to/image.png")
# Load with a saved map
sample["heatmap1"] = fo.Heatmap(map_path=map_path)
# Load from map in memory
sample["heatmap2"] = fo.Heatmap(map=_map)
print(sample)
Note that in the example above, the heatmap is a 2 dimensional array, where each value is the intensity of the heatmap at that pixel location. You can specify the range of your intensity values as well by passing the range argument.
sample["heatmap1"] = fo.Heatmap(map_path=map_path, range=[-10,10])
With the code above, pixels with the value +9 will be rendered with 90% opacity, and pixels with the value -3 will be rendered with 30% opacity.
Pose Estimation with Heatmaps
One of the most popular ways to use heatmaps today is with pose estimation. Pose Estimation is a computer vision task where the goal is to detect the position and orientation of a person or an object. Usually, this is done by predicting the location of specific keypoints like hands, head, elbows, etc.
Heatmaps are a great localization tool to use when trying to create keypoint skeletons within your images. One such model that uses a heatmap to skeletons approach is the SWAHR-HumanPose model from CVPR 2021. A model deserving of a post of its own, SWAHR is able to produce high quality results in astonishing inference times. Up next, we’ll take? a look at the heatmaps it generates and how we can visualize them in FiftyOne.
Installation
The repo for the model unfortunately has not stood the test of time as well as the model results and can be a tricky set up to get running. If you are interested in getting it started, I included additional steps in the notebook . It will include instructions on rolling back some libraries like numpy and torch as well as building some C++ libraries. To get it all to work.
You will also need to? download one of the pretrained models linked here . I used pose_higher_hrnet_w32_512.pth in my example. Make sure the model is placed in the local ./models directory inside the repo.
Generating Heatmaps
We will be using the provided script dist_inference.py with a few modifications to save our heatmaps to disk. The script will take several command line options, most importantly the model_path, image_dir, and save_dir. On execution, the model will generate 18 different heatmaps, each for the keypoint of interest in the skeleton. We will then combine all these heatmaps to make one master heatmap and save it to disk. Let’s take a look at the main steps in these code snippets,:
import cv2
IMPORT TRANSFORMS and TORCH
#First, iterate through your images to set up each inference
for i, img_path in enumerate(img_list):
image_name = img_path.split('/')[-1].split('.')[0]
image = cv2.imread(
img_path,
cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION
)
...
After setting up a loop for inference we can run the model and parse the results.
# size at scale 1.0
base_size, center, scale = get_multi_scale_size(
image, cfg.DATASET.INPUT_SIZE, 1.0, min(cfg.TEST.SCALE_FACTOR)
# At several scale factors, run the model
with torch.no_grad():
final_heatmaps = None
tags_list = []
for idx, s in enumerate(sorted(cfg.TEST.SCALE_FACTOR, reverse=True)):
input_size = cfg.DATASET.INPUT_SIZE
image_resized, center, scale = resize_align_multi_scale(
image, input_size, s, min(cfg.TEST.SCALE_FACTOR)
)
image_resized = transforms(image_resized)
image_resized = image_resized.unsqueeze(0).cuda()
outputs, heatmaps, tags = get_multi_stage_outputs(
cfg, model, image_resized, cfg.TEST.FLIP_TEST,
True, base_size,
)
final_heatmaps, tags_list = aggregate_results(
cfg, s, final_heatmaps, tags_list, heatmaps, tags
)
Running our basic inference loop leaves us with our final_heatmaps object. Normally, this final_heatmaps array would be passed on to create the keypoint skeleton. However, in our case, we are interested in visualizing just our heatmaps. Therefore, we use the provided function make_heatmaps to grab our heatmaps in a form that is easier to understand. We then combine all of our heatmaps by taking the pointwise maximum.
领英推荐
images, heatmaps = make_heatmaps(image,final_heatmaps[0])
master_heatmap = heatmaps[0]
for x in heatmaps:
master_heatmap = np.maximum(master_heatmap, x)
Finally, with our master heatmap ready, our last step is to resize back to the original image size and save to disk!
resized_m_heatmap = cv2.resize(master_heatmap,(image.shape[1],image.shape[0]))
cv2.imwrite(
os.path.join(save_dir, "{}_heatmap.png".format(image_name)),
resized_m_heatmap
)
With our code laid out, we will generate some heatmaps for our quickstart dataset. To execute the same as our example, run the following line to execute the model:
python3 tools/dist_inference.py --world_size 1 --img_dir ~/fiftyone/quickstart/data/ --save_dir output_quick --cfg experiments/coco/higher_hrnet/w32_512_adam_lr1e-3.yaml TEST.MODEL_FILE models/pose_higher_hrnet_w32_512.pth TEST.SCALE_FACTOR '[0.5, 1.0, 2.0]'
Congrats! With our new heatmaps saved to our disk, we can now load them back into FiftyOne. Take a moment to flip your virtual environment back to an environment on the latest FiftyOne version and let’s try to load in our new heatmaps.
Visualizing SWAHR Heatmaps
To start, we load in the quick start dataset that we used to inference on.?
import os
import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset("quickstart")
After our dataset is loaded in, we can compare the filepath of our dataset to the saved filepath of our mask to load in our new heatmaps:
dataset.compute_metadata()
for sample in dataset:
filepath = sample.filepath
base_name = os.path.basename(filepath)
name, extension = os.path.splitext(base_name)
heatmap = fo.Heatmap(map_path="/path/to/SWAHR-HumanPose/output_quick/" + name + "_heatmap.png")
sample["heatmap"] = heatmap
sample.save()
session = fo.launch_app(dataset)
After visualizing our results, we can see that our heatmap has been added! It is cool to note that the model for generating the heatmaps has a very low false positive rate, and we do not see any heatmaps on non person samples.
By observing a person sample closer, we can begin to see the localization by the heatmap of each keypoint in our image. In the sample below, we can see various parts highlighted such as nose, eyes, ears, arms and legs.
The model even performs well from unusual angles and obstructed views!
See Heatmaps in Action!
Conclusion
Visualizing heatmaps on images enhances the computer vision workflow by providing a clear and intuitive representation of important features and regions within an image. It enables a deeper understanding of the model’s focus and decision-making, aiding in model interpretation, validation, and refinement. Using Fiftyone, heatmap visualization can help identify critical areas that the model is emphasizing, allowing for targeted improvements and fine-tuning, ultimately enhancing the model’s accuracy and performance. Moreover, these visualizations facilitate effective communication and collaboration among researchers, developers, and stakeholders, streamlining the development and deployment of computer vision applications!
Join the FiftyOne Community!
Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!
What’s Next?