Reconstructing Russian History in Color
Satya Mallick
CEO @ OpenCV | BIG VISION Consulting | AI, Computer Vision, Machine Learning
In this post we will embark on a fun and entertaining journey into the history of color photography while learning about image alignment in the process. This post is dedicated to the early pioneers of color photography who have enabled us to capture and store our memories in color.
A Brief and Incomplete History of Color Photography
Figure 1. Ribbon by Maxwell and Sutton was the first color photo ever taken by superimposing 3 grayscale images.
The idea that you can take three different photos using three primary color filters (Red, Green, Blue) and combine them to obtain a color image was first proposed by James Clerk Maxwell ( yes, the Maxwell ) in 1855. Six years later, in 1861, English photographer Thomas Sutton produced the first color photograph by putting Maxwell’s theory into practice. He took three grayscale images of a Ribbon ( see Figure 1 ), using three different color filters and then superimposed the images using three projectors equipped with the same color filters. The photographic material available at that time was sensitive to blue light, but not very sensitive to green light, and almost insensitive to red light. Although revolutionary for its time, the method was not practical.
By the early 1900s, the sensitivity of photographic material had improved substantially, and in the first decade of the century a few different practical cameras were available for color photography. Perhaps the most popular among these cameras, the Autochrome, was invented by the Lumière brothers.
A competing camera system was designed by Adolf Miethe and built by Wilhelm Bermpohl, and was called Professor Dr. Miethe’s Dreifarben-Camera. In German the word “Dreifarben” means tri-color. This camera, also referred to as the Miethe-Bermpohl camera, had a long glass plate on which the three images were acquired with three different filters (see Figure 2). A very good description and an image of the camera can be found here.
Figure 2 : Three images captured on a vertical glass plate by a Miethe-Bermpohl camera .
In the hands of Sergey Prokudin-Gorskii the Miethe-Bermpohl camera ( or a variant of it ) would secure a special place in Russian history . In 1909, with funding from Tsar Nicholas II, Prokudin-Gorskii started a decade long journey of capturing Russia in color! He took more than 10,000 color photographs. The most notable among his photographs is the only known color photo of Leo Tolstoy.
Fortunately for us, the Library of Congress purchased a large collection of Prokudin-Gorskii’s photographs in 1948. They are now in the public domain and we get a chance to reconstruct Russian history in color!
It is not trivial to generate a color image from these black and white images (shown in Figure 2). The Miethe-Bermpohl camera was a mechanical device that took these three images over a span of 2-6 seconds. Therefore the three channels were often mis-aligned, and naively stacking them up leads to a pretty unsatisfactory result.
Well, it’s time for some vision magic!
Motion models in OpenCV
OpenCV is an open source computer vision library and in this post we will use it for image alignment.
In a typical image alignment problem we have two images of a scene, and they are related by a motion model. Different image alignment algorithms aim to estimate the parameters of these motion models using different tricks and assumptions. Once these parameters are known, warping one image so that it aligns with the other is straight forward.
Let’s quickly see what these motion models look like.
Figure 3. This shows how the image of a square gets transformed by different motion models.
The OpenCV constants that represent these models have a prefix MOTION_ and are shown inside the brackets.
- Translation ( MOTION_TRANSLATION ) : The first image can be shifted ( translated ) by (x , y) to obtain the second image. There are only two parametersx and y that we need to estimate.
- Euclidean ( MOTION_EUCLIDEAN ) : The first image is a rotated and shiftedversion of the second image. So there are three parameters — x, y and angle . You will notice in Figure 4, when a square undergoes Euclidean transformation, the size does not change, parallel lines remain parallel, and right angles remain unchanged after transformation.
- Affine ( MOTION_AFFINE ) : An affine transform is a combination of rotation, translation ( shift ), scale, and shear. This transform has six parameters. When a square undergoes an Affine transformation, parallel lines remain parallel, but lines meeting at right angles no longer remain orthogonal.
- Homography ( MOTION_HOMOGRAPHY ) : All the transforms described above are 2D transforms. They do not account for 3D effects. A homography transform on the other hand can account for some 3D effects ( but not all ). This transform has 8 parameters. A square when transformed using a Homography can change to any quadrilateral.
In OpenCV an Affine transform is stored in a 2 x 3 sized matrix. Translation and Euclidean transforms are special cases of the Affine transform. In Translation, the rotation, scale and shear parameters are zero, while in a Euclidean transform the scale and shear parameters are zero. So Translation and Euclidean transforms are also stored in a 2 x 3 matrix. Once this matrix is estimated ( as we shall see in the next section ), the images can be brought into alignment using the function warpAffine.
Homography, on the other hand, is stored in a 3 x 3 matrix. Once the Homography is estimated, the images can be brought into alignment using warpPerspective.
Image Registration using Enhanced Correlation Coefficient (ECC) Maximization
The ECC image alignment algorithm introduced in OpenCV 3 is based on a 2008 paper titled Parametric Image Alignment using Enhanced Correlation Coefficient Maximization by Georgios D. Evangelidis and Emmanouil Z. Psarakis. They propose using a new similarity measure called Enhanced Correlation Coefficient (ECC) for estimating the parameters of the motion model. There are two advantages of using their approach.
- Unlike the traditional similarity measure of difference in pixel intensities, ECC is invariant to photometric distortions in contrast and brightness.
- Although the objective function is nonlinear function of the parameters, the iterative scheme they develop to solve the optimization problem is linear. In other words, they took a problem that looks computationally expensive on the surface and found a simpler way to solve it iteratively.
findTransformECC Example in OpenCV
In OpenCV 3, the motion model for ECC image alignment is estimated using the function findTransformECC . Here are the steps for using findTransformECC
- Read the images.
- Convert them to grayscale.
- Pick a motion model you want to estimate.
- Allocate space (warp_matrix) to store the motion model.
- Define a termination criteria that tells the algorithm when to stop.
- Estimate the warp matrix using findTransformECC.
- Apply the warp matrix to one of the images to align it with the other image.
To obtain the OpenCV code ( C++ / Python ) for doing this alignment, please visit
https://www.learnopencv.com/image-alignment-ecc-in-opencv-c-python
Reconstructing Prokudin-Gorskii Collection in Color
The above image is also part of the Prokudin-Gorskii collection. On the left is the image with unaligned RGB channels, and on the right is the image after alignment. This photo also shows that by the early 20th century the photographic plates were sensitive enough to beautifully capture a wide color spectrum. The vivid red, blue and green colors are stunning.
The problem is that the red, green, and blue channels in an image are not as strongly correlated if in pixel intensities as you might guess. For example, check out the blue gown the Emir is wearing in Figure 2. It looks quite different in the three channels. However, even though the intensities are different, something in the three channels is similar because a human eye can easily tell that it is the same scene.
It turns out that the three channels of the image are more strongly correlated in the gradient domain. This is not surprising because even though the intensities may be different in the three channels, the edge map generated by object and color boundaries are consistent.
So we can calculate the alignment based on the gradient of color channels instead of the the color channels themselves. You can find the code and images used to generate results in the post by following the link below
https://www.learnopencv.com/image-alignment-ecc-in-opencv-c-python
Subscribe & Download Code
If you liked this article and would like to download code and example images used in this post, please subscribe to our newsletter.
https://www.learnopencv.com/computer-vision-resources
You will also receive a free Computer Vision Resource guide. In our newsletter we share OpenCV tutorials and examples written in C++/Python, and Computer Vision and Machine Learning algorithms and news.
Consulting
I do consulting work through my consulting company Big Vision LLC. If you have a computer vision or machine learning problem you need help with, please shoot me an email at [email protected].
PhD, PEng, FRSPH, FIET, FInstP, FGS, FRSA, FRGS, FISDDE. Professor and Former Canada Research Chair at University of Waterloo, Director of Machine Learning Research at Apple
7 年As someone who did his PhD in image registration, I truly appreciate this wonderful post that shows how image registration (and fundamental image processing and computer vision) has wonderful practical and impactful uses.
Head - MX Open Innovation
7 年Nice to see a fundamental Vision post when the world around me talks GANs
Research Scientist at Snap Research | Ex TikTok | Recommendation Systems, User Modeling, and Personalization
8 年Nice article! Will definitely try out these techniques.