0:00
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
0:05
The classic computer vision trick behind smooth image blending.
0:08
By Farzone Lotfi. If you have ever tried to create a multi-panel collage by stitching
0:13
many images together you have likely run into the problem of harsh, visible seams.
0:18
Slightly better would be some naive alpha blending. Slightly worse are simple cut and paste jobs.
0:24
Both leave unnatural transitions or ghosting artifacts. To get those buttery smooth,
0:28
seamless transitions, we can use a classic computer vision technique,
0:32
Laplacian Pyramid blending. Below you can see how awesome the results can be.
0:37
Building the pyramids. To blend two images, let's call them image A and image B.
0:42
Without harsh seams, we have to treat the broad strokes.
0:46
Low frequencies, like lighting and color, differently than the fine details,
0:50
high frequencies, like sharp edges. We doth this using image pyramids.
0:56
The Gaussian Pyramid, reduce. We take our original image and run a Gaussian filter, blur,
1:01
over it, then downsample it, e.g. shrink in 8x8 image to a 4x4, then 2x2, then 1x1.
1:10
Each step is a new, level, in our Pyramid, representing progressively coarser,
1:15
lower frequency data. The Laplacian Pyramid, expand. This isolates the details.
1:21
We take a smaller, coarser level of the Gaussian Pyramid and expand it to match the size
1:25
of the level below it. Because expanding is basically guessing the pixels in between,
1:30
it's not perfect. We subtract this expanded image from the actual image at that level to get
1:35
an error image. This error image is the Laplacian. It basically looks like an edge map containing
1:41
all the high frequency details. The blending process suppose you have image A, image B,
1:46
and a mask, region R, that tells the computer where the blend should happen.
1:51
In the process I learned the blending works from coarse to fine, 1.
1:55
Build the Laplacian Pyramids for image A and image B, L underscore A and L underscore B.
2:01
2. Build a Gaussian Pyramid for your mask, G underscore R. Blurring the mask as it gets smaller
2:08
ensures that the transition zone widens for lower frequencies. 3. Form a combined Laplacian Pyramid,
2:14
dollar L underscore O dollar, using the mosques Gaussian Pyramid as weights.
2:19
The formula at each level looks like this. 4. 5. Finally,
2:24
collapse the combined Pyramid, L underscore O, by continually expanding and adding the levels
2:29
back together to get the final blended image. The code IMPL EMEN TATI ONI wrote a Python
2:35
implementation that handles this exact mathematical process. It computes the Pyramids,
2:41
applies the blending step at each level, and reconstructs the output.
2:45
You can check out the full source code on my GitHub here. Laplacian blending with Python.
2:50
Or check out the colab directly. Conclusion BY breaking images down into their frequency bands
2:55
using Gaussian and Laplacian Pyramids. We can blend the structural details and the broad
3:00
lighting separately. The result is a composite image that fools the human eye into seeing
3:05
buttersmuth transition. Grab the code, create some image masks, and try making your own seamless
3:11
collages. Credit where it's due, computational photography course at Georgia Techie have
3:16
to mention that a fantastic resource for learning these concepts is Georgia Tech's CS 6475
3:22
Computational Photography Class. I took this course when I was in grad school and it was one of my
3:27
favorite courses. If you want a deep dive into the math and theory, I highly recommend checking
3:33
out these excellent course notes compiled by Monser Sala. The theory above leans on the foundational
3:39
concepts taught in that course. Thank you for listening to this Hackernoon story,
3:43
read by artificial intelligence. Visit Hackernoon.com to read, write, learn and publish.