In this part of the project, I used finite difference operators to compute the partial derivatives of the "cameraman" image (shown below) in the x and y directions.
These derivatives help understand how pixel values change along these axes, which is critical for edge detection.
Applying the finite difference operator [-1, 1] along the x axis allows us to compute the derivative in x to highlight vertical edges.
Similarly, I applied the finite difference operator along the y axis to compute the derivative in y to highlight horizontal edges.
Next, I calculated the gradient magnitude, which combines the x and y derivatives to detect the overall strength of edges in the image.
More specifically, I used this line of code to calculate the gradient magnitude: """ gradient_magnitude = np.sqrt(Ix**2 + Iy**2) """
Finally, I found an optimal threshold value based on trial and error to create a binary image that highlights the prominent edges in the image while suppressing noise.
Applying the finite difference operator [-1, 1] along the x axis allows us to compute the derivative in x to highlight vertical edges.
Similarly, I applied the finite difference operator along the y axis to compute the derivative in y to highlight horizontal edges.
Next, I calculated the gradient magnitude, which combines the x and y derivatives to detect the overall strength of edges in the image.
More specifically, I used this line of code to calculate the gradient magnitude: """ gradient_magnitude = np.sqrt(Ix**2 + Iy**2) """
Finally, I found an optimal threshold value based on trial and error to create a binary image that highlights the prominent edges in the image while suppressing noise.
In this part of the project, I incorporated a Gaussian filter to improve upon the simple finite difference operator. While the finite difference method is effective at detecting edges, it also may produce a lot of noise. A Gaussian filter helps smooth the image before applying derivative operations.
First, I applied a Gaussian filter to the original image, which blurs the image. This also reduces noise and makes the edges more distinct. I achieved this sign the gaussian_filter() function.
I then computed the x and y derivatives (Ix and Iy) on the smoothed image by convolving it with the finite difference operators (Dx and Dy). This gives us smoother gradients compared to the unblurred image.
Instead of applying Gaussian smoothing followed by the derivative separately, we can combine both operations into a single convolution. We can do this by computing the Derivative of Gaussian filters.
I convolved the Gaussian filter with the finite difference operators (Dx and Dy) to produce DoG filters. These filters are then directly applied to the image to compute the x and y derivatives in one single step.
In this part of the project, I implemented unsharp masking, which enhances the sharpness of an image by emphasizing its high-frequency components.
This process has three steps:
1. Gaussian Blur: First, apply a Gaussian blur, which acts as a low-pass filter that removes the high-frequency details, which are typically associated with sharp edges.
2. Extracting High Frequencies: Next, I subtract the blurred version of the image from the original image, which essentially isolates the high-frequency components.
3. Enhancing the Sharpness: I then added the high-frequency components back to the original image, with an adjustable weight (alpha) that controls how much the image is sharpened. This result appears sharper because the high-frequency details are emphasized.
The unsharp mask filter can be mathematically represented as:
Sharpened image = Original Image + alpha * (Original image - Blurred image)
The following images illustrate the process mentioned above:
I also tested a scenario in which a sharp image is intentionally blurred, and then sharpened again. By comparing the original sharpened image and the sharpened version of the blurred image, we can see how much of the original detail was restored through the unsharp masking process.
In this section, I explored hybrid images, which involve combining two images in a way that allows different interpretations depending on the viewing distance.
Essentially, high-frequency details dominate perception when viewed up close, while low-frequency components are more prominent from a distance.
To create a hybrid image, I first applied a low-pass filter (Gaussian filter) to one image to isolate the low-frequency content. Next, I applied a high-pass filter to the second image by subtracting a blurred version of the image from the original, leaving only the high-frequency details. The two filtered images were then combined to form the hybrid image.
These are the images I started with:
To align and scale the images, I manually selected points in both images to align key features and scaled them appropriately so that the high-frequency and low-frequency parts would blend smoothly.
This is the image I got:
To visualize how the frequency components are distributed in the original and hybrid images, I computed the 2D Fourier transforms of the images. This analysis shows the distribution of high and low-frequency content.
These are the Fourier Analysis
These are some other images I tested the same algorithm on:
In this part of the project, I explored multi-resolution blending to combine two images. The goal of this technique is to combine two images seamlessly by creating a spline between the two images at various frequency levels.
This technique ensures that blending appears smooth, as it processes the images at different resolutions, minimizing any harsh seams or abrupt transitions.
This blending is done using two important tools: 1. Gaussian Stack 2. Laplacian Stack.
A Gaussian Stack is a stack of images created by successively applying a Gaussian blur, which smooths the image at increasing levels of blurring. This stack reqpresents the low-frequency information at each level.
Each level in the stack applies a Gaussian filter with increasing sigma (blur strength), but unlike a pyramid, the image size remains constant without downsampling. This stack helps represent the low-frequency components at different scales.
A Laplacian Stack is derived from the Gaussian stack by subtracting the next, more blurred image from the current one. It captures the high-frequency details between each level, which is crucial for multi-resolution blending.
Once the Gaussian stack is built, I omcomputed the Laplacian Stack by subtracting the adjacent levels in the Gaussian stack. This stack highlights the details or high-frequency information at each level. The final Laplaian stack can then be used for image blending by merging the Laplacian stacks of two images while maintaining fine details and smooth transitions.
With the stacks, we blend the two images. First, I resized both images to match dimensions. Next, I generated the Gaussian stack for each image to capture their low frequency content. From there, I computed the Laplacian stack to isolate the high-frequency details at each level. These stacks allow me to blend the two images smoothly, creating the famous "Orapple".
In this section, I focused on blending two images together using multiresolution blending. This technique allows us to seamlessly merge two images, such as the classic apple and orange images, to create what is famously known as the "Oraple".
With the stacks I created in section 2.3, we blend the two images. First, I resized both images to match dimensions. Next, I generated the Gaussian stack for each image to capture their low frequency content. From there, I computed the Laplacian stack to isolate the high-frequency details at each level. These stacks allow me to blend the two images smoothly, creating the famous "Orapple".
Using the same principle, I tried making my own set of blends -- the first being Stankeley.
Then, I sought to combine Magritte's "The Son of Man" and Munch's "The Scream".
This is a failed example that was a result of changing the threshold improperly. "The Scream" isn't as prominent as the successful example above.