The caption I had was 'an oil painting of a snowy mountain village', 'a man wearing a hat', 'a rocket ship'.
The output quality is actually quite pristine, and it is a great match to the captions provided.
I used seed 180.
The image shown here is the forward() function applied to the test_image (beautiful Campanile) at noise levels [250, 500, 750].
The image shown here is the side-by-side of the corresponding Gaussian-denoised versions of the 3 noisy test images from the previous part.
The image shown here is the UNet-applied denoised versions of the 3 noisy images from 1.2.
The image shown here is the every 5th noisy image denoised (gradually becoming less noisy). The final predicted clean image was produced using iterative denoising.
Pixels are less clear
Pixels are quite cleaar
I picked Pink Floyd's album, The Dark Side of the Moon, as the image I will edit.
The image I got here was edited using CFG at noise levels [1, 3, 5, 7, 10, 20].
Here's my autograph.
And here's a cat I drew.
And here are both drawings similarly edited using CFG at noise levels [1, 3, 5, 7, 10, 20].
The top part of the campanile here has a mask applied to it. I just covered the hole with a grey box.
The center of the image here has a circular mask applied to it. I also covered the hole with a grey box.
The diagonal of the image here has a mask applied to it. I covered this hole with a grey box too.
I was able to apply the noise to rocket at the specified levels [1, 3, 5, 7, 10,20] but wasn't able to have it converge back to the Campanile.
I was able to apply the noise to the Car at the specified levels [1, 3, 5, 7, 10,20] but wasn't able to have it converge back to the Campanile.
I was able to apply the noise to Cat at the specified levels [1, 3, 5, 7, 10,20] but wasn't able to have it converge back to the Campanile.
You can see here that there is an old man sitting upright, but when flipped, we see a campfire.
You can see here that there is a snowy mountain village, but when flipped, we see a skull.
While incorrectly labeled, we see here a hipster bartender standing upright, but when flipped, we see the beautiful Amalfi Coast
You can see a hybrid image of a skull and a waterfall. From close, you can see the details that make up a waterfall, but from afar, you will see a skull.
You can see a hybrid image of a pencil and the Amalfi Coast. From close, you can see the details that make up the famous coastal town, but from afar, you will see a pencil (sort of).
You can see a hybrid image of a dog and arocket. From close, you can see the details that make up a dog, but from afar, you will see a rocket.
Here is a visualization of the noising process with sigma values = [0.0, 0.2, 0.4, 0.5, 0.6, 0.8, 1.0].
Here is a visualization of the sample results on the test set after the first epoch.
Here is a visualization of the sample results on the test set after the fifth epoch.
Here is a visualization of the training loss curve plot for the whole training process.
I've varied the sigma values, but kept the same image.
I was able to apply the noise to Cat at the specified levels [1, 3, 5, 7, 10,20] but wasn't able to have it converge back to the Campanile.
I was able to apply the noise to Cat at the specified levels [1, 3, 5, 7, 10,20] but wasn't able to have it converge back to the Campanile.
I was unable to complete part 2.5.