Maochuan Lu - CS180 Project 1

What did I do?

In this project, we aim to align RGB channels to produce color images. Here is my code documentation and idea:

To align channels, we definitely need to move or shift a channel based on xy-displacement. In this case, we create a helper function shift(pic, x, y) using np.roll.
We also need a metric to evaluate whether this shift can give us better alignment. I chose normalized cross-correlation for this metric, which is more robust to brightness.
With ncc and shift, we begin to implement our naive align function, which simply compares each possible shift in a window and returns the best shift with the best ncc score.
However, with the naive alignment function, processing large pictures is quite slow. So we implemented a pyramid alignment function using recursion (an iterative one is quite slow).
- Firstly, we get our base case (level == 0) and simply return naive alignment at the final layer.
- Secondly, for other levels, we scale down pictures by a factor of 2, creating a pyramid of images at lower resolutions. Then it recursively calls pyramid_align on images at the next level.
- After finding the best alignment at a lower resolution, we scale the current best alignment offsets by a factor of 2, since the previous level was downscaled by a factor of 2.
- Then we shift pic1 based on these scaled offsets and refine the alignment by performing a final alignment at the current level using the align function.
When finishing implementing the above function, our alignment for simple pictures like cathedral.jpg looks great, but for images like emir.tif, it does not align well.
After I checked the ed, a classmate suggested that using feature.canny and cropping would be helpful, so I began to research what canny is and how to use it through Google.
I then implemented the manual_crop function, which simply scales down the height and width by a fixed percentage (e.g., 15%) to remove borders.
Finally, I used the skeleton code to align channels processed by canny and got their shifted coordinates. Then I applied these coordinates back to the original image and shifted them to get a full image.
It works really well for all of the images!

Results

cathedral.jpg

church.tif

emir.tif

harvesters.tif

icon.tif

lady.tif

melons.tif

monastery.jpg

sculpture.tif

self_portrait.tif

three_generations.tif

tobolsk.jpg

train.tif
Back to CS180 Projects