I was a research intern (Summer 2012) at Tachyon Technologies between graduating and starting full-time at Yahoo!. I worked on an applied computer vision problem to convert photographs of comic books into individual panels suitable for mobile viewing. My mentor was the CEO and an MIT TR35 awardee.
This was my first exposure to machine learning and image processing. I performed a bunch of experiments and eventually settled on a research prototype that worked well in most cases. I also built an Android application wrapping this research up into a usable tool.
Solving the problem involved a series of steps. better understood in pictures (taken off the Tachyon Demos site):
Figure 1: Overview
- Dewarping: Book pages in photographs usually appear warped and skewed, depending on the angle at which the photograph was taken. We need flat panels.
Figure 2: Dewarping
- Splitting Panels: A comic book page consists of a series of panels. We needed to cut these out and rearrange them in reading order.
Figure 3: Splitting the comic into panels
- Restoring Colour: The colours in photographs are usually noisy, have gradients, and appear faded. But we know that the comic artist simply filled in outlines with solid colours. We need to restore these colours.
Figure 4: Restoring colour
And the end result:
Figure 5: Result