Building Photosynthernet (Part 2)

In the previous publication, I covered what lead to the idea of photosynthernet as well as explained some early research that led to building v1 of photosynthernet. So if you want some more background on what photosynthernet is and how we got here, be sure to read Part 1. If you’re just here for the cool parts, then read on!

Quick Part 1 Retro

In Part 1, we learned 2 significant things that would lead to a more robust and successful palette algorithm for photosynthernet. Firstly we discovered we could determine the difference between two colors by measuring the Euclidean distance between their sRGB values (this was used to determine if the “tiles” in the logo were similar to the tiles in the image). Second of all, we unintentionally discovered the concept of Super Pixels (In part 1, when we discussed “tiles”, these were really just basic superpixels). These two concepts are the foundation for the current palette generation algorithm used by photosynthernet.

What are Super Pixels?

Superpixels are clusters of perceptually similar pixels that have been merged to form a single polygon that represents a single color.

For example: Imagine you have three pixels that are all some shade of green which are represented by the Cartesian coordinates ([0, 1], green), ([0, 2], light-green), and ([0, 3], swamp-green). You could simply say that this image is just “green”, or you could say that this image contains a single Super Pixel that could be represented as all the values between [0, 1] … [0, 3] as being “green”.

The decision to merge two pixels is made via a distance and weighting equation compared against a maximum threshold. For now, think of it in terms of Euclidean distance; if the delta of two pixels is greater than, say, 20, then consider them to be the same color and merge it into a superpixel.

This is a real image that has been converted into super pixels using the SLIC algorithm

Implementing Rudimentary Super Pixels

In the above image, I show the results of a more intelligent superpixel clustering algorithm, which we will get to eventually… However, when I first started, I didn’t even know the phrase “Super Pixel” existed, so I stuck with my “tiles”, described in Part 1. I broke each image up into a grid of 50×50 pixel tiles. I iterated over each pixel within each tile and compared it against all the other pixels in that tile. If the Euclidean distance between two pixels was greater than approximately 120, I incremented the “score” of the original color by 1. This gave me a map of color -> score. I would then sort the array and pop the top item off. This was the color I could assume was the dominant color in the tile. Once I ran this on all tiles I usually had an array of 50-100 of what I called “Dominant Colors”, which was just my ignorant re-branding of “Super Pixels”.

// Each tile is a key/value store of an sRGB color (as the key)
// and the count of all colors within a delta of 120 to that color (as the value)
tile_1 = {
    '20,155,22' : 30,
    '200,5,29' : 12,
    '79,152,240' : 9
}

tile_2 = {
    '56,172,46' : 60,
    '200,5,29' : 2,
}

// The dominant colors array below is the result of taking the "top" item in each tile map for the entire image
dominant_colors = ['20,155,22', '56,172,46']

Now that I had an array of superpixels I could begin thinking about how to extract a palette from them. The result of the following brainstorming session was a list of criteria that I decided would create the most visually appealing and fun palette:

The palette should contain only 6 colors.
1. The point of photosynthernet is to search images by the color palette and discover similar photos. If each image had a palette of 100 unique colors then a search for “green” would have a very high likelihood of returning a slew of irrelevant results.
The palette shouldn’t contain any pure-white or pure-black results.
1. Even though this may not be very accurate, I thought it would make for more interesting palettes on images with lower color diversity (such as a night sky or bright skyline)

After the above criteria were selected I could get to work on creating an algorithm for eliminating some of the similarly colored tiles. This was done by running the same algorithm used to determine the dominant color of each tile against the tiles themselves. The only difference is this was a recursive and incremental implementation; meaning that if the results did not abide by the above selection criteria then the input variables would be tweaked and re-run. The most notable variable was the threshold to determine if two colors were different.

Essentially the logic was: if at the end of the palette selection algorithm there were not at least 6 unique colors for the palette, then I needed to lower the requirement for two colors to be considered “unique”. So with a threshold of 120, burnt orange and bright orange may be considered the same color. However, when we lower that threshold to 80, then they are seen as two distinct colors, opening us to have a palette full of all the shades of a beautiful sunset.

Finding the Delta

Above I discussed splitting the image into rudimentary superpixels, but what I didn’t touch on was the algorithm I used to determine the distance between two colors.

Initially I started with basic 3 dimensional euclidean distance. This can be expressed as :

Values :
c_1 = (r_1, g_1, b_1)
c_2 = (r_2, g_2, b_2)

Euclidean Distance :
$\Delta(c_1, c_2) = \sqrt{(r_1 - r_2)^2 + (g_1 - g_2)^2 + (b_1 - b_2)^2 }$

And this worked fairly well for a lot of images; in fact, all my test images performed incredibly well with basic euclidean distance. Because of this I stopped improving the palette selection algorithm and began working on the actual photosynthernet app. Which is a Laravel based web application that allows you to log in and upload images. However, this blog post isn’t about that for the time being (I’ll touch on it later I promise), so I’ll skip ahead to improving the palette algorithm and why I had to do so…

Perceptual Color Space

Once I had the app running in a capacity that I could share it with friends and get more real use out of the algorithm I began to see things happening with the palette selection algorithm that I didn’t like.

Images with low color diversity were lowering the threshold too much that the resultant palette was virtually 6 of the same color.
- By this, I mean that if an image was actually tri-chromatic, then the algorithm borked itself trying to force 6 unique colors out of it.
Some colors were dominating others.
- In some images where ~50% of the photo was one color, the other secondary, although significant, colors were not being selected for the palette, instead, they were being rolled into some of the other superpixels.
- A notable example was blues and purples being merged.

So I began doing some research. I’ve seen palette selection algorithms work better than mine, so what were they doing that I wasn’t?

After a bit of looking, I came across the following article by CompuPhase, which had built a desktop program for selecting 256-bit color palettes from an image (Somewhat familiar, yeah?). https://www.compuphase.com/cmetric.htm .

The main takeaway from this article was learning the existence of “Perceptual Color Space”. This means that simply because we perceive a color to be very different from another (take blue and purple for example), doesn’t mean that they are as numerically distinct as we think, especially in sRGB color space. To get more technical, sRGB color space is non-linear, which means the difference between the red values of two colors does not directly correlate to the difference between the green values of two colors. Which means that our simple euclidean distance equation wasn’t going to cut it.

Weighted Euclidean Distance

The solution presented by CompuPhase to the above problem was a weighted euclidean distance. This meant that all I needed to do was multiply our r,g, and b values by certain coefficients, or “weights”, in order to get a more perceptually uniform color distance. These weights were determined by various experiments carried out by the CompuPhase team, so I would suggest reading the above article if you want to get into the nitty and gritty of it all.

However one thing I would like to restate in this post is that simple weighted euclidean distance wasn’t enough. CompuPhase found that the weights for the blue and green values were dependent on how much red was present in a color.

All it means to me is I had to tweak my distance algorithm to use the below equation to get much more consistent results. (Keep in mind for Part 3 that I’m still using my rudimentary “tile superpixels” here)

Red Mean :
$\bar{r} = \frac{r_1 + r_2}{2}$

Dynamic Weighted Euclidean Distance
$\Delta{C} = \sqrt{(2 + \frac{\bar{r}}{256}) \cdot (r_1 - r_2)^2 + 4 \cdot (g_1 - g_2)^2 + ( 2 + \frac{255 - \bar{r}}{256} ) \cdot (b_1 - b_2)^2 }$

In Conclusion

In this part we primarily talked about super pixels and perceptual color space. This post may have seemed a little dry and it seems like I didn’t accomplish much, and I didn’t, but learning this foundation was very important for me to create the next phase of the palette selection algorithm.

In Part 3 we will ditch the sRGB color space altogether, explore Adaptive Histogram Equalization, and implement a much more intelligent Super Pixel clustering algorithm. With these three things I was able to not only create a more accurate palette selection algorithm but in doing so I gained a much deeper understanding of color processing that gives me confidence in delivering something substantial via the Photosynthernet project.

本博客采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议(CC BY-NC-SA 4.0) 发布.

Prev Home Next