Building Photosynthernet (Part 3 - Special Launch Edition)

Welcome to the special launch edition blog post! Now that Photosynthernet has been launched out to the public world via the url https://photosynther.net (as discussed in Part 1), I figured now was the best time to write the next installment in the “Building Photosynthernet” series. After reading, feel free to sign up on the live site and see the information presented here hard at work extracting color palettes from your images!

Quick Part 2 Retro

In Part 2 we discussed things like what Perceptual Color Space was and how Superpixels are determined. If you haven’t read it, I would recommend it as this part will present a deeper dive on those fundamental concepts introduced in Part 2.

Let’s talk Lab

No not this lab

In the context of colors and images, Lab is a color space that is designed to be perceptually uniform. That means, when you add or subtract from any of its values linearly, then the perceived change in the resultant color is also linear. But before we get too far lets get a mental image of a color represented in Lab color space.

First of all, Lab* (The asterisks are a common addition to prevent confusion with another color space designated “Hunter Lab”. I will be using the asterisks the remainder of this post), is an acronym composed of three parts that make up the whole, very similar to RGB. The first part, L*, makes up the perceptual lightness value of the color. This just tells us how light or dark it appears. The next part, a*, tells us the range from red to green; and then similar to a*, b* tells us how blue or yellow a color is.

Now, there is a very long history to LAB color space and it’s an accomplishment by the color industry that I don’t want to diminish by reducing to a single paragraph in this blog post. If you want to know more, please read the Wikipedia Article and work from there; if you’re a geek like me you’ll find it incredibly fascinating! However, for the sake of brevity, all you need to know for this blog post is that Lab color space is perceptually uniform, and device independent. (We’ll get into the device thing later). Lastly, here is a 3D representation of Lab color space.

If you remember from Part 2, we often perceive the range of blues to be more distinct than we do other colors. Which is why the coefficient for our blue value in sRGB Euclidean Distance calculations was 4, instead of 2 or 3. This fact is demonstrated in this visualization by the asymmetric grouping of “Blue” colors.

Better Superpixels

We aren’t done with Lab* color space just yet, as we will need that information later, but for now lets talk about improving our superpixel algorithm.

In the previous post we defined super pixels as

… Clusters of perceptually similar pixels that have been merged to form a single polygon that represents a single color

Part 2!

And these clusters were defined as “tiles” that I broke the image up into. However, this is not very effective as often a tile can break up parts of an image that “blend” together very naturally, shattering a natural “segment” of an image into an array of artificial squares. What I found was that this method of tiling often lead to creating artificially dominant color palettes. Where a single color may have been the global minority in an image, since it appeared as the dominant color in several superpixels, the initial algorithm thought it was more important than it really was.

Thankfully, people much smarter than I am, have developed better algorithms for determining superpixels, so I figured I should just use one of those. And again, similar to Lab* color space, the topic of superpixels has an incredible depth and is deserving of it’s own post. However, in the meantime, if you want to read up on some of the different superpixel algorithms and how they operate I would suggest starting here, where I did.

To put it tersely, after much experimentation, I settled on the SLIC (Simple Linear Iterative Clustering) Superpixel algorithm. This algorithm works in a very similar manner to K-Means Clustering, and by that I mean some centroids/clusters are chosen, and then in an iterative fashion, each centroid pulls other similar pixels near it into the cluster.

Original Image

After SLIC Tranformation

As you can see above, this algorithm does a fairly good job at determining where the “perceptually natural” clusters of each color are. However, as good as the algorithm is in pulling similar colors into clusters, it’s still too granular and produces way too many shades of each color to get a reduced color palette. Remember, Photosynthernet is looking to grab the distinct color palette of an image; which means highlights and lowlights. If we were just grabbing the dominant palette of the above image, we would get like 15 boring blues; so we need to reduce the blues and bring out the reds/beiges. To do so, we will be using a Region Adjacency Graph (RAG)…

The Graphening!

Source : http://melvincabatuan.github.io/RAG/

A Region Adjacency Graph is formed by drawing a line between each centroid, or center of a superpixel, of the above clustered image. Then each line is given a “weight” based on how similar that centroid is to the one it’s connected to. This graph doesn’t do anything for us on its own, other than look cool, but it does provide a framework for merging similar superpixels.

Once we have the RAG, we can then work on merging superpixels. How this is done is we first generate the RAG, assigning weights to the connections between each cluster (How this is done is covered in the next section). Then, we iterate over all the connections and compare the weight against an arbitrary threshold; if the weight is less than the threshold we consider these clusters to be the same color. Once that’s done we choose the dominant color to represent the new mega-cluster (usually just the cluster with the most surface area), then we delete the boundaries between the two clusters, turning it into one and assigning it a new centroid.

This process is done for all connections and done in cycles. Once we have a bunch of mega-clusters after a single run, we run it again, until no more clusters share weighted connections under the set threshold. When we’re done, we should get something resembling the SLIC clustered image above, but with larger and more clearly defined boundaries around each superpixel.

Original

SLIC Clustering

After RAG Merging

Calculating Color Distance

In the previous couple sections I’ve mentioned “similar colors” and “weighted connections. These are not determined through some magical unknown method, they are precise calculations done through a very researched and well-known magical method called CIEDE2000.

CIEDE2000 is a color distance formula for determining the perceptual distance between two colors in Lab color space. It takes things into consideration such as “Is this color viewed on textiles or on a screen? Is this color very light or very dark? Is it very blue? etc.” Think of CIEDE 2000 as the basic Euclidean Distance of Lab color space. Unfortunately for us, it is anything but simple. If you are really interested, I recommend starting at the wikipedia page for color difference formulas and working from there. I will only be covering it at a high level and sharing what helped me during the implementation phase.

CIEDE2000 for determining the distance between two colors on a screen or print media. Not applicable to textiles. Diagram made by me! 😀

There is a lot to cover here so I’ll just keep it to the highlights and describe each primary phrase of the above formula. The above diagram is structured with the full formula presented at the top, with each phrase below defining the components that make up this formula. At the bottom of the diagram is the two colors represented in Lab color space. (These are quickly converted to LCH* color space so we can utilize cylindrical coordinates).

a. This is the Lightness Phrase. It determines the difference in the lightness value of two colors. It’s the simplest of all phrases, only combining the squared distance between the L values with a single compensation constant.
b. This is the Chroma Phrase. Most of this phrase consists of just converting Lab to LCH color space.
c. This is the Hue Phrase. This phrase consists of a lot of geometry to determine how close two colors are on the LCH* color sphere.
d. This is the hue rotation phrase. This phrase provides the compensation that is still necessary to achieve color uniformity in our distance calculations.

This formula was essentially implemented part for part from the wikipedia article in our python palette extraction service to serve the needs of our superpixel and RAG merging stages. I did find that the python libraries I was using already had an implementation of CIEDE2000 included, but in the hue rotation phrase they were only rotating by 30 degrees instead of the 60 degrees outlined in the formal proof of this formula. I personally found better results by strictly following the 60 degrees coefficient instead of the 30, so I decided to just re-implement CIEDE2000 myself both to accomplish better results and as a learning exercise.

Lastly, Give Me That Palette!

So, finally, we have an image with better superpixels and we have developed a superior distance formula to determine how similar two colors are. At this stage, we can go back to the basics discussed in part 2.

First, just like before, we iterate over each superpixel; we then compare that superpixel with all the other superpixels, and if we find one that is below our new threshold for “uniqueness”, we “merge” them. And you might be asking, just as I did, why don’t we use the RAG merge for this and call it a day? Well, RAG is only for merging Region-Adjacent nodes, but an image may have two of the same clusters on opposite sides, leading to a scenario where a RAG would not be aware of them. So this final step is necessary to form a truly unique and accurate color palette.

Another difference here is we start with an incredibly high threshold for similarity. If we assume that red is the polar opposite of green with a distance value of 100, then we set our threshold to somewhere around 75. Then, we aim for a target of at least 4-6 unique colors, and if we find only 2 unique colors, we lower that threshold in steps of ~2.5 and re-run this final palette selection step until we get 4-6. This solves for a disparity we were seeing in our algorithm when selecting the palette from visually “busy” images and visually “minimalist” images. However, some images may only have two primary colors, and the palette extraction algorithm should reflect that reality. So, to prevent the algorithm from lowering the threshold to 0 and creating a palette of 15 of the same blue, we set a lower bound of about 35. If the algorithm only finds 2 primary colors and the threshold is already 35, then we just accept that this image has 2 primary colors and no more.

Wrapping Up!

Hopefully this blog post was as interesting to read as it was for me to write; but even more-so, I hope it was informative. This concludes the primary three-part journey of building photosynthernet; there will of course be more to come about photosynthernet such as how we deployed on kubernetes, how we implemented websockets, how we managed the project and more about the palette selection algorithm!

If you have any questions feel free to ask in the comments and I will try to improve this post with as many addendums as necessary to ensure it’s interesting and understandable.

Thanks for reading!

本博客采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议(CC BY-NC-SA 4.0) 发布.

Prev Home Next