The colour summarizer will produce descriptive colour statistics for an image. Reported will be the average, median, minimum and maximum of each component of RGB, HSV, LCH and Lab (CIELAB, L*a*b*). Average hues are calculated using mean of circular quantities.
Some of the questions the summarizer will answer are
You will be able to download the results of your analysis: your original file, cluster partition images and text/JSON/XML reports. These will be archived (.tgz, readable with any archive manager) and organized under a randomly named subdirectory, which will be different for each analysis.
Pixels are clustered (grouped) based on their color similarity — pixels with similar colors are more likely to be grouped together. Here, color similarity isn't merely based on hue — it takes into account all aspects of a color: luminance, chroma and hue.
You'll be asked to set the number of clusters — the rest of the process is automated.
Clustering is done using k-means. The process starts with randomly assigning pixels to clusters. Next, it calculates the color center of each cluster (average color of its pixels) and reassigns pixels to the cluster whose color center is closest to the pixel color. This is repeated many times until the "error" (taken to be the color distance between pixels and their corresponding cluster) is sufficiently low.
How many clusters should you ask for? This depends on your use. If the image is a technical drawing with only 3 colors, then use 3 clusters. For photos, start with 5 clusters and adjust as needed.
Because the method is initialized with random assignment, each time you run it, you'll get a different result. In some cases the differences are negligible and in some cases they can be very large.
The method works well on images with relatively well-defined color boundaries. The method works less well on on images with gradients that transition across a large range of colors (in hue, brightness and saturation).
If you want to find specific colors or snap colors to reference colors then clustering isn't for you. Instead, use my
LCH is the perceptually uniform equivalent of HSV, and defines colors using intuitive and perceptually-based luminance (perceived brightness), chroma (richness) and hue. If you are doing any kind of image analysis, it's likely that LCH will be much more useful to you than HSV or RGB.
To learn about LCH, see my presentation about color spaces and perceptual uniformity.
If you are a data geek, you'll be happy to know that XML or plain-text API output of the image statistics now includes RGB and HSV histograms, as well as individual pixel values. Munge away!
The purpose of this utility is to generate metadata that summarizes an image's colour characteristics for inclusion in an image database, such as Flickr. In particular this tool is being used to generate metadata for Flickr's Color Fields group.
A little car—lonely, broken and in Havana. I know that feeling.
Here's how the color summarizer describes this image:
altitude antidote aqua beau black blue botticelli bronze brown cod columbia cork dark derby desert drought dust eighth escape geebung goldenrod grey half joss judge jungle kabul malta mash medium metallic millbrook mist moleskin nullarbor pale paperback parchment pizza rich rickshaw road rock rocky rodeo smoky soho sweetwaters triple yellow ziggurat
If you feed in an image with all the colors, you'll get arbitrary and irreproducible clusters. Below is a Granger rainbow grouped into 6 clusters.
By asking for 6 clusters we got: (1) everything that's bright and desaturated, (2) everthing that's dark, (3) purple/blues, (4) bright saturated green/yellows, (5) reds and (6) saturated dark greens.
Image statistics are computed using every pixel in the image — analysis of images in which the background is dominant will be skewed by the background color.
Specifically, statistics for product images (e.g. an items photographed on a white background) can be difficult to interpret.
The answer to this is to mask areas (e.g. white, or close to white) of the image before carrying out statistics, but this is not implemented.