Making of Goethe’s Colours

By Nicholas Rougeux, posted on January 12, 2020 in Art, Web

Closeup of visualization

Figuring out how to put a new face on something old is never easy and devising a new way to look at Goethe’s Theory of Colours was no exception. What started as a relatively simple idea turned out to be more complex that I expected but the process was a good learning experience. The final result is fun too.

I’ve analyzed my fair share of antique texts and each time I find a new one to explore, I find it both daunting and exciting to explore looking at it in new ways. I tried dozens of ideas for Goethe’s Colours before finally settling on one. As is often the case, it’s I don’t know what works until I see it so a lot of experimenting is needed.

In 1810, the German statesman and and writer Johann Wolfgang von Goethe explored the psychology of colors and how they’re perceived by humans in Theory of Colours. While mostly rejected by the scientific community, it was embraced by and influenced philosophers and artists.

The book had come across my radar several times in doing research for other projects and this time around, it piqued my interest enough to explore ways of visualizing its contents. Considering the book is all about colors, I chose to focus on the colors themselves—specifically, which ones were mentioned and when. My assumption was that the book would discuss many colors and visualizing this could prove interesting. I don’t speak German so I chose to use the 1840 English translation available on both the Internet Archive and Project Gutenberg.

I was surprised to learn that while this was ultimately true, not as many colors were mentioned as I expected. Of the 95,000+ words spanning 471 pages, there were only 1,851 mentions of colors by name (e.g. “red,” “yellow,” etc.) and only 191 unique names used to reference these colors (e.g. “red,” “yellow-green,” “gold,” etc).

Finding colors

One of the first tasks for any data-driven project is to look at the raw data. Like many of my projects, I needed to generate these data myself. I used the text on Project Gutenberg as my primary source of data and since it was marked up in HTML my goal was to do the parsing in JavaScript to avoid reshaping the original data into another format.

I skimmed through the book and I saw that the most common names of colors like “red,” “orange,” “yellow,” etc. were used most frequently so I started by simply doing a search for how often each occurred using a very basic regular expression in JavaScript:

red[\. ]|orange|yellow|green|blue|purple|brown|black|white|grey

The red[\. ] was an initial attempt at catching variations on its usage such as, “redder,” “reddish,” or anything else with the word “red” in it. This had its own issues but this first test was a good glimpse into the volume and variety of data available.

Early visualization attempt
Screenshot of the first view sequence of basic colors

An early picture of the data using circles to represent the sequence of basic color names as they appeared in the book

Determined to find all the colors with a regular expression rather than manually finding each one, I started trying out other names for colors and variations on names to see what else appeared. I found many others but there was no getting around the need for a manual review. I couldn’t help but think that some text analysis and machine learning would have made the process more efficient and interesting but in lacking the knowledge of how to get that working, I forged ahead with the manual approach. I eventually settled on the following long-winded regular expression. It could probably have been streamlined more but it did the job for what I needed.

(
    (
        (
            blood
            |bright
            |burnt
            |citron
            |dark
            |deep
            |dull
            |emerald
            |hyacinth
            |light
            |nearly
            |pale(st)?
            |pearly
            |prussian
            |pure
            |quiet
            |sea
            |spanish
            |silver
            |subdued
            |(?&lgt;![a-z])sky
            |vegetable
            |vivid
            |warm
        )
        (e(r|st))?+
        (<span class=\"pagenum\">.*<\/span>)?
        ( |-|\n|\r)?
    )?
    (
        bianchezza
        |bianco
        |black(er|ish|ness)?
        |bleu
        |blue
        |bluer
        |bluish
        |brown(er|ish)?
        |carmine
        |copper
        |coral
        |cramoisi
        |crimson
        |cyaneum
        |(flavum( |-|\n|\r)saturum)
        |flesh-colour
        |(florido( |-|\n|\r)flavo)
        |gelb(en)?
        |(?<![a-z])gold(en)?
        |green(er|ish)?
        |indigo
        |nero
        |orange(ish|r)?
        |peach-blossom
        |purple(ish|r)?
        |purpur(?![a-z])
        |(?<![a-z])red(d(er|ish|en(ed|ing)?)|ness)?(?!uc(e(s|d)|ction)?)
        |(?<![a-z])rose(?![a-z])
        |rouge
        |rubescentem
        |rubra
        |ruby
        |scarlet
        |topaz
        |vermilion
        |violet(er|ish)?
        |white(ish|ness|r|st)?
        |yellow(er|ish)?
        |grey(er|ish)?
    )
    (( |-|\n|\r)?
        (
            bianchezza
            |bianco
            |black(er|ish|ness)?
            |bleu
            |blue
            |bluer
            |bluish
            |brown(er|ish)?
            |carmine
            |copper
            |coral
            |cramoisi
            |crimson
            |cyaneum
            |(flavum( |-|\n|\r)saturum)
            |flesh-colour
            |(florido( |-|\n|\r)flavo)
            |gelb(en)?
            |(?<![a-z])gold(en)?
            |green(er|ish)?
            |indigo
            |nero
            |orange(ish|r)?
            |peach-blossom
            |purple(ish|r)?
            |purpur(?![a-z])
            |(?<![a-z])red(d(er|ish|en(ed|ing)?)|ness)?(?!uc(e(s|d)|ction)?)
            |(?<![a-z])rose(?![a-z])
            |rouge
            |rubescentem
            |rubra
            |ruby
            |scarlet
            |topaz
            |vermilion
            |violet(er|ish)?
            |white(ish|ness|r|st)?
            |yellow(er|ish)?
            |grey(er|ish)?
        )
    )?
)

This complicated mess allowed me to find all the names and their variations like “whiter” and “whiteness” and if they were hyphenated like “yellow-green” or “red-orange” even if they were interrupted by page markers or included a modifier like “light” or “dark.”

In developing the way to detect “red” but not words containing “red” like “inferred” or “coloured,” I used a negative lookbehind but learned that it didn’t work in Firefox which is my primary browser. I found this thread on Stack Overflow from April 2018 that mentions it only works in the latest versions of Chrome.

Despite all my efforts to develop a comprehensive regular expression, I still had to flag a handful of words to ignore when doing the final parsing because even though they matched the expression, they weren’t used as a way to name a color. For example, take the following passage from page 47:

Screenshot of passage from page 47
Screenshot of passage from page 47 with color words highlighted

In this example, “carmine” is used as the name of a pigment material and the first instance of “rose” references a flower—neither are as color names and were therefore ignored. All other colors are used as adjectives or nouns as color names.

In addition to finding the colors, I also calculated their exact position in the text which I used in many of the early design iterations but not in the final result. A handful of names were also not translated from the original German text and remained in either German, Italian, or Latin. Google Translate was used to approximate a best guess for the English equivalent.

Screenshot of data
Screenshot of data

Once the raw data were collected, the final preparation task involved assigning actual colors to the text. Since Goethe did not provide visual examples of each, some creative license was taken to devise a palette. Robert Ridgway’s, Color Standards and Color Nomenclature from 1912 was used as a basis for this palette.

Whenever possible, I matched the language used by Goethe to the names developed by Ridgway and sampled hex values based on the match. Not all names matched so I used some creative license to determine the rest by what I thought was a close match.

Highlighted swatches
Some swatches sampled from Ridgeway’s Color Standards and Color Nomenclature. Not all swatches pictured.

With all the data collected, next came the visuals.

Designing iterations

My goal with this project was to design something that showcased all the colors Goethe mentioned at a glance in a colorful way to serve as a new way of looking at Goethe’s work. Many ideas that sounded good in theory but ended falling flat because the final result diminished the vibrancy of the colors in several ways.

Presented here are several of the dozens of iterations in the order that I developed them as a kind of timeline of my experimentation. They’re rough and not polished but saving and reviewing them can be a useful way to know what doesn’t work or get inspiration for future projects.

NodeBox was my tool of choice to create all of these.

Design concept
Circles representing the sequence of basic color names as they appeared in the book
Design concept
Treemap of color names grouped by generic name.
Design concept
Beeswarm plot of color names and when they occur in the book grouped by generic name
Design concept
The text of the book was divided into 100 equal parts represented by the 100 horizontal lines (e.g. 0–1%, 1–2%, etc.) and circles were placed precisely where each color appears in the book (e.g. a blue color 2.3% of the way through the book)
Design concept
Stripes of hash marks for each generic color positioned horizontally based on where they appear in the book
Design concept
Updated version of circles representing the sequence of colors using the Ridgway palette
Design concept
Concentric circles of generic colors comprising lines of each color where the angle is based on where they appear in the book
Design concept
Stacked horizontal bar chart of color frequency
Design concept
Ridgeline-like plot of the gaps between when colors occur grouped by generic name
Design concept
Stacked bar chart of which colors appear per page
Design concept
Spiral representing the text of the entire book with colored circles placed based on their position in the book
Design concept
Spiral of the sequence of colors as they appear in the book
Design concept
Same spiral as the previous design but with a stacked horizontal bar chart of color frequency with lines connecting each circle in the spiral to its corresponding color in the bar chart
Design concept
Updated treemap of color names grouped by generic name using the Ridgway palette
Design concept
Radar scatter plot of the colors mentioned. Circle size = total mentions, position around the center = hue, distance from the center = saturation value.
Design concept
Horizontal and vertical lines positioned based on where they appear in the book. Vertical lines were positioned horizontally based on where the colors appear in the first half of the book and horizontal lines were placed vertically for the second half.

Up to this point, I had focused mostly on data for each color as it related to the book as a whole. The results were always messy and none piqued my interest to explore as a final design. However, the radar scatter plot did prove somewhat interesting and I considered making a poster from it. I shifted focus to examine color usage per page, which lead to the following iterations and eventually the final result.

Design concepts
Color usage by page where horizontal stripes represent each color used. The fewer colors used on a page, the thicker the stripes are. Left: stripes per page in the order they appeared on the page, Right: stripes reordered by spectrum order (red, orange, yellow, etc).

This design concept would eventually be the basis for the final result but using circles instead of stripes. Comparing these two iterations is a good example of intent versus appeal. When viewing the stripes in the order they appear on each page from a distance, the colors appear muddy as they blend together. Reordering them by spectrum order (red, orange, yellow, etc.) improves their vibrancy. The former may have the added layer of meaning but the latter is more appealing.

Design concept
Color usage by page represented by a single pattern of diagonal stripes where only the stripes of the colors that appear on the page are shown.
Design concept
Color usage by page using pixel-wide horizontal stripes positioned vertically on each page based on where they appeared on that page (earlier toward the top, later toward the bottom)
Design concepts
Radar plots of how many colors appeared on each page in separate plots (left) and layered with curved lines (right)
Design concept
Color usage by page represented by a single three-by-three square pattern where only the pixels of the colors that appear on the page are shown.
Design concept
Color usage by page represented by a single horizontal striped pattern where only the stripes of the colors that appear on the page are shown.
Design concept
Color usage by page where circles represent each color used. The fewer colors used on a page, the larger the circles are. Circles are arranged on each page using a circle packing algorithm to maximize their size.

The first time I saw this last design, I knew it was worth exploring. All the colors were visible and vibrant and the layout was varied enough that made me want to explore everything.

I used white and black backgrounds for all the iterations up to this point but both colors presented issues with seeing all the colors clearly. With a white background, the lighter colors were difficult to see and vice versa with a black background. By switching to a grey background, and adding a subtle border around each circle, I was able to find a sweet spot that didn’t interfere with the grey circles or other colors like yellow.

The position of the circles was based on a packing algorithm to pack as many circles of equal size in a square depending on the number of circles in the square. Years ago, I discovered Packomania, which has tons of packing diagrams available for download as PDFs including circles packed in squares, rectangles of varying ratios, triangles, and more. I used these diagrams rather than developing my own algorithm. The total number of colors on any given page was 33 so I only needed the first 33 diagrams.

However, just importing these diagrams as-is still resulted in some muddy colors from a distance. Fortunately, NodeBox has a useful option to sort shapes by angle so by using that and sorting color data per page in spectrum order resulted in pleasantly colorful results without looking muddy.

Design concepts
Comparison of not sorting by color and angle (left) and sorting (right)

While I polished this design, I found myself repeatedly wondering what colors were mentioned on any given page because the pattern of circles was interesting. I initially only intended to make a data-driven poster and not an interactive version but this lead to me wanting to make one because I knew I wouldn’t be the only one wondering this.

Building experiences

Since I already had color data in a structured format to create the visuals, using that to build out an interactive version seemed straightforward and for the most part, it was—except for a few challenges that popped up.

I opted for a simple approach using plain HTML/CSS to create the visuals but one catch was that I needed to position the circles to match the diagrams from Packomania and I didn’t feel like generating these myself using something like d3. To do this, I needed three parts for each circle: coordinates (x and y) and a radius. NodeBox wasn’t designed for raw data manipulation but it has a useful feature to export to a CSV so I used it to create a dataset of what I needed by detecting the coordinates of each SVG circle and combining that with a list of radii from Packomania for the first 33 packing diagrams.

Design concepts
Screenshot of NodeBox setup that produced packing data

The interactive experience is powered by four sets of data:

  • Colors: Every mention of colors in the book
  • Packs: Packing coordinates for positioning circles on each page
  • Pages: Roman numerals for the first 48 pages
  • Sections: Titles of each section of the book (introduction, parts, etc.)

Using these datasets, I was able to replicate what I had created in NodeBox for the poster and enhance it with the ability to click a page to see the colors mentioned in context.

Design concepts
Screenshot of a popup showing colors mentioned on page 48

Clicking on a page to show the colors in context loads the full text from Project Gutenberg with all the colors highlighted. The process of highlighting the colors was a little slow due to the initial detection process and since that process used regular expressions that only worked in Chrome, I chose to regenerate the HTML with the highlighted colors so they weren’t detected each time the popup opened.

I ran into an interesting quirk I ran into with this popup in Safari and mobile Safari. The popup contains an iframe which loads a static HTML file. My initial idea was to load the file with a hash to jump to a page anchor that already existed:

<iframe src=”theory-of-colours.html#Page_50”></iframe>

However, each time the iframe’s source changed, Safari and mobile Safari scrolled the parent window down seemingly random amounts, thereby causing the page that was clicked to scroll out of frame. After a few headaches and unsuccessful searches, I found that using a querystring and then some extra code on the page loaded into the iframe to parse that and scroll to the desired position worked without affecting the parent window.

<iframe src=”theory-of-colours.html?Page_50”></iframe>

This extra code also had to hijack the other anchor links on the page that linked to other parts like footnotes and other sections. I admit it’s a strange workaround but it works.

$("a").on("click touch", function(){
    newScroll($(this).attr("href"));
    return false;
});

Finally, I added a simple chart at the bottom of the page showing the total times each color was mentioned with the ability to drill down to see which names were used.

Design concepts
Screenshots of the chart showing how often all generic (left) and orange names (right) were used

Final thoughts

I began this project thinking it would be fun simple project to see how often colors were referenced in a book all about them. As usual, it became much more involved with many design ideas and programming challenges. A few headaches were encountered along the way but I’m pleased with the final result and hope others are as well.

Explore the project »

« Back to blog