Metameric Failure Test with Two Reference Instruments


Update: 03/25/2018

For this article, I want to do something a little different. I just obtained a new Colorimetry Research CR-300, so I thought that this offered a good opportunity to test something about which I have heard a lot of anecdotal evidence, but had never experienced directly: metameric failure.

What is Metameric Failure?

Metameric colors are two colors with dissimilar spectral distributions that, nonetheless, are perceptually the same. This occurs because the human eye is a tristimulus receptor that can interpret a wide range of spectral distributions in unexpected ways. Thus, metameric colors look the same, but exhibit different spectral responses. This is normal. What is not normal is metameric failure. This occurs when two colors measure the same, but appear different to the human eye. The most common source of metameric failure is differences in color perception across individuals. We simply do not all process color exactly the same, the most extreme example of this being color blindness.

However, there has been a lot of discussion recently among those who regularly engage in color measurements about metameric failure due not to differences in human vision, but in inherent limitations of the instruments we use to measure color. The most disturbing aspect of this for those who rely upon accurate color measurements is that high-end spectroradiometers are no less susceptible to this problem.

The Test

This is precisely the thesis I wished to test. In so doing I wanted to answer two questions:

1. Is there instrument metameric failure when measuring W-OLED, CCFL LCD, and LED LCD? I will use the CR-300 to test this hypothesis. The CR-300's exceptional optical resolution of 2nm makes it an ideal instrument for this sort of test. It offers twice the optical resolution of most reference instruments, such as the Photo Research 650/670, Minolta CS2000, and Colorimetry Research's own CR-250. The CR-300 operates, as far as I can tell, exactly the same as the CR-250 and it even looks like a CR-250, except that it is much larger and heavier. For those interested in the operation of this instrument, read the previous article on the CR-250.

2. Are there any significant measurement differences between the CR-300 and the JETI 1211 that might account for instrument metameric failure in one but not the other?

The setup. From left to right CCFL LCD, W-OLED, LED LCD

The methodology for this test is quite simple.

  1. In a dark room I will calibrate a 2016 LG W-OLED (OLED55B6P) to as close to D65 as possible.
  2. I will visually inspect the two LCDs before calibration. I know that they are close to D65, but I want to see if my eyes are sufficiently sensitive to accurately report relatively small differences without measurements.
  3. I will calibrate the LCDs to as close to D65 as possible. This will include adjusting the light output of all three displays so they are as close to equally bright as is practically possible.
  4. I will visually inspect all three displays. Depending on the results of (2), if there is any instrument metameric failure—measuring the same, but looking different—I should see it.
  5. Finally, I will re measure the three displays with the JETI 1211 for the purpose of determining if its results correspond to the results I obtained with the CR-300. This will serve to test the accuracy and reliability of reference instrumentation (If two presumably reference instruments measured significantly differently, that would be troubling.). It will also expand the instrument metameric failure test to more than just one reference spectro.

The Results

The two uncalibrated LCDs did not visually appear to have the same white point. In particular, the LED LCD appeared bluer to my eyes. When I measured them, the measurements bore out my subjective assessment. The CCFL LCD (Samsung LN-32B650) was a nearly perfect D65 (x0.3114, y0.3283, 72.64 nits/ R99.5%, G100.1%, B100.4%), but the LED LCD (Samsung UN-32B600) was too blue (x0.3123, y0.3205, 74.28 nits/R102.7%, G98.5%, B105.3%). This verified that my eyes were sufficiently sensitive to identify relatively small differences in the white point. If differences are present, then I should be able to see it without instrumentation.

After calibrating the LCDs and adjusting the light output, they measured very nearly the same.

  • CCFL: x0.3114, y0.3283, 72.64 Nits/R99.5%, G100.1%, B100.4%
  • LED: x0.3136, y0.3290, 73.39 nits/R100.6%, G99.8%, B99.8%

Next, I matched the luminance of the W-OLED. It had been previously calibrated to x0.3136, x0.3293, 73.14 nits/R100.4%, G99.9%, B99.8%.

Finally, I visually inspected all three displays simultaneously in both a dark and well-lit room.

To my eyes the white patches looked identical. I saw no difference whatever. Clearly, if instrument metameric failure is a thing, it does not appear with these families of displays.

Next, I repeated all of my measurements. I did this because—although I was not seeing instrument metameric failure—I did see the displays drift somewhat. This is something I have seen many times before. Commercial displays are just not perfectly stable. The CCFL was the worst offender. Interestingly, although the W-OLED loses brightness rapidly when displaying a static test pattern for more than about a minute, once you wake it up out of its luminance dive, its color and luminance remained the most stable of the three.

Finally, I measured the same three displays using the JETI 1211. Here are the results.

CCFL: 0.3134 0.3307 74.36
LED: 0.3130 0.3289 71.76
W-OLED: 0.3131 0.3291 74.32
JETI 1211
CCFL: 0.3128 0.3309 72.01
LED: 0.3144 0.3306 69.68
W-OLED: 0.3139 0.3298 69.07
JETI variation from CR-300
0.0006 0.0002 2.35 3.2%
0.0014 0.0017 2.08 2.9%
0.0008 0.0007 5.25 7.1%
CR-100 W-OLED Luminance
74.17 cd/m2

Clearly, there was no significant difference between the white point as measured by the CR-300 and the JETI 1211. The specification for both instruments is xy ± 0.0015, which means that they could vary as much as xy0.003 and still be within spec. As it turned out, the largest variance was xy0.0017 and the average variance was xy0.0009, which is phenomenal consistency between two instruments, just what you would expect of a reference device.

There was one area of significant variance, and that was luminance. The JETI's luminance measurement was noticeably and consistently lower than the CR-300. However, since the luminance spec for both is only ± 2%, they could vary as much as 4% and still be within tolerances. In the case of the OLED, they were not even within this generous tolerance. When I remeasured the W-OLED with the CR-100 colorimeter it became clear that the problem was with the JETI. Its luminance calibration is clearly off and needs redone.


There is no doubt that instrument metameric failure is real. Sony went so far as to issue guidance on xy offsets calibrators should use with their BVM OLED broadcast monitors. However, these are RGB OLEDs, not the white OLEDs used in the consumer world. So too, RGB laser projectors seem to have a problem with instrument metameric failure, but these also are rarely used in the consumer display market, which generally uses a combination of a blue laser and a yellow filter to achieve all three primary colors. An RGB LED or quantum dot display could probably also pose problems, but these have been abandoned by the consumer world as well in favor of—like laser projectors—blue LEDs and yellow filters.

Fortunately, this test reveals that the problem of instrument metameric failure does not seem to arise on commercial displays. I did not test a CRT or plasma display as these technologies are now obsolete. I know that many people whose opinion I respect swear that they have seen this phenomenon, but all I can say is that I could not repeat it in a fairly careful test. There is little point engaging in debate about this, because ultimately it is about what you see. If someone sees what they believe is instrument metameric failure, how am I to respond? No, you didn't? That would be a silly thing to say. Clearly, people see what they see. I would only suggest that maybe, just maybe, what they are seeing reflects not a failure of instrumentation, but rather a natural variation in human color perception.

Update: 03/25/2018

There has been a lot of continued discussion about this, and so I decided to update my test by adding some additional displays: a Panasonic plasma and a Samsung Quantum Dot LCD (QLED). I retained the original 2016 LG OLED for comparison.

I looked at this earlier comparing visual appearance of white on three calibrated displays, a 2016 LG OLED, a Sony LED, and a Samsung LCD. I could find no visual difference in their appearance. The results were both interesting and unexpected.

Here is the data for the calibrated displays.

Quantum Dot
Luminance (cd/m2) 32.1 31.2 33.0
x 0.3127 0.3129 0.3131
y 0.3287 0.3282 0.3295
R 100.1% 100.3% 100.1%
G 100.0% 99.9% 100.0%
B 100.1% 100.2% 99.8%
CIE2000 0.24 0.79 0.26

Visual Inspection
First, I determined that is it much easier to detect apparent differences in white point when using full field test patterns, compared to 10% windows. With windows the differences I saw were MUCH more difficult to detect than when using fields. There are a couple of downsides to using full fields.

1. Not all display types (OLED and plasma, in particular) are capable of rendering the full luminance range with full fields. This problem is easily remedied by simply using lower level test patterns. I used 60% video.

2. Another, more troubling problem is that displays do not all have good white field uniformity. This was an especially irritating problem with the OLED. There were clearly visible red/pink vertical discolorations running down each side of the screen. The only solution to this problem was to focus my attention as much as possible on the center of the screen where the measurement was taken.

Pretty much as before, I saw no significant difference between the appearance of the OLED and plasma white point. There may have been a very small difference. However, the difference was so small that it was unclear to me in what direction I should adjust the OLED to match the plasma; or, if there was no difference at all, but my visual field was simply being contaminated by the OLED's poor white field uniformity.

However, I finally was able to detect a clear issue with instrument metamerism when comparing the OLED/plasma with the Quantum Dot. The QD had an easily recognizable reddish bias that I did not see in either the plasma or OLED (center screen), despite the fact that it measured essentially the same as other two displays.

The bottom line is that I still do not see any evidence of significant instrument metamerism with OLED. Subsequent investigation revealed a small issue. The OLED appears slightly redder than a neutral white, even when measuring neutral. This is easily fixed by backing off the OLED High Red by a couple of ticks. However, I did detect a significant problem with the Samsung QLED.

To bring the Quantum Dot visually into line with the other two displays, I adjusted green and blue, measuring this value:
cd/m² 33.6
x 0.3132
y 0.3369
R 98.1%
G 100.8%
B 97.5%
CIE2000 4.83

BTW, here's a measurement of the LG OLED side discoloration
cd/m² 31.9
x 0.3173
y 0.3267
R 103.0%
G 99.1%
B 100.1%
CIE2000 4.82

As you can see, the Quantum Dot metamerism is almost exactly the same as the discolorations on the OLED, and they were quite visible.

I used a Colorimetry Research CR-300 for all measurements.

For what it is worth, here are the spectral signatures of the three displays. You will notice that the Quantum Dot is the only display of the three that demonstrates reasonably narrow bandwidths for all three primaries. This may or may not explain the instrument metamerism I observed. The plasma's spectral signature looks rather bizarre. Green and blue are fairly broad, but red shows quite narrow peaks, including one prominent peak in orange. OLED, by comparison, looks the most normal of the three. Blue is fairly narrow--though not as narrow as the Quantum Dot--and green and red are quite broad.

Quantum Dot




Practical Remedy

Since there appears to be a problem with instrument metamerism with a few displays, really the only way to fix this is to use an optical comparator as a corrective.

Use this technique to address the problem.

  1. Display a 50% gray test pattern on your computer screen.
  2. Using the most accurate instrument you have and the monitor's own white balance controls, calibrate that white point to as close to 0.3127, 0.329 as possible. Once you are satisfied with your efforts, the PC screen becomes your optical comparator.
  3. Maximize the test pattern if possible and then display a 50% gray test pattern on the TV to be calibrated. In the vast majority of cases if the TV has already been calibrated the two screens should look the same. If they don't then you have a metamerism problem.
  4. If you see a problem, use the TV's white balance controls until the TV's 50% white point visually matches the computer screen's 50% white point.

As mentioned above, in the case of the LG OLED, the error is very small so the procedure is quite simple. After getting the LG screen as close to 0.3127, 0.329 as possible, back the High Red control off by a couple of clicks. That should fix the problem.

Previous articles

Colorimetry Research CR-250

Colorimetry Research CR-100

X-Rite i1 Display Pro III colorimeter

JETI Specbos 1211

DVDO Duo Video Processor

K-10 Colorimeter

Lumagen LUT Color Correction

Sony 4K Projector