Adventures in Scene Referred Space – Part Two

Champagne Logarithmic Camera Output Dreams on a Beer Budget

As previously discussed, when integrating CG renders with background stills or video footage, the preferred approach is to make sure all elements are composited together in a common “Scene Referred” color space, Only at the end of our compositing and color grading pipeline should we use a “Color Transform” to output to a “Display Referred Space” appropriate for the destination monitor or projector. Working this way will give us the most flexibility throughout our pipeline, apples-to-apples interoperability between 2D and 3D assets, AND ensure more accurately calculated compositing operations such as ADD and MULTIPLY.

That’s great, but can your off-the-rack DSLR give you a scene referred file? Unless you have a champagne budget, probably not.

If I won the lottery and had an ARRI Alexa sitting around, in addition to bragging rights and a lovely dynamic range (about 14.5 stops worth) I’d get two outputs critical to a great VFX pipeline.

First, it lets me output that all those lovely f-stops into a logarithmic (AKA Log) format. This preserves as much usable information in the light and shadows as possible. (More on this later.)

Second I’d get that footage out as a known, documented Color Transform that I can later decode or transform again into a common color managed space for my VFX pipeline.

I’d then be in a really nice position to drop in some CG that’s been appropriately ranged to match the same dynamic range, and bingo: 90% of my compositing fudging and futzing would be thrown to the wayside as I composite away with true world values under common light intensity conditions. Optical flares, motion blur, bokeh, and other similar effects would all behave as expected with little to no massage and color would behave naturally.

But I don’t have an Alexa sitting around. Do you? What I have is a consumer level Nikon DSLR and a dream.

The good news is there is a way to get some of those champagne dreams on a beer budget. It’s an approximation, technically an interpolation, but it’s pretty darn close, all things considered.

My Nikon D5200 with the stock lens (a nice but not not “prosumer” camera, approximately $600 USD) is capable of about 10 stops of dynamic range when shooting video, so right there I can’t compete with the big boys as far as capturing light. But dynamic range limitations aside, the biggest hurdles are two aspects of the output video format of my camera.

What’s coming out of my (your?) camera?

First: with very few exceptions, most consumer DSLRs (mine included) don’t offer a logarithmic video output option, so the default video file that lands on the memory card is transformed into a display referred color space before you even start a compositing workflow. Put another way, a lot of helpful light and shadow information is sacrificed immediately for the sake of presenting an image that a) mimics the way our eyes work, b) can be easily compressed into a relative small data set (8 bits) and c) looks good on most monitors, including the LCD on the back of your camera.

Left: A logarithmically encoded image. Looks washed out, right? That’s because it’s not designed to look great on your monitor. It’s designed to retain as much information in the lightest and darkest parts of the image as possible. Right: only once … — Left: A logarithmically encoded image. Looks washed out, right? That’s because it’s not designed to look great on your monitor. It’s designed to retain as much information in the lightest and darkest parts of the image as possible. Right: only once it’s been graded is it ready for prime-time. Images courtesy, NikonPC.com.

Second: in addition to the fact that it has already been converted to a display referred color space, that nice picture you just snapped of Grandma at her birthday party doesn’t truly represent the physical ratios of light you shot. It’s undergone an additional hidden (and with rare exceptions at this budget – undocumented) color transform inside the camera to give you aesthetically pleasing result. Smartphone cameras are the worst offenders at this but it’s true of the standard factory settings of DSLRs too. Big name brands want everyone to feel like a pro photographer and so “special sauce” is applied to make your snaps shine. Well that’s great for Grandma but not very helpful for VFX integration where we want the output of our camera to represent the captured scene as closely as possible. Why? Because we’ll be looking to match our footage to CG renders that are (hopefully) computed in a non-sullied color space. And we want to do our color grading later on, on our own terms, thankyouverymuch.

Left: the light and color accuracy recorded at the sensor. Right: the "special sauce" image as saved by the camera. Images courtesy, NikonPC.com. — Left: the light and color accuracy recorded at the sensor. Right: the "special sauce" image as saved by the camera. Images courtesy, NikonPC.com.

Getting what you need out of your budget camera for CG integration

So the two key questions are: how do I retain that lost exposure information in the lightest and darkest parts of my video? And, how do I get rid of the “special sauce”? If I can do both I’ll have a grounded starting point that I can grade with flexibility later in my pipeline AND something I can quantify to match with other image elements such as CG. And I’ll have done it all with a budget camera.

The solution to getting to a known scene referred output from an unknown display referred source is a three step process.

We need to understand the total dynamic range our camera can capture. Every camera is different and different ISOs and picture styles (transforms) will also impact the dynamic range response. A shoot under controlled conditions is necessary to capture this information.
We need to shoot with a color transform that aims to retain as much information as possible in the lightest and darkest parts of the image - in short our own Log format. This will also have the added benefit of giving us a known, quantifiable color transform that we can leverage to untangle the undocumented “special sauce” of our camera.
Once we understand the unique dynamic range (1) and the unique color handling (2) of our camera, we can use the information to re-map our 8-bit display referred video into a 32-bit scene referred image sequence in a known color space ready for compositing.

In the next post I’ll walk you through step one – a methodology to determine your camera’s unique dynamic range. In subsequent posts I’ll document the entire process so you too can have some champagne logarithmic camera output dreams on a beer budget and pull imagery out of your consumer camera that is set up for success with CG integration.

Thanks

If you’ve found this post hopeful, please consider following me on Twitter for updates on future posts.

Once again, a huge thanks has to go to Troy Sobotka, industry veteran, and the brain behind Filmic Blender (standard in Blender as of version 2.79) and a huge wealth of knowledge on all things color and lighting who opened my eyes to the importance of a Scene Referred workflow during the production of a recent VFX project. Be sure to follow him on Twitter.

Cover image by montillon.a used under Creative Commons licensing.

Adventures in Scene Referred Space – Part Two

Champagne Logarithmic Camera Output Dreams on a Beer Budget

What’s coming out of my (your?) camera?

Getting what you need out of your budget camera for CG integration

Further Reading

Thanks