OpenGL Pixel Shader Dewarping

3 min read Original article ↗

Chris

Introduction

When a user requested adding dewarping to my camera streaming software Monocle, I tried various methods and found they each have significant drawbacks until I came up with a novel solution.

Press enter or click to view image in full size

Fisheye Dewarping

CPU

Having previously done transformations on the CPU, I knew this would be far too slow to be satisfactory, so I didn’t even try implementing this.

Vertex Shader

Using the vertex shader was my first attempt because I knew it would be high performance and I felt intuitively it would be able to provide sufficient accuracy to look good enough. Unfortunately the aliasing looked really poor along triangle edges and required a dense mesh of polygons. It was also difficult to create an exact match to the OpenCV dewarping models.

CUDA

Using CUDA was the next experiment. It quickly became very cumbersome, having to draw the original image to a frame buffer, then passing the texture to the CUDA interop, doing the calculations, and then going back to OpenGL to draw the result. Even though the data remained in the GPU throughout the pipeline, the performance was only barely acceptable. It was also unsupported on systems without an NVidia GPU.

Novel Pixel Shader

At first sight, it might appear that it wouldn’t be possible to dewarp in a pixel shader. However, with a small amount of engineering, it not only becomes possible, but I believe preferable. The first insight was that a Sampler2D in a pixel shader can be hijacked to contain a lookup table instead of a texture. The second is that OpenGL has sub-pixel accuracy when sampling the colour from a texture, so we can get excellent aliasing. This does not come without any challenges because most OpenGL drivers require the texture type to be GL_UNSIGNED_BYTE, which does not provide a wide enough range to identify each pixel, so we use GL_RGBA so the x can be represented by both the red and green values, and the y by the blue and alpha values, giving a range of 0–65535 which is more than sufficient for most use cases. In the shader below, each output pixel samples the LUT to find its corresponding location in the distorted source image.

in vec2 tex_coord;
out vec4 FragColor;
uniform sampler2D tex;
uniform sampler2D lut;
void main()
{
vec4 lut_value = texture(lut, tex_coord);
float x = lut_value.g + (lut_value.r / 255.0);
float y = lut_value.a + (lut_value.b / 255.0);
FragColor = texture(tex, vec2(x, y));
}

We also get excellent performance because we only need to draw two triangles and the shader remains simple because the majority of the computationally time intensive work is precalculated when creating the lookup-table. Performance could further be improved by doing the YUV conversion within this shader, instead of first drawing to a separate frame buffer.

I created an example C++ application demonstrating the novel pixel shader dewarping technique. Full code here.

Press enter or click to view image in full size

Discord Monocle Twitter