Video for Linux Two - Video Compression/Decompression Specification

Bill Dirks - September 13, 1999

Video for Linux Two is a set of APIs and standards for handling video devices on Linux. Video for Linux Two is a replacement for the Video for Linux API that comes with the kernel.

This document is the specification for Video for Linux Two video compression/decompression (CODEC) devices. This is a companion document to the V4L2 Devices document.



A V4L2 codec device is a device that can compress, decompress, transform, or otherwise convert video data from one format into another format, in memory. Applications send data to be converted to the driver through the write() call, and receive the converted data through the read() call. For efficiency, a driver may support I/O through memory-mapped buffers also.

Refer to the Device document for the device file names used by codec devices.

Multiple Opens per Device

Drivers should support multiple opens on a codec device. This is necessary because it is normal to have multiple video windows showing or playing video, or for an application to access more than one video stream at a time. Each open is an independent conversion context.

Query Capabilities - VIDIOC_QUERYCAP

This ioctl call is used to obtain the capability information for the device. See the Device document.

Codec drivers have a type of V4L2_TYPE_CODEC. Codecs will probably support the read() and write() calls indicated by the V4L2_FLAG_READ and V4L2_FLAG_WRITE flags. A codec driver can also set V4L2_FLAG_STREAMING if it supports I/O through memory-mapped buffers, and V4L2_FLAG_SELECT if it supports the select() call.

Enumerating Supported Image Conversions - VIDIOC_ENUMCVT

An application can query the list of supported image conversions using the VIDIOC_ENUMCVT ioctl. The application fills in the index field of a struct v4l2_cvtdesc, and then passes it to the VIDIOC_ENUMCVT ioctl, which fills in the rest of the fields. The application should use index values from 0 on up; the ioctl will return an error when the index goes out of range.

struct v4l2_cvtdesc
int index   Number of conversion in the list of supported conversions
__u32 in.pixelformat   The pixelformat value for the input format
__u32 in.flags   V4L2_FMT_FLAG_COMPRESSED if applicable
__u32 in.depth   The depth value for the input format
__u32 in.reserved[2]   reserved
__u32 out.pixelformat   The pixelformat value for the output format
__u32 out.flags   V4L2_FMT_FLAG_COMPRESSED if applicable
__u32 out.depth   The depth value for the output format
__u32 out.reserved[2]   reserved


Input and Output Image Formats - VIDIOC_G_FMT, VIDIOC_S_FMT

Refer to the Device document for information regarding stream data formats and the video image format structure, struct v4l2_pix_format.

Use VIDIOC_S_FMT to set the image formats for input and output, and VIDIOC_G_FMT to retrieve the current formats. All ioctls use a struct v4l2_format to pass the format. Set the type field to V4L2_BUF_TYPE_CODECIN to access the input format, V4L2_BUF_TYPE_CODECOUT to access the output format.

An application should set the input format first. The ioctl call will return an error code if the driver cannot handle the format for some reason. After setting the input format, the driver should set the output format. Devices may have restrictions on the output format based on the input format. For example, the output format may be required to be the same width and height as the input format. The driver will modify the output format structure to indicate the nearest acceptable output format. Applications must make sure it is suitable. The driver will fill in the depth and sizeimage fields in the output format. The sizeimage value returned from the driver is the minimum required size of the output buffer, and the application should allocate output buffers of at least this size. If the output format is a variable-length compressed format, the sizeimage field is the upper bound on the size of any image, so the application can always safely allocate buffers of that size.

Remember that the input data is the data input to the codec, and the output data is the converted data coming out of the codec, which means, paradoxically, you write the input data to the device and read the output data from the device.

If you are planning on using memory mapped device buffers, you should set the formats before mapping the buffers. Also, you should unmap the buffers before changing the formats.

Compression Parameters- VIDIOC_G_COMP, VIDIOC_S_COMP

These ioctls set additional parameters needed for compressing video. See the capture spec for more information.

Memory-Mapping Device Buffers - VIDIOC_REQBUFS, VIDIOC_QUERYBUF

See the Device document for instructions on allocating memory-mapped device buffers. For video data input and output buffers, use V4L2_BUF_TYPE_CODECIN and V4L2_BUF_TYPE_CODECOUT respectively.

Writing and Reading Images - write(), read()

Convert images by sending them to the driver wth the write() call, and get the converted images by reading them with read(). Each call to write() or read() should be used to transfer a complete frame. The application may have to write several frames before the first frame is available for reading. This is because some video compression algorithms scan several frames and encode the motion in the images. The read() function will return the error EPERM if the read cannot complete because more input frames are needed.

If an application is both writing and reading, it should use non-blocking I/O calls to prevent it from deadlocking. It is possible for one program to write data while another reads the output. Non-blocking write() will return the error EAGAIN if a frame or frames need to be read before more data can be accepted. Non-blocking read() will return the error EAGAIN if a conversion is in progress, but not yet complete. Note that this is a different condition from the more-input-frames-needed error condition mentioned in the previous paragraph, which returns the EPERM code. Read() and write() do not work if memory-mapped I/O is in use.


This mode is supported if the V4L2_FLAG_STREAMING flag is set in the struct v4l2_capability.

This mode uses memory-mapped buffers for transferring video data to and from the driver. First set up the image formats, and allocate the buffers as described in the section titled Memory-Mapping Device Buffers in the Device document. You will need two sets of buffers, one set for the original data, another for the converted data. Use buffer type V4L2_BUF_TYPE_CODECIN for the original data buffers and V4L2_BUF_TYPE_CODECOUT for the converted data buffers.

Streaming data through a codec is more complicated than streaming from a capture device because there are two streams, one going in and one coming out. Thus there is a write queue and a read queue. Write buffers and read buffers are queued with VIDIOC_QBUF. When the driver is finished using a buffer dequeue it with VIDIOC_DQBUF. VIDIOC_QBUF operates as described in the capture spec. (Capturing, of course, involves a read queue only.) VIDIOC_QBUF passes a struct v4l2_buffer object to the driver.

To run the conversions, first queue up the read buffers with VIDIOC_QBUF. These buffers will receive the converted data as it becomes available, and will have the type V4L2_BUF_TYPE_CODECIN. The driver will internally queue one of the read buffers. Next fill in a write buffer or buffers (type V4L2_BUF_TYPE_CODECOUT) with the data to be converted, and call VIDIOC_QBUF for each buffer to queue the data for conversion. VIDIOC_QBUF takes a struct v4l2_buffer parameter. For the write buffers, the application should also fill in the bytesused field with the actual number of bytes of data in the buffer before calling VIDIOC_QBUF.

When all is ready, call VIDIOC_STREAMON with the type of the output buffer to start the conversion process. Conversion will proceed as long as there are fresh input data and output buffers queued up. As the codec finishes each buffer, the buffers are available for dequeuing. The application can wait for the next buffer to be done by calling select(). Select() returns when some buffer is ready to be dequeued. The driver will not reuse a buffer until the application has dequeued and requeued it, so it is important to dequeue and requeue the input and output buffers as the driver finishes with them.

To dequeue a buffer fill in the type field of a struct v4l2_buffer and call VIDIOC_DQBUF. The driver will fill in the rest of the fields. If there is no buffer of that type ready then the ioctl will return an error code. When a read buffer is ready, you can read the data out of the buffer, and requeue the buffer. When a write buffer is ready, you can fill it in with new data and requeue it.

To turn off streaming mode call VIDIOC_STREAMOFF with the buffer type of the read buffers. When stream mode is turned off all input buffers are automatically dequeued. Any unused output buffers become unqueued. Finished output buffers remain queued, and the application should dequeue them and recover the data, or else the data will be lost. When streaming is restarted, any finished output buffers still queued will be automatically dequeued.

There are certain things you can't do when streaming is active, for example changing the formats, using the read() or write() calls, or munmap()ing buffers.

Waiting for Frames Using select()

The driver supports the select() call on its file descriptors if the V4L2_FLAG_SELECT flag is set in the struct v4l2_capability. If streaming mode is off, select() returns when there is data ready to be read with the read() call and/or if the driver is ready to accept new input data through the write() call. If streaming mode is on, select() returns when there is a buffer ready to be dequeued. The caller should be sure there is a buffer queued first.