MPEG-4 Part 11

Binary Format for Scenes ( BIFS ) is a based on VRML97 and MPEG -4 Part 11 (ISO / IEC 14496-11, "Scene description and application engine " ) standardized description language for two - and three-dimensional multimedia audio-visual interactive content. It is binary coded.

Interactive content, scenes, concept and object-based coding

Goods MPEG -1 and MPEG -2 standards nor to exclusively encode audio and video data, MPEG -4 was from the beginning as a "toolbox " for coding and transmission of ( not necessarily) interactive audiovisual content planned. When it is mentioned in the following MPEG-4 is always the scene descriptor aspect meant, i.e. the audiovisual content will be described as so-called scene from natural (for example, video sequences or recorded audio tracks ) or synthetic objects (eg. 2D and 3D graphics ) can be made. In addition to MPEG -4 for the first time the concept of object-based coding is used, in which the individual objects in a scene (eg, background, performer, 3D and 2D objects, language, background music, etc. ) is encoded and transmitted separately. In reproduction, the scene is then reassembled.

The advantages of object-based coding and transmission:

  • It enables efficient transmission or storage of multimedia content.
  • For each object, an appropriate codec can be used for one of the MPEG standards (eg voice codec, still image codec, video codec, etc.).
  • Single objects can be easily re-used in the production of other scenes.
  • The scene can be adapted to its capabilities in the playback device.
  • There are interactive content possible.


The terminology has been adopted by VRML. So BIFS scenes have a scene graph, a hierarchical data structure whose individual elements are called nodes. Properties of the node are described in fields. For fields data types and value ranges are defined.

There are both nodes for visible objects ( such as rectangles, cylinders ) and to define their properties such as color, texture, or position. In addition, nodes exist for the positioning of sound sources in the scene as well as for audio signal processing (see below under " AudioBIFS ").

Applied Technologies

In order to integrate the performers seamlessly into a scene, the MPEG -4 video standard (ISO / IEC 14496-2 ) will enable the encoding of so-called Shaped Video Objects. Here, a binary or grayscale mask is encoded in the video stream along with the actual image content so that when composing the scene, the background is visible at the point at which no actor is. In order to generate such video objects the blue screen technique is used. Shaped Video Objects are integrated as textures in the BIFS scene.

In order to include three-dimensional objects (eg in 3D programs modeled objects ) in the scene BIFS referenced the entire VRML standard, so that all the possibilities that are given with VRML, can also be used in BIFS scenes.

MPEG -J defines a Java interface, which allows access to the objects in the scene. Thus complete interactive applications (eg guide) possible with Java and MPEG-4. The bytecode of the application ( also referred to as " MPEGlet " ) is transmitted or stored along with the scene.

AudioBIFS with the part of BIFS is referred to, which enables a complete audio signal processing. It is possible to position the sound sources in a virtual space, to apply effects to audio data or acoustic describe spaces. To describe the effects applied their own language SAOL (structured audio orchestra language) is standardized in MPEG -4 Audio, with the help of all possible signal processing operations can be described. For realistic simulation of acoustic environments objects in the scene with acoustic characteristics (frequency- dependent transmission and reflection) can be provided so that these can be taken into account during playback ( virtual acoustics).