Shot transition detection

The intersection detection is known in the art multimedia technology of computer science the automatic detection of cuts in a digital video.

Purpose

Cut detection is a useful tool in post-production video footage on a computer, because it saves the user the time- consuming search for cuts by hand. However, the average recognition also forms one of the cornerstones of the automatic archiving of video footage. Objective here is to automatically produce scripts for large video archives; the average detection can both help in classifying a video as well as the choice of preview images here.

Hard and soft cuts

In the cut detection, a distinction hard cuts (English hard cut in the art of film "Cut ") and soft sections (English soft cut in the movie art " cover" called ). In a hard cut a movie scene is suddenly and without transition into another. In contrast, a scene comes with a soft cut gradually into another.

While be achieved with hard cuts with modern algorithms for cut detection, excellent results, provide soft -sections continue to still constitute the jerky change of the whole image content with a hard cut is already quite simple methods of image processing a challenge - see, eg, " Histogram differences" under process - clearly visible. The gradual change of the image content in a soft cut but quite often misinterpreted by previous algorithms as a movement of the filmed objects and not recognized the cut because of it.

Method

A method for cutting work according to a two-stage detection principle:

Review. All images of digital videos are compared with the image directly below. In this case, each pair of images is assigned a value, which should be as high as possible when you suspect a cut, and as low as possible when you suspect no cut.
Filtering. Then all pairs of images are filtered with a threshold value (also called " threshold " or " limit ", engl. Threshold). Here, all pairs of images are sorted out, the value is below the threshold. Between the two images of the remaining pairs of images is a section probably.

This approach is prone to error. Since marginal threshold crossings already be interpreted as average, the threshold must be chosen very carefully. As a rule, their value is determined by statistical methods from a large number of test runs.

A method for the detection section is thus composed of two parts, which can be optimized independently. The assessment should be optimized so that it scatters the values as far as possible, ie the difference between the values for cut and non -cut is as large as possible. The filtering can be made more tolerant, so soft cuts are not consecutively misinterpreted as multiple cuts.

Assessment procedures

The optimization of the evaluation is not a simple task. Numerous algorithms have been developed to date, provide the more or less reliable results.

The sum of absolute differences (SAD ) is the most obvious approach to determine the difference of two images: The color values of the images pixel by pixel and subtracted from each amount added up. The result is the SAD, a positive number indicating how much the pixels of the images differ from each other in total. The SAD reacts very sensitive to small changes in the image content and thus often suspected sections where there are no circumstances in reality; particularly frequent quick camera movements, explosion, or turning on a light be misinterpreted in a previously dark scene. On the other hand, the SAD does not respond to most soft cuts, because the changes are proceeding too slowly and do not increase the value strong enough. That the method is still often used, it owes to the fact that it detects all visible hard cuts with absolute certainty and is, moreover, very quickly.

The histogram difference ( HD) is a slight change in the sum of absolute differences. Instead of the images to be compared point by point with each other, instead, the histograms of the two images are compared with each other. A histogram contains for each color of an image the number of pixels that have this color. Thus, the histogram difference does not investigate directly how the image contents vary from each other, but how much distinguished the colors of the two images. This can be a drawback, because it is quite possible that two completely different images have identical histograms - just think of an image with the sea and beach and one with corn field and sky. There is therefore no guarantee that hard cuts are detected with certainty. On the other hand, the histogram difference, however, is less prone to minor changes in the image, such as movement and camera movement.

The Edge Change Ratio (ECR, engl. "Edge change ratio " ) attempted to compare the actual image content of two images together. For this, the outlines of all objects are searched for in the two images and produces so-called edge images. Then, the two edge images are compared and determined the percentage of the edge, which disappears from the first image and the portion that comes to the second image; So it should be determined how very different the objects depicted in the two images. The Edge Change Ratio is one of the most reliable indicators for the occurrence of a section. She is sensitive to hard cuts and is able to determine some forms of soft cuts with great certainty. Nevertheless, the Edge Change Ratio reaches its limits when it comes to the recognition of animated panels - for example, black bars, the " wipe out " the image - go.

Another possibility is offered by the combination of different methods.

Filtering method

The simple threshold filtering can be extended to combine several closely spaced in excess of the threshold value to a single transgression. To do this select to a minimum distance, the two overruns must have each other to be interpreted as two individual sections and selects within such a frame area, only one violation - usually the one with the highest value - from.

Quality measures

There are three dimensions which are used to assess the quality of cut detection method. C denotes the number of correctly detected cuts, M is the number of unrecognized cuts and F the number of misrecognized sections so obtained for the quality measures following formulas:

Precision. The probability that a detected section is actually a section.

Recall. The probability of a true cut is detected.

F1. A combination of the other two quality measures, which only yields high values when both Precision and Recall show high values .

The quality measures take as true mathematical measures only values between 0 and 1, applies to all three: the higher the value, the better the method.

Film editing

716186