Getting Started

As a concrete example to become familiar with PySceneDetect, let's use the following short clip from the James Bond movie, GoldenEye (Copyright © 1995 MGM):

https://www.youtube.com/watch?v=OMgIPnCnlbQ

You can download the clip from here (may have to right-click and save-as, put the video in your working directory as goldeneye.mp4).

Content-Aware Detection

In this case, we want to split this clip up into each individual scene - at each location where a fast cut occurs. This means we need to use content-aware detecton mode (detect-content) or adaptive mode (detect-adaptive). If the video instead contains fade-in/fade-out transitions you want to find, you can use detect-threshold instead.

Using the following command, let's run PySceneDetect on the video, and also save a scene list CSV file and some images of each scene:

scenedetect --input goldeneye.mp4 detect-content list-scenes save-images

Running the above command, in the working directory, you should see a file goldeneye.scenes.csv, as well as individual frames for the start/middle/end of each scene as goldeneye-XXXX-00/01.jpg. The results should appear as follows:

Scene # Start Time Preview
1 00:00:00.000
2 00:00:03.754
3 00:00:08.759
4 00:00:10.802
5 00:00:15.599
6 00:00:27.110
7 00:00:34.117
8 00:00:36.536
... ... ...
18 00:01:06.316
19 00:01:10.779
20 00:01:18.036
21 00:01:19.913
22 00:01:21.999

Splitting Video into Clips

The split-video command can be used to automatically split the input video using ffmpeg or mkvmerge. For example:

scenedetect -i goldeneye.mp4 detect-content split-video

Type scenedetect help split-video for a full list of options which can be specified for video splitting, including high quality mode (-hq/--high-quality) or copy mode (-c/--copy).

Finding Optimal Threshold/Sensitivity

If the default threshold of 27 does not produce correct results, we can determine the proper threshold by generating a statistics file with the -s / --stats option:

scenedetect --input goldeneye.mp4 --stats goldeneye.stats.csv detect-content

We can then plot the values of the content_val column:

goldeneye.mp4 statistics graph

The peaks in values correspond to the scene breaks in the input video. In some cases the threshold may need to be raised or lowered accordingly.