SceneManager

scenedetect.scene_manager Module

This module implements SceneManager, coordinates running a SceneDetector over the frames of a video (VideoStream). Video decoding is done in a separate thread to improve performance.

This module also contains other helper functions (e.g. save_images()) which can be used to process the resulting scene list.

Usage

The following example shows basic usage of a SceneManager:

from scenedetect import open_video, SceneManager, ContentDetector
video = open_video(video_path)
scene_manager = SceneManager()
scene_manager.add_detector(ContentDetector())
# Detect all scenes in video from current position to end.
scene_manager.detect_scenes(video)
# `get_scene_list` returns a list of start/end timecode pairs
# for each scene that was found.
scenes = scene_manager.get_scene_list()

An optional callback can also be invoked on each detected scene, for example:

from scenedetect import open_video, SceneManager, ContentDetector

# Callback to invoke on the first frame of every new scene detection.
def on_new_scene(frame_img: numpy.ndarray, frame_num: int):
    print("New scene found at frame %d." % frame_num)

video = open_video(test_video_file)
scene_manager = SceneManager()
scene_manager.add_detector(ContentDetector())
scene_manager.detect_scenes(video=video, callback=on_new_scene)

To use a SceneManager with a webcam/device or existing cv2.VideoCapture device, use the VideoCaptureAdapter instead of open_video.

Storing Per-Frame Statistics

SceneManager can use an optional StatsManager to save frame statistics to disk:

from scenedetect import open_video, ContentDetector, SceneManager, StatsManager
video = open_video(test_video_file)
scene_manager = SceneManager(stats_manager=StatsManager())
scene_manager.add_detector(ContentDetector())
scene_manager.detect_scenes(video=video)
scene_list = scene_manager.get_scene_list()
print_scenes(scene_list=scene_list)
# Save per-frame statistics to disk.
scene_manager.stats_manager.save_to_csv(csv_file=STATS_FILE_PATH)

The statsfile can be used to find a better threshold for certain inputs, or perform statistical analysis of the video.

class scenedetect.scene_manager.Interpolation(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Interpolation method used for image resizing. Based on constants defined in OpenCV.

AREA = 3

Pixel area relation resampling. Provides moire’-free downscaling.

CUBIC = 2

Bicubic interpolation.

LANCZOS4 = 4

Lanczos interpolation over 8x8 neighborhood.

LINEAR = 1

Bilinear interpolation.

NEAREST = 0

Nearest neighbor interpolation.

class scenedetect.scene_manager.SceneManager(stats_manager=None)

The SceneManager facilitates detection of scenes (detect_scenes()) on a video (VideoStream) using a detector (add_detector()). Video decoding is done in parallel in a background thread.

Parameters:

stats_manager (StatsManager | None) – StatsManager to bind to this SceneManager. Can be accessed via the stats_manager property of the resulting object to save to disk.

add_detector(detector)

Add/register a SceneDetector (e.g. ContentDetector, ThresholdDetector) to run when detect_scenes is called. The SceneManager owns the detector object, so a temporary may be passed.

Parameters:

detector (SceneDetector) – Scene detector to add to the SceneManager.

Return type:

None

clear()

Clear all cuts/scenes and resets the SceneManager’s position.

Any statistics generated are still saved in the StatsManager object passed to the SceneManager’s constructor, and thus, subsequent calls to detect_scenes, using the same frame source seeked back to the original time (or beginning of the video) will use the cached frame metrics that were computed and saved in the previous call to detect_scenes.

Return type:

None

clear_detectors()

Remove all scene detectors added to the SceneManager via add_detector().

Return type:

None

detect_scenes(video=None, duration=None, end_time=None, frame_skip=0, show_progress=False, callback=None, frame_source=None)

Perform scene detection on the given video using the added SceneDetectors, returning the number of frames processed. Results can be obtained by calling get_scene_list() or get_cut_list().

Video decoding is performed in a background thread to allow scene detection and frame decoding to happen in parallel. Detection will continue until no more frames are left, the specified duration or end time has been reached, or stop() was called.

Parameters:
  • video (VideoStream) – VideoStream obtained from either scenedetect.open_video, or by creating one directly (e.g. scenedetect.backends.opencv.VideoStreamCv2).

  • duration (FrameTimecode | None) – Amount of time to detect from current video position. Cannot be specified if end_time is set.

  • end_time (FrameTimecode | None) – Time to stop processing at. Cannot be specified if duration is set.

  • frame_skip (int) – Not recommended except for extremely high framerate videos. Number of frames to skip (i.e. process every 1 in N+1 frames, where N is frame_skip, processing only 1/N+1 percent of the video, speeding up the detection time at the expense of accuracy). frame_skip must be 0 (the default) when using a StatsManager.

  • show_progress (bool) – If True, and the tqdm module is available, displays a progress bar with the progress, framerate, and expected time to complete processing the video frame source.

  • callback (Callable[[ndarray, int], None] | None) – If set, called after each scene/event detected.

  • frame_source (VideoStream | None) – [DEPRECATED] DO NOT USE. For compatibility with previous version.

Returns:

Number of frames read and processed from the frame source.

Return type:

int

Raises:

ValueErrorframe_skip must be 0 (the default) if the SceneManager was constructed with a StatsManager object.

get_num_detectors()

Get number of registered scene detectors added via add_detector.

Return type:

int

get_scene_list(base_timecode=None, start_in_scene=False)

Return a list of tuples of start/end FrameTimecodes for each detected scene.

Parameters:
  • base_timecode (FrameTimecode | None) – [DEPRECATED] DO NOT USE. For backwards compatibility.

  • start_in_scene (bool) – Assume the video begins in a scene. This means that when detecting fast cuts with ContentDetector, if no cuts are found, the resulting scene list will contain a single scene spanning the entire video (instead of no scenes). When detecting fades with ThresholdDetector, the beginning portion of the video will always be included until the first fade-out event is detected.

Returns:

List of tuples in the form (start_time, end_time), where both start_time and end_time are FrameTimecode objects representing the exact time/frame where each detected scene in the video begins and ends.

Return type:

List[Tuple[FrameTimecode, FrameTimecode]]

stop()

Stop the current detect_scenes() call, if any. Thread-safe.

Return type:

None

property auto_downscale: bool

If set to True, will automatically downscale based on video frame size.

Overrides downscale if set.

property crop: Tuple[int, int, int, int] | None

Portion of the frame to crop. Tuple of 4 ints in the form (X0, Y0, X1, Y1) where X0, Y0 describes one point and X1, Y1 is another which describe a rectangle inside of the frame. Coordinates start from 0 and are inclusive. For example, with a 100x100 pixel video, (0, 0, 99, 99) covers the entire frame.

property downscale: int

Factor to downscale each frame by. Will always be >= 1, where 1 indicates no scaling. Will be ignored if auto_downscale=True.

property interpolation: Interpolation

Interpolation method to use when downscaling frames. Must be one of cv2.INTER_*.

property stats_manager: StatsManager | None

Getter for the StatsManager associated with this SceneManager, if any.

scenedetect.scene_manager.compute_downscale_factor(frame_width, effective_width=256)

Get the optimal default downscale factor based on a video’s resolution (currently only the width in pixels is considered).

The resulting effective width of the video will be between frame_width and 1.5 * frame_width pixels (e.g. if frame_width is 200, the range of effective widths will be between 200 and 300).

Parameters:
  • frame_width (int) – Actual width of the video frame in pixels.

  • effective_width (int) – Desired minimum width in pixels.

Returns:

The default downscale factor to use to achieve at least the target effective_width.

Return type:

int

scenedetect.scene_manager.get_scenes_from_cuts(cut_list, start_pos, end_pos, base_timecode=None)

Returns a list of tuples of start/end FrameTimecodes for each scene based on a list of detected scene cuts/breaks.

This function is called when using the SceneManager.get_scene_list() method. The scene list is generated from a cutting list (SceneManager.get_cut_list()), noting that each scene is contiguous, starting from the first to last frame of the input. If cut_list is empty, the resulting scene will span from start_pos to end_pos.

Parameters:
  • cut_list (List[FrameTimecode]) – List of FrameTimecode objects where scene cuts/breaks occur.

  • base_timecode (FrameTimecode | None) – The base_timecode of which all FrameTimecodes in the cut_list are based on.

  • num_frames – The number of frames, or FrameTimecode representing duration, of the video that was processed (used to generate last scene’s end time).

  • start_frame – The start frame or FrameTimecode of the cut list. Used to generate the first scene’s start time. base_timecode: [DEPRECATED] DO NOT USE. For backwards compatibility only.

  • start_pos (int | FrameTimecode) –

  • end_pos (int | FrameTimecode) –

Returns:

List of tuples in the form (start_time, end_time), where both start_time and end_time are FrameTimecode objects representing the exact time/frame where each scene occupies based on the input cut_list.

Return type:

List[Tuple[FrameTimecode, FrameTimecode]]

scenedetect.scene_manager.save_images(scene_list, video, num_images=3, frame_margin=1, image_extension='jpg', encoder_param=95, image_name_template='$VIDEO_NAME-Scene-$SCENE_NUMBER-$IMAGE_NUMBER', output_dir=None, show_progress=False, scale=None, height=None, width=None, interpolation=Interpolation.CUBIC, threading=True, video_manager=None)

Save a set number of images from each scene, given a list of scenes and the associated video/frame source.

Parameters:
  • scene_list (List[Tuple[FrameTimecode, FrameTimecode]]) – A list of scenes (pairs of FrameTimecode objects) returned from calling a SceneManager’s detect_scenes() method.

  • video (VideoStream) – A VideoStream object corresponding to the scene list. Note that the video will be closed/re-opened and seeked through.

  • num_images (int) – Number of images to generate for each scene. Minimum is 1.

  • frame_margin (int) – Number of frames to pad each scene around the beginning and end (e.g. moves the first/last image into the scene by N frames). Can set to 0, but will result in some video files failing to extract the very last frame.

  • image_extension (str) – Type of image to save (must be one of ‘jpg’, ‘png’, or ‘webp’).

  • encoder_param (int) – Quality/compression efficiency, based on type of image: ‘jpg’ / ‘webp’: Quality 0-100, higher is better quality. 100 is lossless for webp. ‘png’: Compression from 1-9, where 9 achieves best filesize but is slower to encode.

  • image_name_template (str) – Template to use for naming image files. Can use the template variables $VIDEO_NAME, $SCENE_NUMBER, $IMAGE_NUMBER, $TIMECODE, $FRAME_NUMBER, $TIMESTAMP_MS. Should not include an extension.

  • output_dir (str | None) – Directory to output the images into. If not set, the output is created in the working directory.

  • show_progress (bool | None) – If True, shows a progress bar if tqdm is installed.

  • scale (float | None) – Optional factor by which to rescale saved images. A scaling factor of 1 would not result in rescaling. A value < 1 results in a smaller saved image, while a value > 1 results in an image larger than the original. This value is ignored if either the height or width values are specified.

  • height (int | None) – Optional value for the height of the saved images. Specifying both the height and width will resize images to an exact size, regardless of aspect ratio. Specifying only height will rescale the image to that number of pixels in height while preserving the aspect ratio.

  • width (int | None) – Optional value for the width of the saved images. Specifying both the width and height will resize images to an exact size, regardless of aspect ratio. Specifying only width will rescale the image to that number of pixels wide while preserving the aspect ratio.

  • interpolation (Interpolation) – Type of interpolation to use when resizing images.

  • threading (bool) – Offload image encoding and disk IO to background threads to improve performance.

  • video_manager – [DEPRECATED] DO NOT USE. For backwards compatibility only.

Returns:

[image_paths] }, where scene_num is the number of the scene in scene_list (starting from 1), and image_paths is a list of the paths to the newly saved/created images.

Return type:

Dictionary of the format { scene_num

Raises:
  • ValueError – Raised if any arguments are invalid or out of range (e.g.

  • if num_images is negative).

scenedetect.scene_manager.write_scene_list(output_csv_file, scene_list, include_cut_list=True, cut_list=None, col_separator=',', row_separator='\n')

Writes the given list of scenes to an output file handle in CSV format.

Parameters:
  • output_csv_file (TextIO) – Handle to open file in write mode.

  • scene_list (List[Tuple[FrameTimecode, FrameTimecode]]) – List of pairs of FrameTimecodes denoting each scene’s start/end FrameTimecode.

  • include_cut_list (bool) – Bool indicating if the first row should include the timecodes where each scene starts. Should be set to False if RFC 4180 compliant CSV output is required.

  • cut_list (List[FrameTimecode] | None) – Optional list of FrameTimecode objects denoting the cut list (i.e. the frames in the video that need to be split to generate individual scenes). If not specified, the cut list is generated using the start times of each scene following the first one.

  • col_separator (str) – Delimiter to use between values. Must be single character.

  • row_separator (str) – Line terminator to use between rows.

Raises:

TypeError – “delimiter” must be a 1-character string

scenedetect.scene_manager.write_scene_list_html(output_html_filename, scene_list, cut_list=None, css=None, css_class='mytable', image_filenames=None, image_width=None, image_height=None)

Writes the given list of scenes to an output file handle in html format.

Parameters:
  • output_html_filename (str) – filename of output html file

  • scene_list (List[Tuple[FrameTimecode, FrameTimecode]]) – List of pairs of FrameTimecodes denoting each scene’s start/end FrameTimecode.

  • cut_list (List[FrameTimecode] | None) – Optional list of FrameTimecode objects denoting the cut list (i.e. the frames in the video that need to be split to generate individual scenes). If not passed, the start times of each scene (besides the 0th scene) is used instead.

  • css (str) – String containing all the css information for the resulting html page.

  • css_class (str) – String containing the named css class

  • image_filenames (Dict[int, List[str]] | None) – dict where key i contains a list with n elements (filenames of the n saved images from that scene)

  • image_width (int | None) – Optional desired width of images in table in pixels

  • image_height (int | None) – Optional desired height of images in table in pixels

scenedetect.scene_manager.CropRegion

Type hint for rectangle of the form X0 Y0 X1 Y1 for cropping frames. Coordinates are relative to source frame without downscaling.

alias of Tuple[int, int, int, int]

scenedetect.scene_manager.CutList

Type hint for a list of cuts, where each timecode represents the first frame of a new shot.

alias of List[FrameTimecode]

scenedetect.scene_manager.DEFAULT_MIN_WIDTH: int = 256

The default minimum width a frame will be downscaled to when calculating a downscale factor.

scenedetect.scene_manager.MAX_FRAME_QUEUE_LENGTH: int = 4

Maximum number of decoded frames which can be buffered while waiting to be processed.

scenedetect.scene_manager.MAX_FRAME_SIZE_ERRORS: int = 16

Maximum number of frame size error messages that can be logged.

scenedetect.scene_manager.PROGRESS_BAR_DESCRIPTION = '  Detected: %d | Progress'

Template to use for progress bar.

scenedetect.scene_manager.SceneList

Type hint for a list of scenes in the form (start time, end time).

alias of List[Tuple[FrameTimecode, FrameTimecode]]