Reference for ultralytics/data/loaders.py
Note
This file is available at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/loaders.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
ultralytics.data.loaders.SourceTypes
dataclass
SourceTypes(stream: bool = False, screenshot: bool = False, from_img: bool = False, tensor: bool = False)
Class to represent various types of input sources for predictions.
This class uses dataclass to define boolean flags for different types of input sources that can be used for making predictions with YOLO models.
Attributes:
Name | Type | Description |
---|---|---|
stream |
bool
|
Flag indicating if the input source is a video stream. |
screenshot |
bool
|
Flag indicating if the input source is a screenshot. |
from_img |
bool
|
Flag indicating if the input source is an image file. |
Examples:
>>> source_types = SourceTypes(stream=True, screenshot=False, from_img=False)
>>> print(source_types.stream)
True
>>> print(source_types.from_img)
False
ultralytics.data.loaders.LoadStreams
Stream Loader for various types of video streams.
Supports RTSP, RTMP, HTTP, and TCP streams. This class handles the loading and processing of multiple video streams simultaneously, making it suitable for real-time video analysis tasks.
Attributes:
Name | Type | Description |
---|---|---|
sources |
List[str]
|
The source input paths or URLs for the video streams. |
vid_stride |
int
|
Video frame-rate stride. |
buffer |
bool
|
Whether to buffer input streams. |
running |
bool
|
Flag to indicate if the streaming thread is running. |
mode |
str
|
Set to 'stream' indicating real-time capture. |
imgs |
List[List[ndarray]]
|
List of image frames for each stream. |
fps |
List[float]
|
List of FPS for each stream. |
frames |
List[int]
|
List of total frames for each stream. |
threads |
List[Thread]
|
List of threads for each stream. |
shape |
List[Tuple[int, int, int]]
|
List of shapes for each stream. |
caps |
List[VideoCapture]
|
List of cv2.VideoCapture objects for each stream. |
bs |
int
|
Batch size for processing. |
Methods:
Name | Description |
---|---|
update |
Read stream frames in daemon thread. |
close |
Close stream loader and release resources. |
__iter__ |
Returns an iterator object for the class. |
__next__ |
Returns source paths, transformed, and original images for processing. |
__len__ |
Return the length of the sources object. |
Examples:
>>> stream_loader = LoadStreams("rtsp://example.com/stream1.mp4")
>>> for sources, imgs, _ in stream_loader:
... # Process the images
... pass
>>> stream_loader.close()
Notes
- The class uses threading to efficiently load frames from multiple streams simultaneously.
- It automatically handles YouTube links, converting them to the best available stream URL.
- The class implements a buffer system to manage frame storage and retrieval.
Source code in ultralytics/data/loaders.py
__iter__
__len__
__next__
Returns the next batch of frames from multiple video streams for processing.
Source code in ultralytics/data/loaders.py
close
Terminates stream loader, stops threads, and releases video capture resources.
Source code in ultralytics/data/loaders.py
update
Read stream frames in daemon thread and update image buffer.
Source code in ultralytics/data/loaders.py
ultralytics.data.loaders.LoadScreenshots
Ultralytics screenshot dataloader for capturing and processing screen images.
This class manages the loading of screenshot images for processing with YOLO. It is suitable for use with
yolo predict source=screen
.
Attributes:
Name | Type | Description |
---|---|---|
source |
str
|
The source input indicating which screen to capture. |
screen |
int
|
The screen number to capture. |
left |
int
|
The left coordinate for screen capture area. |
top |
int
|
The top coordinate for screen capture area. |
width |
int
|
The width of the screen capture area. |
height |
int
|
The height of the screen capture area. |
mode |
str
|
Set to 'stream' indicating real-time capture. |
frame |
int
|
Counter for captured frames. |
sct |
mss
|
Screen capture object from |
bs |
int
|
Batch size, set to 1. |
fps |
int
|
Frames per second, set to 30. |
monitor |
Dict[str, int]
|
Monitor configuration details. |
Methods:
Name | Description |
---|---|
__iter__ |
Returns an iterator object. |
__next__ |
Captures the next screenshot and returns it. |
Examples:
>>> loader = LoadScreenshots("0 100 100 640 480") # screen 0, top-left (100,100), 640x480
>>> for source, im, im0s, vid_cap, s in loader:
... print(f"Captured frame: {im.shape}")
Source code in ultralytics/data/loaders.py
__iter__
__next__
Captures and returns the next screenshot as a numpy array using the mss library.
Source code in ultralytics/data/loaders.py
ultralytics.data.loaders.LoadImagesAndVideos
A class for loading and processing images and videos for YOLO object detection.
This class manages the loading and pre-processing of image and video data from various sources, including single image files, video files, and lists of image and video paths.
Attributes:
Name | Type | Description |
---|---|---|
files |
List[str]
|
List of image and video file paths. |
nf |
int
|
Total number of files (images and videos). |
video_flag |
List[bool]
|
Flags indicating whether a file is a video (True) or an image (False). |
mode |
str
|
Current mode, 'image' or 'video'. |
vid_stride |
int
|
Stride for video frame-rate. |
bs |
int
|
Batch size. |
cap |
VideoCapture
|
Video capture object for OpenCV. |
frame |
int
|
Frame counter for video. |
frames |
int
|
Total number of frames in the video. |
count |
int
|
Counter for iteration, initialized at 0 during iter(). |
ni |
int
|
Number of images. |
Methods:
Name | Description |
---|---|
__iter__ |
Returns an iterator object for VideoStream or ImageFolder. |
__next__ |
Returns the next batch of images or video frames along with their paths and metadata. |
_new_video |
Creates a new video capture object for the given path. |
__len__ |
Returns the number of batches in the object. |
Examples:
>>> loader = LoadImagesAndVideos("path/to/data", batch=32, vid_stride=1)
>>> for paths, imgs, info in loader:
... # Process batch of images or video frames
... pass
Notes
- Supports various image formats including HEIC.
- Handles both local files and directories.
- Can read from a text file containing paths to images and videos.
Source code in ultralytics/data/loaders.py
__iter__
__len__
__next__
Returns the next batch of images or video frames with their paths and metadata.
Source code in ultralytics/data/loaders.py
ultralytics.data.loaders.LoadPilAndNumpy
Load images from PIL and Numpy arrays for batch processing.
This class manages loading and pre-processing of image data from both PIL and Numpy formats. It performs basic validation and format conversion to ensure that the images are in the required format for downstream processing.
Attributes:
Name | Type | Description |
---|---|---|
paths |
List[str]
|
List of image paths or autogenerated filenames. |
im0 |
List[ndarray]
|
List of images stored as Numpy arrays. |
mode |
str
|
Type of data being processed, set to 'image'. |
bs |
int
|
Batch size, equivalent to the length of |
Methods:
Name | Description |
---|---|
_single_check |
Validate and format a single image to a Numpy array. |
Examples:
>>> from PIL import Image
>>> import numpy as np
>>> pil_img = Image.new("RGB", (100, 100))
>>> np_img = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)
>>> loader = LoadPilAndNumpy([pil_img, np_img])
>>> paths, images, _ = next(iter(loader))
>>> print(f"Loaded {len(images)} images")
Loaded 2 images
Source code in ultralytics/data/loaders.py
__iter__
__len__
__next__
Returns the next batch of images, paths, and metadata for processing.
Source code in ultralytics/data/loaders.py
ultralytics.data.loaders.LoadTensor
A class for loading and processing tensor data for object detection tasks.
This class handles the loading and pre-processing of image data from PyTorch tensors, preparing them for further processing in object detection pipelines.
Attributes:
Name | Type | Description |
---|---|---|
im0 |
Tensor
|
The input tensor containing the image(s) with shape (B, C, H, W). |
bs |
int
|
Batch size, inferred from the shape of |
mode |
str
|
Current processing mode, set to 'image'. |
paths |
List[str]
|
List of image paths or auto-generated filenames. |
Methods:
Name | Description |
---|---|
_single_check |
Validates and formats an input tensor. |
Examples:
>>> import torch
>>> tensor = torch.rand(1, 3, 640, 640)
>>> loader = LoadTensor(tensor)
>>> paths, images, info = next(iter(loader))
>>> print(f"Processed {len(images)} images")
Source code in ultralytics/data/loaders.py
__iter__
__len__
__next__
Yields the next batch of tensor images and metadata for processing.
ultralytics.data.loaders.autocast_list
Merges a list of sources into a list of numpy arrays or PIL images for Ultralytics prediction.
Source code in ultralytics/data/loaders.py
ultralytics.data.loaders.get_best_youtube_url
Retrieves the URL of the best quality MP4 video stream from a given YouTube video.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
url
|
str
|
The URL of the YouTube video. |
required |
method
|
str
|
The method to use for extracting video info. Options are "pytube", "pafy", and "yt-dlp". Defaults to "pytube". |
'pytube'
|
Returns:
Type | Description |
---|---|
str | None
|
The URL of the best quality MP4 video stream, or None if no suitable stream is found. |
Examples:
>>> url = "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
>>> best_url = get_best_youtube_url(url)
>>> print(best_url)
https://rr4---sn-q4flrnek.googlevideo.com/videoplayback?expire=...
Notes
- Requires additional libraries based on the chosen method: pytubefix, pafy, or yt-dlp.
- The function prioritizes streams with at least 1080p resolution when available.
- For the "yt-dlp" method, it looks for formats with video codec, no audio, and *.mp4 extension.