Batch Object Detection
This page shows how to run batch object detection on Machine GPU runners — the kind of pipeline you’d use to label a dataset, generate annotations, or analyze a video frame-by-frame.
When to use this
Why might you want to use GPU-accelerated batch object detection?
- Process large collections of images efficiently
- Extract and analyze objects, people, vehicles, or other entities
- Generate metadata and annotations for computer vision datasets
- Create analytics from visual content without manual intervention
How GPU-Powered Detection Works in CI/CD
The GPU Batch Object Detection workflow uses the DETR model (facebook/detr-resnet-50) with Hugging Face Transformers to detect and annotate objects in images. The workflow is defined in GitHub Actions and can be triggered on-demand with configurable parameters.
The detection process:
- Sets up the necessary environment with GPU support
- Downloads images from the COCO2017 dataset
- Processes the images using the DETR object detection model
- Generates annotated results and a CSV file with detection details
- Uploads the results as a GitHub Actions artifact
GitHub Actions Workflow for GPU Object Detection
The GPU Batch Object Detection is implemented as a GitHub Actions workflow that can be triggered manually. Here’s the workflow definition:
name: Batch Object Detection
on: workflow_dispatch: inputs: tenancy: type: choice required: false description: 'The tenancy of the machine' default: 'spot' options: - 'spot' - 'on_demand'
jobs: detect_objects: name: Detect Objects runs-on: - machine - gpu=t4 - cpu=4 - ram=16 - architecture=x64 - tenancy=${{ inputs.tenancy }} steps:
- uses: actions/checkout@v4
- name: Set up Python 3.10 uses: actions/setup-python@v5 with: python-version: '3.10'
- name: Install Dependencies run: | pip install -r requirements.txt
- name: Run Object Detection run: python3 object_detection.py
- name: Upload Detection Results CSV uses: actions/upload-artifact@v4 with: name: detection-results-csv path: detection_output/detections.csvUsing Machine GPU Runners
This object detection process leverages Machine GPU runners to provide the necessary computing power for efficient image processing. The workflow is configured to use:
- T4 GPU: An entry-level ML GPU with 16GB VRAM, well-suited for computer vision tasks
- Spot instance: To optimize for cost while maintaining performance
- Configurable resources: CPU, RAM, and architecture specifications
The T4 GPU provides excellent performance for batch object detection tasks, delivering significantly faster processing compared to CPU-only solutions. For larger workloads or more complex models, you can also configure the workflow to use more powerful GPUs like L4 or L40S.
Benefits of GPU-Accelerated Computer Vision in GitHub Actions
- GPU Acceleration: Efficiently perform object detection using GPUs via Machine
- Seamless Data Integration: Automatically fetch and process images from datasets
- Automated Detection Pipeline: Detect, annotate, and export results without manual intervention
- Results in CSV: Generate a structured CSV artifact with detection details for easy review
- Easy Deployment: Use the repository as a GitHub template to quickly start your own GPU-accelerated workflows
Getting Started
To run the GPU Batch Object Detection workflow:
- Use the MachineDotDev/gpu-batch-object-detection repository as a template
- Navigate to the Actions tab in your repository
- Select the “Batch Object Detection” workflow
- Click “Run workflow” and configure your parameters:
- Choose between spot or on-demand tenancy
- Run the workflow and wait for results
- Download the detection-results-csv artifact to view the detection details
Best Practices
- Use spot instances for non-time-critical batch processing to optimize costs
- Adjust batch sizes to match your GPU memory capacity
- Pre-filter your input data to reduce processing time on irrelevant images
- Consider dataset chunking for extremely large collections of images
- Implement progress tracking for long-running batch jobs
How to adapt this
- Larger images / heavier model: swap
gpu=t4forgpu=l4(24 GB VRAM, $0.006/min spot) - Different model: change the model ID in the Python script — any Hugging Face object detection model works
- Run on a schedule: add
on: schedule: cron: '0 2 * * *'to process new images nightly - Trigger from a webhook: add
on: repository_dispatchand POST to GitHub when new data lands
Next steps
- Working repo — fork or use as a template
- CPU vs GPU — picking GPU size for CV workloads
- Parallel Hyperparameter Tuning — fan out the same workflow across multiple model variants