Frequently Asked Questions

This page addresses common questions about Machine, its features, pricing, and usage patterns. If you don’t find an answer to your question here, please reach out to our support team.

General Questions

What is Machine?

Machine is a SaaS platform that provides GPU acceleration for GitHub Actions workflows. It allows you to run compute-intensive machine learning and AI workloads directly in your GitHub Actions CI/CD pipeline without managing infrastructure.

How does Machine differ from other GPU cloud providers?

Unlike traditional cloud providers that require setting up and managing infrastructure, Machine integrates directly with GitHub Actions. This eliminates the need for separate accounts, billing systems, and infrastructure management. You can seamlessly add GPU acceleration to your existing GitHub workflows without any DevOps overhead.

What types of workloads are best suited for Machine?

Machine is optimized for:

Training and fine-tuning machine learning models
Running inference at scale
Processing large datasets with GPU acceleration
Computer vision and image processing
Language model training and evaluation
Hyperparameter tuning

Can I use Machine with private repositories?

Yes, Machine supports both public and private GitHub repositories. Your code and data remain secure and are only accessible within your workflow.

Getting Started

Visit machine.dev and sign up for the beta program
Once accepted, log in with your GitHub account
Install the Machine Provisioner GitHub App to your account or organization
Select the repositories you want to enable Machine for
Start using GPU acceleration in your workflows

Do I need to install anything to use Machine?

No. Machine is a cloud service that requires no installation. Simply specify the machine runner in your GitHub Actions workflow YAML file, and Machine will automatically provision the GPU resources when your workflow runs.

How quickly are GPU runners provisioned?

Most GPU runners are provisioned within 1 minute of your workflow being triggered. The exact time depends on current demand and region availability.

Technical Questions

What GPUs does Machine offer?

Machine offers a range of NVIDIA GPUs:

T4G (16GB)
T4 (16GB)
L4 (24GB)
A10G (24GB)
L40S (48GB)

and AWS Accelerators:

Inferentia (32GB)
Inferentia2 (32GB)

Can I use a specific CUDA or driver version?

Yes, you can install any version of CUDA on Machine runners without limitations. Machine runners provide you with complete freedom to customize your software environment to match your exact requirements.

Are there limitations on network access or outbound connections?

Machine runners have full outbound internet access, allowing you to download datasets, models, or dependencies. However, there are some restrictions:

Only common outbound ports are accessible (80, 443, etc.)

Can I install custom software on the runners?

Yes, you can install any software you want on Machine runners without limits. The runners provide full root access, allowing you to customize the environment with any tools, libraries, or dependencies your workflow requires.

How long can my jobs run?

Jobs can run for up to 5 days of execution time, as per GitHub Actions limits for self-hosted runners. If a job reaches this limit, it will be terminated and fail to complete. Additionally, entire workflow runs are limited to 35 days, including execution, waiting, and approval time.

For long-running workloads, we still recommend:

Breaking your job into smaller steps when practical
Implementing checkpointing to resume from interruptions
Using persistent storage options for checkpoints

Can I run multiple GPUs in a single job?

Currently, Machine supports single GPU configurations for most users. For specialized multi-GPU needs, reach out to our support team to discuss your requirements.

What operating system do the runners use?

All Machine runners use Ubuntu 22.04 LTS as the base operating system, with optimized drivers and libraries for machine learning workloads.

Pricing and Billing

How is Machine billed?

Machine uses a credit-based system where you pay only for what you use. Credits are consumed based on the GPU type, runtime duration, additional resources (CPU, RAM), and instance type (on-demand vs. spot). One credit is worth $0.005 at the base rate.

What plans are available?

Machine offers several pricing plans:

Pay As You Go: $0.005 per credit, no commitment, maximum 2 concurrent machines
Growth Plan: 11,000 credits per month (~9% discount at $0.004545 per credit), maximum 3 concurrent machines
Professional Plan: 20,000 credits per month (15% discount at $0.00425 per credit), maximum 5 concurrent machines
Business Plan: 40,000 credits per month (20% discount at $0.004 per credit), maximum 10 concurrent machines
Custom Enterprise Solutions: For teams needing more credits or concurrent machines

Credit Consumption Rates

Credit consumption varies by GPU type and configuration. Spot instances provide significant savings (up to 85%) compared to on-demand instances. For example:

GPU Type	vCPU	RAM (GB)	Credits/Min (Spot)	Credits/Min (On-Demand)
T4G	4	8	1	4
T4	4	16	2	6
L4	4	16	2	7
A10G	4	16	3	9
L40S	4	32	4	13

Do credits expire?

Pay As You Go credits don’t expire. Credits included in monthly subscriptions expire at the end of the billing cycle if unused.

Do I pay for provisioning time?

No, you only pay for the time your job is actually running. Provisioning time and teardown time are not billed.

What happens if my job is interrupted when using spot instances?

If your job is interrupted while using spot instances, you will only be billed for the actual runtime up to the point of interruption. We recommend implementing checkpointing in your workflows when using spot instances.

Workflow Configuration

How do I specify a GPU type in my workflow?

Use the runs-on syntax with the machine label and the desired GPU type:

jobs:
  train:
    runs-on:
      - machine
      - gpu=A10G
    steps:
      # Your job steps

Can I use both regular GitHub runners and Machine runners in the same workflow?

Yes, you can mix regular GitHub-hosted runners and Machine GPU runners in different jobs within the same workflow:

jobs:
  prepare:
    runs-on: ubuntu-latest
    # Preparation steps on standard runner

  train:
    needs: prepare
    runs-on:
      - machine
      - gpu=A10G
    # GPU training steps

How can I tell if my job is efficiently using the GPU?

Machine provides GPU monitoring metrics during your job run. You can also use nvidia-smi in your workflow to check GPU utilization:

steps:
  - name: Check GPU utilization
    run: |
      nvidia-smi -l 5 -i 0

What happens if my workflow exceeds the maximum runtime?

Jobs running longer than the maximum allowed runtime will be automatically terminated. The job will be marked as failed in GitHub Actions.

Can I specify regional preferences for my GPU runners?

Yes, you can specify preferred regions for your GPU runners using the regions parameter:

runs-on:
  - machine
  - gpu=L4
  - regions=us-east-1,us-east-2

This allows you to meet data sovereignty requirements or target regions with better availability.

Troubleshooting

My job is failing with an “out of memory” error. What should I do?

If you’re encountering GPU out of memory errors:

Reduce your batch size
Enable gradient checkpointing
Use mixed precision training
Choose a GPU with more memory

My spot instance was terminated before my job completed. How do I handle this?

For spot instances, implement checkpointing in your code:

Save model state regularly
Design your workflow to resume from checkpoints

steps:
  - name: Train with checkpoints
    run: python train.py --checkpoint-every=10

I’m experiencing slow download/upload speeds. How can I improve this?

To improve data transfer speeds:

Use Hugging Face to store models, checkpoints and datasets when possible
Use compression when uploading/downloading large artifacts
Consider using GitHub’s cache action for frequently used datasets or dependencies
Pre-process and downsample data where possible
Use efficient formats for storing data (e.g., parquet instead of CSV)
Break large uploads/downloads into smaller chunks

How do I optimize my workflow to reduce costs?

To optimize costs on Machine:

Use spot instances whenever possible
Choose the right GPU for your workload (don’t use L40S when L4 would suffice)
Implement efficient checkpointing to recover from interruptions
Use workflow conditionals to only run GPU jobs when necessary
Optimize your code to reduce total runtime
Consider running preparatory and post-processing steps on standard GitHub runners

For more optimization tips, see our Cost Optimization Guide.

Where can I learn more?

To learn more about Machine and how to optimize your workflows:

Read our documentation
Check out our GPU Runners and their specs.