Machine supercharges your GitHub Workflows with seamless GPU acceleration. Say goodbye to the tedious overhead of managing GPU runners and hello to streamlined efficiency.

What GPU types are available?

Machine offers a wide selection of NVIDIA GPUs including T4G, T4, L4, A10G, and L40S, as well as AWS Inferentia options to match your specific workflow needs.

How do I integrate Machine with GitHub Actions?

Machine integrates natively with GitHub Actions. Simply update your workflows with a runs-on tag and start accelerating your tasks immediately.

How much faster are GPU-accelerated workflows?

Machine accelerates model training, inference, batch processing, and simulations—up to 100× faster than CPU-only workflows.

LLM Supervised Fine-Tuning

The LLM Supervised Fine-Tuning workflow allows you to fine-tune language models using popular conversational datasets. This implementation leverages Machine GPU runners to efficiently train models, providing optimized model versions tuned to your specific use cases.

In this example we will be fine-tuning the Llama 3.2 3B Instruct model using the FineTome-100k dataset.

Use Case Overview

Why might you want to fine-tune language models?

Adapt pre-trained models to specific domains or tasks
Improve performance on domain-specific conversational scenarios
Create models that better align with your brand voice or style
Reduce hallucinations and improve factual accuracy in specific domains

How It Works

The LLM Supervised Fine-Tuning workflow uses Unsloth to accelerate the fine-tuning process. The workflow is defined in GitHub Actions workflow files and can be triggered on-demand with configurable parameters.

The fine-tuning process:

Loads a specified base model (e.g., Llama 3.2 3B Instruct)
Prepares a conversational dataset (e.g., FineTome-100k or OpenAssistant’s oasst1)
Applies Low-Rank Adaptation (LoRA) for memory-efficient training
Automatically saves checkpoints during training (in the retry-enabled workflow)
Pushes the fine-tuned model to Hugging Face Hub

Workflow Implementation

The LLM Supervised Fine-Tuning is implemented as GitHub Actions workflows that can be triggered manually. Here’s the basic workflow definition:

name: Supervised Fine-Tuning

on:
  workflow_dispatch:
    inputs:
      source_model:
        type: string
        required: false
        description: 'The base model to fine-tune'
        default: 'unsloth/Llama-3.2-3B-Instruct'
      data_set:
        type: string
        required: false
        description: 'Which dataset to use for fine-tuning'
        default: 'finetome-100k'
      max_seq_length:
        type: string
        required: false
        description: 'The maximum sequence length'
        default: '4096'
      lora_rank:
        type: string
        required: false
        description: 'The lora rank'
        default: '64'
      max_steps:
        type: string
        required: false
        description: 'The maximum number of steps'
        default: '250'
      gpu_memory_utilization:
        type: string
        required: false
        description: 'The GPU memory utilization'
        default: '0.90'
      learning_rate:
        type: string
        required: false
        description: 'The learning rate'
        default: '2e-5'
      per_device_train_batch_size:
        type: string
        required: false
        description: 'The per device training batch size'
        default: '2'
      hf_repo:
        type: string
        required: true
        description: 'The Hugging Face repository to upload the model to'

jobs:
  train:
    name: Supervised LoRA Training (unsloth)
    runs-on:
      - machine
      - gpu=T4
      - cpu=4
      - ram=16
      - architecture=x64
    timeout-minutes: 180
    env:
      SOURCE_MODEL: ${{ inputs.source_model }}
      MAX_SEQ_LENGTH: ${{ inputs.max_seq_length }}
      LORA_RANK: ${{ inputs.lora_rank }}
      DATA_SET: ${{ inputs.data_set }}
      GPU_MEMORY_UTILIZATION: ${{ inputs.gpu_memory_utilization }}
      MAX_STEPS: ${{ inputs.max_steps }}
      LEARNING_RATE: ${{ inputs.learning_rate }}
      PER_DEVICE_TRAIN_BATCH_SIZE: ${{ inputs.per_device_train_batch_size }}
      HF_TOKEN: ${{ secrets.HF_TOKEN }}
      HF_HUB_ENABLE_HF_TRANSFER: 1
      HF_REPO: ${{ inputs.hf_repo }}
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python 3.10
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'

      - name: Install dependencies
        run: |
          pip install -r requirements.txt

      - name: Run Training
        run: |
          python3 train.py

Advanced Retry Mechanism

For enhanced reliability, the repository also provides a workflow with automatic checkpointing and retry functionality:

name: Supervised Fine-Tuning with Retry

on:
  workflow_dispatch:
    inputs:
      attempt:
        type: string
        description: 'The attempt number'
        default: '1'
      max_attempts:
        type: number
        description: 'The maximum number of attempts'
        default: 5
      # Same parameters as in the basic workflow
      # ...

This implementation ensures training progress isn’t lost due to spot instance interruptions by:

Automatically saving checkpoints to Hugging Face Hub during training
Detecting spot instance interruptions using a custom GitHub Action
Restarting the workflow with an incremented attempt number
Resuming training from the latest checkpoint

The retry mechanism works through the following steps:

The workflow starts a training job with a specified attempt number (default: 1)
During training, checkpoints are periodically saved to Hugging Face Hub
If the job completes successfully, the workflow ends
If the job fails due to a spot instance interruption:
- The check-runner-interruption action detects that the failure was due to a spot instance preemption
- The workflow calculates the next attempt number
- If within the maximum attempts limit, it triggers a new workflow run with an incremented attempt number
- All original parameters are preserved for the new attempt
When a new attempt starts, it downloads the latest checkpoint and resumes training from that point

This mechanism ensures that even if a spot instance is reclaimed, your training progress isn’t lost, and the job can continue from the last checkpoint on a new instance.

Using Machine GPU Runners

This fine-tuning process leverages Machine GPU runners to provide the necessary computing power. The workflow is configured to use:

T4 GPU: An entry-level ML training GPU with 16GB of VRAM, suitable for efficient training with unsloth optimizations
Spot instance: To optimize for cost while maintaining performance
Configurable resources: CPU, RAM, and architecture specifications

For more demanding models or larger datasets, you can also configure the workflow to use more powerful GPUs:

runs-on:
  - machine
  - gpu=L4
  - cpu=4
  - ram=16
  - architecture=x64

Getting Started

To run the LLM Supervised Fine-Tuning workflow:

Use the MachineDotDev/llm-supervised-fine-tuning repository as a template
Set up a Hugging Face access token with write permissions
Add this token as a repository secret named HF_TOKEN in your GitHub repository settings
Navigate to the Actions tab in your repository
Select the “Supervised Fine-Tuning with Retry” workflow
Click “Run workflow” and configure your parameters:
- Choose your base model and dataset
- Adjust sequence length, LoRA rank, and training steps
- Configure GPU memory utilization and learning rate
- Specify your Hugging Face target repository
Run the workflow and wait for results
Access your fine-tuned model on Hugging Face Hub

Best Practices

Select appropriate datasets: Choose datasets that match your target application domain
Adjust batch size for your GPU: Lower batch sizes if you encounter out-of-memory errors
Use checkpointing for longer runs: For extensive training sessions, use the retry-enabled workflow
Monitor training progress: Check workflow logs to observe loss metrics
Test with prompts similar to your use case: Evaluate the model on examples that match your intended application

Next Steps

Explore the full MachineDotDev/llm-supervised-fine-tuning repository
Learn about GPU runner specifications