How We Built an AI-Powered Automated Product Enrichment Pipeline for Shopify

Engineering a fully automated workflow for a Shopify store Maintaining a successful e-commerce store comes with its fair share of challenges. It demands constant attention to ever-changing details across inventory, customer experience, and platform updates. With so many moving parts, manual oversight can quickly become overwhelming, error-prone, and time-consuming. That’s where automation steps in — not just as a convenience but as a necessity to keep your store running efficiently and at scale. While Shopify offers a rich ecosystem of apps and drag-and-drop interfaces, it often requires you to trade transparency and control for convenience. TL;DR We’ll build a pipeline using GitHub Actions to export the latest products from the Shopify store, perform some actions using LLM, and update the products. Full source for the pipeline can be found here. Taking Back Control Let the robots worry about the boring stuff! Sooner or later, you will hit the limits with off-the-shelf apps and manual workflows and start looking for alternatives. One such alternative is to shift away from GUI-centric tools toward programmable pipelines that offer complete flexibility and control. What you want is: Full ownership of your data Enhancements tailored to your brand and products Sharable Workflows: multiple stores could use the same workflow with little to no tweak Confidence in every step of the process Now, let’s explore how we can build an automated CI pipeline to help mitigate the issues mentioned above. As a proof-of-concept, we’ll create a pipeline to streamline our product-content workflow. The pipeline will use LLM to review the latest products on our store, optimize the title, add SEO title and description, and generate a summary for the team to review. The Stack Here’s what powers the workflow: Shopify — where our products live GitHub Actions — for orchestration and automation ShopCTL — A command line utility for Shopify store management OpenAI API — to revise product titles, generate SEO content, and suggestions Python and some Bash scripts — for the enrichment logic and updates First things first — Setting up the stack Let’s start by setting up a GitHub Actions workflow. We’ll store pipeline configs in the .github/workflows/ directory. Create a file named enrich-products.yml inside the workflows directory. This file will define jobs for our product-content workflow. # .github/workflows/enrich-products.yml name: Shopify Product Enrichment on: workflow_dispatch: The workflow_dispatch event in GitHub Actions allows you to manually trigger a workflow from the GitHub interface or via the API , or you can schedule it to run automatically at a specific time. API Keys We’d need a few API keys to complete our configuration: OPENAI_API_KEY for AI operations and SHOPIFY_ACCESS_TOKEN to communicate with our store. Get the OpenAI API key from your OpenAI account and set it as a secret in GitHub. Getting a Shopify access token is tricky since you need to create a dummy app to do so. Follow this official guide to get one. ShopCTL We’ll use a command-line tool to export and update our products. Let’s create a custom action that we can reuse to reference in our pipeline. Create a file called setup-shopctl.yml inside actions directory and add the following config. # .github/workflows/actions/setup-shopctl.yml name: Setup ShopCTL description: Installs Go and ShopCTL CLI runs: using: "composite" steps: - name: Set up Go uses: actions/setup-go@v5 with: go-version: "1.24" - name: Install ShopCTL shell: bash run: | sudo apt-get update sudo apt-get install -y libx11-dev go install github.com/ankitpokhrel/shopctl/cmd/shopctl@main echo "$HOME/go/bin" >> "$GITHUB_PATH" Apart from custom actions, we need to add a configuration for the store we’re operating. Create a folder called shopctl on the repo’s root and add the following config in a file called .shopconfig.yml. Replace all occurrences of store1 with your store alias. # shopctl/.shopcofig.yml ver: v0 contexts: - alias: store1 store: store1.myshopify.com currentContext: store1 Finalizing the pipeline Our pipeline has four stages, viz: Export -> Enrich -> Update -> Notify Stage 1: Export products The first step in our pipeline is to export the latest products from our store. Add a job called export-products in the enrich-products.yml file we created earlier. jobs: export-products: runs-on: ubuntu-latest env: SHOPIFY_ACCESS_TOKEN: ${{ secrets.SHOPIFY_ACCESS_TOKEN }} # The secret we set earlier SHOPIFY_CONFIG_HOME: ${{ github.workspace }} # This will tell shopctl to use current dir to look for .shopconfig outputs: has-data: ${{ steps.check.outputs.has_data }} steps: - name: Checkout repo uses: actions/checkout@v3

Apr 25, 2025 - 10:40
 0
How We Built an AI-Powered Automated Product Enrichment Pipeline for Shopify

Engineering a fully automated workflow for a Shopify store

Maintaining a successful e-commerce store comes with its fair share of challenges. It demands constant attention to ever-changing details across inventory, customer experience, and platform updates. With so many moving parts, manual oversight can quickly become overwhelming, error-prone, and time-consuming.

That’s where automation steps in — not just as a convenience but as a necessity to keep your store running efficiently and at scale. While Shopify offers a rich ecosystem of apps and drag-and-drop interfaces, it often requires you to trade transparency and control for convenience.

TL;DR

We’ll build a pipeline using GitHub Actions to export the latest products from the Shopify store, perform some actions using LLM, and update the products.

Full source for the pipeline can be found here.

Taking Back Control

Let the robots worry about the boring stuff!

Sooner or later, you will hit the limits with off-the-shelf apps and manual workflows and start looking for alternatives. One such alternative is to shift away from GUI-centric tools toward programmable pipelines that offer complete flexibility and control. What you want is:

  • Full ownership of your data
  • Enhancements tailored to your brand and products
  • Sharable Workflows: multiple stores could use the same workflow with little to no tweak
  • Confidence in every step of the process

Now, let’s explore how we can build an automated CI pipeline to help mitigate the issues mentioned above. As a proof-of-concept, we’ll create a pipeline to streamline our product-content workflow. The pipeline will use LLM to review the latest products on our store, optimize the title, add SEO title and description, and generate a summary for the team to review.

The Stack

Here’s what powers the workflow:

  • Shopify — where our products live
  • GitHub Actions — for orchestration and automation
  • ShopCTL — A command line utility for Shopify store management
  • OpenAI API — to revise product titles, generate SEO content, and suggestions
  • Python and some Bash scripts — for the enrichment logic and updates

First things first — Setting up the stack

Let’s start by setting up a GitHub Actions workflow. We’ll store pipeline configs in the .github/workflows/ directory. Create a file named enrich-products.yml inside the workflows directory. This file will define jobs for our product-content workflow.

# .github/workflows/enrich-products.yml

name: Shopify Product Enrichment

on:
  workflow_dispatch:

The workflow_dispatch event in GitHub Actions allows you to manually trigger a workflow from the GitHub interface or via the API , or you can schedule it to run automatically at a specific time.

API Keys

We’d need a few API keys to complete our configuration: OPENAI_API_KEY for AI operations and SHOPIFY_ACCESS_TOKEN to communicate with our store.

Get the OpenAI API key from your OpenAI account and set it as a secret in GitHub. Getting a Shopify access token is tricky since you need to create a dummy app to do so. Follow this official guide to get one.

ShopCTL

We’ll use a command-line tool to export and update our products. Let’s create a custom action that we can reuse to reference in our pipeline.

Create a file called setup-shopctl.yml inside actions directory and add the following config.

# .github/workflows/actions/setup-shopctl.yml

name: Setup ShopCTL
description: Installs Go and ShopCTL CLI
runs:
  using: "composite"
  steps:
    - name: Set up Go
      uses: actions/setup-go@v5
      with:
        go-version: "1.24"

    - name: Install ShopCTL
      shell: bash
      run: |
        sudo apt-get update
        sudo apt-get install -y libx11-dev
        go install github.com/ankitpokhrel/shopctl/cmd/shopctl@main
        echo "$HOME/go/bin" >> "$GITHUB_PATH"

Apart from custom actions, we need to add a configuration for the store we’re operating. Create a folder called shopctl on the repo’s root and add the following config in a file called .shopconfig.yml. Replace all occurrences of store1 with your store alias.

# shopctl/.shopcofig.yml

ver: v0
contexts:
    - alias: store1
      store: store1.myshopify.com
currentContext: store1

Finalizing the pipeline

Our pipeline has four stages, viz: Export -> Enrich -> Update -> Notify

Stage 1: Export products

The first step in our pipeline is to export the latest products from our store. Add a job called export-products in the enrich-products.yml file we created earlier.

jobs:
  export-products:
    runs-on: ubuntu-latest
    env:
      SHOPIFY_ACCESS_TOKEN: ${{ secrets.SHOPIFY_ACCESS_TOKEN }} # The secret we set earlier
      SHOPIFY_CONFIG_HOME: ${{ github.workspace }} # This will tell shopctl to use current dir to look for .shopconfig
    outputs:
      has-data: ${{ steps.check.outputs.has_data }}

    steps:
      - name: Checkout repo
        uses: actions/checkout@v3

      - name: Setup ShopCTL
        uses: ./.github/workflows/actions/setup-shopctl

      - name: Export products
        run: |
          mkdir -p data

          # Export latest data (last 7 days) using the shopctl tool as latest_products.tar.gz
          shopctl export -r product="created_at:>=$(date -v -7d +%Y-%m-%d)" -o data/ -n latest_products -vvv

      - name: Check if export has data
        id: check
        run: |
          if [ -s data/latest_products.tar.gz ]; then
            echo "has_data=true" >> "$GITHUB_OUTPUT"
          else
            echo "has_data=false" >> "$GITHUB_OUTPUT"
            echo "No products found to process"
          fi

      - name: Upload exported products
        if: steps.check.outputs.has_data == 'true'
        uses: actions/upload-artifact@v4
        with:
          name: exported-products
          path: data/latest_products.tar.gz

The job above will set up ShopCTL using the custom action we created earlier. It will export all products created in the last 7 days and upload them as artifacts if any new products exist.

Stage 2a: Review catalog

The next we want to do is to review our catalog. We’ll use OpenAI API to review product data samples and identify the following:

  • Issues or inconsistencies in tags, product types, or variants
  • Missing or inconsistent inventory information
  • Gaps in product configuration or variant structure
  • Duplicate or overly similar products
  • General recommendations to improve catalog quality and its completeness
review-catalog:
    needs: export-products
    runs-on: ubuntu-latest
    env:
      OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

    steps:
      - name: Checkout repo
        uses: actions/checkout@v3

      - name: Download product export
        uses: actions/download-artifact@v4
        with:
          name: exported-products
          path: data/

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.13"

      - name: Install dependencies
        run: pip install openai

      - name: Run catalog review script
        run: |
          # Assuming your script is saved in scripts/review_catalog.py
          python scripts/review_catalog.py \
            data/latest_products.tar.gz \
            data/review_summary.md

      - name: Upload catalog summary
        uses: actions/upload-artifact@v4
        with:
          name: catalog-review-summary
          path: data/review_summary.md

      - name: Final summary
        run: echo "✅ Shopify product catalog review completed!"

Notice the needs section. We want to run it after products are exported and made available as artifacts. We also need to set up Python, as our review script is written in Python. You can use any language of your choice here. The script generates review_summary.md, which is uploaded as an artificat in the next step (example output below).

## Identified Issues

### 1. Missing or Inconsistent Information:
- Some products have missing or inconsistent `productType` (e.g. `"gid://shopify/Product/8790718087392"`, `"gid://shopify/Product/879071795632

The sample script and the prompt can be found here.

Stage 2b: Enrich Products

Similar to the review-catalog job, add an enrich-products job that will run the script to review the product title and generate an SEO title and description for the product using OpenAI. This job runs in parallel with the review catalog job and generates a CSV with details on metadata to update.

Generated enriched_products.csv file

The sample script and the prompt can be found here.

Stage 3: Update products

Once the metadata is generated in stage 2b, we can update products using ShopCTL. We’ll use a bash script instead of Python at this stage.

Add a job called update-products, as shown below.

update-products:
    needs: enrich-products
    runs-on: ubuntu-latest
    env:
      SHOPIFY_ACCESS_TOKEN: ${{ secrets.SHOPIFY_ACCESS_TOKEN }}
      SHOPIFY_CONFIG_HOME: ${{ github.workspace }}

    steps:
      - name: Checkout repo
        uses: actions/checkout@v3

      - name: Setup ShopCTL
        uses: ./.github/workflows/actions/setup-shopctl

      - name: Download enriched products
        uses: actions/download-artifact@v4
        with:
          name: enriched-products
          path: data/

      - name: Apply updates using shopctl
        run: |
          mkdir -p logs
          touch logs/audit.txt

          while IFS=, read -r pid new_title seo_title seo_desc; do
            # Strip leading/trailing quotes
            seo_desc="${seo_desc%\"}"
            seo_desc="${seo_desc#\"}"

            # Use shopctl to update product details
            if output=$(shopctl product update "$pid" \
                --title "$new_title" \
                --seo-title "$seo_title" \
                --seo-desc "$seo_desc" 2>&1); then
                echo "$pid,success" >> logs/audit.txt
            else
              sanitized_error=$(echo "$output" | tr '\n' ' ' | sed 's/,/ /g')
              echo "$pid,failure,$sanitized_error" >> logs/audit.txt
            fi
          done < <(tail -n +2 data/enriched_products.csv)

        - name: Upload audit log
          uses: actions/upload-artifact@v4
          with:
            name: product-audit-log
            path: logs/audit.txt

        - name: Final summary
          run: echo "✅ Shopify product enrichment and updates completed!"

The job is relatively simple; it uses a bash script to read from the CSV file generated in the previous step, update the product using ShopCTL, and create a log file.

Stage 4: Notify

Now, the only thing remaining is to notify interested parties that the job has been completed (or failed) and what has changed. You can either send a Slack notification or email the details. We will simply fetch and print the logs for the tutorial’s sake.

notify:
    needs: [review-catalog, update-products]
    runs-on: ubuntu-latest

    steps:
      - name: Download audit log
        uses: actions/download-artifact@v4
        with:
          name: product-audit-log
          path: logs/

      - name: Download catalog review
        uses: actions/download-artifact@v4
        with:
          name: catalog-review-summary
          path: data/

      - name: Print audit summary
        run: |
          ls -lah logs/
          ls -lah data/
          echo "