First Look at NVIDIA Jetson Orin Nano Super - The Most Affordable Generative AI Supercomputer
Table of Contents Comparing Jetson Nano Vs Jetson Orin Nano Super How powerful is NVIDIA Jetson Orin Super? Generative AI Capabilities Vision Language Models (VLMs) Vision Transformers I/O and Connectivity Developer-Friendly Features AI Development Tools Getting Started with Jetson Orin Super Prerequisites Software Preparing Your Jetson Orin Nano Installation Steps Step 1: Verify L4T Version Step 2: Keep apt up to date Step 3: Install JetPack Step 4: Add users Step 5: Install jetson-examples Step 6: Reboot system Step 7: Install Ollama Step 8: Run a model Step 9: Install models from Ollama Library Step 10: Install and run Open WebUI through Docker Using GPU Using CPU only Conclusion References NVIDIA has just reinvented edge computing with its latest offering - the Jetson Orin Nano Super Developer Kit. This isn't just an incremental update; it's a significant leap forward in bringing generative AI capabilities to the edge at an unprecedented price point of $249. Comparing Jetson Nano Vs Jetson Orin Nano Super The NVIDIA Jetson Orin Nano Super Developer Kit is a compact, yet powerful computer that redefines generative AI for small edge devices. It delivers up to 67 TOPS of AI performance—a 1.7X improvement over its predecessor—to seamlessly run a wide variety of generative AI models, like vision transformers, large language models, vision-language models, and more. It provides developers, students, and makers with the most affordable and accessible platform with the support of the NVIDIA AI software and a broad AI software ecosystem to democratize generative AI at the edge. Existing Jetson Orin Nano Developer Kit users can experience this performance boost with just a software upgrade, so everyone can now unlock new possibilities with generative AI. Learn about Jetson AI Lab Let's dive into the specs and see how it compares to its predecessors: Feature Orin Nano Original Orin Nano Super Improvement GPU Architecture NVIDIA Ampere (1024 CUDA cores, 32 Tensor cores) @ 635 MHz NVIDIA Ampere (1024 CUDA cores, 32 Tensor cores) @ 1020 MHz 1.6x GPU Clock AI Performance 40 TOPS (Sparse) / 20 TOPS (Dense) 67 TOPS (Sparse) / 33 TOPS (Dense) 1.7x AI Performance CPU 6-core Arm Cortex-A78AE @ 1.5 GHz 6-core Arm Cortex-A78AE @ 1.7 GHz 1.13x CPU Clock Memory 8GB 128-bit LPDDR5 @ 68 GB/s 8GB 128-bit LPDDR5 @ 102 GB/s 1.5x Memory Bandwidth Module Power 7W/15W 7W/15W/25W Additional Power Mode How powerful is NVIDIA Jetson Orin Super? The most striking aspect of the Super variant is its performance improvements: 1.7x increase in AI compute performance (67 TOPS vs 40 TOPS) 1.5x increase in memory bandwidth (102 GB/s vs 68 GB/s) Higher GPU and CPU clock speeds for better overall performance Generative AI Capabilities The NVIDIA Jetson™ platform runs the NVIDIA AI software stack, with a variety of available use-case-specific application frameworks. These include NVIDIA Isaac™ for robotics, NVIDIA Metropolis for vision AI, and NVIDIA Holoscan for sensor processing. You can save significant time with NVIDIA Omniverse™ Replicator for synthetic data generation (SDG) and NVIDIA TAO Toolkit for fine-tuning pretrained AI models from the NVIDIA® NGC™ catalog. One of the most impressive aspects of the Orin Nano Super is its ability to run various types of generative AI models: Large Language Models (LLMs): Model Performance Gain Llama-3.1 8B 1.37x Llama 3.2 3B 1.55x Qwen2.5 7B 1.53x Gemma2 2B 1.63x Gemma2 9B 1.28x Phi 3.5 3B 1.54x SmoLLM2 1.7B 1.57x Vision Language Models (VLMs): Model Performance Gain VILA 1.5 3B 1.51x VILA 1.5 8B 1.45x LLAVA 1.6 7B 1.36x Qwen2-VL-2B 1.57x InternVL2.5-4B 2.04x PaliGemma2-3B 1.58x SmoLVLM-2B 1.59x Vision Transformers Model Performance Gain clip-vit-base-patch32 1.60x clip-vit-base-patch16 1.69x DINOv2-base-patch14 1.68x SAM2 base 1.43x Grounding-DINO 1.52x vit-base-patch16-224 1.61x vit-base-patch32-224 1.60x I/O and Connectivity Interface Specification Camera 2x MIPI CSI-2 22-pin Camera Connectors PCIe M.2 Key M x4 PCIe Gen 3 Additional PCIe M.2 Key M x2 PCIe Gen3 Expansion M.2 Key E PCIe (x1), USB 2.0, UART, I2S, and I2C USB 4x USB 3.2 Gen2 Type A + 1x Type C for Debug Network 1x GbE Connector Display DisplayPort 1.2 (+MST) Storage microSD slot (UHS-1 cards up to SDR104 mode) GPIO 40-Pin Expansion Header Developer-Friendly Features Feature Description Software Stack Full support for TensorRT-LLM Framework Compatibility Native compatibility with popular frameworks Jetson Ecosystem Jetson Software Stack & Microservices support Deployment Pre-built containers for rapid deployment AI Development Tools Tool Description TensorRT Optimization Optimized inference using TensorRT Quantization Support INT8/FP16 quantization support Multi-Model Infe

Table of Contents
- Comparing Jetson Nano Vs Jetson Orin Nano Super
- How powerful is NVIDIA Jetson Orin Super?
- Generative AI Capabilities
- Vision Language Models (VLMs)
- Vision Transformers
- I/O and Connectivity
- Developer-Friendly Features
- AI Development Tools
-
Getting Started with Jetson Orin Super
- Prerequisites
- Software
- Preparing Your Jetson Orin Nano
- Installation Steps
- Step 1: Verify L4T Version
- Step 2: Keep apt up to date
- Step 3: Install JetPack
- Step 4: Add users
- Step 5: Install jetson-examples
- Step 6: Reboot system
- Step 7: Install Ollama
- Step 8: Run a model
- Step 9: Install models from Ollama Library
- Step 10: Install and run Open WebUI through Docker
- Using GPU
- Using CPU only
- Conclusion
- References
NVIDIA has just reinvented edge computing with its latest offering - the Jetson Orin Nano Super Developer Kit. This isn't just an incremental update; it's a significant leap forward in bringing generative AI capabilities to the edge at an unprecedented price point of $249.
Comparing Jetson Nano Vs Jetson Orin Nano Super
The NVIDIA Jetson Orin Nano Super Developer Kit is a compact, yet powerful computer that redefines generative AI for small edge devices.
It delivers up to 67 TOPS of AI performance—a 1.7X improvement over its predecessor—to seamlessly run a wide variety of generative AI models, like vision transformers, large language models, vision-language models, and more.
It provides developers, students, and makers with the most affordable and accessible platform with the support of the NVIDIA AI software and a broad AI software ecosystem to democratize generative AI at the edge. Existing Jetson Orin Nano Developer Kit users can experience this performance boost with just a software upgrade, so everyone can
now unlock new possibilities with generative AI.
Let's dive into the specs and see how it compares to its predecessors:
Feature | Orin Nano Original | Orin Nano Super | Improvement |
---|---|---|---|
GPU Architecture | NVIDIA Ampere (1024 CUDA cores, 32 Tensor cores) @ 635 MHz | NVIDIA Ampere (1024 CUDA cores, 32 Tensor cores) @ 1020 MHz | 1.6x GPU Clock |
AI Performance | 40 TOPS (Sparse) / 20 TOPS (Dense) | 67 TOPS (Sparse) / 33 TOPS (Dense) | 1.7x AI Performance |
CPU | 6-core Arm Cortex-A78AE @ 1.5 GHz | 6-core Arm Cortex-A78AE @ 1.7 GHz | 1.13x CPU Clock |
Memory | 8GB 128-bit LPDDR5 @ 68 GB/s | 8GB 128-bit LPDDR5 @ 102 GB/s | 1.5x Memory Bandwidth |
Module Power | 7W/15W | 7W/15W/25W | Additional Power Mode |
How powerful is NVIDIA Jetson Orin Super?
The most striking aspect of the Super variant is its performance improvements:
- 1.7x increase in AI compute performance (67 TOPS vs 40 TOPS)
- 1.5x increase in memory bandwidth (102 GB/s vs 68 GB/s)
- Higher GPU and CPU clock speeds for better overall performance
Generative AI Capabilities
The NVIDIA Jetson™ platform runs the NVIDIA AI software stack, with a variety of available use-case-specific application frameworks. These include NVIDIA Isaac™ for robotics, NVIDIA Metropolis for vision AI, and NVIDIA Holoscan for sensor processing. You can save significant time with NVIDIA Omniverse™ Replicator for synthetic data generation (SDG) and NVIDIA TAO Toolkit for fine-tuning pretrained AI models from the NVIDIA® NGC™ catalog.
One of the most impressive aspects of the Orin Nano Super is its ability to run various types of generative AI models:
Large Language Models (LLMs):
Model | Performance Gain |
---|---|
Llama-3.1 8B | 1.37x |
Llama 3.2 3B | 1.55x |
Qwen2.5 7B | 1.53x |
Gemma2 2B | 1.63x |
Gemma2 9B | 1.28x |
Phi 3.5 3B | 1.54x |
SmoLLM2 1.7B | 1.57x |
Vision Language Models (VLMs):
Model | Performance Gain |
---|---|
VILA 1.5 3B | 1.51x |
VILA 1.5 8B | 1.45x |
LLAVA 1.6 7B | 1.36x |
Qwen2-VL-2B | 1.57x |
InternVL2.5-4B | 2.04x |
PaliGemma2-3B | 1.58x |
SmoLVLM-2B | 1.59x |
Vision Transformers
Model | Performance Gain |
---|---|
clip-vit-base-patch32 | 1.60x |
clip-vit-base-patch16 | 1.69x |
DINOv2-base-patch14 | 1.68x |
SAM2 base | 1.43x |
Grounding-DINO | 1.52x |
vit-base-patch16-224 | 1.61x |
vit-base-patch32-224 | 1.60x |
I/O and Connectivity
Interface | Specification |
---|---|
Camera | 2x MIPI CSI-2 22-pin Camera Connectors |
PCIe | M.2 Key M x4 PCIe Gen 3 |
Additional PCIe | M.2 Key M x2 PCIe Gen3 |
Expansion | M.2 Key E PCIe (x1), USB 2.0, UART, I2S, and I2C |
USB | 4x USB 3.2 Gen2 Type A + 1x Type C for Debug |
Network | 1x GbE Connector |
Display | DisplayPort 1.2 (+MST) |
Storage | microSD slot (UHS-1 cards up to SDR104 mode) |
GPIO | 40-Pin Expansion Header |
Developer-Friendly Features
Feature | Description |
---|---|
Software Stack | Full support for TensorRT-LLM |
Framework Compatibility | Native compatibility with popular frameworks |
Jetson Ecosystem | Jetson Software Stack & Microservices support |
Deployment | Pre-built containers for rapid deployment |
AI Development Tools
Tool | Description |
---|---|
TensorRT Optimization | Optimized inference using TensorRT |
Quantization Support | INT8/FP16 quantization support |
Multi-Model Inference | Ability to run multiple models simultaneously |
Containerization | Docker container support for easy deployment |
Getting Started with Jetson Orin Super
This guide will walk you through setting up Ollama on your Jetson device, integrating it with Open WebUI, and configuring the system for optimal GPU utilization. Whether you're a developer or an AI enthusiast, this setup allows you to harness the full potential of LLMs right on your Jetson device.
Pre-requisite
- Jetson Orin Nano
- A DC power supply
- 64GB/128GB SD card
- WiFi Adapter
- Wireless Keyboard
- Wireless mouse
Software
- Download Jetson SD card image
- Raspberry Pi Imager / Etcher installed on your local system
Download Jetson SDK using this link
Preparing Your Jetson Prin Nano
- Unzip the SD card image
- Insert SD card into your system.
- Bring up Raspberry Pi Imager tool to flash image into the SD card
Prerequisite
- Ensure that you have Jetpack 6.0 installed on your Jetson Orin Nano device. You can download the SDK Manager on the remote Windows or Linux and follow the tutorial from the official NVIDIA Developer site.
Step 1. Verify L4T Version
To check the L4T (Linux for Tegra) version on your NVIDIA Jetson device (e.g., Jetson Nano, Jetson Xavier), follow these steps:
Run the following command to retrieve your current L4T version.
head -n 1 /etc/nv_tegra_release
Here are the list of supported L4T versions:
- 35.3.1
- 35.4.1
- 35.5.0
- 36.3.0
If your L4T version does not match the supported versions listed above, you may need to re-flash the system on your NVIDIA Jetson device. You might need to use SDK Manager on another computer to re-flash the device. You can download the SDK Manager and follow the tutorial from the official NVIDIA Developer site.
Step 2. Keep apt
up to date:
sudo apt update && sudo apt upgrade
Step 3. Install jetpack
:
sudo apt install jetpack
Step 4. Add users
Add your user to the docker group and restart the Docker service to apply the change:
sudo usermod -aG docker $USER && \
newgrp docker && \
sudo systemctl daemon-reload && \
sudo systemctl restart docker
Step 5. Install jetson-examples:
pip3 install jetson-examples
Step 6. Reboot system
sudo reboot
Step 7. Install Ollama
reComputer run ollama
Optional: If you run the above command via ssh
and encounter the error command not found: reComputer
, you can resolve this by executing the following command:
source ~/.profile
Step 8. Run a model
The smallest LLaMA model available for download is TinyLlama, a compact 1.1 billion parameter model. Despite its reduced size, TinyLlama demonstrates remarkable performance across various tasks, making it suitable for applications with limited computational resources. You can access TinyLlama through its GitHub repository or via Hugging Face.
Let's run the tinyllama model and perform tasks like generating Python code:
ollama run tinyllama
>>> > Can you write a Python script to calculate the factorial of a number?
Sure! Here’s the code:
def factorial(n):
if n == 0 or n == 1:
return 1
else:
return n * factorial(n - 1)
num = int(input("Enter a number: "))
print(f"The factorial of {num} is {factorial(num)}")
Step 9. Install models (e.g. llama3.2) from Ollama Library
ollama pull llama3.2
Step 9. Install and run Open WebUI through Docker
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda
Step 10. Install and run Open WebUI through docker
Once the installation is finished, you can access the GUI by visiting YOUR_SERVER_IP:3000 in your browser.
Access the API endpoints by navigating to YOUR_SERVER_IP/ollama/docs#/. For comprehensive documentation, please refer to the official resources: the Ollama API Documentation (recommended) and Open WebUI API Endpoints.
Using GPU
This installation method uses a single container image that bundles Open WebUI with Ollama, allowing for a streamlined setup via a single command. Choose the appropriate command based on your hardware setup:
sudo docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
Using CPU only
For CPU Only: If you're not using a GPU, use this command instead:
sudo docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
Both commands facilitate a built-in, hassle-free installation of both Open WebUI and Ollama, ensuring that you can get everything up and running swiftly.
Once configured, Open WebUI can be accessed at http://localhost:3000, while Ollama operates at http://localhost:11434. This setup provides a seamless and GPU-accelerated environment for running and managing LLMs locally on NVIDIA Jetson devices.
Conclusion
The Jetson Orin Nano Super Developer Kit represents a significant milestone in edge AI computing. It brings datacenter-class AI capabilities to the edge at an unprecedented price point, making it an ideal platform for developers, researchers, and businesses looking to deploy advanced AI applications at the edge.
The combination of increased AI performance, enhanced memory bandwidth, and broad model support makes it a compelling choice for anyone serious about edge AI development. At $249, it's not just a product - it's a revolution in accessible AI computing.