在国产定制化NVIDIA Jetson Orin NX上离线化部署Ollama(GPU)及各类折腾经验

0.1 摘要

在边缘计算设备上部署大模型是一个新鲜事儿,网上教程多,但实战少,多为按照官网文档直接在线安装,本文主要解决以下几个问题:

  1. 根本没有考虑到 特殊的网络环境
  2. 即使考虑到了 问题1 ,也没考虑到实际的边缘计算设备在工业环境下往往是不联网的,顶多在开发时连接网络,部署上后的运维一直存活在内网。
  3. 官方文档仅针对公版设备,国产定制的芯片、驱动、系统均需要特色化,其实际的配置与开发固然不能直接照搬。
  4. 折腾来折腾去,这货真的调用GPU了吗?
  5. CPU也不差,GPU也不强 ,权衡功耗与性能是得多考虑考虑。
  6. 这货的CUDA跟X86 GPU的CUDA到底有什么区别?如何迁移学习

在此,感谢 图为信息科技(深圳)有限公司 对我赞助的 TWOWIN TW-T806 设备。

本文设备的核心为 NVIDIA Jetson Orin NX ,原则上,本文的相关经验可供同核心或同系列开发板参考。

目录

0.2 背景依托

博0阶段横向

  • 工业大模型的定制化开发,满足其 可信性、实时性
  • 大部分工业场景中的边缘计算设备不能连接互联网,完全依靠外部API是不可行的。
  • “未来是小模型的世界”
  • “具身智能”“大小协同”

0.3 更新日志

  • 2025.01.02 开始折腾
  • 2025.01.03 基本实现Ollama的部署,经验证已经全速运行在GPU上
  • 2025.01.04 完成第一版文章
  • 2025.01.04 在一段激烈的开发中,我手上的WOWIN TW-T806壮烈牺牲(系统配崩了),已经联系图为科技公司进行维修......
  • 2025.01.05 联系图为工程师,按照文档开始尝试刷机解决。
  • 2025.01.06 刷机成功,重新配置并同时对文章检错
  • 2025.01.07 增加了Anaconda、OpenWebUi、手动功耗设置等
  • 2025.01.08 优化性能。
  • 2025.02.14 针对概率性无信号问题给出解决方案。

1 便于“配置”的配置

折腾嵌入式Linux设备往往需要多个屏幕、多套键鼠,还需要用U盘互传文件,这很不方便,完全不符合“敏捷开发”的要求,更不符合新时代嵌入式开发的要求。

本小节主要解决拿到机器后便于后续配置的一些 小技巧儿 ,方便人民群众后续的各种配置。

1.1 VNC远程桌面

孔子曰:不会用、不能用SSH或VNC的嵌入式工程师不是一个合格的嵌入式工程师。

VNC是一个局域网内进行远程桌面的服务器/客户端,Todesk也能用,但是传输速度、数据安全都不能保障。局域网内的解决方案才是正道。

1.1.1 Ubuntu端服务器配置

之前配置VNC Server还需要各种库,但是自从 Ubuntu20.04LTS 之后就已经内置VNC Server了,配置方法如下:

在Ubuntu的系统设置中:

在终端中运行以下命令

gsettings set org.gnome.Vino require-encryption false

关闭加密访问验证。

1.1.2 Windows/Mac/Ubuntu客户端配置

同一局域网下 的设备上,下载VNC客户端,安装好后输入IP地址开始连接,并输入设置的访问密码即可。

1.2 安装Todesk

参考Arm64的Linux安装过程

1.3 Git的代理设置

如果你有能力找到国内的Gitee等克隆仓库,就不必设置代理。本小节针对GIthub仓库的克隆过程中网络连接代理问题。

当然,一个较为稳定的网络环境也能够勉强连接Github,本文所涉及的仓库下载量并不大。

设置Socks代理

git config --global http.proxy 'socks5://127.0.0.1:10808'
git config --global https.proxy 'socks5://127.0.0.1:10808'

设置常规Http代理

git config --global http.proxy http://127.0.0.1:10808
git config --global https.proxy http://127.0.0.1:10808

这里建议使用v2ray的 局域网访问端口 ,实测非常好用。

v2ray的使用按照规定不能在此介绍,请大家自行查阅资料。

取消代理

git config --global --unset http.proxy
git config --global --unset https.proxy

1.4 针对jetson-containers的Docker镜像配置

安装请参考 2.4 Docker安装

使用vim编辑/etc/docker/daemon.json文件,如下

{

    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
     "registry-mirrors": [
       "https://hub.icert.top/",
       "https://ghcr.geekery.cn"
   ]
}

然后执行

systemctl daemon-reload
systemctl restart docker

以上代码结合了 Nvidia Jetson的Docker运行时库国内运维开发绿皮书Geekery的Docker镜像站

在此感谢运维开发绿皮书Geekery坚持互联网开源精神,提供的稳定高效的Docker镜像。

1.5 Python全局换源

清华源(本地用户)

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple/

请注意区分pip版本、Python版本、不用用户Python的概念,自行选择合适的命令

sudo pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple/

sudo pip3 config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple/

pip3 config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple/

1.6 Jetson SDK性能监控

1.6.1 Tegrastats

这是Jetson自带的性能监控软件,每隔一秒以文字的形式输出当前系统的性能情况。

tegrastats

1.6.2 jtop

安装

sudo -H pip install jetson-stats

使用

sudo jtop

1.6.3 htop

Linux基本的性能监视器

sudo apt-get install htop

运行

htop

1.6.4 Jetson Power GUI

1.7 Ubuntu ARM64 国内源

修改/etc/apt/sources.list

经过测试,清华源的ARM64相关库并不能正常连接,推荐使用阿里源

对于20.04LTS,阿里源如下

deb https://mirrors.aliyun.com/ubuntu-ports/ focal main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu-ports/ focal main restricted universe multiverse

deb https://mirrors.aliyun.com/ubuntu-ports/ focal-security main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu-ports/ focal-security main restricted universe multiverse

deb https://mirrors.aliyun.com/ubuntu-ports/ focal-updates main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu-ports/ focal-updates main restricted universe multiverse

# deb https://mirrors.aliyun.com/ubuntu-ports/ focal-proposed main restricted universe multiverse
# deb-src https://mirrors.aliyun.com/ubuntu-ports/ focal-proposed main restricted universe multiverse

deb https://mirrors.aliyun.com/ubuntu-ports/ focal-backports main restricted universe multiverse
deb-src https://mirrors.aliyun.com/ubuntu-ports/ focal-backports main restricted universe multiverse

2 本不该发生的麻烦事

在此之前,再次感谢 图为信息科技(深圳)有限公司 赞助的 TWOWIN TW-T806

但是还是提出一些小意见:预置的Ubuntu没有对标官方Jetson公版设备的系统,包括

  • Jetson SDK版本低(但能够满足本文需求)
  • 较多常用的软件包并未提前安装好
  • 没有配置国内镜像服务器
  • Root权限管理存在一定的安全性问题
  • Python版本过低(后文有针对性解决)

以上问题很大程度上成为编写本文的导火索,本文也对其中的问题进行了修改。

2.1 jetson-containers的安装与配置

官网默认用Jetson就是用公版设备,公版设备都自带jetson-container,所以没有过多介绍其安装。

具体安装命令如下

git clone https://github.com/dusty-nv/jetson-containers
bash jetson-containers/install.sh

如遇到权限问题,则需要运行以下命令

sudo bash jetson-containers/install.sh

以下小节均代表安装过程中出现的各种错误提示,请按需阅读。

2.1.1 Python问题

主要解决TypeError: unsupported operand type(s) for |: ‘type‘ and ‘type‘错误提示,问题的原因是机器上的Python版本低于3.10,缺少了部分语法特性的支持需要更新Python版本。

2.1.1.1 指定版本源码与必要编译环境

安装依赖

sudo apt update
sudo apt install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev libsqlite3-dev wget libbz2-dev

从官网下载需要的python3源文件,这里选择了Python3.10.9

wget https://www.python.org/ftp/python/3.10.9/Python-3.10.9.tgz 

解压到Home

tar -zvxf Python-3.10.9.tgz
2.1.1.2 开启pip依赖下载时的SSL认证

前往Python-3.10.9/Modules,找到Setup文件,对以下代码去除注释

# To statically link OpenSSL:
 _ssl _ssl.c \
     -I$(OPENSSL)/include -L$(OPENSSL)/lib \
     -l:libssl.a -Wl,--exclude-libs,libssl.a \
     -l:libcrypto.a -Wl,--exclude-libs,libcrypto.a
_hashlib _hashopenssl.c \
     -I$(OPENSSL)/include -L$(OPENSSL)/lib \
     -l:libcrypto.a -Wl,--exclude-libs,libcrypto.a

2.1.1.3 编译环境中的OpenSSL开发引用环境

上一小节解决了Python编译时的SSL引入,但是其引用的OpenSSL来自编译环境,需要为机器配置OpenSSL编译环境。

进入OpenSSL源码发布站,下载最新版本的源码。

解压(按需修改命令中的文件名)

tar -zxvf openssl-3.4.0.tar.gz

配置并编译

cd openssl-3.4.0
./config
sudo make -j
sudo make install
2.1.1.4 编译Python

在Python源码根目录下开始编译并安装

cd Python-3.10.9/
./configure --enable-optimizations  --with-openssl --with-openssl-rpath=auto
sudo make
sudo make install
2.1.1.5 多个Python的管理与调用

执行以下命令

whereis python

可以看到众多Python版本,假如需要针对某一Python,例如其中的/usr/local/lib/python3.10,可以使用

/usr/local/lib/python3.10 -m 相关命令

例如当前机器的Python软连接被替换为Python3.10,无法使用Python3.8环境中的Jtop,则可以使用以下命令

/usr/bin/python3.8 -m jtop

pip安装同理。

2.2 Jetson SDK版本查看

在jtop中的INFO

暂不推荐升级此SDK。

2.3 Jetson功率问题

建议在开发阶段切换为最高可选瓦数(而非无限高),即数字最大的功率数。

此处选择25W

对于功率的选择,阿木实验室发表《Jetson Orin NX 功耗模式选择:MAXN与25W模式的对比与优化建议》

提出:

MAXN模式虽然提供了最高的性能,但在散热性能有限的情况下,可能导致系统过热,进而触发降频或自动关机。相比之下,25W模式的温度控制更为稳定,适合大多数开发任务。

结论:

如果25W模式可以满足你的应用需求:建议选择25W模式,既能降低功耗,又能保持系统的稳定性。
如果需要极致性能:选择MAXN模式,但要确保散热系统能够充分应对高温,以避免降频或系统保护性关机。

以下是一些常见的应用场景及其对应的功耗模式选择建议:

边缘AI计算:通常选择15W或25W模式,以在保证性能的同时降低功耗。
计算密集型任务:如深度学习推理或图像处理,选择MAXN模式,但要确保环境的散热条件。
移动设备开发:由于功耗和散热的双重限制,建议选择10W或15W模式。

有时系统会抽风,左上角的功率选择工具会莫名其妙消失,此时可以使用命令手动查看、切换功耗模式。

#查询当前模式
sudo nvpmodel -q verbose
#开启最大频率。重启后失效,使用时不需要设置
sudo jetson_clocks
#查询cpu/gpu的cur、min、max频率,cpu在线数量
sudo jetson_clocks --show
#查询所有支持的模式命令,会导出一个文件
sudo nvpmodel -p --verbose
#查看导出的文件
cat /etc/nvpmodel.conf
#修改为工作模式0,重启后保持
sudo nvpmodel -m 0

2.4 Docker安装

curl -s https://get.docker.com | bash

也可以手动将以下代码存为.sh文件,然后执行

#!/bin/sh
set -e
# Docker Engine for Linux installation script.
#
# This script is intended as a convenient way to configure docker's package
# repositories and to install Docker Engine, This script is not recommended
# for production environments. Before running this script, make yourself familiar
# with potential risks and limitations, and refer to the installation manual
# at https://docs.docker.com/engine/install/ for alternative installation methods.
#
# The script:
#
# - Requires `root` or `sudo` privileges to run.
# - Attempts to detect your Linux distribution and version and configure your
#   package management system for you.
# - Doesn't allow you to customize most installation parameters.
# - Installs dependencies and recommendations without asking for confirmation.
# - Installs the latest stable release (by default) of Docker CLI, Docker Engine,
#   Docker Buildx, Docker Compose, containerd, and runc. When using this script
#   to provision a machine, this may result in unexpected major version upgrades
#   of these packages. Always test upgrades in a test environment before
#   deploying to your production systems.
# - Isn't designed to upgrade an existing Docker installation. When using the
#   script to update an existing installation, dependencies may not be updated
#   to the expected version, resulting in outdated versions.
#
# Source code is available at https://github.com/docker/docker-install/
#
# Usage
# ==============================================================================
#
# To install the latest stable versions of Docker CLI, Docker Engine, and their
# dependencies:
#
# 1. download the script
#
#   $ curl -fsSL https://get.docker.com -o install-docker.sh
#
# 2. verify the script's content
#
#   $ cat install-docker.sh
#
# 3. run the script with --dry-run to verify the steps it executes
#
#   $ sh install-docker.sh --dry-run
#
# 4. run the script either as root, or using sudo to perform the installation.
#
#   $ sudo sh install-docker.sh
#
# Command-line options
# ==============================================================================
#
# --version <VERSION>
# Use the --version option to install a specific version, for example:
#
#   $ sudo sh install-docker.sh --version 23.0
#
# --channel <stable|test>
#
# Use the --channel option to install from an alternative installation channel.
# The following example installs the latest versions from the "test" channel,
# which includes pre-releases (alpha, beta, rc):
#
#   $ sudo sh install-docker.sh --channel test
#
# Alternatively, use the script at https://test.docker.com, which uses the test
# channel as default.
#
# --mirror <Aliyun|AzureChinaCloud>
#
# Use the --mirror option to install from a mirror supported by this script.
# Available mirrors are "Aliyun" (https://mirrors.aliyun.com/docker-ce), and
# "AzureChinaCloud" (https://mirror.azure.cn/docker-ce), for example:
#
#   $ sudo sh install-docker.sh --mirror AzureChinaCloud
#
# ==============================================================================


# Git commit from https://github.com/docker/docker-install when
# the script was uploaded (Should only be modified by upload job):
SCRIPT_COMMIT_SHA="4c94a56999e10efcf48c5b8e3f6afea464f9108e"

# strip "v" prefix if present
VERSION="${VERSION#v}"

# The channel to install from:
#   * stable
#   * test
DEFAULT_CHANNEL_VALUE="stable"
if [ -z "$CHANNEL" ]; then
	CHANNEL=$DEFAULT_CHANNEL_VALUE
fi

DEFAULT_DOWNLOAD_URL="https://download.docker.com"
if [ -z "$DOWNLOAD_URL" ]; then
	DOWNLOAD_URL=$DEFAULT_DOWNLOAD_URL
fi

DEFAULT_REPO_FILE="docker-ce.repo"
if [ -z "$REPO_FILE" ]; then
	REPO_FILE="$DEFAULT_REPO_FILE"
fi

mirror=''
DRY_RUN=${DRY_RUN:-}
while [ $# -gt 0 ]; do
	case "$1" in
		--channel)
			CHANNEL="$2"
			shift
			;;
		--dry-run)
			DRY_RUN=1
			;;
		--mirror)
			mirror="$2"
			shift
			;;
		--version)
			VERSION="${2#v}"
			shift
			;;
		--*)
			echo "Illegal option $1"
			;;
	esac
	shift $(( $# > 0 ? 1 : 0 ))
done

case "$mirror" in
	Aliyun)
		DOWNLOAD_URL="https://mirrors.aliyun.com/docker-ce"
		;;
	AzureChinaCloud)
		DOWNLOAD_URL="https://mirror.azure.cn/docker-ce"
		;;
	"")
		;;
	*)
		>&2 echo "unknown mirror '$mirror': use either 'Aliyun', or 'AzureChinaCloud'."
		exit 1
		;;
esac

case "$CHANNEL" in
	stable|test)
		;;
	*)
		>&2 echo "unknown CHANNEL '$CHANNEL': use either stable or test."
		exit 1
		;;
esac

command_exists() {
	command -v "$@" > /dev/null 2>&1
}

# version_gte checks if the version specified in $VERSION is at least the given
# SemVer (Maj.Minor[.Patch]), or CalVer (YY.MM) version.It returns 0 (success)
# if $VERSION is either unset (=latest) or newer or equal than the specified
# version, or returns 1 (fail) otherwise.
#
# examples:
#
# VERSION=23.0
# version_gte 23.0  // 0 (success)
# version_gte 20.10 // 0 (success)
# version_gte 19.03 // 0 (success)
# version_gte 26.1  // 1 (fail)
version_gte() {
	if [ -z "$VERSION" ]; then
			return 0
	fi
	version_compare "$VERSION" "$1"
}

# version_compare compares two version strings (either SemVer (Major.Minor.Path),
# or CalVer (YY.MM) version strings. It returns 0 (success) if version A is newer
# or equal than version B, or 1 (fail) otherwise. Patch releases and pre-release
# (-alpha/-beta) are not taken into account
#
# examples:
#
# version_compare 23.0.0 20.10 // 0 (success)
# version_compare 23.0 20.10   // 0 (success)
# version_compare 20.10 19.03  // 0 (success)
# version_compare 20.10 20.10  // 0 (success)
# version_compare 19.03 20.10  // 1 (fail)
version_compare() (
	set +x

	yy_a="$(echo "$1" | cut -d'.' -f1)"
	yy_b="$(echo "$2" | cut -d'.' -f1)"
	if [ "$yy_a" -lt "$yy_b" ]; then
		return 1
	fi
	if [ "$yy_a" -gt "$yy_b" ]; then
		return 0
	fi
	mm_a="$(echo "$1" | cut -d'.' -f2)"
	mm_b="$(echo "$2" | cut -d'.' -f2)"

	# trim leading zeros to accommodate CalVer
	mm_a="${mm_a#0}"
	mm_b="${mm_b#0}"

	if [ "${mm_a:-0}" -lt "${mm_b:-0}" ]; then
		return 1
	fi

	return 0
)

is_dry_run() {
	if [ -z "$DRY_RUN" ]; then
		return 1
	else
		return 0
	fi
}

is_wsl() {
	case "$(uname -r)" in
	*microsoft* ) true ;; # WSL 2
	*Microsoft* ) true ;; # WSL 1
	* ) false;;
	esac
}

is_darwin() {
	case "$(uname -s)" in
	*darwin* ) true ;;
	*Darwin* ) true ;;
	* ) false;;
	esac
}

deprecation_notice() {
	distro=$1
	distro_version=$2
	echo
	printf "\033[91;1mDEPRECATION WARNING\033[0m\n"
	printf "    This Linux distribution (\033[1m%s %s\033[0m) reached end-of-life and is no longer supported by this script.\n" "$distro" "$distro_version"
	echo   "    No updates or security fixes will be released for this distribution, and users are recommended"
	echo   "    to upgrade to a currently maintained version of $distro."
	echo
	printf   "Press \033[1mCtrl+C\033[0m now to abort this script, or wait for the installation to continue."
	echo
	sleep 10
}

get_distribution() {
	lsb_dist=""
	# Every system that we officially support has /etc/os-release
	if [ -r /etc/os-release ]; then
		lsb_dist="$(. /etc/os-release && echo "$ID")"
	fi
	# Returning an empty string here should be alright since the
	# case statements don't act unless you provide an actual value
	echo "$lsb_dist"
}

echo_docker_as_nonroot() {
	if is_dry_run; then
		return
	fi
	if command_exists docker && [ -e /var/run/docker.sock ]; then
		(
			set -x
			$sh_c 'docker version'
		) || true
	fi

	# intentionally mixed spaces and tabs here -- tabs are stripped by "<<-EOF", spaces are kept in the output
	echo
	echo "================================================================================"
	echo
	if version_gte "20.10"; then
		echo "To run Docker as a non-privileged user, consider setting up the"
		echo "Docker daemon in rootless mode for your user:"
		echo
		echo "    dockerd-rootless-setuptool.sh install"
		echo
		echo "Visit https://docs.docker.com/go/rootless/ to learn about rootless mode."
		echo
	fi
	echo
	echo "To run the Docker daemon as a fully privileged service, but granting non-root"
	echo "users access, refer to https://docs.docker.com/go/daemon-access/"
	echo
	echo "WARNING: Access to the remote API on a privileged Docker daemon is equivalent"
	echo "         to root access on the host. Refer to the 'Docker daemon attack surface'"
	echo "         documentation for details: https://docs.docker.com/go/attack-surface/"
	echo
	echo "================================================================================"
	echo
}

# Check if this is a forked Linux distro
check_forked() {

	# Check for lsb_release command existence, it usually exists in forked distros
	if command_exists lsb_release; then
		# Check if the `-u` option is supported
		set +e
		lsb_release -a -u > /dev/null 2>&1
		lsb_release_exit_code=$?
		set -e

		# Check if the command has exited successfully, it means we're in a forked distro
		if [ "$lsb_release_exit_code" = "0" ]; then
			# Print info about current distro
			cat <<-EOF
			You're using '$lsb_dist' version '$dist_version'.
			EOF

			# Get the upstream release info
			lsb_dist=$(lsb_release -a -u 2>&1 | tr '[:upper:]' '[:lower:]' | grep -E 'id' | cut -d ':' -f 2 | tr -d '[:space:]')
			dist_version=$(lsb_release -a -u 2>&1 | tr '[:upper:]' '[:lower:]' | grep -E 'codename' | cut -d ':' -f 2 | tr -d '[:space:]')

			# Print info about upstream distro
			cat <<-EOF
			Upstream release is '$lsb_dist' version '$dist_version'.
			EOF
		else
			if [ -r /etc/debian_version ] && [ "$lsb_dist" != "ubuntu" ] && [ "$lsb_dist" != "raspbian" ]; then
				if [ "$lsb_dist" = "osmc" ]; then
					# OSMC runs Raspbian
					lsb_dist=raspbian
				else
					# We're Debian and don't even know it!
					lsb_dist=debian
				fi
				dist_version="$(sed 's/\/.*//' /etc/debian_version | sed 's/\..*//')"
				case "$dist_version" in
					12)
						dist_version="bookworm"
					;;
					11)
						dist_version="bullseye"
					;;
					10)
						dist_version="buster"
					;;
					9)
						dist_version="stretch"
					;;
					8)
						dist_version="jessie"
					;;
				esac
			fi
		fi
	fi
}

do_install() {
	echo "# Executing docker install script, commit: $SCRIPT_COMMIT_SHA"

	if command_exists docker; then
		cat >&2 <<-'EOF'
			Warning: the "docker" command appears to already exist on this system.

			If you already have Docker installed, this script can cause trouble, which is
			why we're displaying this warning and provide the opportunity to cancel the
			installation.

			If you installed the current Docker package using this script and are using it
			again to update Docker, you can ignore this message, but be aware that the
			script resets any custom changes in the deb and rpm repo configuration
			files to match the parameters passed to the script.

			You may press Ctrl+C now to abort this script.
		EOF
		( set -x; sleep 20 )
	fi

	user="$(id -un 2>/dev/null || true)"

	sh_c='sh -c'
	if [ "$user" != 'root' ]; then
		if command_exists sudo; then
			sh_c='sudo -E sh -c'
		elif command_exists su; then
			sh_c='su -c'
		else
			cat >&2 <<-'EOF'
			Error: this installer needs the ability to run commands as root.
			We are unable to find either "sudo" or "su" available to make this happen.
			EOF
			exit 1
		fi
	fi

	if is_dry_run; then
		sh_c="echo"
	fi

	# perform some very rudimentary platform detection
	lsb_dist=$( get_distribution )
	lsb_dist="$(echo "$lsb_dist" | tr '[:upper:]' '[:lower:]')"

	if is_wsl; then
		echo
		echo "WSL DETECTED: We recommend using Docker Desktop for Windows."
		echo "Please get Docker Desktop from https://www.docker.com/products/docker-desktop/"
		echo
		cat >&2 <<-'EOF'

			You may press Ctrl+C now to abort this script.
		EOF
		( set -x; sleep 20 )
	fi

	case "$lsb_dist" in

		ubuntu)
			if command_exists lsb_release; then
				dist_version="$(lsb_release --codename | cut -f2)"
			fi
			if [ -z "$dist_version" ] && [ -r /etc/lsb-release ]; then
				dist_version="$(. /etc/lsb-release && echo "$DISTRIB_CODENAME")"
			fi
		;;

		debian|raspbian)
			dist_version="$(sed 's/\/.*//' /etc/debian_version | sed 's/\..*//')"
			case "$dist_version" in
				12)
					dist_version="bookworm"
				;;
				11)
					dist_version="bullseye"
				;;
				10)
					dist_version="buster"
				;;
				9)
					dist_version="stretch"
				;;
				8)
					dist_version="jessie"
				;;
			esac
		;;

		centos|rhel)
			if [ -z "$dist_version" ] && [ -r /etc/os-release ]; then
				dist_version="$(. /etc/os-release && echo "$VERSION_ID")"
			fi
		;;

		*)
			if command_exists lsb_release; then
				dist_version="$(lsb_release --release | cut -f2)"
			fi
			if [ -z "$dist_version" ] && [ -r /etc/os-release ]; then
				dist_version="$(. /etc/os-release && echo "$VERSION_ID")"
			fi
		;;

	esac

	# Check if this is a forked Linux distro
	check_forked

	# Print deprecation warnings for distro versions that recently reached EOL,
	# but may still be commonly used (especially LTS versions).
	case "$lsb_dist.$dist_version" in
		centos.8|centos.7|rhel.7)
			deprecation_notice "$lsb_dist" "$dist_version"
			;;
		debian.buster|debian.stretch|debian.jessie)
			deprecation_notice "$lsb_dist" "$dist_version"
			;;
		raspbian.buster|raspbian.stretch|raspbian.jessie)
			deprecation_notice "$lsb_dist" "$dist_version"
			;;
		ubuntu.bionic|ubuntu.xenial|ubuntu.trusty)
			deprecation_notice "$lsb_dist" "$dist_version"
			;;
		ubuntu.mantic|ubuntu.lunar|ubuntu.kinetic|ubuntu.impish|ubuntu.hirsute|ubuntu.groovy|ubuntu.eoan|ubuntu.disco|ubuntu.cosmic)
			deprecation_notice "$lsb_dist" "$dist_version"
			;;
		fedora.*)
			if [ "$dist_version" -lt 40 ]; then
				deprecation_notice "$lsb_dist" "$dist_version"
			fi
			;;
	esac

	# Run setup for each distro accordingly
	case "$lsb_dist" in
		ubuntu|debian|raspbian)
			pre_reqs="ca-certificates curl"
			apt_repo="deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] $DOWNLOAD_URL/linux/$lsb_dist $dist_version $CHANNEL"
			(
				if ! is_dry_run; then
					set -x
				fi
				$sh_c 'apt-get -qq update >/dev/null'
				$sh_c "DEBIAN_FRONTEND=noninteractive apt-get -y -qq install $pre_reqs >/dev/null"
				$sh_c 'install -m 0755 -d /etc/apt/keyrings'
				$sh_c "curl -fsSL \"$DOWNLOAD_URL/linux/$lsb_dist/gpg\" -o /etc/apt/keyrings/docker.asc"
				$sh_c "chmod a+r /etc/apt/keyrings/docker.asc"
				$sh_c "echo \"$apt_repo\" > /etc/apt/sources.list.d/docker.list"
				$sh_c 'apt-get -qq update >/dev/null'
			)
			pkg_version=""
			if [ -n "$VERSION" ]; then
				if is_dry_run; then
					echo "# WARNING: VERSION pinning is not supported in DRY_RUN"
				else
					# Will work for incomplete versions IE (17.12), but may not actually grab the "latest" if in the test channel
					pkg_pattern="$(echo "$VERSION" | sed 's/-ce-/~ce~.*/g' | sed 's/-/.*/g')"
					search_command="apt-cache madison docker-ce | grep '$pkg_pattern' | head -1 | awk '{\$1=\$1};1' | cut -d' ' -f 3"
					pkg_version="$($sh_c "$search_command")"
					echo "INFO: Searching repository for VERSION '$VERSION'"
					echo "INFO: $search_command"
					if [ -z "$pkg_version" ]; then
						echo
						echo "ERROR: '$VERSION' not found amongst apt-cache madison results"
						echo
						exit 1
					fi
					if version_gte "18.09"; then
							search_command="apt-cache madison docker-ce-cli | grep '$pkg_pattern' | head -1 | awk '{\$1=\$1};1' | cut -d' ' -f 3"
							echo "INFO: $search_command"
							cli_pkg_version="=$($sh_c "$search_command")"
					fi
					pkg_version="=$pkg_version"
				fi
			fi
			(
				pkgs="docker-ce${pkg_version%=}"
				if version_gte "18.09"; then
						# older versions didn't ship the cli and containerd as separate packages
						pkgs="$pkgs docker-ce-cli${cli_pkg_version%=} containerd.io"
				fi
				if version_gte "20.10"; then
						pkgs="$pkgs docker-compose-plugin docker-ce-rootless-extras$pkg_version"
				fi
				if version_gte "23.0"; then
						pkgs="$pkgs docker-buildx-plugin"
				fi
				if ! is_dry_run; then
					set -x
				fi
				$sh_c "DEBIAN_FRONTEND=noninteractive apt-get -y -qq install $pkgs >/dev/null"
			)
			echo_docker_as_nonroot
			exit 0
			;;
		centos|fedora|rhel)
			repo_file_url="$DOWNLOAD_URL/linux/$lsb_dist/$REPO_FILE"
			(
				if ! is_dry_run; then
					set -x
				fi
				if command_exists dnf5; then
					$sh_c "dnf -y -q --setopt=install_weak_deps=False install dnf-plugins-core"
					$sh_c "dnf5 config-manager addrepo --overwrite --save-filename=docker-ce.repo --from-repofile='$repo_file_url'"

					if [ "$CHANNEL" != "stable" ]; then
						$sh_c "dnf5 config-manager setopt \"docker-ce-*.enabled=0\""
						$sh_c "dnf5 config-manager setopt \"docker-ce-$CHANNEL.enabled=1\""
					fi
					$sh_c "dnf makecache"
				elif command_exists dnf; then
					$sh_c "dnf -y -q --setopt=install_weak_deps=False install dnf-plugins-core"
					$sh_c "rm -f /etc/yum.repos.d/docker-ce.repo  /etc/yum.repos.d/docker-ce-staging.repo"
					$sh_c "dnf config-manager --add-repo $repo_file_url"

					if [ "$CHANNEL" != "stable" ]; then
						$sh_c "dnf config-manager --set-disabled \"docker-ce-*\""
						$sh_c "dnf config-manager --set-enabled \"docker-ce-$CHANNEL\""
					fi
					$sh_c "dnf makecache"
				else
					$sh_c "yum -y -q install yum-utils"
					$sh_c "rm -f /etc/yum.repos.d/docker-ce.repo  /etc/yum.repos.d/docker-ce-staging.repo"
					$sh_c "yum-config-manager --add-repo $repo_file_url"

					if [ "$CHANNEL" != "stable" ]; then
						$sh_c "yum-config-manager --disable \"docker-ce-*\""
						$sh_c "yum-config-manager --enable \"docker-ce-$CHANNEL\""
					fi
					$sh_c "yum makecache"
				fi
			)
			pkg_version=""
			if command_exists dnf; then
				pkg_manager="dnf"
				pkg_manager_flags="-y -q --best"
			else
				pkg_manager="yum"
				pkg_manager_flags="-y -q"
			fi
			if [ -n "$VERSION" ]; then
				if is_dry_run; then
					echo "# WARNING: VERSION pinning is not supported in DRY_RUN"
				else
					if [ "$lsb_dist" = "fedora" ]; then
						pkg_suffix="fc$dist_version"
					else
						pkg_suffix="el"
					fi
					pkg_pattern="$(echo "$VERSION" | sed 's/-ce-/\\\\.ce.*/g' | sed 's/-/.*/g').*$pkg_suffix"
					search_command="$pkg_manager list --showduplicates docker-ce | grep '$pkg_pattern' | tail -1 | awk '{print \$2}'"
					pkg_version="$($sh_c "$search_command")"
					echo "INFO: Searching repository for VERSION '$VERSION'"
					echo "INFO: $search_command"
					if [ -z "$pkg_version" ]; then
						echo
						echo "ERROR: '$VERSION' not found amongst $pkg_manager list results"
						echo
						exit 1
					fi
					if version_gte "18.09"; then
						# older versions don't support a cli package
						search_command="$pkg_manager list --showduplicates docker-ce-cli | grep '$pkg_pattern' | tail -1 | awk '{print \$2}'"
						cli_pkg_version="$($sh_c "$search_command" | cut -d':' -f 2)"
					fi
					# Cut out the epoch and prefix with a '-'
					pkg_version="-$(echo "$pkg_version" | cut -d':' -f 2)"
				fi
			fi
			(
				pkgs="docker-ce$pkg_version"
				if version_gte "18.09"; then
					# older versions didn't ship the cli and containerd as separate packages
					if [ -n "$cli_pkg_version" ]; then
						pkgs="$pkgs docker-ce-cli-$cli_pkg_version containerd.io"
					else
						pkgs="$pkgs docker-ce-cli containerd.io"
					fi
				fi
				if version_gte "20.10"; then
					pkgs="$pkgs docker-compose-plugin docker-ce-rootless-extras$pkg_version"
				fi
				if version_gte "23.0"; then
						pkgs="$pkgs docker-buildx-plugin"
				fi
				if ! is_dry_run; then
					set -x
				fi
				$sh_c "$pkg_manager $pkg_manager_flags install $pkgs"
			)
			echo_docker_as_nonroot
			exit 0
			;;
		sles)
			if [ "$(uname -m)" != "s390x" ]; then
				echo "Packages for SLES are currently only available for s390x"
				exit 1
			fi
			repo_file_url="$DOWNLOAD_URL/linux/$lsb_dist/$REPO_FILE"
			pre_reqs="ca-certificates curl libseccomp2 awk"
			(
				if ! is_dry_run; then
					set -x
				fi
				$sh_c "zypper install -y $pre_reqs"
				$sh_c "rm -f /etc/zypp/repos.d/docker-ce-*.repo"
				$sh_c "zypper addrepo $repo_file_url"

				opensuse_factory_url="https://download.opensuse.org/repositories/security:/SELinux/openSUSE_Factory/"
				if ! zypper lr -d | grep -q "${opensuse_factory_url}"; then
					opensuse_repo="${opensuse_factory_url}security:SELinux.repo"
					if ! is_dry_run; then
						cat >&2 <<- EOF
							WARNING!!
							openSUSE repository ($opensuse_repo) will be enabled now.
							Do you wish to continue?
							You may press Ctrl+C now to abort this script.
						EOF
						( set -x; sleep 20 )
					fi
					$sh_c "zypper addrepo $opensuse_repo"
				fi
				$sh_c "zypper --gpg-auto-import-keys refresh"
				$sh_c "zypper lr -d"
			)
			pkg_version=""
			if [ -n "$VERSION" ]; then
				if is_dry_run; then
					echo "# WARNING: VERSION pinning is not supported in DRY_RUN"
				else
					pkg_pattern="$(echo "$VERSION" | sed 's/-ce-/\\\\.ce.*/g' | sed 's/-/.*/g')"
					search_command="zypper search -s --match-exact 'docker-ce' | grep '$pkg_pattern' | tail -1 | awk '{print \$6}'"
					pkg_version="$($sh_c "$search_command")"
					echo "INFO: Searching repository for VERSION '$VERSION'"
					echo "INFO: $search_command"
					if [ -z "$pkg_version" ]; then
						echo
						echo "ERROR: '$VERSION' not found amongst zypper list results"
						echo
						exit 1
					fi
					search_command="zypper search -s --match-exact 'docker-ce-cli' | grep '$pkg_pattern' | tail -1 | awk '{print \$6}'"
					# It's okay for cli_pkg_version to be blank, since older versions don't support a cli package
					cli_pkg_version="$($sh_c "$search_command")"
					pkg_version="-$pkg_version"
				fi
			fi
			(
				pkgs="docker-ce$pkg_version"
				if version_gte "18.09"; then
					if [ -n "$cli_pkg_version" ]; then
						# older versions didn't ship the cli and containerd as separate packages
						pkgs="$pkgs docker-ce-cli-$cli_pkg_version containerd.io"
					else
						pkgs="$pkgs docker-ce-cli containerd.io"
					fi
				fi
				if version_gte "20.10"; then
					pkgs="$pkgs docker-compose-plugin docker-ce-rootless-extras$pkg_version"
				fi
				if version_gte "23.0"; then
						pkgs="$pkgs docker-buildx-plugin"
				fi
				if ! is_dry_run; then
					set -x
				fi
				$sh_c "zypper -q install -y $pkgs"
			)
			echo_docker_as_nonroot
			exit 0
			;;
		*)
			if [ -z "$lsb_dist" ]; then
				if is_darwin; then
					echo
					echo "ERROR: Unsupported operating system 'macOS'"
					echo "Please get Docker Desktop from https://www.docker.com/products/docker-desktop"
					echo
					exit 1
				fi
			fi
			echo
			echo "ERROR: Unsupported distribution '$lsb_dist'"
			echo
			exit 1
			;;
	esac
	exit 1
}

# wrapped up in a function so that we have some protection against only getting
# half the file during "curl | sh"
do_install

然后安装nvidia-container-runtime

sudo apt-get install nvidia-container-runtime

最后按照 1.4 针对jetson-containers的Docker镜像配置 进行配置

3 Ollama调度环境配置

3.1 真的调用GPU了吗?

查看jtop的GPU页,可以查看GPU的利用率。

Jetson架构的内存与GPU显存是共享的, 但算力不共享 。因此若真正调用GPU,大模型在推理时必然出现 GPU高、CPU低 的情况。

依据此可以检查大模型是否真正跑在GPU上。

3.2 Ollama的手动安装(CPU运算)

下载官方最新发布版本

  • ollama-linux-arm64-jetpack5.tgz ollama-linux-arm64-jetpack6.tgz,依据机器中的Jetpack(Jetson SDK)版本选择。
  • ollama-linux-arm64.tgz

安装

sudo tar -C /usr -xzvf ollama-linux-arm64.tgz
sudo tar -C /usr -xzvf ollama-linux-arm64-jetpack6.tgz

使用在此不过多赘述,可以参考我之前的文章对GPU算力服务器裸机的容器共享化架构设想与实践

注意: 这种安装方式并不能调用GPU,原因未知,按照Ollama对调用GPU的计算性能要求,可调用GPU的计算性能指数(Nvidia定义的)需要大于等于5,官网对Jetson核心的计算指数的描述如下 完全满足要求,但经过多次实验,依旧不能完美调用GPU。

3.3 jetson-container优化Docker的Ollama

官网对jetson-container配置专用Ollama有详细的介绍

Docker container for ollama using jetson-containers

下载专用Ollama镜像,并自动创建对应容器( 注意,此容器为临时

jetson-containers run --name ollama $(autotag ollama)

查看已有镜像

sudo docker images

手动新建容器

sudo docker run --runtime nvidia --name 容器名 -itd -p Ollama外漏端口号:11434 --shm-size=8g --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/nvidia/jetson-containers/data:/data -v /etc/localtime:/etc/localtime:ro -v /etc/timezone:/etc/timezone:ro --device /dev/snd -e PULSE_SERVER=unix:/run/user/1000/pulse/native -v /run/user/1000/pulse:/run/user/1000/pulse --device /dev/bus/usb -e DISPLAY=:1 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-3 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-6 --device /dev/i2c-7 --device /dev/i2c-8 --device /dev/i2c-9 -v /run/jtop.sock:/run/jtop.sock -v 主机Ollama模型存放目录:/ollama -e OLLAMA_MODELS=/ollama --restart=always 镜像名:标记 /bin/bash

例如

sudo docker run --runtime nvidia --name ckh -itd -p 11435:11434 --shm-size=8g --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/nvidia/jetson-containers/data:/data -v /etc/localtime:/etc/localtime:ro -v /etc/timezone:/etc/timezone:ro --device /dev/snd -e PULSE_SERVER=unix:/run/user/1000/pulse/native -v /run/user/1000/pulse:/run/user/1000/pulse --device /dev/bus/usb -e DISPLAY=:1 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-3 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-6 --device /dev/i2c-7 --device /dev/i2c-8 --device /dev/i2c-9 -v /run/jtop.sock:/run/jtop.sock -v ~/ollama:/ollama -e OLLAMA_MODELS=/ollama --restart=always dustynv/ollama:r35.4.1

进入容器

sudo docker exec -it 容器名 /bin/bash

手动创建的容器有概率无法自动运行Ollama服务,需要进入容器后执行

ollama serve

记下Key值后关闭此终端,在新的终端重新进入容器,即可使用Ollama服务。

查看正在运行的容器

sudo docker ps

删除容器

sudo docker rm -f 容器名

删除镜像

sudo docker rmi -f 镜像名

4 一件令人感到悲伤的事情

在我激烈的配置过程中,我突然有了一计,我在没有任何必要的情况下重启了机器。

我知道这没有必要,但我还是做了,我的大脑控制不住我伸向机器电源的手。

感谢我的小杨同志给我精神上的支持。

4.1 刷机过程中的小点

图为官方的刷机仅支持线刷,从官网下载固件刷入,经过实际操作,只有 Ubuntu20.04LTS 才能完美刷机,高于、低于此版本均不能完美刷机。

所有版本在Nvidia Jeston官网

具体的刷机过程请参考各厂商文档,此处不再赘述。

请注意JetPack版本的对应关系,版本过高、过低 都会导致这类定制Jetson无法正常刷入。

5 常用环境管理以及大模型快速测试

5.1 安装Anaconda

Anaconda官网下载Arm64版本Anaconda。

运行以下命令

bash ./安装包

按照提示一路回车或者yes即可。其中init选择no,其余确认全部选择yes。

5.1.1 将Anaconda添加到环境变量中

编辑当前用户环境变量

sudo gedit ~/.bashrc

在打开的编辑器中的最后加入

#conda env 
export PATH="/home/你的用户名/archiconda3/bin:"$PATH

保存后刷新bash

source ~/.bashrc

测试

conda -V

5.2 安装Open-WebUI

这是一个快捷的大模型WebUI,支持Ollama。

首先创建Python环境

conda create -n ckh python=3.11

激活创建的Python环境

conda activate ckh

安装Open-WebUI

pip install --upgrade pip
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
sudo apt-get install cmake
sudo apt-get install aptitude
sudo aptitude install clang
pip install open-webui

运行

open-webui serve

按照提示打开对应端口网址即可。

对于其具体使用本文暂不赘述。

6 边缘计算大模型运行的性能分析

Jetson的CPU、GPU的Memory是共享的,即显存与内存共享,当模型占用内存过大,将无法完全调用GPU。

推测原因为JetPack SDK的运行本身也需要占用一定资源,模型本身过高的内存占用导致SDK无法正常运行。

使用命令ollama ps可以查看模型的运算占比。 可以看出,3B模型完全运行在GPU上;7B模型的运算15%在CPU上,85%在GPU上。

经验得出,Qwen2.5的7B模型占用内存大约为7GB(模型参数量每十亿约经验占用1.5GB内存,此处应该发生了部分量化),已经到达调用GPU的极限,并且部分运算使用CPU分担。

因此, 本核心(8GB版本且关闭图形界面)能够GPU运行的模型参数量最大约为7B ,能够保证 一定响应性 的模型参数量大约为9B。

以下是Qwen2.5 7B模型的实际运行速度(15%/85% CPU/GPU)

以下是Qwen 4B模型的实际运行速度(100%GPU)

以下是Qwen2.5 7B模型的实际运行速度(100%GPU)

7 项目交付与资源节约

7.1 开启SSH

请看之前的文章开启SSH

7.2 关闭图形界面

sudo systemctl set-default multi-user.target
sudo reboot

关闭后可以100%GPU运行7B模型

7.3 开启图形界面

sudo systemctl set-default graphical.target
sudo reboot

8 概率性问题的解决方案

8.1 HDMI无信号输出

原因推测:安装的远程桌面软件影响了图形化桌面管理环境的启动。

插入刷机线,在主机上进行ssh连接

ssh  用户名@192.168.55.1

其中的IP地址是固定的。

登录后输入以下命令强制启动图形化界面

sudo systemctl restart gdm3.service 

以上为临时解决方案,以下有两个较为彻底的解决方案可供选择:

  • 卸载远程桌面软件
  • 重装GDM
    sudo apt purge gdm3
    sudo apt install gdm3
    

再次感谢图为科技技术支持对故障排除的支持!


在国产定制化NVIDIA Jetson Orin NX上离线化部署Ollama(GPU)及各类折腾经验

http://blog.ckh-cn.site/index.php/2025/01/04/79.html

作者

CKH

发布时间

2025-01-04

许可协议

CC BY 4.0

OS: Windows NT 10_0_4_9 6.3 build 9600 (Windows Server 2012 R2 Datacenter Edition) AMD64
CPU Info: Name Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz
Memory Info: TotalPhysicalMemory 2146938880
评论