1. 问题还原
(1)环境
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 FROM nvidia/cuda:12.2 .2 -cudnn8-devel-ubuntu22.04 COPY ["dependence/*" , "/root" ] COPY ["dependence/cudnn-linux-x86_64-8.8.1.3_cuda12-archive/include/*" , "/usr/local/cuda/include/" ] COPY ["dependence/cudnn-linux-x86_64-8.8.1.3_cuda12-archive/lib/*" , "/usr/local/cuda/lib64/" ] RUN apt update && \ apt install -y python3-pip net-tools \ less curl iputils-ping telnet nmon zip cron gcc \ language-pack-zh-hans sudo \ ssh dos2unix tmux \ gawk htop libgl1-mesa-glx zsh vim git && \ pip install jupyter -i http://pypi.douban.com/simple --trusted-host pypi.douban.com && \ ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && \ tar -zvxf /root/TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-12.0.tar.gz -C /usr/local/ && \ rm /bin/sh && \ ln -s /bin/bash /bin/sh && \ echo export LC_ALL=zh_CN.utf8 >>/etc/profile && \ echo export LANG=zh_CN.utf8 >>/etc/profile && \ echo export EDITOR=vim >>/etc/profile && \ echo export PATH=$PATH :/usr/local/cuda/bin >>/etc/profile && \ echo export CUDA_HOME=$CUDA_HOME :/usr/local/cuda >>/etc/profile && \ echo export LD_LIBRARY_PATH=/usr/local/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/lib:$LD_LIBRARY_PATH >>/etc/profile && \ echo export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH >>/etc/profile && \ echo export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH >>/etc/profile && \ echo SHELL=/bin/bash >> /etc/profile && \ echo alias ll=\'ls -al\' >> /etc/profile && \ echo alias ls =\'ls -a --color\' >> /etc/profile && \ echo source /etc/bash.bashrc >> /etc/profile && \ sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config && \ useradd -ms /bin/bash hoshinory && \ echo "hoshinory:hoshinory" | chpasswd && \ groupadd -g 4000 policy && \ usermod -aG policy hoshinory && \ usermod -u 997 hoshinory && \ groupmod -g 997 hoshinory && \ echo "hoshinory ALL=NOPASSWD:ALL" >>/etc/sudoers USER hoshinoryCOPY ["dependence/.bashrc" , "/home/hoshinory/" ] COPY ["dependence/.vimrc" , "/home/hoshinory/" ] COPY ["dependence/start.sh" , "/home/hoshinory/" ] RUN pip install jupyter -i http://pypi.douban.com/simple --trusted-host pypi.douban.com && \ pip install install jupyterlab-lsp -i http://pypi.douban.com/simple --trusted-host pypi.douban.com && \ pip install python-lsp-server[all] -i http://pypi.douban.com/simple --trusted-host pypi.douban.com && \ jupyter lab --generate-config && \ sed -i '2i\import os' /home/hoshinory/.jupyter/jupyter_lab_config.py && \ sed -i '3i\os.environ["LD_LIBRARY_PATH"] = os.environ.get("LD_LIBRARY_PATH", "") + ":/usr/local/cuda/lib64"' /home/hoshinory/.jupyter/jupyter_lab_config.py && \ sed -i '4i\os.environ["LD_LIBRARY_PATH"] = os.environ.get("LD_LIBRARY_PATH", "") + ":/usr/local/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/lib"' /home/hoshinory/.jupyter/jupyter_lab_config.py && \ sed -i '5i\os.environ["LD_LIBRARY_PATH"] = os.environ.get("LD_LIBRARY_PATH", "") + ":/usr/local/cuda/extras/CUPTI/lib64"' /home/hoshinory/.jupyter/jupyter_lab_config.py && \ sed -i '6i\os.environ["CUDA_HOME"] = os.environ.get("CUDA_HOME", "") + ":/usr/local/cuda:/usr/local/cuda"' /home/hoshinory/.jupyter/jupyter_lab_config.py && \ sed -i '7i\os.environ["PATH"] = os.environ.get("PATH", "") + ":/usr/local/cuda/bin"' /home/hoshinory/.jupyter/jupyter_lab_config.py EXPOSE 8888 CMD jupyter lab --notebook-dir=/home/hoshinory \ --ip=0.0.0.0 \ --no-browser \ --allow-root \ --port=8888 \ --config=/home/hoshinory/.jupyter/jupyter_lab_config.py \ --NotebookApp.token='' \ --NotebookApp.password='' \ --NotebookApp.allow_origin='*' \ --NotebookApp.base_url=/ \ --NotebookApp.iopub_data_rate_limit=1e10 \ --FileContentsManager.delete_to_trash=False;
The main problem is enviromental variables not worked in the
notebook. It prevent me to develop codes with GPU.
There is a Infinitie loops BUG in prophet, detail shows as
following:
All programms are running on a Linux image built by
docker.
You must keep a main progress alive when you running a Linux
image. Main progress can not be changed once you runing image. And image
will exit if the main progress died.
Notebook is running in a Linux image as a main progress in
platform of prophet. I donot kown which config file(maybe no config
file, assert there is a config file) was loaded when Notebook running.
What's worse, the config file loaded does not include enviromental
variables we needed.
I search the answer in Google, and something interesting happend.
>(1) I must source ~/.bashrc
first and then restart the
progress of notebook
If I restart the progress of notebook, I must kill the progress of
notebook at beginning.
If I kill the progress of notebook at beginning(I remind that it is
a main progress), the image will exit and restart. When Notebook auto
restart, it loads config file that nobody knows for sure.
If I changed main progress(someone else expcept Notebook), image
cannot be running.