欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

【深度学习】环境搭建—TensorFlow 2.0环境搭建

程序员文章站 2024-03-11 09:48:43
...

前沿:
本来想是搭建一个本地环境,可是在安装过程中需要 cuda 10.0 ,而我安装的是 cuda 10.1 不匹配。所以就寻思着安装了一个 docker,使用容器化安装。

Created with Raphaël 2.2.0安装 Docker安装nvidia-dockerpull tensorflow

1. 安装 docker

见官网教程

2. 安装 nvidia-docker

待补充

3. 安装 tensorflow

查看用户组中是否含有 docker

[email protected]:~$ groups
li adm cdrom sudo dip plugdev lpadmin sambashare docker
//可以看出最后一项就是docker,此时可以不用sudo,直接使用docker开始

检测 docker

[email protected]:~$ docker run hello-world

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/
For more examples and ideas, visit:
 https://docs.docker.com/get-started/
// 安装成功
[email protected]:~$ docker --version
Docker version 18.09.5, build e8ff056     

检测 nvidia-docker

[email protected]:~$ docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi
Unable to find image 'nvidia/cuda:10.1-base' locally
10.1-base: Pulling from nvidia/cuda
898c46f3b1a1: Already exists 
63366dfa0a50: Already exists 
041d4cd74a92: Already exists 
6e1bee0f8701: Already exists 
131dbe7c254d: Pull complete 
5bca6b05dcd6: Pull complete 
0d286a7b6e12: Pull complete 
Digest: sha256:6ddf907e77f4b53ac8b0b8ce9fa9cd43ffb6882f1ad0f2d41ca996f154f17c7b
Status: Downloaded newer image for nvidia/cuda:10.1-base
Mon Apr 22 13:21:37 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:65:00.0  On |                  N/A |
| 31%   30C    P8    22W / 260W |     84MiB / 10986MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+
[email protected]:~$ docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi
Unable to find image 'nvidia/cuda:10.0-base' locally
10.0-base: Pulling from nvidia/cuda
898c46f3b1a1: Already exists 
63366dfa0a50: Already exists 
041d4cd74a92: Already exists 
6e1bee0f8701: Already exists 
112097260ef3: Pull complete 
30a67c795176: Pull complete 
0d286a7b6e12: Pull complete 
Digest: sha256:faac85a7d28e086173915df6456784778c4dacb429ff067def0c4a12671240e8
Status: Downloaded newer image for nvidia/cuda:10.0-base
Mon Apr 22 13:22:09 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:65:00.0  On |                  N/A |
| 31%   30C    P8    22W / 260W |     84MiB / 10986MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

安装 tensorflow

 docker pull tensorflow/tensorflow:2.0.0a0-gpu-py3         // 拉取命令
[email protected]:~$ docker pull tensorflow/tensorflow:2.0.0a0-gpu-py3
2.0.0a0-gpu-py3: Pulling from tensorflow/tensorflow
7b722c1070cd: Pull complete 
5fbf74db61f1: Pull complete 
ed41cb72e5c9: Pull complete 
7ea47a67709e: Pull complete 
53d00018d593: Pull complete 
d452561571e2: Pull complete 
741421562e36: Pull complete 
cf5a5f77591f: Pull complete 
8e44471d34e9: Pull complete 
95409a313744: Pull complete 
3ca5dc868f92: Pull complete 
a1c783d09ef0: Pull complete 
eed91d5a4f29: Pull complete 
b36de521e979: Pull complete 
Digest: sha256:f43f2ea436eebc7b9fe3c80205e6649f4d1a66cfda8626ba010f8d8dfd7985ab
Status: Downloaded newer image for tensorflow/tensorflow:2.0.0a0-gpu-py3

运行 tensorflow

 docker run -it -p 8888:8888 tensorflow/tensorflow:2.0.0a0-gpu-py3         //运行命令
[email protected]:~$ docker run -it -p 8888:8888 tensorflow/tensorflow:2.0.0a0-gpu-py3

________                               _______________                
___  __/__________________________________  ____/__  /________      __
__  /  _  _ \_  __ \_  ___/  __ \_  ___/_  /_   __  /_  __ \_ | /| / /
_  /   /  __/  / / /(__  )/ /_/ /  /   _  __/   _  / / /_/ /_ |/ |/ / 
/_/    \___//_/ /_//____/ \____//_/    /_/      /_/  \____/____/|__/


WARNING: You are running this container as root, which can cause new files in
mounted volumes to be created as the root user on your host machine.

To avoid this, run the container by specifying your user's userid:

$ docker run -u $(id -u):$(id -g) args...

测试

[email protected]:/# python
Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import absolute_import, division, print_function, unicode_literals
>>> !pip install -q tensorflow==2.0.0-alpha0
  File "<stdin>", line 1
    !pip install -q tensorflow==2.0.0-alpha0                       //此处出错,不知为何
    ^
SyntaxError: invalid syntax
>>> import tensorflow as tf

>>> 
>>> mnist = tf.keras.datasets.mnist
>>> (x_train, y_train), (x_test, y_test) = mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 6s 1us/step
>>> x_train, x_test = x_train / 255.0, x_test / 255.0
>>> model = tf.keras.models.Sequential([
...   tf.keras.layers.Flatten(input_shape=(28, 28)),
...   tf.keras.layers.Dense(128, activation='relu'),
...   tf.keras.layers.Dropout(0.2),
...   tf.keras.layers.Dense(10, activation='softmax')
... ])
2019-04-22 14:03:51.302251: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-04-22 14:03:51.316205: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIRARY_PATH: /usr/local/cuda/extras/CUPTI/lib64:
2019-04-22 14:03:51.316261: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2019-04-22 14:03:51.316319: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:160] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
2019-04-22 14:03:51.337157: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3600000000 Hz
2019-04-22 14:03:51.338981: I tensorflow/compiler/xla/service/service.cc:162] XLA service 0x4147790 executing computations on platform Host. Devices:
2019-04-22 14:03:51.339038: I tensorflow/compiler/xla/service/service.cc:169]   StreamExecutor device (0): <undefined>, <undefined>
//此处怀疑是使用了 CPU 计算
>>> model.compile(optimizer='adam',
...               loss='sparse_categorical_crossentropy',
...               metrics=['accuracy'])
>>> model.fit(x_train, y_train, epochs=5)
Epoch 1/5
60000/60000 [==============================] - 7s 109us/sample - loss: 0.2981 - accuracy: 0.9136
Epoch 2/5
60000/60000 [==============================] - 6s 107us/sample - loss: 0.1438 - accuracy: 0.9565
Epoch 3/5
60000/60000 [==============================] - 6s 107us/sample - loss: 0.1094 - accuracy: 0.9674
Epoch 4/5
60000/60000 [==============================] - 6s 107us/sample - loss: 0.0904 - accuracy: 0.9715
Epoch 5/5
60000/60000 [==============================] - 6s 107us/sample - loss: 0.0752 - accuracy: 0.9764
<tensorflow.python.keras.callbacks.History object at 0x7f9ae6231a20>
>>> model.evaluate(x_test, y_test)
10000/10000 [==============================] - 1s 55us/sample - loss: 0.0759 - accuracy: 0.9760
[0.07590396217172965, 0.976]
相关标签: 环境搭建