TensorFlow Serving

MLOPS/SERVING

TensorFlow Serving

개발허재 2022. 3. 21. 14:02

텐서플로우 서빙(TensorFlow Serving)

텐서플로우 서빙(TensorFlow Serving)[2]은 구글에서 만든 프로덕션(production) 환경을 위한 유연하고(flexible), 고성능의(high-performance) serving 시스템이다.

보통 모델 설계 및 트레이닝이 끝나면 이를 실제 프로덕션 환경에 응용하기 위해서 추론(Inference)을 수행할 수 있는 시스템을 구축해야하는데 TensorFlow Serving은 이 과정을 최적화된 형태로 지원한다.

Docker를 이용한 TensorFlow Serving 실행

모델을 SavedModel 포맷으로 저장한 이후에 이를 TensorFlow Serving에서 로드해서 API 서버로 만들 수 있다.

보통 TensorFlow Serving은 Docker를 이용해서 실행한다. Docker를 이용해서 TensorFlow Serving을 실행하는 방법은 아래와 같다.

먼저 docker pull 명령어로 TensorFlow Serving 이미지를 다운받는다. [2]

docker pull tensorflow/serving

다음으로 아래와 같은 명령어로 TensorFlow Serving을 이용해서 저장된 모델에 대한 REST API 추론 서버를 실행할 수 있다.

docker run -t --rm -p 8501:8501 \
    -v "/home/solaris/Desktop/tf_serving/saved_model:/models/fashion_model" \
    -e MODEL_NAME=fashion_model \
    tensorflow/serving &

각각의 명령어 인자값에 대한 설명은 아래와 같다.

-p : 서버에서 데이터를 주고받는데 사용할 포트(Port) 번호를 지정한다. (위 예시의 경우, 8501 포트)
-v : 불러올 모델이 SavedModel 포맷으로 저장된 전체 경로(full path)를 의미한다. (위 예시의 경우, home/solaris/Desktop/tf_serving/saved_model 경로에서 모델 파일을 불러온다.),
뒤에는 모델을 실행할 REST API URL을 의미한다. (위 예시의 경우, models/fashion_model)

TensorFlow Serving 서버가 잘 실행되었다면 8501 포트로 REST API 서버가 구성되었다는 로그를 볼 수 있다.

POST Request를 통한 이미지 데이터 전송 및 예측 결과 시각화

이제 TensorFlow Serving을 실행할때 지정한 아래 URL(models/fashion_model)로 인풋 데이터를 전송한 후, 해당 인풋 데이터에 대한 예측 결과값을 반환받을 수 있다.

http://localhost:8501/v1/models/fashion_model:predict

만약 버전에 대한 정보까지 지정해서 요청하고자 할 경우, 요청 URL은 아래와 같다.

http://localhost:8501/v1/models/fashion_model/versions/1:predict

실행된 서버에 POST Request로 Fashion MNIST 테스트 이미지를 3개 전송하고, API 서버로부터 반환받은 예측결과를 시각화해보는 예제 코드는 아래와 같다.

from tensorflow import keras
import matplotlib.pyplot as plt
import numpy as np
import random
import json
import requests

def show(idx, title):
  plt.figure(figsize=(12, 3))
  plt.imshow(test_images[idx].reshape(28,28))
  plt.axis('off')
  plt.title('\n\n{}'.format(title), fontdict={'size': 16})
  plt.show()

fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# scale the values to 0.0 to 1.0
test_images = test_images / 255.0

# reshape for feeding into the model
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))

rando = random.randint(0,len(test_images)-1)
show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))

data = json.dumps({"signature_name": "serving_default", "instances": test_images[0:3].tolist()})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))


# send data using POST request and receive prediction result
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']
# show first prediction result
show(0, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
  class_names[np.argmax(predictions[0])], np.argmax(predictions[0]), class_names[test_labels[0]], test_labels[0]))

# set model version and send data using POST request and receive prediction result
json_response = requests.post('http://localhost:8501/v1/models/fashion_model/versions/1:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']
# show all prediction result
for i in range(0,3):
  show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
    class_names[np.argmax(predictions[i])], np.argmax(predictions[i]), class_names[test_labels[i]], test_labels[i]))

TensorFlow Serving을 이용할 경우 위와 같이 간단하게 모델에 대한 REST API 형태의 추론 서버를 만들 수 있다.

References

[1] https://www.tensorflow.org/tfx/tutorials/serving/rest_simple

[2] https://github.com/tensorflow/serving

[3] https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/saved_model

[4] http://solarisailab.com/archives/2703

Module: tf.saved_model | TensorFlow Core v1.15.0

Module: tf.saved_model Public API for tf.saved_model namespace. Modules builder module: SavedModel builder. constants module: Constants for SavedModel save and restore operations. experimental module: Public API for tf.saved_model.experimental namespace. l