최근에 scikit-image  0.20 이 릴리즈되면서 rescale() 을 사용하던 부분에서 multichannel 매개변수가 없다는 에러가 발생했다.

 

바로 앞 0.19 버전 까지만 하더라도 multichannel 옵션이 있었지만 0.20 으로 업데이트되면서 해당 옵션은 deprecate 됐다

0.19 버전의 scikit-image API 설명서

 

 

0.20 버전의 scikit-image API 설명서, multichannel 옵션이 빠져있다

 

 

사용하던 코드에서 multichannel 을 사용중이라면 scikit-image 를 0.19 버전으로 다운그레이드 하거나 해당 라인을 삭제 또는 channel_axis 를 활성화 시키는 방법으로 해결 가능하다.

블로그 이미지

우송송

,

python 에서 OpenCV 를 사용하거나 QT 를 다루다보면 종종 아래와 같은 문제가 발생하기도 한다

QObject::moveToThread: Current thread (0x557f778d2d20) is not the object's thread (0x557f765b2450).
Cannot move to target thread (0x557f778d2d20)

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/ubuntu/anaconda3/envs/torch3.8/lib/python3.8/site-packages/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl.

Aborted (core dumped)

 

사용중인 가상환경에서 opencv-python 을 지우고 openv-python-headless 를 설치하는 방법으로 해결 가능하다

pip uninstall opencv-python
pip install opencv-python-headless

 

 

예전엔 이게 안먹혀서 해결하는데 꽤나 고생했던걸로 기억하는데 이번엔 생각보다 간단하게 해결이 됐다

아래 링크에 다른 해결방법도 있으니 위 방법이 안먹히면 참고

 

참조 : https://github.com/NVlabs/instant-ngp/discussions/300

블로그 이미지

우송송

,

터미널에서 가상환경을 activate 하고 import caffe 를 하면 잘 동작하지만 Pycharm 에서 ssh interpreter 로 같은 가상환경을 사용해 import caffe 를 할 경우 caffe 를 찾을 수 없다고 에러가 나는 경우가 있다

 

가상환경 패키지 차이겠지만 virtualenv 의 경우 환경변수를 직접 추가해줘야한다

 

두 가지 방법이 있는데

 

첫 번재 방법은 코드에다가 환경변수를 직접 추가하는 방법

import sys
sys.path.append("/home/ubuntu/caffe/python")

 

 

두 번째 방법은 virtualenv 패키지에 환경변수를 추가해주는 방법이다

$ add2virtualenv /home/ubuntu/caffe/python
블로그 이미지

우송송

,

분명히 모델 초기화 코드를 제대로 패키징 하고 다른 파이썬 파일에서 import 하면 제대로 작동하는 반면, Flask API 에서 적용하려 하면 에러가 나는 경우가 있다

 

This could mean that the variable was uninitialized.
Tensor("mrcnn_detection/Reshape_1:0", shape=(1, 100, 6), dtype=float32) is not an element of this graph.

내가 직면했던 에러는 이 두 가지 에러였다

 

말 그대로 텐서플로의 세션 또는 그래프가 제대로 초기화되지 않아 발생하는 에러이다

 

하지만 왜 Flask API 에 가져오려하면 이런 에러가 발생하는 것인가?

 

Flask 는 여러 스레드를 사용한다. 이런 에러가 발생하는 이유는 tensorflow 모델이 로드되지 않은 채로 스레드에서 사용되기 때문이다

내가 찾은 해결 방법은 tensorflow 가 graph 와 session 을 global 변수로 사용하도록 하는 방법이다

 

 

flask_API.py

import InferenceClass

class InferenceServer:
    def __init__(self):
        self.tensorflow_obj = InferenceClass(self.model_path)
        self.tensorflow_obj.model_init()

        self.app = self.create_flask_app()

    def create_flask_app(self):
        app = Flask(__name__)

        @app.route('/image', methods=['POST'])
        def mask_rcnn_inference():
			
            ...

            inference_image, inference_time = self.tensorflow_obj.inference(img)
            
            ...


        return app

 

 

InferenceClass.model_init()

    def model_init(self):
        self.model = modellib.MaskRCNN(mode="inference", model_dir=self.MODEL_DIR,
                                  config=self.config)
                                  
        self.model.load_weights(self.weights_path, by_name=True)

  

 

 

InferenceClass.inference()

def inference(self, image):
	...
	
	results = self.model.detect([image], verbose=1)

	...

 

 

위 와 같은 형태의 코드에서 Flask 코드에선 global 변수로 session 과 model 을 넣어주면 되고, Flask 에서 import 하는 코드엔 global 로 선언된 graph 를 기반으로 inference 코드가 돌게 하면 된다

 

 

 

 

 

flask_API.py

global model  # 추가됨
global session  # 추가됨


class InferenceServer:
    def __init__(self):
        global model  # 추가됨
        global session  # 추가됨

        session = tf.Session()  # 추가됨
        keras.backend.set_session(session)  # 추가됨

        model = InferenceClass(self.model_path)  # 클래스 변수가 아닌 global 변수 사용
        model.model_init()  # 클래스 변수가 아닌 global 변수 사용
        self.app = self.create_flask_app()

    def create_flask_app(self):
        app = Flask(__name__)

        @app.route('/image', methods=['POST'])
        def mask_rcnn_inference():
            with session.as_default():  # 추가됨
				
                ...
                
                inference_image, inference_time = model.inference(img)  # 클래스 변수가 아닌 global 변수 사용
                
                ...
        return app

 

 

InferenceClass.model_init()

    def model_init(self):
        global graph  # 추가됨
        self.model = modellib.MaskRCNN(mode="inference", model_dir=self.MODEL_DIR,
                                  config=self.config)
        print(self.weights_path)
        self.model.load_weights(self.weights_path, by_name=True)

        graph = tf.get_default_graph()  # 추가됨

 

 

InferenceClass.inference()

def inference(self, image):
	with graph.as_default():  # 추가됨
    
		...
	
		results = self.model.detect([image], verbose=1)

		...
블로그 이미지

우송송

,

Ubuntu 에서 pip install 을 통해 파이썬 패키지를 설치하는 경우 UnicodeDecodeError 가 발생하는 경우가 종종 있다.

 

 

 

 

내 경우에는 Docker 안에서 우분투를 사용할 때 이런 에러가 나타났는데 Dockerfile 을 빌드할 때 Locale 설정을 해주지 않은 것이 그 이유였다.

 

해결방법은 Dockerfile 의 상단에 아래의 코드를 추가하고 새로 빌드하거나

FROM ubuntu:18.04

#Set the locale
RUN apt-get update
RUN apt-get install locales
ENV LANG ko_KR.UTF-8
ENV LANGUAGE ko_KR.UTF-8
RUN update-locale LANG=ko_KR.UTF-8

 

 

도커를 새로 빌드하긴 싫고 당장의 에러만 해결하고 싶다면 아래의 명령어를 먼저 입력 후 pip install 을 진행해주면 된다

$ export LC_ALL=C.UTF-8

 

블로그 이미지

우송송

,

지금 한창 maskrcnn 체크포인트를 로드해서 inference 를 하기 위한 코드를 작성중이다

 

학습을 돌리다보면 한 체크포인트에서 다양한 파일들이 만들어진다

 

checkpoint, data-00000-of-0000*, index, meta 파일이 생성된다

 

하지만 지금까지 pretrained 모델 (주로 pb, h5) 을 사용했던 터라 tensorflow 의 체크포인트인 data-00001-of-00002, data-0000-of-00002, index, meta 파일들을 직접적으로 다루는건 처음이라 많은 시행착오를 겪고있다.

 

저 파일들을 pb 로 만들던 h5 로 만들던 우선 체크포인트 파일들을 불러오기부터 해야하기 때문에 기본적인 작업이라 생각되어 잊어버리지 않기 위해 여기다가 글을 쓴다

 

우선 아래는 특정 체크포인트 파일들을 읽어들이는 기본 코드이다

import tensorflow as tf

meta_ckpt_path = "C:/Users/user/Desktop/models/maskrcnn_models/model.ckpt-2396686.meta"
model_ckpt_path = "C:/Users/user/Desktop/models/maskrcnn_models/model.ckpt-2396686"

with tf.Session() as sess:
    # Restore the graph
    saver = tf.train.import_meta_graph(meta_ckpt_path)

    # Load weights
    saver.restore(sess, model_ckpt_path)

 

 

아래는 가장 최근 checkpoint 파일들을 읽어들이는 기본 코드이다

 

import tensorflow as tf

meta_ckpt_path = "C:/Users/user/Desktop/models/maskrcnn_models/model.ckpt-2396686.meta"
model_ckpt_path = "C:/Users/user/Desktop/models/maskrcnn_models/"

with tf.Session() as sess:
    # Restore the graph
    saver = tf.train.import_meta_graph(meta_ckpt_path)

    # Load weights
    saver.restore(sess, tf.train.latest_checkpoint(model_ckpt_path))
    

 

 

두 코드의 차이는

1. saver.restore 함수에서 tf.train.latest_checkpoint 함수로 checkpoint 들이 저장되는 경로를 파라미터로 주는지

      ex) "C:/Users/user/Desktop/models/maskrcnn_models/"

 

2. saver.restore 함수에서 tf.train.latest_checkpoint 를 사용하지 않고 특정 체크포인트를 지목해주는지 이다

      ex) "C:/Users/user/Desktop/models/maskrcnn_models/model.ckpt-2396686"

 

 

 

1 번 처럼 checkpoint 들이 저장되는 경로를 파라미터로 주는 경우에 주의해야 할 점이 있는데 위에 파일 리스트를 캡쳐한 이미지를 보면 'checkpoint' 라는 파일이 하나 보일것이다

 

해당 파일을 열어보면 

 

 

위 와 같이 체크포인트들의 경로가 기록되어 있는 것을 볼 수 있다.

저 목록 중 가장 최신에 기록 된 체크포인트의 경로를 찾아내는 것 이므로 저 중 맨 첫번 째 줄과 맨 마지막 줄의 경로를 수정해 주면 된다

 

내 경우엔 ubuntu 에서 만들어진 체크포인트를 윈도우에서 사용 중 이므로 윈도우 경로로 지정해주었다

 

 

 

하지만 위와 같이 제대로 코드를 작성했음에도

 

C:\ProgramData\Anaconda3\envs\tf12_env\python.exe C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py
C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-04-16 15:52:45.147591: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-04-16 15:52:45.248758: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
name: GeForce GTX 1080 with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.468
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.61GiB
2020-04-16 15:52:45.249101: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2020-04-16 15:52:46.118933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-16 15:52:46.119121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2020-04-16 15:52:46.119212: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2020-04-16 15:52:46.119427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6370 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
    return fn(*args)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _run_fn
    self._extend_graph()
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\client\session.py", line 1352, in _extend_graph
    tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation IteratorToStringHandle: Could not satisfy explicit device specification '' because the node {{colocation_node IteratorToStringHandle}} was colocated with a group of nodes that required incompatible device '/device:GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices: 
IteratorToStringHandle: GPU CPU 
OneShotIterator: CPU 
IteratorGetNext: GPU CPU 

Colocation members and user-requested devices:
  OneShotIterator (OneShotIterator) 
  IteratorToStringHandle (IteratorToStringHandle) 
  clone_0/IteratorGetNext (IteratorGetNext) /device:GPU:0

	 [[{{node IteratorToStringHandle}} = IteratorToStringHandle[](OneShotIterator)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\training\saver.py", line 1546, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
    run_metadata_ptr)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
    run_metadata)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation IteratorToStringHandle: Could not satisfy explicit device specification '' because the node node IteratorToStringHandle (defined at C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py:11) having device No device assignments were active during op 'IteratorToStringHandle' creation.  was colocated with a group of nodes that required incompatible device '/device:GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices: 
IteratorToStringHandle: GPU CPU 
OneShotIterator: CPU 
IteratorGetNext: GPU CPU 

Colocation members and user-requested devices:
  OneShotIterator (OneShotIterator) 
  IteratorToStringHandle (IteratorToStringHandle) 
  clone_0/IteratorGetNext (IteratorGetNext) /device:GPU:0

	 [[node IteratorToStringHandle (defined at C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py:11)  = IteratorToStringHandle[](OneShotIterator)]]

No node-device colocations were active during op 'IteratorToStringHandle' creation.
No device assignments were active during op 'IteratorToStringHandle' creation.

Caused by op 'IteratorToStringHandle', defined at:
  File "C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py", line 11, in <module>
    saver = tf.train.import_meta_graph(meta_ckpt_path)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\training\saver.py", line 1674, in import_meta_graph
    meta_graph_or_file, clear_devices, import_scope, **kwargs)[0]
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\training\saver.py", line 1696, in _import_meta_graph_with_return_elements
    **kwargs))
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\meta_graph.py", line 806, in import_scoped_meta_graph_with_return_elements
    return_elements=return_elements)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\importer.py", line 442, in import_graph_def
    _ProcessNewOps(graph)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\importer.py", line 234, in _ProcessNewOps
    for new_op in graph._add_new_tf_operations(compute_devices=False):  # pylint: disable=protected-access
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\ops.py", line 3440, in _add_new_tf_operations
    for c_op in c_api_util.new_tf_operations(self)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\ops.py", line 3440, in <listcomp>
    for c_op in c_api_util.new_tf_operations(self)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\ops.py", line 3299, in _create_op_from_tf_operation
    ret = Operation(c_op, self)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Cannot assign a device for operation IteratorToStringHandle: Could not satisfy explicit device specification '' because the node node IteratorToStringHandle (defined at C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py:11) having device No device assignments were active during op 'IteratorToStringHandle' creation.  was colocated with a group of nodes that required incompatible device '/device:GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices: 
IteratorToStringHandle: GPU CPU 
OneShotIterator: CPU 
IteratorGetNext: GPU CPU 

Colocation members and user-requested devices:
  OneShotIterator (OneShotIterator) 
  IteratorToStringHandle (IteratorToStringHandle) 
  clone_0/IteratorGetNext (IteratorGetNext) /device:GPU:0

	 [[node IteratorToStringHandle (defined at C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py:11)  = IteratorToStringHandle[](OneShotIterator)]]

No node-device colocations were active during op 'IteratorToStringHandle' creation.
No device assignments were active during op 'IteratorToStringHandle' creation.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py", line 14, in <module>
    saver.restore(sess, model_ckpt_path)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\training\saver.py", line 1582, in restore
    err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Cannot assign a device for operation IteratorToStringHandle: Could not satisfy explicit device specification '' because the node node IteratorToStringHandle (defined at C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py:11) having device No device assignments were active during op 'IteratorToStringHandle' creation.  was colocated with a group of nodes that required incompatible device '/device:GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices: 
IteratorToStringHandle: GPU CPU 
OneShotIterator: CPU 
IteratorGetNext: GPU CPU 

Colocation members and user-requested devices:
  OneShotIterator (OneShotIterator) 
  IteratorToStringHandle (IteratorToStringHandle) 
  clone_0/IteratorGetNext (IteratorGetNext) /device:GPU:0

	 [[node IteratorToStringHandle (defined at C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py:11)  = IteratorToStringHandle[](OneShotIterator)]]

No node-device colocations were active during op 'IteratorToStringHandle' creation.
No device assignments were active during op 'IteratorToStringHandle' creation.

Caused by op 'IteratorToStringHandle', defined at:
  File "C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py", line 11, in <module>
    saver = tf.train.import_meta_graph(meta_ckpt_path)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\training\saver.py", line 1674, in import_meta_graph
    meta_graph_or_file, clear_devices, import_scope, **kwargs)[0]
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\training\saver.py", line 1696, in _import_meta_graph_with_return_elements
    **kwargs))
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\meta_graph.py", line 806, in import_scoped_meta_graph_with_return_elements
    return_elements=return_elements)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\importer.py", line 442, in import_graph_def
    _ProcessNewOps(graph)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\importer.py", line 234, in _ProcessNewOps
    for new_op in graph._add_new_tf_operations(compute_devices=False):  # pylint: disable=protected-access
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\ops.py", line 3440, in _add_new_tf_operations
    for c_op in c_api_util.new_tf_operations(self)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\ops.py", line 3440, in <listcomp>
    for c_op in c_api_util.new_tf_operations(self)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\ops.py", line 3299, in _create_op_from_tf_operation
    ret = Operation(c_op, self)
  File "C:\ProgramData\Anaconda3\envs\tf12_env\lib\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Cannot assign a device for operation IteratorToStringHandle: Could not satisfy explicit device specification '' because the node node IteratorToStringHandle (defined at C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py:11) having device No device assignments were active during op 'IteratorToStringHandle' creation.  was colocated with a group of nodes that required incompatible device '/device:GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices: 
IteratorToStringHandle: GPU CPU 
OneShotIterator: CPU 
IteratorGetNext: GPU CPU 

Colocation members and user-requested devices:
  OneShotIterator (OneShotIterator) 
  IteratorToStringHandle (IteratorToStringHandle) 
  clone_0/IteratorGetNext (IteratorGetNext) /device:GPU:0

	 [[node IteratorToStringHandle (defined at C:/Users/chzhq/Desktop/roof_detection_bbox/tf_inference/make_ckpt_to_pb.py:11)  = IteratorToStringHandle[](OneShotIterator)]]

No node-device colocations were active during op 'IteratorToStringHandle' creation.
No device assignments were active during op 'IteratorToStringHandle' creation.


Process finished with exit code 1

 

위 와 같이

 

Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint.

 

에러가 발생하는 경우가 있다

 

찾아보니 CUDA 와 cuDNN 환경변수를 추가하라는 답변이 있었지만 나는 이미 추가가 되어있는 상태였기 때문에 검색해보고 나오는 코드들을 족족 적용해보기 시작했다.

 

그 중 얻어걸린 옵션이 있었는데

 

tf.Session() 을 만들 때 옵션으로 allow_soft_placement=True 를 주는 것 이었다.

 

allow_soft_placement 옵션은 텐서플로우가 자동으로 존재하는 디바이스 중 하나를 선택한다

 

 

log_device_placement=True 를 같이 주면 어떤 디바이스에 배치되었는지 로그가 함께 출력된다.

 

아래 코드는 해당 옵션들이 적용된 코드이다

 

import tensorflow as tf

meta_ckpt_path = "C:/Users/user/Desktop/models/maskrcnn_models/model.ckpt-2396686.meta"
model_ckpt_path = "C:/Users/user/Desktop/models/maskrcnn_models/"

# 이 부분에 옵션을 적용해주면 된다
with tf.Session(config=tf.ConfigProto(
        allow_soft_placement=True, log_device_placement=True)) as sess:
        
    # Restore the graph
    saver = tf.train.import_meta_graph(meta_ckpt_path)

    # Load weights
    saver.restore(sess, tf.train.latest_checkpoint(model_ckpt_path))
    

 

 

위 옵션 적용 후 정상적으로 체크포인트가 로드 된 것을 확인할 수 있다.

 

 

 

잘못된 부분에 대한 지적은 언제나 환영입니다 :)

블로그 이미지

우송송

,

tensorflow 를 설치하고 import tensorflow as tf 를 하면 

 

>>> import tensorflow as tf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/__init__.py", line 59, in <module>
    from tensorflow.core.framework.graph_pb2 import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/core/framework/graph_pb2.py", line 15, in <module>
    from tensorflow.core.framework import node_def_pb2 as tensorflow_dot_core_dot_framework_dot_node__def__pb2
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/core/framework/node_def_pb2.py", line 15, in <module>
    from tensorflow.core.framework import attr_value_pb2 as tensorflow_dot_core_dot_framework_dot_attr__value__pb2
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/core/framework/attr_value_pb2.py", line 15, in <module>
    from tensorflow.core.framework import tensor_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__pb2
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/core/framework/tensor_pb2.py", line 15, in <module>
    from tensorflow.core.framework import resource_handle_pb2 as tensorflow_dot_core_dot_framework_dot_resource__handle__pb2
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/core/framework/resource_handle_pb2.py", line 22, in <module>
    serialized_pb=_b('\n/tensorflow/core/framework/resource_handle.proto\x12\ntensorflow\"r\n\x13ResourceHandleProto\x12\x0e\n\x06\x64\x65vice\x18\x01 \x01(\t\x12\x11\n\tcontainer\x18\x02 \x01(\t\x12\x0c\n\x04name\x18\x03 \x01(\t\x12\x11\n\thash_code\x18\x04 \x01(\x04\x12\x17\n\x0fmaybe_type_name\x18\x05 \x01(\tBn\n\x18org.tensorflow.frameworkB\x0eResourceHandleP\x01Z=github.com/tensorflow/tensorflow/tensorflow/go/core/framework\xf8\x01\x01\x62\x06proto3')
TypeError: __new__() got an unexpected keyword argument 'serialized_options'

 

이렇게 serialized_option 에러가 나는 경우가 있는데 이 경우 protobuf 버전이 맞는지 확인하고 맞는 버전을 설치해줘야 한다

 

tensorflow 1.12 버전 기준 protobuf 3.6.1 을 설치하면 된다

 

pip install protobuf==3.6.1

 

블로그 이미지

우송송

,

Tensorflow 레포지토리를 clone 받고 이것저것 설정 후 학습을 시작하기 위해 train.py 을 실행시켰을 때 내 경우 어떤 환경에서든 항상 'object_detection' 을 찾을 수 없다는 에러가 나왔다

 

train.py 를 실행하면 위 와 같이 'object detection' 모듈을 찾을 수 없다는 에러가 나온다

 

 

 

진짜로 저 모듈이 없는건가 싶어 tensorflow 폴더를 뒤져보면 버젓이 존재하는데 없다고 죽어버리니 당황스러웠다

 

처음엔 Pycharm 으로 실행 시키고 빨간 줄이 뜨는 곳 마다 찾아가서 object_detection 을 import 하는 곳에 모두 상위 디렉토리를 명시해 줬었다... 야근만 피할 수 있다면 노가다도 마다하지 않는다

 

하지만 환경을 세팅할 때 마다 매번 그럴 수도 없는 노릇이라 에러를 한번 검색해봤는데 다급할 땐 아무리 찾아도 나오지 않던 해결방법이 검색결과 맨 위에 나와있었다;;

 

 

방법은 간단하다

터미널 창을 열고 아래와 같이 환경변수 하나만 추가해주면 해결된다.

export PYTHONPATH=$PYTHONPATH:/home/user/tensorflow/models/research:/home/user/tensorflow/models/research/slim
블로그 이미지

우송송

,