Solvedapex Error in FusedLayerNorm

After installing apex with the cuda extensions and running pytorch-pretrained-BERT, I get the following error in FusedLayerNormAffineFunction, apex/normalization/fused_layer_norm.py (line 21).

RuntimeError: a Tensor with 2482176 elements cannot be converted to Scalar (item at /pytorch/aten/src/ATen/native/Scalar.cpp:9)

Here are the shapes of my tensors:

input_ - [32, 101, 768]
bias_ - [768]
weight_ - [768]
self.normalized_shape - [768]

I'm not sure if it's a problem with pytorch-pretrained-BERT calling it incorrectly or a bug in apex. Any idea? I've also created an issue here.

I'm running Ubuntu with CUDA 9, PyTorch 0.4.1.

Full stacktrace below.

File "/home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py", line 710, in forward
    embedding_output = self.embeddings(input_ids, token_type_ids)
  File "/home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py", line 261, in forward
    embeddings = self.LayerNorm(embeddings)
  File "/home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/normalization/fused_layer_norm.py", line 149, in forward
    input, self.weight, self.bias)
  File "/home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/normalization/fused_layer_norm.py", line 21, in forward
    input_, self.normalized_shape, weight_, bias_, self.eps)

RuntimeError: a Tensor with 2482176 elements cannot be converted to Scalar (item at /pytorch/aten/src/ATen/native/Scalar.cpp:9)

frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f1aa5da3021 in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f1aa5da28ea in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: at::native::item(at::Tensor const&) + 0x12c3 (0x7f1aa690d5b3 in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #3: at::TypeDefault::item(at::Tensor const&) const + 0x55 (0x7f1aa6b1c905 in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #4: torch::autograd::VariableType::eye_out(at::Tensor&, long, long) const + 0x184 (0x7f1aa4faeec4 in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #5: <unknown function> + 0x89ca (0x7f1a82e739ca in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #6: layer_norm_affine(at::Tensor, c10::ArrayRef<long>, at::Tensor, at::Tensor, double) + 0x185 (0x7f1a82e762a5 in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #7: <unknown function> + 0x18d44 (0x7f1a82e83d44 in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #8: <unknown function> + 0x16495 (0x7f1a82e81495 in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #9: _PyCFunction_FastCallDict + 0x154 (0x55a8f9925744 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #10: <unknown function> + 0x198610 (0x55a8f99ac610 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #11: _PyEval_EvalFrameDefault + 0x30a (0x55a8f99d138a in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #12: <unknown function> + 0x71e1 (0x7f1af51ee1e1 in /home/hyper/.PyCharm2018.1/system/cythonExtensions/_pydevd_frame_eval_ext/pydevd_frame_evaluator.cpython-36m-x86_64-linux-gnu.so)
frame #13: _PyFunction_FastCallDict + 0x11b (0x55a8f99a6bab in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #14: _PyObject_FastCallDict + 0x26f (0x55a8f9925b0f in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #15: _PyObject_Call_Prepend + 0x63 (0x55a8f992a6a3 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #16: PyObject_Call + 0x3e (0x55a8f992554e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #17: THPFunction_do_forward(THPFunction*, _object*) + 0x15c (0x7f1ae02e21ec in /home/hyper/Documents/anaconda3/envs/allennlp/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #18: PyCFunction_Call + 0x5f (0x55a8f992863f in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #19: PyObject_Call + 0x3e (0x55a8f992554e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #20: <unknown function> + 0x16ba91 (0x55a8f997fa91 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #21: _PyObject_FastCallDict + 0x8b (0x55a8f992592b in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #22: <unknown function> + 0x19857e (0x55a8f99ac57e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #23: _PyEval_EvalFrameDefault + 0x30a (0x55a8f99d138a in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #24: <unknown function> + 0x71e1 (0x7f1af51ee1e1 in /home/hyper/.PyCharm2018.1/system/cythonExtensions/_pydevd_frame_eval_ext/pydevd_frame_evaluator.cpython-36m-x86_64-linux-gnu.so)
frame #25: _PyFunction_FastCallDict + 0x11b (0x55a8f99a6bab in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #26: _PyObject_FastCallDict + 0x26f (0x55a8f9925b0f in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #27: _PyObject_Call_Prepend + 0x63 (0x55a8f992a6a3 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #28: PyObject_Call + 0x3e (0x55a8f992554e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #29: _PyEval_EvalFrameDefault + 0x19ec (0x55a8f99d2a6c in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #30: <unknown function> + 0x71e1 (0x7f1af51ee1e1 in /home/hyper/.PyCharm2018.1/system/cythonExtensions/_pydevd_frame_eval_ext/pydevd_frame_evaluator.cpython-36m-x86_64-linux-gnu.so)
frame #31: <unknown function> + 0x1918e4 (0x55a8f99a58e4 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #32: _PyFunction_FastCallDict + 0x1bc (0x55a8f99a6c4c in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #33: _PyObject_FastCallDict + 0x26f (0x55a8f9925b0f in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #34: _PyObject_Call_Prepend + 0x63 (0x55a8f992a6a3 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #35: PyObject_Call + 0x3e (0x55a8f992554e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #36: <unknown function> + 0x16ba91 (0x55a8f997fa91 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #37: _PyObject_FastCallDict + 0x8b (0x55a8f992592b in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #38: <unknown function> + 0x19857e (0x55a8f99ac57e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #39: _PyEval_EvalFrameDefault + 0x30a (0x55a8f99d138a in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #40: <unknown function> + 0x71e1 (0x7f1af51ee1e1 in /home/hyper/.PyCharm2018.1/system/cythonExtensions/_pydevd_frame_eval_ext/pydevd_frame_evaluator.cpython-36m-x86_64-linux-gnu.so)
frame #41: <unknown function> + 0x1918e4 (0x55a8f99a58e4 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #42: _PyFunction_FastCallDict + 0x3da (0x55a8f99a6e6a in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #43: _PyObject_FastCallDict + 0x26f (0x55a8f9925b0f in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #44: _PyObject_Call_Prepend + 0x63 (0x55a8f992a6a3 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #45: PyObject_Call + 0x3e (0x55a8f992554e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #46: _PyEval_EvalFrameDefault + 0x19ec (0x55a8f99d2a6c in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #47: <unknown function> + 0x71e1 (0x7f1af51ee1e1 in /home/hyper/.PyCharm2018.1/system/cythonExtensions/_pydevd_frame_eval_ext/pydevd_frame_evaluator.cpython-36m-x86_64-linux-gnu.so)
frame #48: <unknown function> + 0x1918e4 (0x55a8f99a58e4 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #49: _PyFunction_FastCallDict + 0x1bc (0x55a8f99a6c4c in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #50: _PyObject_FastCallDict + 0x26f (0x55a8f9925b0f in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #51: _PyObject_Call_Prepend + 0x63 (0x55a8f992a6a3 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #52: PyObject_Call + 0x3e (0x55a8f992554e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #53: <unknown function> + 0x16ba91 (0x55a8f997fa91 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #54: _PyObject_FastCallDict + 0x8b (0x55a8f992592b in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #55: <unknown function> + 0x19857e (0x55a8f99ac57e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #56: _PyEval_EvalFrameDefault + 0x30a (0x55a8f99d138a in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #57: <unknown function> + 0x71e1 (0x7f1af51ee1e1 in /home/hyper/.PyCharm2018.1/system/cythonExtensions/_pydevd_frame_eval_ext/pydevd_frame_evaluator.cpython-36m-x86_64-linux-gnu.so)
frame #58: <unknown function> + 0x1918e4 (0x55a8f99a58e4 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #59: _PyFunction_FastCallDict + 0x3da (0x55a8f99a6e6a in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #60: _PyObject_FastCallDict + 0x26f (0x55a8f9925b0f in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #61: _PyObject_Call_Prepend + 0x63 (0x55a8f992a6a3 in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #62: PyObject_Call + 0x3e (0x55a8f992554e in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
frame #63: _PyEval_EvalFrameDefault + 0x19ec (0x55a8f99d2a6c in /home/hyper/Documents/anaconda3/envs/allennlp/bin/python)
33 Answers

✔️Accepted Answer

@Hyperparticle @thomwolf @geniki While I wait for the results of Thor's runs, one thing that occurs to me is your segfault may be because when you upgraded Pytorch, the existing (installed) Apex binaries were no longer compatible somehow. Try a full pip uninstall apex, then cd apex_repo_dir; rm-rf build; pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" . and see if the segfault persists.

Other Answers:

@mrdbourke I think you may have compiled apex without cuda support. You need to compile it with python setup.py install --cpp_ext --cuda_ext.

Me too - PyTorch 1.0.1, CUDA 10. It's not specific to pytorch-pretrained-BERT, the script below is enough for me:

import torch
import apex

input = torch.rand(3, 10).cuda()
fln = apex.normalization.FusedLayerNorm(10).cuda()
fln(input)

I solved the problem, it's the version of GCC . It should be 4.9+,but ubuntu 14.04 is 4.8.

Whew, this is a useful gotcha to know about. good old emergency repair procedure number one: turn it off and on again. Glad people seem to be happy, especially since as I said, I don't have the bandwidth to do a deep dive debug right this second.

Note to self: make the setup.py smarter to avoid such cases in the future.