我按照Fine-tuning BERT的说明用我自己的数据集(它有点大,大于 20G)构建了一个模型,然后采取措施重新 cdoe 我的数据并从tf_record
文件中加载它们。我创建的training_dataset
签名与指令中的签名相同
training_dataset.element_spec
({'input_word_ids': TensorSpec(shape=(32, 1024), dtype=tf.int32, name=None),
'input_mask': TensorSpec(shape=(32, 1024), dtype=tf.int32, name=None),
'input_type_ids': TensorSpec(shape=(32, 1024), dtype=tf.int32, name=None)},
TensorSpec(shape=(32,), dtype=tf.int32, name=None))
其中batch_size
是 32,max_seq_length
是 1024。正如指令所建议的,
The resulting tf.data.Datasets return (features, labels) pairs, as expected by keras.Model.fit
似乎一切都按预期工作,(尽管该指令没有显示如何使用training_dataset
)但是,以下代码
bert_classifier.fit(
x = training_dataset,
validation_data=test_dataset, # has the same signature just as training_dataset
batch_size=32,
epochs=epochs,
verbose=1,
)
遇到一个对我来说似乎很奇怪的错误,
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/captain/project/dataload/train.py", line 81, in <module>
verbose=1,
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1100, in fit
tmp_logs = self.train_function(iterator)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
result = self._call(*args, **kwds)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 871, in _call
self._initialize(args, kwds, add_initializers_to=initializers)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 726, in _initialize
*args, **kwds))
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2969, in _get_concrete_function_internal_garbage_collected
graph_function, _ = self._maybe_define_function(args, kwargs)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3361, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3206, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 990, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 634, in wrapped_fn
out = weak_wrapped_fn().__wrapped__(*args, **kwds)
File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 977, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:805 train_function *
return step_function(self, iterator)
/home/captain/.local/lib/python3.7/site-packages/official/nlp/keras_nlp/layers/position_embedding.py:88 call *
return tf.broadcast_to(position_embeddings, input_shape)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py:845 broadcast_to **
"BroadcastTo", input=input, shape=shape, name=name)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:750 _apply_op_helper
attrs=attr_protos, op_def=op_def)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py:592 _create_op_internal
compute_device)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:3536 _create_op_internal
op_def=op_def)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:2016 __init__
control_input_ops, op_def)
/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:1856 _create_c_op
raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 512 and 1024 for '{{node bert_classifier/bert_encoder_1/position_embedding/BroadcastTo}} =
BroadcastTo[T=DT_FLOAT, Tidx=DT_INT32](bert_classifier/bert_encoder_1/position_embedding/strided_slice_1, bert_classifier/bert_encoder_1/position_embedding/Shape)'
with input shapes: [512,768], [3] and with input tensors computed as partial shapes: input[1] = [32,1024,768].
与 512 无关,我的代码也没有使用 512。那么我的代码哪里出了问题以及如何解决呢?