ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

ACL 2022

code: [传送门]

0. 摘要翻译

图表在数据分析中非常受欢迎。在探索图表时，人们经常提出各种涉及多个逻辑和算术操作的复杂推理问题。他们在问题中通常还常常提到图表的视觉特征。然而，大多数现有数据集并未专注于此类复杂推理问题，因为它们的问题是基于模板的，答案来自固定的词汇。在这项工作中，我们提出了一个大规模的基准，涵盖了9.6K个人工编写的问题，以及从人工编写的图表摘要生成的23.1K个问题。为了解决我们基准中涉及图表的视觉和逻辑推理的独特挑战，我们提出了两个基于Transformer的模型，以统一的方式结合图表的视觉特征和数据表来回答问题。虽然我们的模型在先前的数据集以及我们的基准上取得了最先进的结果，但评估也揭示了回答复杂推理问题中的若干挑战。

吐槽一下，初步浏览了一些小标题和图表内容，感觉这文章写的漏洞百出，图表中的很多数据都算错了。。。不知道是不是我没理解，详细看完再说。

1. INTRODUCTION

为了分析数据，需要向图表中的数据询问复杂的推理问题，这涉及到计算和逻辑操作。这种推理问题需要大量的感知和认知能力。如下图1需要计算两条折线之间每一年的差异，找出差异最大的一年。

Chart Question Answering system (ChartQA) 通过输入一张图表和一个自然语言问题来预测结果。与基于 text 的 QA 不同，ChartQA 中的图表包含视觉表示，读者的注意力可能更多在一些突出的特征，比如趋势、异常值等。对此，读者也更倾向于问出关于视觉属性的问题。如上图1中 Q2 问到了 “橘色” 折线的峰值。

ChartQA 的关注度逐渐增加，但是现有的数据集有很多局限性：

问题是依据模板自动生成的，缺乏自然性；
由于图表都是用 Matplotlib 等工具生成的，无法反映真实世界图表的多样性；
在大多数数据集中，答案来自一个固定的小词汇表（比如局限在坐标轴的标签，yes，no 等）。这些回答忽视了，对于复杂的推理问题，需要包含大量的合并，比较等数据运算。

由于大多数数据集仅支持 fixed vocabulary 的问题，所以现有的模型通常把该任务视为分类任务来处理，并且依赖于动态编码技术，其中包含根据图表元素的位置空间编码的问题和答案（例如 x-axis-label-1）。这总方法在 OCR 模型生成错误或当问题涉及到图表元素的同义词时失效。PlotQA 试图支持 open vocabulary question 通过使用 TableQA，但是没有关于图表的视觉特征。

为了解决上述限制，本文提出了一个大尺度 benchmark 覆盖了 9608 张人工写出的侧重于逻辑和视觉推理的问题。为了节约成本，还有 23111 个通过 T5 model 和人工总结的图表摘要（保持了语言丰富的变化）自动生成的问题，并人工验证其子集以保证质量。该 benchmark 由来自 4 个不同的 online source 的 20882 个图表组成，来确保视觉风格和主题的多样性。

首先使用 ChartOCR model 提取出图表中的基础数据表（underlying data table）。然后将这部分数据和使用神经网络提取的图表的视觉特征以一致的方式输入两个基于 transformer 的 QA 模型。该 model 实现了在先前数据集上 SotA，或者用先前的 model 在新的 benchmark 上达到了和之前同等的水平。

总结，作者的主要贡献：

一个大规模的 ChartQA 数据集，包含了 real-world 图表和人工编写的 question-answer pairs。
一个 pipeline 方法，将视觉特征和自动从图表中提取的数据用于基于 transformer 的 QA 模型，并提供了 SotA 的结果。
广泛的分析了作者提出 model 的性能。

Existing Datasets

ChartQA 与先前的数据集有两方面不同：

question 类型：human-authored vs. template-based
chart 来源：real-world vs. generated using a tool

如下表1，FigureQA、DVQA、LEAF-QA、LEAFQA++ 使用相同的工具绘制 charts， questions 使用模板生成，answers 来自一个固定的 vocabulary。FigureQA、DVQA 使用合成的数据绘制 charts，LEAF-QA 和 LEAFQA++ 使用真实的数据绘制 charts。PlotQA 是唯一一个使用 open-vocabulary questions 的数据集，需要对底层数据应用合并操作，但是仍然是基于模板生成的。PlotQA 没有视觉上的 reasoning questions，且 charts 是使用软件绘制的。作者发现暂时没有大规模的 ChartQA 数据集，此为动机。

Existing Models

关于 ChartQA 可以分为两类：

使用基于分类的视觉 QA model

只能处理固定词汇的问题。

使用编码器将问题和图表图像编码。在进入分类层前，使用 attention 机制融合前面的 question 和 chart 的编码特征。这些模型通常使用动态编码方法，对图表图像中的文本元素的位置信息进行对应的问题编码。容易引入 OCR 噪声。
使用 table QA 方法

该方法需要假设表格数据已经被给定了，或者通过视觉方法提取 chart image 中的表格数据。

Chart Data Extraction

之前提出的半自动或者全自动系统从图标图像中提取数据，但是他们的方法依赖于各种启发式，并不适用于真实数据，表现仍然受限。

WACV 全自动的从真实图表中提取数据，并取得了不错的效果。但是这个模型仅能预测被标记的原始值(raw data value)，并不与其图例和坐标轴相关联。本文扩展了 WACV 流程，可以提取全结构的数据表，然后传入本文提出的模型。

3. ChartQA Datasets

3.1 Data Collection & Preparation

取自四个图表网站，包含不同的主题和多样的风格。网站中包含 underlying data table 的内容也能爬的都爬了。

3.2 Data Annotation

两种主要的标注方法：

使用 AMT（Amazon Mechanical Turk）收集人工编写的 QA 对。

人工标记组合问题（至少包含两种运算）和视觉问题。
从 Statista 的人工编写的中生成 QA 对。

数据集的拆分情况：

3.3 Dataset Analysis

如下表 4 该 benchmark 分为 4 类：

4. Method

4.1 Problem Formulation & Data Extraction

ChartQA 的整个处理过程如下图2所示：

考虑两种问题设定：

假设图表的底层数据表格是可用的；

给定 N 个例子的数据集 $\mathcal{D} = \{c_i, t_i, q_i, a_i\}_{i=1}^N$ ， $c_i$ 表示 chart image， $t_i$ 表示 underlying data table， $q_i$ 表示基于 $c_i$ 的问题， $a_i$ 表示问题的答案。

ChartQA 基于 $c_i, t_i, q_i$ 预测 $a_i$ 。
通过 StoA ChartOCR 从 chart image $c_i$ 中提取数据表格 $t_i$ 。

ChartQA 首先通过 key-point detection networks 定位 chart image 的关键元素（比如，区域面积、标题等）和数据编码的标记（比如，bars）。然后使用每个标记检测到的关键点和轴标签来估计标记的数据值。但是它没有关联预测值和对应的文本标签（例如x轴标签）。作者扩展了他们方法，以输出完整的结构的数据表格。使用 CRAFT 模型来分辨图表元素中的文本。然后使用位置和颜色信息来关联数据值和文本标签。

4.2 Models

本文的 ChartQA 建立在两个 StoA TableQA model 之上：T5 和 TAPAS。TableQA 的输入是 $q_i$ 和 $t_i$ 。ChartQA 的输入多了从 chart image 中提炼的图像信息。T5 具有视觉变体，但是 VL-T5 和 TAPAS 没有，作者扩展了 TAPAS 以考虑视觉特征，称之为 VisionTAPAS。

T5

是一个 encoder-decoder model，它利用相同的架构和损失函数将 NLP 任务统一为文本到文本生成。T5 在大量的未标记的数据集上带着自监督去噪目标进行预训练。为了在 ChartQA 任务上对 T5 进行微调，将 data table 进行 flatten （扁平化）操作后和 question 以 “Question: Question tokens Table: Flattened table tokens” 的形式，一起送入模型进行训练，以直接生成答案。
VL-T5

是 T5 的扩展，将视觉-语言任务(VL)统一为以多模态输入为条件的文本输出。输入由文本信息 token 和使用 Faster R-CNN 从图像中提取的对象的视觉特征组成。该模型在多个多模态任务上进行了预训练，包括语言建模、视觉问答和视觉定位。作者利用如下方式将 VL-T5 应用到 ChartQA 任务上：

对于文本输入，采取了与T5相同的方法，即将图表图像的数据表展平，然后与问题文本连接起来。对于视觉输入，使用带有Resnet-101作为其骨干的Mask R-CNN（He等，2017年）提取图表图像中不同标记（例如，条形、线条）的视觉特征。与原始的 VL-T5 不同，该方式提供了固定数量的对象（36个），不过元素的数量在一个图表和另一个图表之间是不同的。为了解决这个问题，作者使用零填充提取的视觉特征，使其具有固定的长度为36。
TAPAS

TAPAS（Herzig等，2020年）通过为行和列添加额外的位置嵌入来扩展BERT（Devlin等，2019年）体系结构，以编码表格。如图3a所示，模型的输入格式为：[CLS] 问题标记 [SEP] 展平的表格标记。除了BERT的段落和位置嵌入外，这些标记还使用表格特定的位置嵌入进行编码。该模型有两个输出头部：聚合操作头部和单元选择头部。聚合操作头部预测一个操作（例如，COUNT、SUM、AVERAGE、NONE），然后应用于由单元选择头部选择的单元值。根据操作类型，所选单元可以构成最终答案或用于推断最终答案的输入。

TaPas首先在来自维基百科的表格文本对上进行了 Mask 语言建模任务的预训练，其中表格单元被随机 mask，模型被训练以预测它们。然后，它以弱监督的方式进行微调（只使用答案作为监督），采用端到端可微的目标。
VisionTaPas

TaPas 的扩展。它由三个主要组件组成：

用于编码图表图像的视觉Transformer编码器；用于编码问题和数据表的TaPas编码器；以及一个跨模态编码器（图3b）。
Vision Transformer or ViT

Vision Transformer或ViT（Dosovitskiy等，2021年）在视觉任务中采用了Transformer编码器架构（Vaswani等，2017年）。给定一张2D图表图像，图像被分割成一系列2D的补丁{p1，…，pn}。然后，每个补丁被展平并线性投影到一个d维 embeddings 向量中。为了融入补丁的位置信息，1D可学习的位置 embeddings 被添加到图像特征中。一个L层的ViT编码器产生一个表示特殊[CLS]标记和图像补丁的 embeddings 序列 $H = {h^L_{cls}，h^L_1，...，h^L_n}$ 。我们使用来自Dosovitskiy等（2021年）的预训练权重初始化ViT模块。

TaPas编码器的使用方式与上述描述相同，用于编码问题和数据表中的标记。对于一个输入标记序列{ $w_{cls}，w_1，...，w_m$ }，一个L层的TaPas生成相应的编码Z ={ $z^L_{cls}，z^L_1，...，z^L_m$ }。该模块使用在WikiTQ数据集上（Pasupat和Liang，2015年）预训练的TaPas权重进行初始化（Herzig等，2020年）。

跨模态编码器接收ViT和TaPas编码器的输出（H和Z）并计算多模态编码。它有四个块，每个块包含一个视觉分支和一个文本-表格分支。输入首先通过并行的多头交叉注意力层，其中在视觉分支中，查询向量是视觉特征，而键和上下文向量是文本-表格特征，在文本-表格分支中，查询向量是文本特征，键和上下文向量也是文本-表格特征。然后，交叉注意力的特征经过自注意力层和全连接层。与Transformer模型类似，每一层都应用层归一化（Ba等，2016年）并包含一个残差连接。最后，我们将TaPas的聚合操作和单元选择头添加到文本-表格分支的最终层。
Extension to Other Operations

我们ChartQA数据集中的许多问题需要执行减法或比率运算，而原始的TaPas模型不支持这些操作。因此，我们扩展了操作头以添加这两种操作（图3b）。然而，与在TaPas中基于最终答案弱监督训练它们不同，我们发现当为其提供更直接但可能带有噪声的单元监督时效果更好。我们依赖一些启发式方法在我们的训练数据中生成这种监督。例如，给定一个问题" A 和 B 之间的差异是多少？"，答案是5，数据值是"3, 6, 8"，我们寻找两个值，它们的差异是5（即8和3）。虽然这可能会产生噪声监督，但类似的方法已成功地用于向神经模型注入推理能力（Geva等，2020年；Saxton等，2019年）；在100个这样的问题的随机样本中，手动检查显示我们的启发式方法产生了24%的噪声。为了处理固定词汇的答案（例如’是’，‘否’），我们进一步扩展了操作头以包括这些类别。

5. Evaluation

5.1 Datasets, Baselines & Metrics

与 PREFIL 和 PLOTQA* 在先前的三个数据集 FigureQA，PlotQA，DVQA 和本文提出的 ChartQA dataset 上进行比较。

PREFIL 是同时融合 question 和 image features 的分类方法，然后将 features 传播到最终分类层。
PLOTQA 本来没有视觉特征的输入，作者对其进行了扩展。由于它对数据的提取，不能很好的适应真实图表数据，所以实验中提取 underlying data table 用到的是本文 4.1 中提到的提取方法。

评估标准是宽松的，认为与 gold answer 的差距在 5% 之内，就是正确的。对于非数字答案，仍然需要一个完全匹配来考虑正确答案。

5.2 Results

5.3 Ablation Studies

5.4 Qualitative Analysis

6. Conclusion

本文提出了一个新的大规模 benchmark ，使用关注于视觉和逻辑推理的人工编写的问题。评估发现这种方法很有前景，但是展示出了一些人类提出的视觉和逻辑推理问题的几个独特挑战，这几个挑战表现出语言的非正式性、复杂和细微差别。

需要补充的

PlotQA：
Mask RCNN
TaPas
VisionTaPas
VIT
T5：【视频讲解】【代码实现】
ChartOCR
CRAF

7. 代码复现

7.1 环境配置

刚拿到 SCIR 的 hpc，得配置一下 anaconda ，修改一下 conda 源和 pip 源。

conda 换源

将以上配置文件写在~/.condarc中
vim ~/.condarc

清华源：

channels:
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
ssl_verify: true

工大源：

channels:
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
ssl_verify: true

参考：https://zhuanlan.zhihu.com/p/87123943

pip 换源

安装 pytorch 用 pip3 安装更快。pip 换源：

1.在根目录下创建.pip文件夹

1	mkdir ~/.pip

2.在创建好的.pip文件夹下创建pip源配置文件

1	touch ~/.pip/pip.conf

3.使用vim打开pip.conf配置文件

1	vim ~/.pip/pip.conf

4.添加下述内容

[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
[install]
trusted-host=mirrors.aliyun.com

5.保存并退出

:wq

以上就完成了pip源的配置过程。

原文链接：https://blog.csdn.net/qq_44716044/article/details/123347432

7.2 T5复现

qyfan@gpu12:~/code/ChartQA-main$ source activate
(base) qyfan@gpu12:~/code/ChartQA-main$ conda activate ChartQA
(ChartQA) qyfan@gpu12:~/code/ChartQA-main$ ls
ChartQA-Dataset  Data-Extraction  Figures-and-Examples  LICENSE  Models  README.md
(ChartQA) qyfan@gpu12:~/code/ChartQA-main$ cd Models/T5/
(ChartQA) qyfan@gpu12:~/code/ChartQA-main/Models/T5$ sh train_test.sh 
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
12/09/2023 23:24:52 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False
Downloading data files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 6693.04it/s]
Extracting data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.16it/s]
Generating train split: 29999 examples [00:00, 87931.53 examples/s]
Generating validation split: 29999 examples [00:00, 114942.30 examples/s]
Generating test split: 29999 examples [00:00, 162474.66 examples/s]
/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/models/t5/tokenization_t5_fast.py:160: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
- To avoid this warning, please instantiate this tokenizer with `model_max_length` set to your preferred value.
  warnings.warn(
Running tokenizer on train dataset:   0%|                                                                                         | 0/29999 [00:00<?, ? examples/s]/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:3856: UserWarning: `as_target_tokenizer` is deprecated and will be removed in v5 of Transformers. You can tokenize your labels by using the argument `text_target` of the regular `__call__` method (either in the same call as your input texts if you use the same keyword arguments, or in a separate call.
  warnings.warn(
Running tokenizer on train dataset: 100%|███████████████████████████████████████████████████████████████████████████| 29999/29999 [00:10<00:00, 2997.48 examples/s]
Running tokenizer on validation dataset: 100%|██████████████████████████████████████████████████████████████████████| 29999/29999 [00:09<00:00, 3049.45 examples/s]
Running tokenizer on prediction dataset: 100%|██████████████████████████████████████████████████████████████████████| 29999/29999 [00:09<00:00, 3022.47 examples/s]
  0%|                                                                                                                                   | 0/112500 [00:00<?, ?it/s][WARNING|logging.py:314] 2023-12-09 23:25:31,991 >> You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
[W reducer.cpp:1346] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
{'loss': 1.2697, 'learning_rate': 9.955555555555556e-05, 'epoch': 0.13}                                                                                            
{'loss': 1.0826, 'learning_rate': 9.911111111111112e-05, 'epoch': 0.27}                                                                                            
  1%|█▏                                                                                                                    | 1076/112500 [03:08<5:3  1%|▉                                                                                                     | 1077/112500 [03:08<5:29:15,  5.64it/s]{'loss': 0.9864, 'learning_rate': 9.866666666666668e-05, 'epoch': 0.4}                                                                             
{'loss': 0.9707, 'learning_rate': 9.822222222222223e-05, 'epoch': 0.53}                                                                            
  2%|█▊                                                                                                    | 2000/112500 [05:50<5:31:17,  5.56it/s]/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/generation/utils.py:1273: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(
                                                                                                                                                  Traceback (most recent call last):██████████████████████████████████████████████████████████████████████████████| 1875/1875 [09:56<00:00,  3.92it/s]
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 647, in <module>
    main()
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 569, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer.py", line 1555, in train
    return inner_training_loop(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer.py", line 1922, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer.py", line 2271, in _maybe_log_save_evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer_seq2seq.py", line 165, in evaluate
    return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer.py", line 3011, in evaluate
    output = eval_loop(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer.py", line 3304, in evaluation_loop
    metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 541, in compute_metrics
    decoded_preds, decoded_labels = postprocess_text(decoded_preds, decoded_labels)
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 525, in postprocess_text
    preds = ["\n".join(nltk.sent_tokenize(pred)) for pred in preds]
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 525, in <listcomp>
    preds = ["\n".join(nltk.sent_tokenize(pred)) for pred in preds]
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/nltk/tokenize/__init__.py", line 106, in sent_tokenize
    tokenizer = load(f"tokenizers/punkt/{language}.pickle")
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/nltk/data.py", line 750, in load
    opened_resource = _open(resource_url)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/nltk/data.py", line 876, in _open
    return find(path_, path + [""]).open()
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt/PY3/english.pickle

  Searched in:
    - '/home/qyfan/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/share/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - ''
**********************************************************************

  2%|█▊                                                                                                   | 2000/112500 [15:47<14:32:20,  2.11it/s]
                                                                                                                                                  [2023-12-09 23:41:22,421] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 2406783) of binary: /home/qyfan/anaconda3/envs/ChartQA/bin/python
Traceback (most recent call last):
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/run.py", line 810, in <module>
    main()
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/run.py", line 806, in main
    run(args)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/run.py", line 797, in run
    elastic_launch(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
run_T5.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-12-09_23:41:22
  host      : gpu12.cluster.com
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 2406783)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
(ChartQA) qyfan@gpu12:~/code/ChartQA-main/Models/T5$ python
Python 3.9.18 | packaged by conda-forge | (main, Aug 30 2023, 03:49:32) 
[GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> exit()
(ChartQA) qyfan@gpu12:~/code/ChartQA-main/Models/T5$ python
Python 3.9.18 | packaged by conda-forge | (main, Aug 30 2023, 03:49:32) 
[GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd-gpu12: error: *** STEP 60242.0 ON gpu12 CANCELLED AT 2023-12-09T23:52:55 DUE TO TIME LIMIT ***
srun: error: gpu12: task 0: Killed
qyfan@hpc-login-01:~/code/ChartQA-main$ cd Models/
qyfan@hpc-login-01:~/code/ChartQA-main/Models$ cd T5/
qyfan@hpc-login-01:~/code/ChartQA-main/Models/T5$ ls
predict_test.sh  result  run_T5.py  t5-base  train_test.sh
qyfan@hpc-login-01:~/code/ChartQA-main/Models/T5$ wget https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/packages/tokenizers/punkt.zip
--2023-12-09 23:53:58--  https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/packages/tokenizers/punkt.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 0.0.0.0, ::
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|0.0.0.0|:443... failed: Connection refused.
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|::|:443... failed: Connection refused.
qyfan@hpc-login-01:~/code/ChartQA-main/Models/T5$ pwd
/home/qyfan/code/ChartQA-main/Models/T5
qyfan@hpc-login-01:~/code/ChartQA-main/Models/T5$ source activate
(base) qyfan@hpc-login-01:~/code/ChartQA-main/Models/T5$ conda activate ChartQA
(ChartQA) qyfan@hpc-login-01:~/code/ChartQA-main/Models/T5$ ls
predict_test.sh  punkt  result  run_T5.py  t5-base  train_test.sh
(ChartQA) qyfan@hpc-login-01:~/code/ChartQA-main/Models/T5$ python
Python 3.9.18 | packaged by conda-forge | (main, Aug 30 2023, 03:49:32) 
[GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.download("punkt")
[nltk_data] Error loading punkt: <urlopen error [Errno 111] Connection
[nltk_data]     refused>
False
>>> nltk.data.find("tokenizers/punkt")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt

  Searched in:
    - '/home/qyfan/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/share/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

>>> /home/qyfan/nltk_data
  File "<stdin>", line 1
    /home/qyfan/nltk_data
    ^
SyntaxError: invalid syntax
>>> nltk.data.find("tokenizers/punkt")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt

  Searched in:
    - '/home/qyfan/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/share/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

>>> /home/qyfan/nltk_data
KeyboardInterrupt
>>> nltk.data.find("tokenizers/punkt")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt

  Searched in:
    - '/home/qyfan/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/share/nltk_data'
    - '/home/qyfan/anaconda3/envs/ChartQA/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

>>> exit()
(ChartQA) qyfan@hpc-login-01:~/code/ChartQA-main/Models/T5$ python
Python 3.9.18 | packaged by conda-forge | (main, Aug 30 2023, 03:49:32) 
[GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.data.find("./punkt")
FileSystemPathPointer('/home/qyfan/nltk_data/punkt')
>>> exit()
(ChartQA) qyfan@hpc-login-01:~/code/ChartQA-main/Models/T5$ srun --gres=gpu:tesla_v100s-pcie-32gb:1 --pty bash -i
qyfan@gpu12:~/code/ChartQA-main/Models/T5$ source activate
(base) qyfan@gpu12:~/code/ChartQA-main/Models/T5$ conda activate ChartQA
(ChartQA) qyfan@gpu12:~/code/ChartQA-main/Models/T5$ sh predict_test.sh 
model_arg: ModelArguments(model_name_or_path='t5-base', config_name=None, tokenizer_name=None, cache_dir=None, use_fast_tokenizer=True, model_revision='main', use_auth_token=False, resize_position_embeddings=None)
data_args: DataTrainingArguments(dataset_name=None, dataset_config_name=None, text_column='Input', summary_column='Output', train_file=None, validation_file=None, test_file='/home/qyfan/code/ChartQA-main/Figures-and-Examples/T5andVL-T5InputFileExamples.csv', overwrite_cache=False, preprocessing_num_workers=None, max_source_length=512, max_target_length=128, val_max_target_length=128, pad_to_max_length=False, max_train_samples=None, max_eval_samples=None, max_predict_samples=None, num_beams=None, ignore_pad_token_for_loss=True, source_prefix='')
training_args Seq2SeqTrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
dispatch_batches=None,
do_eval=False,
do_predict=True,
do_train=False,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=no,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_config=None,
generation_max_length=None,
generation_num_beams=None,
gradient_accumulation_steps=1,
gradient_checkpointing=False,
gradient_checkpointing_kwargs=None,
greater_is_better=None,
group_by_length=False,
half_precision_backend=auto,
hub_always_push=False,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=every_save,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
include_inputs_for_metrics=False,
include_tokens_per_second=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=0,
log_level=passive,
log_level_replica=warning,
log_on_each_node=True,
logging_dir=result/runs/Dec10_00-28-05_gpu12,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=500,
logging_strategy=steps,
lr_scheduler_type=linear,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
neftune_noise_alpha=None,
no_cuda=False,
num_train_epochs=3.0,
optim=adamw_torch,
optim_args=None,
output_dir=result,
overwrite_output_dir=False,
past_index=-1,
per_device_eval_batch_size=192,
per_device_train_batch_size=8,
predict_with_generate=True,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=[],
resume_from_checkpoint=None,
run_name=result,
save_on_each_node=False,
save_safetensors=True,
save_steps=500,
save_strategy=steps,
save_total_limit=None,
seed=42,
skip_memory_metrics=True,
sortish_sampler=False,
split_batches=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_cpu=False,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
)
12/10/2023 00:28:05 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False
/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/models/t5/tokenization_t5_fast.py:160: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
- To avoid this warning, please instantiate this tokenizer with `model_max_length` set to your preferred value.
  warnings.warn(
[WARNING|logging.py:314] 2023-12-10 00:28:10,765 >> You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 157/157 [10:07<00:00,  2.75s/it]Traceback (most recent call last):
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 650, in <module>
    main()
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 605, in main
    predict_results = trainer.predict(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer_seq2seq.py", line 228, in predict
    return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer.py", line 3087, in predict
    output = eval_loop(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer.py", line 3304, in evaluation_loop
    metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 537, in compute_metrics
    decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 3706, in batch_decode
    return [
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 3707, in <listcomp>
    self.decode(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 3746, in decode
    return self._decode(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 625, in _decode
    text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens)
OverflowError: out of range integral type conversion attempted
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 157/157 [10:08<00:00,  3.87s/it]
[2023-12-10 00:38:28,013] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 2410039) of binary: /home/qyfan/anaconda3/envs/ChartQA/bin/python
Traceback (most recent call last):
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/run.py", line 810, in <module>
    main()
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/run.py", line 806, in main
    run(args)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/run.py", line 797, in run
    elastic_launch(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
run_T5.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-12-10_00:38:28
  host      : gpu12.cluster.com
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 2410039)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
(ChartQA) qyfan@gpu12:~/code/ChartQA-main/Models/T5$

bug2

(ChartQA) qyfan@gpu12:~/code/ChartQA-main/Models/T5$ sh predict_test.sh 
model_arg: ModelArguments(model_name_or_path='t5-base', config_name=None, tokenizer_name=None, cache_dir=None, use_fast_tokenizer=True, model_revision='main', use_auth_token=False, resize_position_embeddings=None)
data_args: DataTrainingArguments(dataset_name=None, dataset_config_name=None, text_column='Input', summary_column='Output', train_file=None, validation_file=None, test_file='/home/qyfan/code/ChartQA-main/Figures-and-Examples/T5andVL-T5InputFileExamples.csv', overwrite_cache=False, preprocessing_num_workers=None, max_source_length=128, max_target_length=128, val_max_target_length=128, pad_to_max_length=True, max_train_samples=None, max_eval_samples=None, max_predict_samples=None, num_beams=None, ignore_pad_token_for_loss=True, source_prefix='summarize: ')
training_args Seq2SeqTrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
dispatch_batches=None,
do_eval=False,
do_predict=True,
do_train=False,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=no,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_config=None,
generation_max_length=None,
generation_num_beams=None,
gradient_accumulation_steps=1,
gradient_checkpointing=False,
gradient_checkpointing_kwargs=None,
greater_is_better=None,
group_by_length=False,
half_precision_backend=auto,
hub_always_push=False,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=every_save,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
include_inputs_for_metrics=False,
include_tokens_per_second=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=0,
log_level=passive,
log_level_replica=warning,
log_on_each_node=True,
logging_dir=result/runs/Dec10_11-00-58_gpu12,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=500,
logging_strategy=steps,
lr_scheduler_type=linear,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
neftune_noise_alpha=None,
no_cuda=False,
num_train_epochs=3.0,
optim=adamw_torch,
optim_args=None,
output_dir=result,
overwrite_output_dir=False,
past_index=-1,
per_device_eval_batch_size=192,
per_device_train_batch_size=8,
predict_with_generate=True,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=[],
resume_from_checkpoint=None,
run_name=result,
save_on_each_node=False,
save_safetensors=True,
save_steps=500,
save_strategy=steps,
save_total_limit=None,
seed=42,
skip_memory_metrics=True,
sortish_sampler=False,
split_batches=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_cpu=False,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
)
12/10/2023 11:00:58 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False
/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/models/t5/tokenization_t5_fast.py:160: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
- To avoid this warning, please instantiate this tokenizer with `model_max_length` set to your preferred value.
  warnings.warn(
Running tokenizer on prediction dataset:   0%|                                                                                    | 0/29999 [00:00<?, ? examples/s]/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:3856: UserWarning: `as_target_tokenizer` is deprecated and will be removed in v5 of Transformers. You can tokenize your labels by using the argument `text_target` of the regular `__call__` method (either in the same call as your input texts if you use the same keyword arguments, or in a separate call.
  warnings.warn(
Running tokenizer on prediction dataset: 100%|██████████████████████████████████████████████████████████████████████| 29999/29999 [00:13<00:00, 2177.59 examples/s]
[WARNING|logging.py:314] 2023-12-10 11:01:17,020 >> You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 157/157 [09:34<00:00,  3.11s/it]Traceback (most recent call last):
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 654, in <module>
    main()
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 609, in main
    predict_results = trainer.predict(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer_seq2seq.py", line 228, in predict
    return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer.py", line 3087, in predict
    output = eval_loop(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/trainer.py", line 3304, in evaluation_loop
    metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
  File "/home/qyfan/code/ChartQA-main/Models/T5/run_T5.py", line 541, in compute_metrics
    decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 3706, in batch_decode
    return [
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 3707, in <listcomp>
    self.decode(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 3746, in decode
    return self._decode(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 625, in _decode
    text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens)
OverflowError: out of range integral type conversion attempted
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 157/157 [09:35<00:00,  3.66s/it]
[2023-12-10 11:11:00,433] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 2441501) of binary: /home/qyfan/anaconda3/envs/ChartQA/bin/python
Traceback (most recent call last):
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/run.py", line 810, in <module>
    main()
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/run.py", line 806, in main
    run(args)
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/run.py", line 797, in run
    elastic_launch(
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/qyfan/anaconda3/envs/ChartQA/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
run_T5.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-12-10_11:11:00
  host      : gpu12.cluster.com
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 2441501)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================