Problem of reproduce. #4

Zhou2019 · 2023-03-25T10:18:45Z

Firstly, thanks for your greate work!

I am trying to reproduce the results of your paper using your code, but I encountered some problems. I hope you can help me solve them. Thank you for your time and attention.

In line 47 of the file train.py, there is a code snippet:
if i == 20:
break
This code needs to be commented out, otherwise the model training is insufficient and the performance is very poor. Please explain why you added this code and how it affects the training process. I guess this code snippet is for debug?
The current open source model structure is inconsistent with the paper or the public pre-trained model results. Specifically, the encoder_feat part of the model weights released by the author contains three convolutional layers:
(conv3d): ModuleList(
(0): Conv3d(512, 1024, kernel_size=(2, 3, 3), stride=(1, 1, 1), padding=(0, 1, 1))
(1): Conv3d(512, 1024, kernel_size=(2, 3, 3), stride=(1, 1, 1), padding=(0, 1, 1))
(2): Conv3d(512, 1024, kernel_size=(2, 3, 3), stride=(1, 1, 1), padding=(0, 1, 1))
)
But there is no such part in the code model. Therefore, even if I comment out the code in the first problem, the model performance still does not reach the effect of the author's paper. Could you provide the correct model structure code for the paper? or help me how to reproduce the results in the paper.
Thanks a lot!!

Chen-Yang-Liu · 2023-03-25T13:11:23Z

Thank you for your interest in our work.

Response to Question 1: You are right. Those two lines of code are used to confirm the training process is correct before we release the code, now you can delete it.
Response to Question 2: Based on our public code and model, I tried to run the eval.py file and got the following scores:

As you can see, the available model is higher than the score in our paper, because the result in our paper is the average of the scores we trained five times.

In fact, while the code was running, we didn't see any warnings about loading the pretrained model. To help you reproduce the results above, we provide our preprocessed data and you can try to use it (download links: [Google Drive]; [Baidu Pan] (code:nq9y)]

Zhou2019 · 2023-03-25T14:01:19Z

Thanks for your response!

About question 2: Yes, there is no error when loading the model, because pytorch has two ways to save the entire network and only the network parameters, and the author uses the way to save the entire network, which will save the network structure, so it can be run even if the weight structure is different from the model structure in the code.
Print out the model structure with training weights and you can find that the encoder_feat components in it have three more convolution layers than the model structure in the code.

The printed structure of the pre-trained model is attached to the file.
pretrain_model_acrch.txt

In line 537 of the file, you can see that there are three convolution layers that differ from the results of the code model. I didn't find the code associated with it in the code model file, so I guess it was because of this problem that the author's results could not be reproduced by me.

Thanks a lot again!

Chen-Yang-Liu · 2023-03-25T14:19:50Z

Thanks for your response!

About question 2: Yes, there is no error when loading the model, because pytorch has two ways to save the entire network and only the network parameters, and the author uses the way to save the entire network, which will save the network structure, so it can be run even if the weight structure is different from the model structure in the code. Print out the model structure with training weights and you can find that the encoder_feat components in it have three more convolution layers than the model structure in the code.

The printed structure of the pre-trained model is attached to the file. pretrain_model_acrch.txt

In line 537 of the file, you can see that there are three convolution layers that differ from the results of the code model. I didn't find the code associated with it in the code model file, so I guess it was because of this problem that the author's results could not be reproduced by me.

Thanks a lot again!

I remembered the reason: Once, when exploring the model structure, I defined some 3D convolutions in the initialization function, but we did not use them in the forward process. Therefore, these 3D convolutions exist in our saved model structure. In my opinion, you can ignore their weights when loading the model to get correct results.

Zhou2019 · 2023-03-26T03:14:08Z

Thanks for your reply.

I have reproduce the result in your paper. I guess it may be that there is no fixed random number seed, which leads to a bad result for the first reproduce.
The following is the result of my reproduce:

['Bleu_1', 'Bleu_2', 'Bleu_3', 'Bleu_4'] [0.9441283734498986, 0.9350939664945913, 0.9295561396083202, 0.9255503767000808]
METEOR 0.7088351869561866
ROUGE_L 0.951196575871603
CIDEr 0.0
nochange_acc: 0.9367875647668393
change_metric:
['Bleu_1', 'Bleu_2', 'Bleu_3', 'Bleu_4'] [0.7669174160401347, 0.6236078299399817, 0.4933946274791389, 0.38481022734657844]
METEOR 0.25945607910671487
ROUGE_L 0.5328756478768945
CIDEr 0.6234487965908536
change_acc: 0.9128630705394191
.......................................................
['Bleu_1', 'Bleu_2', 'Bleu_3', 'Bleu_4'] [0.8504440043441608, 0.7665358703553551, 0.6927971568203759, 0.631244447164277]
METEOR 0.3970548039576966
ROUGE_L 0.7421445413527336
CIDEr 1.342158995158655
trans - beam size 1: BLEU-1 0.8504 BLEU-2 0.7665 BLEU-3 0.6928 BLEU-4 0.6312 METEOR 0.3971 ROUGE_L 0.7421 CIDEr 1.3422

Thanks again for your great work and patient answers.

Zhou2019 · 2023-03-29T07:32:43Z

I apologize for disturbing you again.
Did the author encounter large variations in model performance during the experiment? I trained a model that reached the paper’s indicators last time, but I couldn’t reproduce it again.
Would you be able to supply any beneficial log files or advice for reproduction?

Thanks a lot again.

Chen-Yang-Liu · 2023-03-29T11:31:54Z

I apologize for disturbing you again. Did the author encounter large variations in model performance during the experiment? I trained a model that reached the paper’s indicators last time, but I couldn’t reproduce it again. Would you be able to supply any beneficial log files or advice for reproduction?

Thanks a lot again.

The problem you mentioned does exist, so I repeated the training of the model five times to get the average scores. From my point of view, the problem currently exists in the image captioning task and the change captioning task in the remote sensing field. I think this may be related to two factors: 1. There is a gap between the cross-entropy loss and the evaluation metrics. 2. Compared with the image-text datasets of natural images, the remote sensing image-text datasets are relatively small.

Zhou2019 · 2023-03-30T02:46:47Z

Thanks a lot for your reply and advice.

tuyunbin · 2023-05-08T08:29:59Z

Hi Zhou @Zhou2019 , how did you reproduce the results? I directly run the code without any modification several times, but I only obtained the much lower results, such as CIDEr score with 90+. I run the code on 3080 and 3090 GPU, but I got the similar results.

TangZwei · 2023-06-28T18:25:34Z

We can set batch_size over 35 to achieve the stable results

Zhou2019 closed this as completed Mar 29, 2023

Zhou2019 reopened this Mar 29, 2023

Chen-Yang-Liu mentioned this issue Oct 26, 2023

Problem of reproduce. #9

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem of reproduce. #4

Problem of reproduce. #4

Zhou2019 commented Mar 25, 2023

Chen-Yang-Liu commented Mar 25, 2023

Zhou2019 commented Mar 25, 2023

Chen-Yang-Liu commented Mar 25, 2023

Zhou2019 commented Mar 26, 2023 •

edited

Loading

Zhou2019 commented Mar 29, 2023

Chen-Yang-Liu commented Mar 29, 2023

Zhou2019 commented Mar 30, 2023

tuyunbin commented May 8, 2023

TangZwei commented Jun 28, 2023

Problem of reproduce. #4

Problem of reproduce. #4

Comments

Zhou2019 commented Mar 25, 2023

Chen-Yang-Liu commented Mar 25, 2023

Zhou2019 commented Mar 25, 2023

Chen-Yang-Liu commented Mar 25, 2023

Zhou2019 commented Mar 26, 2023 • edited Loading

Zhou2019 commented Mar 29, 2023

Chen-Yang-Liu commented Mar 29, 2023

Zhou2019 commented Mar 30, 2023

tuyunbin commented May 8, 2023

TangZwei commented Jun 28, 2023

Zhou2019 commented Mar 26, 2023 •

edited

Loading