-
Hi, I have a question about how to perform Neural Rescoring on Conformer_Transdcuer models using the .tsv file generated by the script 'eval_beamsearch_ngram_transducer.py' (while pred_output_folder flag is defined) as input of the script 'eval_neural_rescorer.py'. It seems that the lines contained on the .tsv file should be equal to num_eval_examples x beam_size as it is in the .tsv file generated by the script 'eval_beamsearch_ngram.py' made for CTC models, but it is not. The number of candidates for every possible text it's not equal, so i suppose it's not possible for the eval_neural_rescorer.py script to know which candidate correspond to each text. Is it something I am doing wrong when executing the script 'eval_beamsearch_ngram_transducer.py'? (I have not modified the script) I will show here the candidates obtained for the first 4 texts obtained in a .tsv file for a ctc model using 'eval_beamsearch_ngram.py' and another .tsv file for a transducer model using 'eval_beamsearch_ngram_transducer.py'(The results are in basque language and are tested on the same dataset)
As you can see for the CTC model are 6 lines for every text, but on Transducer model are 3 lines for the frist, 4 lines for the second, 4 lines for the third and 2 lines for the fourth. Hope you can help me with this issue |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Transducer beam search is not guarenteed to generate the exact same number of beams as requested - its more of a max than a required number of beams. This is due to the mAES algorithm which performs pruning internally to speedup computation. |
Beta Was this translation helpful? Give feedback.
Transducer beam search is not guarenteed to generate the exact same number of beams as requested - its more of a max than a required number of beams. This is due to the mAES algorithm which performs pruning internally to speedup computation.