Hello,
Thank you for sharing your code.
I have a question regarding the PER calculation.
In the paper, Phoneme Error Rate (PER) is used to evaluate intelligibility instead of WER.
Could you please let me know which phoneme recognition model was used to obtain the phoneme transcriptions?
If possible, I would also appreciate it if you could share the checkpoint or any relevant details about the model (e.g., architecture or training data).
Thank you in advance for your help!
Hello,
Thank you for sharing your code.
I have a question regarding the PER calculation.
In the paper, Phoneme Error Rate (PER) is used to evaluate intelligibility instead of WER.
Could you please let me know which phoneme recognition model was used to obtain the phoneme transcriptions?
If possible, I would also appreciate it if you could share the checkpoint or any relevant details about the model (e.g., architecture or training data).
Thank you in advance for your help!