OpenAI Whisper and ChatGPT 语音助手

SEO教程

正在检查是否收录...

OpenAI Whisper and ChatGPT ASR Gradio Web UI

一环境准备 1.1 python 1.2 windows 二导入所需要的包三加载模型四定义openai和whisper接口五生成Gradio Web UI

麦克风输入，展示三种结果输入ASR结果输出文本输出TTS结果

一环境准备

1.1 python

gradio==3.19.1
gTTS==2.3.1
openai==0.27.0
openai-whisper==20230124

1.2 windows

使用以下命令安装 ffmpeg

choco install ffmpeg

需要科学上网，否则连接超时

二导入所需要的包

import whisper import gradio as gr import time import warnings import json import openai import os from gtts import gTTS

三加载模型

openai.api_key='输入你自己的openai-key' model = whisper.load_model("base")

四定义openai和whisper接口

def chatgpt_api(input_text): messages = [ {"role": "system", "content": "you are great!"}] if input_text: messages.append( {"role": "user", "content": input_text}, ) chat_completion = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=messages ) reply = chat_completion.choices[0].message.content return reply def transcribe(audio): language = "zh-CN" audio = whisper.load_audio(audio) audio = whisper.pad_or_trim(audio) mel = whisper.log_mel_spectrogram(audio).to(model.device) _, probs = model.detect_language(mel) options = whisper.DecodingOptions(fp16 = False) result = whisper.decode(model, mel, options) result_text = result.text out_result = chatgpt_api(result_text) audioobj = gTTS(text = out_result, lang = language, slow = False) audioobj.save("Aria.mp3") return [result_text, out_result, "Aria.mp3"]

五生成Gradio Web UI

output_1 = gr.Textbox(label="Speech to Text") output_2 = gr.Textbox(label="ChatGPT Output") output_3 = gr.Audio("Aria.mp3") gr.Interface( title = 'OpenAI Whisper and ChatGPT ASR Gradio Web UI', fn=transcribe, inputs=[ gr.inputs.Audio(source="microphone", type="filepath") ], outputs=[ output_1, output_2, output_3 ], live=True).launch()

参考：https://github.com/bhattbhavesh91/voice-assistant-whisper-chatgpt

whispercodeopenaichattpugptchatgptgradiottswebapiasrmp3windowspythongitgpt-3urlgpt-3.5createcodingjsongithubappassistantstemctr