Azure音频耳语（预览）示例

注意：openai库有更新版本可用。请参阅 https://github.com/openai/openai-python/discussions/742

该示例展示了如何使用Azure OpenAI Whisper模型来转录音频文件。

设置

首先，我们安装必要的依赖项。

! pip install "openai>=0.28.1,<1.0.0"
! pip install python-dotenv

接下来，我们将导入我们的库并配置Python OpenAI SDK，以便与Azure OpenAI服务一起使用。

注意：在这个示例中，我们通过在代码中设置变量来配置库以使用Azure API。对于开发环境，考虑设置环境变量而不是在代码中设置：

OPENAI_API_BASE
OPENAI_API_KEY
OPENAI_API_TYPE
OPENAI_API_VERSION

import os
import dotenv
import openai


dotenv.load_dotenv()

True

为了正确访问Azure OpenAI服务，我们需要在Azure门户上创建适当的资源（您可以在Microsoft Docs中查看如何执行此操作的详细指南）

资源创建完成后，我们首先需要使用的是其终结点。您可以在“资源管理”部分的“密钥和终结点”部分找到终结点。有了这个信息，我们将使用这些信息设置SDK：

openai.api_base = os.environ["OPENAI_API_BASE"]

# 支持Whisper的最低API版本
openai.api_version = "2023-09-01-preview"

# 请输入用于Whisper模型的部署ID。
deployment_id = "<deployment-id-for-your-whisper-model>"

认证

Azure OpenAI 服务支持多种认证机制，包括 API 密钥和 Azure 凭据。

# 如果使用 Azure Active Directory 认证，请设置为 True。
use_azure_active_directory = False

使用API密钥进行身份验证

要设置OpenAI SDK以使用Azure API密钥，我们需要将api_type设置为azure，并将api_key设置为与您的端点关联的密钥（您可以在Azure门户的*“资源管理”下的“密钥和端点”*中找到此密钥）。

if not use_azure_active_directory:
    openai.api_type = 'azure'
    openai.api_key = os.environ["OPENAI_API_KEY"]

使用Azure Active Directory进行身份验证

现在让我们看看如何通过Microsoft Active Directory身份验证获取密钥。

from azure.identity import DefaultAzureCredential

if use_azure_active_directory:
    default_credential = DefaultAzureCredential()
    token = default_credential.get_token("https://cognitiveservices.azure.com/.default")

    openai.api_type = 'azure_ad'
    openai.api_key = token.token

令牌在一段时间内有效，之后将会过期。为了确保每个请求都携带一个有效的令牌，您可以通过连接到requests.auth来刷新即将过期的令牌：

import typing
import time
import requests

if typing.TYPE_CHECKING:
    from azure.core.credentials import TokenCredential

class TokenRefresh(requests.auth.AuthBase):

    def __init__(self, credential: "TokenCredential", scopes: typing.List[str]) -> None:
        self.credential = credential
        self.scopes = scopes
        self.cached_token: typing.Optional[str] = None

    def __call__(self, req):
        if not self.cached_token or self.cached_token.expires_on - time.time() < 300:
            self.cached_token = self.credential.get_token(*self.scopes)
        req.headers["Authorization"] = f"Bearer {self.cached_token.token}"
        return req

if use_azure_active_directory:
    session = requests.Session()
    session.auth = TokenRefresh(default_credential, ["https://cognitiveservices.azure.com/.default"])

    openai.requestssession = session

音频转录

音频转录，或者说语音转文本，是将口语转换为文本的过程。使用openai.Audio.transcribe方法将音频文件流转录为文本。

您可以从GitHub上的Azure AI Speech SDK存储库获取示例音频文件。

# 下载示例音频文件
import requests

sample_audio_url = "https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/sampledata/audiofiles/wikipediaOcelot.wav"
audio_file = requests.get(sample_audio_url)
with open("wikipediaOcelot.wav", "wb") as f:
    f.write(audio_file.content)

transcription = openai.Audio.transcribe(
    file=open("wikipediaOcelot.wav", "rb"),
    model="whisper-1",
    deployment_id=deployment_id,
)
print(transcription.text)

设置​

认证​

使用API密钥进行身份验证​

使用Azure Active Directory进行身份验证​

音频转录​

设置

认证

使用API密钥进行身份验证

使用Azure Active Directory进行身份验证

音频转录