示例
这里有一些如何使用PandasAI的示例。 更多示例包含在存储库中,以及数据样本。
使用pandas数据框
使用PandasAI与Pandas DataFrame
import os
from pandasai import SmartDataframe
import pandas as pd
# pandas dataframe
sales_by_country = pd.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
# convert to SmartDataframe
sdf = SmartDataframe(sales_by_country)
response = sdf.chat('Which are the top 5 countries by sales?')
print(response)
# Output: China, United States, Japan, Germany, Australia
处理CSV文件
使用PandasAI与CSV文件的示例
import os
from pandasai import SmartDataframe
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
# You can instantiate a SmartDataframe with a path to a CSV file
sdf = SmartDataframe("data/Loan payments data.csv")
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
# Output: 247 loans have been paid off by men.
处理Excel文件
使用PandasAI与Excel文件的示例。为了使用Excel文件作为数据源,您需要安装pandasai[excel]
额外的依赖项。
pip install pandasai[excel]
然后,您可以按如下方式使用PandasAI与Excel文件:
import os
from pandasai import SmartDataframe
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
# You can instantiate a SmartDataframe with a path to an Excel file
sdf = SmartDataframe("data/Loan payments data.xlsx")
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
# Output: 247 loans have been paid off by men.
处理Parquet文件
使用PandasAI与Parquet文件的示例
import os
from pandasai import SmartDataframe
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
# You can instantiate a SmartDataframe with a path to a Parquet file
sdf = SmartDataframe("data/Loan payments data.parquet")
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
# Output: 247 loans have been paid off by men.
使用Google Sheets
使用PandasAI与Google表格的示例。为了使用Google表格作为数据源,您需要安装pandasai[google-sheet]
额外的依赖项。
pip install pandasai[google-sheet]
然后,您可以按如下方式使用PandasAI与Google表格:
import os
from pandasai import SmartDataframe
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
# You can instantiate a SmartDataframe with a path to a Google Sheet
sdf = SmartDataframe("https://docs.google.com/spreadsheets/d/fake/edit#gid=0")
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
# Output: 247 loans have been paid off by men.
请记住,目前您需要确保Google表格是公开的。
使用Modin数据框
使用PandasAI与Modin DataFrame的示例。为了使用Modin数据框作为数据源,你需要安装pandasai[modin]
额外的依赖项。
pip install pandasai[modin]
然后,您可以按如下方式使用带有Modin DataFrame的PandasAI:
import os
import pandasai
from pandasai import SmartDataframe
import modin.pandas as pd
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sales_by_country = pd.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})
pandasai.set_pd_engine("modin")
sdf = SmartDataframe(sales_by_country)
response = sdf.chat('Which are the top 5 countries by sales?')
print(response)
# Output: China, United States, Japan, Germany, Australia
# you can switch back to pandas using
# pandasai.set_pd_engine("pandas")
使用Polars数据框
使用PandasAI与Polars DataFrame的示例(仍在测试阶段)。为了使用Polars数据框作为数据源,您需要安装pandasai[polars]
额外的依赖项。
pip install pandasai[polars]
然后,您可以按如下方式使用PandasAI与Polars DataFrame:
import os
from pandasai import SmartDataframe
import polars as pl
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
# You can instantiate a SmartDataframe with a Polars DataFrame
sales_by_country = pl.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})
sdf = SmartDataframe(sales_by_country)
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
# Output: 247 loans have been paid off by men.
绘图
使用PandasAI从Pandas DataFrame绘制图表的示例
import os
from pandasai import SmartDataframe
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sdf = SmartDataframe("data/Countries.csv")
response = sdf.chat(
"Plot the histogram of countries showing for each the gpd, using different colors for each bar",
)
print(response)
# Output: check out assets/histogram-chart.png
使用用户定义路径保存图表
您可以传递自定义路径来保存图表。路径必须是有效的全局路径。 以下是使用用户定义位置保存图表的示例。
import os
from pandasai import SmartDataframe
user_defined_path = os.getcwd()
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sdf = SmartDataframe("data/Countries.csv", config={
"save_charts": True,
"save_charts_path": user_defined_path,
})
response = sdf.chat(
"Plot the histogram of countries showing for each the gpd,"
" using different colors for each bar",
)
print(response)
# Output: check out $pwd/exports/charts/{hashid}/chart.png
处理多个数据框(使用SmartDatalake)
使用PandasAI与多个数据框的示例。为了使用多个数据框作为数据源,你需要使用SmartDatalake
而不是SmartDataframe
。你可以如下实例化一个SmartDatalake
:
import os
from pandasai import SmartDatalake
import pandas as pd
employees_data = {
'EmployeeID': [1, 2, 3, 4, 5],
'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
}
salaries_data = {
'EmployeeID': [1, 2, 3, 4, 5],
'Salary': [5000, 6000, 4500, 7000, 5500]
}
employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
lake = SmartDatalake([employees_df, salaries_df])
response = lake.chat("Who gets paid the most?")
print(response)
# Output: Olivia gets paid the most.
与代理合作
通过聊天代理,您可以进行动态对话,代理在整个讨论过程中保留上下文。这使得您可以进行更具互动性和意义的交流。
主要特点
-
上下文保留: 代理记住对话历史,允许无缝、上下文感知的交互。
-
澄清问题: 您可以使用
clarification_questions
方法来请求对话中任何方面的澄清。这有助于确保您完全理解所提供的信息。 -
解释:
explain
方法可用于获取关于代理如何得出特定解决方案或响应的详细解释。它提供了对代理决策过程的透明度和洞察。
请随时发起对话,寻求澄清,并探索解释,以增强您与聊天代理的互动!
import os
import pandas as pd
from pandasai import Agent
employees_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Name": ["John", "Emma", "Liam", "Olivia", "William"],
"Department": ["HR", "Sales", "IT", "Marketing", "Finance"],
}
salaries_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Salary": [5000, 6000, 4500, 7000, 5500],
}
employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
agent = Agent([employees_df, salaries_df], memory_size=10)
query = "Who gets paid the most?"
# Chat with the agent
response = agent.chat(query)
print(response)
# Get Clarification Questions
questions = agent.clarification_questions(query)
for question in questions:
print(question)
# Explain how the chat response is generated
response = agent.explain()
print(response)
代理的描述
当你实例化一个代理时,你可以提供代理的描述。这个描述将用于在聊天中描述代理,并为LLM提供更多关于如何响应查询的上下文。
一些描述的例子可以是:
- 你是一个数据分析代理。你的主要目标是帮助非技术用户分析数据
- 充当数据分析师。每次我向你提问时,你都应该提供使用plotly可视化答案的代码
import os
from pandasai import Agent
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
agent = Agent(
"data.csv",
description="You are a data analysis agent. Your main goal is to help non-technical users to analyze data",
)
向代理添加技能
您可以为代理添加自定义功能,使其能够扩展其能力。这些自定义功能可以与代理的技能无缝集成,从而实现广泛的用户定义操作。
import os
import pandas as pd
from pandasai import Agent
from pandasai.skills import skill
employees_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Name": ["John", "Emma", "Liam", "Olivia", "William"],
"Department": ["HR", "Sales", "IT", "Marketing", "Finance"],
}
salaries_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Salary": [5000, 6000, 4500, 7000, 5500],
}
employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)
@skill
def plot_salaries(merged_df: pd.DataFrame):
"""
Displays the bar chart having name on x-axis and salaries on y-axis using streamlit
"""
import matplotlib.pyplot as plt
plt.bar(merged_df["Name"], merged_df["Salary"])
plt.xlabel("Employee Name")
plt.ylabel("Salary")
plt.title("Employee Salaries")
plt.xticks(rotation=45)
plt.savefig("temp_chart.png")
plt.close()
# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
agent = Agent([employees_df, salaries_df], memory_size=10)
agent.add_skills(plot_salaries)
# Chat with the agent
response = agent.chat("Plot the employee salaries against names")
print(response)