我有一个类,其中包含连接到雪花(使用雪花连接器)并执行数据清理的函数。我的想法是创建多个函数来进行单独的数据清理。
我在名为“self.calls”的名为“work_data”的函数中定义了一个变量,并将该函数定义为熊猫数据框,我想在另一个函数中访问该数据框。
这是我目前的流程:
我首先在init函数中初始化了变量,现在将其设为空白列表。在我运行“work_data”函数并尝试运行“finalized_data”函数之后..但我得到的是空白列表而不是 panadas 数据框。
到目前为止,这是我的代码:
class SnowReader:
sql_pslink ='select * from xyz in xyz";'
sql_account= 'select * from xyz in xyz;'
sql_call = 'select * from xyz in xyz;'
sql_sales = 'select * from xyz in xyz;'
sql_address = 'select * from xyz in xyz;'
def __init__(self) -> None:
self.database = "xyz"
self.username = INTEGRATION_USER
self.password = SNOW_PASSWORD
self.account = "xyz"
self.warehouse = "xyz"
self.role = "xyz"
self.schema = "xyz"
self.conn = self._connect_snow()
self.calls = []
def _connect_snow(self):
try:
self.conn = snowflake.connector.connect(
user=self.username,
password=self.password,
account=self.account,
warehouse=self.warehouse,
database=self.database,
role=self.role,
schema=self.schema,
)
logger.info("You are connected to Snowflake")
except Exception as ex:
if ex.errno == 250001:
logger.error(
f"Invalid username/password, please re-enter username and password.."
)
def work_data(self):
if not self.conn:
self._connect_snow()
link = pd.read_sql(SnowReader.sql_pslink, self.conn)
account = pd.read_sql(SnowReader.sql_account, self.conn)
self.calls = pd.read_sql(SnowReader.sql_call, self.conn)
sales = pd.read_sql(SnowReader.sql_sales, self.conn)
Calls_merged = self.calls.groupby('xyz', as_index=False)['xyz'].count()
Account_step1 = Calls_merged.merge(account,left_on='xyz', right_on='xyz', how="left" )
Sales_merged = sales.groupby(['xyz'], as_index=False)['xyz'].count()
Account_final = Sales_merged.merge(Account_step1,left_on='xyz', right_on='xyz', how="left" )
Master_Address = pd.read_sql(SnowReader.sql_address, self.conn)
return Account_final, Master_Address, link, self.calls
def finalize_data(self):
return self.calls
a, b, c, d= SnowReader().work_data()
display(a,b,c, d)
testt = SnowReader().finalize_data()
testt