我正在尝试通过 Airflow 从 Denodo 获取数据库。为此,我有一个示例 dockerfile,如下所示:
FROM apache/airflow:2.1.0
RUN pip install apache-airflow-providers-databricks
RUN pip install apache-airflow-providers-jdbc
# Install OpenJDK-11
USER root
RUN apt-get update && apt-get install -y openjdk-11-jdk && apt-get clean
ENV JAVA_HOME /usr/lib/jvm/java-11-openjdk-amd64/
RUN export JAVA_HOME
ENV CLASSPATH /usr/local/airflow/jars/denodo-vdp-jdbcdriver.jar/
USER airflow
我在 Airflow 上得到以下响应:
*** Reading local file: /opt/airflow/logs/denodo_example/select_query/2021-08-04T06:32:04.775406+00:00/1.log
[2021-08-04 06:32:05,912] {taskinstance.py:876} INFO - Dependencies all met for <TaskInstance: denodo_example.select_query 2021-08-04T06:32:04.775406+00:00 [queued]>
[2021-08-04 06:32:05,921] {taskinstance.py:876} INFO - Dependencies all met for <TaskInstance: denodo_example.select_query 2021-08-04T06:32:04.775406+00:00 [queued]>
[2021-08-04 06:32:05,921] {taskinstance.py:1067} INFO -
--------------------------------------------------------------------------------
[2021-08-04 06:32:05,921] {taskinstance.py:1068} INFO - Starting attempt 1 of 1
[2021-08-04 06:32:05,923] {taskinstance.py:1069} INFO -
--------------------------------------------------------------------------------
[2021-08-04 06:32:05,934] {taskinstance.py:1087} INFO - Executing <Task(PythonOperator): select_query> on 2021-08-04T06:32:04.775406+00:00
[2021-08-04 06:32:05,939] {standard_task_runner.py:52} INFO - Started process 640 to run task
[2021-08-04 06:32:05,942] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'denodo_example', 'select_query', '2021-08-04T06:32:04.775406+00:00', '--job-id', '14475', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/test.py', '--cfg-path', '/tmp/tmpja39uja5', '--error-file', '/tmp/tmpsv_d06ky']
[2021-08-04 06:32:05,943] {standard_task_runner.py:77} INFO - Job 14475: Subtask select_query
[2021-08-04 06:32:05,975] {logging_mixin.py:104} INFO - Running <TaskInstance: denodo_example.select_query 2021-08-04T06:32:04.775406+00:00 [running]> on host 6093b1cb6783
[2021-08-04 06:32:06,015] {taskinstance.py:1282} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=denodo_example
AIRFLOW_CTX_TASK_ID=select_query
AIRFLOW_CTX_EXECUTION_DATE=2021-08-04T06:32:04.775406+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2021-08-04T06:32:04.775406+00:00
[2021-08-04 06:32:06,246] {base.py:78} INFO - Using connection to: id: denodo_test. Host: jdbc:vdb://http://vpce-04c9e69adf77418bb-der51m2n.vpce-svc-07bf2027450b5bbe0.eu-central-1.vpce.amazonaws.com/:9999/distributed_tpcds?userAgent=jaydebeapi-ip-10-123-136-144, Port: None, Schema: , Login: user_nikitagupta, Password: ***, extra: {'extra__jdbc__drv_clsname': 'com.denodo.vdp.jdbc.Driver', 'extra__jdbc__drv_path': '/opt/airflow/denodo-vdp-jdbcdriver.jar'}
[2021-08-04 06:32:07,038] {local_task_job.py:151} INFO - Task exited with return code 1
dockerfile 中可能需要进行哪些更改才能在 Airflow 和 Denodo 之间建立连接?