2

我正在尝试从 Java 运行涉及数据库(任何数据库,但更首选 noSQL)的 PDI 转换。

我试过使用 mongodb 和 cassandradb 并且缺少插件,我已经在这里问过:Running PDI Kettle on Java - Mongodb Step Missing Plugins,但还没有人回答。

我也尝试过使用 PostgreSQL 切换到 SQL DB,但它仍然不起作用。从我所做的研究来看,我认为这是因为我没有彻底从 Java 连接数据库,但我还没有找到任何适合我的教程或方向。我已经尝试了这个博客的以下指示: http: //ameethpaatil.blogspot.co.id/2010/11/pentaho-data-integration-java-maven.html:但仍然有一些关于存储库的问题(因为我没有没有,似乎需要)。

当我从 Spoon 运行它时,转换很好。只有当我从 Java 运行它时它才会失败。

谁能帮助我如何运行涉及数据库的 PDI 转换?我哪里做错了?

有没有人成功地从涉及 noSQL 和 SQL 数据库运行 PDI 转换?你用的是什么数据库?

如果我问了太多问题,我很抱歉,我很绝望。任何类型的信息将不胜感激。谢谢你。

4

4 回答 4

3

从 Java 执行 PDI 作业非常简单。您只需要导入所有必要的 jar 文件(用于数据库),然后调用kettle 类。最好的方法显然是使用“Maven”来控制依赖。在 mavenpom.xml文件中,只需调用数据库驱动程序。

假设您使用 pentaho v5.0.0GA 和数据库作为 PostgreSQL,示例 Maven 文件将如下所示:

<dependencies>
    <!-- Pentaho Kettle Core dependencies development -->
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle-core</artifactId>
        <version>5.0.0.1</version>
    </dependency>
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle-dbdialog</artifactId>
        <version>5.0.0.1</version>
    </dependency>
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle-engine</artifactId>
        <version>5.0.0.1</version>
    </dependency>
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle-ui-swt</artifactId>
        <version>5.0.0.1</version>
    </dependency>
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle5-log4j-plugin</artifactId>
        <version>5.0.0.1</version>
    </dependency>

    <!-- The database dependency files. Use it if your kettle file involves database connectivity. -->
    <dependency>
        <groupId>postgresql</groupId>
        <artifactId>postgresql</artifactId>
        <version>9.1-902.jdbc4</version>
    </dependency>

您可以查看我的博客了解更多信息。它适用于数据库连接。

希望这可以帮助 :)

于 2015-10-06T07:01:20.000 回答
1

我在使用 pentaho 库的应用程序中遇到了同样的问题。我用这段代码解决了这个问题:

初始化 Kettle 的单例:

import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.core.exception.KettleException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * Inicia as configurações das variáveis de ambiente do kettle
 * 
 * @author Marcos Souza
 * @version 1.0
 *
 */
public class AtomInitKettle {

    private static final Logger LOGGER = LoggerFactory.getLogger(AtomInitKettle.class);

    private AtomInitKettle() throws KettleException {
        try {
            LOGGER.info("Iniciando kettle");
            KettleJNDI.protectSystemProperty();
            KettleEnvironment.init();
            LOGGER.info("Kettle iniciado com sucesso");
        } catch (Exception e) {
            LOGGER.error("Message: {} Cause {} ", e.getMessage(), e.getCause());
        }
    }
}

以及拯救我的代码:

import java.io.File;
import java.util.Properties;

import org.pentaho.di.core.Const;
import org.pentaho.di.core.exception.KettleException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class KettleJNDI {

    private static final Logger LOGGER = LoggerFactory.getLogger(KettleJNDI.class);

    public static final String SYS_PROP_IC = "java.naming.factory.initial";

    private static boolean init = false;

    private KettleJNDI() {

    }

    public static void initJNDI() throws KettleException {
        String path = Const.JNDI_DIRECTORY;
        LOGGER.info("Kettle Const.JNDI_DIRECTORY= {}", path);

        if (path == null || path.equals("")) {
            try {
                File file = new File("simple-jndi");
                path = file.getCanonicalPath();
            } catch (Exception e) {
                throw new KettleException("Error initializing JNDI", e);
            }
            Const.JNDI_DIRECTORY = path;
            LOGGER.info("Kettle null > Const.JNDI_DIRECTORY= {}", path);
        }

        System.setProperty("java.naming.factory.initial", "org.osjava.sj.SimpleContextFactory");
        System.setProperty("org.osjava.sj.root", path);
        System.setProperty("org.osjava.sj.delimiter", "/");
    }

    public static void protectSystemProperty() {
        if (init) {
            return;
        }

        System.setProperties(new ProtectionProperties(SYS_PROP_IC, System.getProperties()));

        if (LOGGER.isInfoEnabled()) {
            LOGGER.info("Kettle System Property Protector: System.properties replaced by custom properies handler");
        }

        init = true;
    }

    public static class ProtectionProperties extends Properties {

        private static final long serialVersionUID = 1L;
        private final String protectedKey;

        public ProtectionProperties(String protectedKey, Properties prprts) {
            super(prprts);
            if (protectedKey == null) {
                throw new IllegalArgumentException("Properties protection was provided a null key");
            }
            this.protectedKey = protectedKey;
        }

        @Override
        public synchronized Object setProperty(String key, String value) {
            // We forbid changes in general, but do it silent ...
            if (protectedKey.equals(key)) {
                if (LOGGER.isDebugEnabled()) {
                    LOGGER.debug("Kettle System Property Protector: Protected change to '" + key + "' with value '" + value + "'");
                }

                return super.getProperty(protectedKey);
            }

            return super.setProperty(key, value);
        }
    }
}
于 2016-02-16T19:34:23.113 回答
1
  • 我用“没有 jndi 的转换”尝试了你的代码并且工作正常!

但我需要在我的 pom.xml 中添加这个存储库:

<repositories>
    <repository>
        <id>pentaho-releases</id>
        <url>http://repository.pentaho.org/artifactory/repo/</url>
    </repository>
</repositories>
  • 此外,当我尝试使用数据源时,出现此错误:无法实例化类:org.osjava.sj.SimpleContextFactory [根异常是 java.lang.ClassNotFoundException:org.osjava.sj.SimpleContextFactory]

在此处完成日志: https ://gist.github.com/eb15f8545e3382351e20.git

[修复]:添加此依赖项:

<dependency>
    <groupId>pentaho</groupId>
    <artifactId>simple-jndi</artifactId>
    <version>1.0.1</version>
</dependency>
  • 之后出现新的错误:

    transformation_with_jndi - 为转换开始调度 [transformation_with_jndi] 表输入.0 - 错误(版本 5.0.0.1.19046,buildguy 从 2013-09-11_13-51-13 构建 1):发生错误,处理将停止:表输入.0 - 尝试连接数据库时发生错误 Table input.0 - java.io.File 参数必须是目录。[D:\opt\workspace-eclipse\invoke-ktr-jndi\simple-jndi]

完整日志:https ://gist.github.com/jrichardsz/9d74c7263f3567ac4b45

[解释] 这是由于

KettleEnvironment.init(); 

https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/blob/master/running-etl-transformation-using-java/researching-pentaho-classes/KettleEnvironment.java

有一个初始化:

        if (simpleJndi) {
          JndiUtil.initJNDI();
}

在 JndiUtil 中:

String path = Const.JNDI_DIRECTORY;
if ((path == null) || (path.equals("")))

https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/blob/master/running-etl-transformation-using-java/researching-pentaho-classes/JndiUtil.java

在 Const 类中:

public static String JNDI_DIRECTORY = NVL(System.getProperty("KETTLE_JNDI_ROOT"), System.getProperty("org.osjava.sj.root"));

https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/blob/master/running-etl-transformation-using-java/researching-pentaho-classes/Const.java

所以我们需要设置这个变量 KETTLE_JNDI_ROOT

[FIX] 你的例子中的一个小变化:只需添加这个

System.setProperty("KETTLE_JNDI_ROOT", jdbcPropertiesPath);

KettleEnvironment.init();

基于您的代码的完整示例:

import java.io.File;
import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.core.exception.KettleException;
import org.pentaho.di.trans.Trans;
import org.pentaho.di.trans.TransMeta;

public class ExecuteSimpleTransformationWithJndiDatasource {    

    public static void main(String[] args) {

        String resourcesPath = (new File(".").getAbsolutePath())+"\\src\\main\\resources";
        String ktr_path = resourcesPath+"\\transformation_with_jndi.ktr";

        //KETTLE_JNDI_ROOT could be the simple-jndi folder in your pdi or spoon home.
        //in this example, is the resources folder
        String jdbcPropertiesPath = resourcesPath;

        try {
            /**
             * Initialize the Kettle Enviornment
             */
            System.setProperty("KETTLE_JNDI_ROOT", jdbcPropertiesPath);
            KettleEnvironment.init();

            /**
             * Create a trans object to properly assign the ktr metadata.
             * 
             * @filedb: The ktr file path to be executed.
             * 
             */
            TransMeta metadata = new TransMeta(ktr_path);
            Trans trans = new Trans(metadata);

            // Execute the transformation
            trans.execute(null);
            trans.waitUntilFinished();

            // checking for errors
            if (trans.getErrors() > 0) {
                System.out.println("Erroruting Transformation");
            }

        } catch (KettleException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

}

有关完整示例,请查看我的 github 频道:

https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/tree/master/running-etl-transformation-using-java/invoke-transformation-from-java-jndi/src/main/resources

于 2015-10-06T19:15:31.083 回答
1

我认为你的问题是与数据库的连接。您可以在转换中进行配置,不需要使用 JNDI。

public class DatabaseMetaStep {

    private static final Logger LOGGER = LoggerFactory.getLogger(DatabaseMetaStep.class);

     /**
     * Adds the configurations of access to the database
     * 
     * @return
     */
    public static DatabaseMeta createDatabaseMeta() {
        DatabaseMeta databaseMeta = new DatabaseMeta();

        LOGGER.info("Carregando informacoes de acesso");
        databaseMeta.setHostname("localhost");
        databaseMeta.setName("stepName");
        databaseMeta.setUsername("user");
        databaseMeta.setPassword("password");
        databaseMeta.setDBPort("port");
        databaseMeta.setDBName("database");     
        databaseMeta.setDatabaseType("MonetDB"); // sql, MySql ...
        databaseMeta.setAccessType(DatabaseMeta.TYPE_ACCESS_NATIVE);

        return databaseMeta;
    }
}

然后你需要将 databaseMeta 设置为 Transmeta

DatabaseMeta databaseMeta = DatabaseMetaStep.createDatabaseMeta();

        TransMeta transMeta = new TransMeta();
        transMeta.setUsingUniqueConnections(true);
        transMeta.setName("ransmetaNeame");

        List<DatabaseMeta> databases = new ArrayList<>();
        databases.add(databaseMeta);
        transMeta.setDatabases(databases);
于 2016-02-17T11:28:06.023 回答