我正在尝试使用 sqoop 将一些数据从 hdfs 导出到 mysql。问题是当我尝试导出它正确导出的未压缩文件时,但如果我尝试导出使用 lzo 压缩压缩的相同文件,则 sqoop 作业将失败。我正在标准 cloudera CDH4 VM 环境中尝试它。文件中的列由制表符分隔,null 表示为 '\N'。
文件内容:
[cloudera@localhost ~]$ cat dipayan-test.txt
dipayan koramangala 29
raju marathahalli 32
raju marathahalli 32
raju \N 32
raju marathahalli 32
raju \N 32
raju marathahalli 32
raju marathahalli \N
raju marathahalli \N
mysql表说明:
mysql> describe sqooptest;
+---------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+-------+
| name | varchar(100) | YES | | NULL | |
| address | varchar(100) | YES | | NULL | |
| age | int(11) | YES | | NULL | |
+---------+--------------+------+-----+---------+-------+
3 rows in set (0.01 sec)
hdfs中的文件:
[cloudera@localhost ~]$ hadoop fs -ls /user/cloudera/dipayan-test
Found 1 items
-rw-r--r-- 3 cloudera cloudera 138 2014-02-16 23:18 /user/cloudera/dipayan-test/dipayan-test.txt.lzo
sqoop 命令:
sqoop export --connect "jdbc:mysql://localhost/bigdata" --username "root" --password "XXXXXX" --driver "com.mysql.jdbc.Driver" --table sqooptest --export-dir /user/cloudera/dipayan-test/ --input-fields-terminated-by '\t' -m 1 --input-null-string '\\N' --input-null-non-string '\\N'
错误:
[cloudera@localhost ~]$ sqoop export --connect "jdbc:mysql://localhost/bigdata" --username "root" --password "mysql" --driver "com.mysql.jdbc.Driver" --table sqooptest --export-dir /user/cloudera/dipayan-test/ --input-fields-terminated-by '\t' -m 1 --input-null-string '\\N' --input-null-non-string '\\N'
14/02/16 23:19:26 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/02/16 23:19:26 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
14/02/16 23:19:26 INFO manager.SqlManager: Using default fetchSize of 1000
14/02/16 23:19:26 INFO tool.CodeGenTool: Beginning code generation
14/02/16 23:19:26 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM sqooptest AS t WHERE 1=0
14/02/16 23:19:26 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM sqooptest AS t WHERE 1=0
14/02/16 23:19:27 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
14/02/16 23:19:27 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note: /tmp/sqoop-cloudera/compile/676bc185f1efffa3b0de0a924df4a02d/sqooptest.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/02/16 23:19:29 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/676bc185f1efffa3b0de0a924df4a02d/sqooptest.jar
14/02/16 23:19:29 INFO mapreduce.ExportJobBase: Beginning export of sqooptest
14/02/16 23:19:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM sqooptest AS t WHERE 1=0
14/02/16 23:19:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/02/16 23:19:31 INFO input.FileInputFormat: Total input paths to process : 1
14/02/16 23:19:31 INFO input.FileInputFormat: Total input paths to process : 1
14/02/16 23:19:31 INFO mapred.JobClient: Running job: job_201402162201_0013
14/02/16 23:19:32 INFO mapred.JobClient: map 0% reduce 0%
14/02/16 23:19:41 INFO mapred.JobClient: Task Id : attempt_201402162201_0013_m_000000_0, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.util.NoSuchElementException
at java.util.AbstractList$Itr.next(AbstractList.java:350)
at sqooptest.__loadFromFields(sqooptest.java:225)
at sqooptest.parse(sqooptest.java:174)
at org.apach
14/02/16 23:19:48 INFO mapred.JobClient: Task Id : attempt_201402162201_0013_m_000000_1, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.util.NoSuchElementException
at java.util.AbstractList$Itr.next(AbstractList.java:350)
at sqooptest.__loadFromFields(sqooptest.java:225)
at sqooptest.parse(sqooptest.java:174)
at org.apach
14/02/16 23:19:55 INFO mapred.JobClient: Task Id : attempt_201402162201_0013_m_000000_2, Status : FAILED
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.util.NoSuchElementException
at java.util.AbstractList$Itr.next(AbstractList.java:350)
at sqooptest.__loadFromFields(sqooptest.java:225)
at sqooptest.parse(sqooptest.java:174)
at org.apach
14/02/16 23:20:04 INFO mapred.JobClient: Job complete: job_201402162201_0013
14/02/16 23:20:04 INFO mapred.JobClient: Counters: 7
14/02/16 23:20:04 INFO mapred.JobClient: Job Counters
14/02/16 23:20:04 INFO mapred.JobClient: Failed map tasks=1
14/02/16 23:20:04 INFO mapred.JobClient: Launched map tasks=4
14/02/16 23:20:04 INFO mapred.JobClient: Data-local map tasks=4
14/02/16 23:20:04 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=29679
14/02/16 23:20:04 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0
14/02/16 23:20:04 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/02/16 23:20:04 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/02/16 23:20:04 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
14/02/16 23:20:04 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 33.5335 seconds (0 bytes/sec)
14/02/16 23:20:04 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
14/02/16 23:20:04 INFO mapreduce.ExportJobBase: Exported 0 records.
14/02/16 23:20:04 ERROR tool.ExportTool: Error during export: Export job failed!
如果文件未压缩并且我直接使用dipayan-test.txt
文件,这将非常有效。
在解决此问题时需要帮助,并且还想知道在处理 lzo 文件时是否遗漏了某些内容。