1

我正在尝试将数据从较小的 Cassandra 环迁移到较大的环。

我编写了一个脚本来从键空间收集 SSTable,然后将它们 SCP 到一个实例,该实例使用 sstableloader 实用程序将它们流式传输到更大的环。

尝试将 SSTables 从较小的环导入到较大的环时出现错误。从异常的外观来看,SSTables 似乎在某些方面已损坏或不兼容。有人对这个错误有任何见解吗?

这是详细信息:

我在原始集群的每个节点上运行以下导出脚本:

#!/bin/bash 

NODETOOL=$1
CASSDATA=$2
NODE=$3
KEYSPACE=$4
DESTINATION=$5

ORIG=$(pwd)

KEY=" 
-----BEGIN RSA PRIVATE KEY-----
A PRIVATE KEY GOES HERE
-----END RSA PRIVATE KEY-----"


echo "Flushing node.... \n"
$NODETOOL flush

echo "Compacting.... \n"
$NODETOOL compact

echo "Taring data and indexes.... \n"
cd $CASSDATA

find $KEYSPACE -type f \( -name \*\Data.db -o -name \*\Index.db \) -print0 | xargs -0 tar -czvf $ORIG/$KEYSPACE-$NODE.tar.gz
cd $ORIG

echo "SCP'ing files to destination."

echo "$KEY" > key.key
chmod 600 key.key

scp -i key.key $KEYSPACE-$NODE.tar.gz root@$DESTINATION:~/import

rm key.key

如您所见,我刷新了节点 memtables,运行压缩,然后找到所有数据和索引文件并将它们打包以进行传输,然后将它们发送到运行 sstableloader 脚本的节点。

#!/bin/bash 

SSTOOL=$1 
KEYSPACE=$2
NODE=$3

ORIG=$(pwd)

tar -xzf $KEYSPACE-$NODE.tar.gz -C exportdir/$NODE
echo "Importing..."

cd exportdir/$NODE

$SSTOOL -v -d<IPADDRESS> $KEYSPACE/ColFam1
$SSTOOL -v -d<IPADDRESS> $KEYSPACE/ColFam2

cd $ORIG
echo "DONE" 

当我在节点上运行脚本 Cassandra 时,请联系环并正确识别较大环中的所有目标实例。

Streaming revelant part of Msq/ColFam1-DATA.db to [<IP>,<IP>,<IP>]

但随后立即失败并出现以下错误:

ERROR 15:49:54,672 Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.EOFException
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
 Caused by: java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:416)
at org.apache.cassandra.streaming.FileStreamTask.write(FileStreamTask.java:217)
at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:164)
at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
... 3 more
Exception in thread "Streaming to <IP>:1" java.lang.RuntimeException: java.io.EOFException
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:416)
at org.apache.cassandra.streaming.FileStreamTask.write(FileStreamTask.java:217)
at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:164)
at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
... 3 more

对较大环中的每个节点重复此消息。

就像这个:

 ERROR 15:49:54,789 Error in ThreadPoolExecutor
 java.lang.IllegalArgumentException: unable to seek to position 6774 in /root/import/exportdir/2/KEYSPACE/ColFam1/KEYSPACE-ColFam1-ic-3-Data.db (6523 bytes) in read-only mode
at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:306)
at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:155)
at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Exception in thread "Streaming to /10.182.178.228:1" java.lang.IllegalArgumentException: unable to seek to position 6774 in /root/import/exportdir/2/KEYSPACE/ColFam1/KEYSPACE-ColFam1-ic-3-Data.db (6523 bytes) in read-only mode
at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:306)
at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:155)
at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

在每个环的日志文件中是:

 INFO [Thread-24] 2013-09-08 15:49:54,793 StreamInSession.java (line 136) Streaming of file KEYSPACE/ColFam1/KEYSPACE-COLFAM1-Data.db sections=20 progress=0/5673 - 0% for org.apache.cassandra.streaming.StreamInSession@386e5d failed: requesting a retry.

从这些日志行看来,节点之间的闲聊是正确的,但我在文件解析时遇到了错误。

一些路径 / 键空间 / col 系列在堆栈跟踪中被混淆了。如果您在我的方法中看到错误,请回答!

谢谢。

4

0 回答 0