此函数读取(document.doc)文件..但它将阿拉伯字符转换为英文字符
我想让它读阿拉伯字符,或者至少删除它。
function word($filename){
if(($fh = fopen($filename, 'r')) !== false ) {
$headers = fread($fh, 0xA00);
$n1 = ( ord($headers[0x21C]) - 1 );
$n2 = ( ( ord($headers[0x21D]) - 8 ) * 256 );
$n3 = ( ( ord($headers[0x21E]) * 256 ) * 256 );
$n4 = ( ( ( ord($headers[0x21F]) * 256 ) * 256 ) * 256 );
$textLength = ($n1 + $n2 + $n3 + $n4);
if($extracted_plaintext = @fread($fh, $textLength)){
}else{
return docx2text($filename); // Save this contents to file
}
$text=str_replace( chr(13) , "\n", $extracted_plaintext);
echo $text;
}
}
word('filename.doc');
例如:filename.doc -> 文件包含语句“بسم الله الرحمن الرحيم”