c++ - 在 boost.python 中将 Python 二进制文件转换为 C++ 二进制文件时出错

Question

我必须通过 boost::python 将一些二进制文件从 Python 转换为 C++。二进制文件可能来自图像或文本文件。但是将图像文件的二进制文件转换为 c++ 时会出现一些错误。以下是一个示例。

C++

#include <boost/python.hpp>
#include <boost/python/module.hpp>
#include <boost/python/def.hpp>
#include <fstream>
#include <iostream>

using namespace boost::python;
void greet(char *name,char *ss)
{
    std::ofstream fout(name,std::ios::binary);
    std::cout<< "This length is:" << strlen(ss) <<std::endl;
    fout.write(ss.strlen);
    fout.close();
    return;
}

BOOST_PYTHON_MODULE(ctopy)
{
    def("greet",greet);
}

Python：

import ctopy
#It is right.
f=open("t1.txt","rb")
ctopy.greet("t2.txt",f.read())
f.close()

#Do a error.There isn't data in the file "p2.jpg".
f2=open("p1.jpg","rb")
ctopy.greet("p2.jpg",f2.read()) #error.the file "p2.jpg" will be a empty file.
f2.close()

如何将图像的二进制文件转换为 C++？

score 1 · Accepted Answer

二进制文件的编码通常取决于编程语言以外的因素，例如文件类型、操作系统等。例如，在POSIX上，文本文件包含组织成零行或多行的字符，它是在 C++ 和 Python 中以相同的方式表示。两种语言只需要对给定格式使用正确的编码。在这种情况下，将 Python 二进制流转换为 C++ 二进制流没有特殊的过程，因为它在两种语言中都是原始字节流。

原始代码中的方法存在两个问题：

strlen()应该用于确定以空字符结尾的字符串的长度。如果二进制文件包含值为的字节\0，则strlen()不会返回数据的全部大小。
数据的大小会丢失，因为char*使用的是std::string. 考虑使用std::string它，因为它既提供大小，size()又允许字符串本身包含空字符。另一种解决方案是在数据旁边明确提供数据的大小。

这是一个完整的 Boost.Python 示例：

#include <boost/python.hpp>
#include <fstream>
#include <iostream>

void write_file(const std::string& name, const std::string& data)
{
  std::cout << "This length is: " << data.size() << std::endl;
  std::ofstream fout(name.c_str(), std::ios::binary);
  fout.write(data.data(), data.size());
  fout.close();
}

BOOST_PYTHON_MODULE(example)
{
  using namespace boost::python;
  def("write", &write_file);
}

以及示例 Python 代码 ( test.py)：

import example

with open("t1.txt", "rb") as f:
    example.write("t2.txt", f.read())

with open("p1.png", "rb") as f:
    example.write("p2.png", f.read())

和用法，我下载这个图像并创建一个简单的文本文件，然后使用上面的代码创建它们的副本：

[twsansbury]$ wget http://www.boost.org/style-v2/css_0/get-boost.png -O p1.png >> /dev/null 2>&1
[twsansbury]$ echo "这是一个测试" > t1.txt
[twsansbury]$ ./test.py
这个长度是：15
这个长度是：27054
[twsansbury]$ md5sum t1.txt t2.txt
e19c1283c925b3206685ff522acfe3e6 t1.txt
e19c1283c925b3206685ff522acfe3e6 t2.txt
[twsansbury]$ md5sum p1.png p2.png
fe6240bff9b29a90308ef0e2ba959173 p1.png
fe6240bff9b29a90308ef0e2ba959173 p2.png

md5 校验和匹配，表明文件内容相同。

score 0 · Accepted Answer

在您从中创建一个最小示例后，请提供真实代码。此外，您使用的是哪个 Python 版本？无论如何，您提供的代码中存在一些错误：

您应该使用const，因为任何 C++ 常见问题解答都会告诉您。
您正在使用 strlen() 甚至不能保证以零结尾但中间可以包含零的东西。
您应该使用 std::string，如果您有内部空字符，它不会出错。
关闭文件是没有用的，这是在 dtor 中自动完成的。刷新和检查流状态更有趣。失败时，抛出异常。
放弃尾随回报，它不会受到伤害，但它是不必要的噪音。
阅读 PEP 8。
使用 with 语句读取文件。

c++ - 在 boost.python 中将 Python 二进制文件转换为 C++ 二进制文件时出错

2 回答 2

Related

Reference