1

我正在尝试将 Python 模块分解为字节码。

我必须静态或动态地导入 Python 模块才能反汇编或检查它吗?如果不是,那么(pythonic,便携式)方法是什么?

我想:

  1. 在运行时将可用的 Python 模块的二进制数据加载到内存中:
    1. 没有它作为可用模块出现在sys.modules.
    2. 我不想执行任何模块的__init__代码,或者将它添加到任何命名空间。
    3. 加载模块应该没有其他副作用。就解释器而言,它应该只是要检查的数据块。
  2. 反汇编或以其他方式检查模块的类、函数或数据。
  3. 需要时卸载模块。

我已经搜索过,我看到了许多动态模块导入的方法(具有执行模块__init__代码或其他内联代码以及插入到 sys.modules 中的副作用)。但我宁愿不处理这些副作用。

这可能吗?如果是这样,哪些方法最便携/Pythonic?

4

2 回答 2

1

我对此进行了一些研究,一种可能的解决方案是使用pyclbr模块。它的检查查看有关类和函数的基本信息,将其加载到字典中以便于访问。这是一个示例运行:

>>> import pyclbr
>>> import sys
>>> info = pyclbr.readmodule_ex('inspect')
>>> info
{'formatargvalues': <pyclbr.Function object at 0x5083e28e50>, 'walktree': <pyclbr.Function object at 0x5083e28b50>, 'getinnerframes': <pyclbr.Function object at 0x5083e29050>, 'indentsize': <pyclbr.Function object at 0x5083e28710>, 'getmodulename': <pyclbr.Function object at 0x5083e28850>, 'formatannotation': <pyclbr.Function object at 0x5083e28d50>, 'ismemberdescriptor': <pyclbr.Function object at 0x5083e283d0>, 'iscode': <pyclbr.Function object at 0x5083e28550>, 'getsource': <pyclbr.Function object at 0x5083e28b10>, 'formatargspec': <pyclbr.Function object at 0x5083e28dd0>, 'getabsfile': <pyclbr.Function object at 0x5083e288d0>, 'getsourcelines': <pyclbr.Function object at 0x5083e28ad0>, '_getfullargs': <pyclbr.Function object at 0x5083e28c10>, 'isabstract': <pyclbr.Function object at 0x5083e28610>, 'isbuiltin': <pyclbr.Function object at 0x5083e28590>, 'getlineno': <pyclbr.Function object at 0x5083e28f10>, 'getcomments': <pyclbr.Function object at 0x5083e28990>, 'getgeneratorstate': <pyclbr.Function object at 0x5083e293d0>, 'getattr_static': <pyclbr.Function object at 0x5083e29390>, 'getframeinfo': <pyclbr.Function object at 0x5083e28ed0>, 'isgenerator': <pyclbr.Function object at 0x5083e28490>, '_static_getmro': <pyclbr.Function object at 0x5083e29190>, 'isframe': <pyclbr.Function object at 0x5083e28510>, 'getouterframes': <pyclbr.Function object at 0x5083e28f90>, 'getclasstree': <pyclbr.Function object at 0x5083e28b90>, 'getfile': <pyclbr.Function object at 0x5083e287d0>, '_shadowed_dict': <pyclbr.Function object at 0x5083e29310>, 'getargvalues': <pyclbr.Function object at 0x5083e28d10>, 'getmembers': <pyclbr.Function object at 0x5083e28650>, 'BlockFinder': <pyclbr.Class object at 0x5083e28a10>, 'isfunction': <pyclbr.Function object at 0x5083e28390>, 'getargspec': <pyclbr.Function object at 0x5083e28c50>, 'currentframe': <pyclbr.Function object at 0x5083e29090>, 'namedtuple': <pyclbr.Function object at 0x5083e1b150>, 'getmoduleinfo': <pyclbr.Function object at 0x5083e28810>, 'trace': <pyclbr.Function object at 0x5083e29110>, 'isclass': <pyclbr.Function object at 0x5083db8950>, '_is_type': <pyclbr.Function object at 0x5083e29290>, 'getcallargs': <pyclbr.Function object at 0x5083e28e90>, 'ismethoddescriptor': <pyclbr.Function object at 0x5083e28310>, 'isgeneratorfunction': <pyclbr.Function object at 0x5083e28450>, 'isroutine': <pyclbr.Function object at 0x5083e285d0>, 'getfullargspec': <pyclbr.Function object at 0x5083e28cd0>, 'getmro': <pyclbr.Function object at 0x5083e286d0>, 'getargs': <pyclbr.Function object at 0x5083e28bd0>, 'stack': <pyclbr.Function object at 0x5083e290d0>, 'getdoc': <pyclbr.Function object at 0x5083e28750>, 'findsource': <pyclbr.Function object at 0x5083e28950>, 'cleandoc': <pyclbr.Function object at 0x5083e28790>, '_check_class': <pyclbr.Function object at 0x5083e29250>, '_check_instance': <pyclbr.Function object at 0x5083e29210>, 'classify_class_attrs': <pyclbr.Function object at 0x5083e28690>, 'ismodule': <pyclbr.Function object at 0x5083db8910>, 'EndOfBlock': <pyclbr.Class object at 0x5083e289d0>, 'isdatadescriptor': <pyclbr.Function object at 0x5083e28350>, 'getmodule': <pyclbr.Function object at 0x5083e28910>, 'formatannotationrelativeto': <pyclbr.Function object at 0x5083e28d90>, 'getsourcefile': <pyclbr.Function object at 0x5083e28890>, 'ismethod': <pyclbr.Function object at 0x5083e282d0>, 'isgetsetdescriptor': <pyclbr.Function object at 0x5083e28410>, 'istraceback': <pyclbr.Function object at 0x5083e284d0>, 'getblock': <pyclbr.Function object at 0x5083e28a50>}
>>> sys.modules['inspect']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'inspect'

任何更高级的东西,您都必须开始研究通过ast 模块访问抽象语法树。

于 2013-03-05T02:35:39.257 回答
0

我找到了适用于 Python 2.7 的uncompyle2,它包含加载源文件并将其编译为字节码加载模块并将其编译为字节码而不导入模块的函数。

因此,至少它看起来是可行的,但可能涉及compile()在源代码上调用,或者,如果使用 pyc 文件,则可能不可移植(uncompyle2 仅支持 Python 2.7 与 pyc 文件。)

于 2013-03-05T05:11:02.493 回答