将ctypes类型的数组转换为void指针时出现numpy错误 - python

我想将字符串列表发送给C函数：

from ctypes import c_double, c_void_p, Structure, cast, c_char_p, c_size_t, POINTER
import numpy as np


class FFIArray(Structure):
    """
    Convert sequence of structs or types to C-compatible void array

    """

    _fields_ = [("data", c_void_p), ("len", c_size_t)]

    @classmethod
    def from_param(cls, seq):
        """  Allow implicit conversions """
        return seq if isinstance(seq, cls) else cls(seq)

    def __init__(self, seq, data_type):
        array = np.ctypeslib.as_array((data_type * len(seq))(*seq))
        self._buffer = array.data
        self.data = cast(array.ctypes.data_as(POINTER(data_type)), c_void_p)
        self.len = len(array)


class Coordinates(Structure):

    _fields_ = [("lat", c_double), ("lon", c_double)]

    def __str__(self):
        return "Latitude: {}, Longitude: {}".format(self.lat, self.lon)


if __name__ == "__main__":
    tup = Coordinates(0.0, 1.0)
    coords = [tup, tup]
    a = b"foo"
    b = b"bar"
    words = [a, b]

    coord_array = FFIArray(coords, data_type=Coordinates)
    print(coord_array)
    word_array = FFIArray(words, c_char_p)
    print(word_array)

这适用于例如c_double，但是当我尝试使用c_char_p并失败时，出现以下错误（在python 2.7.16和3.7.4以及NumPy 1.16.5、1.17.2上进行测试）：

Traceback (most recent call last):
  File "/Users/sth/dev/test/venv3/lib/python3.7/site-packages/numpy/core/_internal.py", line 600, in _dtype_from_pep3118
    dtype, align = __dtype_from_pep3118(stream, is_subdtype=False)
  File "/Users/sth/dev/test/venv3/lib/python3.7/site-packages/numpy/core/_internal.py", line 677, in __dtype_from_pep3118
    raise ValueError("Unknown PEP 3118 data type specifier %r" % stream.s)
ValueError: Unknown PEP 3118 data type specifier 'z'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "so_example.py", line 42, in <module>
    word_array = FFIArray(words, c_char_p)
  File "so_example.py", line 19, in __init__
    array = np.ctypeslib.as_array((data_type * len(seq))(*seq))
  File "/Users/sth/dev/test/venv3/lib/python3.7/site-packages/numpy/ctypeslib.py", line 523, in as_array
    return array(obj, copy=False)
ValueError: '<z' is not a valid PEP 3118 buffer format string

有一个更好的方法吗？我也不想使用numpy，尽管它对于将数字类型和numpy数组的可迭代对象转换为其他地方的_FFIArray很有用。

参考方案

列出[Python 3.Docs]: ctypes - A foreign function library for Python。

我（尚未）深入了解NumPy的错误（到目前为止，我到达了_multiarray_umath（C）源，但我不知道如何调用_internal.py中的函数）。

同时，这是一个不使用NumPy的变体（在这种情况下是不需要的，但是您提到在其他部分使用了它，因此这可能只能解决部分问题）。

code03.py：

#!/usr/bin/env python3

import sys
import ctypes
import numpy as np


class FFIArray(ctypes.Structure):
    """
    Convert sequence of structs or types to C-compatible void array
    """

    _fields_ = [
        ("data", ctypes.c_void_p),
        ("len", ctypes.c_size_t)
    ]

    @classmethod
    def from_param(cls, seq, data_type):
        """  Allow implicit conversions """
        return seq if isinstance(seq, cls) else cls(seq, data_type)

    def __init__(self, seq, data_type):
        self.len = len(seq)
        self._data_type = data_type
        self._DataTypeArr = self._data_type * self.len
        self.data = ctypes.cast(self._DataTypeArr(*seq), ctypes.c_void_p)

    def __str__(self):
        ret = super().__str__()  # Python 3
        #ret = super(FFIArray, self).__str__()  # !!! Python 2 !!!
        ret += "\nType: {0:s}\nLength: {1:d}\nElement Type: {2:}\nElements:\n".format(
            self.__class__.__name__, self.len, self._data_type)
        arr_data = self._DataTypeArr.from_address(self.data)
        for idx, item in enumerate(arr_data):
            ret += "  {0:d}: {1:}\n".format(idx, item)
        return ret


class Coordinates(ctypes.Structure):
    _fields_ = [
        ("lat", ctypes.c_double),
        ("lon", ctypes.c_double)
    ]

    def __str__(self):
        return "Latitude: {0:.3f}, Longitude: {1:.3f}".format(self.lat, self.lon)


def main():
    coord_list = [Coordinates(i+ 1, i * 2) for i in range(4)]
    s0 = b"foo"
    s1 = b"bar"
    word_list = [s0, s1]

    coord_array = FFIArray(coord_list, data_type=Coordinates)
    print(coord_array)
    word_array = FFIArray(word_list, ctypes.c_char_p)
    print(word_array)


if __name__ == "__main__":
    print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    print("NumPy: {0:s}\n".format(np.version.version))
    main()
    print("\nDone.")

笔记：

修复了FFIArray.from_param中的错误（缺少arg）
初始化程序的NumPy用法很尴尬：

从字节值创建一个ctypes数组
创建一个np数组（上一步结果之外）
创建一个ctypes指针（上一步结果之外）

对原始代码做了一些小的重构

输出：

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q058049957]> "e:\Work\Dev\VEnvs\py_064_03.07.03_test0\Scripts\python.exe" code03.py
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] 64bit on win32

NumPy: 1.16.2

<__main__.FFIArray object at 0x0000019CFEB63648>
Type: FFIArray
Length: 4
Element Type: <class '__main__.Coordinates'>
Elements:
  0: Latitude: 1.000, Longitude: 0.000
  1: Latitude: 2.000, Longitude: 2.000
  2: Latitude: 3.000, Longitude: 4.000
  3: Latitude: 4.000, Longitude: 6.000

<__main__.FFIArray object at 0x0000019CFEB637C8>
Type: FFIArray
Length: 2
Element Type: <class 'ctypes.c_char_p'>
Elements:
  0: b'foo'
  1: b'bar'


Done.

@ EDIT0

PEP 3118定义了用于访问（共享）内存的标准。其中一部分是格式字符串说明符，用于在缓冲区内容和相关数据之间进行转换。这些列在[Python.Docs]: PEP 3118 - Additions to the struct string-syntax中，并从[Python 3.Docs]: struct - Format Characters进行扩展。 ctypes类型具有一个（!!! undocumented !!!）_type_属性，该属性（我认为）在执行从/到np的转换时使用：

>>> import ctypes
>>>
>>> data_types = list()
>>>
>>> for attr_name in dir(ctypes):
...     attr = getattr(ctypes, attr_name, None)
...     if isinstance(attr, (type,)) and issubclass(attr, (ctypes._SimpleCData,)):
...         data_types.append((attr, attr_name))
...
>>> for data_type, data_type_name in data_types:
...     print("{0:} ({1:}) - {2:}".format(data_type, data_type_name, getattr(data_type, "_type_", None)))
...
<class 'ctypes.HRESULT'> (HRESULT) - l
<class '_ctypes._SimpleCData'> (_SimpleCData) - None
<class 'ctypes.c_bool'> (c_bool) - ?
<class 'ctypes.c_byte'> (c_byte) - b
<class 'ctypes.c_char'> (c_char) - c
<class 'ctypes.c_char_p'> (c_char_p) - z
<class 'ctypes.c_double'> (c_double) - d
<class 'ctypes.c_float'> (c_float) - f
<class 'ctypes.c_long'> (c_int) - l
<class 'ctypes.c_short'> (c_int16) - h
<class 'ctypes.c_long'> (c_int32) - l
<class 'ctypes.c_longlong'> (c_int64) - q
<class 'ctypes.c_byte'> (c_int8) - b
<class 'ctypes.c_long'> (c_long) - l
<class 'ctypes.c_double'> (c_longdouble) - d
<class 'ctypes.c_longlong'> (c_longlong) - q
<class 'ctypes.c_short'> (c_short) - h
<class 'ctypes.c_ulonglong'> (c_size_t) - Q
<class 'ctypes.c_longlong'> (c_ssize_t) - q
<class 'ctypes.c_ubyte'> (c_ubyte) - B
<class 'ctypes.c_ulong'> (c_uint) - L
<class 'ctypes.c_ushort'> (c_uint16) - H
<class 'ctypes.c_ulong'> (c_uint32) - L
<class 'ctypes.c_ulonglong'> (c_uint64) - Q
<class 'ctypes.c_ubyte'> (c_uint8) - B
<class 'ctypes.c_ulong'> (c_ulong) - L
<class 'ctypes.c_ulonglong'> (c_ulonglong) - Q
<class 'ctypes.c_ushort'> (c_ushort) - H
<class 'ctypes.c_void_p'> (c_void_p) - P
<class 'ctypes.c_void_p'> (c_voidp) - P
<class 'ctypes.c_wchar'> (c_wchar) - u
<class 'ctypes.c_wchar_p'> (c_wchar_p) - Z
<class 'ctypes.py_object'> (py_object) - O

如上所示，找不到c_char_p和c_whar_p或与标准不匹配。乍一看，这似乎是ctypes错误，因为它不遵守标准，但是我不会急于在进一步调查之前主张这一事实（并可能会提交错误）（特别是因为该区域已经报告了错误）：[Python.Bugs]: ctypes arrays have incorrect buffer information (PEP-3118)）。

下面是一个变体，它也处理np数组。

code04.py：

#!/usr/bin/env python3

import sys
import ctypes
import numpy as np


class FFIArray(ctypes.Structure):
    """
    Convert sequence of structs or types to C-compatible void array
    """

    _fields_ = [
        ("data", ctypes.c_void_p),
        ("len", ctypes.c_size_t)
    ]

    _special_np_types_mapping = {
        ctypes.c_char_p: "S",
        ctypes.c_wchar_p: "U",
    }

    @classmethod
    def from_param(cls, seq, data_type=ctypes.c_void_p):
        """  Allow implicit conversions """
        return seq if isinstance(seq, cls) else cls(seq, data_type=data_type)

    def __init__(self, seq, data_type=ctypes.c_void_p):
        self.len = len(seq)
        self.__data_type = data_type  # Used just to hold the value passed to the initializer
        if isinstance(seq, np.ndarray):
            arr = np.ctypeslib.as_ctypes(seq)
            self._data_type = arr._type_  # !!! data_type is ignored in this case !!!
            self._DataTypeArr = arr.__class__
            self.data = ctypes.cast(arr, ctypes.c_void_p)
        else:
            self._data_type = data_type
            self._DataTypeArr = self._data_type * self.len
            self.data = ctypes.cast(self._DataTypeArr(*seq), ctypes.c_void_p)

    def __str__(self):
        strings = [super().__str__()]  # Python 3
        #strings = [super(FFIArray, self).__str__()]  # !!! Python 2 (ugly) !!!
        strings.append("Type: {0:s}\nElement Type: {1:}{2:}\nElements ({3:d}):".format(
            self.__class__.__name__, self._data_type,
            "" if self._data_type == self.__data_type else " ({0:})".format(self.__data_type),
            self.len))
        arr_data = self._DataTypeArr.from_address(self.data)
        for idx, item in enumerate(arr_data):
            strings.append("  {0:d}: {1:}".format(idx, item))
        return "\n".join(strings) + "\n"

    def to_np(self):
        arr_data = self._DataTypeArr.from_address(self.data)
        if self._data_type in self._special_np_types_mapping:
            dtype = np.dtype(self._special_np_types_mapping[self._data_type] + str(max(len(item) for item in arr_data)))
            np_arr = np.empty(self.len, dtype=dtype)
            for idx, item in enumerate(arr_data):
                np_arr[idx] = item
            return np_arr
        else:
            return np.ctypeslib.as_array(arr_data)


class Coordinates(ctypes.Structure):
    _fields_ = [
        ("lat", ctypes.c_double),
        ("lon", ctypes.c_double)
    ]

    def __str__(self):
        return "Latitude: {0:.3f}, Longitude: {1:.3f}".format(self.lat, self.lon)


def main():
    coord_list = [Coordinates(i + 1, i * 2) for i in range(4)]
    s0 = b"foo"
    s1 = b"bar (beyond all recognition)"  # To avoid having 2 equal strings
    word_list = [s0, s1]

    coord_array0 = FFIArray(coord_list, data_type=Coordinates)
    print(coord_array0)

    word_array0 = FFIArray(word_list, data_type=ctypes.c_char_p)
    print(word_array0)
    print("to_np: {0:}\n".format(word_array0.to_np()))

    np_array_src = np.array([0, -3.141593, 2.718282, -0.577, 0.618])
    float_array0 = FFIArray.from_param(np_array_src, data_type=None)
    print(float_array0)
    np_array_dst = float_array0.to_np()
    print("to_np: {0:}".format(np_array_dst))
    print("Equal np arrays: {0:}\n".format(all(np_array_src == np_array_dst)))

    empty_array0 = FFIArray.from_param([])
    print(empty_array0)


if __name__ == "__main__":
    print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    print("NumPy: {0:s}\n".format(np.version.version))
    main()
    print("\nDone.")

输出：

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q058049957]> "e:\Work\Dev\VEnvs\py_064_03.07.03_test0\Scripts\python.exe" code04.py
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] 64bit on win32

NumPy: 1.16.2

<__main__.FFIArray object at 0x000002484A2265C8>
Type: FFIArray
Element Type: <class '__main__.Coordinates'>
Elements (4):
  0: Latitude: 1.000, Longitude: 0.000
  1: Latitude: 2.000, Longitude: 2.000
  2: Latitude: 3.000, Longitude: 4.000
  3: Latitude: 4.000, Longitude: 6.000

<__main__.FFIArray object at 0x000002484A2267C8>
Type: FFIArray
Element Type: <class 'ctypes.c_char_p'>
Elements (2):
  0: b'foo'
  1: b'bar (beyond all recognition)'

to_np: [b'foo' b'bar (beyond all recognition)']

<__main__.FFIArray object at 0x000002484A2264C8>
Type: FFIArray
Element Type: <class 'ctypes.c_double'> (None)
Elements (5):
  0: 0.0
  1: -3.141593
  2: 2.718282
  3: -0.577
  4: 0.618

to_np: [ 0.       -3.141593  2.718282 -0.577     0.618   ]
Equal np arrays: True

<__main__.FFIArray object at 0x000002484A226848>
Type: FFIArray
Element Type: <class 'ctypes.c_void_p'>
Elements (0):


Done.

当然，这是可能性之一。另一个可能涉及（不推荐使用）[SciPy.Docs]: numpy.char.array用法，但是我不想使事情复杂化（没有明确的情况）。

@ EDIT1

在np数组转换中添加了FFIArray（我不是np专家，因此对于一个人来说可能看起来很麻烦）。字符串需要特殊处理，因此不会发布新的代码版本（因为更改不是很重要），而是在以前的版本上使用。

您如何在列表内部调用一个字符串位置？ - python

我一直在做迷宫游戏。我首先决定制作一个迷你教程。游戏开发才刚刚开始，现在我正在尝试使其向上发展。我正在尝试更改PlayerAre变量，但是它不起作用。我试过放在列表内和列表外。maze = ["o","*","*","*","*","*",…

用大写字母拆分字符串，但忽略AAA Python Regex - python

我的正则表达式：vendor = "MyNameIsJoe. I'mWorkerInAAAinc." ven = re.split(r'(?<=[a-z])[A-Z]|[A-Z](?=[a-z])', vendor) 以大写字母分割字符串，例如：'我的名字是乔。 I'mWorkerInAAAinc”变成…

Python Pandas导出数据 - python

我正在使用python pandas处理一些数据。我已使用以下代码将数据导出到excel文件。writer = pd.ExcelWriter('Data.xlsx'); wrong_data.to_excel(writer,"Names which are wrong", index = False); writer.…

Python集合模块中的defaultdict是否真的比使用setdefault更快？ - python

我见过其他Python程序员在以下用例中使用collections模块中的defaultdict：from collections import defaultdict s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue'…

包含空格时解析换行符的新字符串 - python

我正在使用真正整洁的Parsy库来拆分字符串（作为更大的语法定义的一部分。）在没有嵌入式空格的情况下，sep_by效果很好。基本上，我想换行并获取所有字符，包括嵌入式空格或任何其他Unicode字符。例子：作品：>>> parser = letter.at_least(1).concat().sep_by(string('\n&#…

将ctypes类型的数组转换为void指针时出现numpy错误 - python

腾讯的同事天天给我安利让我看《三体》，说马化腾和雷军也在…