从字符串中提取xyz坐标值到列表中 - python

我有一些数据以字符串形式从格式如下所示的文件中获取。我想做的是创建一个向量(存储为python中的列表),该向量指示[x2,y2,z2]和[x1,x2,x3]在x,y,z方向上的差异,字符串如下所示。

一旦我将所需的[x2,y2,z2]和[x1,x2,x3]提取为整数列表,我就应该可以很好地计算差值向量。我需要帮助的是根据下面的数据创建这些[x2,y2,z2]和[x1,x2,x3]列表。

data = """x1=45 y1=74 z1=55 col1=[255, 255, 255] x2=46 y2=74 z2=55 col2=[255, 255, 255] 
x1=34 y1=12 z1=15 col1=[255, 255, 255] x2=35 y2=12 z2=15 col2=[255, 255, 255] 
x1=22 y1=33 z1=24 col1=[255, 255, 255] x2=23 y2=33 z2=24 col2=[255, 255, 255] 
x1=16 y1=45 z1=58 col1=[255, 255, 255] x2=17 y2=45 z2=58 col2=[255, 255, 255] 
x1=27 y1=66 z1=21 col1=[255, 255, 255] x2=28 y2=66 z2=21 col2=[255, 255, 255]
"""

只是澄清一下,我只需要弄清楚如何为单行提取[x2,y2,z2]和[x1,x2,x3]列表。我可以弄清楚如何为每条线循环并自己计算每条线的差向量。它只是从每一行中提取相关数据并将其重新格式化为一种使我难受的可用格式。

我怀疑使用正则表达式是提取此信息的潜在途径。我看过https://docs.python.org/2/library/re.html处的文档,感到完全被该文档感到困惑和困惑。我只想要一种易于理解的方法。

python大神给出的解决方案

我确切地知道你来自哪里。直到昨天我才明白正则表达式,他们总是把我弄糊涂了。但是,一旦您了解了它们,您就会意识到它们的力量。这是您的问题的一种可能的解决方案。我还将对正则表达式的工作方式有一个直观的了解,以期减少正则表达式背后的困惑。

在下面的代码中,我假设您一次处理一行,并且数据的格式始终相同。

# Example of just one line of the data
line = """x1=45 y1=74 z1=55 col1=[255, 255, 255] x2=46 y2=74 z2=55 col2=[255, 255, 255] """

# Extract the relevant x1, y1, z1 values, stored as a list of strings
p1 = re.findall(r"[x-z][1]=([\d]*)", line)

# Extract the relevant x2, y2, z2 values, stored as a list of strings
p2 = re.findall(r"[x-z][2]=([\d]*)", line)

# Convert the elements in each list from strings to integers
p1 = [int(x) for x in p1]
p2 = [int(x) for x in p2]

# Calculate difference vector (Im assuming this is what you're trying to do)
diff = [p2[i] - p1[i] for i in range(len(p2))]

关于正则表达式中的符号的简要说明

# EXPLANATION OF THE REGEX. 
# Finds segments of strings that: 
#     [x-z]    start with a letter x,y, or z
#     [1]      followed by the number 1
#     =        followed by the equals sign
# 
#     But dont return any of that section of the string, only use that 
#     information to then extract the following values that we do actually want 
#
#     (        Return the parts of the string that have the following pattern, 
#              given that they were preceded by the previous pattern
# 
#     [\d]     contain only a numeric digit
#     *        keep proceeding forward if the current character is a digit
#     )        end of the pattern, now we can return the substring.