从另一本词典中删除一些词典中的元素 - python

假设我有两个这样的字典。

a  = {'file1': [('1.txt', 1.0),
                ('3.txt', 0.4),
                ('2.txt', 0.3)],
      'file2': [('1.txt', 0.5),
                ('2.txt', 0.2),
                ('3.txt', 1.0)]}

b =  {'file1': [('1.txt', 9),
                ('2.txt', 1),
                ('3.txt', 5),
                ('4.txt', 4)],
      'file2': [('1.txt', 0),
                ('2.txt', 2),
                ('3.txt', 3),
                ('4.txt', 0)]}

我写了一个基于字典a过滤字典b的函数。

该函数的预期结果如下:

c =  {'file1': [('1.txt', 9),
                ('2.txt', 1),
                ('3.txt', 5)],
      'file2': [('1.txt', 0),
                ('2.txt', 2),
                ('3.txt', 3)]

到目前为止，我已经编写了一个函数，但是它的输出不是我想要的。

def filter():
    c = {file1:set((txt1,value2)
               for file1,dic1 in a.items()
               for file2,dic2 in b.items()
               for txt1,value1 in dic1
               for txt2,value2 in dic2
               if txt1 == txt2 and file1 == file2)
         for file1,dic1 in a.items()}

    pp({k:v for k,v in c.items()})

现在的输出如下所示:

{'file1': {('1.txt', 0),
           ('1.txt', 9),
           ('2.txt', 1),
           ('2.txt', 2),
           ('3.txt', 3),
           ('3.txt', 5)},
 'file2': {('1.txt', 0),
           ('1.txt', 9),
           ('2.txt', 1),
           ('2.txt', 2),
           ('3.txt', 3),
           ('3.txt', 5)}}

我不知道哪里出了问题。
任何帮助，将不胜感激。

python大神给出的解决方案

您可以将collections.defaultdict用于此类任务:

>>> from collections import defaultdict
>>> d=defaultdict(list)
>>> for k,v in b.items():
...      for i in v:
...         if i[0] in zip(*a[k])[0]: #in python 3 next(zip(*a[k]))
...              d[k].append(i)
... 
>>> d
defaultdict(<type 'list'>, {'file2': [('1.txt', 0), ('2.txt', 2), ('3.txt', 3)], 'file1': [('1.txt', 9), ('2.txt', 1), ('3.txt', 5)]})

请注意，要检查a中是否存在b值，可以使用zip函数获取文件名!

另外，您也可以使用dict.setdefault() method:

>>> c={}
>>> for k,v in b.items():
...      for i in v:
...         if i[0] in zip(*a[k])[0]:
...            c.setdefault(k,[]).append(i)
... 
>>> c
{'file2': [('1.txt', 0), ('2.txt', 2), ('3.txt', 3)], 'file1': [('1.txt', 9), ('2.txt', 1), ('3.txt', 5)]}

注意:如果您将python3用作zip函数，则返回的生成器将无法为其编制索引，因此您需要将zip(*a[k])[0]:更改为next(zip(*a[k])):

腾讯的同事天天给我安利让我看《三体》，说马化腾和雷军也在…

腾讯的同事天天给我安利让我看《三体》，说马化腾和雷军也在看。自己强行看了两个月，全部给看完了。感觉这文笔也就我读初中的水平……而且写着国内的一些情况，外国人能理解吗？这书为什么会这么火？这水平我也可以去写呀[笑哭][笑哭][笑哭] 招商银行员工：可以写赶紧写一个啊，能拿科幻文学雨果奖。包清白：哦楼主：pei ！tui ！你也配姓龙楼主：@赵龙王呵呵 […]