生成动态嵌套JSON对象和数组-python - python

正如问题所解释的那样,我一直在尝试生成嵌套的JSON对象。在这种情况下,我有for循环从字典dic中获取数据。下面是代码:

f = open("test_json.txt", 'w')
flag = False
temp = ""
start = "{\n\t\"filename\"" + " : \"" +initial_filename+"\",\n\t\"data\"" +" : " +" [\n"
end = "\n\t]" +"\n}"
f.write(start)
for i, (key,value) in enumerate(dic.iteritems()):
    f.write("{\n\t\"keyword\":"+"\""+str(key)+"\""+",\n")
    f.write("\"term_freq\":"+str(len(value))+",\n")
    f.write("\"lists\":[\n\t")
    for item in value:
        f.write("{\n")
        f.write("\t\t\"occurance\" :"+str(item)+"\n")
        #Check last object
        if value.index(item)+1 == len(value):
            f.write("}\n" 
            f.write("]\n")
        else:
            f.write("},") # close occurrence object
    # Check last item in dic
    if i == len(dic)-1:
        flag = True
    if(flag):
        f.write("}")
    else:
        f.write("},") #close lists object
        flag = False 

#check for flag
f.write("]") #close lists array 
f.write("}")

预期输出为:

{
"filename": "abc.pdf",
"data": [{
    "keyword": "irritation",
    "term_freq": 5,
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    }]
}, {
    "keyword": "bomber",
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    }],
    "term_freq": 5
}]
}

但是目前我得到如下输出:

{
"filename": "abc.pdf",
"data": [{
    "keyword": "irritation",
    "term_freq": 5,
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    },]                // Here lies the problem "," before array(last element)
}, {
    "keyword": "bomber",
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    },],                  // Here lies the problem "," before array(last element)
    "term_freq": 5
}]
}

请帮助,我正在尝试解决它,但是失败了。请不要将其标记为重复,因为我已经检查了其他答案并且完全没有帮助。

编辑1:
输入基本上是从映射类型为dic的字典<String, List>中获取的
例如:“刺激” => [1、3、5、7、8]
其中刺激是关键,并映射到页码列表。
这基本上是在外部for循环中读取的,其中key是关键字,value是该关键字出现的页面列表。

编辑2:

dic = collections.defaultdict(list) # declaring the variable dictionary
dic[key].append(value) # inserting the values - useless to tell here
for key in dic:
    # Here dic[x] represents list - each value of x
    print key,":",dic[x],"\n" #prints the data in dictionary

参考方案

@andrea -f对我来说看起来不错,这是另一种解决方案:

随意选择两者:)

import json

dic = {
        "bomber": [1, 2, 3, 4, 5],
        "irritation": [1, 3, 5, 7, 8]
      }

filename = "abc.pdf"

json_dict = {}
data = []

for k, v in dic.iteritems():
  tmp_dict = {}
  tmp_dict["keyword"] = k
  tmp_dict["term_freq"] = len(v)
  tmp_dict["lists"] = [{"occurrance": i} for i in v]
  data.append(tmp_dict)

json_dict["filename"] = filename
json_dict["data"] = data

with open("abc.json", "w") as outfile:
    json.dump(json_dict, outfile, indent=4, sort_keys=True)

同样的想法,我首先创建一个大的json_dict以直接保存在json中。我使用with语句保存json以避免捕获exception

另外,如果将来需要改进json.dumps()输出,则应该查看json的文档。

编辑

而且只是为了好玩,如果您不喜欢tmp var,则可以单行处理所有数据for循环:)

json_dict["data"] = [{"keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v]} for k, v in dic.iteritems()]

它可以为最终解决方案提供一些不完全可读的内容,例如:

import json

json_dict = {
              "filename": "abc.pdf",
              "data": [{
                        "keyword": k,
                        "term_freq": len(v),
                        "lists": [{"occurrance": i} for i in v]
                       } for k, v in dic.iteritems()]
            }

with open("abc.json", "w") as outfile:
    json.dump(json_dict, outfile, indent=4, sort_keys=True)

编辑2

看来您不想将json保存为所需的输出,但还是想读取它。

实际上,您也可以使用json.dumps()来打印json。

with open('abc.json', 'r') as handle:
    new_json_dict = json.load(handle)
    print json.dumps(json_dict, indent=4, sort_keys=True)

但是,这里仍然存在一个问题,因为列表中"filename":ddata之前,所以f会打印在列表的末尾。

要强制命令,您将必须在字典的生成中使用OrderedDict。请注意,python 2.X的语法很丑(imo)

这是新的完整解决方案;)

import json
from collections import OrderedDict

dic = {
        'bomber': [1, 2, 3, 4, 5],
        'irritation': [1, 3, 5, 7, 8]
      }

json_dict = OrderedDict([
              ('filename', 'abc.pdf'),
              ('data', [ OrderedDict([
                                        ('keyword', k),
                                        ('term_freq', len(v)),
                                        ('lists', [{'occurrance': i} for i in v])
                                     ]) for k, v in dic.iteritems()])
            ])

with open('abc.json', 'w') as outfile:
    json.dump(json_dict, outfile)


# Now to read the orderer json file

with open('abc.json', 'r') as handle:
    new_json_dict = json.load(handle, object_pairs_hook=OrderedDict)
    print json.dumps(json_dict, indent=4)

将输出:

{
    "filename": "abc.pdf", 
    "data": [
        {
            "keyword": "bomber", 
            "term_freq": 5, 
            "lists": [
                {
                    "occurrance": 1
                }, 
                {
                    "occurrance": 2
                }, 
                {
                    "occurrance": 3
                }, 
                {
                    "occurrance": 4
                }, 
                {
                    "occurrance": 5
                }
            ]
        }, 
        {
            "keyword": "irritation", 
            "term_freq": 5, 
            "lists": [
                {
                    "occurrance": 1
                }, 
                {
                    "occurrance": 3
                }, 
                {
                    "occurrance": 5
                }, 
                {
                    "occurrance": 7
                }, 
                {
                    "occurrance": 8
                }
            ]
        }
    ]
}

但是要小心,在大多数情况下,最好保存常规的.json文件以使它们成为跨语言的。

Python uuid4,如何限制唯一字符的长度 - python

在Python中,我正在使用uuid4()方法创建唯一的字符集。但是我找不到将其限制为10或8个字符的方法。有什么办法吗?uuid4()ffc69c1b-9d87-4c19-8dac-c09ca857e3fc谢谢。 参考方案 尝试:x = uuid4() str(x)[:8] 输出:"ffc69c1b" Is there a way to…

json.dumps弄乱顺序 - python

我正在使用json module创建一个包含类似条目的json文件json.dumps({"fields": { "name": "%s", "city": "%s", "status": "%s", "cou…

JSON SCHEMA PATTERN逗号分隔列表 - python

我的json模式中具有以下模式,并且我需要根据以下模式包含逗号分隔的值。当前模式只能像DV2一样处理一种模式所以我应该如何修改我的模式以包括多个字符串,如下所示,但它应该与声明的模式匹配。例如:“ DV2”,“ DEV1”,“ DEV3”,“ ST”, "ENVIRONMENT": { "type": "st…

Python中的Json操作 - python

Latest_json和Historic_json函数返回:return(frame.to_json(orient='records')) 主功能:recentdata = recent_json(station) historicdata = historic_json(station) alldata = historicdata +…

Python-crontab模块 - python

我正在尝试在Linux OS(CentOS 7)上使用Python-crontab模块我的配置文件如下:{ "ossConfigurationData": { "work1": [ { "cronInterval": "0 0 0 1 1 ?", "attribute&…