所以我想看看是否有可能在Google Cloud Vision API响应的nodejs中实现非最大抑制,例如,响应如下所示:
[
{
"mid": "/m/09728",
"languageCode": "",
"name": "Bread",
"score": 0.8558391332626343,
"boundingPoly": {
"vertices": [],
"normalizedVertices": [
{
"x": 0.010737711563706398,
"y": 0.26679491996765137
},
{
"x": 0.9930269718170166,
"y": 0.26679491996765137
},
{
"x": 0.9930269718170166,
"y": 0.7275580167770386
},
{
"x": 0.010737711563706398,
"y": 0.7275580167770386
}
]
}
},
{
"mid": "/m/052lwg6",
"languageCode": "",
"name": "Baked goods",
"score": 0.6180902123451233,
"boundingPoly": {
"vertices": [],
"normalizedVertices": [
{
"x": 0.010737711563706398,
"y": 0.26679491996765137
},
{
"x": 0.9930269718170166,
"y": 0.26679491996765137
},
{
"x": 0.9930269718170166,
"y": 0.7275580167770386
},
{
"x": 0.010737711563706398,
"y": 0.7275580167770386
}
]
}
},
{
"mid": "/m/02wbm",
"languageCode": "",
"name": "Food",
"score": 0.5861617922782898,
"boundingPoly": {
"vertices": [],
"normalizedVertices": [
{
"x": 0.321802020072937,
"y": 0.2874892055988312
},
{
"x": 0.999139130115509,
"y": 0.2874892055988312
},
{
"x": 0.999139130115509,
"y": 0.6866284608840942
},
{
"x": 0.321802020072937,
"y": 0.6866284608840942
}
]
}
}
]
所以实际上应该放在外面的边界框是这样的食物:
我已经在Python中找到了执行this的示例,但这意味着我需要在Node中使用子进程来执行python脚本,然后拉回响应,这有点脏。
显然,来自Google的这些框值需要乘以图像的高度和宽度,因此,例如,假设它为288 X 512:
const left = Math.round(vertices[0].x * 288);
const top = Math.round(vertices[0].y * 512);
const width = Math.round((vertices[2].x * 288)) - left;
const height = Math.round((vertices[2].y * 512)) - top;
我改编的脚本是这样的(只需对阈值进行硬编码并从命令行获取一系列数组):
# import the necessary packages
import numpy as np
import sys
import json
# Malisiewicz et al.
def non_max_suppression_fast():
overlapThresh = 0.3
boxes = json.loads(sys.argv[1])
# if there are no boxes, return an empty list
if len(boxes) == 0:
return []
# if the bounding boxes integers, convert them to floats --
# this is important since we'll be doing a bunch of divisions
if boxes.dtype.kind == "i":
boxes = boxes.astype("float")
# initialize the list of picked indexes
pick = []
# grab the coordinates of the bounding boxes
x1 = boxes[:,0]
y1 = boxes[:,1]
x2 = boxes[:,2]
y2 = boxes[:,3]
# compute the area of the bounding boxes and sort the bounding
# boxes by the bottom-right y-coordinate of the bounding box
area = (x2 - x1 + 1) * (y2 - y1 + 1)
idxs = np.argsort(y2)
# keep looping while some indexes still remain in the indexes
# list
while len(idxs) > 0:
# grab the last index in the indexes list and add the
# index value to the list of picked indexes
last = len(idxs) - 1
i = idxs[last]
pick.append(i)
# find the largest (x, y) coordinates for the start of
# the bounding box and the smallest (x, y) coordinates
# for the end of the bounding box
xx1 = np.maximum(x1[i], x1[idxs[:last]])
yy1 = np.maximum(y1[i], y1[idxs[:last]])
xx2 = np.minimum(x2[i], x2[idxs[:last]])
yy2 = np.minimum(y2[i], y2[idxs[:last]])
# compute the width and height of the bounding box
w = np.maximum(0, xx2 - xx1 + 1)
h = np.maximum(0, yy2 - yy1 + 1)
# compute the ratio of overlap
overlap = (w * h) / area[idxs[:last]]
# delete all indexes from the index list that have
idxs = np.delete(idxs, np.concatenate(([last],
np.where(overlap > overlapThresh)[0])))
# return only the bounding boxes that were picked using the
# integer data type
return boxes[pick].astype("int")
有人能在这里给我指点吗?我敢肯定,这只是关于计算每个盒子的总面积,但是我不能完全理解。
参考方案
好的,所以实际上,如果您使用Tensorflow.js,这非常简单-使用以下函数来获取Google视觉的响应:
注意288和512是我需要设置自己的图像的宽度和高度。
function nonMaxSuppression(objects){
return new Promise((resolve) => {
// Loop through the objects and convert the vertices into the right format.
for (let index = 0; index < objects.length; index++) {
const verts = objects[index].boundingPoly.normalizedVertices;
// As above note 288 and 512 are image width and image height for me.
const left = Math.round(verts[0].x * 288);
const top = Math.round(verts[0].y * 512);
const width = Math.round((verts[2].x * 288)) - left;
const height = Math.round((verts[2].y * 512)) - top;
// we need an array of boxes AND an array of scores
this.boxes.push([left, top, width, height]);
this.scores.push(objects[index].score);
}
// Params are boxes, scores, max number of boxes to select.
const theBox = tf.image.nonMaxSuppression(this.boxes, this.scores, 2);
// the function returns the box number that matched from this.boxes, seems like it's not zero based at least in my tests so we need to - 1 to get the index from the original array.
resolve(theBox.id -1 );
});
}
塔达香蕉!
Python:在不更改段落顺序的情况下在文件的每个段落中反向单词? - python我想通过反转text_in.txt文件中的单词来生成text_out.txt文件,如下所示:text_in.txt具有两段,如下所示:Hello world, I am Here. I am eighteen years old. text_out.txt应该是这样的:Here. am I world, Hello old. years eighteen a…
用大写字母拆分字符串,但忽略AAA Python Regex - python我的正则表达式:vendor = "MyNameIsJoe. I'mWorkerInAAAinc." ven = re.split(r'(?<=[a-z])[A-Z]|[A-Z](?=[a-z])', vendor) 以大写字母分割字符串,例如:'我的名字是乔。 I'mWorkerInAAAinc”变成…
如何在python中将从PDF提取的文本格式化为json - python我已经使用pyPDF2提取了一些文本格式的发票PDF。我想将此文本文件转换为仅包含重要关键字和令牌的json文件。输出应该是这样的:#PurchaseOrder {"doctype":"PO", "orderingcompany":"Demo Company", "su…
我怎样才能从字典的键中算出对象? - python我有这本字典:dict={"asset":[("S3","A1"),"S2",("E4","E5"),("E1","S1"),"A6","A8"], "…
Python:将两列组合在一起,找到第三列的总和 - pythonpython真的很新,需要我完成的问题需要一些帮助。我需要根据用户对月份(MM)和年份(YYYY)的输入来找到每个时间段(月/年)的平均收入。我的输入如下:year_value = int(input("Year (YYYY): ")) month_value = int(input("Month (MM): ")) …