YOLOv3的源代码精度理解(十二) get_map函数

YOLOv3的源代码精度理解(十二) get_map函数对于源代码的解读 map部分 get_map函数 获得的类别预测框json 真实文件的框json 对于VOC_AP的调用 AP的计算图,我们寻找不变的点的,主要是为了确定矩形的长宽信息,方便计算。 最

代码主要是参考bubbliiing的github YOLOv3的代码:github.com/bubbliiiing…

对于源代码的解读

map部分

get_map函数

def get_map(MINOVERLAP, draw_plot, path = './map_out'):
    """ 参数:MINOVERLAP用于指定想要获得的mAP0.x draw_plot是否要画图,传入的是True """
    
    # 对下面的文件夹路径,有则加载,没有则创建
    GT_PATH             = os.path.join(path, 'ground-truth')
    DR_PATH             = os.path.join(path, 'detection-results')
    IMG_PATH            = os.path.join(path, 'images-optional')
    TEMP_FILES_PATH     = os.path.join(path, '.temp_files')
    RESULTS_FILES_PATH  = os.path.join(path, 'results')

    show_animation = True
    if os.path.exists(IMG_PATH): 
        for dirpath, dirnames, files in os.walk(IMG_PATH):
            if not files:
                show_animation = False
    else:
        show_animation = False
	
    # 创建临时文件夹,存储中间文件
    if not os.path.exists(TEMP_FILES_PATH):
        os.makedirs(TEMP_FILES_PATH)
        
    # 存在结果文件的时候,直接删除
    if os.path.exists(RESULTS_FILES_PATH):
        shutil.rmtree(RESULTS_FILES_PATH)
    # 我们要绘制图像的时候我们需要创建四个指标 AP,F1,Recall,Precision
    if draw_plot:
        os.makedirs(os.path.join(RESULTS_FILES_PATH, "AP"))
        os.makedirs(os.path.join(RESULTS_FILES_PATH, "F1"))
        os.makedirs(os.path.join(RESULTS_FILES_PATH, "Recall"))
        os.makedirs(os.path.join(RESULTS_FILES_PATH, "Precision"))
    if show_animation:
        os.makedirs(os.path.join(RESULTS_FILES_PATH, "images", "detections_one_by_one"))

    ground_truth_files_list = glob.glob(GT_PATH + '/*.txt')
    if len(ground_truth_files_list) == 0:
        error("Error: No ground-truth files found!")
    ground_truth_files_list.sort()
    gt_counter_per_class     = {}
    counter_images_per_class = {}

    # 整个接下来的一段都是构造一个临时文件,存储的都是json数据
    for txt_file in ground_truth_files_list:
        # 循环遍历真实框,得到img_id
        file_id     = txt_file.split(".txt", 1)[0]
        file_id     = os.path.basename(os.path.normpath(file_id))
        # 加载预测的文件
        temp_path   = os.path.join(DR_PATH, (file_id + ".txt"))
        if not os.path.exists(temp_path):
            error_msg = "Error. File not found: {}\n".format(temp_path)
            error(error_msg)
        # 行数据列表 类名,置信度,x1,y1,x2,y2
        lines_list      = file_lines_to_list(txt_file)
        bounding_boxes  = []
        # 对存在difficult设置flag
        is_difficult    = False
        already_seen_classes = []
        for line in lines_list:
            # 循环每条数据,我们的类名中存在xxx xxx的格式,所以我们使用split的方式可能会出现错误
            # try针对大多数都是可以成功分离的,对于异常的情况我们在except中进行处理
            try:
                if "difficult" in line:
                    class_name, left, top, right, bottom, _difficult = line.split()
                    is_difficult = True
                else:
                    class_name, left, top, right, bottom = line.split()
            except:
                # 
                if "difficult" in line:
                    line_split  = line.split()
                    _difficult  = line_split[-1]
                    bottom      = line_split[-2]
                    right       = line_split[-3]
                    top         = line_split[-4]
                    left        = line_split[-5]
                    class_name  = ""
                    # 对类名进行拼接
                    for name in line_split[:-5]:
                        class_name += name + " "
                    class_name  = class_name[:-1]
                    is_difficult = True
                else:
                    line_split  = line.split()
                    bottom      = line_split[-1]
                    right       = line_split[-2]
                    top         = line_split[-3]
                    left        = line_split[-4]
                    class_name  = ""
                    # 对类名进行拼接
                    for name in line_split[:-4]:
                        class_name += name + " "
                    class_name = class_name[:-1]

            # 将四个坐标进行拼接
            bbox = left + " " + top + " " + right + " " + bottom
            # 根据是否是difficult对数据分成两种格式
            if is_difficult:
                bounding_boxes.append({"class_name":class_name, "bbox":bbox, "used":False, "difficult":True})
                is_difficult = False
            else:
                # 注意这个used这个地方我们的used的设置,后面我们真实框中只有一个飞机,我们预测出来两个飞机,则只能是一真一假
                # used判断这个真实的物体是不是真的被预测过了
                bounding_boxes.append({"class_name":class_name, "bbox":bbox, "used":False})
                # 通过这个gt_counter_per_class对真实存在的物体进行统计
                if class_name in gt_counter_per_class:
                    gt_counter_per_class[class_name] += 1
                else:
                    gt_counter_per_class[class_name] = 1
				
                # 统计类别
                if class_name not in already_seen_classes:
                    if class_name in counter_images_per_class:
                        counter_images_per_class[class_name] += 1
                    else:
                        counter_images_per_class[class_name] = 1
                    already_seen_classes.append(class_name)
		
        # 对上面的json数据写入临时文件
        # gt_counter_per_class:{'aeroplane': 28, 'person': 451, 'boat': 35, 'dog': 59, 'sheep': 14, 'cat': 33, 'tvmonitor': 29, 'bottle': 35, 'bus': 27, 'car': 109, 'bird': 70, 'horse': 40, 'pottedplant': 47, 'chair': 60, 'motorbike': 35, 'sofa': 31, 'diningtable': 17, 'bicycle': 36, 'cow': 22, 'train': 28},所有图片中包含aeroplane一共28个
        # counter_images_per_class:{'aeroplane': 25, 'person': 204, 'boat': 21, 'dog': 50, 'sheep': 5, 'cat': 32, 'tvmonitor': 27, 'bottle': 26, 'bus': 17, 'car': 69, 'bird': 38, 'horse': 31, 'pottedplant': 27, 'chair': 40, 'motorbike': 25, 'sofa': 28, 'diningtable': 14, 'bicycle': 28, 'cow': 13, 'train': 21},所有图片中包含aeroplane的图片一共25张,一张图片多个aeroplane也只统计一次
        with open(TEMP_FILES_PATH + "/" + file_id + "_ground_truth.json", 'w') as outfile:
            json.dump(bounding_boxes, outfile)

    # 类别列表
    gt_classes  = list(gt_counter_per_class.keys())
    # 排序
    gt_classes  = sorted(gt_classes)
    # 一共多少个类别
    n_classes   = len(gt_classes)

    # 获取预测
    dr_files_list = glob.glob(DR_PATH + '/*.txt')
    # 排序
    dr_files_list.sort()
    
    
    # 对每个类进行循环,这个部分的整体的思想就是,我们对所有的预测文件进行扫描,然后我们将某一类的预测框进行汇总,排序后存储成jsons数据
    # 因为是按照置信度进行排序,所以我们针对这个类进行和真实框物体进行扫描,看能否对的上
    for class_index, class_name in enumerate(gt_classes):
        bounding_boxes = []
        # 对预测的文件列表进行循环
        for txt_file in dr_files_list:
            # 和上面套路一样
            file_id = txt_file.split(".txt",1)[0]
            file_id = os.path.basename(os.path.normpath(file_id))
            temp_path = os.path.join(GT_PATH, (file_id + ".txt"))
            if class_index == 0:
                if not os.path.exists(temp_path):
                    error_msg = "Error. File not found: {}\n".format(temp_path)
                    error(error_msg)
            # 获取lines_data_list
            lines = file_lines_to_list(txt_file)
            for line in lines:
                # 循环每条数据,我们的类名中存在xxx xxx的格式,所以我们使用split的方式可能会出现错误
            	# try针对大多数都是可以成功分离的,对于异常的情况我们在except中进行处理
                try:
                    tmp_class_name, confidence, left, top, right, bottom = line.split()
                except:
                    line_split      = line.split()
                    bottom          = line_split[-1]
                    right           = line_split[-2]
                    top             = line_split[-3]
                    left            = line_split[-4]
                    confidence      = line_split[-5]
                    tmp_class_name  = ""
                    for name in line_split[:-5]:
                        tmp_class_name += name + " "
                    tmp_class_name  = tmp_class_name[:-1]

                if tmp_class_name == class_name:
                    bbox = left + " " + top + " " + right + " " +bottom
                    # 注意这个地方我们需要两个数据,confidence用于排序,将真正大概绿预测放在前面
                    # file_id 是去找到对应的文件进行对比
                    bounding_boxes.append({"confidence":confidence, "file_id":file_id, "bbox":bbox})
		# 进行排序
        bounding_boxes.sort(key=lambda x:float(x['confidence']), reverse=True)
        
        # 将中间数据写入到json文件中
        with open(TEMP_FILES_PATH + "/" + class_name + "_dr.json", 'w') as outfile:
            json.dump(bounding_boxes, outfile)

    sum_AP = 0.0
    ap_dictionary = {}
    lamr_dictionary = {}
    # 开启结果文件,记录结果
    with open(RESULTS_FILES_PATH + "/results.txt", 'w') as results_file:
        results_file.write("# AP and precision/recall per class\n")
        count_true_positives = {}
		
        # 我们对每个类进行扫描
        for class_index, class_name in enumerate(gt_classes):
            count_true_positives[class_name] = 0
            
            # 加载这个类的所有预测框,已经按照confidence排好序了
            dr_file = TEMP_FILES_PATH + "/" + class_name + "_dr.json"
            dr_data = json.load(open(dr_file))

            # 我们构造一个核预测框同样大小的[0,0,...],用于记录正确和错误的信息
            nd          = len(dr_data)
            tp          = [0] * nd
            fp          = [0] * nd
            score       = [0] * nd
            
            # 记录置信度大于0.5时候的index下标
            score05_idx = 0
            
            # 然后我们对预测框列表进行循环
            for idx, detection in enumerate(dr_data):
                # image_id
                file_id     = detection["file_id"]
                # 置信度
                score[idx]  = float(detection["confidence"])
                
                # 记录下标,主要是记录map0.5的时候一共多少个数据
                if score[idx] > 0.5:
                    score05_idx = idx
				
                # 画图相关操作以后在丰富
                if show_animation:
                    ground_truth_img = glob.glob1(IMG_PATH, file_id + ".*")
                    if len(ground_truth_img) == 0:
                        error("Error. Image not found with id: " + file_id)
                    elif len(ground_truth_img) > 1:
                        error("Error. Multiple image with id: " + file_id)
                    else:
                        img = cv2.imread(IMG_PATH + "/" + ground_truth_img[0])
                        img_cumulative_path = RESULTS_FILES_PATH + "/images/" + ground_truth_img[0]
                        if os.path.isfile(img_cumulative_path):
                            img_cumulative = cv2.imread(img_cumulative_path)
                        else:
                            img_cumulative = img.copy()
                        bottom_border = 60
                        BLACK = [0, 0, 0]
                        img = cv2.copyMakeBorder(img, 0, bottom_border, 0, 0, cv2.BORDER_CONSTANT, value=BLACK)

                # 我们根据image_id取出来真实的框的信息
                gt_file             = TEMP_FILES_PATH + "/" + file_id + "_ground_truth.json"
                ground_truth_data   = json.load(open(gt_file))
                ovmax       = -1
                gt_match    = -1
                
                # bounding_box 我们得到四个坐标
                bb          = [float(x) for x in detection["bbox"].split()]
                
                # 我们对真实框信息进行扫描
                for obj in ground_truth_data:
                    # 要是类相同的话
                    if obj["class_name"] == class_name:
                        # 我们将bbox的信息放到列表中
                        bbgt    = [ float(x) for x in obj["bbox"].split() ]
                        # 下面就是iou的计算
                        bi      = [max(bb[0],bbgt[0]), max(bb[1],bbgt[1]), min(bb[2],bbgt[2]), min(bb[3],bbgt[3])]
                        iw      = bi[2] - bi[0] + 1
                        ih      = bi[3] - bi[1] + 1
                        if iw > 0 and ih > 0:
                            ua = (bb[2] - bb[0] + 1) * (bb[3] - bb[1] + 1) + (bbgt[2] - bbgt[0]
                                            + 1) * (bbgt[3] - bbgt[1] + 1) - iw * ih
                            # iou的公式
                            ov = iw * ih / ua
                            # 记录最大的iou值和{obj数据}
                            if ov > ovmax:
                                ovmax = ov
                                gt_match = obj

                if show_animation:
                    status = "NO MATCH FOUND!" 
                
                # 最小的overlop设置成0.5
                min_overlap = MINOVERLAP
                
                # 这里的比较有点意思,首先需要满足的是iou面积大于0.5,第二去除difficult,第三这个obj没有被更牛逼的计算框匹配到,used=False
                # 综合以上的三个条件我们就认为是预测的就是准确的
                if ovmax >= min_overlap:
                    if "difficult" not in gt_match:
                        if not bool(gt_match["used"]):
                            # 将指定的位置置成1,这样的话,证明预测框表现良好
                            tp[idx] = 1
                            # 这个obj的used直接置成True
                            gt_match["used"] = True
                            # 预测出来的类别数加上1
                            count_true_positives[class_name] += 1
                            # 更新ground_truth的信息
                            with open(gt_file, 'w') as f:
                                    f.write(json.dumps(ground_truth_data))
                            if show_animation:
                                status = "MATCH!"
                        else:
                            # 虽然我们的预测框和真实框之间的iou满足条件,但是这个物体已经被匹配过了,我们的fp中相对位置变成1
                            fp[idx] = 1
                            if show_animation:
                                status = "REPEATED MATCH!"
                else:
                    # iou没有过关,直接fp指定位置置成1
                    fp[idx] = 1
                    if ovmax > 0:
                        status = "INSUFFICIENT OVERLAP"

                """ Draw image to show animation """
                # 否,跳过
                if show_animation:
                    height, widht = img.shape[:2]
                    white           = (255,255,255)
                    light_blue      = (255,200,100)
                    green           = (0,255,0)
                    light_red       = (30,30,255)
                    margin          = 10
                    # 1nd line
                    v_pos           = int(height - margin - (bottom_border / 2.0))
                    text            = "Image: " + ground_truth_img[0] + " "
                    img, line_width = draw_text_in_image(img, text, (margin, v_pos), white, 0)
                    text            = "Class [" + str(class_index) + "/" + str(n_classes) + "]: " + class_name + " "
                    img, line_width = draw_text_in_image(img, text, (margin + line_width, v_pos), light_blue, line_width)
                    if ovmax != -1:
                        color       = light_red
                        if status   == "INSUFFICIENT OVERLAP":
                            text    = "IoU: {0:.2f}% ".format(ovmax*100) + "< {0:.2f}% ".format(min_overlap*100)
                        else:
                            text    = "IoU: {0:.2f}% ".format(ovmax*100) + ">= {0:.2f}% ".format(min_overlap*100)
                            color   = green
                        img, _ = draw_text_in_image(img, text, (margin + line_width, v_pos), color, line_width)
                    # 2nd line
                    v_pos           += int(bottom_border / 2.0)
                    rank_pos        = str(idx+1)
                    text            = "Detection #rank: " + rank_pos + " confidence: {0:.2f}% ".format(float(detection["confidence"])*100)
                    img, line_width = draw_text_in_image(img, text, (margin, v_pos), white, 0)
                    color           = light_red
                    if status == "MATCH!":
                        color = green
                    text            = "Result: " + status + " "
                    img, line_width = draw_text_in_image(img, text, (margin + line_width, v_pos), color, line_width)

                    font = cv2.FONT_HERSHEY_SIMPLEX
                    if ovmax > 0: 
                        bbgt = [ int(round(float(x))) for x in gt_match["bbox"].split() ]
                        cv2.rectangle(img,(bbgt[0],bbgt[1]),(bbgt[2],bbgt[3]),light_blue,2)
                        cv2.rectangle(img_cumulative,(bbgt[0],bbgt[1]),(bbgt[2],bbgt[3]),light_blue,2)
                        cv2.putText(img_cumulative, class_name, (bbgt[0],bbgt[1] - 5), font, 0.6, light_blue, 1, cv2.LINE_AA)
                    bb = [int(i) for i in bb]
                    cv2.rectangle(img,(bb[0],bb[1]),(bb[2],bb[3]),color,2)
                    cv2.rectangle(img_cumulative,(bb[0],bb[1]),(bb[2],bb[3]),color,2)
                    cv2.putText(img_cumulative, class_name, (bb[0],bb[1] - 5), font, 0.6, color, 1, cv2.LINE_AA)

                    cv2.imshow("Animation", img)
                    cv2.waitKey(20) 
                    output_img_path = RESULTS_FILES_PATH + "/images/detections_one_by_one/" + class_name + "_detection" + str(idx) + ".jpg"
                    cv2.imwrite(output_img_path, img)
                    cv2.imwrite(img_cumulative_path, img_cumulative)

            # 统计信息计算
            cumsum = 0
            # 我们计算fp = 1的数量放在cunsum中,最终fp累计展示:[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 9, 10, 10, 11, 12, 13, 14, 15, 16, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122]
            for idx, val in enumerate(fp):
                fp[idx] += cumsum
                cumsum += val
                
            # 同上,累计结果展示:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 20, 21, 22, 22, 22, 23, 23, 24, 24, 24, 25, 25, 25, 25, 25, 25, 26, 26, 26, 26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27]
            cumsum = 0
            for idx, val in enumerate(tp):
                tp[idx] += cumsum
                cumsum += val

            # rec:[0.03571428571428571, 0.07142857142857142, 0.10714285714285714, 0.14285714285714285, 0.17857142857142858, 0.21428571428571427, 0.25, 0.2857142857142857, 0.32142857142857145, 0.35714285714285715, 0.39285714285714285, 0.42857142857142855, 0.4642857142857143, 0.5, 0.5357142857142857, 0.5714285714285714, 0.6071428571428571, 0.6428571428571429, 0.6785714285714286, 0.7142857142857143, 0.7142857142857143, 0.75, 0.7857142857142857, 0.7857142857142857, 0.7857142857142857, 0.8214285714285714, 0.8214285714285714, 0.8571428571428571, 0.8571428571428571, 0.8571428571428571, 0.8928571428571429, 0.8928571428571429, 0.8928571428571429, 0.8928571428571429, 0.8928571428571429, 0.8928571428571429, 0.9285714285714286, 0.9285714285714286, 0.9285714285714286, 0.9285714285714286, 0.9285714285714286, 0.9285714285714286, 0.9285714285714286, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143, 0.9642857142857143]
            # 这个地方是召回(查全)率,正好是查找出来的数据量比上真实类别的总数量,框越多,查全率越准
            rec = tp[:]
            for idx, val in enumerate(tp):
                rec[idx] = float(tp[idx]) / np.maximum(gt_counter_per_class[class_name], 1)

            # prec:[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.9523809523809523, 0.9545454545454546, 0.9565217391304348, 0.9565217391304348, 0.9166666666666666, 0.92, 0.8846153846153846, 0.8888888888888888, 0.8571428571428571, 0.8275862068965517, 0.8333333333333334, 0.8064516129032258, 0.78125, 0.7575757575757576, 0.7352941176470589, 0.7142857142857143, 0.7222222222222222, 0.7027027027027027, 0.6842105263157895, 0.6666666666666666, 0.65, 0.6341463414634146, 0.6190476190476191, 0.627906976744186, 0.6136363636363636, 0.6, 0.5869565217391305, 0.574468085106383, 0.5625, 0.5510204081632653, 0.54, 0.5294117647058824, 0.5192307692307693, 0.5094339622641509, 0.5, 0.4909090909090909, 0.48214285714285715, 0.47368421052631576, 0.46551724137931033, 0.4576271186440678, 0.45, 0.4426229508196721, 0.43548387096774194, 0.42857142857142855, 0.421875, 0.4153846153846154, 0.4090909090909091, 0.40298507462686567, 0.39705882352941174, 0.391304347826087, 0.38571428571428573, 0.38028169014084506, 0.375, 0.3698630136986301, 0.36486486486486486, 0.36, 0.35526315789473684, 0.35064935064935066, 0.34615384615384615, 0.34177215189873417, 0.3375, 0.3333333333333333, 0.32926829268292684, 0.3253012048192771, 0.32142857142857145, 0.3176470588235294, 0.313953488372093, 0.3103448275862069, 0.3068181818181818, 0.30337078651685395, 0.3, 0.2967032967032967, 0.29347826086956524, 0.2903225806451613, 0.2872340425531915, 0.28421052631578947, 0.28125, 0.27835051546391754, 0.2755102040816326, 0.2727272727272727, 0.27, 0.26732673267326734, 0.2647058823529412, 0.2621359223300971, 0.25961538461538464, 0.2571428571428571, 0.25471698113207547, 0.2523364485981308, 0.25, 0.24770642201834864, 0.24545454545454545, 0.24324324324324326, 0.24107142857142858, 0.23893805309734514, 0.23684210526315788, 0.23478260869565218, 0.23275862068965517, 0.23076923076923078, 0.2288135593220339, 0.226890756302521, 0.225, 0.2231404958677686, 0.22131147540983606, 0.21951219512195122, 0.21774193548387097, 0.216, 0.21428571428571427, 0.2125984251968504, 0.2109375, 0.20930232558139536, 0.2076923076923077, 0.20610687022900764, 0.20454545454545456, 0.20300751879699247, 0.20149253731343283, 0.2, 0.19852941176470587, 0.19708029197080293, 0.1956521739130435, 0.19424460431654678, 0.19285714285714287, 0.19148936170212766, 0.19014084507042253, 0.1888111888111888, 0.1875, 0.18620689655172415, 0.18493150684931506, 0.1836734693877551, 0.18243243243243243, 0.18120805369127516]
            # 这个地方是准确率,查找出来的真实类的数据量占所有框的比例,框越多,准确率越低
            prec = tp[:]
            for idx, val in enumerate(tp):
                prec[idx] = float(tp[idx]) / np.maximum((fp[idx] + tp[idx]), 1)
			
            # 计算AP和mpre,mrec
            ap, mrec, mprec = voc_ap(rec[:], prec[:])
            
            # 计算F1的值
            F1  = np.array(rec)*np.array(prec)*2 / np.where((np.array(prec)+np.array(rec))==0, 1, (np.array(prec)+np.array(rec)))
			
            # 计算所有类下的AP得到总和
            sum_AP  += ap
            # 构造输出信息
            text    = "{0:.2f}%".format(ap*100) + " = " + class_name + " AP " #class_name + " AP = {0:.2f}%".format(ap*100)
			
            # 其实就是我们去取score05的下标的位置上的各个数据,假设score05_idx = 25,我们就会得到F1,rec,prec的在该下标下的数据值
            # 作为我们的map0.5时候的召回率,准确率和F1分数
            # 下面这一大段都是在构造输出到前台的数据字符串的格式
            if len(prec)>0:
                F1_text         = "{0:.2f}".format(F1[score05_idx]) + " = " + class_name + " F1 "
                Recall_text     = "{0:.2f}%".format(rec[score05_idx]*100) + " = " + class_name + " Recall "
                Precision_text  = "{0:.2f}%".format(prec[score05_idx]*100) + " = " + class_name + " Precision "
            else:
                F1_text         = "0.00" + " = " + class_name + " F1 " 
                Recall_text     = "0.00%" + " = " + class_name + " Recall " 
                Precision_text  = "0.00%" + " = " + class_name + " Precision " 

            rounded_prec    = [ '%.2f' % elem for elem in prec ]
            rounded_rec     = [ '%.2f' % elem for elem in rec ]
            results_file.write(text + "\n Precision: " + str(rounded_prec) + "\n Recall :" + str(rounded_rec) + "\n\n")
            if len(prec)>0:
                print(text + "\t||\tscore_threhold=0.5 : " + "F1=" + "{0:.2f}".format(F1[score05_idx])\
                    + " ; Recall=" + "{0:.2f}%".format(rec[score05_idx]*100) + " ; Precision=" + "{0:.2f}%".format(prec[score05_idx]*100))
            else:
                print(text + "\t||\tscore_threhold=0.5 : F1=0.00% ; Recall=0.00% ; Precision=0.00%")
            # 将数据放入到ap字典中
            ap_dictionary[class_name] = ap
			
            # 
            n_images = counter_images_per_class[class_name]
            lamr, mr, fppi = log_average_miss_rate(np.array(rec), np.array(fp), n_images)
            lamr_dictionary[class_name] = lamr

            # 下面全都是绘图的部分,不进行详细分析
            if draw_plot:
                plt.plot(rec, prec, '-o')
                area_under_curve_x = mrec[:-1] + [mrec[-2]] + [mrec[-1]]
                area_under_curve_y = mprec[:-1] + [0.0] + [mprec[-1]]
                plt.fill_between(area_under_curve_x, 0, area_under_curve_y, alpha=0.2, edgecolor='r')

                fig = plt.gcf()
                fig.canvas.set_window_title('AP ' + class_name)

                plt.title('class: ' + text)
                plt.xlabel('Recall')
                plt.ylabel('Precision')
                axes = plt.gca()
                axes.set_xlim([0.0,1.0])
                axes.set_ylim([0.0,1.05]) 
                fig.savefig(RESULTS_FILES_PATH + "/AP/" + class_name + ".png")
                plt.cla()

                plt.plot(score, F1, "-", color='orangered')
                plt.title('class: ' + F1_text + "\nscore_threhold=0.5")
                plt.xlabel('Score_Threhold')
                plt.ylabel('F1')
                axes = plt.gca()
                axes.set_xlim([0.0,1.0])
                axes.set_ylim([0.0,1.05])
                fig.savefig(RESULTS_FILES_PATH + "/F1/" + class_name + ".png")
                plt.cla()

                plt.plot(score, rec, "-H", color='gold')
                plt.title('class: ' + Recall_text + "\nscore_threhold=0.5")
                plt.xlabel('Score_Threhold')
                plt.ylabel('Recall')
                axes = plt.gca()
                axes.set_xlim([0.0,1.0])
                axes.set_ylim([0.0,1.05])
                fig.savefig(RESULTS_FILES_PATH + "/Recall/" + class_name + ".png")
                plt.cla()

                plt.plot(score, prec, "-s", color='palevioletred')
                plt.title('class: ' + Precision_text + "\nscore_threhold=0.5")
                plt.xlabel('Score_Threhold')
                plt.ylabel('Precision')
                axes = plt.gca()
                axes.set_xlim([0.0,1.0])
                axes.set_ylim([0.0,1.05])
                fig.savefig(RESULTS_FILES_PATH + "/Precision/" + class_name + ".png")
                plt.cla()
                
        if show_animation:
            cv2.destroyAllWindows()
		
        results_file.write("\n# mAP of all classes\n")
        # 最后得到的总的map的值除以总的类别数,就是最终的平均的map
        mAP     = sum_AP / n_classes
        text    = "mAP = {0:.2f}%".format(mAP*100)
        results_file.write(text + "\n")
        print(text)

    shutil.rmtree(TEMP_FILES_PATH)

    """ Count total of detection-results """
    det_counter_per_class = {}
    for txt_file in dr_files_list:
        lines_list = file_lines_to_list(txt_file)
        for line in lines_list:
            class_name = line.split()[0]
            if class_name in det_counter_per_class:
                det_counter_per_class[class_name] += 1
            else:
                det_counter_per_class[class_name] = 1
    dr_classes = list(det_counter_per_class.keys())

    """ Write number of ground-truth objects per class to results.txt """
    with open(RESULTS_FILES_PATH + "/results.txt", 'a') as results_file:
        results_file.write("\n# Number of ground-truth objects per class\n")
        for class_name in sorted(gt_counter_per_class):
            results_file.write(class_name + ": " + str(gt_counter_per_class[class_name]) + "\n")

    """ Finish counting true positives """
    for class_name in dr_classes:
        if class_name not in gt_classes:
            count_true_positives[class_name] = 0

    """ Write number of detected objects per class to results.txt """
    with open(RESULTS_FILES_PATH + "/results.txt", 'a') as results_file:
        results_file.write("\n# Number of detected objects per class\n")
        for class_name in sorted(dr_classes):
            n_det = det_counter_per_class[class_name]
            text = class_name + ": " + str(n_det)
            text += " (tp:" + str(count_true_positives[class_name]) + ""
            text += ", fp:" + str(n_det - count_true_positives[class_name]) + ")\n"
            results_file.write(text)

    """ Plot the total number of occurences of each class in the ground-truth """
    if draw_plot:
        window_title = "ground-truth-info"
        plot_title = "ground-truth\n"
        plot_title += "(" + str(len(ground_truth_files_list)) + " files and " + str(n_classes) + " classes)"
        x_label = "Number of objects per class"
        output_path = RESULTS_FILES_PATH + "/ground-truth-info.png"
        to_show = False
        plot_color = 'forestgreen'
        draw_plot_func(
            gt_counter_per_class,
            n_classes,
            window_title,
            plot_title,
            x_label,
            output_path,
            to_show,
            plot_color,
            '',
            )

    # """
    # Plot the total number of occurences of each class in the "detection-results" folder
    # """
    # if draw_plot:
    # window_title = "detection-results-info"
    # # Plot title
    # plot_title = "detection-results\n"
    # plot_title += "(" + str(len(dr_files_list)) + " files and "
    # count_non_zero_values_in_dictionary = sum(int(x) > 0 for x in list(det_counter_per_class.values()))
    # plot_title += str(count_non_zero_values_in_dictionary) + " detected classes)"
    # # end Plot title
    # x_label = "Number of objects per class"
    # output_path = RESULTS_FILES_PATH + "/detection-results-info.png"
    # to_show = False
    # plot_color = 'forestgreen'
    # true_p_bar = count_true_positives
    # draw_plot_func(
    # det_counter_per_class,
    # len(det_counter_per_class),
    # window_title,
    # plot_title,
    # x_label,
    # output_path,
    # to_show,
    # plot_color,
    # true_p_bar
    # )

    """ Draw log-average miss rate plot (Show lamr of all classes in decreasing order) """
    if draw_plot:
        window_title = "lamr"
        plot_title = "log-average miss rate"
        x_label = "log-average miss rate"
        output_path = RESULTS_FILES_PATH + "/lamr.png"
        to_show = False
        plot_color = 'royalblue'
        draw_plot_func(
            lamr_dictionary,
            n_classes,
            window_title,
            plot_title,
            x_label,
            output_path,
            to_show,
            plot_color,
            ""
            )

    """ Draw mAP plot (Show AP's of all classes in decreasing order) """
    if draw_plot:
        window_title = "mAP"
        plot_title = "mAP = {0:.2f}%".format(mAP*100)
        x_label = "Average Precision"
        output_path = RESULTS_FILES_PATH + "/mAP.png"
        to_show = True
        plot_color = 'royalblue'
        draw_plot_func(
            ap_dictionary,
            n_classes,
            window_title,
            plot_title,
            x_label,
            output_path,
            to_show,
            plot_color,
            ""
            )

获得的类别预测框json

image-20220401113719728.png

真实文件的框json

image-20220401113753851.png

  • 对于VOC_AP的调用
def voc_ap(rec, prec):
    # 这里面有个核心就是,我们的准确率是框越多,准确率越低,因为匹配上的框比较少,但是recall越高,是recall是查全率,框越多,越容易框到所有物体,
    # 所以recall框越多,数值越大;precision框越多,值越小
    """ --- Official matlab code VOC2012--- mrec=[0 ; rec ; 1]; mpre=[0 ; prec ; 0]; for i=numel(mpre)-1:-1:1 mpre(i)=max(mpre(i),mpre(i+1)); end i=find(mrec(2:end)~=mrec(1:end-1))+1; ap=sum((mrec(i)-mrec(i-1)).*mpre(i)); """
    rec.insert(0, 0.0) # insert 0.0 at begining of list
    rec.append(1.0) # insert 1.0 at end of list
    mrec = rec[:]
    prec.insert(0, 0.0) # insert 0.0 at begining of list
    prec.append(0.0) # insert 0.0 at end of list
    mpre = prec[:]
    """ This part makes the precision monotonically decreasing (goes from the end to the beginning) matlab: for i=numel(mpre)-1:-1:1 mpre(i)=max(mpre(i),mpre(i+1)); """
    # 这个部分实际上是在制作精度下降,应为我们可以知道我们的框越多,我们的准确率和召回率都会下降,我们其实在每个下标的位置保留的就是
    # 当前框数下的准确率
    for i in range(len(mpre)-2, -1, -1):
        mpre[i] = max(mpre[i], mpre[i+1])
    """ This part creates a list of indexes where the recall changes matlab: i=find(mrec(2:end)~=mrec(1:end-1))+1; """
    # 我们主要是计算面积,横坐标是召回率,纵坐标是准确率,绘制的图像,我们可以得到多个点,将每个点的长宽补齐,最终形成的面积就是AP
    i_list = []
    for i in range(1, len(mrec)):
        if mrec[i] != mrec[i-1]:
            i_list.append(i) # if it was matlab would be i + 1
    """ The Average Precision (AP) is the area under the curve (numerical integration) matlab: ap=sum((mrec(i)-mrec(i-1)).*mpre(i)); """
    # 计算矩形面积的和,得到AP
    ap = 0.0
    for i in i_list:
        ap += ((mrec[i]-mrec[i-1])*mpre[i])
    # 将AP返回
    return ap, mrec, mpre

AP的计算图,我们寻找不变的点的,主要是为了确定矩形的长宽信息,方便计算。

image.png

最后得到的效果如图所示:

image.png

我们看下其他的统计数据

  • 每个类的物品的数量

image.png

  • 单个类的AP的图像

image.png

  • 单个类的F1值图像

image.png

  • 单个类的Precision

image.png

  • 单个类的recall

image.png

  • log-average miss rate的图像

image.png

今天的文章YOLOv3的源代码精度理解(十二) get_map函数分享到此就结束了,感谢您的阅读。

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
如需转载请保留出处:https://bianchenghao.cn/19173.html

(0)
编程小号编程小号

相关推荐

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注