Packaging (#6)

* add box visualization * add box visualization and change thresholds for pif preprocessing * refactor printer * change default values * change confidence definition * remove redundant function * add debug plot in preprocessing * add task error in evaluation * add horizontal flipping * add evaluation table * add evaluation table with verbosity * add tabulate requirement and command line option verbose * refactor evaluate * add task error with mean absolute deviation * add stereo baseline * integrate stereo baseline * refactor factory preprocessing * add stereo command for evaluation * fix category bug * add interquartile range for stereo * use left tt for translation * refactor stereo functions * remvove redundant functions * change names of constants * add pixel error as function of depth * fix bug on output directory * add now time at the moment of saving * add person sitting category * remove box in pifpaf predictions * fix printing name * add printing of number of matches * add cyclist category * fix assertion error * add travis file * working eval * working eval * change source file * renaming * add pylint file * fix pylint * fix import * add pyc files in gitignore * pylint fix * pylint fix * add pytest cache * update readme * fix pylint * fix pylint * add travis file * add pylint in pip install * fix pylint
2019-07-19 15:39:03 +02:00 · 2019-07-19 15:39:03 +02:00 · 8968f3c8a2
commit 8968f3c8a2
parent 519de28f4e
45 changed files with 1282 additions and 950 deletions
--- a/.gitignore
+++ b/.gitignore
@ -2,3 +2,5 @@
 data/
 .DS_store
 __pycache__
 Monoloco/*.pyc
 .pytest*
--- a/.pylintrc
+++ b/.pylintrc
@ -0,0 +1,26 @@
 [BASIC]
 variable-rgx=[a-z0-9_]{1,30}$  # to accept 2 (dfferent) letters variables
 Good-names=xx,dd,zz,hh,ww,pp,kk,lr,w1,w2,w3,mm,im,uv,ax,COV_MIN,CONF_MIN
 [TYPECHECK]
 disable=E1102,missing-docstring,useless-object-inheritance,duplicate-code,too-many-arguments,too-many-instance-attributes,too-many-locals,too-few-public-methods,arguments-differ,logging-format-interpolation
 # List of members which are set dynamically and missed by pylint inference
 # system, and so shouldn't trigger E1101 when accessed. Python regular
 # expressions are accepted.
 generated-members=numpy.*,torch.*,cv2.*
 ignored-modules=nuscenes, tabulate, cv2
 [FORMAT]
 max-line-length=120
--- a/.travis.yml
+++ b/.travis.yml
@ -0,0 +1,13 @@
 dist: xenial
 language: python
 python:
 - "3.6"
 - "3.7"
 install:
 - pip install openpifpaf
 - pip install nuscenes-devkit
 - pip install tabulate
 - pip install pylint
 script:
 - pylint monoloco --disable=unused-variable,fixme
 - pytest -vv
--- a/README.md
+++ b/README.md
@ -31,7 +31,7 @@ All details for Pifpaf pose detector at [openpifpaf](https://github.com/vita-epf
 ```
-pip install nuscenes-devkit openpifpaf
+pip install openpifpaf nuscenes-devkit tabulate
 ```
 ### Data structure
@ -63,14 +63,14 @@ Alternatively, you can download a Pifpaf pre-trained model from [openpifpaf](htt
 # Interfaces
 All the commands are run through a main file called `main.py` using subparsers.
-To check all the commands for the parser and the subparsers run:
+To check all the commands for the parser and the subparsers (including openpifpaf ones) run:
 * `python3 src/main.py --help`
 * `python3 src/main.py prep --help`
 * `python3 src/main.py predict --help`
 * `python3 src/main.py train --help`
 * `python3 src/main.py eval --help`
 * `python3 -m monoloco.run --help`
 * `python3 -m monoloco.run predict --help`
 * `python3 -m monoloco.run train --help`
 * `python3 -m monoloco.run eval --help`
 * `python3 -m monoloco.run prep --help`
 or check the file `monoloco/run.py`
 # Predict
 The predict script receives an image (or an entire folder using glob expressions), 
@ -96,7 +96,7 @@ If it does not find the file, it will generate images
 with all the predictions without ground-truth matching.
 Below an example with and without ground-truth matching. They have been created (adding or removing `--path_gt`) with:
-`python3 src/main.py predict --networks monoloco --glob docs/002282.png --output_types combined --scale 2 
+`python3 -m monoloco.run predict --networks monoloco --glob docs/002282.png --output_types combined --scale 2 
 --model data/models/monoloco-190513-1437.pkl --n_dropout 100 --z_max 30`
 With ground truth matching (only matching people):
@ -110,7 +110,7 @@ To accurately estimate distance, the focal length is necessary.
 However, it is still possible to test Monoloco on images where the calibration matrix is not available. 
 Absolute distances are not meaningful but relative distance still are. 
 Below an example on a generic image from the web, created with:
-`python3 src/main.py predict --networks monoloco --glob docs/surf.jpg --output_types combined --model data/models/monoloco-190513-1437.pkl --n_dropout 100 --z_max 25`
+`python3 -m monoloco.run predict --networks monoloco --glob docs/surf.jpg --output_types combined --model data/models/monoloco-190513-1437.pkl --n_dropout 100 --z_max 25`
 ![no calibration](docs/surf.jpg.combined.png)
@ -124,7 +124,7 @@ Multiple visualizations can be combined in different windows.
 The above gif has been obtained running on a Macbook the command:
-`python src/main.py predict --webcam --scale 0.2 --output_types combined --z_max 10 --checkpoint resnet50`
+`python3 -m monoloco.run predict --webcam --scale 0.2 --output_types combined --z_max 10 --checkpoint resnet50`
 # Preprocess
@ -148,7 +148,7 @@ You can create them running the predict script and using `--networks pifpaf`.
 ### Inputs joints for training
 MonoLoco is trained using 2D human pose joints matched with the ground truth location provided by
-nuScenes or KITTI Dataset. To create the joints run: `python src/main.py prep` specifying:
+nuScenes or KITTI Dataset. To create the joints run: `python3 -m monoloco.run prep` specifying:
 1. `--dir_ann` annotation directory containing Pifpaf joints of KITTI or nuScenes. 
 2. `--dataset` Which dataset to preprocess. For nuscenes, all three versions of the 
@ -163,12 +163,12 @@ by the image name to easily access ground truth files for evaluation and predict
 # Train
 Provide the json file containing the preprocess joints as argument. 
-As simple as `python3 src/main.py --train --joints <json file path>`
+As simple as `python3 -m monoloco.run --train --joints <json file path>`
-All the hyperparameters options can be checked at `python3 src/main.py train --help`.
+All the hyperparameters options can be checked at `python3 -m monoloco.run train --help`.
 ### Hyperparameters tuning
-Random search in log space is provided. An example: `python3 src/main.py train --hyp --multiplier 10 --r_seed 1`.
+Random search in log space is provided. An example: `python3 -m monoloco.run train --hyp --multiplier 10 --r_seed 1`.
 One iteration of the multiplier includes 6 runs.
@ -176,7 +176,7 @@ One iteration of the multiplier includes 6 runs.
 Evaluate performances of the trained model on KITTI or Nuscenes Dataset.
 ### 1) nuScenes
 Evaluation on nuScenes is already provided during training. It is also possible to evaluate an existing model running
-`python src/main.py eval --dataset nuscenes --model <model to evaluate>`
+`python3 -m monoloco.run eval --dataset nuscenes --model <model to evaluate>`
 ### 2) KITTI
 ### Baselines
@ -186,7 +186,7 @@ and stereo Baselines:
 [Mono3D](https://www.cs.toronto.edu/~urtasun/publications/chen_etal_cvpr16.pdf), 
 [3DOP](https://xiaozhichen.github.io/papers/nips15chen.pdf), 
 [MonoDepth](https://arxiv.org/abs/1609.03677) and our 
-[Geometrical Baseline](src/eval/geom_baseline.py).
+[Geometrical Baseline](monoloco/eval/geom_baseline.py).
 * **Mono3D**: download validation files from [here](http://3dimage.ee.tsinghua.edu.cn/cxz/mono3d) 
 and save them into `data/kitti/m3d`
@ -196,7 +196,7 @@ and save them into `data/kitti/3dop`
 [here](https://github.com/Parrotlife/pedestrianDepth-baseline/tree/master/MonoDepth-PyTorch) 
 and save them into `data/kitti/monodepth`
 * **GeometricalBaseline**: A geometrical baseline comparison is provided. 
-The best average value for comparison can be created running `python src/main.py eval --geometric`
+The best average value for comparison can be created running `python3 -m monoloco.run eval --geometric`
 #### Evaluation
 First the model preprocess the joints starting from json annotations predicted from pifpaf, 
@ -205,7 +205,7 @@ in txt file with format comparable to other baseline.
 Then the model performs evaluation.
 The following graph is obtained running:
-`python3 src/main.py eval --dataset kitti --generate --model data/models/monoloco-190513-1437.pkl 
+`python3 -m monoloco.run eval --dataset kitti --generate --model data/models/monoloco-190513-1437.pkl 
 --dir_ann <folder containing pifpaf annotations of KITTI images>`
 ![kitti_evaluation](docs/results.png)
--- a/monoloco/init.py
+++ b/monoloco/init.py
--- a/monoloco/eval/init.py
+++ b/monoloco/eval/init.py
--- a/monoloco/eval/eval_kitti.py
+++ b/monoloco/eval/eval_kitti.py
@ -1,38 +1,45 @@
-"""Evaluate Monoloco code on KITTI dataset using ALE and ALP metrics"""
+"""Evaluate Monoloco code on KITTI dataset using ALE and ALP metrics with the following baselines:
 import os
 import math
 import logging
 from collections import defaultdict
 import datetime
 from utils.iou import get_iou_matches
 from utils.misc import get_task_error
 from utils.kitti import check_conditions, get_category, split_training, parse_ground_truth
 from visuals.results import print_results
 class KittiEval:
    """
    Evaluate Monoloco code and compare it with the following baselines:
    - Mono3D
    - 3DOP
    - MonoDepth
    """
 import os
 import math
 import logging
 import datetime
 from collections import defaultdict
 from itertools import chain
 from tabulate import tabulate
 from ..utils.iou import get_iou_matches
 from ..utils.misc import get_task_error, get_pixel_error
 from ..utils.kitti import check_conditions, get_category, split_training, parse_ground_truth
 from ..visuals.results import print_results
 class EvalKitti:
    logging.basicConfig(level=logging.INFO)
    logger = logging.getLogger(__name__)
    CLUSTERS = ('easy', 'moderate', 'hard', 'all', '6', '10', '15', '20', '25', '30', '40', '50', '>50')
-    dic_stds = defaultdict(lambda: defaultdict(list))
+    METHODS = ['m3d', 'geom', 'task_error', '3dop', 'our']
-    dic_stats = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: defaultdict(float))))
+    HEADERS = ['method', '<0.5', '<1m', '<2m', 'easy', 'moderate', 'hard', 'all']
-    dic_cnt = defaultdict(int)
+    CATEGORIES = ['pedestrian', 'cyclist']
-    errors = defaultdict(lambda: defaultdict(list))
+
    def __init__(self, thresh_iou_our=0.3, thresh_iou_m3d=0.3, thresh_conf_m3d=0.3, thresh_conf_our=0.3,
                 verbose=False, stereo=False):
    def __init__(self, thresh_iou_our=0.3, thresh_iou_m3d=0.5, thresh_conf_m3d=0.5, thresh_conf_our=0.3):
        self.dir_gt = os.path.join('data', 'kitti', 'gt')
        self.dir_m3d = os.path.join('data', 'kitti', 'm3d')
        self.dir_3dop = os.path.join('data', 'kitti', '3dop')
        self.dir_md = os.path.join('data', 'kitti', 'monodepth')
        self.dir_our = os.path.join('data', 'kitti', 'monoloco')
        self.stereo = stereo
        if self.stereo:
            self.dir_our_stereo = os.path.join('data', 'kitti', 'monoloco_stereo')
            self.METHODS.extend(['our_stereo', 'pixel_error'])
        path_train = os.path.join('splits', 'kitti_train.txt')
        path_val = os.path.join('splits', 'kitti_val.txt')
        dir_logs = os.path.join('data', 'logs')
@ -41,106 +48,101 @@ class KittiEval:
        now = datetime.datetime.now()
        now_time = now.strftime("%Y%m%d-%H%M")[2:]
        self.path_results = os.path.join(dir_logs, 'eval-' + now_time + '.json')
        self.verbose = verbose
        assert os.path.exists(self.dir_m3d) and os.path.exists(self.dir_our) \
               and os.path.exists(self.dir_3dop)
        self.dic_thresh_iou = {'m3d': thresh_iou_m3d, '3dop': thresh_iou_m3d,
-                               'md': thresh_iou_our, 'our': thresh_iou_our}
+                               'md': thresh_iou_our, 'our': thresh_iou_our, 'our_stereo': thresh_iou_our}
-        self.dic_thresh_conf = {'m3d': thresh_conf_m3d, '3dop': thresh_conf_m3d, 'our': thresh_conf_our}
+        self.dic_thresh_conf = {'m3d': thresh_conf_m3d, '3dop': thresh_conf_m3d,
                                'our': thresh_conf_our, 'our_stereo': thresh_conf_our}
        # Extract validation images for evaluation
        names_gt = tuple(os.listdir(self.dir_gt))
        _, self.set_val = split_training(names_gt, path_train, path_val)
        # Define variables to save statistics
        self.errors = None
        self.dic_stds = None
        self.dic_stats = None
        self.dic_cnt = None
        self.cnt_stereo_error = None
        self.cnt_gt = 0
    def run(self):
        """Evaluate Monoloco performances on ALP and ALE metrics"""
-        # Iterate over each ground truth file in the training set
+        for category in self.CATEGORIES:
        cnt_gt = 0
        for name in self.set_val:
            path_gt = os.path.join(self.dir_gt, name)
            path_m3d = os.path.join(self.dir_m3d, name)
            path_our = os.path.join(self.dir_our, name)
            path_3dop = os.path.join(self.dir_3dop, name)
            path_md = os.path.join(self.dir_md, name)
-            # Iterate over each line of the gt file and save box location and distances
+            # Initialize variables
-            out_gt = parse_ground_truth(path_gt)
+            self.errors = defaultdict(lambda: defaultdict(list))
-            cnt_gt += len(out_gt[0])
+            self.dic_stds = defaultdict(lambda: defaultdict(list))
            self.dic_stats = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: defaultdict(float))))
            self.dic_cnt = defaultdict(int)
            self.cnt_gt = 0
            self.cnt_stereo_error = 0
-            # Extract annotations for the same file
+            # Iterate over each ground truth file in the training set
-            if out_gt[0]:
+            for name in self.set_val:
-                out_m3d = self._parse_txts(path_m3d, method='m3d')
+                path_gt = os.path.join(self.dir_gt, name)
-                out_3dop = self._parse_txts(path_3dop, method='3dop')
+                path_m3d = os.path.join(self.dir_m3d, name)
-                out_md = self._parse_txts(path_md, method='md')
+                path_our = os.path.join(self.dir_our, name)
-                out_our = self._parse_txts(path_our, method='our')
+                if self.stereo:
                    path_our_stereo = os.path.join(self.dir_our_stereo, name)
                path_3dop = os.path.join(self.dir_3dop, name)
                path_md = os.path.join(self.dir_md, name)
-                # Compute the error with ground truth
+                # Iterate over each line of the gt file and save box location and distances
-                self._estimate_error(out_gt, out_m3d, method='m3d')
+                out_gt = parse_ground_truth(path_gt, category)
-                self._estimate_error(out_gt, out_3dop, method='3dop')
+                self.cnt_gt += len(out_gt[0])
                self._estimate_error(out_gt, out_md, method='md')
                self._estimate_error(out_gt, out_our, method='our')
-                # Iterate over all the files together to find a pool of common annotations
+                # Extract annotations for the same file
-                self._compare_error(out_gt, out_m3d, out_3dop, out_md, out_our)
+                if out_gt[0]:
                    out_m3d = self._parse_txts(path_m3d, category, method='m3d')
                    out_3dop = self._parse_txts(path_3dop, category, method='3dop')
                    # out_md = self._parse_txts(path_md, category, method='md')
                    out_md = out_m3d
                    out_our = self._parse_txts(path_our, category, method='our')
                    out_our_stereo = self._parse_txts(path_our_stereo, category, method='our') if self.stereo else []
-        # Update statistics of errors and uncertainty
+                    # Compute the error with ground truth
-        for key in self.errors:
+                    self._estimate_error(out_gt, out_m3d, method='m3d')
-            add_true_negatives(self.errors[key], cnt_gt)
+                    self._estimate_error(out_gt, out_3dop, method='3dop')
-            for clst in self.CLUSTERS[:-2]:  # M3d and pifpaf does not have annotations above 40 meters
+                    # self._estimate_error(out_gt, out_md, method='md')
-                get_statistics(self.dic_stats['test'][key][clst], self.errors[key][clst], self.dic_stds[clst], key)
+                    self._estimate_error(out_gt, out_our, method='our')
                    if self.stereo:
                        self._estimate_error(out_gt, out_our_stereo, method='our_stereo')
-        # Show statistics
+                    # Iterate over all the files together to find a pool of common annotations
-        print(" Number of GT annotations: {} ".format(cnt_gt))
+                    self._compare_error(out_gt, out_m3d, out_3dop, out_md, out_our, out_our_stereo)
        for key in self.errors:
            if key in ['our', 'm3d', '3dop']:
                print(" Number of {} annotations with confidence >= {} : {} "
                      .format(key, self.dic_thresh_conf[key], self.dic_cnt[key]))
-            for clst in self.CLUSTERS[:-9]:
+            # Update statistics of errors and uncertainty
-                print(" {} Average error in cluster {}: {:.2f} with a max error of {:.1f}, "
+            for key in self.errors:
-                      "for {} annotations"
+                add_true_negatives(self.errors[key], self.cnt_gt)
-                      .format(key, clst, self.dic_stats['test'][key][clst]['mean'],
+                for clst in self.CLUSTERS[:-2]:  # M3d and pifpaf does not have annotations above 40 meters
-                              self.dic_stats['test'][key][clst]['max'],
+                    get_statistics(self.dic_stats['test'][key][clst], self.errors[key][clst], self.dic_stds[clst], key)
                              self.dic_stats['test'][key][clst]['cnt']))
-                if key == 'our':
+            # Show statistics
-                    print("% of annotation inside the confidence interval: {:.1f} %, "
+            print('\n' + category.upper() + ':')
-                          "of which {:.1f} % at higher risk"
+            self.show_statistics()
                          .format(100 * self.dic_stats['test'][key][clst]['interval'],
                                  100 * self.dic_stats['test'][key][clst]['at_risk']))
            for perc in ['<0.5m', '<1m', '<2m']:
                print("{} Instances with error {}: {:.2f} %"
                      .format(key, perc, 100 * sum(self.errors[key][perc])/len(self.errors[key][perc])))
            print("\n Number of matched annotations: {:.1f} %".format(self.errors[key]['matched']))
            print("-"*100)
        print("\n Annotations inside the confidence interval: {:.1f} %"
              .format(100 * self.dic_stats['test']['our']['all']['interval']))
        print("precision 1: {:.2f}".format(self.dic_stats['test']['our']['all']['prec_1']))
        print("precision 2: {:.2f}".format(self.dic_stats['test']['our']['all']['prec_2']))
    def printer(self, show):
        print_results(self.dic_stats, show)
-    def _parse_txts(self, path, method):
+    def _parse_txts(self, path, category, method):
        boxes = []
        dds = []
        stds_ale = []
        stds_epi = []
        dds_geom = []
        # xyzs = []
        # xy_kps = []
        # Iterate over each line of the txt file
        if method in ['3dop', 'm3d']:
            try:
                with open(path, "r") as ff:
                    for line in ff:
-                        if check_conditions(line, thresh=self.dic_thresh_conf[method], mode=method):
+                        if check_conditions(line, category, method=method, thresh=self.dic_thresh_conf[method]):
                            boxes.append([float(x) for x in line.split()[4:8]])
                            loc = ([float(x) for x in line.split()[11:14]])
                            dds.append(math.sqrt(loc[0] ** 2 + loc[1] ** 2 + loc[2] ** 2))
@ -155,7 +157,7 @@ class KittiEval:
                with open(path, "r") as ff:
                    for line in ff:
                        box = [float(x[:-1]) for x in line.split()[0:4]]
-                        delta_h = (box[3] - box[1]) / 10
+                        delta_h = (box[3] - box[1]) / 10   # TODO Add new value
                        delta_w = (box[2] - box[0]) / 10
                        assert delta_h > 0 and delta_w > 0, "Bounding box <=0"
                        box[0] -= delta_w
@ -178,13 +180,14 @@ class KittiEval:
                for line_our in file_lines[:-1]:
                    line_list = [float(x) for x in line_our.split()]
-                    if check_conditions(line_list, thresh=self.dic_thresh_conf[method], mode=method):
+                    if check_conditions(line_list, category, method=method, thresh=self.dic_thresh_conf[method]):
                        boxes.append(line_list[:4])
                        dds.append(line_list[8])
                        stds_ale.append(line_list[9])
                        stds_epi.append(line_list[10])
                        dds_geom.append(line_list[11])
                        self.dic_cnt[method] += 1
                        self.dic_cnt['geom'] += 1
                # kk_list = [float(x) for x in file_lines[-1].split()]
@ -196,8 +199,8 @@ class KittiEval:
    def _estimate_error(self, out_gt, out, method):
        """Estimate localization error"""
-        boxes_gt, _, dds_gt, truncs_gt, occs_gt = out_gt
+        boxes_gt, _, dds_gt, zzs_gt, truncs_gt, occs_gt = out_gt
-        if method == 'our':
+        if method[:3] == 'our':
            boxes, dds, stds_ale, stds_epi, dds_geom = out
        else:
            boxes, dds = out
@ -208,19 +211,28 @@ class KittiEval:
            # Update error if match is found
            cat = get_category(boxes_gt[idx_gt], truncs_gt[idx_gt], occs_gt[idx_gt])
            self.update_errors(dds[idx], dds_gt[idx_gt], cat, self.errors[method])
            if method == 'our':
                self.update_errors(dds_geom[idx], dds_gt[idx_gt], cat, self.errors['geom'])
                self.update_uncertainty(stds_ale[idx], stds_epi[idx], dds[idx], dds_gt[idx_gt], cat)
                dd_task_error = dds_gt[idx_gt] + (get_task_error(dds_gt[idx_gt], mode='mad'))**2
                self.update_errors(dd_task_error, dds_gt[idx_gt], cat, self.errors['task_error'])
-    def _compare_error(self, out_gt, out_m3d, out_3dop, out_md, out_our):
+            elif method == 'our_stereo':
                dd_pixel_error = get_pixel_error(dds_gt[idx_gt], zzs_gt[idx_gt])
                self.update_errors(dd_pixel_error, dds_gt[idx_gt], cat, self.errors['pixel_error'])
    def _compare_error(self, out_gt, out_m3d, out_3dop, out_md, out_our, out_our_stereo):
        """Compare the error for a pool of instances commonly matched by all methods"""
        # Extract outputs of each method
-        boxes_gt, _, dds_gt, truncs_gt, occs_gt = out_gt
+        boxes_gt, _, dds_gt, zzs_gt, truncs_gt, occs_gt = out_gt
        boxes_m3d, dds_m3d = out_m3d
        boxes_3dop, dds_3dop = out_3dop
        boxes_md, dds_md = out_md
        boxes_our, dds_our, _, _, dds_geom = out_our
        if self.stereo:
            boxes_our_stereo, dds_our_stereo, _, _, dds_geom_stereo = out_our_stereo
        # Find IoU matches
        matches_our = get_iou_matches(boxes_our, boxes_gt, self.dic_thresh_iou['our'])
@ -234,12 +246,25 @@ class KittiEval:
            if check:
                cat = get_category(boxes_gt[idx_gt], truncs_gt[idx_gt], occs_gt[idx_gt])
                dd_gt = dds_gt[idx_gt]
                self.update_errors(dds_our[idx], dd_gt, cat, self.errors['our_merged'])
                self.update_errors(dds_geom[idx], dd_gt, cat, self.errors['geom_merged'])
                self.update_errors(dd_gt + get_task_error(dd_gt, mode='mad'),
                                   dd_gt, cat, self.errors['task_error_merged'])
                self.update_errors(dds_m3d[indices[0]], dd_gt, cat, self.errors['m3d_merged'])
                self.update_errors(dds_3dop[indices[1]], dd_gt, cat, self.errors['3dop_merged'])
                self.update_errors(dds_md[indices[2]], dd_gt, cat, self.errors['md_merged'])
-                self.dic_cnt['merged'] += 1
+                if self.stereo:
                    self.update_errors(dds_our_stereo[idx], dd_gt, cat, self.errors['our_stereo_merged'])
                    dd_pixel = get_pixel_error(dd_gt, zzs_gt[idx_gt])
                    self.update_errors(dd_pixel, dd_gt, cat, self.errors['pixel_error_merged'])
                    error = abs(dds_our[idx] - dd_gt)
                    error_stereo = abs(dds_our_stereo[idx] - dd_gt)
                    if error_stereo > (error + 0.1):
                        self.cnt_stereo_error += 1
                for key in self.METHODS:
                    self.dic_cnt[key + '_merged'] += 1
    def update_errors(self, dd, dd_gt, cat, errors):
        """Compute and save errors between a single box and the gt box which match"""
@ -320,21 +345,74 @@ class KittiEval:
        self.dic_stds[clst]['prec_2'].append(prec_2)
        self.dic_stds[cat]['prec_2'].append(prec_2)
    def show_statistics(self):
        print('-'*90)
        alp = [[str(100 * average(self.errors[key][perc]))[:4]
                for perc in ['<0.5m', '<1m', '<2m']]
               for key in self.METHODS]
        ale = [[str(self.dic_stats['test'][key + '_merged'][clst]['mean'])[:4] + ' (' +
                str(self.dic_stats['test'][key][clst]['mean'])[:4] + ')'
                for clst in self.CLUSTERS[:4]]
               for key in self.METHODS]
        results = [[key] + alp[idx] + ale[idx] for idx, key in enumerate(self.METHODS)]
        print(tabulate(results, headers=self.HEADERS))
        print('-'*90 + '\n')
        if self.verbose:
            methods_all = list(chain.from_iterable((method, method + '_merged') for method in self.METHODS))
            for key in methods_all:
                for clst in self.CLUSTERS[:4]:
                    print(" {} Average error in cluster {}: {:.2f} with a max error of {:.1f}, "
                          "for {} annotations"
                          .format(key, clst, self.dic_stats['test'][key][clst]['mean'],
                                  self.dic_stats['test'][key][clst]['max'],
                                  self.dic_stats['test'][key][clst]['cnt']))
                    if key == 'our':
                        print("% of annotation inside the confidence interval: {:.1f} %, "
                              "of which {:.1f} % at higher risk"
                              .format(self.dic_stats['test'][key][clst]['interval'],
                                      self.dic_stats['test'][key][clst]['at_risk']))
                for perc in ['<0.5m', '<1m', '<2m']:
                    print("{} Instances with error {}: {:.2f} %"
                          .format(key, perc, 100 * average(self.errors[key][perc])))
                print("\nMatched annotations: {:.1f} %".format(self.errors[key]['matched']))
                print(" Detected annotations : {}/{} ".format(self.dic_cnt[key], self.cnt_gt))
                print("-" * 100)
            print("\n Annotations inside the confidence interval: {:.1f} %"
                  .format(self.dic_stats['test']['our']['all']['interval']))
            print("precision 1: {:.2f}".format(self.dic_stats['test']['our']['all']['prec_1']))
            print("precision 2: {:.2f}".format(self.dic_stats['test']['our']['all']['prec_2']))
            if self.stereo:
                print("Stereo error greater than mono: {:.1f} %"
                      .format(100 * self.cnt_stereo_error / self.dic_cnt['our_merged']))
 def get_statistics(dic_stats, errors, dic_stds, key):
    """Update statistics of a cluster"""
-    dic_stats['mean'] = sum(errors) / float(len(errors))
+    try:
-    dic_stats['max'] = max(errors)
+        dic_stats['mean'] = average(errors)
-    dic_stats['cnt'] = len(errors)
+        dic_stats['max'] = max(errors)
        dic_stats['cnt'] = len(errors)
    except (ZeroDivisionError, ValueError):
        dic_stats['mean'] = 0.
        dic_stats['max'] = 0.
        dic_stats['cnt'] = 0.
    if key == 'our':
-        dic_stats['std_ale'] = sum(dic_stds['ale']) / float(len(dic_stds['ale']))
+        dic_stats['std_ale'] = average(dic_stds['ale'])
-        dic_stats['std_epi'] = sum(dic_stds['epi']) / float(len(dic_stds['epi']))
+        dic_stats['std_epi'] = average(dic_stds['epi'])
-        dic_stats['interval'] = sum(dic_stds['interval']) / float(len(dic_stds['interval']))
+        dic_stats['interval'] = average(dic_stds['interval'])
-        dic_stats['at_risk'] = sum(dic_stds['at_risk']) / float(len(dic_stds['at_risk']))
+        dic_stats['at_risk'] = average(dic_stds['at_risk'])
-        dic_stats['prec_1'] = sum(dic_stds['prec_1']) / float(len(dic_stds['prec_1']))
+        dic_stats['prec_1'] = average(dic_stds['prec_1'])
-        dic_stats['prec_2'] = sum(dic_stds['prec_2']) / float(len(dic_stds['prec_2']))
+        dic_stats['prec_2'] = average(dic_stds['prec_2'])
 def add_true_negatives(err, cnt_gt):
@ -379,3 +457,8 @@ def extract_indices(idx_to_check, *args):
                checks[idx_method] = True
                indices.append(idx_pred)
    return all(checks), indices
 def average(my_list):
    """calculate mean of a list"""
    return sum(my_list) / len(my_list)
--- a/monoloco/eval/generate_kitti.py
+++ b/monoloco/eval/generate_kitti.py
@ -0,0 +1,234 @@
 """Run monoloco over all the pifpaf joints of KITTI images
 and extract and save the annotations in txt files"""
 import math
 import os
 import glob
 import json
 import shutil
 import itertools
 import copy
 import numpy as np
 import torch
 from ..predict.network import MonoLoco
 from ..eval.geom_baseline import compute_distance
 from ..utils.kitti import get_calibration
 from ..utils.pifpaf import preprocess_pif
 from ..utils.camera import xyz_from_distance, get_keypoints, pixel_to_camera
 from ..utils.stereo import depth_from_disparity
 class GenerateKitti:
    def __init__(self, model, dir_ann, p_dropout=0.2, n_dropout=0):
        # Load monoloco
        use_cuda = torch.cuda.is_available()
        device = torch.device("cuda" if use_cuda else "cpu")
        self.monoloco = MonoLoco(model_path=model, device=device, n_dropout=n_dropout, p_dropout=p_dropout)
        self.dir_out = os.path.join('data', 'kitti', 'monoloco')
        self.dir_ann = dir_ann
        # List of images
        self.list_basename = factory_basename(dir_ann)
        self.dir_kk = os.path.join('data', 'kitti', 'calib')
    def run_mono(self):
        """Run Monoloco and save txt files for KITTI evaluation"""
        cnt_ann = cnt_file = cnt_no_file = 0
        dir_out = os.path.join('data', 'kitti', 'monoloco')
        # Remove the output directory if alreaady exists (avoid residual txt files)
        if os.path.exists(dir_out):
            shutil.rmtree(dir_out)
        os.makedirs(dir_out)
        print("\nCreated empty output directory for txt files")
        # Run monoloco over the list of images
        for basename in self.list_basename:
            path_calib = os.path.join(self.dir_kk, basename + '.txt')
            annotations, kk, tt = factory_file(path_calib, self.dir_ann, basename)
            boxes, keypoints = preprocess_pif(annotations, im_size=(1242, 374))
            if not keypoints:
                cnt_no_file += 1
                continue
            else:
                # Run the network and the geometric baseline
                outputs, varss = self.monoloco.forward(keypoints, kk)
                dds_geom = eval_geometric(keypoints, kk, average_y=0.48)
            # Save the file
            uv_centers = get_keypoints(keypoints, mode='bottom')  # Kitti uses the bottom center to calculate depth
            xy_centers = pixel_to_camera(uv_centers, kk, 1)
            outputs = outputs.detach().cpu()
            zzs = xyz_from_distance(outputs[:, 0:1], xy_centers)[:, 2].tolist()
            all_outputs = [outputs.detach().cpu(), varss.detach().cpu(), dds_geom, zzs]
            all_inputs = [boxes, xy_centers]
            all_params = [kk, tt]
            path_txt = os.path.join(dir_out, basename + '.txt')
            save_txts(path_txt, all_inputs, all_outputs, all_params)
            # Update counting
            cnt_ann += len(boxes)
            cnt_file += 1
        print("Saved in {} txt {} annotations. Not found {} images\n".format(cnt_file, cnt_ann, cnt_no_file))
    def run_stereo(self):
        """Run monoloco on left and right images and alculate disparity if a match is found"""
        cnt_ann = cnt_file = cnt_no_file = cnt_no_stereo = cnt_disparity = 0
        dir_out = os.path.join('data', 'kitti', 'monoloco_stereo')
        # Remove the output directory if alreaady exists (avoid residual txt files)
        if os.path.exists(dir_out):
            shutil.rmtree(dir_out)
        os.makedirs(dir_out)
        print("Created empty output directory for txt STEREO files")
        for basename in self.list_basename:
            path_calib = os.path.join(self.dir_kk, basename + '.txt')
            stereo = True
            for mode in ['left', 'right']:
                annotations, kk, tt = factory_file(path_calib, self.dir_ann, basename, mode=mode)
                boxes, keypoints = preprocess_pif(annotations, im_size=(1242, 374))
                if not keypoints and mode == 'left':
                    cnt_no_file += 1
                    break
                elif not keypoints and mode == 'right':
                    stereo = False
                else:
                    # Run the network and the geometric baseline
                    outputs, varss = self.monoloco.forward(keypoints, kk)
                    dds_geom = eval_geometric(keypoints, kk, average_y=0.48)
                    uv_centers = get_keypoints(keypoints, mode='bottom')  # Kitti uses the bottom to calculate depth
                    xy_centers = pixel_to_camera(uv_centers, kk, 1)
                if mode == 'left':
                    outputs_l = outputs.detach().cpu()
                    varss_l = varss.detach().cpu()
                    zzs_l = xyz_from_distance(outputs_l[:, 0:1], xy_centers)[:, 2].tolist()
                    kps_l = copy.deepcopy(keypoints)
                    boxes_l = boxes
                    xy_centers_l = xy_centers
                    dds_geom_l = dds_geom
                    kk_l = kk
                    tt_l = tt
                else:
                    kps_r = copy.deepcopy(keypoints)
            if stereo:
                zzs, cnt = depth_from_disparity(zzs_l, kps_l, kps_r)
                cnt_disparity += cnt
            else:
                zzs = zzs_l
            # Save the file
            all_outputs = [outputs_l, varss_l, dds_geom_l, zzs]
            all_inputs = [boxes_l, xy_centers_l]
            all_params = [kk_l, tt_l]
            path_txt = os.path.join(dir_out, basename + '.txt')
            save_txts(path_txt, all_inputs, all_outputs, all_params)
            # Update counting
            cnt_ann += len(boxes_l)
            cnt_file += 1
        # Print statistics
        print("Saved in {} txt {} annotations. Not found {} images."
              .format(cnt_file, cnt_ann, cnt_no_file))
        print("Annotations corrected using stereo: {:.1f}%, not found {} stereo files"
              .format(cnt_disparity / cnt_ann * 100, cnt_no_stereo))
 def save_txts(path_txt, all_inputs, all_outputs, all_params):
    outputs, varss, dds_geom, zzs = all_outputs[:]
    uv_boxes, xy_centers = all_inputs[:]
    kk, tt = all_params[:]
    with open(path_txt, "w+") as ff:
        for idx in range(outputs.shape[0]):
            xx = float(xy_centers[idx][0]) * zzs[idx] + tt[0]
            yy = float(xy_centers[idx][1]) * zzs[idx] + tt[1]
            zz = zzs[idx] + tt[2]
            dd = math.sqrt(xx ** 2 + yy ** 2 + zz ** 2)
            cam_0 = [xx, yy, zz, dd]
            for el in uv_boxes[idx][:]:
                ff.write("%s " % el)
            for el in cam_0:
                ff.write("%s " % el)
            ff.write("%s " % float(outputs[idx][1]))
            ff.write("%s " % float(varss[idx]))
            ff.write("%s " % dds_geom[idx])
            ff.write("\n")
        # Save intrinsic matrix in the last row
        for kk_el in itertools.chain(*kk):  # Flatten a list of lists
            ff.write("%f " % kk_el)
        ff.write("\n")
 def factory_basename(dir_ann):
    """ Return all the basenames in the annotations folder"""
    list_ann = glob.glob(os.path.join(dir_ann, '*.json'))
    list_basename = [os.path.basename(x).split('.')[0] for x in list_ann]
    assert list_basename, " Missing json annotations file to create txt files for KITTI datasets"
    return list_basename
 def factory_file(path_calib, dir_ann, basename, mode='left'):
    """Choose the annotation and the calibration files. Stereo option with ite = 1"""
    assert mode in ('left', 'right')
    p_left, p_right = get_calibration(path_calib)
    if mode == 'left':
        kk, tt = p_left[:]
        path_ann = os.path.join(dir_ann, basename + '.png.pifpaf.json')
    else:
        kk, tt = p_right[:]
        path_ann = os.path.join(dir_ann + '_right', basename + '.png.pifpaf.json')
    try:
        with open(path_ann, 'r') as f:
            annotations = json.load(f)
    except FileNotFoundError:
        annotations = []
    return annotations, kk, tt
 def eval_geometric(keypoints, kk, average_y=0.48):
    """ Evaluate geometric distance"""
    dds_geom = []
    uv_centers = get_keypoints(keypoints, mode='center')
    uv_shoulders = get_keypoints(keypoints, mode='shoulder')
    uv_hips = get_keypoints(keypoints, mode='hip')
    xy_centers = pixel_to_camera(uv_centers, kk, 1)
    xy_shoulders = pixel_to_camera(uv_shoulders, kk, 1)
    xy_hips = pixel_to_camera(uv_hips, kk, 1)
    for idx, xy_center in enumerate(xy_centers):
        zz = compute_distance(xy_shoulders[idx], xy_hips[idx], average_y)
        xyz_center = np.array([xy_center[0], xy_center[1], zz])
        dd_geom = float(np.linalg.norm(xyz_center))
        dds_geom.append(dd_geom)
    return dds_geom
--- a/monoloco/eval/geom_baseline.py
+++ b/monoloco/eval/geom_baseline.py
@ -6,12 +6,10 @@ from collections import defaultdict
 import numpy as np
-from utils.camera import pixel_to_camera, get_keypoints
+from ..utils.camera import pixel_to_camera, get_keypoints
 AVERAGE_Y = 0.48
 CLUSTERS = ['10', '20', '30', 'all']
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 def geometric_baseline(joints):
@ -30,6 +28,8 @@ def geometric_baseline(joints):
    'right_ankle']
    """
    logging.basicConfig(level=logging.INFO)
    logger = logging.getLogger(__name__)
    cnt_tot = 0
    dic_dist = defaultdict(lambda: defaultdict(list))
@ -100,7 +100,7 @@ def compute_distance(xyz_norm_1, xyz_norm_2, average_y, mode='average', dy_met=0
    1. knowing specific height of the annotation (head-ankle) dy_met
    2. using mean height of people (average_y)
    """
-    assert mode == 'average' or mode == 'real'
+    assert mode in ('average', 'real')
    x1 = float(xyz_norm_1[0])
    y1 = float(xyz_norm_1[1])
@ -115,13 +115,13 @@ def compute_distance(xyz_norm_1, xyz_norm_2, average_y, mode='average', dy_met=0
        cc = -dy_met
    # Solving the linear system Ax = b
-    Aa = np.array([[y1, 0, -xx],
+    matrix = np.array([[y1, 0, -xx],
-                  [0, -y1, 1],
+                       [0, -y1, 1],
-                  [y2, 0, -xx],
+                       [y2, 0, -xx],
-                  [0, -y2, 1]])
+                       [0, -y2, 1]])
    bb = np.array([cc * xx, -cc, 0, 0]).reshape(4, 1)
-    xx = np.linalg.lstsq(Aa, bb, rcond=None)
+    xx = np.linalg.lstsq(matrix, bb, rcond=None)
    z_met = abs(np.float(xx[0][1]))  # Abs take into account specularity behind the observer
    return z_met
@ -160,7 +160,7 @@ def calculate_heights(heights, mode):
     Compute statistics of heights based on the distance
     """
-    assert mode == 'mean' or mode == 'std' or mode == 'max'
+    assert mode in ('mean', 'std', 'max')
    heights_fin = {}
    head_shoulder = np.array(heights['shoulder']) - np.array(heights['head'])
@ -193,4 +193,3 @@ def calculate_error(dic_errors):
    for clst in dic_errors:
        errors[clst] = np.float(np.mean(np.array(dic_errors[clst])))
    return errors
--- a/monoloco/predict/init.py
+++ b/monoloco/predict/init.py
--- a/monoloco/predict/factory.py
+++ b/monoloco/predict/factory.py
@ -2,7 +2,7 @@
 import json
 import os
 from openpifpaf import show
-from visuals.printer import Printer
+from ..visuals.printer import Printer
 def factory_for_gt(im_size, name=None, path_gt=None):
@ -24,7 +24,7 @@ def factory_for_gt(im_size, name=None, path_gt=None):
        dic_gt = None
        x_factor = im_size[0] / 1600
        y_factor = im_size[1] / 900
-        pixel_factor = (x_factor + y_factor) / 2
+        pixel_factor = (x_factor + y_factor) / 2   # TODO remove and check it
        if im_size[0] / im_size[1] > 2.5:
            kk = [[718.3351, 0., 600.3891], [0., 718.3351, 181.5122], [0., 0., 1.]]  # Kitti calibration
        else:
@ -45,7 +45,7 @@ def factory_outputs(args, images_outputs, output_path, pifpaf_outputs, dic_out=N
        keypoint_sets, scores, pifpaf_out = pifpaf_outputs[:]
        # Visualizer
-        keypoint_painter = show.KeypointPainter(show_box=True)
+        keypoint_painter = show.KeypointPainter(show_box=False)
        skeleton_painter = show.KeypointPainter(show_box=False, color_connections=True,
                                                markersize=1, linewidth=4)
@ -79,7 +79,8 @@ def factory_outputs(args, images_outputs, output_path, pifpaf_outputs, dic_out=N
                printer = Printer(images_outputs[1], output_path, kk, output_types=args.output_types
                                  , z_max=args.z_max, epistemic=epistemic)
                figures, axes = printer.factory_axes()
-                printer.draw(figures, axes, dic_out, images_outputs[1], save=True, show=args.show)
+                printer.draw(figures, axes, dic_out, images_outputs[1], draw_box=args.draw_box,
                             save=True, show=args.show)
        if 'json' in args.output_types:
            with open(os.path.join(output_path + '.monoloco.json'), 'w') as ff:
--- a/monoloco/predict/network.py
+++ b/monoloco/predict/network.py
@ -8,10 +8,10 @@ from collections import defaultdict
 import torch
-from utils.iou import get_iou_matches, reorder_matches
+from ..utils.iou import get_iou_matches, reorder_matches
-from utils.camera import get_keypoints, pixel_to_camera, xyz_from_distance
+from ..utils.camera import get_keypoints, pixel_to_camera, xyz_from_distance
-from utils.monoloco import get_monoloco_inputs, unnormalize_bi, laplace_sampling
+from ..utils.network import get_monoloco_inputs, unnormalize_bi, laplace_sampling
-from models.architectures import LinearModel
+from ..train.architectures import LinearModel
 class MonoLoco:
@ -64,7 +64,7 @@ class MonoLoco:
        return outputs, varss
    @staticmethod
-    def post_process(outputs, varss, boxes, keypoints, kk, dic_gt, iou_min=0.25):
+    def post_process(outputs, varss, boxes, keypoints, kk, dic_gt, iou_min=0.3):
        """Post process monoloco to output final dictionary with all information for visualizations"""
        dic_out = defaultdict(list)
@ -74,6 +74,7 @@ class MonoLoco:
        if dic_gt:
            boxes_gt, dds_gt = dic_gt['boxes'], dic_gt['dds']
            matches = get_iou_matches(boxes, boxes_gt, thresh=iou_min)
            print("found {} matches with ground-truth".format(len(matches)))
        else:
            matches = [(idx, idx) for idx, _ in enumerate(boxes)]  # Replicate boxes
@ -98,6 +99,7 @@ class MonoLoco:
            xyz_real = xyz_from_distance(dd_real, xy_centers[idx])
            xyz_pred = xyz_from_distance(dd_pred, xy_centers[idx])
            dic_out['boxes'].append(box)
            dic_out['boxes_gt'].append(boxes_gt[idx_gt] if dic_gt else boxes[idx])
            dic_out['dds_real'].append(dd_real)
            dic_out['dds_pred'].append(dd_pred)
            dic_out['stds_ale'].append(ale)
--- a/monoloco/predict/pifpaf.py
+++ b/monoloco/predict/pifpaf.py
@ -107,4 +107,3 @@ class PifPaf:
            for kps in keypoint_sets
        ]
        return keypoint_sets, scores, pifpaf_out
--- a/monoloco/predict/predict.py
+++ b/monoloco/predict/predict.py
@ -4,10 +4,10 @@ from PIL import Image
 import torch
-from predict.pifpaf import PifPaf, ImageList
+from ..predict.pifpaf import PifPaf, ImageList
-from predict.monoloco import MonoLoco
+from ..predict.network import MonoLoco
-from predict.factory import factory_for_gt, factory_outputs
+from ..predict.factory import factory_for_gt, factory_outputs
-from utils.pifpaf import preprocess_pif
+from ..utils.pifpaf import preprocess_pif
 def predict(args):
--- a/monoloco/prep/init.py
+++ b/monoloco/prep/init.py
--- a/monoloco/prep/preprocess_ki.py
+++ b/monoloco/prep/preprocess_ki.py
@ -8,11 +8,12 @@ from collections import defaultdict
 import json
 import datetime
-from utils.kitti import get_calibration, split_training, parse_ground_truth
+from ..prep.transforms import transform_keypoints
-from utils.monoloco import get_monoloco_inputs
+from ..utils.kitti import get_calibration, split_training, parse_ground_truth
-from utils.pifpaf import preprocess_pif
+from ..utils.network import get_monoloco_inputs
-from utils.iou import get_iou_matches
+from ..utils.pifpaf import preprocess_pif
-from utils.misc import append_cluster
+from ..utils.iou import get_iou_matches
 from ..utils.misc import append_cluster
 class PreprocessKitti:
@ -29,7 +30,7 @@ class PreprocessKitti:
                           clst=defaultdict(lambda: defaultdict(list)))}
    dic_names = defaultdict(lambda: defaultdict(list))
-    def __init__(self, dir_ann, iou_min=0.3):
+    def __init__(self, dir_ann, iou_min):
        self.dir_ann = dir_ann
        self.iou_min = iou_min
@ -52,10 +53,7 @@ class PreprocessKitti:
    def run(self):
        """Save json files"""
-        cnt_gt = 0
+        cnt_gt = cnt_files = cnt_files_ped = cnt_fnf = 0
        cnt_files = 0
        cnt_files_ped = 0
        cnt_fnf = 0
        dic_cnt = {'train': 0, 'val': 0, 'test': 0}
        for name in self.names_gt:
@ -73,10 +71,7 @@ class PreprocessKitti:
            kk = p_left[0]
            # Iterate over each line of the gt file and save box location and distances
-            if phase == 'train':
+            boxes_gt, boxes_3d, dds_gt = parse_ground_truth(path_gt, category='all')[:3]
                (boxes_gt, boxes_3d, dds_gt, _, _) = parse_ground_truth(path_gt, mode='gt_all')  # Also cyclists
            else:
                (boxes_gt, boxes_3d, dds_gt, _, _) = parse_ground_truth(path_gt, mode='gt')  # only pedestrians
            self.dic_names[basename + '.png']['boxes'] = copy.deepcopy(boxes_gt)
            self.dic_names[basename + '.png']['dds'] = copy.deepcopy(dds_gt)
@ -90,7 +85,11 @@ class PreprocessKitti:
                with open(os.path.join(self.dir_ann, basename + '.png.pifpaf.json'), 'r') as f:
                    annotations = json.load(f)
                boxes, keypoints = preprocess_pif(annotations, im_size=(1238, 374))
                keypoints_hflip = transform_keypoints(keypoints, mode='flip')
                inputs = get_monoloco_inputs(keypoints, kk).tolist()
                inputs_hflip = get_monoloco_inputs(keypoints, kk).tolist()
                all_keypoints = [keypoints, keypoints_hflip]
                all_inputs = [inputs, inputs_hflip]
            except FileNotFoundError:
                boxes = []
@ -98,13 +97,15 @@ class PreprocessKitti:
            # Match each set of keypoint with a ground truth
            matches = get_iou_matches(boxes, boxes_gt, self.iou_min)
            for (idx, idx_gt) in matches:
-                self.dic_jo[phase]['kps'].append(keypoints[idx])
+                for nn, keypoints in enumerate(all_keypoints):
-                self.dic_jo[phase]['X'].append(inputs[idx])
+                    inputs = all_inputs[nn]
-                self.dic_jo[phase]['Y'].append([dds_gt[idx_gt]])  # Trick to make it (nn,1)
+                    self.dic_jo[phase]['kps'].append(keypoints[idx])
-                self.dic_jo[phase]['boxes_3d'].append(boxes_3d[idx_gt])
+                    self.dic_jo[phase]['X'].append(inputs[idx])
-                self.dic_jo[phase]['K'].append(kk)
+                    self.dic_jo[phase]['Y'].append([dds_gt[idx_gt]])  # Trick to make it (nn,1)
-                self.dic_jo[phase]['names'].append(name)  # One image name for each annotation
+                    self.dic_jo[phase]['boxes_3d'].append(boxes_3d[idx_gt])
-                append_cluster(self.dic_jo, phase, inputs[idx], dds_gt[idx_gt], keypoints[idx])
+                    self.dic_jo[phase]['K'].append(kk)
                    self.dic_jo[phase]['names'].append(name)  # One image name for each annotation
                    append_cluster(self.dic_jo, phase, inputs[idx], dds_gt[idx_gt], keypoints[idx])
                dic_cnt[phase] += 1
        with open(self.path_joints, 'w') as file:
@ -116,7 +117,8 @@ class PreprocessKitti:
                  .format(dic_cnt[phase], phase))
        print("Number of GT files: {}. Files with at least one pedestrian: {}.  Files not found: {}"
              .format(cnt_files, cnt_files_ped, cnt_fnf))
-        print("Number of GT annotations: {}".format(cnt_gt))
+        print("Matched : {:.1f} % of the ground truth instances"
              .format(100 * (dic_cnt['train'] + dic_cnt['val']) / cnt_gt))
        print("\nOutput files:\n{}\n{}\n".format(self.path_names, self.path_joints))
    def _factory_phase(self, name):
--- a/monoloco/prep/preprocess_nu.py
+++ b/monoloco/prep/preprocess_nu.py
@ -13,12 +13,13 @@ import numpy as np
 from nuscenes.nuscenes import NuScenes
 from nuscenes.utils import splits
-from utils.iou import get_iou_matches
+
-from utils.misc import append_cluster
+from ..utils.iou import get_iou_matches
-from utils.nuscenes import select_categories
+from ..utils.misc import append_cluster
-from utils.camera import project_3d
+from ..utils.nuscenes import select_categories
-from utils.pifpaf import preprocess_pif
+from ..utils.camera import project_3d
-from utils.monoloco import get_monoloco_inputs
+from ..utils.pifpaf import preprocess_pif
 from ..utils.network import get_monoloco_inputs
 class PreprocessNuscenes:
@ -35,7 +36,7 @@ class PreprocessNuscenes:
              }
    dic_names = defaultdict(lambda: defaultdict(list))
-    def __init__(self, dir_ann, dir_nuscenes, dataset, iou_min=0.3):
+    def __init__(self, dir_ann, dir_nuscenes, dataset, iou_min):
        logging.basicConfig(level=logging.INFO)
        self.logger = logging.getLogger(__name__)
@ -58,21 +59,13 @@ class PreprocessNuscenes:
        """
        Prepare arrays for training
        """
-        cnt_scenes = 0
+        cnt_scenes = cnt_samples = cnt_sd = cnt_ann = 0
        cnt_samples = 0
        cnt_sd = 0
        cnt_ann = 0
        start = time.time()
        for ii, scene in enumerate(self.scenes):
            end_scene = time.time()
            current_token = scene['first_sample_token']
            cnt_scenes += 1
-            if ii == 0:
+            time_left = str((end_scene - start_scene) / 60 * (len(self.scenes) - ii))[:4] if ii != 0 else "NaN"
                time_left = "Nan"
            else:
                time_left = str((end_scene-start_scene)/60 * (len(self.scenes) - ii))[:4]
            sys.stdout.write('\r' + 'Elaborating scene {}, remaining time {} minutes'
                             .format(cnt_scenes, time_left) + '\t\n')
@ -93,29 +86,9 @@ class PreprocessNuscenes:
                for cam in self.CAMERAS:
                    sd_token = sample_dic['data'][cam]
                    cnt_sd += 1
                    path_im, boxes_obj, kk = self.nusc.get_sample_data(sd_token, box_vis_level=1)  # At least one corner
                    kk = kk.tolist()
                    # Extract all the annotations of the person
-                    boxes_gt = []
+                    name, boxes_gt, boxes_3d, dds, kk = self.extract_from_token(sd_token)
                    dds = []
                    boxes_3d = []
                    name = os.path.basename(path_im)
                    for box_obj in boxes_obj:
                        if box_obj.name[:6] != 'animal':
                            general_name = box_obj.name.split('.')[0] + '.' + box_obj.name.split('.')[1]
                        else:
                            general_name = 'animal'
                        if general_name in select_categories('all'):
                            box = project_3d(box_obj, kk)
                            dd = np.linalg.norm(box_obj.center)
                            boxes_gt.append(box)
                            dds.append(dd)
                            box_3d = box_obj.center.tolist() + box_obj.wlh.tolist()
                            boxes_3d.append(box_3d)
                            self.dic_names[name]['boxes'].append(box)
                            self.dic_names[name]['dds'].append(dd)
                            self.dic_names[name]['K'] = kk
                    # Run IoU with pifpaf detections and save
                    path_pif = os.path.join(self.dir_ann, name + '.pifpaf.json')
@ -124,23 +97,24 @@ class PreprocessNuscenes:
                    if exists:
                        with open(path_pif, 'r') as file:
                            annotations = json.load(file)
                            boxes, keypoints = preprocess_pif(annotations, im_size=(1600, 900))
                    else:
                        continue
-                        boxes, keypoints = preprocess_pif(annotations, im_size=(1600, 900))
+                    if keypoints:
                        inputs = get_monoloco_inputs(keypoints, kk).tolist()
-                        if keypoints:
+                        matches = get_iou_matches(boxes, boxes_gt, self.iou_min)
-                            inputs = get_monoloco_inputs(keypoints, kk).tolist()
+                        for (idx, idx_gt) in matches:
-
+                            self.dic_jo[phase]['kps'].append(keypoints[idx])
-                            matches = get_iou_matches(boxes, boxes_gt, self.iou_min)
+                            self.dic_jo[phase]['X'].append(inputs[idx])
-                            for (idx, idx_gt) in matches:
+                            self.dic_jo[phase]['Y'].append([dds[idx_gt]])  # Trick to make it (nn,1)
-                                self.dic_jo[phase]['kps'].append(keypoints[idx])
+                            self.dic_jo[phase]['names'].append(name)  # One image name for each annotation
-                                self.dic_jo[phase]['X'].append(inputs[idx])
+                            self.dic_jo[phase]['boxes_3d'].append(boxes_3d[idx_gt])
-                                self.dic_jo[phase]['Y'].append([dds[idx_gt]])  # Trick to make it (nn,1)
+                            self.dic_jo[phase]['K'].append(kk)
-                                self.dic_jo[phase]['names'].append(name)  # One image name for each annotation
+                            append_cluster(self.dic_jo, phase, inputs[idx], dds[idx_gt], keypoints[idx])
-                                self.dic_jo[phase]['boxes_3d'].append(boxes_3d[idx_gt])
+                            cnt_ann += 1
-                                self.dic_jo[phase]['K'].append(kk)
+                            sys.stdout.write('\r' + 'Saved annotations {}'.format(cnt_ann) + '\t')
                                append_cluster(self.dic_jo, phase, inputs[idx], dds[idx_gt], keypoints[idx])
                                cnt_ann += 1
                                sys.stdout.write('\r' + 'Saved annotations {}'.format(cnt_ann) + '\t')
                current_token = sample_dic['next']
@ -154,33 +128,55 @@ class PreprocessNuscenes:
              .format(cnt_ann, cnt_samples, cnt_scenes, (end-start)/60))
        print("\nOutput files:\n{}\n{}\n".format(self.path_names, self.path_joints))
    def extract_from_token(self, sd_token):
        boxes_gt = []
        dds = []
        boxes_3d = []
        path_im, boxes_obj, kk = self.nusc.get_sample_data(sd_token, box_vis_level=1)  # At least one corner
        kk = kk.tolist()
        name = os.path.basename(path_im)
        for box_obj in boxes_obj:
            if box_obj.name[:6] != 'animal':
                general_name = box_obj.name.split('.')[0] + '.' + box_obj.name.split('.')[1]
            else:
                general_name = 'animal'
            if general_name in select_categories('all'):
                box = project_3d(box_obj, kk)
                dd = np.linalg.norm(box_obj.center)
                boxes_gt.append(box)
                dds.append(dd)
                box_3d = box_obj.center.tolist() + box_obj.wlh.tolist()
                boxes_3d.append(box_3d)
                self.dic_names[name]['boxes'].append(box)
                self.dic_names[name]['dds'].append(dd)
                self.dic_names[name]['K'] = kk
        return name, boxes_gt, boxes_3d, dds, kk
 def factory(dataset, dir_nuscenes):
    """Define dataset type and split training and validation"""
    assert dataset in ['nuscenes', 'nuscenes_mini', 'nuscenes_teaser']
-
+    if dataset == 'nuscenes_mini':
-    if dataset == 'nuscenes':
+        version = 'v1.0-mini'
        nusc = NuScenes(version='v1.0-trainval', dataroot=dir_nuscenes, verbose=True)
        scenes = nusc.scene
        split_scenes = splits.create_splits_scenes()
        split_train, split_val = split_scenes['train'], split_scenes['val']
    elif dataset == 'nuscenes_mini':
        nusc = NuScenes(version='v1.0-mini', dataroot=dir_nuscenes, verbose=True)
        scenes = nusc.scene
        split_scenes = splits.create_splits_scenes()
        split_train, split_val = split_scenes['train'], split_scenes['val']
    else:
-        nusc = NuScenes(version='v1.0-trainval', dataroot=dir_nuscenes, verbose=True)
+        version = 'v1.0-trainval'
    nusc = NuScenes(version=version, dataroot=dir_nuscenes, verbose=True)
    scenes = nusc.scene
    if dataset == 'nuscenes_teaser':
        with open("splits/nuscenes_teaser_scenes.txt", "r") as file:
            teaser_scenes = file.read().splitlines()
        scenes = nusc.scene
        scenes = [scene for scene in scenes if scene['token'] in teaser_scenes]
        with open("splits/split_nuscenes_teaser.json", "r") as file:
            dic_split = json.load(file)
        split_train = [scene['name'] for scene in scenes if scene['token'] in dic_split['train']]
        split_val = [scene['name'] for scene in scenes if scene['token'] in dic_split['val']]
    else:
        split_scenes = splits.create_splits_scenes()
        split_train, split_val = split_scenes['train'], split_scenes['val']
    return nusc, scenes, split_train, split_val
--- a/monoloco/prep/transforms.py
+++ b/monoloco/prep/transforms.py
@ -0,0 +1,54 @@
 import numpy as np
 COCO_KEYPOINTS = [
    'nose',            # 1
    'left_eye',        # 2
    'right_eye',       # 3
    'left_ear',        # 4
    'right_ear',       # 5
    'left_shoulder',   # 6
    'right_shoulder',  # 7
    'left_elbow',      # 8
    'right_elbow',     # 9
    'left_wrist',      # 10
    'right_wrist',     # 11
    'left_hip',        # 12
    'right_hip',       # 13
    'left_knee',       # 14
    'right_knee',      # 15
    'left_ankle',      # 16
    'right_ankle',     # 17
 ]
 HFLIP = {
    'nose': 'nose',
    'left_eye': 'right_eye',
    'right_eye': 'left_eye',
    'left_ear': 'right_ear',
    'right_ear': 'left_ear',
    'left_shoulder': 'right_shoulder',
    'right_shoulder': 'left_shoulder',
    'left_elbow': 'right_elbow',
    'right_elbow': 'left_elbow',
    'left_wrist': 'right_wrist',
    'right_wrist': 'left_wrist',
    'left_hip': 'right_hip',
    'right_hip': 'left_hip',
    'left_knee': 'right_knee',
    'right_knee': 'left_knee',
    'left_ankle': 'right_ankle',
    'right_ankle': 'left_ankle',
 }
 def transform_keypoints(keypoints, mode):
    assert mode == 'flip', "mode not recognized"
    kps = np.array(keypoints)
    dic_kps = {key: kps[:, :, idx] for idx, key in enumerate(COCO_KEYPOINTS)}
    kps_hflip = np.array([dic_kps[value] for key, value in HFLIP.items()])
    kps_hflip = np.transpose(kps_hflip, (1, 2, 0))
    return kps_hflip.tolist()
--- a/monoloco/run.py
+++ b/monoloco/run.py
@ -1,21 +1,19 @@
 # pylint: skip-file
 import argparse
 import os
 import sys
 sys.path.insert(0, os.path.join('.', 'features'))
 sys.path.insert(0, os.path.join('.', 'models'))
 from openpifpaf.network import nets
 from openpifpaf import decoder
-from features.preprocess_nu import PreprocessNuscenes
+
-from features.preprocess_ki import PreprocessKitti
+from .prep.preprocess_nu import PreprocessNuscenes
-from predict.predict import predict
+from .prep.preprocess_ki import PreprocessKitti
-from models.trainer import Trainer
+from .predict.predict import predict
-from eval.generate_kitti import generate_kitti
+from .train.trainer import Trainer
-from eval.geom_baseline import geometric_baseline
+from .eval.generate_kitti import GenerateKitti
-from models.hyp_tuning import HypTuning
+from .eval.geom_baseline import geometric_baseline
-from eval.kitti_eval import KittiEval
+from .train.hyp_tuning import HypTuning
-from visuals.webcam import webcam
+from .eval.eval_kitti import EvalKitti
 from .visuals.webcam import webcam
 def cli():
@ -37,6 +35,7 @@ def cli():
                             default='nuscenes')
    prep_parser.add_argument('--dir_nuscenes', help='directory of nuscenes devkit',
                             default='data/nuscenes/')
    prep_parser.add_argument('--iou_min', help='minimum iou to match ground truth', type=float, default=0.3)
    # Predict (2D pose and/or 3D location from images)
    # General
@ -59,9 +58,9 @@ def cli():
                                default="data/models/monoloco-190513-1437.pkl")
    predict_parser.add_argument('--hidden_size', type=int, help='Number of hidden units in the model', default=256)
    predict_parser.add_argument('--path_gt', help='path of json file with gt 3d localization',
-                                default='data/arrays/names-kitti-190513-1754.json')
+                                default='data/arrays/names-kitti-190710-1206.json')
    predict_parser.add_argument('--transform', help='transformation for the pose', default='None')
-    predict_parser.add_argument('--draw_kps', help='to draw kps in the images', action='store_true')
+    predict_parser.add_argument('--draw_box', help='to draw box in the images', action='store_true')
    predict_parser.add_argument('--predict', help='whether to make prediction', action='store_true')
    predict_parser.add_argument('--z_max', type=int, help='maximum meters distance for predictions', default=22)
    predict_parser.add_argument('--n_dropout', type=int, help='Epistemic uncertainty evaluation', default=0)
@ -87,7 +86,7 @@ def cli():
    # Evaluation
    eval_parser.add_argument('--dataset', help='datasets to evaluate, kitti or nuscenes', default='kitti')
-    eval_parser.add_argument('--geometric',  help='to evaluate geometric distance', action='store_true')
+    eval_parser.add_argument('--geometric', help='to evaluate geometric distance', action='store_true')
    eval_parser.add_argument('--generate', help='create txt files for KITTI evaluation', action='store_true')
    eval_parser.add_argument('--dir_ann', help='directory of annotations of 2d joints (for KITTI evaluation')
    eval_parser.add_argument('--model', help='path of MonoLoco model to load', required=True)
@ -96,7 +95,9 @@ def cli():
    eval_parser.add_argument('--dropout', type=float, help='dropout. Default no dropout', default=0.2)
    eval_parser.add_argument('--hidden_size', type=int, help='Number of hidden units in the model', default=256)
    eval_parser.add_argument('--n_stage', type=int, help='Number of stages in the model', default=3)
-    eval_parser.add_argument('--show', help='whether to show eval statistics', action='store_true')
+    eval_parser.add_argument('--show', help='whether to show statistic graphs', action='store_true')
    eval_parser.add_argument('--verbose', help='verbosity of statistics', action='store_true')
    eval_parser.add_argument('--stereo', help='include stereo baseline results', action='store_true')
    args = parser.parse_args()
    return args
@ -113,10 +114,10 @@ def main():
    elif args.command == 'prep':
        if 'nuscenes' in args.dataset:
-            prep = PreprocessNuscenes(args.dir_ann, args.dir_nuscenes, args.dataset)
+            prep = PreprocessNuscenes(args.dir_ann, args.dir_nuscenes, args.dataset, args.iou_min)
            prep.run()
        if 'kitti' in args.dataset:
-            prep = PreprocessKitti(args.dir_ann)
+            prep = PreprocessKitti(args.dir_ann, args.iou_min)
            prep.run()
    elif args.command == 'train':
@ -139,10 +140,13 @@ def main():
            geometric_baseline(args.joints)
        if args.generate:
-            generate_kitti(args.model, args.dir_ann, p_dropout=args.dropout, n_dropout=args.n_dropout)
+            kitti_txt = GenerateKitti(args.model, args.dir_ann, p_dropout=args.dropout, n_dropout=args.n_dropout)
            kitti_txt.run_mono()
            if args.stereo:
                kitti_txt.run_stereo()
        if args.dataset == 'kitti':
-            kitti_eval = KittiEval()
+            kitti_eval = EvalKitti(verbose=args.verbose, stereo=args.stereo)
            kitti_eval.run()
            kitti_eval.printer(show=args.show)
--- a/monoloco/train/init.py
+++ b/monoloco/train/init.py
--- a/monoloco/train/architectures.py
+++ b/monoloco/train/architectures.py
@ -3,47 +3,47 @@ import torch.nn as nn
 class TriLinear(nn.Module):
-        """
+    """
-        As Bilinear but without skip connection
+    As Bilinear but without skip connection
-        """
+    """
-        def __init__(self, input_size, output_size, p_dropout, linear_size=1024):
+    def __init__(self, input_size, output_size, p_dropout, linear_size=1024):
-            super(TriLinear, self).__init__()
+        super(TriLinear, self).__init__()
-            self.input_size = input_size
+        self.input_size = input_size
-            self.output_size = output_size
+        self.output_size = output_size
-            self.l_size = linear_size
+        self.l_size = linear_size
-            self.relu = nn.ReLU(inplace=True)
+        self.relu = nn.ReLU(inplace=True)
-            self.dropout = nn.Dropout(p_dropout)
+        self.dropout = nn.Dropout(p_dropout)
-            self.w1 = nn.Linear(self.input_size, self.l_size)
+        self.w1 = nn.Linear(self.input_size, self.l_size)
-            self.batch_norm1 = nn.BatchNorm1d(self.l_size)
+        self.batch_norm1 = nn.BatchNorm1d(self.l_size)
-            self.w2 = nn.Linear(self.l_size, self.l_size)
+        self.w2 = nn.Linear(self.l_size, self.l_size)
-            self.batch_norm2 = nn.BatchNorm1d(self.l_size)
+        self.batch_norm2 = nn.BatchNorm1d(self.l_size)
-            self.w3 = nn.Linear(self.l_size, self.output_size)
+        self.w3 = nn.Linear(self.l_size, self.output_size)
-        def forward(self, x):
+    def forward(self, x):
-            y = self.w1(x)
+        y = self.w1(x)
-            y = self.batch_norm1(y)
+        y = self.batch_norm1(y)
-            y = self.relu(y)
+        y = self.relu(y)
-            y = self.dropout(y)
+        y = self.dropout(y)
-            y = self.w2(y)
+        y = self.w2(y)
-            y = self.batch_norm2(y)
+        y = self.batch_norm2(y)
-            y = self.relu(y)
+        y = self.relu(y)
-            y = self.dropout(y)
+        y = self.dropout(y)
-            y = self.w3(y)
+        y = self.w3(y)
-            return y
+        return y
-def weight_init(m):
+def weight_init(batch):
    """TO initialize weights using kaiming initialization"""
-    if isinstance(m, nn.Linear):
+    if isinstance(batch, nn.Linear):
-        nn.init.kaiming_normal_(m.weight)
+        nn.init.kaiming_normal_(batch.weight)
 class Linear(nn.Module):
@ -93,7 +93,7 @@ class LinearModel(nn.Module):
        self.batch_norm1 = nn.BatchNorm1d(self.linear_size)
        self.linear_stages = []
-        for l in range(num_stage):
+        for _ in range(num_stage):
            self.linear_stages.append(Linear(self.linear_size, self.p_dropout))
        self.linear_stages = nn.ModuleList(self.linear_stages)
@ -109,11 +109,8 @@ class LinearModel(nn.Module):
        y = self.batch_norm1(y)
        y = self.relu(y)
        y = self.dropout(y)
        # linear layers
        for i in range(self.num_stage):
            y = self.linear_stages[i](y)
        y = self.w2(y)
        return y
--- a/monoloco/train/datasets.py
+++ b/monoloco/train/datasets.py
@ -54,10 +54,3 @@ class KeypointsDataset(Dataset):
        count = len(self.dic_clst[clst]['Y'])
        return inputs, outputs, count
--- a/monoloco/train/hyp_tuning.py
+++ b/monoloco/train/hyp_tuning.py
@ -1,13 +1,16 @@
 import math
 import os
 import json
 import time
 import logging
 import torch
 import random
 import datetime
 import torch
 import numpy as np
-from models.trainer import Trainer
+
 from .trainer import Trainer
 class HypTuning:
@ -30,12 +33,10 @@ class HypTuning:
        if not os.path.exists(dir_logs):
            os.makedirs(dir_logs)
        now = datetime.datetime.now()
        now_time = now.strftime("%Y%m%d-%H%M")[2:]
        name_out = 'hyp-baseline-' if baseline else 'hyp-monoloco-'
-        self.path_log = os.path.join(dir_logs, name_out + now_time)
+        self.path_log = os.path.join(dir_logs, name_out)
-        self.path_model = os.path.join(dir_out, name_out + now_time + '.pkl')
+        self.path_model = os.path.join(dir_out, name_out)
        logging.basicConfig(level=logging.INFO)
        self.logger = logging.getLogger(__name__)
@ -49,7 +50,7 @@ class HypTuning:
        random.shuffle(self.sched_step)
        self.bs_list = [64, 128, 256, 512, 1024, 2048] * multiplier
        random.shuffle(self.bs_list)
-        self.hidden_list = [128, 256, 512, 128, 256, 512] * multiplier
+        self.hidden_list = [256, 256, 256, 256, 256, 256] * multiplier
        random.shuffle(self.hidden_list)
        self.n_stage_list = [3, 3, 3, 3, 3, 3] * multiplier
        random.shuffle(self.n_stage_list)
@ -104,11 +105,14 @@ class HypTuning:
                dic_err_best = dic_err
                best_acc_val = acc_val
                model_best = model
                torch.save(model_best.state_dict(), self.path_model)
                with open(self.path_log, 'w') as f:
                    json.dump(dic_best, f)
        # Save model and log
        now = datetime.datetime.now()
        now_time = now.strftime("%Y%m%d-%H%M")[2:]
        self.path_model = self.path_model + now_time + '.pkl'
        torch.save(model_best.state_dict(), self.path_model)
        with open(self.path_log + now_time, 'w') as f:
            json.dump(dic_best, f)
        end = time.time()
        print('\n\n\n')
        self.logger.info(" Tried {} combinations".format(cnt))
--- a/monoloco/train/losses.py
+++ b/monoloco/train/losses.py
@ -52,8 +52,6 @@ class CustomL1Loss(torch.nn.Module):
        weights = torch.from_numpy(weights_np).float().to(self.device)  # To make weights in the same cuda device
        losses = torch.abs(output - target) * weights
        loss = losses.mean()  # Mean over the batch
        # self.print_loss()
        return loss
@ -66,7 +64,7 @@ class LaplacianLoss(torch.nn.Module):
        self.reduce = reduce
        self.evaluate = evaluate
-    def laplacian_1d(self, mu_si,  xx):
+    def laplacian_1d(self, mu_si, xx):
        """
        1D Gaussian Loss. f(x | mu, sigma). The network outputs mu and sigma. X is the ground truth distance.
        This supports backward().
@ -84,8 +82,7 @@ class LaplacianLoss(torch.nn.Module):
        if self.evaluate:
            return norm_bi
-        else:
+        return term_a + term_b
            return term_a + term_b
    def forward(self, outputs, targets):
@ -109,13 +106,12 @@ class GaussianLoss(torch.nn.Module):
        self.evaluate = evaluate
        self.device = device
-    def gaussian_1d(self, mu_si,  xx):
+    def gaussian_1d(self, mu_si, xx):
        """
        1D Gaussian Loss. f(x | mu, sigma). The network outputs mu and sigma. X is the ground truth distance.
        This supports backward().
        Inspired by
        https://github.com/naba89/RNN-Handwriting-Generation-Pytorch/blob/master/loss_functions.py
        """
        mu, si = mu_si[:, 0:1], mu_si[:, 1:2]
@ -129,8 +125,8 @@ class GaussianLoss(torch.nn.Module):
        if self.evaluate:
            return norm_si
-        else:
+
-            return term_a + term_b
+        return term_a + term_b
    def forward(self, outputs, targets):
--- a/monoloco/train/trainer.py
+++ b/monoloco/train/trainer.py
@ -1,3 +1,9 @@
 # pylint: skip-file  # TODO
 """
 Training and evaluation of a neural network which predicts 3D localization and confidence intervals
 given 2d joints
 """
 import copy
 import os
@ -13,19 +19,14 @@ import torch.nn as nn
 from torch.utils.data import DataLoader
 from torch.optim import lr_scheduler
-from models.datasets import KeypointsDataset
+from .datasets import KeypointsDataset
-from models.architectures import LinearModel
+from .architectures import LinearModel
-from models.losses import LaplacianLoss
+from .losses import LaplacianLoss
-from utils.logs import set_logger
+from ..utils.logs import set_logger
-from utils.monoloco import epistemic_variance, laplace_sampling, unnormalize_bi
+from ..utils.network import laplace_sampling, unnormalize_bi
 class Trainer:
    """
    Training and evaluation of a neural network which predicts 3D localization and confidence intervals
    given 2d joints
    """
    def __init__(self, joints, epochs=100, bs=256, dropout=0.2, lr=0.002,
                 sched_step=20, sched_gamma=1, hidden_size=256, n_stage=3, r_seed=1, n_dropout=0, n_samples=100,
                 baseline=False, save=False, print_loss=False):
@ -123,10 +124,7 @@ class Trainer:
        best_model_wts = copy.deepcopy(self.model.state_dict())
        best_acc = 1e6
        best_epoch = 0
-        epoch_losses_tr = []
+        epoch_losses_tr = epoch_losses_val = epoch_norms = epoch_sis = []
        epoch_losses_val = []
        epoch_norms = []
        epoch_sis = []
        for epoch in range(self.num_epochs):
@ -138,10 +136,7 @@ class Trainer:
                else:
                    self.model.eval()  # Set model to evaluate mode
-                running_loss_tr = 0.0
+                running_loss_tr = running_loss_eval = norm_tr = bi_tr = 0.0
                running_loss_eval = 0.0
                norm_tr = 0.0
                bi_tr = 0.0
                # Iterate over data.
                for inputs, labels, _, _ in self.dataloaders[phase]:
@ -156,10 +151,7 @@ class Trainer:
                    with torch.set_grad_enabled(phase == 'train'):
                        outputs = self.model(inputs)
-                        if self.output_size == 2:
+                        outputs_eval = outputs[:, 0:1] if self.output_size == 2 else outputs
                            outputs_eval = outputs[:, 0:1]  # Fundamental to put slices
                        else:
                            outputs_eval = outputs
                        loss = self.criterion(outputs, labels)
                        loss_eval = self.criterion_eval(outputs_eval, labels)  # L1 loss to evaluation
@ -196,7 +188,8 @@ class Trainer:
        time_elapsed = time.time() - since
        print('\n\n' + '-'*120)
-        self.logger.info('Training:\nTraining complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
+        self.logger.info('Training:\nTraining complete in {:.0f}m {:.0f}s'
                         .format(time_elapsed // 60, time_elapsed % 60))
        self.logger.info('Best validation Accuracy: {:.3f}'.format(best_acc))
        self.logger.info('Saved weights of the model at epoch: {}'.format(best_epoch))
@ -251,7 +244,7 @@ class Trainer:
                total_outputs = torch.empty((0, len(labels))).to(self.device)
                if self.n_dropout > 0:
-                    for ii in range(self.n_dropout):
+                    for _ in range(self.n_dropout):
                        outputs = self.model(inputs)
                        outputs = unnormalize_bi(outputs)
                        samples = laplace_sampling(outputs, self.n_samples)
@ -269,8 +262,6 @@ class Trainer:
                if not self.baseline:
                    outputs = unnormalize_bi(outputs)
                avg_distance = float(self.criterion_eval(outputs[:, 0:1], labels).item())
                dic_err[phase]['all'] = self.compute_stats(outputs, labels, varss, dic_err[phase]['all'], size_eval)
            print('-'*120)
@ -323,26 +314,25 @@ class Trainer:
        if self.baseline:
            return (mean_mu, max_mu), (0, 0, 0)
-        else:
+        mean_bi = torch.mean(outputs[:, 1]).item()
            mean_bi = torch.mean(outputs[:, 1]).item()
-            low_bound_bi = labels >= (outputs[:, 0] - outputs[:, 1])
+        low_bound_bi = labels >= (outputs[:, 0] - outputs[:, 1])
-            up_bound_bi = labels <= (outputs[:, 0] + outputs[:, 1])
+        up_bound_bi = labels <= (outputs[:, 0] + outputs[:, 1])
-            bools_bi = low_bound_bi & up_bound_bi
+        bools_bi = low_bound_bi & up_bound_bi
-            conf_bi = float(torch.sum(bools_bi)) / float(bools_bi.shape[0])
+        conf_bi = float(torch.sum(bools_bi)) / float(bools_bi.shape[0])
-            # if varss[0] >= 0:
+        # if varss[0] >= 0:
-            #     mean_var = torch.mean(varss).item()
+        #     mean_var = torch.mean(varss).item()
-            #     max_var = torch.max(varss).item()
+        #     max_var = torch.max(varss).item()
-            #
+        #
-            #     low_bound_var = labels >= (outputs[:, 0] - varss)
+        #     low_bound_var = labels >= (outputs[:, 0] - varss)
-            #     up_bound_var = labels <= (outputs[:, 0] + varss)
+        #     up_bound_var = labels <= (outputs[:, 0] + varss)
-            #     bools_var = low_bound_var & up_bound_var
+        #     bools_var = low_bound_var & up_bound_var
-            #     conf_var = float(torch.sum(bools_var)) / float(bools_var.shape[0])
+        #     conf_var = float(torch.sum(bools_var)) / float(bools_var.shape[0])
-            dic_err['mean'] += mean_mu * (outputs.size(0) / size_eval)
+        dic_err['mean'] += mean_mu * (outputs.size(0) / size_eval)
-            dic_err['bi'] += mean_bi * (outputs.size(0) / size_eval)
+        dic_err['bi'] += mean_bi * (outputs.size(0) / size_eval)
-            dic_err['count'] += (outputs.size(0) / size_eval)
+        dic_err['count'] += (outputs.size(0) / size_eval)
-            dic_err['conf_bi'] += conf_bi * (outputs.size(0) / size_eval)
+        dic_err['conf_bi'] += conf_bi * (outputs.size(0) / size_eval)
-            return dic_err
+        return dic_err
--- a/monoloco/utils/init.py
+++ b/monoloco/utils/init.py
--- a/monoloco/utils/camera.py
+++ b/monoloco/utils/camera.py
@ -10,9 +10,9 @@ def pixel_to_camera(uv_tensor, kk, z_met):
    It accepts lists or tensors of (m, 2) or (m, x, 2) or (m, 2, x)
    where x is the number of keypoints
    """
-    if type(uv_tensor) == list:
+    if isinstance(uv_tensor, list):
        uv_tensor = torch.tensor(uv_tensor)
-    if type(kk) == list:
+    if isinstance(kk, list):
        kk = torch.tensor(kk)
    if uv_tensor.size()[-1] != 2:
        uv_tensor = uv_tensor.permute(0, 2, 1)  # permute to have 2 as last dim to be padded
@ -42,7 +42,7 @@ def project_3d(box_obj, kk):
    box_2d = []
    # Obtain the 3d points of the box
    xc, yc, zc = box_obj.center
-    ww, ll, hh, = box_obj.wlh
+    ww, _, hh, = box_obj.wlh
    # Points corresponding to a box at the z of the center
    x1 = xc - ww/2
@ -70,7 +70,7 @@ def get_keypoints(keypoints, mode):
    Input --> list or torch.tensor [(m, 3, 17) or (3, 17)]
    Output --> torch.tensor [(m, 2)]
    """
-    if type(keypoints) == list:
+    if isinstance(keypoints, list):
        keypoints = torch.tensor(keypoints)
    if len(keypoints.size()) == 2:  # add batch dim
        keypoints = keypoints.unsqueeze(0)
@ -109,17 +109,15 @@ def get_keypoints(keypoints, mode):
 def transform_kp(kps, tr_mode):
    """Apply different transformations to the keypoints based on the tr_mode"""
-    assert tr_mode == "None" or tr_mode == "singularity" or tr_mode == "upper" or tr_mode == "lower" \
+    assert tr_mode in ("None", "singularity", "upper", "lower", "horizontal", "vertical", "lateral",
-           or tr_mode == "horizontal" or tr_mode == "vertical" or tr_mode == "lateral" \
+                       'shoulder', 'knee', 'upside', 'falling', 'random')
           or tr_mode == 'shoulder' or tr_mode == 'knee' or tr_mode == 'upside' or tr_mode == 'falling' \
           or tr_mode == 'random'
    uu_c, vv_c = get_keypoints(kps, mode='center')
    if tr_mode == "None":
        return kps
-    elif tr_mode == "singularity":
+    if tr_mode == "singularity":
        uus = [uu_c for uu in kps[0]]
        vvs = [vv_c for vv in kps[1]]
@ -131,23 +129,6 @@ def transform_kp(kps, tr_mode):
        uus = kps[0]
        vvs = [vv_c for vv in kps[1]]
    elif tr_mode == 'lower':
        uus = kps[0]
        vvs = kps[1][:9] + [vv_c for vv in kps[1][9:]]
    elif tr_mode == 'upper':
        uus = kps[0]
        vvs = [vv_c for vv in kps[1][:9]] + kps[1][9:]
    elif tr_mode == 'lateral':
        uus = []
        for idx, kp in enumerate(kps[0]):
            if idx % 2 == 1:
                uus.append(kp)
            else:
                uus.append(uu_c)
        vvs = kps[1]
    elif tr_mode == 'shoulder':
        uus = kps[0]
        vvs = kps[1][:7] + [kps[1][6] for vv in kps[1][7:]]
@ -183,7 +164,7 @@ def xyz_from_distance(distances, xy_centers):
    xy_centers --> tensor(m,3) or (3)
    """
-    if type(distances) == float:
+    if isinstance(distances, float):
        distances = torch.tensor(distances).unsqueeze(0)
    if len(distances.size()) == 1:
        distances = distances.unsqueeze(1)
@ -193,16 +174,3 @@ def xyz_from_distance(distances, xy_centers):
    assert xy_centers.size()[-1] == 3 and distances.size()[-1] == 1, "Size of tensor not recognized"
    return xy_centers * distances / torch.sqrt(1 + xy_centers[:, 0:1].pow(2) + xy_centers[:, 1:2].pow(2))
 def pixel_to_camera_old(uv1, kk, z_met):
    """
    (3,) array --> (3,) array
    Convert a point in pixel coordinate to absolute camera coordinates
    """
    if len(uv1) == 2:
        uv1.append(1)
    kk_1 = np.linalg.inv(kk)
    xyz_met_norm = np.dot(kk_1, uv1)
    xyz_met = xyz_met_norm * z_met
    return xyz_met
--- a/monoloco/utils/iou.py
+++ b/monoloco/utils/iou.py
@ -68,5 +68,3 @@ def reorder_matches(matches, boxes, mode='left_rigth'):
    matches_left = [idx for (idx, _) in matches]
    return [matches[matches_left.index(idx_boxes)] for idx_boxes in ordered_boxes if idx_boxes in matches_left]
--- a/monoloco/utils/kitti.py
+++ b/monoloco/utils/kitti.py
@ -1,6 +1,7 @@
 import math
 import numpy as np
 import math
 def get_calibration(path_txt):
@ -69,28 +70,27 @@ def get_simplified_calibration(path_txt):
    raise ValueError('Matrix K_02 not found in the file')
-def check_conditions(line, mode, thresh=0.3):
+def check_conditions(line, category, method, thresh=0.3):
    """Check conditions of our or m3d txt file"""
    check = False
-    assert mode in ['gt', 'gt_all', 'm3d', '3dop','our'], "Mode %r not recognized" % mode
+    assert method in ['gt', 'm3d', '3dop', 'our'], "Method %r not recognized" % method
    assert category in ['pedestrian', 'cyclist', 'all']
-    if mode == 'm3d' or mode == '3dop':
+    if method in ('m3d', '3dop'):
        conf = line.split()[15]
-        if line[:10] == 'pedestrian' and float(conf) >= thresh:
+        if line.split()[0] == category and float(conf) >= thresh:
            check = True
-    elif mode == 'gt':
+    elif method == 'gt':
-        # if line[:10] == 'Pedestrian' or line[:10] == 'Person_sit':
+        if category == 'all':
-        if line[:10] == 'Pedestrian':
+            categories_gt = ['Pedestrian', 'Person_sitting', 'Cyclist']
        else:
            categories_gt = [category.upper()[0] + category[1:]]  # Upper case names
        if line.split()[0] in categories_gt:
            check = True
-    # Consider also person sitting and cyclists categories
+    elif method == 'our':
    elif mode == 'gt_all':
        if line[:10] == 'Pedestrian' or line[:10] == 'Person_sit' or line[:7] == 'Cyclist':
            check = True
    elif mode == 'our':
        if line[4] >= thresh:
            check = True
@ -130,23 +130,25 @@ def split_training(names_gt, path_train, path_val):
    return set_train, set_val
-def parse_ground_truth(path_gt, mode='gt'):
+def parse_ground_truth(path_gt, category):
    """Parse KITTI ground truth files"""
    boxes_gt = []
    dds_gt = []
    zzs_gt = []
    truncs_gt = []  # Float from 0 to 1
    occs_gt = []  # Either 0,1,2,3 fully visible, partly occluded, largely occluded, unknown
    boxes_3d = []
    with open(path_gt, "r") as f_gt:
        for line_gt in f_gt:
-            if check_conditions(line_gt, mode=mode):
+            if check_conditions(line_gt, category, method='gt'):
                truncs_gt.append(float(line_gt.split()[1]))
                occs_gt.append(int(line_gt.split()[2]))
                boxes_gt.append([float(x) for x in line_gt.split()[4:8]])
                loc_gt = [float(x) for x in line_gt.split()[11:14]]
                wlh = [float(x) for x in line_gt.split()[8:11]]
                boxes_3d.append(loc_gt + wlh)
                zzs_gt.append(loc_gt[2])
                dds_gt.append(math.sqrt(loc_gt[0] ** 2 + loc_gt[1] ** 2 + loc_gt[2] ** 2))
-    return boxes_gt, boxes_3d, dds_gt, truncs_gt, occs_gt
+    return boxes_gt, boxes_3d, dds_gt, zzs_gt, truncs_gt, occs_gt
--- a/monoloco/utils/logs.py
+++ b/monoloco/utils/logs.py
--- a/monoloco/utils/misc.py
+++ b/monoloco/utils/misc.py
@ -1,4 +1,6 @@
 import random
 def append_cluster(dic_jo, phase, xx, dd, kps):
    """Append the annotation based on its distance"""
@ -24,11 +26,21 @@ def append_cluster(dic_jo, phase, xx, dd, kps):
        dic_jo[phase]['clst']['>30']['Y'].append([dd])
-def get_task_error(dd):
+def get_task_error(dd, mode='std'):
    """Get target error not knowing the gender"""
-    mm_gender = 0.0556
+    assert mode in ('std', 'mad')
    if mode == 'std':
        mm_gender = 0.0557
    elif mode == 'mad':  # mean absolute deviation
        mm_gender = 0.0457
    return mm_gender * dd
 def get_pixel_error(dd_gt, zz_gt):
    """calculate error in stereo distance due to +-1 pixel mismatch (function of depth)"""
-
+    disp = 0.54 * 721 / zz_gt
    random.seed(1)
    sign = random.choice((-1, 1))
    delta_z = zz_gt - 0.54 * 721 / (disp + sign)
    return dd_gt + delta_z
--- a/monoloco/utils/network.py
+++ b/monoloco/utils/network.py
@ -1,7 +1,7 @@
 import numpy as np
 import torch
-from utils.camera import get_keypoints, pixel_to_camera
+from ..utils.camera import get_keypoints, pixel_to_camera
 def get_monoloco_inputs(keypoints, kk):
@ -16,8 +16,9 @@ def get_monoloco_inputs(keypoints, kk):
        kk = torch.tensor(kk)
    # Projection in normalized image coordinates and zero-center with the center of the bounding box
    uv_center = get_keypoints(keypoints, mode='center')
-    xy1_center = pixel_to_camera(uv_center, kk, 1) * 10
+    xy1_center = pixel_to_camera(uv_center, kk, 10)
-    xy1_all = pixel_to_camera(keypoints[:, 0:2, :], kk, 1) * 10
+    xy1_all = pixel_to_camera(keypoints[:, 0:2, :], kk, 10)
    # xy1_center[:, 1].fill_(0)  #TODO
    kps_norm = xy1_all - xy1_center.unsqueeze(1)  # (m, 17, 3) - (m, 1, 3)
    kps_out = kps_norm[:, :, 0:2].reshape(kps_norm.size()[0], -1)  # no contiguous for view
    return kps_out
--- a/monoloco/utils/nuscenes.py
+++ b/monoloco/utils/nuscenes.py
@ -23,7 +23,7 @@ def get_unique_tokens(list_fin):
    return list_token_scene
-def split_scenes(list_token_scene, tr, val, dir_main, save=False, load=True):
+def split_scenes(list_token_scene, train, val, dir_main, save=False, load=True):
    """
    Split the list according tr, val percentages (test percentage is a consequence) after shuffling the order
    """
@ -34,7 +34,7 @@ def split_scenes(list_token_scene, tr, val, dir_main, save=False, load=True):
        random.seed(1)
        random.shuffle(list_token_scene)  # it shuffles in place
        n_scenes = len(list_token_scene)
-        n_train = round(n_scenes * tr / 100)
+        n_train = round(n_scenes * train / 100)
        n_val = round(n_scenes * val / 100)
        list_train = list_token_scene[0: n_train]
        list_val = list_token_scene[n_train: n_train + n_val]
@ -55,18 +55,16 @@ def select_categories(cat):
    """
    Choose the categories to extract annotations from
    """
-    assert cat == 'person' or cat == 'all' or cat == 'car'
+    assert cat in ['person', 'all', 'car', 'cyclist']
    if cat == 'person':
        categories = ['human.pedestrian']
    elif cat == 'all':
-        categories = ['human.pedestrian',
+        categories = ['human.pedestrian', 'vehicle.bicycle', 'vehicle.motorcycle']
-                      'vehicle.bicycle', 'vehicle.motorcycle']
+    elif cat == 'cyclist':
-
+        categories = ['vehicle.bicycle']
    elif cat == 'car':
        categories = ['vehicle']
    return categories
--- a/monoloco/utils/pifpaf.py
+++ b/monoloco/utils/pifpaf.py
@ -0,0 +1,54 @@
 import numpy as np
 def preprocess_pif(annotations, im_size=None):
    """
    Preprocess pif annotations:
    1. enlarge the box of 10%
    2. Constraint it inside the image (if image_size provided)
    """
    boxes = []
    keypoints = []
    for dic in annotations:
        box = dic['bbox']
        if box[3] < 0.5:  # Check for no detections (boxes 0,0,0,0)
            return [], []
        kps = prepare_pif_kps(dic['keypoints'])
        conf = float(np.sort(np.array(kps[2]))[-3])  # The confidence is the 3rd highest value for the keypoints
        # Add 15% for y and 20% for x
        delta_h = (box[3] - box[1]) / 7
        delta_w = (box[2] - box[0]) / 3.5
        assert delta_h > -5 and delta_w > -5, "Bounding box <=0"
        box[0] -= delta_w
        box[1] -= delta_h
        box[2] += delta_w
        box[3] += delta_h
        # Put the box inside the image
        if im_size is not None:
            box[0] = max(0, box[0])
            box[1] = max(0, box[1])
            box[2] = min(box[2], im_size[0])
            box[3] = min(box[3], im_size[1])
        box.append(conf)
        boxes.append(box)
        keypoints.append(kps)
    return boxes, keypoints
 def prepare_pif_kps(kps_in):
    """Convert from a list of 51 to a list of 3, 17"""
    assert len(kps_in) % 3 == 0, "keypoints expected as a multiple of 3"
    xxs = kps_in[0:][::3]
    yys = kps_in[1:][::3]  # from offset 1 every 3
    ccs = kps_in[2:][::3]
    return [xxs, yys, ccs]
--- a/monoloco/utils/stereo.py
+++ b/monoloco/utils/stereo.py
@ -0,0 +1,87 @@
 import copy
 import warnings
 import numpy as np
 def depth_from_disparity(zzs, kps, kps_right):
    """Associate instances in left and right images and compute disparity"""
    zzs_stereo = []
    zzs = np.array(zzs)
    kps = np.array(kps)
    kps_right_list = copy.deepcopy(kps_right)
    cnt_stereo = 0
    expected_disps = 0.54 * 721 / np.array(zzs)
    for idx, zz_mono in enumerate(zzs):
        if kps_right_list:
            zz_stereo, disparity_x, disparity_y, idx_min = filter_disparities(kps, kps_right_list, idx, expected_disps)
            if verify_stereo(zz_stereo, zz_mono, disparity_x, disparity_y):
                zzs_stereo.append(zz_stereo)
                cnt_stereo += 1
                kps_right_list.pop(idx_min)
            else:
                zzs_stereo.append(zz_mono)
        else:
            zzs_stereo.append(zz_mono)
    return zzs_stereo, cnt_stereo
 def filter_disparities(kps, kps_right_list, idx, expected_disps):
    """filter joints based on confidence and interquartile range of the distribution"""
    CONF_MIN = 0.3
    kps_right = np.array(kps_right_list)
    with warnings.catch_warnings() and np.errstate(invalid='ignore'):
        try:
            disparity_x = kps[idx, 0, :] - kps_right[:, 0, :]
            disparity_y = kps[idx, 1, :] - kps_right[:, 1, :]
            # Mask for low confidence
            mask_conf_left = kps[idx, 2, :] > CONF_MIN
            mask_conf_right = kps_right[:, 2, :] > CONF_MIN
            mask_conf = mask_conf_left & mask_conf_right
            disparity_x_conf = np.where(mask_conf, disparity_x, np.nan)
            disparity_y_conf = np.where(mask_conf, disparity_y, np.nan)
            # Mask outliers using iqr
            mask_outlier = get_iqr_mask(disparity_x_conf)
            disparity_x_mask = np.where(mask_outlier, disparity_x_conf, np.nan)
            disparity_y_mask = np.where(mask_outlier, disparity_y_conf, np.nan)
            avg_disparity_x = np.nanmedian(disparity_x_mask, axis=1)  # ignore the nan
            diffs_x = [abs(expected_disps[idx] - real) for real in avg_disparity_x]
            idx_min = diffs_x.index(min(diffs_x))
            zz_stereo = 0.54 * 721. / float(avg_disparity_x[idx_min])
        except ZeroDivisionError:
            zz_stereo = - 100
        return zz_stereo, disparity_x_mask[idx_min], disparity_y_mask[idx_min], idx_min
 def verify_stereo(zz_stereo, zz_mono, disparity_x, disparity_y):
    COV_MIN = 0.1
    y_max_difference = (50 / zz_mono)
    z_max_difference = 0.6 * zz_mono
    cov = float(np.nanstd(disparity_x) / np.abs(np.nanmean(disparity_x)))  # Coefficient of variation
    avg_disparity_y = np.nanmedian(disparity_y)
    if abs(zz_stereo - zz_mono) < z_max_difference and \
            avg_disparity_y < y_max_difference and \
            cov < COV_MIN:
        return True
    return False
 def get_iqr_mask(distribution):
    quartile_1, quartile_3 = np.nanpercentile(distribution, [25, 75], axis=1)
    iqr = quartile_3 - quartile_1
    lower_bound = quartile_1 - (iqr * 1.5)
    upper_bound = quartile_3 + (iqr * 1.5)
    return (distribution < upper_bound.reshape(-1, 1)) & (distribution > lower_bound.reshape(-1, 1))
--- a/monoloco/visuals/init.py
+++ b/monoloco/visuals/init.py
--- a/monoloco/visuals/paper.py
+++ b/monoloco/visuals/paper.py
@ -1,15 +1,15 @@
 # pylint: skip-file
 import numpy as np
 import os
 import math
 import matplotlib.pyplot as plt
 from matplotlib.patches import Ellipse
 from visuals.printer import get_angle
 from visuals.printer import get_confidence
 def paper():
    """Print paper figures"""
    dir_out = os.path.join('data', 'all_images', 'paper')
    method = True
    task_error = True
@ -75,7 +75,7 @@ def paper():
        plt.yticks([])
        plt.xlabel('X [m]')
        plt.ylabel('Z [m]')
-        plt.savefig(os.path.join(dir_out, fig_name))
+        # plt.savefig(os.path.join('docs', fig_name))
        plt.show()
        plt.close()
@ -107,7 +107,7 @@ def paper():
        plt.xlabel("Distance from the camera [m]")
        plt.ylabel("Localization error due to human height variation [m]")
        plt.legend(loc=(0.01, 0.55))  # Location from 0 to 1 from lower left
-        plt.savefig(os.path.join(dir_out, fig_name))
+        # plt.savefig(os.path.join(dir_out, fig_name))
        plt.show()
        plt.close()
@ -121,11 +121,21 @@ def gmm():
    std_men = 7
    mu_women = 165
    std_women = 7
-    N_men = np.random.normal(mu_men, std_men, 100000)
+    N_men_1 = np.random.normal(mu_men, std_men, 1000000)
-    N_women = np.random.normal(mu_women, std_women, 100000)
+    N_men_2 = np.random.normal(mu_men, std_men, 1000000)
-    N_gmm = np.concatenate((N_men, N_women))
+    N_women_1 = np.random.normal(mu_women, std_women, 1000000)
-    mu_gmm = np.mean(N_gmm)
+    N_women_2 = np.random.normal(mu_women, std_women, 1000000)
-    std_gmm = np.std(N_gmm)
+    N_gmm_1 = np.concatenate((N_men_1, N_women_1))
    N_gmm_2 = np.concatenate((N_men_2, N_women_2))
    mu_gmm_1 = np.mean(N_gmm_1)
    mu_gmm_2 = np.mean(N_gmm_2)
    std_gmm = np.std(N_gmm_1)
    mm_gender = std_gmm / mu_gmm_1
    var_gmm = np.var(N_gmm_1)
    abs_diff_1 = np.abs(mu_gmm_1 - N_gmm_1)
    abs_diff_2 = np.mean(np.abs(N_gmm_1 - N_gmm_2))
    mean_deviation_1 = np.mean(abs_diff_1)
    mean_deviation_2 = np.mean(abs_diff_2)
    # sns.distplot(N_men, hist=False, rug=False, label="Men")
    # sns.distplot(N_women, hist=False, rug=False, label="Women")
    # sns.distplot(N_gmm, hist=False, rug=False, label="GMM")
@ -133,7 +143,21 @@ def gmm():
    # plt.ylabel("Height distributions of men and women")
    # plt.legend()
    # plt.show()
-    print("Variace of GMM distribution: {:.2f}".format(std_gmm))
+    print("Mean of GMM distribution: {:.2f}".format(mu_gmm_1))
-    mm_gender = std_gmm / mu_gmm
+    print("Standard deviation: {:.2f}".format(std_gmm))
    print("Relative error (standard deviation) {:.3f} %".format(mm_gender * 100))
    print("Variance: {:.2f}".format(var_gmm))
    print("Mean deviation: {:.2f}".format(mean_deviation_1))
    print("Mean deviation 2: {:.2f}".format(mean_deviation_2))
    print("Relative error (mean absolute deviation): {:.3f} %".format((mean_deviation_1 / mu_gmm_1) * 100))
    return mm_gender
 def get_confidence(xx, zz, std):
    theta = math.atan2(zz, xx)
    delta_x = std * math.cos(theta)
    delta_z = std * math.sin(theta)
    return (xx - delta_x, xx + delta_x), (zz - delta_z, zz + delta_z)
--- a/monoloco/visuals/printer.py
+++ b/monoloco/visuals/printer.py
@ -0,0 +1,278 @@
 import math
 from collections import OrderedDict
 import numpy as np
 import matplotlib
 import matplotlib.pyplot as plt
 import matplotlib.cm as cm
 from matplotlib.patches import Ellipse, Circle, Rectangle
 from mpl_toolkits.axes_grid1 import make_axes_locatable
 from ..utils.camera import pixel_to_camera
 from ..utils.misc import get_task_error
 class Printer:
    """
    Print results on images: birds eye view and computed distance
    """
    FONTSIZE_BV = 16
    FONTSIZE = 18
    TEXTCOLOR = 'darkorange'
    COLOR_KPS = 'yellow'
    def __init__(self, image, output_path, kk, output_types, epistemic=False, z_max=30, fig_width=10):
        self.im = image
        self.kk = kk
        self.output_types = output_types
        self.epistemic = epistemic
        self.z_max = z_max  # To include ellipses in the image
        self.y_scale = 1
        self.width = self.im.size[0]
        self.height = self.im.size[1]
        self.fig_width = fig_width
        # Define the output dir
        self.path_out = output_path
        self.cmap = cm.get_cmap('jet')
        self.extensions = []
        # Define variables of the class to change for every image
        self.mpl_im0 = self.stds_ale = self.stds_epi = self.xx_gt = self.zz_gt = self.xx_pred = self.zz_pred =\
            self.dds_real = self.uv_centers = self.uv_shoulders = self.uv_kps = self.boxes = self.boxes_gt = \
            self.uv_camera = self.radius = None
    def _process_results(self, dic_ann):
        # Include the vectors inside the interval given by z_max
        self.stds_ale = dic_ann['stds_ale']
        self.stds_epi = dic_ann['stds_epi']
        self.xx_gt = [xx[0] for xx in dic_ann['xyz_real']]
        self.zz_gt = [xx[2] if xx[2] < self.z_max - self.stds_epi[idx] else 0
                      for idx, xx in enumerate(dic_ann['xyz_real'])]
        self.xx_pred = [xx[0] for xx in dic_ann['xyz_pred']]
        self.zz_pred = [xx[2] if xx[2] < self.z_max - self.stds_epi[idx] else 0
                        for idx, xx in enumerate(dic_ann['xyz_pred'])]
        self.dds_real = dic_ann['dds_real']
        self.uv_centers = dic_ann['uv_centers']
        self.uv_shoulders = dic_ann['uv_shoulders']
        self.uv_kps = dic_ann['uv_kps']
        self.boxes = dic_ann['boxes']
        self.boxes_gt = dic_ann['boxes_gt']
        self.uv_camera = (int(self.im.size[0] / 2), self.im.size[1])
        self.radius = 11 / 1600 * self.width
    def factory_axes(self):
        """Create axes for figures: front bird combined"""
        axes = []
        figures = []
        #  Initialize combined figure, resizing it for aesthetic proportions
        if 'combined' in self.output_types:
            assert 'bird' and 'front' not in self.output_types, \
                "combined figure cannot be print together with front or bird ones"
            self.y_scale = self.width / (self.height * 1.8)  # Defined proportion
            if self.y_scale < 0.95 or self.y_scale > 1.05:  # allows more variation without resizing
                self.im = self.im.resize((self.width, round(self.height * self.y_scale)))
            self.width = self.im.size[0]
            self.height = self.im.size[1]
            fig_width = self.fig_width + 0.6 * self.fig_width
            fig_height = self.fig_width * self.height / self.width
            # Distinguish between KITTI images and general images
            fig_ar_1 = 1.7 if self.y_scale > 1.7 else 1.3
            width_ratio = 1.9
            self.extensions.append('.combined.png')
            fig, (ax1, ax0) = plt.subplots(1, 2, sharey=False, gridspec_kw={'width_ratios': [1, width_ratio]},
                                           figsize=(fig_width, fig_height))
            ax1.set_aspect(fig_ar_1)
            fig.set_tight_layout(True)
            fig.subplots_adjust(left=0.02, right=0.98, bottom=0, top=1, hspace=0, wspace=0.02)
            figures.append(fig)
            assert 'front' not in self.output_types and 'bird' not in self.output_types, \
                "--combined arguments is not supported with other visualizations"
        # Initialize front figure
        elif 'front' in self.output_types:
            width = self.fig_width
            height = self.fig_width * self.height / self.width
            self.extensions.append(".front.png")
            plt.figure(0)
            fig0, ax0 = plt.subplots(1, 1, figsize=(width, height))
            fig0.set_tight_layout(True)
            figures.append(fig0)
        # Create front figure axis
        if any(xx in self.output_types for xx in ['front', 'combined']):
            ax0 = self.set_axes(ax0, axis=0)
            divider = make_axes_locatable(ax0)
            cax = divider.append_axes('right', size='3%', pad=0.05)
            bar_ticks = self.z_max // 5 + 1
            norm = matplotlib.colors.Normalize(vmin=0, vmax=self.z_max)
            scalar_mappable = plt.cm.ScalarMappable(cmap=self.cmap, norm=norm)
            scalar_mappable.set_array([])
            plt.colorbar(scalar_mappable, ticks=np.linspace(0, self.z_max, bar_ticks),
                         boundaries=np.arange(- 0.05, self.z_max + 0.1, .1), cax=cax, label='Z [m]')
            axes.append(ax0)
        if not axes:
            axes.append(None)
        # Initialize bird-eye-view figure
        if 'bird' in self.output_types:
            self.extensions.append(".bird.png")
            fig1, ax1 = plt.subplots(1, 1)
            fig1.set_tight_layout(True)
            figures.append(fig1)
        if any(xx in self.output_types for xx in ['bird', 'combined']):
            ax1 = self.set_axes(ax1, axis=1)  # Adding field of view
            axes.append(ax1)
        return figures, axes
    def draw(self, figures, axes, dic_out, image, draw_text=True, legend=True, draw_box=False,
             save=False, show=False):
        # Process the annotation dictionary of monoloco
        self._process_results(dic_out)
        # Draw the front figure
        num = 0
        self.mpl_im0.set_data(image)
        for idx, uv in enumerate(self.uv_shoulders):
            if any(xx in self.output_types for xx in ['front', 'combined']) and \
                 min(self.zz_pred[idx], self.zz_gt[idx]) > 0:
                color = self.cmap((self.zz_pred[idx] % self.z_max) / self.z_max)
                self.draw_circle(axes, uv, color)
                if draw_box:
                    self.draw_boxes(axes, idx, color)
                if draw_text:
                    self.draw_text_front(axes, uv, num)
                    num += 1
        # Draw the bird figure
        num = 0
        for idx, _ in enumerate(self.xx_pred):
            if any(xx in self.output_types for xx in ['bird', 'combined']) and self.zz_gt[idx] > 0:
                # Draw ground truth and predicted ellipses
                self.draw_ellipses(axes, idx)
                # Draw bird eye view text
                if draw_text:
                    self.draw_text_bird(axes, idx, num)
                    num += 1
        # Add the legend
        if legend:
            draw_legend(axes)
        # Draw, save or/and show the figures
        for idx, fig in enumerate(figures):
            fig.canvas.draw()
            if save:
                fig.savefig(self.path_out + self.extensions[idx], bbox_inches='tight')
            if show:
                fig.show()
    def draw_ellipses(self, axes, idx):
        """draw uncertainty ellipses"""
        target = get_task_error(self.dds_real[idx])
        angle_gt = get_angle(self.xx_gt[idx], self.zz_gt[idx])
        ellipse_real = Ellipse((self.xx_gt[idx], self.zz_gt[idx]), width=target * 2, height=1,
                               angle=angle_gt, color='lightgreen', fill=True, label="Task error")
        axes[1].add_patch(ellipse_real)
        if abs(self.zz_gt[idx] - self.zz_pred[idx]) > 0.001:
            axes[1].plot(self.xx_gt[idx], self.zz_gt[idx], 'kx', label="Ground truth", markersize=3)
        angle = get_angle(self.xx_pred[idx], self.zz_pred[idx])
        ellipse_ale = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_ale[idx] * 2,
                              height=1, angle=angle, color='b', fill=False, label="Aleatoric Uncertainty",
                              linewidth=1.3)
        ellipse_var = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_epi[idx] * 2,
                              height=1, angle=angle, color='r', fill=False, label="Uncertainty",
                              linewidth=1, linestyle='--')
        axes[1].add_patch(ellipse_ale)
        if self.epistemic:
            axes[1].add_patch(ellipse_var)
        axes[1].plot(self.xx_pred[idx], self.zz_pred[idx], 'ro', label="Predicted", markersize=3)
    def draw_boxes(self, axes, idx, color):
        ww_box = self.boxes[idx][2] - self.boxes[idx][0]
        hh_box = (self.boxes[idx][3] - self.boxes[idx][1]) * self.y_scale
        ww_box_gt = self.boxes_gt[idx][2] - self.boxes_gt[idx][0]
        hh_box_gt = (self.boxes_gt[idx][3] - self.boxes_gt[idx][1]) * self.y_scale
        rectangle = Rectangle((self.boxes[idx][0], self.boxes[idx][1] * self.y_scale),
                              width=ww_box, height=hh_box, fill=False, color=color, linewidth=3)
        rectangle_gt = Rectangle((self.boxes_gt[idx][0], self.boxes_gt[idx][1] * self.y_scale),
                                 width=ww_box_gt, height=hh_box_gt, fill=False, color='g', linewidth=2)
        axes[0].add_patch(rectangle_gt)
        axes[0].add_patch(rectangle)
    def draw_text_front(self, axes, uv, num):
        axes[0].text(uv[0] + self.radius, uv[1] * self.y_scale - self.radius, str(num),
                     fontsize=self.FONTSIZE, color=self.TEXTCOLOR, weight='bold')
    def draw_text_bird(self, axes, idx, num):
        """Plot the number in the bird eye view map"""
        std = self.stds_epi[idx] if self.stds_epi[idx] > 0 else self.stds_ale[idx]
        theta = math.atan2(self.zz_pred[idx], self.xx_pred[idx])
        delta_x = std * math.cos(theta)
        delta_z = std * math.sin(theta)
        axes[1].text(self.xx_pred[idx] + delta_x, self.zz_pred[idx] + delta_z,
                     str(num), fontsize=self.FONTSIZE_BV, color='darkorange')
    def draw_circle(self, axes, uv, color):
        circle = Circle((uv[0], uv[1] * self.y_scale), radius=self.radius, color=color, fill=True)
        axes[0].add_patch(circle)
    def set_axes(self, ax, axis):
        assert axis in (0, 1)
        if axis == 0:
            ax.set_axis_off()
            ax.set_xlim(0, self.width)
            ax.set_ylim(self.height, 0)
            self.mpl_im0 = ax.imshow(self.im)
            ax.get_xaxis().set_visible(False)
            ax.get_yaxis().set_visible(False)
        else:
            uv_max = [0., float(self.height)]
            xyz_max = pixel_to_camera(uv_max, self.kk, self.z_max)
            x_max = abs(xyz_max[0])  # shortcut to avoid oval circles in case of different kk
            ax.plot([0, x_max], [0, self.z_max], 'k--')
            ax.plot([0, -x_max], [0, self.z_max], 'k--')
            ax.set_ylim(0, self.z_max+1)
            ax.set_xlabel("X [m]")
            ax.set_ylabel("Z [m]")
        return ax
 def draw_legend(axes):
    handles, labels = axes[1].get_legend_handles_labels()
    by_label = OrderedDict(zip(labels, handles))
    axes[1].legend(by_label.values(), by_label.keys())
 def get_angle(xx, zz):
    """Obtain the points to plot the confidence of each annotation"""
    theta = math.atan2(zz, xx)
    angle = theta * (180 / math.pi)
    return angle
--- a/monoloco/visuals/results.py
+++ b/monoloco/visuals/results.py
@ -1,3 +1,4 @@
 # pylint: disable=R0915
 import os
 import numpy as np
@ -5,7 +6,7 @@ import matplotlib.pyplot as plt
 from matplotlib.patches import Ellipse
-def print_results(dic_stats, show=False, save=False):
+def print_results(dic_stats, show=False):
    """
    Visualize error as function of the distance on the test set and compare it with target errors based on human
@ -67,7 +68,7 @@ def print_results(dic_stats, show=False, save=False):
    xxs = get_distances(clusters)
    yys = target_error(np.array(xxs), mm_gender)
    ax[1].plot(xxs, bbs, marker='s', color='b', label="Spread b")
-    ax[1].plot(xxs, yys,  '--', color='lightgreen', label="Task error", linewidth=2.5)
+    ax[1].plot(xxs, yys, '--', color='lightgreen', label="Task error", linewidth=2.5)
    yys_up = [rec_c + ar/2 * scale * yy for yy in yys]
    bbs_up = [rec_c + ar/2 * scale * bb for bb in bbs]
    yys_down = [rec_c - ar/2 * scale * yy for yy in yys]
@ -81,7 +82,7 @@ def print_results(dic_stats, show=False, save=False):
    for idx, xx in enumerate(xxs):
        te = Ellipse((xx, rec_c), width=yys[idx]*ar*scale, height=scale, angle=90, color='lightgreen', fill=True)
-        bi = Ellipse((xx, rec_c), width=bbs[idx]*ar*scale, height=scale, angle=90, color='b',linewidth=1.8,
+        bi = Ellipse((xx, rec_c), width=bbs[idx]*ar*scale, height=scale, angle=90, color='b', linewidth=1.8,
                     fill=False)
        ax[0].add_patch(te)
--- a/monoloco/visuals/webcam.py
+++ b/monoloco/visuals/webcam.py
@ -1,3 +1,4 @@
 # pylint: disable=W0212
 """
 Webcam demo application
@ -14,11 +15,11 @@ from openpifpaf import transforms
 import cv2
-from visuals.printer import Printer
+from ..visuals.printer import Printer
-from utils.pifpaf import preprocess_pif
+from ..utils.pifpaf import preprocess_pif
-from predict.pifpaf import PifPaf
+from ..predict.pifpaf import PifPaf
-from predict.monoloco import MonoLoco
+from ..predict.network import MonoLoco
-from predict.factory import factory_for_gt
+from ..predict.factory import factory_for_gt
 def webcam(args):
@ -107,7 +108,7 @@ class VisualizerMonoloco:
                    del axes[1].patches[0]  # the one became the 0
                    if len(axes[1].lines) > 2:
                        del axes[1].lines[2]
-                        if len(axes[1].texts) > 0:  # in case of no text
+                        if axes[1].texts:  # in case of no text
                            del axes[1].texts[0]
            printer.draw(figures, axes, dict_ann, image)
            mypause(0.01)
--- a/src/eval/generate_kitti.py
+++ b/src/eval/generate_kitti.py
@ -1,153 +0,0 @@
 """Run monoloco over all the pifpaf joints of KITTI images
 and extract and save the annotations in txt files"""
 import math
 import os
 import glob
 import json
 import shutil
 import itertools
 import numpy as np
 import torch
 from predict.monoloco import MonoLoco
 from eval.geom_baseline import compute_distance
 from utils.kitti import get_calibration
 from utils.pifpaf import preprocess_pif
 from utils.camera import xyz_from_distance, get_keypoints, pixel_to_camera
 def generate_kitti(model, dir_ann, p_dropout=0.2, n_dropout=0):
    cnt_ann = 0
    cnt_file = 0
    cnt_no_file = 0
    dir_kk = os.path.join('data', 'kitti', 'calib')
    dir_out = os.path.join('data', 'kitti', 'monoloco')
    # Remove the output directory if alreaady exists (avoid residual txt files)
    if os.path.exists(dir_out):
        shutil.rmtree(dir_out)
    os.makedirs(dir_out)
    print("Created empty output directory for txt files")
    # Load monoloco
    use_cuda = torch.cuda.is_available()
    device = torch.device("cuda" if use_cuda else "cpu")
    monoloco = MonoLoco(model_path=model, device=device, n_dropout=n_dropout, p_dropout=p_dropout)
    # Run monoloco over the list of images
    list_basename = factory_basename(dir_ann)
    for basename in list_basename:
        path_calib = os.path.join(dir_kk, basename + '.txt')
        annotations, kk, tt = factory_file(path_calib, dir_ann, basename)
        boxes, keypoints = preprocess_pif(annotations, im_size=(1242, 374))
        if not keypoints:
            cnt_no_file += 1
            continue
        else:
            # Run the network and the geometric baseline
            outputs, varss = monoloco.forward(keypoints, kk)
            dds_geom = eval_geometric(keypoints, kk, average_y=0.48)
        # Save the file
        all_outputs = [outputs.detach().cpu(), varss.detach().cpu(), dds_geom]
        all_inputs = [boxes, keypoints]
        all_params = [kk, tt]
        path_txt = os.path.join(dir_out, basename + '.txt')
        save_txts(path_txt, all_inputs, all_outputs, all_params)
        # Update counting
        cnt_ann += len(boxes)
        cnt_file += 1
    # Print statistics
    print("Saved in {} txt {} annotations. Not found {} images"
          .format(cnt_file, cnt_ann, cnt_no_file))
 def save_txts(path_txt, all_inputs, all_outputs, all_params):
    outputs, varss, dds_geom = all_outputs[:]
    uv_boxes, keypoints = all_inputs[:]
    kk, tt = all_params[:]
    uv_centers = get_keypoints(keypoints, mode='bottom')  # Kitti uses the bottom center to calculate depth
    xy_centers = pixel_to_camera(uv_centers, kk, 1)
    zzs = xyz_from_distance(outputs[:, 0:1], xy_centers)[:, 2].tolist()
    with open(path_txt, "w+") as ff:
        for idx in range(outputs.shape[0]):
            xx = float(xy_centers[idx][0]) * zzs[idx] + tt[0]
            yy = float(xy_centers[idx][1]) * zzs[idx] + tt[1]
            zz = zzs[idx] + tt[2]
            dd = math.sqrt(xx ** 2 + yy ** 2 + zz ** 2)
            cam_0 = [xx, yy, zz, dd]
            for el in uv_boxes[idx][:]:
                ff.write("%s " % el)
            for el in cam_0:
                ff.write("%s " % el)
            ff.write("%s " % float(outputs[idx][1]))
            ff.write("%s " % float(varss[idx]))
            ff.write("%s " % dds_geom[idx])
            ff.write("\n")
        # Save intrinsic matrix in the last row
        for kk_el in itertools.chain(*kk):  # Flatten a list of lists
            ff.write("%f " % kk_el)
        ff.write("\n")
 def factory_basename(dir_ann):
    """ Return all the basenames in the annotations folder"""
    list_ann = glob.glob(os.path.join(dir_ann, '*.json'))
    list_basename = [os.path.basename(x).split('.')[0] for x in list_ann]
    assert list_basename, " Missing json annotations file to create txt files for KITTI datasets"
    return list_basename
 def factory_file(path_calib, dir_ann, basename):
    """Choose the annotation and the calibration files. Stereo option with ite = 1"""
    p_left, p_right = get_calibration(path_calib)
    kk, tt = p_left[:]
    path_ann = os.path.join(dir_ann, basename + '.png.pifpaf.json')
    try:
        with open(path_ann, 'r') as f:
            annotations = json.load(f)
    except FileNotFoundError:
        annotations = None
    return annotations, kk, tt
 def eval_geometric(keypoints, kk, average_y=0.48):
    """ Evaluate geometric distance"""
    dds_geom = []
    uv_centers = get_keypoints(keypoints, mode='center')
    uv_shoulders = get_keypoints(keypoints, mode='shoulder')
    uv_hips = get_keypoints(keypoints, mode='hip')
    xy_centers = pixel_to_camera(uv_centers, kk, 1)
    xy_shoulders = pixel_to_camera(uv_shoulders, kk, 1)
    xy_hips = pixel_to_camera(uv_hips, kk, 1)
    for idx, xy_center in enumerate(xy_centers):
        zz = compute_distance(xy_shoulders[idx], xy_hips[idx], average_y)
        xyz_center = np.array([xy_center[0], xy_center[1], zz])
        dd_geom = float(np.linalg.norm(xyz_center))
        dds_geom.append(dd_geom)
    return dds_geom
--- a/src/features/resize_images.py
+++ b/src/features/resize_images.py
@ -1,37 +0,0 @@
 import glob
 import logging
 import os
 import cv2
 import sys
 def resize(input_glob, output_dir, factor=2):
    """
    Resize images using multiplicative factor
    """
    list_im = glob.glob(input_glob)
    for idx, path_in in enumerate(list_im):
        basename, _ = os.path.splitext(os.path.basename(path_in))
        im = cv2.imread(path_in)
        assert im is not None, "Image not found"
        # Paddle the image if requested and resized the dataset to a fixed dataset
        h_im = im.shape[0]
        w_im = im.shape[1]
        w_new = round(factor * w_im)
        h_new = round(factor * h_im)
        print("resizing image {} to: {} x {}".format(basename, w_new, h_new))
        im_new = cv2.resize(im, (w_new, h_new))
        # Save the image
        name_im = basename + '.png'
        path_out = os.path.join(output_dir, name_im)
        cv2.imwrite(path_out, im_new)
        sys.stdout.write('\r' + 'Saving image number: {}'.format(idx) + '\t')
--- a/src/utils/pifpaf.py
+++ b/src/utils/pifpaf.py
@ -1,57 +0,0 @@
 import numpy as np
 def preprocess_pif(annotations, im_size=None):
    """
    Preprocess pif annotations:
    1. enlarge the box of 10%
    2. Constraint it inside the image (if image_size provided)
    """
    boxes = []
    keypoints = []
    for dic in annotations:
        box = dic['bbox']
        if box[3] < 0.5:  # Check for no detections (boxes 0,0,0,0)
            return [], []
        else:
            kps = prepare_pif_kps(dic['keypoints'])
            conf = float(np.mean(np.array(kps[2])))
            # Add 10% for y and 20% for x
            delta_h = (box[3] - box[1]) / 10
            delta_w = (box[2] - box[0]) / 5
            assert delta_h > -5 and delta_w > -5, "Bounding box <=0"
            box[0] -= delta_w
            box[1] -= delta_h
            box[2] += delta_w
            box[3] += delta_h
            # Put the box inside the image
            if im_size is not None:
                box[0] = max(0, box[0])
                box[1] = max(0, box[1])
                box[2] = min(box[2], im_size[0])
                box[3] = min(box[3], im_size[1])
            box.append(conf)
            boxes.append(box)
            keypoints.append(kps)
    return boxes, keypoints
 def prepare_pif_kps(kps_in):
    """Convert from a list of 51 to a list of 3, 17"""
    assert len(kps_in) % 3 == 0, "keypoints expected as a multiple of 3"
    xxs = kps_in[0:][::3]
    yys = kps_in[1:][::3]  # from offset 1 every 3
    ccs = kps_in[2:][::3]
    return [xxs, yys, ccs]
--- a/src/visuals/printer.py
+++ b/src/visuals/printer.py
@ -1,243 +0,0 @@
 import math
 from collections import OrderedDict
 import numpy as np
 import matplotlib
 import matplotlib.pyplot as plt
 import matplotlib.cm as cm
 from matplotlib.patches import Ellipse, Circle
 from mpl_toolkits.axes_grid1 import make_axes_locatable
 from utils.camera import pixel_to_camera
 from utils.misc import get_task_error
 class Printer:
    """
    Print results on images: birds eye view and computed distance
    """
    RADIUS_KPS = 6
    FONTSIZE_BV = 16
    FONTSIZE = 18
    TEXTCOLOR = 'darkorange'
    COLOR_KPS = 'yellow'
    def __init__(self, image, output_path, kk, output_types, text=True, legend=True, epistemic=False,
                 z_max=30, fig_width=10):
        self.im = image
        self.kk = kk
        self.output_types = output_types
        self.text = text
        self.epistemic = epistemic
        self.legend = legend
        self.z_max = z_max  # To include ellipses in the image
        self.y_scale = 1
        self.width = self.im.size[0]
        self.height = self.im.size[1]
        self.fig_width = fig_width
        # Define the output dir
        self.path_out = output_path
        self.cmap = cm.get_cmap('jet')
        self.extensions = []
        self.mpl_im0 = None
    def _process_results(self, dic_ann):
        # Include the vectors inside the interval given by z_max
        self.stds_ale = dic_ann['stds_ale']
        self.stds_ale_epi = dic_ann['stds_epi']
        self.xx_gt = [xx[0] for xx in dic_ann['xyz_real']]
        self.zz_gt = [xx[2] if xx[2] < self.z_max - self.stds_ale_epi[idx] else 0
                      for idx, xx in enumerate(dic_ann['xyz_real'])]
        self.xx_pred = [xx[0] for xx in dic_ann['xyz_pred']]
        self.zz_pred = [xx[2] if xx[2] < self.z_max - self.stds_ale_epi[idx] else 0
                        for idx, xx in enumerate(dic_ann['xyz_pred'])]
        self.dds_real = dic_ann['dds_real']
        self.uv_centers = dic_ann['uv_centers']
        self.uv_shoulders = dic_ann['uv_shoulders']
        self.uv_kps = dic_ann['uv_kps']
        self.uv_camera = (int(self.im.size[0] / 2), self.im.size[1])
        self.radius = 14 / 1600 * self.width
    def factory_axes(self):
        """Create axes for figures: front bird combined"""
        axes = []
        figures = []
        #  Initialize combined figure, resizing it for aesthetic proportions
        if 'combined' in self.output_types:
            assert 'bird' and 'front' not in self.output_types, \
                "combined figure cannot be print together with front or bird ones"
            self.y_scale = self.width / (self.height * 1.8)  # Defined proportion
            if self.y_scale < 0.95 or self.y_scale > 1.05:  # allows more variation without resizing
                self.im = self.im.resize((self.width, round(self.height * self.y_scale)))
            self.width = self.im.size[0]
            self.height = self.im.size[1]
            fig_width = self.fig_width + 0.6 * self.fig_width
            fig_height = self.fig_width * self.height / self.width
            # Distinguish between KITTI images and general images
            if self.y_scale > 1.7:
                fig_ar_1 = 1.7
            else:
                fig_ar_1 = 1.3
            width_ratio = 1.9
            self.extensions.append('.combined.png')
            fig, (ax1, ax0) = plt.subplots(1, 2, sharey=False, gridspec_kw={'width_ratios': [1, width_ratio]},
                                           figsize=(fig_width, fig_height))
            ax1.set_aspect(fig_ar_1)
            fig.set_tight_layout(True)
            fig.subplots_adjust(left=0.02, right=0.98, bottom=0, top=1, hspace=0, wspace=0.02)
            figures.append(fig)
            assert 'front' not in self.output_types and 'bird' not in self.output_types, \
                "--combined arguments is not supported with other visualizations"
        # Initialize front figure
        elif 'front' in self.output_types:
            width = self.fig_width
            height = self.fig_width * self.height / self.width
            self.extensions.append(".front.png")
            plt.figure(0)
            fig0, ax0 = plt.subplots(1, 1, figsize=(width, height))
            fig0.set_tight_layout(True)
            figures.append(fig0)
        # Create front figure axis
        if any(xx in self.output_types for xx in ['front', 'combined']):
            ax0.set_axis_off()
            ax0.set_xlim(0, self.width)
            ax0.set_ylim(self.height, 0)
            self.mpl_im0 = ax0.imshow(self.im)
            z_min = 0
            bar_ticks = self.z_max // 5 + 1
            ax0.get_xaxis().set_visible(False)
            ax0.get_yaxis().set_visible(False)
            divider = make_axes_locatable(ax0)
            cax = divider.append_axes('right', size='3%', pad=0.05)
            norm = matplotlib.colors.Normalize(vmin=z_min, vmax=self.z_max)
            scalar_mappable = plt.cm.ScalarMappable(cmap=self.cmap, norm=norm)
            scalar_mappable.set_array([])
            plt.colorbar(scalar_mappable, ticks=np.linspace(z_min, self.z_max, bar_ticks),
                         boundaries=np.arange(z_min - 0.05, self.z_max + 0.1, .1), cax=cax, label='Z [m]')
            axes.append(ax0)
        if not axes:
            axes.append(None)
        if 'bird' in self.output_types:
            self.extensions.append(".bird.png")
            fig1, ax1 = plt.subplots(1, 1)
            fig1.set_tight_layout(True)
            figures.append(fig1)
        if any(xx in self.output_types for xx in ['bird', 'combined']):
            uv_max = [0., float(self.height)]
            xyz_max = pixel_to_camera(uv_max, self.kk, self.z_max)
            x_max = abs(xyz_max[0])  # shortcut to avoid oval circles in case of different kk
            # Adding field of view
            ax1.plot([0, x_max], [0, self.z_max], 'k--')
            ax1.plot([0, -x_max], [0, self.z_max], 'k--')
            ax1.set_ylim(0, self.z_max+1)
            ax1.set_xlabel("X [m]")
            ax1.set_ylabel("Z [m]")
            axes.append(ax1)
        return figures, axes
    def draw(self, figures, axes, dic_out, image, save=False, show=False):
        self._process_results(dic_out)
        num = 0
        if any(xx in self.output_types for xx in ['front', 'combined']):
            self.mpl_im0.set_data(image)
            for idx, uv in enumerate(self.uv_shoulders):
                if min(self.zz_pred[idx], self.zz_gt[idx]) > 0:
                    color = self.cmap((self.zz_pred[idx] % self.z_max) / self.z_max)
                    circle = Circle((uv[0], uv[1] * self.y_scale), radius=self.radius, color=color, fill=True)
                    axes[0].add_patch(circle)
                    if self.text:
                        axes[0].text(uv[0]+self.radius, uv[1] * self.y_scale - self.radius, str(num),
                                     fontsize=self.FONTSIZE, color=self.TEXTCOLOR, weight='bold')
                        num += 1
        if any(xx in self.output_types for xx in ['bird', 'combined']):
            for idx, _ in enumerate(self.xx_gt):
                if self.zz_gt[idx] > 0:
                    target = get_task_error(self.dds_real[idx])
                    angle = get_angle(self.xx_gt[idx], self.zz_gt[idx])
                    ellipse_real = Ellipse((self.xx_gt[idx], self.zz_gt[idx]), width=target * 2, height=1,
                                           angle=angle, color='lightgreen', fill=True, label="Task error")
                    axes[1].add_patch(ellipse_real)
                    if abs(self.zz_gt[idx] - self.zz_pred[idx]) > 0.001:
                        axes[1].plot(self.xx_gt[idx], self.zz_gt[idx], 'kx', label="Ground truth", markersize=3)
            # Print prediction and the real ground truth.
            num = 0
            for idx, _ in enumerate(self.xx_pred):
                if self.zz_gt[idx] > 0:  # only the merging ones and inside the interval
                    angle = get_angle(self.xx_pred[idx], self.zz_pred[idx])
                    ellipse_ale = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_ale[idx] * 2,
                                          height=1, angle=angle, color='b', fill=False, label="Aleatoric Uncertainty",
                                          linewidth=1.3)
                    ellipse_var = Ellipse((self.xx_pred[idx], self.zz_pred[idx]), width=self.stds_ale_epi[idx] * 2,
                                          height=1, angle=angle, color='r', fill=False, label="Uncertainty",
                                          linewidth=1, linestyle='--')
                    axes[1].add_patch(ellipse_ale)
                    if self.epistemic:
                        axes[1].add_patch(ellipse_var)
                    axes[1].plot(self.xx_pred[idx], self.zz_pred[idx], 'ro', label="Predicted", markersize=3)
                    # Setup the legend to avoid repetitions
                    if self.legend:
                        handles, labels = axes[1].get_legend_handles_labels()
                        by_label = OrderedDict(zip(labels, handles))
                        axes[1].legend(by_label.values(), by_label.keys())
                    # Plot the number
                    (_, x_pos), (_, z_pos) = get_confidence(self.xx_pred[idx], self.zz_pred[idx],
                                                            self.stds_ale_epi[idx])
                    if self.text:
                        axes[1].text(x_pos, z_pos, str(num), fontsize=self.FONTSIZE_BV, color='darkorange')
                        num += 1
        for idx, fig in enumerate(figures):
            fig.canvas.draw()
            if save:
                fig.savefig(self.path_out + self.extensions[idx], bbox_inches='tight')
            if show:
                fig.show()
 def get_confidence(xx, zz, std):
    """Obtain the points to plot the confidence of each annotation"""
    theta = math.atan2(zz, xx)
    delta_x = std * math.cos(theta)
    delta_z = std * math.sin(theta)
    return (xx - delta_x, xx + delta_x), (zz - delta_z, zz + delta_z)
 def get_angle(xx, zz):
    """Obtain the points to plot the confidence of each annotation"""
    theta = math.atan2(zz, xx)
    angle = theta * (180 / math.pi)
    return angle
--- a/tests/test_utils.py
+++ b/tests/test_utils.py
@ -1,10 +1,12 @@
 import os
 import sys
-
+# Python does not consider the current directory to be a package
-from utils.iou import get_iou_matrix
+sys.path.insert(0, os.path.join('..', 'monoloco'))
 from utils.camera import pixel_to_camera
 def test_iou():
    from monoloco.utils.iou import get_iou_matrix
    boxes_pred = [[1, 100, 1, 200]]
    boxes_gt = [[100., 120., 150., 160.],[12, 110, 130., 160.]]
    iou_matrix = get_iou_matrix(boxes_pred, boxes_gt)
@ -12,6 +14,7 @@ def test_iou():
 def test_pixel_to_camera():
    from monoloco.utils.camera import pixel_to_camera
    kk = [[718.3351, 0., 600.3891], [0., 718.3351, 181.5122], [0., 0., 1.]]
    zz = 10
    uv_vector = [1000., 400.]
`@ -54,10 +54,3 @@ class KeypointsDataset(Dataset):`
	`count = len(self.dic_clst[clst]['Y'])`	`count = len(self.dic_clst[clst]['Y'])`

	`return inputs, outputs, count`	`return inputs, outputs, count`
`@ -68,5 +68,3 @@ def reorder_matches(matches, boxes, mode='left_rigth'):`
	`matches_left = [idx for (idx, _) in matches]`	`matches_left = [idx for (idx, _) in matches]`

	`return [matches[matches_left.index(idx_boxes)] for idx_boxes in ordered_boxes if idx_boxes in matches_left]`	`return [matches[matches_left.index(idx_boxes)] for idx_boxes in ordered_boxes if idx_boxes in matches_left]`