diff --git a/.gitignore b/.gitignore
index 24dd579..fbe5966 100644
--- a/.gitignore
+++ b/.gitignore
@@ -2,10 +2,13 @@
data/
.DS_store
__pycache__
-Monoloco/*.pyc
+monoloco/*.pyc
.pytest*
-dist/
build/
+dist/
*.egg-info
tests/*.png
-kitti-eval/*
+kitti-eval/build
+kitti-eval/cmake-build-debug
+figures/
+visual_tests/
diff --git a/.pylintrc b/.pylintrc
index 4685a31..dc2e7fa 100644
--- a/.pylintrc
+++ b/.pylintrc
@@ -9,7 +9,7 @@ Good-names=xx,dd,zz,hh,ww,pp,kk,lr,w1,w2,w3,mm,im,uv,ax,COV_MIN,CONF_MIN
[TYPECHECK]
-disable=E1102,missing-docstring,useless-object-inheritance,duplicate-code,too-many-arguments,too-many-instance-attributes,too-many-locals,too-few-public-methods,arguments-differ,logging-format-interpolation
+disable=import-error,invalid-name,unused-variable,E1102,missing-docstring,useless-object-inheritance,duplicate-code,too-many-arguments,too-many-instance-attributes,too-many-locals,too-few-public-methods,arguments-differ,logging-format-interpolation,import-outside-toplevel
# List of members which are set dynamically and missed by pylint inference
diff --git a/LICENSE b/LICENSE
index 06eabdd..8ddfc99 100644
--- a/LICENSE
+++ b/LICENSE
@@ -1,4 +1,4 @@
-Copyright 2019 by EPFL/VITA. All rights reserved.
+Copyright 2020-2021 by EPFL/VITA. All rights reserved.
This project and all its files are licensed under
GNU AGPLv3 or later version.
@@ -6,4 +6,4 @@ GNU AGPLv3 or later version.
If this license is not suitable for your business or project
please contact EPFL-TTO (https://tto.epfl.ch/) for a full commercial license.
-This software may not be used to harm any person deliberately.
+This software may not be used to harm any person deliberately or for any military application.
diff --git a/README.md b/README.md
index 2b96417..018f92e 100644
--- a/README.md
+++ b/README.md
@@ -1,83 +1,67 @@
-# Monoloco
+# Monoloco library [](https://pepy.tech/project/monoloco)
-> We tackle the fundamentally ill-posed problem of 3D human localization from monocular RGB images. Driven by the limitation of neural networks outputting point estimates, we address the ambiguity in the task by predicting confidence intervals through a loss function based on the Laplace distribution. Our architecture is a light-weight feed-forward neural network that predicts 3D locations and corresponding confidence intervals given 2D human poses. The design is particularly well suited for small training data, cross-dataset generalization, and real-time applications. Our experiments show that we (i) outperform state-of-the-art results on KITTI and nuScenes datasets, (ii) even outperform a stereo-based method for far-away pedestrians, and (iii) estimate meaningful confidence intervals. We further share insights on our model of uncertainty in cases of limited observations and out-of-distribution samples.
+
-```
-@InProceedings{Bertoni_2019_ICCV,
-author = {Bertoni, Lorenzo and Kreiss, Sven and Alahi, Alexandre},
-title = {MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation},
-booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
-month = {October},
-year = {2019}
-}
-```
-**2021**
-**NEW! MonoLoco++ is [available](https://github.com/vita-epfl/monstereo):**
-* It estimates 3D localization, orientation, and bounding box dimensions
-* It verifies social distance requirements. More info: [video](https://www.youtube.com/watch?v=r32UxHFAJ2M) and [project page](http://vita.epfl.ch/monoloco)
-* It works with [OpenPifPaf](https://github.com/vita-epfl/openpifpaf) 0.12 and PyTorch 1.7
+This library is based on three research projects for monocular/stereo 3D human localization (detection), body orientation, and social distancing.
-**2020**
-* Paper on [ICCV'19](http://openaccess.thecvf.com/content_ICCV_2019/html/Bertoni_MonoLoco_Monocular_3D_Pedestrian_Localization_and_Uncertainty_Estimation_ICCV_2019_paper.html) website or [ArXiv](https://arxiv.org/abs/1906.06059)
-* Check our video with method description and qualitative results on [YouTube](https://www.youtube.com/watch?v=ii0fqerQrec)
+> __MonStereo: When Monocular and Stereo Meet at the Tail of 3D Human Localization__
+> _[L. Bertoni](https://scholar.google.com/citations?user=f-4YHeMAAAAJ&hl=en), [S. Kreiss](https://www.svenkreiss.com),
+[T. Mordan](https://people.epfl.ch/taylor.mordan/?lang=en), [A. Alahi](https://scholar.google.com/citations?user=UIhXQ64AAAAJ&hl=en)_, ICRA 2021
+__[Article](https://arxiv.org/abs/2008.10913)__ __[Citation](#Citation)__ __[Video](#Todo)__
+
+
-* Live demo available! (more info in the webcam section)
+---
-* Continuously tested with Travis CI: [](https://travis-ci.org/vita-epfl/monoloco)
-
+> __Perceiving Humans: from Monocular 3D Localization to Social Distancing__
+> _[L. Bertoni](https://scholar.google.com/citations?user=f-4YHeMAAAAJ&hl=en), [S. Kreiss](https://www.svenkreiss.com),
+[A. Alahi](https://scholar.google.com/citations?user=UIhXQ64AAAAJ&hl=en)_, T-ITS 2021
+__[Article](https://arxiv.org/abs/2009.00984)__ __[Citation](#Citation)__ __[Video](https://www.youtube.com/watch?v=r32UxHFAJ2M)__
+
+
+
+---
+
+> __MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation__
+> _[L. Bertoni](https://scholar.google.com/citations?user=f-4YHeMAAAAJ&hl=en), [S. Kreiss](https://www.svenkreiss.com), [A.Alahi](https://scholar.google.com/citations?user=UIhXQ64AAAAJ&hl=en)_, ICCV 2019
+__[Article](https://arxiv.org/abs/1906.06059)__ __[Citation](#Todo)__ __[Video](https://www.youtube.com/watch?v=ii0fqerQrec)__
+
+
+
+## License
+All projects are built upon [Openpifpaf](https://github.com/vita-epfl/openpifpaf) for the 2D keypoints and share the AGPL Licence.
+
+This software is also available for commercial licensing via the EPFL Technology Transfer
+Office (https://tto.epfl.ch/, info.tto@epfl.ch).
+
+
+## Quick setup
+A GPU is not required, yet highly recommended for real-time performances.
+
+The installation has been tested on OSX and Linux operating systems, with Python 3.6, 3.7, 3.8.
+Packages have been installed with pip and virtual environments.
+
+For quick installation, do not clone this repository, make sure there is no folder named monoloco in your current directory, and run:
-# Setup
-### Install
-Python 3 is required. Python 2 is not supported.
-Do not clone this repository and make sure there is no folder named monoloco in your current directory.
```
pip3 install monoloco
```
-For development of the monoloco source code itself, you need to clone this repository and then:
+For development of the source code itself, you need to clone this repository and then:
```
-pip3 install -e '.[test, prep]'
-```
-Python 3.6 or 3.7 is required for nuScenes development kit.
-All details for Pifpaf pose detector at [openpifpaf](https://github.com/vita-epfl/openpifpaf).
-
-
-
-### Data structure
-
- Data
- ├── arrays
- ├── models
- ├── kitti
- ├── nuscenes
- ├── logs
-
-
-Run the following to create the folders:
-```
-mkdir data
-cd data
-mkdir arrays models kitti nuscenes logs
+pip3 install sdist
+cd monoloco
+python3 setup.py sdist bdist_wheel
+pip3 install -e .
```
-### Pre-trained Models
-* Download a MonoLoco pre-trained model from
-[Google Drive](https://drive.google.com/open?id=1F7UG1HPXGlDD_qL-AN5cv2Eg-mhdQkwv) and save it in `data/models`
-(default) or in any folder and call it through the command line option `--model `
-* Pifpaf pre-trained model will be automatically downloaded at the first run.
-Three standard, pretrained models are available when using the command line option
-`--checkpoint resnet50`, `--checkpoint resnet101` and `--checkpoint resnet152`.
-Alternatively, you can download a Pifpaf pre-trained model from [openpifpaf](https://github.com/vita-epfl/openpifpaf)
- and call it with `--checkpoint `
-
-
-# Interfaces
-All the commands are run through a main file called `main.py` using subparsers.
-To check all the commands for the parser and the subparsers (including openpifpaf ones) run:
+### Interfaces
+All the commands are run through a main file called `run.py` using subparsers.
+To check all the options:
* `python3 -m monoloco.run --help`
* `python3 -m monoloco.run predict --help`
@@ -86,118 +70,238 @@ To check all the commands for the parser and the subparsers (including openpifpa
* `python3 -m monoloco.run prep --help`
or check the file `monoloco/run.py`
-
-# Prediction
-The predict script receives an image (or an entire folder using glob expressions),
-calls PifPaf for 2d human pose detection over the image
-and runs Monoloco for 3d location of the detected poses.
-The command `--networks` defines if saving pifpaf outputs, MonoLoco outputs or both.
-You can check all commands for Pifpaf at [openpifpaf](https://github.com/vita-epfl/openpifpaf).
+# Predictions
-Output options include json files and/or visualization of the predictions on the image in *frontal mode*,
-*birds-eye-view mode* or *combined mode* and can be specified with `--output_types`
+The software receives an image (or an entire folder using glob expressions),
+calls PifPaf for 2D human pose detection over the image
+and runs Monoloco++ or MonStereo for 3D localization &/or social distancing &/or orientation
+**Which Modality**
+The command `--mode` defines which network to run.
-### Ground truth matching
-* In case you provide a ground-truth json file to compare the predictions of MonoLoco,
+- select `--mode mono` (default) to predict the 3D localization of all the humans from monocular image(s)
+- select `--mode stereo` for stereo images
+- select `--mode keypoints` if just interested in 2D keypoints from OpenPifPaf
+
+Models are downloaded automatically. To use a specific model, use the command `--model`. Additional models can be downloaded from [here](https://drive.google.com/drive/folders/1jZToVMBEZQMdLB5BAIq2CdCLP5kzNo9t?usp=sharing)
+
+**Which Visualization**
+- select `--output_types multi` if you want to visualize both frontal view or bird's eye view in the same picture
+- select `--output_types bird front` if you want to different pictures for the two views or just one view
+- select `--output_types json` if you'd like the ouput json file
+
+If you select `--mode keypoints`, use standard OpenPifPaf arguments
+
+**Focal Length and Camera Parameters**
+Absolute distances are affected by the camera intrinsic parameters.
+When processing KITTI images, the network uses the provided intrinsic matrix of the dataset.
+In all the other cases, we use the parameters of nuScenes cameras, with "1/1.8'' CMOS sensors of size 7.2 x 5.4 mm.
+The default focal length is 5.7mm and this parameter can be modified using the argument `--focal`.
+
+## A) 3D Localization
+
+**Ground-truth comparison**
+If you provide a ground-truth json file to compare the predictions of the network,
the script will match every detection using Intersection over Union metric.
- The ground truth file can be generated using the subparser `prep` and called with the command `--path_gt`.
- Check preprocess section for more details or download the file from
- [here](https://drive.google.com/open?id=1F7UG1HPXGlDD_qL-AN5cv2Eg-mhdQkwv).
-
-* In case you don't provide a ground-truth file, the script will look for a predefined path.
-If it does not find the file, it will generate images
-with all the predictions without ground-truth matching.
-
-Below an example with and without ground-truth matching. They have been created (adding or removing `--path_gt`) with:
-`python3 -m monoloco.run predict --glob docs/002282.png --output_types combined --scale 2
---model data/models/monoloco-190513-1437.pkl --n_dropout 50 --z_max 30`
-
-With ground truth matching (only matching people):
-
-
-Without ground_truth matching (all the detected people):
-
-
-### Images without calibration matrix
-To accurately estimate distance, the focal length is necessary.
-However, it is still possible to test Monoloco on images where the calibration matrix is not available.
-Absolute distances are not meaningful but relative distance still are.
-Below an example on a generic image from the web, created with:
-`python3 -m monoloco.run predict --glob docs/surf.jpg --output_types combined --model data/models/monoloco-190513-1437.pkl --n_dropout 50 --z_max 25`
-
-
+ The ground truth file can be generated using the subparser `prep`, or directly downloaded from [Google Drive](https://drive.google.com/file/d/1e-wXTO460ip_Je2NdXojxrOrJ-Oirlgh/view?usp=sharing)
+ and called it with the command `--path_gt`.
-# Webcam
-
+**Monocular examples**
-MonoLoco can run on personal computers with only CPU and low resolution images (e.g. 256x144) at ~2fps.
-It support 3 types of visualizations: `front`, `bird` and `combined`.
-Multiple visualizations can be combined in different windows.
+For an example image, run the following command:
-The above gif has been obtained running on a Macbook the command:
-
-```pip3 install opencv-python
-python3 -m monoloco.run predict --webcam --scale 0.2 --output_types combined --z_max 10 --checkpoint resnet50 --model data/models/monoloco-190513-1437.pkl
+```
+python -m monoloco.run predict docs/002282.png \
+--path_gt \
+-o