diff --git a/README.md b/README.md
index 04c328a..226c73a 100644
--- a/README.md
+++ b/README.md
@@ -31,7 +31,7 @@ __[Article](https://arxiv.org/abs/2009.00984)__ &nbs
> __MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation__
> _[L. Bertoni](https://scholar.google.com/citations?user=f-4YHeMAAAAJ&hl=en), [S. Kreiss](https://www.svenkreiss.com), [A.Alahi](https://scholar.google.com/citations?user=UIhXQ64AAAAJ&hl=en)_, ICCV 2019
-__[Article](https://arxiv.org/abs/1906.06059)__ __[Citation](#Todo)__ __[Video](https://www.youtube.com/watch?v=ii0fqerQrec)__
+__[Article](https://arxiv.org/abs/1906.06059)__ __[Citation](#Citation)__ __[Video](https://www.youtube.com/watch?v=ii0fqerQrec)__
@@ -50,8 +50,6 @@ Packages have been installed with pip and virtual environments.
For quick installation, do not clone this repository, make sure there is no folder named monoloco in your current directory, and run:
-
-
```
pip3 install monoloco
```
@@ -127,7 +125,7 @@ python -m monoloco.run predict docs/002282.png \

-To show all the instances estimated by MonoLoco add the argument `show_all` to the above command.
+To show all the instances estimated by MonoLoco add the argument `--show_all` to the above command.

@@ -147,7 +145,7 @@ To run MonStereo on stereo images, make sure the stereo pairs have the following
You can load one or more image pairs using glob expressions. For example:
-```
+```sh
python3 -m monoloco.run predict --mode stereo \
--glob docs/000840*.png
--path_gt \
@@ -156,7 +154,7 @@ python3 -m monoloco.run predict --mode stereo \

-```
+```sh
python3 -m monoloco.run predict --glob docs/005523*.png \ --output_types multi \
--model data/models/ms-200710-1511.pkl \
--path_gt \
@@ -179,10 +177,12 @@ An example from the Collective Activity Dataset is provided below.
To visualize social distancing run the below, command:
-```
+
+```sh
python -m monoloco.run predict docs/frame0032.jpg \
--social_distance --output_types front bird
```
+
@@ -194,15 +194,16 @@ The network estimates orientation and box dimensions as well. Results are saved
## Training
We train on the KITTI dataset (MonoLoco/Monoloco++/MonStereo) or the nuScenes dataset (MonoLoco) specifying the path of the json file containing the input joints. Please download them [here](https://drive.google.com/drive/folders/1j0riwbS9zuEKQ_3oIs_dWlYBnfuN2WVN?usp=sharing) or follow [preprocessing instructions](#Preprocessing).
-Results for MonoLoco++ are obtained with:
+Results for [MonoLoco++](###Tables) are obtained with:
```
-python -m monoloco.run train --joints data/arrays/joints-kitti-201202-1743.json
+python -m monoloco.run train --joints data/arrays/joints-kitti-mono-210422-1600.json
```
-While for the MonStereo ones just change the input joints and add `--mode stereo`
-```
-python3 -m monoloco.run train --lr 0.002 --joints data/arrays/joints-kitti-201202-1022.json --mode stereo
+While for the [MonStereo](###Tables) results run:
+
+```sh
+python -m monoloco.run train --joints data/arrays/joints-kitti-stereo-210422-1601.json --lr 0.003 --mode stereo
```
If you are interested in the original results of the MonoLoco ICCV article (now improved with MonoLoco++), please refer to the tag v0.4.9 in this repository.
@@ -217,7 +218,7 @@ Finally, for a more extensive list of available parameters, run:
Preprocessing and training step are already fully supported by the code provided,
but require first to run a pose detector over
all the training images and collect the annotations.
-The code supports this option (by running the predict script and using `--mode pifpaf`).
+The code supports this option (by running the predict script and using `--mode keypoints`).
### Data structure
@@ -246,6 +247,7 @@ Download kitti images (from left and right cameras), ground-truth files (labels)
The network takes as inputs 2D keypoints annotations. To create them run PifPaf over the saved images:
+
```sh
python -m openpifpaf.predict \
--glob "data/kitti/images/*.png" \
@@ -253,6 +255,7 @@ python -m openpifpaf.predict \
--checkpoint=shufflenetv2k30 \
--instance-threshold=0.05 --seed-threshold 0.05 --force-complete-pose
```
+
**Horizontal flipping**
To augment the dataset, we apply horizontal flipping on the detected poses. To include small variations in the pose, we use the poses from the right-camera (the dataset uses a stereo camera). As there are no labels for the right camera, the code automatically correct the ground truth depth by taking into account the camera baseline.
@@ -302,7 +305,7 @@ which for example change the name of all the jpg images in that folder adding th
Pifpaf annotations should also be saved in a single folder and can be created with:
-```
+```sh
python -m openpifpaf.predict \
--glob "data/collective_activity/images/*.jpg" \
--checkpoint=shufflenetv2k30 \
@@ -310,21 +313,16 @@ python -m openpifpaf.predict \
--json-output