Update README file
Browse files
README.md
CHANGED
|
@@ -4,8 +4,8 @@
|
|
| 4 |
The code is developed using python 3.11.11 on Ubuntu 21.xx with torch==2.6.0+cu124,
|
| 5 |
transformers==4.57.3 (with Qwen3 series)
|
| 6 |
|
| 7 |
-
##
|
| 8 |
-
### Installation
|
| 9 |
1. Install required packges and dependencies.
|
| 10 |
2. Clone this repo, and we'll call the directory that you cloned as ${ROOT}.
|
| 11 |
3. Creat necessary directories:
|
|
@@ -16,7 +16,7 @@ transformers==4.57.3 (with Qwen3 series)
|
|
| 16 |
4. Download LLM's weights into model_weights from hugging face.
|
| 17 |
|
| 18 |
|
| 19 |
-
### Prepare Dataset
|
| 20 |
5. Install COCO API:
|
| 21 |
```
|
| 22 |
pip install pycocotools
|
|
@@ -49,7 +49,7 @@ transformers==4.57.3 (with Qwen3 series)
|
|
| 49 |
`--read_rules.py
|
| 50 |
```
|
| 51 |
|
| 52 |
-
### Start annotation
|
| 53 |
#### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate.sh".
|
| 54 |
```
|
| 55 |
IDX={YOUR_GPU_IDS}
|
|
@@ -66,17 +66,17 @@ else
|
|
| 66 |
fi
|
| 67 |
|
| 68 |
CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
|
| 69 |
-
tools/
|
| 70 |
--model-path ${model_path} \
|
| 71 |
--data-path ${data_path} \
|
| 72 |
--output-dir ${output_dir} \
|
| 73 |
```
|
| 74 |
#### Start auto-annotation
|
| 75 |
```
|
| 76 |
-
bash scripts/
|
| 77 |
```
|
| 78 |
|
| 79 |
-
## Annotation format
|
| 80 |
A list of dict that contains the following keys:
|
| 81 |
```
|
| 82 |
{
|
|
@@ -92,4 +92,79 @@ A list of dict that contains the following keys:
|
|
| 92 |
'object_bbox': [128, 276, 144, 313],
|
| 93 |
'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
|
| 94 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 95 |
```
|
|
|
|
| 4 |
The code is developed using python 3.11.11 on Ubuntu 21.xx with torch==2.6.0+cu124,
|
| 5 |
transformers==4.57.3 (with Qwen3 series)
|
| 6 |
|
| 7 |
+
## Annotating HICO-Det
|
| 8 |
+
### A. Installation
|
| 9 |
1. Install required packges and dependencies.
|
| 10 |
2. Clone this repo, and we'll call the directory that you cloned as ${ROOT}.
|
| 11 |
3. Creat necessary directories:
|
|
|
|
| 16 |
4. Download LLM's weights into model_weights from hugging face.
|
| 17 |
|
| 18 |
|
| 19 |
+
### B. Prepare Dataset
|
| 20 |
5. Install COCO API:
|
| 21 |
```
|
| 22 |
pip install pycocotools
|
|
|
|
| 49 |
`--read_rules.py
|
| 50 |
```
|
| 51 |
|
| 52 |
+
### C. Start annotation
|
| 53 |
#### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate.sh".
|
| 54 |
```
|
| 55 |
IDX={YOUR_GPU_IDS}
|
|
|
|
| 66 |
fi
|
| 67 |
|
| 68 |
CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
|
| 69 |
+
tools/annotate_hico.py \
|
| 70 |
--model-path ${model_path} \
|
| 71 |
--data-path ${data_path} \
|
| 72 |
--output-dir ${output_dir} \
|
| 73 |
```
|
| 74 |
#### Start auto-annotation
|
| 75 |
```
|
| 76 |
+
bash scripts/annotate_hico.sh
|
| 77 |
```
|
| 78 |
|
| 79 |
+
### D. Annotation format
|
| 80 |
A list of dict that contains the following keys:
|
| 81 |
```
|
| 82 |
{
|
|
|
|
| 92 |
'object_bbox': [128, 276, 144, 313],
|
| 93 |
'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
|
| 94 |
}
|
| 95 |
+
```
|
| 96 |
+
|
| 97 |
+
|
| 98 |
+
## Annotate COCO
|
| 99 |
+
1. Download COCO dataset.
|
| 100 |
+
2. Organize dataset, your directory tree of dataset should look like this (the files inside the Config is copied from the HICO-Det):
|
| 101 |
+
```
|
| 102 |
+
{DATA_ROOT}
|
| 103 |
+
|-- annotations
|
| 104 |
+
| |--person_keypoints_train2017.json
|
| 105 |
+
| `--person_keypoints_val2017.json
|
| 106 |
+
|── Configs
|
| 107 |
+
| |--hico_hoi_list.txt
|
| 108 |
+
| `--Part_State_76.txt
|
| 109 |
+
|── train2017
|
| 110 |
+
| |--000000000009.jpg
|
| 111 |
+
| |--000000000025.jpg
|
| 112 |
+
| ...
|
| 113 |
+
`-- val2017
|
| 114 |
+
|--000000000139.jpg
|
| 115 |
+
|--000000000285.jpg
|
| 116 |
+
...
|
| 117 |
+
|
| 118 |
+
```
|
| 119 |
+
|
| 120 |
+
### Start annotation
|
| 121 |
+
#### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate_coco.sh".
|
| 122 |
+
```
|
| 123 |
+
IDX={YOUR_GPU_IDS}
|
| 124 |
+
export PYTHONPATH=$PYTHONPATH:./
|
| 125 |
+
|
| 126 |
+
data_path={DATA_ROOT}
|
| 127 |
+
model_path={ROOT}/model_weights/{YOUR_MODEL_NAME}
|
| 128 |
+
output_dir={ROOT}/outputs
|
| 129 |
+
|
| 130 |
+
if [ -d ${output_dir} ];then
|
| 131 |
+
echo "dir already exists"
|
| 132 |
+
else
|
| 133 |
+
mkdir ${output_dir}
|
| 134 |
+
fi
|
| 135 |
+
|
| 136 |
+
CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
|
| 137 |
+
tools/annotate_coco.py \
|
| 138 |
+
--model-path ${model_path} \
|
| 139 |
+
--data-path ${data_path} \
|
| 140 |
+
--output-dir ${output_dir} \
|
| 141 |
+
```
|
| 142 |
+
#### Start auto-annotation
|
| 143 |
+
```
|
| 144 |
+
bash scripts/annotate_coco.sh
|
| 145 |
+
```
|
| 146 |
+
By defualt, the annotation script only annotates the COCO train2017 set. To annotate val2017, find the following two code in Line167-Line168 in the tools/annotate_coco.py and replace the 'train2017' to 'val2017'.
|
| 147 |
+
|
| 148 |
+
```
|
| 149 |
+
dataset = PoseCOCODataset(
|
| 150 |
+
data_path=os.path.join(args.data_path, 'annotations', 'person_keypoints_train2017.json'), # <- Line 167
|
| 151 |
+
multimodal_cfg=dict(image_folder=os.path.join(args.data_path, 'train2017'), # <- Line 168
|
| 152 |
+
data_augmentation=False,
|
| 153 |
+
image_size=336,),)
|
| 154 |
+
```
|
| 155 |
+
|
| 156 |
+
|
| 157 |
+
## Annotation format
|
| 158 |
+
A list of dict that contains the following keys:
|
| 159 |
+
```
|
| 160 |
+
{
|
| 161 |
+
'file_name': '000000000009.jpg',
|
| 162 |
+
'image_id': 9,
|
| 163 |
+
'keypoints': a 51-elements list (17x3 keypoints with x, y, v),
|
| 164 |
+
'vis': a 51-elements list (17 keypionts, each has 3 visiblity flags),
|
| 165 |
+
'height': 640,
|
| 166 |
+
'width': 480,
|
| 167 |
+
'human_bbox': [126, 258, 150, 305],
|
| 168 |
+
'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
|
| 169 |
+
}
|
| 170 |
```
|