Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is a proper format of labels? #202

Closed
tugot17 opened this issue Jul 22, 2020 · 4 comments
Closed

What is a proper format of labels? #202

tugot17 opened this issue Jul 22, 2020 · 4 comments
Labels
documentation Improvements or additions to documentation example request good first issue Good for newcomers help wanted Extra attention is needed question Further information is requested

Comments

@tugot17
Copy link

tugot17 commented Jul 22, 2020

I have a custom dataset. 4 Classes: ['Port', 'Necklace', 'Endo', 'ICD']

I have a dataframe where each row represents a separate detection. If some image has more then single detection, then it's image_id is in two columns

image_id label width height bbox path
00005885_000.png Port 1024 1024 [262.0, 355.0, 365.0, 435.0] /home/piotr/Desktop/Nih-Chest-X-Ray-Artifacts ....
0005776_009.png Necklace 1024 1024 [386.0, 1.0, 696.0, 216.0] /home/piotr/Desktop/Nih-Chest-X-Ray-Artifacts ....

and I wrote a custom Parser

class ArtifactParser(DefaultImageInfoParser, FasterRCNNParser):
    def __init__(self, df):
        self.df = df
        self.imageid_map = IDMap()
        self.labelid_map = IDMap()

    def __iter__(self):
        yield from self.df.itertuples()

    def __len__(self):
        return len(self.df)

    def imageid(self, o) -> int:
        return self.imageid_map[o.image_id]

    def filepath(self, o) -> Union[str, Path]:
        return o.img_path

    def height(self, o) -> int:
        return o.height

    def width(self, o) -> int:
        return o.width

    def labels(self, o) -> List[int]:
        return [self.labelid_map[o.label]]

    def bboxes(self, o) -> List[BBox]:
        return [BBox.from_xyxy(*json.loads(o.bbox))]

Everything seems to be working but I'm not sure whether it is a proper approach for creating a Parser or should I change something?

@tugot17 tugot17 added documentation Improvements or additions to documentation example request good first issue Good for newcomers help wanted Extra attention is needed labels Jul 22, 2020
@lgvaz
Copy link
Collaborator

lgvaz commented Jul 22, 2020

For labels you should use:

self.class_map = ClassMap(['Port', 'Necklace', 'Endo', 'ICD'])

And then at the imageid method:

return [self.class_map.get_name(o.label)]

We still have to better document this, but ClassMap will take care of adding the background class at index 0, which is required by the models we currently support.

On a quick look everything else seems fine on the parser =)


Note: Very clever to load the bboxes with json.loads, much easier than what I'm doing in the wheat parser ^^

@lgvaz lgvaz added the question Further information is requested label Jul 22, 2020
@tugot17
Copy link
Author

tugot17 commented Jul 22, 2020

I changed as you recommended to self.class_map = ClassMap(['Port', 'Necklace', 'Endo', 'ICD']) and it works perfectly, however in show_record I still need to give class_map= as an argument.
Maybe it would be a good idea to use the field class_map from ArtifactParser class (eg. make it a Parser argument) in show_record method? By doing so there would no more be a necessary to use this variable each time when we want to display the image?

@lgvaz
Copy link
Collaborator

lgvaz commented Jul 22, 2020

There are a couple of issues with that,

First, you only pass the record to show_record, so we would need to add a reference to class_map in each record.

Second, class_map is not a required attribute for the parser, take note that you had to define it yourself, this is by design, some datasets already give you the label_id instead of the label name, and there is no reason to use a class_map in those situations.

In summary, the way it's implemented gives more freedom to the user, you're not forced to use ClassMap if you don't want to, it's just a convenience class.

@tugot17
Copy link
Author

tugot17 commented Jul 22, 2020

Okk, makes sense. So, for now I close the issue. Thanks for your help :)

@tugot17 tugot17 closed this as completed Jul 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation example request good first issue Good for newcomers help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants