Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for explicit padding and dilations in 2D CNN layers #138

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 37 additions & 30 deletions src/omlt/io/onnx_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,13 +176,15 @@ def _visit_node(self, node, next_nodes):

def _consume_dense_nodes(self, node, next_nodes):
"""Starting from a MatMul node, consume nodes to form a dense Ax + b node."""
# This should only be called when we know we have a starting MatMul node. This
# error indicates a bug in the function calling this one.
if node.op_type != "MatMul":
raise ValueError(
f"{node.name} is a {node.op_type} node, only MatMul nodes can be used as starting points for consumption."
f"{node.name} is a {node.op_type} node, but the method for parsing MatMul nodes was invoked."
)
if len(node.input) != 2:
raise ValueError(
f"{node.name} input has {len(node.input)} dimensions, only nodes with 2 input dimensions can be used as starting points for consumption."
f"{node.name} input has {len(node.input)} dimensions, but the parser requires the starting node to have 2 input dimensions."
)

[in_0, in_1] = list(node.input)
Expand All @@ -200,7 +202,7 @@ def _consume_dense_nodes(self, node, next_nodes):
raise TypeError(f"Expected a node next, got a {type_} instead.")
if node.op_type != "Add":
raise ValueError(
f"The first node to be consumed, {node.name}, is a {node.op_type} node. Only Add nodes are supported."
f"The next node to be parsed, {node.name}, is a {node.op_type} node. Only Add nodes are supported."
)

# extract biases
Expand Down Expand Up @@ -255,11 +257,11 @@ def _consume_gemm_dense_nodes(self, node, next_nodes):
"""Starting from a Gemm node, consume nodes to form a dense aAB + bC node."""
if node.op_type != "Gemm":
raise ValueError(
f"{node.name} is a {node.op_type} node, only Gemm nodes can be used as starting points for consumption."
f"{node.name} is a {node.op_type} node, but the method for parsing Gemm nodes was invoked."
)
if len(node.input) != 3:
raise ValueError(
f"{node.name} input has {len(node.input)} dimensions, only nodes with 3 input dimensions can be used as starting points for consumption."
f"{node.name} input has {len(node.input)} dimensions, but the parser requires the starting node to have 3 input dimensions."
)

attr = _collect_attributes(node)
Expand Down Expand Up @@ -310,11 +312,11 @@ def _consume_conv_nodes(self, node, next_nodes):
"""
if node.op_type != "Conv":
raise ValueError(
f"{node.name} is a {node.op_type} node, only Conv nodes can be used as starting points for consumption."
f"{node.name} is a {node.op_type} node, but the method for parsing Conv nodes was invoked."
)
if len(node.input) not in [2, 3]:
raise ValueError(
f"{node.name} input has {len(node.input)} dimensions, only nodes with 2 or 3 input dimensions can be used as starting points for consumption."
f"{node.name} input has {len(node.input)} dimensions, but the parser requires the starting node to have 2 or 3 input dimensions."
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phraseology consumption is not very intuitive. If I were a user and fed my CNN into OMLT and got this error, it would be clear to me what to do to fix it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was cribbing from the docstrings, but after a bit more time with the code I've reworded all of these errors slightly. The node type checks should never be hit unless there's a bug in the calling function, and '''._visit_node()''' doesn't have that problem. The dimension checks are hopefully a bit clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I think we can be more specific like: To define a convolution layer, the input/output channels and weight matrix are required. Biaes is optional.

if len(node.input) == 2:
Expand Down Expand Up @@ -359,25 +361,32 @@ def _consume_conv_nodes(self, node, next_nodes):
f"Input/output size ({input_output_size}) first dimension must match input weights channels ({in_channels})."
)

# TODO: need to check pads and dilations also have correct dimensions. Also should
# add support for autopad.
if "pads" in attr:
pads = attr["pads"]
else:
pads = 2 * (len(input_output_size) - 1) * [0]

if "dilations" in attr:
dilations = attr["dilations"]
else:
dilations = (len(input_output_size) - 1) * [1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line needs test coverage.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage added to the non-dilation case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR already adds tests with dilations, old tests already cover cases without dilations. We can check the coverage after the CI issue is fixed.


# Other attributes are not supported
if "dilations" in attr and attr["dilations"] != [1, 1]:
raise ValueError(
f"{node} has non-identity dilations ({attr['dilations']}). This is not supported."
)
if attr["group"] != 1:
raise ValueError(
f"{node} has multiple groups ({attr['group']}). This is not supported."
)
if "pads" in attr and np.any(attr["pads"]):
raise ValueError(
f"{node} has non-zero pads ({attr['pads']}). This is not supported."
)

# generate new nodes for the node output
padding = 0
padding = [
pads[i] + pads[i + len(input_output_size) - 1]
for i in range(len(input_output_size) - 1)
]
output_size = [out_channels]
for w, k, s in zip(input_output_size[1:], kernel_shape, strides):
new_w = int((w - k + 2 * padding) / s) + 1
for w, k, s, p in zip(input_output_size[1:], kernel_shape, strides, padding):
new_w = int((w - k + p) / s) + 1
output_size.append(new_w)

activation = "linear"
Expand All @@ -401,6 +410,8 @@ def _consume_conv_nodes(self, node, next_nodes):
output_size,
strides,
weights,
pads=pads,
dilations=dilations,
activation=activation,
input_index_mapper=transformer,
)
Expand All @@ -413,11 +424,11 @@ def _consume_reshape_nodes(self, node, next_nodes):
"""Parse a Reshape node."""
if node.op_type != "Reshape":
raise ValueError(
f"{node.name} is a {node.op_type} node, only Reshape nodes can be used as starting points for consumption."
f"{node.name} is a {node.op_type} node, but the method for parsing Reshape nodes was invoked."
)
if len(node.input) != 2:
raise ValueError(
f"{node.name} input has {len(node.input)} dimensions, only nodes with 2 input dimensions can be used as starting points for consumption."
f"{node.name} input has {len(node.input)} dimensions, but the parser requires the starting node to have 2 input dimensions."
)
[in_0, in_1] = list(node.input)
input_layer = self._node_map[in_0]
Expand All @@ -434,7 +445,7 @@ def _consume_pool_nodes(self, node, next_nodes):
"""
if node.op_type not in _POOLING_OP_TYPES:
raise ValueError(
f"{node.name} is a {node.op_type} node, only MaxPool nodes can be used as starting points for consumption."
f"{node.name} is a {node.op_type} node, but the method for parsing MaxPool nodes was invoked."
)
pool_func_name = "max"

Expand All @@ -445,7 +456,7 @@ def _consume_pool_nodes(self, node, next_nodes):
)
if len(node.input) != 1:
raise ValueError(
f"{node.name} input has {len(node.input)} dimensions, only nodes with 1 input dimension can be used as starting points for consumption."
f"{node.name} input has {len(node.input)} dimensions, but the parser requires the starting node to have 1 input dimension."
)

input_layer, transformer = self._node_input_and_transformer(node.input[0])
Expand All @@ -467,17 +478,11 @@ def _consume_pool_nodes(self, node, next_nodes):
kernel_depth = attr["kernel_shape"][0]
kernel_shape = attr["kernel_shape"][1:]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kernel size for maxpool2d does not include input/output channel. There are only two options for kernel size: (i) one integer k, then the kernel size is [1, 1, k, k]; (ii) two integers [r, c], then the kernel size is [1, 1, r, c]. We need to change lines 478-479 to get correct kernel_depth (which should be 1) and kernel_shape (which should be [k, k] or [r, c]). After fixing that, the checking in 490-492 makes sense. Otherwise, we will get the error message like "Kernel shape [4] has 1 dimensions. Strides attribute has 2 dimensions. These must be equal."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I read the correct kernel_depth and kernel_shape from the node attribute? Do I need to count the dimensions to determine whether it's been given as k, [r,c], or [1,1,r,c]? Or do I just have the indices wrong here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to determine which case since ONNX already standardizes the dimensions for kernel. For maxpooling2d, the node atttribute will give [r, c] (or [k,k] for case(i)), so we just need to define kernel_shape as attr["kernel_shape"]. Since the output channels equal to the input channels, just define kernel_depth as in_channels will be fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This code predates this PR.)

I've now set kernel_depth to in_channels.

In the maxpool_2d.onnx file used in the test_onnx_parser/test_maxpool test, attr["kernel_shape"] for node1 is (3, 2, 3). I'm not sure what each of these dimensions represents, but if I take all 3 it fails.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think kernel_shape = attr["kernel_shape"] since the first dimension is no longer the depth in maxpooling. Can you put some tests on this part to see which one is correct?

strides = attr["strides"] if "strides" in attr else [1] * len(kernel_shape)
pads = attr["pads"] if "pads" in attr else None
dilations = attr["dilations"] if "dilations" in attr else None

# check only kernel shape, stride, storage order are set
# everything else is not supported
if "dilations" in attr and attr["dilations"] != [1, 1]:
raise ValueError(
f"{node.name} has non-identity dilations ({attr['dilations']}). This is not supported."
)
if "pads" in attr and np.any(attr["pads"]):
raise ValueError(
f"{node.name} has non-zero pads ({attr['pads']}). This is not supported."
)
if ("auto_pad" in attr) and (attr["auto_pad"] != "NOTSET"):
raise ValueError(
f"{node.name} has autopad set ({attr['auto_pad']}). This is not supported."
Expand Down Expand Up @@ -519,6 +524,8 @@ def _consume_pool_nodes(self, node, next_nodes):
pool_func_name,
tuple(kernel_shape),
kernel_depth,
pads=pads,
dilations=dilations,
activation=activation,
input_index_mapper=transformer,
)
Expand Down
Loading
Loading