parseq: 'torch.Size' object has no attribute 'rank' #1354
-
Bug descriptionThanks team for including parseq and vitstr! Also, do we get parseq weights from baudm if we set Code snippet to reproduce the bugdef get_transform(img_size: Tuple[int], augment: bool = False, rotation: int = 0):
transforms = []
if augment:
transforms.append(rand_augment_transform()) # Assuming you have your own augment function
if rotation:
transforms.append(T.RandomRotation(rotation)) # Apply random rotation
transforms.extend([
T.Resize((img_size[1], img_size[0]), T.InterpolationMode.BICUBIC), #note height and width
T.ToTensor(),
T.Normalize(0.5, 0.5) # Normalize for RGB images
])
return T.Compose(transforms)
# Per baudm/parseq: Model expects a batch of images with shape: (B, C, H, W) but if you're using parseq from docTR it's B, H, W, C
img_size = (128, 32)
# Load your PIL image
pil_image = Image.open('<image_here>.png').convert('RGB')
transform = get_transform(img_size)
transformed_image = transform(pil_image)
transformed_image = transformed_image.permute(1, 2, 0) #move channel to the end
# Use the model
end_of_pipeline_detect_model = parseq(pretrained=False)
end_of_pipeline_detect_model(transformed_image.unsqueeze(0)) Error tracebackAttributeError: Exception encountered when calling layer 'patch_embedding_1' (type PatchEmbedding).
'torch.Size' object has no attribute 'rank' Environment#skip Deep Learning backendis_tf_available: True
is_torch_available: True |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
Hi @temiwale88 👋🏼 , If you only want to predict already cropped images you can do the following:
For example: I see you have installed both backends (TF and PT - this is not recommended for any prod system only for develop and testing) Keep in mind the pretrained version of parseq (PyTorch) is only available on the All models are pretained on a mindee internal dataset (~11M real world word crops). Additional: |
Beta Was this translation helpful? Give feedback.
-
Moving this to a discussion because it's not a bug :) |
Beta Was this translation helpful? Give feedback.
Hi @temiwale88 👋🏼 ,
If you only want to predict already cropped images you can do the following:
For example:
Output:
[('Text', 0.9996758699417114), ('Data', 0.9975508451461792)]
I see you have installed both backends (TF and PT - this is not recommended for any prod system only for develop and testing)
You can switch between by doing:
torch:
USE_TORCH=1 python3 /path/to/your/script.py
tens…