Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Xmlrpc: XML documents without an .xml extension are stored as binary #5067

Closed
ohumbel opened this issue Sep 22, 2023 · 5 comments · Fixed by #5068
Closed

[BUG] Xmlrpc: XML documents without an .xml extension are stored as binary #5067

ohumbel opened this issue Sep 22, 2023 · 5 comments · Fixed by #5068

Comments

@ohumbel
Copy link
Contributor

ohumbel commented Sep 22, 2023

When an XML document without an explicit .xml extension is stored via xmlrpc, its mime type is converted to binary.
Despite the fact that the provided resource type is XMLResource.class.

Here is a sketch for a test:

@Test
public void testXMLMimeType() throws XMLDBException {
    // store an XML document without an .xml extension
    try(Collection collection = DatabaseManager.getCollection(getBaseUri() + "/db/" + COLLECTION_NAME, ...)) {
        final Class<? extends Resource> xmlResourceType = XMLResource.class;
        final XMLResource resource = (XMLResource)collection.createResource(DOCUMENT_NAME, xmlResourceType);
        resource.setContent(XML_CONTENT);
        collection.storeResource(resource);
        assertEquals(ResourceType.XML_RESOURCE, resource.getResourceType());
    }
	
    // retrieve the document and verify its resource type
    try(Collection collection = DatabaseManager.getCollection(getBaseUri() + "/db/" + COLLECTION_NAME, ...)) {
        Resource resource = collection.getResource(DOCUMENT_NAME);
        assertEquals(ResourceType.XML_RESOURCE, resource.getResourceType()); // <== fails with binary
    }
}

The expected behaviour would be, that an XML document is stored as XMLResource if so requested.
I will try to provide a pull request with a complete test and a possible solution.

The fix is needed for develop and develop-6.x.x.
On develop-4.x.x the issue does not manifest itself.

@ohumbel
Copy link
Contributor Author

ohumbel commented Sep 22, 2023

Background

All our XML documents are named like Standard_123, or Standard_123.1. After upgrading from 4.x to 6.2.0, we were not able to retrieve documents after having stored them. We exclusively access the remote eXist instance with xmlrpc.

The proposed fix works really well for us.

@adamretter
Copy link
Contributor

If I am not mistaken, eXist-db 6.x.x, 5.x.x, and 4.x.x do not have a Collection#createResource(String, Class) method, this was only introduced in eXist-db 7.x.x as part of @reinhapa's updates to the XML:DB API.

In eXist-db 7.x.x:

final Class<? extends Resource> xmlResourceType = XMLResource.class;
collection.createResource(DOCUMENT_NAME, xmlResourceType)

When a LocalXMLResource is constructed it sets the XML Mime Type here:

public LocalXMLResource(final Subject user, final BrokerPool brokerPool, final LocalCollection parent, final XmldbURI did) throws XMLDBException {
        super(user, brokerPool, parent, did, MimeType.XML_TYPE.getName());

This is then used in LocalCollection#storeXMLResource here:

final String strMimeType = res.getMimeType(broker, transaction);
final MimeType mimeType = strMimeType != null ? MimeTable.getInstance().getContentType(strMimeType) : null;

if (res.root != null) {
    collection.storeDocument(transaction, broker, resURI, res.root, mimeType, res.datecreated, res.datemodified, null, null, null);

or... when a RemoteXMLResource is constructed it sets it here:

public RemoteXMLResource(
            final RemoteCollection parent,
            final int handle,
            final int pos,
            final XmldbURI docId,
            final Optional<String> id,
            final Optional<String> type)
            throws XMLDBException {
        super(parent, docId, MimeType.XML_TYPE.getName());

In RemoteCollection#store(RemoteXMLResource) it looks like it doesn't send the Media Type to the remote XML-RPC function parse.

If we look in the RPCConnection#parse(byte[], XmldbURI, int, Date, Date) function we see this:

final MimeType mime = MimeTable.getInstance().getContentTypeFor(docUri.lastSegment());
                broker.storeDocument(transaction, docUri.lastSegment(), source, mime, created, modified, null, null, null, collection);

So it looks like parse attempts to determine the MimeType from eXist-db's MimeTable (i.e. its embedded mime-types.xml file) by inspecting the filename.
As there is no file extension, it will not find a media type, and the mime variable will be null. When broker.storeDocument is called, as the mime is null, it will store it as a binary file.

I would suggest the best solution would be to add an additional media type parameter to the XML-RPC API for parse.

@line-o
Copy link
Member

line-o commented Sep 22, 2023

node-exist which uses eXist-db's XML-RPC API allows to set, or override, the mime-type explicitly when calling parseLocal of an uploaded document (see https://github.com/eXist-db/node-exist/blob/70f4ff38f7b3d81f2e257c1f8a12f7e742c6d02d/components/documents.js#L12).

But this might not be available when using store.

@reinhapa
Copy link
Member

If I am not mistaken, eXist-db 6.x.x, 5.x.x, and 4.x.x do not have a Collection#createResource(String, Class) method, this was only introduced in eXist-db 7.x.x as part of @reinhapa's updates to the XML:DB API.

That is not quite correct, before the update to the new XML::DB API the Method signature used Collection#createResource(String, String) where the second parameter defined if it should be an XMLResource or not.

The main problem or expectation is, that as long there was no change made to the mime time (in this case after reading an existing resource) it should also not "freshly calculated on the storage side. This is also the shown by the test itself.

@ohumbel
Copy link
Contributor Author

ohumbel commented Sep 23, 2023

Regarding backport, please refer to my comment in the pull request: #5068 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants