Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect file type from MIME type rather than extension #4

Open
johngriffin opened this issue Feb 4, 2011 · 3 comments
Open

Detect file type from MIME type rather than extension #4

johngriffin opened this issue Feb 4, 2011 · 3 comments

Comments

@johngriffin
Copy link
Contributor

When there is no extension, xlwrap currently guesses the file type as being csv. see:

https://github.com/markbirbeck/xlwrap/commit/ef3b0096d89bcd3472d5a2b1b65aefae6c08d0c6

It would be better to use the MIME type to make work out the file type.

@antiguru
Copy link
Contributor

antiguru commented Feb 4, 2011

Something like that: http://tika.apache.org/ or is it overkill? I fully agree that there is some change required.

@johngriffin
Copy link
Contributor Author

Looks like it would do the trick, but might be a bit heavy for just mime-type detection. A lo-fi alternative might be to use javax.activation.MimetypesFileTypeMap. Seems that the tradeoff would be speed of detection vs weight of code dependency. We'd also probably have to manually list some of the mime types we're detecting with javax.activation - but that's possible and not such a big deal since we only need to support csv, excel and openoffice.

@johngriffin
Copy link
Contributor Author

We've committed an implementation of MIME detection, using tika, to my fork of xlwrap, see commit here:

johngriffin@dfa2bc9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants