Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dotnetcore support #152

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,6 @@ SolutionInfo.cs
.paket/Paket.Restore.targets
paket-files
*.snk

localNuget/
*.jar
Binary file removed .paket/paket.bootstrapper.exe
Binary file not shown.
41 changes: 0 additions & 41 deletions .paket/paket.targets

This file was deleted.

84 changes: 41 additions & 43 deletions Readme.md → README.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,41 @@
Tika on .NET
============

[![Build status](https://ci.appveyor.com/api/projects/status/ofc68okbo9s75okr?svg=true)](https://ci.appveyor.com/project/KevM/tikaondotnet) [![NuGet version](https://badge.fury.io/nu/TikaOnDotNet.TextExtractor.svg)](https://badge.fury.io/nu/TikaOnDotNet.TextExtractor)

This project is a simple wrapper around the very excellent and robust
[Tika](http://tika.apache.org/) text extraction Java library. This project produces two nugets:
- TikaOnDotNet - A straight [IKVM](http://www.ikvm.net/userguide/ikvmc.html) hosted port of Java Tika project.

[![Install-Package TikaOnDotNet](https://cldup.com/H-IdGdU75T.png)](https://www.nuget.org/packages/TikaOnDotnet/)

- TikaOnDotNet.TextExtractor - Use Tika to extract text from rich documents.

[![Install-Package TikaOnDotNet.TextExtractor](https://cldup.com/_BM0b5jVjU.png)](https://www.nuget.org/packages/TikaOnDotNet.TextExtractor/)

## Getting Started

The best way to get started is to:
- Add a Nuget dependency to [TikaOnDotNet.TextExtractor](https://www.nuget.org/packages/TikaOnDotNet.TextExtractor/).
- Instantiate a new `TextExtractor` object and call one of the `Extract` methods.

### Usage
```cs
// using TikaOnDotNet.TextExtraction;

var textExtractor = new TextExtractor();

var wordDocContents = textExtractor.Extract(@".\path\to\my favorite word.docx");
var webPageContents = textExtractor.Extract(new Uri("https://google.com"));
```

Take a look at [our tests](https://github.com/KevM/tikaondotnet/tree/master/src/TikaOnDotNet.Tests) for more usage examples.

## How To Contribute

Have an idea to make this project better? Great! Start out by taking a look at our [Contributing Guide](https://github.com/KevM/tikaondotnet/blob/master/Contributing.md).

## Having A Problem?

Search in the [Issues](https://github.com/KevM/tikaondotnet/issues?q=is%3Aopen+is%3Aissue)
as your problem may be a common one. If don't find your problem please [create an
issue](https://github.com/KevM/tikaondotnet/issues/new). Contributors here will
chime in when they can.
Tika on .NET
============

This project is a simple wrapper around the very excellent and robust
[Tika](http://tika.apache.org/) text extraction Java library. This project produces two nugets:
- TikaOnDotNet - A straight [IKVM](http://www.ikvm.net/userguide/ikvmc.html) hosted port of Java Tika project.

[![Install-Package TikaOnDotNet](https://cldup.com/H-IdGdU75T.png)](https://www.nuget.org/packages/TikaOnDotnet/)

- TikaOnDotNet.TextExtractor - Use Tika to extract text from rich documents.

[![Install-Package TikaOnDotNet.TextExtractor](https://cldup.com/_BM0b5jVjU.png)](https://www.nuget.org/packages/TikaOnDotNet.TextExtractor/)

## Getting Started

The best way to get started is to:
- Add a Nuget dependency to [TikaOnDotNet.TextExtractor](https://www.nuget.org/packages/TikaOnDotNet.TextExtractor/).
- Instantiate a new `TextExtractor` object and call one of the `Extract` methods.

### Usage
```cs
// using TikaOnDotNet.TextExtraction;

var textExtractor = new TextExtractor();

var wordDocContents = textExtractor.Extract(@".\path\to\my favorite word.docx");
var webPageContents = textExtractor.Extract(new Uri("https://google.com"));
```

Take a look at [our tests](https://github.com/KevM/tikaondotnet/tree/master/src/TikaOnDotNet.Tests) for more usage examples.

## How To Contribute

Have an idea to make this project better? Great! Start out by taking a look at our [Contributing Guide](https://github.com/KevM/tikaondotnet/blob/master/Contributing.md).

## Having A Problem?

Search in the [Issues](https://github.com/KevM/tikaondotnet/issues?q=is%3Aopen+is%3Aissue)
as your problem may be a common one. If don't find your problem please [create an
issue](https://github.com/KevM/tikaondotnet/issues/new). Contributors here will
chime in when they can.
4 changes: 4 additions & 0 deletions Release-Notes.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
## 2.4.1

After a long hiatus from updates after losing support for IKVM we are updating TikaOnDotnet to the latest version of Tika. A wonderfun boon from the new revived status of IKVM is that we can now add dotnet core support. Yes. This nuget targets both .Net Framework and dotnet core.

## 1.17.1

- Add new overloads to the `TextExtractor.Extract` allowing users to provide their own extraction result assemblers. Example:
Expand Down
3 changes: 0 additions & 3 deletions Thanks.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,3 @@ Thanks
Thanks goes out to everyone who has helped out with TikaOnDotnet.

- All [project contributors](https://github.com/KevM/tikaondotnet/graphs/contributors).

- Sergey Tihon - who opened up the license of his [build automation](https://github.com/sergey-tihon/Stanford.NLP.NET/blob/3cef796a872c59448de345ad8bd72ceb04920b7d/build.fsx)
so we could remove that nasty manual `ikvmc.exe` step.
27 changes: 0 additions & 27 deletions appveyor.yml

This file was deleted.

14 changes: 0 additions & 14 deletions build.cmd

This file was deleted.

159 changes: 0 additions & 159 deletions build.fsx

This file was deleted.

Binary file added icon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions nuget.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSources>
<add key="Local" value=".\localNuget\" />
</packageSources>
<activePackageSource>
<add key="All" value="(Aggregate source)" />
</activePackageSource>
<packageSourceMapping>
<packageSource key="nuget.org">
<package pattern="*" />
</packageSource>
<packageSource key="Local">
<package pattern="Tika*" />
</packageSource>
</packageSourceMapping>
</configuration>
15 changes: 0 additions & 15 deletions paket.dependencies

This file was deleted.

Loading