Skip to content

Commit

Permalink
Fix missing url in Avro Docs.
Browse files Browse the repository at this point in the history
  • Loading branch information
donPain committed Dec 13, 2024
1 parent a1abb6a commit 32af73f
Show file tree
Hide file tree
Showing 10 changed files with 27 additions and 26 deletions.
18 changes: 9 additions & 9 deletions docs/content.zh/docs/connectors/datastream/formats/parquet.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,14 +198,14 @@ ds = env.from_source(source, WatermarkStrategy.no_watermarks(), "file-source")

Flink 支持三种方式来读取 Parquet 文件并创建 Avro records (PyFlink 只支持 Generic record):

- [Generic record](https://avro.apache.org/docs/1.10.0/api/java/index.html)
- [Specific record](https://avro.apache.org/docs/1.10.0/api/java/index.html)
- [Reflect record](https://avro.apache.org/docs/1.10.0/api/java/org/apache/avro/reflect/package-summary.html)
- [Generic record](https://avro.apache.org/docs/++version++/api/java/index.html)
- [Specific record](https://avro.apache.org/docs/++version++/api/java/index.html)
- [Reflect record](https://avro.apache.org/docs/++version++/api/java/org/apache/avro/reflect/package-summary.html)

### Generic record

使用 JSON 定义 Avro schemas。你可以从 [Avro specification](https://avro.apache.org/docs/1.10.0/spec.html) 获取更多关于 Avro schemas 和类型的信息。
此示例使用了一个在 [official Avro tutorial](https://avro.apache.org/docs/1.10.0/gettingstartedjava.html) 中描述的示例相似的 Avro schema:
使用 JSON 定义 Avro schemas。你可以从 [Avro specification](https://avro.apache.org/docs/++version++/spec.html) 获取更多关于 Avro schemas 和类型的信息。
此示例使用了一个在 [official Avro tutorial](https://avro.apache.org/docs/++version++/getting-started-java) 中描述的示例相似的 Avro schema:

```json lines
{"namespace": "example.avro",
Expand All @@ -219,11 +219,11 @@ Flink 支持三种方式来读取 Parquet 文件并创建 Avro records (PyFlin
}
```
这个 schema 定义了一个具有三个属性的的 user 记录:name,favoriteNumber 和 favoriteColor。你可以
[record specification](https://avro.apache.org/docs/1.10.0/spec.html#schema_record) 找到更多关于如何定义 Avro schema 的详细信息。
[record specification](https://avro.apache.org/docs/++version++/spec.html#schema_record) 找到更多关于如何定义 Avro schema 的详细信息。

在此示例中,你将创建包含由 Avro Generic records 格式构成的 Parquet records 的 DataStream。
Flink 会基于 JSON 字符串解析 Avro schema。也有很多其他的方式解析 schema,例如基于 java.io.File 或 java.io.InputStream。
请参考 [Avro Schema](https://avro.apache.org/docs/1.10.0/api/java/org/apache/avro/Schema.html) 以获取更多详细信息。
请参考 [Avro Schema](https://avro.apache.org/docs/++version++/api/java/org/apache/avro/Schema.html) 以获取更多详细信息。
然后,你可以通过 `AvroParquetReaders` 为 Avro Generic 记录创建 `AvroParquetRecordFormat`

{{< tabs "GenericRecord" >}}
Expand Down Expand Up @@ -286,7 +286,7 @@ stream = env.from_source(source, WatermarkStrategy.no_watermarks(), "file-source
基于之前定义的 schema,你可以通过利用 Avro 代码生成来生成类。
一旦生成了类,就不需要在程序中直接使用 schema。
你可以使用 `avro-tools.jar` 手动生成代码,也可以直接使用 Avro Maven 插件对配置的源目录中的任何 .avsc 文件执行代码生成。
请参考 [Avro Getting Started](https://avro.apache.org/docs/1.10.0/gettingstartedjava.html) 获取更多信息。
请参考 [Avro Getting Started](https://avro.apache.org/docs/++version++/getting-started-java) 获取更多信息。

此示例使用了样例 schema {{< gh_link file="flink-formats/flink-parquet/src/test/resources/avro/testdata.avsc" name="testdata.avsc" >}}:

Expand Down Expand Up @@ -335,7 +335,7 @@ final DataStream<GenericRecord> stream =

除了需要预定义 Avro Generic 和 Specific 记录, Flink 还支持基于现有 Java POJO 类从 Parquet 文件创建 DateStream。
在这种场景中,Avro 会使用 Java 反射为这些 POJO 类生成 schema 和协议。
请参考 [Avro reflect](https://avro.apache.org/docs/1.10.0/api/java/index.html) 文档获取更多关于 Java 类型到 Avro schemas 映射的详细信息。
请参考 [Avro reflect](https://avro.apache.org/docs/++version++/api/java/index.html) 文档获取更多关于 Java 类型到 Avro schemas 映射的详细信息。

本例使用了一个简单的 Java POJO 类 {{< gh_link file="flink-formats/flink-parquet/src/test/java/org/apache/flink/formats/parquet/avro/Datum.java" name="Datum" >}}:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -291,4 +291,4 @@ Format 参数

除了此处列出的类型之外,Flink 还支持读取/写入可为空(nullable)的类型。 Flink 将可为空的类型映射到 Avro `union(something, null)`, 其中 `something` 是从 Flink 类型转换的 Avro 类型。

您可以参考 [Avro Specification](https://avro.apache.org/docs/current/spec.html) 以获取有关 Avro 类型的更多信息。
您可以参考 [Avro Specification](https://avro.apache.org/docs/++version++/specification/) 以获取有关 Avro 类型的更多信息。
2 changes: 1 addition & 1 deletion docs/content.zh/docs/connectors/table/formats/avro.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,4 +204,4 @@ Format 参数

除了上面列出的类型,Flink 支持读取/写入 nullable 的类型。Flink 将 nullable 的类型映射到 Avro `union(something, null)`,其中 `something` 是从 Flink 类型转换的 Avro 类型。

您可以参考 [Avro 规范](https://avro.apache.org/docs/current/spec.html) 获取更多有关 Avro 类型的信息。
您可以参考 [Avro 规范](https://avro.apache.org/docs/++version++/specification/) 获取更多有关 Avro 类型的信息。
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ Flink 基于下面的规则来支持 [POJO 类型]({{< ref "docs/dev/datastream/
### Avro 类型

Flink 完全支持 Avro 状态类型的升级,只要数据结构的修改是被
[Avro 的数据结构解析规则](http://avro.apache.org/docs/current/spec.html#Schema+Resolution)认为兼容的即可。
[Avro 的数据结构解析规则](https://avro.apache.org/docs/++version++/specification/_print/#schema-resolution)认为兼容的即可。

一个例外是如果新的 Avro 数据 schema 生成的类无法被重定位或者使用了不同的命名空间,在作业恢复时状态数据会被认为是不兼容的。

Expand Down
18 changes: 9 additions & 9 deletions docs/content/docs/connectors/datastream/formats/parquet.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,14 +196,14 @@ ds = env.from_source(source, WatermarkStrategy.no_watermarks(), "file-source")

Flink supports producing three types of Avro records by reading Parquet files (Only Generic record is supported in PyFlink):

- [Generic record](https://avro.apache.org/docs/1.10.0/api/java/index.html)
- [Specific record](https://avro.apache.org/docs/1.10.0/api/java/index.html)
- [Reflect record](https://avro.apache.org/docs/1.10.0/api/java/org/apache/avro/reflect/package-summary.html)
- [Generic record](https://avro.apache.org/docs/++version++/api/java/index.html)
- [Specific record](https://avro.apache.org/docs/++version++/api/java/index.html)
- [Reflect record](https://avro.apache.org/docs/++version++/api/java/org/apache/avro/reflect/package-summary.html)

### Generic record

Avro schemas are defined using JSON. You can get more information about Avro schemas and types from the [Avro specification](https://avro.apache.org/docs/1.10.0/spec.html).
This example uses an Avro schema example similar to the one described in the [official Avro tutorial](https://avro.apache.org/docs/1.10.0/gettingstartedjava.html):
Avro schemas are defined using JSON. You can get more information about Avro schemas and types from the [Avro specification](https://avro.apache.org/docs/++version++/spec.html).
This example uses an Avro schema example similar to the one described in the [official Avro tutorial](https://avro.apache.org/docs/++version++/getting-started-java):

```json lines
{"namespace": "example.avro",
Expand All @@ -217,10 +217,10 @@ This example uses an Avro schema example similar to the one described in the [of
}
```

This schema defines a record representing a user with three fields: name, favoriteNumber, and favoriteColor. You can find more details at [record specification](https://avro.apache.org/docs/1.10.0/spec.html#schema_record) for how to define an Avro schema.
This schema defines a record representing a user with three fields: name, favoriteNumber, and favoriteColor. You can find more details at [record specification](https://avro.apache.org/docs/++version++/spec.html#schema_record) for how to define an Avro schema.

In the following example, you will create a DataStream containing Parquet records as Avro Generic records.
It will parse the Avro schema based on the JSON string. There are many other ways to parse a schema, e.g. from java.io.File or java.io.InputStream. Please refer to [Avro Schema](https://avro.apache.org/docs/1.10.0/api/java/org/apache/avro/Schema.html) for details.
It will parse the Avro schema based on the JSON string. There are many other ways to parse a schema, e.g. from java.io.File or java.io.InputStream. Please refer to [Avro Schema](https://avro.apache.org/docs/++version++/api/java/org/apache/avro/Schema.html) for details.
After that, you will create an `AvroParquetRecordFormat` via `AvroParquetReaders` for Avro Generic records.

{{< tabs "GenericRecord" >}}
Expand Down Expand Up @@ -284,7 +284,7 @@ Based on the previously defined schema, you can generate classes by leveraging A
Once the classes have been generated, there is no need to use the schema directly in your programs.
You can either use `avro-tools.jar` to generate code manually or you could use the Avro Maven plugin to perform
code generation on any .avsc files present in the configured source directory. Please refer to
[Avro Getting Started](https://avro.apache.org/docs/1.10.0/gettingstartedjava.html) for more information.
[Avro Getting Started](https://avro.apache.org/docs/++version++/getting-started-java/) for more information.

The following example uses the example schema {{< gh_link file="flink-formats/flink-parquet/src/test/resources/avro/testdata.avsc" name="testdata.avsc" >}}:

Expand Down Expand Up @@ -334,7 +334,7 @@ final DataStream<GenericRecord> stream =
Beyond Avro Generic and Specific record that requires a predefined Avro schema,
Flink also supports creating a DataStream from Parquet files based on existing Java POJO classes.
In this case, Avro will use Java reflection to generate schemas and protocols for these POJO classes.
Java types are mapped to Avro schemas, please refer to the [Avro reflect](https://avro.apache.org/docs/1.10.0/api/java/index.html) documentation for more details.
Java types are mapped to Avro schemas, please refer to the [Avro reflect](https://avro.apache.org/docs/++version++/api/java/index.html) documentation for more details.

This example uses a simple Java POJO class {{< gh_link file="flink-formats/flink-parquet/src/test/java/org/apache/flink/formats/parquet/avro/Datum.java" name="Datum" >}}:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -298,4 +298,4 @@ See the [Apache Avro Format]({{< ref "docs/connectors/table/formats/avro" >}}#da

In addition to the types listed there, Flink supports reading/writing nullable types. Flink maps nullable types to Avro `union(something, null)`, where `something` is the Avro type converted from Flink type.

You can refer to [Avro Specification](https://avro.apache.org/docs/current/spec.html) for more information about Avro types.
You can refer to [Avro Specification](https://avro.apache.org/docs/++version++/specification/) for more information about Avro types.
2 changes: 1 addition & 1 deletion docs/content/docs/connectors/table/formats/avro.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,4 +218,4 @@ So the following table lists the type mapping from Flink type to Avro type.

In addition to the types listed above, Flink supports reading/writing nullable types. Flink maps nullable types to Avro `union(something, null)`, where `something` is the Avro type converted from Flink type.

You can refer to [Avro Specification](https://avro.apache.org/docs/current/spec.html) for more information about Avro types.
You can refer to [Avro Specification](https://avro.apache.org/docs/++version++/specification/) for more information about Avro types.
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,8 @@ public TypeSerializer<T> restoreSerializer() {
*
* <p>Checks whenever a new version of a schema (reader) can read values serialized with the old
* schema (writer). If the schemas are compatible according to {@code Avro} schema resolution
* rules (@see <a href="https://avro.apache.org/docs/current/spec.html#Schema+Resolution">Schema
* rules (@see <a
* href="https://avro.apache.org/docs/++version++/specification/_print/#schema-resolution">Schema
* Resolution</a>).
*/
@VisibleForTesting
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@

/**
* Test the avro input format. (The testcase is mostly the getting started tutorial of avro)
* http://avro.apache.org/docs/current/gettingstartedjava.html
* http://avro.apache.org/docs/current/getting-started-java
*/
public class AvroRecordInputFormatTest {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@

/**
* Test the avro input format. (The testcase is mostly the getting started tutorial of avro)
* http://avro.apache.org/docs/current/gettingstartedjava.html
* https://avro.apache.org/docs/++version++/getting-started-java/
*/
class AvroSplittableInputFormatTest {

Expand Down

0 comments on commit 32af73f

Please sign in to comment.