forked from apache/datafusion-sqlparser-rs
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement
Spanned
to retrieve source locations on AST nodes (apache…
…#1435) Co-authored-by: Ifeanyi Ubah <[email protected]> Co-authored-by: Andrew Lamb <[email protected]>
- Loading branch information
1 parent
0adec33
commit 3c8fd74
Showing
18 changed files
with
3,092 additions
and
399 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
|
||
## Breaking Changes | ||
|
||
These are the current breaking changes introduced by the source spans feature: | ||
|
||
#### Added fields for spans (must be added to any existing pattern matches) | ||
- `Ident` now stores a `Span` | ||
- `Select`, `With`, `Cte`, `WildcardAdditionalOptions` now store a `TokenWithLocation` | ||
|
||
#### Misc. | ||
- `TokenWithLocation` stores a full `Span`, rather than just a source location. Users relying on `token.location` should use `token.location.start` instead. | ||
## Source Span Contributing Guidelines | ||
|
||
For contributing source spans improvement in addition to the general [contribution guidelines](../README.md#contributing), please make sure to pay attention to the following: | ||
|
||
|
||
### Source Span Design Considerations | ||
|
||
- `Ident` always have correct source spans | ||
- Downstream breaking change impact is to be as minimal as possible | ||
- To this end, use recursive merging of spans in favor of storing spans on all nodes | ||
- Any metadata added to compute spans must not change semantics (Eq, Ord, Hash, etc.) | ||
|
||
The primary reason for missing and inaccurate source spans at this time is missing spans of keyword tokens and values in many structures, either due to lack of time or because adding them would break downstream significantly. | ||
|
||
When considering adding support for source spans on a type, consider the impact to consumers of that type and whether your change would require a consumer to do non-trivial changes to their code. | ||
|
||
Example of a trivial change | ||
```rust | ||
match node { | ||
ast::Query { | ||
field1, | ||
field2, | ||
location: _, // add a new line to ignored location | ||
} | ||
``` | ||
|
||
If adding source spans to a type would require a significant change like wrapping that type or similar, please open an issue to discuss. | ||
|
||
### AST Node Equality and Hashes | ||
|
||
When adding tokens to AST nodes, make sure to store them using the [AttachedToken](https://docs.rs/sqlparser/latest/sqlparser/ast/helpers/struct.AttachedToken.html) helper to ensure that semantically equivalent AST nodes always compare as equal and hash to the same value. F.e. `select 5` and `SELECT 5` would compare as different `Select` nodes, if the select token was stored directly. f.e. | ||
|
||
```rust | ||
struct Select { | ||
select_token: AttachedToken, // only used for spans | ||
/// remaining fields | ||
field1, | ||
field2, | ||
... | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
// Licensed to the Apache Software Foundation (ASF) under one | ||
// or more contributor license agreements. See the NOTICE file | ||
// distributed with this work for additional information | ||
// regarding copyright ownership. The ASF licenses this file | ||
// to you under the Apache License, Version 2.0 (the | ||
// "License"); you may not use this file except in compliance | ||
// with the License. You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, | ||
// software distributed under the License is distributed on an | ||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
// KIND, either express or implied. See the License for the | ||
// specific language governing permissions and limitations | ||
// under the License. | ||
|
||
use core::cmp::{Eq, Ord, Ordering, PartialEq, PartialOrd}; | ||
use core::fmt::{self, Debug, Formatter}; | ||
use core::hash::{Hash, Hasher}; | ||
|
||
use crate::tokenizer::{Token, TokenWithLocation}; | ||
|
||
#[cfg(feature = "serde")] | ||
use serde::{Deserialize, Serialize}; | ||
|
||
#[cfg(feature = "visitor")] | ||
use sqlparser_derive::{Visit, VisitMut}; | ||
|
||
/// A wrapper type for attaching tokens to AST nodes that should be ignored in comparisons and hashing. | ||
/// This should be used when a token is not relevant for semantics, but is still needed for | ||
/// accurate source location tracking. | ||
#[derive(Clone)] | ||
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] | ||
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))] | ||
pub struct AttachedToken(pub TokenWithLocation); | ||
|
||
impl AttachedToken { | ||
pub fn empty() -> Self { | ||
AttachedToken(TokenWithLocation::wrap(Token::EOF)) | ||
} | ||
} | ||
|
||
// Conditional Implementations | ||
impl Debug for AttachedToken { | ||
fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result { | ||
self.0.fmt(f) | ||
} | ||
} | ||
|
||
// Blanket Implementations | ||
impl PartialEq for AttachedToken { | ||
fn eq(&self, _: &Self) -> bool { | ||
true | ||
} | ||
} | ||
|
||
impl Eq for AttachedToken {} | ||
|
||
impl PartialOrd for AttachedToken { | ||
fn partial_cmp(&self, other: &Self) -> Option<Ordering> { | ||
Some(self.cmp(other)) | ||
} | ||
} | ||
|
||
impl Ord for AttachedToken { | ||
fn cmp(&self, _: &Self) -> Ordering { | ||
Ordering::Equal | ||
} | ||
} | ||
|
||
impl Hash for AttachedToken { | ||
fn hash<H: Hasher>(&self, _state: &mut H) { | ||
// Do nothing | ||
} | ||
} | ||
|
||
impl From<TokenWithLocation> for AttachedToken { | ||
fn from(value: TokenWithLocation) -> Self { | ||
AttachedToken(value) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.