Skip to content

Commit

Permalink
Fix infraspecific names
Browse files Browse the repository at this point in the history
- Inserts rank between species and final epithet
- Privatizes methods used `by InatObs#name_id`
- Improves method name
  • Loading branch information
JoeCohen committed Aug 10, 2024
1 parent 285522f commit 5e2c538
Show file tree
Hide file tree
Showing 4 changed files with 102 additions and 70 deletions.
108 changes: 59 additions & 49 deletions app/classes/inat_obs.rb
Original file line number Diff line number Diff line change
Expand Up @@ -178,55 +178,6 @@ def name_id
best_mo_name(mo_names)
end

# For infrageneric ranks, the iNat `:name` string is only the epithet.
# Ex: "Distantes"
# So get a complete string. Ex: "Morchella section Distantes"
def full_name
if infrageneric?
genus_rank_epithet
else
inat_taxon[:name]
end
end

def infrageneric?
%w[subgenus section subsection stirps series subseries].
include?(inat_taxon[:rank])
end

def genus_rank_epithet
# Search the identifications of this iNat observation
# for an identification of the inat_taxon[:id]
inat_identifications.each do |identification|
next unless identifies_this_obs?(identification)

# search the identification's ancestors to find the genus
identification[:taxon][:ancestors].each do |ancestor|
next unless ancestor[:rank] == "genus"

# return a string comprising Genus rank epithet
# ex: "Morchella section Distantes"
return "#{ancestor[:name]} #{inat_taxon[:rank]} #{inat_taxon[:name]}"
end
end
end

def identifies_this_obs?(identification)
identification[:taxon][:id] == inat_taxon[:id]
end

def best_mo_name(mo_names)
return Name.unknown.id if mo_names.none?
return mo_names.first.id if mo_names.one?

# iNat name maps to multiple MO Names
# So for the moment, just map it to Fungi
# TODO: refine this.
# Ideas: check iNat and MO authors, possibly prefer non-deprecated MO Name
# - might need a dictionary here
Name.unknown.id
end

def notes
return "" if description.empty?

Expand Down Expand Up @@ -339,6 +290,65 @@ def description
@obs[:description]
end

def full_name
if infrageneric?
# iNat :name string is only the epithet. Ex: "Distantes"
prepend_genus_and_rank
elsif infraspecific?
# iNat :name string omits the rank. Ex: "Inonotus obliquus sterilis"
insert_rank_between_species_and_final_epithet
else
inat_taxon[:name]
end
end

def infrageneric?
%w[subgenus section subsection stirps series subseries].
include?(inat_taxon[:rank])
end

def prepend_genus_and_rank
# Search the identifications of this iNat observation
# for an identification of the inat_taxon[:id]
inat_identifications.each do |identification|
next unless identifies_this_obs?(identification)

# search the identification's ancestors to find the genus
identification[:taxon][:ancestors].each do |ancestor|
next unless ancestor[:rank] == "genus"

# return a string comprising Genus rank epithet
# ex: "Morchella section Distantes"
return "#{ancestor[:name]} #{inat_taxon[:rank]} #{inat_taxon[:name]}"
end
end
end

def infraspecific?
%w[subspecies variety form].include?(inat_taxon[:rank])
end

def insert_rank_between_species_and_final_epithet
words = inat_taxon[:name].split
"#{words[0..1].join(" ")} #{inat_taxon[:rank]} #{words[2]}"
end

def identifies_this_obs?(identification)
identification[:taxon][:id] == inat_taxon[:id]
end

def best_mo_name(mo_names)
return Name.unknown.id if mo_names.none?
return mo_names.first.id if mo_names.one?

# iNat name maps to multiple MO Names
# So for the moment, just map it to Fungi
# TODO: refine this.
# Ideas: check iNat and MO authors, possibly prefer non-deprecated MO Name
# - might need a dictionary here
Name.unknown.id
end

def sequence_field?(field)
field[:datatype] == "dna" ||
field[:name] =~ /DNA/ && field[:value] =~ /^[ACTG]{,10}/
Expand Down
43 changes: 22 additions & 21 deletions test/inat/README_INAT_EXAMPLES.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,28 +15,29 @@ All data as of the time of importing. (The corresponding iNat Observation may ha

| File | iNat Obs | fotos | location | Other |
| ---- | -------- | ----- | -------- | ----- |
| amanita_flavorubens.txt | [231104466](https://www.inaturalist.org/observations/231104466) | **0** | public | Casual |
| arrhenia_sp_NY02.txt | [184219885](https://www.inaturalist.org/observations/184219885) | 1 | public | **mo-style Provisional Species Name**, **DNA** |
| calostoma_lutescens.txt | [195434438](https://www.inaturalist.org/observations/195434438) | **0** | public | |
| ceanothus_cordulatus.txt | [219631412](https://www.inaturalist.org/observations/219631412) | 1 | public | **Plant** |
| coprinus.txt | [213450312](https://www.inaturalist.org/observations/213450312) | 1 | **obscured** | Needs ID |
| distantes.txt | [215996396](https://www.inaturalist.org/observations/215996396) | 1 | **obscured** | Needs ID, jdc Obs, taxon[:name]: "Distantes" rank:"section", rank_level:13|
| donadina_PNW01.txt | [212320801](https://www.inaturalist.org/observations/212320801) | 1 | public | **non-mo-style Provisional Species Name (PNW)**, **DNA** |
| evernia.txt | [216357655](https://www.inaturalist.org/observations/216357655) | 0 | public | Casual, lichen, no fields, place: Troutdale |
| fuligo_septica.txt | [219783802](https://www.inaturalist.org/observations/219783802) | 1 | public | slime mold **Protozoa** Richmond, CA |
| gyromitra_ancilis.txt | [216745568](https://www.inaturalist.org/observations/216745568) | 3 | public | **cc-by license**, **many projects**, US 20, Linn Co.|
| import_all.txt | | | | all fungal obss (total of 5) of iNat user devin189, 2 per page (this user had few fungal observations) |
| inocybe.txt | [222904190](https://www.inaturalist.org/observations/222904190) | 5 | public | cc-by-nc, **2 tags∆∆** |
| lentinellus_ursinus.txt | [220796026](https://inaturalist.org/observations/220796026) | 2 | obscured | **ID matches many MO names** |
| listed_ids.txt | [231104466](https://www.inaturalist.org/observations/231104466) [195434438](https://www.inaturalist.org/observations/195434438) | na | na | response to request for 2 obs by number (amanita_flavorubens, evernia) |
| lycoperdon.txt | [24970904](https://www.inaturalist.org/observations/24970904) | 2 | public | cc-by-nc, projects, **multiple ids**, many fields including **DNA**, place: E. side of Metolius River, Sisters Ranger District, Deschutes National Forest, Jefferson County, Oregon, US |
| russulaceae.txt | [216675045](https://www.inaturalist.org/observations/216675045) | 2 | public | **all rights reserved**, many projects, Activity; place: Point Defiance Park, Tacoma, WA, US |
| amanita_flavorubens| [231104466](https://www.inaturalist.org/observations/231104466) | **0** | public | Casual |
| arrhenia_sp_NY02| [184219885](https://www.inaturalist.org/observations/184219885) | 1 | public | **mo-style Provisional Species Name**, **DNA** |
| calostoma_lutescens| [195434438](https://www.inaturalist.org/observations/195434438) | **0** | public | |
| ceanothus_cordulatus| [219631412](https://www.inaturalist.org/observations/219631412) | 1 | public | **Plant** |
| coprinus| [213450312](https://www.inaturalist.org/observations/213450312) | 1 | **obscured** | Needs ID |
| distantes| [215996396](https://www.inaturalist.org/observations/215996396) | 1 | **obscured** | Needs ID, jdc Obs, taxon[:name]: "Distantes" rank:"section", rank_level:13|
| donadina_PNW01| [212320801](https://www.inaturalist.org/observations/212320801) | 1 | public | **non-mo-style Provisional Species Name (PNW)**, **DNA** |
| evernia| [216357655](https://www.inaturalist.org/observations/216357655) | 0 | public | Casual, lichen, no fields, place: Troutdale |
| fuligo_septica| [219783802](https://www.inaturalist.org/observations/219783802) | 1 | public | slime mold **Protozoa** Richmond, CA |
| gyromitra_ancilis| [216745568](https://www.inaturalist.org/observations/216745568) | 3 | public | **cc-by license**, **many projects**, US 20, Linn Co.|
| import_all| | | | all fungal obss (total of 5) of iNat user devin189, 2 per page (this user had few fungal observations) |
| inocybe| [222904190](https://www.inaturalist.org/observations/222904190) | 5 | public | cc-by-nc, **2 tags∆∆** |
| i_obliquus_f_sterilis | [232919689](https://www.inaturalist.org/observations/232919689) | 1 | public | cc-by-nc, **infraspecific name** |
| lentinellus_ursinus| [220796026](https://inaturalist.org/observations/220796026) | 2 | obscured | **ID matches many MO names** |
| listed_ids| [231104466](https://www.inaturalist.org/observations/231104466) [195434438](https://www.inaturalist.org/observations/195434438) | na | na | response to request for 2 obs by number (amanita_flavorubens, evernia) |
| lycoperdon| [24970904](https://www.inaturalist.org/observations/24970904) | 2 | public | cc-by-nc, projects, **multiple ids**, many fields including **DNA**, place: E. side of Metolius River, Sisters Ranger District, Deschutes National Forest, Jefferson County, Oregon, US |
| russulaceae| [216675045](https://www.inaturalist.org/observations/216675045) | 2 | public | **all rights reserved**, many projects, Activity; place: Point Defiance Park, Tacoma, WA, US |
| somion_unicolor.json | | | | Formatted version of following; facilitates viewing iNat API response key/values test/inat/somion_unicolor.json |
| somion_unicolor.txt | [**202555552**](https://www.inaturalist.org/observations/202555552) | 5 | public | Research Grade, Notes, Activity, >1 ID, 1 field (Mushroom Observer URL), **mirrored from MO** |
| trametes.txt | [220370929](https://www.inaturalist.org/observations/220370929) | 2 | public | D. Miller observation with different collector; Notes; **Observation Fields: Collector**, place: 25th Ave NE, Seattle, WA, US, with huge error |
| tremella_mesenterica.txt | [213508767](https://www.inaturalist.org/observations/213508767) | 1 | public | place: Lewisville, TX 75057, USA |
| xeromphalina_campanella_complex.txt | [215969102](https://www.inaturalist.org/observations/215969102) | 2 | public | **Complex** |
| zero_results.txt | n.a. | | n.a. | response with total_results: 0, to expose and prevent reversion of bug |
| somion_unicolor| [**202555552**](https://www.inaturalist.org/observations/202555552) | 5 | public | Research Grade, Notes, Activity, >1 ID, 1 field (Mushroom Observer URL), **mirrored from MO** |
| trametes| [220370929](https://www.inaturalist.org/observations/220370929) | 2 | public | D. Miller observation with different collector; Notes; **Observation Fields: Collector**, place: 25th Ave NE, Seattle, WA, US, with huge error |
| tremella_mesenterica| [213508767](https://www.inaturalist.org/observations/213508767) | 1 | public | place: Lewisville, TX 75057, USA |
| xeromphalina_campanella_complex| [215969102](https://www.inaturalist.org/observations/215969102) | 2 | public | **Complex** |
| zero_results| n.a. | | n.a. | response with total_results: 0, to expose and prevent reversion of bug |

## TODO

Expand Down
1 change: 1 addition & 0 deletions test/inat/i_obliquus_f_sterilis.txt

Large diffs are not rendered by default.

20 changes: 20 additions & 0 deletions test/models/inat_obs_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,26 @@ def test_infrageneric_name
assert_equal(name.text_name, mock_inat_obs.text_name)
end

def test_infraspecific_name
name = Name.create(
user: rolf,
rank: "Form",
text_name: "Inonotus obliquus f. sterilis",
search_name: "Inonotus obliquus f. sterilis (Vanin) Balandaykin & Zmitr.",
display_name: "**__Inonotus obliquus__** f. **__sterilis__** " \
"(Vanin) Balandaykin & Zmitr.",
sort_name: "Inonotus obliquus {7f. sterilis " \
"(Vanin) Balandaykin & Zmitr.",
author: "(Vanin) Balandaykin & Zmitr.",
icn_id: 809_726
)

mock_inat_obs = mock_observation("i_obliquus_f_sterilis")

assert_equal(name.id, mock_inat_obs.name_id)
assert_equal(name.text_name, mock_inat_obs.text_name)
end

def test_names_alternative_authors
# Make sure fixtures still OK
names = Name.where(text_name: "Lentinellus ursinus", rank: "Species",
Expand Down

0 comments on commit 5e2c538

Please sign in to comment.