GitHub - Monty/DNA_Match_Manager_Helper: Create sortable spreadsheet from DNA Match Manager output files

DNA Match Manager Helper

DNA Match Manager is a free program that downloads DNA match data from Ancestry, 23andMe, FamilyTreeDNA, GEDmatch, and MyHeritage (in minutes instead of days). However, some important numeric data in its output files, such as Shared cMs, is in a different column for each site. That makes it difficult to sort into a useful order.

The shell script makeDNASpreadsheet.sh overcomes that problem by extracting and coalescing data from those files into a summary spreadsheet sorted by Shared cMs, Shared %, and then Match Name so that the closest DNA relatives are at the top.

The resulting spreadsheet contains the following columns:

Column	Description
Source Site	Site the data came from - e.g. Ancestry, 23andMe, etc.
Match Name	Name of the person who is a DNA match
Email	Email address of the Match (if provided)
Side	P, M, or blank - i.e. P[aternal] or M[aternal]
Gender	M, F, or blank
Shared %	Percentage of DNA shared - computed if not provided
Shared cMs	Total amount of DNA shared (in cMs)
Shared Segs	Total number of DNA segments shared
Longest Seg	Size (in cMs) of largest matching segment
Relationship	Varies - e.g. GEDmatch uses a number instead of text

Prerequisites

In addition to DNA Match Manager you'll need a few standard tools such as bash, awk, and grep which are installed by default on both macOS and Linux, and are installed by Windows Subsystem for Linux on Windows 10.

Compatibility

Tested on macOS Mojave and Catalina. Tested on Ubuntu 18.04. Should work in Windows 10 if Windows Subsystem for Linux is installed.

Instructions

Clone or download DNA Match Manager Helper.
Run DNA Match Manager to collect data from any sites you use. Save (in the directory created in step 1) either a file per site or a single file containing all sites. Include one of these key strings in the name you choose for each output file. Be consistent in your methodology!
- 23andMe
- Ancestry
- FTDNA
- GEDmatch
- MyHeritage
- AllSites - Only if you choose to use a single file to contain all data
For example, instead of the default MatchManagerExport_1.13.2020.2.48.28PM.csv save your 23andMe match data as Match-23andMe_1.13.2020.2.48.28PM.csv (or any other name containing "23andMe").
In a terminal window, type: ./makeDNASpreadsheet.sh to generate a match summary spreadsheet and (if there are any) a list of matches added since the previous run. They will have timestamped names such as Relatives-2020-01-11.103106.csv and Additions-2020-01-11.103106.csv

On macOS you can simply double click run.command in a Finder window (or right-click and select 'Open') which will automatically open a Terminal window and run ./makeDNASpreadsheet.sh for you.

By default, only matches sharing more than 100 cMs are saved. This is appropriate for adoptees or others seeking close DNA relatives. If you are doing genealogy research a lower minimum will be more helpful.

To change the minimum cMs saved, use the -m switch, e.g. in a terminal window type: ./makeDNASpreadsheet.sh -m 19.5 Note: If you want 20 cM, enter 19.5 as some web sites show cMs rounded up to the next whole number.

ProTips

Create a permanent master spreadsheet from the Relatives-timestamp.csv file and save it as a non-csv spreadsheet. Use that spreadsheet to make notes, add missing data, and change erroneous relationships (e.g. second cousin to first cousin once removed, etc.). Add as many columns as you need for things like most recent common ancestor, birth year, location, common surnames, or data from site specific columns.

Don't delete the Relatives-timestamp.csv file produced by this script. It will be used by later runs to produce an Additions-timestamp.csv file containing data that can be cut/pasted into your master spreadsheet. You can either paste that data one line at a time into its proper slot, or paste it all at the bottom and then sort it by Shared cMs (descending), Shared % (descending), and then Match Name (ascending) to move it into the correct slot.

There is no reason to run ./makeDNASpreadsheet.sh again until you know or suspect you have a new relative that shares more DNA than the minimum cMs saved. Run it monthly if you don't keep a close watch on your DNA testing sites.

If you want to see what added matches will look like without waiting for new ones to show up on the web, just reduce the minimum cMs slightly, e.g. ./makeDNASpreadsheet.sh -m 99

If you want to permanently change the minimum cMs saved, edit makeDNASpreadsheet.sh and change the line cMs_min=99.5 to a different value.

Enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
DNA_Match_Manager-FieldNames.md		DNA_Match_Manager-FieldNames.md
LICENSE		LICENSE
README.md		README.md
awkStatements.txt		awkStatements.txt
fix23andMe.sh		fix23andMe.sh
fixFTDNA.sh		fixFTDNA.sh
fixFields.awk		fixFields.awk
fixMyHeritage.sh		fixMyHeritage.sh
getFieldsFromDNA.awk		getFieldsFromDNA.awk
makeDNASpreadsheet.sh		makeDNASpreadsheet.sh
run.command		run.command

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DNA Match Manager Helper

Prerequisites

Compatibility

Instructions

ProTips

About

Releases 1

Packages

Languages

License

Monty/DNA_Match_Manager_Helper

Folders and files

Latest commit

History

Repository files navigation

DNA Match Manager Helper

Prerequisites

Compatibility

Instructions

ProTips

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages