WIP: #296 filter for PhD and post docs alongside advisors #23

SaniHarouna-Mayer · 2020-03-04T22:48:14Z

copy of recent PR to the right branch now..

…llection

- working on fixing cells with data validation

- reformatted the names into Last, First format - sorted the names by alphabetical order (by last name)

… different pr.

…it of PR to wrong branch

sbillinge

thanks @SaniHarouna-Mayer . Please can you move over the comments from the other PR?

sbillinge · 2020-03-05T00:16:37Z

regolith/schemas.py

@@ -1366,6 +1366,7 @@
            "schema": {
                "type": "dict",
                "schema": {
+                    "advisor": {"required": False, "type": "string"},


I think a description could help here.

Also, we will need an examplar (above). Add it to an existing education scopatz item.

regolith/tools.py

SaniHarouna-Mayer · 2020-03-05T02:24:19Z

Previous comments:

Thanks @SaniHarouna-Mayer this looks very good, well done!

How will it be used? we will need the code that calls it. Also, it will need a test.

Finally, a thought is thatthink that we probably want to get phd and postdoc advisors only for a single person so we won't need to iterate over an input_contacts, though this is just a guess as I am not sure how you plan to deploy this....

We ill also need to dereference the advisor name, so if the advisor is sbillinge we will use fuzzy_retrieval to get this person either from the people or contacts collection. You may still be working on that. A helpful comment in the PR would help me know where you are with the PR.

Thanks so much! Again, a fine piece of code, very clean!

Thank you!! I will need help with the test and the usage within the code.
I will catch up on this with Songsheng tomorrow and work on your other
comments then.

sounds good.

Basically in the main code it is just getting all the collaborators etc.
for a single person that is specified on the command line....e.g., in
general that person will be sbillinge... so if the people collection is
loaded into the variable ppl, let's say, the code will look something
like

phd_advisors = [position["advisor"] for position in ppl["education"] if
ppl["_id"] is person]

then we would generally load phd_advisors into some dictionary that is
passed to the template and then the template will have to be modified to
unpack that and insert it into the built document.

one more thing, the elements of the list phd_advisors (list in case
someone does two PhD's!) could be actual name of a person, or an _id of
a person or whatever, so we would run them through fuzzy_retrieval to
get the full person and then load their canonical name into whatever is
passed to the template.

S

@SaniHarouna-Mayer btw this PR should be in to the recent-collaborators branch

…ne entry in education

…ion clear enough??

SaniHarouna-Mayer · 2020-03-05T18:38:33Z

@sbillinge
I am struggling to understand the architecture of the whole code and how I am supposed to implement my contribution:

Where is function supposed to be called?

then we would generally load phd_advisors into some dictionary that is
passed to the template and then the template will have to be modified to
unpack that and insert it into the built document.

Where and how will the output be passed to the template and
how will it be inserted into the built document? (latex method in RecentCollabsBuilder class in recentcollabsbuilder.py?)

We ill also need to dereference the advisor name, so if the advisor is sbillinge we will use fuzzy_retrieval to get this person either from the people or contacts collection. You may still be working on that.

To dereference the advisor name, simply add a key to the dictionary, which tells from which db the information is retrieved from?

Thank you!

SaniHarouna-Mayer · 2020-03-14T17:22:05Z

Hi @sbillinge, could you please doublecheck if I got the guideline the right way?

The filter function from tools.py is called to coabuilder.py and filters for an advisor name and the positions in people.yml and contacts.yml
I extend the excel method in coabuilder.py to build an xsls file which contains the information required how it is shown in the coa_template

Right now there are no advisor entries at all in people and contacts. How should we proceed to get this updated in the end? More specifically, where can I find those information, since it won't be easily found googling like we did when updating the institutions and contacts?

Thank you!

sbillinge · 2020-03-14T21:49:10Z

Thanks for reaching out Sani. It looks as if you are doing Issue regro#296 which is to find PhD and post-doc advisors.

If it was me, my plan might look like this:

We cannot currently search for advisors because it is not recorded so there will have to be a schema update. First thing then is to decide where the advisor info should go.
Once it is decided where to go, make something that will allow us to test our code. Since we always run it for sbillinge in general we just need to put the advisor info into the sbillinge entry for now, so modify people accordingly.
the program needs to take that information and insert in in the correct place in the template. One thing to figure out is whether it will need to add new rows to the template (as the authors one does) or not, but in any case, write the code for putting the info the spreadsheet, taking inspiration from the code already there.
Now we have testing going on (thanks to Hung) we will have to modify the regolith schem with these new fields. Return to this when we have everything working and building properly.

sbillinge · 2020-03-14T21:54:01Z

Now, to move this forward more quickly I can put some discussion and thoughts I have already had. Let's work on (1) first, where to put the info. The obvious place is to put it into the education and employment fields of the people collection. I would then think to have an optional advisor field in the education schema and then add the advisor when the education involved research, e.g., phd. For the post-doc positions, they will appear in employment not education. It doesn't make sense to me to have an advisor field in education because most education is things like bank manager or hairdresser or whatever, so I would think that we would need a more generic word, maybe mentor. I guess hairdressers and bank managers can have mentors, but for post-docs the mentor could be the advisor. Not 100% sure about that but that would be my first suggestion.

sbillinge · 2020-03-14T21:59:25Z

Let's work on (2) . The way the program is run is regolith build recent-collabs --people sbillinge so the person is specified that we want to search to find the advisors. It seems that we would then have to iterate through the education and employment lists only of the one person who was specified on the command line and find (a) all the phd degrees in education and extract the advisor and (b) all the entries in the employment that are postdocs and extract the mentors. I guess another way to do it is to have an optional advisor field in employment that will, presumably, only be used in the case of a postdoc, and then the job here is to just find all the employments that contain an advisor, then get that value. This might actually be better than my previous idea of mentor!

sbillinge · 2020-03-14T22:00:12Z

for (3) it is a matter of copy-pasting code that is already there and adapting it.

sbillinge · 2020-03-14T22:01:54Z

btw, my PhD advisors were Takeshi Egami and Peter Davies and my postdoc advisor was George Kwei.

…n: mentor

SaniHarouna-Mayer · 2020-03-16T21:19:22Z

Thanks Simon, this is really helpful!

SaniHarouna-Mayer · 2020-03-16T21:19:33Z

(1):

Now, to move this forward more quickly I can put some discussion and thoughts I have already had. Let's work on (1) first, where to put the info. The obvious place is to put it into the education and employment fields of the people collection. I would then think to have an optional advisor field in the education schema and then add the advisor when the education involved research, e.g., phd. For the post-doc positions, they will appear in employment not education. It doesn't make sense to me to have an advisor field in education because most education is things like bank manager or hairdresser or whatever, so I would think that we would need a more generic word, maybe mentor. I guess hairdressers and bank managers can have mentors, but for post-docs the mentor could be the advisor. Not 100% sure about that but that would be my first suggestion.

I already implemented these changes quite similar to your ideas. I don't quite see the confusion with the advisor name. I changed it to mentor in education and employment. I'd like to stick with the same name in education and description since it describes the same relationship and the code would loose lucidity otherwise.

sbillinge and others added 22 commits March 4, 2020 11:50

initial commit of recent-collabs builder

e396c64

now extracts people it finds in author list, but only if in people co…

a6b9851

…llection

WIP working and returns set of folks in people coll

dd4d779

proper name parsing in recentcollabs builder

54e3ee1

tweaking error handling in recent_collabs

49a105b

adding dateutil to requirements

8799307

ENH: add needed_dbs

95eff79

MAINT: replace sbillinge with people argument

2e164af

catch tbd months

68145d1

more friendly fail when no person is specified

484abe5

now extracts people it finds in author list, but only if in people co…

5db5ea0

…llection

people seems to be enforced as list in p3.8]

a6bc4ba

test file

71e55dc

added coa_template.xlsx

f2afba3

- added script coabuilder.py filling in excel template

b6fbe30

- working on fixing cells with data validation

- removed the duplicate entries

900f02f

- reformatted the names into Last, First format - sorted the names by alphabetical order (by last name)

added global variable NUM_MONTHS

4ca5487

requirements should have python-dateutil not just dateutil

abeadf0

remove missing review-man test for test_builders. This should be in a…

0b5677d

… different pr.

changing tests so that recent collabs will run with scopatz as person

232b17a

added function filter for advisors and positions, status of last comm…

c208195

…it of PR to wrong branch

add advisor to schema @ education and employment

d29c162

sbillinge reviewed Mar 5, 2020

View reviewed changes

SaniHarouna-Mayer added 3 commits March 4, 2020 21:29

deleted duplicated function filter grants @tools.py

73f3373

in EXEMPLARS - add an advisor entry for one entry in employment and o…

4f25ea5

…ne entry in education

add descripotion to advisor @employment & @education schema. descript…

1fe7cd2

…ion clear enough??

SaniHarouna-Mayer added 2 commits March 16, 2020 16:34

update schema.py -> education and employment -> advisor -> descriptio…

c1b527b

…n: mentor

update schema.py -> change key name advisor to mentor

100c5ea

sbillinge force-pushed the recent_collaborators branch from 13a36ad to 3e0ebc5 Compare April 25, 2020 11:58

sbillinge force-pushed the recent_collaborators branch from 7ee23f8 to 7983fdc Compare June 5, 2020 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: #296 filter for PhD and post docs alongside advisors #23

WIP: #296 filter for PhD and post docs alongside advisors #23

SaniHarouna-Mayer commented Mar 4, 2020

sbillinge left a comment

sbillinge Mar 5, 2020

SaniHarouna-Mayer commented Mar 5, 2020

SaniHarouna-Mayer commented Mar 5, 2020 •

edited

Loading

SaniHarouna-Mayer commented Mar 14, 2020

sbillinge commented Mar 14, 2020

sbillinge commented Mar 14, 2020

sbillinge commented Mar 14, 2020 •

edited

Loading

sbillinge commented Mar 14, 2020

sbillinge commented Mar 14, 2020

SaniHarouna-Mayer commented Mar 16, 2020

SaniHarouna-Mayer commented Mar 16, 2020

WIP: #296 filter for PhD and post docs alongside advisors #23

Are you sure you want to change the base?

WIP: #296 filter for PhD and post docs alongside advisors #23

Conversation

SaniHarouna-Mayer commented Mar 4, 2020

sbillinge left a comment

Choose a reason for hiding this comment

sbillinge Mar 5, 2020

Choose a reason for hiding this comment

SaniHarouna-Mayer commented Mar 5, 2020

SaniHarouna-Mayer commented Mar 5, 2020 • edited Loading

SaniHarouna-Mayer commented Mar 14, 2020

sbillinge commented Mar 14, 2020

sbillinge commented Mar 14, 2020

sbillinge commented Mar 14, 2020 • edited Loading

sbillinge commented Mar 14, 2020

sbillinge commented Mar 14, 2020

SaniHarouna-Mayer commented Mar 16, 2020

SaniHarouna-Mayer commented Mar 16, 2020

SaniHarouna-Mayer commented Mar 5, 2020 •

edited

Loading

sbillinge commented Mar 14, 2020 •

edited

Loading