Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUE-14: LoD Reconciling + CSV Export + Release Improvements #31

Merged
merged 42 commits into from
Nov 18, 2021

Commits on Jul 25, 2021

  1. First pass on LoD Reconciling

    Not perfect yet. Still some tiny issues missing and some re-reconciling/correcting UI needed. Should be ready tonight? Left on this pull still a few DPM()s for my own enjoyment.
    DiegoPino committed Jul 25, 2021
    Configuration menu
    Copy the full SHA
    a157a20 View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2021

  1. Configuration menu
    Copy the full SHA
    cd1b043 View commit details
    Browse the repository at this point in the history

Commits on Jul 28, 2021

  1. Wow. We got this even with LoD Fix/Edit form!!

    @alliomeria gosh. I made it i think?
    OF course the UI can get some better things and we need to add a LOT of options to avoid "overwriting" all your lovely Manually LoD. But gosh. This is so good!!
    DiegoPino committed Jul 28, 2021
    Configuration menu
    Copy the full SHA
    9909ff7 View commit details
    Browse the repository at this point in the history
  2. Because MADS RDF is case Sensitive.. add CONST mapping

    And our CSV headers are not (lowercase/normalized) And gosh. Well. But this seems to work. Last commit for tonight.
    DiegoPino committed Jul 28, 2021
    Configuration menu
    Copy the full SHA
    95922b1 View commit details
    Browse the repository at this point in the history

Commits on Jul 30, 2021

  1. Configuration menu
    Copy the full SHA
    07bd252 View commit details
    Browse the repository at this point in the history

Commits on Aug 3, 2021

  1. Header normalization finally done the right way

    @alliomeria i will need you to test this when you can. This is finally working as expected. For DCMNY this gives us initially 714 headers! but then the cleanup reduces things significantly to lower numbers (e.g for gsmt:icc to 87 or so)
    We get more data of course now!! (because of children really providing their correct data when not collapsing).
    
    Uff. Was super hard to find my own error but the gist for my own future me:
    
    Batch driven Plugins like the SolrImport one provide the headers via their ::getInfo method but because we do not really know until the batch ends how many of the headers will really survive the cleanup we do not set them into the actual Config (config that is stored in each AMI set) which really means we have to manually pass them to the batch!
    
    Good, so good
    DiegoPino committed Aug 3, 2021
    Configuration menu
    Copy the full SHA
    579d4b5 View commit details
    Browse the repository at this point in the history

Commits on Aug 4, 2021

  1. use RELS_EXT_isPageNumber_literal_intDerivedFromString_l for page ord…

    …ering
    
    On Books with pages
    DiegoPino committed Aug 4, 2021
    Configuration menu
    Copy the full SHA
    36f53a8 View commit details
    Browse the repository at this point in the history

Commits on Aug 19, 2021

  1. Does less manual cleaning and trusts fputcsv() more for escaping

    @alliomeria this should fix the double encoding issue, but
    just to be sure, could you try Solr import 7 with a collection that has complex weird characters in its descriptions/titles (maybe NYHS?)
    and also with a google sheet using, e.g Japanese + "" and & and ' somewher?
    
    WE can test tomorrow together. Thing is i removed a lot of code that i "for some reason" had added, so wonder if i will not break now something somewhere else, and thus "testing" is required!
    DiegoPino committed Aug 19, 2021
    Configuration menu
    Copy the full SHA
    fd52fb6 View commit details
    Browse the repository at this point in the history
  2. weird left over ;

    DiegoPino committed Aug 19, 2021
    Configuration menu
    Copy the full SHA
    fb8cd17 View commit details
    Browse the repository at this point in the history

Commits on Sep 2, 2021

  1. Fix some silly select defaults

    I was resetting the array to give this a default value. happens that i needed the first key, not the first value and now that our "labels" are Capitalized and not == to the value (e.g Direct v/s direct) i was breaking the Form State. What a mess!
    DiegoPino committed Sep 2, 2021
    Configuration menu
    Copy the full SHA
    8f1cba7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a1472be View commit details
    Browse the repository at this point in the history
  3. Small Improvements to Ami Utility Service

    - Can not handle CSV rows that are either indexed arrays (default) or associative ones.
    - ::createAmiSet() can be passed a Name (title/label) via the \stdClass $data using $data->name. used for now on the CSV export but as easy-peasy as it gets to be added to the AMI normal Multi Step one (next pull?)
    DiegoPino committed Sep 2, 2021
    Configuration menu
    Copy the full SHA
    d70a726 View commit details
    Browse the repository at this point in the history
  4. This still needs more work. Webform find and replace breaks Form State

    And mostly the logic generated by \Drupal\views_bulk_operations\Form\ConfigureAction::buildForm where form_state does not survive the ajax triggered dynamic fields that have internally a submit (e.g OpenStreetmaps). So will mark as experimental until i find a strange/dark hack to get this rolling. webform elements, gosh
    DiegoPino committed Sep 2, 2021
    Configuration menu
    Copy the full SHA
    28969c6 View commit details
    Browse the repository at this point in the history
  5. Full Export Action to CSV

    Quite smart may i say! So much and so little code really. Test it, let me know how it goes
    DiegoPino committed Sep 2, 2021
    Configuration menu
    Copy the full SHA
    84ed6ab View commit details
    Browse the repository at this point in the history

Commits on Sep 21, 2021

  1. Another pass on making LoD reconciling better

    - This adds new methods for offsetting and paging CSV data (was a pain because JSON pretty printed has breaklines). Since i was here i also added options for getting headers (or not) for offset and paged CSV access
    - Adds an actual pager to the Edit Reconciling page
    - Nothing of this is Done-done. I need to now change the "saving" algorithm to allow partial / offset saves to happen. Probably using the already created KeyValue for the set instead of reading the actual Form data as we did with large full sets and also need the "this was fixed" checkbox to be added to a new Column
    
    More tomorrow @alliomeria !
    DiegoPino committed Sep 21, 2021
    Configuration menu
    Copy the full SHA
    8fe9a34 View commit details
    Browse the repository at this point in the history
  2. Now that was a mistake!

    @alliomeria paging is now working again.
    I was doing a wrong "check if this needs a header row" thing and ending
    without a header row which the Reconcile form needed. Good!
    DiegoPino committed Sep 21, 2021
    Configuration menu
    Copy the full SHA
    afffc75 View commit details
    Browse the repository at this point in the history

Commits on Sep 23, 2021

  1. Fix direct Ingest

    @patdunlavey not the last pull of this branch (LOD had to be postponed for tonight) but direct should work now. Please give it a spin? (i will do some extra double checking of data after things, but should "mostly" workout.
    DiegoPino committed Sep 23, 2021
    Configuration menu
    Copy the full SHA
    6f51681 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a25920d View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2021

  1. Configuration menu
    Copy the full SHA
    3758608 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    583e6bc View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2021

  1. Gosh. Restored Twig Template processing

    After dealing with Direct Ingest i broke Twig templates.... @patdunlavey ... you will see this fail... should back again to its normal behavior.
    DiegoPino committed Sep 25, 2021
    Configuration menu
    Copy the full SHA
    c59d170 View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2021

  1. Configuration menu
    Copy the full SHA
    a161939 View commit details
    Browse the repository at this point in the history
  2. Complete refactor of co-dependent Symfony Services

    It was getting weird to have such cross dependencies. Now both deal with their own concerns and there is complementation. This changes a lot
    
    Also:
    - Json Decoding Cells now deals with Smart Quote exceptions and tries to be as smart as possible in case of errors. But if the key is one of our core ones ap: or as: then in case of failure we reset to NULL value. This allows us to avoid breaking code in case one our controlled values via a Twig template OR a Direct ingest ends with wrong data.
    
    @todo. Implement a JSON-SCHEMA level validation for our own internal control vocabulary (WIP!)
    DiegoPino committed Sep 27, 2021
    Configuration menu
    Copy the full SHA
    bd71514 View commit details
    Browse the repository at this point in the history
  3. Fix Batch Size Issue capping to hundreds and not respecting exact row…

    … numbers
    
    @alliomeria when you come back. Solr exact fetch is fixed, also fixes the real time report to the number of rows requested. Tested on collapsed/uncollapsed and with offsets to check if a set of consequent harvests with  offsets was correct and not missing anything. All checks out!
    DiegoPino committed Sep 27, 2021
    Configuration menu
    Copy the full SHA
    f280dad View commit details
    Browse the repository at this point in the history

Commits on Sep 30, 2021

  1. replace EntityChangedActionDeriver to avoid other entities to pop on …

    …the options
    
    This is basically the same as setting type = "node" but allows us in the future to also allow certain Actions to run on e.g AMI Sets
    
    @alliomeria how to test?
    
    - git pull this branch
    - Clear Caches
    - Edit your Search and replace View. Remove the "Global: Views bulk operations" field (instead of installing/reinstalling the module, this seems to be the best option a.k.a as TRICK to make actions show again)
    - Add it again. The list should be refreshed. Check the ones you want to use
    - Save the view
    - test!
    DiegoPino committed Sep 30, 2021
    Configuration menu
    Copy the full SHA
    58b0811 View commit details
    Browse the repository at this point in the history

Commits on Oct 5, 2021

  1. This one was driving me crazy. Stuck "Direct" for ever

    Happens that i uncommented a line i had commented because i used to be smart. Not anymore. If you add a #name property to an AJAX driven form with select, the original value gets cached for ever and never changes even if you submit it differently. How do i know this? Just because i tested it. There is NO documentation. Wonder if that is also the issue with Webform based find and replace??
    DiegoPino committed Oct 5, 2021
    Configuration menu
    Copy the full SHA
    31e70a7 View commit details
    Browse the repository at this point in the history
  2. Consistently use only lower case columns

    Internally our CSV to data structure is already normalized for lower case (we need to document this better). So for Solr import, also inmediatelly set the column names to lower case and in general enforce checking against lowercases when doing validation. Does not require any change really on Twig templates because for Twig "data.HoLa" is the same as "data.hola". Just saying @alliomeria
    DiegoPino committed Oct 5, 2021
    Configuration menu
    Copy the full SHA
    9a7aecc View commit details
    Browse the repository at this point in the history

Commits on Oct 12, 2021

  1. Small typo on Docs

    DiegoPino committed Oct 12, 2021
    Configuration menu
    Copy the full SHA
    19ea03b View commit details
    Browse the repository at this point in the history

Commits on Oct 19, 2021

  1. Configuration menu
    Copy the full SHA
    fffc059 View commit details
    Browse the repository at this point in the history

Commits on Nov 3, 2021

  1. Make ZIP upload field Private instead of tmp

    all ZIP files will go eventually  (once the AMI set entity is saved or when uploaded directly via "Edit" of a Set, into private://ami/zip
    DiegoPino committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    d71bb08 View commit details
    Browse the repository at this point in the history
  2. New thing for me

    Since we can not predict the final extension of a download upfront without download i check now using gob() for the first part of the possible filename.
    If there is already a single (only a single one) File with the same starting future name (which is consistent since we use the URL of the file to generate it) then we reuse the already downloaded one. Will add also a button that deletes all files for an AMI set in the future and one that forces the download even if there.
    
    Also this piece of code ensures ZIP files (After first upload) are moved into private://ami/zip once the AMI set is saved
    DiegoPino committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    d9ba478 View commit details
    Browse the repository at this point in the history
  3. Files distributed across multiple Queue Entries First pass

    And first pass works, like a charm!
    Ok, need to add
    1.- Config entry to set "what is many files" and what not.
    2.- Better reuse of as:technical metadata, so we do not reprocess that (file re use after download is already in place).
    
    Probably the best thing ever here. @aksm @alliomeria
    DiegoPino committed Nov 3, 2021
    Configuration menu
    Copy the full SHA
    2a470a4 View commit details
    Browse the repository at this point in the history

Commits on Nov 4, 2021

  1. This looks better

    @alliomeria can you test with the same CSV with 387 objects?
    And please twice? First time may be slow-ish, second time quite fast
    DiegoPino committed Nov 4, 2021
    Configuration menu
    Copy the full SHA
    4581569 View commit details
    Browse the repository at this point in the history
  2. New options. Force file processing as separate item can be set per SET

    also reprocessing of cached TECHMD, etc
    DiegoPino committed Nov 4, 2021
    Configuration menu
    Copy the full SHA
    f629a7c View commit details
    Browse the repository at this point in the history

Commits on Nov 10, 2021

  1. Enforces revisions?

    @dmer @patdunlavey can you check this code please (as you know release process so as soon as you can). I wonder if you removed moderation workflows which enables by default revisions or if you have not set revisions at the bundle level?) But this will force that.
    DiegoPino committed Nov 10, 2021
    Configuration menu
    Copy the full SHA
    054c668 View commit details
    Browse the repository at this point in the history

Commits on Nov 11, 2021

  1. AMI version of Entity Preview for Format Strawberryfield

    Works. Looks good. I'm good. A good person right @alliomeria and @aksm ?
    Revisit CSV offsets once we are on PHP 8.1 I found a PHP BUG!!
    DiegoPino committed Nov 11, 2021
    Configuration menu
    Copy the full SHA
    e62e6ce View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2021

  1. Configuration menu
    Copy the full SHA
    7faea54 View commit details
    Browse the repository at this point in the history

Commits on Nov 16, 2021

  1. Fix remote CSV load

    Need to check now CSV write!
    DiegoPino committed Nov 16, 2021
    Configuration menu
    Copy the full SHA
    7edc6f8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    88ba765 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2166ce4 View commit details
    Browse the repository at this point in the history

Commits on Nov 17, 2021

  1. C'mon ZipArchive WHY! You can not stream from remote

    I have been willing to write the first PHP remote streamer library from forever and i never have the time. So for now, for the people that decide to put ALLL on S3 we have to download the file.. no other option.
    So we inject the StrawberryfieldFileMetadataService to reuse the ::ensureFileAvailability function.
    But we can not delete the file here since we may need it for the rest of our times.
    DiegoPino committed Nov 17, 2021
    Configuration menu
    Copy the full SHA
    d800b3e View commit details
    Browse the repository at this point in the history

Commits on Nov 18, 2021

  1. Webform based Search and Replace is working!

    But.. VBO actions is not respecting the Facet selection when using the "Select all" option... We should make sure that is clear? Documentation? Eventually patch VBO or, alter the form and remove Select All completely?
    DiegoPino committed Nov 18, 2021
    Configuration menu
    Copy the full SHA
    2b92b72 View commit details
    Browse the repository at this point in the history