Releases: moj-analytical-services/splink
Releases · moj-analytical-services/splink
v4.0.5
What's Changed
- add EMA use case by @RobinL in #2468
- Change name of second __splink__cluster_count_row_numbered query, prevent table name conflict by @browo097302 in #2447
- Add iteration number to
neighbours_filtered
table by @ADBond in #2470 - Fix docs examples by @ADBond in #2471
- Docs - correct heading and link text by @ADBond in #2472
- Simplify Altair import by @ADBond in #2479
- Specify version range for
pytest-cov
in CI by @ADBond in #2489 - Compare two records - allow dataframes to be registered by @RobinL in #2493
- 4.0.5 release by @RobinL in #2495
Full Changelog: v4.0.4...v4.0.5
v4.0.4
What's Changed
- Handle threshold_match_probablity 0 in predict() #2420 by @browo097302 in #2425
- Take converged clusters out of play by @RobinL in #2436
- Fix clustering in linky jobs with source dataset column on Postgres by @ADBond in #2444
- Cluster multiple thresholds v2 by @RobinL in #2437
- Used .blocking_rule_sql property match_weights_interactive_history_chart() by @browo097302 in #2446
- restore pretty print of SplinkDataFrame by @RobinL in #2450
- 2440 add docstring to customrule by @RobinL in #2452
- Cluster multiple add stats by @RobinL in #2453
- Score missing intra-cluster edges by @ADBond in #2442
- Fix cluster studio docstring by @ADBond in #2455
- Docs cleanup by @Thomas-Hirsch in #2460
- Fix profile charts issue by @RobinL in #2466
- 4.0.4 release by @RobinL in #2467
New Contributors
- @browo097302 made their first contribution in #2425
Full Changelog: v4.0.3...v4.0.4
v4.0.3
v4.0.2
What's Changed
- Fix performance issue with exploding blocking rules by @RobinL in #2385
- Add cookbook to examples by @RobinL in #2388
- fix docs by @RobinL in #2389
- Create llm prompt by @RobinL in #2366
- 2351 fix spark sampling by @aymonwuolanne in #2390
- Improve number formatting and descriptions on match weight charts by @RobinL in #2392
- add labelling tool by @RobinL in #2393
- Fix ColumnsReversedLevel by @RobinL in #2395
- Add
is_in_level
andcompute_comparison_vector_value
testing functions to internals by @RobinL in #2396 - Migrate tests of comparisons and comparison levels to new testing framework by @RobinL in #2397
- Add AbsoluteDifferenceLevel by @RobinL in #2398
- TimeDifference docstring by @RobinL in #2400
- More levels docstrings by @RobinL in #2401
- add dates docs by @RobinL in #2402
- Better docstrings by @RobinL in #2404
- Add cosine similiarity comparison level and comparison by @RobinL in #2405
- add gov transformation mag link by @RobinL in #2406
- Add cosine similarity tests and allow schemad data by @RobinL in #2407
- Consistency in usage of sql_dialect, sql_dialect_str, sqlglot_dialect by @RobinL in #2391
- ArraySubset comparison level by @RobinL in #2416
- Interactive comparison notebook by @RobinL in #2417
- 4.0.2 release by @RobinL in #2418
Full Changelog: v4.0.1...v4.0.2
v4.0.1
What's Changed
- Bias blog by @ericakane-moj in #2279
- Fix bug in Postgres example by @fhightower in #2352
- Added new use case to index.md by @AnthonyTacquet in #2363
- Fixing issue with reaonly filesystems by @RossHammer in #2357
- Update changelog by @ADBond in #2370
- avoid attempting to cast
Infinity
to double for spark backend by @bkitej-rw in #2372 - Fix Spark 'InfinityD' bug by @ADBond in #2374
- Support duckdbpyrelation as input type by @RobinL in #2375
- Bump actions/download-artifact from 3 to 4.1.7 in /.github/workflows by @dependabot in #2377
- Splink datasets - simplify + restructure by @ADBond in #2378
- Fix docs reference for renamed class by @ADBond in #2380
- Update upload-artifact version in docs CI by @ADBond in #2381
- Allow a specific m and u probabilities to be fixed during training by @RobinL in #2379
- Allow all charts to be generated as a dict by @RossHammer in #2361
- Splink 401 release by @RobinL in #2386
New Contributors
- @probjects made their first contribution in #2172
- @DavidFrenchSG made their first contribution in #2204
- @astimoore made their first contribution in #2229
- @dkaufman-rc made their first contribution in #2240
- @ericakane-moj made their first contribution in #2277
- @bnm3k made their first contribution in #2342
- @fhightower made their first contribution in #2352
- @AnthonyTacquet made their first contribution in #2363
- @RossHammer made their first contribution in #2357
- @bkitej-rw made their first contribution in #2372
Full Changelog: v4.0.0...v4.0.1
v4.0.0
See
https://moj-analytical-services.github.io/splink/blog/2024/07/24/splink-400-released.html
for release announcement
v4.0.0.dev9
What's Changed
- Comparison that has tf adjustments = True properly accounts for column expressions by @RobinL in #2267
- Adjust package top level imports by @ADBond in #2269
- Evaluation docstrings by @RobinL in #2271
- Remove broken EM training options by @ADBond in #2272
- Restore lat-long SQL test by @ADBond in #2273
- Consistent
db_api
argument name by @ADBond in #2278 - Turn off previously configured options by @ADBond in #2276
- Remove jan 1st option from date of birth comparison by @RobinL in #2281
- update release blog by @RobinL in #2284
- Small fixes by @ADBond in #2285
- Update Splink 4 docs by @ADBond in #2283
- update version by @RobinL in #2286
Full Changelog: v4.0.0.dev8...v4.0.0.dev9
Splink 4 dev 8
What's Changed
- Docs links by @RobinL in #2237
- Cherrypick various patches to master by @RobinL in #2241
- Update docstrings splink4 by @RobinL in #2246
- as spark dataframe in docs by @RobinL in #2247
- More docstrings by @RobinL in #2248
- Docstrings 3 by @RobinL in #2250
- Restore spark test mark by @ADBond in #2253
- add note about excludedocs by @RobinL in #2256
- Del accidentally committed testing script by @RobinL in #2258
- Splink 4 release blog v1 by @RobinL in #2235
- Find biggest block by @RobinL in #2260
- Blocking tutorial by @RobinL in #2262
- prevent integer overflow by @RobinL in #2263
- Remove clustering pairwise output format by @ADBond in #2264
- improve blocking below thres by @RobinL in #2265
- splink 4 dev8 release by @RobinL in #2266
Full Changelog: v4.0.0.dev7...v4.0.0.dev8
Dev 7
What's Changed
- Update docs for Splink4 by @RobinL in #2203
- Update comparison template library by @RobinL in #2214
- Further splink4 docs work by @RobinL in #2215
- Move comparison helpers by @RobinL in #2216
- Restore dev guides by @RobinL in #2217
- add back tags by @RobinL in #2218
- Splink4 docs: fix more links by @RobinL in #2225
- Athena linker splink4 migration by @RobinL in #2226
- Athena linker migration 2 by @RobinL in #2227
- Restore Athena example to docs by @RobinL in #2228
- Block to IDs by @RobinL in #2231
- dev7 release by @RobinL in #2236
Full Changelog: v4.0.0.dev6...v4.0.0.dev7
v3.9.15
What's Changed
- Document first-time developer setup, add conda option by @zmbc in #2083
- fix links by @RobinL in #2097
- Add dirty reload for much faster updates by @RobinL in #2096
- Add documentation for spellchecker and spellcheck docs by @zslade in #2025
- Add graph definition to docs by @zslade in #1979
- Minor fixes to spellchecker by @zslade in #2113
- Changing args as kwargs by @jlb52 in #2116
- Update threshold_selection_tool.json by @aalexandersson in #2120
- Fix broken link by @samnlindsay in #2098
- added tf_minimum_u_value to as_dict method by @aymonwuolanne in #2122
- Fix a bug in conda script and make minor improvements to quickstart by @zmbc in #2125
- Fix documentation Github Action for forks by @zmbc in #2126
- Add better check for whether conda is already installed by @zmbc in #2130
- Update PULL_REQUEST_TEMPLATE.md with spellchecker tick box by @zslade in #2128
- Clusters topic guide by @zslade in #1883
- Splink blog March 2024: Splink 3 update and Splink 4 development announcement by @RobinL in #2081
- Fix link to linter by @RobinL in #2121
- add probabilistic section to graphs definitions by @RossKen in #2137
- Update PULL_REQUEST_TEMPLATE.md by @zslade in #2138
- Minor bug in filtering predict table by @samnlindsay in #2152
- Update documentation on settings validation in response to code changes by @ThomasHepworth in #2149
- Remove reference to github action that will not come to be by @zslade in #2163
- Fixing spurious error messages with Databricks enable_splink by @aymonwuolanne in #2159
- Fix Splink 4 blog post link by @probjects in #2172
- Make spellcheck work cross-platform by @zmbc in #2131
- add marie curie by @RobinL in #2201
- Fix bug giving warning messages in term_frequencies.py by @DavidFrenchSG in #2204
- Fix lint by @RobinL in #2205
- Improve performance of SQL generation by using deepcopy less by @RobinL in #2212
- 3.9.15 release by @RobinL in #2213
New Contributors
- @zmbc made their first contribution in #2083
- @jlb52 made their first contribution in #2116
- @aalexandersson made their first contribution in #2120
- @probjects made their first contribution in #2172
- @DavidFrenchSG made their first contribution in #2204
Full Changelog: v3.9.14...v3.9.15