From 81024c4c4a558cc83dc218552daec1923557d2c9 Mon Sep 17 00:00:00 2001 From: GitHub Action Date: Thu, 7 Dec 2023 14:03:53 +0000 Subject: [PATCH] refreshing readme --- README.md | 702 +++++++++++++++++++++++++++--------------------------- 1 file changed, 351 insertions(+), 351 deletions(-) diff --git a/README.md b/README.md index b24cb40..37770c8 100644 --- a/README.md +++ b/README.md @@ -21,49 +21,7 @@ Public repository for custom blocks for Omniscope Evo. 7. Create a pull request. ## Table of blocks (Omniscope 2020+) -1. Outputs - 1. BigQuery - 1. [Google BigQuery Export](#OutputsGoogleBigQueryWriter) - 2. Github - 1. [GitHub](#OutputsGitHub) - 3. PDF - 1. [Report tab to PDF](#OutputsReporttabtoPDF) - 2. [Append PDF files](#OutputsAppendPDFfiles) - 3. [Web Image-PDF output](#OutputsWebImage-PDFoutput) - 4. [Multi-tenant Report to PDF](#OutputsReporttoPDFbatchoutput) - 4. PowerPoint - 1. [Report to PowerPoint](#OutputsReporttoPowerPoint) - 5. Slack - 1. [Slack Bot](#OutputsSlackBot) -2. Custom scripts - 1. [Execute Command](#CustomscriptsExecuteCommand) -3. Inputs - 1. Databases - 1. [MongoDB](#InputsDatabasesMongoDB) - 2. R - 1. [Rds Batch Append](#InputsRdsBatchAppend) - 2. [R Data Reader](#InputsRdata) - 3. [SFTP Downloader](#InputsSFTPDownloader) - 4. [Sharepoint Online Downloader](#InputsSharepointOnline) - 5. [PDF Reader](#InputsPDFReader) -4. Analytics - 1. Clustering - 1. [DBScan](#AnalyticsClusteringDBScan) - 2. [KMeans](#AnalyticsClusteringKMeans) - 3. [Gaussian Mixture Model](#AnalyticsClusteringGMM) - 2. Network Analysis - 1. [Attribute Analysis](#AnalyticsNetworkAnalysisAttributeAnalysis) - 2. [TSNE](#AnalyticsNetworkAnalysisTSNE) - 3. Prediction - 1. [Support Vector Machine](#AnalyticsPredictionSVM) - 2. [K-Nearest-Neighbours](#AnalyticsPredictionKNN) - 4. Validation - 1. [Model Validation](#AnalyticsValidationModelValidation) - 5. Website - 1. [Website Analysis](#AnalyticsWebsitesWebsiteAnalysis) - 6. [Data Profiler](#AnalyticsDataProfiler) - 7. [Survival Analysis](#AnalyticsSurvival) -5. Connectors +1. Connectors 1. Azure 1. [Azure Data Lake Storage Gen2 Blob](#ConnectorsAzureDataLakeBlob) 2. Flightstats @@ -76,20 +34,20 @@ Public repository for custom blocks for Omniscope Evo. 1. [Slack API WebClient](#ConnectorsSlackAPIWebClient) 5. Weather 1. [OpenWeatherMap](#ConnectorsWeatherOpenWeatherMap) - 6. [Trello](#ConnectorsTrello) - 7. [Jira](#ConnectorsJira) - 8. [HubSpot](#ConnectorsHubSpot) - 9. [Etherscan](#ConnectorsEtherscan) - 10. [XPT Reader](#ConnectorsXPTReader) - 11. [Yahoo Finance](#ConnectorsYahooFinance) - 12. [Google BigQuery Custom SQL](#ConnectorsBigQueryGoogleBigQueryCustomSQL) - 13. [Google BigQuery Import Table](#ConnectorsBigQueryGoogleBigQueryImportTable) - 14. [Dune](#ConnectorsDune) - 15. [Flipside](#ConnectorsFlipside) -6. Preparation + 6. [Google BigQuery Import Table](#ConnectorsBigQueryGoogleBigQueryImportTable) + 7. [Google BigQuery Custom SQL](#ConnectorsBigQueryGoogleBigQueryCustomSQL) + 8. [Etherscan](#ConnectorsEtherscan) + 9. [XPT Reader](#ConnectorsXPTReader) + 10. [HubSpot](#ConnectorsHubSpot) + 11. [Trello](#ConnectorsTrello) + 12. [Jira](#ConnectorsJira) + 13. [Yahoo Finance](#ConnectorsYahooFinance) + 14. [Flipside](#ConnectorsFlipside) + 15. [Dune](#ConnectorsDune) +2. Preparation 1. ForEach - 1. [ForEach multi stage](#PreparationForEachForEachMultiStage) - 2. [Project Parameters Batch Setting](#PreparationForEachProjectParameters) + 1. [Project Parameters Batch Setting](#PreparationForEachProjectParameters) + 2. [ForEach multi stage](#PreparationForEachForEachMultiStage) 2. Geo 1. [Shapefile](#PreparationGeoShapefile) 2. [Gridsquare](#PreparationGeoGridsquare) @@ -98,8 +56,8 @@ Public repository for custom blocks for Omniscope Evo. 4. JSON 1. [JSON Normalise](#PreparationJSONNormalise) 5. Join - 1. [Inequality Join](#PreparationJoinInequalityJoin) - 2. [Interval Join](#PreparationJoinIntervalJoin) + 1. [Interval Join](#PreparationJoinIntervalJoin) + 2. [Inequality Join](#PreparationJoinInequalityJoin) 3. [Fuzzy Terms Join](#PreparationJoinFuzzyJoin) 6. Partition 1. [Partition](#PreparationPartition) @@ -109,586 +67,628 @@ Public repository for custom blocks for Omniscope Evo. 1. [Standardise](#PreparationStandardisationStandardise) 9. Workflow 1. [For Each (Separate Workflows)](#PreparationForEachForEach) - 10. [Field Renamer](#PreparationFieldRenamer) - 11. [Unescape HTML](#PreparationUnescapeHTML) - 12. [Unstack Records](#PreparationUnstackrows) + 10. [Unstack Records](#PreparationUnstackrows) + 11. [Field Renamer](#PreparationFieldRenamer) + 12. [Unescape HTML](#PreparationUnescapeHTML) 13. [Add row ID field](#PreparationAddrowIDfield) - 14. [Split Address](#PreparationSplitAddress) - 15. [URL Encode](#PreparationURLEncode) + 14. [URL Encode](#PreparationURLEncode) + 15. [Split Address](#PreparationSplitAddress) +3. Inputs + 1. Databases + 1. [MongoDB](#InputsDatabasesMongoDB) + 2. R + 1. [Rds Batch Append](#InputsRdsBatchAppend) + 2. [R Data Reader](#InputsRdata) + 3. [SFTP Downloader](#InputsSFTPDownloader) + 4. [PDF Reader](#InputsPDFReader) + 5. [Sharepoint Online Downloader](#InputsSharepointOnline) +4. Custom scripts + 1. [Execute Command](#CustomscriptsExecuteCommand) +5. Analytics + 1. Clustering + 1. [Gaussian Mixture Model](#AnalyticsClusteringGMM) + 2. [DBScan](#AnalyticsClusteringDBScan) + 3. [KMeans](#AnalyticsClusteringKMeans) + 2. Network Analysis + 1. [Attribute Analysis](#AnalyticsNetworkAnalysisAttributeAnalysis) + 2. [TSNE](#AnalyticsNetworkAnalysisTSNE) + 3. Prediction + 1. [K-Nearest-Neighbours](#AnalyticsPredictionKNN) + 2. [Support Vector Machine](#AnalyticsPredictionSVM) + 4. Validation + 1. [Model Validation](#AnalyticsValidationModelValidation) + 5. Website + 1. [Website Analysis](#AnalyticsWebsitesWebsiteAnalysis) + 6. [Survival Analysis](#AnalyticsSurvival) + 7. [Data Profiler](#AnalyticsDataProfiler) +6. Outputs + 1. BigQuery + 1. [Google BigQuery Export](#OutputsGoogleBigQueryWriter) + 2. Github + 1. [GitHub](#OutputsGitHub) + 3. PDF + 1. [Multi-tenant Report to PDF](#OutputsReporttoPDFbatchoutput) + 2. [Report tab to PDF](#OutputsReporttabtoPDF) + 3. [Append PDF files](#OutputsAppendPDFfiles) + 4. [Web Image-PDF output](#OutputsWebImage-PDFoutput) + 4. PowerPoint + 1. [Report to PowerPoint](#OutputsReporttoPowerPoint) + 5. Slack + 1. [Slack Bot](#OutputsSlackBot) ## Block Overview -
+
-### Google BigQuery Export +### Azure Data Lake Storage Gen2 Blob -Allows to write data to a Google BigQuery table. The table can be created/replaced, or records can be appended to an existing table + -[Link to Github page](Outputs/Google%20BigQuery%20Writer) +Storage Gen2 Blob connector to load a CSV or Parquet blob/file in Omniscope. -
+[Link to Github page](Connectors/Azure%20Data%20Lake%20Blob) -### GitHub +
-Reads from and writes data to GitHub +### Flightstats Airports -[Link to Github page](Outputs/GitHub) + -
+Downloads a list of airports as provided by flightstats (https://www.flightstats.com). The script needs your flightstats app id and key which needs to be obtained either through buying their service or signing up for a test account. -### Report tab to PDF +[Link to Github page](Connectors/Flightstats/Airports) - +
-Prints Report tabs to PDF files for each record of the input data. +### Flightstats Airlines -[Link to Github page](Outputs/Report%20tab%20to%20PDF) + -
+Downloads a list of airlines as provided by flightstats (https://www.flightstats.com). The script needs your flightstats app id and key which needs to be obtained either through buying their service or signing up for a test account. -### Append PDF files +[Link to Github page](Connectors/Flightstats/Airlines) - +
-Append multiple PDF files combining them into one PDF file. +### Flightstats Flights -[Link to Github page](Outputs/Append%20PDF%20files) + -
+Requests information about flights specified in the input data from flightstats (https://www.flightstats.com). If the flight exists the result will contain live information, otherwise it will not be part of it. The script needs your flightstats app id and key which needs to be obtained either through buying their service or signing up for a test account. -### Web Image-PDF output +[Link to Github page](Connectors/Flightstats/Flights) - +
-Grabs screenshots of webpages, optionally producing a PDF document. +### Overpass Street Coordinates -[Link to Github page](Outputs/Web%20Image-PDF%20output) + -
+Finds all matching streets given a street name and requests multiple coordinates along the street using data from Overpass API. It will create a row for each point found that is part of a street that matches the given street name. The resulting rows will include the street name, the street Id and the coordinates of the point. The script needs an input with a field with the street name. -### Multi-tenant Report to PDF +[Link to Github page](Connectors/Overpass/Street%20Coordinates) - +
-Prints Report tabs to PDF files for each record of the input data. +### Slack API WebClient -[Link to Github page](Outputs/Report%20to%20PDF%20batch%20output) + -
+Allows you to call public Slack endpoints. -### Report to PowerPoint +[Link to Github page](Connectors/Slack%20API%20WebClient) - +
-Export a Report to a PowerPoint pptx file +### OpenWeatherMap -[Link to Github page](Outputs/Report%20to%20PowerPoint) +Retrieves current weather and forecasts from OpenWeatherMap -
+[Link to Github page](Connectors/Weather/OpenWeatherMap) -### Slack Bot +
- +### Google BigQuery Import Table -Posts messages on a channel. +Allows to import a table from Google BigQuery. -[Link to Github page](Outputs/Slack%20Bot) +[Link to Github page](Connectors/BigQuery/Google%20BigQuery%20Import%20Table) -
+
-### Execute Command +### Google BigQuery Custom SQL - +Executes a SQL query on Google BigQuery and imports the query results -Execute a system command. +[Link to Github page](Connectors/BigQuery/Google%20BigQuery%20Custom%20SQL) -[Link to Github page](Custom%20scripts/ExecuteCommand) +
-
+### Etherscan -### MongoDB + - +The Ethereum Blockchain Explorer. -A connector for MongoDB +[Link to Github page](Connectors/Etherscan) -[Link to Github page](Inputs/Databases/MongoDB) +
-
+### XPT Reader -### Rds Batch Append + -Reads multiple rds files either from an upstream block, or a folder, and appends them +Reads a SAS Transport *xpt* file, extracting a dataset. -[Link to Github page](Inputs/Rds%20Batch%20Append) +[Link to Github page](Connectors/XPT%20Reader) -
+
-### R Data Reader +### HubSpot -Joins regions defined in a shapefile with points defined as latitudes and longitudes, and gives meta information about the content of the shapefile + -[Link to Github page](Inputs/Rdata) +Retrieves contacts, companies, deals and lists -
+[Link to Github page](Connectors/HubSpot) -### SFTP Downloader +
- +### Trello -Download files from a SFTP server folder. + -[Link to Github page](Inputs/SFTP%20Downloader) +Retrieves boards, lists and cards, and allows you to search in Trello. -
+[Link to Github page](Connectors/Trello) -### Sharepoint Online Downloader +
- +### Jira -Download a file from a Sharepoint Online site. + -[Link to Github page](Inputs/Sharepoint%20Online) +Retrieves projects and issues from Jira -
+[Link to Github page](Connectors/Jira) -### PDF Reader +
- +### Yahoo Finance -Extract text from PDF files. + -[Link to Github page](Inputs/PDF%20Reader) +Fetches price data for tickers from Yahoo Finance -
+[Link to Github page](Connectors/YahooFinance) -### DBScan +
- +### Flipside -Performs DBScan clustering on the first input data provided. The output consists of the original input with a Cluster field appended. If a second input is available, it will be used as output instead. + -[Link to Github page](Analytics/Clustering/DBScan) +Executes a SQL query on Flipside and retrieves the blockchain data -
+[Link to Github page](Connectors/Flipside) -### KMeans +
- +### Dune -Performs KMeans clustering on the first input data provided. The output consists of the original input with a Cluster field appended. If a second input is available, it will be used as output instead. + -[Link to Github page](Analytics/Clustering/KMeans) +Execute queries and retrieve blockchain data from any public query on dune.com, as well as any personal private queries your Dune account has access to -
+[Link to Github page](Connectors/Dune) -### Gaussian Mixture Model +
- +### Project Parameters Batch Setting -Performs GMM clustering on the first input data provided. The output consists of the original input with a Cluster field appended. If a second input is available, it will be used as output instead. +None -[Link to Github page](Analytics/Clustering/GMM) +[Link to Github page](Preparation/ForEach/ProjectParameters) -
+
-### Attribute Analysis +### ForEach multi stage - +The ForEach multi stage block allows to orchestrate the execution of another Omniscope project and running the workflow multiple times, each time with a different set of parameter values. Unlike the ForEach block allows multiple stages of execution, executing/refreshing from source a different set of blocks in each stage. -Given a dataset in which each record represents an edge between two nodes of a network, and each node has an associated categorical attribute, the block analyses connections between attributes, based on connections between associated nodes. The result of the analysis is a list of records in which each record specifies a connection from one attribute to another. The connection contains a probability field, which gives an answer to the question that if a node has the specified categorical attribute, how probable it is that it has a connection to another node with the linked categorical attribute. +[Link to Github page](Preparation/ForEach/ForEachMultiStage) -[Link to Github page](Analytics/Network%20Analysis/Attribute%20Analysis) +
-
+### Shapefile -### TSNE +Match regions in shapefile with geographical points having latitude and longitude - +[Link to Github page](Preparation/Geo/Shapefile) -Given a dataset in which each record represents an edge between two nodes of a network, the block will project all the nodes onto a (e.g. 2)- dimensional plane in such a way that nodes which share many connections are close together, and nodes that do not share many connections are far apart. +
-[Link to Github page](Analytics/Network%20Analysis/TSNE) +### Gridsquare -
+ -### Support Vector Machine +Converts gridsquare / Maidenhead - +[Link to Github page](Preparation/Geo/Gridsquare) -Predicts classes of new data from old data by drawing a boundary between two classes whereas the margin around the bondary is made as large as possible to avoid touching the points. +
-[Link to Github page](Analytics/Prediction/SVM) +### Kedro -
+Intefaces with kedro workflows -### K-Nearest-Neighbours +[Link to Github page](Preparation/Interfaces/Kedro) - +
-Performs k-nearest-neighbour prediction on the data. The prediction for a new point depends on the k-nearest-neighbours around the point. The majority class is used as the prediction. +### JSON Normalise -[Link to Github page](Analytics/Prediction/KNN) + -
+Normalise semi-structured JSON strings into a flat table, appending data record by record. -### Model Validation +[Link to Github page](Preparation/JSON/Normalise) -Computes a confusion matrix as well as model validation statistics +
-[Link to Github page](Analytics/Validation/Model%20Validation) +### Interval Join -
+Performs a join between values in the first input and intervals in the second input. Rows are joined if the value is contained in an interval. -### Website Analysis +[Link to Github page](Preparation/Join/Interval%20Join) -Extracts the structure and content of a website and its pages. +
-[Link to Github page](Analytics/Websites/Website%20Analysis) +### Inequality Join -
+ -### Data Profiler +Performs a join between the first (left) and second (right) input. The join can be performed using equality/inequality comparators ==, <=, >=, <, > , which means the result will be a constraint cartesian join including all records that match the inequalities. - +[Link to Github page](Preparation/Join/Inequality%20Join) -Provides detailed statistics about a dataset +
-[Link to Github page](Analytics/Data%20Profiler) +### Fuzzy Terms Join -
+ -### Survival Analysis +Performs a join between the first (left) and second (right) input. The field on which the join is performed must be text containing multiple terms. The result will contain joined records based on how many terms they share, weighted by inverse document frequency. - +[Link to Github page](Preparation/Join/Fuzzy%20Join) -Computes an estimate of a survival curve for truncated and/or censored data using the Kaplan-Meier or Fleming-Harrington method +
-[Link to Github page](Analytics/Survival) +### Partition -
+Partitions the data into chunks of the desired size. There will be a new field called "Partition" which contains a number unique to each partition. -### Azure Data Lake Storage Gen2 Blob +[Link to Github page](Preparation/Partition) - +
-Storage Gen2 Blob connector to load a CSV or Parquet blob/file in Omniscope. +### Melt De-pivot -[Link to Github page](Connectors/Azure%20Data%20Lake%20Blob) +Keep all selected fixed fields in the output, de-pivot all other fields -
+[Link to Github page](Preparation/Pivot/Melt%20De-pivot) -### Flightstats Airports +
- +### Standardise -Downloads a list of airports as provided by flightstats (https://www.flightstats.com). The script needs your flightstats app id and key which needs to be obtained either through buying their service or signing up for a test account. +Standardises the values in the selected fields so that they are in the range between 0 and 1. I.e. The new value of the highest value in each field is going to be 1, and the lowest value 0. All other values are scaled proportionally. -[Link to Github page](Connectors/Flightstats/Airports) +[Link to Github page](Preparation/Standardisation/Standardise) -
+
-### Flightstats Airlines +### For Each (Separate Workflows) - + -Downloads a list of airlines as provided by flightstats (https://www.flightstats.com). The script needs your flightstats app id and key which needs to be obtained either through buying their service or signing up for a test account. +Executes another Omniscope project multiple times, each time with a different set of parameter values. -[Link to Github page](Connectors/Flightstats/Airlines) +[Link to Github page](Preparation/ForEach/ForEach) -
+
-### Flightstats Flights +### Unstack Records - +Unstack all records by splitting on text fields with stacked values, filling records with empty strings where needed. -Requests information about flights specified in the input data from flightstats (https://www.flightstats.com). If the flight exists the result will contain live information, otherwise it will not be part of it. The script needs your flightstats app id and key which needs to be obtained either through buying their service or signing up for a test account. +[Link to Github page](Preparation/Unstack%20rows) -[Link to Github page](Connectors/Flightstats/Flights) +
-
+### Field Renamer -### Overpass Street Coordinates +Renames the fields of a data set given a list of current names and new names. - +[Link to Github page](Preparation/Field%20Renamer) -Finds all matching streets given a street name and requests multiple coordinates along the street using data from Overpass API. It will create a row for each point found that is part of a street that matches the given street name. The resulting rows will include the street name, the street Id and the coordinates of the point. The script needs an input with a field with the street name. +
-[Link to Github page](Connectors/Overpass/Street%20Coordinates) +### Unescape HTML -
+Convert all named and numeric character references to the corresponding Unicode characters -### Slack API WebClient +[Link to Github page](Preparation/Unescape%20HTML) - +
-Allows you to call public Slack endpoints. +### Add row ID field -[Link to Github page](Connectors/Slack%20API%20WebClient) +Adds a Row ID field with a sequential number. -
+[Link to Github page](Preparation/Add%20row%20ID%20field) -### OpenWeatherMap +
-Retrieves current weather and forecasts from OpenWeatherMap +### URL Encode -[Link to Github page](Connectors/Weather/OpenWeatherMap) +URL encode strings in a field using the UTF-8 encoding scheme -
+[Link to Github page](Preparation/URL%20Encode) -### Trello +
- +### Split Address -Retrieves boards, lists and cards, and allows you to search in Trello. + -[Link to Github page](Connectors/Trello) +Splits an address field into streetname, streetnumber, and suffix. -
+[Link to Github page](Preparation/Split%20Address) -### Jira +
- +### MongoDB -Retrieves projects and issues from Jira + -[Link to Github page](Connectors/Jira) +A connector for MongoDB -
+[Link to Github page](Inputs/Databases/MongoDB) -### HubSpot +
- +### Rds Batch Append -Retrieves contacts, companies, deals and lists +Reads multiple rds files either from an upstream block, or a folder, and appends them -[Link to Github page](Connectors/HubSpot) +[Link to Github page](Inputs/Rds%20Batch%20Append) -
+
-### Etherscan +### R Data Reader - +Joins regions defined in a shapefile with points defined as latitudes and longitudes, and gives meta information about the content of the shapefile -The Ethereum Blockchain Explorer. +[Link to Github page](Inputs/Rdata) -[Link to Github page](Connectors/Etherscan) +
-
+### SFTP Downloader -### XPT Reader + - +Download files from a SFTP server folder. -Reads a SAS Transport *xpt* file, extracting a dataset. +[Link to Github page](Inputs/SFTP%20Downloader) -[Link to Github page](Connectors/XPT%20Reader) +
-
+### PDF Reader -### Yahoo Finance + - +Extract text from PDF files. -Fetches price data for tickers from Yahoo Finance +[Link to Github page](Inputs/PDF%20Reader) -[Link to Github page](Connectors/YahooFinance) +
-
+### Sharepoint Online Downloader -### Google BigQuery Custom SQL + -Executes a SQL query on Google BigQuery and imports the query results +Download a file from a Sharepoint Online site. -[Link to Github page](Connectors/BigQuery/Google%20BigQuery%20Custom%20SQL) +[Link to Github page](Inputs/Sharepoint%20Online) -
+
-### Google BigQuery Import Table +### Execute Command -Allows to import a table from Google BigQuery. + -[Link to Github page](Connectors/BigQuery/Google%20BigQuery%20Import%20Table) +Execute a system command. -
+[Link to Github page](Custom%20scripts/ExecuteCommand) -### Dune +
- +### Gaussian Mixture Model -Execute queries and retrieve blockchain data from any public query on dune.com, as well as any personal private queries your Dune account has access to + -[Link to Github page](Connectors/Dune) +Performs GMM clustering on the first input data provided. The output consists of the original input with a Cluster field appended. If a second input is available, it will be used as output instead. -
+[Link to Github page](Analytics/Clustering/GMM) -### Flipside +
- +### DBScan -Executes a SQL query on Flipside and retrieves the blockchain data + -[Link to Github page](Connectors/Flipside) +Performs DBScan clustering on the first input data provided. The output consists of the original input with a Cluster field appended. If a second input is available, it will be used as output instead. -
+[Link to Github page](Analytics/Clustering/DBScan) -### ForEach multi stage +
-The ForEach multi stage block allows to orchestrate the execution of another Omniscope project and running the workflow multiple times, each time with a different set of parameter values. Unlike the ForEach block allows multiple stages of execution, executing/refreshing from source a different set of blocks in each stage. +### KMeans -[Link to Github page](Preparation/ForEach/ForEachMultiStage) + -
+Performs KMeans clustering on the first input data provided. The output consists of the original input with a Cluster field appended. If a second input is available, it will be used as output instead. -### Project Parameters Batch Setting +[Link to Github page](Analytics/Clustering/KMeans) -None +
-[Link to Github page](Preparation/ForEach/ProjectParameters) +### Attribute Analysis -
+ -### Shapefile +Given a dataset in which each record represents an edge between two nodes of a network, and each node has an associated categorical attribute, the block analyses connections between attributes, based on connections between associated nodes. The result of the analysis is a list of records in which each record specifies a connection from one attribute to another. The connection contains a probability field, which gives an answer to the question that if a node has the specified categorical attribute, how probable it is that it has a connection to another node with the linked categorical attribute. -Match regions in shapefile with geographical points having latitude and longitude +[Link to Github page](Analytics/Network%20Analysis/Attribute%20Analysis) -[Link to Github page](Preparation/Geo/Shapefile) +
-
+### TSNE -### Gridsquare + - +Given a dataset in which each record represents an edge between two nodes of a network, the block will project all the nodes onto a (e.g. 2)- dimensional plane in such a way that nodes which share many connections are close together, and nodes that do not share many connections are far apart. -Converts gridsquare / Maidenhead +[Link to Github page](Analytics/Network%20Analysis/TSNE) -[Link to Github page](Preparation/Geo/Gridsquare) +
-
+### K-Nearest-Neighbours -### Kedro + -Intefaces with kedro workflows +Performs k-nearest-neighbour prediction on the data. The prediction for a new point depends on the k-nearest-neighbours around the point. The majority class is used as the prediction. -[Link to Github page](Preparation/Interfaces/Kedro) +[Link to Github page](Analytics/Prediction/KNN) -
+
-### JSON Normalise +### Support Vector Machine - + -Normalise semi-structured JSON strings into a flat table, appending data record by record. +Predicts classes of new data from old data by drawing a boundary between two classes whereas the margin around the bondary is made as large as possible to avoid touching the points. -[Link to Github page](Preparation/JSON/Normalise) +[Link to Github page](Analytics/Prediction/SVM) -
+
-### Inequality Join +### Model Validation - +Computes a confusion matrix as well as model validation statistics -Performs a join between the first (left) and second (right) input. The join can be performed using equality/inequality comparators ==, <=, >=, <, > , which means the result will be a constraint cartesian join including all records that match the inequalities. +[Link to Github page](Analytics/Validation/Model%20Validation) -[Link to Github page](Preparation/Join/Inequality%20Join) +
-
+### Website Analysis -### Interval Join +Extracts the structure and content of a website and its pages. -Performs a join between values in the first input and intervals in the second input. Rows are joined if the value is contained in an interval. +[Link to Github page](Analytics/Websites/Website%20Analysis) -[Link to Github page](Preparation/Join/Interval%20Join) +
-
+### Survival Analysis -### Fuzzy Terms Join + - +Computes an estimate of a survival curve for truncated and/or censored data using the Kaplan-Meier or Fleming-Harrington method -Performs a join between the first (left) and second (right) input. The field on which the join is performed must be text containing multiple terms. The result will contain joined records based on how many terms they share, weighted by inverse document frequency. +[Link to Github page](Analytics/Survival) -[Link to Github page](Preparation/Join/Fuzzy%20Join) +
-
+### Data Profiler -### Partition + -Partitions the data into chunks of the desired size. There will be a new field called "Partition" which contains a number unique to each partition. +Provides detailed statistics about a dataset -[Link to Github page](Preparation/Partition) +[Link to Github page](Analytics/Data%20Profiler) -
+
-### Melt De-pivot +### Google BigQuery Export -Keep all selected fixed fields in the output, de-pivot all other fields +Allows to write data to a Google BigQuery table. The table can be created/replaced, or records can be appended to an existing table -[Link to Github page](Preparation/Pivot/Melt%20De-pivot) +[Link to Github page](Outputs/Google%20BigQuery%20Writer) -
+
-### Standardise +### GitHub -Standardises the values in the selected fields so that they are in the range between 0 and 1. I.e. The new value of the highest value in each field is going to be 1, and the lowest value 0. All other values are scaled proportionally. +Reads from and writes data to GitHub -[Link to Github page](Preparation/Standardisation/Standardise) +[Link to Github page](Outputs/GitHub) -
+
-### For Each (Separate Workflows) +### Multi-tenant Report to PDF - + -Executes another Omniscope project multiple times, each time with a different set of parameter values. +Prints Report tabs to PDF files for each record of the input data. -[Link to Github page](Preparation/ForEach/ForEach) +[Link to Github page](Outputs/Report%20to%20PDF%20batch%20output) -
+
-### Field Renamer +### Report tab to PDF -Renames the fields of a data set given a list of current names and new names. + -[Link to Github page](Preparation/Field%20Renamer) +Prints Report tabs to PDF files for each record of the input data. -
+[Link to Github page](Outputs/Report%20tab%20to%20PDF) -### Unescape HTML +
-Convert all named and numeric character references to the corresponding Unicode characters +### Append PDF files -[Link to Github page](Preparation/Unescape%20HTML) + -
+Append multiple PDF files combining them into one PDF file. -### Unstack Records +[Link to Github page](Outputs/Append%20PDF%20files) -Unstack all records by splitting on text fields with stacked values, filling records with empty strings where needed. +
-[Link to Github page](Preparation/Unstack%20rows) +### Web Image-PDF output -
+ -### Add row ID field +Grabs screenshots of webpages, optionally producing a PDF document. -Adds a Row ID field with a sequential number. +[Link to Github page](Outputs/Web%20Image-PDF%20output) -[Link to Github page](Preparation/Add%20row%20ID%20field) +
-
+### Report to PowerPoint -### Split Address + - +Export a Report to a PowerPoint pptx file -Splits an address field into streetname, streetnumber, and suffix. +[Link to Github page](Outputs/Report%20to%20PowerPoint) -[Link to Github page](Preparation/Split%20Address) +
-
+### Slack Bot -### URL Encode + -URL encode strings in a field using the UTF-8 encoding scheme +Posts messages on a channel. -[Link to Github page](Preparation/URL%20Encode) +[Link to Github page](Outputs/Slack%20Bot)