-
Notifications
You must be signed in to change notification settings - Fork 3
Febrl's data set generator mod
yipeng/dsgen-big
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
# This is a modification to Febrl's data set generator to faciliate generation # of large (gigabyte) sets. See "generate_bigdata.py". # # Yipeng Huang, Feb 22 2012 # ------------------- # My changes allow dsgen to produce sizable datasets that exceed memory # constraints. It writes to records directly to disk, and generates a # proportional number of duplicates at regular intervals of a million original # records (approx 100mb). The catch is that the output is not randomly sorted # even for small files. You should run another script if you need sorted data. # ------------------- # ============================================================================= # AUSTRALIAN NATIONAL UNIVERSITY OPEN SOURCE LICENSE (ANUOS LICENSE) # VERSION 1.3 # # The contents of this file are subject to the ANUOS License Version 1.3 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at: # # https://sourceforge.net/projects/febrl/ # # Software distributed under the License is distributed on an "AS IS" # basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See # the License for the specific language governing rights and limitations # under the License. # # The Original Software is: "generate.py" # # The Initial Developer of the Original Software is: # Dr Peter Christen (Department of Computer Science, Australian National # University) # # Copyright (C) 2002 - 2011 the Australian National University and # others. All Rights Reserved. # # Contributors: # # Alternatively, the contents of this file may be used under the terms # of the GNU General Public License Version 2 or later (the "GPL"), in # which case the provisions of the GPL are applicable instead of those # above. The GPL is available at the following URL: http://www.gnu.org/ # If you wish to allow use of your version of this file only under the # terms of the GPL, and not to allow others to use your version of this # file under the terms of the ANUOS License, indicate your decision by # deleting the provisions above and replace them with the notice and # other provisions required by the GPL. If you do not delete the # provisions above, a recipient may use your version of this file under # the terms of any one of the ANUOS License or the GPL. # ============================================================================= # # Freely extensible biomedical record linkage (Febrl) - Version 0.4.1 # # See: http://datamining.anu.edu.au/linkage.html # # =============================================================================
About
Febrl's data set generator mod
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published