PySyft enables a new way to do data science, where you can use non-public information, without seeing nor obtaining a copy of the data itself. All you need is to connect to a Datasite!
Datasites are like websites, but for data. Designed with the principles of structured transparency, they enable data owners to control how their data is protected and data scientists to use data without obtaining a copy.
PySyft supports any statistical analysis or machine learning, offering support for directly running Python code - even using third-party Python libraries.
β Linux β macOS β Windows β Docker β Kubernetes
Try out your first query against a live demo Datasite!
pip install -U "syft[data_science]"
More instructions are available here.
Launch a development server directly in your Jupyter Notebook:
import syft as sy
sy.requires(">=0.9.2,<0.9.3")
server = sy.orchestra.launch(
name="my-datasite",
port=8080,
create_producer=True,
n_consumers=1,
dev_mode=False,
reset=True, # resets database
)
or from the command line:
$ syft launch --name=my-datasite --port=8080 --reset=True
Starting syft-datasite server on 0.0.0.0:8080
Datasite servers can be deployed as a single container using Docker or directly in Kubernetes. Check out our deployment guide.
Main way to use a Datasite is via our Syft client, in a Jupyter Notebook. Check out our PySyft client guide:
import syft as sy
sy.requires(">=0.9.2,<0.9.3")
datasite_client = sy.login(
port=8080,
email="[email protected]",
password="changethis"
)
Learn about PySyft via our getting started guide:
- PySyft from the ground up
- Part 1: Datasets & Assets
- Part 2: Client and Datasite Access
- Part 3: Propose the research study
- Part 4: Review Code Requests
- Part 5: Retrieving Results
π Check out our docs website.
Quick PySyft components links:
In a variety of domains across society, data owners have valid concerns about the risks associated with sharing their data, such as legal risks, privacy invasion (misuing the data), or intellectual property (copying and redistributing it).
Datasites enable data scientists to answer questions without even seeing or acquiring a copy of the data, within the data owners's definition of acceptable use. We call this process Remote Data Science.
This means that the current risks of sharing information with someone will no longer prevent the vast benefits such as innovation, insights and scientific discovery. With each Datasite, data owners are able to enable 1000x more accesible data
in each scientific field and lead, together with data scientists, breakthrough innovation.
Learn more about our work on our website.
For questions about PySyft, reach out via #support
on Slack.
β PySyft and Syft Server must use the same version
.
Latest Stable
0.9.2
(Stable) - Docs- Install PySyft (Stable):
pip install -U syft
Latest Beta
0.9.3
(Beta) -dev
branch ππ½- Install PySyft (Beta):
pip install -U syft --pre
Find more about previous releases here.
Supported by the OpenMined Foundation, the OpenMined Community is an online network of over 17,000 technologists, researchers, and industry professionals keen to unlock 1000x more data in every scientific field and industry.
OpenMined and Syft appreciates all contributors, if you would like to fix a bug or suggest a new feature, please reach out via Github or Slack!
OpenMined is a non-profit foundation creating technology infrastructure that helps researchers get answers from data without needing a copy or direct access. Our community of technologists is building Syft.
|
|
|
---|
Apache License 2.0
Person icons created by Freepik - Flaticon