Skip to content

Commit

Permalink
Add support for merged dataset geojson format on endpoint `/api/v1/da…
Browse files Browse the repository at this point in the history
…ta/<form_id>` (#2608)

* add merged dataset geojson format on endpoint /api/v1/data/<form_id>

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* mark flaky test

* set max_runs for flaky test

* update flaky test max run

* fix typo

* enhance test case

* add disclaimer for merged datasets docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* update flaky test max_runs

* update flaky test max_runs
  • Loading branch information
kelvin-muchiri authored Jun 5, 2024
1 parent c3a3f1b commit 32663a1
Show file tree
Hide file tree
Showing 9 changed files with 135 additions and 116 deletions.
38 changes: 24 additions & 14 deletions docs/merged-datasets.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
Merged Datasets
***************

.. warning:: **Disclaimer: Experimental Feature**

This feature is experimental. As a result, users may encounter bugs, glitches, or unexpected behavior. While we have taken steps to ensure a stable experience, some functionality may not work as intended.

Your feedback is invaluable in helping us improve this feature. Please report any issues or provide suggestions to help us enhance the final version.

Use this feature at your own discretion and be prepared for potential interruptions or performance inconsistencies.


This endpoint provides access to data from multiple forms. Merged datasets should have the same functionality as the forms endpoint with the difference being:

- They do not accept submissions directly, submissions to individual forms will be reflected in merged datasets..
Expand Down Expand Up @@ -138,42 +147,43 @@ Response


Retrieving Data from a Merged Dataset
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Returns the data from both forms. The key `_xform_id_string` can be used to
differentiate data from linked forms.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Returns the data from all linked forms.

.. raw:: html

<pre class="prettyprint">
<b>GET</b> /api/v1/merged-datasets/<code>{pk}</code>/data
<b>GET</b> /api/v1/data/<code>{pk}</code>
</pre>
<pre class="prettyprint"><b>GET</b> /api/v1/merged-datasets/{pk}/data</pre>

::

curl -X GET "https://api.ona.io/api/v1/merged-datasets/1/data"
curl -X GET "https://api.ona.io/api/v1/data/1"
Example
-------

::

curl -X GET "https://api.ona.io/api/v1/merged-datasets/1/data"

Example Response
----------------
::
Response
--------

::

[
{"date": "2015-05-19", "gender": "male", "age": 32, "name": "Kendy", "_xform_id_string": "form_a"},
{"date": "2015-05-19", "gender": "female", "age": 41, "name": "Maasai", "_xform_id_string": "form_b"},
{"date": "2015-05-19", "gender": "male", "age": 21, "name": "Tom", "_xform_id_string": "form_c"}
]


For data pagination and advanced filtering options, use endpoint `/api/v1/data/{pk} <https://github.com/onaio/onadata/blob/cc188e5c83caea78421a5a68093789b64265017b/docs/data.rst#get-json-list-of-data-end-points>`_

How data in parent forms differs from and affects the merged xform
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A merged dataset combines data from multiple forms into one form. It creates a new form structure from the intersection of the fields in the forms being merged.

A merged dataset:
- Does not allow submissions or data edits, this can only be done on the individual forms.
- Data deleted from the individual forms will also not be present in the mereged dataset.
- Data deleted from the individual forms will also not be present in the merged dataset.
- Form replacement is not supported.
- It has it's own form structure, which is not replaceable the same way you could replace an individual form when changing certain aspects of a form.
47 changes: 46 additions & 1 deletion onadata/apps/api/tests/viewsets/test_data_viewset.py
Original file line number Diff line number Diff line change
Expand Up @@ -3707,6 +3707,51 @@ def test_data_paginated_past_threshold(self):
'<http://testserver/?page=4&page_size=1>; rel="last"',
)

def test_merged_dataset(self):
"""Data for merged dataset is returned"""
merged_xf = self._create_merged_dataset(make_submissions=True)
view = DataViewSet.as_view({"get": "list"})
request = self.factory.get("/", **self.extra)
response = view(request, pk=merged_xf.pk)
self.assertEqual(response.status_code, 200)
self.assertEqual(len(response.data), 2)

def test_merged_dataset_geojson(self):
"""Merged dataset geojson works"""
merged_xf = self._create_merged_dataset(make_submissions=True)
view = DataViewSet.as_view({"get": "list"})
request = self.factory.get("/", **self.extra)
response = view(request, pk=merged_xf.pk, format="geojson")
self.assertEqual(response.status_code, 200)
# we get correct content type
headers = dict(response.items())
self.assertEqual(headers["Content-Type"], "application/geo+json")
instance_qs = Instance.objects.all().order_by("pk")
self.assertEqual(
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": None,
"properties": {
"id": instance_qs[0].pk,
"xform": instance_qs[0].xform.pk,
},
},
{
"type": "Feature",
"geometry": None,
"properties": {
"id": instance_qs[1].pk,
"xform": instance_qs[1].xform.pk,
},
},
],
},
response.data,
)


class TestOSM(TestAbstractViewSet):
"""
Expand All @@ -3721,7 +3766,7 @@ def setUp(self):
self.logger = logging.getLogger("console_logger")

# pylint: disable=invalid-name,too-many-locals
@flaky(max_runs=8)
@flaky(max_runs=10)
def test_data_retrieve_instance_osm_format(self):
"""Test /data endpoint OSM format."""
filenames = [
Expand Down
3 changes: 2 additions & 1 deletion onadata/apps/api/tests/viewsets/test_xform_viewset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2284,7 +2284,7 @@ def test_form_clone_shared_forms(self):
self.assertEqual(response.status_code, 201)
self.assertEqual(count + 1, XForm.objects.count())

@flaky
@flaky(max_runs=8)
def test_return_error_on_clone_duplicate(self):
with HTTMock(enketo_mock):
self._publish_xls_form_to_project()
Expand Down Expand Up @@ -3567,6 +3567,7 @@ def test_failed_form_publishing_after_maximum_retries(
self.assertEqual(response.status_code, 202)
self.assertEqual(response.data, error_message)

@flaky(max_runs=3)
def test_survey_preview_endpoint(self):
view = XFormViewSet.as_view({"post": "survey_preview", "get": "survey_preview"})

Expand Down
15 changes: 9 additions & 6 deletions onadata/apps/api/viewsets/data_viewset.py
Original file line number Diff line number Diff line change
Expand Up @@ -648,17 +648,20 @@ def list(self, request, *args, **kwargs):
return super().list(request, *args, **kwargs)

if export_type == "geojson":
# raise 404 if all instances dont have geoms
if not xform.instances_with_geopoints and not (
xform.polygon_xpaths() or xform.geotrace_xpaths()
):
raise Http404(_("Not Found"))
if not is_merged_dataset:
# raise 404 if all instances dont have geoms
if not xform.instances_with_geopoints and not (
xform.polygon_xpaths() or xform.geotrace_xpaths()
):
raise Http404(_("Not Found"))

# add pagination when fetching geojson features
page = self.paginate_queryset(self.object_list)
serializer = self.get_serializer(page, many=True)

return Response(serializer.data)
return Response(
serializer.data, headers={"Content-Type": "application/geo+json"}
)

return custom_response_handler(request, xform, query, export_type)

Expand Down
37 changes: 3 additions & 34 deletions onadata/apps/logger/models/tests/test_merged_xform.py
Original file line number Diff line number Diff line change
@@ -1,48 +1,17 @@
"""Tests for module onadata.apps.logger.models.merged_xform"""

from pyxform.builder import create_survey_element_from_dict
from unittest.mock import call, patch

from onadata.apps.main.tests.test_base import TestBase
from onadata.apps.logger.models.merged_xform import MergedXForm
from onadata.apps.logger.models.xform import XForm


class MergedXFormTestCase(TestBase):
@patch("onadata.libs.utils.project_utils.set_project_perms_to_xform_async.delay")
def test_perms_applied_async_on_create(self, mock_set_perms):
"""Permissions are applied asynchronously on create"""
md = """
| survey |
| | type | name | label |
| | select one fruits | fruit | Fruit |
| choices |
| | list name | name | label |
| | fruits | orange | Orange |
| | fruits | mango | Mango |
"""
self._publish_markdown(md, self.user, id_string="a")
self._publish_markdown(md, self.user, id_string="b")
xf1 = XForm.objects.get(id_string="a")
xf2 = XForm.objects.get(id_string="b")
survey = create_survey_element_from_dict(xf1.json_dict())
survey["id_string"] = "c"
survey["sms_keyword"] = survey["id_string"]
survey["title"] = "Merged XForm"
merged_xf = MergedXForm.objects.create(
id_string=survey["id_string"],
sms_id_string=survey["id_string"],
title=survey["title"],
user=self.user,
created_by=self.user,
is_merged_dataset=True,
project=self.project,
xml=survey.to_xml(),
json=survey.to_json(),
)
merged_xf.xforms.add(xf1)
merged_xf.xforms.add(xf2)
merged_xf = self._create_merged_dataset()
xf1 = merged_xf.xforms.get(id_string="a")
xf2 = merged_xf.xforms.get(id_string="b")
calls = [
call(xf1.pk, self.project.pk),
call(xf2.pk, self.project.pk),
Expand Down
2 changes: 1 addition & 1 deletion onadata/apps/logger/tests/test_briefcase_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ def _download_submissions(self):
mocker.head(requests_mock.ANY, content=submission_list)
self.briefcase_client.download_instances(self.xform.id_string)

@flaky(max_runs=8)
@flaky(max_runs=10)
def test_download_xform_xml(self):
"""
Download xform via briefcase api
Expand Down
46 changes: 45 additions & 1 deletion onadata/apps/main/tests/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
from io import StringIO
from tempfile import NamedTemporaryFile

from pyxform.builder import create_survey_element_from_dict

from django.conf import settings
from django.contrib.auth import authenticate, get_user_model
from django.core.files.uploadedfile import InMemoryUploadedFile
Expand All @@ -28,7 +30,7 @@
from six.moves.urllib.request import urlopen

from onadata.apps.api.viewsets.xform_viewset import XFormViewSet
from onadata.apps.logger.models import Instance, XForm, XFormVersion
from onadata.apps.logger.models import Instance, MergedXForm, XForm, XFormVersion
from onadata.apps.logger.views import submission
from onadata.apps.logger.xform_instance_parser import clean_and_parse_xml
from onadata.apps.main.models import UserProfile
Expand Down Expand Up @@ -570,3 +572,45 @@ def _publish_follow_up_form(self, user, project=None):
latest_form = XForm.objects.all().order_by("-pk").first()

return latest_form

def _create_merged_dataset(self, make_submissions=False):
md = """
| survey |
| | type | name | label |
| | select one fruits | fruit | Fruit |
| choices |
| | list name | name | label |
| | fruits | orange | Orange |
| | fruits | mango | Mango |
"""
self._publish_markdown(md, self.user, id_string="a")
self._publish_markdown(md, self.user, id_string="b")
xf1 = XForm.objects.get(id_string="a")
xf2 = XForm.objects.get(id_string="b")
survey = create_survey_element_from_dict(xf1.json_dict())
survey["id_string"] = "c"
survey["sms_keyword"] = survey["id_string"]
survey["title"] = "Merged XForm"
merged_xf = MergedXForm.objects.create(
id_string=survey["id_string"],
sms_id_string=survey["id_string"],
title=survey["title"],
user=self.user,
created_by=self.user,
is_merged_dataset=True,
project=self.project,
xml=survey.to_xml(),
json=survey.to_json(),
)
merged_xf.xforms.add(xf1)
merged_xf.xforms.add(xf2)

if make_submissions:
# Make submission for form a
xml = '<data id="a"><fruit>orange</fruit></data>'
Instance(xform=xf1, xml=xml).save()
# Make submission for form b
xml = '<data id="b"><fruit>mango</fruit></data>'
Instance(xform=xf2, xml=xml).save()

return merged_xf
26 changes: 2 additions & 24 deletions onadata/libs/tests/models/test_share_project.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
"""Tests for module onadata.libs.models.share_project"""

from unittest.mock import patch, call
from pyxform.builder import create_survey_element_from_dict

from onadata.apps.logger.models.data_view import DataView
from onadata.apps.logger.models.project import Project
from onadata.apps.logger.models.merged_xform import MergedXForm
from onadata.apps.logger.models.xform import XForm
from onadata.apps.main.tests.test_base import TestBase
from onadata.libs.models.share_project import ShareProject
Expand Down Expand Up @@ -37,7 +35,7 @@ def setUp(self):
project = Project.objects.create(
name="Demo", organization=self.user, created_by=self.user
)
self._publish_markdown(md_xform, self.user, project, id_string="a")
self._publish_markdown(md_xform, self.user, project)
self.dataview_form = XForm.objects.all().order_by("-pk")[0]
DataView.objects.create(
name="Demo",
Expand All @@ -46,27 +44,7 @@ def setUp(self):
matches_parent=True,
columns=[],
)
# MergedXForm
self._publish_markdown(md_xform, self.user, project, id_string="b")
xf1 = XForm.objects.get(id_string="a")
xf2 = XForm.objects.get(id_string="b")
survey = create_survey_element_from_dict(xf1.json_dict())
survey["id_string"] = "c"
survey["sms_keyword"] = survey["id_string"]
survey["title"] = "Merged XForm"
self.merged_xf = MergedXForm.objects.create(
id_string=survey["id_string"],
sms_id_string=survey["id_string"],
title=survey["title"],
user=self.user,
created_by=self.user,
is_merged_dataset=True,
project=self.project,
xml=survey.to_xml(),
json=survey.to_json(),
)
self.merged_xf.xforms.add(xf1)
self.merged_xf.xforms.add(xf2)
self.merged_xf = self._create_merged_dataset()
self.alice = self._create_user("alice", "Yuao8(-)")

@patch("onadata.libs.models.share_project.safe_delete")
Expand Down
Loading

0 comments on commit 32663a1

Please sign in to comment.