Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMLParseError when requesting data using a dictionary key from ABS_XML #253

Open
Chowti opened this issue Feb 7, 2024 · 1 comment
Open

Comments

@Chowti
Copy link

Chowti commented Feb 7, 2024

Using
Python 3.11.7
pandasdmx 1.10.0

I am getting an XMLParseError while attempting to get data using a dictionary key from "ABS_XML".

import pandasdmx as sdmx

abs_xml = sdmx.Request("ABS_XML")

resp = abs_xml.data('ABS_ANNUAL_ERP_LGA2022',
                    key = dict(SEX_ABS='1'),
                    params = dict(startPeriod='2021'))
Traceback
[c:\Users\timot\anaconda3\envs\SDMX\Lib\site-packages\pandasdmx\remote.py:11](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/remote.py:11): RuntimeWarning: optional dependency requests_cache is not installed; cache options to Session() have no effect
  warn(



--- SS without DSD ---
{1: False}

--- <class 'pandasdmx.message.StructureMessage'> ---
{2: <pandasdmx.StructureMessage>
  <Header>
    id: 'IDREF59600'
    prepared: '2024-02-05T17:21:23.770127+11:00'
    receiver: <Agency Unknown>
    sender: <Agency Unknown>
    source: 
    test: False}

--- <class 'pandasdmx.model.DataStructureDefinition'> ---
{'ABS_ANNUAL_ERP_LGA2022': <DataStructureDefinition ABS:ABS_ANNUAL_ERP_LGA2022(1.2.0): ERP by LGA (2022), Age and Sex, 2001 to 2022>}

--- <class 'pandasdmx.model.Agency'> ---
{'ABS': <Agency ABS>}

--- <class 'pandasdmx.model.DataflowDefinition'> ---
{'ABS_ANNUAL_ERP_LGA2022': <DataflowDefinition ABS:ABS_ANNUAL_ERP_LGA2022(1.2.0): ERP by LGA (2022), Age and Sex, 2001 to 2022>}

--- <class 'pandasdmx.model.CategoryScheme'> ---
{87: <CategoryScheme ABS:PEOPLE(1.0.0) (13 items): People>, 88: <CategoryScheme ABS:PEOPLE(1.0.0) (1 items)>}

--- <class 'pandasdmx.model.Categorisation'> ---
{'CAT_ANNUAL_ERP_LGA2022': <Categorisation ABS:CAT_ANNUAL_ERP_LGA2022(1.2.0): ERP by LGA (2022), Age and Sex, 2001 to 2022>}

--- <class 'pandasdmx.model.Codelist'> ---
{'CL_AGE': <Codelist ABS:CL_AGE(1.0.0) (194 items): Age>, 'CL_ERP': <Codelist ABS:CL_ERP(1.0.0) (1 items): Measure>, 'CL_FREQ': <Codelist ABS:CL_FREQ(1.0.0) (9 items): Frequency>, 'CL_LGA_2022': <Codelist ABS:CL_LGA_2022(1.0.0) (578 items): Local Government Areas - 2022>, 'CL_OBS_STATUS': <Codelist ABS:CL_OBS_STATUS(1.0.0) (16 items): Observation Status>, 'CL_REGION_TYPE': <Codelist ABS:CL_REGION_TYPE(1.0.0) (43 items): Region Type>, 'CL_SEX': <Codelist ABS:CL_SEX(1.0.0) (3 items): Sex>, 'CL_UNIT_MEASURE': <Codelist ABS:CL_UNIT_MEASURE(1.0.0) (88 items): Unit of Measure>}

--- <class 'pandasdmx.model.ConceptScheme'> ---
{11693: <ConceptScheme ABS:CS_COMMON(1.0.0) (5 items): Common Concepts>, 11694: <ConceptScheme ABS:CS_COMMON(1.0.0) (1 items)>, 11700: <ConceptScheme ABS:CS_DEMOG(1.0.0) (25 items): Demographic Concepts>, 11701: <ConceptScheme ABS:CS_DEMOG(1.0.0) (1 items)>, 'CS_DEMOG': <ConceptScheme ABS:CS_DEMOG(1.0.0) (1 items)>, 11712: <ConceptScheme ABS:CS_GEOGRAPHY(1.0.0) (25 items): Geography Concepts>, 11713: <ConceptScheme ABS:CS_GEOGRAPHY(1.0.0) (1 items)>, 'CS_GEOGRAPHY': <ConceptScheme ABS:CS_GEOGRAPHY(1.0.0) (1 items)>, 'CS_COMMON': <ConceptScheme ABS:CS_COMMON(1.0.0) (3 items)>, 11734: <ConceptScheme ABS:CS_ATTRIBUTE(1.0.0) (6 items): Attribute Concepts>, 11735: <ConceptScheme ABS:CS_ATTRIBUTE(1.0.0) (1 items)>, 'CS_ATTRIBUTE': <ConceptScheme ABS:CS_ATTRIBUTE(1.0.0) (2 items)>}

--- <class 'pandasdmx.model.Annotation'> ---
{'obs_count': Annotation(id='obs_count', title='698478', type='sdmx_metrics', url=None, text=), 11758: Annotation(id=None, title='A', type='ReleaseVersion', url=None, text=)}

--- Name ---
{11759: ('en', 'Availability (A) for ABS_ANNUAL_ERP_LGA2022')}

--- <class 'pandasdmx.reader.sdmxml.Reference'> ---
{'ABS_ANNUAL_ERP_LGA2022': <pandasdmx.reader.sdmxml.Reference object at 0x0000023E19BFFF50>}

--- <class 'pandasdmx.model.MemberSelection'> ---
{11762: <MemberSelection MEASURE in {'ERP'}>, 11766: <MemberSelection SEX_ABS in {'1', '2', '3'}>, 11786: <MemberSelection AGE in {'8599', 'A04', 'A10', 'A15', 'A20', 'A25', 'A30', 'A35', 'A40', 'A45', 'A50', 'A55', 'A59', 'A60', 'A65', 'A70', 'A75', 'A80', 'TOT'}>, 12344: <MemberSelection LGA_2022 in {'1', '10050', '10180', '10250', '10300', '10470', '10500', '10550', '10600', '10650', '10750', '10800', '10850', '10900', '10950', '11150', '11200', '11250', '11300', '11350', '11400', '11450', '11500', '11520', '11570', '11600', '11650', '11700', '11720', '11730', '11750', '11800', '12000', '12150', '12160', '12350', '12380', '12390', '12700', '12730', '12750', '12850', '12870', '12900', '12930', '12950', '13010', '13310', '13340', '13450', '13550', '13660', '13800', '13850', '13910', '14000', '14100', '14170', '14220', '14300', '14350', '14400', '14500', '14550', '14600', '14650', '14700', '14750', '14850', '14870', '14900', '14920', '14950', '15050', '15240', '15270', '15300', '15350', '15520', '15560', '15650', '15700', '15750', '15800', '15850', '15900', '15950', '15990', '16100', '16150', '16200', '16260', '16350', '16380', '16400', '16490', '16550', '16610', '16700', '16900', '16950', '17000', '17040', '17080', '17100', '17150', '17200', '17310', '17350', '17400', '17420', '17550', '17620', '17640', '17650', '17750', '17850', '17900', '17950', '18020', '18050', '18100', '18200', '18250', '18350', '18400', '18450', '18500', '18710', '19399', '2', '20110', '20260', '20570', '20660', '20740', '20830', '20910', '21010', '21110', '21180', '21270', '21370', '21450', '21610', '21670', '21750', '21830', '21890', '22110', '22170', '22250', '22310', '22410', '22490', '22620', '22670', '22750', '22830', '22910', '22980', '23110', '23190', '23270', '23350', '23430', '23670', '23810', '23940', '24130', '24210', '24250', '24330', '24410', '24600', '24650', '24780', '24850', '24900', '24970', '25060', '25150', '25250', '25340', '25430', '25490', '25620', '25710', '25810', '25900', '25990', '26080', '26170', '26260', '26350', '26430', '26490', '26610', '26670', '26700', '26730', '26810', '26890', '26980', '27070', '27170', '27260', '27350', '27450', '27630', '29399', '3', '30250', '30300', '30370', '30410', '30450', '30760', '30900', '31000', '31750', '31820', '31900', '31950', '32080', '32250', '32260', '32270', '32310', '32330', '32450', '32500', '32600', '32750', '32770', '32810', '33100', '33200', '33220', '33360', '33430', '33610', '33620', '33800', '33830', '33960', '33980', '34420', '34530', '34570', '34580', '34590', '34710', '34770', '34800', '34830', '34860', '34880', '35010', '35250', '35300', '35600', '35670', '35740', '35760', '35780', '35790', '35800', '36070', '36150', '36250', '36300', '36370', '36510', '36580', '36630', '36660', '36720', '36820', '36910', '36950', '36960', '37010', '37300', '37310', '37340', '37400', '37550', '37570', '37600', '4', '40070', '40120', '40150', '40220', '40250', '40310', '40430', '40520', '40700', '40910', '41010', '41060', '41140', '41190', '41330', '41560', '41750', '41830', '41960', '42030', '42110', '42250', '42600', '42750', '43080', '43220', '43360', '43650', '43710', '43790', '44000', '44060', '44210', '44340', '44550', '44620', '44830', '45040', '45090', '45120', '45290', '45340', '45400', '45540', '45680', '45890', '46090', '46300', '46450', '46510', '46670', '46860', '46970', '47140', '47290', '47490', '47630', '47700', '47800', '47910', '47980', '48050', '48130', '48260', '48340', '48410', '48540', '48640', '48750', '48830', '49399', '5', '50080', '50210', '50250', '50280', '50350', '50420', '50490', '50560', '50630', '50770', '50840', '50910', '50980', '51080', '51120', '51190', '51260', '51310', '51330', '51400', '51470', '51540', '51610', '51680', '51710', '51750', '51820', '51860', '51890', '51960', '52030', '52100', '52170', '52240', '52310', '52380', '52450', '52520', '52590', '52660', '52730', '52800', '52870', '52940', '53010', '53080', '53150', '53220', '53290', '53360', '53430', '53570', '53640', '53710', '53780', '53800', '53920', '53990', '54060', '54130', '54170', '54200', '54280', '54310', '54340', '54410', '54480', '54550', '54620', '54690', '54760', '54830', '54900', '54970', '55040', '55110', '55180', '55250', '55320', '55390', '55460', '55530', '55600', '55670', '55740', '55810', '55880', '55950', '56090', '56160', '56230', '56300', '56370', '56460', '56580', '56620', '56730', '56790', '56860', '56930', '57000', '57080', '57140', '57210', '57280', '57350', '57420', '57490', '57630', '57700', '57770', '57840', '57910', '57980', '58050', '58190', '58260', '58330', '58400', '58470', '58510', '58540', '58570', '58610', '58680', '58760', '58820', '58890', '59030', '59100', '59170', '59250', '59310', '59320', '59330', '59340', '59350', '59360', '59370', '6', '60210', '60410', '60610', '60810', '61010', '61210', '61410', '61510', '61610', '61810', '62010', '62210', '62410', '62610', '62810', '63010', '63210', '63410', '63610', '63810', '64010', '64210', '64610', '64810', '65010', '65210', '65410', '65610', '65810', '7', '70200', '70420', '70540', '70620', '70700', '71000', '71150', '71300', '72200', '72300', '72330', '72800', '73600', '74050', '74550', '74560', '74660', '74680', '79399', '8', '89399', '9', '99399', 'AUS'}>, 12348: <MemberSelection REGION_TYPE in {'AUS', 'LGA2022', 'STE'}>, 12350: <MemberSelection FREQUENCY in {'A'}>}

--- <class 'pandasdmx.model.RangePeriod'> ---
{12353: RangePeriod(start=Period(is_inclusive=True, period=datetime.datetime(2001, 1, 1, 0, 0)), end=Period(is_inclusive=True, period=datetime.datetime(2022, 12, 31, 0, 0)))}

<common:KeyValue xmlns:common="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common" xmlns:message="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message" xmlns:structure="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/structure" id="TIME_PERIOD">
            <common:TimeRange/></common:KeyValue>
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
File [c:\Users\timot\anaconda3\envs\SDMX\Lib\site-packages\pandasdmx\reader\sdmxml.py:299](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:299), in Reader.read_message(self, source, dsd)
    [297](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:297) try:
    [298](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:298)     # Parse the element
--> [299](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:299)     result = func(self, element)
    [300](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:300) except TypeError:

File [c:\Users\timot\anaconda3\envs\SDMX\Lib\site-packages\pandasdmx\reader\sdmxml.py:1190](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:1190), in _ms(reader, elem)
   [1189](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:1189) else:
-> [1190](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:1190)     raise RuntimeError
   [1192](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:1192) if arg["values_for"] is None:

RuntimeError: 

The above exception was the direct cause of the following exception:

XMLParseError                             Traceback (most recent call last)
File [c:\Pystuff\pandasdmx\Fresh.py:6](file:///C:/Pystuff/pandasdmx/Fresh.py:6)
      [2](file:///C:/Pystuff/pandasdmx/Fresh.py:2) import pandasdmx as sdmx
      [4](file:///C:/Pystuff/pandasdmx/Fresh.py:4) abs_xml = sdmx.Request("ABS_XML")
----> [6](file:///C:/Pystuff/pandasdmx/Fresh.py:6) resp = abs_xml.data('ABS_ANNUAL_ERP_LGA2022',
      [7](file:///C:/Pystuff/pandasdmx/Fresh.py:7)                     key = dict(SEX_ABS='1'),
      [8](file:///C:/Pystuff/pandasdmx/Fresh.py:8)                     params = dict(startPeriod='2021'))

File [c:\Users\timot\anaconda3\envs\SDMX\Lib\site-packages\pandasdmx\api.py:457](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:457), in Request.get(self, resource_type, resource_id, tofile, use_cache, dry_run, **kwargs)
    [455](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:455)     req = self._request_from_url(kwargs)
    [456](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:456) else:
--> [457](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:457)     req = self._request_from_args(kwargs)
    [459](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:459) req = self.session.prepare_request(req)
    [461](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:461) # Now get the SDMX message via HTTP

File [c:\Users\timot\anaconda3\envs\SDMX\Lib\site-packages\pandasdmx\api.py:287](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:287), in Request._request_from_args(self, kwargs)
    [283](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:283)     raise ValueError(f"unrecognized arguments: {kwargs!r}")
    [285](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:285) if validate:
    [286](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:286)     # Make the key, and retain the DSD (if any) for use in parsing
--> [287](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:287)     key, dsd = self._make_key(resource_type, resource_id, key, dsd)
    [288](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:288)     kwargs["dsd"] = dsd
    [290](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:290) url_parts.append(key)

File [c:\Users\timot\anaconda3\envs\SDMX\Lib\site-packages\pandasdmx\api.py:184](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:184), in Request._make_key(self, resource_type, resource_id, key, dsd)
    [180](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:180)     pass
    [181](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:181) elif self.source.supports[Resource.datastructure]:
    [182](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:182)     # Retrieve the DataStructureDefinition
    [183](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:183)     dsd = (
--> [184](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:184)         self.dataflow(
    [185](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:185)             resource_id, params=dict(references="all"), use_cache=True
    [186](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:186)         )
    [187](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:187)         .dataflow[resource_id]
    [188](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:188)         .structure
    [189](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:189)     )
    [191](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:191)     if dsd.is_external_reference:
    [192](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:192)         # DataStructureDefinition was not retrieved with the Dataflow
    [193](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:193)         # query; retrieve it explicitly
    [194](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:194)         dsd = self.get(resource=dsd, use_cache=True).structure[dsd.id]

File [c:\Users\timot\anaconda3\envs\SDMX\Lib\site-packages\pandasdmx\api.py:514](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:514), in Request.get(self, resource_type, resource_id, tofile, use_cache, dry_run, **kwargs)
    [511](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:511) reader = Reader()
    [513](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:513) # Parse the message, using any provided or auto-queried DSD
--> [514](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:514) msg = reader.read_message(response_content, dsd=kwargs.get("dsd", None))
    [516](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:516) # Store the HTTP response with the message
    [517](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/api.py:517) msg.response = response

File [c:\Users\timot\anaconda3\envs\SDMX\Lib\site-packages\pandasdmx\reader\sdmxml.py:317](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:317), in Reader.read_message(self, source, dsd)
    [315](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:315)     self._dump()
    [316](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:316)     print(etree.tostring(element, pretty_print=True).decode())
--> [317](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:317)     raise XMLParseError from exc
    [319](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:319) # Parsing complete; count uncollected items from the stacks, which represent
    [320](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:320) # parsing errors
    [321](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:321) 
    [322](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:322) # Remove some internal items
    [323](file:///C:/Users/timot/anaconda3/envs/SDMX/Lib/site-packages/pandasdmx/reader/sdmxml.py:323) self.pop_single("SS without DSD")

XMLParseError: RuntimeError

The error looks to occur when trying to get the dsd structure information.

dsd = abs_xml.dataflow('ABS_ANNUAL_ERP_LGA2022', params=dict(references="all"), use_cache=True).dataflow['ABS_ANNUAL_ERP_LGA2022'].structure

Specifying references=descendants and then using the information returned, allows the data request to complete successfully.

dsd = abs_xml.dataflow('ABS_ANNUAL_ERP_LGA2022', params=dict(references="descendants"), use_cache=True).dataflow['ABS_ANNUAL_ERP_LGA2022'].structure

resp = abs_xml.data('ABS_ANNUAL_ERP_LGA2022',
                    key = dict(SEX_ABS='1'),
                    params = dict(startPeriod='2021'),
                    dsd = dsd)
<pandasdmx.DataMessage>
  <Header>
    id: 'IREF030445'
    prepared: '2024-02-08T02:01:51'
    sender: <Agency _Stat_V8>
    source: 
    test: False
  response: <Response [200]>
  DataSet (1)
  dataflow: <DataflowDefinition (missing id)>
  observation_dimension: <TimeDimension TIME_PERIOD>

My main suspicion would be parsing the 2 content constraints returned from,
https://api.data.abs.gov.au/dataflow/ABS/ABS_ANNUAL_ERP_LGA2022/latest?references=all.
These are automatically generated during a point in time release,
https://sis-cc.gitlab.io/dotstatsuite-documentation/using-api/embargo-management/#point-in-time-release-feature

@BartStolarek
Copy link

Unfortunately I think I'm in the same boat, I'm new to using pandasdmx and actually sdmx structures as well, but here is my code:

from pandasdmx import Request
import logging
import pandasdmx

abs_xml = pandasdmx.Request('ABS_XML',
                            log_level=logging.INFO)



# Dataflows
flow_msg = abs_xml.dataflow(force=True) # get dataflows
dataflows_pandas = pandasdmx.to_pandas(flow_msg.dataflow) # convert to pandas DataFrame
dataflows_pandas.to_csv('dataflows.csv')  # save dataflows to csv
sa2Data = dataflows_pandas[dataflows_pandas.str.contains('SA2+', case=False)] # filter dataflows for SA2
sa2Data.to_csv('sa2DataFlows.csv') # save SA2 dataflows to csv

example_msg = abs_xml.dataflow(resource=flow_msg.dataflow.C21_G04_SA2) # get dataflow for C21_G04_SA2

When I run that, I get the following RuntimeError

../venv/lib/python3.10/site-packages/pandasdmx/remote.py:11: RuntimeWarning: optional dependency requests_cache is not installed; cache options to Session() have no effect
  warn(
2024-02-22 14:58:04,233 pandasdmx.api - INFO: Requesting resource from https://api.data.abs.gov.au/dataflow/ABS/latest
2024-02-22 14:58:04,233 pandasdmx.api - INFO: with headers {'User-Agent': 'python-requests/2.31.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
2024-02-22 14:58:07,415 pandasdmx.reader.sdmxml - DEBUG: Truncate sub-microsecond time in <Prepared>
2024-02-22 14:58:08,110 pandasdmx.api - INFO: Requesting resource from https://api.data.abs.gov.au/dataflow/ABS/C21_G04_SA2/latest?references=all
2024-02-22 14:58:08,110 pandasdmx.api - INFO: with headers {'User-Agent': 'python-requests/2.31.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
2024-02-22 14:58:10,468 pandasdmx.reader.sdmxml - DEBUG: Truncate sub-microsecond time in <Prepared>



--- SS without DSD ---
{1: False}

--- <class 'pandasdmx.message.StructureMessage'> ---
{2: <pandasdmx.StructureMessage>
  <Header>
    id: 'IDREF23404'
    prepared: '2024-02-22T13:40:55.674541+11:00'
    receiver: <Agency Unknown>
    sender: <Agency Unknown>
    source: 
    test: False}

--- <class 'pandasdmx.model.DataStructureDefinition'> ---
{'C21_G04_SA2': <DataStructureDefinition ABS:C21_G04_SA2(1.0.0): Census 2021, G04 Age by sex, Main Statistical Areas Level 2 and up (SA2+) Datastructure>}

--- <class 'pandasdmx.model.Agency'> ---
{'ABS': <Agency ABS>}

--- <class 'pandasdmx.model.DataflowDefinition'> ---
{'C21_G04_SA2': <DataflowDefinition ABS:C21_G04_SA2(1.0.0): Census 2021, G04 Age by sex, Main Statistical Areas Level 2 and up (SA2+)>}

--- <class 'pandasdmx.model.CategoryScheme'> ---
{63: <CategoryScheme ABS:C21_ASGS(1.0.0) (5 items): Census 2021>, 64: <CategoryScheme ABS:C21_ASGS(1.0.0) (1 items)>}

--- <class 'pandasdmx.model.Categorisation'> ---
{'CAT_C21_G04_SA2': <Categorisation ABS:CAT_C21_G04_SA2(1.0.0): Census 2021, G04 Age by sex, Main Statistical Areas Level 2 and up (SA2+) Categorisation>}

--- <class 'pandasdmx.model.Codelist'> ---
{'CL_ASGS_2021': <Codelist ABS:CL_ASGS_2021(1.0.0) (2985 items): Australian Statistical Geography Standard (ASGS) Edition 3 - Main Structure>, 'CL_C21_AGEINGP13': <Codelist ABS:CL_C21_AGEINGP13(1.0.0) (102 items): Age, excludes overseas vistitors 13>, 'CL_C21_SEXP01': <Codelist ABS:CL_C21_SEXP01(1.0.0) (3 items): Sex 01>, 'CL_REGION_TYPE': <Codelist ABS:CL_REGION_TYPE(1.0.0) (43 items): Region Type>, 'CL_STATE': <Codelist ABS:CL_STATE(1.0.0) (10 items): State>}

--- <class 'pandasdmx.model.ConceptScheme'> ---
{52106: <ConceptScheme ABS:CS_C21_PERSON(1.0.0) (120 items): Census 2021 Person Concepts>, 52107: <ConceptScheme ABS:CS_C21_PERSON(1.0.0) (1 items)>, 'CS_C21_PERSON': <ConceptScheme ABS:CS_C21_PERSON(1.0.0) (1 items)>, 52118: <ConceptScheme ABS:CS_GEOGRAPHY(1.0.0) (25 items): Geography Concepts>, 52119: <ConceptScheme ABS:CS_GEOGRAPHY(1.0.0) (1 items)>, 'CS_GEOGRAPHY': <ConceptScheme ABS:CS_GEOGRAPHY(1.0.0) (2 items)>, 52134: <ConceptScheme ABS:CS_COMMON(1.0.0) (5 items): Common Concepts>, 52135: <ConceptScheme ABS:CS_COMMON(1.0.0) (1 items)>, 'CS_COMMON': <ConceptScheme ABS:CS_COMMON(1.0.0) (1 items)>}

--- <class 'pandasdmx.model.Annotation'> ---
{'obs_count': Annotation(id='obs_count', title='912186', type='sdmx_metrics', url=None, text=), 52148: Annotation(id=None, title='A', type='ReleaseVersion', url=None, text=)}

--- Name ---
{52149: ('en', 'Availability (A) for C21_G04_SA2')}

--- <class 'pandasdmx.reader.sdmxml.Reference'> ---
{'C21_G04_SA2': <pandasdmx.reader.sdmxml.Reference object at 0x7fdd3198b970>}

--- <class 'pandasdmx.model.MemberSelection'> ---
{52253: <MemberSelection AGEINGP in {'_T', '0', '0_4', '1', '10', '10_14', '11', '12', '13', '14', '15', '15_19', '16', '17', '18', '19', '2', '20', '20_24', '21', '22', '23', '24', '25', '25_29', '26', '27', '28', '29', '3', '30', '30_34', '31', '32', '33', '34', '35', '35_39', '36', '37', '38', '39', '4', '40', '40_44', '41', '42', '43', '44', '45', '45_49', '46', '47', '48', '49', '5', '5_9', '50', '50_54', '51', '52', '53', '54', '55', '55_59', '56', '57', '58', '59', '6', '60', '60_64', '61', '62', '63', '64', '65', '65_69', '66', '67', '68', '69', '7', '70', '70_74', '71', '72', '73', '74', '75', '75_79', '76', '77', '78', '79', '8', '80_84', '85_89', '9', '90_94', '95_99', 'GE100'}>, 52257: <MemberSelection SEXP in {'1', '2', '3'}>, 55239: <MemberSelection REGION in {'1', '101', ...<truncated>..., '9OTER', 'AUS'}>, 55246: <MemberSelection REGION_TYPE in {'AUS', 'GCCSA', 'SA2', 'SA3', 'SA4', 'STE'}>, 55257: <MemberSelection STATE in {'1', '2', '3', '4', '5', '6', '7', '8', '9', 'AUS'}>}

--- <class 'pandasdmx.model.RangePeriod'> ---
{55260: RangePeriod(start=Period(is_inclusive=True, period=datetime.datetime(2021, 1, 1, 0, 0)), end=Period(is_inclusive=True, period=datetime.datetime(2021, 12, 31, 0, 0)))}

<common:KeyValue xmlns:common="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common" xmlns:message="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message" xmlns:structure="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/structure" id="TIME_PERIOD">
            <common:TimeRange/></common:KeyValue>
        

Traceback (most recent call last):
  File "../venv/lib/python3.10/site-packages/pandasdmx/reader/sdmxml.py", line 299, in read_message
    result = func(self, element)
  File "../venv/lib/python3.10/site-packages/pandasdmx/reader/sdmxml.py", line 1189, in _ms
    raise RuntimeError
RuntimeError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "../main.py", line 17, in <module>
    example_msg = abs_xml.dataflow(resource=flow_msg.dataflow.C21_G04_SA2) # get dataflow for C21_G04_SA2
  File "..r/venv/lib/python3.10/site-packages/pandasdmx/api.py", line 514, in get
    msg = reader.read_message(response_content, dsd=kwargs.get("dsd", None))
  File "../venv/lib/python3.10/site-packages/pandasdmx/reader/sdmxml.py", line 316, in read_message
    raise XMLParseError from exc
pandasdmx.exceptions.XMLParseError: RuntimeError


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants