Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notifying implementation inconsistency in user properties #86

Open
Martin-Rey opened this issue Dec 5, 2018 · 4 comments
Open

Notifying implementation inconsistency in user properties #86

Martin-Rey opened this issue Dec 5, 2018 · 4 comments

Comments

@Martin-Rey
Copy link

Hi,

I am getting an unexpected error following my calculation of the SFR_histogram for a single halo at the latest timestep: 'tangos write SFR_histogram SFR_10Myr --for Halo600 --latest --hmax=0'.

The error reads:

Out[10]: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
    691                 type_pprinters=self.type_printers,
    692                 deferred_pprinters=self.deferred_printers)
--> 693             printer.pretty(obj)
    694             printer.flush()
    695             return stream.getvalue()

~/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    378                             if callable(meth):
    379                                 return meth(obj, self, cycle)
--> 380             return _default_pprint(obj, self, cycle)
    381         finally:
    382             self.end_group()

~/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _default_pprint(obj, p, cycle)
    493     if _safe_getattr(klass, '__repr__', None) is not object.__repr__:
    494         # A user-provided repr. Find newlines and replace them with p.break_()
--> 495         _repr_pprint(obj, p, cycle)
    496         return
    497     p.begin_group(1, '<')

~/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    691     """A pprint that just redirects to the normal repr function."""
    692     # Find newlines and replace them with p.break_()
--> 693     output = repr(obj)
    694     for idx,output_line in enumerate(output.splitlines()):
    695         if idx:

~/Documents/tangos/tangos/core/halo_data/property.py in __repr__(self)
     47             x = "<HaloProperty "
     48         if self.data_float is not None:
---> 49             return (x + self.name.text + "=%.2e" % self.data) + " of " + self.halo.short() + ">"
     50         elif self.data_array is not None:
     51             return x + self.name.text + " (array) of " + self.halo.short() + ">"

~/Documents/tangos/tangos/core/halo_data/property.py in data(self)
     65     @property
     66     def data(self):
---> 67         return self.get_data_with_reassembly_options()
     68 
     69     def get_data_with_reassembly_options(self, *options):

~/Documents/tangos/tangos/core/halo_data/property.py in get_data_with_reassembly_options(self, *options)
     68 
     69     def get_data_with_reassembly_options(self, *options):
---> 70         return extraction_patterns.HaloPropertyValueWithReassemblyOptionsGetter(*options).postprocess_data_objects([self])[0]
     71 
     72 

~/Documents/tangos/tangos/core/extraction_patterns.py in postprocess_data_objects(self, outputs)
    112 
    113     def postprocess_data_objects(self, outputs):
--> 114         return [self._postprocess_one_result(o) for o in outputs]
    115 
    116     def _setup_data_mapper(self, property_object):

~/Documents/tangos/tangos/core/extraction_patterns.py in <listcomp>(.0)
    112 
    113     def postprocess_data_objects(self, outputs):
--> 114         return [self._postprocess_one_result(o) for o in outputs]
    115 
    116     def _setup_data_mapper(self, property_object):

~/Documents/tangos/tangos/core/extraction_patterns.py in _postprocess_one_result(self, property_object)
    132         if hasattr(self._providing_class, 'reassemble'):
    133             instance = self._providing_class(property_object.halo.timestep.simulation)
--> 134             return instance.reassemble(property_object, *self._options)
    135         else:
    136             self._setup_data_mapper(property_object)

~/Documents/tangos/tangos/properties/pynbody/SF.py in reassemble(self, *options)
     35 
     36     def reassemble(self, *options):
---> 37         reassembled = super(StarFormHistogram, self).reassemble(*options)
     38         return reassembled/1e9 # Msol per Gyr -> Msol per yr
     39 

~/Documents/tangos/tangos/properties/__init__.py in reassemble(self, property, reassembly_type)
    248 
    249         if reassembly_type=='major':
--> 250             return self._reassemble_using_finding_strategy(property, strategy = rfs.MultiHopMajorProgenitorsStrategy)
    251         elif reassembly_type=='major_across_simulations':
    252             return self._reassemble_using_finding_strategy(property, strategy = rfs.MultiHopMajorProgenitorsStrategy,

~/Documents/tangos/tangos/properties/__init__.py in _reassemble_using_finding_strategy(self, property, strategy, strategy_kwargs)
    276         for t_i, hist_i in zip(t, stack):
    277             end = self.bin_index(t_i)
--> 278             start = end - len(hist_i)
    279             valid = hist_i == hist_i
    280             if t_i != previous_time:

TypeError: object of type 'numpy.float64' has no len()

I think the problem is generated because there is only one SFR_histogram property available across the merger tree of this halo, and hence a number (which has no length) is retruned rather than an array.

Am I misusing this property (in the sense that it should always be generated across the full merger tree) or is this a bug?

The error also generates a catastrophic failure of the web server preventing from accessing the page of this halo (but not the timestep page or other halo pages). I think this is because the webserver tries by default to reassemble histogram properties which fail with the same error as above.

Thanks
Martin

@apontzen
Copy link
Member

apontzen commented Dec 5, 2018

It has to be generated at least for the whole major progenitor branch, otherwise it’s not defined. But it would be nice if a neater/more helpful error resulted...

@Martin-Rey
Copy link
Author

Ok thanks for the clarification. I will leave it up to you to decide if a better error message should be added and if a better try/catch mechanism should be added to the web server to avoid it completely breaking if histogram properties fail.

Otherwise, I'll just close this issue and it will serve as documentation on SFR_histogram.

@Martin-Rey
Copy link
Author

I finally got to the bottom of this. It was due to my declaration of the property name.

I declared it as names = ["SFR_histogram"] even though only one calculation was returned.

When performing the calculation loop, the code here

def run_property_calculation(self, db_halo, property_calculator, existing_properties):
names = property_calculator.names
if type(names) is str:
listize = True
names = [names]
else:
listize = False

would evaluate the need to listize to False because the property name is a list type.

Once the actual calculation was done by my property, it would return an SFR array as result which would go through this snippet of code:

results = self._get_property_value(db_halo, property_calculator, existing_properties)
if listize:
results = [results]
self._queue_results_for_later_commit(db_halo, names, results, existing_properties)

No list rewrapping would be done as the if test evaluates to False.

When queued for later committing, the result array would be broken down by a zip call here:

def _queue_results_for_later_commit(self, db_halo, names, results, existing_properties_data):
for n, r in zip(names, results):
if isinstance(r, proxy_object.ProxyObjectBase):

and the database would end up storing only the first number in the array, rather than the full array. This one number would then be retrieved during the reassembly, leading the above failure.

This error clearly follows my own coding error when declaring the property name, but the bug was incredibly difficult to catch as TANGOS swallowed the problem very deep without complaining. One way to avoid this in the future would be to implement additional consistency checks for example verifying that the length of the calculated result matches the length of the stored result, as well as throwing a more sensible error in the reassembly method if its arguments are not what is expected.

Note that in the end, this has nothing to do with the SFR_histogram being generated for only one output.

@apontzen
Copy link
Member

Ah, I see, yes it would be useful to implement a simple test to catch this type of inconsistency

@apontzen apontzen changed the title SFR_histogram calculated on one snapshot? Notifying implementation inconsistency in user properties Jan 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants