Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protos with recursive fields fail with stack overflow #5

Open
drewrobb opened this issue Dec 3, 2016 · 10 comments
Open

Protos with recursive fields fail with stack overflow #5

drewrobb opened this issue Dec 3, 2016 · 10 comments

Comments

@drewrobb
Copy link

drewrobb commented Dec 3, 2016

Adding a recursive field to a proto breaks things, see drewrobb/sparksql-scalapb-test@4cfc436 for a reproduction. I'm happy to help address this if you have a recommended approach to solving it?

Exception in thread "main" java.lang.StackOverflowError
	at shadeproto.Descriptors$FieldDescriptor.getName(Descriptors.java:881)
	at com.trueaccord.scalapb.spark.ProtoSQL$.com$trueaccord$scalapb$spark$ProtoSQL$$structFieldFor(ProtoSQL.scala:65)
	at com.trueaccord.scalapb.spark.ProtoSQL$$anonfun$1.apply(ProtoSQL.scala:62)
	at com.trueaccord.scalapb.spark.ProtoSQL$$anonfun$1.apply(ProtoSQL.scala:62)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.AbstractTraversable.map(Traversable.scala:104)
 

      ......

	at com.trueaccord.scalapb.spark.ProtoSQL$.com$trueaccord$scalapb$spark$ProtoSQL$$structFieldFor(ProtoSQL.scala:62)
	at com.trueaccord.scalapb.spark.ProtoSQL$$anonfun$1.apply(ProtoSQL.scala:62)
	at com.trueaccord.scalapb.spark.ProtoSQL$$anonfun$1.apply(ProtoSQL.scala:62)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.AbstractTraversable.map(Traversable.scala:104)
@dbkegley
Copy link

@drewrobb, were you able to find a resolution for this? We are facing what looks like a similar issue with a highly nested schema

@drewrobb
Copy link
Author

@dbkegley we have not found a resolution to this, nor even have a proposed way to fix it

@thesamet
Copy link
Contributor

Schemas in Spark must be known ahead of time. A possible workaround would be to set a limit on the recursion depth when generating a schema. Would that be useful?

@drewrobb
Copy link
Author

That sounds like it would fix my use case. We aren't storing arbitrarily deep trees or anything-- mostly just single level recursion like in the example in this issue.

@thesamet
Copy link
Contributor

FWIW, for a single level, you could do something like this:

message Person { ... }

message PersonWithOtherPerson {
  optional Person main = 1;
  optional Person other_person = 2;
}

The downside is that this pushes the parent Person to a field, rather than in the top level. One way to get around this is to have an implicit conversion between PersonWithOtherPerson and Person.

@dbkegley
Copy link

@thesamet I think this would work for us as well. Unfortunately we only consume so don't have access to update the schema. We can advise against recursive fields but there's no guarantee the producers will follow our recommendation

@colinlouie
Copy link

colinlouie commented May 5, 2020

@thesamet, I'm in the same boat where I, as a consumer, cannot control the source. It would be great if the ProtoSQL driver could have a recursion-depth limit/parameter. As a workaround, I'm looking into a way to flatten this out before it hits Spark. The recursion is maximum 10 deep if that helps.

I'm using Scala 2.11.12, Spark 2.4.4, sparksql-scalapb 0.9.2, sbt-protoc 0.99.28, scalapb compilerplugin 0.9.7.

@anjshrg
Copy link

anjshrg commented Aug 26, 2021

just wandering if this issue was addressed in the newer release of scalapb? We are facing a similar issue.

@thesamet
Copy link
Contributor

Hi @anjshrg , the issue is still not resolved. PRs will be welcome!

@MCardus
Copy link

MCardus commented Oct 21, 2024

I found the same issue using the Protobuf field type google.protobuf.Struct. This field type contains nested Struct types, therefore I've got a StackOverflow error. Any idea on how to tackle this issue when we can't control the schema?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants