Truncate Column Rename Step #93

yannistze · 2024-10-18T18:24:19Z

Hello,

If I understand the order of operations in the SQLPushDownRule.scala file correctly, the first step after adding the shared context to all the Relations is to rename all the columns in each Relation to unique names following the normalizedExprIdMap logic:

singlestore-spark-connector/src/main/scala/com/singlestore/spark/SQLPushdownRule.scala

Lines 41 to 45 in 8e70dff

    
           // Second, we need to rename the outputs of each SingleStore relation in the tree.  This transform is 
        
           // done to ensure that we can handle projections which involve ambiguous column name references. 
        
           var ptr, nextPtr = normalized.transform({ 
        
             case SQLGen.Relation(relation) => relation.renameOutput 
        
           })

This makes sense, to avoid Duplicate Name issues later on for example in Joins since they are wrapped around a selectAll statement instead of a select statement with aliases:

singlestore-spark-connector/src/main/scala/com/singlestore/spark/SQLGen.scala

Line 470 in 8e70dff

.selectAll()

singlestore-spark-connector/src/main/scala/com/singlestore/spark/SQLGen.scala

Line 488 in 8e70dff

.selectAll()

singlestore-spark-connector/src/main/scala/com/singlestore/spark/SQLGen.scala

Line 502 in 8e70dff

.selectAll()

The issue that I am facing with this approach is that:

when either (or all) Relation(s) has(have) too many columns (above 50-80)
when the column name conventions are long
and we fully PushDown

the query string becomes too long and thus leading to the schema fetch and subsequent PrepareStatement code to fail.

Tried:

truncating the query in a few places and keeping the Connector logic the same which helped but didn't fully solve the issue
using qualifiers instead of renaming each column which also helped but required a more extensive refactoring of the Connector (ex. joins) and thus making it more risky

I was wondering if there is any guidance in scenarios like the one I am facing ?

Thanks

The text was updated successfully, but these errors were encountered:

AdalbertMemSQL · 2024-10-22T10:52:55Z

Hello,
Could you clarify what error you are encountering?

yannistze · 2024-10-22T16:50:26Z

Hello, Could you clarify what error you are encountering?

Hello, for sure, the error I get is the following generic one:

java.sql.SQLTransientConnectionException: Driver has reconnect connection after a communications link failure with address=(host=10.133.121.176)(port=3306)(type=primary)
  at com.singlestore.jdbc.client.impl.MultiPrimaryClient.replayIfPossible(MultiPrimaryClient.java:212)
  at com.singlestore.jdbc.client.impl.MultiPrimaryClient.execute(MultiPrimaryClient.java:345)
  at com.singlestore.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:69)
  at com.singlestore.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:251)
  at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
  at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
  at com.singlestore.spark.JdbcHelpers$.loadSchema(JdbcHelpers.scala:137)
  at com.singlestore.spark.SinglestoreReader.schema$lzycompute(SinglestoreReader.scala:84)
  at com.singlestore.spark.SinglestoreReader.schema(SinglestoreReader.scala:84)
...

that if I am not mistaken comes from this codepath in the JDBC Driver:

        // no transaction, but connection is now up again.
        // changing exception to SQLTransientConnectionException
        throw new SQLTransientConnectionException(
            String.format(
                "Driver has reconnect connection after a communications link failure with %s",
                oldClient.getHostAddress()),
            "25S03");

and "masks" the root cause 😞

yannistze mentioned this issue Oct 18, 2024

[DP-938][DP-939][DP-943][DP-944][DP-945][DP-946][DP-947][DP-948][DP-951][DP-954] Put back Rename Column Logic With Truncation ActionIQ/singlestore-spark-connector#3

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Truncate Column Rename Step #93

Truncate Column Rename Step #93

yannistze commented Oct 18, 2024

AdalbertMemSQL commented Oct 22, 2024

yannistze commented Oct 22, 2024

Truncate Column Rename Step #93

Truncate Column Rename Step #93

Comments

yannistze commented Oct 18, 2024

AdalbertMemSQL commented Oct 22, 2024

yannistze commented Oct 22, 2024