Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with new table exist logic #19

Open
jcoveney opened this issue Jul 23, 2014 · 15 comments
Open

Issue with new table exist logic #19

jcoveney opened this issue Jul 23, 2014 · 15 comments

Comments

@jcoveney
Copy link

So, I had a chance to have a user using vertica test 2.5.4-wip-84

14/07/23 03:57:00 INFO jdbc.JDBCTap: testing table exists with select 1 from TABLE where 1 = 0
14/07/23 03:57:00 INFO jdbc.JDBCTap: creating connection: JDBC STRING
14/07/23 03:57:00 INFO jdbc.JDBCTap: executing query: select 1 from TABLE where 1 = 0
14/07/23 03:57:00 INFO jdbc.JDBCTap: 'TABLE' exists? false

note, I replaced our table name with TABLE and the jdbc stirng with JDBC. The table exists, but it didn't detect it. I had them try using the wip:

14/07/23 17:22:13 INFO jdbc.JDBCTap: testing if table exists with DatabaseMetaData

That's it...the lack of logging alone seems like a regression, but is also didn't work: it then tried to make the table (and failed, as it exists).

@fs111
Copy link
Contributor

fs111 commented Jul 23, 2014

Please use 2.5.3 release, the 2.5.4-wip seems to be made from the wrong branch. The " testing table exists with select 1 from TABLE where 1 = 0" is no longer in the codebase, so something must have gone wrong with that wip. I will have to sort that out.

@jcoveney
Copy link
Author

Ok, thanks. Will get back to you.

@jcoveney
Copy link
Author

2.5.3 has the same issue it looks like. It had the same logging. Let me make sure the version is the right one but I think it is as it has the different logging

@fs111
Copy link
Contributor

fs111 commented Jul 23, 2014

@jcoveney
Copy link
Author

Haven't had a chance to set up our env to make that easy yet...hoping to. That said, I was able to unblock our user with the following logic which worked in 2.5.2 (testing soon on 2.5.3)

case object VerticaAdapter extends JdbcAdapter {
override def getDriver = "com.vertica.Driver"
private [this] val tableRegex = new Regex("""(?:([^.]+).)?(.+)""", "schema", "table")
private[this] def getExistenceQuery(tableName: String) =
tableName match {
case tableRegex(schema, table) => {
val baseQuery =
"SELECT * FROM v_catalog.tables WHERE table_type in ('TABLE', 'VIEW') AND table_name = '"+table+"'"
Option(schema).foldLeft(baseQuery) { _ + " AND table_schema = '" + _ + "'" }
}
case _ => throw new IllegalArgumentException("Invalid table name was given: " + tableName)
}
override def getTableDesc(tableName: String, columnNames: Array[String], columnDefinitions: Array[String]) =
new TableDesc(tableName, columnNames, columnDefinitions, null, getExistenceQuery(tableName))
}

This is a vertica specific table existence query. That said, I have a feeling it is going to fail in 2.5.3, but will report back.

@jcoveney
Copy link
Author

We have another issue with 2.5.2 that I'm working out (I think unrelated), but 2.5.3 definitely doesn't work, EVEN with my proper query specified in the TableDesc

@fs111
Copy link
Contributor

fs111 commented Jul 25, 2014

I have removed the table exists query support in favor of using DatabaseMetaData, so setting it will have no effect: https://github.com/Cascading/cascading-jdbc/blob/2.5/cascading-jdbc-core/src/main/java/cascading/jdbc/JDBCTap.java#L643-L685

I don't have access to a vertica instance, but can you please run the cascading-jdbc-mysql tests against an instance of your own just to make sure, that works as it does for me and on our CI server?

Sorry for all the confusion!

@amatsukawa
Copy link

Taking this over from @jcoveney. I've tracked down why Vertica table discovery fails. We use table_schema in vertica, eg the tables are referred to as schema_name.table_name, for example can be test.amatsukawa_123.

Here: https://github.com/Cascading/cascading-jdbc/blob/2.5/cascading-jdbc-core/src/main/java/cascading/jdbc/JDBCTap.java#L655 that correct call for the example above should be

tables = dbm.getTables( null, "test", "amatsukawa_123", null );

whereas currently since TableDesc doesn't support schemas, we need to pass in the entire string "test.amatsukawa_123" as the table name, which ends up being:

tables = dbm.getTables( null, null, "test.amatsukawa_123", null );

which fails. Just using "amatsukawa_123" as the table name makes table discovery work but fails inserts instead, so I think the way forward is to extend TableDesc to know about table_schema.

@sharrake
Copy link

Is this issue fixed in cascading-jdbc ? I am facing the same issue with latest cascading-jdbc version

@fs111
Copy link
Contributor

fs111 commented Jan 28, 2016

@sharrake which db are you talking to? Can you share a minimal example with me?

@sharrake
Copy link

@fs111 I am using cascading-jdbc to talk to postgressql. I am trying to read data from hdfs, do some processing and write data to postgres sql using cascading-jdbc. The table to which, I want to load data is not in a public schema and that is where it is failing.

If I give jdbc url as following "jdbc:postgresql://localhost:5432/test_emr?currentSchema=emr&user=sharrake&password=sharrake"; It checks the table availabity properly but it doesn't find table when it has to start loading data.

But when I give the url as "jdbc:postgresql://localhost:5432/test_emr?&user=sharrake&password=sharrake"; and in tha tablename give schema as "emr.tablename", it fails before even it could find the table.

I am not sure if this is a limitation of cascading-jdbc or I am missing out something.If you have a sample cascading-jdbc code which loads data into a postgres table which is in a user schema, it will be of great help.

@sharrake
Copy link

sharrake commented Feb 1, 2016

@fs111 It is working fine and I am able to write data into a postgres table using cascading-jdbc if table is in public schema but it doesn't work if table is in a user schema. Can you please confirm if it is a known issue and if there is any plan to fix this ?

@fs111
Copy link
Contributor

fs111 commented Feb 1, 2016

This is indeed a current limitation. We are currently not working on cascading-jdbc, but I can guide you to sending us a patch, so that this gets resolved. Are you interested in such a collaboration?

@sharrake
Copy link

sharrake commented Feb 2, 2016

@fs111 Please send me the process of sending a patch. Will check if I can fix the issue and submit the patch for it.

@fs111
Copy link
Contributor

fs111 commented Feb 2, 2016

@sharrake first, we need a signed copy of your contributors agreement: http://files.concurrentinc.com/agreements/Concurrent_Contributor_Agreement.doc sent it to [email protected] and CC me [email protected]

For the actual feature: You should probably look a the TableDesc as a starting point for adding the schema. Then the table exists logic should be overwritten in a postgressql sub-class in the postgresql sub-project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants