Currently, our schema tree doesn't support external databases, external schemas and external tables for Amazon Redshift. Extraction code needs to be modified to handle these. The external schema should not show up in the current schema tree. The attached patch filters this out. We need to create a separate area just for external databases, schemas and tables. The patch sets this up.
This is going to be the layout for the external objects in the tree. Attached is a patch to build the tree objects with extraction code. Right clicking on the objects should allow the following commands:
These commands will need to be added in the code. The underlying gui menu's will need to be created to support the features for each object (databases, schemas and tables) based on the available features found in the Amazon documentation. Additionally, we will need to provide the underlying code to support the scripting features mentioned above. This includes Schema Script Generator. Remember, that external tables are stored in Amazon s3 buckets and are read only. This therefore limit the operations available.
|
102 KB
![]() |
9 KB
|
62 KB
|
93 KB
|
109 KB
|
116 KB
|
99 KB
|
80 KB
|
91 KB
|
85 KB
|
177 KB
|
218 KB
|
206 KB
|
211 KB
|
210 KB
|
206 KB
|
215 KB
|
203 KB
|
239 KB
|
219 KB
|
172 KB
|
195 KB
|
212 KB
|
210 KB
|
203 KB
|
114 KB
|
105 KB
|
208 KB
|
204 KB
|
81 KB
|
95 KB
|
106 KB
|
106 KB
|
82 KB
|
94 KB
|
131 KB
|
128 KB
|
125 KB
|
133 KB
|
124 KB
I created an external schema vn_spectrum and table in Redshift/Spectrum.
RETURNS THE LIST OF ALL DATABASES CREATED IN ATHENA
And "SELECT schemaname, tablename FROM pg_catalog.svv_external_tables;"
Returns
Am I looking into wrong place in browser or this the issue yet to be fixed?
Hi,
Can we expect a solution for this issue? We want our business team to start using Spectrum external tables, but they can't see the external tables in Aqua Data Studio.
Thanks
Hi,
Can we expect a solution for this issue? We want our business team to start using Spectrum external tables, but they can't see the external tables in Aqua Data Studio.
Thanks
Hi Manjeeth,
I pushed this up to be evaluated by our engineering team. We will see if we can add it to our V20 roadmap.
Thanks for your patience, Tom
Hi Manjeeth,
I pushed this up to be evaluated by our engineering team. We will see if we can add it to our V20 roadmap.
Thanks for your patience, Tom
Thank you Tom. Do you have any timelines on V20? We were going to let the users start using the external tables from next week, so sooner the update available on Aqua Data Studio, the better. There are other applications like SQL Workbench (recommended by Amazon), Data grip from Jet brains that already display the external tables.
Thank you Tom. Do you have any timelines on V20? We were going to let the users start using the external tables from next week, so sooner the update available on Aqua Data Studio, the better. There are other applications like SQL Workbench (recommended by Amazon), Data grip from Jet brains that already display the external tables.
Just checking to see if there is any timeline update on this issue, as Users have started asking for this option. SQLWorkbench and Datagrip have options to see the external tables already.
Just checking to see if there is any timeline update on this issue, as Users have started asking for this option. SQLWorkbench and Datagrip have options to see the external tables already.
Hi Manjeeth,
I think the next release will be 4th quarter of this year.
Tom
Hi Manjeeth,
I think the next release will be 4th quarter of this year.
Tom
Revision no. 57475
Author: harish.khattri
Revision no. 57475
Author: harish.khattri
Hi Harish,
Lets go with number 2. Perhaps we should change the wording a little bit..."An external database can only be created when creating an external schema. Do you want to create an external schema?" Also, instead of "Create External Schema", we should use a "Continue" or "Cancel" for the buttons.
Thanks,
Tom
Hi Harish,
Lets go with number 2. Perhaps we should change the wording a little bit..."An external database can only be created when creating an external schema. Do you want to create an external schema?" Also, instead of "Create External Schema", we should use a "Continue" or "Cancel" for the buttons.
Thanks,
Tom
Revision no. 57486
Author: harish.khattri
Revision no. 57486
Author: harish.khattri
Hi Harish,
Per our conversation... The reason databases tom and tomtest work correctly with the new external schema gui is that there are already internal database's with that name. There are no internal databases named harish, harish1...etc. When making a connection, remember that external databases are place holders. You cannot use our connection scheme to connect to them. Notice in the node classes like RedshiftExtSchemasNode, I use ConnectionProperties connSettings = getConnectionProperties(); to get the original connection and database information. It kind of tricks it and I didn't see any problems when doing this. I didn't look at all the new code that you checked in but I did add a quick fix(hack) to demonstrate how to work around this. This seems to work. Please test it and clean it up - empty catch.
Hope that helps,
Tom
Hi Harish,
Per our conversation... The reason databases tom and tomtest work correctly with the new external schema gui is that there are already internal database's with that name. There are no internal databases named harish, harish1...etc. When making a connection, remember that external databases are place holders. You cannot use our connection scheme to connect to them. Notice in the node classes like RedshiftExtSchemasNode, I use ConnectionProperties connSettings = getConnectionProperties(); to get the original connection and database information. It kind of tricks it and I didn't see any problems when doing this. I didn't look at all the new code that you checked in but I did add a quick fix(hack) to demonstrate how to work around this. This seems to work. Please test it and clean it up - empty catch.
Hope that helps,
Tom
Thanks Tom, it is working. I committed changes for the same.
Revision no. 57488
Author: harish.khattri
The implementation of Create External Schema on the External Schemas node in the schema tree is done.
Thanks Tom, it is working. I committed changes for the same.
Revision no. 57488
Author: harish.khattri
The implementation of Create External Schema on the External Schemas node in the schema tree is done.
Revision no. 57493
Author: harish.khattri
Revision no. 57493
Author: harish.khattri
Revision no. 57496
Author: harish.khattri
@Tom, What we have to show in the "External Database Properties" menu of the individual External Database node? As there is no direct Create External Database query and all the information saved in the SVV_EXTERNAL_DATABASE view is same as the External Schema which we are already showing in the External Schema Properties.
If we have to show the External Database Properties then we have to create ExternalDatabaseDialog class only for showing properties.
Please suggest
Revision no. 57496
Author: harish.khattri
@Tom, What we have to show in the "External Database Properties" menu of the individual External Database node? As there is no direct Create External Database query and all the information saved in the SVV_EXTERNAL_DATABASE view is same as the External Schema which we are already showing in the External Schema Properties.
If we have to show the External Database Properties then we have to create ExternalDatabaseDialog class only for showing properties.
Please suggest
Revision no. 57500
Author: harish.khattri
Revision no. 57500
Author: harish.khattri
Hi Harish,
Per your email... I know we are limited with catalog information especially for database. Let's skip the properties for database. The detail pane has enough information.
Thanks,
Tom
Hi Harish,
Per your email... I know we are limited with catalog information especially for database. Let's skip the properties for database. The detail pane has enough information.
Thanks,
Tom
Hi Harish,
One small change... Would you change the text in the create external schema menu from:
"Create external database if not exists"
to
"Creating the schema will also automatically create the database if it doesn't exist"
and use UIComponentFactory.getExampleComponent to display the message just below the database textfield.
Thank you,
Tom
Hi Harish,
One small change... Would you change the text in the create external schema menu from:
"Create external database if not exists"
to
"Creating the schema will also automatically create the database if it doesn't exist"
and use UIComponentFactory.getExampleComponent to display the message just below the database textfield.
Thank you,
Tom
Hi Tom,
I committed the changes suggested by you in following commit:
Revision no. 57506
Author: harish.khattri
Hi Tom,
I committed the changes suggested by you in following commit:
Revision no. 57506
Author: harish.khattri
Thanks Harish. Looks good.
Another thing that I noticed is when dropping an external schema using the gui on either a closed or opened tree database node, the database node does not get refreshed. Can you have a look at that? Thank you.
Thanks Harish. Looks good.
Another thing that I noticed is when dropping an external schema using the gui on either a closed or opened tree database node, the database node does not get refreshed. Can you have a look at that? Thank you.
Hi Tom,
We discussed this issue in our status call on 14th Jan, the External Databases node is not getting refreshed when we create or drop External Database through external schema node (list node as well as individual external schema node). There is utility method to refresh parent (or self) node of selected node only. In our case we need to refresh parent node of the parent node of selected node (say grandparent node), there is no utility method for that.
I fixed this issue and committed the changes in following commit:
Revision no. 57508
Author: harish.khattri
Hi Tom,
We discussed this issue in our status call on 14th Jan, the External Databases node is not getting refreshed when we create or drop External Database through external schema node (list node as well as individual external schema node). There is utility method to refresh parent (or self) node of selected node only. In our case we need to refresh parent node of the parent node of selected node (say grandparent node), there is no utility method for that.
I fixed this issue and committed the changes in following commit:
Revision no. 57508
Author: harish.khattri
Revision no. 57520
Author: harish.khattri
Revision no. 57520
Author: harish.khattri
Revision no. 57527
Author: harish.khattri
Revision no. 57527
Author: harish.khattri
Revision no. 57529
Author: harish.khattri
Revision no. 57530
Author: harish.khattri
Revision no. 57529
Author: harish.khattri
Revision no. 57530
Author: harish.khattri
Revision no. 57533
Author: harish.khattri
Revision no. 57533
Author: harish.khattri
Revision no. 57536
Author: harish.khattri
Revision no. 57536
Author: harish.khattri
Hi Harish,
We discussed the above at the meeting. I agree. Please remember to fix the column search in the auto complete.
Thanks,
Tom
Hi Harish,
We discussed the above at the meeting. I agree. Please remember to fix the column search in the auto complete.
Thanks,
Tom
Revision no. 57542
Author: harish.khattri
Revision no. 57542
Author: harish.khattri
Hi Harish,
When we display the table properties->storage, we should also list the input/output format and any serde info. I can see it in the detail window so it is available. Would you have a look at that?
See HiveColumnUtil class...
The column parms that are not usable should be removed!!
When creating an external table, I keep getting the error /* Required: Storage Location */ even when I have filled that field in. See general, storage and preview. Am I missing something?
Thanks,
Tom
Hi Harish,
When we display the table properties->storage, we should also list the input/output format and any serde info. I can see it in the detail window so it is available. Would you have a look at that?
See HiveColumnUtil class...
The column parms that are not usable should be removed!!
When creating an external table, I keep getting the error /* Required: Storage Location */ even when I have filled that field in. See general, storage and preview. Am I missing something?
Thanks,
Tom
Hi Tom,
I fixed the input/output format. Also fixed the validation of the location and input/output format.
The input/output format for standard file formats will be informative only as they will not be the part of the SQL query for standard formats.
Revision no. 57546
Author: harish.khattri
Revision no. 57547
Author: harish.khattri
Revision no. 57549
Author: harish.khattri
Hi Tom,
I fixed the input/output format. Also fixed the validation of the location and input/output format.
The input/output format for standard file formats will be informative only as they will not be the part of the SQL query for standard formats.
Revision no. 57546
Author: harish.khattri
Revision no. 57547
Author: harish.khattri
Revision no. 57549
Author: harish.khattri
Hi Harish,
Table Properties now has an extra line Would you please have a look at it?
Another thing I noticed was that for Alter/Create, the null column is available. I don't see that we can have those in the ddl. Can we either eliminate it or gray it out for external tables; whichever is easier.
When doing a script table to new window, we get this error message. We probably should just use the default internal database and schema for the new window instead of trying to switch to the external one. I don't know how you make the distinction between and a "tom" internal database and a "tom" external database. This looks problematic and does not work.If you happen to select the internal database "tom" in the QA window and then execute SELECT * FROM tom.student_data which is an external table, it errors with a not found. Hmm!! Have to think about this??
The rest of the changes look good.
Thanks,
Tom
Hi Harish,
Table Properties now has an extra line Would you please have a look at it?
Another thing I noticed was that for Alter/Create, the null column is available. I don't see that we can have those in the ddl. Can we either eliminate it or gray it out for external tables; whichever is easier.
When doing a script table to new window, we get this error message. We probably should just use the default internal database and schema for the new window instead of trying to switch to the external one. I don't know how you make the distinction between and a "tom" internal database and a "tom" external database. This looks problematic and does not work.If you happen to select the internal database "tom" in the QA window and then execute SELECT * FROM tom.student_data which is an external table, it errors with a not found. Hmm!! Have to think about this??
The rest of the changes look good.
Thanks,
Tom
Hi Tom,
I fixed the issues mentioned in your above comment.
Revision no. 57552
Author: harish.khattri
Hi Tom,
I fixed the issues mentioned in your above comment.
Revision no. 57552
Author: harish.khattri
Hi Harish,
I created two external schema's tom1 and tom2 using hive metastore but I don't see them in the database tree??
I got this from the create external schema docs...
You can't use CREATE EXTERNAL DATABASE IF NOT EXISTS with HIVE METASTORE
Thanks, Tom
CREATE EXTERNAL SCHEMA "tom2" FROM HIVE METASTORE DATABASE 'tom2' URI '172.31.32.0/20' PORT 9083 IAM_ROLE 'arn:aws:iam::789328423458:role/AWSRoleForRedshiftSpectrum'
Hi Harish,
I created two external schema's tom1 and tom2 using hive metastore but I don't see them in the database tree??
I got this from the create external schema docs...
You can't use CREATE EXTERNAL DATABASE IF NOT EXISTS with HIVE METASTORE
Thanks, Tom
CREATE EXTERNAL SCHEMA "tom2" FROM HIVE METASTORE DATABASE 'tom2' URI '172.31.32.0/20' PORT 9083 IAM_ROLE 'arn:aws:iam::789328423458:role/AWSRoleForRedshiftSpectrum'
Hi Tom,
It is handled for both create and drop in the scripting and UI.
Hi Tom,
It is handled for both create and drop in the scripting and UI.
Hi Tom,
I am facing error when i try to create external table from the external schema having "Kind : Hive Metastore".
Please refer the screenshot: Metastore table
Thanks,
Bhushan
Hi Tom,
I am facing error when i try to create external table from the external schema having "Kind : Hive Metastore".
Please refer the screenshot: Metastore table
Thanks,
Bhushan
Hi Bhushan,
The issue with External Schema Properties is fixed.
Revision no. 57556
Author: harish.khattri
Hi Bhushan,
The issue with External Schema Properties is fixed.
Revision no. 57556
Author: harish.khattri
Hi Bhushan,
I will look into the Hive Metastore table issue tomorrow.
Thanks,
Tom
Hi Bhushan,
I will look into the Hive Metastore table issue tomorrow.
Thanks,
Tom
Hi Harish,
The issue with External Schema Properties is working fine on Build 22.
Thanks,
Bhushan
Hi Harish,
The issue with External Schema Properties is working fine on Build 22.
Thanks,
Bhushan
Hi Tom,
As per my observation when we expand the external databases node, it is very slow. It is affecting the Manual Testing as well as Automation, because it has became a time consuming process.
Thanks,
Bhushan
Hi Tom,
As per my observation when we expand the external databases node, it is very slow. It is affecting the Manual Testing as well as Automation, because it has became a time consuming process.
Thanks,
Bhushan
Hi Harish,
An observation... If I drop an external schema using the gui schema tree drop with the drop database checked, it doesn't seem to drop the database in the schema tree. If I use the drop with cascade in the QA window, it also drops the database. For the gui drop, you use the suffix restrict(default) which doesn't seem to drop the database. My guess is that it senses the schema and doesn't think the database is empty.
DROP SCHEMA "test" DROP EXTERNAL DATABASE CASCADE
We should also have a checkbox in the gui external schema drop for cascade. Have it checked by default and add the suffix cascade in the ddl. If unchecked, just change the cascade to restrict.
This is important because we can't directly drop the database. It will stick around forever.
Thanks,
Tom
Hi Harish,
An observation... If I drop an external schema using the gui schema tree drop with the drop database checked, it doesn't seem to drop the database in the schema tree. If I use the drop with cascade in the QA window, it also drops the database. For the gui drop, you use the suffix restrict(default) which doesn't seem to drop the database. My guess is that it senses the schema and doesn't think the database is empty.
DROP SCHEMA "test" DROP EXTERNAL DATABASE CASCADE
We should also have a checkbox in the gui external schema drop for cascade. Have it checked by default and add the suffix cascade in the ddl. If unchecked, just change the cascade to restrict.
This is important because we can't directly drop the database. It will stick around forever.
Thanks,
Tom
Hi Bhushan,
Yes it is slow. I tried to upgrade and reboot the cluster but that didn't help. I deleted a couple of external databases to see if that makes it any better. I also tried the query from the console and it to takes over 2 minutes. I'm checking with Jimmy to see if we have AWS support. Will let you know but in the meantime, we will have to live with it.
Thanks,
Tom
Hi Bhushan,
Yes it is slow. I tried to upgrade and reboot the cluster but that didn't help. I deleted a couple of external databases to see if that makes it any better. I also tried the query from the console and it to takes over 2 minutes. I'm checking with Jimmy to see if we have AWS support. Will let you know but in the meantime, we will have to live with it.
Thanks,
Tom
Hi Tom,
It is working fine, I tested with following steps:
1. Created external schema 'bhushan2schema' with option CREATE EXTERNAL DATABASE IF NOT EXISTS database name 'bhushan2'. screenshot1
2. The external database 'bhushan2' and schema 'bhushan2schema' created successfully. (refreshing External Databases node taking around 5-6 minutes). screenshot2
3. Created another external schema 'bhushan2schema1' under database 'bhushan2'. screenshot3
4. The external schema 'bhushan2schema1' created. Now database 'bhushan2' have two external schemas 'bhushan2schema' and 'bhushan2schema1'. screenshot4
5. Click on menu 'Drop External Schema' for 'bhushan2schema1', in the dialog I checked the option 'Drop External Database' and click on Drop button. screenshot5
6. The external database 'bhushan2' is also dropped. (refreshing External Databases node taking around 5-6 minutes) screenshot6
Bhushan will attach the screenshots for these steps.
Hi Tom,
It is working fine, I tested with following steps:
1. Created external schema 'bhushan2schema' with option CREATE EXTERNAL DATABASE IF NOT EXISTS database name 'bhushan2'. screenshot1
2. The external database 'bhushan2' and schema 'bhushan2schema' created successfully. (refreshing External Databases node taking around 5-6 minutes). screenshot2
3. Created another external schema 'bhushan2schema1' under database 'bhushan2'. screenshot3
4. The external schema 'bhushan2schema1' created. Now database 'bhushan2' have two external schemas 'bhushan2schema' and 'bhushan2schema1'. screenshot4
5. Click on menu 'Drop External Schema' for 'bhushan2schema1', in the dialog I checked the option 'Drop External Database' and click on Drop button. screenshot5
6. The external database 'bhushan2' is also dropped. (refreshing External Databases node taking around 5-6 minutes) screenshot6
Bhushan will attach the screenshots for these steps.
Hi Harish,
Hmm. I tried it again and it worked. It didn't drop the database for me when I tried before. Let's have the QA team test it one more time. If they don't find an issue, I am good with it.
Thanks and sorry for the delay,
Tom
Hi Harish,
Hmm. I tried it again and it worked. It didn't drop the database for me when I tried before. Let's have the QA team test it one more time. If they don't find an issue, I am good with it.
Thanks and sorry for the delay,
Tom
Hi Tom & Harish,
I have verified the scenario on build 26 as well. It is working fine.
Thanks,
Bhushan
Hi Tom & Harish,
I have verified the scenario on build 26 as well. It is working fine.
Thanks,
Bhushan
Hi Harish,
While alter the table; when we add any column, "Not Null" value getting generated in preview SQL tab, which is not getting generated while creating table.
Please refer: Screenshot
I have covered all the possible scenarios regarding manual testing those are as follows:-
1. Create External Database
2. Create External Schema (data catalog / Hive Metastore), Drop External Schema (with & w/o delete database), External Schema Properties, Script object (to window as, to new window as, to file as)
3. Create External Table, Alter External Table, Drop External Table, Select top 1000 rows, External Table Properties, Script object (to window as, to new window as, to file as)
4. Test Cases are added to the test rail: https://idera.testrail.net/index.php?/cases/view/241584
@Tom, These are the Scenarios which are pending due to S3 files
1. Remaining 5 file types are need to be tested for "Data Catalog".
2. I am not able to create external table using "Hive Metastore".
Thanks,
Bhushan
Hi Harish,
While alter the table; when we add any column, "Not Null" value getting generated in preview SQL tab, which is not getting generated while creating table.
Please refer: Screenshot
I have covered all the possible scenarios regarding manual testing those are as follows:-
1. Create External Database
2. Create External Schema (data catalog / Hive Metastore), Drop External Schema (with & w/o delete database), External Schema Properties, Script object (to window as, to new window as, to file as)
3. Create External Table, Alter External Table, Drop External Table, Select top 1000 rows, External Table Properties, Script object (to window as, to new window as, to file as)
4. Test Cases are added to the test rail: https://idera.testrail.net/index.php?/cases/view/241584
@Tom, These are the Scenarios which are pending due to S3 files
1. Remaining 5 file types are need to be tested for "Data Catalog".
2. I am not able to create external table using "Hive Metastore".
Thanks,
Bhushan
Thanks,
Bhushan
Thanks,
Bhushan
Hi Bhushan,
I have fixed following issues in the below mentioned revision:
1. Alter External Table: when add new column, the NOT NULL was appended in the query which is not supported for External Table
2. Added Cascade checkbox in the Drop External Schema UI as per Tom's suggestion. Now External database will get deleted if DROP EXTERNAL DATABASE and Cascade checkboxes will be checked and if external schema have external tables in it.
3. External Table Properties: corrected the format for SerDe properties.
Revision no. 57579
Author: harish.khattri
Hi Bhushan,
I have fixed following issues in the below mentioned revision:
1. Alter External Table: when add new column, the NOT NULL was appended in the query which is not supported for External Table
2. Added Cascade checkbox in the Drop External Schema UI as per Tom's suggestion. Now External database will get deleted if DROP EXTERNAL DATABASE and Cascade checkboxes will be checked and if external schema have external tables in it.
3. External Table Properties: corrected the format for SerDe properties.
Revision no. 57579
Author: harish.khattri
Revision: 24083
Revision: 24083
Observation :
1. when we delete one of the schema from the database with delete database & cascade checkbox checked, the database get deleted but when we again create new database with same name, at that time remaining schema from deleted database get reflected in that new database
(due to cascade, table is getting deleted)
Thanks,
Bhushan
Observation :
1. when we delete one of the schema from the database with delete database & cascade checkbox checked, the database get deleted but when we again create new database with same name, at that time remaining schema from deleted database get reflected in that new database
(due to cascade, table is getting deleted)
Thanks,
Bhushan
Hi Harish,
These are some bugs i have observed in while creating external tables for Avro file format
1. Storage Format - Wrong reflection though created by Row format serde
2. Serde name - not visible
3. I am not able to add column in external tables for Avro file format
@Tom,
(Observation: When i try alter any table in the from any file format to Avro file format, at that time serde property field remains non editable, but when i select the alter options table gets alter with the Avro file format- Screenshot )
TC - Redshift alter table doesn't allow you to update the serde properties - limitation.
4. I am not able add multiple columns while altering any table
Thanks,
Bhushan
Hi Harish,
These are some bugs i have observed in while creating external tables for Avro file format
1. Storage Format - Wrong reflection though created by Row format serde
2. Serde name - not visible
3. I am not able to add column in external tables for Avro file format
@Tom,
(Observation: When i try alter any table in the from any file format to Avro file format, at that time serde property field remains non editable, but when i select the alter options table gets alter with the Avro file format- Screenshot )
TC - Redshift alter table doesn't allow you to update the serde properties - limitation.
4. I am not able add multiple columns while altering any table
Thanks,
Bhushan
Hi Bhushan,
Points 1, 2 and 4 are fixed in below revisions. However, in point 3, database gives proper error while add or delete columns from external table with AVRO file format. I think it will be OK.
As fixing point 3 is complex, I can remove menu items "INSERT COLUMN" and "DROP COLUMN", but the extra row is added in the columns table which I can not remove only for AVRO file format. Because this row is being added from the Column Model, here I am able to determine that it is an External Table but I can not check the file format.
If I disable columns table for AVRO file format then we can not rename the column which is permitted for AVRO file format.
TC - I agree with Harish's assessment.
Revision no. 57591
Author: harish.khattri
Revision no. 57593
Author: harish.khattri
Hi Bhushan,
Points 1, 2 and 4 are fixed in below revisions. However, in point 3, database gives proper error while add or delete columns from external table with AVRO file format. I think it will be OK.
As fixing point 3 is complex, I can remove menu items "INSERT COLUMN" and "DROP COLUMN", but the extra row is added in the columns table which I can not remove only for AVRO file format. Because this row is being added from the Column Model, here I am able to determine that it is an External Table but I can not check the file format.
If I disable columns table for AVRO file format then we can not rename the column which is permitted for AVRO file format.
TC - I agree with Harish's assessment.
Revision no. 57591
Author: harish.khattri
Revision no. 57593
Author: harish.khattri
Hi Harish,
I have verified the issues on build 32. Those are working fine.
Thanks,
Bhushan
Hi Harish,
I have verified the issues on build 32. Those are working fine.
Thanks,
Bhushan
Automation test result
Automation test result
Hi Harish,
One thing that we should make sure gets documented in the ADS documentation is the internal database connection/external schema relationship. For example, if I am connected to internal database "aquafold" and I create an external schema/table, that schema/table will not show up in the schema tree if I change the database connection to say "amazon_db" In other words, the schema tree external schemas/tables will only display what was created with the current connection. The same is true for QA. Do a "SELECT * FROM SVV_EXTERNAL_SCHEMAS" using the "aquafold" database connection. Now switch databases at the top of the QA window and run the query again. You will see different schema's listed.
I have not figured a way around this nor have I found that relationship in the catalog tables. The AWS Redshift console works the same way so I'm guessing that it is an AWS idiosyncrasy.
Do you know of any way around this? In other words, to display all external schema's in the tree.
Thanks,
Tom
Hi Harish,
One thing that we should make sure gets documented in the ADS documentation is the internal database connection/external schema relationship. For example, if I am connected to internal database "aquafold" and I create an external schema/table, that schema/table will not show up in the schema tree if I change the database connection to say "amazon_db" In other words, the schema tree external schemas/tables will only display what was created with the current connection. The same is true for QA. Do a "SELECT * FROM SVV_EXTERNAL_SCHEMAS" using the "aquafold" database connection. Now switch databases at the top of the QA window and run the query again. You will see different schema's listed.
I have not figured a way around this nor have I found that relationship in the catalog tables. The AWS Redshift console works the same way so I'm guessing that it is an AWS idiosyncrasy.
Do you know of any way around this? In other words, to display all external schema's in the tree.
Thanks,
Tom
Hi Tom,
I fixed the issue with Sequence file in the Hive metastore. Actually the output format value for Hive metastore and for data catalog are different. So, I followed the HiveColumnUtil for different output format values for Sequence and Text file formats. Now it will identify the Sequence file correctly for both.
Internal Database Connection and External Schema Relationship:
As per the Amazon Redshift Documentation, The CREATE EXTERNAL SCHEMA query creates a new external schema in the current database. The relationship between Internal Database Connection and External Schema is not well defined in the AWS Redshift documentation. Please find below my observations related to the relationship:
There are system views to get the information about External Database, External Schema and External Table named SVV_EXTERNAL_DATABASES, SVV_EXTERNAL_SCHEMAS and SVV_EXTERNAL_TABLES. Each internal database has its own copy of these system views.
When we create an external database then its entry will be added in SVV_EXTERNAL_DATABASES view for each Internal database connection. However, for external schema and external table it is not the case. When we create an external schema and/or external table then its entry will be added to the system views only for connected internal database. Hence, we are getting different results for the same query "SELECT * FROM SVV_EXTERNAL_SCHEMAS" for different internal database connection.
I also not figured out the way to get all external schemas irrespective of the internal database connection. If you remembered, we discussed this scenario when I started implementation of Create External Schema. Then we decided to use original connection in the External Schemas node from the connection properties. The Amazon Redshift database does not allow the query "SELECT * FROM '<database_name>'.SVV_EXTERNAL_SCHEMAS"
Revision no. 57597
Author: harish.khattri
Hi Tom,
I fixed the issue with Sequence file in the Hive metastore. Actually the output format value for Hive metastore and for data catalog are different. So, I followed the HiveColumnUtil for different output format values for Sequence and Text file formats. Now it will identify the Sequence file correctly for both.
Internal Database Connection and External Schema Relationship:
As per the Amazon Redshift Documentation, The CREATE EXTERNAL SCHEMA query creates a new external schema in the current database. The relationship between Internal Database Connection and External Schema is not well defined in the AWS Redshift documentation. Please find below my observations related to the relationship:
There are system views to get the information about External Database, External Schema and External Table named SVV_EXTERNAL_DATABASES, SVV_EXTERNAL_SCHEMAS and SVV_EXTERNAL_TABLES. Each internal database has its own copy of these system views.
When we create an external database then its entry will be added in SVV_EXTERNAL_DATABASES view for each Internal database connection. However, for external schema and external table it is not the case. When we create an external schema and/or external table then its entry will be added to the system views only for connected internal database. Hence, we are getting different results for the same query "SELECT * FROM SVV_EXTERNAL_SCHEMAS" for different internal database connection.
I also not figured out the way to get all external schemas irrespective of the internal database connection. If you remembered, we discussed this scenario when I started implementation of Create External Schema. Then we decided to use original connection in the External Schemas node from the connection properties. The Amazon Redshift database does not allow the query "SELECT * FROM '<database_name>'.SVV_EXTERNAL_SCHEMAS"
Revision no. 57597
Author: harish.khattri
@Tom,
Observation:
1. When we execute the "Select Top 1000 Rows" queries for some of the hive tables, i am getting this error - Screenshot1
Is it a bug or the AWS behavior?
2. When we execute the "Select Top 1000 Rows" queries for some of the AVRO table (data catalog), i am getting this error- Screenshot2
Thanks,
Bhushan
@Tom,
Observation:
1. When we execute the "Select Top 1000 Rows" queries for some of the hive tables, i am getting this error - Screenshot1
Is it a bug or the AWS behavior?
2. When we execute the "Select Top 1000 Rows" queries for some of the AVRO table (data catalog), i am getting this error- Screenshot2
Thanks,
Bhushan
Hi Bhushan,
Not sure what the issues are. Probably either s3 permissions or underlying data was deleted. Remember, that this is a little more complicated then creating a table in a database. The underlying infrastructure in s3 has to be set up - buckets, folders...etc.. Once a table is created and stored in s3, permissions have to be given to access the data from redshift. I created a few emr hive tables in s3 for you to test with.
Hi Bhushan,
Not sure what the issues are. Probably either s3 permissions or underlying data was deleted. Remember, that this is a little more complicated then creating a table in a database. The underlying infrastructure in s3 has to be set up - buckets, folders...etc.. Once a table is created and stored in s3, permissions have to be given to access the data from redshift. I created a few emr hive tables in s3 for you to test with.
Hi Tom,
Yes, it is a good idea to add tooltips as user will know which database he/she is connected. I added tooltips on the External Schemas and External Tables node.
Revision no. 57600
Author: harish.khattri
Hi Tom,
Yes, it is a good idea to add tooltips as user will know which database he/she is connected. I added tooltips on the External Schemas and External Tables node.
Revision no. 57600
Author: harish.khattri
Hi Tom,
I have executed the queries you have provided, those are working fine. I have verified on the ADS-34 build.
Thanks,
Bhushan
Hi Tom,
I have executed the queries you have provided, those are working fine. I have verified on the ADS-34 build.
Thanks,
Bhushan
Revision no. 57608
Author: harish.khattri
Revision no. 57608
Author: harish.khattri
Verified with build : 20.6 RC-2
Able to create External Database, External Schema and External Table on Amazon Redshift.
Schema tree structure for External Database , External Schema and External Table is available.
*) An external database only be created when creating an external schema. (exdbschema.jpg)
*) Cannot a create external schema and external table with same name on any external database available, since all data is stored in same location.
*) An external database can be dropped , while dropping an external schema with an option
"Drop External Database" on Drop: External schema window. (exdbdropschema.jpg)
*) An External Schema can be created in two kind
"Data Catalog" and "Hive Metastore" ( exschemadc.jpg and exschemahm.jpg)
*) Options available for external database ( exdatabase.jpg )
*) Options available for external schema ( exschemeprop.jpg )
*) Options available for external table ( extableprop.jpg )
*) An External Table can be created with multiple options ( extable.jpg, extablestformat.jpg and extablestoredas.jpg )
Storage format: Row Format Delimited and Row Format SerDe
Stored As: PARQUET, RCFILE, SEQUENCEFILE, TEXTFILE, ORC, AVRO AND OTHER
Verified with build : 20.6 RC-2
Able to create External Database, External Schema and External Table on Amazon Redshift.
Schema tree structure for External Database , External Schema and External Table is available.
*) An external database only be created when creating an external schema. (exdbschema.jpg)
*) Cannot a create external schema and external table with same name on any external database available, since all data is stored in same location.
*) An external database can be dropped , while dropping an external schema with an option
"Drop External Database" on Drop: External schema window. (exdbdropschema.jpg)
*) An External Schema can be created in two kind
"Data Catalog" and "Hive Metastore" ( exschemadc.jpg and exschemahm.jpg)
*) Options available for external database ( exdatabase.jpg )
*) Options available for external schema ( exschemeprop.jpg )
*) Options available for external table ( extableprop.jpg )
*) An External Table can be created with multiple options ( extable.jpg, extablestformat.jpg and extablestoredas.jpg )
Storage format: Row Format Delimited and Row Format SerDe
Stored As: PARQUET, RCFILE, SEQUENCEFILE, TEXTFILE, ORC, AVRO AND OTHER
Issue #15455 |
Verified |
Fixed |
Resolved |
Completion |
No due date |
Fixed Build ADS 20.6.0-dev-32-no-ofsc |
No time estimate |
1 issue link |
relates to #15773
Issue #15773Amazon Redshift - Add schema script generator support for external tables |
I created an external schema vn_spectrum and table in Redshift/Spectrum.
RETURNS THE LIST OF ALL DATABASES CREATED IN ATHENA