Redshift__在一個外部架構下建立外部表後,其他外部架構也自動生成了一樣的外部表

lusklusklusk發表於2023-04-21

官方檔案



Amazon Redshift 外部 schema 引用了外部資料目錄中的外部資料庫。您可在 Amazon Redshift、Amazon Athena、AWS Glue Data Catalog 或在 Apache Hive 元儲存(如 Amazon EMR)中建立外部資料庫。

只有建立external schema時使用不同的database,則在這些不同的external schema下建立一個外部表,其他的external schema不會自動出現生成相同的外部表,也就是說,其實external table外部表是和external database對應的,同一個external database下所有的external schema有相同的external table
create external schema spectrum_schema from data catalog
database 'spectrum_db'
iam_role 'arn:aws:iam::123456789012:role/MySpectrumRole'
create external database if not exists;




官方技術支援的回覆
Dear Customer,

Warm greetings from AWS Support Engineering. I hope you are doing well. Thank you for reaching out to us with your concern. My name is Gurpreet and I will be helping you with your case.

From the case notes and our discussion, I could understand that you found an issue where, when user creates or drops an external table under specific external schema (like spectrum), then other external schema will also generate or lose the same table. Please correct me if I misunderstood your concern in any way.

In this context, I would like to inform you that I have tested a similar scenario at my test environment and was able to replicate the similar scenario at my end. Please find below the quick summary steps:

1. Created 2 external schemas over the same Athena database - spectrum_schema_1 & spectrum_schema_2

2. Created an external table in spectrum_schema_1 and it was created in both the external schemas:

dev=# create external table spectrum_schema_1.newtbl(id int, doj varchar) location 's3://<test-buck>/f1';
CREATE EXTERNAL TABLE

dev=# select redshift_database_name, schemaname, tablename, location from svv_external_tables where tablename='newtbl';
 redshift_database_name |    schemaname     | tablename |      location
------------------------+-------------------+-----------+--------------------
 dev                    | spectrum_schema_1 | newtbl    | s3://<test-buck>/f1
 dev                    | spectrum_schema_2 | newtbl    | s3://<test-buck>/f1
(2 rows)


2. Dropped the external table from spectrum_schema_1 and it was dropped from both the external schemas:

dev=# drop table spectrum_schema_1.newtbl;
DROP TABLE

dev=# select redshift_database_name, schemaname, tablename, location from svv_external_tables where tablename='newtbl';
 redshift_database_name | schemaname | tablename | location
------------------------+------------+-----------+----------
(0 rows)

===================

However, the above behavior is not observed if the external schemas are created over two different Athena databases. Please find below the quick summary steps:

1. Created another external schema over a different Athena database - spectrum_schema_3.

2. Created an external table in spectrum_schema_3 and it is only created in one external schema (i.e. - spectrum_schema_3):

dev=# select redshift_database_name, schemaname, tablename, location from svv_external_tables where tablename='newtbl';
 redshift_database_name |    schemaname     | tablename |      location
------------------------+-------------------+-----------+--------------------
 dev                    | spectrum_schema_3 | newtbl    | s3://<test-audit-logs>/f1/logs
(1 row)


Therefore, from the above replications it can be inferred that in case the external schemas over the same database and we create a table in one of the external schema then it is created in other external schemas as well because the external table is being created at the Athena database from where other schemas are fetching/reading the data.

However, in case the external schemas created over the different Athena databases, it doesn’t exhibit the above behavior because the external table is being created in the Athena database which is not scanned by other external schema. So, I would like to inform you that the behavior being observed is expected.

With that said, in case you have any other queries/concerns related to this issue. I would be happy to hop on a call to discuss this further. And, I request you to please provide your available time slots (Preferably 2-3 time slots) along with timezone so that we can schedule a call at your convenience.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30126024/viewspace-2947503/,如需轉載,請註明出處,否則將追究法律責任。

相關文章