340 likes | 465 Views
Finding and Reporting Postgres Bug #7553. By: Lloyd Albin 9/28/2012. The Project.
E N D
Finding and ReportingPostgres Bug #7553 By: Lloyd Albin 9/28/2012
The Project Restructure the database by moving tables, views, and functions to new schemas and sometimes renaming those tables, views, or functions at the same time. Some tables can’t be moved so you need to create views in the new schema to reference the old tables and then update any view to reference the new view instead of the table.
Stage 1, moving items around The first task is to move the tables, views and functions to there new locations and rename them is necessary. We must rename them first to keep from having two items with the same name in the new schemas. This is done by the following simple Postgres commands. ALTER TABLE schema.table RENAME TO new_table; ALTER TABLE schema.table SET SCHEMA new_schema;
Stage 2, Creating Views Some tables can’t be moved so you need to create views in the new schema to reference the old tables. CREATE OR REPLACE VIEW new_schema.new_viewAS SELECT * FROM old_schema.old_table;
Stage 3, Updating existing Views This is the hard part. We need to write a function to update existing views to reference the new views instead of the old tables. To do this part I wrote a plpgsql function that reads the view definition from information_schema.views and modfiesby using regexp_replace to update the references. At the same time I added to the start of the view the ‘CREATE OR REPLACE VIEW schema.view AS ’ and then EXECUTE the new view thereby saving it.
The Problem I found that some views during the EXECUTE were failing. So figuring the problem must be within the function that I just wrote, time to start debugging my function. To make debugging easier, I added two lines to show me the DDL for the old and new views. RAISE NOTICE ‘Old View \n%’, old_view; RAISE NOTICE ‘New View \n%’, new_view;
Comparing the new and old views What I found was that the function was changing the code properly between the old and new views. FROM old_schema.old_table to FROM new_schema.new_view old_table.field to new_view.field
Trying the SELECT Statement I tried to execute the new SELECT statement and it also failed with error messages. In trying to figure out the problem, I ended up trying to execute the old SELECT statement and again it failed with error messages. This made me wonder if I had the correct view definition, so I then looked at the old view definition with a GUI tool. It looked the same. I clicked on the Data tab, and data appeared.
Realizing it is a Bug This is becoming really strange, how can a view work, but the SELECT statement not be executable outside of the view. At this point, I knew it was a Postgres bug, now to figure out what is going wrong.
Looking closely at the Views When I started looking closely at the views, I noticed something interesting. When Postgres updates the view during the ALTER TABLE commands, it does so using the alias syntax. The following two views show an examples of with and without using the alias syntax.
Alias Syntax Without Alias Syntax With Alias Syntax SELECT schema.table.field, schema2.table.field FROM schema.table LEFT JOIN schema2.table ON schema.table.field = schema2.table.field; SELECT a.field, b.field FROM schema.table a LEFT JOIN schema2.table b ON a.field= b.field;
Benefits of Alias Syntax As you can see from those two views, the Alias syntax can sometimes make it easier to read the view. The problem is that Postgres did not update the view correctly. The following two views show how Postgres would update the view.
How Postgres updated the view Pre ALTER commands Post ALTER commands SELECT schema.table.field, schema2.table.field FROM schema.table LEFT JOIN schema2.table ON schema.table.field = schema2.table.field; SELECT table.field, table.field FROM schema.table table LEFT JOIN schema2.table table ON table.field= table.field;
The Problem with the Alias Syntax The problem with the “Post ALTER commands’ version is that you can’t tell which table is being referenced via the Alias.
How to Submit a bug report • Please ensure you have read the bug reporting guidelines before reporting a bug. In particular, please re-read the documentation to verify that what you are trying is possible. If the documentation is not clear, please report that, too; it is a documentation bug. If a program does something different from what the documentation says, that is also a bug. • Poor performance is not necessarily a bug. Read the documentation or ask on one of the mailing lists for help in tuning your applications. Failing to comply to the SQL standard is not necessarily a bug either, unless compliance for the specific feature is explicitly claimed. • Before you continue, check on the TODO list and in the FAQ to see if your bug is already known. If you cannot decode the information on the TODO list, report your problem so we can clarify the TODO list. • To report a security bug, please send an email to security@postgresql.org. All other bugs will be forwarded to the pgsql-bugs mailing list where they will be publicly archived. • Make sure you are running the latest available minor release for your major version before reporting a bug. The current list of supported versions is 9.2.0, 9.1.5, 9.0.9, 8.4.13, 8.3.20. • This bug report form should only be used for reporting bugs and problems with the PostgreSQL database. Problems with database connectors such as ODBC and JDBC, graphical administration tools such as pgAdmin or other external projects should not be reported here; please report to those projects directly. For products closely connected with PostgreSQL, there may be an appropriate mailing list available.
Latest Version of Postgres This is a problem, we just can’t upgrade our server without testing all of our application, etc. So to be able to report this bug I needed to find a way to test it on the current version of Postgres 9.0.9 since we are running 9.0.7. What I came up with was a test script that I could run on any version of Postgres.
Setting up the Test Script CREATE SCHEMA schema_a; CREATE SCHEMA schema_b; CREATE SCHEMA schema_c; CREATE TABLE schema_a.table_a ( id varchar(11), field1 varchar(10) ); CREATE TABLE schema_b.table_a ( id varchar(11), field1 varchar(10) ); CREATE TABLE schema_b.table_b ( id varchar(11) ); INSERT INTO schema_a.table_a VALUES ('test1', 'test2'); INSERT INTO schema_b.table_a VALUES ('test1', 'test3'); INSERT INTO schema_b.table_b VALUES ('test1');
The View CREATE OR REPLACE VIEW public.view_aAS SELECT schema_a.table_a.field1, schema_b.table_a.field1 AS field2 FROM schema_b.table_b LEFT JOIN schema_b.table_a ON schema_b.table_b.id = schema_b.table_a.id LEFT JOIN schema_a.table_a ON schema_a.table_a.id = schema_b.table_b.id;
Testing the View The first test of the view is to make sure a SELECT on the view works. SELECT * FROM public.view_a; Results: "field1" "field2" "test2" "test3"
How to test the view definition CREATE OR REPLACE FUNCTION schema_c.function_a () RETURNS void AS $$ DECLARE def_row RECORD; BEGIN SELECT definition INTO def_row FROM pg_catalog.pg_views WHERE viewname = 'view_a'; EXECUTE def_row.definition; END; $$ LANGUAGE plpgsql;
Testing the View Definition The second test of the view makes sure the view definition works. SELECT * FROM schema_c.function_a(); The real result is no error messages from the function, but behind the scene we are getting this result. Results: "field1" "field2" "test2" "test3"
Causing the Bug to happen ALTER TABLE schema_a.table_a RENAME TO table_d; ALTER TABLE schema_a.table_d SET SCHEMA schema_c; ALTER TABLE schema_b.table_a RENAME TO table_e; ALTER TABLE schema_b.table_e SET SCHEMA schema_c; ALTER TABLE schema_b.table_b RENAME TO table_f; ALTER TABLE schema_b.table_f SET SCHEMA schema_c; -- Use with Postgres 8.3 --ALTER TABLE public.view_aSET SCHEMA schema_c; -- Use with Postgres 8.4+ ALTER VIEW public.view_aSET SCHEMA schema_c;
Testing the View Now to test the view again. SELECT * FROM schema_c.view_a; Results: "field1" "field2" "test2" "test3"
Testing the View Definition Testing the view definition again. SELECT * FROM schema_c.function_a(); Results: ERROR: invalid reference to FROM-clause entry for table "table_a“ LINE 1: ...hema_c.table_etable_a ON (((table_b.id)::text = (schema_c.t... HINT: There is an entry for table "table_a", but it cannot be referenced from this part of the query. QUERY: SELECT schema_c.table_a.field1, schema_c.table_a.field1 AS field2 FROM ((schema_c.table_ftable_b LEFT JOIN schema_c.table_etable_a ON (((table_b.id)::text = (schema_c.table_a.id)::text))) LEFT JOIN schema_c.table_dtable_a ON (((schema_c.table_a.id)::text = (table_b.id)::text))); CONTEXT: PL/pgSQL function schema_c.function_a() line 6 at EXECUTE statement
The Old View Code (Review) CREATE OR REPLACE VIEW public.view_a AS SELECT schema_a.table_a.field1, schema_b.table_a.field1 AS field2 FROM schema_b.table_b LEFT JOIN schema_b.table_a ON schema_b.table_b.id = schema_b.table_a.id LEFT JOIN schema_a.table_a ON schema_a.table_a.id = schema_b.table_b.id;
The New View Code CREATE VIEW schema_c.view_a( field1, field2) AS SELECT table_a.field1, table_a.field1 AS field2 FROM schema_c.table_ftable_b LEFT JOIN schema_c.table_etable_a ON table_b.id::text = schema_c.table_a.id::text LEFT JOIN schema_c.table_dtable_a ON schema_c.table_a.id::text = table_b.id::text;
Submitting the Bug 9/17/2012 To submit a bug: http://www.postgresql.org/support/submitbug/ Or on the front page of the Postgres website under shortcuts, click on “Report a Bug”. The bug will not show up on the list until it has been approved. Mine showed up about 6-7 hours later.
First Response to the Bug Report 9/18 This appears to be a variant of the what-alias-to-use-after-a-rename problem that we were discussing last winter: http://archives.postgresql.org/message-id/29791.1327718297@sss.pgh.pa.us That thread kinda died off after coming to the conclusion that there isn't a simple, obviously correct solution. The code's current behavior is definitely wrong in the face of renames, but I don't want to change it in a way that just moves the failure cases elsewhere (which is what would happen for any of the proposals made in the thread :-(). regards, tom lane
Progress on the Hackers List 9/18/2012 – Tom talks about various ways to fix this bug. 9/20/2012 – Tom wrote the first fix for the bug.
Third Note in Hackers List 9/21/2012 • ... In the attached I've hacked around this by causing the planner to > assign new aliases to RTEs that it replaces in this way (see planagg.c > and prepunion.c diffs). This seems like a bit of a kluge, but it > doesn't take much code. An alternative that I'm considering is to > have EXPLAIN make a pre-pass over the plan tree to identify which > RTEs will actually be referenced, and then consider only those RTEs > while assigning aliases. This would be a great deal more code though, > and code which would require maintenance every time we add plan node > types etc. So I'm not sure it's really a better answer. Thoughts? Attached is a second draft that does it like that. This adds about 130 lines to explain.c compared to the other way, but on reflection it's probably a better solution compared to trying to kluge things in the planner. The change in the select_views results shows that there's at least one other case of duplicated RTE names that I'd not covered with the two planner kluges. I think the next question is whether we want to back-patch this. Although the problem with incorrect view dumping is arguably a data integrity issue (cf bug #7553), few enough people have hit it that I'm not sure it's worth taking risks for. I'd feel better about this code once it'd got through a beta test cycle. Comments? regards, tom lane
Postgres Build Farm 9/22/2012, PG 9.3 Improve ruleutils.c's heuristics for dealing with rangetable aliases. The previous scheme had bugs in some corner cases involving tables that had been renamed since a view was made. This could result in dumped views that failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd Albin, as well as in some pgsql-hackers discussion back in January. Also, its behavior for printing EXPLAIN plans was sometimes confusing because of willingness to use the same alias for multiple RTEs (it was AshutoshBapat'scomplaint about that aspect that started the January thread). To fix, ensure that each RTE in the query has a unique unqualified alias, by modifying the alias if necessary (we add "_" and digits as needed to create a non-conflicting name). Then we can just print its variables with that alias, avoiding the confusing and bug-prone scheme of sometimes schema-qualifying variable names. In EXPLAIN, it proves to be expedient to take the further step of only assigning such aliases to RTEs that are actually referenced in the query, since the planner has a habit of generating extra RTEs with the same alias in situations such as inheritance-tree expansion. Although this fixes a bug of very long standing, I'm hesitant to back-patch such a noticeable behavioral change. My experiments while creating a regression test convinced me that actually incorrect output (as opposed to confusing output) occurs only in very narrow cases, which is backed up by the lack of previous complaints from the field. So we may be better off living with it in released branches; and in any case it'd be smart to let this ripen awhile in HEAD before we consider back-patching it.
SQL as entered in version 9.3 Entered 9.3 DDL CREATE OR REPLACE VIEW public.view_aAS SELECT schema_a.table_a.field1, schema_b.table_a.field1 AS field2 FROM schema_b.table_b LEFT JOIN schema_b.table_a ON schema_b.table_b.id = schema_b.table_a.id LEFT JOIN schema_a.table_a ON schema_a.table_a.id = schema_b.table_b.id; CREATE OR REPLACE VIEW public.view_a (field1, field2) AS SELECT table_a_1.field1, table_a.field1 AS field2 FROM schema_b.table_b LEFT JOIN schema_b.table_a ON table_b.id::text = table_a.id::text LEFT JOIN schema_a.table_atable_a_1 ON table_a_1.id::text = table_b.id::text;
SQL after ALTERS in version 9.3 Pre ALTER 9.3 DDL Post ALTER 9.3 DDL CREATE OR REPLACE VIEW public.view_a(field1, field2) AS SELECT table_a_1.field1, table_a.field1 AS field2 FROM schema_b.table_b LEFT JOIN schema_b.table_a ON table_b.id::text = table_a.id::text LEFT JOIN schema_a.table_a table_a_1 ON table_a_1.id::text = table_b.id::text; CREATE OR REPLACE VIEW schema_c.view_a (field1, field2) AS SELECT table_d.field1, table_e.field1 AS field2 FROM schema_c.table_f LEFT JOIN schema_c.table_e ON table_f.id::text = table_e.id::text LEFT JOIN schema_c.table_d ON table_d.id::text = table_f.id::text;
Afterward 4 days after submission, they had it fixed in the source code for Postgres 9.3 As part of a presentation on how to compile Postgres source code on a Raspberry PI, I compiled Postgres 9.3 and was able to test how the new code will work as shown on the two preceding slides. I also really like the way the new code stops using alias’s when they were no longer needed, thereby cleaning up the DDL. Many thanks to Tom Lane for the source code changes.