Idea for making COPY data Microsoft-proof

Idea for making COPY data Microsoft-proof

Post by Tom La » Wed, 13 Feb 2002 01:23:02




> Can you do something akin to what you did with the binary output - but in
> this case allow for no details.

This strikes me as solving an entirely different issue -- with great
loss of backwards compatibility.  Possibly these are good ideas, but for
the moment I'd like to keep this thread focused on the issue of coping
with newline translations.

(In any case, I thought someone was already working on an optional
column-name-list clause for COPY, which would solve that problem in what
seems a cleaner fashion.)

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

 
 
 

Idea for making COPY data Microsoft-proof

Post by Philip Warn » Wed, 13 Feb 2002 09:11:54




>This strikes me as solving an entirely different issue

Well, a related issue: you are talking about (slightly) changing the
encoding of dumped data - why not therefor allow for an (optional) header
with full encoding details. Column headers are just a minor bonus.

Quote:>- with great
>loss of backwards compatibility.

Not if anything without the 'WITH HEADERS' is treated as per current.
Anyone having M$ problems can use the new format (and pg_dump could use it
always).

Quote:>the moment I'd like to keep this thread focused on the issue of coping
>with newline translations.

No big deal, but this seems to be an encoding issue; and it seems like a
good idea to formalize it somehow.

----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \

Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate

message can get through to the mailing list cleanly

 
 
 

Idea for making COPY data Microsoft-proof

Post by Philip Warn » Wed, 13 Feb 2002 09:23:26



Quote:>To take just one problem: how do I know that
>the first line is metadata, and not data that happens to look exactly
>like whatever my metadata layout is?

You don't, which is why you need the 'WITH HEADER' or 'WITH ENCODING'
clause on COPY. I guess COPY could issue a warning when you do not say WITH
HEADER and it looks like a valid header.

Other than that, it's a case of storing information about the dumped data,
not the database schema in the data file. I'm not particularly attached to
the column names being there, but it does seem usefull to store
instructions indicating the the file is formatted.

----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \

Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

---------------------------(end of broadcast)---------------------------

 
 
 

Idea for making COPY data Microsoft-proof

Post by Tom La » Wed, 13 Feb 2002 09:23:26



> No big deal, but this seems to be an encoding issue; and it seems like a
> good idea to formalize it somehow.

Well, currently there is a strict separation between COPY data (in the
file) and metadata (supplied as parameters to the COPY command).  I'm
not eager to revisit that decision.  What you seem to be suggesting is
shoving metadata into the data file, but I think that will create more
problems than it solves.  To take just one problem: how do I know that
the first line is metadata, and not data that happens to look exactly
like whatever my metadata layout is?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------

 
 
 

Idea for making COPY data Microsoft-proof

Post by Brent Vern » Wed, 13 Feb 2002 13:01:59


[2002-02-11 11:11] Tom Lane said:

| > Can you do something akin to what you did with the binary output - but in
| > this case allow for no details.
|
| This strikes me as solving an entirely different issue -- with great
| loss of backwards compatibility.  Possibly these are good ideas, but for
| the moment I'd like to keep this thread focused on the issue of coping
| with newline translations.
|
| (In any case, I thought someone was already working on an optional
| column-name-list clause for COPY, which would solve that problem in what
| seems a cleaner fashion.)

Yes, the work for a column list in COPY FROM is largely done.  I've
not been able to work on COPY TO, tho.

Part #1 of your original proposal is certainly the right thing to do.

I've backgrounded this problem for most of the day, and although I
know it's a severe change, your "stronger" solution seems like a
better change than the part #2, which just feels like something
that would only be undone later.  Both #2 and the "stronger" way
will require a SET option for absolute correctness; why not require
that SETting more often than not for any old-format dumps?  Yes, it
will affect a larger number of users, but we net a better dump
format for that pain.

cheers.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate

message can get through to the mailing list cleanly

 
 
 

1. Making the regression tests locale-proof

Since locale support is now enabled by default, it is desirable that the
regression tests can pass if the clusters locale is not C.

As a first step I have included the following statements in pg_regress
right after the database is created:

alter database "$dbname" set lc_messages to 'C';
alter database "$dbname" set lc_monetary to 'C';
alter database "$dbname" set lc_numeric to 'C';
alter database "$dbname" set lc_time to 'C';

This gets rid of a boatload of failures related to number formatting.
For that purpose I have changed the permissions on these options to
USERSET.  (I'm still debating making lc_messages SUSET, because otherwise
users can screw with admins by changing the language of the log output all
the time.  Comments?)

The remaining issue is the sort order.  I think this can be solved for
practical purposes by creating two expected files for each affected test,
say char.out and char-locale.out.  The regression test driver would try
the first one, if that fails try the second one.

The assumption here is that all locales will choose the same sort order as
long as they're dealing only with the core 26 letters.  This does not have
to be true in theory, but I think it works for the vast majority of
practical cases.

We could also cut down the number of affected tests by making the
select_implicit and select_having not use mixed-case strings in the test
tables.  Then we have only char, varchar, and select_views left.

Comments?

--

---------------------------(end of broadcast)---------------------------

2. Distributed Transactions

3. Online PROOF of Microsoft's HOPELESSNESS

4. FoxPro under OS2

5. Boss wants proof that Microsoft still supports Foxpro

6. SQL text parse util

7. Microsoft really sucks (look at this proof)

8. Request help making duplicate copies of a database (without data)

9. Making a for loop parallel instead of serial - trolling for ideas

10. Microsoft SQL Server Password to be made Case Sensitive

11. MSSQL 2000, ODBC, tmpdb, locking - What changes Microsoft has made

12. VB and macros made in Microsoft Access