infile datalines end=endfile

infile datalines end=endfile

Post by Peetie Wheatstr » Fri, 20 Sep 2002 00:32:07



Greetings,

This has bothered me for quite some time ...

I often need to test/control for end-of-file when
programming data steps using in-stream data.

If I code:

data _null_;
  infile datalines eof=endfile;
  input x;
  sx + x;
  return;
 endfile:
  put _all_;
  return;
datalines;
2
3
;
run;

everything works fine.

But I more commonly need something like:

data _null_;
  infile datalines end=endfile;
  input x;
  sx + x;
  if endfile then put _all_;
datalines;
2
3
;
run;

Which yields:

"WARNING: The value of the INFILE END= option cannot
 be set for CARDS or DATALINES input."

Clearly (from the first data step) SAS has no problem
identifying the last record. Why won't "end=enfile"
in the second data step work?

Note that it states in the doc that it won't work. It
can't be considered a 'bug'.

Is it a design error that got nicely doc'd? Or is there
another reason why it is disfunctional?

  Thanx,
  Peetie

_____________________________________________________________
Play the Elvis? Scratch & Win for your chance to instantly win $10,000 Cash
- a 2003 Harley Davidson? Sportster? - 1 of 25,000 CD's - and more!
http://r.lycos.com/r/sagel_mail_scratch_tl/http://win.ipromotions.com...

 
 
 

infile datalines end=endfile

Post by diskin.den.. » Fri, 20 Sep 2002 01:16:51


Peetie,

I'll try  to make my understanding of this clear, but forgive me if I fall
short.

If end= is specified for a SAS dataset, SAS sets the  variable when it
detects that it has read the  last record. This is triggered  by something
in the  SAS  dataset, either a count or a last record flag, I'm not sure
which.

When reading an external (non-SAS)  dataset, SAS cannot try to read the
next record until the of the program calls for it. Among many reasons for
this behavior is that the input stream may be interactive and dependent on
the  processing.
If all external input was from static flat files, something like this could
be implemented.

HTH,
Dennis Diskin


       09/18/2002 11:32 AM




cc:

Subject:    infile datalines end=endfile

Greetings,

This has bothered me for quite some time ...

I often need to test/control for end-of-file when
programming data steps using in-stream data.

If I code:

data _null_;
  infile datalines eof=endfile;
  input x;
  sx + x;
  return;
 endfile:
  put _all_;
  return;
datalines;
2
3
;
run;

everything works fine.

But I more commonly need something like:

data _null_;
  infile datalines end=endfile;
  input x;
  sx + x;
  if endfile then put _all_;
datalines;
2
3
;
run;

Which yields:

"WARNING: The value of the INFILE END= option cannot
 be set for CARDS or DATALINES input."

Clearly (from the first data step) SAS has no problem
identifying the last record. Why won't "end=enfile"
in the second data step work?

Note that it states in the doc that it won't work. It
can't be considered a 'bug'.

Is it a design error that got nicely doc'd? Or is there
another reason why it is disfunctional?

  Thanx,
  Peetie

_____________________________________________________________
Play the Elvis? Scratch & Win for your chance to instantly win $10,000 Cash
- a 2003 Harley Davidson? Sportster? - 1 of 25,000 CD's - and more!
http://r.lycos.com/r/sagel_mail_scratch_tl/http://win.ipromotions.com...

 
 
 

infile datalines end=endfile

Post by Ian Whitlo » Fri, 20 Sep 2002 03:06:52


Peetie,

You are lucky, when I did it years ago, I lost many hours debugging a messy
DATA step because there was no warning.  Remember the options do different
things - the END= variable is set when the last line is touched,
the jump to EOF= is made when you attempt to read beyond the last record.
The END= requires a user code test, while the EOF= jump is automatic from
any appropriate INPUT statement.

Here are two ways to get close to your comfortable "homey" code.

/* method 1: place macro in autocall lib */
%macro endfile (eof=eof) ;
   return ;
   &eof:
   retain endfile 0 ;
   endfile = 1 ;
   goto endtest ;
%mend  endfile ;

/* little massage of your DATA step */
data _null_;
  infile datalines eof=eof;
  input x;
  sx + x;
  endtest:
  if endfile then put _all_;
  %endfile()
datalines;
2
3
;

/* method 2 */
filename temp catalog "work.temp.datalines.source" ;
data _null_ ;
   file temp ;
   input ;
   put _infile_ ;
cards ;
2
3
;

data _null_;
  infile temp end=endfile;
  input x;
  sx + x;
  if endfile then put _all_;
run ;

Now ask yourself the big question.  Since EOF gives you more flexible
control and works all the time why shouldn't you standardize on it?  Well in
your example it cost 4 extra lines of code.  Now I would ask, is 4 lines of
code significant in the amount of code you write in a day?  (In truth one
has to be a little bit careful when using the default OUTPUT statement, but
once bitten you learn.)

In practice I found method 2 works quite well when I wrote the code before
thinking about how to test it and did not want to switch to EOF=.  But now I
tend to ask whenever I write

   infile something end=...

Is this what I want to do?


-----Original Message-----

Sent: Wednesday, September 18, 2002 11:32 AM

Subject: infile datalines end=endfile

Greetings,

This has bothered me for quite some time ...

I often need to test/control for end-of-file when
programming data steps using in-stream data.

If I code:

data _null_;
  infile datalines eof=endfile;
  input x;
  sx + x;
  return;
 endfile:
  put _all_;
  return;
datalines;
2
3
;
run;

everything works fine.

But I more commonly need something like:

data _null_;
  infile datalines end=endfile;
  input x;
  sx + x;
  if endfile then put _all_;
datalines;
2
3
;
run;

Which yields:

"WARNING: The value of the INFILE END= option cannot
 be set for CARDS or DATALINES input."

Clearly (from the first data step) SAS has no problem
identifying the last record. Why won't "end=enfile"
in the second data step work?

Note that it states in the doc that it won't work. It
can't be considered a 'bug'.

Is it a design error that got nicely doc'd? Or is there
another reason why it is disfunctional?

  Thanx,
  Peetie

_____________________________________________________________
Play the Elvis? Scratch & Win for your chance to instantly win $10,000 Cash
- a 2003 Harley Davidson? Sportster? - 1 of 25,000 CD's - and more!
http://r.lycos.com/r/sagel_mail_scratch_tl/http://win.ipromotions.com...
020801/index.asp?tc=7087

 
 
 

infile datalines end=endfile

Post by Peetie Wheatstr » Sat, 21 Sep 2002 00:45:28


On Wed, 18 Sep 2002 14:06:52


>Peetie,

>You are lucky, when I did it years ago, I lost many hours debugging a messy
>DATA step because there was no warning.

Not even an uninitialized var note?

Quote:>Remember the options do different
>things - the END= variable is set when the last line is touched,
>the jump to EOF= is made when you attempt to read beyond the last record.

Both were designed to signal an end-of-file condition?

Quote:>The END= requires a user code test, while the EOF= jump is automatic from
>any appropriate INPUT statement.

>Here are two ways to get close to your comfortable "homey" code.

>/* method 1: place macro in autocall lib */
>%macro endfile (eof=eof) ;
>   return ;
>   &eof:
>   retain endfile 0 ;
>   endfile = 1 ;
>   goto endtest ;
>%mend  endfile ;

>/* little massage of your DATA step */
>data _null_;
>  infile datalines eof=eof;
>  input x;
>  sx + x;
>  endtest:
>  if endfile then put _all_;
>  %endfile()
>datalines;
>2
>3
>;

Potentially unneeded complexity, here?

- Show quoted text -

Quote:>/* method 2 */
>filename temp catalog "work.temp.datalines.source" ;
>data _null_ ;
>   file temp ;
>   input ;
>   put _infile_ ;
>cards ;
>2
>3
>;

>data _null_;
>  infile temp end=endfile;
>  input x;
>  sx + x;
>  if endfile then put _all_;
>run ;

Potentially unnecessary i/o, here?

Oooops. I fear the "hominess" has evaporated. :-)

Still, it's good code. It's just kludgy ...

Quote:>Now ask yourself the big question.  Since EOF gives you more flexible
>control and works all the time why shouldn't you standardize on it?

Never mind me. Why hasn't sas-l standardized on it? I rarely
see it here.

Quote:>Well in
>your example it cost 4 extra lines of code.  Now I would ask, is 4 lines of
>code significant in the amount of code you write in a day?

If there weren't many, many other things to test/do, it
would not be very significant.

If, in the alleged real world, everyone had to code four lines
when one would've done for every piddling thing, perhaps many
(myself included) might have considered an exciting and
challenging career in, say, ditch-digging. Or perhaps politics. :-)

Quote:>(In truth one
>has to be a little bit careful when using the default OUTPUT statement, but
>once bitten you learn.)

>In practice I found method 2 works quite well when I wrote the code before
>thinking about how to test it and did not want to switch to EOF=.  But now I
>tend to ask whenever I write

>   infile something end=...

>Is this what I want to do?

It is a fair and advisable question to ask. There can be
many considerations when structuring a data step.

One such consideration for many is parsimony of code.
Why do a branch (transfer-of-control) when one is
potentially unneeded? Don't we need the flexibility
to be able to choose either? Reading from a file -or-
from in-stream data?

The above begs the central question of my query. Is there
any good reason why "infile datalines end=endfile;"
should be dysfunctional given that SAS can detect the
last record of the in-stream data?

If such good reason exists, I seek enlightenment.
I'm at a loss to discover one myself. If noone comes up
with one, I suppose I'll assume that this is a small
instance of "counter-productive engineering" that made
it through the doc's and is likely to remain unchanged
forever.

  Thanx,
  Peetie

- Show quoted text -

>-----Original Message-----

>Sent: Wednesday, September 18, 2002 11:32 AM

>Subject: infile datalines end=endfile

>Greetings,

>This has bothered me for quite some time ...

>I often need to test/control for end-of-file when
>programming data steps using in-stream data.

>If I code:

>data _null_;
>  infile datalines eof=endfile;
>  input x;
>  sx + x;
>  return;
> endfile:
>  put _all_;
>  return;
>datalines;
>2
>3
>;
>run;

>everything works fine.

>But I more commonly need something like:

>data _null_;
>  infile datalines end=endfile;
>  input x;
>  sx + x;
>  if endfile then put _all_;
>datalines;
>2
>3
>;
>run;

>Which yields:

>"WARNING: The value of the INFILE END= option cannot
> be set for CARDS or DATALINES input."

>Clearly (from the first data step) SAS has no problem
>identifying the last record. Why won't "end=enfile"
>in the second data step work?

>Note that it states in the doc that it won't work. It
>can't be considered a 'bug'.

>Is it a design error that got nicely doc'd? Or is there
>another reason why it is disfunctional?

>  Thanx,
>  Peetie

_____________________________________________________________
Play the Elvis? Scratch & Win for your chance to instantly win $10,000 Cash
- a 2003 Harley Davidson? Sportster? - 1 of 25,000 CD's - and more!
http://r.lycos.com/r/sagel_mail_scratch_tl/http://win.ipromotions.com...