I want a way to find the _first_ header of an email/News
article, but not using C or perl. Here's the scenario:
I have a text file with concatenated email and/or News articles. Each
article has a number of headers that probably vary from one article to
another, both in case of which headers are present and in what order
they appear. No assumptions are made on which email or News readers that
have written to the file. It could look something like this:
[...]
Path: ...
Newsgroups: ...
Subject: ...
[more headers]
[News article]
From ...
Received: ...
Subject: ...
[more headers]
[email article]
Suppose _one_ of these headers is found in _every_ article, e.g. 'Subject:'.
This is the _only_ header that is supposed to exist in every article.
I want to find the "start" of each article, i.e. the first header that
separates the article from the previous one. There's probably an empty
line separating the articles also, but I don't think that's guaranteed.
Note that the 'Subject:' header itself may be the first header!
I want a solution that finds the line numbers of the first header of
each article containing a 'Subject:' header. Alternatively, it may find
simply the header lines themselves without the line number. As input,
you could have either the line numbers of the existing 'Subject:' headers,
or the complete file in which case you'll have to find the 'Subject:'
headers first.
The solution should use awk, sed, grep or other "standard" UNIX commands
and should be written in a csh-compatible syntax (no flames please!).
I'm _not_ looking for solutions written in C, perl or other languages.
Thanks in advance!
--
_______ __ __ _ _ ,------- Michagon -------.
/ _____|\ | \/ (_)___| |__ ___ __ _ ___ _ __ | (Thomas Michanek) |
/ <|___ \|| . . | / __) '. `-_ / _` / _ \ '. | | Trumslagaregatan 118 |
|\_____|> / |_|`'|_|_\___)_||_(_,_\__, \___/_||_| |S-58346 Linkoping SWEDEN|
\|______/ |_____________________|___/_________| |+46 13 273727(voice/fax)|