Reformatting a file

Reformatting a file

Post by Sudhir Sharm » Thu, 20 Sep 2001 20:15:58



Hi,

I need to reformat an existing tab-delimited file programatically -
using Shell or awk or sed. The input file is in the following format-

Input File :
---------------------------
Object 1234

Field_ID    Field_Name    Field_Value
19             IP Address      10.44.22.34
20             Subnet Mask    255.255.247.0

Object 1235

Field_ID    Field_Name    Field_Value
19             IP Address      10.44.22.34

Object 1245

Field_ID    Field_Name    Field_Value
20             Subnet Mask    255.255.247.0
21             MAC_ADDR  08:00:03:05:02:05

.....etc
----------------------------

The output file tab-delimited format will be as shown which is a
grouping of the above fields -

-------------------------------------------
Object    IP Address    Subnet Mask

1234      10.44.22.34   255.255.247.0
1235      10.44.22.34
1245                            255.255.247.0
....etc
--------------------------------------------
For some Objects, the IP Address or Subnet Mask entries may not be
present and I need to fill in blank spaces for these.

Could someone please point me on how to do this. Any pointers will be
appreciated. Thanks in advance.

Regards,
Sudhir.

 
 
 

Reformatting a file

Post by Quanyi Su » Thu, 20 Sep 2001 21:44:35


Hi:

I suggest you to use Perl's split() and it's very easy.
Check perldoc -f split for how to use it.

Quanyi Sun


>Hi,

>I need to reformat an existing tab-delimited file programatically -
>using Shell or awk or sed. The input file is in the following format-

>Input File :
>---------------------------
>Object 1234

>Field_ID    Field_Name    Field_Value
>19             IP Address      10.44.22.34
>20             Subnet Mask    255.255.247.0

>Object 1235

>Field_ID    Field_Name    Field_Value
>19             IP Address      10.44.22.34

>Object 1245

>Field_ID    Field_Name    Field_Value
>20             Subnet Mask    255.255.247.0
>21             MAC_ADDR  08:00:03:05:02:05

>.....etc
>----------------------------

>The output file tab-delimited format will be as shown which is a
>grouping of the above fields -

>-------------------------------------------
>Object    IP Address    Subnet Mask

>1234      10.44.22.34   255.255.247.0
>1235      10.44.22.34
>1245                            255.255.247.0
>....etc
>--------------------------------------------
>For some Objects, the IP Address or Subnet Mask entries may not be
>present and I need to fill in blank spaces for these.

>Could someone please point me on how to do this. Any pointers will be
>appreciated. Thanks in advance.

>Regards,
>Sudhir.


 
 
 

Reformatting a file

Post by eric » Thu, 20 Sep 2001 22:18:23



> Hi,

> I need to reformat an existing tab-delimited file programatically -
> using Shell or awk or sed. The input file is in the following format-

It's perl, but it might do until you have an awkly sedshell solution.
It prints out in two different formats, one for people and one for
computers.
I am sure it can be made smaller, but I think I did okay for a q&d
script.

Eric

#!/usr/bin/perl

$in = 0;
%objects = ();
%fields = ();
while(<>) {
        chomp;
        next if /^$/;
        next if /^Field/;
        if (/^Object/) { # start new object
                ($name) = (split(' '))[-1];
                $objects{$name} = {};
                $ref = $objects{$name};
        } else {

                $fields{$info[1]} = $info[1];
                print "adding $info[1] => $info[2]\n";
                $ref->{$info[1]} = $info[2];
        }

Quote:}

# spit it back out


        printf("\t%17s", $fields{$key});
Quote:}

print "\n";
foreach $key (sort(keys(%objects))) {
        printf("%-8s", $key);

                printf("\t%17s", $objects{$key}->{$item});
        }
        print "\n";

Quote:}

# or to make it more friendly to a computer (ugly for human)
print "\n\n\n";


        printf("\t%s", $fields{$key});
Quote:}

print "\n";
foreach $key (sort(keys(%objects))) {
        print "$key";

                printf("\t%s", $objects{$key}->{$item});
        }
        print "\n";

Quote:}

---- example output
Object                 IP Address                MAC_ADDR            
Subnet Mask
1234                  10.44.22.34                                  
255.255.247.0
1235                
10.44.22.35                                                
1245                                    08:00:03:05:02:05          
255.255.247.0

Object  IP Address      MAC_ADDR        Subnet Mask
1234    10.44.22.34             255.255.247.0
1235    10.44.22.35
1245            08:00:03:05:02:05       255.255.247.0

 
 
 

Reformatting a file

Post by Martien Verbrugg » Thu, 20 Sep 2001 23:20:08


On Wed, 19 Sep 2001 16:45:58 +0530,

Quote:> Hi,

> I need to reformat an existing tab-delimited file programatically -
> using Shell or awk or sed. The input file is in the following format-

Don't have an easy shell/sed/awk solution, but Perl is really good at
this sort of stuff:

#!/usr/local/bin/perl -w
use strict;

# Read the data
#
my %objects;
my $object;
while (<>)
{
    $object = $1 if /^Object (\d+)/;
    next unless $object;                        # just in case
    $objects{$object}{ip} = $1 if /^19\s+IP Address\s+([\d.]+)/;
    $objects{$object}{nm} = $1 if /^20\s+Subnet Mask\s+([\d.]+)/;

Quote:}

# Create the report
#
print "Object   IP Address       Subnet Mask\n\n";

for (sort { $a <=> $b } keys %objects)
{
    printf "%-8d %-16s %-16s\n", $_,
                                 $objects{$_}{ip} || "",
                                 $objects{$_}{nm} || "";

Quote:}

You have left quite some stuff unspecified, so I felt free to make
some assumptions.

Martien
--
Martien Verbruggen              |
Interactive Media Division      | Useful Statistic: 75% of the people
Commercial Dynamics Pty. Ltd.   | make up 3/4 of the population.
NSW, Australia                  |

 
 
 

Reformatting a file

Post by Chris F.A. Johnso » Fri, 21 Sep 2001 00:53:09



> Hi,

> I need to reformat an existing tab-delimited file programatically -
> using Shell or awk or sed. The input file is in the following format-

> Input File :
> ---------------------------
> Object 1234

> Field_ID    Field_Name    Field_Value
> 19             IP Address      10.44.22.34
> 20             Subnet Mask    255.255.247.0

> Object 1235

> Field_ID    Field_Name    Field_Value
> 19             IP Address      10.44.22.34

> Object 1245

> Field_ID    Field_Name    Field_Value
> 20             Subnet Mask    255.255.247.0
> 21             MAC_ADDR  08:00:03:05:02:05

> .....etc
> ----------------------------

> The output file tab-delimited format will be as shown which is a
> grouping of the above fields -

> -------------------------------------------
> Object    IP Address    Subnet Mask

> 1234      10.44.22.34   255.255.247.0
> 1235      10.44.22.34
> 1245                            255.255.247.0
> ....etc
> --------------------------------------------
> For some Objects, the IP Address or Subnet Mask entries may not be
> present and I need to fill in blank spaces for these.

> Could someone please point me on how to do this. Any pointers will be
> appreciated. Thanks in advance.

awk '
    BEGIN { FS = "\t"
            printf "%s\t%s\t%s\n", "Object", "IP Address", "Subnet Mask" }

    /^Object/ { if ( object ) {
                    printf "%s\t%s\t%s\n", object, ip, mask
                    ip = " "  ## use whatever you like for enpty fields
                    mask = " "
                    }
                ## This assumes there is a tab after "Object"
                object = $2
                }
    /IP Address/ { ip = $3 }
    /Subnet Mask/ { mask = $3 }

    END { if ( object ) printf "%s\t%s\t%s\n", object, ip, mask }

--
    Chris F.A. Johnson                        http://cfaj.freeshell.org
    ===================================================================
    My code (if any) in this post is copyright 2001, Chris F.A. Johnson
    and may be copied under the terms of the GNU General Public License

 
 
 

Reformatting a file

Post by those who know me have no need of my nam » Fri, 21 Sep 2001 01:49:08



Quote:>I need to reformat an existing tab-delimited file programatically -
>using Shell or awk or sed.

using awk:

BEGIN { oid="" }
$1 == "Object" { oid=$2 }
$1 == "19" { list[oid]=1 ; ip[oid]=$(NF) }
$1 == "20" { list[oid]=1 ; nm[oid]=$(NF) }
END { for (oid in list) printf "%-6s\t%-15s\t%-15s\n", oid, ip[oid], nm[oid] }

--
okay, have a sig then

 
 
 

Reformatting a file

Post by those who know me have no need of my nam » Fri, 21 Sep 2001 01:52:39



Quote:># Create the report
>#
>print "Object   IP Address       Subnet Mask\n\n";

>for (sort { $a <=> $b } keys %objects)
>{
>    printf "%-8d %-16s %-16s\n", $_,
>                                 $objects{$_}{ip} || "",
>                                 $objects{$_}{nm} || "";
>}

why is it that almost none of the perl code i see that creates a report
uses the reporting capabilities of the practical extraction and reporting
language?

--
okay, have a sig then

 
 
 

Reformatting a file

Post by laura fairhe » Fri, 21 Sep 2001 06:00:46



>Hi,

>I need to reformat an existing tab-delimited file programatically -
>using Shell or awk or sed. The input file is in the following format-

>Input File :
>---------------------------
>Object 1234

>Field_ID    Field_Name    Field_Value
>19             IP Address      10.44.22.34
>20             Subnet Mask    255.255.247.0

>Object 1235

>Field_ID    Field_Name    Field_Value
>19             IP Address      10.44.22.34

>Object 1245

>Field_ID    Field_Name    Field_Value
>20             Subnet Mask    255.255.247.0
>21             MAC_ADDR  08:00:03:05:02:05

>.....etc
>----------------------------

>The output file tab-delimited format will be as shown which is a
>grouping of the above fields -

>-------------------------------------------
>Object    IP Address    Subnet Mask

>1234      10.44.22.34   255.255.247.0
>1235      10.44.22.34
>1245                            255.255.247.0
>....etc
>--------------------------------------------
>For some Objects, the IP Address or Subnet Mask entries may not be
>present and I need to fill in blank spaces for these.

awk '
BEGIN{
FS=sprintf("\011")
object=-1
printf "%-8s\011%-20s\011%s\n\n","Object","IP Address","Subnet Mask"

Quote:}

{
if($1=="Object"){
  if(object>=0){
    printf "%-8s\011%-20s\011%s\n",object,ip,subnet
    }
  ip="";subnet="";object=$2
  }
if($1==19)ip=$3
if($1==20)subnet=$3

Quote:}

END{
if(object>=0) {
  printf "%-8s\011%-20s\011%s\n",object,ip,subnet
  }

Quote:}' file.in

where 'file.in' is your input data file

Quote:

>Could someone please point me on how to do this. Any pointers will be
>appreciated. Thanks in advance.

this is exactly the sort of thing that 'awk' is there for
to do for you. if you are familiar with C programming the
syntax should be learnable in a day. 'printf' comes in useful
for formatting the output nicely. it could be done in 'sed'
in theory but 'awk' is the correct tool to use for this sort of
thing (either 'awk' or maybe 'perl' ).

cu,from

--
: ${L} # http://lf.8k.com:80

- Show quoted text -

Quote:

>Regards,
>Sudhir.

 
 
 

Reformatting a file

Post by Christoph Hintermülle » Fri, 21 Sep 2001 16:44:54


Hi

Perl can do much moire for you than beeing a better AWK/SED/SH

For further details look a
   perl  formats

-----------------------------------------------
#!/usr/bin/perl
$formatstring="format STDOUT=\n^~".("<" x 10-2).
               "\t^".("<" x 15.1).
               "\t^".("<" x 20-1)."\n".
               "\$firstfield,\$secondfield,\$thirdfield\n.\n";

eval $formatstring;
while(<>){
   chomp;
   ($firstfield,$secondfield,$thirdfield)=split /[ \t]+/,$_;
   while($firstfield||$secondfield||$thirdfield){
     write;
   }

}

> Hi,

> I need to reformat an existing tab-delimited file programatically -
> using Shell or awk or sed. The input file is in the following format-

> Input File :
> ---------------------------
> Object 1234

> Field_ID    Field_Name    Field_Value
> 19             IP Address      10.44.22.34
> 20             Subnet Mask    255.255.247.0

> Object 1235

> Field_ID    Field_Name    Field_Value
> 19             IP Address      10.44.22.34

> Object 1245

> Field_ID    Field_Name    Field_Value
> 20             Subnet Mask    255.255.247.0
> 21             MAC_ADDR  08:00:03:05:02:05

> .....etc
> ----------------------------

> The output file tab-delimited format will be as shown which is a
> grouping of the above fields -

> -------------------------------------------
> Object    IP Address    Subnet Mask

> 1234      10.44.22.34   255.255.247.0
> 1235      10.44.22.34
> 1245                            255.255.247.0
> ....etc
> --------------------------------------------
> For some Objects, the IP Address or Subnet Mask entries may not be
> present and I need to fill in blank spaces for these.

> Could someone please point me on how to do this. Any pointers will be
> appreciated. Thanks in advance.

> Regards,
> Sudhir.

--
THESIS:     God is alive
PROOVE:     Who else would have scheduled the mankind and world first
             recommendation of research????
CONCLUSION: Scientists do what he wants, willing or not:)
 
 
 

Reformatting a file

Post by Sudhir Sharm » Fri, 21 Sep 2001 18:20:16


Quote:> Hi,

> I need to reformat an existing tab-delimited file programatically -
> using Shell or awk or sed. The input file is in the following format-

> Input File :
> ---------------------------
> Object 1234

> Field_ID    Field_Name    Field_Value
> 19             IP Address      10.44.22.34
> 20             Subnet Mask    255.255.247.0

> Object 1235

> Field_ID    Field_Name    Field_Value
> 19             IP Address      10.44.22.34

> Object 1245

> Field_ID    Field_Name    Field_Value
> 20             Subnet Mask    255.255.247.0
> 21             MAC_ADDR  08:00:03:05:02:05

> .....etc
> ----------------------------

> The output file tab-delimited format will be as shown which is a
> grouping of the above fields -

> -------------------------------------------
> Object    IP Address    Subnet Mask

> 1234      10.44.22.34   255.255.247.0
> 1235      10.44.22.34
> 1245                            255.255.247.0
> ....etc
> --------------------------------------------
> For some Objects, the IP Address or Subnet Mask entries may not be
> present and I need to fill in blank spaces for these.

> Could someone please point me on how to do this. Any pointers will be
> appreciated. Thanks in advance.

> Regards,
> Sudhir.

 
 
 

Reformatting a file

Post by j.. » Sun, 23 Sep 2001 15:33:14



>> Hi,
>> I need to reformat an existing tab-delimited file programatically -
>> using Shell or awk or sed. The input file is in the following format-
>> Input File :
>> ---------------------------
>> Object 1234
>> Field_ID    Field_Name    Field_Value
>> 19             IP Address      10.44.22.34
>> 20             Subnet Mask    255.255.247.0
>> Object 1235
>> Field_ID    Field_Name    Field_Value
>> 19             IP Address      10.44.22.34
>> Object 1245
>> Field_ID    Field_Name    Field_Value
>> 20             Subnet Mask    255.255.247.0
>> 21             MAC_ADDR  08:00:03:05:02:05
>> .....etc
>> ----------------------------
>> The output file tab-delimited format will be as shown which is a
>> grouping of the above fields -
>> -------------------------------------------
>> Object    IP Address    Subnet Mask
>> 1234      10.44.22.34   255.255.247.0
>> 1235      10.44.22.34
>> 1245                            255.255.247.0
>> ....etc
>> --------------------------------------------
>> For some Objects, the IP Address or Subnet Mask entries may not be
>> present and I need to fill in blank spaces for these.
>> Could someone please point me on how to do this. Any pointers will be
>> appreciated. Thanks in advance.

 Let's assume you want to use awk.  I'll make some assumptions
 about the input format and field names here:

        The input file is a set of multi-line stanzas, Object ###
        followed by a series of  Field ID, Field Name, and value
        tuples.

 Regarding the output, this more like a SQL table, with
 attributes (columns) for Object numbers and each of the
 field names.

 My first stab at something like this would be to write a
 mini-state machine, I'm either processing an object stanza or
 I'm not.  For each object I'm building an associative array of
 column ID/names and values (I might build *two* associative
 arrays, one for Field IDs to Field Names and another to match
 either to values --- that would depend on the sanity/schema
 of the input and the requirements of the output).   I might
 also use a case structure to confirm that each Field_ID/Field_Name
 was valid.

 Depending on the actual requirements I might make two passes
 through the data; one to build a dictionary of Field IDs and Names
 and another to process the data and generate output.  If the data
 set is reasonably small (less than a few 10s of Mb) I'd do that in
 one pass, building the dictionary and the resulting table as
 arrays in memory.  If it was large (enough to strain my memory/swap)
 and/or if I had to do this frequently, I'd do it in two passes.  If
 I had a full list of the desired Field Names/IDs, I wouldn't need
 to build a dictionary (duh!).

 You don't say anything about the sanity of the data or how you'd
 want to handle errors.  I'd probably write warnings (to stderr) for
 any anomalous data; and I'd run a number of tests using various
 regular expressions to condition my understanding of the file's
 true format.  For example, some of the field names shown here have
 space in them.  Do any of the values?  Do any of the names or values
 have embedded tabs?  Is there some quoting/escaping mechanism in use
 for tabs?  Should I read just the first three tabs as field delimiters
 and tree the remainder of each line as the "value"?

 Is every "Object" header followed by a copy of the same static
 "Field_ID    Field_Name    Field_Value" header?  What if I don't
 see one?  What if I see more than one within the same "Object"
 stanza?  From your input sample it seems that the "Object"
 header is separate from the oject number/ID using a space rather
 than a tab.

 So here's a simplistic bit of code:

#!/usr/bin/awk -f
BEGIN {

                ## column/attribut dictionary
                attr[19]="IP Address"
                attr[20]="Subnet Mask"
                attr[21]="MAC_ADDR"

        FS="       " # a tab!
        OFS="      " # ditto!
        header = "Object ID        " # ends with a tab
        for ( i in attr ) {
                header = header attr[i]  " " # another tab
                        }
        print header
        }

function print_object_tuple() {
        output=obj
        for ( i in attr ) {
                        if (object_attr_value[i]) output = output "        " object_attr_value[i]
                        else output = output "     " # tab!
                        }
        print output
        }

# ignore blank lines
/^[     ]*$/ { next }

/^Object/ {
        # we've found an object:
        # if we were processing an ealier object: print it
        if (obj) {
                print_object_tuple()
                }

        sub(/Object /,"")  ## ugly hack!  bad input format!
        obj=$0  # set new/next object
        }

# ignore static header lines
/^[^0-9]/ { next } # all data lines start with numbers?

{
        object_attr_name[$1]=$2
        object_attr_value[$1]=$3
        }

END {
                print_object_tuple()
 }

        Note: as written this doesn't give you any control over the
        order of the columns and it doesn't sort the records.  It merely
        guarantees that the column headers and the data columns will be
        the same.

        PERL or Python would be much more elegant (it's trivial to
        iterate over our array in a specific order).  Of course I could
        use a numeric array and iterate over it in using a scalar/numeric
        value instead of using the associative (hash/dictionary) string
        indexing as I've done here.

        However, you specified awk, sed or sh; and this is (moderately)
        easier in awk then sh.

        Of course this will break horribly if there are any irregularities
        in your input format.  Also it could get really tedious if there are
        more than a few attribute/columns types/names to deal with.  
        Even as is the output format is ugly (though it is tab delimited,
        as specified).

Quote:>> Regards,
>> Sudhir.

--
Jim Dennis              
 
 
 

1. Reformat file, tricky.

I have a file (actually a very long LaTeX log) which I'd like to
clean up for viewing, via a small script for use when I get an
error.

The file has lines like this scattered through it:
[12] [13] [14] [15
]                    
[17
]    
[33
] [34]  

Can anyone deeply versed in the ways of sed or similar
provide a method of coping with this to produce

[12] [13] [14] [15]
[17]
[33] [34]

Or, second choice, a way of deleting any line with just [ ] or numerals.

Thanks in advance.

--
David Kennedy, Dept. of Pure & Applied Physics, Queen's University of Belfast

               My .sig was so clever that it actually escaped!

2. X font problem

3. what is fastest way to reformat file from variable to fixed length

4. How can I recover my FAX32 partition?

5. reformatting directory files

6. Creating large numbers of user with passwords

7. Help Needed to Reformat a File

8. 450GX Chipset in SuperMicro's P6DOF Motherboard for Linux SMP

9. Sed Question - File Reformat

10. Reformatting file before ftp transfer

11. Reformatting man pages. Why does it fail?

12. Reformatting libraries?

13. How do I reformat ?