New tool idea?

New tool idea?

Post by Andre Kost » Fri, 23 Jul 1999 04:00:00



I was wondering if there already exists a tool to do the following:

1) Given a C/C++ source file, determine if the list of includes in the
source file are all actually required (basically to determine if the source
is including "obsolete" headers)

2) Also in the same source, determine if a header included in the original
source file is also included previously (presumably through a third header
file)

If such a tool doesn't exist, does the idea sound intriguing?

As an example:

***File a.h

typedef struct
{
        int a;

Quote:} a_t;

***File b.h

typedef struct
{
        int b;

Quote:} b_t;

*** File c.h

#include "a.h"

typedef struct
{
        int c;

Quote:} c_t;

***File test.cpp

#include "c.h"
#include "b.h"
#include "a.h"

int main(int, char * [])
{
        a_t var;

Quote:}

Running this hypothetical tool on test.cpp would result in the program
stating that the include for b.h may be safely removed since no
identifiers/macros in b.h are used, and the include of a.h is superfluous
since c.h already includes it.

I guess the only purpose of this tool would be to help maintain code
"cleanliness" (no extra includes...)

Comments?

 
 
 

New tool idea?

Post by Roy Grim » Fri, 23 Jul 1999 04:00:00



> I was wondering if there already exists a tool to do the following:

> [snip]

> I guess the only purpose of this tool would be to help maintain code
> "cleanliness" (no extra includes...)

> Comments?

I have to wonder at the usefullness of this tool.  When you stop to
consider what the "extra includes" gives you.

Have you ever opened up someone else's code and looked at it, trying to
find where something was defined only to have to dig through dozens of
layers of include files?  It can be a nightmare.  It is far better to
list all the include files which contain code you directly reference.
Granted, you may include a header that is included in some layer of
another inclusion.  But, that's not necessarily a bad thing.

OTOH, with all the visual development environments out there, it's easy
to have your environment do that search for you and show you exactly
where something is defined, eliminating the tedious searching that must
be done.  But that only works if you are using that type of
environment.  And no, not everyone is using the latest and greatest
tools.

Now, there can be a use for a particular subfunction of that tool no
matter what your environment.  That is, to parse through the code and
print out all includes that are never used.  Say, for example, that I
include "string.h" but never do any string processing. (i.e. don't use
the types and don't use the functions)  If I could have a tool that
shows me that I have an unneeded include, that would be valuable.  I've
got a tool like that in an Ada development environment I use and would
like to have something like that for C/C++ and assembler code.

FWIW,
Roy

 
 
 

New tool idea?

Post by Davin McCa » Sat, 24 Jul 1999 04:00:00


It does sound an interesting idea. However, I would recommend that
such a tool would not recommend the removal of superfluos #includes,
only of those which were completely unecessary.

Consider the example you gave with files a.h, b.h, c.h and test.cpp.
c.h includes a.h, and test.cpp includes all of a.h, b.h and c.h. If
the #include of a.h in test.cpp was removed, a.h would still
effectively be included... but what if, at some later date, the c.h
include file was changed and no longer included a.h? (ie, a new
version of the library or whatever). If this occurred, and test.cpp
was recompiled, there could be problems.

Davin.

__________________________________________________________

my programming page: http://yoyo.cc.monash.edu.au/~davmac/

 
 
 

New tool idea?

Post by Bernd Striede » Sat, 24 Jul 1999 04:00:00


Hi


> I was wondering if there already exists a tool to do the following:

> 1) Given a C/C++ source file, determine if the list of includes in the
> source file are all actually required (basically to determine if the source
> is including "obsolete" headers)

from the manuals of gcc:

       -M  [ -MG ]
              Tell the preprocessor to output a rule suitable for
              make describing the  dependencies  of  each  object
              file.   For each source file, the preprocessor out-
              puts one make-rule whose target is the object  file
              name  for  that  source file and whose dependencies
              are all the files `#include'd in it.  This rule may
              be  a single line or may be continued with `\'-new-
              line if it is long.  The list of rules  is  printed
              on  standard  output  instead of the preprocessed C
              program.

Quote:

> 2) Also in the same source, determine if a header included in the original
> source file is also included previously (presumably through a third header
> file)

This is commonly overcome by include guards. Multiply included files are
so common, for example the C library headers, that there is no chance
besides include guards anyway. IMO it is best to not know what is
included by other headers, because I would consider this an
implementation detail of this file. If I know it, I will rely on it and
will get trapped later when the other header is changed.

Quote:

> If such a tool doesn't exist, does the idea sound intriguing?

> As an example:

[....]

> Running this hypothetical tool on test.cpp would result in the program
> stating that the include for b.h may be safely removed since no
> identifiers/macros in b.h are used, and the include of a.h is superfluous
> since c.h already includes it.

Is it sure forever that c.h includes a.h or might the developer of c.h
decide to change to not using a.h?

Checking for superfluous includes involves parsing the code to some
degree, so this would be something to include into the compiler, but
which might also yield a lot of work in the compiler to include the
source to every symbol parsed. IMO it is unlikely that anybody able to
do it needs it as well.

Quote:

> I guess the only purpose of this tool would be to help maintain code
> "cleanliness" (no extra includes...)

> Comments?

There are no extra includes in well designed code. You insert includes
when you need them. You can't rely on includes included by those you
include. I've never seen documentation of libraries where it's said that
you can rely on this. In most cases it is said only what you have to
include to get certain services in the code you write. This is common
sense, if you do other things, it is likely that you confuse other
people. You can't rely purely on the code you see, because most often
there is nothing said within it you can rely on over the times. So you
even cannot write tools doing this job automatically, without having
incorrect behaviour in some cases.

I don't want to look through some levels of headers until I have found
what I want to use. E.g. If the docs tell me to include file poiuz1 to
use struct mnbv3 and I want to use functions using mnbv3 declared in
header poiuz67, I'll include both, although poiuz67 would have sufficed
now. If I want to use struct mnbv3 it means I access fields of that
struct which must not be true if you are exclusively working with it
through functions.

If you have a strong feeling that there are unnecessary includes in your
files, than it's at best called laziness, but there is the fact that
some unconscious things might have happened to your code which has been
bad all times.

Programmers tend to be lazy at times... I've done some programming, too.
And extra includes have been along the least problems I have thought
about. A much bigger nuisance are those include-everything.h files. This
is good for the laziest programmers you can think of. Have everything
available everywhere and go. There a lot more unconscious things happen
than with some extra includes. If I see a C file not including stdio.h,
then I assume that there is no printf in it, or not including stdlib.h,
then there is no malloc in it. This is important information at
(debugging) times. With those all.h headers, I just know that everything
can happen everywhere.

I hope that I convinced you that those little includes at the beginning
of your files are worth some conscious treatment, their occurence as
well as their absence. It can't be done automatically and it shouldn't.

Bye,

Bernd.

 
 
 

New tool idea?

Post by Andre Kost » Sat, 24 Jul 1999 04:00:00




>Hi


>> I was wondering if there already exists a tool to do the following:

>> 1) Given a C/C++ source file, determine if the list of includes in the
>> source file are all actually required (basically to determine if the
>> source is including "obsolete" headers)

>from the manuals of gcc:

>       -M  [ -MG ]
>              Tell the preprocessor to output a rule suitable for
>              make describing the  dependencies  of  each  object
>              file.   For each source file, the preprocessor out-
>              puts one make-rule whose target is the object  file
>              name  for  that  source file and whose dependencies
>              are all the files `#include'd in it.  This rule may
>              be  a single line or may be continued with `\'-new-
>              line if it is long.  The list of rules  is  printed
>              on  standard  output  instead of the preprocessed C
>              program.

This will generate a list of headers that the file does use, but doesn't
distinguish between headers that are actually used, and ones which aren't
needed at all.  (I actually use the -M stuff in my makefile to auto
-generate dependancy files....)

Quote:>> 2) Also in the same source, determine if a header included in the
>> original source file is also included previously (presumably through a
>> third header file)

>This is commonly overcome by include guards. Multiply included files are
>so common, for example the C library headers, that there is no chance
>besides include guards anyway. IMO it is best to not know what is
>included by other headers, because I would consider this an
>implementation detail of this file. If I know it, I will rely on it and
>will get trapped later when the other header is changed.

I actually wasn't thinking/worried about multiple inclusions (doing the
include guard thing in my header files is a very deeply ingrained habit
by now).

Quote:>Checking for superfluous includes involves parsing the code to some
>degree, so this would be something to include into the compiler, but
>which might also yield a lot of work in the compiler to include the
>source to every symbol parsed. IMO it is unlikely that anybody able to
>do it needs it as well.

Yep... it would be a pain to parse the file...

Quote:>> I guess the only purpose of this tool would be to help maintain code
>> "cleanliness" (no extra includes...)

>> Comments?
>There are no extra includes in well designed code. You insert includes
>when you need them. You can't rely on includes included by those you
>include. I've never seen documentation of libraries where it's said that
>you can rely on this. In most cases it is said only what you have to
>include to get certain services in the code you write. This is common
>sense, if you do other things, it is likely that you confuse other
>people. You can't rely purely on the code you see, because most often
>there is nothing said within it you can rely on over the times. So you
>even cannot write tools doing this job automatically, without having
>incorrect behaviour in some cases.

I only agree with the statement: "There are no extra includes in well
designed code", if you add "when originally written.".  It is a fact of
life that programs need to be maintained.  It then becomes more difficult
to determine when an include is no longer needed.  (If I remove all
references to symbol X from the source, do I still need header.h ?)

Quote:>I don't want to look through some levels of headers until I have found
>what I want to use. E.g. If the docs tell me to include file poiuz1 to
>use struct mnbv3 and I want to use functions using mnbv3 declared in
>header poiuz67, I'll include both, although poiuz67 would have sufficed
>now. If I want to use struct mnbv3 it means I access fields of that
>struct which must not be true if you are exclusively working with it
>through functions.

>If you have a strong feeling that there are unnecessary includes in your
>files, than it's at best called laziness, but there is the fact that
>some unconscious things might have happened to your code which has been
>bad all times.

>Programmers tend to be lazy at times... I've done some programming, too.

BTW: Programming is my job (I get paid to play! :) ).  And programmers
tend to be lazy almost all of the time :)  And in C++, being lazy is a
good thing (read Scott Meyers sometime... Effective C++, More Effective
C++) :)

Quote:>And extra includes have been along the least problems I have thought
>about. A much bigger nuisance are those include-everything.h files. This
>is good for the laziest programmers you can think of. Have everything
>available everywhere and go. There a lot more unconscious things happen

I can argue the include-everything.h point, but only in certain instances
(and I don't really want to start an argument, just discussing a
potentially useful tool) In general, I completely agree... lots of .h
files.

Quote:>than with some extra includes. If I see a C file not including stdio.h,
>then I assume that there is no printf in it, or not including stdlib.h,
>then there is no malloc in it. This is important information at
>(debugging) times. With those all.h headers, I just know that everything
>can happen everywhere.

I also want to know that if I am including stdio.h, that the source _is_
using printf/scanf, or if I include pthread.h that the source _is_ using
pthreads calls.

I will grant you that detecting header files that are included directly
as well as implicitly may not be truly handy.  However this does lead to
the exact opposite thought.  What about detecting header files that you
are using symbols from that are only included implicitly, but not
directly?

Quote:>I hope that I convinced you that those little includes at the beginning
>of your files are worth some conscious treatment, their occurence as
>well as their absence. It can't be done automatically and it shouldn't.

I'm not saying that the tool should automatically remove them, only to
advise you that header such-and-such is not used at all, and that this
-other-header is implicitly included.

The idea I had for the tool is to help in managing the software project
in such a way that the source code helps document itself.  As we both
agree, programmers tend to be lazy... and since when do lazy people
properly comment their code? (I'm guilty on this one) :)

So... to revise the "specifications":

1) Given a C/C++ source file, determine if the list of includes in the
source file are all actually required (basically to determine if the
source is including "obsolete" headers)

2) Also in the same source, determine if a header included through
another header should be included on it's own (to rephrase, all symbols
referenced by the source should all be in directly included headers).

Note that these should be advisory statements, not modifications to the
source.

 
 
 

New tool idea?

Post by John Reis » Sat, 24 Jul 1999 04:00:00



>I was wondering if there already exists a tool to do the following:

>1) Given a C/C++ source file, determine if the list of includes in the
>source file are all actually required (basically to determine if the source
>is including "obsolete" headers)

>2) Also in the same source, determine if a header included in the original
>source file is also included previously (presumably through a third header
>file)

[snip]

This is an excellent idea.  Especially in projects involving many
programmers over a couple of years or more, the use of #include can
"bloat" just like the rest of the code.

The use of "#include guards" helps a little.  A guard is like:

        #ifndef _INCLUDED_foo_h_  /*{*/
        #define _INCLUDED_foo_h_
           [ ... entire rest of file ...]
        #endif  /*}*/

which is a way to make the rest of the file appear effectively only
once, even if the file is #included many times (directly or
indirectly).  Sidebar: I like using the comments with braces on most
conditional compilation controls.  That way, I can use the
"parenthesis matching" commands of a text editor to determine the
scope of #ifdef, without having to teach the editor about #ifdef.

The committee for the current C standard missed a golden opportunity
for better software, by not introducing "#include1" or similar
directive meaning "include this file only once".  "This file" is
determined by [at least]: same spelling with same delimiters, or same
effective path after -I processing and "parent directory removal"
(changing "/dir/../" to "/"), or any reliable implementation-defined
method (which would allow for detection of redundant NFS automounters,
for instance).  Of course, the committee might argue that the ISO
rules required the standardization of _existing practice_, but this is
a cop out when the existing practice, even the use of guards, sucks.
I consider that the committee for the current C++ standard got away
with what amounts to wholesale innovation.  Why couldn't the C
committee?  [Some say that the "#" and "##" tokenization operators
_were_ innovations.  And the ground rules need adjustment, too.]

It _is_ possible to track the "file of origin" as symbols are used.
Number the source files as encountered, remember the number of the
current file when creating a new symbol in the symbol table, update a
bitvector when looking up an existing symbol, scan the bitvector(s) at
the end of the source.  This may require some integration between the
"preprocessor" and the lexical analyzer.

Finally, I contend that in most cases, _each_ individual header file
should be standalone compilable.  That is, if a file contains a use of
a symbol, then _that_ file is responsible for #include-ing the
declaration.  In particular, for every header file foo.h, it should be
possible to do

        cp foo.h foo.c
        cc -c foo.c

and get _no_ errors.  Any exceptions, such as expected "parametric"
usage of #define or typedef, must be prominently documented near the
beginning of the file, and mention a concrete example of specific
files that would satisfy the rule.  This goes a long way towards
making #include understandable and maintainable.


 
 
 

New tool idea?

Post by Derek M Jon » Sat, 24 Jul 1999 04:00:00


All,

Quote:

> 1) Given a C/C++ source file, determine if the list of includes in the
> source file are all actually required (basically to determine if the source
> is including "obsolete" headers)

My companies C analysis tool does just this.

Quote:> 2) Also in the same source, determine if a header included in the original
> source file is also included previously (presumably through a third header
> file)

and this.  But it it less clear cut to say that such usages should
be removed.  For instance if the top level header is under the control
of another development group you might want to retain your own #include
to keep a degree of independence.

An important point about removing #includes is that there may be
dependencies between those #includes flagged as not needed.  In
this case it is not always possible to remove a subset of the
flagges #includes.

There are also a number of implementation issues.  For instance
OSPC assumes that any #include containing a definition cannot
be removed (because the definitions may not be references within that
source file, but the program but the definition is needed to satisfy
an external reference elsewhere).

My experience is that in production code 20-30% of #includes are
not required and can safely be removed.

derek

--
Derek M Jones                                     tel: +44 (0) 1252 520 667

Applications Standards Conformance Testing       http://www.knosof.co.uk

 
 
 

New tool idea?

Post by Mark Carmicha » Sun, 25 Jul 1999 04:00:00



    > I was wondering if there already exists a tool to do the following:
    >
    > 1) Given a C/C++ source file, determine if the list of includes in the
    > source file are all actually required (basically to determine if the
source
    > is including "obsolete" headers)
    >
    > 2) Also in the same source, determine if a header included in the
original
    > source file is also included previously (presumably through a third
header
    > file)
    >
    > If such a tool doesn't exist, does the idea sound intriguing?
    >

I think it is a great idea, and so do some other folks:

    http://www.research.att.com/sw/tools/reuse/packages/cia.html

If you visit that page, note the 'incl' link; it is a CIA application
that does exactly what you described.  If course, knowing that much about
your C code, CIA can do a lot more for it as well...

If you are interested in tools, "Practical Reusable UNIX Software" is a
fascinating read.

--
Mark Carmichael                        "My phone bill, my opinions."

 
 
 

New tool idea?

Post by Erik Kun » Thu, 29 Jul 1999 04:00:00


: The committee for the current C standard missed a golden opportunity
: for better software, by not introducing "#include1" or similar
: directive meaning "include this file only once".  "This file" is

Todays compiler will tell you, if include files ore used more than
once per source file.

 
 
 

New tool idea?

Post by Andrew Hatel » Sat, 31 Jul 1999 04:00:00


...
Quote:> Consider the example you gave with files a.h, b.h, c.h and test.cpp.
> c.h includes a.h, and test.cpp includes all of a.h, b.h and c.h. If
> the #include of a.h in test.cpp was removed, a.h would still
> effectively be included... but what if, at some later date, the c.h
> include file was changed and no longer included a.h?

...

Its the total absence of the need to even think about this kind of low
level junk that make me extremely happy to program in Ada. What's a
language designed for - the convenience of the programmer or the
convenience of the guy writing the compiler?

I don't write Makefiles either - thats a job the compiler should be
smart enough to figure out for itself.

regards
Andrew
Ada fan.

 
 
 

1. New GUI tool for Linux. [idea]


Yes, and I keep thinking it should have the ease of use of the old Appware
(from Novell) with the ability to wrap existing widgets into the
equivalent of the (Appware) Loadable Modules!

Unfortunately the company that bought Appware (Network Multimedia) is not
doing too much recently.  The website is down and the change of name
(MIcrobrew) turns any websearch for it into a nightmare.

However the ease of use is astonishing. A tool like that would let
beginners like me contribute much more.

Stefan Harms

--
remove NOSPAM to reach me

2. y2k bug

3. Test

4. SB 16 SCSI II and Chinon 535 CD

5. Idea Tools For Logging Database

6. OS design references

7. Invite ideas on implementing a packet generation tool for linux...

8. Tools Tools Tools ... wanted

9. any ideas? good idea...

10. New Solaris programming book - soliciting subject ideas

11. New feature idea for Apache?