|
» |
|
|
 |
 |
 |
Scandetail and scansummary FAQ
 |
| |
| This document provides answers to frequently asked questions
about the STK file scanners, scandetail and
scansummary, and contains: |
| |
File scanner accuracy questions
File scanner usage questions
File scanner accuracy questions
- Q:
what causes the STK file scanners to overreport?
-
A: The file scanners do not parse your source code
files. Instead, they scan each file for tokens using the
rules of the file's language. Then they analyze groups of
tokens to decide what a particular token or identifier could
be. For example, in C, the pattern ID1 -> ID2
indicates that ID2 is the name of a structure member. In
Fortran, the pattern CALL ID indicates that
ID is a Fortran subroutine.
This method is fast but cannot handle every situation. For
example, the token strcmp could be either a
variable with the same name as the function
strcmp, or a pointer to the function
strcmp. A full parsing solution could resolve such
ambiguity, but would be much slower and still would not
provide completely accurate information.
Thus, the file scanners are not always able to accurately
determine the type of a particular identifier. In these
cases, rather than miss a possible impact, the file scanners
report impacts that apply to any of the possible types of the
identifier. In the output report, such impacts are marked
with an asterisk * to indicate that the file scanner may have
chosen the wrong type for this identifier.
- Q:
what will the STK file scanners miss?
-
A: The file scanners are conservative, meaning that
they report an impact even if they might be wrong, rather
than miss reporting it. The following issues may cause the
file scanners to miss impacts:
- The file scanners are not intended to be a replacement
for the compiler, so they detect very few syntax problems.
They report only those syntax problems that involve an API
change.
- The file scanners do not automatically scan include
files. If you have third-party tools, you may want to scan
their include files for impacts.
- If the file scanners encounter nested comments,
comments that aren't closed, or strings without ending
quotes, the scanners do not work correctly. The rest of the
line or file, is treated as if it were part of the string
or comment. The scanners do not detect impacts in comments,
and only detect commands, arguments, paths, and libraries
in strings.
- The file scanners can make errors if scope braces { and
} are not balanced in C or C++ programs. The file scanners
use scope information to help reduce overreporting of local
variables, and if the braces are not balanced, they may
miss impacts on pointers to functions.
- The file scanners do not properly handle C function
definitions that have no return type. The scanners cannot
distinguish between these definitions and a call to a
function with the same name, so they may miss impacts on
pointers to functions.
- The file scanners cannot detect 64-bit data model
problems. They do not have enough information about the
types of the identifiers to report that assigning a
particular return value to a variable may result in loss of
data, or that a particular signed to unsigned comparison
could behave differently. To find 64-bit data model
problems, use the C compiler or Flexelint.
- The file scanners have difficulty with programs that
generate code. For example, if your GUI builder generates
code, the scanners cannot detect impacts in that generated
code. You can run the scanners on the code that is actually
generated by such programs, but you cannot detect impacts
in that code by scanning the application that generated the
code.
- If your Fortran code does not use spaces between words,
or adds extra spaces, the file scanners may miss function
or subroutine calls. Any of the following statements
without a space after them, or extra spaces in the keyword,
could also cause a problem: CALL, RETURN, BACKSPACE,
ENDFILE, REWIND.
- Fortran continuation lines can also cause the file
scanners to miss impacts. The file scanners may miss a
command, argument, or path in a string, if that command,
argument, or path is split across two lines. COBOL
continuation lines can cause the same problem.
- COBOL code can use external variables in C libraries,
like errno. The file scanners have a limited
ability to detect this usage, due to the numerous ways in
which the external clause can appear when defining fields.
A COBOL impact on the "EXTERNAL" keyword indicates places
where you might be using external variables.
- Q: why
do the STK file scanners report that I'm using a function when
I have a variable with the same name?
-
A: In this case, the file scanners can't determine if
the token is a simple variable name or a pointer to a
function. The file scanners can only check nearby tokens and
they have a limited ability to track declarations. Variable
declarations can be missed if they are obscure or complex.
For example: int *****num; In this case,
num can be missed as a declaration because of its
complex declaration. If nothing else indicates that
num is not a function pointer, the file scanners
assume that it might be.
See What causes the STK file
scanners to overreport? for more information.
- Q:
why do the STK file scanners report impacts on string
contents?
-
A: Several types of impacts can occur in literal
strings. For example, strings may contain paths that have
changed, or commands that could be used by system(),
popen(), or exec*().
The file scanners are conservative and report any string
content that is suspicious. For example, in the statement
printf("file %s\n",filename) the file scanners
cannot determine if file is actually the UNIX
system command file. Although file is
within a printf statement, stdout could
have been redirected to a file that is about to execute with
a system call.
If your application does not call any commands using
system(), popen(), exec*(), or similar commands,
and the output of your program is not executed as a script,
you can tell the file scanners not to look for these types of
problems. To exclude reporting of impacts on commands or
arguments to commands, use the -Y C,A option.
-
Q: why do the STK file scanners report true and
false as invalid identifiers for C++ when they are C++
keywords?
-
A: true and false have only
recently been added as keywords to the C++ language and were
not supported by the previous HP C++ compiler. All of the new
keywords in the C++ language have associated impacts so that
you can determine if you are using these keywords
incorrectly.
If you have already made the transition to the new HP ANSI
C++ compiler, use the -C ACC option to exclude all
impacts related to this compiler.
-
Q: why are the file scanners missing C++
impacts in my C++ files?
-
A: If the C++ file has a non-standard extension in its
name, the file scanners do not recognize it. Recognized C++
file extensions are: .C, .cxx, .cc, .cpp, .H, .hxx,
.hh, and .hpp.
If the file extension is .c or .h,
the file scanners assume the file is a C file and apply the
rules for recognizing identifiers in C files. In this case,
the scanners may missC++ impacts. For example, structure tags
may be missed because the naming rules are more relaxed in
C++.
To remedy this problem, edit the file
/etc/opt/STKS/config/client, which contains lines
that control how various file types are recognized. Add the
extensions used for your C++ programs to the
CXX_EXTENSIONS line. If these extensions are used
by any other file types, delete them from the corresponding
*_EXTENSIONS line.
-
Q: why are my comments being scanned?
-
A: There are several possibilities. First, if you
received a file of unknown type message, the file was
not preprocessed. The scanners simply break up the file,
including comments, into words and symbols before looking
them up in the impact database. See What can I do about files of "unknown
type?" for more information.
Another reason is that the syntax for comments varies
slightly across scripting languages. For example, in the
Posix shell a comment begins with any # that isn't
quoted and follows a legal separator or occurs at the
beginning of a line, while in the C shell a comment begins
with any # that isn't quoted. However, rather than
have different preprocessing steps for every language, the
file scanners use one method to remove comments from all
scripts. If this is why your comments are being scanned, add
a space before the # symbol.
The final reason that your comments may be scanned is that
in scripts, it is common practice to create a script as a
string and then execute it. If this script string contains
comments, they are scanned because the file scanners cannot
detect that the string is executable and that it actually
contains comments. You cannot prevent the file scanners from
doing this.
- Q: would other
tools overreport less than the STK file scanners?
- A: HP Softbench 6.0 CodeAdvisor can find many of the
same problems in your C or C++ code. Because it performs a full
parse, it is slower. It does not detect everything that the STK
file scanners do (for example, it does not look for commands or
paths), but what it does detect is less likely to be
overreported.
file scanner usage questions
- Q: what is
an impact statement?
- A: A document that identifies specific application
migration issues between source and destination platforms and
provides remedies or references for more information.
-
Q: how do I prevent certain files from being
scanned?
-
A: You can exclude files from a scan in four ways:
- To exclude just a few files, use the -F
file option, which prevents the specified
file(s) from being scanned. You can add this option to your
$HOME/.scanrc file so that you don't have to
repeat it on every command line. An example
.scanrc is in
/opt/STKS/examples/sample_scanrc, and contains
commented-out lines for many common file scanner
options.
- To exclude files with file types that the file scanners
do not recognize, use the -u option. While this
doesn't allow you to control exactly which files are
scanned, it does reduce overreporting caused by scanning
unrecognized files.
- A more precise option is to edit the client
configuration file in
/etc/opt/STKS/config/client. Among other
functions, this file describes the files that are not to be
scanned. If these files have easily-identified names (such
as a standard extension or a standard name before the
extension), you can add this information to the variables
EXCLUDE_FIRSTNAMES and
EXCLUDE_EXTENSIONS.
- The best way to prevent the file scanners from scanning
some files is to specify exactly what to scan. Use the
-f filelist option to specify a list
of files. Use find(1) to create a list of
files, then delete the names of the files you don't want
scanned. See the next question for more information on
using filelists.
-
Q: how can I exercise more control over which
files are scanned?
-
A: The -r rootdir option for
locating files is convenient. However, if you need more
control than that and don't want to list each file on the
command line, use the -f filelist
option. The filelist is a list of files, one per line, that
specifies which files to scan. Use find(1) to
create a list of files, then delete the names of the files
you don't want scanned.
You can use either relative or absolute path names in the
filelist. Absolute paths allow you to run the file scanners
in any directory. However, your filelists may not work for
another user. For example, if each user has a copy of the
source checked out from the source code control system,
filelists with absolute path names can only point to one of
the copies.
Relative path names do not have this problem, as long as the
basic source hierarchy is the same. However, you must run the
file scanners in a particular directory. For example, if all
path names are relative from the top of the source tree, you
need to be at the top of the source tree to do the scan.
-
Q: how do I determine which files should be
scanned?
- A: In addition to scanning your source code files,
scan your Makefiles and C or C++ header files. Scanning
Makefiles can reveal build problems such as obsolete or changed
libraries. Scanning C or C++ header files is necessary because
the scanner does not include them when scanning source
files.
- Q:
when do I need to re-scan?
-
A: The only reason to re-scan is that once you have
resolved a few impacts or modified your source files, line
numbers for impacts are no longer accurate. Running the
scanner again corrects the line numbers.
You probably don't need to re-scan as often as you might
think, because the file scanners cannot detect if you've
actually resolved an impact. You cannot measure your progress
in resolving impacts by simply re-scanning.
-
Q: do I have to resolve all of the impacts
before my code will run?
- A: No. The file scanners were designed to help you
find possible problems with your source. The scanners are
conservative in reporting impacts and cannot tell if your code
is correct or incorrect; they can indicate only that you may be
using something that has changed. You must look at each impact
individually to decide if your code is affected or not. Even if
your code is incorrect and you resolve it, in most cases the
scanner cannot recognize the fix and still reports an
impact.
-
Q: why should I care about Non-Critical
impacts?
-
A: There are three types of non-critical impacts:
Warning, Non-Standard, and Enhancement. While these impacts
do not cause your code to fail, you should investigate all of
them for their applicability.
- Warning impacts are informational and indicate changes
that you may want to be aware of. In most cases, warnings
won't prevent your code from running, but they may give you
additional information about potential problems. In
general, it's a good idea to scan for warnings, but only
after you have resolved any other problems in your
code.
- Non-Standard impacts indicate old or deprecated code
that is no longer supported by standards and thus, may not
be good to use in the long term. These impacts do not
affect your code today, but they indicate areas that may
need to be changed in the future.
- Enhancement impacts indicate new features that you may
be able to take advantage of. These impacts are not
problems that would prevent your code from working, just
new capabilities to consider using. You do not have to scan
for these impacts, but if you do, wait until you have
resolved any other problems in your code. Adding new
features in the midst of transitioning code to a new
operating system can cause confusion, especially during
testing.
-
Q: which impacts can I ignore?
-
A: Ideally, none of them. However, development
schedules are often tight and some impacts are not absolutely
essential to getting your application running on the new
operating system. In general, you can safely exclude
Non-Standard and Enhancement impacts using
-TNs,En.
In addition, look at each impact classification and decide
if it applies to you. If not, exclude it using -C
class. The most common classifications to exclude
are:
- ACC, if you're not transitioning C++ code to
the new HP ANSI C++ compiler
- 64, if your application does not need to be
64-bit clean
- TH, if your code is not threaded
- NW, if it is not a networking
application
- F90, if you are not transitioning Fortran
code from HP Fortran 77 to HP Fortran 90
- CBL4, if you are not transitioning COBOL
code from HP Micro Focus COBOL 3.x to 4.x
- 32, if your application does not
interoperate with 64-bit applications.
You can add any of these exclusion options to your
$HOME/.scanrc file. An example .scanrc
is in /opt/STKS/examples/sample_scanrc, and
contains commented-out lines for many common file scanner
options.
-
Q: why would I use scansummary?
-
A: The scansummary file scanner produces an
overview of your transition issues without all the
information about file location that you would get from
scandetail. It lists each impact only once with a
number indicating how often the impact occurs.
An especially useful option to scansummary is
to sort by impact classification (-s
class). This option groups each impact by impact
classification, so that all the networking impacts are
grouped together, all the 64-bit impacts are grouped
together, and so on. This grouping helps you determine what
impacts are related by which technology or change.
-
Q: why would I use the text output from the file
scanners?
- A: The text output option (-o text) was
designed so that people who use tools like Softbench or emacs
could use the tool's ability to parse compiler warning messages
to integrate an editor. If you use the text output with these
tools, you can click on any impact and use that tool's editor
to view your code.
- Q:
can I edit the reports from the file scanners?
- A: Yes. However, if you do so, much of the
information about which options were used in the scan becomes
invalid. It is more difficult to repeat the scan based on the
information in the report. If you need reproducibility, do not
edit the reports.
- Q:
how do I get a report that groups all impacts of the same
type?
- A: For scandetail, use the -s
synopsis option, which sorts impacts by synopsis.
Thus, all impacts that are essentially the same are grouped
together.
The scansummary program does this automatically, but
doesn't show where the impacts are located.
- Q:
how do I get a listing of most to least severe
problems?
- A: For scandetail, use the -s
synopsis, which sorts impacts by synopsis and lists
the most severe problems first. scansummary's -s type
option does the same thing and is enabled by default.
-
Q: what other file types can I scan?
-
A: The file scanners can scan almost any text file, so
you can use them in many ways. For example, you can scan
configuration files or the output of binaries after they are
passed through the strings command. You could
check your /etc/services file to see if it
contains services that may have been impacted. (Use the
command scandetail +YC,P,A /etc/services to scan
only for commands, paths, and command arguments, which
minimizes overreporting.) Or, you could scan the
strings output of your third-party library to see
if you are referencing any commands, paths, or libraries that
have changed and might therefore cause a binary compatibility
problem. First, run strings on the library and
redirect the output to a file. The file name should be some
name that the scanner won't recognize as a valid filetype,
for example libstrings. Then run
scansummary as follows: scansummary +YC,P,A,L
libstrings. Do not use scandetail, because
file and line information are useless for this scan.
Using the file scanners in this way requires more careful
interpretation. For example, what looks like a reference to
an obsolete command in the strings output of your
library does not necessarily indicate a problem. Evaluate
what the application is doing and whether it even makes sense
for the library to be using that command.
-
Q: what happens to my existing reports if I
upgrade the STK tools?
-
A: Downloading the latest version of the STK has a
minimal impact on your existing STK reports. Impact pages
have the same name from version to version, so a report that
linked to an impact in a previous version links to the same
impact in the latest version. This ensures that reports are
compatible from release to release. However, the impact
itself may have changed.
Impacts can change in two ways. The STK team frequently
improves the content of certain impacts as better information
becomes available. Not all impact pages have the same content
as they did in the previous release. For example, an impact
that was Non-critical in a previous release may be labeled as
Critical now, or the list of identifiers may be
different.
Occasionally, an impact is removed, usually because the
information in the impact is duplicated elsewhere, or because
it was determined that the impact was incorrect. If your old
report links to an impact that is Obsolete then you
can ignore the impact, because it is no longer valid.
You must have a compatible version of the STK file scanner
database for correct operation of the STK file scanners.
Check the file /opt/STKS/lib/Version and ensure
the version is compatible with your OS version. If it is not,
see Downloading the STK Tools to
download a compatible version of the STK tools.
-
Q: can someone other than the person doing the
scan use the STK reports?
- A: Yes, to some degree. The editor links (the file
name and line number links) typically do not work when users
share reports. The editor links use absolute path names so that
you can start the browser in any directory. Because they use
absolute path names, other users probably cannot use the
reports and cannot edit the files in the editor.
If you want to have multiple users use the same STK report, you
can edit the HTML output so that it uses only relative path
names. Then you must start your browser in the directory to
which the path names are relative. Relative file references are
always relative to the browser's current working directory. If
you make this change, the reports work for any user that has
the same directory setup for the files being scanned.
-
Q: how can I avoid retyping file scanner
options?
- A: The file scanners access a resource or
configuration file in your home directory called
.scanrc. They execute any options you put into this
file just as if you had typed them on the command line. See the
next two questions for information on what should and should
not be in the $HOME/.scanrc. An example
.scanrc is in
/opt/STKS/examples/sample_scanrc, and contains
commented-out lines for many common file scanner options.
-
Q: what should and shouldn't I put in the
.scanrc?
-
A: Only add options to your $HOME/.scanrc
that you want to run with every scan, so you don't have to
change this file frequently. The options you have in
.scanrc, depends on the phase of the transition
process you are in.
During the investigation process, you probably don't want
much in your .scanrc file. At this stage, you are
doing many different scans, from whole source base scans for
all classifications, severities, and types to single file
scans for specific classifications, severities, and
types.
During the planning process, you know which
classifications and severities of impacts you are interested
in. At this stage, your .scanrc may exclude some
classifications (with the -C class
option) and exclude some impact types (with the -T
type option.
During the porting process, you may want to exclude
individual impacts with the -I
synopsisID option.
Be careful about adding -Y
identifier_type options. These options can easily
exclude many impacts and may not always be safe. You can use
-Y Cc,Cf,Ck if you don't have any COBOL files, or
-Y Ff,Fs,Fi,Fk if you don't have any Fortran
files, but beyond that, you risk excluding more than you
should.
Do not put files in your $HOME/.scanrc file.
While -r rootdir and -f
filelist options are OK, you cannot list the
actual file names to be scanned.
- Q:
what can I do about files of "unknown type"?
-
A: To handle files of unknown type, changing their
extension to a known file type. The file scanners try to
identify each file so that they can use the relevant scanning
rules. If a file cannot be identified, they use generic rules
that can produce overreporting.
To determine a file's type, the file scanners use
information in the client configuration file
/etc/opt/STKS/config/client. This file contains
lists of extensions and names that identify different types
of files. It also contains a list of interpreters that
identify scripting languages.
You can use this information in two ways. First, you can
change the file so it can be identified. For example, if you
have a script, its first line should be
#!/path/interpreter. The file scanners
use this line to determine that the file is actually a
script.
Or, you could edit the client configuration file. For
example, if you use .perl to identify your perl
scripts, you can add it to the variable
SCRIPT_EXTENSIONS. Add the extensions used for
your scripts to the SCRIPT_EXTENSIONS line. If
these extensions are used by any other file types, delete
them from the corresponding *_EXTENSIONS line.
- Q: where
can I get information on file scanner errors and
warnings?
- A:Full information on file scanner error and warning
messages can be found in File Scanner
Error Messages.
- Q: what is
the difference between scansummary, scandetail, and scanwizard?
- A: Both scansummary and scandetail are tools that scan
source code.
scansummary helps you plan your source code transition by determining
the number of instances of transition impacts in your source files.
scandetail helps you perform a transition by indicating exactly what
transition impacts occur on each line of your source files.
scanwizard is an interactive command line utility which assists you
in generating a custom detailed or summary report using the
scandetail or scansummary utilities. This is the recommended
scanning tool for new users.
|
|