Jump to content
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
HP.com home

FAQ

Solaris Software Transition Kit

» 

DSPP

» HP STK home
Solaris STK
» Home
» Overview
» Tools
» Documentation
» Transition impacts
» Porting to HP-UX
» FAQ
» Glossary
» Help
» Send us feedback
Site maps
» Solaris STK
» DSPP
Content starts here

Scandetail and scansummary FAQ

 
This document provides answers to frequently asked questions about the STK file scanners, scandetail and scansummary, and contains:
 

File scanner accuracy questions

» what causes the STK file scanners to overreport?
» what will the STK file scanners miss?
» why do the STK file scanners report that I'm using a function when I have a variable with the same name?
» why do the STK file scanners report impacts on string contents?
» why do the STK file scanners report true and false as invalid identifiers for C++ when they are C++ keywords?
» why are the file scanners missing C++ impacts in my C++ files?
» why are my comments being scanned?
» would other tools overreport less than the STK file scanners?
   

File scanner usage questions

» what is an impact statement?
» how do I prevent certain files from being scanned?
» how can I exercise more control over which files are scanned?
» how do I determine which files should be scanned?
» when do I need to re-scan?
» do I have to resolve all of the impacts before my code will run?
» why should I care about non-critical impacts
» what impacts can I ignore?
» why would I use scansummary?
» why would I use the text output from the scanners?
» can I edit the reports from the scanner?
» how do I get a report that groups all impacts of the same type?
» how do I get a listing of most to least severe problems?
» what other file types can I scan?
» what happens to my existing reports if I upgrade the STK tools?
» can someone other than the person doing the scan use the STK reports?
» how can I avoid retyping file scanner options?
» what should and shouldn't I put in the .scanrc file?
» what can I do about files of "unknown type"?
» where can I get information on file scanner errors and warnings?
» what is the difference between scansummary, scandetail, and scanwizard?
   

File scanner accuracy questions

Q: what causes the STK file scanners to overreport?
A: The file scanners do not parse your source code files. Instead, they scan each file for tokens using the rules of the file's language. Then they analyze groups of tokens to decide what a particular token or identifier could be. For example, in C, the pattern ID1 -> ID2 indicates that ID2 is the name of a structure member. In Fortran, the pattern CALL ID indicates that ID is a Fortran subroutine.

This method is fast but cannot handle every situation. For example, the token strcmp could be either a variable with the same name as the function strcmp, or a pointer to the function strcmp. A full parsing solution could resolve such ambiguity, but would be much slower and still would not provide completely accurate information.

Thus, the file scanners are not always able to accurately determine the type of a particular identifier. In these cases, rather than miss a possible impact, the file scanners report impacts that apply to any of the possible types of the identifier. In the output report, such impacts are marked with an asterisk * to indicate that the file scanner may have chosen the wrong type for this identifier.

Q: what will the STK file scanners miss?
A: The file scanners are conservative, meaning that they report an impact even if they might be wrong, rather than miss reporting it. The following issues may cause the file scanners to miss impacts:
  • The file scanners are not intended to be a replacement for the compiler, so they detect very few syntax problems. They report only those syntax problems that involve an API change.
  • The file scanners do not automatically scan include files. If you have third-party tools, you may want to scan their include files for impacts.
  • If the file scanners encounter nested comments, comments that aren't closed, or strings without ending quotes, the scanners do not work correctly. The rest of the line or file, is treated as if it were part of the string or comment. The scanners do not detect impacts in comments, and only detect commands, arguments, paths, and libraries in strings.
  • The file scanners can make errors if scope braces { and } are not balanced in C or C++ programs. The file scanners use scope information to help reduce overreporting of local variables, and if the braces are not balanced, they may miss impacts on pointers to functions.
  • The file scanners do not properly handle C function definitions that have no return type. The scanners cannot distinguish between these definitions and a call to a function with the same name, so they may miss impacts on pointers to functions.
  • The file scanners cannot detect 64-bit data model problems. They do not have enough information about the types of the identifiers to report that assigning a particular return value to a variable may result in loss of data, or that a particular signed to unsigned comparison could behave differently. To find 64-bit data model problems, use the C compiler or Flexelint.
  • The file scanners have difficulty with programs that generate code. For example, if your GUI builder generates code, the scanners cannot detect impacts in that generated code. You can run the scanners on the code that is actually generated by such programs, but you cannot detect impacts in that code by scanning the application that generated the code.
  • If your Fortran code does not use spaces between words, or adds extra spaces, the file scanners may miss function or subroutine calls. Any of the following statements without a space after them, or extra spaces in the keyword, could also cause a problem: CALL, RETURN, BACKSPACE, ENDFILE, REWIND.
  • Fortran continuation lines can also cause the file scanners to miss impacts. The file scanners may miss a command, argument, or path in a string, if that command, argument, or path is split across two lines. COBOL continuation lines can cause the same problem.
  • COBOL code can use external variables in C libraries, like errno. The file scanners have a limited ability to detect this usage, due to the numerous ways in which the external clause can appear when defining fields. A COBOL impact on the "EXTERNAL" keyword indicates places where you might be using external variables.
Q: why do the STK file scanners report that I'm using a function when I have a variable with the same name?
A: In this case, the file scanners can't determine if the token is a simple variable name or a pointer to a function. The file scanners can only check nearby tokens and they have a limited ability to track declarations. Variable declarations can be missed if they are obscure or complex. For example: int *****num; In this case, num can be missed as a declaration because of its complex declaration. If nothing else indicates that num is not a function pointer, the file scanners assume that it might be.

See What causes the STK file scanners to overreport? for more information.

Q: why do the STK file scanners report impacts on string contents?
A: Several types of impacts can occur in literal strings. For example, strings may contain paths that have changed, or commands that could be used by system(), popen(), or exec*().

The file scanners are conservative and report any string content that is suspicious. For example, in the statement printf("file %s\n",filename) the file scanners cannot determine if file is actually the UNIX system command file. Although file is within a printf statement, stdout could have been redirected to a file that is about to execute with a system call.

If your application does not call any commands using system(), popen(), exec*(), or similar commands, and the output of your program is not executed as a script, you can tell the file scanners not to look for these types of problems. To exclude reporting of impacts on commands or arguments to commands, use the -Y C,A option.

Q: why do the STK file scanners report true and false as invalid identifiers for C++ when they are C++ keywords?
A: true and false have only recently been added as keywords to the C++ language and were not supported by the previous HP C++ compiler. All of the new keywords in the C++ language have associated impacts so that you can determine if you are using these keywords incorrectly.

If you have already made the transition to the new HP ANSI C++ compiler, use the -C ACC option to exclude all impacts related to this compiler.

Q: why are the file scanners missing C++ impacts in my C++ files?
A: If the C++ file has a non-standard extension in its name, the file scanners do not recognize it. Recognized C++ file extensions are: .C, .cxx, .cc, .cpp, .H, .hxx, .hh, and .hpp.

If the file extension is .c or .h, the file scanners assume the file is a C file and apply the rules for recognizing identifiers in C files. In this case, the scanners may missC++ impacts. For example, structure tags may be missed because the naming rules are more relaxed in C++.

To remedy this problem, edit the file /etc/opt/STKS/config/client, which contains lines that control how various file types are recognized. Add the extensions used for your C++ programs to the CXX_EXTENSIONS line. If these extensions are used by any other file types, delete them from the corresponding *_EXTENSIONS line.

Q: why are my comments being scanned?
A: There are several possibilities. First, if you received a file of unknown type message, the file was not preprocessed. The scanners simply break up the file, including comments, into words and symbols before looking them up in the impact database. See What can I do about files of "unknown type?" for more information.

Another reason is that the syntax for comments varies slightly across scripting languages. For example, in the Posix shell a comment begins with any # that isn't quoted and follows a legal separator or occurs at the beginning of a line, while in the C shell a comment begins with any # that isn't quoted. However, rather than have different preprocessing steps for every language, the file scanners use one method to remove comments from all scripts. If this is why your comments are being scanned, add a space before the # symbol.

The final reason that your comments may be scanned is that in scripts, it is common practice to create a script as a string and then execute it. If this script string contains comments, they are scanned because the file scanners cannot detect that the string is executable and that it actually contains comments. You cannot prevent the file scanners from doing this.

Q: would other tools overreport less than the STK file scanners?
A: HP Softbench 6.0 CodeAdvisor can find many of the same problems in your C or C++ code. Because it performs a full parse, it is slower. It does not detect everything that the STK file scanners do (for example, it does not look for commands or paths), but what it does detect is less likely to be overreported.

file scanner usage questions

Q: what is an impact statement?
A: A document that identifies specific application migration issues between source and destination platforms and provides remedies or references for more information.
Q: how do I prevent certain files from being scanned?
A: You can exclude files from a scan in four ways:
  1. To exclude just a few files, use the -F file option, which prevents the specified file(s) from being scanned. You can add this option to your $HOME/.scanrc file so that you don't have to repeat it on every command line. An example .scanrc is in /opt/STKS/examples/sample_scanrc, and contains commented-out lines for many common file scanner options.
  2. To exclude files with file types that the file scanners do not recognize, use the -u option. While this doesn't allow you to control exactly which files are scanned, it does reduce overreporting caused by scanning unrecognized files.
  3. A more precise option is to edit the client configuration file in /etc/opt/STKS/config/client. Among other functions, this file describes the files that are not to be scanned. If these files have easily-identified names (such as a standard extension or a standard name before the extension), you can add this information to the variables EXCLUDE_FIRSTNAMES and EXCLUDE_EXTENSIONS.
  4. The best way to prevent the file scanners from scanning some files is to specify exactly what to scan. Use the -f filelist option to specify a list of files. Use find(1) to create a list of files, then delete the names of the files you don't want scanned. See the next question for more information on using filelists.
Q: how can I exercise more control over which files are scanned?
A: The -r rootdir option for locating files is convenient. However, if you need more control than that and don't want to list each file on the command line, use the -f filelist option. The filelist is a list of files, one per line, that specifies which files to scan. Use find(1) to create a list of files, then delete the names of the files you don't want scanned.

You can use either relative or absolute path names in the filelist. Absolute paths allow you to run the file scanners in any directory. However, your filelists may not work for another user. For example, if each user has a copy of the source checked out from the source code control system, filelists with absolute path names can only point to one of the copies.

Relative path names do not have this problem, as long as the basic source hierarchy is the same. However, you must run the file scanners in a particular directory. For example, if all path names are relative from the top of the source tree, you need to be at the top of the source tree to do the scan.

Q: how do I determine which files should be scanned?
A: In addition to scanning your source code files, scan your Makefiles and C or C++ header files. Scanning Makefiles can reveal build problems such as obsolete or changed libraries. Scanning C or C++ header files is necessary because the scanner does not include them when scanning source files.
Q: when do I need to re-scan?
A: The only reason to re-scan is that once you have resolved a few impacts or modified your source files, line numbers for impacts are no longer accurate. Running the scanner again corrects the line numbers.

You probably don't need to re-scan as often as you might think, because the file scanners cannot detect if you've actually resolved an impact. You cannot measure your progress in resolving impacts by simply re-scanning.

Q: do I have to resolve all of the impacts before my code will run?
A: No. The file scanners were designed to help you find possible problems with your source. The scanners are conservative in reporting impacts and cannot tell if your code is correct or incorrect; they can indicate only that you may be using something that has changed. You must look at each impact individually to decide if your code is affected or not. Even if your code is incorrect and you resolve it, in most cases the scanner cannot recognize the fix and still reports an impact.
Q: why should I care about Non-Critical impacts?
A: There are three types of non-critical impacts: Warning, Non-Standard, and Enhancement. While these impacts do not cause your code to fail, you should investigate all of them for their applicability.
  • Warning impacts are informational and indicate changes that you may want to be aware of. In most cases, warnings won't prevent your code from running, but they may give you additional information about potential problems. In general, it's a good idea to scan for warnings, but only after you have resolved any other problems in your code.
  • Non-Standard impacts indicate old or deprecated code that is no longer supported by standards and thus, may not be good to use in the long term. These impacts do not affect your code today, but they indicate areas that may need to be changed in the future.
  • Enhancement impacts indicate new features that you may be able to take advantage of. These impacts are not problems that would prevent your code from working, just new capabilities to consider using. You do not have to scan for these impacts, but if you do, wait until you have resolved any other problems in your code. Adding new features in the midst of transitioning code to a new operating system can cause confusion, especially during testing.
Q: which impacts can I ignore?
A: Ideally, none of them. However, development schedules are often tight and some impacts are not absolutely essential to getting your application running on the new operating system. In general, you can safely exclude Non-Standard and Enhancement impacts using -TNs,En.

In addition, look at each impact classification and decide if it applies to you. If not, exclude it using -C class. The most common classifications to exclude are:

  • ACC, if you're not transitioning C++ code to the new HP ANSI C++ compiler
  • 64, if your application does not need to be 64-bit clean
  • TH, if your code is not threaded
  • NW, if it is not a networking application
  • F90, if you are not transitioning Fortran code from HP Fortran 77 to HP Fortran 90
  • CBL4, if you are not transitioning COBOL code from HP Micro Focus COBOL 3.x to 4.x
  • 32, if your application does not interoperate with 64-bit applications.

You can add any of these exclusion options to your $HOME/.scanrc file. An example .scanrc is in /opt/STKS/examples/sample_scanrc, and contains commented-out lines for many common file scanner options.

Q: why would I use scansummary?
A: The scansummary file scanner produces an overview of your transition issues without all the information about file location that you would get from scandetail. It lists each impact only once with a number indicating how often the impact occurs.

An especially useful option to scansummary is to sort by impact classification (-s class). This option groups each impact by impact classification, so that all the networking impacts are grouped together, all the 64-bit impacts are grouped together, and so on. This grouping helps you determine what impacts are related by which technology or change.

Q: why would I use the text output from the file scanners?
A: The text output option (-o text) was designed so that people who use tools like Softbench or emacs could use the tool's ability to parse compiler warning messages to integrate an editor. If you use the text output with these tools, you can click on any impact and use that tool's editor to view your code.
Q: can I edit the reports from the file scanners?
A: Yes. However, if you do so, much of the information about which options were used in the scan becomes invalid. It is more difficult to repeat the scan based on the information in the report. If you need reproducibility, do not edit the reports.
Q: how do I get a report that groups all impacts of the same type?
A: For scandetail, use the -s synopsis option, which sorts impacts by synopsis. Thus, all impacts that are essentially the same are grouped together.

The scansummary program does this automatically, but doesn't show where the impacts are located.
Q: how do I get a listing of most to least severe problems?
A: For scandetail, use the -s synopsis, which sorts impacts by synopsis and lists the most severe problems first. scansummary's -s type option does the same thing and is enabled by default.
Q: what other file types can I scan?
A: The file scanners can scan almost any text file, so you can use them in many ways. For example, you can scan configuration files or the output of binaries after they are passed through the strings command. You could check your /etc/services file to see if it contains services that may have been impacted. (Use the command scandetail +YC,P,A /etc/services to scan only for commands, paths, and command arguments, which minimizes overreporting.) Or, you could scan the strings output of your third-party library to see if you are referencing any commands, paths, or libraries that have changed and might therefore cause a binary compatibility problem. First, run strings on the library and redirect the output to a file. The file name should be some name that the scanner won't recognize as a valid filetype, for example libstrings. Then run scansummary as follows: scansummary +YC,P,A,L libstrings. Do not use scandetail, because file and line information are useless for this scan.

Using the file scanners in this way requires more careful interpretation. For example, what looks like a reference to an obsolete command in the strings output of your library does not necessarily indicate a problem. Evaluate what the application is doing and whether it even makes sense for the library to be using that command.

Q: what happens to my existing reports if I upgrade the STK tools?
A: Downloading the latest version of the STK has a minimal impact on your existing STK reports. Impact pages have the same name from version to version, so a report that linked to an impact in a previous version links to the same impact in the latest version. This ensures that reports are compatible from release to release. However, the impact itself may have changed.

Impacts can change in two ways. The STK team frequently improves the content of certain impacts as better information becomes available. Not all impact pages have the same content as they did in the previous release. For example, an impact that was Non-critical in a previous release may be labeled as Critical now, or the list of identifiers may be different.

Occasionally, an impact is removed, usually because the information in the impact is duplicated elsewhere, or because it was determined that the impact was incorrect. If your old report links to an impact that is Obsolete then you can ignore the impact, because it is no longer valid.

You must have a compatible version of the STK file scanner database for correct operation of the STK file scanners. Check the file /opt/STKS/lib/Version and ensure the version is compatible with your OS version. If it is not, see Downloading the STK Tools to download a compatible version of the STK tools.

Q: can someone other than the person doing the scan use the STK reports?
A: Yes, to some degree. The editor links (the file name and line number links) typically do not work when users share reports. The editor links use absolute path names so that you can start the browser in any directory. Because they use absolute path names, other users probably cannot use the reports and cannot edit the files in the editor.

If you want to have multiple users use the same STK report, you can edit the HTML output so that it uses only relative path names. Then you must start your browser in the directory to which the path names are relative. Relative file references are always relative to the browser's current working directory. If you make this change, the reports work for any user that has the same directory setup for the files being scanned.
Q: how can I avoid retyping file scanner options?
A: The file scanners access a resource or configuration file in your home directory called .scanrc. They execute any options you put into this file just as if you had typed them on the command line. See the next two questions for information on what should and should not be in the $HOME/.scanrc. An example .scanrc is in /opt/STKS/examples/sample_scanrc, and contains commented-out lines for many common file scanner options.
Q: what should and shouldn't I put in the .scanrc?
A: Only add options to your $HOME/.scanrc that you want to run with every scan, so you don't have to change this file frequently. The options you have in .scanrc, depends on the phase of the transition process you are in.

During the investigation process, you probably don't want much in your .scanrc file. At this stage, you are doing many different scans, from whole source base scans for all classifications, severities, and types to single file scans for specific classifications, severities, and types.

During the planning process, you know which classifications and severities of impacts you are interested in. At this stage, your .scanrc may exclude some classifications (with the -C class option) and exclude some impact types (with the -T type option.

During the porting process, you may want to exclude individual impacts with the -I synopsisID option.

Be careful about adding -Y identifier_type options. These options can easily exclude many impacts and may not always be safe. You can use -Y Cc,Cf,Ck if you don't have any COBOL files, or -Y Ff,Fs,Fi,Fk if you don't have any Fortran files, but beyond that, you risk excluding more than you should.

Do not put files in your $HOME/.scanrc file. While -r rootdir and -f filelist options are OK, you cannot list the actual file names to be scanned.

Q: what can I do about files of "unknown type"?
A: To handle files of unknown type, changing their extension to a known file type. The file scanners try to identify each file so that they can use the relevant scanning rules. If a file cannot be identified, they use generic rules that can produce overreporting.

To determine a file's type, the file scanners use information in the client configuration file /etc/opt/STKS/config/client. This file contains lists of extensions and names that identify different types of files. It also contains a list of interpreters that identify scripting languages.

You can use this information in two ways. First, you can change the file so it can be identified. For example, if you have a script, its first line should be #!/path/interpreter. The file scanners use this line to determine that the file is actually a script.

Or, you could edit the client configuration file. For example, if you use .perl to identify your perl scripts, you can add it to the variable SCRIPT_EXTENSIONS. Add the extensions used for your scripts to the SCRIPT_EXTENSIONS line. If these extensions are used by any other file types, delete them from the corresponding *_EXTENSIONS line.

Q: where can I get information on file scanner errors and warnings?
A:Full information on file scanner error and warning messages can be found in File Scanner Error Messages.
Q: what is the difference between scansummary, scandetail, and scanwizard?
A: Both scansummary and scandetail are tools that scan source code.
scansummary helps you plan your source code transition by determining the number of instances of transition impacts in your source files.
scandetail helps you perform a transition by indicating exactly what transition impacts occur on each line of your source files.
scanwizard is an interactive command line utility which assists you in generating a custom detailed or summary report using the scandetail or scansummary utilities. This is the recommended scanning tool for new users.
Printable version
Privacy statement Using this site means you accept its terms Feedback to DSPP
© 2008 Hewlett-Packard Development Company, L.P.