What is available in the debug information?

The debug info and the API mirrors closely the items available in the sources used to build an executable. To use the API efficiently, it is necessary to understand from which blocks the information is built.

Data pieces

There are several types of information:

  • Libraries
  • Lines
  • Modules
  • Scopes
  • Segments
  • Source files
  • Spans
  • Symbols
  • Types

Each item of each type has something like a primary index called an id. The ids can be thought of as array indices, so looking up something by its id is fast. Invalid ids are marked with the special value CC65_INV_ID.

Data passed back for an item may contain ids of other objects. A scope for example contains the id of the parent scope (or CC65_INV_ID if there is no parent scope). Most API functions use ids to lookup related objects.

Libraries

This information comes from the linker and is currently used in only one place: To mark the origin of a module. The information available for a library is its name including the path.

Available information
Library id
Name and path of library

Lines

A line is a location in a source file. It is module dependent, which means that if two modules use the same source file, each one has its own line information for this file. While the assembler has also column information, it is dropped early because it would generate much more data. A line may have one or more spans attached if code or data is generated.

Available information
Line id
Id of the source file, the line is from
The line number in the file (starting with 1)
The type of the line: Assembler/C source or macro
A count for recursive macros if the line comes from a macro

Modules

A module is actually an object file. It is generated from one or more source files and may come from a library. The assembler generates a main scope for symbols declared outside user generated scopes. The main scope has an empty name.

Available information
Module id
The name of the module including the path
The id of the main source file (the one specified on the command line)
The id of the library the module comes from, or CC65_INV_ID
The id of the main scope for this module

Scopes

Each module has a main scope where all symbols live, that are specified outside other scopes. Additional nested scopes may be specified in the sources. So scopes have a one to many relation: Each scope (with the exception of the main scope) has exactly one parent and may have several child scopes. Scopes may not cross modules.

Available information
Scope id
The name of the scope (may be empty)
The type of the scope: Module, .SCOPE or .PROC, .STRUCT and .ENUM
The size of the scope (the size of the span for the active segment)
The id of the parent scope (CC65_INV_ID in case of the main scope)
The id of the attached symbol for .PROC scopes
The id of the module where the scope comes from

Segments

Available information
Segment id
The name of the segment
The start address of the segment
The size of the segment
The name of the output file, this segment was written to (may be empty)
The offset of the segment in the output file (only if name not empty)

It is also possible to retrieve the spans for sections (a section is the part of a segment that comes from one module). Since the main scope covers a whole module, and the main scope has spans assigned (if not empty), the spans for the main scope of a module are also the spans for the sections in the segments.

Source files

Modules are generated from source files. Since some source files are used several times when generating a list of modules (header files for example), the linker will merge duplicates to reduce redundant information. Source files are considered identical if the full name including the path is identical, and the size and time of last modification matches. Please note that there may be still duplicates if files are accessed using different paths.

Available information
Source file id
The name of the source file including the path
The size of the file at the time when it was read
The time of last modification at the time when the file was read

It is suggested that a debugger might use the path of the file to locate it on disk, and the size and time of modification to check if the file has been modified. Showing a warning to the user in the later case might prevent dumb errors like debugging an executable using wrong versions of the sources.

Spans

A span is a small part of a segment. It has a start address and a size. Spans are used to record sizes of other objects. Line infos and scopes may have spans attached, so it is possible to lookup which data was generated for these items.

Available information
Span id
The start address of the span. This is an absolute address
The end address of the span. This is inclusive which means if start==end then ⇒ size==1
The id of the segment were the span is located
The type of the data in the span (optional, maybe NULL)
The number of line infos available for this span.
The number of scope infos available for this span

The last two fields will save a call to cc65_line_byspan or cc65_scope_byspan by providing information about the number of items that can be retrieved by these calls.

Symbols

Available information
Symbol id
The name of the symbol
The type of the symbol, which may be label, equate or import
The size of the symbol (size of attached code or data). Only for labels. Zero if unknown
The value of the symbol. For an import, this is taken from the corresponding export
The id of the corresponding export. Only valid for imports, CC65_INV_ID for other symbols
The segment id if the symbol is segment based. For an import, taken from the export
The id of the scope this symbols was defined in
The id of the parent symbol. This is only set for cheap locals and CC65_INV_ID otherwise

Beware: Even for an import, the id of the corresponding export may be CC65_INV_ID. This happens if the module with the export has no debug information. So make sure that your application can handle it.

Types

A type is somewhat special. You cannot retrieve data about it in a similar way as with the other items. Instead you have to call a special routine that parses the type data and returns it in a set of data structures that can be processed by a C or C++ program.

The type information is language independent and doesn't encode things like “const” or “volatile”. Instead it defines a set of simple data types and a few ways to aggregate them (arrays, structs and unions).

Type information is currently generated by the assembler for storage allocating commands like .BYTE or .WORD. For example, the assembler code

foo:    .byte $01, $02, $03

will assign the symbol foo a size of 3, but will also generate a span with a size of 3 bytes and a type “array[3] of BYTE”.

Evaluating the type of a span allows a debugger to display the data in the same way as it was defined in the assembler source.

Assembler Command Generated Type Information
.ADDR ARRAY OF LITTLE ENDIAN POINTER WITH SIZE 2 TO VOID
.BYTE ARRAY OF UNSIGNED WITH SIZE 1
.DBYT ARRAY OF BIG ENDIAN UNSIGNED WITH SIZE 2
.DWORD ARRAY OF LITTLE ENDIAN UNSIGNED WITH SIZE 4
.FARADDR ARRAY OF LITTLE ENDIAN POINTER WITH SIZE 3 TO VOID
.WORD ARRAY OF LITTLE ENDIAN UNSIGNED WITH SIZE 2
cc65/debug-info-data.txt · Last modified: 2011-09-09 14:12 by polluks
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki