====== What is available in the debug information? ======

The debug info and the API mirrors closely the items available in the sources used to build an executable. To use the API efficiently, it is necessary to understand from which blocks the information is built.

===== Data pieces ======

There are several types of information:

  * Libraries
  * Lines
  * Modules
  * Scopes
  * Segments
  * Source files 
  * Spans
  * Symbols
  * Types

Each item of each type has something like a primary index called an ''id''. The ids can be thought of as array indices, so looking up something by its id is fast. Invalid ids are marked with the special value ''CC65_INV_ID''.

Data passed back for an item may contain ids of other objects. A scope for example contains the id of the parent scope (or ''CC65_INV_ID'' if there is no parent scope). Most API functions use ids to lookup related objects.

===== Libraries =====

This information comes from the linker and is currently used in only one place: To mark the origin of a module. The information available for a library is its name including the path.

^  Available information                       ^
| Library id                                   |
| Name and path of library                     |

===== Lines =====

A line is a location in a source file. It is module dependent, which means that if two modules use the same source file, each one has its own line information for this file. While the assembler has also column information, it is dropped early because it would generate much more data. A line may have one or more spans attached if code or data is generated.

^  Available information                                           ^
| Line id                                                          |
| Id of the source file, the line is from                          |
| The line number in the file (starting with 1)                    |
| The type of the line: Assembler/C source or macro                |
| A count for recursive macros if the line comes from a macro      |


===== Modules =====

A module is actually an object file. It is generated from one or more source files and may come from a library. The assembler generates a main scope for symbols declared outside user generated scopes. The main scope has an empty name.

^  Available information                                                  ^
| Module id                                                               |
| The name of the module including the path                               |
| The id of the main source file (the one specified on the command line)  |
| The id of the library the module comes from, or CC65_INV_ID             |
| The id of the main scope for this module                                |


===== Scopes =====

Each module has a main scope where all symbols live, that are specified outside other scopes. Additional nested scopes may be specified in the sources. So scopes have a one to many relation: Each scope (with the exception of the main scope) has exactly one parent and may have several child scopes. Scopes may not cross modules.

^  Available information                                                  ^
| Scope id                                                                |
| The name of the scope (may be empty)                                    |
| The type of the scope: Module, .SCOPE or .PROC, .STRUCT and .ENUM       |
| The size of the scope (the size of the span for the active segment)     |
| The id of the parent scope (CC65_INV_ID in case of the main scope)      |
| The id of the attached symbol for .PROC scopes                          |
| The id of the module where the scope comes from                         |


===== Segments =====

^  Available information                                                  ^
| Segment id                                                              |
| The name of the segment                                                 |
| The start address of the segment                                        |
| The size of the segment                                                 |
| The name of the output file, this segment was written to (may be empty) |
| The offset of the segment in the output file (only if name not empty)   |

It is also possible to retrieve the spans for sections (a section is the part of a segment that comes from one module). Since the main scope covers a whole module, and the main scope has spans assigned (if not empty), the spans for the main scope of a module are also the spans for the sections in the segments.

===== Source files =====

Modules are generated from source files. Since some source files are used several times when generating a list of modules (header files for example), the linker will merge duplicates to reduce redundant information. Source files are considered identical if the full name including the path is identical, and the size and time of last modification matches. Please note that there may be still duplicates if files are accessed using different paths.

^  Available information                                                  ^
| Source file id                                                          |
| The name of the source file including the path                          |
| The size of the file at the time when it was read                       |
| The time of last modification at the time when the file was read        |

It is suggested that a debugger might use the path of the file to locate it on disk, and the size and time of modification to check if the file has been modified. Showing a warning to the user in the later case might prevent dumb errors like debugging an executable using wrong versions of the sources. 


===== Spans =====

A span is a small part of a segment. It has a start address and a size. Spans are used to record sizes of other objects. Line infos and scopes may have spans attached, so it is possible to lookup which data was generated for these items.

^  Available information                                                                      ^
| Span id                                                                                     |
| The start address of the span. This is an absolute address                                  |
| The end address of the span. This is inclusive which means if start==end then => size==1    |
| The id of the segment were the span is located                                              |
| The type of the data in the span (optional, maybe NULL)                                     |
| The number of line infos available for this span.                                           |
| The number of scope infos available for this span                                           |

The last two fields will save a call to ''cc65_line_byspan'' or ''cc65_scope_byspan'' by providing information about the number of items that can be retrieved by these calls.


===== Symbols =====

^  Available information                                                                      ^
| Symbol id                                                                                   |
| The name of the symbol                                                                      |
| The type of the symbol, which may be label, equate or import                                |
| The size of the symbol (size of attached code or data). Only for labels. Zero if unknown    |
| The value of the symbol. For an import, this is taken from the corresponding export         |
| The id of the corresponding export. Only valid for imports, CC65_INV_ID for other symbols   |
| The segment id if the symbol is segment based. For an import, taken from the export         |
| The id of the scope this symbols was defined in                                             |
| The id of the parent symbol. This is only set for cheap locals and CC65_INV_ID otherwise    |

Beware: Even for an import, the id of the corresponding export may be CC65_INV_ID. This happens if the module with the export has no debug information. So make sure that your application can handle it.

===== Types =====

A type is somewhat special. You cannot retrieve data about it in a similar way as with the other items. Instead you have to call a special routine that parses the type data and returns it in a set of data structures that can be processed by a C or C++ program.

The type information is language independent and doesn't encode things like "const" or "volatile". Instead it defines a set of simple data types and a few ways to aggregate them (arrays, structs and unions).

Type information is currently generated by the assembler for storage allocating commands like .BYTE or .WORD. For example, the assembler code
<code asm>
foo:    .byte $01, $02, $03
</code>
will assign the symbol foo a size of 3, but will also generate a span with a size of 3 bytes and a type "array[3] of BYTE".

Evaluating the type of a span allows a debugger to display the data in the same way as it was defined in the assembler source.

^ Assembler Command  ^ Generated Type Information                           ^
| ''.ADDR''          | ARRAY OF LITTLE ENDIAN POINTER WITH SIZE 2 TO VOID   |
| ''.BYTE''          | ARRAY OF UNSIGNED WITH SIZE 1                        |
| ''.DBYT''          | ARRAY OF BIG ENDIAN UNSIGNED WITH SIZE 2             |
| ''.DWORD''         | ARRAY OF LITTLE ENDIAN UNSIGNED WITH SIZE 4          |
| ''.FARADDR''       | ARRAY OF LITTLE ENDIAN POINTER WITH SIZE 3 TO VOID   |
| ''.WORD''          | ARRAY OF LITTLE ENDIAN UNSIGNED WITH SIZE 2          |